TWI705696B

TWI705696B - Improvement on inter-layer prediction

Info

Publication number: TWI705696B
Application number: TW108123130A
Authority: TW
Inventors: 張凱; 張莉; 劉鴻彬; 王悅
Original assignee: 大陸商北京字節跳動網絡技術有限公司; 美商字節跳動有限公司
Priority date: 2018-07-01
Filing date: 2019-07-01
Publication date: 2020-09-21
Also published as: CN110677674A; WO2020008325A1; TW202007154A; CN110677674B

Abstract

Devices, systems and methods for video processing are described. In a representative aspect, a video processing method includes determining, based on a component type of a current video block, whether an inter-layer prediction mode is applicable to a conversion between the current video block and a bitstream representation of the current video block, and performing the conversion by applying the inter-layer prediction mode due to the determining that the inter-layer prediction mode is applicable to the current video block, wherein the applying the inter-layer prediction includes dividing a portion of the current video block into at least one sub-block using more than one dividing patterns and generating a predictor for the current video block as a weighted average of predictors determined for each of the more than one dividing patterns.

Description

Interweaving prediction improvements

本專利文件涉及一種視頻編碼和解碼技術、裝置和系統。 This patent document relates to a video encoding and decoding technology, device and system.

儘管視頻壓縮有所進步，但是數位視頻佔網際網絡和其他數位通信網絡上最大的寬頻使用。隨著能夠接收和顯示視頻的所連接的用戶設備的數量增加，預計數位視頻使用的寬頻需求將繼續增長。 Despite advances in video compression, digital video accounts for the largest broadband usage on the Internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the broadband demand for digital video usage will continue to grow.

本文件公開了一種技術，其可以用於視頻編碼和解碼實施例中，以改善基於子塊的編碼的性能，並且特別地，當使用仿射運動編碼模式。 This document discloses a technique that can be used in video encoding and decoding embodiments to improve the performance of sub-block-based encoding, and in particular, when an affine motion encoding mode is used.

在一個實施例中，提供了一種視頻處理方法，包含：基於當前視頻塊的分量類型，確定交織預測模式是否適用於當前視頻塊與當前視頻塊的位元流表示之間的轉換；以及響應於確定交織預測模式適用於當前視頻塊，通過應用交織預測模式而進行轉換，其中應用交織預測包含使用多於一個細分樣式將部分當前視頻塊細分為至少一個子塊，並且產生作為對多於一個細分樣式中的每一個確定的預測器的加權平均的當前視頻塊的預測器。 In one embodiment, a video processing method is provided, including: determining whether an interlaced prediction mode is suitable for conversion between the current video block and the bitstream representation of the current video block based on the component type of the current video block; and responding to It is determined that the interleaving prediction mode is suitable for the current video block, and the conversion is performed by applying the interleaving prediction mode, where applying interleaving prediction includes subdividing part of the current video block into at least one sub-block using more than one subdivision pattern, and generating as a pair of more than one subdivision Each of the patterns determines the weighted average predictor of the current video block.

在另一實施例中，提供了一種視頻處理方法，包含：基於當前視頻塊的預測方向，確定交織預測模式是否適用於當前視頻塊與當前視頻塊的位元流表示之間的轉換；以及響應於確定交織預測模式適用於當前視頻塊，通過應用交織預測模式進行轉換，其中應用交織預測包含使用多於一個細分樣式將部分當前視頻塊細分為至少一個子塊，並且產生作為對多於一個細分樣式中的每一個確定的預測器的加權平均的當前視頻塊的預測器。 In another embodiment, a video processing method is provided, including: based on the prediction direction of the current video block, determining whether the interleaving prediction mode is suitable for conversion between the current video block and the bitstream representation of the current video block; and responding In order to determine that the interlaced prediction mode is suitable for the current video block, conversion is performed by applying the interlaced prediction mode, where applying interlaced prediction includes subdividing part of the current video block into at least one sub-block using more than one subdivision pattern, and generating as a pair of more than one subdivision Each of the patterns determines the weighted average predictor of the current video block.

在另一實施例中，提供了一種視頻處理方法，包含：基於當前圖片的低延遲模式，確定交織預測模式是否適用於當前圖片中的當前視頻塊與當前視頻塊的位元流表示之間的轉換；以及響應於確定交織預測模式適用於當前視頻塊，通過應用交織預測模式進行轉換，並且，其中應用交織預測包含使用多於一個細分樣式將部分當前視頻塊細分為至少一個子塊，並且產生作為對多於一個細分樣式中的每一個確定的預測器的加權平均的當前視頻塊的預測器。 In another embodiment, a video processing method is provided, which includes: based on the low-delay mode of the current picture, determining whether the interlaced prediction mode is applicable between the current video block in the current picture and the bitstream representation of the current video block Conversion; and in response to determining that the interleaving prediction mode is suitable for the current video block, performing conversion by applying the interleaving prediction mode, and wherein applying the interleaving prediction includes subdividing part of the current video block into at least one sub-block using more than one subdivision pattern, and generating The predictor of the current video block as a weighted average of the predictors determined for each of more than one subdivision pattern.

在另一實施例中，提供了一種視頻處理方法，包含：基於使用包含當前視頻塊的當前圖片作為參考，確定交織預測模式是否適用於當前視頻塊與當前視頻塊的位元流表示之間的轉換；以及響應於確定交織預測模式適用於當前視頻塊，通過應用交織預測模式進行轉換，並且，其中應用交織預測包含使用多於一個細分樣式將部分當前視頻塊細分為至少一個子塊樣式，並且產生作為對多於一個細分樣式中的每一個確定的預測器的加權平均的當前視頻塊的預測器。 In another embodiment, a video processing method is provided, including: determining whether an interlaced prediction mode is applicable between the current video block and the bitstream representation of the current video block based on using a current picture containing the current video block as a reference Converting; and in response to determining that the interleaving prediction mode is suitable for the current video block, performing conversion by applying the interleaving prediction mode, and wherein applying the interleaving prediction includes subdividing part of the current video block into at least one sub-block style using more than one subdivision pattern, and A predictor of the current video block as a weighted average of the determined predictors for each of more than one subdivision patterns is generated.

在另一實施例中，提供了一種視頻處理方法，包含：選擇性地基於視頻條件，進行視頻的來自視頻幀的亮度分量、第一色度分量和第二色度分量中的一個或多個分量的基於交織預測的編碼，其中進行交織預測包括通過以下為視頻的分量的當前塊確定預測塊：選擇視頻幀的分量的像素的集合以形成塊；根據第一樣式將塊分割為子塊的第一集合；基於子塊的第一集合產生第一中間預測塊；根據第二樣式將塊分割為子塊的第二集合，其中第二集合中的至少一個子塊不在第一集合中；基於子塊的第二集合產生第二中間預測塊；以及基於第一中間預測塊和第二中間預測塊確定預測塊。 In another embodiment, a video processing method is provided, including: selectively performing one or more of the luminance component, the first chrominance component, and the second chrominance component from the video frame of the video based on the video condition Coding of components based on interleaving prediction, where performing interleaving prediction includes determining a prediction block from the current block of the video components: selecting a set of pixels of the components of the video frame to form a block; dividing the block into sub-blocks according to the first pattern Generate a first intermediate prediction block based on the first set of sub-blocks; divide the block into a second set of sub-blocks according to the second pattern, wherein at least one sub-block in the second set is not in the first set; A second intermediate prediction block is generated based on the second set of sub-blocks; and the prediction block is determined based on the first intermediate prediction block and the second intermediate prediction block.

在又一實施例中，公開了一種實現本文中描述的視頻編碼方法的視頻編碼器裝置。 In yet another embodiment, a video encoder device implementing the video encoding method described herein is disclosed.

在又一代表性方面中，將本文中描述的各種技術實施為非暫態電腦可讀媒體上儲存的電腦程式產品。電腦程式產品包含用於進行本文中描述的方法的程式代碼。 In yet another representative aspect, the various technologies described in this article are implemented as non-transitory computers Computer program products stored on readable media. The computer program product contains program code for performing the methods described in this article.

在又一代表性方面中，視頻解碼器設備可以實現本文所描述的方法。 In yet another representative aspect, a video decoder device may implement the method described herein.

以下所附附件、附圖和說明書中提出了一個或多個實現方式的細節。根據說明書和附圖以及請求項，其他特徵將變得顯而易見。 The details of one or more implementation manners are proposed in the attached appendices, drawings, and description below. Other features will become apparent from the description and drawings and claims.

v₁、v₀、v_x、v_y、MVx、MVy:向量 v ₁ , v ₀ , v _x , v _y , MVx, MVy: vector

A至E、1300至1305、P、P _i:視頻塊 A to E, 1300 to 1305, P, P _i: the video block

Wa、Wb:權重值 Wa, Wb: weight value

τ ₀、τ ₁:距離 τ ₀ , τ ₁ : distance

Ref0、Ref1:參考幀 Ref0, Ref1: reference frame

1800:視頻處理設備 1800: Video processing equipment

1802:處理器 1802: processor

1804:記憶體 1804: memory

1806:視頻處理電路 1806: Video processing circuit

1900、2000、2100、2200、2300:方法 1900, 2000, 2100, 2200, 2300: method

1902至1904、2002至2004、2102至2104、2202至2204、2302至2312:步驟 1902 to 1904, 2002 to 2004, 2102 to 2104, 2202 to 2204, 2302 to 2312: steps

第1圖示出了基於子塊的預測的示例。 Figure 1 shows an example of sub-block-based prediction.

第2圖示出了簡化的仿射運動模型的示例。 Figure 2 shows an example of a simplified affine motion model.

第3圖示出了每個子塊的仿射運動向量場(MVF)的示例。 Figure 3 shows an example of the affine motion vector field (MVF) of each sub-block.

第4圖示出了AF_INTER模式的運動向量預測(MVP)的示例。 Figure 4 shows an example of motion vector prediction (MVP) in AF_INTER mode.

第5A圖和第5B圖繪示了AF_MERGE編碼模式的候選的示例。 Figures 5A and 5B show examples of candidates for the AF_MERGE coding mode.

第6圖示出了編碼單元(CU)的高級時域運動向量預測器(ATMVP)運動預測的示例性過程。 Figure 6 shows an exemplary process of motion prediction by the advanced temporal motion vector predictor (ATMVP) of the coding unit (CU).

第7圖示出了具有四個子塊(A-D)的一個CU及其相鄰塊(a-d)的示例。 Figure 7 shows an example of a CU with four sub-blocks (A-D) and its neighboring blocks (a-d).

第8圖示出了視頻編碼中的光流軌跡的示例。 Figure 8 shows an example of optical flow trajectories in video encoding.

第9A圖和第9B圖示出了沒有塊擴展的雙向光(BIO)編碼技術的示例。第9A圖示出了塊之外的訪問位置的示例，並且第9B圖示出了為了避免額外記憶體訪問和計算而使用的填充(padding)的示例。 Figures 9A and 9B show examples of bidirectional optical (BIO) encoding techniques without block extension. FIG. 9A shows an example of access locations outside of the block, and FIG. 9B shows an example of padding used in order to avoid additional memory access and calculation.

第10圖示出了雙邊匹配的示例。 Figure 10 shows an example of bilateral matching.

第11圖示出了模板匹配的示例。 Figure 11 shows an example of template matching.

第12圖示出了幀速率上轉換(FRUC)中的單邊運動估計(ME)的示例。 Figure 12 shows an example of unilateral motion estimation (ME) in frame rate up conversion (FRUC).

第13圖示出了交織預測的示例性實現方式。 Figure 13 shows an exemplary implementation of interleaving prediction.

第14A圖至第14C圖示出了部分交織預測的示例。虛線表示第一細分樣式；實線表示第二細分樣式；粗線表示應用交織預測的區域。在該區域之外，不應用交織預測。 Figures 14A to 14C show examples of partial interleaving prediction. The dotted line represents the first subdivision style; the solid line Indicates the second subdivision style; the thick line indicates the area where interlaced prediction is applied. Outside this area, no interleaving prediction is applied.

第15圖示出了子塊中的權重值的示例。示例性權重值{Wa，Wb}為{3，1}、{7，1}、{5，3}、{13，3}等。 Figure 15 shows an example of weight values in sub-blocks. Exemplary weight values {Wa, Wb} are {3, 1}, {7, 1}, {5, 3}, {13, 3}, etc.

第16圖示出了根據本公開的技術的具有兩種細分樣式的交織預測的示例。 Figure 16 shows an example of interleaving prediction with two subdivision patterns according to the technology of the present disclosure.

第17A圖示出了根據本公開的技術的其中將塊細分為4×4子塊的示例性細分樣式。 FIG. 17A shows an exemplary subdivision pattern in which a block is subdivided into 4×4 sub-blocks according to the technology of the present disclosure.

第17B圖示出了根據本公開的技術的其中將塊細分為8×8子塊的示例性細分樣式。 FIG. 17B shows an exemplary subdivision pattern in which a block is subdivided into 8×8 sub-blocks according to the technology of the present disclosure.

第17C圖示出了根據本公開的技術的其中將塊細分為4×8子塊的示例性細分樣式。 FIG. 17C shows an exemplary subdivision pattern in which a block is subdivided into 4×8 sub-blocks according to the technology of the present disclosure.

第17D圖示出了根據本公開的技術的其中將塊細分為8×4子塊的示例性細分樣式。 FIG. 17D shows an exemplary subdivision pattern in which a block is subdivided into 8×4 sub-blocks according to the technology of the present disclosure.

第17E圖示出了根據本公開的技術的其中將塊細分為非均勻子塊的示例性細分樣式。 FIG. 17E shows an exemplary subdivision pattern in which a block is subdivided into non-uniform sub-blocks according to the technology of the present disclosure.

第17F圖示出了根據本公開的技術的其中將塊細分為非均勻子塊的另一示例性細分樣式。 FIG. 17F shows another exemplary subdivision pattern in which a block is subdivided into non-uniform sub-blocks according to the technology of the present disclosure.

第17G圖示出了根據本公開的技術的其中將塊細分為非均勻子塊的又一示例性細分樣式。 FIG. 17G shows another exemplary subdivision pattern in which a block is subdivided into non-uniform sub-blocks according to the technology of the present disclosure.

第18圖是用於實現本文件中描述的視頻處理方法的硬體平臺的示例的框圖。 Figure 18 is a block diagram of an example of a hardware platform used to implement the video processing method described in this document.

第19圖是本文件中描述的視頻處理的示例性方法的流程圖。 Figure 19 is a flowchart of an exemplary method of video processing described in this document.

第20圖是本文件中描述的視頻處理的另一示例性方法的流程圖。 Figure 20 is a flowchart of another exemplary method of video processing described in this document.

第21圖是本文件中描述的視頻處理的另一示例性方法的流程圖。 Figure 21 is a flowchart of another exemplary method of video processing described in this document.

第22圖是本文件中描述的視頻處理的另一示例性方法的流程圖。 Figure 22 is a flowchart of another exemplary method of video processing described in this document.

第23圖是本文件中描述的視頻處理的另一示例性方法的流程圖。 Figure 23 is a flowchart of another exemplary method of video processing described in this document.

本文件中使用章節標題以改善可讀性，並且不將章節中描述的技術和實施例限制於僅該章節。 Chapter titles are used in this document to improve readability, and the techniques and embodiments described in the chapters are not limited to only this chapter.

為改善視頻的壓縮比，研究者持續地尋求編碼視頻的新技術。 To improve the compression ratio of video, researchers continue to seek new technologies for encoding video.

1.介紹1 Introduction

本發明涉及視頻/圖像編碼技術。具體地，其涉及視頻/圖像編碼中的基於子塊的預測。其可以應用於比如HEVC的現有視頻編碼標準，或將最終確定的標準(通用視頻編碼(Versatile Video Coding)。其還可以適用於未來的視頻/圖像編碼標準或視頻/圖像編解碼器。本發明可以進一步改善P1805026601。 The present invention relates to video/image coding technology. Specifically, it relates to sub-block-based prediction in video/image coding. It can be applied to existing video coding standards such as HEVC, or a standard to be finalized (Versatile Video Coding). It can also be applied to future video/image coding standards or video/image codecs. The present invention can further improve P1805026601.

簡要討論Brief discussion

基於子塊的預測最初通過HEVC附錄I(3D-HEVC)引入到視頻編碼標準中。通過基於子塊的預測，塊(諸如編碼單元(CU)或預測單元(PU))被細分為若干不重疊的子塊。不同的子塊可以分配不同的運動信息，諸如參考索引或運動向量(MV)，並且單獨地對每個子塊進行運動補償(MC)。第1圖展示了基於子塊的預測的概念。 Sub-block-based prediction was first introduced into the video coding standard through HEVC Appendix I (3D-HEVC). Through sub-block-based prediction, a block (such as a coding unit (CU) or a prediction unit (PU)) is subdivided into several non-overlapping sub-blocks. Different sub-blocks can be assigned different motion information, such as reference index or motion vector (MV), and motion compensation (MC) is performed on each sub-block individually. Figure 1 shows the concept of sub-block-based prediction.

為了探索HEVC之外的未來視頻編碼技術，由VCEG和MPEG於2015年聯合成立聯合視頻探索團隊(JVET)。從那時起，JVET採用了許多新方法並將其納入名為聯合探索模型(JEM)的參考軟體。 In order to explore future video coding technologies beyond HEVC, VCEG and MPEG jointly established a Joint Video Exploration Team (JVET) in 2015. Since then, JVET has adopted many new methods and incorporated them into a reference software called Joint Exploration Model (JEM).

在JEM中，在若干編碼工具中採用基於子塊的預測，諸如仿射預測、可選時域運動向量預測(ATMVP)、空間-時間運動向量預測(STMVP)、雙向光流(BIO)以及幀速率上轉換(FRUC)。 In JEM, sub-block-based prediction is used in several coding tools, such as affine prediction, optional temporal motion vector prediction (ATMVP), space-time motion vector prediction (STMVP), bidirectional optical flow (BIO), and frame Rate up conversion (FRUC).

2.1 仿射預測2.1 Affine prediction

在HEVC中，對於運動補償預測(MCP)僅應用平移運動模型。而在現實世界中，存在許多種運動，例如放大/縮小、旋轉、透視運動和其他不規則的運動。在JEM中，應用簡化的仿射變換運動補償預測。如第2圖所示，塊的仿射運動場由兩個控制點運動向量描述。 In HEVC, only the translational motion model is applied for motion compensation prediction (MCP). and In the real world, there are many kinds of movement, such as zoom in/out, rotation, perspective movement, and other irregular movements. In JEM, a simplified affine transform motion compensation prediction is applied. As shown in Figure 2, the affine motion field of the block is described by two control point motion vectors.

塊的運動向量場(MVF)由以下等式描述：

The motion vector field (MVF) of the block is described by the following equation:

其中(v _0x ,v _0y)是左頂角控制點的運動向量，(v _1x ,v _1y)是右頂角控制點的運動向量。 Where ( v _0x , v _0y ) is the motion vector of the control point at the left top corner, and ( v _1x , v _1y ) is the motion vector of the control point at the right top corner.

為了進一步簡化運動補償預測，應用基於子塊的仿射變換預測。子塊尺寸M×N如等式(2)中導出，其中MvPre是運動向量分數精度(在JEM中為1/16)，(v _2x ,v _2y)是左下控制點的運動向量，其根據等式(1)計算。 To further simplify the motion compensation prediction, sub-block-based affine transformation prediction is applied. The sub-block size M × N is derived from equation (2), where MvPre is the accuracy of the motion vector score (1/16 in JEM), and ( v _2x , v _2y ) is the motion vector of the lower left control point, which is based on Formula (1) calculation.

在由等式(2)導出之後，如果需要，應該向下調整M和N，以使其分別為w和h的除數。 After deriving from equation (2), if necessary, M and N should be adjusted downward so that they are divisors of w and h, respectively.

如第3圖所示，為了導出每個M×N子塊的運動向量，根據等式(1)計算每個子塊的中心樣本的運動向量並將其取整至1/16分數精度。然後，應用運動補償插值濾波器，以利用導出的運動向量產生每個子塊的預測。 As shown in Figure 3, in order to derive the motion vector of each M×N sub-block, the motion vector of the center sample of each sub-block is calculated according to equation (1) and rounded to 1/16 fractional accuracy. Then, a motion compensation interpolation filter is applied to use the derived motion vector to generate a prediction for each sub-block.

在MCP之後，每個子塊的高精度運動向量以與正常運動向量相同的精度被取整並保存。 After the MCP, the high-precision motion vector of each sub-block is rounded and saved with the same accuracy as the normal motion vector.

在JEM中，存在兩種仿射運動模式：AF_INTER模式和AF_MERGE模式。對於寬度和高度均大於8的CU，可以應用AF_INTER模式。在位元流中信令通知CU級別的仿射標誌，以指示是否使用AF_INTER模式。在此模式下，使用相鄰塊構建具有運動向量對{(v₀,v₁)|v₀={v_A,v_B,v_c},v₁={v_D,v_E}}的候選列表。如第4圖所示，從塊A、塊B或塊C的運動向量中選擇v₀。來自相鄰塊的運動向量根據參考列表並且根據相鄰塊的參考的POC、當前CU的參考的POC和當前CU的POC之間的關係來縮放。並且從相鄰塊D和E中選擇v₁的方法是類似的。如果候選列表的數量小於2，則由通過複製每個AMVP候選而組成的運動向量對來填充該列表。當候選列表大於2時，首先根據相鄰運動向量的一致性(候選對中的兩個運動向量的相似性)對候選進行分類，並且僅保留前兩個候選。RD成本校驗用於確定選擇哪個運動向量對候選作為當前CU的控制點運動向量預測(CPMVP)。並且，在位元流中信令通知指示候選列表中的CPMVP的位置的索引。在確定當前仿射CU的CPMVP之後，應用仿射運動估計，並找到控制點運動向量(CPMV)。然後在位元流中信令通知CPMV與CPMVP的差異。 In JEM, there are two affine motion modes: AF_INTER mode and AF_MERGE mode. For CUs whose width and height are both greater than 8, AF_INTER mode can be applied. The CU-level affine flag is signaled in the bit stream to indicate whether to use the AF_INTER mode. In this mode, neighboring blocks are used to construct a candidate with motion vector pairs {(v ₀ ,v ₁ )|v ₀ ={v _A ,v _B ,v _c },v ₁ ={v _D ,v _E }} List. As shown in Figure 4, select v ₀ from the motion vectors of block A, block B, or block C. The motion vector from the neighboring block is scaled according to the reference list and according to the relationship between the referenced POC of the neighboring block, the referenced POC of the current CU, and the POC of the current CU. And the method of selecting v ₁ from adjacent blocks D and E is similar. If the number of the candidate list is less than 2, the list is filled with a pair of motion vectors composed by duplicating each AMVP candidate. When the candidate list is greater than 2, the candidates are first classified according to the consistency of adjacent motion vectors (the similarity of the two motion vectors in the candidate pair), and only the first two candidates are retained. The RD cost check is used to determine which motion vector pair candidate is selected as the control point motion vector prediction (CPMVP) of the current CU. And, an index indicating the position of the CPMVP in the candidate list is signaled in the bit stream. After determining the CPMVP of the current affine CU, apply affine motion estimation and find the control point motion vector (CPMV). Then signal the difference between CPMV and CPMVP in the bit stream.

當在AF_MERGE模式中應用CU時，它從有效的相鄰重建塊獲得使用仿射模式編碼的第一塊。如第5A圖所示，並且對於候選塊的選擇順序是從左方、上方、右上方、左下方到左上方。如第5B圖所示，如果相鄰左下方的塊A以仿射模式編碼，則導出包含塊A的CU的左頂角、右上角和左底角的運動向量v₂、v₃和v₄。並且根據v₂、v₃和v₄來計算當前CU的左頂角的運動向量v₀。其次，計算當前CU的右上方的運動向量v₁。 When the CU is applied in the AF_MERGE mode, it obtains the first block encoded using the affine mode from the valid neighboring reconstructed block. As shown in Fig. 5A, the selection order for candidate blocks is from left, top, top right, bottom left to top left. As shown in Figure 5B, if the adjacent lower left block A is coded in affine mode, the motion vectors v ₂ , v ₃ and v _{4 of the} top left corner, top right corner, and bottom left corner of the CU containing block A are derived . And calculate the motion vector v _{0 of the} top left corner of the current CU according to v ₂ , v ₃ and v ₄ . Second, calculate the motion vector v ₁ of the upper right of the current CU.

在導出當前CU的CPMV v₀和v₁之後，根據簡化的仿射運動模型等式(1)，產生該當前CU的MVF。為了識別當前CU是否使用AF_MERGE模式編碼，當存在至少一個相鄰塊以仿射模式編碼時，在位元流中信令通知仿射標誌。 After deriving the CPMV v ₀ and v ₁ of the current CU, the MVF of the current CU is generated according to the simplified affine motion model equation (1). In order to identify whether the current CU uses AF_MERGE mode coding, when there is at least one neighboring block coded in affine mode, an affine flag is signaled in the bit stream.

2.2 ATMVP2.2 ATMVP

在可選時域運動向量預測(ATMVP)方法中，通過從小於當前CU的塊中提取多組運動信息(包括運動向量和參考指數)，修改運動向量時域運動向量預測(TMVP)。如第6圖所示，子CU為方形N×N塊(N默認設定為4)。 In the optional temporal motion vector prediction (ATMVP) method, the motion vector temporal motion vector prediction (TMVP) is modified by extracting multiple sets of motion information (including motion vectors and reference indices) from blocks smaller than the current CU. As shown in Figure 6, the sub-CU is a square N×N block (N is set to 4 by default).

ATMVP分兩步預測CU內子CU的運動向量。第一步是用所謂的時間向量識別參考圖片中的相應塊。參考圖片也稱為運動源圖片。第二步是將當前的CU分割成子CU，並從每個子CU對應的塊中獲取每個子CU的運動向量和參考指數，如第6圖中所示。 ATMVP predicts the motion vector of the sub-CU in the CU in two steps. The first step is to use the so-called time The vector identifies the corresponding block in the reference picture. The reference picture is also called the motion source picture. The second step is to divide the current CU into sub-CUs, and obtain the motion vector and reference index of each sub-CU from the block corresponding to each sub-CU, as shown in Figure 6.

在第一步中，參考圖片和對應的塊由當前CU的空間相鄰塊的運動信息確定。為了避免相鄰塊的複製掃描處理，使用當前CU的MERGE候選列表中的第一MERGE候選。第一可用的運動向量及其相關聯的參考索引被設置為時間向量和運動源圖片的索引。這樣，與TMVP相比，在ATMVP中可以更準確地識別對應的塊，其中對應的塊(有時稱為並置塊)始終位於相對於當前CU的右下角或中心位置。 In the first step, the reference picture and the corresponding block are determined by the motion information of the spatial neighboring blocks of the current CU. In order to avoid copy scan processing of adjacent blocks, the first MERGE candidate in the MERGE candidate list of the current CU is used. The first available motion vector and its associated reference index are set as the index of the time vector and the motion source picture. In this way, compared with TMVP, the corresponding block can be identified more accurately in ATMVP, where the corresponding block (sometimes called a collocated block) is always located in the lower right corner or the center position relative to the current CU.

在第二步中，通過將時間向量添加到當前CU的坐標中，通過運動源圖片中的時間向量識別子CU的對應塊。對於每個子CU，使用其對應塊的運動信息(覆蓋中心樣本的最小運動網格)來導出子CU的運動信息。在識別出對應的N×N塊的運動信息後，用與HEVC的TMVP同樣方式，將其轉換為當前子CU的運動向量和參考指數，其中應用運動縮放和其他程式。例如，解碼器檢查是否滿足低延遲條件(即，當前圖片的所有參考圖片的POC都小於當前圖片的POC)，並且可能使用運動向量MVx(與參考圖片列表X對應的運動向量)來預測每個子CU的運動向量MVy(X等於0或1並且Y等於1-X)。 In the second step, by adding the time vector to the coordinates of the current CU, the corresponding block of the sub-CU is identified by the time vector in the motion source picture. For each sub-CU, the motion information of its corresponding block (the smallest motion grid covering the center sample) is used to derive the motion information of the sub-CU. After the motion information of the corresponding N×N block is identified, it is converted into the motion vector and reference index of the current sub-CU in the same way as the TMVP of HEVC, where motion scaling and other programs are applied. For example, the decoder checks whether the low-delay condition is satisfied (that is, the POC of all reference pictures of the current picture is less than the POC of the current picture), and may use the motion vector MVx (the motion vector corresponding to the reference picture list X) to predict each sub The motion vector MVy of the CU (X is equal to 0 or 1 and Y is equal to 1-X).

3.STMVP3.STMVP

在此方法中，子CU的運動向量按照光柵掃描順序遞歸導出。第7圖示出了此概念。讓我們考慮8×8的CU 700，其含有四個4×4子CU A、B、C和D。當前幀中相鄰的4×4塊標記為a、b、c和d。 In this method, the motion vector of the sub-CU is recursively derived in the raster scan order. Figure 7 illustrates this concept. Let us consider an 8×8 CU 700, which contains four 4×4 sub-CUs A, B, C, and D. The adjacent 4×4 blocks in the current frame are labeled a, b, c, and d.

子CU A的運動推導由識別其兩個空間鄰居開始。第一鄰居是子CU A上方的N×N塊(塊c)。如果該塊c不可用或內部編碼，則檢查子CU A上方的其他N×N塊(從左到右，從塊c處開始)。第二個鄰居是子CU A左側的一個塊(塊b)。如果塊b不可用或是內部編碼，則檢查子CU A左側的其他塊(從上到下，從塊b處開始)。每個列表從相鄰塊獲得的運動信息被縮放到給定列表的第一參考幀。接下來，按照HEVC中規定的與TMVP相同的程式，推導出子塊A的時域運動向量預測(TMVP)。提取位置D處的並置塊的運動信息並進行相應的縮放。最後，在檢索和縮放運動信息後，對每個參考列表分別平均所有可用的運動向量(上至3個)。將平均運動向量指定為當前子CU的運動向量。 The motion derivation of sub CU A starts by identifying its two spatial neighbors. The first neighbor is the N×N block (block c) above the sub CU A. If the block c is not available or internally coded, check the other N×N blocks above the sub-CU A (from left to right, starting at block c). The second neighbor is a block to the left of the sub CU A (block b). If block b is not available or is internally coded, check the other blocks on the left of sub-CU A (from top to bottom, starting from block b). The motion information obtained from neighboring blocks for each list is scaled to the first reference frame of a given list. Next, according to the same formula specified in HEVC as TMVP, the temporal motion vector prediction (TMVP) of sub-block A is derived. Extract the motion information of the collocated block at position D and perform corresponding scaling. Finally, after retrieving and scaling the motion information, average all available motion vectors (up to 3) for each reference list. The average motion vector is designated as the motion vector of the current sub-CU.

4.BIO 4.BIO

雙向光流(BIO)是在分塊運動補償之上對雙向預測進行的樣本方向運動細化。樣本級的運動細化不使用信令。 Bidirectional optical flow (BIO) is a sample direction motion refinement for bidirectional prediction based on block motion compensation. No signaling is used for motion refinement at the sample level.

設I ^(k)為塊運動補償後到參考k(k=0，1)的亮度值，並且

/

,

/

分別為I ^(k)梯度的水平分量和垂直分量。假設光流是有效的，則運動向量場(v _x ,v _y)由等式給出：

Let I ^{( k )} be the brightness value from the block motion compensation to the reference k (k=0, 1) , and

/

,

/

These are the horizontal and vertical components of the I ^{( k )} gradient. Assuming that the optical flow is valid, the motion vector field ( v _x , v _y ) is given by the equation:

將此光流等式與每個樣品運動軌跡的埃爾米特插值相結合，得到唯一的三階多項式，該多項式在末端同時匹配函數值I ^(k)和其導數

/

,

/

。該多項式在t=0時的值是BIO預測：

Combine this optical flow equation with the Hermitian interpolation of each sample's trajectory to obtain a unique third-order polynomial, which matches the function value I ^{( k )} and its derivative at the same time at the end

/

,

/

. The value of the polynomial at t=0 is the BIO prediction:

這裏，τ₀和τ₁表示到參考幀的距離，如第8圖所示。基於Ref0和Ref1的POC計算距離τ₀和τ₁：τ₀=POC(current)-POC(Ref0),τ₁=POC(Ref1)-POC(current)。如果兩個預測都來自同一個時間方向(都來自過去或都來自未來)，則符號是不同的(即，τ₀,τ₁<0)。在這種情況下，僅在預測不是來自同一時間點(即，τ₀≠τ₁)的情況下應用BIO。兩個參考區域都具有非零運動 (即，MVx ₀ ,MVy ₀ ,MVx ₁ ,MVy ₁≠0)，並且塊運動向量與時間距離成比例(即，MVx ₀ /MVx ₁=MVy ₀ /MVy ₁=-τ₀/τ₁)。 Here, τ ₀ and τ ₁ represent the distance to the reference frame, as shown in Figure 8. The distances τ ₀ and τ _{1 are} calculated based on the POC of Ref0 and Ref1: τ ₀ =POC(current)-POC(Ref0), τ ₁ =POC(Ref1)-POC(current). If both predictions are from the same time direction (both from the past or both from the future), the signs are different (ie, τ ₀ , τ ₁ <0). In this case, BIO is only applied if the prediction is not from the same point in time (ie, τ ₀ ≠τ ₁ ). Both reference regions have non-zero motion (ie, MVx ₀ , MVy ₀ , MVx ₁ , MVy ₁ ≠ 0), and the block motion vector is proportional to the time distance (ie, MVx ₀ /MVx ₁ = MVy ₀ /MVy ₁ =-τ ₀ /τ ₁ ).

通過最小化A點和B點之間的值的差△來確定運動向量場(v _x ,v _y)(第9A圖和第9B圖上的運動軌跡與參考幀平面的相交點)。對△，模型僅使用局部泰勒展開的第一個線性項：

The motion vector field ( v _x , v _y ) is determined by minimizing the difference Δ between the values of point A and point B (the intersection point of the motion trajectory and the reference frame plane in Figs. 9A and 9B). For △, the model only uses the first linear term of the local Taylor expansion:

等式5中的所有值取決於樣本位置(i′,j′)，其至此省略了標注。假設在局部周圍區域的運動是一致的，那麼我們將△在以當前預測點(i，j)為中心的(2M+1)×(2M+1)方形窗口Ω內最小化，其中M等於2：

All the values in Equation 5 depend on the sample position ( i ′, j ′), which has been omitted so far. Assuming that the motion in the local surrounding area is consistent, then we minimize △ in a (2M+1)×(2M+1) square window Ω centered on the current prediction point (i, j), where M is equal to 2. :

對於這個優化問題，JEM使用簡化方法，首先在垂直方向上最小化，然後在水平方向最小化。結果如下：

For this optimization problem, JEM uses a simplified method, which is first minimized in the vertical direction, and then minimized in the horizontal direction. The results are as follows:

其中，

among them,

為了避免被零除或很小的值除，在式(7)和式(8)中引入正則化參數r和m。 In order to avoid division by zero or small values, regularization is introduced in equations (7) and (8) Parameters r and m.

r=500．4^d-8 (10) r = 500. 4 ^{d -8} (10)

m=700．4^d-8 (11) m = 700. 4 ^{d -8} (11)

此處d是視頻樣本的位深度。 Here d is the bit depth of the video sample.

為了使BIO的內存訪問與常規雙向預測運動補償相同，僅在當前塊內位置計算所有預測和梯度值I ^(k),

/

,

/

。在等式(9)中，以預測區塊邊界上當前預測點為中心的(2M+1)×(2M+1)的方形窗口Ω需要訪問區塊外的位置(如第9A圖所示)。在JEM中，塊外的值I ^(k),

/

.

/

設置為等於塊內最近的可用值。例如，這可以實現為填充，如第9B圖所示。 In order to make the memory access of BIO the same as conventional bidirectional predictive motion compensation, all predictions and gradient values I ^{( k ) are} calculated only in the current block position,

/

,

/

. In equation (9), the (2M+1)×(2M+1) square window Ω centered on the current prediction point on the boundary of the prediction block needs to be accessed outside the block (as shown in Figure 9A) . In JEM, the value I ^{( k )} outside the block,

/

.

/

Set equal to the nearest available value in the block. For example, this can be implemented as padding, as shown in Figure 9B.

使用BIO，可以對每個樣本的運動場進行細化。為了降低計算複雜度，在JEM中採用了基於塊設計的BIO。基於4×4塊計算運動細化。在基於塊的BIO中，對4×4塊中所有樣本的等式(9)中的s_n值進行聚合，然後將s_n的聚合值用於4×4塊的推導的BIO運動向量偏移。更具體地說，下面的等式用於基於塊的BIO推導：

Using BIO, the sports field of each sample can be refined. In order to reduce the computational complexity, BIO based on block design is used in JEM. The motion refinement is calculated based on 4×4 blocks. In block-based BIO, the _sn value in equation (9) of all samples in the 4×4 block is aggregated, and then the aggregated value of _sn is used for the derived BIO motion vector offset of the 4×4 block . More specifically, the following equation is used for block-based BIO derivation:

其中，b_k表示屬預測塊的第k個4×4塊的樣本組。等式(7)和等式(8)中的s_n替換為((s_n,b_k)>>4)以推導相關聯的運動向量偏移。 Among them, b _k represents the sample group of the k-th 4×4 block belonging to the prediction block. Replace s _{n in} equation (7) and equation (8) with ((s _n ,b _k )>>4) to derive the associated motion vector offset.

在某些情況下，由於噪聲或不規則運動，BIO的MV團(regiment)可能不可靠。因此，在BIO中，MV團的大小被固定到一個閾值thBIO。該閾值是基於當前圖片的參考圖片是否全部來自一個方向確定的。如果當前圖片的所有參考圖片都來自一個方向，則該閾值的值被設置為12×2^14-d，否則其被設置為12×2^13-d。 In some cases, BIO's MV regimen may be unreliable due to noise or irregular movement. Therefore, in BIO, the size of the MV group is fixed to a threshold thBIO. The threshold is determined based on whether the reference pictures of the current picture all come from one direction. If all reference pictures of the current picture are from one direction, the value of the threshold is set to 12×2 ^{14- d} , otherwise it is set to 12×2 ^{13- d} .

使用與HEVC運動補償處理一致的操作(2D可分離FIR)通過運動補償插值同時計算BIO的梯度。此2D可分離FIR的輸入是與運動補償處理相同的參考幀，以及根據塊運動向量的分數部分的分數位置(fracX，fracY)。在水平梯度

I/

x的情況下，首先使用BIOfilterS對信號進行垂直插值，該BIOfilterS對應於具有去縮放標度位移d-8的分數位置fracY，然後在水平方向上應用梯度濾波器BIOfilterG，該BIOfilterG對應於具有去縮放標度位移18-d的分數位置fracX。在垂直梯度

I/

y的情況下，使用BIOfilterG垂直地應用第一梯度濾波器，該BIOfilterG對應於具有去縮放標度位移d-8的分數位置fracY，然後，然後在水平方向上使用BIOfilterS執行信號替換，該BIOfilterS對應於具有去縮放標度位移18-d的分數位置fracX。用於梯度計算BIOfilterG和信號替換BIOfilterS的插值濾波器的長度更短(6-tap)，以保持合理的複雜度。表格示出了用於BIO中塊運動向量的不同分數位置的梯度計算的濾波器。表格示出了用於BIO中預測信號產生的插值濾波器。 The operation (2D separable FIR) consistent with the HEVC motion compensation process is used to calculate the gradient of BIO simultaneously through motion compensation interpolation. The input of this 2D separable FIR is the same reference frame as the motion compensation process, and the fractional position (fracX, fracY) according to the fractional part of the block motion vector. In horizontal gradient

I/

In the case of x , first use BIOfilterS to interpolate the signal vertically, which corresponds to the fractional position fracY with a de-scaling scale displacement d-8, and then apply the gradient filter BIOfilterG in the horizontal direction, which corresponds to the de-scaling The scale is displaced by 18-d fractional position fracX. In vertical gradient

I/

In the case of y , apply the first gradient filter vertically using BIOfilterG, which corresponds to the fractional position fracY with a de-scaling scale displacement d-8, and then perform signal replacement in the horizontal direction using BIOfilterS, which corresponds to At the fractional position fracX with a de-scaling scale shift of 18-d. The length of the interpolation filter used for gradient calculation BIOfilterG and signal replacement BIOfilterS is shorter (6-tap) to keep reasonable complexity. The table shows the filters used for gradient calculation of different fractional positions of block motion vectors in BIO. The table shows the interpolation filters used for the prediction signal generation in BIO.

在JEM中，當兩個預測來自不同的參考圖片時，將BIO應用於所有的雙向預測塊。當為CU啟用LIC時，禁用BIO。 In JEM, when two predictions come from different reference pictures, BIO is applied to all bidirectional prediction blocks. When LIC is enabled for CU, BIO is disabled.

在JEM中，在正常MC處理之後將OBMC應用於塊。為了降低計算複雜度，在OBMC處理期間不應用BIO。這意味著在OBMC處理期間，僅當使用自己的MV時，將BIO應用於塊的MC處理，而當使用相鄰塊的MV時，BIO不應用於塊的MC處理。 In JEM, OBMC is applied to the block after normal MC processing. To reduce computational complexity, BIO is not applied during OBMC processing. This means that during OBMC processing, BIO is applied to the MC processing of a block only when its own MV is used, and when the MV of an adjacent block is used, BIO is not applied to the MC processing of the block.

2.5 FRUC2.5 FRUC

當CU的合併標誌為真時，向該CU信令通知FRUC標誌。當FRUC標誌為假時，信令通知合併索引，並使用常規Merge模式。當FRUC標誌為真時，信令通知附加的FRUC模式標誌以指示將使用哪種方法(雙邊匹配或模板匹配)來導出該塊的運動信息。 When the merge flag of the CU is true, the FRUC flag is signaled to the CU. When the FRUC flag is false, the merge index is signaled, and the regular Merge mode is used. When the FRUC flag is true, the additional FRUC mode flag is signaled to indicate which method (bilateral matching or template matching) will be used to derive the motion information of the block.

在編碼器側，關於是否對CU使用FRUC合併模式的決定是基於如對正常合併候選那樣所做的RD成本選擇。換言之，通過使用RD成本選擇來校驗CU的兩種匹配模式(雙邊匹配和模板匹配)。導致最小成本的匹配模式與其他CU模式進一步比較。如果FRUC匹配模式是最有效的模式，則對於CU將FRUC標誌設置為真，並且使用有關匹配模式。 On the encoder side, the decision on whether to use the FRUC merge mode for the CU is based on the RD cost selection as done for the normal merge candidates. In other words, two matching modes (bilateral matching and template matching) of the CU are checked by using RD cost selection. The matching mode that results in the least cost is further compared with other CU modes. If the FRUC matching mode is the most effective mode, set the FRUC flag to true for the CU, and use the relevant matching mode.

FRUC合併模式中的運動推導過程有兩個步驟。首先執行CU級別運動搜索，接下來執行子CU級別運動細化。在CU級別，基於雙邊匹配或模板匹配為整個CU導出初始運動向量。首先，產生MV候選列表，並且選擇導致最小匹配成本的候選作為進一步CU級別細化的起點。然後，圍繞起始點執行基於雙邊匹配或模板匹配的局部搜索，並且將導致最小匹配成本的MV作為整個CU的MV。隨後，運動信息在子CU級別進一步細化，其中導出的CU運動向量作為起點。 The motion derivation process in FRUC merge mode has two steps. The CU-level motion search is performed first, and then the sub-CU-level motion refinement is performed. At the CU level, the initial motion vector is derived for the entire CU based on bilateral matching or template matching. First, a list of MV candidates is generated, and the candidate that leads to the smallest matching cost is selected as the starting point for further CU level refinement. Then, a partial search based on bilateral matching or template matching is performed around the starting point, and the MV that results in the smallest matching cost is taken as the MV of the entire CU. Subsequently, the motion information is further refined at the sub-CU level, with the derived CU motion vector as the starting point.

例如，針對W×HCU運動信息推導執行以下推導處理。在第一階段，導出整體W×H CU的MV。在第二階段，CU進一步分割為M×M子CU。如(16)中計算M的值，D是預定義的分割深度，其在JEM中默認設置為3。然後導出每個子CU的MV。 For example, the following derivation process is executed for W × H CU motion information derivation. In the first stage, the MV of the overall W × H CU is derived. In the second stage, the CU is further divided into M × M sub-CUs. For example, when calculating the value of M in (16), D is the predefined segmentation depth, which is set to 3 by default in JEM. Then export the MV of each sub-CU.

如第10圖所示，雙邊匹配用於通過在兩個不同參考圖像中沿當前CU的運動軌跡找到兩個塊之間的最接近匹配，來導出當前CU的運動信息。在連續運動軌跡的假設下，指向兩個參考塊的運動向量MV0和MV1應當與在當前圖像和兩個參考圖像之間的時間距離──即TD0和TD1──成比例。作為特殊情況，當當前圖像在時間上在兩個參考圖像之間並且從當前圖像到兩個參考圖像的時間距離相同時，雙邊匹配變為基於鏡像的雙向MV。 As shown in Figure 10, bilateral matching is used to derive the motion information of the current CU by finding the closest match between two blocks along the motion trajectory of the current CU in two different reference images. In continuous Under the assumption of the motion trajectory, the motion vectors MV0 and MV1 pointing to the two reference blocks should be proportional to the time distance between the current image and the two reference images—that is, TD0 and TD1. As a special case, when the current image is between two reference images in time and the time distance from the current image to the two reference images is the same, bilateral matching becomes a mirror-based bidirectional MV.

如第11圖所示，模板匹配用於通過找到在當前圖像中的模板(當前CU的頂部相鄰塊和/或左方相鄰塊)與參考圖像中的塊(具有與模板相同的尺寸)之間的最接近匹配，來導出當前CU的運動信息。除了上述FRUC合併模式之外，模板匹配也適用於AMVP模式。在JEM中，如在HEVC中一樣，AMVP有兩個候選。使用模板匹配方法，導出新的候選。如果由模板匹配的新導出的候選與第一現有AMVP候選不同，則將其插入AMVP候選列表的最開始，並且然後將列表尺寸設置為2(這意味著移除第二現有AMVP候選)。當應用於AMVP模式時，僅應用CU級別搜索。 As shown in Figure 11, template matching is used to find the template in the current image (the top adjacent block and/or the left adjacent block of the current CU) and the block in the reference image (with the same size as the template) The closest match between the two is used to derive the motion information of the current CU. In addition to the aforementioned FRUC merge mode, template matching is also applicable to AMVP mode. In JEM, as in HEVC, AMVP has two candidates. Use template matching methods to derive new candidates. If the newly derived candidate matched by the template is different from the first existing AMVP candidate, it is inserted at the very beginning of the AMVP candidate list, and then the list size is set to 2 (which means removing the second existing AMVP candidate). When applied to AMVP mode, only CU level search is applied.

CU級別MV候選集合CU level MV candidate set

CU級別的MV候選集合由以下組成：i.如果當前CU處於AMVP模式，則為原始AMVP候選，ii.所有合併候選，iii.插值MV域中的數個MV(後面描述)，iv.頂部和左方相鄰的運動向量。 The CU-level MV candidate set consists of the following: i. If the current CU is in AMVP mode, it is the original AMVP candidate, ii. All merged candidates, iii. Interpolating several MVs in the MV domain (described later), iv. Top and The motion vector adjacent to the left.

當使用雙邊匹配時，合併候選的每個有效MV被用作輸入，以在假設雙邊匹配的情況下產生MV對。例如，合併候選的一個有效MV是在參考列表A中的(MVa，refa)。然後，在其他參考列表B中找到其配對雙邊MV的參考圖像refb，使得refa和refb在時間上位於當前圖片的不同側。如果參考列表B中這樣的refb不可用，則refb被確定為與refa不同的參考，並且refb到當前圖像的時間距離是列表B中的最小值。在確定refb之後，基於當前圖像與refa、refb之間的時間距離通過縮放MVa來導出MVb。 When bilateral matching is used, each valid MV of the merge candidate is used as input to generate MV pairs assuming bilateral matching. For example, a valid MV of the merge candidate is (MVa, refa) in the reference list A. Then, find the reference image refb of the paired bilateral MV in the other reference list B, so that refa and refb are located on different sides of the current picture in time. If such refb in reference list B is not available, refb is determined to be a different reference from refa, and the time distance from refb to the current image is the minimum value in list B. After determining refb, based on the time distance between the current image and refa and refb Export MVb by scaling MVa.

來自插值MV域的四個MV也被添加到CU級別候選列表。更具體地，添加當前CU的位置(0,0)、(W/2,0)、(0,H/2)和(W/2,H/2)處的插值MV。 Four MVs from the interpolation MV domain are also added to the CU level candidate list. More specifically, the interpolation MVs at the positions (0, 0), (W/2, 0), (0, H/2), and (W/2, H/2) of the current CU are added.

當FRUC應用於AMVP模式時，原始AMVP候選也被添加到CU級別MV候選集合。 When FRUC is applied to AMVP mode, the original AMVP candidate is also added to the CU-level MV candidate set.

在CU級別，用於AMVP CU的最多15個MV、用於合併CU的最多13個MV被添加到候選列表。 At the CU level, up to 15 MVs for AMVP CU and up to 13 MVs for merging CU are added to the candidate list.

子CU級別MV候選集合子CU級別的MV候選集合由以下組成：i.從CU級別搜索確定的MV，ii.頂部、左方、左頂和右頂的相鄰MV，iii.來自參考圖像的並列MV的縮放版本，iv.最多4個ATMVP候選，v.最多4個STMVP候選。 Sub-CU-level MV candidate set The sub-CU-level MV candidate set consists of: i. MV determined from the CU level search, ii. adjacent MVs at the top, left, top left, and top right, and iii. from the reference image The scaled version of the parallel MV, iv. up to 4 ATMVP candidates, v. up to 4 STMVP candidates.

來自參考圖像的縮放MV如下導出。遍歷兩個列表中的所有參考圖像。參考圖像中的子CU的並列位置處的MV被縮放到起始CU級別MV的參考。 The zoomed MV from the reference image is derived as follows. Iterate through all reference images in the two lists. The MV at the juxtaposed position of the sub-CU in the reference image is scaled to the reference of the starting CU level MV.

ATMVP和STMVP候選僅限於前四個。 The ATMVP and STMVP candidates are limited to the first four.

在子CU級別，最多17個MV被添加到候選列表中。 At the sub-CU level, up to 17 MVs are added to the candidate list.

插值MV域的產生Generation of interpolated MV domain

在對幀進行編碼之前，基於單邊ME為整個圖像產生插值運動域。然後，運動域可以稍後用作CU級別或子CU級別MV候選。 Before encoding the frame, an interpolated motion field is generated for the entire image based on a single-sided ME. Then, the motion domain can be used as a CU-level or sub-CU-level MV candidate later.

首先，兩個參考列表中的每個參考圖像的運動域以4×4塊級別遍歷。對於每個4×4塊，如果與塊相關聯的運動通過當前圖像中的4×4塊(如第12圖所示)並且該塊尚未被分配任何插值運動，則參考塊的運動根據時間距離TD0和 TD1(與HEVC中的TMVP的MV縮放的方式相同的方式)縮放到當前圖像，並且將縮放的運動分配給當前幀中的塊。如果無縮放的MV被分配到4×4塊，則在插值的運動域中將塊的運動標記為不可用。 First, the motion domain of each reference image in the two reference lists is traversed at a 4×4 block level. For each 4×4 block, if the motion associated with the block passes through a 4×4 block in the current image (as shown in Figure 12) and the block has not been assigned any interpolation motion, the motion of the reference block is based on time Distance TD0 and TD1 (the same way as the MV scaling of TMVP in HEVC) scales to the current image, and assigns the scaled motion to the block in the current frame. If an unscaled MV is allocated to a 4×4 block, the motion of the block is marked as unusable in the interpolated motion domain.

插值和匹配成本Interpolation and matching costs

當運動向量指向分數樣本位置時，需要運動補償插值。為了降低複雜性，雙邊匹配和模板匹配都使用雙線性插值而不是常規的8抽頭HEVC插值。 When the motion vector points to the fractional sample position, motion compensation interpolation is required. In order to reduce complexity, both bilateral matching and template matching use bilinear interpolation instead of conventional 8-tap HEVC interpolation.

匹配成本的計算在不同的步驟有點不同。當從CU級別的候選集合中選擇候選時，匹配成本是雙邊匹配或模板匹配的絕對差和(SAD)。在確定起始MV之後，如下計算子CU級別搜索的雙邊匹配的匹配成本C：

The calculation of the matching cost is a bit different in different steps. When selecting candidates from the candidate set at the CU level, the matching cost is the sum of absolute differences (SAD) of bilateral matching or template matching. After the starting MV is determined, the matching cost C of the bilateral match searched at the sub-CU level is calculated as follows:

其中w是一個加權因子，且根據經驗設置為4，MV和MV ^s分別指示當前MV和起始MV。SAD仍用作子CU級別搜索的模板匹配的匹配成本。 Where w is a weighting factor and is set to 4 based on experience, MV and MV ^s indicate the current MV and the starting MV respectively. SAD is still used as the matching cost for template matching in sub-CU level searches.

在FRUC模式中，MV通過僅使用亮度樣本導出。導出的運動將用於MC幀間預測的亮度和色度。在確定MV之後，使用用於亮度的8抽頭插值濾波器和用於色度的4抽頭插值濾波器來執行最終MC。 In FRUC mode, MV is derived by using only luminance samples. The derived motion will be used for the luma and chroma of MC inter prediction. After the MV is determined, the final MC is performed using an 8-tap interpolation filter for luma and a 4-tap interpolation filter for chroma.

MV細化MV refinement

MV細化是以雙邊匹配成本或模板匹配成本為準則的基於模式的MV搜索。在JEM中，支持兩種搜索模式──分別用於CU級別和子CU級別的MV細化的無限制的中心偏置菱形搜索(unrestricted center-biased diamond search，UCBDS)和自適應交叉搜索(adaptive cross search)。對於CU級別和子CU級別MV細化，以四分之一亮度樣本MV精度直接搜索MV，並且接下來以八分之一亮度樣本MV細化。對於CU步驟和子CU步驟的MV細化的搜索範圍被設置為等於8個亮度樣本。 MV refinement is a pattern-based MV search based on bilateral matching cost or template matching cost. In JEM, two search modes are supported--unrestricted center-biased diamond search (UCBDS) and adaptive cross search (UCBDS) for CU-level and sub-CU-level MV refinement. search). For CU-level and sub-CU-level MV refinement, the MV is directly searched with a quarter-luminance sample MV accuracy, and then the MV is refined with one-eighth luminance sample MV. The search range of MV refinement for the CU step and the sub-CU step is set equal to 8 luminance samples.

模板匹配FRUC合併模式中預測方向的選擇Selection of prediction direction in template matching FRUC merge mode

在雙邊匹配Merge模式中，始終應用雙向預測，因為基於在兩個不同參考圖像中沿當前CU的運動軌跡的兩個塊之間的最接近匹配來導出CU的運動信息。模板匹配Merge模式不存在這樣的限制。在模板匹配Merge模式中，編碼器可以在針對CU的來自列表0的單向預測、來自列表1的單向預測或者雙向預測之中進行選擇。選擇基於模板匹配成本，如下：如果costBi<=factor * min(cost0,cost1) In the bilateral matching Merge mode, bi-directional prediction is always applied because the motion information of the CU is derived based on the closest match between two blocks along the motion trajectory of the current CU in two different reference images. There is no such limitation in template matching Merge mode. In the template matching Merge mode, the encoder can choose among unidirectional prediction from list 0, unidirectional prediction from list 1, or bidirectional prediction for the CU. The selection is based on the template matching cost, as follows: if costBi <= factor * min( cost 0, cost1 )

使用雙向預測；否則，如果cost0<=cost1 Use bidirectional prediction; otherwise, if cost 0<= cost1

使用來自列表0的單向預測；否則，使用來自列表1的單向預測；其中cost0是列表0模板匹配的SAD，cost1是列表1模板匹配的SAD，costBi是雙向預測模板匹配的SAD。factor的值等於1.25，這意味著選擇過程偏向於雙向預測。 Use the one-way prediction from list 0; otherwise, use the one-way prediction from list 1; where cost0 is the SAD matched by the list 0 template, cost1 is the SAD matched by the list 1 template, and costBi is the SAD matched by the two-way prediction template. The value of factor is equal to 1.25, which means that the selection process is biased towards bidirectional prediction.

幀間預測方向選擇僅應用於CU級別模板匹配過程。 Inter-frame prediction direction selection is only applied to the CU-level template matching process.

交織預測示例Interleaved prediction example

通過交織預測，以多於一個細分樣式將塊細分為子塊。細分樣式定義為將塊細分為子塊的方式，包含子塊的尺寸和子塊的位置。對於每個細分樣式，可以通過基於細分樣式推導每個子塊的運動信息來產生對應的預測塊。因此，即使對於一個預測方向，可以由多個細分樣式產生多個預測塊。替代地，對於每個預測方向，可以僅施加細分樣式。 Through interleaving prediction, the block is subdivided into sub-blocks in more than one subdivision pattern. The subdivision style is defined as a way of subdividing a block into sub-blocks, including the size and position of the sub-block. For each subdivision pattern, the corresponding prediction block can be generated by deriving the motion information of each sub-block based on the subdivision pattern. Therefore, even for one prediction direction, multiple prediction blocks can be generated from multiple subdivision patterns. Alternatively, for each prediction direction, only the subdivision pattern may be applied.

假設存在X個細分樣式，和當前塊的X個預測塊，X個預測塊被表示為P ₀，P ₁，...，P _X-1，其由使用X個細分樣式的基於子塊的預測產生。當前塊的最終預測，表示為P，可以產生為

Suppose the X refinement patterns, the X and the prediction block of the current block exists, the X prediction block is denoted as _{_{P 0, P 1, ...,}} P X -1, which is based on the use of X subblock subdivided by a pattern of The forecast is generated. The final prediction of the current block, denoted as P , can be generated as

其中(x，y)是塊中的像素的坐標，並且w _i(x,y)是P _i的權重值。在不失去一般化的情況下，假設

w _i(x,y)=(1≪N)，其中N是非負值。第13圖示出了使用兩個細分樣式的交織預測的示例。 Where (x, y) are the coordinates of pixels in the block, and w _i (x, y) is the weighting value of P _i. Without losing generalization, assuming

w _i ( x,y )= ( 1≪ N) , where N is a non-negative value. Figure 13 shows an example of interleaving prediction using two subdivision patterns.

3.由所描述的實施例解決的示例性問題3. Exemplary problems solved by the described embodiments

仿射合併MV推導過程存在兩個潛在缺點，如第5圖中所示。 There are two potential shortcomings in the MV derivation process of affine merge, as shown in Figure 5.

首先，CU的左頂點和CU的尺寸必須由屬該CU的每個4×4塊儲存。該信息在HEVC中不需要被儲存。 First, the left vertex of the CU and the size of the CU must be stored by each 4×4 block belonging to the CU. This information does not need to be stored in HEVC.

其次，解碼器必須訪問不與當前CU相鄰的4×4塊的MV。在HEVC中，解碼器僅需訪問與當前CU相鄰的4×4塊的MV。 Second, the decoder must access the MV of a 4×4 block that is not adjacent to the current CU. In HEVC, the decoder only needs to access the MV of the 4×4 block adjacent to the current CU.

4.實施例的示例4. Example of embodiment

我們提出若干方法以進一步改善基於子塊的預測，包含交織預測和仿射合併MV推導過程。 We propose several methods to further improve sub-block-based prediction, including interleaving prediction and affine combined MV derivation process.

以下的詳細發明應認為是解釋總體概念的示例。這些發明不應以窄的方式理解。此外，這些發明可以以任意方式組合。本發明與其他發明之間的組合也是適用的。 The following detailed invention should be considered as an example to explain the overall concept. These inventions should not be understood in a narrow manner. In addition, these inventions can be combined in any manner. Combinations between the present invention and other inventions are also applicable.

交織預測的使用 Use of interleaved prediction

1.在一個實施例中，是否應用和如何應用交織預測可以取決於顏色分量。 1. In one embodiment, whether and how to apply interleaving prediction may depend on the color components.

a.例如，交織預測僅應用於亮度分量上，而不應用於色度分量上；b.例如，細分樣式對於不同顏色分量是不同的；c.例如，權重值對於不同顏色分量是不同的。 a. For example, interleaving prediction is only applied to the luminance component, not to the chrominance component; b. For example, the subdivision pattern is different for different color components; c. For example, the weight value is different for different color components.

2.在一個實施例中，是否應用和如何應用交織預測可以取決於幀間預測方向和/或參考圖片相同與否。 2. In one embodiment, whether and how to apply interlaced prediction may depend on whether the inter prediction direction and/or reference pictures are the same.

a.例如，交織預測可以僅用於單向預測，而不用於雙向預測。 a. For example, interleaving prediction can only be used for unidirectional prediction, but not for bidirectional prediction.

b.例如，交織預測可以僅用於雙向預測，但兩個參考圖片列表的兩個參考圖片是相同的。 b. For example, interleaving prediction can only be used for bidirectional prediction, but the two reference pictures of the two reference picture lists are the same.

c.在一個示例中，對低延遲P(LDP)情況禁用交織預測。 c. In one example, interleaving prediction is disabled for the low delay P(LDP) case.

d.在一個示例中，在從當前圖片預測當前塊時，也啟用交織預測。 d. In one example, when predicting the current block from the current picture, interleaving prediction is also enabled.

部分交織預測 Partial interleaving prediction

1.在一個實施例中，交織預測可以應用於整個塊的部分。 1. In one embodiment, interleaving prediction can be applied to parts of the entire block.

a.第二細分樣式可以僅覆蓋整個塊的部分。該部分以外的樣本不受交織預測影響。 a. The second subdivision style can only cover part of the entire block. The samples outside this part are not affected by the interleaving prediction.

b.該部分可以排除位於塊邊界處的樣本，例如，最前/最後n行或最前/最後m列。 b. This part can exclude samples located at the block boundary, for example, the top/last n rows or the top/last m columns.

c.該部分可以排除位於具有與在該塊內第二細分樣式中的大部分子塊尺寸不同的尺寸的子塊處的樣本。 c. This part can exclude samples located at sub-blocks having a size different from the size of most of the sub-blocks in the second subdivision pattern within the block.

d.圖14A和圖14B示出了部分交織的仿射預測的一些示例。第一細分樣式與JEM中的細分樣式相同，即，子塊的左頂點在(i×w，j×h)處，然後從等式(1)以(x,y)=(i×w+w/2，j×h+h/2)計算此子塊(第(i，j)子塊)的MV。對於兩個細分樣式，子塊的尺寸都是w×h。例如，w=h=4或w=h=8。 d. Figures 14A and 14B show some examples of partially interleaved affine prediction. The first subdivision style is the same as that in JEM, that is, the left vertex of the sub-block is at ( i × w , j × h ), and then from equation (1), (x, y)=( i × w + w /2, j × h + h /2) calculate the MV of this sub-block (the ( i, j )-th sub-block). For the two subdivision styles, the size of the sub-block is w × h . For example, w = h = 4 or w = h = 8.

i.在圖14A中，第二細分樣式的子塊(第(i，j)子塊)的上左頂部是(i×w+w/2，j×h)，並且從等式(1)以(x,y)=(i×w+w，j×h+h/2)計算此子塊的MV。 i. In Figure 14A, the top left of the sub-block of the second subdivision pattern (the ( i, j )-th sub-block) is ( i × w + w /2, j × h ), and from equation (1) Calculate the MV of this sub-block with (x,y)=( i × w + w , j × h + h /2).

ii.在圖14B中，第二細分樣式的子塊的上左頂部是(i×w，j×h+h/2)，並且從等式(1)以(x,y)=(i×w+w/2，j×h+h)計算此子塊的MV。 ii. In Figure 14B, the top left of the sub-block of the second subdivision pattern is ( i × w , j × h + h /2), and from equation (1), (x, y) = ( i × w + w /2, j × h + h ) calculate the MV of this sub-block.

iii.在圖14B中，第二細分樣式的子塊的上左頂部是(i×w+w/2，j×h+h/2)，並且從等式(1)以(x,y)=(i×w+w，j×h+h)計算此子塊的MV。 iii. In Figure 14B, the top left of the sub-block of the second subdivision pattern is ( i × w + w /2, j × h + h /2), and from equation (1) to (x, y) =( i × w + w , j × h + h ) Calculate the MV of this sub-block.

第14A圖-第14C圖示出了部分交織預測的示例。虛線表示第一細分樣式；實線表示第二細分樣式；粗線表示要應用交織預測的區域。在該區域之外，不應用交織預測。 Figures 14A-14C show examples of partial interleaving prediction. The dotted line indicates the first subdivision style; the solid line indicates the second subdivision style; the thick line indicates the area where interlaced prediction is to be applied. Outside this area, no interleaving prediction is applied.

交織預測中的權重值 Weight value in interleaving prediction

2.在一個實施例中，存在兩個可能的權重值Wa和Wb，滿足Wa+Wb=2^N。示例性權重值{Wa，Wb}為{3，1}、{7，1}、{5，3}、{13，3}等。 2. In one embodiment, there are two possible weight values Wa and Wb, satisfying Wa+Wb=2 ^N. Exemplary weight values {Wa, Wb} are {3, 1}, {7, 1}, {5, 3}, {13, 3}, etc.

a.如果與由第一細分樣式產生的預測樣本P1相關聯的權重值w1和與由第二細分樣式產生的預測樣本P2相關聯的權重值w2相同(都等於Wa或Wb)，則此樣本的最終預測P被計算為P=(P1+P2)>>1或P=(P1+P2+1)>>1。 a. If the weight value w1 associated with the predicted sample P1 generated by the first subdivision style and the weight value w2 associated with the predicted sample P2 generated by the second subdivision style are the same (both equal to Wa or Wb), then this sample The final prediction P is calculated as P=(P1+P2)>>1 or P=(P1+P2+1)>>1.

b.如果與由第一細分樣式產生的預測樣本P1相關聯的權重值w1和與由第二細分樣式產生的預測樣本P2不同({w1，w2}={Wa，Wb}或{w1，w2}={Wb，Wa})，則此樣本的最終預測P被計算為P=(w1×P1+w2×P2+偏移)>>N，其中偏移可以為1<<(N-1)或0。 b. If the weight value w1 associated with the predicted sample P1 generated by the first subdivision style is different from the predicted sample P2 generated by the second subdivision style ({w1, w2}={Wa, Wb} or {w1, w2 }={Wb, Wa}), the final prediction P of this sample is calculated as P=(w1×P1+w2×P2+offset)>>N, where the offset can be 1<<(N-1) or 0.

c.其可以相似地擴展到當存在多於2個細分樣式時的情況。 c. It can be similarly extended to the case when there are more than 2 subdivision patterns.

3.在一個實施例中，如果樣本A比樣本B更接近於推導子塊的MV的位置，則子塊中的樣本A的權重值大於子塊中的樣本B的權重值。在第16圖中示出了4×4子塊、4×2子塊、2×4子塊或2×2子塊的示例性權重值。 3. In one embodiment, if the specimen A MV deriving a position closer than the sub-sample block B, the weight values of the samples A sub-block is greater than a weight value of the sub-sample B blocks. Exemplary weight values of 4×4 sub-blocks, 4×2 sub-blocks, 2×4 sub-blocks, or 2×2 sub-blocks are shown in Fig. 16.

交織預測的示例Example of interleaving prediction

第16圖示出根據所公開的技術的具有兩個細分樣式的交織預測的示例。當前塊1300可以細分成多個樣式。例如，如第16圖所示，當前塊被細分成樣式0(1301)和樣式1(1302)。產生兩個預測塊P₀(1303)和P₁(1304)。通過計算P₀(1303)和P₁(1304)的加權和，可以產生當前塊1300的最終預測塊P(1305)。 Figure 16 shows an example of interleaving prediction with two subdivision patterns according to the disclosed technology. The current block 1300 can be subdivided into multiple styles. For example, as shown in Figure 16, the current block is subdivided into pattern 0 (1301) and pattern 1 (1302). Two prediction blocks P ₀ (1303) and P ₁ (1304) are generated. By calculating the weighted sum of P ₀ (1303) and P ₁ (1304), the final prediction block P (1305) of the current block 1300 can be generated.

一般來說，給定X個細分樣式，當前塊的X個預測塊(表示為P ₀，P ₁,，...,P _X-1)可以以X個細分樣式由基於子塊的預測產生。當前塊的最終預測(表示為P)可產生為：

Generally speaking, given X subdivision patterns, X prediction blocks (represented as P ₀ , P ₁ ,..., P _{X -1} ) of the current block can be generated by sub-block-based prediction in X subdivision patterns . The final prediction of the current block (denoted as P) can be produced as:

這裏，(x,y)是塊中像素的坐標，並且w _i(x,y)是P _i的權重係數。通過示例而不是限制，權重可以表示為：

Here, (x, y) are the coordinates of pixels in the block, and w _i (x, y) is the weight P _i is a weight coefficient. By way of example rather than restriction, the weight can be expressed as:

N是非負值。可選地，等式(8)中的位移操作也可以表示為：

N is a non-negative value. Optionally, the displacement operation in equation (8) can also be expressed as:

權重之和是2的冪，通過執行移位操作而不是浮點除法，可以更有效地計算加權和P。 The weight sum is a power of 2. By performing a shift operation instead of floating point division, the weighted sum P can be calculated more efficiently.

細分樣式可以具有不同的子塊形狀、尺寸或位置。在一些實施例中，細分樣式可以包括不規則的子塊大小。圖17A-圖17G顯示了16×16塊的幾個細分樣式的示例。在第17A圖中，根據所公開的技術將塊細分為4×4個子塊。這種樣式也用於JEM。第17B圖示出根據所公開的技術將塊細分為8×8個子塊的細分樣式的示例。第17C圖示出根據所公開的技術將塊細分為8×4個子塊的細分樣式的示例。第17D圖示出根據所公開的技術將塊細分為4×8個子塊的細分樣式的示例。在第17E圖中，根據所公開的技術將塊的一部分細分為4×4子塊。塊邊界上的像素被細分成更小的子塊，其大小如2×4,4×2或2×2。一些子塊可以合併以形成更大的子塊。第17F圖示出了相鄰子塊(如4×4子塊和2x4子塊)的示例，這些子塊合併後形成尺寸為6×4、4×6或6×6的較大子塊。在第14G圖中，塊的一部分被細分為8×8子塊。而塊邊界處的像素被細分為較小的子塊如8×4、4×8或4×4。 The subdivision style can have different sub-block shapes, sizes or positions. In some embodiments, the subdivision pattern may include irregular sub-block sizes. Figures 17A-17G show examples of several subdivision patterns of 16×16 blocks. In Figure 17A, the block is subdivided into 4×4 sub-blocks according to the disclosed technology. This style is also used in JEM. Figure 17B shows an example of a subdivision pattern in which a block is subdivided into 8×8 sub-blocks according to the disclosed technology. Figure 17C shows an example of a subdivision pattern in which a block is subdivided into 8×4 sub-blocks according to the disclosed technology. FIG. 17D shows an example of a subdivision pattern in which a block is subdivided into 4×8 sub-blocks according to the disclosed technology. In Figure 17E, a part of the block is subdivided into 4×4 sub-blocks according to the disclosed technology. The pixels on the block boundary are subdivided into smaller sub-blocks, the size of which is 2×4, 4×2, or 2×2. Some sub-blocks can be combined to form larger sub-blocks. Figure 17F shows an example of adjacent sub-blocks (such as 4×4 sub-blocks and 2×4 sub-blocks). These sub-blocks are combined to form a larger sub-block with a size of 6×4, 4×6, or 6×6. In Figure 14G, part of the block is subdivided into 8×8 sub-blocks. The pixels at the block boundary are subdivided into smaller sub-blocks such as 8×4, 4×8, or 4×4.

基於子塊的預測中，子塊的形狀和大小可以基於編碼塊的形狀和/或大小和/或編碼塊信息來確定。例如，在一些實施例中，當當前塊的大小為M×N 時，子塊的大小為4×N(或8×N等)，即子塊與當前塊具有相同的高度。在一些實施例中，當當前塊的大小為M×N時，子塊的大小為M×4(或M×8等)，即子塊與當前塊具有相同的寬度。在一些實施例中，當當前塊的大小為M×N(其中M>N)時，子塊的大小為A×B，其中A>B(例如，8×4)。或者，子塊的大小為B×A(例如，4×8)。 In sub-block-based prediction, the shape and size of the sub-block may be determined based on the shape and/or size of the coding block and/or coding block information. For example, in some embodiments, when the size of the current block is M×N When, the size of the sub-block is 4×N (or 8×N, etc.), that is, the sub-block has the same height as the current block. In some embodiments, when the size of the current block is M×N, the size of the sub-block is M×4 (or M×8, etc.), that is, the sub-block has the same width as the current block. In some embodiments, when the size of the current block is M×N (where M>N), the size of the sub-block is A×B, where A>B (for example, 8×4). Alternatively, the size of the sub-block is B×A (for example, 4×8).

在一些實施例中，當前塊的大小為M×N。當M×N<=T(或min(M，N)<=T，或max(M，N)<=T等)時，子塊的大小為A×B；當M×N>T(或min(M，N)>T，或max(M，N)>T等)時，子塊的大小為C×D，其中A<=C，B<=D。例如，如果M×N<=256，子塊的大小可以是4×4。在一些實現中，子塊的大小為8×8。 In some embodiments, the size of the current block is M×N. When M×N<=T (or min(M,N)<=T, or max(M,N)<=T, etc.), the size of the sub-block is A×B; when M×N>T(or When min(M,N)>T, or max(M,N)>T, etc.), the size of the sub-block is C×D, where A<=C, B<=D. For example, if M×N<=256, the size of the sub-block may be 4×4. In some implementations, the size of the sub-block is 8×8.

在一些實施例中，可以基於幀間預測方向而確定是否應用交織預測。例如，在一些實施例中，交織預測可以應用於雙向預測，而不應用於單向預測。作為另一示例，當應用多假說(multiple-hypothesis)時，當存在多於一個參考塊時，交織預測可以應用於一個預測方向。 In some embodiments, whether to apply interlaced prediction may be determined based on the inter prediction direction. For example, in some embodiments, interleaving prediction may be applied to bidirectional prediction, but not to unidirectional prediction. As another example, when multiple-hypothesis is applied, when there is more than one reference block, interleaving prediction can be applied to one prediction direction.

在一些實施例中，可以基於幀間預測方向而確定該如何應用交織預測。在一些實施例中，使用基於子塊的預測而雙向預測的塊被對於兩個不同參考列表用兩個不同的細分樣式細分為子塊。例如，當從參考列表0(L0)預測時，雙向預測的塊被細分為4×8子塊，如第17D圖所示。當從參考列表1(L1)預測時，相同的塊被細分為8×4子塊，如第17C圖所示。最終預測P被計算為

In some embodiments, how to apply interleaving prediction may be determined based on the direction of inter prediction. In some embodiments, a block that uses sub-block-based prediction but bidirectionally predicted is subdivided into sub-blocks with two different subdivision patterns for two different reference lists. For example, when predicting from the reference list 0 (L0), the bi-predicted block is subdivided into 4×8 sub-blocks, as shown in Figure 17D. When predicted from reference list 1 (L1), the same block is subdivided into 8×4 sub-blocks, as shown in Figure 17C. The final prediction P is calculated as

此處，P0和P1分別為來自L0和L1的預測。w0和w1分別為L0和L1的權重值。如等式(16)中所示，權重值可以確定為：w ⁰(x,y)+w ¹(x,y)=1<<N(其中N是非負整數值)。因為較少的子塊被用於每個方向上的預測(例如，與8×8子塊相比之下的4×8子塊)，與基於子塊的現有方法相比，計算需要較少寬頻。通過使用較大的子塊，預測結果也較不易受噪聲干擾影響。 Here, P0 and P1 are predictions from L0 and L1, respectively. w0 and w1 are the weight values of L0 and L1, respectively. As shown in equation (16), the weight value can be determined as: w ⁰ ( x, y ) + w ¹ ( x, y )=1<<N (where N is a non-negative integer value). Because fewer sub-blocks are used for prediction in each direction (for example, 4×8 sub-blocks compared to 8×8 sub-blocks), less calculation is required compared to existing methods based on sub-blocks Broadband. By using larger sub-blocks, the prediction result is also less susceptible to noise interference.

在一些實施例中，用基於子塊的預測的單向預測的塊對於相同參考列表用兩個或更多個不同細分樣式被細分為子塊。例如，列表L(L=0或1)P^L的預測被計算為

In some embodiments, a block that is unidirectionally predicted with sub-block-based prediction is subdivided into sub-blocks using two or more different subdivision patterns for the same reference list. For example, the prediction of the list L (L=0 or 1) ^PL is calculated as

在此，XL是用於列表L的細分樣式的數目。

是用第i細分樣式產生的預測，並且

是

的權重值。例如，當XL為2時，兩個細分樣式被應用於列表L。在第一細分樣式中，塊被細分為4×8子塊，如第17D圖所示。在第二細分樣式中，塊被細分為8×4子塊，如第17D圖所示。 Here, XL is the number of subdivision patterns used for the list L.

Is the prediction generated with the i-th subdivision style, and

Yes

The weight value of. For example, when XL is 2, two subdivision styles are applied to list L. In the first subdivision pattern, the block is subdivided into 4×8 sub-blocks, as shown in Figure 17D. In the second subdivision pattern, the block is subdivided into 8×4 sub-blocks, as shown in Figure 17D.

在一些實施例中，用基於子塊的預測的雙向預測的塊被認為是分別來自L0和L1的兩個單向預測的塊的組合。來自每個列表的預測可以如以上示例中所描述而推導。最終預測P可以被計算為

In some embodiments, a bi-directionally predicted block with sub-block-based prediction is considered to be a combination of two uni-directionally predicted blocks from L0 and L1, respectively. The prediction from each list can be derived as described in the example above. The final prediction P can be calculated as

此處，參數a和b是兩個附加權重，其應用於兩個內部預測塊。在此具體示例中，a和b兩者都可以設定為1。相似於以上示例，因為較少的子塊被用於在每個方向上預測(例如，與8×8子塊相比之下的4×8子塊)，寬頻使用優於基於子塊的現有方法或與基於子塊的現有方法為相同水平。於此同時，預測結果可以通過使用較大的子塊改善。 Here, the parameters a and b are two additional weights, which are applied to two intra prediction blocks. In this specific example, both a and b can be set to 1. Similar to the above example, because fewer sub-blocks are used for prediction in each direction (for example, 4×8 sub-blocks compared to 8×8 sub-blocks), broadband usage is better than existing sub-block-based The method is at the same level as the existing method based on sub-blocks. At the same time, the prediction result can be improved by using larger sub-blocks.

在一些實施例中，單個非均勻樣式可以被用於每個單向預測的塊中。例如，對於每個列表L(例如，L0或L1)，塊被細分為不同樣式(例如，如第17E圖或第17F圖中所示)。使用較小數目的子塊降低對寬頻的需求。子塊的非均勻性還增加了預測結果的魯棒性。 In some embodiments, a single non-uniform pattern may be used in each unidirectionally predicted block. For example, for each list L (e.g., L0 or L1), the block is subdivided into different styles (e.g., as shown in Figure 17E or Figure 17F). Using a smaller number of sub-blocks reduces the need for broadband. The non-uniformity of the sub-blocks also increases the robustness of the prediction results.

在一些實施例中，對於多假說編碼的塊，可以存在對每個預測方向 (或參考圖片列表)由不同細分樣式產生的多於一個預測塊。多個預測塊可以用來在應用附加的權重的情況下產生最終預測。例如，附加的權重可以設定為1/M，其中M是產生的預測塊的總數。 In some embodiments, for multi-hypothesis coded blocks, there may be (Or reference picture list) more than one prediction block generated by different subdivision patterns. Multiple prediction blocks can Used to generate the final prediction with additional weights applied. For example, the additional weight can be set to 1/M, where M is the total number of generated prediction blocks.

在一些實施例中，編碼器可以確定是否應用和如何應用交織預測。然後編碼器可以將對應於確定的信息在序列級別、圖片級別、視圖級別、條帶級別、編碼樹單元(CTU)(還被稱為最大編碼單元(LCU))級別、CU級別、PU級別、樹單元(TU)級別，或區域級別(其可以包含多個CU/PU/TU/LCU)發送到解碼器。信息可以在序列參數集(SPS)、視圖參數集(VPS)、圖片參數集(PPS)、條帶報頭(SH)、CTU/LCU、CU、PU、TU或區域的第一塊中被信令通知。 In some embodiments, the encoder can determine whether and how to apply interlaced prediction. Then the encoder can correspond to the determined information in sequence level, picture level, view level, slice level, coding tree unit (CTU) (also known as the largest coding unit (LCU)) level, CU level, PU level, The tree unit (TU) level, or area level (which may contain multiple CU/PU/TU/LCU) is sent to the decoder. Information can be signaled in the first block of sequence parameter set (SPS), view parameter set (VPS), picture parameter set (PPS), slice header (SH), CTU/LCU, CU, PU, TU, or region Notice.

在一些實現方式中，交織預測應用於現有子塊方法，像是仿射預測、ATMVP、STMVP、FRUC，或BIO。在這樣的情況下，不需要附加的信令成本。在一些實現方式中，由交織預測產生的新子塊合併候選可以被插入到合併列表中，例如，交織預測+ATMVP、交織預測+STMVP、交織預測+FRUC等。 In some implementations, interleaving prediction is applied to existing sub-block methods, such as affine prediction, ATMVP, STMVP, FRUC, or BIO. In this case, no additional signaling cost is required. In some implementations, new sub-block merging candidates generated by interleaving prediction may be inserted into the merge list, for example, interleaving prediction+ATMVP, interleaving prediction+STMVP, interleaving prediction+FRUC, etc.

在一些實施例中，要由當前塊使用的細分樣式可以基於來自空間和/或時間相鄰塊的信息推導。例如，替代於依賴於編碼器來信令通知相關信息，編碼器和解碼器兩者都可以採用一組預定規則來基於時間相鄰性(例如，該相同塊的之前使用的細分樣式)或空間相鄰性(例如，由相鄰塊使用的細分樣式)獲得細分樣式。 In some embodiments, the subdivision pattern to be used by the current block may be derived based on information from spatial and/or temporal neighboring blocks. For example, instead of relying on the encoder to signal related information, both the encoder and the decoder can adopt a set of predetermined rules to be based on temporal adjacency (for example, the previously used subdivision pattern of the same block) or spatial phase. Neighborhood (for example, the subdivision style used by neighboring blocks) obtains the subdivision style.

在一些實施例中，權重值w可以固定。例如，全部細分樣式可以相等地加權：w _i(x,y)=1。在一些實施例中，可以基於塊的位置以及使用的細分樣式確定權重值。例如，w _i(x,y)對於不同的(x，y)可以是不同的。在一些實施例中，權重值可以進一步取決於基於編碼技術(例如，仿射或ATMVP)和/或其他編碼信息(例如，跳過或非跳過模式，和/或MV信息)的子塊預測。 In some embodiments, the weight value w may be fixed. For example, all subdivision styles can be weighted equally: w _i ( x,y )=1. In some embodiments, the weight value may be determined based on the location of the block and the subdivision style used. For example, w _i ( x , y) can be different for different (x, y). In some embodiments, the weight value may further depend on sub-block prediction based on coding technology (for example, affine or ATMVP) and/or other coding information (for example, skip or non-skip mode, and/or MV information) .

在一些實施例中，編碼器可以確定權重值，並且將該值以序列級別、圖片級別、條帶級別、CTU/LCU級別、CU級別、PU級別或區域級別(其可以包含多個CU/PU/TU/LCU)發送到解碼器。可以在序列參數集(SPS)、圖片參數集(PPS)、條帶報頭(SH)、CTU/LCU、CU、PU，或區域的第一塊中信令通知權重值。在一些實施例中，可以從空間和/或時間相鄰塊的權重值推導權重值。 In some embodiments, the encoder can determine the weight value, and use the value in sequence level, picture level, slice level, CTU/LCU level, CU level, PU level, or region level (which may include multiple CU/PU /TU/LCU) is sent to the decoder. The weight value can be signaled in the sequence parameter set (SPS), picture parameter set (PPS), slice header (SH), CTU/LCU, CU, PU, or the first block of the region. In some embodiments, the weight value may be derived from the weight value of spatial and/or temporal neighboring blocks.

注意到，本文中公開的交織預測技術可以應用於基於子塊的預測的編碼技術中的一個、一些或全部。例如，交織預測技術可以應用於仿射預測，而基於子塊的預測的其他編碼技術(例如，ATMVP、STMVP、FRUC或BIO)不使用交織預測。作為另一示例，仿射、ATMVP以及STMVP中的全部應用本文中所公開的交織預測技術。 Note that the interleaving prediction technique disclosed herein can be applied to one, some or all of the coding techniques based on sub-block prediction. For example, the interleaving prediction technique can be applied to affine prediction, while other coding techniques based on sub-block prediction (for example, ATMVP, STMVP, FRUC or BIO) do not use interleaving prediction. As another example, all of affine, ATMVP, and STMVP apply the interleaving prediction technique disclosed herein.

第18圖是示例性視頻處理設備1800的框圖。設備1800可以用來實現本文所描述的方法中的一個或多個。設備1800可以實施為智能電話、平板、電腦、物聯網(IoT)接收器等等。設備1800可以包含一個或多個處理器1802、一個或多個記憶體1804以及視頻處理硬體1806。(多個)處理器1802可以配置為實現本文件中描述的一個或多個方法。記憶體(多個記憶體)1804可以用於儲存數據和代碼，該數據和代碼用於實現本文中描述的方法和技術。視頻處理電路1806可以用來以硬體電路實現本文件中描述的一些技術。 Figure 18 is a block diagram of an exemplary video processing device 1800. The device 1800 can be used to implement one or more of the methods described herein. The device 1800 may be implemented as a smart phone, a tablet, a computer, an Internet of Things (IoT) receiver, and so on. The device 1800 may include one or more processors 1802, one or more memories 1804, and video processing hardware 1806. The processor(s) 1802 may be configured to implement one or more methods described in this document. The memory (multiple memories) 1804 can be used to store data and codes, which are used to implement the methods and techniques described herein. The video processing circuit 1806 can be used to implement some of the techniques described in this document with hardware circuits.

第19圖示出了視頻處理的示例性方法1900的流程圖。方法1900包含，在步驟1902，基於當前視頻塊的分量類型來確定交織預測模式是否適用於當前視頻塊與當前視頻塊的位元流表示之間的轉換。方法1900還包含，在步驟1904，響應於確定交織預測模式適用於當前視頻塊，通過應用交織預測模式進行轉換，其中應用交織預測包含使用多於一個細分樣式將部分當前視頻塊細分為至少一個子塊，並且產生作為對於多於一個細分樣式中的每一個確定的預測器的加權平均的當前視頻塊的預測器。 Figure 19 shows a flowchart of an exemplary method 1900 of video processing. The method 1900 includes, in step 1902, determining whether the interlaced prediction mode is suitable for the conversion between the current video block and the bitstream representation of the current video block based on the component type of the current video block. The method 1900 further includes, in step 1904, in response to determining that the interleaving prediction mode is applicable to the current video block, performing conversion by applying the interleaving prediction mode, wherein applying the interleaving prediction includes subdividing a part of the current video block into at least one subdivision pattern using more than one subdivision pattern. Block, and generate a predictor of the current video block as a weighted average of the determined predictor for each of more than one subdivision patterns.

第20圖示出了視頻處理的示例性方法2000的流程圖。方法2000包含，在步驟2002，基於當前視頻塊的預測方向來確定交織預測模式是否適用於當前視頻塊與當前視頻塊的位元流表示之間的轉換。方法2000還包含，在步驟2004，響應於確定交織預測模式適用於當前視頻塊，通過應用交織預測模式進行轉換，並且其中應用交織預測包含使用多於一個細分樣式將部分當前視頻塊細分為至少一個子塊，並且產生作為對於多於一個細分樣式中的每一個確定的預測器的加權平均的當前視頻塊的預測器。 Figure 20 shows a flowchart of an exemplary method 2000 of video processing. The method 2000 includes, in step 2002, determining whether the interlaced prediction mode is suitable for the conversion between the current video block and the bitstream representation of the current video block based on the prediction direction of the current video block. The method 2000 further includes, in step 2004, in response to determining that the interleaving prediction mode is suitable for the current video block, performing conversion by applying the interleaving prediction mode, and wherein applying the interleaving prediction includes subdividing part of the current video block into at least one subdivision pattern using more than one subdivision pattern. Sub-blocks, and generate the predictor of the current video block as a weighted average of the determined predictors for each of more than one subdivision patterns.

第21圖示出了視頻處理的示例性方法2100的流程圖。方法2100包含，在步驟2102，基於當前圖片的低延遲模式來確定交織預測模式是否適用於當前圖片中的當前視頻塊與當前視頻塊的位元流表示之間的轉換。方法2100還包含，在步驟2104，響應於確定交織預測模式適用於當前視頻塊，通過應用交織預測模式進行轉換，並且，其中應用交織預測包含使用多於一個細分樣式將部分當前視頻塊細分為至少一個子塊，並且產生作為對於多於一個細分樣式中的每一個確定的預測器的加權平均的當前視頻塊的預測器。 Figure 21 shows a flowchart of an exemplary method 2100 of video processing. The method 2100 includes, in step 2102, determining whether the interlaced prediction mode is suitable for the conversion between the current video block in the current picture and the bitstream representation of the current video block based on the low delay mode of the current picture. The method 2100 further includes, in step 2104, in response to determining that the interleaving prediction mode is applicable to the current video block, performing conversion by applying the interleaving prediction mode, and wherein applying the interleaving prediction includes subdividing part of the current video block into at least one subdivision pattern using more than one subdivision pattern. One sub-block, and generate the predictor of the current video block as a weighted average of the determined predictors for each of more than one subdivision patterns.

第22圖示出了視頻處理的示例性方法2200的流程圖。方法2200包含，在步驟2202，基於使用包含當前視頻塊的當前圖片作為參考來確定交織預測模式是否適用於當前視頻塊與當前視頻塊的位元流表示之間的轉換。方法2200還包含，在步驟2204，響應於確定交織預測模式適用於當前視頻塊，通過應用交織預測模式進行轉換，並且其中應用交織預測包含使用多於一個細分樣式將部分當前視頻塊細分為至少一個子塊樣式，並且產生作為對於多於一個細分樣式中的每一個確定的預測器的加權平均的當前視頻塊的預測器。 Figure 22 shows a flowchart of an exemplary method 2200 of video processing. The method 2200 includes, in step 2202, determining whether the interlaced prediction mode is suitable for conversion between the current video block and the bitstream representation of the current video block based on using the current picture containing the current video block as a reference. The method 2200 further includes, in step 2204, in response to determining that the interleaving prediction mode is suitable for the current video block, converting by applying the interleaving prediction mode, and wherein applying the interleaving prediction includes subdividing part of the current video block into at least one subdivision pattern using more than one subdivision pattern. Sub-block pattern, and generate the predictor of the current video block as a weighted average of the determined predictor for each of more than one sub-division pattern.

提供了視頻處理的另一示例性方法。方法包含選擇性地基於視頻條件，進行(2300)視頻的來自視頻幀的亮度分量、第一色度分量和第二色度分量中的一個或多個分量的基於交織預測的編碼。進行幀間交織預測，包括，通過以下為視頻的分量的當前塊確定預測塊：選擇(2302)視頻幀的分量的像素的集合以形成塊；根據第一樣式，將塊分割(2304)為子塊的第一集合；基於子塊的第一集合，產生(2306)第一中間預測塊；根據第二樣式，將塊分割(2308)為子塊的第二集合，其中第二集合中的至少一個子塊不在第一集合中；基於子塊的第二集合，產生(2310)第二中間預測塊；以及，基於第一中間預測塊和第二中間預測塊，確定(2312)預測塊。 Another exemplary method of video processing is provided. The method includes selectively performing (2300) coding based on the interleaving prediction of one or more of the luminance component, the first chrominance component, and the second chrominance component of the video frame from the video frame based on the video condition. Perform inter-frame interleaving prediction, including, through Determine the prediction block from the current block of the components of the video as follows: select (2302) the set of pixels of the components of the video frame to form a block; divide the block (2304) into the first set of sub-blocks according to the first pattern; based on The first set of sub-blocks is generated (2306) the first intermediate prediction block; according to the second pattern, the block is divided (2308) into a second set of sub-blocks, wherein at least one sub-block in the second set is not in the first set Medium; Based on the second set of sub-blocks, generate (2310) a second intermediate prediction block; and, based on the first intermediate prediction block and the second intermediate prediction block, determine (2312) the prediction block.

以下使用基於條款的格式描述了上述方法/技術的附加特徵和實施例。 The following uses a clause-based format to describe additional features and embodiments of the above methods/techniques.

1、一種視頻處理的方法，包括：基於當前視頻塊的分量類型，確定交織預測模式是否適用於當前視頻塊與當前視頻塊的位元流表示之間的轉換；以及響應於確定交織預測模式適用於當前視頻塊，通過應用交織預測模式而進行轉換 1. A method of video processing, comprising: determining whether an interleaved prediction mode is suitable for conversion between the current video block and the bitstream representation of the current video block based on the component type of the current video block; and in response to determining that the interleaved prediction mode is suitable The current video block is converted by applying the interlaced prediction mode

其中應用交織預測包含使用多於一個細分樣式將部分當前視頻塊細分為至少一個子塊，並且產生作為對多於一個細分樣式中的每一個確定的預測器的加權平均的當前視頻塊的預測器。 Wherein applying interlaced prediction includes subdividing part of the current video block into at least one sub-block using more than one subdivision pattern, and generating a predictor of the current video block as a weighted average of the predictor determined for each of the more than one subdivision patterns .

2、根據條款1所述的方法，其中響應於分量類型等於亮度分量而應用交織預測。 2. The method according to clause 1, wherein the interleaving prediction is applied in response to the component type being equal to the luminance component.

3、根據條款1所述的方法，其中用於視頻的第一顏色分量的多於一個細分樣式與用於視頻的第二顏色分量的另外的多於一個細分樣式不同。 3. The method of clause 1, wherein more than one subdivision pattern for the first color component of the video is different from another more than one subdivision pattern for the second color component of the video.

4、根據條款1所述的方法，其中加權平均使用權重，權重的值取決於分量類型。 4. The method according to clause 1, wherein a weighted average is used, and the value of the weight depends on the component type.

5、一種視頻處理的方法，包括：基於當前視頻塊的預測方向，確定交織預測模式是否適用於當前視頻塊與當前視頻塊的位元流表示之間的轉換；以及響應於確定交織預測模式適用於當前視頻塊，通過應用交織預測模式進行轉換，並且其中應用交織預測包含使用多於一個細分樣式將部分當前視頻塊細分為至少一個子塊，並且產生作為對多於一個細分樣式中的每一個確定的預測器的加權平均的當前視頻塊的預測器。 5. A method of video processing, including: Based on the prediction direction of the current video block, determine whether the interleaving prediction mode is suitable for the current video block Conversion between the frequency block and the bitstream representation of the current video block; and In response to determining that the interleaving prediction mode is suitable for the current video block, conversion is performed by applying the interleaving prediction mode, and Wherein applying interlaced prediction includes subdividing part of the current video block into at least one sub-block using more than one subdivision pattern, and generating a predictor of the current video block as a weighted average of the predictor determined for each of the more than one subdivision patterns .

6、根據條款5所述的方法，其中響應於預測方向等於單向預測而應用交織預測。 6. The method according to clause 5, wherein interlaced prediction is applied in response to the prediction direction being equal to one-way prediction.

7、根據條款5所述的方法，其中響應於預測方向等於雙向而應用交織預測。 7. The method of clause 5, wherein interlaced prediction is applied in response to the prediction direction being equal to two-way.

8、一種視頻處理的方法，包括：基於當前圖片的低延遲模式，確定交織預測模式是否適用於當前圖片中的當前視頻塊與當前視頻塊的位元流表示之間的轉換；以及響應於確定交織預測模式適用於當前視頻塊，通過應用交織預測模式進行轉換，並且其中應用交織預測包含使用多於一個細分樣式將部分當前視頻塊細分為至少一個子塊，並且產生作為對多於一個細分樣式中的每一個確定的預測器的加權平均的當前視頻塊的預測器。 8. A method of video processing, including: Based on the low-latency mode of the current picture, determine whether the interlaced prediction mode is suitable for the conversion between the current video block in the current picture and the bitstream representation of the current video block; and In response to determining that the interleaving prediction mode is suitable for the current video block, conversion is performed by applying the interleaving prediction mode, and Wherein applying interlaced prediction includes subdividing part of the current video block into at least one sub-block using more than one subdivision pattern, and generating a predictor of the current video block as a weighted average of the predictor determined for each of the more than one subdivision patterns .

9、根據條款8所述的方法，其中對當前圖片的低延遲模式禁用交織預測。 9. The method according to clause 8, wherein the interleaving prediction is disabled for the low delay mode of the current picture.

10、一種視頻處理的方法，包括：基於使用包含當前視頻塊的當前圖片作為參考，確定交織預測模式是否適用於當前視頻塊與當前視頻塊的位元流表示之間的轉換；以及響應於確定交織預測模式適用於當前視頻塊，通過應用交織預測模式進行轉換，並且其中應用交織預測包含使用多於一個細分樣式將部分當前視頻塊細分為至少一個子塊樣式，並且產生作為對多於一個細分樣式中的每一個確定的預測器的加權平均的當前視頻塊的預測器。 10. A method of video processing, including: Based on using the current picture containing the current video block as a reference, determine whether the interlaced prediction mode is suitable for the conversion between the current video block and the bitstream representation of the current video block; and In response to determining that the interleaving prediction mode is suitable for the current video block, by applying the interleaving prediction mode Method, and wherein applying interlaced prediction includes subdividing part of the current video block into at least one sub-block pattern using more than one subdivision pattern, and generating a weighted average of predictors determined for each of the more than one subdivision pattern The predictor of the current video block.

11、根據條款10所述的方法，其中當從當前圖片預測當前視頻塊時，啟用交織預測。 11. The method of clause 10, wherein when the current video block is predicted from the current picture, interlaced prediction is enabled.

12、根據條款1、5、8或10所述的方法，其中部分當前視頻塊包括少於所有的當前視頻塊。 12. The method according to clause 1, 5, 8 or 10, wherein part of the current video block includes less than all current video blocks.

13、根據條款1-12中任一項所述的方法，其中交織預測模式應用於當前視頻塊的部分。 13. The method according to any one of clauses 1-12, wherein the interlaced prediction mode is applied to part of the current video block.

14、根據條款13所述的方法，其中細分樣式中的至少一個僅覆蓋當前視頻塊的部分。 14. The method according to clause 13, wherein at least one of the subdivision patterns only covers part of the current video block.

15、根據條款14所述的方法，其中部分排除位於當前視頻塊的邊界處的樣本。 15. The method of clause 14, wherein samples located at the boundary of the current video block are partially excluded.

16、根據條款14所述的方法，其中部分排除位於具有與當前視頻塊內的大部分子塊尺寸不同的尺寸的子塊處的樣本。 16. The method of clause 14, wherein samples located at sub-blocks having a size different from the size of most sub-blocks within the current video block are partially excluded.

17、根據條款1、5、8、10或12所述的方法，其中細分部分當前視頻塊還包含：根據細分樣式的第一樣式將當前視頻塊分割為子塊的第一集合；基於子塊的第一集合產生第一中間預測塊；根據細分樣式的第二樣式將當前視頻塊分割為子塊的第二集合，其中第二集合中的至少一個子塊不在第一集合中；以及基於子塊的第二集合產生第二中間預測塊。 17. The method according to clause 1, 5, 8, 10 or 12, wherein subdividing part of the current video block further comprises: dividing the current video block into a first set of sub-blocks according to the first pattern of the subdivision style; The first set of blocks generates a first intermediate prediction block; the current video block is divided into a second set of sub-blocks according to the second pattern of the subdivision style, wherein at least one sub-block in the second set is not in the first set; and The second set of sub-blocks produces a second intermediate prediction block.

18、根據條款17所述的方法，其中與由第一樣式產生的預測樣本相關聯的權重值w1和與由第二樣式產生的預測樣本相關聯的權重值w2相同，並且最終預測P計算為P=(P1+P2)>>1或P=(P1+P2+1)>>1。 18. The method according to clause 17, wherein the prediction sample generated by the first pattern is The associated weight value w1 is the same as the weight value w2 associated with the prediction sample generated by the second style, and the final prediction P is calculated as P=(P1+P2)>>1 or P=(P1+P2+1)> >1.

19、根據條款17所述的方法，其中與由第一樣式產生的預測樣本相關聯的權重值w1和與由第二樣式產生的預測樣本相關聯的權重值w2不同，並且最終預測P計算為P=(w1×P1+w2×P2+偏移)>>N，其中偏移為1<<(N-1)或0。 19. The method according to clause 17, wherein the weight value w1 associated with the prediction sample generated by the first style is different from the weight value w2 associated with the prediction sample generated by the second style, and the final prediction P is calculated It is P=(w1×P1+w2×P2+offset)>>N, where the offset is 1<<(N-1) or 0.

20、根據條款17所述的方法，其中第一中間預測塊的第一權重Wa和第二中間預測塊的第二權重Wb滿足條件Wa+Wb=2^N，其中N是整數。 20. The method according to clause 17, wherein the first weight Wa of the first intermediate prediction block and the second weight Wb of the second intermediate prediction block satisfy the condition Wa+Wb=2 ^N , where N is an integer.

21、根據條款17所述的方法，其中如果第一樣本比第二樣本更靠近推導子塊的運動向量的位置，子塊中的第一樣本的權重值大於子塊中的第二樣本的權重值。 21. The method according to clause 17, wherein if the first sample is closer to the position where the motion vector of the sub-block is derived than the second sample, the weight value of the first sample in the sub-block is greater than the second sample in the sub-block The weight value of.

22、一種視頻處理的方法，包括：選擇性地基於視頻條件，進行視頻的來自視頻幀的亮度分量、第一色度分量和第二色度分量中的一個或多個分量的基於交織預測的編碼，其中進行交織預測包括通過以下為視頻的分量的當前塊確定預測塊：選擇視頻幀的分量的像素的集合以形成塊；根據第一樣式將塊分割為子塊的第一集合；基於子塊的第一集合產生第一中間預測塊；根據第二樣式將塊分割為子塊的第二集合，其中第二集合中的至少一個子塊不在第一集合中；基於子塊的第二集合產生第二中間預測塊；以及基於第一中間預測塊和第二中間預測塊確定預測塊。 22. A method of video processing, comprising: selectively based on video conditions, performing interleaving prediction of one or more of the luminance component, the first chrominance component, and the second chrominance component of the video frame. Encoding, where performing interleaving prediction includes determining a prediction block from the following current block, which is the component of the video: selecting a set of pixels of the components of the video frame to form a block; dividing the block into a first set of sub-blocks according to a first pattern; based on The first set of sub-blocks generates a first intermediate prediction block; the block is divided into a second set of sub-blocks according to the second pattern, wherein at least one sub-block in the second set is not in the first set; a second sub-block-based The set generates a second intermediate prediction block; and the prediction block is determined based on the first intermediate prediction block and the second intermediate prediction block.

23、根據條款22所述的方法，其中僅對於亮度分量使用交織預測來形成預測塊。 23. The method of clause 22, wherein interlaced prediction is used only for the luminance component to form the prediction block.

24、根據條款22所述的方法，其中使用不同的第一樣式或第二樣式來分割視頻的不同分量。 24. The method according to clause 22, wherein a different first or second style is used To split the different components of the video.

25、根據條款22至24中任一項所述的方法，其中視頻條件包括預測的方向，並且其中僅對於單向預測或雙向預測中的一者進行交織預測，而不對於單向預測和雙向預測中的另一者進行交織預測。 25. The method according to any one of clauses 22 to 24, wherein the video conditions include the direction of prediction, and wherein interleaving prediction is performed only for one of unidirectional prediction or bidirectional prediction, and not for unidirectional prediction and bidirectional prediction. The other of the predictions performs interlaced prediction.

26、根據條款22所述的方法，其中視頻條件包括使用低延遲P編碼模式，並且其中在使用低延遲P模式的情況下，方法包含抑制進行交織預測。 26. The method of clause 22, wherein the video conditions include the use of a low-delay P coding mode, and wherein in the case of using the low-delay P-mode, the method includes suppressing interleaving prediction.

27、根據條款22所述的方法，其中視頻條件包括使用包含當前塊的當前圖片作為預測的參考。 27. The method of clause 22, wherein the video conditions include using a current picture containing the current block as a reference for prediction.

28、根據條款1-27中任一項所述的方法，其中基於交織的預測編碼包括使用僅來自部分當前塊的子塊的第一集合和子塊的第二集合。 28. The method according to any one of clauses 1-27, wherein interleaving-based predictive coding includes using only a first set of sub-blocks from part of the current block and a second set of sub-blocks.

29、根據條款28所述的方法，其中當前塊的較小部分排除在當前塊的邊界區域中的樣本。 29. The method of clause 28, wherein a smaller part of the current block excludes samples in the boundary area of the current block.

30、根據條款28所述的方法，其中使用部分當前塊的基於交織的預測編碼包含使用部分當前塊進行仿射預測。 30. The method of clause 28, wherein using a part of the current block for interleaving-based predictive coding includes using a part of the current block for affine prediction.

31、根據條款22所述的方法，其中確定預測塊包含使用第一中間預測塊和第二中間預測塊的加權平均來確定預測塊。 31. The method of clause 22, wherein determining the prediction block comprises using a weighted average of the first intermediate prediction block and the second intermediate prediction block to determine the prediction block.

32、附加條款31所述的方法，其中第一中間預測塊的第一權重Wa和第二中間預測塊的第二權重Wb滿足條件Wa+Wb=2^N，其中N是整數。 32. The method of additional clause 31, wherein the first weight Wa of the first intermediate prediction block and the second weight Wb of the second intermediate prediction block satisfy the condition Wa+Wb=2 ^N , where N is an integer.

33、根據條款32所述的方法，其中Wa=3且Wb=1。 33. The method according to clause 32, wherein Wa=3 and Wb=1.

34、根據條款22所述的方法，其中當第一樣本比第二樣本更接近於推導子塊的運動向量的位置時，子塊中的第一樣本的權重值大於子塊中的第二樣本的權重值。 34. The method according to clause 22, wherein when the first sample is closer to the position of deriving the motion vector of the sub-block than the second sample, the weight value of the first sample in the sub-block is greater than that of the first sample in the sub-block. The weight value of the second sample.

35、一種設備，包括處理器和其上具有指令的非暫態記憶體，其中當由處理器執行指令時，使處理器實現條款1至34中的一個或多個中的方法。 35. A device comprising a processor and a non-transitory memory with instructions thereon, wherein when the instructions are executed by the processor, the processor is caused to implement the method in one or more of clauses 1 to 34.

36、一種電腦程式產品，儲存在非暫態電腦可讀媒體上，電腦程式產品包含程式代碼，程式代碼用於執行條款1至34中的一個或多個所述的方法。 36. A computer program product stored on a non-transitory computer readable medium, the computer program product containing program code, and the program code is used to perform one or more of the methods described in clauses 1 to 34.

從以上，可以理解本文中已經描述的本公開的技術的具體實施例是出於說明目的，但可以進行各種修改，而不背離本發明的範圍。相應地，本公開的技術不受所附請求項之外的限制。 From the above, it can be understood that the specific embodiments of the technology of the present disclosure described herein are for illustrative purposes, but various modifications can be made without departing from the scope of the present invention. Accordingly, the technology of the present disclosure is not restricted except for the appended claims.

本文件中描述的公開和其他實施例、模塊和功能操作可以以數位電子電路實現，或者以電腦軟體、韌體或硬體實現，包含本文件中公開的結構及其結構等同物，或者以它們中的一個或多個的組合實現。公開和其他實施例可以實現為一個或多個電腦程式產品，即，在電腦可讀媒體上編碼的一個或多個電腦程式指令模塊，用於由數據處理裝置執行或控制數據處理裝置的操作。電腦可讀媒體可以是機器可讀儲存設備、機器可讀儲存基板、記憶體設備、影響機器可讀傳播信號的物質組合、或者它們中的一個或多個的組合。術語“數據處理裝置”涵蓋用於處理數據的所有裝置、設備和機器，包括例如可編程處理器、電腦或多個處理器或電腦。除了硬體之外，該裝置還可以包括為所討論的電腦程式創建執行環境的代碼，例如，構成處理器韌體、協定疊、數據庫管理系統、操作系統、或者它們中的一個或多個的組合的代碼。傳播信號是人工產生的信號，例如機器產生的電信號、光信號或電磁信號，其被產生以對信息進行編碼以便傳輸到合適的接收器裝置。 The disclosure and other embodiments, modules, and functional operations described in this document can be implemented by digital electronic circuits, or by computer software, firmware, or hardware, including the structures disclosed in this document and their structural equivalents, or by using them A combination of one or more of them. The disclosed and other embodiments can be implemented as one or more computer program products, that is, one or more computer program instruction modules encoded on a computer-readable medium, used to execute or control the operation of the data processing device. The computer-readable medium may be a machine-readable storage device, a machine-readable storage substrate, a memory device, a combination of substances that affect a machine-readable propagated signal, or a combination of one or more of them. The term "data processing device" covers all devices, equipment, and machines used to process data, including, for example, a programmable processor, a computer, or multiple processors or computers. In addition to hardware, the device may also include code that creates an execution environment for the computer program in question, for example, constituting processor firmware, protocol stack, database management system, operating system, or one or more of them The combined code. Propagated signals are artificially generated signals, such as electrical, optical or electromagnetic signals generated by machines, which are generated to encode information for transmission to a suitable receiver device.

電腦程式(也稱為程式、軟體、軟體應用、腳本或代碼)可以以任何形式的編程語言編寫，包括編譯或解釋語言，並且可以以任何形式來部署電腦程式，包括作為獨立程式或作為適合在計算環境中使用的模塊、組件、子例程或其他單元。電腦程式不一定對應於文件系統中的文件。程式可以儲存在保存其他程式或數據的文件的一部分中(例如，儲存在標記語言文件中的一個或多個腳本)，儲存在專用於所討論的程式的單個文件中，或儲存在多個協調文件中(例如，儲存一個或多個模塊、子程式或代碼部分的文件)。可以部署電腦程式以在一個電腦上或在位於一個站點上或分布在多個站點上並由通信網絡互連的多個電腦上執行。 Computer programs (also called programs, software, software applications, scripts or codes) can be written in any form of programming language, including compiled or interpreted languages, and computer programs can be deployed in any form, including as stand-alone programs or as suitable for Modules, components, subroutines, or other units used in a computing environment. Computer programs do not necessarily correspond to documents in the file system. Programs can be stored in a part of a document that holds other programs or data (for example, one or more scripts stored in a markup language document), in a single document dedicated to the program in question, or in multiple coordination file Medium (for example, a document that stores one or more modules, subroutines, or code parts). Computer programs can be deployed to be executed on one computer or on multiple computers located at one site or distributed across multiple sites and interconnected by a communication network.

本文件中描述的過程和邏輯流程可以由執行一個或多個電腦程式的一個或多個可編程處理器執行，以通過對輸入數據進行操作並產生輸出來執行功能。過程和邏輯流程也可以由專用邏輯電路執行，並且裝置也可以實現為專用邏輯電路，例如FPGA(現場可編程門陣列)或ASIC(專用集成電路)。 The processes and logic flows described in this document can be executed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The process and logic flow can also be executed by a dedicated logic circuit, and the device can also be implemented as a dedicated logic circuit, such as FPGA (Field Programmable Gate Array) or ASIC (Application Specific Integrated Circuit).

舉例來說，適合於執行電腦程式的處理器包括通用和專用微處理器、以及任何種類的數位電腦的任何一個或多個處理器。通常，處理器將從只讀記憶體或隨機存取記憶體或兩者接收指令和數據。電腦的基本元件是用於執行指令的處理器和用於儲存指令和數據的一個或多個記憶體設備。通常，電腦還將包括或可操作地耦合到用於儲存數據的一個或多個大容量儲存設備，例如磁碟或光碟，以從該一個或多個大容量儲存設備接收數據，或將數據傳遞到該一個或多個大容量儲存設備，或者既接收又傳遞數據。然而，電腦不需要具有這樣的設備。適用於儲存電腦程式指令和數據的電腦可讀媒體包括所有形式的非揮發性記憶體、媒體和記憶體設備，舉例來說，包括半導體記憶體設備，例如EPROM、EEPROM和快閃記憶體設備；磁碟，例如內部硬碟或可移動磁碟；磁光盤；以及CD ROM和DVD-ROM磁碟。處理器和記憶體可以由專用邏輯電路補充或併入專用邏輯電路中。 For example, processors suitable for executing computer programs include general-purpose and special-purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, the processor will receive commands and data from read-only memory or random access memory or both. The basic components of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include or be operatively coupled to one or more mass storage devices for storing data, such as magnetic disks or optical discs, to receive data from the one or more mass storage devices, or to transfer data To the one or more mass storage devices, or both receive and transmit data. However, the computer does not need to have such equipment. Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including, for example, semiconductor memory devices such as EPROM, EEPROM and flash memory devices; Disks, such as internal hard disks or removable disks; magneto-optical disks; and CD ROM and DVD-ROM disks. The processor and memory can be supplemented by or incorporated into a dedicated logic circuit.

雖然本專利文件包含許多細節，但這些細節不應被解釋為對任何發明或可要求保護的範圍的限制，而是作為特定於特定發明的特定實施例的特徵的描述。在本專利文件中，在分開的實施例的上下文中描述的某些特徵也可以在單個實施例中組合實現。相反，在單個實施例的上下文中描述的各種特徵也可以分開地或以任何合適的子組合在多個實施例中實現。此外，儘管上面的特徵可以描述為以某些組合起作用並且甚至最初如此要求保護，但是在一些情況下，可以從所要求保護的組合中去除來自該組合的一個或多個特徵，並且所要求保護的組合可以指向子組合或子組合的變型。 Although this patent document contains many details, these details should not be construed as limitations on the scope of any invention or claimable, but as descriptions of features specific to specific embodiments of specific inventions. In this patent document, certain features described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. In addition, despite the above The features can be described as acting in certain combinations and even initially claimed as such, but in some cases, one or more features from the combination can be removed from the claimed combination, and the claimed combination can point to Sub-combination or variant of sub-combination.

類似地，雖然在附圖中以特定順序描繪了操作，但是這不應該被理解為要求以所示的特定順序或按順序執行這樣的操作，或者執行所有示出的操作，以實現期望的結果。此外，在本專利文件中描述的實施例中的各種系統組件的分離不應被理解為在所有實施例中都要求這樣的分離。 Similarly, although operations are depicted in a specific order in the drawings, this should not be understood as requiring that such operations be performed in the specific order shown or in order, or that all the operations shown are performed to achieve the desired result . In addition, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.

僅描述了幾個實現方式和示例，並且可以基於本專利文件中描述和示出的內容來做出其他實現方式、增強和變型。 Only a few implementations and examples are described, and other implementations, enhancements and modifications can be made based on the content described and shown in this patent document.

以上所述僅為本發明之較佳實施例，凡依本發明申請專利範圍所做之均等變化與修飾，皆應屬本發明之涵蓋範圍。 The foregoing descriptions are only preferred embodiments of the present invention, and all equivalent changes and modifications made in accordance with the scope of the patent application of the present invention shall fall within the scope of the present invention.

2300:方法 2300: method

2302至2312:步驟 2302 to 2312: steps

Claims

A video processing method includes: determining whether an interlaced prediction mode is suitable for conversion between the current video block and a bitstream representation of the current video block based on the component type of the current video block; and in response to determining the interlaced prediction mode The conversion is applied to the current video block by applying the interlaced prediction mode; wherein applying the interlaced prediction includes subdividing part of the current video block into at least one sub-block using more than one subdivision pattern, and generating as the A weighted average predictor of the current video block of each determined predictor in a subdivision pattern.

The method of claim 1, wherein the interleaving prediction is applied in response to the component type being equal to the luminance component.

The method of claim 1, wherein the more than one subdivision pattern for the first color component of the video is different from the other more than one subdivision pattern for the second color component of the video.

The method according to claim 1, wherein the weighted average uses a weight, and the value of the weight depends on the component type.

A video processing method includes: determining whether an interleaving prediction mode is suitable for conversion between the current video block and a bitstream representation of the current video block based on the prediction direction of the current video block; and in response to determining the interleaving prediction mode Apply to the current video block, perform the conversion by applying the interleaving prediction mode, and wherein applying the interleaving prediction includes subdividing part of the current video block into at least one sub-block using more than one subdivision pattern, and generating as the A weighted average predictor of the current video block of each determined predictor in a subdivision pattern.

The method according to claim 5, wherein in response to the prediction direction being equal to one-way prediction, apply The interweaving prediction.

The method of claim 5, wherein the interleaving prediction is applied in response to the prediction direction being equal to two-way.

A video processing method includes: determining whether an interlaced prediction mode is suitable for conversion between a current video block in the current picture and a bitstream representation of the current video block based on a low delay mode of the current picture; and in response to the determination The interlaced prediction mode is applied to the current video block, and the conversion is performed by applying the interlaced prediction mode, and wherein applying the interlaced prediction includes subdividing part of the current video block into at least one sub-block using more than one subdivision pattern, and generating as The predictor of the current video block that is a weighted average of the determined predictors for each of the more than one subdivision patterns.

The method according to claim 8, wherein the interleaving prediction is disabled for the low delay mode of the current picture.

A video processing method includes: determining whether an interlaced prediction mode is suitable for conversion between the current video block and the bitstream representation of the current video block based on using a current picture containing the current video block as a reference; and in response to the determination The interlaced prediction mode is applied to the current video block, the conversion is performed by applying the interlaced prediction mode, and wherein applying the interlaced prediction includes subdividing part of the current video block into at least one sub-block pattern using more than one sub-block pattern, and generating The predictor of the current video block as a weighted average of the determined predictors for each of the more than one subdivision patterns.

The method of claim 10, wherein the interlaced prediction is enabled when the current video block is predicted from the current picture.

The method according to claim 1, 5, 8 or 10, wherein part of the current video block includes less than All of the current video block.

The method according to any one of claim 1, 5, 8, or 10, wherein the interlaced prediction mode is applied to part of the current video block.

The method according to claim 13, wherein at least one of the subdivision patterns only covers a part of the current video block.

The method according to claim 14, wherein the part excludes samples located at the boundary of the current video block.

The method according to claim 14, wherein the part excludes samples located at sub-blocks having a size different from most of the sub-block sizes within the current video block.

The method according to any one of claim 1, 5, 8, or 10, wherein the subdividing part of the current video block further comprises: dividing the current video block into sub-blocks according to the first pattern of the subdivision style Set; generate a first intermediate prediction block based on the first set of sub-blocks; divide the current video block into a second set of sub-blocks according to the second pattern of the subdivision pattern, wherein at least one sub-block in the second set Not in the first set; and generating a second intermediate prediction block based on the second set of sub-blocks.

The method according to claim 17, wherein the weight value w1 associated with the prediction sample generated by the first style is the same as the weight value w2 associated with the prediction sample generated by the second style, and the final prediction P The calculation is P=(P1+P2)>>1 or P=(P1+P2+1)>>1.

The method according to claim 17, wherein the weight value w1 associated with the prediction sample generated by the first style is different from the weight value w2 associated with the prediction sample generated by the second style, and the final prediction P Calculate as P=(w1×P1+w2×P2+offset)>>N, where the offset is 1<<(N-1) or 0.

The method according to claim 17, wherein the first weight Wa of the first intermediate prediction block and the second weight Wb of the second intermediate prediction block satisfy the condition Wa+Wb=2 ^N , where N is an integer.

The method according to claim 17, wherein if the first sample is closer to the position of deriving the motion vector of the sub-block than the second sample, the weight value of the first sample in the sub-block is greater than the first sample in the sub-block The weight value of the second sample.

A video processing method, comprising: selectively encoding based on video conditions, encoding one or more of the luminance component, the first chrominance component, and the second chrominance component of the video frame based on the interleaving prediction , Wherein performing the interleaving prediction includes determining a prediction block by the following current block which is a component of the video: selecting a set of pixels of the component of the video frame to form a block; dividing the block into the first sub-block according to the first pattern Set; generate a first intermediate prediction block based on the first set of sub-blocks; divide the block into a second set of sub-blocks according to the second pattern, wherein at least one sub-block in the second set is not in the first set ; Generate a second intermediate prediction block based on the second set of sub-blocks; and determine a prediction block based on the first intermediate prediction block and the second intermediate prediction block.

The method according to claim 22, wherein interlaced prediction is used only for the luminance component to form the prediction block.

The method according to claim 22, wherein different first styles or second styles are used to segment different components of the video.

The method according to any one of Claims 22 to 24, wherein the video condition includes the direction of prediction, and wherein the interleaving prediction is performed only for one of unidirectional prediction or bidirectional prediction, and not for the unidirectional prediction Perform the interleaving prediction with the other of the bidirectional prediction.

The method according to claim 22, wherein the video condition includes the use of a low-delay P coding mode, and wherein in the case of using the low-delay P mode, the method includes suppressing the interleaving prediction.

The method according to claim 22, wherein the video condition includes using the current block The current picture is used as a reference for this prediction.

The method according to any one of claim 1, 5, 8, 10, or 22, wherein the interleaving-based predictive coding includes using only a first set of the sub-block from a part of the current block and the first set of the sub-block Two sets.

The method according to claim 28, wherein a smaller part of the current block excludes samples in the boundary area of the current block.

The method according to claim 28, wherein using a part of the current block to perform interleaving-based predictive coding includes using the part of the current block to perform affine prediction.

The method according to claim 22, wherein determining the prediction block includes using a weighted average of the first intermediate prediction block and the second intermediate prediction block to determine the prediction block.

The method according to claim 31, wherein the first weight Wa of the first intermediate prediction block and the second weight Wb of the second intermediate prediction block satisfy the condition Wa+Wb=2 ^N , where N is an integer.

The method according to claim 32, wherein Wa=3 and Wb=1.

The method according to claim 22, wherein when the first sample is closer to the position where the motion vector of the sub-block is derived than the second sample, the weight value of the first sample in the sub-block is greater than that in the sub-block The weight value of the second sample.

A device comprising a processor and a non-transitory memory with instructions thereon, wherein when the instruction is executed by the processor, the processor is caused to implement any one of request items 1, 5, 8, 10, or 22. The method described.

A computer program product is stored on a non-transitory computer readable medium, the computer program product includes program code, and the program code is used to execute the method described in any one of claim items 1, 5, 8, 10 or 22.