TWI719524B

TWI719524B - Complexity reduction of non-adjacent merge design

Info

Publication number: TWI719524B
Application number: TW108123172A
Authority: TW
Inventors: 張莉; 張凱; 劉鴻彬; 王悅
Original assignee: 大陸商北京字節跳動網絡技術有限公司; 美商字節跳動有限公司
Priority date: 2018-07-01
Filing date: 2019-07-01
Publication date: 2021-02-21
Also published as: WO2020008322A1; CN110677650A; CN110677650B; TW202007146A

Abstract

Devices, systems and methods for complexity reduction of non-adjacent merge design are described. In a representative aspect, a method for video processing includes receiving a current block of video data, selecting, based on a rule, a first non-adjacent block that is not adjacent to the current block, constructing a merge candidate list including a first merge candidate comprising motion information based on the first non-adjacent block, and processing the current block based on the merge candidate list.

Description

Reduce the complexity of non-adjacent Merge design

根據適用的《專利法》和/或《巴黎公約》的規定，本申請及時要求於2018年7月1日提交的國際專利申請號PCT/CN2018/093944、以及2018年9月11日提交的國際專利申請號PCT/CN2018/104982的優先權和利益。根據美國法律，國際專利申請號PCT/CN2018/093944和國際專利申請號PCT/CN2018/104982的全部公開以引用方式併入本文，作為本申請公開的一部分。 In accordance with the applicable “Patent Law” and/or the “Paris Convention”, this application promptly requires the international patent application number PCT/CN2018/093944 filed on July 1, 2018, and the international patent application number filed on September 11, 2018. Priority and benefits of patent application number PCT/CN2018/104982. According to US law, the entire disclosures of International Patent Application No. PCT/CN2018/093944 and International Patent Application No. PCT/CN2018/104982 are incorporated herein by reference as part of the disclosure of this application.

本申請文件總體直接涉及圖像和視頻編碼與解碼技術。 The application documents as a whole directly relate to image and video encoding and decoding technologies.

數位視訊在網際網路和其他數位通訊網絡上佔了最大的頻寬使用。隨著能夠接收和顯示視訊的連接用戶設備數量的增加，預計數位視訊使用的頻寬需求將繼續增長。 Digital video occupies the largest bandwidth usage on the Internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the demand for bandwidth used by digital video will continue to grow.

本發明描述了用於降低非相鄰Merge設計的複雜度的裝置、系統和方法。例如，當前公開的技術公開了用於選擇非相鄰Merge候選(Merge candidate)以將行緩衝器的尺寸保持在臨界值之下的規則。所描述的方法可以應用於現有的視訊編碼標準(例如，高效視訊編碼(HEVC))和未來的視訊編碼標準或視訊編解碼器。 The present invention describes a device, system and method for reducing the complexity of non-adjacent Merge design. For example, the currently disclosed technology discloses a rule for selecting non-adjacent Merge candidates to keep the size of the line buffer below a critical value. The described method can be applied to existing video coding standards (for example, High Efficiency Video Coding (HEVC)) and future video coding standards or video News codec.

在一個代表性方面，所公開的技術可以用於提供用於視訊處理的方法。該方法包含接收視訊資料的當前塊，基於規則選擇與當前塊不相鄰的第一非相鄰塊，構建包含第一Merge候選的Merge候選列表，第一Merge候選包含基於第一非相鄰塊的運動資訊，以及基於Merge候選列表處理當前塊。 In a representative aspect, the disclosed technology can be used to provide a method for video processing. The method includes receiving a current block of video data, selecting a first non-adjacent block that is not adjacent to the current block based on a rule, and constructing a Merge candidate list including a first Merge candidate. The first Merge candidate includes a first non-adjacent block based on , And process the current block based on the Merge candidate list.

在又一代表性方面，上述方法以處理器可執行代碼的形式實現並儲存在電腦可讀取程式媒介中。 In yet another representative aspect, the above method is implemented in the form of processor executable code and stored in a computer-readable program medium.

在又一代表性方面，公開了一種配置或可操作以執行上述方法的設備。該設備可以包含被編程為實現該方法的處理器。 In yet another representative aspect, a device configured or operable to perform the above method is disclosed. The device may contain a processor programmed to implement the method.

在又一代表性方面，視訊解碼器裝置可實現如本文中所描述的方法。 In yet another representative aspect, the video decoder device may implement the method as described herein.

在圖式、說明書和請求項中更詳細地描述了所公開技術的上述和其他方面和特徵。 The above and other aspects and features of the disclosed technology are described in more detail in the drawings, specification and claims.

1510:初始列表 1510: initial list

1520:最終列表 1520: final list

1800、2000:編碼單元(CU) 1800, 2000: coding unit (CU)

1801、1851、2001至2004:子編碼單元(子CU) 1801, 1851, 2001 to 2004: sub coding unit (sub CU)

1850、2110、2111、2210:參考圖片 1850, 2110, 2111, 2210: reference pictures

2011至2014:塊 2011 to 2014: block

2100:當前圖片 2100: current picture

2101、2102、MV0、MV1、MV0’、MV1’:運動向量 2101, 2102, MV0, MV1, MV0’, MV1’: motion vector

2103、2104、TD0、TD1:時間距離 2103, 2104, TD0, TD1: time distance

2200:當前編碼單元(當前CU) 2200: current coding unit (current CU)

2300:單邊運動估計 2300: unilateral motion estimation

2800:方法 2800: method

2810至2840:步驟 2810 to 2840: steps

2900:視訊處理裝置 2900: Video processing device

2902:處理器 2902: processor

2904:記憶體 2904: memory

2906:視訊處理硬體 2906: Video processing hardware

A₀、A₁、B₀、B₁、B₂、C₀、C₁:位置；候選 A ₀ , A ₁ , B ₀ , B ₁ , B ₂ , C ₀ , C ₁ : position; candidate

tb、td:POC距離 tb, td: POC distance

圖1繪示了典型高效視訊編碼(HEVC)視訊編碼器和解碼器的示例方塊圖。 Figure 1 shows an example block diagram of a typical high efficiency video coding (HEVC) video encoder and decoder.

圖2繪示了H.264/AVC中的巨集區塊劃分的示例。 Figure 2 shows an example of macro block division in H.264/AVC.

圖3繪示了將編碼塊(CB)劃分為預測塊(PB)的示例。 Fig. 3 shows an example of dividing a coding block (CB) into prediction blocks (PB).

圖4A和圖4B分別繪示了將編碼樹塊(CTB)細分為CB和轉換塊(TB)以及各個相應的四元樹的示例。 4A and 4B respectively show examples of subdividing the coding tree block (CTB) into a CB and a transformation block (TB) and each corresponding quaternary tree.

圖5A和圖5B繪示了用於最大編碼單元(LCU)的細分和對應的QTBT(四元樹加二元樹)的示例。 Figures 5A and 5B illustrate examples of subdivision for the largest coding unit (LCU) and the corresponding QTBT (quaternary tree plus binary tree).

圖6A至圖6E繪示了劃分編碼塊的示例。 6A to 6E show examples of dividing coding blocks.

圖7繪示了基於QTBT的CB的示例細分。 Figure 7 shows an example subdivision of CB based on QTBT.

圖8A至圖8I繪示了支持多樹類型(MTT)的CB的劃分的示例，其是QTBT的概括。 8A to 8I illustrate examples of the division of CBs supporting multiple tree types (MTT), which are generalizations of QTBT.

圖9繪示了構建Merge候選列表的示例。 Figure 9 shows an example of constructing a Merge candidate list.

圖10繪示了空間候選的位置的示例。 Fig. 10 shows an example of the position of the spatial candidate.

圖11繪示了經受空間Merge候選的冗餘檢查的候選對的示例。 FIG. 11 shows an example of candidate pairs subjected to redundancy check of spatial Merge candidates.

圖12A和圖12B繪示了基於當前塊的尺寸和形狀的第二預測單元(PU)的位置的示例。 12A and 12B show examples of the position of the second prediction unit (PU) based on the size and shape of the current block.

圖13繪示了用於時間Merge候選的運動向量縮放的示例。 Figure 13 shows an example of motion vector scaling for temporal Merge candidates.

圖14繪示了時間Merge候選的候選位置的示例。 Fig. 14 shows an example of candidate positions of temporal Merge candidates.

圖15繪示了產生組合的雙向預測Merge候選的示例。 FIG. 15 shows an example of generating combined bidirectional prediction Merge candidates.

圖16繪示了構建運動向量預測候選的示例。 Fig. 16 shows an example of constructing motion vector prediction candidates.

圖17繪示了用於空間運動向量候選的運動向量縮放的示例。 Figure 17 illustrates an example of motion vector scaling for spatial motion vector candidates.

圖18繪示了使用可選時間運動向量預測(ATMVP)算法用於編碼單元(CU)的運動預測的示例。 FIG. 18 illustrates an example of using the optional temporal motion vector prediction (ATMVP) algorithm for motion prediction of coding units (CU).

圖19繪示了源塊和源圖片的識別的示例。 Figure 19 shows an example of the identification of source blocks and source pictures.

圖20繪示了具有由空時運動向量預測(STMVP)算法使用的子塊和相鄰塊的編碼單元(CU)的示例。 FIG. 20 illustrates an example of coding units (CU) with sub-blocks and neighboring blocks used by the space-time motion vector prediction (STMVP) algorithm.

圖21繪示了模式匹配運動向量推導(PMMVD)模式中的雙邊匹配的示例，其是基於幀速率上轉換(FRUC)算法的特殊Merge模式。 FIG. 21 shows an example of bilateral matching in the pattern matching motion vector derivation (PMMVD) mode, which is a special Merge mode based on the frame rate up conversion (FRUC) algorithm.

圖22繪示了FRUC算法中的模板匹配的示例。 Figure 22 shows an example of template matching in the FRUC algorithm.

圖23繪示了FRUC算法中的單邊運動估計的示例。 Figure 23 shows an example of unilateral motion estimation in the FRUC algorithm.

圖24繪示了基於雙邊模板匹配的解碼器側運動向量細化(DMVR)算法的示例。 Figure 24 shows an example of a decoder-side motion vector refinement (DMVR) algorithm based on bilateral template matching.

圖25繪示了用於推導空間Merge候選的空間相鄰塊的示例。 FIG. 25 illustrates an example of spatial neighboring blocks used to derive spatial Merge candidates.

圖26繪示了用於添加非相鄰Merge候選的示例性偽代碼。 Figure 26 shows an exemplary pseudo code for adding non-adjacent Merge candidates.

圖27繪示了非相鄰塊的受限區域的示例。 Fig. 27 shows an example of restricted areas of non-adjacent blocks.

圖28繪示了根據當前公開的技術的用於視訊處理的示例方法的流程圖。 FIG. 28 shows a flowchart of an example method for video processing according to the currently disclosed technology.

圖29是用於實現本發明說明中描述的視覺媒體解碼或視覺媒體編碼技術的硬體平台的示例的方塊圖。 FIG. 29 is a block diagram of an example of a hardware platform for implementing the visual media decoding or visual media encoding technology described in the description of the present invention.

由於對更高解析度視訊的日益增長的需求，視訊編碼方法和技術在現代技術中無處不在。視訊編解碼器通常包含壓縮或解壓縮數位視訊的電子電路或軟體，並且不斷被改進以提供更高的編碼效率。視訊編解碼器將未壓縮的視訊轉換為壓縮格式，反之亦然。視訊品質、用於表示視訊的資料量(由位元率確定)、編碼和解碼算法的複雜性、對資料丟失和錯誤的敏感性、編輯的簡易性、隨機存取和端到端延遲(時間延遲)之間存在複雜的關係。壓縮格式通常符合標準視訊壓縮規範，例如，高效視訊編碼(HEVC)標準(也稱為H.265或MPEG-H第2部分)、即將完成的通用視訊編碼標準、或其他當前和/或未來視訊編碼標準。 Due to the increasing demand for higher-resolution video, video coding methods and techniques are ubiquitous in modern technology. Video codecs usually include electronic circuits or software that compress or decompress digital video, and are constantly being improved to provide higher coding efficiency. Video codecs convert uncompressed video to compressed format and vice versa. Video quality, the amount of data used to represent the video (determined by the bit rate), the complexity of encoding and decoding algorithms, the sensitivity to data loss and errors, the ease of editing, random access, and end-to-end latency (time There is a complicated relationship between delay). The compression format usually complies with standard video compression specifications, such as the High Efficiency Video Coding (HEVC) standard (also known as H.265 or MPEG-H Part 2), the upcoming general video coding standard, or other current and/or future video Coding standards.

所公開的技術的實施例可以應用於現有視訊編碼標準(例如，HEVC、H.265)和未來標準以改進壓縮性能。在本發明說明中使用章節標題以提高描述的可讀性，並且不以任何方式將討論或實施例(和/或實現方式)限制於僅相應的章節。 The embodiments of the disclosed technology can be applied to existing video coding standards (for example, HEVC, H.265) and future standards to improve compression performance. The chapter titles are used in the description of the present invention to improve the readability of the description, and the discussion or the embodiments (and/or implementations) are not limited to only the corresponding chapters in any way.

1.視訊編碼的示例實施例1. Example embodiment of video coding

圖1繪示了典型HEVC視訊編碼器和解碼器的示例方塊圖。產生符合HEVC的位元流的編碼算法通常如下進行。每個圖片被劃分為塊狀區域，其中精確的塊分割被傳遞到解碼器。視訊序列的第一圖片(以及到視訊序列中的每個乾淨隨機存取點(clean random access point)處的第一圖片)僅使用(在同一圖片內使用區域到區域的空間上資料的一些預測，但不依賴於其他圖片)幀內預測而編碼。對於序列的所有其餘圖片或隨機存取點之間的圖片，幀間時間預測編碼模式通常用於大多數塊。用於幀間預測的編碼過程包含選擇包含選定的參考圖片和運動向量(MV)的運動資料，該參考圖片和運動向量(MV)將被應用於預測每個塊的樣本。編碼器和解碼器通過應用使用MV和模式判定資料的運動補償(MC)來產生相同的幀間預測信令，其作為輔助被發送。 Figure 1 shows an example block diagram of a typical HEVC video encoder and decoder. The coding algorithm for generating a bit stream conforming to HEVC usually proceeds as follows. Each picture is divided into block regions, where the precise block division is passed to the decoder. The first picture of the video sequence (and to each The first picture at the clean random access point (clean random access point) is only coded using intra-frame prediction (some predictions of spatial data from area to area are used in the same picture, but not dependent on other pictures). For all remaining pictures of the sequence or pictures between random access points, the inter-frame temporal prediction coding mode is usually used for most blocks. The encoding process for inter prediction involves selecting motion data containing selected reference pictures and motion vectors (MV), which will be applied to predict the samples of each block. The encoder and decoder generate the same inter prediction signaling by applying motion compensation (MC) using MV and mode decision data, which is sent as an aid.

通過線性空間變換來變換幀內預測或幀間預測的殘差信號，其是初始塊與其預測之間的差異。變換係數然後被縮放、量化、熵編碼，並與預測資訊一起被發送。 The residual signal of intra prediction or inter prediction is transformed by linear spatial transformation, which is the difference between the initial block and its prediction. The transform coefficients are then scaled, quantized, entropy coded, and sent along with the prediction information.

編碼器複製解碼器處理環路(參見圖1中的灰色陰影框)，使得兩者都將為後續資料產生相同的預測。因此，量化的變換係數通過逆縮放而構造並且然後被逆變換以複製殘差信號的解碼的近似。然後將殘差添加到預測中，然後可以將該添加的結果饋送到一個或兩個環路濾波器中以平滑由逐塊處理和量化引起的偽像。最終圖片表示(即解碼器的輸出的副本)儲存在解碼圖片緩衝器中以用於後續圖片的預測。通常，圖片的編碼或解碼處理的順序通常不同於它們從源到達的順序；需要區分解碼器的解碼順序(即位元流順序)和輸出順序(即顯示順序)。 The encoder replicates the decoder processing loop (see the gray shaded box in Figure 1) so that both will produce the same prediction for subsequent data. Therefore, the quantized transform coefficients are constructed by inverse scaling and then inversely transformed to replicate the decoded approximation of the residual signal. The residual is then added to the prediction, and the result of this addition can then be fed into one or two loop filters to smooth out artifacts caused by block-by-block processing and quantization. The final picture representation (ie a copy of the decoder's output) is stored in the decoded picture buffer for prediction of subsequent pictures. Generally, the order of encoding or decoding processing of pictures is usually different from the order in which they arrive from the source; it is necessary to distinguish the decoding order (ie, bit stream order) and output order (ie, display order) of the decoder.

通常期望通過HEVC編碼的視訊材料作為逐行掃描圖像而輸入(由於源視訊源自該格式或者由編碼之前的去交錯產生)。HEVC設計中不存在明確的編碼特徵以支持隔行掃描的使用，因為隔行掃描不再用於顯示器並且對於分發而言變得基本上不常見。然而，在HEVC中已經提供了元資料語法，以允許編碼器指示已經通過將隔行的視訊的每個場(即每個視訊幀的偶數或奇數行)編碼為單獨的圖片傳送了隔行掃描視訊，或已經通過將每個隔行的幀編碼為HEVC編碼圖片傳送了隔行掃描視訊。這提供了一種對隔行的視訊進行編碼的有效方法，其無需使解碼器需要支持針對其的特殊解碼過程。 It is generally expected that the video material encoded by HEVC is input as a progressive image (because the source video originates from this format or is generated by deinterlacing before encoding). There are no explicit coding features in the HEVC design to support the use of interlaced scanning, because interlaced scanning is no longer used in displays and becomes largely uncommon for distribution. However, the metadata syntax has been provided in HEVC to allow the encoder to indicate that the interlaced video has been transmitted by encoding each field of the interlaced video (that is, the even or odd lines of each video frame) into a separate picture. Or have been encoded by encoding each interlaced frame into HEVC The code image is transmitted with interlaced video. This provides an effective method for encoding interlaced video without requiring the decoder to support a special decoding process for it.

1.1 H.264/AVC中的劃分樹結構的示例1.1 Example of partition tree structure in H.264/AVC

先前標準中的編碼層的核心是巨集區塊，其包含16×16的亮度(luma)樣本塊、以及在4：2：0顏色採樣的通常情況下的兩個對應的8×8的色度(chroma)樣本塊。 The core of the coding layer in the previous standard is a macro block, which contains a 16×16 luma sample block and two corresponding 8×8 color samples in the usual case of 4:2:0 color sampling. Chroma sample block.

幀內編碼塊使用空間預測來利用像素之間的空間相關性。兩種劃分被定義為：16x16和4x4。 Intra-coded blocks use spatial prediction to take advantage of the spatial correlation between pixels. Two divisions are defined as: 16x16 and 4x4.

幀間編碼塊通過估計圖片之間的運動來使用時間預測而不是空間預測。可以對於16x16巨集區塊或其如下任何子巨集區塊劃分獨立地估計運動：16x8、8x16、8x8、8x4、4x8、4x4(如圖2所示)。每子巨集區塊劃分僅允許一個運動向量(MV)。 The inter-coding block uses temporal prediction instead of spatial prediction by estimating the motion between pictures. The motion can be estimated independently for the 16x16 macro block or any of the following sub-macro block partitions: 16x8, 8x16, 8x8, 8x4, 4x8, 4x4 (as shown in FIG. 2). Only one motion vector (MV) is allowed per sub-macro block partition.

1.2 HEVC中的劃分樹結構的示例1.2 Example of partition tree structure in HEVC

在HEVC中，通過使用表示為編碼樹的四元樹結構將編碼樹單元(CTU)劃分成編碼單元(CU)，以適應各種局部特性。使用圖片間(時間)預測還是圖片內(空間)預測來對圖片區域編碼的決策是在CU級進行的。根據預測單元(PU)劃分類型，每個CU可以進一步劃分成一個、兩個或四個PU。在一個PU內部，應用相同的預測過程，並且在PU的基礎上將相關資訊發送到解碼器。在通過基於PU劃分類型應用預測過程而獲得了殘差塊之後，可以根據類似於CU的編碼樹的另一個四元樹結構將CU劃分為變換單元(TU)。HEVC結構的關鍵特徵之一是它具有多個劃分概念，包含CU、PU和TU。 In HEVC, a coding tree unit (CTU) is divided into coding units (CU) by using a four-element tree structure expressed as a coding tree to adapt to various local characteristics. The decision to use inter-picture (temporal) prediction or intra-picture (spatial) prediction to encode picture regions is made at the CU level. According to the prediction unit (PU) division type, each CU can be further divided into one, two, or four PUs. Within a PU, the same prediction process is applied, and relevant information is sent to the decoder on the basis of the PU. After the residual block is obtained by applying the prediction process based on the PU division type, the CU may be divided into transformation units (TU) according to another quad tree structure similar to the coding tree of the CU. One of the key features of the HEVC structure is that it has multiple division concepts, including CU, PU, and TU.

使用HEVC的混合視訊編碼中涉及的一些特徵包含： Some of the features involved in hybrid video coding using HEVC include:

(1)編碼樹單元(CTU)和編碼樹塊(CTB)結構：HEVC中的類似結構是編碼樹單元(CTU)，其具有由編碼器選擇的尺寸並且可以大於傳統巨集區塊。CTU由亮度CTB和相應的色度CTB以及語法元素組成。亮度CTB的尺寸L×L可以被選擇為L=16、32或64個樣本，較大的尺寸通常能夠實現更好的壓縮。然後，HEVC支持使用樹結構和類似四元樹的信令來將CTB劃分為更小的塊。 (1) Coding tree unit (CTU) and coding tree block (CTB) structure: The similar structure in HEVC is the coding tree unit (CTU), which has a size selected by the encoder and can be larger than traditional giants. Set blocks. CTU is composed of luminance CTB and corresponding chrominance CTB and syntax elements. The size L×L of the luminance CTB can be selected as L=16, 32, or 64 samples, and a larger size can usually achieve better compression. Then, HEVC supports the use of tree structure and quad-tree-like signaling to divide the CTB into smaller blocks.

(2)編碼單元(CU)和編碼塊(CB)：CTU的四元樹語法指定其亮度CB和色度CB的尺寸和位置。四元樹的根與CTU相關聯。因此，亮度CTB的尺寸是亮度CB的最大支持尺寸。將CTU劃分成亮度CB和色度CB是信令聯合的。一個亮度CB和通常兩個色度CB以及相關聯的語法一起形成編碼單元(CU)。CTB可以僅包含一個CU或者可以被劃分以形成多個CU，並且每個CU具有相關聯的、向預測單元(PU)以及變換單元(TU)的樹的劃分。 (2) Coding Unit (CU) and Coding Block (CB): The quaternary tree syntax of CTU specifies the size and location of its luminance CB and chrominance CB. The root of the quaternary tree is associated with the CTU. Therefore, the size of the brightness CTB is the maximum supported size of the brightness CB. The division of CTU into luminance CB and chrominance CB is a joint signaling. One luma CB and usually two chroma CBs and the associated syntax together form a coding unit (CU). The CTB may contain only one CU or may be divided to form multiple CUs, and each CU has an associated division into a tree of prediction units (PU) and transformation units (TU).

(3)預測單元和預測塊(PB)：使用幀間圖片預測還是幀內圖片預測來對圖片區域編碼的決策是在CU級進行的。PU劃分結構的根在CU級。取決於基本預測類型決策，然後可以在尺寸上進一步劃分亮度CB和色度CB，並根據亮度和色度預測塊(PB)對其進行預測。HEVC支持從64×64到4×4樣本的可變PB尺寸。圖3繪示了對於M×M CU所允許的PB的示例。 (3) Prediction unit and prediction block (PB): The decision to use inter-picture prediction or intra-picture prediction to encode a picture region is made at the CU level. The root of the PU partition structure is at the CU level. Depending on the basic prediction type decision, the luminance CB and chrominance CB can then be further divided in size and predicted based on the luminance and chrominance prediction blocks (PB). HEVC supports variable PB size from 64×64 to 4×4 samples. Figure 3 shows an example of PB allowed for M×M CU.

(4)TU和變換塊：使用塊變換對預測殘差進行編碼。TU樹結構的根在CU級。亮度CB殘差可以與亮度變換塊(TB)相同，或者可以被進一步劃分成更小的亮度TB。這同樣適用於色度TB。對於正方形TB尺寸4×4、8×8、16×16和32×32，定義類似於離散餘弦變換(DCT)的整數基函數。對於亮度幀內圖片預測殘差的4×4變換，可以替代地指定從離散正弦變換(DST)的形式導出的整數變換。 (4) TU and transform block: use block transform to encode prediction residuals. The root of the TU tree structure is at the CU level. The luminance CB residual may be the same as the luminance transform block (TB), or may be further divided into smaller luminance TB. The same applies to chroma TB. For square TB sizes 4×4, 8×8, 16×16, and 32×32, an integer basis function similar to a discrete cosine transform (DCT) is defined. For the 4×4 transform of the luma intra picture prediction residual, an integer transform derived from the form of Discrete Sine Transform (DST) may alternatively be specified.

1.2.1 樹結構劃分為TB和TU的示例1.2.1 Example of dividing the tree structure into TB and TU

對於殘差編碼，可以將CB遞歸地劃分為變換塊(TB)。由殘差四元樹信令通知劃分。僅指定正方形CB和TB劃分，其中塊可以被遞歸地劃分到象限(quadrant)，如圖4A和圖4B所示。對於尺寸為M×M的給定亮度CB，標誌(flag) 表明是否將該CB劃分成四個尺寸為M/2×M/2的塊。如果如序列參數集(SPS)中指示的殘差四元樹的最大深度所信令通知的，每個象限能夠進一步劃分，則為每個象限分配標誌，該標誌指示是否將其劃分成四個象限。由殘差四元樹產生的葉節點塊是變換塊，通過變換編碼對其進一步處理。編碼器指示它將使用的最大和最小亮度TB尺寸。當CB尺寸大於最大TB尺寸時，劃分是隱含的。當劃分將導致亮度TB尺寸小於所指示的最小值時，不劃分是隱含的。除了當亮度TB尺寸為4×4時，色度TB尺寸在每個維度上是亮度TB尺寸的一半，在亮度TB尺寸為4×4的情況下，單個4×4色度TB被用於由四個4×4亮度TB覆蓋的區域。在幀內預測的CU的情況下，最近鄰的TB(在CB內或在CB外)的解碼後樣本被用作用於幀內預測的參考資料。 For residual coding, CB can be recursively divided into transform blocks (TB). The division is notified by residual quadruple tree signaling. Only square CB and TB divisions are specified, where blocks can be recursively divided into quadrants, as shown in Figs. 4A and 4B. For a given brightness CB with a size of M×M, flag Indicate whether to divide the CB into four blocks of size M/2×M/2. If each quadrant can be further divided as signaled by the maximum depth of the residual quadruple tree indicated in the sequence parameter set (SPS), then a flag is assigned to each quadrant, which indicates whether to divide it into four Quadrant. The leaf node block generated by the residual quadruple tree is a transform block, which is further processed by transform coding. The encoder indicates the maximum and minimum brightness TB size it will use. When the CB size is greater than the maximum TB size, the division is implicit. When the division will result in the luminance TB size being smaller than the indicated minimum value, no division is implicit. Except when the luminance TB size is 4×4, the chrominance TB size is half of the luminance TB size in each dimension. When the luminance TB size is 4×4, a single 4×4 chrominance TB is used by Area covered by four 4×4 TBs of brightness. In the case of an intra-predicted CU, the decoded samples of the nearest neighbor TB (within or outside the CB) are used as reference materials for intra-prediction.

與先前的標準相反，HEVC設計允許TB跨越多個PB以用於幀間預測的CU，以使四元樹結構的TB劃分的潛在編碼效率益處最大化。 Contrary to previous standards, the HEVC design allows TBs to span multiple PBs for inter-predicted CUs, so as to maximize the potential coding efficiency benefits of TB partitioning in a quad-tree structure.

1.2.2父節點和子節點1.2.2 Parent node and child node

根據四元樹結構對CTB進行劃分，其節點為編碼單元。四元樹結構中的多個節點包含葉節點和非葉節點。葉節點在樹結構中沒有子節點(即，葉節點不會進一步劃分)。非葉節點包含樹結構的根節點。根節點對應於視訊資料的初始視訊塊(例如，CTB)。對於多個節點的每個各自的非根節點，各自的非根節點對應於視訊塊，該視訊塊是對應於各自非根節點的樹結構中的父節點的視訊塊的子塊。多個非葉節點的每個各自的非葉節點在樹結構中具有一個或多個子節點。 The CTB is divided according to the quaternary tree structure, and its nodes are coding units. Multiple nodes in the quaternary tree structure include leaf nodes and non-leaf nodes. The leaf node has no child nodes in the tree structure (that is, the leaf node will not be further divided). Non-leaf nodes include the root node of the tree structure. The root node corresponds to the initial video block (for example, CTB) of the video data. For each respective non-root node of the plurality of nodes, the respective non-root node corresponds to a video block, and the video block is a child block of the video block corresponding to the parent node in the tree structure of the respective non-root node. Each non-leaf node of the plurality of non-leaf nodes has one or more child nodes in the tree structure.

1.3 JEM中具有較大CTU的四元樹加二元樹塊結構的示例1.3 An example of a quaternary tree with a larger CTU and a binary tree block structure in JEM

在一些實施例中，使用稱為聯合探索模型(JEM)的參考軟體來探索未來的視訊編碼技術。除二元樹結構外，JEM還描述了四元樹加二元樹(QTBT)和三元樹(TT)結構。 In some embodiments, reference software called Joint Exploration Model (JEM) is used to explore future video coding technologies. In addition to the binary tree structure, JEM also describes the quaternary tree plus binary tree (QTBT) and ternary tree (TT) structure.

1.3.1 QTBT塊劃分結構的示例1.3.1 Example of QTBT block partition structure

與HEVC不同，QTBT結構去除了多種劃分類型的概念，即，它去除了CU、PU和TU概念的分離，並且支持CU劃分形狀的更大靈活性。在QTBT塊結構中，CU可以具有正方形或矩形形狀。如圖5A中所示，編碼樹單元(CTU)首先被四元樹結構劃分。四元樹葉節點被二元樹結構進一步劃分。在二元樹劃分中有兩種劃分類型：對稱水平劃分和對稱垂直劃分。二元樹葉節點被稱為編碼單元(CU)，並且該劃分被用於預測和變換處理而無需任何進一步的劃分。這意味著CU、PU和TU在QTBT編碼塊結構中具有相同的塊尺寸。在JEM中，CU有時由不同顏色分量的編碼塊(CB)組成，例如，在4：2：0色度格式的P和B切片(slice)的情況下，一個CU包含一個亮度CB和兩個色度CB；並且CU有時由單個分量的CB組成，例如，在I切片的情況下，一個CU僅包含一個亮度CB或僅包含兩個色度CB。 Unlike HEVC, the QTBT structure removes the concept of multiple partition types, that is, it removes the separation of the concepts of CU, PU, and TU, and supports greater flexibility in CU partition shapes. In the QTBT block structure, the CU may have a square or rectangular shape. As shown in FIG. 5A, the coding tree unit (CTU) is first divided by the quad tree structure. The quaternary leaf nodes are further divided by the binary tree structure. There are two types of divisions in binary tree division: symmetric horizontal division and symmetric vertical division. The binary leaf node is called a coding unit (CU), and the division is used for prediction and transformation processing without any further division. This means that CU, PU and TU have the same block size in the QTBT coding block structure. In JEM, a CU is sometimes composed of coding blocks (CB) of different color components. For example, in the case of P and B slices in the 4:2:0 chroma format, one CU contains one luminance CB and two And a CU is sometimes composed of a single component CB, for example, in the case of I slice, one CU contains only one luma CB or only two chroma CBs.

為QTBT劃分方案定義以下參數。 Define the following parameters for the QTBT division scheme.

-- CTU尺寸：四元樹的根節點尺寸，與HEVC中的概念相同 - CTU size: the size of the root node of the quaternary tree, the same concept as in HEVC

-- MinQTSize：最小允許的四元樹葉節點尺寸 - MinQTSize : the minimum allowable size of four-element leaf nodes

-- MaxBTSize：最大允許的二元樹根節點尺寸 - MaxBTSize : the maximum allowable size of the root node of the binary tree

-- MaxBTDepth：最大允許的二元樹深度 - MaxBTDepth : Maximum allowable binary tree depth

-- MinBTSize：最小允許的二元樹葉節點尺寸 - MinBTSize : the minimum allowable size of a binary leaf node

在QTBT劃分結構的一個示例中，CTU尺寸被設置為具有兩個對應的64×64色度樣本塊的128×128亮度樣本，MinQTSize被設置為16×16，MaxBTSize被設置為64×64，MinBTSize(寬度和高度)被設置為4×4，並且MaxBTDepth被設置為4。首先將四元樹劃分應用於CTU以產生四元樹葉節點。四元樹葉節點可以具有從16×16(即，MinQTSize)到128×128(即，CTU尺寸)的尺寸。如果葉四元樹節點是128×128，則由於該尺寸超過MaxBTSize(即，64×64)，所以它不會被二元樹進一步劃分。否則，葉四元樹節點可以被二元樹進一步劃分。因此，四元樹葉節點也是二元樹的根節點，並且二元樹深度為0。當二元樹深度達到MaxBTDepth(即，4)時，不考慮進一步的劃分。當二元樹節點的寬度等於MinBTSize(即，4)時，不考慮進一步的水平劃分。類似地，當二元樹節點的高度等於MinBTSize時，不考慮進一步的垂直劃分。通過預測和變換處理進一步處理二元樹的葉節點，而無需任何進一步的劃分。在JEM中，最大CTU尺寸為256×256亮度樣本。 In an example of the QTBT partition structure, the CTU size is set to 128×128 luma samples with two corresponding 64×64 chroma sample blocks, MinQTSize is set to 16×16, MaxBTSize is set to 64×64, MinBTSize (Width and Height) is set to 4×4, and MaxBTDepth is set to 4. First, the quaternary tree division is applied to the CTU to generate quaternary leaf nodes. The quaternary leaf node may have a size from 16×16 (ie, MinQTSize ) to 128×128 (ie, CTU size). If the leaf quadtree node is 128×128, since the size exceeds MaxBTSize (ie, 64×64), it will not be further divided by the binary tree. Otherwise, the leaf quaternary tree nodes can be further divided by the binary tree. Therefore, the quaternary leaf node is also the root node of the binary tree, and the depth of the binary tree is zero. When the depth of the binary tree reaches MaxBTDepth (ie, 4), no further division is considered. When the width of the binary tree node is equal to MinBTSize (ie, 4), no further horizontal division is considered. Similarly, when the height of the binary tree node is equal to MinBTSize , no further vertical division is considered. The leaf nodes of the binary tree are further processed through prediction and transformation processing without any further division. In JEM, the maximum CTU size is 256×256 luminance samples.

圖5A繪示了通過使用QTBT進行塊劃分的示例，圖5B繪示了對應的樹表示。實線表示四元樹劃分，虛線表示二元樹劃分。在二元樹的每個劃分(即，非葉)節點中，信令通知一個標誌以指示使用哪種劃分類型(即，水平或垂直)，其中0表示水平劃分並且1表示垂直劃分。對於四元樹劃分，不需要指示劃分類型，因為四元樹劃分總是水平地且垂直地劃分塊以產生具有相等尺寸的4個子塊。 FIG. 5A shows an example of block division by using QTBT, and FIG. 5B shows the corresponding tree representation. The solid line represents the quaternary tree division, and the dashed line represents the binary tree division. In each partition (ie, non-leaf) node of the binary tree, a flag is signaled to indicate which partition type (ie, horizontal or vertical) is used, where 0 represents horizontal partition and 1 represents vertical partition. For the quad-tree division, there is no need to indicate the division type, because the quad-tree division always divides the blocks horizontally and vertically to generate 4 sub-blocks of equal size.

另外，QTBT方案支持使亮度和色度具有單獨的QTBT結構的能力。目前，對於P和B切片，一個CTU中的亮度CTB和色度CTB共享相同的QTBT結構。然而，對於I切片，通過QTBT結構將亮度CTB劃分為CU，並且通過另一QTBT結構將色度CTB劃分為色度CU。這意味著I切片中的CU由亮度分量的編碼塊或兩個色度分量的編碼塊組成，並且P切片或B切片中的CU由所有三個顏色分量的編碼塊組成。 In addition, the QTBT scheme supports the ability to have a separate QTBT structure for luminance and chrominance. Currently, for P and B slices, the luminance CTB and chrominance CTB in a CTU share the same QTBT structure. However, for I slices, the luma CTB is divided into CUs through the QTBT structure, and the chrominance CTB is divided into chrominance CUs through another QTBT structure. This means that a CU in an I slice is composed of coding blocks of a luma component or two chrominance components, and a CU in a P slice or a B slice is composed of coding blocks of all three color components.

在HEVC中，小塊的幀間預測受限於減少運動補償的記憶體存取，使得對於4×8和8×4塊不支持雙向預測，並且對於4×4塊不支持幀間預測。在JEM的QTBT中，這些限制被去除。 In HEVC, the inter-frame prediction of small blocks is restricted by memory access that reduces motion compensation, so that bi-directional prediction is not supported for 4×8 and 8×4 blocks, and inter-frame prediction is not supported for 4×4 blocks. In JEM's QTBT, these restrictions are removed.

1.4多功能視訊編碼(VVC)的三元樹(TT)1.4 Triple Tree (TT) of Multifunctional Video Coding (VVC)

圖6A繪示了四元樹(QT)劃分的示例，並且圖6B和圖6C分別繪示了垂直和水平二元樹(BT)劃分的示例。在一些實施例中，除了四元樹和二元樹之外，還支持三元樹(TT)劃分，例如水平和垂直中心側三元樹(如圖6D和圖6E所示)。 Fig. 6A shows an example of a quaternary tree (QT) division, and Fig. 6B and Fig. 6C respectively show An example of vertical and horizontal binary tree (BT) division is presented. In some embodiments, in addition to quaternary trees and binary trees, ternary tree (TT) division is also supported, such as horizontal and vertical center-side ternary trees (as shown in FIG. 6D and FIG. 6E).

在一些實現中，支持兩個層次的樹：區域樹(四元樹)和預測樹(二元樹或三元樹)。首先用區域樹(RT)對CTU進行劃分。可以進一步用預測樹(PT)劃分RT葉。也可以用PT進一步劃分PT葉，直到達到最大PT深度。PT葉是基本的編碼單元。為了方便起見，它仍然被稱為CU。CU不能進一步劃分。預測和變換都以與JEM相同的方式應用於CU。整個劃分結構被稱為“多類型樹”。 In some implementations, two levels of trees are supported: regional trees (quaternary trees) and prediction trees (binary trees or ternary trees). First, use the regional tree (RT) to divide the CTU. The prediction tree (PT) can be further used to divide the RT leaves. PT can also be used to further divide the PT leaf until the maximum PT depth is reached. The PT leaf is the basic coding unit. For convenience, it is still called CU. CU cannot be further divided. Both prediction and transformation are applied to CU in the same way as JEM. The whole partition structure is called "multi-type tree".

1.5可選視訊編碼技術中的劃分結構的示例1.5 Example of the partition structure in optional video coding technology

在一些實施例中，支持被稱為多樹型(MTT)的樹結構，其是QTBT的廣義化。在QTBT中，如圖7所示，首先用四元樹結構對編碼樹單元(CTU)進行劃分。然後用二元樹結構對四元樹葉節點進行進一步劃分。 In some embodiments, a tree structure called multi-tree (MTT) is supported, which is a generalization of QTBT. In QTBT, as shown in Figure 7, the coding tree unit (CTU) is first divided by a quad tree structure. Then use the binary tree structure to further divide the quaternary leaf nodes.

MTT的結構由兩種類型的樹節點組成：區域樹(RT)和預測樹(PT)，支持九種類型的劃分，如圖8A至圖8I所示。區域樹可以遞歸地將CTU劃分為方形塊，直至4x4尺寸的區域樹葉節點。在區域樹的每個節點上，可以從三種樹類型中的一種形成預測樹：二元樹、三元樹和非對稱二元樹。在PT劃分中，禁止在預測樹的分支中進行四元樹劃分。和JEM一樣，亮度樹和色度樹在I切片中被分開。 The structure of MTT is composed of two types of tree nodes: regional tree (RT) and prediction tree (PT), which supports nine types of divisions, as shown in Figs. 8A to 8I. The regional tree can recursively divide the CTU into square blocks, up to 4x4 size regional leaf nodes. At each node of the regional tree, a prediction tree can be formed from one of three tree types: binary tree, ternary tree, and asymmetric binary tree. In the PT division, it is forbidden to divide the quaternary tree in the branches of the prediction tree. Like JEM, the luminance tree and the chrominance tree are separated in I slices.

2. HEVC/H.265中的幀間預測的示例2. Example of inter prediction in HEVC/H.265

多年來，視訊編碼標準已經顯著改進，並且現在部分地提供高編碼效率和對更高解析度的支持。諸如HEVC和H.265的最近的標準基於混合視訊編碼結構，其中利用時間預測加變換編碼。 Over the years, video coding standards have improved significantly and now partly provide high coding efficiency and support for higher resolutions. Recent standards such as HEVC and H.265 are based on a hybrid video coding structure in which temporal prediction plus transform coding is used.

2.1 預測模式的示例2.1 Examples of prediction modes

每個幀間預測PU(預測單元)具有一個或兩個參考圖片列表的運動參數。在一些實施例中，運動參數包含運動向量和參考圖片索引。在其他實施例中，還可以使用inter_pred_idc來信令通知兩個參考圖片列表中的一個的使用。在又一其他實施例中，可以將運動向量明確地編碼為相對於預測器的增量。 Each inter prediction PU (prediction unit) has motion parameters of one or two reference picture lists. In some embodiments, the motion parameters include motion vectors and reference picture indexes. In other embodiments, inter_pred_idc can also be used to signal the use of one of the two reference picture lists. In yet another embodiment, the motion vector can be explicitly coded as an increment relative to the predictor.

當用跳過(skip)模式對編碼單元進行編碼時，一個PU與CU相關聯，並且不存在顯著的殘差係數、沒有編碼的運動向量增量或參考圖片索引。指定Merge模式(Merge mode)，從而從相鄰PU獲得當前PU的運動參數，包含空間和時間候選。Merge模式可以應用於任何幀間預測的PU，而不僅應用於跳過模式。Merge模式的替代是運動參數的顯式傳輸，其中，對於每個PU，明確地用信令通知運動向量、每個參考圖片列表的對應參考圖片索引和參考圖片列表使用。 When coding a coding unit in skip mode, one PU is associated with a CU, and there are no significant residual coefficients, no coding motion vector increments, or reference picture indexes. Specify the Merge mode to obtain the motion parameters of the current PU from the neighboring PU, including space and time candidates. The Merge mode can be applied to any inter-predicted PU, not only to the skip mode. The alternative to the Merge mode is the explicit transmission of motion parameters, in which, for each PU, the motion vector, the corresponding reference picture index of each reference picture list, and the reference picture list use are explicitly signaled.

當信令指示將使用兩個參考圖片列表中的一個時，從一個樣本塊產生PU。這被稱為“單向預測(uni-prediction)”。單向預測可用於P切片和B切片兩者。 When the signaling indicates that one of the two reference picture lists will be used, the PU is generated from one sample block. This is called "uni-prediction". Unidirectional prediction can be used for both P slices and B slices.

當信令指示將使用兩個參考圖片列表時，從兩個樣本塊產生PU。這被稱為“雙向預測(bi-prediction)”。雙向預測僅適用於B切片。 When the signaling indicates that two reference picture lists will be used, the PU is generated from two sample blocks. This is called "bi-prediction". Bidirectional prediction is only applicable to B slices.

2.1.1 構建Merge模式的候選的實施例2.1.1 Examples of candidates for constructing Merge mode

當使用Merge模式預測PU時，從位元流解析指向Merge候選列表中的條目的索引並將其用於檢索運動資訊。該列表的構建可以根據以下步驟順序進行概述： When using the Merge mode to predict the PU, the index pointing to the entry in the Merge candidate list is parsed from the bit stream and used to retrieve motion information. The construction of this list can be summarized according to the following sequence of steps:

步驟1：初始候選推導 Step 1: Initial candidate derivation

步驟1.1：空間候選推導 Step 1.1: Spatial candidate derivation

步驟1.2：空間候選的冗餘檢查 Step 1.2: Redundancy check of spatial candidates

步驟1.3：時間候選推導 Step 1.3: Time candidate derivation

步驟2：插入額外的候選 Step 2: Insert additional candidates

步驟2.1：雙向預測候選的創建 Step 2.1: Creation of bidirectional prediction candidates

步驟2.2：零運動候選的插入 Step 2.2: Insertion of zero motion candidates

圖9繪示了基於上面概述的步驟序列構建Merge候選列表的示例。對於空間Merge候選推導，在位於五個不同位置的候選中最多選擇四個Merge候選。對於時間Merge候選推導，在兩個候選中最多選擇一個Merge候選。由於在解碼器處假設恆定數量的候選用於每個PU，因此當候選的數量未達到在切片報頭中用信令通知的最大Merge候選數量(MaxNumMergeCand)時，產生額外的候選。由於候選的數量是恆定的，因此使用截斷的一元二值化(Truncated Unary binarization，TU)來編碼最佳Merge候選的索引。如果CU的尺寸等於8，則當前CU的所有PU共享單個Merge候選列表，其與2N×2N預測單元的Merge候選列表相同。 Figure 9 illustrates an example of constructing a Merge candidate list based on the sequence of steps outlined above. For the derivation of spatial Merge candidates, at most four Merge candidates are selected among candidates located at five different positions. For the derivation of temporal Merge candidates, at most one Merge candidate is selected among the two candidates. Since a constant number of candidates is assumed for each PU at the decoder, when the number of candidates does not reach the maximum number of Merge candidates (MaxNumMergeCand) signaled in the slice header, additional candidates are generated. Since the number of candidates is constant, truncated unary binarization (TU) is used to encode the index of the best Merge candidate. If the size of the CU is equal to 8, all PUs of the current CU share a single Merge candidate list, which is the same as the Merge candidate list of the 2N×2N prediction unit.

2.1.2構建空間Merge候選2.1.2 Build Space Merge Candidate

在空間Merge候選的推導中，在位於圖10描繪的位置的候選當中選擇最多四個Merge候選。推導的順序是A₁、B₁、B₀、A₀和B₂。僅當位置A₁、B₁、B₀、A₀的任何PU不可用(例如，因為它屬另一切片或區塊)或者是幀內編碼時，才考慮位置B₂。在添加位置A₁處的候選之後，對剩餘候選的添加進行冗餘檢查，其確保具有相同運動資訊的候選被排除在列表之外，使得編碼效率提高。 In the derivation of spatial Merge candidates, a maximum of four Merge candidates are selected among the candidates located at the positions depicted in FIG. 10. The order of derivation is A ₁ , B ₁ , B ₀ , A ₀ and B ₂ . _{The position B 2 is} only considered when any PU at the positions A ₁ , B ₁ , B ₀ , A ₀ is not available (for example, because it belongs to another slice or block) or is intra-coded. After the addition of the candidate position at A _1, of the remaining candidate is added redundancy check, which ensures that the candidate has the same motion information is excluded from the list, so that to improve coding efficiency.

為了降低計算複雜度，在所提到的冗餘檢查中並未考慮所有可能的候選對。相反，僅考慮圖11中用箭頭連接的對，並且僅在用於冗餘檢查的對應候選具有不一樣的運動資訊時，才將候選添加到列表。重複運動資訊的另一來源是與不同於2N×2N的分區相關聯的“第二PU”。作為示例，圖12A和圖12B分別描繪了針對N×2N和2N×N的情況的第二PU。當當前PU被分區為N×2N時，位置A₁處的候選不被考慮用於列表建構。在一些實施例中，通過添加該候選可能導致具有相同運動資訊的兩個預測單元，這對於在編碼單元中僅具有一個PU是多餘的。類似地，當當前PU被分區為2N×N時，不考慮位置B₁。 In order to reduce the computational complexity, not all possible candidate pairs are considered in the mentioned redundancy check. In contrast, only the pairs connected by arrows in FIG. 11 are considered, and the candidates are added to the list only when the corresponding candidates for redundancy check have different motion information. Another source of repetitive motion information is the "second PU" associated with a partition other than 2N×2N. As an example, FIGS. 12A and 12B depict the second PU for the cases of N×2N and 2N×N, respectively. When the current PU is partitioned into N×2N, _{the candidate at position A 1} is not considered for list construction. In some embodiments, adding the candidate may result in two prediction units with the same motion information, which is redundant for having only one PU in the coding unit. Similarly, when the current PU is partitioned as 2N×N, the position B _{1 is} not considered.

2.1.3構建時間Merge候選2.1.3 Build Time Merge Candidate

在此步驟中，只有一個候選添加到列表中。特別地，在這個時間Merge候選的推導中，基於與給定參考圖片列表中當前圖片具有最小POC差異的共位的PU推導了縮放運動向量。用於推導共位的PU的參考圖片列表在切片報頭中顯式地發信令。 In this step, only one candidate is added to the list. In particular, in the derivation of this temporal Merge candidate, the scaling motion vector is derived based on the co-located PU with the smallest POC difference from the current picture in the given reference picture list. The reference picture list used to derive the co-located PU is explicitly signaled in the slice header.

圖13繪示了針對時間Merge候選(如虛線)的縮放運動向量的推導的示例，時間，其使用POC距離tb和td從共位的PU的運動向量進行縮放，其中tb定義為當前圖片的參考圖片和當前圖片之間的POC差異，並且td定義為共位的圖片的參考圖片與共位的圖片之間的POC差異。時間Merge候選的參考圖片索引設置為零。對於B切片，得到兩個運動向量(一個是對於參考圖片列表0，另一個是對於參考圖片列表1)並將其組合使其成為雙向預測Merge候選。 Figure 13 shows an example of the derivation of the scaled motion vector for temporal Merge candidates (such as the dashed line), time, which uses the POC distance tb and td to scale from the motion vector of the co-located PU, where tb is defined as the reference of the current picture The POC difference between the picture and the current picture, and td is defined as the POC difference between the reference picture of the co-located picture and the co-located picture. The reference picture index of the temporal Merge candidate is set to zero. For the B slice, two motion vectors (one for reference picture list 0 and the other for reference picture list 1) are obtained and combined to make them into bidirectional prediction Merge candidates.

在屬參考幀的共位的PU(Y)中，在候選C₀和C₁之間選擇時間候選的位置，如圖14所示。如果位置C₀處的PU不可用、內部編碼或在當前CTU之外，則使用位置C₁。否則，位置C₀被用於時間Merge候選的推導。 In the co-located PU (Y) belonging to the reference frame, the position of the _{temporal candidate is selected between the candidates C 0} and C ₁ , as shown in FIG. 14. If the PU at position C ₀ is not available, internally coded, or outside the current CTU, position C ₁ is used. Otherwise, the position C ₀ is used for the derivation of temporal Merge candidates.

在該步驟中，只有一個候選被添加到列表中。具體地，在該時間Merge候選的推導中，基於共位的PU來推導縮放的運動向量，該共位的PU屬給定參考圖片列表內與當前圖片具有最小POC差的圖片。在切片報頭中明確地用信令通知要用於推導共位的PU的參考圖片列表。 In this step, only one candidate is added to the list. Specifically, in the derivation of the temporal Merge candidate, the scaled motion vector is derived based on the co-located PU, which belongs to the picture with the smallest POC difference from the current picture in the given reference picture list. The reference picture list to be used to derive the co-located PU is explicitly signaled in the slice header.

圖13繪示了針對時間Merge候選推導縮放運動向量的示例(如虛線所示)，其是使用POC距離tb和td從共位的PU的運動向量縮放的，其中tb被定義為當前圖片的參考圖片與當前圖片之間的POC差，td被定義為是共位的圖片的參考圖片與共位的圖片之間的POC差。時間Merge候選的參考圖片索引被設置為等於零。對於B切片，獲得兩個運動向量，一個用於參考圖片列表0，另一個用於參考圖片列表1，並且結合該兩個運動向量以獲得雙向預測Merge候選。 Figure 13 shows an example of deriving a scaled motion vector for temporal Merge candidates (as shown by the dashed line), which is scaled from the motion vector of the co-located PU using POC distances tb and td, where tb is defined as the reference of the current picture The POC difference between the picture and the current picture, td is defined as the POC difference between the reference picture of the co-located picture and the co-located picture. The reference picture index of the temporal Merge candidate is set equal to zero. For the B slice, two motion vectors are obtained, one is used for reference picture list 0 and the other is used for reference picture list 1, and the two motion vectors are combined to obtain bidirectional prediction Merge candidates.

在屬參考幀的共位的PU(Y)中，在候選C₀和C₁之間選擇時間候選的位置，如圖14所示。如果位置C₀處的PU不可用、是幀內編碼的、或者在當前CTU之外，則使用位置C₁。否則，位置C₀用於時間Merge候選的推導。 In the co-located PU (Y) belonging to the reference frame, the position of the _{temporal candidate is selected between the candidates C 0} and C ₁ , as shown in FIG. 14. If the PU at position C ₀ is not available, is intra-coded, or is outside the current CTU, position C ₁ is used. Otherwise, the position C _{0 is} used for the derivation of temporal Merge candidates.

2.1.4 構建Merge候選的額外類型2.1.4 Constructing additional types of Merge candidates

除了空時Merge候選之外，還存在兩種額外類型的Merge候選：組合的雙向預測Merge候選和零Merge候選。通過利用空時Merge候選來產生組合的雙向預測Merge候選。組合的雙向預測Merge候選僅用於B切片。通過將初始候選的第一參考圖片列表運動參數與另一候選的第二參考圖片列表運動參數組合來產生組合的雙向預測候選。如果這兩個元組提供不同的運動假設，它們將形成一個新的雙向預測候選。 In addition to space-time Merge candidates, there are two additional types of Merge candidates: combined bidirectional predictive Merge candidates and zero Merge candidates. The combined bi-directional predictive Merge candidate is generated by using the space-time Merge candidate. The combined bidirectional prediction Merge candidate is only used for B slices. The combined bi-directional prediction candidate is generated by combining the first reference picture list motion parameter of the initial candidate with the second reference picture list motion parameter of another candidate. If these two tuples provide different motion hypotheses, they will form a new bi-prediction candidate.

圖15繪示了該處理的示例，其中初始列表(1510，左側)中具有mvL0和refIdxL0或mvL1和refIdxL1的兩個候選被用於創建添加到最終列表(1520，右側)的組合的雙向預測Merge候選。 Figure 15 shows an example of this process, where two candidates with mvL0 and refIdxL0 or mvL1 and refIdxL1 in the initial list (1510, left) are used to create a combined bidirectional prediction Merge that is added to the final list (1520, right) Candidate.

插入零運動候選以填充Merge候選列表中的剩餘條目，從而達到MaxNumMergeCand容量。這些候選具有零空間位移和參考圖片索引，該參考圖片索引從零開始並且每當新的零運動候選被添加到列表時增加。這些候選使用的參考幀的數量是1和2，分別用於單向和雙向預測。在一些實施例中，不對這些候選執行冗餘檢查。 Insert zero motion candidates to fill the remaining entries in the Merge candidate list to reach the MaxNumMergeCand capacity. These candidates have a zero spatial displacement and a reference picture index, which starts at zero and increases every time a new zero motion candidate is added to the list. The number of reference frames used by these candidates is 1 and 2, which are used for unidirectional and bidirectional prediction, respectively. In some embodiments, no redundancy check is performed on these candidates.

2.1.5並行處理的運動估計區域的示例2.1.5 Examples of motion estimation regions processed in parallel

為了加速編碼處理，可以並行執行運動估計，從而同時推導給定區域內的所有預測單元的運動向量。從空間鄰域推導Merge候選可能干擾並行處理，因為一個預測單元直到其相關聯的運動估計完成時才能從相鄰PU推導運動參數。為了減輕編碼效率和處理等待時間之間的折衷，可以定義運動估計區域(MER)。MER的尺寸可以在圖片參數集(PPS)中使用 “log2_parallel_merge_level_minus2”語法元素信令通知。當定義了MER時，落入同一區域的Merge候選被標記為不可用，因此在列表建構中不予考慮。 In order to speed up the encoding process, motion estimation can be performed in parallel, thereby deriving the motion vectors of all prediction units in a given area at the same time. Deriving Merge candidates from the spatial neighborhood may interfere with parallel processing because a prediction unit cannot derive motion parameters from neighboring PUs until its associated motion estimation is completed. In order to alleviate the trade-off between coding efficiency and processing latency, a motion estimation area (MER) can be defined. The size of MER can be used in the picture parameter set (PPS) The "log2_parallel_merge_level_minus2" syntax element is signaled. When MER is defined, Merge candidates that fall into the same area are marked as unavailable, so they are not considered in the list construction.

表1中呈現了圖片參數集(PPS)初始字節序列有效載荷(RBSP)語法，其中log2_parallel_merge_level_minus2加2指定變量Log2ParMrgLevel的值，該變量用於如現有視訊編碼標準中規定的Merge模式亮度運動向量的推導過程以及空間Merge候選的推導過程。log2_parallel_merge_level_minus2的值應在0到CtbLog2SizeY-2的範圍內，包含0和CtbLog2SizeY-2。 Table 1 presents the picture parameter set (PPS) initial byte sequence payload (RBSP) syntax, where log2_parallel_merge_level_minus2 plus 2 specifies the value of the variable Log2ParMrgLevel, which is used for the Merge mode luminance motion vector specified in the existing video coding standards The derivation process and the derivation process of spatial Merge candidates. The value of log2_parallel_merge_level_minus2 should be in the range of 0 to CtbLog2SizeY-2, including 0 and CtbLog2SizeY-2.

變量Log2ParMrgLevel推導如下：Log2ParMrgLevel=log2_parallel_merge_level_minus2+2 The variable Log2ParMrgLevel is derived as follows: Log2ParMrgLevel=log2_parallel_merge_level_minus2+2

注意Log2ParMrgLevel的值表示Merge候選列表的並行推導的內置能力。例如，當Log2ParMrgLevel等於6時，可以並行推導64×64塊中包含的所有預測單元(PU)和編碼單元(CU)的Merge候選列表。 Note that the value of Log2ParMrgLevel represents the built-in capability of parallel derivation of the Merge candidate list. For example, when Log2ParMrgLevel is equal to 6, the Merge candidate list of all prediction units (PU) and coding units (CU) contained in a 64×64 block can be derived in parallel.

2.2運動向量預測的實施例2.2 Examples of motion vector prediction

運動向量預測利用運動向量與相鄰的PU的空時相關性，其用於運動參數的顯式傳輸。通過首先檢查左上方的時間相鄰的PU位置的可用性、移除冗餘的候選位置並且加上零向量以使候選列表長度恆定來構建運動向量候選列表。然後，編碼器可以從候選列表中選擇最佳的預測器，並發送指示所選候選的對應索引。與Merge索引信令類似，最佳運動向量候選的索引使用截斷的一元進行編碼。 Motion vector prediction uses the space-time correlation between motion vectors and adjacent PUs, which is used for explicit transmission of motion parameters. The motion vector candidate list is constructed by first checking the availability of the temporally adjacent PU positions on the upper left, removing redundant candidate positions, and adding a zero vector to make the length of the candidate list constant. Then, the encoder can select the best predictor from the candidate list and send a corresponding index indicating the selected candidate. Similar to Merge index signaling, the index of the best motion vector candidate is coded using truncated unary.

2.2.1構建運動向量預測候選的示例2.2.1 Example of constructing motion vector prediction candidates

圖16概括了運動向量預測候選的推導過程，並且可以針對每個參考圖片列表以refidx作為輸入實現。 Figure 16 summarizes the derivation process of motion vector prediction candidates, and can be implemented with refidx as input for each reference picture list.

在運動向量預測中，考慮了兩種類型的運動向量候選：空間運動向量候選和時間運動向量候選。對於空間運動向量候選的推導，基於位於圖10先前所示的五個不同位置的每個PU的運動向量最終推導兩個運動向量候選。 In motion vector prediction, two types of motion vector candidates are considered: spatial motion vector candidates and temporal motion vector candidates. For the derivation of the spatial motion vector candidates, two motion vector candidates are finally derived based on the motion vector of each PU located at the five different positions previously shown in FIG. 10.

對於時間運動向量候選的推導，從兩個候選中選擇一個運動向量候選，這兩個候選是基於兩個不同的共位位置推導的。在作出第一個空時候選列表後，移除列表中重複的運動向量候選。如果潛在候選的數量大於二，則從列表中移除在相關聯的參考圖片列表中參考圖片索引大於1的運動向量候選。如果空時運動向量候選數小於二，則會在列表中添加額外的零運動向量候選。 For the derivation of temporal motion vector candidates, one motion vector candidate is selected from two candidates, which are derived based on two different co-location positions. After the first empty time selection list is made, the duplicate motion vector candidates in the list are removed. If the number of potential candidates is greater than two, the motion vector candidates whose reference picture index is greater than 1 in the associated reference picture list are removed from the list. If the number of space-time motion vector candidates is less than two, additional zero motion vector candidates will be added to the list.

2.2.2構建空間運動向量候選2.2.2 Construction of spatial motion vector candidates

在推導空間運動向量候選時，在五個潛在候選中最多考慮兩個候選，這五個候選來自圖10先前所示位置上的PU，這些位置與運動Merge的位置相同。當前PU左側的推導順序定義為A₀、A₁、和縮放的A₀、縮放的A₁。當前PU上方的推導順序定義為B₀、B₁、B₂、縮放的B₀、縮放的B₁、縮放的B₂。因此，每側有四種情況可以用作運動向量候選，其中兩種情況不需要使用空間縮放，並且兩種情況使用空間縮放。四種不同的情況概括如下： When deriving the spatial motion vector candidates, at most two candidates are considered among the five potential candidates. These five candidates are from the PU at the positions previously shown in FIG. 10, and these positions are the same as the positions of the motion Merge. The derivation sequence on the left side of the current PU is defined as A ₀ , A ₁ , and scaled A ₀ , scaled A ₁ . The derivation sequence above the current PU is defined as B ₀ , B ₁ , B ₂ , scaled B ₀ , scaled B ₁ , scaled B ₂ . Therefore, there are four cases on each side that can be used as motion vector candidates, two cases do not need to use spatial scaling, and two cases use spatial scaling. The four different situations are summarized as follows:

--無空間縮放 --Zoom without space

(1)相同的參考圖片列表，並且相同的參考圖片索引(相同的POC) (1) The same reference picture list and the same reference picture index (same POC)

(2)不同的參考圖片列表，但是相同的參考圖片(相同的POC) (2) Different reference picture lists, but the same reference picture (same POC)

--空間縮放 --Space zoom

(3)相同的參考圖片列表，但是不同的參考圖片(不同的POC) (3) The same reference picture list, but different reference pictures (different POC)

(4)不同的參考圖片列表，並且不同的參考圖片(不同的POC) (4) Different reference picture lists, and different reference pictures (different POCs)

首先檢查無空間縮放的情況，然後檢查允許空間縮放的情況。當POC在相鄰PU的參考圖片與當前PU的參考圖片之間不同時，考慮空間縮放，而不考慮參考圖片列表。如果左側候選的所有PU都不可用或是幀內編碼，則允許對上述運動向量進行縮放，以幫助左側和上方MV候選的並行推導。否則，不允許對上述運動向量進行空間縮放。 First check the condition of no spatial scaling, and then check the condition of allowing spatial scaling. When the POC is different between the reference picture of the neighboring PU and the reference picture of the current PU, spatial scaling is considered, regardless of the reference picture list. If all the PUs of the left candidate are unavailable or are intra-coded, the above-mentioned motion vector is allowed to be scaled to help the parallel derivation of the left and upper MV candidates. Otherwise, spatial scaling of the above motion vector is not allowed.

如在圖17中的示例所示，對於空間縮放情況，相鄰PU的運動向量以與時間縮放相似的方式縮放。一個區別在於，給出了當前PU的參考圖片列表和索引作為輸入，實際縮放處理與時間縮放處理相同。 As shown in the example in FIG. 17, for the spatial scaling case, the motion vectors of neighboring PUs are scaled in a similar manner to temporal scaling. One difference is that the reference picture list and index of the current PU are given as input, and the actual scaling process is the same as the time scaling process.

2.2.3構建時間運動向量候選2.2.3 Construction of temporal motion vector candidates

除了參考圖片索引的推導外，時間Merge候選的所有推導過程與空間運動向量候選的推導過程相同(如圖14中的示例所示)。在一些實施例中，將參考圖片索引用信令通知給解碼器。 Except for the derivation of the reference picture index, all the derivation processes of the temporal Merge candidates are the same as the derivation process of the spatial motion vector candidates (as shown in the example in FIG. 14). In some embodiments, the reference picture index is signaled to the decoder.

2.2.4 AMVP資訊的信令2.2.4 Signaling of AMVP Information

對於AMVP模式，在位元流中可以信令通知四個部分，例如預測方向、參考索引、MVD和MV預測候選索引，其在表2和表3中所示的語法的上下文中描述。 For the AMVP mode, four parts can be signaled in the bit stream, such as prediction direction, reference index, MVD and MV prediction candidate index, which are described in the context of the syntax shown in Table 2 and Table 3.

3 聯合探索模型(JEM)中幀間預測方法的示例3 Examples of inter prediction methods in the Joint Exploration Model (JEM)

在一些實施例中，使用稱為聯合探索模型(JEM)的參考軟體來探索未來視訊編碼技術。在JEM中，在若干編碼工具中採用基於子塊的預測，諸如仿射預測、可選時間運動向量預測(ATMVP)、空時運動向量預測(STMVP)、雙向光流(BIO)、幀速率上轉換(FRUC)、局部自適應運動向量解析度(LAMVR)、重疊塊運動補償(OBMC)、局部照明補償(LIC)和解碼器側運動向量細化(DMVR)。 In some embodiments, reference software called Joint Exploration Model (JEM) is used to explore future video coding technologies. In JEM, sub-block-based prediction is used in several coding tools, such as affine prediction, optional temporal motion vector prediction (ATMVP), space-time motion vector prediction (STMVP), Bidirectional optical flow (BIO), frame rate up-conversion (FRUC), local adaptive motion vector resolution (LAMVR), overlapping block motion compensation (OBMC), local illumination compensation (LIC), and decoder-side motion vector refinement (DMVR) ).

3.1基於子CU的運動向量預測的示例3.1 Example of motion vector prediction based on sub-CU

在具有四元樹加二元樹(QTBT)的JEM中，每個CU可以針對每個預測方向具有至多一組運動參數。在一些實施例中，通過將大CU劃分成子CU並且推導大CU的所有子CU的運動資訊，在編碼器中考慮兩個子CU級運動向量預測方法。可選時間運動向量預測(ATMVP)方法允許每個CU從比共位的參考圖片中的當前CU小的多個塊中提取多組運動資訊。在空時運動向量預測(STMVP)方法中，通過使用時間運動向量預測器和空間相鄰運動向量來遞歸地推導子CU的運動向量。在一些實施例中，為了保留用於子CU運動預測的更準確的運動場，可以禁用參考幀的運動壓縮。 In JEM with a quad tree plus a binary tree (QTBT), each CU can have at most one set of motion parameters for each prediction direction. In some embodiments, by dividing the large CU into sub-CUs and deriving the motion information of all sub-CUs of the large CU, two sub-CU-level motion vector prediction methods are considered in the encoder. The optional temporal motion vector prediction (ATMVP) method allows each CU to extract multiple sets of motion information from multiple blocks smaller than the current CU in the co-located reference picture. In the space-time motion vector prediction (STMVP) method, the motion vector of the sub-CU is recursively derived by using a temporal motion vector predictor and spatial neighboring motion vectors. In some embodiments, in order to retain a more accurate motion field for sub-CU motion prediction, motion compression of reference frames may be disabled.

3.1.1可選時間運動向量預測(ATMVP)的示例時間3.1.1 Sample time for optional temporal motion vector prediction (ATMVP)

在ATMVP方法中，時間運動向量預測(TMVP)方法通過從小於當前CU的塊中提取多組運動資訊(包含運動向量和參考索引)來修改。 In the ATMVP method, the Temporal Motion Vector Prediction (TMVP) method is modified by extracting multiple sets of motion information (including motion vectors and reference indexes) from blocks smaller than the current CU.

圖18繪示了CU 1800的ATMVP運動預測處理的示例。ATMVP方法分兩步預測CU 1800內的子CU 1801的運動向量。第一步是用時間向量識別參考圖片1850中的對應塊1851。參考圖片1850還被稱為運動源圖片。第二步是將當前CU 1800劃分成子CU1801，並從與每個子CU對應的塊中獲取運動向量以及每個子CU的參考索引。 FIG. 18 shows an example of the ATMVP motion prediction processing of the CU 1800. The ATMVP method predicts the motion vector of the sub-CU 1801 in the CU 1800 in two steps. The first step is to use the time vector to identify the corresponding block 1851 in the reference picture 1850. The reference picture 1850 is also called a motion source picture. The second step is to divide the current CU 1800 into sub-CU 1801, and obtain the motion vector and the reference index of each sub-CU from the block corresponding to each sub-CU.

在第一步中，由當前CU 1800的空間相鄰塊的運動資訊確定參考圖片1850和對應塊。為了避免相鄰塊的重複掃描過程，使用當前CU 1800的Merge候選列表中的第一Merge候選。第一可用運動向量及其相關聯的參考索引被設置為時間向量和運動源圖片的索引。這樣，與TMVP相比，可以更準確地識別對應塊，其中對應塊(有時稱為共位塊)總是相對於當前CU位於右下或中心位置。 In the first step, the reference picture 1850 and the corresponding block are determined from the motion information of the spatial neighboring blocks of the current CU 1800. In order to avoid the repeated scanning process of adjacent blocks, the first Merge candidate in the Merge candidate list of the current CU 1800 is used. The first available motion vector and its associated reference index are set as the index of the time vector and the motion source picture. In this way, compared with TMVP, the corresponding block can be identified more accurately, The corresponding block (sometimes called a co-located block) is always located in the lower right or center position relative to the current CU.

在一個示例中，如果第一Merge候選來自左相鄰塊(即，圖19中的A₁)，則使用相關的MV和參考圖片來識別源塊和源圖片。 In one example, if the first Merge candidate is from the left neighboring block (ie, A _{1 in} FIG. 19), the related MV and reference picture are used to identify the source block and the source picture.

在第二步中，通過向當前CU的座標添加時間向量，通過運動源圖片1850中的時間向量來識別子CU 1851的對應塊。對於每個子CU，其對應塊(例如，覆蓋中心樣本的最小運動網格)的運動資訊用於推導子CU的運動資訊。在識別出對應的N×N塊的運動資訊之後，以與HEVC的TMVP相同的方式將其轉換為當前子CU的參考索引和運動向量，其中運動縮放和其他過程也適用。例如，解碼器檢查是否滿足低延遲條件(例如，當前圖片的所有參考圖片的POC小於當前圖片的POC)並且可能使用運動向量MV_x(例如，對應於參考圖片列表X的運動向量)來預測每個子CU的運動向量MV_y(例如，其中X等於0或1並且Y等於1-X)。 In the second step, by adding a time vector to the coordinates of the current CU, the corresponding block of the sub-CU 1851 is identified through the time vector in the motion source picture 1850. For each sub-CU, the motion information of its corresponding block (for example, the smallest motion grid covering the center sample) is used to derive the motion information of the sub-CU. After the motion information of the corresponding N×N block is identified, it is converted into the reference index and motion vector of the current sub-CU in the same way as the TMVP of HEVC, where motion scaling and other processes are also applicable. For example, the decoder checks whether the low-delay condition is satisfied (for example, the POC of all reference pictures of the current picture is smaller than the POC of the current picture) and may use the motion vector MV _x (for example, the motion vector corresponding to the reference picture list X) to predict each The motion vector MV _{y of each} sub-CU (for example, where X is equal to 0 or 1 and Y is equal to 1-X).

3.1.2空時運動向量預測(STMVP)的示例3.1.2 Example of Space-Time Motion Vector Prediction (STMVP)

在STMVP方法中，子CU的運動向量是按照光柵掃描順序遞歸推導的。圖20繪示了一個具有四個子塊及相鄰塊的CU的示例。考慮一個8×8的CU 2000，它包含四個4×4的子CU A(2001)、B(2002)、C(2003)和D(2004)。當前幀中相鄰的4×4的塊標記為a(2011)、b(2012)、c(2013)和d(2014)。 In the STMVP method, the motion vector of the sub-CU is derived recursively according to the raster scan order. Figure 20 shows an example of a CU with four sub-blocks and adjacent blocks. Consider an 8×8 CU 2000, which contains four 4×4 sub-CUs A (2001), B (2002), C (2003) and D (2004). The adjacent 4×4 blocks in the current frame are marked as a(2011), b(2012), c(2013) and d(2014).

子CU A的運動推導由識別其兩個空間鄰居開始。第一個鄰居是子CU A 1101上方的N×N塊(塊c 2013)。如果該塊c 2013不可用或者是幀內編碼的，則檢查子CU A(2001)上方的其它N×N塊(從左到右，從塊c 2013處開始)。第二個鄰居是子CU A 2001左側的塊(塊b 2012)。如果塊b(2012)不可用或者是幀內編碼的，則檢查子CU A 2001左側的其它塊(從上到下，從塊b 2012處開始)。每個列表從相鄰塊獲得的運動資訊被縮放到給定列表的第一個參考幀。接下來，按照與HEVC中規定的TMVP相同的程序，推導子塊A 2001的時間運動向量預測(TMVP)。提取塊D 2004處的共位塊的運動資訊並進行相應的縮放。最後，在檢索和縮放運動資訊後，對每個參考列表分別平均所有可用的運動向量。將平均的運動向量指定為當前子CU的運動向量。 The motion derivation of sub CU A starts by identifying its two spatial neighbors. The first neighbor is the N×N block above the sub CU A 1101 (block c 2013). If the block c 2013 is not available or is intra-coded, then check the other N×N blocks above the sub CU A (2001) (from left to right, starting at block c 2013). The second neighbor is the block to the left of the sub CU A 2001 (block b 2012). If block b (2012) is not available or is intra-coded, check the other blocks on the left of sub-CU A 2001 (from top to bottom, starting at block b 2012). The motion information obtained from neighboring blocks in each list is scaled to the first reference frame of a given list. Next, follow the same procedure as TMVP specified in HEVC to derive the temporal motion vector of sub-block A 2001 Forecast (TMVP). The motion information of the co-located block at block D 2004 is extracted and scaled accordingly. Finally, after retrieving and scaling the motion information, all available motion vectors are averaged for each reference list. The average motion vector is designated as the motion vector of the current sub-CU.

3.1.3子CU運動預測模式信令的示例3.1.3 Example of sub-CU motion prediction mode signaling

在一些實施例中，子CU模式作為額外的Merge候選模式啟用，並且不需要額外的語法元素來信令通知該模式。將另外兩個Merge候選添加到每個CU的Merge候選列表中，以表示ATMVP模式和STMVP模式。在其他實施例中，如果序列參數集指示啟用了ATMVP和STMVP，則最多可以使用七個Merge候選。額外Merge候選的編碼邏輯與HM中的Merge候選的編碼邏輯相同，這意味著對於P切片或B切片中的每個CU，可能需要對兩個額外Merge候選進行兩次額外的RD檢查。在一些實施例中，例如JEM，Merge索引的所有二進制位(bin)都由CABAC(基於上下文的自適應二進制算數編碼)進行上下文編碼。在其他實施例中，例如HEVC，只有第一個二進制位是上下文編碼的，並且其餘的二進制位是上下文旁路編碼的。 In some embodiments, the sub-CU mode is enabled as an additional Merge candidate mode, and no additional syntax elements are required to signal the mode. The other two Merge candidates are added to the Merge candidate list of each CU to indicate ATMVP mode and STMVP mode. In other embodiments, if the sequence parameter set indicates that ATMVP and STMVP are enabled, a maximum of seven Merge candidates can be used. The coding logic of the additional Merge candidate is the same as that of the Merge candidate in the HM, which means that for each CU in the P slice or B slice, two additional RD checks may be required for two additional Merge candidates. In some embodiments, such as JEM, all bins of the Merge index are context-encoded by CABAC (Context-based Adaptive Binary Arithmetic Coding). In other embodiments, such as HEVC, only the first binary bit is context coded, and the remaining binary bits are context bypass coded.

3.2自適應運動向量差解析度的示例3.2 Example of adaptive motion vector difference resolution

在一些實施例中，當切片報頭中的use_integer_mv_flag等於0時，以四分之一亮度樣本為單位信令通知(PU的運動向量和預測運動向量之間的)運動向量差(MVD)。在JEM中，引入了局部自適應運動向量解析度(LAMVR)。在JEM中，MVD可以以四分之一亮度樣本、整數亮度樣本或4亮度樣本為單位進行編碼。在編碼單元(CU)級控制MVD解析度，並且對於具有至少一個非零MVD分量的每個CU有條件地信令通知MVD解析度標誌。 In some embodiments, when the use_integer_mv_flag in the slice header is equal to 0, the motion vector difference (MVD) (between the motion vector of the PU and the predicted motion vector) is signaled in units of a quarter luminance sample. In JEM, local adaptive motion vector resolution (LAMVR) is introduced. In JEM, MVD can be coded in units of one-quarter luminance samples, integer luminance samples, or 4 luminance samples. The MVD resolution is controlled at the coding unit (CU) level, and the MVD resolution flag is conditionally signaled for each CU with at least one non-zero MVD component.

對於具有至少一個非零MVD分量的CU，信令通知第一標誌以指示在CU中是否使用四分之一亮度樣本MV精度。當第一標誌(等於1)指示不使用四分之一亮度樣本MV精度時，信令通知另一標誌以指示是使用整數亮度樣本MV 精度還是4亮度樣本MV精度。 For a CU with at least one non-zero MVD component, the first flag is signaled to indicate whether to use the quarter luma sample MV accuracy in the CU. When the first flag (equal to 1) indicates that the quarter luminance sample MV precision is not used, another flag is signaled to indicate that the integer luminance sample MV is used The accuracy is still 4 luminance samples MV accuracy.

當CU的第一MVD解析度標誌為零或未針對CU編碼(意味著CU中的所有MVD均為零)時，對於CU使用四分之一亮度樣本MV解析度。當CU使用整數亮度樣本MV精度或4亮度樣本MV精度時，CU的AMVP候選列表中的MVP被取整到對應的精度。 When the first MVD resolution flag of the CU is zero or is not coded for the CU (meaning that all MVDs in the CU are zero), a quarter luma sample MV resolution is used for the CU. When the CU uses integer luma sample MV precision or 4 luma sample MV precision, the MVP in the AMVP candidate list of the CU is rounded to the corresponding precision.

在編碼器中，CU級RD檢查用於確定將哪個MVD解析度用於CU。即，對於每個MVD解析度，執行三次CU級RD檢查。為了加快編碼器速度，在JEM中應用以下編碼方案。 In the encoder, the CU-level RD check is used to determine which MVD resolution to use for the CU. That is, for each MVD resolution, three CU-level RD checks are performed. In order to speed up the encoder, the following coding scheme is applied in JEM.

--在具有正常四分之一亮度樣本MVD解析度的CU的RD檢查期間，儲存當前CU的運動資訊(整數亮度樣本準確度)。儲存的運動資訊(在取整之後)被用作在RD檢查期間針對具有整數亮度樣本和4亮度樣本MVD解析度的相同CU的進一步小範圍運動向量細化的起點，使得耗時的運動估計過程不重複三次。 - During the RD inspection of the CU with the MVD resolution of a normal quarter luminance sample, store the current CU motion information (integer luminance sample accuracy). The stored motion information (after rounding) is used as the starting point for further small-range motion vector refinement for the same CU with integer luma samples and 4 luma sample MVD resolution during the RD inspection, making a time-consuming motion estimation process Do not repeat three times.

--有條件地調用具有4亮度樣本MVD解析度的CU的RD檢查。對於CU，當RD成本整數亮度樣本MVD解析度遠大於四分之一亮度樣本MVD解析度時，跳過針對CU的4亮度樣本MVD解析度的RD檢查。 - Conditionally invoke the RD check of the CU with the MVD resolution of 4 luminance samples. For the CU, when the MVD resolution of the RD cost integer luminance sample is much larger than the MVD resolution of a quarter luminance sample, skip the RD check for the MVD resolution of 4 luminance samples of the CU.

3.3模式匹配運動向量推導(PMMVD)的示例3.3 Example of Pattern Matching Motion Vector Derivation (PMMVD)

PMMVD模式是基於幀速率上轉換(FRUC)方法的特殊Merge模式。利用該模式，在解碼器側推導塊的運動資訊，而不是信令通知塊的運動資訊。 The PMMVD mode is a special Merge mode based on the frame rate up conversion (FRUC) method. Using this mode, the motion information of the block is derived on the decoder side instead of signaling the motion information of the block.

當CU的Merge標誌為真時，可以向CU信令通知FRUC標誌。當FRUC標誌為假時，可以信令通知Merge索引並使用常規Merge模式。當FRUC標誌為真時，可以信令通知額外的FRUC模式標誌以指示將使用哪種方法(例如，雙邊匹配或模板匹配)來推導該塊的運動資訊。 When the Merge flag of the CU is true, the FRUC flag can be signaled to the CU. When the FRUC flag is false, the Merge index can be signaled and the regular Merge mode can be used. When the FRUC flag is true, an additional FRUC mode flag can be signaled to indicate which method (for example, bilateral matching or template matching) will be used to derive the motion information of the block.

在編碼器側，關於是否對CU使用FRUC Merge模式的決定是基於對正常Merge候選所做的RD成本選擇。例如，通過使用RD成本選擇來檢查CU的多種匹配模式(例如，雙邊匹配和模板匹配)兩者。引起最小成本的匹配模式與其他CU模式進一步比較。如果FRUC匹配模式是最有效的模式，則對於CU將FRUC標誌設置為真，並且使用相關的匹配模式。 On the encoder side, the decision on whether to use the FRUC Merge mode for the CU is based on the RD cost selection made for the normal Merge candidates. For example, by using RD cost selection to check the number of CUs Two kinds of matching modes (for example, bilateral matching and template matching). The matching mode that causes the least cost is further compared with other CU modes. If the FRUC matching mode is the most effective mode, the FRUC flag is set to true for the CU, and the relevant matching mode is used.

通常，FRUC Merge模式中的運動推導過程具有兩個步驟：首先執行CU級運動搜索，然後進行子CU級運動細化。在CU級，基於雙邊匹配或模板匹配，推導整個CU的初始運動向量。首先，產生MV候選列表，並且選擇引起最小匹配成本的候選作為進一步CU級細化的起點。然後，在起點附近執行基於雙邊匹配或模板匹配的局部搜索。將最小匹配成本的MV結果作為整個CU的MV。隨後，以推導的CU運動向量作為起點，進一步在子CU級細化運動資訊。 Generally, the motion derivation process in the FRUC Merge mode has two steps: first, perform a CU-level motion search, and then perform sub-CU-level motion refinement. At the CU level, based on bilateral matching or template matching, the initial motion vector of the entire CU is derived. First, a list of MV candidates is generated, and the candidate that causes the smallest matching cost is selected as the starting point for further CU-level refinement. Then, a local search based on bilateral matching or template matching is performed near the starting point. The MV result with the smallest matching cost is taken as the MV of the entire CU. Subsequently, the derived CU motion vector is used as a starting point to further refine the motion information at the sub-CU level.

例如，對於W×H CU運動資訊推導執行以下推導過程。在第一階段，推導整個W×H CU的MV。在第二階段，該CU進一步被劃分成M×M個子CU。M的值的計算方法如等式(3)所示，D是預定義的劃分深度，在JEM中預設設置為3。然後推導每個子CU的MV。 For example, for W × H CU motion information derivation, the following derivation process is performed. In the first stage, the MV of the entire W × H CU is derived. In the second stage, the CU is further divided into M × M sub-CUs. The calculation method of the value of M is shown in equation (3), and D is the predefined division depth, which is preset to 3 in JEM. Then derive the MV of each sub-CU.

圖21繪示了在幀速率上轉換(FRUC)方法中使用的雙邊匹配的示例。通過沿當前CU的運動軌跡在兩個不同的參考圖片(2110、2111)中找到兩個塊之間最接近的匹配，使用雙邊匹配來推導當前CU(2100)的運動資訊。在連續運動軌跡假設下，指向兩個參考塊的運動向量MV0(2101)和MV1(2102)與當前圖片和兩個參考圖片之間的時間距離(例如，TD0(2103)和TD1(2104)成正比。在一些實施例中，當當前圖片2100暫時位於兩個參考圖片(2110、2111)之間並且當前圖片到兩個參考圖片的時間距離相同時，雙邊匹配成為基於鏡像的雙向MV。 FIG. 21 shows an example of bilateral matching used in the frame rate up conversion (FRUC) method. By finding the closest match between two blocks in two different reference pictures (2110, 2111) along the motion trajectory of the current CU, bilateral matching is used to derive the motion information of the current CU (2100). Under the assumption of continuous motion trajectories, the time distance between the motion vectors MV0 (2101) and MV1 (2102) pointing to the two reference blocks and the current picture and the two reference pictures (for example, TD0 (2103) and TD1 (2104) are In some embodiments, when the current picture 2100 is temporarily located between two reference pictures (2110, 2111) and the time distance from the current picture to the two reference pictures is the same, the bilateral matching becomes a mirror-based two-way MV.

圖22繪示了在幀速率上轉換(FRUC)方法中使用的模板匹配的示例。通過在當前圖片中的模板(例如，當前CU的頂部和/或左側相鄰塊)和參考圖片2210中的塊(例如，與模板尺寸相同)之間找到最接近的匹配，使用模板匹配來推導當前CU 2200的運動資訊。除了上述的FRUC Merge模式外，模板匹配也可以應用於AMVP模式。在JEM和HEVC兩者中，AMVP有兩個候選。利用模板匹配方法，可以推導新的候選。如果由模板匹配新推導的候選與第一個現有AMVP候選不同，則將其插入AMVP候選列表的最開始處，並且然後將列表尺寸設置為2(例如，通過移除第二個現有AMVP候選)。當應用於AMVP模式時，僅應用CU級搜索。 FIG. 22 shows an example of template matching used in the frame rate up conversion (FRUC) method. Through the template in the current picture (for example, the top and/or left adjacent block of the current CU) and reference The closest match is found between the blocks in the picture 2210 (for example, the same size as the template), and the template matching is used to derive the current motion information of the CU 2200. In addition to the aforementioned FRUC Merge mode, template matching can also be applied to AMVP mode. In both JEM and HEVC, AMVP has two candidates. Using template matching methods, new candidates can be derived. If the newly derived candidate from template matching is different from the first existing AMVP candidate, insert it at the very beginning of the AMVP candidate list, and then set the list size to 2 (for example, by removing the second existing AMVP candidate) . When applied to AMVP mode, only CU-level search is applied.

CU級的MV候選集可以包含以下：(1)初始AMVP候選，如果當前CU處於AMVP模式，(2)所有Merge候選，(3)插值MV場(稍後描述)中的幾個MV，以及頂部和左側相鄰運動向量。 The CU-level MV candidate set can include the following: (1) the initial AMVP candidate, if the current CU is in AMVP mode, (2) all Merge candidates, (3) several MVs in the interpolated MV field (described later), and the top And the adjacent motion vector on the left.

當使用雙邊匹配時，Merge候選的每個有效MV可以用作輸入，以產生假設雙邊匹配的情況下的MV對。例如，在參考列表A中，Merge候選的一個有效MV是(MVa，ref_a)。然後，在其他參考列表B中找到其配對的雙邊MV的參考圖片ref_b，使得ref_a和ref_b在時間上位於當前圖片的不同側。如果這樣的ref_b在參考列表B中不可用，則ref_b被確定為與ref_a不同的參考，並且其到當前圖片的時間距離是列表B中的最小值。在確定ref_b之後，通過基於當前圖片ref_a和ref_b之間的時間距離縮放MVa來推導MVb。 When using bilateral matching, each valid MV of the Merge candidate can be used as input to generate MV pairs under the assumption of bilateral matching. For example, in the reference list A, a valid MV of the Merge candidate is (MVa, ref _a ). _{Then, find the reference picture ref b of} its paired bilateral MV in the other reference list B, so that ref _a and ref _b are located on different sides of the current picture in time. If such ref _b is not available in reference list B, then ref _b is determined to be _a different reference from ref a, and its time distance to the current picture is the minimum value in list B. After determining ref _b , MVb is derived by scaling MVa based on _{the temporal distance between ref a} and ref _{b of the current picture.}

在一些實現中，還將來自插值MV場中的四個MV添加到CU級候選列表中。更具體地，添加當前CU的位置(0，0)，(W/2，0)，(0，H/2)和(W/2，H/2)處插值的MV。當在AMVP模式下應用FRUC時，初始的AMVP候選也添加到CU級的MV候選集。在一些實現中，在CU級，可以將AMVP CU的15個MV和Merge CU的13個MV添加到候選列表中。 In some implementations, four MVs from the interpolated MV field are also added to the CU-level candidate list. More specifically, the interpolated MVs at positions (0, 0), (W/2, 0), (0, H/2), and (W/2, H/2) of the current CU are added. When FRUC is applied in AMVP mode, the initial AMVP candidate is also added to the MV candidate set at the CU level. In some implementations, at the CU level, 15 MVs of AMVP CU and 13 MVs of Merge CU can be added to the candidate list.

子CU級設置的MV候選包含：(1)從CU級搜索確定的MV，(2)頂部、左側、左上方和右上方相鄰的MV，(3)來自參考圖片的共位的MV的縮放版本，(4)一個或多個ATMVP候選(例如，最多4個)，以及(5)一個或多個STMVP候選(例如，最多4個)。來自參考圖片的縮放MV推導如下。兩個列表中的參考圖片被遍歷。參考圖片中子CU的共位位置處的MV被縮放為起始CU級MV的參考。ATMVP和STMVP候選可以為前四個。在子CU級，一個或多個MV(最多17個)被添加到候選列表中。 The MV candidates set at the sub-CU level include: (1) MVs determined from the CU level search, (2) adjacent MVs on the top, left, top left, and top right, and (3) scaling of co-located MVs from reference pictures Version, (4) one or more ATMVP candidates (e.g., up to 4), and (5) one or more STMVP candidates (e.g., up to 4). The zoomed MV from the reference picture is derived as follows. The reference pictures in the two lists are traversed. The MV at the co-located position of the sub-CU in the reference picture is scaled to the reference of the starting CU-level MV. The ATMVP and STMVP candidates can be the first four. At the sub-CU level, one or more MVs (up to 17) are added to the candidate list.

插值MV場的產生。在對幀進行編碼之前，基於單向ME產生整個圖片的內插運動場。然後，該運動場可以隨後用作CU級或子CU級的MV候選。 Generation of interpolated MV field . Before encoding the frame, an interpolated motion field of the entire picture is generated based on the one-way ME. Then, the sports field can be subsequently used as a MV candidate at the CU level or at the sub-CU level.

在一些實施例中，兩個參考列表中每個參考圖片的運動場在4×4的塊級別上被遍歷。圖23繪示了FRUC方法中的單邊運動估計(ME)2300的示例。對於每個4×4塊，如果與塊相關聯的運動通過當前圖片中的4×4塊並且塊未被分配任何插值運動，則參考塊的運動根據時間距離TD0和TD1(以與HEVC中的TMVP的MV縮放的方式相同的方式)被縮放到當前圖片，並且將縮放的運動分配給當前幀中的塊。如果沒有縮放的MV被分配給4×4塊，則在插值運動場中將塊的運動標記為不可用。 In some embodiments, the motion field of each reference picture in the two reference lists is traversed at the 4×4 block level. FIG. 23 shows an example of unilateral motion estimation (ME) 2300 in the FRUC method. For each 4×4 block, if the motion associated with the block passes through the 4×4 block in the current picture and the block is not assigned any interpolated motion, the motion of the reference block is based on the temporal distance TD0 and TD1 (to be compared with that in HEVC The MV zoom of TMVP is zoomed to the current picture in the same way, and the zoomed motion is assigned to the block in the current frame. If an unscaled MV is assigned to a 4×4 block, the motion of the block is marked as unavailable in the interpolated motion field.

插值和匹配成本。當運動向量指向分數樣本位置時，需要運動補償插值。為了降低複雜度，替代常規8抽頭HEVC插值，可以將雙線性插值用於雙邊匹配和模板匹配。 Interpolation and matching costs. When the motion vector points to the position of the fractional sample, motion compensation interpolation is required. In order to reduce complexity, instead of conventional 8-tap HEVC interpolation, bilinear interpolation can be used for bilateral matching and template matching.

匹配成本的計算在不同步驟處有點不同。當從CU級的候選集中選擇候選時，匹配成本可以是雙邊匹配或模板匹配的絕對和差(SAD)。在確定起始MV之後，子CU級搜索的雙邊匹配的匹配成本C計算如下：C=SAD+w．(|MV _x-MV _x ^s|+|MV _y-MV _y ^s|) 等式(4) The calculation of the matching cost is a bit different at different steps. When selecting candidates from the candidate set at the CU level, the matching cost may be the absolute sum difference (SAD) of bilateral matching or template matching. After the initial MV is determined, the matching cost C of the bilateral matching of the sub-CU level search is calculated as follows: C = SAD + w . (| MV _x - MV _x ^s |+| MV _y - MV _y ^s |) Equation (4)

這裏，w是權重係數。在一些實施例中，w被經驗地設置為4。MV和MV^s分別指示當前MV和起始MV。仍然可以將SAD用作子CU級搜索的模式匹配的匹配成本。 Here, w is the weight coefficient. In some embodiments, w is empirically set to 4. MV and MV ^s indicate the current MV and the starting MV, respectively. SAD can still be used as the matching cost of pattern matching for sub-CU-level searches.

在FRUC模式中，僅通過使用亮度樣本來推導MV。推導的運動將用於MC幀間預測的亮度和色度兩者。在確定MV之後，使用用於亮度的8抽頭插值濾波器和用於色度的4抽頭插值濾波器來執行最終MC。 In FRUC mode, MV is derived only by using luminance samples. The derived motion will be used for both luma and chroma for MC inter prediction. After the MV is determined, the final MC is performed using an 8-tap interpolation filter for luma and a 4-tap interpolation filter for chroma.

MV細化是基於模式的MV搜索，以雙邊匹配成本或模板匹配成本為標準。在JEM中，支持兩種搜索模式-無限制中心偏置菱形搜索(UCBDS)和自適應交叉搜索，分別在CU級和子CU級進行MV細化。對於CU和子CU級MV細化兩者，以四分之一亮度樣本MV精度直接搜索MV，並且接著是八分之一亮度樣本MV細化。將用於CU和子CU步驟的MV細化的搜索範圍設置為等於8個亮度樣本。 MV refinement is a pattern-based MV search, using bilateral matching costs or template matching costs as the standard. In JEM, two search modes are supported-Unrestricted Center Offset Diamond Search (UCBDS) and Adaptive Cross Search, respectively, for MV refinement at the CU level and the sub-CU level. For both CU and sub-CU-level MV refinement, the MV is directly searched with quarter-luminance sample MV accuracy, and then one-eighth luminance sample MV refinement follows. The search range for MV refinement for the CU and sub-CU steps is set equal to 8 luma samples.

在雙邊匹配Merge模式中，應用雙向預測，因為CU的運動資訊是基於在兩個不同的參考圖片中沿當前CU的運動軌跡的兩個塊之間的最近匹配推導的。在模板匹配Merge模式中，編碼器可以從列表0中的單向預測、列表1中的單向預測或雙向預測當中為CU選擇。選擇可以基於如下的模板匹配成本：如果costBi<=factor*min(cost0,cost1)則使用雙向預測；否則，如果cost0<=cost1則使用列表0中的單向預測；否則，使用列表1中的單向預測；這裏，cost0是列表0模板匹配的SAD，cost1是列表1模板匹配的SAD，並且costBi是雙向預測模板匹配的SAD。例如，當factor的值等於1.25，意味著選擇處理偏向於雙向預測。幀間預測方向選擇可以應用於CU級模板匹配處理。 In the bilateral matching Merge mode, bidirectional prediction is applied because the motion information of the CU is derived based on the closest match between two blocks along the motion trajectory of the current CU in two different reference pictures. In the template matching Merge mode, the encoder can select for the CU from one-way prediction in list 0, one-way prediction in list 1, or two-way prediction. The selection can be based on the following template matching cost: if costBi<=factor*min(cost0,cost1), use bidirectional prediction; otherwise, if cost0<=cost1, use the one-way prediction in list 0; otherwise, use the one in list 1. One-way prediction; here, cost0 is the SAD matched by the template in list 0, cost1 is the SAD matched by the template in list 1, and costBi is the SAD matched by the two-way prediction template. For example, when the value of factor is equal to 1.25, it means that the selection process is biased towards bidirectional prediction. Inter-frame prediction direction selection can be applied to CU-level template matching processing.

3.4解碼器側運動向量細化(DMVR)的示例3.4 Example of Decoder-side Motion Vector Refinement (DMVR)

在雙向預測操作中，對於一個塊區域的預測，將分別使用list0的運動向量(MV)和list1的MV形成的兩個預測塊進行組合以形成單個預測信號。在解碼器側運動向量細化(DMVR)方法中，通過雙邊模板匹配處理進一步細化雙向預測的兩個運動向量。雙邊模板匹配應用在解碼器中，以在雙邊模板和參考圖片中的重建樣本之間執行基於失真的搜索，以便獲得細化的MV而無需傳輸附加的運動資訊。 In the bidirectional prediction operation, for the prediction of a block area, the motion of list0 will be used respectively The two prediction blocks formed by the vector (MV) and the MV of list1 are combined to form a single prediction signal. In the decoder-side motion vector refinement (DMVR) method, the two motion vectors of bidirectional prediction are further refined through bilateral template matching processing. Bilateral template matching is applied in the decoder to perform a distortion-based search between the bilateral template and the reconstructed samples in the reference picture in order to obtain a refined MV without the need to transmit additional motion information.

在DMVR中，分別從列表0的初始MV0和列表1的MV1，將雙邊模板產生為兩個預測塊的加權組合(即平均)，如圖24所示。模板匹配操作包含計算所產生的模板與參考圖片中的(在初始預測塊周圍的)樣本區域之間的成本度量。對於兩個參考圖片中的每個，將產生最小模板成本的MV考慮為該列表的更新MV以替換初始MV。在JEM中，對每個列表搜索九個MV候選。該九個MV候選包含初始MV和8個與初始MV在水平或垂直方向上或兩個方向上具有一個亮度樣本偏移的環繞的MV。最後，將兩個新的MV，即如圖24中所示的MV0’和MV1’，用於產生最終的雙向預測結果。將絕對差之和(SAD)用作成本度量。 In DMVR, from the initial MV0 of list 0 and MV1 of list 1, the bilateral template is generated as a weighted combination (ie, average) of two prediction blocks, as shown in FIG. 24. The template matching operation includes calculating the cost metric between the generated template and the sample area (around the initial prediction block) in the reference picture. For each of the two reference pictures, the MV that produces the smallest template cost is considered as the updated MV of the list to replace the original MV. In JEM, each list is searched for nine MV candidates. The nine MV candidates include the initial MV and 8 surrounding MVs with one luminance sample offset from the initial MV in the horizontal or vertical direction or in both directions. Finally, two new MVs, MV0' and MV1' as shown in Fig. 24, are used to generate the final bidirectional prediction result. The sum of absolute difference (SAD) is used as a cost metric.

將DMVR應用於雙向預測的Merge模式，其中一個MV來自過去的參考圖片，另一MV來自未來的參考圖片，而無需傳輸額外的語法元素。在JEM中，當對CU啟用LIC、仿射運動、FRUC或子CU Merge候選時，不應用DMVR。 DMVR is applied to the Merge mode of bidirectional prediction, where one MV comes from a reference picture in the past and the other MV comes from a reference picture in the future, without the need to transmit additional syntax elements. In JEM, when LIC, affine motion, FRUC, or sub-CU Merge candidates are enabled for the CU, DMVR is not applied.

3.5 具有雙邊匹配細化的合併/跳過模式的示例3.5 Example of merge/skip mode with two-sided matching refinement

首先如下構建Merge候選列表：利用冗餘檢查將空間相鄰和時間相鄰塊的運動向量和參考索引插入候選列表，直到可用候選的數量達到最大候選尺寸19。如HEVC(組合候選和零候選)中使用的，根據預定的插入順序，通過插入空間候選、時間候選、仿射候選、高級時間MVP(ATMVP)候選、空時MVP(STMVP)候選和額外的候選來構建Merge/跳過模式的Merge候選列表，並且在圖25中所示的編號的塊的上下文中： First, construct the Merge candidate list as follows: Use redundancy check to insert the motion vectors and reference indexes of spatially adjacent and temporally adjacent blocks into the candidate list until the number of available candidates reaches the maximum candidate size of 19. As used in HEVC (combined candidate and zero candidate), according to the predetermined insertion order, by inserting spatial candidates, temporal candidates, affine candidates, advanced temporal MVP (ATMVP) candidates, space-time MVP (STMVP) candidates and additional candidates To build a Merge candidate list for Merge/Skip mode, and in the context of the numbered blocks shown in Figure 25:

(1)塊1-4的空間候選 (1) Spatial candidates for blocks 1-4

(2)塊1-4的外插仿射候選 (2) Extrapolated affine candidates for blocks 1-4

(3)ATMVP (3) ATMVP

(4)STMVP (4) STMVP

(5)虛擬仿射候選 (5) Virtual affine candidate

(6)空間候選(塊5)(僅在可用候選數小於6時使用) (6) Spatial candidates (block 5) (only used when the number of available candidates is less than 6)

(7)外插仿射候選(塊5) (7) Extrapolate affine candidates (block 5)

(8)時間候選(如在HEVC中推導的) (8) Temporal candidates (as deduced in HEVC)

(9)非相鄰空間候選，隨後是外插仿射候選(塊6至49) (9) Non-adjacent spatial candidates, followed by extrapolated affine candidates (blocks 6 to 49)

(10)組合候選 (10) Combination candidates

(11)零候選 (11) Zero candidates

可以注意到，除了STMVP和仿射之外，IC標誌也從Merge候選繼承。此外，對於前四個空間候選，在單預測的候選之前插入雙預測候選。 It can be noted that in addition to STMVP and affine, the IC flag is also inherited from the Merge candidate. In addition, for the first four spatial candidates, the bi-predictive candidate is inserted before the uni-predicted candidate.

3.5.1非相鄰Merge候選3.5.1 Non-adjacent Merge Candidates

如果可用Merge候選的總數尚未達到最大允許Merge候選，則可以將非相鄰Merge候選添加到Merge候選列表。在現有實現中，可以將非相鄰Merge候選插入到Merge候選列表中TMVP Merge候選之後。添加非相鄰Merge候選的處理可以通過圖26所示的偽代碼來執行。 If the total number of available Merge candidates has not reached the maximum allowable Merge candidates, non-adjacent Merge candidates can be added to the Merge candidate list. In existing implementations, non-adjacent Merge candidates can be inserted after the TMVP Merge candidates in the Merge candidate list. The process of adding non-adjacent Merge candidates can be performed by the pseudo code shown in FIG. 26.

4 現有實現的示例4 Examples of existing implementations

在現有實現中，使用從非相鄰塊獲得運動資訊的非相鄰Merge候選可能導致次優的性能。 In existing implementations, using non-adjacent Merge candidates that obtain motion information from non-adjacent blocks may result in sub-optimal performance.

在一個示例中，根據位於CTU行上方的非相鄰塊的運動資訊的預測可能顯著增加行緩衝器尺寸。 In one example, prediction based on the motion information of non-adjacent blocks located above the CTU line may significantly increase the line buffer size.

在另一示例中，根據非相鄰塊的運動資訊的預測可以以將所有運動資訊(通常在4×4級上)儲存到快取記憶體中為成本(其顯著增加了硬體實現的複雜性)帶來額外的編碼增益。 In another example, the prediction based on the motion information of non-neighboring blocks can be used to calculate all motions Information (usually on the 4×4 level) stored in the cache brings additional coding gain to the cost (which significantly increases the complexity of the hardware implementation).

5 用於構建非相鄰Merge候選的方法的示例5 Examples of methods used to construct non-adjacent Merge candidates

當前公開的技術的實施例克服了現有實現方式的缺點，從而提供具有較低記憶體和複雜度要求以及較高編碼效率的視訊編碼。基於所公開的技術選擇非相鄰Merge候選可以增強現有和未來的視訊編碼標準，其在以下針對各種實現方式所描述的示例中闡明。以下提供的所公開技術的示例解釋了一般概念，並不意味著被解釋為限制。在示例中，除非明確地相反指示，否則可以組合這些示例中描述的各種特徵。 The embodiments of the currently disclosed technology overcome the shortcomings of existing implementations, thereby providing video coding with lower memory and complexity requirements and higher coding efficiency. The selection of non-adjacent Merge candidates based on the disclosed technology can enhance existing and future video coding standards, which are illustrated in the examples described below for various implementations. The examples of the disclosed technology provided below explain general concepts and are not meant to be interpreted as limitations. In the examples, unless explicitly indicated to the contrary, the various features described in these examples may be combined.

所公開的技術的實施例減少了非相鄰Merge候選以及用於進一步改進非相鄰Merge候選的編碼性能的方法所需的快取記憶體/行緩衝器尺寸。 The embodiments of the disclosed technology reduce the cache memory/line buffer size required for non-adjacent Merge candidates and methods for further improving the encoding performance of non-adjacent Merge candidates.

對於下面討論的示例，假設當前塊的左上樣本座標為(Cx，Cy)，並且一個非相鄰塊中的左上樣本的座標為(NAx，NAy)，原點(0,0)是圖片/切片/片/LCU行/LCU的左上角的點。座標差(即，從當前塊的偏移)由(offsetX，offsetY)表示，其中offsetX=Cx-NAx並且offsetY=Cy-NAy。 For the example discussed below, suppose the coordinates of the upper left sample of the current block are (Cx, Cy), and the coordinates of the upper left sample in a non-adjacent block are (NAx, NAy), and the origin (0,0) is the picture/slice /Piece/LCU line/LCU dot in the upper left corner. The coordinate difference (ie, the offset from the current block) is represented by (offsetX, offsetY), where offsetX=Cx-NAx and offsetY=Cy-NAy.

示例1有利地至少提供了記憶體和緩衝器的減少。 Example 1 advantageously provides at least a reduction in memory and buffers.

示例1：在一個示例中，在構建Merge候選時，僅存取位於特定位置的非相鄰塊。 Example 1: In an example, when constructing a Merge candidate, only non-adjacent blocks located at a specific position are accessed.

(a)在一個示例中，x和y應該滿足NAx%M=0且NAy%N=0，其中M和N是兩個非零整數，諸如M=N=8或16。 (a) In an example, x and y should satisfy NAx%M=0 and NAy%N=0, where M and N are two non-zero integers, such as M=N=8 or 16.

(b)在一個示例中，如果一個非相鄰塊中的左上樣本不滿足給定條件，則跳過與該塊相關聯的運動資訊的檢查。因此，相關聯的運動資訊不能添加到Merge候選列表。 (b) In an example, if the upper left sample in a non-adjacent block does not meet a given condition, skip the check of the motion information associated with the block. Therefore, the associated sports information cannot be added to the Merge candidate list.

(c)可替代地，如果一個非相鄰塊中的左上樣本不滿足給定條件，則可以移位、截斷或取整該塊的位置以確保滿足條件。例如，(NAx，NAy)可以被修改為((NAx/M)* M，(NAy/N)* N)，其中“/”是整數除法。 (c) Alternatively, if the upper left sample in a non-adjacent block does not satisfy the given condition, You can shift, truncate or round the position of the block to ensure that the conditions are met. For example, (NAx, NAy) can be modified to ((NAx/M)*M, (NAy/N)*N), where "/" is integer division.

(d)可以預定義/用信令通知覆蓋所有非相鄰塊的受限區域尺寸。在這種情況下，當由給定偏移(OffsetX，OffsetY)計算的非相鄰塊在該區域之外時，它被標記為不可用或被視為幀內編碼模式。可以將相應的運動資訊添加到候選列表中作為候選。圖27中描繪了一個示例。 (d) The size of the restricted area covering all non-adjacent blocks can be pre-defined/signaled. In this case, when a non-adjacent block calculated by a given offset (OffsetX, OffsetY) is outside the area, it is marked as unavailable or regarded as an intra-coding mode. The corresponding motion information can be added to the candidate list as candidates. An example is depicted in Figure 27.

(i)在一個示例中，區域尺寸被定義為一個或多個CTB。 (i) In one example, the area size is defined as one or more CTBs.

(ii)在一個示例中，區域尺寸定義為W*H(例如，W=64且H=64)。可替代地，此外，具有座標(NAx，NAy)的所有非相鄰塊應當滿足以下條件中的至少一個：NAx>=((Cx/W)* W) (ii) In an example, the area size is defined as W*H (for example, W=64 and H=64). Alternatively, in addition, all non-adjacent blocks with coordinates (NAx, NAy) should satisfy at least one of the following conditions: NAx>=((Cx/W)*W)

NAx<=((Cx/W)* W)+W. NAx<=((Cx/W)* W)+W.

NAy>=((Cy/H)* H) NAy>=((Cy/H)* H)

NAy<=((Cy/H)* H)+H. NAy<=((Cy/H)* H)+H.

其中，上述函數中的“>=”和/或“<=”可以用“>”和/或“<”代替，函數“/”表示整數除法運算，其中除法結果的小數部分被丟棄。 Among them, the ">=" and/or "<=" in the above function can be replaced by ">" and/or "<". The function "/" represents an integer division operation, in which the fractional part of the division result is discarded.

(iii)可替代地，覆蓋當前塊的LCU行上方的所有塊被標記為不可用或被視為幀內碼模式。可以將相應的運動資訊添加到候選列表中作為候選。 (iii) Alternatively, all blocks above the LCU row covering the current block are marked as unavailable or regarded as intra-code mode. The corresponding motion information can be added to the candidate list as candidates.

(iv)可替代地，假設覆蓋當前塊的LCU的左上樣本座標為(LX，LY)。(LX-NAx)、和/或abs(LX-NAx)、和/或(LY-NAy)、和/或abs(LY-NAy)應該在臨界值內。 (iv) Alternatively, suppose that the upper left sample coordinates of the LCU covering the current block are (LX, LY). (LX-NAx), and/or abs(LX-NAx), and/or (LY-NAy), and/or abs(LY-NAy) should be within the critical value.

(v)可以預定義一個或多個臨界值。它們可以進一步取決於CU高度的最小尺寸/寬度的最小尺寸/LCU尺寸等。例如，(LY-NAy)應當小於CU高度的最小尺寸，或(LY-NAy)應當小於CU高度的最小尺寸的兩倍。 (v) One or more critical values can be predefined. They may further depend on the minimum size of the CU height/the minimum size of the width/LCU size, etc. For example, (LY-NAy) should be smaller than the minimum size of the CU height, or (LY-NAy) should be smaller than twice the minimum size of the CU height.

(vi)可以在序列參數集(SPS)、圖片參數集(PPS)、視訊參數集(VPS)、切片報頭、片報頭等中用信令通知區域尺寸或臨界值。 (vi) The area size or critical value can be signaled in sequence parameter set (SPS), picture parameter set (PPS), video parameter set (VPS), slice header, slice header, etc.

(vii)在一個示例中，用於並行編碼的當前切片/片/其他種類的單元之外的所有非相鄰塊被標記為不可用，並且對應的運動資訊不應當添加到候選列表作為候選。 (vii) In one example, all non-adjacent blocks other than the current slice/slice/other types of units used for parallel encoding are marked as unavailable, and the corresponding motion information should not be added to the candidate list as candidates.

示例2有利地至少提供了降低的計算複雜度。 Example 2 advantageously provides at least reduced computational complexity.

示例2：當插入新的非相鄰Merge候選時，可以對部分可用Merge候選應用修剪(pruning)。 Example 2: When a new non-adjacent Merge candidate is inserted, pruning can be applied to some of the available Merge candidates.

(a)在一個示例中，新的非相鄰Merge候選不與其他插入的非相鄰Merge候選進行修剪。 (a) In one example, the new non-adjacent Merge candidate is not pruned with other inserted non-adjacent Merge candidates.

(b)在一個示例中，新的非相鄰Merge候選不與諸如TMVP或ATMVP的時間Merge候選進行修剪。 (b) In one example, new non-adjacent Merge candidates are not pruned with temporal Merge candidates such as TMVP or ATMVP.

(c)在一個示例中，新的非相鄰Merge候選與來自某些特定相鄰塊的一些Merge候選進行修剪，但不與來自某些其它特定相鄰塊的一些其他Merge候選進行修剪。 (c) In one example, the new non-adjacent Merge candidates are pruned with some Merge candidates from some specific neighboring blocks, but not with some other Merge candidates from some other specific neighboring blocks.

以上描述的示例可以並入下面描述的方法(例如，方法2800)的上下文中，該方法可以在視訊解碼器和/或視訊編碼器處實現。 The examples described above can be incorporated into the context of the methods described below (for example, method 2800), which can be implemented at a video decoder and/or a video encoder.

圖28繪示了用於視訊編碼的示例性方法的流程圖，其可在視訊編碼器中實施。方法2800包含，在步驟2810，接收視訊資料的當前塊。 FIG. 28 shows a flowchart of an exemplary method for video encoding, which can be implemented in a video encoder. Method 2800 includes, in step 2810, receiving the current block of video data.

方法2800包含，在步驟2820，基於規則選擇與當前塊不相鄰的第一非相鄰塊。 The method 2800 includes, in step 2820, selecting a first non-adjacent block that is not adjacent to the current block based on a rule.

在一些實施例中，並且如在示例1的上下文中所描述的，第一非相鄰塊是從多個非相鄰塊中選擇的，並且其中受限區域包含多個非相鄰塊中的每一個。在示例中，受限區域對應於當前塊旁邊的一個或多個編碼樹塊(CTB)。 In some embodiments, and as described in the context of Example 1, the first non-adjacent block is selected from a plurality of non-adjacent blocks, and wherein the restricted area contains the Every. In the example, the restricted area corresponds to one or more coding tree blocks (CTB) next to the current block.

在一些實施例中，圖片片段的左上樣本座標是(0,0)，其中第一非相鄰塊的左上樣本座標是(NAx，NAy)。在第一示例中，規則指定(NAx%M=0)且(NAy%N=0)，其中%表示取模函數，並且M和N是整數。在第二示例中，規則指定(NAx，NAy)可以被修改為((NAx/M)×M)和((NAx/N)×N)，其中/表示整數除法。在第三示例中，覆蓋當前塊的最大編碼單元(LCU)的左上樣本是(Lx，Ly)，並且(Lx-NAx)、abs(Lx-NAx)、(Ly-NAy)和abs(Ly-NAy)中的至少一個小於預定臨界值，其中abs( )表示絕對值函數。 In some embodiments, the upper left sample coordinate of the picture segment is (0, 0), and the upper left sample coordinate of the first non-adjacent block is (NAx, NAy). In the first example, the rules specify (NAx%M=0) and (NAy%N=0), where% represents a modulus function, and M and N are integers. In the second example, the rule designation (NAx, NAy) can be modified to ((NAx/M)×M) and ((NAx/N)×N), where / represents integer division. In the third example, the upper left sample of the largest coding unit (LCU) covering the current block is (Lx, Ly), and (Lx-NAx), abs(Lx-NAx), (Ly-NAy), and abs(Ly- At least one of NAy) is less than a predetermined critical value, where abs() represents an absolute value function.

方法2800包含，在步驟2830，構建包含第一Merge候選的Merge候選列表，第一Merge候選包含基於第一非相鄰塊的運動資訊。 The method 2800 includes, in step 2830, constructing a Merge candidate list including a first Merge candidate, the first Merge candidate including motion information based on the first non-adjacent block.

在一些實施例中，方法2800另包含將第一Merge候選插入到Merge候選列表中，並且如在示例3的上下文中所描述的，其可以與特定類型的Merge候選進行修剪或不進行修剪。在第一示例中，第一Merge候選不與從Merge候選列表中的其他非相鄰塊構建的其他Merge候選進行修剪。在第二示例中，第一Merge候選不與Merge候選列表中的時間Merge候選進行修剪。在第三示例中，第一Merge候選與來自第一組相鄰塊的其他候選進行修剪，並且第一Merge候選不與來自與第一組相鄰塊不同的第二組相鄰塊的候選進行修剪。 In some embodiments, the method 2800 further includes inserting the first Merge candidate into the Merge candidate list, and as described in the context of Example 3, it may be pruned or not pruned with a specific type of Merge candidate. In the first example, the first Merge candidate is not pruned with other Merge candidates constructed from other non-adjacent blocks in the Merge candidate list. In the second example, the first Merge candidate is not pruned with the temporal Merge candidate in the Merge candidate list. In the third example, the first Merge candidate is pruned with other candidates from the first set of neighboring blocks, and the first Merge candidate is not pruned with candidates from the second set of neighboring blocks that are different from the first set of neighboring blocks. prune.

在一些實施例中，並且如在示例4的上下文中所描述的，第一非相鄰塊可以限於或可以不限於某些類型的編碼。例如，第一非相鄰塊是高級運動向量預測(AMVP)編碼的非相鄰塊。例如，未利用空間Merge候選對第一非相鄰塊進行編碼。例如，利用Merge模式和運動細化過程對第一非相鄰塊進行編碼。例如，未利用解碼器側運動向量細化(DMVR)對第一非相鄰塊進行編碼。 In some embodiments, and as described in the context of Example 4, the first non-adjacent block may or may not be limited to certain types of encoding. For example, the first non-adjacent block is a non-adjacent block encoded by Advanced Motion Vector Prediction (AMVP). For example, the spatial Merge candidate is not used to encode the first non-adjacent block. For example, the Merge mode and the motion refinement process are used to encode the first non-adjacent block. For example, the decoder side motion vector refinement (DMVR) is not used to encode the first non-adjacent block.

在一些實施例中，方法2800另包含選擇第二非相鄰塊。在示例中，第一非相鄰塊和第二非相鄰塊是當前塊的空間鄰居，第一非相鄰塊利用第一模式進行編碼，第二非相鄰塊利用第二模式進行編碼，並且在選擇第二非相鄰塊之前選擇第一非相鄰塊。 In some embodiments, the method 2800 further includes selecting a second non-adjacent block. In the example, the first non-adjacent block and the second non-adjacent block are spatial neighbors of the current block, the first non-adjacent block is coded using the first mode, and the second non-adjacent block is coded using the second mode. And when selecting the second non-adjacent block Select the first non-adjacent block before.

在一些實施例中，非相鄰塊的運動資訊用作高級運動向量預測(AMVP)模式中的預測器。 In some embodiments, the motion information of non-neighboring blocks is used as a predictor in Advanced Motion Vector Prediction (AMVP) mode.

方法2800包含，在步驟2840，基於Merge候選列表處理當前塊。在一些實現中，通過修剪Merge候選列表並使用修剪之後的Merge候選列表來對當前塊進行解碼。 The method 2800 includes, in step 2840, processing the current block based on the Merge candidate list. In some implementations, the current block is decoded by trimming the Merge candidate list and using the trimmed Merge candidate list.

6 所公開的技術的示例實現6 Example implementation of the disclosed technology

圖29是視訊處理裝置2900的方塊圖。裝置2900可用於實施本文所述的一種或多種方法。裝置2900可以在智能手機、平板電腦、電腦、物聯網(IoT)接收器等中實施。裝置2900可包含一個或多個處理器2902、一個或多個記憶體2904和視訊處理硬體2906。處理器2902可以配置為實現本文中描述的一個或多個方法(包含但不限於方法2800)。一個或多個記憶體2904可用於儲存用於實現本文所述方法和技術的資料和代碼。視訊處理硬體2906可用於在硬體電路中實現本文中描述的一些技術。 FIG. 29 is a block diagram of the video processing device 2900. The device 2900 can be used to implement one or more of the methods described herein. The device 2900 may be implemented in a smart phone, a tablet computer, a computer, an Internet of Things (IoT) receiver, and the like. The device 2900 may include one or more processors 2902, one or more memories 2904, and video processing hardware 2906. The processor 2902 may be configured to implement one or more methods described herein (including but not limited to the method 2800). One or more memories 2904 can be used to store data and codes used to implement the methods and techniques described herein. Video processing hardware 2906 can be used to implement some of the technologies described in this article in hardware circuits.

在一些實施例中，視訊解碼器裝置可實現使用如本文中所描述的零單元的方法，以用於視訊解碼。該方法的各種特徵可以類似於上述方法2800。 In some embodiments, the video decoder device may implement a method using zero units as described herein for video decoding. The various features of the method can be similar to the method 2800 described above.

在一些實施例中，視訊解碼方法可以使用在硬體平台上實現的解碼裝置來實現，如關於圖29所描述的。 In some embodiments, the video decoding method may be implemented using a decoding device implemented on a hardware platform, as described with respect to FIG. 29.

下面使用基於條款的描述格式來描述上述方法/技術的附加特徵和實施例。 The following uses a clause-based description format to describe additional features and embodiments of the above methods/techniques.

1.一種視訊處理方法，包含：接收視訊資料的當前塊；基於規則選擇與所述當前塊不相鄰的第一非相鄰塊；構建包含第一Merge候選的Merge候選列表，所述第一Merge候選包含基於所述第一非相鄰塊的運動資訊；以及基於所述Merge候選列表處理所述當前塊。 1. A video processing method, comprising: receiving a current block of video data; selecting a first non-adjacent block that is not adjacent to the current block based on a rule; constructing a Merge candidate list including a first Merge candidate, the first Merge candidates include motion information based on the first non-neighboring block; and processing the current block based on the Merge candidate list.

2.根據條款1所述的方法，其中，通過修剪所述Merge候選列表並使用所述修剪後的Merge候選列表來處理所述當前塊。 2. The method according to clause 1, wherein the current block is processed by pruning the Merge candidate list and using the pruned Merge candidate list.

3.根據條款1所述的方法，其中圖片片段的左上樣本座標是(0,0)，其中所述第一非相鄰塊的左上樣本座標是(NAx，NAy)，並且所述規則指定(NAx%M=0)且(NAy%N=0)，其中%是取模函數，並且其中M和N是整數。 3. The method according to clause 1, wherein the upper left sample coordinates of the picture fragment is (0, 0), wherein the upper left sample coordinates of the first non-adjacent block are (NAx, NAy), and the rule specifies ( NAx%M=0) and (NAy%N=0), where% is the modulus function, and where M and N are integers.

4.根據條款1所述的方法，另包含確定所述第一非相鄰塊的左上樣本是否滿足所述規則，所述第一Merge候選包含與滿足所述規則的所述第一非相鄰塊相關聯的運動資訊。 4. The method according to clause 1, further comprising determining whether the upper left sample of the first non-adjacent block satisfies the rule, and the first Merge candidate includes the first non-adjacent Movement information associated with the block.

5.根據條款3所述的方法，其中，所述規則指定將左上樣本座標(NAx，NAy)修改為((NAx/M)×M)和((NAx/N)×N)，其中/是整數除法，其中M和N是整數。 5. The method according to clause 3, wherein the rule specifies to modify the upper left sample coordinates (NAx, NAy) to ((NAx/M)×M) and ((NAx/N)×N), where / is Integer division, where M and N are integers.

6.根據條款1所述的方法，其中，所述規則指定從多個非相鄰塊中選擇所述第一非相鄰塊，並且其中受限區域包含所述多個非相鄰塊中的每一個。 6. The method according to clause 1, wherein the rule specifies that the first non-adjacent block is selected from a plurality of non-adjacent blocks, and wherein the restricted area includes Every.

7.根據條款1所述的方法，其中，受限區域的尺寸是預定義的或用信令通知的。 7. The method according to clause 1, wherein the size of the restricted area is predefined or signaled.

8.根據條款5所述的方法，其中，所述受限區域對應於與所述當前塊相鄰的一個或多個編碼樹塊(CTB)。 8. The method of clause 5, wherein the restricted area corresponds to one or more coding tree blocks (CTB) adjacent to the current block.

9.根據條款5所述的方法，其中，所述受限區域包含W樣本乘H樣本的矩形區域，或者其中具有座標(NAx，NAy)的非相鄰塊滿足以下條件之一：NAx>=((Cx/W)* W)，或NAx<=((Cx/W)* W)+W，或NAy>=((Cy/H)* H)，或NAy<=((Cy/H)* H)+H。 9. The method according to clause 5, wherein the restricted area includes a rectangular area of W samples by H samples, or a non-adjacent block with coordinates (NAx, NAy) in it satisfies one of the following conditions: NAx>= ((Cx/W)* W), or NAx<=((Cx/W)* W)+W, or NAy>=((Cy/H)* H), or NAy<=((Cy/H) * H)+H.

10.根據條款5所述的方法，其中，所述受限區域包含W樣本乘H樣本的矩形區域，或者其中具有座標(NAx，NAy)的非相鄰塊滿足以下條件之一：NAx>((Cx/W)* W)，或NAx<((Cx/W)* W)+W，或NAy>((Cy/H) * H)，或NAy<((Cy/H)* H)+H。 10. The method according to clause 5, wherein the restricted area includes a rectangular area of W samples by H samples, or a non-adjacent block with coordinates (NAx, NAy) in it satisfies one of the following conditions: NAx>( (Cx/W)* W), or NAx<((Cx/W)* W)+W, or NAy>((Cy/H) * H), or NAy<((Cy/H)* H)+H.

11.根據條款9或10所述的方法，其中操作“/”指示整數除法運算，其中丟棄結果的小數部分。 11. The method according to clause 9 or 10, wherein the operation "/" indicates an integer division operation in which the fractional part of the result is discarded.

12.根據條款6所述的方法，其中覆蓋所述當前塊的最大編碼單元(LCU)的左上樣本是(Lx，Ly)，並且其中(Lx-NAx)、abs(Lx-NAx)、(Ly-NAy)和abs(Ly-NAy)中的至少一個小於一個或多個臨界值。 12. The method according to clause 6, wherein the upper left sample of the largest coding unit (LCU) covering the current block is (Lx, Ly), and wherein (Lx-NAx), abs(Lx-NAx), (Ly -At least one of NAy) and abs (Ly-NAy) is less than one or more critical values.

13.根據條款12所述的方法，其中，所述一個或多個臨界值基於編碼單元(CU)或LCU的高度的最小尺寸，或者CU或LCU的寬度的最小尺寸。 13. The method of clause 12, wherein the one or more critical values are based on a minimum size of a height of a coding unit (CU) or LCU, or a minimum size of a width of a CU or LCU.

14.根據條款12所述的方法，其中，在視訊參數集(VPS)、序列參數集(SPS)、圖片參數集(PPS)、切片報頭、片報頭、編碼樹單元(CTU)或CU中用信令通知所述受限區域的尺寸或所述一個或多個臨界值。 14. The method according to clause 12, wherein the video parameter set (VPS), sequence parameter set (SPS), picture parameter set (PPS), slice header, slice header, coding tree unit (CTU) or CU is used Signaling the size of the restricted area or the one or more critical values.

15.根據條款1所述的方法，另包含：將所述第一Merge候選插入Merge候選列表。 15. The method according to clause 1, further comprising: inserting the first Merge candidate into a Merge candidate list.

16.根據條款12所述的方法，其中，第一Merge候選不與Merge候選列表中從其他非相鄰塊構建的其他Merge候選進行修剪。 16. The method according to clause 12, wherein the first Merge candidate is not pruned with other Merge candidates constructed from other non-adjacent blocks in the Merge candidate list.

17.根據條款15所述的方法，其中，第一Merge候選不與Merge候選列表中的時間Merge候選進行修剪，並且其中所述時間Merge候選包含時間運動向量預測(TMVP)或可選時間運動向量預測(ATMVP)。 17. The method according to clause 15, wherein the first Merge candidate is not pruned with the temporal Merge candidate in the Merge candidate list, and wherein the temporal Merge candidate includes temporal motion vector prediction (TMVP) or optional temporal motion vector Forecast (ATMVP).

18.根據條款15所述的方法，其中，第一Merge候選與來自於第一組相鄰塊的候選進行修剪，並且其中所述第一Merge候選不與來自與所述第一組相鄰塊不同的第二組相鄰塊的候選進行修剪。 18. The method according to clause 15, wherein the first Merge candidate is pruned with candidates from the first set of neighboring blocks, and wherein the first Merge candidate is not the same as those from the first set of neighboring blocks. The different candidates of the second set of neighboring blocks are pruned.

19.一種裝置，包含處理器和其上具有指令的非暫時性記憶體，其中所述指令在由所述處理器執行時使所述處理器實現條款1至18中任一項所述的方法。 19. A device comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions when executed by the processor cause the processor to implement the method of any one of clauses 1 to 18 .

20.一種儲存在非暫時性電腦可讀取媒介上的電腦程式產品，所述電腦程式產品包含用於執行條款1至18中的任一項所述的方法的程式代碼。 20. A computer program product stored on a non-transitory computer readable medium, the computer program product containing program code for executing the method described in any one of clauses 1 to 18.

從前述內容可以理解，本文已經出於說明的目的描述了本公開技術的具體實施例，但是可以在不脫離本發明範圍的情況下進行各種修改。因此，本公開的技術除了所附請求項外不受限制。 It can be understood from the foregoing that the specific embodiments of the disclosed technology have been described herein for illustrative purposes, but various modifications can be made without departing from the scope of the present invention. Therefore, the technology of the present disclosure is not limited except for the appended claims.

本專利文件中描述的主題和功能操作的實現可以在各種系統、數位電子電路、或電腦軟體、韌體或硬體中實現，包含本說明書中公開的結構及其結構等同物，或者以它們中的一個或多個的組合實現。本說明書中描述的主題的實現可以實現為一個或多個電腦程式產品，即，在有形和非瞬時電腦可讀取媒介上編碼的一個或多個電腦程式指令模塊，用於由資料處理裝置執行或控制資料處理裝置的操作。電腦可讀取媒介可以是機器可讀取儲存設備、機器可讀取儲存基板、記憶體設備、影響機器可讀取傳播信號的物質組合、或者它們中的一個或多個的組合。術語“資料處理單元”或“資料處理裝置”涵蓋用於處理資料的所有裝置、設備和機器，包含例如可編程處理器、電腦或多個處理器或電腦。除了硬體之外，該裝置還可以包含為所討論的電腦程式創建執行環境的代碼，例如，構成處理器韌體、協議疊、資料庫管理系統、作業系統、或者它們中的一個或多個的組合的代碼。 The subject and functional operations described in this patent document can be implemented in various systems, digital electronic circuits, or computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in them A combination of one or more of them. The realization of the subject described in this manual can be realized as one or more computer program products, that is, one or more computer program instruction modules encoded on a tangible and non-transitory computer readable medium for execution by a data processing device Or control the operation of the data processing device. The computer-readable medium may be a machine-readable storage device, a machine-readable storage substrate, a memory device, a combination of substances that affect a machine-readable propagation signal, or a combination of one or more of them. The term "data processing unit" or "data processing device" covers all devices, equipment, and machines used to process data, including, for example, programmable processors, computers, or multiple processors or computers. In addition to hardware, the device may also contain code that creates an execution environment for the computer program in question, for example, constituting processor firmware, protocol stack, database management system, operating system, or one or more of them The code of the combination.

電腦程式(也稱為程式、軟體、軟體應用、脚本或代碼)可以用任何形式的編程語言編寫，包含編譯或解釋語言，並且可以以任何形式來部署電腦程式，包含獨立程式或適合在計算環境中使用的模塊、組件、子程式或其他單元。電腦程式並不必需對應於檔案系統中的檔案。程式可以儲存在檔案的保存其他程式或資料(例如，儲存在標記語言文件中的一個或多個脚本)的部分中，儲存在專用於所討論的程式的單個檔案中，或儲存在多個協調檔案中(例如，儲存一個或多個模塊、子程式或代碼部分的檔案)。可以部署電腦程式以在一個電腦上或在位於一個站點上或分布在多個站點上並通過通訊網絡互連的多個電腦上執行。 Computer programs (also called programs, software, software applications, scripts or codes) can be written in any form of programming language, including compiled or interpreted languages, and computer programs can be deployed in any form, including stand-alone programs or suitable for computing environments Modules, components, subprograms or other units used in The computer program does not necessarily correspond to the files in the file system. Programs can be stored in the section of the file that holds other programs or data (for example, one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinations In files (for example, files that store one or more modules, subprograms, or code parts). Computer programs can be deployed to It is executed on one computer or on multiple computers located on one site or distributed on multiple sites and interconnected by a communication network.

本說明書中描述的過程和邏輯流程可以由執行一個或多個電腦程式的一個或多個可編程處理器執行，以通過對輸入資料進行操作並產生輸出來執行功能。過程和邏輯流程也可以由專用邏輯電路執行，並且裝置也可以實現為專用邏輯電路，例如FPGA(現場可編程門陣列)或ASIC(專用積體電路)。 The processes and logic flows described in this specification can be executed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The process and logic flow can also be executed by a dedicated logic circuit, and the device can also be implemented as a dedicated logic circuit, such as FPGA (Field Programmable Gate Array) or ASIC (Dedicated Integrated Circuit).

舉例來說，適合於執行電腦程式的處理器包含通用和專用微處理器、以及任何種類的數位電腦的任何一個或多個處理器。通常，處理器將從只讀記憶體或隨機存取記憶體或兩者接收指令和資料。電腦的基本元件是用於執行指令的處理器和用於儲存指令和資料的一個或多個記憶體設備。通常，電腦還將包含或可操作地耦合到用於儲存資料的一個或多個大容量儲存設備，例如磁碟、磁光碟或光碟，以從該一個或多個大容量儲存設備接收資料，或將資料傳輸到該一個或多個大容量儲存設備，或者既接收又傳遞資料。然而，電腦不需要具有這樣的設備。適用於儲存電腦程式指令和資料的電腦可讀取媒介包含所有形式的非揮發性記憶體、媒介和記憶體設備，包含例如半導體記憶體設備，例如EPROM、EEPROM和快閃記憶體設備。處理器和記憶體可以由專用邏輯電路補充或並入專用邏輯電路中。 For example, processors suitable for executing computer programs include general-purpose and special-purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, the processor will receive commands and data from read-only memory or random access memory or both. The basic components of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, the computer will also include or be operatively coupled to one or more mass storage devices for storing data, such as magnetic disks, magneto-optical disks, or optical discs, to receive data from the one or more mass storage devices, or Transmit data to the one or more mass storage devices, or both receive and transmit data. However, the computer does not need to have such equipment. Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including, for example, semiconductor memory devices such as EPROM, EEPROM, and flash memory devices. The processor and memory can be supplemented by or incorporated into a dedicated logic circuit.

旨在將說明書與圖式一起僅視為示例性的，其中示例性意味著示例。如這裏所使用的，除非上下文另有明確說明，單數形式“一”、“一個”和“該”旨在也包含複數形式。另外，除非上下文另有明確說明，否則“或”的使用旨在包含“和/或”。 It is intended to treat the description together with the drawings as merely exemplary, where exemplary means an example. As used herein, unless the context clearly dictates otherwise, the singular forms "a", "an" and "the" are intended to also encompass the plural forms. In addition, the use of "or" is intended to include "and/or" unless the context clearly dictates otherwise.

雖然本專利文件包含許多細節，但這些細節不應被解釋為對任何發明或可要求保護的範圍的限制，而是作為特定於特定發明的特定實施例的特徵的描述。在本專利文件中，在單獨的實施例的上下文中描述的某些特徵也可以在單個實施例中組合實現。相反，在單個實施例的上下文中描述的各種特徵也可以單獨地或以任何合適的子組合在多個實施例中實現。此外，儘管上面的特徵可以描述為以某些組合起作用並且甚至最初如此要求權利保護，但是在某些情況下，可以從所要求保護的組合中去除來自該組合的一個或多個特徵，並且所要求保護的組合可以指向子組合或子組合的變型。 Although this patent document contains many details, these details should not be construed as limitations on the scope of any invention or claimable, but as a description of the features specific to a particular embodiment of a particular invention. In this patent document, certain features described in the context of separate embodiments may also be Implemented in combination in a single embodiment. Conversely, various features described in the context of a single embodiment can also be implemented in multiple embodiments individually or in any suitable subcombination. Furthermore, although the above features may be described as working in certain combinations and even as claimed initially, in some cases, one or more features from the combination may be removed from the claimed combination, and The claimed combination may refer to a sub-combination or a variant of the sub-combination.

類似地，雖然在圖式中以特定順序描繪了操作，但是這不應該被理解為要求以所示的特定順序或按順序執行這樣的操作，或者執行所有繪示的操作，以實現期望的結果。此外，在本專利文件中描述的實施例中的各種系統組件的分離不應被理解為在所有實施例中都要求這種分離。 Similarly, although operations are depicted in a specific order in the drawings, this should not be understood as requiring that such operations be performed in the specific order shown or in order, or that all operations depicted are performed to achieve the desired result . In addition, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.

僅描述了幾個實現方式和示例，並且可以基於本專利文件中描述和繪示的內容來做出其他實現方式、增強和變型。 Only a few implementations and examples are described, and other implementations, enhancements and modifications can be made based on the content described and illustrated in this patent document.

以上所述僅為本發明之較佳實施例，凡依本發明申請專利範圍所做之均等變化與修飾，皆應屬本發明之涵蓋範圍。 The foregoing descriptions are only preferred embodiments of the present invention, and all equivalent changes and modifications made in accordance with the scope of the patent application of the present invention should fall within the scope of the present invention.

2800:方法 2800: method

2810至2840:步驟 2810 to 2840: steps

Claims

A video processing method includes: receiving a current block of video data; selecting a first non-adjacent block that is not adjacent to the current block according to a rule; constructing a Merge candidate list including a first Merge candidate, the The first Merge candidate includes motion information based on the first non-adjacent block; and the current block is processed according to the Merge candidate list; wherein the upper left sample coordinates of the picture segment covering the current block are (0, 0), and the first The upper left sample coordinates of a non-adjacent block are (NAx, NAy), and the rule specifies (NAx%M=0) and (NAy%N=0), where% is a modulus function, and where M and N are integers .

The method according to claim 1, wherein the current block is processed by trimming the Merge candidate list and using the trimmed Merge candidate list.

The method according to claim 1, further comprising determining whether the upper left sample of the first non-adjacent block satisfies the rule, and the first Merge candidate includes motion information associated with the first non-adjacent block satisfying the rule .

The method according to claim 1, further comprising determining whether the upper left sample of the first non-adjacent block satisfies the rule, and if the rule is not satisfied, the upper left sample coordinates (NAx, NAy) are modified to ((NAx/M )×M) and ((NAx/N)×N), where / is integer division, where M and N are integers.

The method according to claim 1, wherein the rule specifies that the first non-adjacent block is selected from a plurality of non-adjacent blocks, and one of the restricted areas includes each of the plurality of non-adjacent blocks.

The method according to claim 5, wherein the size of a restricted area is predefined or signaled.

The method according to claim 5, wherein the restricted area corresponds to a block adjacent to the current block One or more coding tree blocks (CTB).

The method according to claim 5, wherein the restricted area includes a rectangular area of W samples by H samples, or a non-adjacent block with coordinates (NAx, NAy) satisfies one of the following conditions: NAx>=((Cx /W)* W), or NAx<=((Cx/W)* W)+W, or NAy>=((Cy/H)* H), or NAy<=((Cy/H)* H) +H.

The method according to claim 5, wherein the restricted area includes a rectangular area of W samples by H samples, or a non-adjacent block with coordinates (NAx, NAy) satisfies one of the following conditions: NAx>((Cx/ W)* W), or NAx<((Cx/W)* W)+W, or NAy>((Cy/H)* H), or NAy<((Cy/H)* H)+H.

The method according to claim 8 or 9, wherein the operation "/" indicates an integer division operation, in which the fractional part of the result is discarded.

The method according to claim 5, wherein the upper left sample of the largest coding unit (LCU) covering the current block is (Lx, Ly), and (Lx-NAx), abs(Lx-NAx), (Ly-NAy) At least one of) and abs(Ly-NAy) is less than one or more critical values.

The method according to claim 11, wherein the one or more critical values are based on the minimum size of the height of the coding unit (CU) or the largest coding unit, or the smallest size of the width of the coding unit or the largest coding unit.

The method according to claim 11, wherein the information is used in video parameter set (VPS), sequence parameter set (SPS), picture parameter set (PPS), slice header, slice header, coding tree unit (CTU) or coding unit Let inform the size of the restricted area or the one or more critical values.

The method according to claim 1, further comprising: inserting the first Merge candidate into the Merge candidate list.

The method according to claim 14, wherein the first Merge candidate is not pruned with other Merge candidates constructed from other non-adjacent blocks in the Merge candidate list.

The method according to claim 14, wherein the first Merge candidate is not pruned with the temporal Merge candidate in the Merge candidate list, and wherein the temporal Merge candidate includes temporal motion vector prediction (TMVP) or optional temporal motion vector prediction (ATMVP).

The method according to claim 15, wherein the first Merge candidate is pruned from a candidate from a first group of neighboring blocks, and wherein the first Merge candidate is not different from a candidate from a first group of neighboring blocks A candidate for a second set of neighboring blocks is trimmed.

A device for video processing, comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions when executed by the processor enable the processor to implement the requirements described in any one of claim items 1 to 18 Methods.

A computer program product stored on a non-transitory computer readable medium, the computer program product containing program code for executing the method according to any one of claim items 1 to 17.