TWI820169B

TWI820169B - Extension of look-up table based motion vector prediction with temporal information

Info

Publication number: TWI820169B
Application number: TW108124973A
Authority: TW
Inventors: 張莉; 張凱; 劉鴻彬; 王悅
Original assignee: 大陸商北京字節跳動網絡技術有限公司; 美商字節跳動有限公司
Priority date: 2018-07-14
Filing date: 2019-07-15
Publication date: 2023-11-01
Also published as: WO2020016745A3; CN110719463A; WO2020016745A2; CN110719476B; CN110719463B; TWI826486B; TW202032991A; CN110719476A; WO2020016744A1; TW202021360A

Abstract

The application provides a method for video processing. This method includes: determining a new candidate for video processing by averaging motion vectors of two or more selected motion candidates; adding the new candidate to a candidate list; and performing a conversion between a first video block of a video and a bitstream representation of the video by using the determined new candidate in the candidate list.

Description

Extending lookup table-based motion vector prediction with temporal information

本發明檔涉及視頻編碼技術、設備和系統。The present invention relates to video coding technology, equipment and systems.

相關申請的交叉引用根據適用的專利法和/或巴黎公約的規定，本發明及時要求於2018年7月14日提交的國際專利申請號PCT/CN2018/095716以及2018年7月15日提交的國際專利申請號PCT/CN2018/095719的優先權和利益。將國際專利申請號PCT/CN2018/095716和PCT/CN2018/095719的全部公開以引用方式併入本文，作為本發明公開的一部分。Cross-references to related applications This invention timely claims the benefit of International Patent Application No. PCT/CN2018/095716, filed on July 14, 2018, and International Patent Application No. PCT/, filed on July 15, 2018, in accordance with the provisions of applicable patent law and/or the Paris Convention. Priority and interests of CN2018/095719. The entire disclosures of International Patent Application Nos. PCT/CN2018/095716 and PCT/CN2018/095719 are incorporated herein by reference as part of the disclosure of the present invention.

儘管視訊壓縮有所進步，數位視訊在互聯網和其它數位通信網路上使用的頻寬仍然最大。隨著能夠接收和顯示視頻的連接使用者設備數量的增加，預計數位視訊使用的頻寬需求將繼續增長。Despite advances in video compression, digital video still uses the largest bandwidth on the Internet and other digital communications networks. Bandwidth demands for digital video usage are expected to continue to grow as the number of connected user devices capable of receiving and displaying video increases.

本檔公開了用於使用運動向量的Merge清單編碼和解碼數位視訊的方法、系統和設備。This document discloses methods, systems, and apparatus for encoding and decoding digital video using Merge lists of motion vectors.

在一個示例的方面，公開了一種視頻處理方法。該視頻處理方法包括：通過平均兩個或更多選擇的運動候選，確定用於視頻處理的新候選；將所述新候選增加到候選列表；通過使用所述候選列表中的確定的新候選，執行視頻的第一視頻塊和視頻的位元流表示之間的轉換。In an example aspect, a video processing method is disclosed. The video processing method includes: determining new candidates for video processing by averaging two or more selected motion candidates; adding the new candidates to a candidate list; and using the determined new candidates in the candidate list, Performs a conversion between a first video block of the video and a bitstream representation of the video.

在一個示例的方面，公開了一種視頻處理方法。該視頻處理方法包括：通過使用來自一個或多個表的一個或多個運動候選來確定用於視頻處理的新運動候選，其中表包括一個或多個運動候選，並且每個運動候選是關聯的運動資訊；基於新候選者在視頻塊和視頻塊的編碼表示之間執行轉換。In an example aspect, a video processing method is disclosed. The video processing method includes determining new motion candidates for video processing by using one or more motion candidates from one or more tables, wherein the table includes the one or more motion candidates and each motion candidate is associated Motion information; performs conversion between video blocks and encoded representations of video blocks based on new candidates.

在一個示例的方面，公開了一種視頻處理方法。該視頻處理方法包括：通過始終使用來自當前圖片中的第一視頻塊的多於一個空間相鄰塊的運動資訊，並且不使用來自與當前圖片不同的圖片中的時間塊的運動資訊，來確定用於視頻處理的新候選；通過使用所確定的新候選來執行視頻的當前圖片中的第一視頻塊與該視頻的位元流表示之間的轉換。In an example aspect, a video processing method is disclosed. The video processing method includes determining by always using motion information from more than one spatial neighbor block of a first video block in a current picture, and not using motion information from temporal blocks in a picture different from the current picture. New candidates for video processing; performing a conversion between a first video block in the current picture of the video and a bitstream representation of the video by using the determined new candidates.

在一個示例的方面，公開了一種視頻處理方法。該視頻處理方法包括：通過使用來自當前圖片中的第一視頻塊的至少一個空間非緊鄰塊的運動資訊、以及從第一視頻塊的空間非緊鄰塊或並非從第一視頻塊的空間非緊鄰塊推導出的其他候選，來確定用於視頻處理的新候選；通過使用所確定的新候選，來執行視頻的第一視頻塊與該視頻的位元流表示之間的轉換。In an example aspect, a video processing method is disclosed. The video processing method includes: by using motion information from at least one spatially non-immediately adjacent block of a first video block in the current picture, and from a spatially non-immediately adjacent block of the first video block or from a spatially non-immediately adjacent block of the first video block. Other candidates derived from the block are used to determine new candidates for video processing; by using the determined new candidates, a conversion between the first video block of the video and the bit stream representation of the video is performed.

在一個示例的方面，公開了一種視頻處理方法。該視頻處理方法包括：通過使用來自當前圖片中的第一視頻塊的一個或多個表的運動資訊和來自不同於當前圖片的圖片中的時間塊的運動資訊，來確定用於視頻處理的新候選；通過使用所確定的新候選，來執行視頻的當前圖片中的第一視頻塊與該視頻的位元流表示之間的轉換。In an example aspect, a video processing method is disclosed. The video processing method includes determining a new video block for video processing by using motion information from one or more tables of a first video block in a current picture and motion information from a temporal block in a picture different from the current picture. Candidate; performs a conversion between the first video block in the current picture of the video and the bitstream representation of the video by using the determined new candidate.

在一個示例的方面，公開了一種視頻處理方法。該視頻處理方法包括：通過使用來自第一視頻塊的一個或多個表的運動資訊和來自第一視頻塊的一個或多個空間相鄰塊的運動資訊，來確定用於視頻處理的新候選；通過使用所確定的新候選，來執行視頻的當前圖片中的第一視頻塊與該視頻的位元流表示之間的轉換。In an example aspect, a video processing method is disclosed. The video processing method includes determining new candidates for video processing by using motion information from one or more tables of a first video block and motion information from one or more spatially neighboring blocks of the first video block. ; Perform a conversion between the first video block in the current picture of the video and the bitstream representation of the video by using the determined new candidate.

在一個示例的方面，公開了一種視頻處理方法。該視頻處理方法包括：保持一組表，其中每個表包括運動候選，並且每個運動候選與對應的運動資訊相關聯；執行第一視頻塊與包括該第一視頻塊的視頻的位元流表示之間的轉換；以及通過基於第一視頻塊的編碼/解碼模式選擇性地對一個或多個表中的現有運動候選進行修剪，來更新該一個或多個表。In an example aspect, a video processing method is disclosed. The video processing method includes: maintaining a set of tables, wherein each table includes motion candidates, and each motion candidate is associated with corresponding motion information; and executing a first video block and a bit stream of a video including the first video block. converting between representations; and updating the one or more tables by selectively pruning existing motion candidates in the one or more tables based on the encoding/decoding mode of the first video block.

在一個示例的方面，公開了一種視頻處理方法。該視頻處理方法包括：保持一組表，其中每個表包括運動候選，並且每個運動候選與對應的運動資訊相關聯；執行第一視頻塊與包括該第一視頻塊的視頻的位元流表示之間的轉換；以及更新一個或多個表，以包括來自所述第一視頻塊的一個或多個時間相鄰塊的運動資訊作為新的運動候選。In an example aspect, a video processing method is disclosed. The video processing method includes: maintaining a set of tables, wherein each table includes motion candidates, and each motion candidate is associated with corresponding motion information; executing a first video block and a bit stream of a video including the first video block converting between representations; and updating one or more tables to include motion information from one or more temporally adjacent blocks of the first video block as new motion candidates.

在一個示例的方面，公開了一種更新運動候選表的方法。該方法包括：基於正被處理的視頻塊的編碼/解碼模式，選擇性地對表中的現有運動候選進行修剪，每個運動候選與對應的運動資訊相關聯；以及更新所述表，以包括視頻塊的運動資訊作為新的運動候選。In one example aspect, a method of updating a motion candidate table is disclosed. The method includes: selectively pruning existing motion candidates in a table, each motion candidate being associated with corresponding motion information, based on the encoding/decoding mode of the video block being processed; and updating the table to include The motion information of video blocks is used as new motion candidates.

在一個示例的方面，公開了一種更新運動候選表的方法。該方法包括：保持運動候選表，每個運動候選與對應的運動資訊相關聯；以及更新所述表，以包括來自正被處理的視頻塊的一個或多個時間相鄰塊的運動資訊作為新的運動候選。In one example aspect, a method of updating a motion candidate table is disclosed. The method includes maintaining a table of motion candidates, each motion candidate being associated with corresponding motion information; and updating the table to include motion information from one or more temporally adjacent blocks of the video block being processed as new campaign candidates.

在一個示例的方面，公開了一種視頻處理方法。該視頻處理方法包括：通過使用來自一個或多個表的一個或多個運動候選，來確定用於視頻處理的新的運動候選，其中表包括一個或多個運動候選，並且每個運動候選與運動資訊相關聯；以及基於新候選，在視頻塊和視頻塊的編碼表示之間執行轉換。In an example aspect, a video processing method is disclosed. The video processing method includes determining new motion candidates for video processing by using one or more motion candidates from one or more tables, wherein the table includes the one or more motion candidates and each motion candidate is associated with motion information is associated; and based on the new candidate, a conversion is performed between the video block and the encoded representation of the video block.

在一個示例的方面，公開了一種視頻系統中的裝置。該裝置包括處理器和其上具有指令的非暫時性記憶體，其中所述指令在由處理器執行時使處理器實現本文所述的各種方法。In an example aspect, an apparatus in a video system is disclosed. The apparatus includes a processor and non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to perform the various methods described herein.

本文所述的各種技術可以實施為存儲在非暫時性電腦可讀介質上的電腦程式產品。電腦程式產品包括用於執行本文所述方法的程式碼。Various technologies described herein may be implemented as a computer program product stored on non-transitory computer-readable media. A computer program product includes program code for performing the methods described herein.

在附件、附圖和下面的描述中闡述了一個或多個實現的細節。其它特徵將從說明書和附圖以及申請專利範圍書中顯而易見。The details of one or more implementations are set forth in the attachments, figures, and the description below. Other features will be apparent from the description and drawings, and from the patent claims.

為了提高視頻的壓縮比，研究人員不斷尋找新的技術來編碼視頻。In order to improve the compression ratio of videos, researchers are constantly looking for new technologies to encode videos.

1.1. 介紹introduce

本檔與視頻編碼技術相關。具體地，與視頻編碼中的運動資訊編碼（例如Merge模式、AMVP模式）相關。其可應用於現有的視頻編碼標準HEVC，或待最終確定的標準多功能視頻編碼（VVC）。也可能適用於未來的視頻編碼標準或視頻轉碼器。This document is related to video encoding technology. Specifically, it is related to motion information coding (such as Merge mode, AMVP mode) in video coding. It can be applied to the existing video coding standard HEVC, or to the yet-to-be-finalized standard Versatile Video Coding (VVC). May also apply to future video encoding standards or video transcoders.

簡要討論briefly discuss

視頻編碼標準主要是通過開發公知的ITU-T和ISO/IEC標準而發展起來的。ITU-T開發了H.261和H.263，ISO/IEC開發了MPEG-1和MPEG-4視覺，並且兩個組織聯合開發了H.262/MPEG-2視頻、H.264/MPEG-4高級視頻編碼（AVC）和H.265/HEVC標準。自H.262以來，視頻編碼標準基於混合視頻編碼結構，其中採用了時域預測加變換編碼。典型HEVC編碼器框架的示例如圖1所示。Video coding standards have been developed primarily through the development of the well-known ITU-T and ISO/IEC standards. ITU-T developed H.261 and H.263, ISO/IEC developed MPEG-1 and MPEG-4 visual, and the two organizations jointly developed H.262/MPEG-2 video, H.264/MPEG-4 Advanced Video Coding (AVC) and H.265/HEVC standards. Since H.262, video coding standards are based on a hybrid video coding structure, in which temporal prediction plus transform coding is used. An example of a typical HEVC encoder framework is shown in Figure 1.

2.12.1 分割結構split structure

2.1.1 H.264/AVC2.1.1 H.264/AVC 中的分割樹結構Split tree structure in

先前標準中編碼層的核心是巨集塊，包含16×16的亮度樣本塊，並且在常規的4:2:0顏色採樣情況下，包含兩個對應的8×8的彩度樣本塊。The core of the coding layer in the previous standard was the macro block, which contained a 16×16 block of luma samples and, in the case of conventional 4:2:0 color sampling, two corresponding 8×8 blocks of chroma samples.

內部編碼塊使用空間預測來探索像素之間的空間相關性。定義了兩種分割：16x16和4x4。The intra coding block uses spatial prediction to explore the spatial correlation between pixels. Two splits are defined: 16x16 and 4x4.

幀間編碼塊通過估計圖片之間的運動來使用時域預測，而不是空間預測。可以單獨估計16x16宏塊或其任何子宏塊分割的運動：16x8、8x16、8x8、8x4、4x8、4x4（見圖2）。每個子宏塊分割只允許一個運動向量（MV）。Inter-coded blocks use temporal prediction, rather than spatial prediction, by estimating motion between pictures. The motion of a 16x16 macroblock or any of its sub-macroblock partitions can be estimated individually: 16x8, 8x16, 8x8, 8x4, 4x8, 4x4 (see Figure 2). Only one motion vector (MV) is allowed per sub-macroblock partition.

2.1.2 HEVC2.1.2 HEVC 中的分割樹結構Split tree structure in

在HEVC中，通過使用四叉樹結構（表示為編碼樹）將CTU劃分成CU來適應各種局部特性。在CU級別決定是使用幀間（時域）預測還是幀內（空間）預測對圖片區域進行編碼。根據PU的分割類型，每個CU可以進一步劃分成一個、兩個或四個PU。在一個PU中，應用相同的預測處理，並且相關資訊以PU為基礎傳輸到解碼器。在基於PU分割類型通過應用預測處理獲得殘差塊後，可以根據與CU的編碼樹相似的另一個四叉樹結構將CU分割成變換單元（TU）。HEVC結構的一個重要特徵是它具有多個分割概念，包括CU、PU以及TU。In HEVC, various local characteristics are accommodated by dividing the CTU into CUs using a quadtree structure (represented as a coding tree). It is decided at the CU level whether to encode a picture region using inter (temporal) prediction or intra (spatial) prediction. Depending on the PU's split type, each CU can be further divided into one, two, or four PUs. In a PU, the same prediction process is applied, and the relevant information is transferred to the decoder on a PU basis. After obtaining the residual block by applying prediction processing based on the PU partition type, the CU can be partitioned into transform units (TUs) according to another quadtree structure similar to the coding tree of the CU. An important feature of the HEVC structure is that it has multiple partition concepts, including CU, PU and TU.

在下文中，使用HEVC的混合視頻編碼中涉及的各種特徵突出顯示如下。In the following, various features involved in hybrid video coding using HEVC are highlighted below.

1）編碼樹單元和編碼樹塊（CTB）結構：HEVC中的類似結構是編碼樹單元（CTU），其具有由編碼器選擇並且可以大於傳統的宏塊的尺寸。CTU由亮度 CTB和相應的彩度 CTB以及語法元素組成。亮度 CTB的尺寸L×L可以選擇為L=16、32或64個樣本，較大的尺寸通常能夠實現更好的壓縮。然後，HEVC支援使用樹結構和四叉樹式信令將CTB分割為更小的塊。1) Coding tree unit and coding tree block (CTB) structure: A similar structure in HEVC is the coding tree unit (CTU), which has a size selected by the encoder and can be larger than traditional macroblocks. A CTU consists of a luma CTB and the corresponding chroma CTB as well as syntax elements. Luminance The size L×L of the CTB can be chosen to be L=16, 32 or 64 samples, with larger sizes generally enabling better compression. Then, HEVC supports dividing the CTB into smaller blocks using tree structure and quad-tree signaling.

2）編碼單元（CU）和編碼塊（CB）：CTU的四叉樹語法規定了其亮度和彩度 CB的尺寸和位置。四叉樹的根與CTU相關聯。因此，亮度 CTB的尺寸是亮度 CB支持的最大尺寸。CTU的亮度和彩度 CB的劃分是聯合發信令的。一個亮度CB和通常兩個彩度CB以及相關的語法一起形成編碼單元（CU）。CTB可以只包含一個CU，也可以劃分形成多個CU，並且每個CU都有一個相關的劃分，分割成預測單元（PU）和轉換單元樹（TU）。2) Coding unit (CU) and coding block (CB): The quadtree syntax of CTU specifies the size and position of its brightness and chroma CB. The root of the quadtree is associated with the CTU. Therefore, the size of the luma CTB is the largest size supported by the luma CB. The division of CTU luminance and chroma CB is jointly signaled. A luma CB and usually two chroma CBs and associated syntax form a coding unit (CU). The CTB can contain only one CU, or it can be divided to form multiple CUs, and each CU has an associated partition, divided into prediction units (PU) and transformation unit trees (TU).

3）預測單元（PU）和預測塊（PB）：在CU級別決定是使用幀間預測還是幀內預測對圖片區域進行編碼。PU分割結構的根位於CU級。取決於基本的預測類型決定，可以在尺寸上進一步劃分亮度和彩度 CB，並從亮度和彩度預測塊（PB）中預測亮度和彩度 CB。HEVC支援從64×64到4×4個樣本的可變PB尺寸。圖3示出了MxM CU的允許PB示例。3) Prediction unit (PU) and prediction block (PB): It is decided at the CU level whether to use inter prediction or intra prediction to encode the picture area. The root of the PU partition structure is at the CU level. Depending on the underlying prediction type decision, the luma and chroma CBs can be further partitioned in size and predicted from luma and chroma prediction blocks (PBs). HEVC supports variable PB sizes from 64×64 to 4×4 samples. Figure 3 shows an example of allowed PBs for an MxM CU.

4）變換單元（TU）和變換塊（TB）：使用塊變換對預測殘差進行編碼。TU樹結構的根位於CU級。亮度CB殘差可能與亮度TB相同，或者也可能進一步劃分成更小的亮度變換塊TB。同樣適用於彩度TB。對於4×4、8×8、16×16和32×32的方形TB定義了與離散余弦變換（DCT）相似的整數基函數。對於亮度幀內預測殘差的4×4變換，也可以指定從離散正弦變換（DST）形式推導的整數變換。4) Transform Unit (TU) and Transform Block (TB): Block transform is used to encode the prediction residual. The root of the TU tree structure is at the CU level. The luma CB residual may be the same as the luma TB, or it may be further divided into smaller luma transform blocks TB. The same applies to Chroma TB. Integer basis functions similar to the discrete cosine transform (DCT) are defined for square TBs of 4×4, 8×8, 16×16 and 32×32. For a 4×4 transform of the luma intra prediction residual, it is also possible to specify an integer transform derived from the discrete sine transform (DST) form.

圖4示出了將CTB細分成CB（和轉換塊（TB））的示例。實線指示CB邊界，並且虛線指示TB邊界。（a）帶分割的CTB。（b）對應的四叉樹。Figure 4 shows an example of subdividing a CTB into CBs (and Transform Blocks (TBs)). Solid lines indicate CB boundaries, and dashed lines indicate TB boundaries. (a) CTB with segmentation. (b) Corresponding quadtree.

2.1.2.12.1.2.1 分割成變換塊和單元的樹形結構劃分Tree structure partitioning into transformation blocks and units

對於殘差編碼，CB可以遞迴地分割為轉換塊（TB）。分割由殘差四叉樹發信令。只指定了方形CB和TB分割，其中塊可以遞迴地劃分為四象限，如圖4所示。對於尺寸為M×M的給定的亮度 CB，標誌指示它是否被劃分成四個尺寸為M/2×M/2的塊。如果可以進一步劃分，如序列參數集（SPS）中指示的殘差四叉樹的最大深度所指示的那樣，每個象限都會分配一個標誌，指示是否將其劃分為四個象限。由殘差四叉樹生成的葉節點塊是由變換編碼進一步處理的變換塊。編碼器指示它將使用的最大和最小亮度 TB尺寸。當CB尺寸大於最大TB尺寸時，則暗示劃分。當劃分將導致比指示的最小值更小的亮度TB尺寸時，則暗示不劃分。彩度TB尺寸在每個維度上是亮度TB尺寸的一半，但當亮度TB尺寸為4×4時除外，在這種情況下，被四個4×4亮度TB覆蓋的區域使用單個4×4彩度TB。在幀內預測的CU的情況下，最近相鄰TB（CB內或CB外）的解碼樣本用作幀內預測的參考資料。For residual coding, CB can be recursively split into transform blocks (TB). The partitioning is signaled by a residual quadtree. Only square CB and TB partitions are specified, where blocks can be recursively divided into four quadrants, as shown in Figure 4. For a given brightness CB of size M×M, the flag indicates whether it is divided into four blocks of size M/2×M/2. If further partitioning is possible, as indicated by the maximum depth of the residual quadtree indicated in the Sequence Parameter Set (SPS), each quadrant is assigned a flag indicating whether it should be divided into four quadrants. The leaf node blocks generated by the residual quadtree are transform blocks that are further processed by transform coding. The encoder indicates the maximum and minimum brightness TB size it will use. When the CB size is larger than the maximum TB size, partitioning is implied. No partitioning is implied when partitioning would result in a smaller luma TB size than the indicated minimum. The chroma TB size is half the luma TB size in each dimension, except when the luma TB size is 4×4, in which case the area covered by four 4×4 luma TBs uses a single 4×4 Chroma TB. In the case of intra-predicted CUs, the decoded samples of the nearest neighboring TB (inside or outside the CB) are used as reference materials for intra prediction.

與以前的標準不同，對於幀間預測的CU，HEVC設計允許TB跨越多個PB，以最大化得益於四叉樹結構的TB分割的潛在編碼效率。Unlike previous standards, for inter-predicted CUs, the HEVC design allows a TB to span multiple PBs to maximize the potential coding efficiency benefiting from the quadtree structure of TB partitioning.

2.1.2.22.1.2.2 父節點和子節點Parent node and child node

根據四叉樹結構對CTB進行劃分，其節點為編碼單元。四叉樹結構中的多個節點包括葉節點和非葉節點。葉節點在樹結構中沒有子節點（即，葉節點不會進一步劃分）。非葉節點包括樹結構的根節點。根節點對應於視頻資料的初始視頻塊（例如，CTB）。對於多個節點的每個各自的非根節點，各自的非根節點對應於視頻塊，該視頻塊是對應於各自非根節點的樹結構中的父節點的視頻塊的子塊。多個非葉節點的每個各自的非葉節點在樹結構中具有一個或多個子節點。The CTB is divided according to the quadtree structure, and its nodes are coding units. Multiple nodes in a quadtree structure include leaf nodes and non-leaf nodes. Leaf nodes have no child nodes in the tree structure (i.e., leaf nodes are not further divided). Non-leaf nodes include the root node of the tree structure. The root node corresponds to the initial video block (eg, CTB) of the video material. For each respective non-root node of the plurality of nodes, the respective non-root node corresponds to a video block that is a child block of the video block corresponding to the parent node in the tree structure of the respective non-root node. Each respective non-leaf node of the plurality of non-leaf nodes has one or more child nodes in the tree structure.

2.1.32.1.3 聯合探索模型（joint exploration model ( JEMJEM ）中具有較大) has a larger CTUCTU 的四叉樹加二叉樹塊結構Quadtree plus binary tree block structure

為探索HEVC之外的未來視頻編碼技術，VCEG和MPEG於2015年共同成立了聯合視頻探索團隊（JVET）。從那時起，JVET採用了許多新的方法，並將其應用到了名為聯合探索模型（JEM）的參考軟體中。To explore future video coding technologies beyond HEVC, VCEG and MPEG jointly established the Joint Video Exploration Team (JVET) in 2015. Since then, JVET has adopted many new methods and implemented them into a reference software called the Joint Exploration Model (JEM).

2.1.3.1 QTBT塊分割結構2.1.3.1 QTBT block partition structure

與HEVC不同，QTBT結構消除了多個分割類型的概念，即，其消除了CU、PU和TU概念的分離，並支持CU分割形狀的更多靈活性。在QTBT塊結構中，CU可以是方形或矩形。如圖5所示，首先用四叉樹結構對編碼樹單元（CTU）進行分割。四叉樹葉節點進一步被二叉樹結構分割。在二叉樹劃分中有兩種分割類型：對稱的水準劃分和對稱的垂直劃分。二叉樹葉節點被稱為編碼單元（CU），該劃分用於預測和轉換處理，而無需進一步分割。這意味著在QTBT編碼塊結構中CU、PU和TU具有相同的塊尺寸。在JEM中，CU有時由不同顏色分量的編碼塊（CB）組成，例如，在4:2:0彩度格式的P條帶和B條帶中，一個CU包含一個亮度 CB和兩個彩度 CB，並且CU有時由單個分量的CB組成，例如，在I條帶的情況下，一個CU僅包含一個亮度 CB或僅包含兩個彩度 CB。Unlike HEVC, the QTBT structure eliminates the concept of multiple partition types, that is, it eliminates the separation of CU, PU, and TU concepts, and supports more flexibility in CU partition shapes. In QTBT block structure, CU can be square or rectangular. As shown in Figure 5, the coding tree unit (CTU) is first divided using a quad-tree structure. The quad leaf nodes are further divided by the binary tree structure. There are two types of partitioning in binary tree partitioning: symmetric horizontal partitioning and symmetric vertical partitioning. Binary leaf nodes are called coding units (CUs), and this partitioning is used for prediction and transformation processing without further partitioning. This means that CU, PU and TU have the same block size in QTBT coding block structure. In JEM, a CU is sometimes composed of coding blocks (CB) of different color components. For example, in the P strip and B strip of the 4:2:0 chroma format, a CU contains one luma CB and two chroma components. Degree CB, and a CU sometimes consists of a single component CB, for example, in the case of I-strips, a CU contains only one luminance CB or only two chroma CBs.

為QTBT分割方案定義了以下參數。The following parameters are defined for the QTBT segmentation scheme.

–CTU尺寸：四叉樹的根節點尺寸，與HEVC中的概念相同。–CTU size: The root node size of the quadtree, the same concept as in HEVC.

–MiNQTSize ：最小允許的四叉樹葉節點尺寸– MiNQTSize : Minimum allowed size of four-fork leaf nodes

–MaxBTSize ：最大允許的二叉樹根節點尺寸– MaxBTSize : The maximum allowed binary tree root node size

–MaxBTDePTh ：最大允許的二叉樹深度– MaxBTDePTh : Maximum allowed binary tree depth

–MiNBTSize ：最小允許的二叉樹葉節點尺寸– MiNBTSize : Minimum allowed binary leaf node size

在QTBT分割結構的一個示例中，CTU尺寸被設置為具有兩個對應的64×64彩度樣本塊的128×128 個亮度樣本，MiNQTSize 被設置為16×16，MaxBTSize 被設置為64×64，MiNBTSize （寬度和高度）被設置為4×4，MaxBTSize 被設置為4。四叉樹分割首先應用於CTU，以生成四叉樹葉節點。四叉樹葉節點的尺寸可以具有從16×16（即，MiNQTSize ）到128×128（即，CTU尺寸）的尺寸。如果葉四叉樹節點是128×128，則其不會被二叉樹進一步劃分，因為其尺寸超過了MaxBTSize （即，64×64）。否則，葉四叉樹節點可以被二叉樹進一步分割。因此，四叉樹葉節點也是二叉樹的根節點，並且其二叉樹深度為0。當二叉樹深度達到MaxBTDePTh （即，4）時，不考慮進一步劃分。當二叉樹節點的寬度等於MiNBTSize （即，4）時，不考慮進一步的水準劃分。同樣，當二叉樹節點的高度等於MiNBTSize 時，不考慮進一步的垂直劃分。通過預測和變換處理進一步處理二叉樹的葉節點，而不需要進一步的分割。在JEM中，最大CTU尺寸為256×256 個亮度樣本。In an example of the QTBT partitioning structure, the CTU size is set to 128×128 luma samples with two corresponding blocks of 64×64 chroma samples, MiNQTSize is set to 16×16, MaxBTSize is set to 64×64, MiNBTSize (width and height) is set to 4×4 and MaxBTSize is set to 4. Quadtree partitioning is first applied to the CTU to generate quadtree leaf nodes. The size of a quad leaf node can have dimensions from 16×16 (i.e., MiNQTSize ) to 128×128 (i.e., CTU size). If a leaf quadtree node is 128×128, it will not be further divided by the binary tree because its size exceeds MaxBTSize (i.e., 64×64). Otherwise, leaf quadtree nodes can be further split by binary trees. Therefore, the quad leaf node is also the root node of the binary tree, and its binary tree depth is 0. When the binary tree depth reaches MaxBTDePTh (i.e., 4), no further partitioning is considered. When the width of a binary tree node is equal to MiNBTSize (i.e., 4), further horizontal divisions are not considered. Likewise, when the height of a binary tree node is equal to MiNBTSize , further vertical divisions are not considered. The leaf nodes of the binary tree are further processed through prediction and transformation processing without further segmentation. In JEM, the maximum CTU size is 256×256 luminance samples.

圖5（左側）圖示了通過使用QTBT進行塊分割的示例，圖5（右側）圖示了相應的樹表示。實線表示四叉樹分割，並且虛線表示二叉樹分割。在二叉樹的每個劃分（即，非葉）節點中，會對一個標誌發信令來指示使用哪種分割類型（即，水準或垂直），其中0表示水準劃分，1表示垂直劃分。對於四叉樹分割，不需要指明分割類型，因為四叉樹分割總是水準和垂直劃分一個塊，以生成尺寸相同的4個子塊。Figure 5 (left) illustrates an example of block segmentation by using QTBT, and Figure 5 (right) illustrates the corresponding tree representation. Solid lines represent quadtree splits, and dashed lines represent binary tree splits. In each partition (i.e., non-leaf) node of the binary tree, a flag is signaled to indicate which type of partition (i.e., horizontal or vertical) is used, where 0 indicates a horizontal partition and 1 indicates a vertical partition. For quadtree partitioning, there is no need to specify the partition type, because quadtree partitioning always divides a block horizontally and vertically to generate 4 sub-blocks of the same size.

此外，QTBT方案支援亮度和彩度具有單獨的QTBT結構的能力。目前，對於P條帶和B條帶，一個CTU中的亮度和彩度 CTB共用相同的QTBT結構。然而，對於I條帶，用QTBT結構將亮度CTB分割為CU，用另一個QTBT結構將彩度CTB分割為彩度CU。這意味著I條帶中的CU由亮度分量的編碼塊或兩個彩度分量的編碼塊組成，P條帶或B條帶中的CU由所有三種顏色分量的編碼塊組成。In addition, the QTBT solution supports brightness and chroma capabilities with separate QTBT structures. Currently, the luma and chroma CTBs in a CTU share the same QTBT structure for P-strips and B-strips. However, for I-strips, a QTBT structure is used to split the luma CTB into CUs, and another QTBT structure is used to split the chroma CTB into chroma CUs. This means that a CU in an I-slice consists of coding blocks for the luma component or two chroma components, and a CU in a P-slice or B-slice consists of coding blocks for all three color components.

在HEVC中，為了減少運動補償的記憶體訪問，限制小塊的幀間預測，使得4×8和8×4塊不支持雙向預測，並且4×4塊不支援幀間預測。在JEM的QTBT中，這些限制被移除。In HEVC, in order to reduce memory access for motion compensation, inter prediction of small blocks is restricted, so that 4×8 and 8×4 blocks do not support bidirectional prediction, and 4×4 blocks do not support inter prediction. In JEM's QTBT, these restrictions are removed.

2.1.42.1.4 多功能視頻編碼（Versatile video coding ( VVCVVC ）的三叉樹) ternary tree

如JVET-D0117中提出的，也支持四叉樹和二叉樹以外的樹類型。在實現中，引入了兩個額外的三叉樹（TT）劃分，即，水準和垂直中心側三叉樹，如圖6（d）和（e）所示。As proposed in JVET-D0117, tree types other than quadtrees and binary trees are also supported. In the implementation, two additional ternary tree (TT) partitions are introduced, namely, horizontal and vertical center-side ternary trees, as shown in Figure 6(d) and (e).

圖6示出了：（a）四叉樹分割，（b）垂直二叉樹分割（c）水準二叉樹分割（d）垂直中心側三叉樹分割，和（e）水準中心側三叉樹分割。Figure 6 shows: (a) quadtree segmentation, (b) vertical binary tree segmentation (c) horizontal binary tree segmentation (d) vertical center-side ternary tree segmentation, and (e) horizontal center-side ternary tree segmentation.

在一些實現中，有兩個層次的樹：區域樹（四叉樹）和預測樹（二叉樹或三叉樹）。首先用區域樹（RT）對CTU進行劃分。可以進一步用預測樹（PT）劃分RT葉。也可以用PT進一步劃分PT葉，直到達到最大PT深度。PT葉是基本的編碼單元。為了方便起見，它仍然被稱為CU。CU不能進一步劃分。預測和變換都以與JEM相同的方式應用於CU。整個分割結構被稱為“多類型樹”。In some implementations, there are two levels of trees: a region tree (quadtree) and a prediction tree (binary or ternary tree). First, the CTU is divided using a region tree (RT). RT leaves can be further divided using prediction trees (PT). The PT lobe can also be further divided with PT until the maximum PT depth is reached. The PT leaf is the basic coding unit. For convenience, it is still called CU. CU cannot be divided further. Both predictions and transformations are applied to CU in the same way as JEM. The entire partitioning structure is called a "multitype tree".

2.1.5 JVET-J00212.1.5 JVET-J0021 中的分割結構segmentation structure in

被稱為多樹型（MTT）的樹結構是QTBT的廣義化。在QTBT中，如圖5所示，首先用四叉樹結構對編碼樹單元（CTU）進行劃分。然後用二叉樹結構對四叉樹葉節點進行進一步劃分。The tree structure called multi-tree type (MTT) is a generalization of QTBT. In QTBT, as shown in Figure 5, coding tree units (CTUs) are first divided using a quad-tree structure. Then the four-fork leaf nodes are further divided using a binary tree structure.

MTT的基本結構由兩種類型的樹節點組成：區域樹（RT）和預測樹（PT），支援九種類型的劃分，如圖7所示。The basic structure of MTT consists of two types of tree nodes: regional tree (RT) and prediction tree (PT), supporting nine types of divisions, as shown in Figure 7.

圖7圖示了：（a）四叉樹分割，（b）垂直二叉樹分割，（c）水準二叉樹分割，（d）垂直三叉樹分割，（e）水準三叉樹分割，（f）水準向上非對稱二叉樹分割，（g）水準向下非對稱二叉樹分割，（h）垂直的左非對稱二叉樹分割，和（i）垂直的右非對稱二叉樹分割。Figure 7 illustrates: (a) quadtree segmentation, (b) vertical binary tree segmentation, (c) horizontal binary tree segmentation, (d) vertical ternary tree segmentation, (e) horizontal ternary tree segmentation, (f) horizontal upward non-tree segmentation Symmetric binary tree segmentation, (g) horizontal downward asymmetric binary tree segmentation, (h) vertical left asymmetric binary tree segmentation, and (i) vertical right asymmetric binary tree segmentation.

區域樹可以遞迴地將CTU劃分為方形塊，直至4x4尺寸的區域樹葉節點。在區域樹的每個節點上，可以從三種樹類型中的一種形成預測樹：二叉樹（BT）、三叉樹（TT）和非對稱二叉樹（ABT）。在PT劃分中，禁止在預測樹的分支中進行四叉樹分割。和JEM一樣，亮度樹和彩度樹在I條帶中被分開。RT和PT的信令方法如圖8所示。The zone tree can recursively divide the CTU into square blocks, up to 4x4 sized zone leaf nodes. At each node of the region tree, a prediction tree can be formed from one of three tree types: binary tree (BT), ternary tree (TT), and asymmetric binary tree (ABT). In PT partitioning, quadtree splitting in branches of the prediction tree is prohibited. Like JEM, the luma and chroma trees are separated in the I strip. The signaling methods of RT and PT are shown in Figure 8.

2.2 HEVC/H.265中的幀間預測2.2 Inter prediction in HEVC/H.265

每個幀間預測的PU具有一個或兩個參考圖片清單的運動參數。運動參數包括運動向量和參考圖片索引。對兩個參考圖片清單中的一個的使用也可以使用inter_pred_idc 來信令通知。運動向量可以相對於預測器顯式地編碼為增量，這種編碼模式稱為高級運動向量預測（AMVP）模式。Each inter-predicted PU has motion parameters for one or two reference picture lists. Motion parameters include motion vectors and reference picture indexes. The use of one of the two reference picture lists may also be signaled using inter_pred_idc . Motion vectors can be explicitly coded as deltas relative to the predictor, a coding mode called Advanced Motion Vector Prediction (AMVP) mode.

當CU採用跳躍模式編碼時，一個PU與CU相關聯，並且沒有顯著的殘差係數、沒有編碼的運動向量增量或參考圖片索引。指定了一種Merge模式，通過該模式，可以從相鄰的PU（包括空間和時域候選）中獲取當前PU的運動參數。Merge模式可以應用於任何幀間預測的PU，而不僅僅是跳躍模式。Merge模式的另一種選擇是運動參數的顯式傳輸，其中運動向量、每個參考圖片清單對應的參考圖片索引和參考圖片清單的使用都會在每個PU中顯式地用信令通知。When a CU is encoded in skip mode, a PU is associated with the CU and has no significant residual coefficients, no encoded motion vector increments, or reference picture indexes. Specifies a Merge mode by which the motion parameters of the current PU are obtained from neighboring PUs (including spatial and temporal candidates). Merge mode can be applied to any inter-predicted PU, not just skip mode. An alternative to Merge mode is the explicit transmission of motion parameters, where the motion vectors, the reference picture index corresponding to each reference picture list, and the use of the reference picture list are explicitly signaled in each PU.

當信令指示要使用兩個參考圖片清單中的一個時，從一個樣本塊中生成PU。這被稱為“單向預測”。單向預測對P條帶和B條帶都可用。A PU is generated from a block of samples when signaling indicates that one of the two reference picture lists is to be used. This is called "one-way prediction". One-way prediction is available for both P-strips and B-strips.

當信令指示要使用兩個參考圖片清單時，從兩個樣本塊中生成PU。這被稱為“雙向預測”。雙向預測僅對B條帶可用。When signaling indicates that two reference picture lists are to be used, a PU is generated from two blocks of samples. This is called "bidirectional prediction". Bidirectional prediction is only available for B slices.

下文提供HEVC中規定的幀間預測模式的細節。描述將從Merge模式開始。Details of the inter prediction modes specified in HEVC are provided below. The description will start with Merge mode.

2.2.1 Merge2.2.1 Merge 模式model

2.2.1.1Merge2.2.1.1Merge 模式的候選的推導Derivation of Pattern Candidates

當使用Merge模式預測PU時，從位元流分析指向Merge候選清單中條目的索引，並用於檢索運動資訊。該清單的結構在HEVC標準中有規定，並且可以按照以下步驟順序進行概括：When predicting a PU using Merge mode, the index pointing to the entry in the Merge candidate list is analyzed from the bitstream and used to retrieve motion information. The structure of this list is specified in the HEVC standard and can be summarized in the following sequence of steps:

步驟1：初始候選推導Step 1: Initial candidate derivation

步驟1.1：空域候選推導Step 1.1: Airspace candidate derivation

步驟1.2：空域候選的冗餘檢查Step 1.2: Redundancy Check of Airspace Candidates

步驟1.3：時域候選推導Step 1.3: Time domain candidate derivation

步驟2：附加候選插入Step 2: Append candidate insertions

步驟2.1：雙向預測候選的創建Step 2.1: Creation of Bidirectional Prediction Candidates

步驟2.2：零運動候選的插入Step 2.2: Insertion of Zero Motion Candidates

在圖9中也示意性描述了這些步驟。對於空間Merge候選推導，在位於五個不同位置的候選中最多選擇四個Merge候選。對於時域Merge候選推導，在兩個候選中最多選擇一個Merge候選。由於在解碼器處假定每個PU的候選數為常量，因此當候選數未達到條帶標頭中發信令的最大Merge候選數（MaxNumMergeCand ）時，生成附加的候選。由於候選數是恒定的，所以最佳Merge候選的索引使用截斷的一元二值化（TU）進行編碼。如果CU的大小等於8，則當前CU的所有PU都共用一個Merge候選列表，這與2N×2N預測單元的Merge候選清單相同。These steps are also schematically depicted in Figure 9. For spatial Merge candidate derivation, up to four Merge candidates are selected among candidates located at five different locations. For time-domain Merge candidate derivation, at most one Merge candidate is selected among the two candidates. Since the number of candidates per PU is assumed to be constant at the decoder, additional candidates are generated when the number of candidates does not reach the maximum number of Merge candidates ( MaxNumMergeCand ) signaled in the slice header. Since the number of candidates is constant, the index of the best Merge candidate is encoded using truncated unary binarization (TU). If the size of the CU is equal to 8, all PUs of the current CU share a Merge candidate list, which is the same as the Merge candidate list of the 2N×2N prediction unit.

下面詳細介紹與上述步驟相關的操作。The operations related to the above steps are described in detail below.

2.2.1.22.2.1.2 空域候選推導Airspace candidate derivation

在空間Merge候選的推導中，在位於圖10所示位置的候選中最多選擇四個Merge候選。推導順序為A1、 B1、 B0、 A0 和 B2。只有當位置A1、 B1、 B0、 A0的任何PU不可用（例如，因為它屬於另一個條帶或片）或是內部編碼時，才考慮位置B2。在增加A1位置的候選後，對其餘候選的增加進行冗餘檢查，其確保具有相同運動資訊的候選被排除在清單之外，從而提高編碼效率。為了降低計算的複雜度，在所提到的冗餘檢查中並不考慮所有可能的候選對。相反，只有與圖11中的箭頭連結的對才會被考慮，並且只有當用於冗餘檢查的對應候選沒有相同的運動資訊時，才將候選添加到列表中。複製運動資訊的另一個來源是與2N×2N不同的分區相關的“第二PU”。例如，圖12分別描述了N×2N和2N×N情況下的第二PU。當當前的PU被劃分為N×2N時，對於列表構建不考慮A1位置的候選。在一些實施例中，添加此候選可能導致兩個具有相同運動資訊的預測單元，這對於在編碼單元中僅具有一個PU是冗餘的。同樣地，當當前PU被劃分為2N×N時，不考慮位置B1。In the derivation of spatial Merge candidates, up to four Merge candidates are selected among the candidates located at the positions shown in Figure 10. The derivation sequence is A1, B1, B0, A0 and B2. Location B2 is only considered if any PU at location A1, B1, B0, A0 is unavailable (for example, because it belongs to another stripe or slice) or is internally coded. After adding candidates at the A1 position, a redundancy check is performed on the addition of the remaining candidates, which ensures that candidates with the same motion information are excluded from the list, thereby improving coding efficiency. In order to reduce the computational complexity, not all possible candidate pairs are considered in the mentioned redundancy check. Instead, only pairs connected by arrows in Figure 11 are considered, and candidates are added to the list only if the corresponding candidate for redundancy check does not have the same motion information. Another source of replicated motion information is the "second PU" associated with a different 2N×2N partition. For example, Figure 12 depicts the second PU in the N×2N and 2N×N cases respectively. When the current PU is divided into N×2N, candidates for the A1 position are not considered for list construction. In some embodiments, adding this candidate may result in two prediction units with the same motion information, which is redundant with only one PU in the coding unit. Likewise, when the current PU is partitioned into 2N×N, location B1 is not considered.

2.2.1.32.2.1.3 時域time domain 候選推導Candidate derivation

在此步驟中，只有一個候選添加到列表中。特別地，在這個時域Merge候選的推導中，基於與給定參考圖片清單中當前圖片具有最小圖片順序計數POC差異的並置PU推導了縮放運動向量。用於推導並置PU的參考圖片清單在條帶標頭中顯式地發信令。圖13中的虛線示出了時域Merge候選的縮放運動向量的獲得，其使用POC距離tb和td從並置PU的運動向量進行縮放，其中tb定義為當前圖片的參考圖片和當前圖片之間的POC差異，並且td定義為並置圖片的參考圖片與並置圖片之間的POC差異。時域Merge候選的參考圖片索引設置為零。縮放處理的實際處理在HEVC規範中描述。對於B條帶，得到兩個運動向量（一個是對於參考圖片清單0，另一個是對於參考圖片清單1）並將其組合使其成為雙向預測Merge候選。圖示用於時間Merge候選的運動向量縮放。In this step, only one candidate is added to the list. In particular, in the derivation of this temporal Merge candidate, scaled motion vectors are derived based on the collocated PU with the smallest picture order count POC difference from the current picture in the given reference picture list. The reference picture list used to derive collocated PUs is explicitly signaled in the slice header. The dashed line in Figure 13 shows the acquisition of the scaled motion vector of the temporal Merge candidate, which is scaled from the motion vector of the collocated PU using the POC distances tb and td, where tb is defined as the distance between the reference picture of the current picture and the current picture. POC difference, and td is defined as the POC difference between the reference picture of the collocated picture and the collocated picture. The reference picture index of the temporal Merge candidate is set to zero. The actual handling of scaling is described in the HEVC specification. For the B slice, two motion vectors (one for reference picture list 0 and the other for reference picture list 1) are obtained and combined to make them bidirectional prediction Merge candidates. Illustration showing motion vector scaling for temporal Merge candidates.

在屬於參考幀的並置PU（Y）中，在候選C₀ 和C₁ 之間選擇時域候選的位置，如圖14所示。如果位置C₀ 處的PU不可用、內部編碼或在當前CTU之外，則使用位置C₁ 。否則，位置C₀ 被用於時域Merge候選的推導。In the collocated PU (Y) belonging to the reference frame, the position of the temporal candidate is selected between candidates C ₀ and C ₁ , as shown in Figure 14. If the PU at position C ₀ is unavailable, internally coded, or outside the current CTU, then position C ₁ is used. Otherwise, position C ₀ is used for the derivation of time-domain Merge candidates.

2.2.1.42.2.1.4 附加候選插入Additional candidate insertion

除了空時Merge候選，還有兩種附加類型的Merge候選：組合雙向預測Merge候選和零Merge候選。組合雙向預測Merge候選是利用空時Merge候選生成的。組合雙向預測Merge候選僅用於B條帶。通過將初始候選的第一參考圖片清單運動參數與另一候選的第二參考圖片清單運動參數相結合，生成組合雙向預測候選。如果這兩個元組提供不同的運動假設，它們將形成新的雙向預測候選。圖15示出了原始列表中（在左側）的兩個候選被用於創建添加到最終列表（在右側）中的組合雙向預測Merge候選的情況，其具有MvL0和refIdxL0或MvL1和refIdxL1的兩個候選。有許多關於組合的規則需要考慮以生成這些附加Merge候選。In addition to space-time Merge candidates, there are two additional types of Merge candidates: combined bidirectional prediction Merge candidates and zero Merge candidates. Combined bidirectional prediction Merge candidates are generated using space-time Merge candidates. Combined bi-prediction Merge candidates are only used for B-slices. A combined bidirectional prediction candidate is generated by combining the first reference picture list motion parameters of the initial candidate with the second reference picture list motion parameters of another candidate. If these two tuples provide different motion hypotheses, they will form new bidirectional prediction candidates. Figure 15 shows a case where two candidates from the original list (on the left) are used to create a combined bidirectional prediction Merge candidate added to the final list (on the right) with two of MvL0 and refIdxL0 or MvL1 and refIdxL1 candidate. There are many rules about combining that need to be considered to generate these additional Merge candidates.

插入零運動候選以填充Merge候選列表中的其餘條目，從而達到MaxNumMergeCand的容量。這些候選具有零空間位移和從零開始並且每次將新的零運動候選添加到清單中時都會增加的參考圖片索引。這些候選使用的參考幀的數目對於單向預測和雙向預測分別是1幀和2幀。最後，對這些候選不執行冗餘檢查。Zero motion candidates are inserted to fill the remaining entries in the Merge candidate list, thus reaching the capacity of MaxNumMergeCand. These candidates have zero spatial displacement and a reference picture index that starts at zero and is incremented every time a new zero motion candidate is added to the manifest. The number of reference frames used by these candidates is 1 frame and 2 frames for unidirectional prediction and bidirectional prediction respectively. Finally, no redundancy checks are performed on these candidates.

2.2.1.52.2.1.5 並行處理的運動估計區域Parallel processing of motion estimation areas

為了加快編碼處理，可以並存執行運動估計，從而同時推導給定區域內所有預測單元的運動向量。從空間鄰域推導Merge候選可能會干擾並行處理，因為一個預測單元在完成相關運動估計之前無法從相鄰的PU推導運動參數。為了緩和編碼效率和處理延遲之間的平衡，HEVC定義了運動估計區域（MER）。可使用如下所述的語法元素“log2_parallel_merge_level_minus2” 在圖片參數集中對MER的尺寸中進行信令通知。當定義MER時，落入同一區域的Merge候選標記為不可用，並且因此在列表構建中不考慮。To speed up the encoding process, motion estimation can be performed concurrently, resulting in simultaneous derivation of motion vectors for all prediction units within a given region. Deriving Merge candidates from spatial neighborhoods may interfere with parallel processing because one prediction unit cannot derive motion parameters from neighboring PUs until the associated motion estimation is completed. To ease the trade-off between coding efficiency and processing latency, HEVC defines Motion Estimation Regions (MERs). The size of the MER can be signaled in the picture parameter set using the syntax element "log2_parallel_merge_level_minus2" as described below. When MER is defined, Merge candidates that fall into the same region are marked as unavailable, and are therefore not considered in list construction.

圖片參數設置原始位元組序列有效載荷（Picture parameters set raw byte sequence payload ( RBSPRBSP ）語法) syntax

通用圖片參數集Common picture parameter set RBSPRBSP 語法Grammar pic_paraMeter_set_rbsp( ) {pic_paraMeter_set_rbsp( ) { 描述符Descriptor pps_pic_paraMeter_set_idpps_pic_paraMeter_set_id ue(v)ue(v) pps_seq_paraMeter_set_idpps_seq_paraMeter_set_id ue(v)ue(v) depeNdeNt_slice_segMeNts_eNabled_flagdepeNdeNt_slice_segMeNts_eNabled_flag u(1)u(1) …… pps_scaliNg_list_data_preseNt_flag pps_scaliNg_list_data_preseNt_flag u(1)u(1) if( pps_scaliNg_list_data_preseNt_flag )if(pps_scaliNg_list_data_preseNt_flag) scaliNg_list_data( )scaliNg_list_data() lists_ModificatioN_preseNt_flag lists_ModificatioN_preseNt_flag u(1)u(1) log2_parallel_Merge_level_MiNus2 log2_parallel_Merge_level_MiNus2 ue(v)ue(v) slice_segMeNt_header_exteNsioN_preseNt_flag slice_segMeNt_header_exteNsioN_preseNt_flag u(1)u(1) pps_exteNsioN_preseNt_flagpps_exteNsioN_preseNt_flag u(1)u(1) …… rbsp_trailiNg_bits( )rbsp_trailiNg_bits( ) }}

log2_parallel_Merge_level_MiNus2 加2指定變數Log2ParMrgLevel的值，該變數用於第8.5.3.2.2條中規定的Merge模式亮度運動向量的推導過程，以及第8.5.3.2.3條中規定的空間Merge候選的推導過程。log2_parallel_Merge_level_MiNus2的值應在0到CtbLog2SizeY − 2的範圍內，包括0和CtbLog2SizeY − 2。 log2_parallel_Merge_level_MiNus2 plus 2 specifies the value of the variable Log2ParMrgLevel, which is used in the derivation of the Merge mode luminance motion vector specified in clause 8.5.3.2.2, and in the derivation of the spatial Merge candidate specified in clause 8.5.3.2.3. The value of log2_parallel_Merge_level_MiNus2 should be in the range of 0 to CtbLog2SizeY − 2, inclusive.

變數Log2ParMrgLevel推導如下：The variable Log2ParMrgLevel is derived as follows:

Log2ParMrgLevel = log2_parallel_Merge_level_MiNus2 + 2 （7-37）Log2ParMrgLevel = log2_parallel_Merge_level_MiNus2 + 2 (7-37)

注釋3–Log2ParMrgLevel的值表示Merge候選列表並行推導的內置功能。例如，當Log2ParMrgLevel等於6時，可以並行推導64x64塊中包含的所有預測單元（PU）和編碼單元（CU）的Merge候選列表。Note 3 – The value of Log2ParMrgLevel represents the built-in functionality of Merge candidate list parallel inference. For example, when Log2ParMrgLevel is equal to 6, the Merge candidate list for all prediction units (PUs) and coding units (CUs) contained in a 64x64 block can be derived in parallel.

2.2.2 AMVP2.2.2 AMVP 模式中的運動向量預測Motion vector prediction in mode

運動向量預測利用運動向量與相鄰的PU的空時相關性，其用於運動參數的顯式傳輸。首先通過檢查左上方的時域相鄰的PU位置的可用性、去掉多餘的候選位置並且加上零向量以使候選列表長度恒定來構建運動向量候選列表。然後，編碼器可以從候選清單中選擇最佳的預測器，並發送指示所選候選的對應索引。與Merge索引信令類似，最佳運動向量候選的索引使用截斷的一元進行編碼。在這種情況下要編碼的最大值是2（例如，圖2至圖8）。在下面的章節中，將詳細介紹運動向量預測候選的推導過程。Motion vector prediction exploits the spatial-temporal correlation of motion vectors with neighboring PUs, which is used for explicit transmission of motion parameters. A motion vector candidate list is first constructed by checking the availability of temporally adjacent PU positions in the upper left, removing redundant candidate positions, and adding zero vectors to make the candidate list constant length. The encoder can then select the best predictor from the candidate list and send the corresponding index indicating the selected candidate. Similar to Merge index signaling, the index of the best motion vector candidate is encoded using a truncated unary. The maximum value to encode in this case is 2 (e.g., Figures 2 to 8). In the following sections, the derivation process of motion vector prediction candidates will be introduced in detail.

2.2.2.12.2.2.1 運動向量預測候選的推導Derivation of motion vector prediction candidates

圖16概括了運動向量預測候選的推導過程。Figure 16 summarizes the derivation process of motion vector prediction candidates.

在運動向量預測中，考慮了兩種類型的運動向量候選：空間運動向量候選和時域運動向量候選。對於空間運動向量候選的推導，基於位於圖11所示的五個不同位置的每個PU的運動向量最終推推導兩個運動向量候選。In motion vector prediction, two types of motion vector candidates are considered: spatial motion vector candidates and temporal motion vector candidates. For the derivation of spatial motion vector candidates, two motion vector candidates are ultimately derived based on the motion vectors of each PU located at five different locations shown in Figure 11.

對於時域運動向量候選的推導，從兩個候選中選擇一個運動向量候選，這兩個候選是基於兩個不同的並置位置推導的。在作出第一個空時候選列表後，移除列表中重複的運動向量候選。如果潛在候選的數量大於二，則從列表中移除相關聯的參考圖片清單中參考圖片索引大於1的運動向量候選。如果空時運動向量候選數小於二，則會在列表中添加附加的零運動向量候選。For the derivation of temporal motion vector candidates, one motion vector candidate is selected from two candidates, which are derived based on two different collocated positions. After making the first empty selection list, remove duplicate motion vector candidates from the list. If the number of potential candidates is greater than two, motion vector candidates with a reference picture index greater than 1 in the associated reference picture list are removed from the list. If the number of space-time motion vector candidates is less than two, additional zero motion vector candidates are added to the list.

2.2.2.22.2.2.2 空間運動向量候選Spatial motion vector candidate

在推導空間運動向量候選時，在五個潛在候選中最多考慮兩個候選，這五個候選來自圖11所描繪位置上的PU，這些位置與運動Merge的位置相同。當前PU左側的推導順序定義為A₀ 、A₁ 、以及縮放的 A₀ 、縮放的A₁ 。當前PU上面的推導順序定義為B₀ 、B₁ , B₂ 、縮放的 B₀ 、縮放的 B₁ 、縮放的B₂ 。因此，每側有四種情況可以用作運動向量候選，其中兩種情況不需要使用空間縮放，並且兩種情況使用空間縮放。四種不同的情況概括如下：When deriving spatial motion vector candidates, a maximum of two candidates are considered out of the five potential candidates from the PU at the locations depicted in Figure 11, which are the same locations as the motion merge. The derivation sequence for the left side of the current PU is defined as A ₀ , A ₁ , and scaled A ₀ , scaled A ₁ . The derivation sequence on the current PU is defined as B ₀ , B ₁ , B ₂ , scaled B ₀ , scaled B ₁ , scaled B ₂ . Therefore, there are four cases per side that can be used as motion vector candidates, two of which do not require the use of spatial scaling, and two of which use spatial scaling. The four different situations are summarized as follows:

--無空間縮放--No space scaling

（1）相同的參考圖片清單，並且相同的參考圖片索引（相同的POC）(1) The same reference picture list, and the same reference picture index (same POC)

（2）不同的參考圖片清單，但是相同的參考圖片（相同的POC）(2) Different reference picture lists, but the same reference pictures (same POC)

--空間縮放--Spatial scaling

（3）相同的參考圖片清單，但是不同的參考圖片（不同的POC）(3) The same reference picture list, but different reference pictures (different POC)

（4）不同的參考圖片清單，並且不同的參考圖片（不同的POC）(4) Different reference picture lists, and different reference pictures (different POC)

首先檢查無空間縮放的情況，然後檢查空間縮放。當POC在相鄰PU的參考圖片與當前PU的參考圖片之間不同時，都會考慮空間縮放，而不考慮參考圖片清單。如果左側候選的所有PU都不可用或是內部編碼，則允許對上述運動向量進行縮放，以幫助左側和上方MV候選的平行推導。否則，不允許對上述運動向量進行空間縮放。Check first for the case of no spatial scaling, then for spatial scaling. Whenever the POC differs between the reference picture of a neighboring PU and the reference picture of the current PU, spatial scaling is considered regardless of the reference picture list. If all PUs of the left candidate are unavailable or internally coded, the above motion vectors are allowed to be scaled to aid in the parallel derivation of the left and upper MV candidates. Otherwise, spatial scaling of the above motion vectors is not allowed.

在空間縮放處理中，相鄰PU的運動向量以與時域縮放相似的方式縮放，如圖17所示。主要區別在於，給出了當前PU的參考圖片清單和索引作為輸入，實際縮放處理與時域縮放處理相同。In the spatial scaling process, the motion vectors of neighboring PUs are scaled in a similar manner to temporal scaling, as shown in Figure 17. The main difference is that the reference picture list and index of the current PU are given as input, and the actual scaling process is the same as the time domain scaling process.

2.2.2.32.2.2.3 時域time domain 運動向量候選motion vector candidate

除了參考圖片索引的推導外，時域Merge候選的所有推導過程與空間運動向量候選的推導過程相同（參見例如圖6）。向解碼器信令通知參考圖片索引。Except for the derivation of the reference picture index, all derivation processes for temporal Merge candidates are identical to those for spatial motion vector candidates (see e.g. Figure 6). The reference picture index is signaled to the decoder.

2.2.2.4 AMVP2.2.2.4 AMVP 信息的信令signaling of information

對於AMVP模式，可以在位元流對四個部分發信令，包括預測方向、參考索引、MVD和MV預測候選索引。For AMVP mode, four parts can be signaled in the bit stream, including prediction direction, reference index, MVD and MV prediction candidate index.

語法表： predictioN_uNit( x0, y0, NPbW, NPbH ) { 描述符 if( cu_skip_flag[ x0 ][ y0 ] ) { if( MaxNuMMergeCaNd > 1 ) Merge_idx [ x0 ][ y0 ] ae(v) } else { /* MODE_INTER */ Merge_flag [ x0 ][ y0 ] ae(v) if( Merge_flag[ x0 ][ y0 ] ) { if( MaxNuMMergeCaNd > 1 ) Merge_idx [ x0 ][ y0 ] ae(v) } else { if( slice_type = = B ) iNter_pred_idc [ x0 ][ y0 ] ae(v) if( iNter_pred_idc[ x0 ][ y0 ] != PRED_L1 ) { if( NuM_ref_idx_l0_active_MiNus1 > 0 ) ref_idx_l0 [ x0 ][ y0 ] ae(v) Mvd_codiNg( x0, y0, 0 ) Mvp_l0_flag [ x0 ][ y0 ] ae(v) } if( iNter_pred_idc[ x0 ][ y0 ] != PRED_L0 ) { if( NuM_ref_idx_l1_active_MiNus1 > 0 ) ref_idx_l1 [ x0 ][ y0 ] ae(v) if( Mvd_l1_zero_flag && iNter_pred_idc[ x0 ][ y0 ] = = PRED_BI ) { MvdL1[ x0 ][ y0 ][ 0 ] = 0 MvdL1[ x0 ][ y0 ][ 1 ] = 0 } else Mvd_codiNg( x0, y0, 1 ) Mvp_l1_flag [ x0 ][ y0 ] ae(v) } } } } Grammar table: predictionN_uNit( x0, y0, NPbW, NPbH ) { Descriptor if( cu_skip_flag[ x0 ][ y0 ] ) { if(MaxNuMMergeCaNd > 1) Merge_idx [x0][y0] ae(v) } else { /* MODE_INTER */ Merge_flag [x0][y0] ae(v) if( Merge_flag[ x0 ][ y0 ] ) { if(MaxNuMMergeCaNd > 1) Merge_idx [x0][y0] ae(v) } else { if( slice_type == B ) iNter_pred_idc [x0][y0] ae(v) if( iNter_pred_idc[ x0 ][ y0 ] != PRED_L1 ) { if( NuM_ref_idx_l0_active_MiNus1 > 0 ) ref_idx_l0 [x0][y0] ae(v) Mvd_codiNg(x0, y0, 0) Mvp_l0_flag [x0][y0] ae(v) } if( iNter_pred_idc[ x0 ][ y0 ] != PRED_L0 ) { if( NuM_ref_idx_l1_active_MiNus1 > 0 ) ref_idx_l1 [x0][y0] ae(v) if( Mvd_l1_zero_flag && iNter_pred_idc[ x0 ][ y0 ] = = PRED_BI ) { MvdL1[x0][y0][0] = 0 MvdL1[x0][y0][1] = 0 } else Mvd_codiNg(x0, y0, 1) Mvp_l1_flag [x0][y0] ae(v) } } } }

運動向量差語法 Mvd_codiNg( x0, y0, refList ) { 描述符 abs_Mvd_greater0_flag [ 0 ] ae(v) abs_Mvd_greater0_flag [ 1 ] ae(v) if( abs_Mvd_greater0_flag[ 0 ] ) abs_Mvd_greater1_flag [ 0 ] ae(v) if( abs_Mvd_greater0_flag[ 1 ] ) abs_Mvd_greater1_flag [ 1 ] ae(v) if( abs_Mvd_greater0_flag[ 0 ] ) { if( abs_Mvd_greater1_flag[ 0 ] ) abs_Mvd_MiNus2 [ 0 ] ae(v) Mvd_sigN_flag [ 0 ] ae(v) } if( abs_Mvd_greater0_flag[ 1 ] ) { if( abs_Mvd_greater1_flag[ 1 ] ) abs_Mvd_MiNus2 [ 1 ] ae(v) Mvd_sigN_flag [ 1 ] ae(v) } } Motion vector difference syntax Mvd_codiNg(x0, y0, refList) { Descriptor abs_Mvd_greater0_flag [ 0 ] ae(v) abs_Mvd_greater0_flag [ 1 ] ae(v) if( abs_Mvd_greater0_flag[ 0 ] ) abs_Mvd_greater1_flag [ 0 ] ae(v) if( abs_Mvd_greater0_flag[ 1 ] ) abs_Mvd_greater1_flag [ 1 ] ae(v) if( abs_Mvd_greater0_flag[ 0 ] ) { if( abs_Mvd_greater1_flag[ 0 ] ) abs_Mvd_MiNus2 [0] ae(v) Mvd_sigN_flag [ 0 ] ae(v) } if( abs_Mvd_greater0_flag[ 1 ] ) { if( abs_Mvd_greater1_flag[ 1 ] ) abs_Mvd_MiNus2 [1] ae(v) Mvd_sigN_flag [1] ae(v) } }

2.32.3 聯合探索模型（joint exploration model ( JEMJEM ）中新的幀間預測方法) new inter-frame prediction method

2.3.12.3.1 基於子Based on sub CUCU 的運動向量預測motion vector prediction

在具有QTBT的JEM中，每個CU對於每個預測方向最多可以具有一組運動參數。通過將大的CU分割成子CU並推導該大CU的所有子CU的運動資訊，編碼器中考慮了兩種子CU級的運動向量預測方法。可選時域運動向量預測（ATMVP）方法允許每個CU從多個小於並置參考圖片中當前CU的塊中獲取多組運動資訊。在空時運動向量預測（STMVP）方法中，通過利用時域運動向量預測器和空間鄰接運動向量遞迴地推導子CU的運動向量。In JEM with QTBT, each CU can have at most one set of motion parameters for each prediction direction. By dividing a large CU into sub-CUs and deriving the motion information of all sub-CUs of the large CU, two sub-CU-level motion vector prediction methods are considered in the encoder. The Alternative Temporal Motion Vector Prediction (ATMVP) method allows each CU to obtain multiple sets of motion information from multiple blocks smaller than the current CU in the collocated reference picture. In the space-time motion vector prediction (STMVP) method, the motion vector of a sub-CU is recursively derived by utilizing a temporal motion vector predictor and a spatial adjacent motion vector.

為了為子CU運動預測的保持更精確的運動場，當前禁用參考幀的運動壓縮。In order to maintain a more accurate motion field for sub-CU motion prediction, motion compression of reference frames is currently disabled.

2.3.1.12.3.1.1 可選時域運動向量預測Optional temporal motion vector prediction

在可選時域運動向量預測（ATMVP）方法中，運動向量時域運動向量預測（TMVP）是通過從小於當前CU的塊中提取多組運動資訊（包括運動向量和參考索引）來修改的。如圖18所示，子CU為方形N×N塊（默認N設置為4）。In the alternative temporal motion vector prediction (ATMVP) method, the motion vector temporal motion vector prediction (TMVP) is modified by extracting multiple sets of motion information (including motion vectors and reference indices) from blocks smaller than the current CU. As shown in Figure 18, the sub-CU is a square N×N block (default N is set to 4).

ATMVP分兩步預測CU內的子CU的運動向量。第一步是用所謂的時域向量識別參考圖中的對應塊。參考圖片稱為運動源圖片。第二步是將當前CU劃分成子CU，並從每個子CU對應的塊中獲取運動向量以及每個子CU的參考索引，如圖18所示。ATMVP predicts the motion vectors of sub-CUs within a CU in two steps. The first step is to identify the corresponding blocks in the reference image using so-called temporal vectors. The reference picture is called the motion source picture. The second step is to divide the current CU into sub-CUs and obtain the motion vector and the reference index of each sub-CU from the block corresponding to each sub-CU, as shown in Figure 18.

在第一步中，參考圖片和對應的塊由當前CU的空間相鄰塊的運動資訊確定。為了避免相鄰塊的重複掃描處理，使用當前CU的Merge候選列表中的第一個Merge候選。第一個可用的運動向量及其相關聯的參考索引被設置為時域向量和運動源圖片的索引。這樣，在ATMVP中，與TMVP相比，可以更準確地識別對應的塊，其中對應的塊（有時稱為並置塊）始終位於相對於當前CU的右下角或中心位置。在一個示例中，如果第一個Merge候選來自左相鄰塊（即，圖19中的A₁ ），則使用相關的MV和參考圖片來識別源塊和源圖片。In the first step, the reference picture and the corresponding block are determined from the motion information of the spatial neighboring blocks of the current CU. In order to avoid repeated scanning processing of adjacent blocks, the first Merge candidate in the Merge candidate list of the current CU is used. The first available motion vector and its associated reference index are set to the temporal vector and the index of the motion source picture. In this way, in ATMVP, the corresponding block can be identified more accurately than in TMVP, where the corresponding block (sometimes called a collocated block) is always located in the lower right corner or center position relative to the current CU. In one example, if the first Merge candidate is from the left neighboring block (ie, A ₁ in Figure 19 ), the relevant MV and reference picture are used to identify the source block and source picture.

圖19示出了源塊和源圖片的識別的示例。Figure 19 shows an example of identification of source blocks and source pictures.

在第二步中，通過將時域向量添加到當前CU的座標中，通過運動源圖片中的時域向量識別子CU的對應塊。對於每個子CU，使用其對應塊的運動資訊（覆蓋中心樣本的最小運動網格）來推導子CU的運動資訊。在識別出對應N×N塊的運動資訊後，將其轉換為當前子CU的運動向量和參考索引，與HEVC的TMVP方法相同，其中應用運動縮放和其它處理。例如，解碼器檢查是否滿足低延遲條件（即，當前圖片的所有參考圖片的POC都小於當前圖片的POC），並可能使用運動向量MVx（與參考圖片清單X對應的運動向量）來為每個子CU預測運動向量MVy（X等於0或1且Y等於1−X）。In the second step, the corresponding block of the sub-CU is identified through the temporal vector in the motion source picture by adding the temporal vector to the coordinates of the current CU. For each sub-CU, the motion information of its corresponding block (the minimum motion grid covering the center sample) is used to derive the motion information of the sub-CU. After identifying the motion information corresponding to the N×N block, it is converted into the motion vector and reference index of the current sub-CU, which is the same as the TMVP method of HEVC, in which motion scaling and other processing are applied. For example, the decoder checks whether the low-latency condition is met (i.e., the POC of all reference pictures of the current picture is smaller than the POC of the current picture), and may use the motion vector MVx (the motion vector corresponding to the reference picture list CU predicts motion vector MVy (X equals 0 or 1 and Y equals 1−X).

2.3.1.22.3.1.2 空時運動向量預測Space-time motion vector prediction

在這種方法中，子CU的運動向量是按照光柵掃描順序遞迴推導的。圖20圖示了這一概念。考慮一個8×8的 CU，它包含四個4×4的子CU A、B、C和D。當前幀中相鄰的4×4的塊標記為a、b、c和d。In this method, the motion vectors of sub-CUs are derived recursively in raster scan order. Figure 20 illustrates this concept. Consider an 8×8 CU, which contains four 4×4 sub-CUs A, B, C and D. The adjacent 4×4 blocks in the current frame are labeled a, b, c, and d.

子CU A的運動推導由識別其兩個空間鄰居開始。第一個鄰居是子CU A上方的N×N塊（塊c）。如果該塊c不可用或內部編碼，則檢查子CU A上方的其它N×N塊（從左到右，從塊c處開始）。第二個鄰居是子CU A左側的一個塊（塊b）。如果塊b不可用或是內部編碼，則檢查子CU A左側的其它塊（從上到下，從塊b處開始）。每個清單從相鄰塊獲得的運動資訊被縮放到給定清單的第一個參考幀。接下來，按照HEVC中規定的與TMVP相同的程式，推推導子塊A的時域運動向量預測（TMVP）。提取位置D處的並置塊的運動資訊並進行相應的縮放。最後，在檢索和縮放運動資訊後，對每個參考列表分別平均所有可用的運動向量（最多3個）。將平均運動向量指定為當前子CU的運動向量。The motion derivation of sub-CU A starts by identifying its two spatial neighbors. The first neighbor is the N×N block above sub-CU A (block c). If block c is unavailable or internally encoded, other N×N blocks above sub-CU A are checked (from left to right, starting at block c). The second neighbor is a block to the left of sub-CU A (block b). If block b is not available or is internally coded, check other blocks to the left of sub-CU A (from top to bottom, starting from block b). The motion information obtained from neighboring blocks for each manifest is scaled to the first reference frame of the given manifest. Next, the temporal motion vector prediction (TMVP) of sub-block A is derived according to the same procedure as TMVP specified in HEVC. Extract the motion information of the collocated block at position D and scale accordingly. Finally, after retrieving and scaling the motion information, all available motion vectors (up to 3) are averaged separately for each reference list. Specify the average motion vector as the motion vector of the current sub-CU.

圖20示出了具有四個子塊（A-D）及其相鄰塊（a-d）的一個CU的示例。Figure 20 shows an example of one CU with four sub-blocks (A-D) and their neighboring blocks (a-d).

2.3.1.32.3.1.3 子son CUCU 運動預測模式信令通知Motion prediction mode signaling notification

子CU模式作為附加的Merge候選模式啟用，並且不需要附加的語法元素來對該模式發信令。將另外兩個Merge候選添加到每個CU的Merge候選列表中，以表示ATMVP模式和STMVP模式。如果序列參數集指示啟用了ATMVP和STMVP，則最多使用七個Merge候選。附加Merge候選的編碼邏輯與HM中的Merge候選的編碼邏輯相同，這意味著對於P條帶或B條帶中的每個CU，需要對兩個附加Merge候選進行兩次額外的RD檢查。Sub-CU mode is enabled as an additional Merge candidate mode, and no additional syntax elements are required to signal this mode. Two additional Merge candidates are added to each CU's Merge candidate list to represent ATMVP mode and STMVP mode. If the sequence parameter set indicates that ATMVP and STMVP are enabled, up to seven Merge candidates are used. The encoding logic of additional Merge candidates is the same as that of Merge candidates in HM, which means that for each CU in a P-slice or B-slice, two additional RD checks are required for the two additional Merge candidates.

在JEM中，Merge索引的所有bin檔都由CABAC進行上下文編碼。然而在HEVC中，只有第一個bin檔是上下文編碼的，並且其餘的biN檔是上下文旁路編碼的。In JEM, all bin files indexed by Merge are contextually encoded by CABAC. However, in HEVC, only the first bin file is context encoded, and the remaining biN files are context bypass encoded.

2.3.22.3.2 自我調整運動向量差解析度Self-adjusting motion vector difference resolution

在HEVC中，當在條帶標頭中use_integer_mv_flag等於0時，運動向量差（MVD）（在PU的運動向量和預測運動向量之間）以四分之一亮度樣本為單位發信令。在JEM中，引入了局部自我調整運動向量解析度（LAMVR）。在JEM中，MVD可以用四分之一亮度樣本、整數亮度樣本或四亮度樣本的單位進行編碼。MVD解析度控制在編碼單元（CU）級別，並且MVD解析度標誌有條件地為每個至少有一個非零MVD分量的CU發信令。In HEVC, when use_integer_mv_flag is equal to 0 in the slice header, the motion vector difference (MVD) (between the PU's motion vector and the predicted motion vector) is signaled in units of quarter luma samples. In JEM, Local Self-Adjusting Motion Vector Resolution (LAMVR) is introduced. In JEM, MVD can be encoded in units of quarter luma samples, integer luma samples, or quad luma samples. MVD resolution is controlled at the coding unit (CU) level, and the MVD resolution flag is conditionally signaled for each CU that has at least one non-zero MVD component.

對於具有至少一個非零MVD分量的CU，第一個標誌將發信令以指示CU中是否使用四分之一亮度樣本MV精度。當第一個標誌（等於1）指示不使用四分之一亮度樣本MV精度時，另一個標誌發信令以指示是使用整數亮度樣本MV精度還是使用四亮度樣本MV精度。For CUs with at least one non-zero MVD component, the first flag will signal whether quarter-luma sample MV precision is used in the CU. While the first flag (equal to 1) indicates not to use quarter luma sample MV precision, the other flag signals whether to use integer luma sample MV precision or quad luma sample MV precision.

當CU的第一個MVD解析度標誌為零或沒有為CU編碼（意味著CU中的所有MVD都為零）時，CU使用四分之一亮度樣本MV解析度。當一個CU使用整數亮度樣本MV精度或四亮度樣本MV精度時，該CU的AMVP候選列表中的MVP將取整到對應的精度。When the first MVD resolution flag of the CU is zero or is not encoded for the CU (meaning all MVDs in the CU are zero), the CU uses quarter-luma sample MV resolution. When a CU uses integer luma sample MV precision or quad luma sample MV precision, the MVPs in the CU's AMVP candidate list will be rounded to the corresponding precision.

在編碼器中，CU級別的RD檢查用於確定哪個MVD解析度將用於CU。也就是說，對每個MVD解析度執行三次CU級別的RD檢查。為了加快編碼器速度，在JEM中應用以下編碼方案。In the encoder, a CU-level RD check is used to determine which MVD resolution will be used for the CU. That is, three CU-level RD checks are performed for each MVD resolution. To speed up the encoder, the following encoding scheme is applied in JEM.

在對具有正常四分之一亮度採樣MVD解析度的CU進行RD檢查期間，存儲當前CU（整數亮度採樣精度）的運動資訊。在對具有整數亮度樣本和4 亮度樣本MVD解析度的同一個CU進行RD檢查時，將存儲的運動資訊（取整後）用作進一步小範圍運動向量細化的起始點，從而使耗時的運動估計處理不會重複三次。During RD inspection of a CU with normal quarter luma sampling MVD resolution, motion information for the current CU (integer luma sampling resolution) is stored. When performing RD inspection on the same CU with integer luma samples and 4 luma sample MVD resolution, the stored motion information (after rounding) is used as a starting point for further small-scale motion vector refinement, thus making the time-consuming The motion estimation process is not repeated three times.

有條件地調用具有4 亮度樣本MVD解析度的CU的RD檢查。對於CU，當整數亮度樣本MVD解析度的RD檢查成本遠大於四分之一亮度樣本MVD解析度的RD檢查成本時，將跳過對CU的4 亮度樣本MVD解析度的RD檢查。Conditionally calls the RD check for CUs with 4 luminance sample MVD resolution. For a CU, when the RD check cost for the integer luma sample MVD resolution is much greater than the RD check cost for the quarter luma sample MVD resolution, the RD check for the CU's 4-luma sample MVD resolution will be skipped.

2.3.32.3.3 模式匹配運動向量推導Pattern matching motion vector derivation

模式匹配運動向量推導（PMMVD）模式是基於畫面播放速率上轉換（FRUC）技術的特殊Merge模式。在這種模式下，塊的運動資訊不會被發信令，而是在解碼器側推導。Pattern Matching Motion Vector Derivation (PMMVD) mode is a special Merge mode based on Frame Rate Up Conversion (FRUC) technology. In this mode, the motion information of the block is not signaled but derived on the decoder side.

對於CU，當其Merge標誌為真時，對FRUC標誌發信令。當FRUC標誌為假時，對Merge索引發信令並且使用常規Merge模式。當FRUC標誌為真時，對另一個FRUC模式標誌發信令來指示將使用哪種模式（雙邊匹配或範本匹配）來推導該塊的運動資訊。For a CU, the FRUC flag is signaled when its Merge flag is true. When the FRUC flag is false, the Merge search is signaled and regular Merge mode is used. When the FRUC flag is true, another FRUC mode flag is signaled to indicate which mode (bilateral matching or template matching) will be used to derive motion information for this block.

在編碼器側，基於對正常Merge候選所做的RD成本選擇決定是否對CU使用FRUC Merge模式。即通過使用RD成本選擇來檢查CU的兩個匹配模式（雙邊匹配和範本匹配）。導致最低成本的模式進一步與其它CU模式相比較。如果FRUC匹配模式是最有效的模式，那麼對於CU，FRUC標誌設置為真，並且使用相關的匹配模式。On the encoder side, it is decided whether to use FRUC Merge mode for the CU based on the RD cost selection made on the normal Merge candidates. That is, two matching modes (bilateral matching and template matching) of CU are checked by using RD cost selection. The mode that resulted in the lowest cost was further compared with other CU modes. If the FRUC match mode is the most efficient mode, then for CU the FRUC flag is set to true and the associated match mode is used.

FRUC Merge模式中的運動推導過程有兩個步驟：首先執行CU級運動搜索，然後執行子CU級運動優化。在CU級，基於雙邊匹配或範本匹配，推導整個CU的初始運動向量。首先，生成一個MV候選列表，並且選擇導致最低匹配成本的候選作為進一步優化CU級的起點。然後在起始點附近執行基於雙邊匹配或範本匹配的局部搜索，並且將最小匹配成本的MV結果作為整個CU的MV值。接著，以推導的CU運動向量為起點，進一步在子CU級細化運動資訊。The motion derivation process in FRUC Merge mode has two steps: first perform CU-level motion search, and then perform sub-CU-level motion optimization. At the CU level, the initial motion vector of the entire CU is derived based on bilateral matching or template matching. First, an MV candidate list is generated, and the candidate that results in the lowest matching cost is selected as the starting point for further optimization at the CU level. Then a local search based on bilateral matching or template matching is performed near the starting point, and the MV result of the minimum matching cost is used as the MV value of the entire CU. Then, taking the derived CU motion vector as the starting point, the motion information is further refined at the sub-CU level.

例如，對於W×H CU運動資訊推導執行以下推導過程。在第一階段，推推導了整個W×H CU的MV。在第二階段，該CU進一步被分成M×M子CU。M的值按照（16）計算，D是預先定義的劃分深度，在JEM中默認設置為3。然後推導每個子CU的MV值。For example, for W×H CU motion information derivation, the following derivation process is performed. In the first stage, the MV of the entire W × H CU is derived. In the second stage, the CU is further divided into M×M sub-CUs. The value of M is calculated according to (16), and D is the predefined division depth, which is set to 3 by default in JEM. Then the MV value of each sub-CU is derived.

（3） (3)

如圖21所示，通過沿當前CU的運動軌跡在兩個不同的參考圖片中找到兩個塊之間最接近的匹配，使用雙邊匹配來推導當前CU的運動資訊。在連續運動軌跡假設下，指向兩個參考塊的運動向量MV0和MV1與當前圖片和兩個參考圖片之間的時間距離（即，TD0和TD1成正比。作為特殊情況，當當前圖片暫時位於兩個參考圖片之間並且當前圖片到兩個參考圖片的時間距離相同時，雙邊匹配成為基於鏡像的雙向MV。As shown in Figure 21, by finding the closest match between two blocks in two different reference pictures along the motion trajectory of the current CU, bilateral matching is used to derive the motion information of the current CU. Under the assumption of continuous motion trajectories, the motion vectors MV0 and MV1 pointing to the two reference blocks are proportional to the time distance between the current picture and the two reference pictures (i.e., TD0 and TD1). As a special case, when the current picture is temporarily located between When there are two reference pictures and the time distance between the current picture and the two reference pictures is the same, bilateral matching becomes a two-way MV based on mirroring.

如圖22所示，通過在當前圖片中的範本（當前CU的頂部和/或左側相鄰塊）和參考圖片中的塊（與範本尺寸相同）之間找到最接近的匹配，使用範本匹配來推導當前CU的運動資訊。除了上述的FRUC Merge模式外，範本匹配也應用於AMVP模式。在JEM中，正如在HEVC中一樣，AMVP有兩個候選。利用範本匹配方法，推導了新的候選。如果由範本匹配新推導的候選與第一個現有AMVP候選不同，則將其插入AMVP候選列表的最開始處，並且然後將列表尺寸設置為2（即移除第二個現有AMVP候選）。當應用於AMVP模式時，僅應用CU級搜索。As shown in Figure 22, template matching is used by finding the closest match between the template in the current picture (the top and/or left adjacent block of the current CU) and the block in the reference picture (same size as the template). Derive the current CU's motion information. In addition to the above-mentioned FRUC Merge mode, template matching is also applied to AMVP mode. In JEM, as in HEVC, there are two candidates for AMVP. Using the template matching method, new candidates are derived. If a newly derived candidate by template matching is different from the first existing AMVP candidate, it is inserted at the very beginning of the AMVP candidate list, and the list size is then set to 2 (i.e., the second existing AMVP candidate is removed). When applied to AMVP mode, only CU-level searches are applied.

2.3.3.1 CU2.3.3.1 CU 級level MVMV 候選集candidate set

CU級的MV候選集包括：The CU-level MV candidate set includes:

（i）原始AMVP候選，如果當前CU處於AMVP模式，(i) Original AMVP candidate, if the current CU is in AMVP mode,

（ii）所有Merge候選，(ii) all Merge candidates,

（iii）插值MV場中的幾個MV。(iii) Interpolate several MVs in the MV field.

（iv）頂部和左側相鄰運動向量(iv) Top and left adjacent motion vectors

當使用雙邊匹配時，Merge候選的每個有效MV用作輸入，以生成假設為雙邊匹配的MV對。例如，Merge候選在參考列表A處的一個有效MV為（MVa，ref_a ）。然後在另一個參考列表B中找到其配對的雙邊MV的參考圖片ref_b ，以便ref_a 和ref_b 在時間上位於當前圖片的不同側。如果參考列表B中的參考ref_b 不可用，則將參考ref_b 確定為與參考ref_a 不同的參考，並且其到當前圖片的時間距離是清單B中的最小距離。確定參考ref_b 後，通過基於當前圖片和參考ref_a 、參考ref_b 之間的時間距離縮放MVa推導MVb。When using bilateral matching, each valid MV of the Merge candidate is used as input to generate pairs of MVs that are assumed to be bilaterally matched. For example, a valid MV of the Merge candidate at reference list A is (MVa, ref _a ). The reference picture ref _b of its paired bilateral MV is then found in another reference list B, so that ref _a and ref _b are temporally located on different sides of the current picture. If reference ref _b in reference list B is not available, reference ref _b is determined to be a reference different from reference ref _a , and its temporal distance to the current picture is the minimum distance in list B. After the reference ref _b is determined, MVb is derived by scaling the MVa based on the time distance between the current picture and the reference ref _a and the reference ref _b .

還將來自插值MV場中的四個MV添加到CU級候選列表中。更具體地，添加當前CU的位置（0，0），（W/2，0），（0，H/2）和（W/2，H/2）處插值的MV。Four MVs from the interpolated MV field are also added to the CU-level candidate list. More specifically, the interpolated MVs at the positions (0, 0), (W/2, 0), (0, H/2) and (W/2, H/2) of the current CU are added.

當在AMVP模式下應用FRUC時，原始的AMVP候選也添加到CU級的MV候選集。When applying FRUC in AMVP mode, the original AMVP candidates are also added to the CU-level MV candidate set.

在CU級，可以將AMVP CU的最多15個 MV和Merge CU的最多13個 MV添加到候選列表中。At the CU level, up to 15 MVs for AMVP CUs and up to 13 MVs for Merge CUs can be added to the candidate list.

2.3.3.22.3.3.2 子son CUCU 級level MVMV 候選集candidate set

在子CU級設置的MV候選包括:MV candidates set at the sub-CU level include:

（i）從CU級搜索確定的MV，(i) Search the determined MV from the CU level,

（ii）頂部、左側、左上方和右上方相鄰的MV，(ii) Top, left, upper left and upper right adjacent MVs,

（iii）來自參考圖片的並置MV的縮放版本，(iii) a scaled version of the juxtaposed MV from the reference image,

（iv）最多4個ATMVP候選，(iv) Up to 4 ATMVP candidates,

（v）最多4個STMVP候選。(v) Up to 4 STMVP candidates.

來自參考圖片的縮放MV推導如下。兩個清單中的所有參考圖片都被遍歷。參考圖片中子CU的並置位置處的MV被縮放為起始CU級MV的參考。The scaled MV from the reference image is derived as follows. All reference images in both lists are traversed. The MV at the collocated position of the sub-CU in the reference picture is scaled to the reference of the starting CU-level MV.

ATMVP和STMVP候選被限制為前四個。在子CU級，最多17個MV被添加到候選列表中。ATMVP and STMVP candidates are limited to the top four. At the sub-CU level, up to 17 MVs are added to the candidate list.

2.3.3.32.3.3.3 插值interpolation MVMV 場的生成field generation

在對幀進行編碼之前，基於單向ME生成整個圖片的內插運動場。然後，該運動場可以隨後用作CU級或子CU級的MV候選。Before encoding the frame, an interpolated motion field of the entire picture is generated based on unidirectional ME. This sports field can then be subsequently used as an MV candidate at the CU level or sub-CU level.

首先，兩個參考清單中每個參考圖片的運動場在4×4的塊級別上被遍歷。對於每個4×4塊，如果與塊相關聯的運動通過當前圖片中的4×4塊（如圖23所示），並且該塊沒有被分配任何內插運動，則根據時間距離TD0和TD1將參考塊的運動縮放到當前圖片（與HEVC中TMVP的MV縮放相同），並且在當前幀中將該縮放運動指定給該塊。如果沒有縮放的MV指定給4×4塊，則在插值運動場中將塊的運動標記為不可用。First, the motion field of each reference picture in the two reference lists is traversed at the 4 × 4 block level. For each 4×4 block, if the motion associated with the block passes through the 4×4 block in the current picture (as shown in Figure 23), and the block is not assigned any interpolated motion, then according to the temporal distance TD0 and TD1 Scale the motion of the reference block to the current picture (same as MV scaling of TMVP in HEVC), and assign that scaled motion to the block in the current frame. If no scaled MV is assigned to a 4×4 block, the block's motion is marked as unavailable in the interpolated motion field.

2.3.3.42.3.3.4 插補匹配interpolation matching 成本cost

當運動向量指向分數採樣位置時，需要運動補償插值。為了降低複雜度，對雙邊匹配和範本匹配都使用雙線性插值而不是常規的8抽頭HEVC插值。When motion vectors point to fractional sampling locations, motion compensated interpolation is required. In order to reduce complexity, bilinear interpolation is used instead of conventional 8-tap HEVC interpolation for both bilateral matching and template matching.

匹配成本的計算在不同的步驟處有點不同。當從CU級的候選集中選擇候選時，匹配成本是雙邊匹配或範本匹配的絕對和差（SAD）。在確定起始MV後，雙邊匹配在子CU級搜索的匹配成本C計算如下：The calculation of matching costs is a little different at different steps. When selecting candidates from a CU-level candidate set, the matching cost is the sum of absolute differences (SAD) for bilateral matching or exemplar matching. After determining the starting MV, the matching cost C of bilateral matching at the sub-CU level is calculated as follows:

（4） (4)

這裡，w是權重係數，被經驗地設置為4。MV和分別指示當前MV和起始MV。仍然將SAD用作模式匹配在子CU級搜索的匹配成本。Here, w is the weight coefficient, which is empirically set to 4. MV and Indicate the current MV and starting MV respectively. Still using SAD as the matching cost for pattern matching at the sub-CU level search.

在FRUC模式下，MV通過僅使用亮度樣本推導。推導的運動將用於亮度和彩度的MC幀間預測。確定MV後，對亮度使用8抽頭（8-taps）插值濾波器並且對彩度使用4抽頭（4-taps）插值濾波器執行最終MC。In FRUC mode, MV is derived by using only luminance samples. The derived motion will be used for MC inter prediction of luma and chroma. After determining the MV, a final MC is performed using an 8-tap interpolation filter for luma and a 4-tap interpolation filter for chroma.

2.3.3.5 MV2.3.3.5 MV 細化refine

MV細化是基於模式的MV搜索，以雙邊成本或範本匹配成本為標準。在JEM中，支援兩種搜索模式—無限制中心偏置菱形搜索（UCBDS）和自我調整交叉搜索，分別在CU級別和子CU級別進行MV細化。對於CU級和子CU級的MV細化，都在四分之一亮度樣本精度下直接搜索MV，接著是八分之一亮度樣本MV細化。將CU和子CU步驟的MV細化的搜索範圍設置為8 個亮度樣本。MV refinement is a pattern-based MV search, using bilateral cost or template matching cost as the criterion. In JEM, two search modes are supported—unrestricted center-biased diamond search (UCBDS) and self-adjusting cross search, which perform MV refinement at the CU level and sub-CU level respectively. For both CU-level and sub-CU-level MV refinement, MV is searched directly at one-quarter luminance sample precision, followed by one-eighth luminance sample MV refinement. Set the search range for MV refinement of CU and sub-CU steps to 8 luma samples.

2.3.3.62.3.3.6 範本匹配Template matching FRUC MergeFRUC Merge 模式下預測方向的選擇Selection of prediction direction in mode

在雙邊Merge模式下，總是應用雙向預測，因為CU的運動資訊是在兩個不同的參考圖片中基於當前CU運動軌跡上兩個塊之間的最近匹配得出的。範本匹配Merge模式沒有這種限定。在範本匹配Merge模式下，編碼器可以從清單0的單向預測、列表1的單向預測或者雙向預測中為CU做出選擇。該選擇基於如下的範本匹配成本：In bilateral Merge mode, bidirectional prediction is always applied because the motion information of the CU is obtained in two different reference pictures based on the closest match between the two blocks on the current CU motion trajectory. Template matching Merge mode does not have this limitation. In template matching Merge mode, the encoder can choose for the CU from unidirectional prediction in List 0, unidirectional prediction in List 1, or bidirectional prediction. This selection is based on the following template matching costs:

如果 costBi>=factor*min（cost0,cost1）If costBi>=factor*min（cost0,cost1）

則使用雙向預測；Then use bidirectional prediction;

否則，如果 cost0>=cost1Otherwise, if cost0>=cost1

則使用列表0中的單向預測；Then use the one-way prediction in list 0;

否則，Otherwise,

使用列表1中的單向預測；Use the one-way prediction from Listing 1;

其中cost0是清單0範本匹配的SAD，cost1是清單2範本匹配的SAD，並且costBi是雙向預測範本匹配的SAD。factor的值等於1.25，意味著選擇處理朝雙向預測偏移。幀間預測方向選擇可以僅應用於CU級範本匹配處理。Where cost0 is the SAD matching the list 0 template, cost1 is the SAD matching the list 2 template, and costBi is the SAD matching the bidirectional prediction template. The value of factor is equal to 1.25, which means that the selection process is shifted towards bidirectional prediction. Inter prediction direction selection can be applied only to CU-level template matching processing.

2.3.42.3.4 解碼器側運動向量細化Decoder side motion vector refinement

在雙向預測操作中，對於一個塊區域的預測，將兩個分別由列表0的運動向量（MV）和列表1的MV形成的預測塊組合形成單個預測信號。在解碼器側運動向量細化（DMVR）方法中，通過雙邊範本匹配處理進一步細化雙向預測的兩個運動向量。解碼器中應用的雙邊範本匹配用於在雙邊範本和參考圖片中的重建樣本之間執行基於失真的搜索，以便在不傳輸附加運動資訊的情況下獲得細化的MV。In the bidirectional prediction operation, for prediction of one block region, two prediction blocks formed by the motion vector (MV) of list 0 and the MV of list 1 respectively are combined to form a single prediction signal. In the decoder-side motion vector refinement (DMVR) method, the two motion vectors of bidirectional prediction are further refined through bilateral template matching processing. Bilateral template matching applied in the decoder is used to perform a distortion-based search between bilateral templates and reconstructed samples in reference pictures to obtain refined MVs without transmitting additional motion information.

在DMVR中，雙邊範本被生成為兩個預測塊的加權組合（即平均），其中兩個預測塊分別來自列表0的初始MV0和列表1的MV1。範本匹配操作包括計算生成的範本與參考圖片中的樣本區域（在初始預測塊周圍）之間的成本度量。對於兩個參考圖片中的每一個，產生最小範本成本的MV被視為該列表的更新MV，以替換原始MV。在JEM中，為每個列表搜索九個MV候選。九個MV候選包括原始MV和8個周邊MV，這八個周邊MV在水準或垂直方向上或兩者與原始MV具有一個亮度樣本的偏移。最後，使用圖24所示的兩個新的MV（即MV0′和MV1′）生成最終的雙向預測結果。絕對差異之和（SAD）被用作成本度量。In DMVR, the bilateral template is generated as a weighted combination (i.e., average) of two prediction blocks, where the two prediction blocks are from the initial MV0 of list 0 and the MV1 of list 1. The template matching operation consists of computing a cost metric between the generated template and the sample region in the reference picture (around the initial prediction block). For each of the two reference pictures, the MV that yields the minimum template cost is considered as the updated MV of the list, replacing the original MV. In JEM, nine MV candidates are searched for each list. The nine MV candidates include the original MV and eight peripheral MVs that have an offset of one luminance sample from the original MV in the horizontal or vertical direction or both. Finally, the two new MVs (i.e., MV0′ and MV1′) shown in Figure 24 are used to generate the final bidirectional prediction results. The sum of absolute differences (SAD) was used as the cost measure.

在不傳輸附加語法元素的情況下，將DMVR應用於雙向預測的Merge模式，其中一個MV來自過去的參考圖片，並且另一個MV來自未來的參考圖片。在JEM中，當為CU啟用LIC、仿射運動、FRUC或子CU Merge候選時，不應用DMVR。Without transmitting additional syntax elements, DMVR is applied to bidirectionally predicted Merge mode, where one MV comes from a past reference picture and the other MV comes from a future reference picture. In JEM, DMVR is not applied when LIC, Affine Motion, FRUC, or Sub-CU Merge candidates are enabled for a CU.

2.3.52.3.5 局部光照補償local illumination compensation

局部光照補償（IC）基於用於光照改變的線性模型，使用縮放因數a和偏移b。並且針對每個幀間模式編碼的編碼單元（CU）自我調整地啟用或禁用局部光照補償。Local illumination compensation (IC) is based on a linear model for illumination changes, using a scaling factor a and an offset b. And local illumination compensation is self-adjustedly enabled or disabled for each inter-mode encoding coding unit (CU).

當IC應用於CU時，採用最小平方誤差方法通過使用當前CU的相鄰樣點及其對應的參考樣點來推導參數a和b。更具體地，如圖25所示，使用CU的子採樣的（2:1子採樣）相鄰樣點和參考圖片中的（由當前CU或子CU的運動資訊識別的）對應樣點。IC參數被推導並分別應用於每個預測方向。When IC is applied to a CU, the least square error method is adopted to derive the parameters a and b by using the adjacent sample points of the current CU and their corresponding reference sample points. More specifically, as shown in Figure 25, subsampled (2:1 subsampled) adjacent samples of the CU and corresponding samples (identified by the motion information of the current CU or sub-CU) in the reference picture are used. IC parameters are derived and applied separately to each prediction direction.

當以Merge模式對CU進行編碼時，以與Merge模式中的運動資訊複製類似的方式從相鄰塊複製IC標誌；否則，對CU信令通知IC標誌以指示LIC是否適用。When a CU is encoded in Merge mode, the IC flag is copied from neighboring blocks in a similar manner to the motion information copy in Merge mode; otherwise, the IC flag is signaled to the CU to indicate whether LIC is applicable.

當針對圖片啟用IC時，需要附加CU級別RD檢查以確定是否將LIC應用於CU。當針對CU啟用IC時，對於整數像素運動搜索和分數像素運動搜索分別使用均值移除絕對和差（Mean-Removed Sum of Absolute Diffefference，MR-SAD）以及均值移除絕對Hadamard變換和差（Mean-Removed Sum of Absolute Hadamard-Transformed Difference，MR-SATD），而不是SAD和SATD。When IC is enabled for a picture, an additional CU level RD check is required to determine whether to apply LIC to the CU. When IC is enabled for CU, Mean-Removed Sum of Absolute Diffefference (MR-SAD) and Mean-Removed Sum of Absolute Diffefference (MR-SAD) are used for integer pixel motion search and fractional pixel motion search respectively. Removed Sum of Absolute Hadamard-Transformed Difference, MR-SATD) instead of SAD and SATD.

為了降低編碼複雜度，在JEM中應用以下編碼方案。當當前圖片與其參考圖片之間沒有明顯的光照改變時，針對全部圖片禁用IC。為了識別這種情況，在編碼器處計算當前圖片的長條圖和當前圖片的每個參考圖片。如果當前圖片與當前圖片的每個參考圖片之間的長條圖差異小於給定閾值，則針對當前圖片禁用IC；否則，針對當前圖片啟用IC。To reduce coding complexity, the following coding scheme is applied in JEM. When there are no significant lighting changes between the current picture and its reference picture, IC is disabled for all pictures. To identify this situation, a histogram of the current picture and each reference picture of the current picture is calculated at the encoder. If the bar chart difference between the current picture and each reference picture of the current picture is less than a given threshold, IC is disabled for the current picture; otherwise, IC is enabled for the current picture.

2.3.62.3.6 具有雙向匹配細化的with two-way match refinement Merge/Merge/ 跳過模式skip mode

首先通過利用冗餘檢查將空間相鄰和時間相鄰塊的運動向量和參考索引插入候選清單中來構造Merge候選列表，直到可用候選的數量達到最大候選尺寸19。通過根據預定義的插入順序，在插入空間候選（圖26）、時間候選、仿射候選、高級時間MVP（Advanced Temporal MVP，ATMVP）候選、時空MVP（Spatial Temporal，STMVP）候選和HEVC中使用的附加候選（組合候選和零候選）來構造Merge/跳過模式的Merge候選清單：The Merge candidate list is first constructed by inserting the motion vectors and reference indices of spatially adjacent and temporally adjacent blocks into the candidate list using redundancy checking until the number of available candidates reaches the maximum candidate size of 19. By inserting spatial candidates (Figure 26), temporal candidates, affine candidates, Advanced Temporal MVP (ATMVP) candidates, spatiotemporal MVP (Spatial Temporal, STMVP) candidates and HEVC candidates according to the predefined insertion order. Append candidates (combined candidates and zero candidates) to construct a Merge candidate list for Merge/Skip mode:

（1）塊1-4的空間候選(1) Spatial candidates for blocks 1-4

（2）塊1-4的外推（extrapolated）仿射候選(2) Extrapolated affine candidates for blocks 1-4

（3）ATMVP(3)ATMVP

（4）STMVP(4) STMVP

（5）虛擬仿射候選(5) Virtual affine candidates

（6）空間候選（塊5）（僅當可用候選的數量小於6時使用）(6) Spatial Candidates (Block 5) (only used when the number of available candidates is less than 6)

（7）外推仿射候選（塊5）(7) Extrapolate affine candidates (Block 5)

（8）時間候選（如在HEVC中推導的）(8) Temporal candidates (as derived in HEVC)

（9）非鄰近空間候選，其後是外推仿射候選（塊6至49）(9) Non-contiguous spatial candidates, followed by extrapolated affine candidates (blocks 6 to 49)

（10）組合候選(10) Combination candidates

（11）零候選(11) Zero candidate

注意到，除了STMVP和仿射之外，IC標誌也從Merge候選繼承。而且，對於前四個空間候選，在具有單向預測的候選之前插入雙向預測候選。Note that in addition to STMVP and affine, the IC flag is also inherited from the Merge candidate. Furthermore, for the first four spatial candidates, bidirectional prediction candidates are inserted before candidates with unidirectional prediction.

2.3.7 JVET-K01612.3.7 JVET-K0161

在本提議中，沒有提出子塊STMVP作為空間-時間Merge模式。所提出的方法使用共位塊，其與HEVC/JEM（僅1個圖片，此處沒有時間向量）相同。所提出的方法還檢查上部和左側空間位置，該位置在該提議中被調節。具體地，為了檢查相鄰的幀間預測資訊，對於每個上方和左側檢查最多兩個位置。確切的位置如圖27所示。In this proposal, sub-block STMVP is not proposed as a space-time Merge mode. The proposed method uses co-located blocks, which is the same as HEVC/JEM (only 1 picture, no time vector here). The proposed method also examines the upper and left spatial positions, which are regulated in this proposal. Specifically, to check adjacent inter prediction information, up to two positions are checked for each upper and left side. The exact location is shown in Figure 27.

Afar: (nPbW * 5 / 2, -1), Amid (nPbW / 2, -1) （注意：當前塊上方的上方空間塊的偏移量）Afar: (nPbW * 5 / 2, -1), Amid (nPbW / 2, -1) (Note: the offset of the upper space block above the current block)

Lfar: (-1, nPbH * 5 / 2), Lmid (-1, nPbH/2) （注意：當前塊上方的左側空間塊的偏移）Lfar: (-1, nPbH * 5 / 2), Lmid (-1, nPbH/2) (Note: the offset of the left space block above the current block)

上方塊、左側塊和時間塊的運動向量的平均值被計算為與BMS軟體實現方式相同。如果3個參考幀間預測塊可用。The average of the motion vectors of the upper block, left block and time block is calculated in the same way as the BMS software implementation. If 3 reference inter prediction blocks are available.

mvLX[0] = ((mvLX_A[0] + mvLX_L[0] + mvLX_C[0]) * 43) / 128mvLX[0] = ((mvLX_A[0] + mvLX_L[0] + mvLX_C[0]) * 43) / 128

mvLX[1] = ((mvLX_A[1] + mvLX_L[1] + mvLX_C[1]) * 43) / 128mvLX[1] = ((mvLX_A[1] + mvLX_L[1] + mvLX_C[1]) * 43) / 128

如果僅有兩個或一個幀間預測塊可用，則使用兩個的平均值或者僅使用一個mv。If only two or one inter prediction block is available, the average of the two is used or only one mv is used.

2.3.8 JVT-K01352.3.8 JVT-K0135

為了生成平滑的細細微性運動場，圖28給出了平面運動向量預測過程的簡要描述。In order to generate smooth and subtle motion fields, Figure 28 gives a brief description of the planar motion vector prediction process.

通過如下在4×4塊的基礎上對水準和垂直線性插值求平均，來實現平面運動向量預測。Planar motion vector prediction is achieved by averaging horizontal and vertical linear interpolation on a 4×4 block basis as follows.

W 和H 表示塊的寬度和高度。(x,y) 是當前子塊相對於左上角子塊的座標。所有距離由像素距離除以4表示。是當前子塊的運動向量。 W and H represent the width and height of the block. (x, y) is the coordinate of the current sub-block relative to the upper-left sub-block. All distances are expressed as pixel distance divided by 4. is the motion vector of the current sub-block.

位置(x,y) 的水準預測和垂直預測計算如下：Level prediction at position (x, y) and vertical forecasting The calculation is as follows:

其中和是當前塊左側和右側的4x4塊的運動向量。和是當前塊上方和底部的4x4塊的運動向量。in and are the motion vectors of the 4x4 blocks to the left and right of the current block. and are the motion vectors of the 4x4 blocks above and below the current block.

從當前塊的空間相鄰塊推導出左側列和上方行相鄰塊的參考運動資訊。The reference motion information of the left column and upper row neighboring blocks is derived from the spatial neighboring blocks of the current block.

右側列和底部行相鄰塊的參考運動資訊如下推導出。The reference motion information for adjacent blocks in the right column and bottom row is derived as follows.

推導右下方時間相鄰4×4塊的運動資訊Derive the motion information of the temporally adjacent 4×4 blocks in the lower right corner

使用推導出的右下方相鄰4×4塊的運動資訊以及右上方相鄰4×4塊的運動資訊，來計算右側列相鄰4×4塊的運動向量，如公式K1中所描述。Using the derived motion information of the adjacent 4×4 block in the lower right and the motion information of the adjacent 4×4 block in the upper right, the motion vector of the adjacent 4×4 block in the right column is calculated, as described in Equation K1.

使用推導出的右下方相鄰4×4塊的運動資訊以及左下方相鄰4×4塊的運動資訊，來計算底部行相鄰4×4塊的運動向量，如公式K2中所描述。The motion vector of the bottom row adjacent 4×4 block is calculated using the derived motion information of the lower right adjacent 4×4 block and the motion information of the lower left adjacent 4×4 block, as described in Equation K2.

R(W,y) = ((H-y-1)AR + (y+1)BR)/HR(W,y) = ((Hy-1) AR + (y+1) BR)/H 公式 K1Formula K1 B(x,H) = ((W-x-1)BL+ (x+1)BR)/WB(x,H) = ((Wx-1) BL+ (x+1) BR)/W 公式 K2Formula K2

其中AR 是右上方空間相鄰4×4塊的運動向量，BR 是右下方時間相鄰4×4塊的運動向量，並且BL 是左下方空間相鄰4×4塊的運動向量。where AR is the motion vector of the spatially adjacent 4×4 block in the upper right, BR is the motion vector of the temporally adjacent 4×4 block in the lower right, and BL is the motion vector of the spatially adjacent 4×4 block in the lower left.

對於每個清單從相鄰塊獲得的運動資訊被縮放到給定清單的第一參考圖片。For each list the motion information obtained from adjacent blocks is scaled to the first reference picture for the given list.

3.3. 通過本文公開的實施例解決的問題的示例Examples of problems solved by embodiments disclosed herein

發明人先前已提出基於查閱資料表的運動向量預測技術，其使用存儲有至少一個運動候選的一個或多個查閱資料表以預測塊的運動資訊，其可以在各種實施例中實現以提供具有更高編碼效率的視頻編碼。每個LUT可以包括一個或多個運動候選，每個運動候選與對應的運動資訊相關聯。運動候選的運動資訊可包括預測方向、參考索引/圖片、運動向量、LIC標誌、仿射標誌、運動向量差（MVD）精度和/或MVD值。運動資訊還可以包括塊位置資訊，以指示運動資訊來自哪裡。The inventors have previously proposed a lookup table-based motion vector prediction technique, which uses one or more lookup tables storing at least one motion candidate to predict motion information of a block, which can be implemented in various embodiments to provide better performance. High coding efficiency video encoding. Each LUT may include one or more motion candidates, and each motion candidate is associated with corresponding motion information. The motion information of the motion candidate may include prediction direction, reference index/picture, motion vector, LIC flag, affine flag, motion vector difference (MVD) accuracy and/or MVD value. Motion information may also include block location information to indicate where the motion information comes from.

基於所公開的技術的基於LUT的運動向量預測可以增強現有和未來的視頻編碼標準，其在以下針對各種實現方式所描述的示例中闡明。因為LUT允許基於歷史資料（例如，已經被處理的塊）執行編碼/解碼過程，所以基於LUT的運動向量預測也可以被稱為基於歷史的運動向量預測（HMVP）方法。在基於LUT的運動向量預測方法中，在編碼/解碼過程期間保持具有來自先前被編碼的塊的運動資訊的一個或多個表。存儲在LUT中的這些運動候選被命名為HMVP候選。在一個塊的編碼/解碼期間，可以將LUT中的相關聯的運動資訊添加到運動候選清單（例如，Merge/ AMVP候選列表），並且在對一個塊進行編碼/解碼之後，可以更新LUT。然後使用更新後的LUT來編碼後續塊。也就是說，LUT中的運動候選的更新基於塊的編碼/解碼順序。以下示例應被視為解釋一般概念的示例。不應以狹窄的方式解釋這些示例。此外，這些實例可以以任何方式組合。LUT-based motion vector prediction based on the disclosed techniques can enhance existing and future video coding standards, which is illustrated in the examples described below for various implementations. Because LUT allows the encoding/decoding process to be performed based on historical information (eg, blocks that have been processed), LUT-based motion vector prediction may also be called a history-based motion vector prediction (HMVP) method. In LUT-based motion vector prediction methods, one or more tables with motion information from previously encoded blocks are maintained during the encoding/decoding process. These motion candidates stored in the LUT are named HMVP candidates. During encoding/decoding of a block, associated motion information in the LUT can be added to a motion candidate list (eg, Merge/AMVP candidate list), and after encoding/decoding of a block, the LUT can be updated. The updated LUT is then used to encode subsequent blocks. That is, the motion candidates in the LUT are updated based on the encoding/decoding order of the blocks. The following examples should be considered as examples to explain the general concept. These examples should not be interpreted in a narrow way. Furthermore, these instances can be combined in any way.

一些實施例可以使用存儲有至少一個運動候選的一個或多個查閱資料表，以預測塊的運動資訊。實施例可以使用運動候選來指示存儲在查閱資料表中的一組運動資訊。對於傳統的AMVP或Merge模式，實施例可以使用AMVP或Merge候選來存儲運動資訊。Some embodiments may use one or more lookup tables storing at least one motion candidate to predict motion information for a block. Embodiments may use motion candidates to indicate a set of motion information stored in a lookup table. For traditional AMVP or Merge modes, embodiments may use AMVP or Merge candidates to store motion information.

儘管當前的基於LUT的運動向量預測技術通過使用歷史資料克服了HEVC的缺點，但是，僅考慮來自空間相鄰塊的資訊。Although current LUT-based motion vector prediction techniques overcome the shortcomings of HEVC by using historical data, only information from spatially adjacent blocks is considered.

當將來自LUT的運動候選用於AMVP或Merge列表構建過程時，直接繼承它而不做任何改變。When using motion candidates from a LUT for AMVP or Merge list building processes, inherit it directly without making any changes.

JVET-K0161的設計有益於編碼性能。然而，它需要額外推導TMVP，這增加了計算複雜性和記憶體頻寬。JVET-K0161 is designed to benefit encoding performance. However, it requires additional derivation of TMVP, which increases computational complexity and memory bandwidth.

4.4. 一些示例some examples

以下示例應被視為解釋一般概念的示例。不應以狹窄的方式解釋這些示例。此外，這些實例可以以任何方式組合。The following examples should be considered as examples to explain the general concept. These examples should not be interpreted in a narrow way. Furthermore, these instances can be combined in any way.

使用當前公開的技術的一些實施例可以聯合使用來自LUT的運動候選和來自時間相鄰塊的運動資訊。此外，還提出了JVET-K0161的複雜性降低。Some embodiments using currently disclosed techniques may jointly use motion candidates from LUTs and motion information from temporal neighboring blocks. Additionally, complexity reduction of JVET-K0161 is proposed.

利用來自Exploiting from LUTLUT 的運動候選Candidates for sports

1. 提出通過利用來自LUT的運動候選來構造新的AMVP /Merge候選。1. Propose to construct new AMVP/Merge candidates by utilizing motion candidates from LUT.

a. 在一個示例中，可以通過對來自LUT的運動候選的運動向量添加/減去偏移（或多個偏移），推導出新的候選。a. In one example, new candidates can be derived by adding/subtracting an offset (or offsets) to the motion vector of the motion candidate from the LUT.

b. 在一個示例中，可以通過對來自LUT的所選運動候選的運動向量求平均，推導出新的候選。b. In one example, new candidates can be derived by averaging the motion vectors of selected motion candidates from the LUT.

i. 在一個實施例中，可以在沒有除法運算的情況下近似地實現平均。例如，MVa, MVb和MVc可以被平均為(MVa+MVb+MVc)×/2 ^N 或(MVa+MVb+MVc)×/2 ^N 。例如，當N = 7時，平均值是(MVa+MVb+MVc) ×42/128或(MVa+MVb+MVc) ×43/128。請注意，預先計算或並將其存儲在查閱資料表中。i. In one embodiment, averaging can be achieved approximately without division operations. For example, MVa, MVb and MVc can be averaged as (MVa+MVb+MVc)× /2 ^N or (MVa+MVb+MVc)× / ^2N . For example, when N = 7, the average value is (MVa+MVb+MVc) ×42/128 or (MVa+MVb+MVc) ×43/128. Note that precomputing or and store it in the lookup table.

ii. 在一個示例中，僅選擇具有相同參考圖片（在兩個預測方向上）的運動向量。ii. In one example, only motion vectors with the same reference picture (in both prediction directions) are selected.

iii. 在一個示例中，預先確定每個預測方向上的參考圖片，並且如果必要，將運動向量縮放到預先確定的參考圖片。iii. In one example, the reference picture in each prediction direction is predetermined, and if necessary, the motion vector is scaled to the predetermined reference picture.

1. 在一個示例中，參考圖片清單X中的第一條目（X = 0或1）被選擇作為參考圖片。1. In one example, the first entry in the reference picture list X (X = 0 or 1) is selected as the reference picture.

2. 可替代地，對於每個預測方向，選擇LUT中最頻繁使用的參考圖片作為參考圖片。2. Alternatively, for each prediction direction, select the most frequently used reference picture in the LUT as the reference picture.

c. 在一個示例中，對於每個預測方向，首先選擇具有與預先確定的參考圖片相同的參考圖片的運動向量，然後選擇其他運動向量。c. In one example, for each prediction direction, first select a motion vector with the same reference picture as the predetermined reference picture, and then select other motion vectors.

2. 提出通過來自LUT的一個或多個運動候選和來自時間相鄰塊的運動資訊的函數來構造新的AMVP /Merge候選。2. Propose to construct new AMVP/Merge candidates as a function of one or more motion candidates from LUT and motion information from temporal adjacent blocks.

a. 在一個示例中，類似於STMVP或JVET-K0161，可以通過對來自LUT和TMVP的運動候選求平均，推導出新的候選。a. In one example, similar to STMVP or JVET-K0161, new candidates can be derived by averaging the motion candidates from LUT and TMVP.

b. 在一個示例中，上述塊（例如，圖27中的Amid和Afar）可以被來自LUT的一個或多個候選替換。可替代地，此外，其他過程可以保持不變，就像在JVET-K0161中已實現的那樣。b. In one example, the above blocks (eg, Amid and Afar in Figure 27) can be replaced by one or more candidates from the LUT. Alternatively, furthermore, other procedures can remain unchanged, as already implemented in JVET-K0161.

3. 提出通過來自LUT的一個或多個運動候選、來自空間相鄰塊和/或空間非緊鄰的相鄰塊的AMVP/Merge候選、以及來自時間塊的運動資訊的函數，來構造新的AMVP /Merge候選。3. Propose to construct a new AMVP as a function of one or more motion candidates from the LUT, AMVP/Merge candidates from spatial adjacent blocks and/or spatially non-immediate adjacent blocks, and motion information from temporal blocks /Merge candidate.

a. 在一個示例中，上述塊中的一個或多個（例如，圖27中的Amid和Afar）可以被來自LUT的候選替換。可替代地，此外，其他過程可以保持不變，就像在JVET-K0161中已實現的那樣。a. In one example, one or more of the above blocks (eg, Amid and Afar in Figure 27) can be replaced by candidates from the LUT. Alternatively, furthermore, other procedures can remain unchanged, as already implemented in JVET-K0161.

b. 在一個示例中，左側塊中的一個或多個（例如，圖27中的Amid和Afar）可以被來自LUT的候選替換。可替代地，此外，其他過程可以保持不變，就像在JVET-K0161中已實現的那樣。b. In one example, one or more of the left blocks (e.g., Amid and Afar in Figure 27) can be replaced by candidates from the LUT. Alternatively, furthermore, other procedures can remain unchanged, as already implemented in JVET-K0161.

4. 提出當將塊的運動資訊插入LUT時，是否對LUT中的現有條目進行修剪可以取決於塊的編碼模式。4. It is proposed that when a block's motion information is inserted into a LUT, whether existing entries in the LUT are pruned can depend on the coding mode of the block.

a. 在一個示例中，如果以Merge模式對塊編碼，則不執行修剪。a. In one example, if a block is encoded in Merge mode, no pruning is performed.

b. 在一個示例中，如果以AMVP模式對塊編碼，則不執行修剪。b. In one example, if the block is encoded in AMVP mode, no pruning is performed.

c. 在一個示例中，如果以AMVP /Merge模式對塊編碼，則僅對LUT的最新M個條目進行修剪。c. In one example, if the block is encoded in AMVP/Merge mode, only the latest M entries of the LUT are pruned.

d. 在一個示例中，當以子塊模式（例如，仿射或ATMVP）對塊編碼時，始終禁用修剪。d. In one example, pruning is always disabled when encoding blocks in sub-block mode (e.g., affine or ATMVP).

5. 提出將來自時間塊的運動資訊添加到LUT。5. Propose adding motion information from time blocks to the LUT.

a. 在一個示例中，運動資訊可以來自共位元的塊。a. In one example, motion information can come from co-bit blocks.

b. 在一個示例中，運動資訊可以來自來自不同參考圖片的一個或多個塊。b. In one example, the motion information can come from one or more patches from different reference pictures.

與and STMVPSTMVP 相關Related

1. 提出始終使用空間Merge候選推導出新的Merge候選，而不考慮TMVP候選。1. It is proposed to always use spatial Merge candidates to derive new Merge candidates, regardless of TMVP candidates.

a. 在一個示例中，可以利用兩個運動Merge候選的平均值。a. In one example, the average of two motion Merge candidates can be used.

b. 在一個示例中，可以聯合使用空間Merge候選和來自LTU的運動候選推導出新的候選。b. In one example, new candidates can be derived using a combination of spatial Merge candidates and motion candidates from LTU.

2. 提出可以利用非緊鄰塊（其不是右或左相鄰塊）推導出STMVP候選。2. Propose that STMVP candidates can be derived using non-immediately adjacent blocks (which are not right or left adjacent blocks).

a. 在一個示例中，用於STMVP候選推導的上方塊保持不變，而使用的左側塊從相鄰塊改變為非緊鄰塊。a. In one example, the upper block used for STMVP candidate derivation remains unchanged, while the left block used changes from adjacent blocks to non-immediate blocks.

b. 在一個示例中，用於STMVP候選推導的左側塊保持不變，而所使用的上方塊從相鄰塊改變為非緊鄰塊。b. In one example, the left block used for STMVP candidate derivation remains unchanged, while the upper block used changes from adjacent blocks to non-immediate blocks.

c. 在一個示例中，可以聯合使用非緊鄰塊的候選和來自LUT的運動候選推導出新的候選。c. In one example, new candidates can be derived using candidates from non-immediate blocks jointly with motion candidates from the LUT.

3. 提出始終使用空間Merge候選推導出新的Merge候選，而不考慮TMVP候選。3. Proposed to always use spatial Merge candidates to derive new Merge candidates, regardless of TMVP candidates.

b. 可替代地，可以利用來自與當前塊相鄰或不相鄰的不同位置的兩個、三個或更多MV的平均值。b. Alternatively, the average of two, three or more MVs from different positions adjacent or not adjacent to the current block can be utilized.

i. 在一個實施例中，MV僅可以從當前LCU（也稱為CTU）中的位置獲取。i. In one embodiment, the MV can only be obtained from the location in the current LCU (also called CTU).

ii. 在一個實施例中，MV僅可以從當前LCU行中的位置獲取。ii. In one embodiment, the MV can only be obtained from the position in the current LCU row.

iii. 在一個實施例中，MV僅可以從當前LCU行中或挨著當前LCU行的位置獲取。圖29中示出了示例。塊A、B、C、E、E和F挨著當前LCU行。iii. In one embodiment, the MV can only be obtained from the position in or next to the current LCU row. An example is shown in Figure 29. Blocks A, B, C, E, E and F are next to the current LCU row.

iv. 在一個實施例中，MV僅可以從當前LCU行中或挨著當前LCU行但不在左上角相鄰塊的左側的位置獲取。圖29中示出了示例。塊T是左上角相鄰塊。塊B、C、E、E和F挨著當前LCU行，但不在左上角相鄰塊的左側。iv. In one embodiment, the MV can only be obtained from a position in the current LCU row or next to the current LCU row but not to the left of the upper left adjacent block. An example is shown in Figure 29. Block T is the upper left adjacent block. Blocks B, C, E, E, and F are next to the current LCU row, but not to the left of the adjacent block in the upper left corner.

c. 在一個實施例中，可以聯合使用空間Merge候選和來自LTU的運c. In one embodiment, spatial Merge candidates can be used in conjunction with operations from LTU

動候選推導出新的候選moving candidates to derive new candidates

4. 提出圖28中用於平面運動預測的BR塊的MV不是從時間MV預測獲取的，而是從LUT的一個條目獲取的。4. It is proposed that the MV of the BR block used for plane motion prediction in Figure 28 is not obtained from temporal MV prediction, but from an entry of the LUT.

5. 提出來自LUT的運動候選可以與其他類型的Merge/ AMVP候選（例如，空間Merge/ AMVP候選、時間Merge/ AMVP候選、默認運動候選）聯合使用以推導出新的候選。5. It is proposed that motion candidates from LUTs can be used jointly with other types of Merge/AMVP candidates (e.g., spatial Merge/AMVP candidates, temporal Merge/AMVP candidates, default motion candidates) to derive new candidates.

在本示例和本專利檔中公開的其他示例的各種實施方式中，修剪可以包括：a）將運動資訊與現有條目進行唯一性比較，以及b）如果唯一，則將運動資訊添加到清單，或者c）如果不唯一，則要麼c1）不添加運動資訊，要麼c2）添加運動資訊並刪除匹配的現有條目。在一些實現方式中，當將運動候選從表添加到候選列表時，不調用修剪操作。In various implementations of this example and other examples disclosed in this patent document, pruning may include: a) comparing the motion information to an existing entry for uniqueness, and b) if unique, adding the motion information to the inventory, or c) If not unique, either c1) do not add sports information, or c2) add sports information and delete matching existing entries. In some implementations, the pruning operation is not called when adding motion candidates from the table to the candidate list.

圖30是圖示可以用於實現本公開技術的各個部分的電腦系統或其它控制設備3000的結構的示例的示意圖。在圖30中，電腦系統3000包括通過內部連接3025連接的一個或多個處理器3005和記憶體3010。內部連接3025可以表示由適當的橋、適配器或控制器連接的任何一條或多條單獨的物理匯流排、點對點連接或兩者。因此，內部連接3025可以包括例如系統匯流排、周邊元件連接（PCI）匯流排、超傳輸或工業標準架構（ISA）匯流排、小型電腦系統介面（SCSI）匯流排、通用序列匯流排（USB）、IIC（I2C）匯流排或電氣與電子工程師協會（IEEE）標準674匯流排（有時被稱為“火線”）。30 is a schematic diagram illustrating an example of the structure of a computer system or other control device 3000 that may be used to implement various portions of the disclosed technology. In Figure 30, computer system 3000 includes one or more processors 3005 and memory 3010 connected by internal connections 3025. Internal connection 3025 may represent any one or more individual physical busses, point-to-point connections, or both connected by an appropriate bridge, adapter, or controller. Thus, internal connections 3025 may include, for example, a system bus, a Peripheral Component Connector (PCI) bus, a HyperTransport or Industry Standard Architecture (ISA) bus, a Small Computer System Interface (SCSI) bus, a Universal Serial Bus (USB) , IIC (I2C) bus or Institute of Electrical and Electronics Engineers (IEEE) Standard 674 bus (sometimes called "FireWire").

處理器3005可以包括中央處理器（CPU），來控制例如主機的整體操作。在一些實施例中，處理器3005通過執行存儲在記憶體3010中的軟體或固件來實現這一點。處理器3005可以是或可以包括一個或多個可程式設計通用或專用微處理器、數位訊號處理器（DSP）、可程式設計控制器、專用積體電路（ASIC）、可程式設計邏輯器件（PLD）等，或這些器件的組合。Processor 3005 may include a central processing unit (CPU) to control, for example, the overall operation of the host computer. In some embodiments, processor 3005 does this by executing software or firmware stored in memory 3010 . Processor 3005 may be or may include one or more programmable general or special purpose microprocessors, digital signal processors (DSPs), programmable controllers, application specific integrated circuits (ASICs), programmable logic devices ( PLD), etc., or a combination of these devices.

記憶體3010可以是或包括電腦系統的主記憶體。記憶體3010表示任何適當形式的隨機存取記憶體（RAM）、唯讀記憶體（ROM）、快閃記憶體等，或這些設備的組合。在使用中，記憶體3010除其它外可包含一組機器指令，當處理器3005執行該指令時，使處理器3005執行操作以實現本公開技術的實施例。Memory 3010 may be or include the main memory of the computer system. Memory 3010 represents any suitable form of random access memory (RAM), read only memory (ROM), flash memory, etc., or a combination of these devices. In use, memory 3010 may contain, among other things, a set of machine instructions that, when executed by processor 3005, cause processor 3005 to perform operations to implement embodiments of the disclosed technology.

通過內部連接3025連接到處理器3005的還有（可選的）網路介面卡3015。網路介面卡3015為電腦系統3000提供與遠端設備（諸如存儲客戶機和/或其它存儲伺服器）通信的能力，並且可以是例如乙太網適配器或光纖通道適配器。Also connected to processor 3005 via internal connection 3025 is an (optional) network interface card 3015. Network interface card 3015 provides computer system 3000 with the ability to communicate with remote devices, such as storage clients and/or other storage servers, and may be, for example, an Ethernet adapter or a Fiber Channel adapter.

圖31示出了可以用於實施本公開技術的各個部分的移動設備3100的示例實施例的框圖。移動設備3100可以是筆記型電腦、智慧手機、平板電腦、攝像機或其它能夠處理視頻的設備。移動設備3100包括處理器或控制器3101來處理資料，以及與處理器3101通信的記憶體3102來存儲和/或緩衝資料。例如，處理器3101可以包括中央處理器（CPU）或微控制器單元（MCU）。在一些實現中，處理器3101可以包括現場可程式設計閘陣列（FPGA）。在一些實現中，移動設備3100包括或與圖形處理單元（GPU）、視頻處理單元（VPU）和/或無線通訊單元通信，以實現智慧手機設備的各種視覺和/或通信資料處理功能。例如，記憶體3102可以包括並存儲處理器可執行代碼，當處理器3101執行該代碼時，將移動設備3100配置為執行各種操作，例如接收資訊、命令和/或資料、處理資訊和資料，以及將處理過的資訊/資料發送或提供給另一個資料設備，諸如執行器或外部顯示器。為了支援移動設備3100的各種功能，記憶體3102可以存儲資訊和資料，諸如指令、軟體、值、圖像以及處理器3101處理或引用的其它資料。例如，可以使用各種類型的隨機存取記憶體（RAM）設備、唯讀記憶體（ROM）設備、快閃記憶體設備和其它合適的存儲介質來實現記憶體3102的存儲功能。在一些實現中，移動設備3100包括輸入/輸出（I/O）單元3103，來將處理器3101和/或記憶體3102與其它模組、單元或設備進行介面。例如，I/O單元3103可以與處理器3101和記憶體3102進行介面，以利用與典型資料通信標準相容的各種無線介面，例如，在雲中的一台或多台電腦和使用者設備之間。在一些實現中，移動設備3100可以通過I/O單元3103使用有線連接與其它設備進行介面。移動設備3100還可以與其它外部介面（例如資料記憶體）和/或可視或音訊顯示裝置3104連接，以檢索和傳輸可由處理器處理、由記憶體存儲或由顯示裝置3104或外部設備的輸出單元上顯示的資料和資訊。例如，顯示裝置3104可以根據所公開的技術顯示包括基於該塊是否是使用運動補償演算法編碼的而應用幀內塊複製的塊（CU、PU或TU）的視頻幀。31 illustrates a block diagram of an example embodiment of a mobile device 3100 that may be used to implement various portions of the disclosed technology. Mobile device 3100 may be a laptop, smartphone, tablet, video camera, or other device capable of processing video. Mobile device 3100 includes a processor or controller 3101 to process data, and a memory 3102 in communication with processor 3101 to store and/or buffer data. For example, processor 3101 may include a central processing unit (CPU) or a microcontroller unit (MCU). In some implementations, processor 3101 may include a field programmable gate array (FPGA). In some implementations, the mobile device 3100 includes or communicates with a graphics processing unit (GPU), a video processing unit (VPU), and/or a wireless communication unit to implement various visual and/or communication data processing functions of the smartphone device. For example, memory 3102 may include and store processor-executable code that, when executed by processor 3101, configures mobile device 3100 to perform various operations, such as receiving information, commands, and/or information, processing information and information, and Send or provide processed information/data to another data device, such as an actuator or external display. In order to support various functions of the mobile device 3100, the memory 3102 can store information and data, such as instructions, software, values, images, and other data processed or referenced by the processor 3101. For example, various types of random access memory (RAM) devices, read only memory (ROM) devices, flash memory devices, and other suitable storage media may be used to implement the storage function of the memory 3102. In some implementations, mobile device 3100 includes an input/output (I/O) unit 3103 to interface processor 3101 and/or memory 3102 with other modules, units, or devices. For example, I/O unit 3103 can interface with processor 3101 and memory 3102 to utilize various wireless interfaces compatible with typical data communications standards, such as between one or more computers and user devices in the cloud. between. In some implementations, mobile device 3100 may interface with other devices using wired connections through I/O unit 3103 . Mobile device 3100 may also interface with other external interfaces (such as data storage) and/or visual or audio display device 3104 to retrieve and transmit information that may be processed by a processor, stored by memory, or output by display device 3104 or an external device. The data and information displayed on. For example, display device 3104 may display a video frame including a block (CU, PU, or TU) for which intra block copying is applied based on whether the block was encoded using a motion compensation algorithm in accordance with the disclosed techniques.

在一些實施例中，可以實現如本文所述的基於子塊的預測的方法的視頻解碼器裝置可用於視頻解碼。In some embodiments, a video decoder device that can implement methods of sub-block based prediction as described herein can be used for video decoding.

在一些實施例中，可以使用實現在如圖30和圖31所述的硬體平臺上的解碼裝置來實現視頻解碼方法。In some embodiments, the video decoding method may be implemented using a decoding device implemented on the hardware platform as shown in FIG. 30 and FIG. 31 .

在本文件中公開的各種實施例和技術可以在以下示例的列表中描述。Various embodiments and techniques disclosed in this document may be described in the following list of examples.

圖32是根據當前公開的技術的用於視頻處理的示例方法3200的流程圖。方法3200包括，在操作3202，通過平均兩個或更多選擇的運動候選，確定用於視頻處理的新候選。方法3200包括，在操作3204，將所述新候選增加到候選列表。方法3200包括，在操作3206，通過使用所述候選列表中的確定的新候選，執行視頻的第一視頻塊和視頻的位元流表示之間的轉換。32 is a flow diagram of an example method 3200 for video processing in accordance with currently disclosed technology. Method 3200 includes, at operation 3202, determining new candidates for video processing by averaging two or more selected motion candidates. Method 3200 includes, at operation 3204, adding the new candidate to a candidate list. Method 3200 includes, at operation 3206, performing a conversion between a first video block of the video and a bitstream representation of the video using the determined new candidate in the candidate list.

在一些實施例中，所述候選列表是Merge候選列表，以及確定的新候選是Merge候選。In some embodiments, the candidate list is a Merge candidate list, and the determined new candidates are Merge candidates.

在一些實施例中，所述Merge候選列表是幀間預測Merge候選列表或幀內塊複製預測Merge候選列表。In some embodiments, the Merge candidate list is an inter prediction Merge candidate list or an intra block copy prediction Merge candidate list.

在一些實施例中，所述一個或多個表包括從視頻資料中所述第一視頻塊之前處理的在先處理視頻塊推導的運動候選。In some embodiments, the one or more tables include motion candidates derived from previously processed video blocks in the video material that were processed before the first video block.

在一些實施例中，在所述候選列表中不存在可用的空間候選和時間候選。In some embodiments, there are no spatial candidates and no temporal candidates available in the candidate list.

在一些實施例中，所述選擇的運動候選來自一個或多個表。In some embodiments, the selected motion candidates are from one or more tables.

在一些實施例中，在沒有除法運算的情況下實現所述平均。In some embodiments, the averaging is performed without division operations.

在一些實施例中，通過所述選擇的運動候選的運動向量的和與縮放因數的乘法，實現所述平均。In some embodiments, the averaging is achieved by multiplying the sum of the motion vectors of the selected motion candidates by a scaling factor.

在一些實施例中，將所述選擇的運動候選的運動向量的水準分量進行平均以推導新候選的水準分量。In some embodiments, the horizontal components of the motion vectors of the selected motion candidates are averaged to derive the horizontal components of the new candidate.

在一些實施例中，將所述選擇的運動候選的運動向量的垂直分量進行平均以推導新候選的垂直分量。In some embodiments, the vertical components of the motion vectors of the selected motion candidates are averaged to derive the vertical components of the new candidate.

在一些實施例中，所述縮放因數被預先計算並存儲在查閱資料表中。In some embodiments, the scaling factors are pre-computed and stored in a lookup table.

在一些實施例中，僅選擇具有相同參考圖片的運動向量。In some embodiments, only motion vectors with the same reference picture are selected.

在一些實施例中，在兩個預測方向上僅選擇在兩個預測方向上具有相同參考圖片的運動向量。In some embodiments, only motion vectors having the same reference picture in both prediction directions are selected.

在一些實施例中，預先確定每個預測方向上的目標參考圖片，以及將所述運動向量縮放到預先確定的參考圖片。In some embodiments, the target reference picture in each prediction direction is predetermined, and the motion vector is scaled to the predetermined reference picture.

在一些實施例中，選擇參考圖片清單X中的第一條目作為用於參考圖片清單的目標參考圖片，X為0或1。In some embodiments, the first entry in the reference picture list X is selected as the target reference picture for the reference picture list, with X being 0 or 1.

在一些實施例中，對於每個預測方向，選擇表中最常使用的參考圖片作為目標參考圖片。In some embodiments, for each prediction direction, the most commonly used reference picture in the table is selected as the target reference picture.

在一些實施例中，對於每個預測方向，首先選擇具有與預先確定的目標參考圖片相同的參考圖片的運動向量，然後選擇其他運動向量。In some embodiments, for each prediction direction, a motion vector having the same reference picture as the predetermined target reference picture is first selected, and then other motion vectors are selected.

在一些實施例中，來自表的運動候選與運動資訊關聯，所述運動資訊包括以下的至少一種：預測方向、參考圖片索引、運動向量值、強度補償標誌、仿射標誌、運動向量差精度或運動向量差值。In some embodiments, motion candidates from the table are associated with motion information including at least one of the following: prediction direction, reference picture index, motion vector value, intensity compensation flag, affine flag, motion vector difference accuracy, or Motion vector difference.

在一些實施例中，方法3200還包括基於所述轉換更新一個或多個表。In some embodiments, method 3200 further includes updating one or more tables based on the transformation.

在一些實施例中，一個或多個表的更新包括在執行所述轉換後基於視頻的第一視頻塊的運動資訊更新一個或多個表。In some embodiments, the updating of the one or more tables includes updating the one or more tables based on the motion information of the first video block of the video after performing the conversion.

在一些實施例中，方法3200還包括基於更新的表，執行視頻的隨後視頻塊和視頻的位元流表示之間的轉換。In some embodiments, method 3200 further includes performing a conversion between subsequent video blocks of the video and a bitstream representation of the video based on the updated table.

在一些實施例中，所述轉換包括編碼處理和/或解碼處理。In some embodiments, the conversion includes an encoding process and/or a decoding process.

在一些實施例中，視頻編碼裝置可以在為後續視頻重建視頻期間執行本文所述的方法2900和其他方法。In some embodiments, a video encoding device may perform method 2900 and other methods described herein during reconstruction of video for subsequent videos.

在一些實施例中，視頻系統中的裝置可以包括被配置為執行本文描述的方法的處理器。In some embodiments, an apparatus in a video system may include a processor configured to perform the methods described herein.

在一些實施例中，所描述的方法可以體現為存儲在電腦可讀程式介質上的電腦可執行代碼。In some embodiments, the described methods may be embodied as computer-executable code stored on a computer-readable program medium.

圖33是根據當前公開的技術的用於視頻處理的示例方法3300的流程圖。方法3300包括在操作3302，通過使用來自一個或多個表的一個或多個運動候選來確定用於視頻處理的新運動候選，其中表包括一個或多個運動候選，並且每個運動候選是關聯的運動資訊。方法3300包括在操作3304，基於新候選者在視頻塊和視頻塊的編碼表示之間執行轉換。33 is a flow diagram of an example method 3300 for video processing in accordance with currently disclosed technology. Method 3300 includes, at operation 3302, determining new motion candidates for video processing by using one or more motion candidates from one or more tables, wherein the table includes the one or more motion candidates and each motion candidate is associated sports information. Method 3300 includes, at operation 3304, performing a conversion between the video block and a coded representation of the video block based on the new candidate.

在一些實施例中，通過對與來自所述一個或多個表的運動候選相關聯的運動向量加上或減去偏移，來推導出所述新的運動候選。In some embodiments, the new motion candidates are derived by adding or subtracting offsets to motion vectors associated with motion candidates from the one or more tables.

在一些實施例中，確定新的運動候選包括：作為來自一個或多個表的一個或多個運動候選和來自時間相鄰塊的運動資訊的函數，來確定新的運動候選。In some embodiments, determining the new motion candidate includes determining the new motion candidate as a function of one or more motion candidates from one or more tables and motion information from temporally adjacent blocks.

在一些實施例中，確定新運動候選包括：對來自一個或多個表的運動候選和時間運動向量預測器進行平均。In some embodiments, determining new motion candidates includes averaging motion candidates from one or more tables and a temporal motion vector predictor.

在一些實施例中，對所選運動候選進行平均包括與所選運動候選相關聯的運動向量的加權平均或平均。In some embodiments, averaging the selected motion candidates includes a weighted averaging or averaging of motion vectors associated with the selected motion candidates.

在一些實施例中，通過來自所述一個或多個表的運動候選的運動向量之和與所述時間運動向量預測器與縮放因數的乘法運算，來實現所述平均。In some embodiments, the averaging is achieved by multiplying the sum of motion vectors of motion candidates from the one or more tables with the temporal motion vector predictor and a scaling factor.

在一些實施例中，對來自所述一個或多個表的運動候選的運動向量的水準分量與時間運動向量預測器進行平均，以推導出新的運動候選的水準分量。In some embodiments, horizontal components of motion vectors of motion candidates from the one or more tables are averaged with a temporal motion vector predictor to derive horizontal components of new motion candidates.

在一些實施例中，對所選擇的水準分量進行平均包括與所選運動候選相關聯的水準分量的加權平均或平均。In some embodiments, averaging the selected level components includes a weighted average or average of the level components associated with the selected motion candidate.

在一些實施例中，對所選擇的垂直分量進行平均包括與所選運動候選相關聯的垂直分量的加權平均或平均。In some embodiments, averaging the selected vertical components includes a weighted average or average of the vertical components associated with the selected motion candidates.

在一些實施例中，作為來自一個或多個表的一個或多個運動候選、來自空間相鄰塊和/或空間非緊鄰的相鄰塊的Merge候選、以及來自時間相鄰塊的運動資訊的函數，來確定新的運動候選。In some embodiments, as one or more motion candidates from one or more tables, Merge candidates from spatial neighboring blocks and/or spatially non-immediate neighboring blocks, and motion information from temporal neighboring blocks. function to determine new motion candidates.

在一些實施例中，確定新候選者包括：作為來自一個或多個表的一個或多個運動候選、來自空間相鄰塊和/或空間非緊鄰的相鄰塊的高級運動向量預測（AMVP）候選的函數、以及來自時間相鄰塊的運動資訊，來確定新的運動候選。In some embodiments, determining the new candidate includes: advanced motion vector prediction (AMVP) as one or more motion candidates from one or more tables, from spatially adjacent blocks, and/or from spatially non-immediate adjacent blocks. Candidate functions, as well as motion information from temporal adjacent blocks, are used to determine new motion candidates.

在一些實施例中，確定新候選者包括：作為來自一個或多個表的一個或多個運動候選、以及高級運動向量預測（AMVP）候選列表中的AMVP候選或Merge候選清單中的Merge候選的函數，來確定新的運動候選。In some embodiments, determining the new candidate includes being one or more motion candidates from one or more tables and an AMVP candidate in an advanced motion vector prediction (AMVP) candidate list or a Merge candidate in a Merge candidate list. function to determine new motion candidates.

在一些實施例中，將所述新的運動候選添加到Merge候選列表。In some embodiments, the new motion candidate is added to the Merge candidate list.

在一些實施例中，將所述新的運動候選添加到AMVP候選列表。In some embodiments, the new motion candidate is added to the AMVP candidate list.

在一些實施例中，一個或多個表中的每一個包括一組運動候選，其中每個運動候選與對應的運動資訊相關聯。In some embodiments, each of the one or more tables includes a set of motion candidates, where each motion candidate is associated with corresponding motion information.

在一些實施例中，運動候選與運動資訊相關聯，所述運動資訊包括以下的至少一種：預測方向、參考圖片索引、運動向量值、強度補償標誌、仿射標誌、運動向量差精度或運動向量差值。In some embodiments, motion candidates are associated with motion information including at least one of: prediction direction, reference picture index, motion vector value, intensity compensation flag, affine flag, motion vector difference accuracy, or motion vector difference.

在一些實施例中，該方法還包括基於所述轉換更新一個或多個表。In some embodiments, the method further includes updating one or more tables based on the transformation.

在一些實施例中，更新一個或多個表包括在執行所述轉換之後基於第一視頻塊的運動資訊來更新一個或多個表。In some embodiments, updating the one or more tables includes updating the one or more tables based on the motion information of the first video block after performing the conversion.

在一些實施例中，該方法還包括基於更新後的表，執行視頻的後續視頻塊與視頻的位元流表示之間的轉換。In some embodiments, the method further includes performing a conversion between subsequent video blocks of the video and a bitstream representation of the video based on the updated table.

圖34是根據當前公開的技術的用於視頻處理的示例方法3400的流程圖。方法3400包括在操作3402，通過始終使用來自當前圖片中的第一視頻塊的多於一個空間相鄰塊的運動資訊，並且不使用來自與當前圖片不同的圖片中的時間塊的運動資訊，來確定用於視頻處理的新候選。方法3400包括在操作3404，通過使用所確定的新候選來執行視頻的當前圖片中的第一視頻塊與該視頻的位元流表示之間的轉換。34 is a flow diagram of an example method 3400 for video processing in accordance with currently disclosed technology. Method 3400 includes, at operation 3402, by always using motion information from more than one spatially neighboring block of a first video block in the current picture and not using motion information from temporal blocks in a different picture than the current picture. Identifying new candidates for video processing. Method 3400 includes, at operation 3404, performing a conversion between a first video block in a current picture of the video and a bitstream representation of the video using the determined new candidate.

在一些實施例中，所確定的新候選被添加到候選列表，所述候選列表包括Merge候選列表或高級運動向量預測（AMVP）候選列表。In some embodiments, the determined new candidates are added to a candidate list, including a Merge candidate list or an Advanced Motion Vector Prediction (AMVP) candidate list.

在一些實施例中，來自多於一個空間相鄰塊的運動資訊是從相對於當前圖片中的第一視頻塊的預定義的空間相鄰塊推導出的候選、或來自一個或多個表的運動候選。In some embodiments, motion information from more than one spatial neighbor is a candidate derived from a predefined spatial neighbor relative to the first video block in the current picture, or from one or more tables. Sports Candidates.

在一些實施例中，所述一個或多個表包括從在視頻資料中的第一視頻塊之前處理的先前處理過的視頻塊推導出的運動候選。In some embodiments, the one or more tables include motion candidates derived from previously processed video blocks processed before the first video block in the video material.

在一些實施例中，從相對於當前圖片中的第一視頻塊的預定義的空間相鄰塊推導出的候選是空間Merge候選。In some embodiments, candidates derived from predefined spatial neighboring blocks relative to the first video block in the current picture are spatial Merge candidates.

在一些實施例中，通過對至少兩個空間Merge候選求平均，來推導出所述新候選。In some embodiments, the new candidates are derived by averaging at least two spatial Merge candidates.

在一些實施例中，通過聯合使用空間Merge候選和來自一個或多個表的運動候選，來推導出所述新候選。In some embodiments, the new candidates are derived by jointly using spatial Merge candidates and motion candidates from one or more tables.

在一些實施例中，通過對與從不同位置推導出的候選相關聯的至少兩個運動向量求平均，來推導出所述新候選。In some embodiments, the new candidate is derived by averaging at least two motion vectors associated with candidates derived from different locations.

在一些實施例中，所述不同位置與所述第一視頻塊相鄰。In some embodiments, the different locations are adjacent to the first video block.

在一些實施例中，僅從所述第一視頻塊所屬的當前最大編碼單元中的位置獲取所述運動向量。In some embodiments, the motion vector is obtained only from the position in the current largest coding unit to which the first video block belongs.

在一些實施例中，僅從當前最大編碼單元行中的位置獲取所述運動向量。In some embodiments, the motion vector is obtained only from the position in the current largest coding unit row.

在一些實施例中，僅從當前最大編碼單元行中或挨著當前最大編碼單元行的位置獲取所述運動向量。In some embodiments, the motion vector is obtained only from a position in or next to the current maximum coding unit row.

在一些實施例中，僅從當前最大編碼單元行中或挨著當前最大編碼單元行但不在左上角相鄰塊的左側的位置獲取運動向量。In some embodiments, the motion vector is obtained only from a position in or next to the current maximum coding unit row but not to the left of the upper left neighboring block.

在一些實施例中，用於平面運動預測的右下塊的運動向量不從時間運動向量預測候選中獲取，而是從所述表的一個條目獲取。In some embodiments, the motion vector for the lower right block used for planar motion prediction is not obtained from the temporal motion vector prediction candidates, but is obtained from an entry of the table.

在一些實施例中，通過聯合使用來自一個或多個表的運動候選和其他種類的Merge/ AMVP候選，來推導出所述新候選。In some embodiments, the new candidates are derived by jointly using motion candidates from one or more tables and other kinds of Merge/AMVP candidates.

在一些實施例中, 所述一個或多個表中的運動候選與運動資訊相關聯，所述運動資訊包括以下中的至少一個：預測方向、參考圖片索引、運動向量值、強度補償標誌、仿射標誌、運動向量差精度或運動向量差值。In some embodiments, the motion candidates in the one or more tables are associated with motion information, the motion information including at least one of the following: prediction direction, reference picture index, motion vector value, intensity compensation flag, simulation shot flag, motion vector difference accuracy, or motion vector difference value.

在一些實施例中，更新一個或多個表包括在執行轉換之後基於所述第一視頻塊的運動資訊來更新一個或多個表。In some embodiments, updating the one or more tables includes updating the one or more tables based on the motion information of the first video block after performing the conversion.

在一些實施例中，該方法還包括基於更新後的表，執行該視頻的後續視頻塊與該視頻的位元流表示之間的轉換。In some embodiments, the method further includes performing a conversion between subsequent video blocks of the video and a bitstream representation of the video based on the updated table.

在一些實施例中，所述轉換包括編碼過程和/或解碼過程。In some embodiments, the conversion includes an encoding process and/or a decoding process.

圖35是根據當前公開的技術的用於視頻處理的示例方法3500的流程圖。方法3500包括在操作3502，通過使用來自當前圖片中的第一視頻塊的至少一個空間非緊鄰塊的運動資訊、以及從第一視頻塊的空間非緊鄰塊或並非從第一視頻塊的空間非緊鄰塊推導出的其他候選，來確定用於視頻處理的新候選。方法3500包括在操作3504，通過使用所確定的新候選，來執行視頻的第一視頻塊與該視頻的位元流表示之間的轉換。35 is a flow diagram of an example method 3500 for video processing in accordance with currently disclosed technology. Method 3500 includes at operation 3502, by using motion information from at least one spatially non-immediately adjacent block of the first video block in the current picture, and from a spatially non-immediately adjacent block of the first video block or from a spatially non-immediately adjacent block of the first video block. Other candidates derived from immediately adjacent blocks are used to determine new candidates for video processing. Method 3500 includes, at operation 3504, performing a conversion between a first video block of the video and a bitstream representation of the video using the determined new candidate.

在一些實施例中，所確定的新候選被添加到候選列表，所述候選列表包括Merge或高級運動向量預測（AMVP）候選列表。In some embodiments, the determined new candidates are added to a candidate list, including a Merge or Advanced Motion Vector Prediction (AMVP) candidate list.

在一些實施例中，來自多於一個空間非緊鄰塊的運動資訊是從相對於當前圖片中的第一視頻塊的預定義的空間非緊鄰塊推導出的候選。In some embodiments, motion information from more than one spatially non-immediate block is a candidate derived from a predefined spatially non-immediate block relative to the first video block in the current picture.

在一些實施例中，從相對於當前圖片中的第一視頻塊的預定義的空間非緊鄰塊推導出的候選是空間-時間運動向量預測（STMVP）候選。In some embodiments, candidates derived from predefined spatially non-immediate blocks relative to the first video block in the current picture are spatial-temporal motion vector prediction (STMVP) candidates.

在一些實施例中，所述視頻塊的非緊鄰塊不是所述第一視頻塊的右相鄰塊或左相鄰塊。In some embodiments, a non-immediate neighbor of the video block is not a right neighbor or a left neighbor of the first video block.

在一些實施例中，所述第一視頻塊的用於STMVP候選推導的上方塊保持不變，而所使用的左側塊從相鄰塊改變為非緊鄰塊。In some embodiments, the upper block of the first video block used for STMVP candidate derivation remains unchanged, while the left block used changes from a neighboring block to a non-immediate block.

在一些實施例中，所述第一視頻塊的用於STMVP候選推導的左側塊保持不變，而所使用的上方塊從相鄰塊改變為非緊鄰塊。In some embodiments, the left block of the first video block used for STMVP candidate derivation remains unchanged, while the upper block used changes from a neighboring block to a non-immediate block.

圖36是根據當前公開的技術的用於視頻處理的示例方法3600的流程圖。方法3600包括在操作3602，通過使用來自當前圖片中的第一視頻塊的一個或多個表的運動資訊和來自不同於當前圖片的圖片中的時間塊的運動資訊，來確定用於視頻處理的新候選。方法3600包括在操作3604，通過使用所確定的新候選，來執行視頻的當前圖片中的第一視頻塊與該視頻的位元流表示之間的轉換。36 is a flow diagram of an example method 3600 for video processing in accordance with currently disclosed technology. Method 3600 includes, at operation 3602, determining a time block for video processing by using motion information from one or more tables of a first video block in the current picture and motion information from a temporal block in a picture different from the current picture. New candidate. Method 3600 includes, at operation 3604, performing a conversion between a first video block in a current picture of the video and a bitstream representation of the video using the determined new candidate.

在一些實施例中，所確定的新候選被添加到候選列表，所述候選列表包括Merge或AMVP候選列表。In some embodiments, the determined new candidates are added to a candidate list, including a Merge or AMVP candidate list.

在一些實施例中，來自當前圖片中的一個或多個表的運動資訊與從一個或多個表中選擇的一個或多個歷史運動向量預測（HMVP）候選相關聯，並且來自不同於當前圖片的圖片中的時間塊的運動資訊是時間運動候選。In some embodiments, motion information from one or more tables in the current picture is associated with one or more historical motion vector prediction (HMVP) candidates selected from the one or more tables and is from a different source than the current picture. The motion information of the temporal block in the picture is the temporal motion candidate.

在一些實施例中，通過對一個或多個HMVP候選與一個或多個時間運動候選求平均，來推導出所述新候選。In some embodiments, the new candidates are derived by averaging one or more HMVP candidates with one or more temporal motion candidates.

圖37是根據當前公開的技術的用於視頻處理的示例方法3700的流程圖。方法3700包括在操作3702，通過使用來自第一視頻塊的一個或多個表的運動資訊和來自第一視頻塊的一個或多個空間相鄰塊的運動資訊，來確定用於視頻處理的新候選。方法3700包括在操作3704，通過使用所確定的新候選，來執行視頻的當前圖片中的第一視頻塊與該視頻的位元流表示之間的轉換。37 is a flow diagram of an example method 3700 for video processing in accordance with currently disclosed technology. Method 3700 includes, at operation 3702, determining a new table for video processing by using motion information from one or more tables of the first video block and motion information from one or more spatially adjacent blocks of the first video block. candidate. Method 3700 includes, at operation 3704, performing a conversion between a first video block in a current picture of the video and a bitstream representation of the video using the determined new candidate.

在一些實施例中，來自第一視頻塊的一個或多個表的運動資訊與從一個或多個表中選擇的一個或多個歷史運動向量預測（HMVP）候選相關聯，並且來自第一視頻塊的一個或多個空間相鄰塊的運動資訊是從相對於第一視頻塊的預定義的空間塊推導出的候選。In some embodiments, motion information from one or more tables of the first video block is associated with one or more historical motion vector prediction (HMVP) candidates selected from the one or more tables and from the first video The motion information of one or more spatial neighbors of the block is a candidate derived from a predefined spatial block relative to the first video block.

在一些實施例中，從相對於第一視頻塊的預定義的空間塊推導出的候選是空間Merge候選。In some embodiments, the candidates derived from a predefined spatial block relative to the first video block are spatial Merge candidates.

在一些實施例中，通過對一個或多個HMVP候選和一個或多個空間Merge候選求平均，來推導出所述新候選。In some embodiments, the new candidates are derived by averaging one or more HMVP candidates and one or more spatial Merge candidates.

在一些實施例中，來自表的運動候選與運動資訊相關聯，所述運動資訊包括以下中的至少一個：預測方向、參考圖片索引、運動向量值、強度補償標誌、仿射標誌、運動向量差精度或運動向量差值。In some embodiments, motion candidates from the table are associated with motion information including at least one of: prediction direction, reference picture index, motion vector value, intensity compensation flag, affine flag, motion vector difference Accuracy or motion vector difference.

在一些實施例中，該方法還包括還包括基於所述轉換更新一個或多個表。In some embodiments, the method further includes updating one or more tables based on the transformation.

在一些實施例中，更新一個或多個表包括在執行所述轉換之後基於當前視頻塊的運動資訊來更新一個或多個表。In some embodiments, updating the one or more tables includes updating the one or more tables based on motion information of the current video block after performing the conversion.

在一些實施例中，該方法還包括基於更新後的表，在所述視頻資料的後續視頻塊與所述視頻資料的位元流表示之間執行轉換。In some embodiments, the method further includes performing a conversion between subsequent video blocks of the video material and a bitstream representation of the video material based on the updated table.

圖38是根據當前公開的技術的用於視頻處理的示例方法3800的流程圖。方法3800包括，在操作3802，保持一組表，其中每個表包括運動候選，並且每個運動候選與對應的運動資訊相關聯；在操作3804，執行第一視頻塊與包括該第一視頻塊的視頻的位元流表示之間的轉換；以及在操作3806，通過基於第一視頻塊的編碼/解碼模式選擇性地對一個或多個表中的現有運動候選進行修剪，來更新該一個或多個表。38 is a flow diagram of an example method 3800 for video processing in accordance with currently disclosed technology. Method 3800 includes, at operation 3802, maintaining a set of tables, wherein each table includes motion candidates, and each motion candidate is associated with corresponding motion information; and at operation 3804, performing a first video block and including the first video block converting between bitstream representations of the video; and at operation 3806, updating the one or more existing motion candidates in the one or more tables by selectively pruning the encoding/decoding mode of the first video block. Multiple tables.

在一些實施例中，基於該組表中的一個或多個表，執行第一視頻塊與包括該第一視頻塊的視頻的位元流表示之間的轉換。In some embodiments, the conversion between the first video block and a bitstream representation of the video including the first video block is performed based on one or more tables in the set of tables.

在一些實施例中，在以Merge模式對第一視頻塊編碼/解碼的情況下，省略修剪。In some embodiments, where the first video block is encoded/decoded in Merge mode, pruning is omitted.

在一些實施例中，在以高級運動向量預測模式對第一視頻塊編碼/解碼的情況下，省略修剪。In some embodiments, pruning is omitted where the first video block is encoded/decoded in advanced motion vector prediction mode.

在一些實施例中，在以Merge模式或高級運動向量預測模式對第一視頻塊編碼/解碼的情況下，對所述表的最新M個條目進行修剪，其中M是預先指定的整數。In some embodiments, where the first video block is encoded/decoded in Merge mode or advanced motion vector prediction mode, the latest M entries of the table are pruned, where M is a pre-specified integer.

在一些實施例中，在以子塊模式對第一視頻塊編碼/解碼的情況下，禁用修剪。In some embodiments, pruning is disabled where the first video block is encoded/decoded in sub-block mode.

在一些實施例中，所述子塊模式包括仿射模式、可選時間運動向量預測模式。In some embodiments, the sub-block modes include affine mode, selectable temporal motion vector prediction mode.

在一些實施例中，所述修剪包括檢查所述表中是否存在冗余的現有運動候選。In some embodiments, the pruning includes checking the table for redundant existing motion candidates.

在一些實施例中，所述修剪還包括：如果所述表中存在冗余的現有運動候選，則將與所述第一視頻塊相關聯的運動資訊插入到所述表中，並且刪除所述表中的所述冗余的現有運動候選。In some embodiments, the pruning further includes inserting motion information associated with the first video block into the table if there are redundant existing motion candidates in the table, and deleting the The redundant existing motion candidates in the table.

在一些實施例中，如果所述表中存在冗余的現有運動候選，則不使用與所述第一視頻塊相關聯的運動資訊來更新所述表。In some embodiments, if there are redundant existing motion candidates in the table, the table is not updated with motion information associated with the first video block.

圖39是根據當前公開的技術的用於視頻處理的示例方法3900的流程圖。方法3900包括，在操作3902，保持一組表，其中每個表包括運動候選，並且每個運動候選與對應的運動資訊相關聯；在操作3904，執行第一視頻塊與包括該第一視頻塊的視頻的位元流表示之間的轉換；以及在操作3906，更新一個或多個表，以包括來自所述第一視頻塊的一個或多個時間相鄰塊的運動資訊作為新的運動候選。39 is a flow diagram of an example method 3900 for video processing in accordance with currently disclosed technology. Method 3900 includes, at operation 3902, maintaining a set of tables, wherein each table includes motion candidates, and each motion candidate is associated with corresponding motion information; and at operation 3904, performing a first video block and including the first video block converting between bitstream representations of the video; and at operation 3906, updating one or more tables to include motion information from one or more temporally adjacent blocks of the first video block as new motion candidates .

在一些實施例中，所述一個或多個時間相鄰塊是共位的塊。In some embodiments, the one or more temporally adjacent blocks are co-located blocks.

在一些實施例中，所述一個或多個時間相鄰塊包括來自不同參考圖片的一個或多個塊。In some embodiments, the one or more temporal neighboring blocks include one or more blocks from different reference pictures.

在一些實施例中，該方法還包括基於更新後的表，執行所述視頻的後續視頻塊與該視頻的位元流表示之間的轉換。In some embodiments, the method further includes performing a conversion between subsequent video blocks of the video and a bitstream representation of the video based on the updated table.

圖40是根據當前公開的技術的用於更新運動候選表的示例方法4000的流程圖。方法4000包括，在操作4002，基於正被處理的視頻塊的編碼/解碼模式，選擇性地對表中的現有運動候選進行修剪，每個運動候選與對應的運動資訊相關聯；以及在操作4004，更新所述表，以包括視頻塊的運動資訊作為新的運動候選。40 is a flow diagram of an example method 4000 for updating a motion candidate table in accordance with currently disclosed technology. Method 4000 includes, at operation 4002, selectively pruning existing motion candidates in the table, each motion candidate being associated with corresponding motion information, based on the encoding/decoding mode of the video block being processed; and at operation 4004 , the table is updated to include the motion information of the video block as the new motion candidate.

在一些實施例中，在以Merge模式或高級運動向量預測模式對所述視頻塊編碼/解碼的情況下，對所述表的最新M個條目進行修剪，其中M是預先指定的整數。In some embodiments, where the video block is encoded/decoded in Merge mode or advanced motion vector prediction mode, the latest M entries of the table are pruned, where M is a pre-specified integer.

在一些實施例中，在以子塊模式對所述視頻塊編碼/解碼的情況下，禁用修剪。In some embodiments, pruning is disabled where the video block is encoded/decoded in sub-block mode.

在一些實施例中，所述修剪還包括：如果所述表中存在冗餘的運動候選，則將與正被處理的視頻塊相關聯的運動資訊插入到所述表中，並且刪除所述表中的所述冗餘的運動候選。In some embodiments, the pruning further includes inserting motion information associated with the video block being processed into the table if redundant motion candidates exist in the table, and deleting the table The redundant motion candidates in .

在一些實施例中，如果所述表中存在冗余的現有運動候選，則不使用與正被處理的視頻塊相關聯的運動資訊來更新所述表。In some embodiments, if there are redundant existing motion candidates in the table, the table is not updated with motion information associated with the video block being processed.

圖41是根據當前公開的技術的用於更新運動候選表的示例方法4100的流程圖。方法4100包括，在操作4102，保持運動候選表，每個運動候選與對應的運動資訊相關聯；以及在操作4104，更新所述表，以包括來自正被處理的視頻塊的一個或多個時間相鄰塊的運動資訊作為新的運動候選。41 is a flow diagram of an example method 4100 for updating a motion candidate table in accordance with currently disclosed technology. Method 4100 includes, at operation 4102, maintaining a table of motion candidates, each motion candidate being associated with corresponding motion information; and at operation 4104, updating the table to include one or more times from the video block being processed The motion information of adjacent blocks is used as a new motion candidate.

在一些實施例中，運動候選與運動資訊相關聯，所述運動資訊包括以下中的至少一個：預測方向、參考圖片索引、運動向量值、強度補償標誌、仿射標誌、運動向量差精度或運動向量差值。In some embodiments, motion candidates are associated with motion information including at least one of: prediction direction, reference picture index, motion vector value, intensity compensation flag, affine flag, motion vector difference accuracy, or motion Vector difference.

在一些實施例中，所述運動候選對應於用於幀內模式編碼的幀內預測模式的運動候選。In some embodiments, the motion candidates correspond to motion candidates for intra prediction mode for intra mode encoding.

在一些實施例中，所述運動候選對應於包括用於IC參數編碼的亮度補償參數的運動候選。In some embodiments, the motion candidate corresponds to a motion candidate that includes brightness compensation parameters for IC parameter encoding.

圖42是根據當前公開的技術的用於視頻處理的示例方法4200的流程圖。方法4200包括，在操作4202，通過使用來自一個或多個表的一個或多個運動候選，來確定用於視頻處理的新的運動候選，其中表包括一個或多個運動候選，並且每個運動候選與運動資訊相關聯；以及在操作4104，基於新候選，在視頻塊和視頻塊的編碼表示之間執行轉換。42 is a flow diagram of an example method 4200 for video processing in accordance with currently disclosed technology. Method 4200 includes, at operation 4202, determining new motion candidates for video processing by using one or more motion candidates from one or more tables, wherein the table includes the one or more motion candidates, and each motion The candidates are associated with motion information; and at operation 4104, a conversion is performed between the video block and a coded representation of the video block based on the new candidate.

在一些實施例中，確定所述新候選包括：作為來自一個或多個表的一個或多個運動候選、以及高級運動向量預測（AMVP）候選列表中的AMVP候選或Merge候選清單中的Merge候選的函數，來確定所述新的運動候選。In some embodiments, determining the new candidate includes: being one or more motion candidates from one or more tables and an AMVP candidate in an Advanced Motion Vector Prediction (AMVP) candidate list or a Merge candidate in a Merge candidate list function to determine the new motion candidate.

從上述來看，應當理解的是，為了便於說明，本發明公開的技術的具體實施例已經在本文中進行了描述，但是可以在不偏離本發明範圍的情況下進行各種修改。因此，除了的之外，本發明公開的技術不限於申請專利範圍的限定。From the above, it should be understood that specific embodiments of the technology disclosed in the present invention have been described herein for convenience of explanation, but various modifications may be made without departing from the scope of the present invention. Therefore, except for, the technology disclosed in the present invention is not limited to the scope of the patent application.

本文中公開的和其他描述的實施例、模組和功能操作可以在數位電子電路、或電腦軟體、固件或硬體中實現，包括本文中所公開的結構及其結構等效體，或其中一個或多個的組合。公開的實施例和其他實施例可以實現為一個或多個電腦程式產品，即一個或多個編碼在電腦可讀介質上的電腦程式指令的模組，以供資料處理裝置執行或控制資料處理裝置的操作。電腦可讀介質可以是機器可讀存放裝置、機器可讀存儲基板、存放裝置、影響機器可讀傳播信號的物質組成或其中一個或多個的組合。術語“資料處理裝置”包括用於處理資料的所有裝置、設備和機器，包括例如可程式設計處理器、電腦或多處理器或電腦組。除硬體外，該裝置還可以包括為電腦程式創建執行環境的代碼，例如，構成處理器固件的代碼、協定棧、資料庫管理系統、作業系統或其中一個或多個的組合。傳播信號是人為產生的信號，例如機器產生的電信號、光學信號或電磁信號，生成這些信號以對資訊進行編碼，以便傳輸到適當的接收裝置。The embodiments, modules, and functional operations disclosed and otherwise described herein may be implemented in digital electronic circuitry, or computer software, firmware, or hardware, including the structures disclosed herein and their structural equivalents, or one thereof. or a combination of more. The disclosed embodiments and other embodiments may be implemented as one or more computer program products, that is, one or more modules of computer program instructions encoded on a computer-readable medium for execution by a data processing device or to control a data processing device. operation. The computer-readable medium may be a machine-readable storage device, a machine-readable storage substrate, a storage device, a material composition that affects a machine-readable propagation signal, or a combination of one or more thereof. The term "data processing device" includes all devices, equipment and machines for processing data, including, for example, a programmable processor, a computer or multi-processor or a group of computers. In addition to hardware, the device may include code that creates an execution environment for computer programs, such as code that makes up the processor's firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of these. Propagated signals are artificially generated signals, such as machine-generated electrical, optical, or electromagnetic signals, that are generated to encode information for transmission to an appropriate receiving device.

電腦程式（也稱為程式、軟體、軟體應用、腳本或代碼）可以用任何形式的程式設計語言（包括編譯語言或解釋語言）編寫，並且可以以任何形式部署，包括作為獨立程式或作為模組、元件、副程式或其他適合在計算環境中使用的單元。電腦程式不一定與檔案系統中的檔對應。程式可以存儲在保存其他程式或資料的檔的部分中（例如，存儲在標記語言文件中的一個或多個腳本）、專用於該程式的單個檔中、或多個協調檔（例如，存儲一個或多個模組、副程式或部分代碼的檔）中。電腦程式可以部署在一台或多台電腦上來執行，這些電腦位於一個網站上或分佈在多個網站上，並通過通信網路互連。A computer program (also called a program, software, software application, script, or code) may be written in any form of programming language, including a compiled or interpreted language, and may be deployed in any form, including as a stand-alone program or as a module , components, subroutines, or other units suitable for use in a computing environment. Computer programs do not necessarily correspond to files in the file system. A program can be stored in a portion of a file that holds other programs or data (for example, one or more scripts stored in a markup language file), in a single file dedicated to the program, or in multiple coordinated files (for example, one or more scripts stored in a markup language file). or multiple modules, subroutines, or partial code files). A computer program may be deployed and executed on one or more computers located on a single website or distributed across multiple websites and interconnected by a communications network.

本文中描述的處理和邏輯流可以通過一個或多個可程式設計處理器執行，該處理器執行一個或多個電腦程式，通過在輸入資料上操作並生成輸出來執行功能。處理和邏輯流也可以通過特殊用途的邏輯電路來執行，並且裝置也可以實現為特殊用途的邏輯電路，例如，FPGA（現場可程式設計閘陣列）或ASIC（專用積體電路）。The processing and logic flow described herein may be performed by one or more programmable processors that execute one or more computer programs to perform functions by operating on input data and generating output. Processing and logic flow may also be performed by, and devices may be implemented as, special purpose logic circuits, such as, for example, an FPGA (Field Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit).

例如，適於執行電腦程式的處理器包括通用和專用微處理器，以及任何類型數位電腦的任何一個或多個。通常，處理器將從唯讀記憶體或隨機存取記憶體或兩者接收指令和資料。電腦的基本元件是執行指令的處理器和存儲指令和資料的一個或多個存放裝置。通常，電腦還將包括一個或多個用於存儲資料的大型存放區設備，例如，磁片、磁光碟或光碟，或通過操作耦合到一個或多個大型存放區設備來從其接收資料或將資料傳輸到一個或多個大型存放區設備，或兩者兼有。然而，電腦不一定具有這樣的設備。適用於存儲電腦程式指令和資料的電腦可讀介質包括所有形式的非易失性記憶體、介質和記憶體設備，包括例如半導體記憶體設備，例如EPROM、EEPROM和快閃記憶體設備；磁片，例如內部硬碟或抽取式磁碟；磁光磁片；以及CDROM和DVD-ROM光碟。處理器和記憶體可以由專用邏輯電路來補充，或合併到專用邏輯電路中。For example, processors suitable for the execution of computer programs include both general and special purpose microprocessors, and any one or more of any type of digital computer. Typically, a processor will receive instructions and data from read-only memory or random access memory, or both. The basic components of a computer are a processor that executes instructions and one or more storage devices that store instructions and data. Typically, a computer will also include one or more large storage devices, such as magnetic disks, magneto-optical disks, or optical disks, for storing data, or be operatively coupled to one or more large storage devices to receive data from or transfer data thereto. Data is transferred to one or more large storage area devices, or both. However, computers do not necessarily have such a facility. Computer-readable media suitable for storage of computer program instructions and data includes all forms of non-volatile memory, media, and memory devices, including, for example, semiconductor memory devices such as EPROM, EEPROM, and flash memory devices; magnetic disks , such as internal hard drives or removable disks; magneto-optical disks; and CDROM and DVD-ROM discs. The processor and memory can be supplemented by, or incorporated into, dedicated logic circuits.

旨在將說明書與附圖一起僅視為示例性的，其中示例性意味著示例。如這裡所使用的，單數形式“一”，“一個”和“所述”旨在也包括複數形式，除非上下文另有明確說明。另外，除非上下文另有明確說明，否則“或”的使用旨在包括“和/或”。It is intended that the specification, together with the drawings, be regarded as illustrative only, where illustrative means example. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. In addition, the use of "or" is intended to include "and/or" unless the context clearly dictates otherwise.

雖然本專利檔包含許多細節，但不應將其解釋為對任何發明或申請專利範圍範圍的限制，而應解釋為對特定發明的特定實施例的特徵的描述。本專利檔在單獨實施例的上下文描述的某些特徵也可以在單個實施例中組合實施。相反，在單個實施例的上下文中描述的各種功能也可以在多個實施例中單獨實施，或在任何合適的子組合中實施。此外，儘管上述特徵可以描述為在某些組合中起作用，甚至最初要求是這樣，但在某些情況下，可以從組合中刪除申請專利範圍組合中的一個或多個特徵，並且申請專利範圍的組合可以指向子組合或子組合的變體。Although this patent document contains many details, they should not be construed as limitations on the scope of any invention or claim, but rather as descriptions of features of particular embodiments of particular inventions. Certain features of this patent document that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various functions that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Furthermore, although the features described above may be described as functioning in certain combinations, or even initially claimed as such, in some cases one or more features in the claimed combination may be deleted from the combination and the claimed scope A combination of can point to a subcombination or a variation of a subcombination.

同樣，儘管圖紙中以特定順序描述了操作，但這不應理解為要獲得想要的結果必須按照所示的特定順序或循序執行此類操作，或執行所有說明的操作。此外，本專利檔所述實施例中各種系統元件的分離不應理解為在所有實施例中都需要這樣的分離。Likewise, although operations are described in a particular order in the drawings, this should not be understood to imply that such operations must be performed in the specific order or sequence shown, or that all operations illustrated, to obtain desired results. Furthermore, the separation of various system components in the embodiments described in this patent document should not be construed as requiring such separation in all embodiments.

僅描述了一些實現和示例，其他實現、增強和變體可以基於本專利檔中描述和說明的內容做出。Only some implementations and examples are described, other implementations, enhancements, and variations may be made based on what is described and illustrated in this patent document.

100:視訊轉碼器 3000:電腦系統 3005:處理器 3010:記憶體 3015:網路介面卡 3025:內部連接 3100:移動設備 3101:處理器/控制器 3102:記憶體 3103:輸入/輸出（I/O）單元 3104:顯示器 3200、3300、3400、3500、3600、3700、3800、3900、4000、4100、4200:方法 3202、3204、3206、3302、3304、3402、3404、3502、3504、3602、3604、3702、3704、3802、3804、3806、3902、3904、3906、4002、4004、4102、4104、4202、4204:操作 A₀、A₁、B₁、B₀、B₂、C₀、C₁:位置 a、b、c、d、A、B、C、D、E、F、T:塊 AL、AR、BL、BR、MV0、MV0′、MV1、MV1′:運動向量 tb、td:POC距離 TD0、TD1:時間距離100: Video transcoder 3000: Computer system 3005: Processor 3010: Memory 3015: Network interface card 3025: Internal connection 3100: Mobile device 3101: Processor/controller 3102: Memory 3103: Input/output (I /O) Unit 3104: Display 3200, 3300, 3400, 3500, 3600, 3700, 3800, 3900, 4000, 4100, 4200: Method 3202, 3204, 3206, 3302, 3304, 3402, 3404, 3502, 3504, 3602, 3604, 3702, 3704, 3802, 3804, 3806, 3902, 3904, 3906, 4002, 4004, 4102, 4104, 4202, 4204: Operations A ₀ , A ₁ , B ₁ , B ₀ , B ₂ , C ₀ , C ₁ : Positions a, b, c, d, A, B, C, D, E, F, T: Blocks AL, AR, BL, BR, MV0, MV0′, MV1, MV1′: Motion vectors tb, td: POC distance TD0, TD1: time distance

圖1是示出視訊轉碼器實現的示例的框圖。圖2圖示了H.264視頻編碼標準中的巨集塊分割。圖3圖示了將編碼塊（CB）劃分成預測塊（PB）的示例。圖4圖示了將編碼樹塊（CTB）細分成CB和轉換塊（TB）的示例實現。實線表示CB邊界，且虛線表示TB邊界，包括帶分割的示例CTB和相應的四叉樹。圖5示出了用於分割視頻資料的四叉樹二叉樹（QTBT）結構的示例。圖6示出了視頻塊分割的示例。圖7示出了四叉樹分割的示例。圖8示出了樹型信令的示例。圖9示出了Merge候選列表構造的推導過程的示例。圖10示出了空間Merge候選的示例位置。圖11示出了考慮到空間Merge候選的冗餘檢查的候選對的示例。圖12示出了Nx2N和2NxN分割的第二個PU的位置的示例。圖13圖示了時域Merge候選的示例運動向量縮放。圖14示出了時域Merge候選的候選位置以及它們的並置圖片。圖15示出了組合雙向預測Merge候選的示例。圖16示出了運動向量預測候選的推導過程的示例。圖17示出了空間運動向量候選的運動向量縮放的示例。圖18示出了編碼單元（CU）的運動預測的示例可選時域運動向量預測（ATMVP）。圖19圖示地描繪了源塊和源圖片的識別的示例。圖20示出了具有四個子塊和相鄰塊的一個CU的示例。圖21圖示了雙邊匹配的示例。圖22圖示了範本匹配的示例。圖23描繪了畫面播放速率上轉換（FRUC）中的單邊運動估計（ME）的示例。圖24示出了基於雙邊範本匹配的解碼器側運動向量細化（DMVR）的示例。圖25示出了用於推導光照補償IC參數的空間相鄰塊的示例。圖26示出了用於推導空間Merge候選的空間相鄰塊的示例。圖27示出了使用相鄰幀間預測塊的示例。圖28示出了平面運動向量預測過程的示例。圖29示出了挨著當前編碼單元（CU）行的位置的示例。圖30是說明可用於實現本公開技術的各個部分的電腦系統或其它控制設備的結構的示例的框圖。圖31示出了可用於實現本公開技術的各個部分的移動設備的示例實施例的框圖。圖32示出了根據當前公開的技術的用於視頻處理的示例方法的流程圖。圖33示出了根據當前公開的技術的用於視頻處理的示例方法的流程圖。圖34示出了根據當前公開的技術的用於視頻處理的示例方法的流程圖。圖35示出了根據當前公開的技術的用於視頻處理的示例方法的流程圖。圖36示出了根據當前公開的技術的用於視頻處理的示例方法的流程圖。圖37示出了根據當前公開的技術的用於視頻處理的示例方法的流程圖。圖38示出了根據當前公開的技術的用於視頻處理的示例方法的流程圖。圖39示出了根據當前公開的技術的用於視頻處理的示例方法的流程圖。圖40示出了根據當前公開的技術的用於更新運動候選表的示例方法的流程圖。圖41示出了根據當前公開的技術的用於更新運動候選表的示例方法的流程圖。圖42示出了根據當前公開的技術的用於視頻處理的示例方法的流程圖。Figure 1 is a block diagram illustrating an example of a video transcoder implementation. Figure 2 illustrates macroblock partitioning in the H.264 video coding standard. Figure 3 illustrates an example of dividing a coding block (CB) into prediction blocks (PB). Figure 4 illustrates an example implementation of subdividing coding tree blocks (CTBs) into CBs and transform blocks (TBs). Solid lines represent CB boundaries, and dashed lines represent TB boundaries, including example CTBs with partitions and corresponding quadtrees. Figure 5 shows an example of a Quad Binary Tree (QTBT) structure for segmenting video material. Figure 6 shows an example of video block segmentation. Figure 7 shows an example of quadtree partitioning. Figure 8 shows an example of tree signaling. Figure 9 shows an example of the derivation process for Merge candidate list construction. Figure 10 shows example locations of spatial Merge candidates. Figure 11 shows an example of candidate pairs taking into account redundancy checking of spatial Merge candidates. Figure 12 shows an example of the location of the second PU for Nx2N and 2NxN partitions. Figure 13 illustrates example motion vector scaling for temporal Merge candidates. Figure 14 shows the candidate locations of the temporal Merge candidates and their collocated pictures. Figure 15 shows an example of combining bidirectional prediction Merge candidates. FIG. 16 shows an example of the derivation process of motion vector prediction candidates. FIG. 17 shows an example of motion vector scaling of spatial motion vector candidates. Figure 18 shows an example alternative temporal motion vector prediction (ATMVP) for motion prediction of a coding unit (CU). Figure 19 diagrammatically depicts an example of identification of source blocks and source pictures. Figure 20 shows an example of one CU with four sub-blocks and adjacent blocks. Figure 21 illustrates an example of bilateral matching. Figure 22 illustrates an example of template matching. Figure 23 depicts an example of one-sided motion estimation (ME) in frame rate up conversion (FRUC). Figure 24 shows an example of decoder-side motion vector refinement (DMVR) based on bilateral exemplar matching. Figure 25 shows an example of spatial neighbor blocks used to derive illumination compensation IC parameters. Figure 26 shows an example of spatial neighboring blocks used to derive spatial Merge candidates. Figure 27 shows an example of using adjacent inter prediction blocks. FIG. 28 shows an example of the plane motion vector prediction process. Figure 29 shows an example of the position next to the current coding unit (CU) row. 30 is a block diagram illustrating an example of the structure of a computer system or other control device that may be used to implement various portions of the disclosed technology. 31 illustrates a block diagram of an example embodiment of a mobile device that may be used to implement various portions of the disclosed technology. 32 illustrates a flowchart of an example method for video processing in accordance with currently disclosed technology. 33 illustrates a flowchart of an example method for video processing in accordance with currently disclosed technology. 34 illustrates a flowchart of an example method for video processing in accordance with currently disclosed technology. 35 illustrates a flowchart of an example method for video processing in accordance with currently disclosed technology. 36 illustrates a flowchart of an example method for video processing in accordance with currently disclosed technology. 37 illustrates a flowchart of an example method for video processing in accordance with currently disclosed technology. 38 illustrates a flow diagram of an example method for video processing in accordance with currently disclosed technology. 39 illustrates a flowchart of an example method for video processing in accordance with currently disclosed technology. 40 illustrates a flowchart of an example method for updating a motion candidate table in accordance with currently disclosed technology. 41 illustrates a flowchart of an example method for updating a motion candidate table in accordance with currently disclosed technology. 42 illustrates a flow diagram of an example method for video processing in accordance with currently disclosed technology.

3200:方法 3200:Method

3202、3204、3206:步驟 3202, 3204, 3206: steps

Claims

A method for video processing, comprising: determining new candidates for video processing by averaging two or more motion candidates selected from one or more lookup tables; adding the new candidates to a candidate list; and performing a conversion between a first video block of the video and a bitstream representation of the video using the determined new candidate in the candidate list; wherein the one or more lookup tables allow encoding or decoding to be performed based on historical data A process for maintaining the one or more lookup tables with motion information from previously encoded blocks during the encoding or decoding process.

According to the method described in item 1 of the patent application, the candidate list is a Merge candidate list, and the determined new candidate is a Merge candidate.

According to the method described in item 2 of the patent application, the Merge candidate list is an inter-frame prediction Merge candidate list or an intra-frame block copy prediction Merge candidate list.

The method of claim 1 , wherein the one or more lookup tables include motion candidates derived from previously processed video blocks in the video material that were processed before the first video block.

According to the method described in item 4 of the patent application, there are no available spatial candidates and temporal candidates in the candidate list.

The method according to claim 1, wherein said averaging is achieved without division operations.

The method according to claim 1, wherein the averaging is achieved by multiplication of a sum of motion vectors of the selected motion candidates and a scaling factor.

The method according to claim 1, wherein horizontal components of motion vectors of the selected motion candidates are averaged to derive horizontal components of new candidates.

The method according to claim 1, wherein vertical components of motion vectors of the selected motion candidates are averaged to derive vertical components of new candidates.

According to the method described in claim 6 of the patent application, the scaling factor is pre-calculated and stored in a lookup table.

According to the method described in any one of items 1 to 10 of the patent application, only motion vectors with the same reference picture are selected.

The method according to claim 11, wherein only motion vectors having the same reference picture in the two prediction directions are selected in the two prediction directions.

The method according to any one of items 1 to 10 of the patent scope, wherein the target reference picture in each prediction direction is predetermined, and the motion vector is scaled to the predetermined reference picture.

According to the method described in item 13 of the patent application, the first entry in the reference picture list X is selected as the target reference picture for the reference picture list, and X is 0 or 1.

According to the method described in item 13 of the patent application, for each prediction direction, the most commonly used reference picture in the table is selected as the target reference picture.

The method according to claim 13, wherein for each prediction direction, a motion vector having the same reference picture as a predetermined target reference picture is first selected, and then other motion vectors are selected.

The method according to any one of items 1 to 10 of the patent application scope, wherein the motion candidates from the one or more lookup tables are associated with motion information, the motion information includes at least one of the following: predicted direction , reference picture index, motion vector value, intensity compensation flag, affine flag, motion vector difference precision or motion vector difference value.

The method according to any one of items 1 to 10 of the patent scope further includes: updating the one or more lookup tables based on the conversion.

The method according to claim 18, wherein the updating of the one or more lookup tables includes updating the one or more lookup tables based on motion information of the first video block of the video after performing the conversion.

The method of claim 19, further comprising performing a conversion between subsequent video blocks of the video and a bitstream representation of the video based on the updated one or more lookup tables.

According to the method described in any one of items 1 to 10 of the patent application scope, the conversion includes encoding processing and/or decoding processing.

A device in a video system includes a processor configured to implement the method described in one or more of items 1 to 21 of the patent application scope.

A non-transitory computer-readable program medium having code stored thereon, the code including instructions that, when executed by a processor, cause the processor to implement one of items 1 to 21 of the patent scope or multiple methods described in.