TWI719522B

TWI719522B - Symmetric bi-prediction mode for video coding

Info

Publication number: TWI719522B
Application number: TW108123164A
Authority: TW
Inventors: 莊孝強; 張莉; 王悅
Original assignee: 大陸商北京字節跳動網絡技術有限公司; 美商字節跳動有限公司
Priority date: 2018-06-30
Filing date: 2019-07-01
Publication date: 2021-02-21
Also published as: WO2020003262A1; TW202017375A; CN115396677A; CN110662077A; CN110662077B

Abstract

A video bitstream processing method comprising generating, in response to a mirror mode flag in the video bitstream, a second motion vector difference information based on a symmetry rule and a first motion vector difference information; and reconstructing a video block using the first motion vector difference and the second motion vector difference information, wherein the reconstruction is performed bi-predictively.

Description

Symmetrical bidirectional prediction mode for video coding

該文件涉及圖像和視頻編碼技術。 [相關申請的交叉引用] 根據適用的專利法和/或根據巴黎公約的規則，本申請及時要求2018年6月30日提交的國際專利申請No. PCT/CN2018/093897的優先權和權益。國際專利申請No.PCT/CN2018/093897的全部公開內容通過引用併入本申請的公開內容的一部分。This document deals with image and video coding technology. [Cross references to related applications] In accordance with applicable patent laws and/or in accordance with the rules of the Paris Convention, this application promptly claims the priority and rights of international patent application No. PCT/CN2018/093897 filed on June 30, 2018. The entire disclosure of International Patent Application No. PCT/CN2018/093897 is incorporated into a part of the disclosure of this application by reference.

數位視訊佔有互聯網和其他數位通信網路上最大的頻寬使用。隨著能夠接收和顯示視頻的連接使用者設備的數量增加，預計用於數位視訊使用的頻寬需求將繼續增長。Digital video occupies the largest bandwidth usage on the Internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the demand for bandwidth for digital video usage will continue to grow.

所公開的技術可以由視覺媒體解碼器或編碼器實施例使用，其中使用運動向量的對稱性來減少用於信令通知運動資訊的位元，以改善編碼效率。The disclosed technology can be used by visual media decoder or encoder embodiments, where the symmetry of motion vectors is used to reduce the bits used for signaling motion information to improve coding efficiency.

在一個示例方面，公開了一種視頻位元流處理方法。該方法包括響應於視頻位元流中的鏡像模式標誌，基於對稱性規則和第一運動向量差資訊生成第二運動向量差資訊。該方法還包括使用第一運動向量差資訊和第二運動向量差來重建當前圖片中的視頻塊，其中使用雙向預測執行重建。In an exemplary aspect, a video bitstream processing method is disclosed. The method includes generating second motion vector difference information based on the symmetry rule and the first motion vector difference information in response to the mirror mode flag in the video bitstream. The method further includes using the first motion vector difference information and the second motion vector difference to reconstruct the video block in the current picture, wherein the reconstruction is performed using bidirectional prediction.

在另一示例方面，公開了另一種視頻位元流處理方法。該方法包括對於與視頻塊相關聯的第一參考圖片清單，接收第一組運動向量的運動向量差資訊。該方法還包括使用多假設對稱性規則從第一組運動向量的運動向量差資訊導出與視頻塊相關聯的第二參考圖片清單的第二組運動向量相關聯的運動向量差資訊，其中多假設對稱性規則指定第二運動向量差值為(0，0)，並且相應的運動向量預測器被設置為從第一運動向量差資訊導出的鏡像運動向量值，並且使用導出的結果執行視頻塊和視頻塊的位元流表示之間的轉換。In another exemplary aspect, another video bitstream processing method is disclosed. The method includes receiving motion vector difference information of a first set of motion vectors for a first reference picture list associated with a video block. The method further includes using multiple hypothesis symmetry rules to derive the motion vector difference information associated with the second set of motion vectors of the second reference picture list associated with the video block from the motion vector difference information of the first set of motion vectors, wherein the multiple hypotheses The symmetry rule specifies that the second motion vector difference value is (0, 0), and the corresponding motion vector predictor is set to the mirror motion vector value derived from the first motion vector difference information, and uses the derived result to perform video block summation The bitstream representations of video blocks are converted between.

在另一示例方面，公開了另一種視頻位元流處理方法。該方法包括對於視頻塊，接收與第一參考圖片清單相關聯的第一運動向量差資訊。該方法還包括對於視頻塊，接收與第二參考圖片清單相關聯的第二運動向量差資訊，並且使用多假設對稱性規則從第一運動向量差資訊和第二運動向量差資訊導出與第一參考圖片清單相關聯的第三運動向量差資訊和與第二參考圖片清單相關聯的第四運動向量差資訊，其中多假設對稱性規則指定第二運動向量差值為(0，0)，並且相應的運動向量預測器被設置為從第一運動向量差值資訊導出的鏡像運動向量值。In another exemplary aspect, another video bitstream processing method is disclosed. The method includes, for a video block, receiving first motion vector difference information associated with a first reference picture list. The method also includes for the video block, receiving second motion vector difference information associated with the second reference picture list, and using multiple hypothesis symmetry rules to derive the difference information from the first motion vector difference information and the second motion vector difference information. The third motion vector difference information associated with the reference picture list and the fourth motion vector difference information associated with the second reference picture list, where the symmetry rule specifies that the second motion vector difference is (0, 0), and The corresponding motion vector predictor is set to the mirror motion vector value derived from the first motion vector difference information.

在另一示例方面，公開了另一種視頻位元流處理方法。該方法包括接收相對於視頻的參考幀的視頻的未來幀，接收與視頻的未來幀和視頻的過去幀相關的運動向量，應用視頻的未來幀和視頻的過去幀之間的預定關係，並且基於視頻的未來幀、運動向量以及視頻的過去幀和視頻的未來幀之間的預定關係重建視頻的過去幀。In another exemplary aspect, another video bitstream processing method is disclosed. The method includes receiving a future frame of the video relative to a reference frame of the video, receiving a motion vector related to the future frame of the video and the past frame of the video, applying a predetermined relationship between the future frame of the video and the past frame of the video, and based on The future frame of the video, the motion vector, and the predetermined relationship between the past frame of the video and the future frame of the video reconstruct the past frame of the video.

在另一示例方面，上述方法可以由包括處理器的視頻解碼器裝置實施。In another example aspect, the above-described method may be implemented by a video decoder device including a processor.

在另一示例方面，上述方法可以由視訊轉碼器裝置實施，該視訊轉碼器裝置包括用於在視頻編碼處理期間對編碼視頻進行解碼的處理器。In another example aspect, the above-described method may be implemented by a video transcoder device that includes a processor for decoding the encoded video during the video encoding process.

在又一示例方面，這些方法可以以處理器可執行指令的形式體現並存儲在電腦可讀程式介質上。In yet another example aspect, these methods may be embodied in the form of processor-executable instructions and stored on a computer-readable program medium.

在本文件中進一步描述了這些和其他方面。These and other aspects are further described in this document.

在本文件中使用章節標題以便於理解，並且不將章節中公開的實施例僅限於該章節。這樣，來自一個章節的實施例可以與來自其他章節的實施例組合。此外，雖然用參考特定視頻轉碼器描述了某些實施例，但是所公開的技術也適用於其他視頻編碼技術。此外，雖然一些實施例詳細描述了視頻編碼步驟，但是應該理解，撤銷編碼的相應解碼步驟將由解碼器實施。此外，術語視頻處理涵蓋視頻編碼或壓縮、視頻解碼或解壓縮以及視頻轉碼，其中視頻像素從一種壓縮格式表示為另一壓縮格式或以不同的壓縮位元速率表示。The chapter titles are used in this document to facilitate understanding, and the embodiments disclosed in the chapters are not limited to this chapter. In this way, embodiments from one chapter can be combined with embodiments from other chapters. In addition, although certain embodiments have been described with reference to specific video transcoders, the disclosed techniques are also applicable to other video encoding techniques. In addition, although some embodiments describe the video encoding step in detail, it should be understood that the corresponding decoding step of de-encoding will be implemented by the decoder. In addition, the term video processing encompasses video encoding or compression, video decoding or decompression, and video transcoding, in which video pixels are expressed from one compression format to another compression format or at different compression bit rates.

本文件提供了可由視頻位元流的解碼器使用的各種技術，以改善解壓縮或解碼的數位視訊的品質。此外，視訊轉碼器還可在編碼過程期間實施這些技術，以便重建用於進一步編碼的解碼的幀。HEVC 中雙向預測的信令通知 This document provides various techniques that can be used by decoders of video bitstreams to improve the quality of decompressed or decoded digital video. In addition, video transcoders can also implement these techniques during the encoding process in order to reconstruct decoded frames for further encoding. Signaling notification of bidirectional prediction in HEVC

在HEVC中，幀間PU級信令通知（inter PU-level signaling）可以分為三種不同的模式。表1和表2示出了HEVC中的用於幀間PU信令通知的相關語法元素。第一模式是跳過模式，其中僅需要信令通知單個Merge索引（merge_idx）。第二模式是Merge模式，其中僅信令通知Merge標誌（merge_flag）和Merge索引（merge_idx）。第三模式是AMVP模式，其中信令通知方向索引（inter_pred_idc）、參考索引（ref_idx_l0/ref_idx_l1），mvp索引（mvp_l0_flag/mvp_l1_flag）和MVD（mvd_coding）。In HEVC, inter PU-level signaling can be divided into three different modes. Table 1 and Table 2 show related syntax elements for inter-PU signaling in HEVC. The first mode is the skip mode, in which only a single Merge index (merge_idx) needs to be signaled. The second mode is the Merge mode, in which only the Merge flag (merge_flag) and the Merge index (merge_idx) are signaled. The third mode is the AMVP mode, in which signalling direction index (inter_pred_idc), reference index (ref_idx_l0/ref_idx_l1), mvp index (mvp_l0_flag/mvp_l1_flag) and MVD (mvd_coding).

在所有這三種模式中，雙向預測AMVP模式提供了更加耗費速率的情形，同時它提供了捕獲各種運動模型的自由度，包括加速和其他非線性運動模型。分別信令通知兩個列表的運動向量以提供這種自由度。HEVC 中的AMVP推導 AMVP 模式下的運動向量預測 Among all the three modes, the bidirectional predictive AMVP mode provides a more rate-consuming situation, while it provides the freedom to capture various motion models, including acceleration and other non-linear motion models. The motion vectors of the two lists are separately signaled to provide this degree of freedom. AMVP in HEVC derives motion vector prediction in AMVP mode

運動向量預測利用與相鄰PU的運動向量的時空相關性，其用於運動參數的顯式傳輸。它通過首先校驗在時間上左側、上部相鄰PU位置的可用性，移除冗餘候選並添加零向量以使候選列表為恒定長度來構建運動向量候選列表。然後，編碼器可以從候選清單中選擇最佳預測器，並發送指示所選候選的相應索引。類似地，利用Merge索引信令通知，使用截斷一元碼（truncated unary）來編碼最佳運動向量候選的索引。在這種情況下要編碼的最大值是2。在以下章節中，提供了關於運動向量預測候選的推導過程的細節。表1.HEVC中的幀間PU語法元素

表2.HEVC中MVD編碼的語法元素

運動向量預測候選 Motion vector prediction utilizes the spatio-temporal correlation with the motion vectors of neighboring PUs, which is used for explicit transmission of motion parameters. It constructs the motion vector candidate list by first checking the availability of the adjacent PU positions on the left and upper in time, removing redundant candidates and adding zero vectors to make the candidate list a constant length. The encoder can then select the best predictor from the candidate list and send a corresponding index indicating the selected candidate. Similarly, using Merge index signaling, truncated unary is used to encode the index of the best motion vector candidate. The maximum value to be coded in this case is 2. In the following sections, details on the derivation process of motion vector prediction candidates are provided. Table 1. Inter PU Syntax Elements in HEVC

Table 2. Syntax elements of MVD encoding in HEVC

Motion vector prediction candidate

圖1歸納了運動向量預測候選的推導過程。Figure 1 summarizes the derivation process of motion vector prediction candidates.

在運動向量預測中，考慮兩種類型的運動向量候選：空間運動向量候選和時間運動向量候選。對於空間運動向量候選推導，最終基於位於五個不同位置的每個PU的運動向量導出兩個運動向量候選，如圖2所示。In motion vector prediction, two types of motion vector candidates are considered: spatial motion vector candidates and temporal motion vector candidates. For the derivation of spatial motion vector candidates, two motion vector candidates are finally derived based on the motion vector of each PU located at five different positions, as shown in FIG. 2.

對於時間運動向量候選推導，從兩個候選中選擇一個運動向量候選，這兩個候選基於兩個不同的共位（co-located）位置導出。在做出時空候選第一列表之後，移除列表中的重複運動向量候選。如果潛在候選的數量大於兩個，則從列表中移除其相關參考圖片清單內的參考圖片索引大於1的運動向量候選。如果時空運動向量候選的數量小於兩個，則將附加的零運動向量候選添加到列表中。空間運動向量候選 For the derivation of temporal motion vector candidates, one motion vector candidate is selected from two candidates, which are derived based on two different co-located positions. After the first list of spatiotemporal candidates is made, the repeated motion vector candidates in the list are removed. If the number of potential candidates is greater than two, the motion vector candidates whose reference picture index is greater than 1 in the related reference picture list are removed from the list. If the number of spatiotemporal motion vector candidates is less than two, additional zero motion vector candidates are added to the list. Spatial motion vector candidate

在空間運動向量候選的推導中，在五個潛在候選中考慮至多兩個候選，該五個候選從位於如圖2所示的位置的PU中導出，那些位置與運動Merge的位置相同。當前PU左側的推導順序被定義為A0、A1、縮放（scaled）A0和縮放A1。當前PU上側的推導順序被定義為B0、B1、B2、縮放B0、縮放B1和縮放B2。因此，對於每一側，存在四種可用作運動向量候選的情況，其中兩種情況不需要使用空間縮放，並且兩種情況使用空間縮放。四種不同的情況歸納如下。無空間縮放（1）相同的參考圖片清單，以及相同的參考圖片索引（相同的圖片序號（picture order count，POC））（2）不同的參考圖片清單，但相同的參考圖片索引（相同的POC）空間縮放（3）相同的參考圖片清單，但不同的參考圖片索引（不同的POC）（4）不同的參考圖片清單，以及不同的參考圖片索引（不同的POC）In the derivation of the spatial motion vector candidates, at most two candidates are considered among the five potential candidates, which are derived from the PU located at the positions shown in FIG. 2, and those positions are the same as the positions of the motion merge. The derivation order on the left side of the current PU is defined as A0, A1, scaled A0, and scaled A1. The derivation order on the upper side of the current PU is defined as B0, B1, B2, scale B0, scale B1, and scale B2. Therefore, for each side, there are four cases that can be used as motion vector candidates, of which two cases do not need to use spatial scaling, and two cases use spatial scaling. The four different situations are summarized as follows. No space zoom (1) The same reference picture list and the same reference picture index (same picture order count (POC)) (2) Different reference picture lists, but the same reference picture index (same POC) Space zoom (3) The same reference picture list, but different reference picture indexes (different POC) (4) Different reference picture lists, and different reference picture indexes (different POCs)

首先校驗非空間縮放情況，然後校驗空間縮放。當相鄰PU的參考圖片和當前PU的參考圖片之間的POC不同時，考慮空間縮放，而不管參考圖片清單。如果左側候選的所有PU都不可用或者是幀內編碼的，則針對上述運動向量的縮放可以幫助左側和上部MV候選的並行推導。否則，上述運動向量不允許空間縮放。First verify the non-spatial scaling, and then verify the spatial scaling. When the POC between the reference picture of the adjacent PU and the reference picture of the current PU is different, spatial scaling is considered regardless of the reference picture list. If all the PUs of the left candidate are unavailable or are intra-coded, the scaling for the above-mentioned motion vector can help the parallel derivation of the left and upper MV candidates. Otherwise, the aforementioned motion vector does not allow spatial scaling.

在空間縮放過程中，以與時間縮放類似的方式縮放相鄰PU的運動向量，如圖3所示。主要差異在於給出了當前PU的參考圖片清單和索引作為輸入；實際縮放過程與時間縮放過程相同。時間運動向量候選 In the spatial scaling process, the motion vectors of adjacent PUs are scaled in a similar manner to the temporal scaling, as shown in FIG. 3. The main difference is that the reference picture list and index of the current PU are given as input; the actual scaling process is the same as the time scaling process. Temporal motion vector candidate

除了參考圖片索引導出之外，用於推導時間Merge候選的所有過程與用於推導空間運動向量候選的過程相同。將參考圖片索引信令通知給解碼器。HEVC 中的Merge模式 Merge 模式的候選 Except for reference picture index derivation, all processes for deriving temporal Merge candidates are the same as those for deriving spatial motion vector candidates. Signal the reference picture index to the decoder. Merge mode in HEVC Candidates for Merge mode

當使用Merge模式預測PU時，從位元流解析指向Merge候選清單中的條目的索引並用於檢索運動資訊。該列表的構建在HEVC標準中規定，並且可以根據以下步驟順序進行歸納：步驟1：初始候選推導 -步驟1.1：空間候選推導 -步驟1.2：用於空間候選的冗餘校驗 -步驟1.3：時間候選推導步驟2：附加候選插入 -步驟2.1：雙向預測候選創建 -步驟2.2：零運動候選插入When using the Merge mode to predict the PU, the index pointing to the entry in the Merge candidate list is parsed from the bit stream and used to retrieve motion information. The construction of this list is stipulated in the HEVC standard, and can be summarized according to the following sequence of steps: Step 1: Initial candidate derivation -Step 1.1: Spatial candidate derivation -Step 1.2: Redundancy check for spatial candidates -Step 1.3: Time candidate derivation Step 2: Additional candidate insertion -Step 2.1: Two-way prediction candidate creation -Step 2.2: Zero motion candidate insertion

這些步驟也在圖4中示意性地描繪。對於空間Merge候選推導，在位於五個不同位置的候選中選擇至多四個Merge候選。對於時間Merge候選推導，在兩個候選中選擇至多一個Merge候選。由於在解碼器處假設每個PU的候選的數量為常數，因此當候選的數量未達到在條帶頭中被信令通知的最大Merge候選數量（MaxNumMergeCand）時，生成附加的候選。由於候選的數量是恒定的，因此使用截斷一元二值化（truncated unary binarization，TU）來編碼最佳Merge候選的索引。如果CU的大小等於8，則當前CU的所有PU共用單個Merge候選列表，其與2N×2N預測單元的Merge候選清單相同。These steps are also schematically depicted in FIG. 4. For the derivation of spatial Merge candidates, at most four Merge candidates are selected from candidates located at five different positions. For the temporal Merge candidate derivation, at most one Merge candidate is selected from the two candidates. Since the number of candidates for each PU is assumed to be constant at the decoder, when the number of candidates does not reach the maximum number of Merge candidates (MaxNumMergeCand) signaled in the slice header, additional candidates are generated. Since the number of candidates is constant, truncated unary binarization (TU) is used to encode the index of the best Merge candidate. If the size of the CU is equal to 8, all PUs of the current CU share a single Merge candidate list, which is the same as the Merge candidate list of the 2N×2N prediction unit.

在以下子章節中，描述了上述每個步驟的詳細操作。空間候選 In the following subsections, the detailed operation of each of the above steps is described. Spatial candidate

在空間Merge候選的推到中，從位於圖2所示位置的候選中選擇至多四個Merge候選。推導的順序是A1、B1、B0、A0和B2。僅當位置A1、B1、B0、A0的任何PU不可用(例如，因為它屬於另一條帶（slice）或片（tile）)或被幀內編碼時，才考慮位置B2。在位置A1處的候選被添加之後，剩餘候選的添加經受冗餘校驗，該冗餘校驗確保具有相同運動資訊的候選從清單中排除，從而改善編碼效率。為了降低計算複雜性，在提到的冗餘校驗中並沒有考慮所有可能的候選對。取而代之的是，僅考慮與圖5中箭頭連結的對，並且僅當用於冗餘校驗的對應候選不具有相同的運動資訊時，候選才被添加到列表中。重複運動資訊的另一來源是與不同於2Nx2N的分割相關聯的“第二PU”。作為示例，圖6分別描述了N×2N和2N×N情況下的第二PU。當當前PU被分割為N×2N時，位置A1處的候選不被考慮用於列表構建。事實上，通過添加該候選將導致兩個預測單元具有相同的運動資訊，這對於在編碼單元中僅具有一個預測單元是冗餘的。類似地，當當前PU被分割為2N×N時，不考慮位置B1。時間候選 In the pushing of spatial Merge candidates, at most four Merge candidates are selected from the candidates located at the positions shown in FIG. 2. The order of derivation is A1, B1, B0, A0, and B2. Position B2 is considered only when any PU at positions A1, B1, B0, A0 is not available (for example, because it belongs to another slice or tile) or is intra-coded. After the candidate at position A1 is added, the addition of the remaining candidates undergoes a redundancy check, which ensures that candidates with the same motion information are excluded from the list, thereby improving coding efficiency. In order to reduce the computational complexity, all possible candidate pairs are not considered in the mentioned redundancy check. Instead, only the pair connected with the arrow in FIG. 5 is considered, and only when the corresponding candidate used for redundancy check does not have the same motion information, the candidate is added to the list. Another source of repetitive motion information is the "second PU" associated with a partition other than 2Nx2N. As an example, FIG. 6 depicts the second PU in the case of N×2N and 2N×N, respectively. When the current PU is divided into N×2N, the candidate at position A1 is not considered for list construction. In fact, adding this candidate will cause two prediction units to have the same motion information, which is redundant for having only one prediction unit in the coding unit. Similarly, when the current PU is divided into 2N×N, the position B1 is not considered. Time candidate

在此步驟中，僅有一個候選添加到列表中。具體地，在該時間Merge候選的推導中，基於屬於與給定參考圖片清單內的當前圖片具有最小POC差的圖片的共位元的PU來推導縮放運動向量。在條帶頭中明確地用信令通知要用於推導共位元的PU的參考圖片清單。如圖7中的虛線所示，獲得時間Merge候選的縮放運動向量，該運動向量使用POC距離、tb和td從共位的PU的運動向量縮放，其中tb被定義為當前圖片的參考圖片和當前圖片之間的POC差，並且td被定義為共位元的圖片的參考圖片和共位元的圖片之間的POC差。時間Merge候選的參考圖片索引被設置為等於零。HEVC規範中描述了縮放過程的實際實現。對於B條帶，獲得並組合兩個運動向量，該兩個運動向量中一個用於參考圖片清單0（list0），並且另一個用於參考圖片清單1（list1），以形成雙向預測Merge候選。In this step, only one candidate is added to the list. Specifically, in the derivation of this temporal Merge candidate, the scaling motion vector is derived based on the collocated PU belonging to the picture with the smallest POC difference from the current picture in the given reference picture list. The reference picture list of the PU to be used to derive the co-bit is explicitly signaled in the slice header. As shown by the dashed line in Figure 7, the scaled motion vector of the temporal Merge candidate is obtained. The motion vector is scaled from the motion vector of the co-located PU using the POC distance, tb, and td, where tb is defined as the reference picture of the current picture and the current The POC difference between pictures, and td is defined as the POC difference between the reference picture of the co-bit picture and the co-bit picture. The reference picture index of the temporal Merge candidate is set equal to zero. The actual implementation of the scaling process is described in the HEVC specification. For the B slice, two motion vectors are obtained and combined, one of the two motion vectors is used for reference picture list 0 (list0), and the other is used for reference picture list 1 (list1) to form a bi-directional predictive Merge candidate.

在屬於參考幀的共位的PU（Y）中，在候選C0和C1之間選擇時間候選的位置，如圖8所示。如果位置C0處的PU不可用，是被幀內編碼的，或者在當前編碼樹單元（coding tree unit，CTU）之外，則使用位置C1。否則，位置C0用於推導時間Merge候選。附加候選插入 In the co-located PU (Y) belonging to the reference frame, the position of the temporal candidate is selected between the candidates C0 and C1, as shown in FIG. 8. If the PU at position C0 is not available, is intra-coded, or is outside the current coding tree unit (coding tree unit, CTU), position C1 is used. Otherwise, position C0 is used to derive temporal Merge candidates. Additional candidate insertion

除了時空Merge候選之外，還有兩種其他類型的Merge候選：組合的雙向預測Merge候選和零Merge候選。通過利用時空Merge候選來生成組合的雙向預測Merge候選。組合的雙向預測Merge候選僅用於B條帶。通過將初始候選的第一參考圖片清單運動參數與另一個的第二參考圖片清單運動參數組合來生成組合的雙向預測候選。如果這兩個元組提供不同的運動假設，它們將形成新的雙向預測候選。作為示例，圖9描繪了當原始列表中的兩個候選（在左側）的情況，其具有mvL0和refIdxL0或mvL1和refIdxL1，用於創建添加到最終清單的組合雙向預測Merge候選（在右側）。關於被認為生成這些附加Merge候選的組合有許多規則。In addition to spatiotemporal Merge candidates, there are two other types of Merge candidates: combined bidirectional predictive Merge candidates and zero Merge candidates. The combined bidirectional predictive Merge candidate is generated by using spatio-temporal Merge candidates. The combined bi-directional prediction Merge candidate is only used for the B band. The combined bi-directional prediction candidate is generated by combining the first reference picture list motion parameter of the initial candidate with another second reference picture list motion parameter. If these two tuples provide different motion hypotheses, they will form a new bi-prediction candidate. As an example, Figure 9 depicts the situation when there are two candidates in the original list (on the left), which have mvL0 and refIdxL0 or mvL1 and refIdxL1, which are used to create the combined bi-directional predictive Merge candidates (on the right) added to the final list. There are many rules regarding the combination that is considered to generate these additional Merge candidates.

插入零運動候選以填充Merge候選列表中的剩餘條目，從而達到MaxNumMergeCand容量。這些候選具有零空間位移和參考圖片索引，該參考圖片索引從零開始並且每次向列表中添加新的零運動候選時增加。由這些候選使用的參考幀的數量分別是單向和雙向預測的一個和兩個。最後，不對這些候選執行冗餘校驗。模式匹配的運動向量 Insert zero motion candidates to fill the remaining entries in the Merge candidate list to reach the MaxNumMergeCand capacity. These candidates have a zero spatial displacement and a reference picture index, which starts at zero and increases every time a new zero motion candidate is added to the list. The number of reference frames used by these candidates is one and two for unidirectional and bidirectional prediction, respectively. Finally, no redundancy check is performed on these candidates. Pattern matching motion vector

模式匹配運動向量推導（Pattern matched motion vector derivation，PMMVD）模式是基於畫面播放速率上轉換（Frame-Rate Up Conversion，FRUC）技術的特殊Merge模式。利用該模式，塊的運動資訊不是被信令通知的，而是在解碼器側導出的。Pattern matched motion vector derivation (PMMVD) mode is a special Merge mode based on Frame-Rate Up Conversion (FRUC) technology. Using this mode, the motion information of the block is not signaled, but derived on the decoder side.

當CU的Merge標誌為真時，為該CU信令通知FRUC標誌。當FRUC標誌為假時，信令通知Merge索引並使用常規Merge模式。當FRUC標誌為真時，信令通知附加的FRUC模式標誌以指示將使用哪種方法（雙邊匹配或範本匹配）來導出該塊的運動資訊。When the Merge flag of the CU is true, the FRUC flag is notified for the CU signaling. When the FRUC flag is false, the Merge index is signaled and the regular Merge mode is used. When the FRUC flag is true, the additional FRUC mode flag is signaled to indicate which method (bilateral matching or template matching) will be used to derive the motion information of the block.

在編碼器側，是否對CU使用FRUC Merge模式的決定是基於研發成本（cost）選擇，就像對正常Merge候選所做的那樣。也就是說，通過使用RD成本選擇來校驗CU的兩種匹配模式（雙邊匹配和範本匹配）。導致最小成本的那個與其他CU模式進一步比較。如果FRUC匹配模式是最有效的模式，則對於CU將FRUC標誌設置為真，並且使用相關匹配模式。On the encoder side, the decision of whether to use the FRUC Merge mode for the CU is based on the cost of research and development (cost) selection, just like the normal Merge candidate. That is to say, the two matching modes of CU (bilateral matching and template matching) are checked by using RD cost selection. The one that causes the least cost is further compared with other CU models. If the FRUC matching mode is the most effective mode, the FRUC flag is set to true for the CU, and the relevant matching mode is used.

FRUC Merge模式中的運動推導過程有兩個步驟。首先執行CU級運動搜索，然後進行子CU級運動細化。在CU級，基於雙邊匹配或範本匹配為整個CU導出初始運動向量。首先，生成MV候選列表，並且選擇導致最小匹配成本的候選作為進一步CU級細化的起始點。然後，執行基於起始點周圍的雙邊匹配或範本匹配的局部搜索，並且將導致最小匹配成本的MV作為整個CU的MV。隨後，在子CU級進一步細化運動資訊，以導出的CU運動向量作為起始點。The motion derivation process in FRUC Merge mode has two steps. First, perform CU-level motion search, and then perform sub-CU-level motion refinement. At the CU level, the initial motion vector is derived for the entire CU based on bilateral matching or template matching. First, a list of MV candidates is generated, and the candidate that leads to the smallest matching cost is selected as the starting point for further CU-level refinement. Then, a local search based on bilateral matching or template matching around the starting point is performed, and the MV that results in the smallest matching cost is taken as the MV of the entire CU. Subsequently, the motion information is further refined at the sub-CU level, and the derived CU motion vector is used as the starting point.

例如，針對W×H CU運動資訊導出執行以下導出處理。在第一階段，導出整個W×H CU的MV。在第二階段，CU進一步劃分成M×M個子CU。如（1）中計算M的值，D是預定義的劃分深度，其在JEM中默認設置為3。然後導出每個子CU的MV。

For example, the following export processing is executed for W×H CU motion information export. In the first stage, the MV of the entire W×H CU is derived. In the second stage, the CU is further divided into M×M sub-CUs. For example, the value of M is calculated in (1), D is the predefined division depth, which is set to 3 by default in JEM. Then export the MV of each sub-CU.

如圖10所示，雙邊匹配用於通過在兩個不同的參考圖片中沿著當前CU的運動軌跡找到兩個塊之間的最接近匹配來導出當前CU的運動資訊。在連續運動軌跡的假設下，指向兩個參考塊的運動向量MV0和MV1應當與當前圖片和兩個參考圖片之間的時間距離（即TD0和TD1）成比例。作為特殊情況，當當前圖片在時間上在兩個參考圖片之間並且從當前圖片到兩個參考圖片的時間距離相同時，雙邊匹配變為基於鏡像的雙向MV。As shown in FIG. 10, bilateral matching is used to derive the motion information of the current CU by finding the closest match between two blocks along the motion trajectory of the current CU in two different reference pictures. Under the assumption of continuous motion trajectories, the motion vectors MV0 and MV1 pointing to the two reference blocks should be proportional to the time distance between the current picture and the two reference pictures (ie, TD0 and TD1). As a special case, when the current picture is between two reference pictures in time and the time distance from the current picture to the two reference pictures is the same, the bilateral matching becomes a mirror-based two-way MV.

如圖11所示，範本匹配用於通過找到當前圖片中的範本（當前CU的頂部和/或左側相鄰塊）和參考圖片中的塊(與範本大小相同)之間的最接近匹配來導出當前CU的運動資訊。除FRUC Merge模式外，範本匹配也適用於AMVP模式。在JEM中，有兩個AMVP候選。使用範本匹配方法，推導新的候選。如果通過範本匹配的新導出的候選與第一現有AMVP候選不同，則將其插入AMVP候選列表的最開始，然後將列表大小設置為2（意味著移除第二現有AMVP候選）。當應用于AMVP模式時，僅應用CU級搜索。CU 級MV候選集 As shown in Figure 11, template matching is used to derive by finding the closest match between the template in the current picture (the top and/or left adjacent block of the current CU) and the block in the reference picture (the same size as the template) Sports information of the current CU. In addition to FRUC Merge mode, template matching is also applicable to AMVP mode. In JEM, there are two AMVP candidates. Use template matching methods to derive new candidates. If the newly derived candidate matched by the template is different from the first existing AMVP candidate, it is inserted into the very beginning of the AMVP candidate list, and then the list size is set to 2 (meaning the second existing AMVP candidate is removed). When applied to AMVP mode, only CU-level search is applied. CU -level MV candidate set

在CU級設置的MV候選包括： - 如果當前CU處於AMVP模式，則為原始AMVP候選 - 所有Merge候選， - 插值MV場中的幾個MV。 - 頂部和左側相鄰運動向量The MV candidates set at the CU level include: -If the current CU is in AMVP mode, it is the original AMVP candidate -All Merge candidates, -Interpolate several MVs in the MV field. -Top and left adjacent motion vectors

當使用雙邊匹配時，Merge候選的每個有效MV用作輸入以在雙邊匹配的假設下生成MV對。例如，Merge候選的一個有效MV是位於參考列表a處(MVa，refa)。然後，在另一參考列表B中找到其配對的雙邊MV的參考圖片refb，使得refa和refb在時間上位於當前圖片的不同側。如果參考列表B中沒有這樣的refb，則refb被確定為不同於refa的參考，並且其到當前圖片的時間距離是清單B中最小的。在refb被確定之後，通過基於當前圖片和refa、refb之間的時間距離縮放MVa來導出MVb。When using bilateral matching, each valid MV of the Merge candidate is used as input to generate MV pairs under the assumption of bilateral matching. For example, a valid MV of the Merge candidate is located at the reference list a (MVa, refa). Then, find the reference picture refb of its paired bilateral MV in another reference list B, so that refa and refb are located on different sides of the current picture in time. If there is no such refb in the reference list B, the refb is determined to be a reference different from refa, and its time distance to the current picture is the smallest in the list B. After refb is determined, MVb is derived by scaling MVa based on the time distance between the current picture and refa and refb.

來自插值MV場的四個MV也被添加到CU級候選列表。更具體地，添加當前CU的位置（0,0），（W/2,0），（0，H/2）和（W/2，H/2）處的插值MV。Four MVs from the interpolated MV field are also added to the CU-level candidate list. More specifically, the interpolation MVs at the positions (0, 0), (W/2, 0), (0, H/2), and (W/2, H/2) of the current CU are added.

當FRUC應用于AMVP模式時，原始AMVP候選也被添加到CU級MV候選集。When FRUC is applied to the AMVP mode, the original AMVP candidate is also added to the CU-level MV candidate set.

在CU級，AMVP Cu至多15個MV並且Merge Cu至多13個MV被添加到候選列表。子CU級MV候選集 At the CU level, AMVP Cu at most 15 MVs and Merge Cu at most 13 MVs are added to the candidate list. Sub-CU-level MV candidate set

在CU級設置的MV候選集包括： - 從CU級搜索確定的MV， - 頂部、左側、左上角和右上角的相鄰MV， - 來自參考圖片的共位元的MV的縮放版本， - 至多4個ATMVP候選， - 至多4個STMVP候選The MV candidate set set at the CU level includes: -MV determined from CU-level search, -Adjacent MVs at the top, left, top left, and top right corners, -A zoomed version of the co-bit MV from the reference picture, -Up to 4 ATMVP candidates, -Up to 4 STMVP candidates

來自參考圖片的縮放MV推導如下。遍歷兩個清單中的所有參考圖片。參考圖片中的子CU的共位位置處的MV被縮放到起始CU級MV的參考。The zoomed MV from the reference picture is derived as follows. Go through all the reference pictures in the two lists. The MV at the co-located position of the sub-CU in the reference picture is scaled to the reference of the starting CU-level MV.

ATMVP和STMVP候選限於前四個。ATMVP and STMVP candidates are limited to the first four.

在子CU級，至多17個MV被添加到候選列表中。插值MV場的生成 At the sub-CU level, up to 17 MVs are added to the candidate list. Generation of interpolated MV field

在對幀進行編碼之前，基於單邊ME為整個圖片生成插值運動場。然後，運動場可以稍後用作CU級或子CU級MV候選。Before encoding the frame, an interpolated motion field is generated for the entire picture based on the single-sided ME. Then, the sports field can be used as a CU-level or sub-CU-level MV candidate later.

首先，兩個參考清單中的每個參考圖片的運動場以4×4塊級遍歷。對於每個4×4塊，如果與塊相關聯的運動通過當前圖片中的4×4塊（如圖12所示）並且塊未被分配任何插值運動，則參考塊的運動是根據時間距離TD0和TD1（與HEVC中的TMVP的MV縮放的方式相同的方式）縮放到當前圖片，並且將縮放的運動分配給當前幀中的塊。如果沒有將縮放MV分配給4×4塊，則在插值運動場中將塊的運動標記為不可用。插值和匹配成本 First, the motion field of each reference picture in the two reference lists is traversed at a 4×4 block level. For each 4×4 block, if the motion associated with the block passes through a 4×4 block in the current picture (as shown in Figure 12) and the block is not assigned any interpolation motion, the motion of the reference block is based on the time distance TD0 Scaling to the current picture with TD1 (the same way as the MV scaling of TMVP in HEVC), and assigns the scaled motion to the block in the current frame. If the scaled MV is not assigned to a 4×4 block, the motion of the block is marked as unavailable in the interpolated motion field. Interpolation and matching costs

當運動向量指向分數樣本位置時，需要運動補償的插值。為了降低複雜性，雙線性插值代替常規8抽頭HEVC插值用於雙邊匹配和範本匹配。When the motion vector points to the fractional sample position, motion compensation interpolation is required. In order to reduce complexity, bilinear interpolation replaces conventional 8-tap HEVC interpolation for bilateral matching and template matching.

匹配成本的計算在不同步驟稍有不同。當在CU級從候選集中選擇候選時，匹配成本是雙邊匹配或範本匹配的絕對差和(sum of absolute difference，SAD)。在確定起始MV之後，子CU級搜索的雙邊匹配的匹配成本計算如下：

（等式2）The calculation of the matching cost is slightly different in different steps. When selecting candidates from the candidate set at the CU level, the matching cost is the sum of absolute difference (SAD) of bilateral matching or template matching. After the initial MV is determined, the matching cost of the bilateral match in the sub-CU-level search is calculated as follows:

(Equation 2)

其中，

是根據經驗設置為4的加權因數，

和

分別表示當前MV和起始MV。SAD仍然用作子CU級搜索的範本匹配的匹配成本。among them,

Is a weighting factor set to 4 based on experience,

with

Represents the current MV and the start MV respectively. SAD is still used as the matching cost for template matching in sub-CU-level searches.

在FRUC模式中，僅通過使用亮度樣本來導出MV。導出的運動將用於MC幀間預測的亮度和色度兩者。在決定MV之後，使用用於亮度的8抽頭插值濾波器和用於色度的4抽頭插值濾波器來執行最終MC。MV 細化 In FRUC mode, MV is derived only by using luminance samples. The derived motion will be used for both luma and chroma for MC inter prediction. After MV is decided, the final MC is performed using an 8-tap interpolation filter for luma and a 4-tap interpolation filter for chroma. MV refinement

MV細化是基於模式的MV搜索，其中具有雙邊匹配成本或範本匹配成本的標準。在JEM中，支援兩種搜索模式-無限制的中心偏置菱形搜索（unrestricted center-biased diamond search，UCBDS）和分別在CU級和子CU級的MV細化的自我調整交叉搜索。對於CU和子CU級MV細化兩者，以四分之一亮度樣本MV準確度直接搜索MV，並且隨後是八分之一亮度樣本MV細化。用於CU和子CU步驟的MV細化的搜索範圍被設置為等於8亮度樣本。在範本匹配FRUC Merge模式中選擇預測方向 MV refinement is a pattern-based MV search, which has a standard of bilateral matching cost or template matching cost. In JEM, two search modes are supported-unrestricted center-biased diamond search (UCBDS) and self-adjusted cross search with MV refinement at the CU level and the sub-CU level respectively. For both CU and sub-CU-level MV refinement, the MV is directly searched with quarter-luminance sample MV accuracy, and then one-eighth luminance sample MV refinement follows. The search range for MV refinement for the CU and sub-CU steps is set equal to 8 luma samples. Select the prediction direction in the template matching FRUC Merge mode

在雙邊匹配Merge模式中，始終應用雙向預測，因為基於沿兩個不同參考圖片中的當前CU的運動軌跡的兩個塊之間的最接近匹配來導出CU的運動資訊。範本匹配Merge模式沒有這樣的限制。在範本匹配Merge模式中，編碼器可以從清單0的單向預測、列表1的單向預測或CU的雙向預測中進行選擇。選擇是基於範本匹配成本，如下所示：如果costBi >= factor * min (cost 0,cost1 ) 使用雙向預測；否則,如果cost 0 >=cost1 使用來自列表0的單向預測；否則，使用來自列表1的單向預測；In the bilateral matching Merge mode, bidirectional prediction is always applied because the motion information of the CU is derived based on the closest match between two blocks along the motion trajectory of the current CU in two different reference pictures. There is no such restriction for template matching Merge mode. In the template matching Merge mode, the encoder can choose from the unidirectional prediction of List 0, the unidirectional prediction of List 1, or the bidirectional prediction of CU. The selection is based on the template matching cost, as shown below: If costBi >= factor * min ( cost 0, cost1 ) use bidirectional prediction; otherwise, if cost 0 >= cost1 use unidirectional prediction from list 0; otherwise, use from list 1 one-way forecast;

其中cost0是清單0範本匹配的SAD，cost1是清單1範本匹配的SAD，並且costBi是雙向預測範本匹配的SAD。因數（factor ）的值等於1.25，這意味著選擇過程偏向於雙向預測。Where cost0 is the SAD matched by the template in Listing 0, cost1 is the SAD matched by the template in Listing 1, and costBi is the SAD matched by the bidirectional prediction template. The value of factor is equal to 1.25, which means that the selection process is biased towards bidirectional prediction.

幀間預測方向選擇僅應用於CU級範本匹配處理。解碼器側運動向量細化 Inter-frame prediction direction selection is only applied to CU-level template matching processing. Decoder side motion vector refinement

在雙向預測操作中，對於一個塊區域的預測，分別使用列表0的運動向量（motion vector，MV）和列表1的MV形成的兩個預測塊被組合以形成單個預測信號。在解碼器側運動向量細化（decoder-side motion vector refinement，DMVR）方法中，通過雙邊範本匹配過程進一步細化雙向預測的兩個運動向量。雙邊範本匹配在解碼器中應用，以在雙邊範本和參考圖片中的重建樣本之間執行基於失真的搜索，以獲得細化的MV而不傳輸附加的運動資訊。In the bidirectional prediction operation, for the prediction of a block region, two prediction blocks formed using the motion vector (MV) of List 0 and the MV of List 1 are combined to form a single prediction signal. In the decoder-side motion vector refinement (DMVR) method, the two motion vectors of bidirectional prediction are further refined through a bilateral template matching process. Bilateral template matching is applied in the decoder to perform a distortion-based search between the reconstructed samples in the bilateral template and the reference picture to obtain a refined MV without transmitting additional motion information.

在DMVR中，分別從列表0的初始MV0和列表1的MV1生成雙邊範本作為兩個預測塊的加權組合（即平均），如圖10所示。範本匹配操作包括計算生成的範本與參考圖片中的樣本區域（初始預測塊周圍）之間的成本測量。對於兩個參考圖片中的每一個，產生最小範本成本的MV被視為該列表的更新MV以替換原始範本。在JEM中，為每個列表搜索九個MV候選。九個MV候選包括原始MV和8個周圍MV，其中一個亮度樣本在水準或垂直方向或兩者上偏移原始MV。最後，兩個新的MV，即圖10中所示的MV0′和MV1′，用於生成最終的雙向預測結果。絕對差值和（sum of absolute differences，SAD）用作成本測量。In DMVR, a bilateral template is generated from the initial MV0 of list 0 and MV1 of list 1 as a weighted combination (ie, average) of two prediction blocks, as shown in Figure 10. The template matching operation includes calculating the cost measurement between the generated template and the sample area (around the initial prediction block) in the reference picture. For each of the two reference pictures, the MV that produces the smallest template cost is regarded as the updated MV of the list to replace the original template. In JEM, nine MV candidates are searched for each list. The nine MV candidates include the original MV and 8 surrounding MVs, where one luminance sample is offset from the original MV in the horizontal or vertical direction or both. Finally, two new MVs, MV0' and MV1' shown in Figure 10, are used to generate the final bidirectional prediction result. The sum of absolute differences (SAD) is used as a cost measure.

DMVR被應用於雙向預測的Merge模式，其中一個MV來自過去參考圖片，另一MV來自未來參考圖片，而不傳輸附加的語法元素。在JEM中，當針對CU啟用LIC、仿射運動、FRUC或子CU Merge候選時，不應用DMVR。自我調整運動向量差分解析度 DMVR is applied to the Merge mode of bidirectional prediction, where one MV is from the past reference picture and the other MV is from the future reference picture, without transmitting additional syntax elements. In JEM, when LIC, affine motion, FRUC, or sub-CU Merge candidates are enabled for CU, DMVR is not applied. Self-adjusting motion vector difference resolution

在HEVC中，當條帶頭中的use_integer_mv_flag等於0時，以四分之一亮度樣本為單位，信令通知運動向量差（motion vector difference，MVD）（在PU的運動向量和預測運動向量之間）。在JEM中，引入了局部自我調整運動向量解析度（locally adaptive motion vector resolution，LAMVR）。在JEM中，MVD可以以四分之一亮度樣本、整數亮度樣本或四亮度樣本為單位進行編碼。在編碼單元（coding unit，CU）級控制MVD解析度，並且對於具有至少一個非零MVD分量的每個CU，有條件地信令通知MVD解析度標誌。In HEVC, when the use_integer_mv_flag in the slice header is equal to 0, the motion vector difference (MVD) (between the motion vector of the PU and the predicted motion vector) is signaled in units of quarter luminance samples. . In JEM, locally adaptive motion vector resolution (LAMVR) is introduced. In JEM, MVD can be coded in units of quarter-luminance samples, integer-luminance samples, or four-luminance samples. The MVD resolution is controlled at the coding unit (CU) level, and for each CU with at least one non-zero MVD component, the MVD resolution flag is conditionally signaled.

對於具有至少一個非零MVD分量的CU，信令通知第一標誌以指示在CU中是否使用四分之一亮度樣本MV精度。當第一標誌（等於1）指示未使用四分之一亮度樣本MV精度時，信令通知另一標誌以指示是使用整數亮度樣本MV精度還是四亮度樣本MV精度。For a CU with at least one non-zero MVD component, the first flag is signaled to indicate whether to use the quarter luma sample MV accuracy in the CU. When the first flag (equal to 1) indicates that one-quarter luminance sample MV accuracy is not used, another flag is signaled to indicate whether to use integer luminance sample MV accuracy or four luminance sample MV accuracy.

當CU的第一MVD解析度標誌為零或未針對CU編碼（意味著CU中的所有MVD均為零）時，將四分之一亮度樣本MV解析度用於CU。當CU使用整數亮度樣本MV精度或四亮度樣本MV精度時，CU的AMVP候選列表中的MVP被舍入到相應的精度。When the first MVD resolution flag of the CU is zero or is not coded for the CU (meaning that all MVDs in the CU are zero), a quarter of the luminance sample MV resolution is used for the CU. When the CU uses the integer luma sample MV precision or the four luma sample MV precision, the MVP in the AMVP candidate list of the CU is rounded to the corresponding precision.

在編碼器中，CU級RD校驗用於確定將哪個MVD解析度用於CU。也就是說，對於每個MVD解析度，執行CU級RD校驗三次。為了加快編碼器速度，在JEM中應用以下編碼方案。In the encoder, the CU-level RD check is used to determine which MVD resolution to use for the CU. That is, for each MVD resolution, CU-level RD verification is performed three times. In order to speed up the encoder, the following coding scheme is applied in JEM.

在具有正常四分之一亮度樣本MVD解析度的CU的RD校驗期間，存儲當前CU的運動資訊（整數亮度樣本準確度）。存儲的運動資訊（在舍入之後）被用作在RD校驗期間針對具有整數亮度樣本和4亮度樣本MVD解析度的相同CU的進一步小範圍運動向量細化的起始點，使得耗時的運動估計過程不重複三次。During the RD verification period of the CU with the MVD resolution of a normal quarter luminance sample, the motion information of the current CU (integer luminance sample accuracy) is stored. The stored motion information (after rounding) is used as the starting point for further small-range motion vector refinement for the same CU with integer luminance samples and 4-luminance sample MVD resolution during RD verification, making it time-consuming The motion estimation process is not repeated three times.

有條件地調用具有4亮度樣本MVD解析度的CU的RD校驗。對於CU，當RD成本整數亮度樣本MVD解析度遠大於四分之一亮度樣本MVD解析度時，跳過針對CU的4亮度樣本MVD解析度的RD校驗。基於子CU的運動向量預測 Conditionally invoke the RD check of the CU with the MVD resolution of 4 luma samples. For CU, when the MVD resolution of the RD cost integer luminance sample is much greater than the MVD resolution of a quarter luminance sample, skip the RD check for the MVD resolution of 4 luminance samples of the CU. Motion vector prediction based on sub-CU

在JEM中，每個CU可以具有針對每個預測方向的至多一組運動參數。通過將大CU劃分成子CU並且導出大CU的所有子CU的運動資訊，在編碼器中考慮兩種子CU級運動向量預測方法。替代時間運動向量預測（ATMVP）方法允許每個CU從比共位元的參考圖片中的當前CU小的多個塊中獲取多組運動資訊。在空間-時間運動向量預測（spatial-temporal motion vector prediction，STMVP）方法中，通過使用時間運動向量預測器和空間相鄰運動向量來遞迴地導出子CU的運動向量。In JEM, each CU may have at most one set of motion parameters for each prediction direction. By dividing the large CU into sub-CUs and deriving the motion information of all sub-CUs of the large CU, two sub-CU-level motion vector prediction methods are considered in the encoder. The alternative temporal motion vector prediction (ATMVP) method allows each CU to obtain multiple sets of motion information from multiple blocks smaller than the current CU in the co-bit reference picture. In the spatial-temporal motion vector prediction (STMVP) method, the motion vector of the sub-CU is recursively derived by using a temporal motion vector predictor and spatial adjacent motion vectors.

為了保留用於子CU運動預測的更準確的運動場，當前禁用參考幀的運動壓縮。替代時間運動向量預測 In order to retain a more accurate motion field for sub-CU motion prediction, motion compression of reference frames is currently disabled. Alternative temporal motion vector prediction

在替代時間運動向量預測（ATMVP）方法中，通過從小於當前CU的塊中提取多組運動資訊（包括運動向量和參考索引）來修改時間運動向量預測（temporal motion vector prediction，TMVP）。如圖11所示，子CU是方形N×N塊（預設情況下N設置為4）。In the alternative temporal motion vector prediction (ATMVP) method, temporal motion vector prediction (TMVP) is modified by extracting multiple sets of motion information (including motion vectors and reference indexes) from blocks smaller than the current CU. As shown in Figure 11, the sub-CU is a square N×N block (N is set to 4 by default).

圖13示出了雙邊範本匹配過程的示例。在第一步驟中，從預測塊生成雙邊範本。在第二步驟中，使用雙邊範本匹配來找到最匹配的塊。Fig. 13 shows an example of a bilateral template matching process. In the first step, a bilateral template is generated from the prediction block. In the second step, two-sided template matching is used to find the most matching block.

ATMVP以兩個步驟預測CU內的子CU的運動向量。第一步是利用所謂的時間向量識別參考圖片中的對應塊。參考圖片稱為運動源圖片。第二步是將當前CU劃分成子CU，並從對應於每個子CU的塊中獲得運動向量以及每個子CU的參考索引，如圖14所示。ATMVP predicts the motion vectors of sub-CUs in the CU in two steps. The first step is to use the so-called time vector to identify the corresponding block in the reference picture. The reference picture is called the motion source picture. The second step is to divide the current CU into sub-CUs, and obtain the motion vector and the reference index of each sub-CU from the block corresponding to each sub-CU, as shown in FIG. 14.

在第一步驟中，參考圖片和對應塊由當前CU的空間相鄰塊的運動資訊確定。為了避免相鄰塊的重複掃描過程，使用當前CU的Merge候選列表中的第一Merge候選。第一可用運動向量及其相關參考索引被設置為時間向量和運動源圖片的索引。這樣，在ATMVP中，與TMVP相比，可以更準確地識別相應的塊，其中相應的塊（有時稱為共位的塊）總是相對於當前CU位於右下或中心位置。在一個示例中，如果第一Merge候選來自左相鄰塊（即，圖15中的A₁ ），則利用相關聯的MV和參考圖片來識別源塊和源圖片。In the first step, the reference picture and the corresponding block are determined by the motion information of the spatial neighboring blocks of the current CU. In order to avoid the repeated scanning process of adjacent blocks, the first Merge candidate in the Merge candidate list of the current CU is used. The first available motion vector and its related reference index are set as the index of the time vector and the motion source picture. In this way, in ATMVP, compared with TMVP, the corresponding block can be identified more accurately, where the corresponding block (sometimes called a co-located block) is always located in the lower right or center position relative to the current CU. In one example, if the first Merge candidate is from the left neighboring block (ie, A _{1 in} FIG. 15), the associated MV and reference picture are used to identify the source block and the source picture.

在第二步驟中，通過向當前CU的座標添加時間向量，通過運動源圖片中的時間向量來識別子CU的對應塊。對於每個子CU，其對應塊的運動資訊（覆蓋中心樣本的最小運動網格）用於導出子CU的運動資訊。在識別出對應的N×N塊的運動資訊之後，以與HEVC的TMVP相同的方式將該運動資訊轉換為當前子CU的運動向量和參考索引，其中應用運動縮放和其他過程。例如，解碼器校驗是否滿足低延遲條件（即，當前圖片的所有參考圖片的POC小於當前圖片的POC）並且可能使用運動向量MVx（對應於參考圖片清單X的運動向量）來預測每個子CU的運動向量MVy(其中X等於0或1並且Y等於1-X)。空間-時間運動向量預測 In the second step, by adding a time vector to the coordinates of the current CU, the corresponding block of the sub-CU is identified through the time vector in the motion source picture. For each sub-CU, the motion information of the corresponding block (the smallest motion grid covering the center sample) is used to derive the motion information of the sub-CU. After the motion information of the corresponding N×N block is identified, the motion information is converted into the motion vector and reference index of the current sub-CU in the same way as the TMVP of HEVC, where motion scaling and other processes are applied. For example, the decoder checks whether the low-delay condition is satisfied (ie, the POC of all reference pictures of the current picture is less than the POC of the current picture) and may use the motion vector MVx (corresponding to the motion vector of the reference picture list X) to predict each sub-CU The motion vector MVy (where X is equal to 0 or 1 and Y is equal to 1-X). Space-time motion vector prediction

在該方法中，按照光柵掃描順序，遞迴地導出子CU的運動向量。圖16說明了這個構思。讓我們考慮一個8×8的CU，它包含四個4×4子CU A、B、C和D。當前幀中相鄰的4×4塊被標記為a、b、c和d。In this method, the motion vector of the sub-CU is derived recursively in the raster scan order. Figure 16 illustrates this concept. Let us consider an 8×8 CU, which contains four 4×4 sub-CUs A, B, C, and D. The adjacent 4×4 blocks in the current frame are labeled a, b, c, and d.

子CU A的運動推導通過識別其兩個空間相鄰開始。第一相鄰是子CU A上方的N×N塊（塊c）。如果該塊c不可用或者是幀內編碼的，則校驗子CU A上方的其他N×N個塊（從塊c開始，從左到右）。第二相鄰是子CU A左側的塊（塊b）。如果塊b不可用或者是幀內編碼的，則校驗子CU A左側的其他塊（從塊b開始，從上到下）。從每個清單的相鄰塊獲得的運動資訊被縮放為給定清單的第一參考幀。接下來，通過遵循與HEVC中指定的TMVP推導相同的過程來導出子塊A的時間運動向量預測器（TMVP）。獲取位置D處的共位元塊的運動資訊並相應地縮放。最後，在檢索和縮放運動資訊之後，對於每個參考列表，所有可用的運動向量（至多3個）被單獨平均。平均的運動向量被指定為當前子CU的運動向量。子CU運動預測模式信令通知 The motion derivation of sub CU A starts by identifying that its two spaces are adjacent. The first neighbor is the N×N block (block c) above the sub CU A. If the block c is not available or is intra-coded, the other N×N blocks above the syndrome CU A (starting from block c, from left to right). The second neighbor is the block to the left of the sub-CU A (block b). If block b is not available or is intra-coded, the other blocks on the left side of syndrome CU A (start from block b, from top to bottom). The motion information obtained from the neighboring blocks of each list is scaled to the first reference frame of a given list. Next, the temporal motion vector predictor (TMVP) of sub-block A is derived by following the same process as the TMVP derivation specified in HEVC. Obtain the motion information of the co-bit block at position D and scale it accordingly. Finally, after retrieving and scaling the motion information, for each reference list, all available motion vectors (up to 3) are averaged individually. The average motion vector is designated as the motion vector of the current sub-CU. Sub-CU motion prediction mode signaling notification

子CU模式被啟用作為附加的Merge候選，並且不需要附加的語法元素來信令通知模式。添加兩個附加的Merge候選每個CU的Merge候選清單，以表示ATMVP模式和STMVP模式。如果序列參數集指示啟用了ATMVP和STMVP，則至多使用七個Merge候選。附加Merge候選的編碼邏輯與HM中的Merge候選的相同，這意味著，對於P或B條帶中的每個CU，兩個附加Merge候選需要附加兩個RD校驗。The sub-CU mode is enabled as an additional Merge candidate, and no additional syntax elements are required to signal the mode. Add two additional Merge candidates to the Merge candidate list of each CU to indicate ATMVP mode and STMVP mode. If the sequence parameter set indicates that ATMVP and STMVP are enabled, at most seven Merge candidates are used. The coding logic of the additional Merge candidates is the same as that of the Merge candidates in the HM, which means that for each CU in the P or B slice, two additional Merge candidates require additional two RD checks.

在JEM中，所有Merge索引的bin都由CABAC進行上下文編碼。而在HEVC中，僅第一個bin是上下文編碼的，而剩餘的bin是上下文旁路編碼的（context by-pass coded）。實施例解決的問題的示例 In JEM, all bins indexed by Merge are context-encoded by CABAC. In HEVC, only the first bin is context coded, and the remaining bins are context by-pass coded. Examples of problems solved by the embodiments

雖然MVD提供了很大的靈活性來適應視訊訊號中的各種運動，但它構成了位元流的很大一部分。特別是在雙向預測期間，需要信令通知L0的MVD和L1的MVD，並且它們引入大的開銷，尤其是對於低速率視覺通信。可以利用關於運動對稱性的一些屬性來節省在運動資訊的編碼上花費的速率。Although MVD provides great flexibility to adapt to various movements in the video signal, it constitutes a large part of the bit stream. Especially during the bidirectional prediction period, it is necessary to signal the MVD of L0 and the MVD of L1, and they introduce a large overhead, especially for low-rate visual communication. Some properties about motion symmetry can be used to save the rate spent on encoding motion information.

當前的AMVP模式（包括MVP索引和參考索引兩者）對於L0和L1兩者分別進行信令通知，而當運動遵循對稱性模型時，它們可以更有效地表示。實施例示例 The current AMVP mode (including both the MVP index and the reference index) is separately signaled for both L0 and L1, and when the motion follows the symmetry model, they can be represented more effectively. Example of embodiment

1. 在雙向預測期間，可以利用運動向量的對稱性的屬性來生成用於AMVP模式的基礎MV集。具體地，僅針對單個方向（列表），信令通知MVD，並且使用鏡像條件來設置另一方向的MV。替代地，此外，可以進一步細化MV。這種模式稱為對稱雙向預測模式(sym-bi-mode)。本文，雙向預測是指通過按顯示順序使用來自過去的一個參考幀和來自未來的另一參考幀進行預測。在一些示例實施例中，通用視頻編碼（versatile video coding，VVC）（例如，JVET-N1001-v5和其他版本和標準）包括對稱性運動向量差（symmetric motion vector difference，SMVD）模式，其可以跳過L1 MVD的信令通知。被跳過的L1 MVD可以被設置為L0 MVD的鏡像而無需縮放。 a. 在一個示例中，當發送L(1-N ) (N=0 或1 )MVD時，不發送LN的MVD值(即，繼承為(0，0))，並且MVP被設置為來自L(1-N ) MV的鏡像MV。之後，可以將運動細化應用於LN 運動向量。 (i) 在一個示例中，可以應用DMVR細化過程。替代地，可以應用FRUC細化過程來細化LN 運動向量。 (ii) 在一個示例中，細化的搜索範圍可以通過SPS(Sequence Parameter Set，序列參數集)、PPS(Picture Parameter Set，圖片參數集)、VPS(Video Parameter set，視頻參數集)或條帶頭來預定義或信令通知。 (iii) 在一個示例中，運動細化可以應用於特定網格。例如，具有網格距離d 的均勻採樣網格可用于定義搜索點。網格距離d 可以被預定義，或者經由SPS、PPS、VPS或條帶頭用信令通知。採樣網格的使用可以被認為是子採樣的搜索區域，因此具有減少搜索所需的記憶體頻寬的益處。 (iv) 在一個示例中，鏡像模式的信令通知可以在CU級、CTU級、區域級（覆蓋多個CU/CTU）或條帶級中進行。當它在CU級進行時，當它是sym-bi-mode時，需要用信令通知一位元（one-bit）標誌。也就是說，當該標誌被信令通知為1時，可以跳過相關聯的LN MVD以及其MVP索引。當在CTU級、區域級或條帶級完成時，所有sym-bi-mode都不會信令通知LN MVD值及其MVP索引。在一些示例實施例中，SMVD標誌的信令通知發生在CU級。 b. 在一個示例中，在條帶頭/圖片參數集/序列參數集中存在一位元標誌，用於信令通知是否應該調用細化過程。替代地，也可以在CU/CTU/區域級進行信令通知。 c. 在一個示例中，在雙向預測期間，可以信令通知要跳過哪個MVD列表。信令通知可以在CU級、區域級、CTU級或條帶級發生。當在CU級進行信令通知時，需要在sym-bi-mode中信令通知一位元標誌。當在區域級、CTU級或條帶級用信令通知時，所有屬於雙向預測CU的都將跳過指定列表的MVD的信令通知，並使用鏡像MVP作為其起始點來找到最終運動向量。 d. 在一個示例中，僅需要將鏡像MVP存儲在MV緩衝器中以用於後續塊的運動預測（AMVP，Merge）。細化的運動向量不需要存儲在MV緩衝器中。 e. 在一個示例中，MVP可以隨常規MVP索引放置，並且需要一個附加位元（總共2個）來信令通知三個MVP索引。在一些實施例中，在SMVD模式中，兩個MVP索引都被信令通知為常規AMVP模式。 f. 在一個示例中，添加鏡像MVP候選來代替第二AMVP候選。儘管如此，只需要一位元來信令通知MVP索引。 g. 在一個示例中，當兩個參考幀之間的POC距離相等時，可以應用鏡像MVP模式。在一些實施例中，在SMVD模式中，導出兩個參考作為L0和L1中與當前幀最接近的參考幀。 h. 在一個示例中，由鏡像引入的縮放可以使用源幀和目標幀之間的相對時間距離。例如，如果使用L(1-N )的參考幀和LN 的參考幀，並且決定跳過LN 的MVD信令通知，LN (N = 0 或1 )的初始運動向量可以計算為：MVPN = (τN /τ(1-N ))∙MV(1-N )，其中τ0和τ1分別表示L0的當前幀和參考幀之間的POC距離和L1的當前幀和參考幀之間的POC距離。1. During the bidirectional prediction, the symmetry property of the motion vector can be used to generate the basic MV set for the AMVP mode. Specifically, only for a single direction (list), MVD is signaled, and the mirroring condition is used to set the MV in the other direction. Alternatively, in addition, the MV can be further refined. This mode is called symmetric bi-directional prediction mode (sym-bi-mode). In this paper, bidirectional prediction refers to prediction by using one reference frame from the past and another reference frame from the future in display order. In some example embodiments, universal video coding (VVC) (for example, JVET-N1001-v5 and other versions and standards) includes a symmetric motion vector difference (SMVD) mode, which can skip Signaling through L1 MVD. The skipped L1 MVD can be set as a mirror image of the L0 MVD without scaling. a. In an example, when L( 1-N ) ( N=0 or 1 ) MVD is sent, the MVD value of LN is not sent (that is, inherited as (0, 0)), and the MVP is set from L ( 1-N ) Mirror MV of MV. After that, motion refinement can be applied to the L N motion vector. (i) In one example, the DMVR refinement process can be applied. Alternatively, the FRUC refinement process can be applied to refine the L N motion vector. (ii) In an example, the refined search range can be through SPS (Sequence Parameter Set), PPS (Picture Parameter Set), VPS (Video Parameter Set, video parameter set) or strip header To be pre-defined or signaled. (iii) In one example, motion refinement can be applied to a specific grid. For example, a uniform sampling grid with a grid distance d can be used to define search points. The grid distance d can be predefined or signaled via SPS, PPS, VPS or slice header. The use of the sampling grid can be considered as a sub-sampling search area, and therefore has the benefit of reducing the memory bandwidth required for searching. (iv) In an example, the signaling of the mirroring mode can be performed at the CU level, CTU level, regional level (covering multiple CU/CTU), or stripe level. When it is performed at the CU level, when it is sym-bi-mode, a one-bit flag needs to be signaled. That is, when the flag is signaled as 1, the associated L N MVD and its MVP index can be skipped. When completed at the CTU level, regional level, or stripe level, all sym-bi-modes will not signal the L N MVD value and its MVP index. In some example embodiments, the signaling of the SMVD flag occurs at the CU level. b. In an example, there is a one-bit flag in the slice header/picture parameter set/sequence parameter set for signaling whether the refinement process should be invoked. Alternatively, signaling can also be performed at the CU/CTU/regional level. c. In one example, during bidirectional prediction, it is possible to signal which MVD list to skip. Signaling can occur at the CU level, regional level, CTU level, or stripe level. When signaling at the CU level, a one-bit flag needs to be signaled in sym-bi-mode. When signaling at the regional level, CTU level, or stripe level, all bi-predictive CUs will skip the MVD signaling of the specified list and use the mirror MVP as their starting point to find the final motion vector . d. In one example, only the mirrored MVP needs to be stored in the MV buffer for use in the motion prediction (AMVP, Merge) of subsequent blocks. The refined motion vector does not need to be stored in the MV buffer. e. In an example, the MVP can be placed along with the regular MVP index, and an additional bit (2 in total) is required to signal three MVP indexes. In some embodiments, in the SMVD mode, both MVP indexes are signaled as the regular AMVP mode. f. In one example, add a mirrored MVP candidate to replace the second AMVP candidate. Nevertheless, only one bit is needed to signal the MVP index. g. In an example, when the POC distance between two reference frames is equal, the mirror MVP mode can be applied. In some embodiments, in the SMVD mode, two references are derived as the reference frames closest to the current frame in L0 and L1. h. In one example, the scaling introduced by mirroring can use the relative time distance between the source frame and the target frame. For example, if L (1-N) of the reference frame and the reference frame L N, L N and the MVD decided to skip signaling, L N (N = 0 or 1) of the initial motion vector can be calculated as: MVP N = (τ N /τ( 1-N ))∙MV( 1-N ), where τ0 and τ1 represent the POC distance between the current frame of L0 and the reference frame and the distance between the current frame of L1 and the reference frame, respectively POC distance.

2. 可以使用各種匹配方案來完成細化過程。讓來自L0和L1圖片的補丁（patch）分別為P0和P1。補丁被定義為由MV的插值過程生成的預測樣本。 a. P0和P1之間的相似性用作選擇細化的MV的標準。在一個示例中，細化找到MVN (N=0 或1) ，其最小化了P0和P1之間的絕對差之和（SAD）。 b. 由P0和P1生成臨時補丁，並且可以將標準定義為找到預測補丁和臨時補丁之間具有最高相關性的MV。例如，可以創建單獨的補丁P’= (P0+P1)/2並用於找到MVN (N=0 或1) ，其最小化了P'和PN 之間的SAD。更一般地，P’可以通過以下公式生成： P’=ω∙P0+(1-ω)∙P1，其中ω是0和1之間的加權因數。 c. 在一個示例中，基於範本的匹配方案可用于定義細化過程。頂部範本、左側範本、或頂部和左側範本兩者同時可用於查找MVN (N=0 或1) 。找到MVN (N=0 或1) 的過程類似於上述兩個示例中描述的過程。 d. 在一個示例中，取決於搜索點到初始鏡像MVP位置的距離，針對搜索點中的一些可以跳過插值過程。當搜索那些到MVPN (N=0 或1) 的距離超過閾值T 的點時，不涉及插值過程。只有整數像素參考樣本被用作補丁來導出運動向量。T 可以預先定義，也可以經由SPS、PPS、VPS或條帶頭信令通知。 e. 在一個示例中，用於找到MVN 的成本度量包括由搜索點引入到鏡像MVP的估計速率：C = SAD + λ∙R，其中λ是一個加權因數，用於加權細化過程中估計速率的重要性。λ的值可以預定義，通過SPS、PPS、VPS或條帶頭用信令通知。注意，下面定義的MVDN、MVN和MVPN是二維向量。 i. 在一個示例中，R = ||MVDN ||，其中MVDN = MVN –MVPN 。這裡，函數||∙||代表L1規範。 ii. 在一個示例中，R = round(log₂ (||MVDN ||))，其中函數round指示輸入引數（argument）對最接近的整數的舍入函數。 iii. 在一個示例中，R = mvd_coding(MVDN )，其中函數mvd_coding指示輸入MVD值的符合標準的二值化過程。2. Various matching schemes can be used to complete the refinement process. Let the patches from the L0 and L1 pictures be P0 and P1, respectively. Patches are defined as prediction samples generated by the interpolation process of MV. a. The similarity between P0 and P1 is used as a criterion for selecting refined MVs. In one example, the refinement finds MV N (N=0 or 1) , which minimizes the sum of absolute differences (SAD) between P0 and P1. b. Generate temporary patches from P0 and P1, and the standard can be defined as finding the MV with the highest correlation between the predicted patch and the temporary patch. For example, a separate patch P'= (P0+P1)/2 can be created and used to find MV N (N=0 or 1) , which minimizes the SAD between P'and P N. More generally, P'can be generated by the following formula: P'=ω∙P0+(1-ω)∙P1, where ω is a weighting factor between 0 and 1. c. In one example, a template-based matching scheme can be used to define the refinement process. The top template, left template, or both top and left templates can be used to find MV N (N=0 or 1) . The process of finding MV N (N=0 or 1) is similar to the process described in the above two examples. d. In an example, depending on the distance from the search point to the initial mirror MVP position, the interpolation process can be skipped for some of the search points. When searching for points whose distance to MVP N (N=0 or 1) exceeds the threshold T , no interpolation process is involved. Only integer pixel reference samples are used as patches to derive motion vectors. T can be pre-defined or notified via SPS, PPS, VPS, or slice header signaling. e. In an example, the cost metric used to find MV N includes the estimated rate introduced from the search point to the mirror MVP: C = SAD + λ∙R, where λ is a weighting factor used for estimation in the weighted refinement process The importance of speed. The value of λ can be pre-defined and signaled through SPS, PPS, VPS or slice header. Note that the MVDN, MVN, and MVPN defined below are two-dimensional vectors. i. In an example, R = ||MVD N ||, where MVD N = MV N –MVP N. Here, the function ||∙|| represents the L1 specification. ii. In an example, R = round(log ₂ (||MVD N ||)), where the function round indicates the rounding function of the input argument to the nearest integer. iii. In an example, R = mvd_coding(MVD N ), where the function mvd_coding indicates the standard-compliant binarization process of the input MVD value.

3. MVD_L1_ZERO_FLAG是條帶級標誌，其通過移除所有L1 MVD值，對L1 MVD信令通知施加強約束。鏡像MV和細化可以通過以下方式與這種設計結合使用。 f. 在一個示例中，當啟用MVD_L1_ZERO_FLAG時，不用信令通知MVP索引，並且仍然可以應用鏡像MVP約束和細化過程。 g. 在一個示例中，當啟用MVD_L1_ZERO_FLAG時，仍然信令通知MVP索引（例如，如在上述1.e或1.f中）並且不施加鏡像MVP約束。然而，仍然可以應用MV細化過程。 h. 在一個示例中，當啟用MVD_L1_ZERO_FLAG時，將鏡像MVP添加到MVP候選列表，隨後是MV細化過程。3. MVD_L1_ZERO_FLAG is a stripe-level flag, which imposes strong constraints on L1 MVD signaling by removing all L1 MVD values. Mirror MV and refinement can be combined with this design in the following ways. f. In an example, when MVD_L1_ZERO_FLAG is enabled, the MVP index is not signaled, and the mirror MVP constraint and refinement process can still be applied. g. In an example, when MVD_L1_ZERO_FLAG is enabled, the MVP index is still signaled (for example, as in 1.e or 1.f above) and mirroring MVP constraints are not applied. However, the MV refinement process can still be applied. h. In an example, when MVD_L1_ZERO_FLAG is enabled, the mirror MVP is added to the MVP candidate list, followed by the MV refinement process.

4. 當涉及LN (N = 0 或1 )的參考索引和MVP索引的信令通知時，可以創建聯合MVP列表以支援鏡像MVD模式。也就是說，MVP列表是針對L0和L1（給定的一對特定參考索引）聯合導出的，並且僅需要信令通知單個索引。 i. 在一個示例中，可以跳過refIdxN 的信令通知，並且僅選擇最接近L(1-N )參考幀的鏡像位置的參考幀，作為其用於MVP縮放的參考幀。在一些實施例中，在SMVD模式中，跳過兩個參考索引，因為它們在兩個列表中被選擇為與當前幀最接近的參考幀。 j. 在一個示例中，在推導過程期間，不能創建Bi預測器的MVP候選應被視為無效。 k. 在一個例子中，除了當縮放發生時，導致運動向量位於解碼圖片緩衝器(Decoded Picture Buffer，DPB)中的L0和L1的參考幀上的候選對被認為是有效候選之外，推導可以通過遵循針對L(1-N )的MVP推導的現有過程來完成。 l. 可以表示鏡像MVD模式，包括：如果( sym_mvd_flag[ x0 ][ y0 ] ) { MvdL1[ x0 ][ y0 ][ 0 ] = −MvdL0[ x0 ][ y0 ][ 0 ] MvdL1[ x0 ][ y0 ][ 1 ] = −MvdL0[ x0 ][ y0 ][ 1 ] }否則4. When signaling of the reference index and MVP index of L N ( N = 0 or 1 ) is involved, a joint MVP list can be created to support the mirror MVD mode. That is, the MVP list is jointly derived for L0 and L1 (a given pair of specific reference indexes), and only a single index needs to be signaled. i. In an example, the signaling of refIdx N can be skipped, and only the reference frame closest to the mirror position of the L(1-N ) reference frame is selected as its reference frame for MVP scaling. In some embodiments, in the SMVD mode, two reference indexes are skipped because they are selected as the reference frame closest to the current frame in the two lists. j. In one example, during the derivation process, MVP candidates that cannot create a Bi predictor should be considered invalid. k. In an example, in addition to when the scaling occurs, the candidate pair that causes the motion vector to be located on the reference frame of L0 and L1 in the decoded picture buffer (Decoded Picture Buffer, DPB) is considered to be a valid candidate, the derivation can be This is done by following the existing process of MVP derivation for L(1-N). l. It can represent the mirror MVD mode, including: If (sym_mvd_flag[ x0 ][ y0]) {MvdL1[ x0 ][ y0 ][ 0] = −MvdL0[ x0 ][ y0 ][ 0] MvdL1[ x0 ][ y0] [1] = −MvdL0[ x0 ][ y0 ][ 1]} otherwise

5. 所提出的方法也可以應用於多假設模式。 m. 在這種情況下，當針對每個參考圖片清單存在兩組MV資訊時，可以針對一個參考圖片清單信令通知MV資訊。然而，可以導出另一參考圖片清單的MV資訊集的MVD。對於一個參考圖片清單的每組MV資訊，可以以與sym-bi-mode相同的方式對其進行處理。 n. 替代地，當存在用於每個參考圖片清單的兩組MV資訊時，可以信令通知兩個參考圖片清單的一組MV資訊。然而可以使用sym-bi-mode在運行中導出兩個參考圖片清單的其他兩組MV資訊。5. The proposed method can also be applied to the multi-hypothesis model. m. In this case, when there are two sets of MV information for each reference picture list, the MV information can be signaled for one reference picture list. However, the MVD of the MV information set of another reference picture list can be derived. For each group of MV information in a reference picture list, it can be processed in the same way as sym-bi-mode. n. Alternatively, when there are two sets of MV information for each reference picture list, one set of MV information for the two reference picture lists can be signaled. However, you can use sym-bi-mode to export the other two sets of MV information in the two reference picture lists during operation.

許多視頻編碼標準基於混合視頻編碼結構，其中利用時間預測加轉換編碼。圖17中描繪了典型HEVC編碼器框架的示例。Many video coding standards are based on a hybrid video coding structure in which temporal prediction plus transform coding is used. An example of a typical HEVC encoder framework is depicted in FIG. 17.

圖18是視頻處理裝置的框圖1800。裝置1800可以用於實施本文描述的一個或多個方法。裝置1800可以體現在智慧手機、平板電腦、電腦、物聯網（Internet of Things，IoT）接收器等中。裝置1800可以包括一個或多個處理器1802、一個或多個記憶體1804和視頻處理硬體1806。（多個）處理器1802可以被配置為實施本文件中描述的一種或多種方法。記憶體（多個記憶體）1804可以用於存儲用於實施本文描述的方法和技術的資料和代碼。視頻處理硬體1806可用於在硬體電路中實施本文件中描述的一些技術。Figure 18 is a block diagram 1800 of a video processing device. The apparatus 1800 can be used to implement one or more of the methods described herein. The device 1800 may be embodied in a smart phone, a tablet computer, a computer, an Internet of Things (IoT) receiver, etc. The device 1800 may include one or more processors 1802, one or more memories 1804, and video processing hardware 1806. The processor(s) 1802 may be configured to implement one or more methods described in this document. The memory (multiple memories) 1804 can be used to store data and codes used to implement the methods and techniques described herein. Video processing hardware 1806 can be used to implement some of the techniques described in this document in hardware circuits.

圖19是視頻位元流處理的示例方法1900的流程圖。方法1900包括：響應於視頻位元流中的鏡像模式標誌，基於對稱性規則和第一運動向量差資訊生成（1902）第二運動向量差資訊；使用第一運動向量差和第二運動向量差資訊重建（1904）視頻塊，其中，重建是雙向預測地執行的。Figure 19 is a flowchart of an example method 1900 of video bitstream processing. The method 1900 includes: generating (1902) second motion vector difference information based on the symmetry rule and the first motion vector difference information in response to the mirror mode flag in the video bitstream; using the first motion vector difference and the second motion vector difference Information reconstruction (1904) video block, where the reconstruction is performed bi-predictively.

圖20是視頻位元流處理的示例方法2000的流程圖。方法2000包括：對於與視頻塊相關聯的第一參考圖片清單，接收(2002)第一組運動向量的運動向量差資訊；以及使用多假設對稱性規則，從第一組運動向量的運動向量差資訊導出(2004)與第二參考圖片清單的第二組運動向量相關聯的運動向量差資訊，該第二參考圖片清單與視頻塊相關聯。該資訊可以使用接收的第一組運動向量的運動向量差資訊來生成。Figure 20 is a flowchart of an example method 2000 of video bitstream processing. The method 2000 includes: for a first reference picture list associated with a video block, receiving (2002) motion vector difference information of a first set of motion vectors; and using a multiple hypothesis symmetry rule, from the motion vector difference of the first set of motion vectors The information derives (2004) motion vector difference information associated with the second set of motion vectors of the second reference picture list, which is associated with the video block. The information can be generated using the received motion vector difference information of the first set of motion vectors.

在一些實施例中，視頻位元流處理的方法可以包括方法2000的變型，其中，在多假設情況下，部分運動向量差資訊以交織方式被信令通知。這種方法包括：對於視頻塊，接收與第一參考圖片清單相關聯的第一運動向量差資訊，對於該視頻塊，接收與第二參考圖片清單相關聯的第二運動向量差資訊；使用多假設對稱性規則從第一運動向量差資訊和第二運動向量差資訊導出與第一參考圖片清單相關聯的第三運動向量差資訊和與第二參考圖片清單相關聯的第四運動向量差資訊。In some embodiments, the method for processing a video bit stream may include a variant of the method 2000, in which, under multiple hypotheses, part of the motion vector difference information is signaled in an interleaved manner. This method includes: for a video block, receiving first motion vector difference information associated with a first reference picture list, and for this video block, receiving second motion vector difference information associated with a second reference picture list; Assume that the symmetry rule derives the third motion vector difference information associated with the first reference picture list and the fourth motion vector difference information associated with the second reference picture list from the first motion vector difference information and the second motion vector difference information .

關於方法1900和2000，位元流處理可以包括以壓縮形式生成表示視頻的位元流。替代地，位元流處理可以包括使用位元流從其壓縮形式表示重建視頻。Regarding methods 1900 and 2000, bitstream processing may include generating a bitstream representing the video in compressed form. Alternatively, bitstream processing may include reconstructing the video from its compressed form using the bitstream.

關於方法1900和2000，在一些實施例中，對稱性規則和多假設對稱性規則可以相同或不同。特別地，僅當使用多假設運動預測對視頻塊（或圖片）進行編碼時，才可以使用多假設對稱性規則。Regarding methods 1900 and 2000, in some embodiments, the symmetry rule and the multi-hypothesis symmetry rule may be the same or different. In particular, the multi-hypothesis symmetry rule can be used only when the video block (or picture) is coded using multi-hypothesis motion prediction.

關於方法1900和2000，對稱性規則可以指定第二運動向量預測差值將是（0,0），並且相應的運動向量預測器被設置為鏡像運動向量，其值從第一運動向量差值資訊導出。此外，可以進一步對鏡像運動向量值執行運動向量細化。如以上示例中所述，可以基於CU/CTU/區域級的位元流中的指示來選擇性地使用鏡像模式。類似地，還可以通過信令通知細化標誌，來控制運動向量細化被使用（或不被使用）。可以在條帶頭或圖片參數集、或序列參數集或區域級或編碼單元或編碼樹單元級使用細化標誌。Regarding methods 1900 and 2000, the symmetry rule can specify that the second motion vector prediction difference will be (0,0), and the corresponding motion vector predictor is set to mirror the motion vector, whose value is from the first motion vector difference information Export. In addition, motion vector refinement can be further performed on the mirror motion vector value. As described in the above example, the mirroring mode can be selectively used based on the indication in the bit stream at the CU/CTU/region level. Similarly, the refinement flag can also be signaled to control the motion vector refinement to be used (or not used). The refinement flag can be used at the slice header or picture parameter set, or sequence parameter set or region level, or coding unit or coding tree unit level.

關於方法1900和2000，使用基於對稱性規則的技術以生成鏡像運動向量可以使得能夠跳過在位元流中發送運動向量差資訊（因為該資訊可以由解碼器生成）。可以經由位元流中的標誌選擇性地控制跳過操作。在一個有利方面，使用上述技術的鏡像MVP計算可以在解碼器側用於改進後續塊的解碼，而不會受到在細化的運動向量被用於後續塊的預測的情形下可能發生的計算依賴性的不利影響。Regarding methods 1900 and 2000, the use of techniques based on symmetry rules to generate mirrored motion vectors can enable skipping of sending motion vector difference information in the bit stream (because the information can be generated by the decoder). The skip operation can be selectively controlled via flags in the bit stream. In an advantageous aspect, the mirror MVP calculation using the above technique can be used on the decoder side to improve the decoding of subsequent blocks, without being subject to calculation dependence that may occur in the case where the refined motion vector is used for the prediction of subsequent blocks. The adverse effects of sex.

關於方法1900和2000，在一些實施例中，對稱性規則可以僅用於在兩個參考幀具有相同距離的情況下生成鏡像運動向量。否則，可以基於參考幀的相對時間距離來執行運動向量的縮放。Regarding methods 1900 and 2000, in some embodiments, the symmetry rule may only be used to generate mirror motion vectors when two reference frames have the same distance. Otherwise, the scaling of the motion vector can be performed based on the relative time distance of the reference frame.

關於方法1900和2000，在一些實施例中，可以使用基於補丁的技術來計算鏡像運動向量，並且可以包括使用來自參考幀列表0的第一運動向量差來生成預測樣本的第一補丁，使用來自參考幀列表1的第一運動向量差來生成預測樣本的第二補丁，並且將運動向量細化確定為最小化第一補丁和第二補丁之間的誤差函數的值。可以使用各種優化標準（例如，速率失真（rate distortion）、SAD等）來確定細化的運動向量。Regarding methods 1900 and 2000, in some embodiments, patch-based techniques may be used to calculate the mirror motion vector, and may include using the first motion vector difference from the reference frame list 0 to generate the first patch of the prediction sample, using The second patch of the prediction sample is generated with reference to the first motion vector difference of the frame list 1, and the motion vector refinement is determined as a value that minimizes the error function between the first patch and the second patch. Various optimization criteria (for example, rate distortion, SAD, etc.) can be used to determine the refined motion vector.

應當理解，公開了用於減少壓縮視頻位元流中用於表示運動的位元量的技術。使用所公開的技術，可以僅使用常規技術的運動資訊的一半來信令通知雙向預測，並且可以使用視頻中物件的運動的鏡像對稱性在解碼器處生成另一半運動資訊。對稱性標誌和細化標誌可以用於信令通知該模式的使用（或不使用）以及運動向量的進一步細化。可以使用對稱性規則來計算鏡像運動向量。在對稱性規則中做出的一個假設是物件在當前塊的時間和用於雙向預測的參考塊的時間之間保持其平移運動。例如，使用一個對稱性規則，指向在一個時間方向上從當前塊位移了delx和dely的參考區域的運動向量可以被假設為在另一個方向上改變到delx和dely的縮放版本(縮放也可以包括負縮放，這可能是由於運動向量方向的改變)。縮放可以取決於時間距離和其他考慮因素並且在本文件中描述。It should be understood that techniques for reducing the amount of bits used to represent motion in a compressed video bitstream are disclosed. Using the disclosed technology, only half of the motion information of the conventional technology can be used to signal bidirectional prediction, and the mirror symmetry of the motion of the object in the video can be used to generate the other half of the motion information at the decoder. The symmetry flag and the refinement flag can be used to signal the use (or non-use) of the mode and the further refinement of the motion vector. The symmetry rule can be used to calculate the mirror motion vector. One assumption made in the symmetry rule is that the object maintains its translational motion between the time of the current block and the time of the reference block used for bidirectional prediction. For example, using a symmetry rule, a motion vector pointing to a reference area shifted by delx and dely from the current block in one time direction can be assumed to be changed to a scaled version of delx and dely in another direction (zoom can also include Negative scaling, this may be due to the change in the direction of the motion vector) Scaling can depend on time distance and other considerations and is described in this document.

本文件中描述的公開的和其他解決方案、示例、實施例、模組和功能操作可以在數位電子電路中實施，或者在電腦軟體、固件或硬體中實施，包括本文件中公開的結構及其結構等同物，或者它們中的一個或多個的組合。所公開的和其他實施例可以被實施為一個或多個電腦程式產品，即編碼在電腦可讀介質上的電腦程式指令的一個或多個模組，用於由資料處理裝置執行或控制資料處理裝置的操作。電腦可讀介質可以是機器可讀存放裝置、機器可讀存儲基板、記憶體設備、影響機器可讀傳播信號的物質組合、或者它們中的一個或多個的組合。術語“資料處理裝置”涵蓋用於處理資料的所有裝置、設備和機器，包括例如可程式設計處理器、電腦或多個處理器或電腦。除了硬體之外，該裝置還可以包括為所討論的電腦程式創建執行環境的代碼，例如，構成處理器固件的代碼、協定棧、資料庫管理系統、作業系統、或者它們中的一個或多個的組合。傳播信號是人工生成的信號，例如機器生成的電信號、光信號或電磁信號，其被生成以對資訊進行編碼以便傳輸到合適的接收器裝置。The disclosed and other solutions, examples, embodiments, modules, and functional operations described in this document can be implemented in digital electronic circuits, or implemented in computer software, firmware, or hardware, including the structures disclosed in this document and Its structural equivalents, or a combination of one or more of them. The disclosed and other embodiments can be implemented as one or more computer program products, that is, one or more modules of computer program instructions encoded on a computer readable medium for executing or controlling data processing by a data processing device Operation of the device. The computer-readable medium may be a machine-readable storage device, a machine-readable storage substrate, a memory device, a combination of substances that affect a machine-readable propagated signal, or a combination of one or more of them. The term "data processing device" covers all devices, equipment, and machines used to process data, including, for example, a programmable processor, a computer, or multiple processors or computers. In addition to the hardware, the device may also include code that creates an execution environment for the computer program in question, for example, the code that constitutes the processor firmware, protocol stack, database management system, operating system, or one or more of them. A combination of. Propagated signals are artificially generated signals, such as electrical, optical or electromagnetic signals generated by a machine, which are generated to encode information for transmission to a suitable receiver device.

電腦程式（也稱為程式、軟體、軟體應用、腳本或代碼）可以以任何形式的程式設計語言編寫，包括編譯或解釋語言，並且可以以任何形式部署，包括作為獨立（stand-alone）程式或作為模組、元件、子常式或適合在計算環境中使用的其他單元。電腦程式不一定對應於檔案系統中的檔。程式可以存儲在保存其他程式或資料的檔的一部分中(例如，存儲在標記語言文件中的一個或多個腳本)，存儲在專用於所討論的程式的單個檔中，或者存儲在多個協調檔(例如，存儲一個或多個模組、副程式或部分代碼的檔)中。可以部署電腦程式以在一個電腦上或在位於一個網站上或分佈在多個網站上並通過通信網路互連的多個電腦上執行。Computer programs (also called programs, software, software applications, scripts or codes) can be written in any form of programming language, including compiled or interpreted languages, and can be deployed in any form, including as stand-alone programs or As a module, component, subroutine or other unit suitable for use in a computing environment. Computer programs do not necessarily correspond to files in the file system. The program can be stored in a part of a file that holds other programs or data (for example, one or more scripts stored in a markup language file), in a single file dedicated to the program in question, or in multiple coordination File (for example, a file that stores one or more modules, subprograms, or part of the code). Computer programs can be deployed to be executed on one computer or on multiple computers located on one website or distributed on multiple websites and interconnected by a communication network.

本文件中描述的過程和邏輯流程可以由執行一個或多個電腦程式的一個或多個可程式設計處理器執行，以通過對輸入資料進行操作並生成輸出來執行功能。過程和邏輯流程也可以由專用邏輯電路執行，並且裝置也可以被實施為專用邏輯電路，例如FPGA（field programmable gate array，現場可程式設計閘陣列）或ASIC（application specific integrated circuit，專用積體電路）。The processes and logic flows described in this document can be executed by one or more programmable processors that execute one or more computer programs to perform functions by operating on input data and generating output. The process and logic flow can also be executed by a dedicated logic circuit, and the device can also be implemented as a dedicated logic circuit, such as FPGA (field programmable gate array) or ASIC (application specific integrated circuit, dedicated integrated circuit) ).

作為示例，適合於執行電腦程式的處理器包括通用和專用微處理器，以及任何類型的數位電腦的任何一個或多個處理器。通常，處理器將從唯讀記憶體或隨機存取記憶體或兩者接收指令和資料。電腦的基本元件是用於執行指令的處理器和用於存儲指令和資料的一個或多個記憶體設備。通常，電腦還將包括一個或多個用於存儲資料的大型存放區設備，例如磁片、磁光碟或光碟，或者被可操作地耦合以從一個或多個大型存放區設備接收資料或傳送資料或兩者。但是，電腦不需要這樣的設備。適用於存儲電腦程式指令和資料的電腦可讀介質包括所有形式的非易失性記憶體、介質和記憶體設備，包括例如半導體記憶體設備，例如EPROM、EEPROM和快閃記憶體設備；磁片，例如內部硬碟或抽取式磁碟；磁光碟；和CD ROM和DVD-ROM磁片。處理器和記憶體可以由專用邏輯電路補充或併入專用邏輯電路中。As an example, processors suitable for executing computer programs include general-purpose and special-purpose microprocessors, and any one or more processors of any type of digital computer. Generally, the processor will receive commands and data from read-only memory or random access memory or both. The basic components of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include one or more large storage area devices for storing data, such as floppy disks, magneto-optical discs, or optical discs, or be operatively coupled to receive data from or transmit data from one or more large storage area devices Or both. However, computers do not need such equipment. Computer readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including, for example, semiconductor memory devices such as EPROM, EEPROM and flash memory devices; magnetic disks , Such as internal hard disks or removable disks; magneto-optical disks; and CD ROM and DVD-ROM disks. The processor and memory can be supplemented by or incorporated into a dedicated logic circuit.

雖然本專利文件包含許多細節，但這些細節不應被解釋為對任何發明或可要求保護的範圍的限制，而是作為特定于特定發明的特定實施例的特徵的描述。本專利文件中在單個實施例的上下文中描述的某些特徵也可以在單個實施例中組合實施。相反，在單個實施例的上下文中描述的各種特徵也可以單獨地或以任何合適的子組合在多個實施例中實施。此外，儘管上面的特徵可以描述為以某些組合起作用並且甚至最初被要求保護，但是在某些情況下，可以從組合中刪除來自所要求保護的組合的一個或多個特徵，並且所要求保護的組合可以指向子組合或子組的變化。Although this patent document contains many details, these details should not be construed as limitations on the scope of any invention or claimable, but as a description of the features specific to a particular embodiment of a particular invention. Certain features described in this patent document in the context of a single embodiment can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments individually or in any suitable subcombination. In addition, although the above features can be described as functioning in certain combinations and even initially claimed, in some cases, one or more features from the claimed combination can be deleted from the combination, and the claimed The combination of protection can point to changes in sub-combinations or sub-groups.

類似地，雖然在附圖中以特定順序描繪了操作，但是這不應該被理解為要求以所示的特定順序或按循序執行這些操作，或者執行所有示出的操作，以實現期望的結果。此外，在本專利文件中描述的實施例中的各種系統元件的分離不應被理解為在所有實施例中都需要這種分離。Similarly, although operations are depicted in a specific order in the drawings, this should not be understood as requiring that these operations be performed in the specific order shown or in a sequential order, or that all the operations shown are performed to achieve the desired result. In addition, the separation of various system elements in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.

僅描述了幾個實施方式和示例，並且可以基於本專利文件中描述和圖示的內容進行其他實施、增強和變化。Only a few implementations and examples are described, and other implementations, enhancements and changes can be made based on the content described and illustrated in this patent document.

1800:視頻處理裝置 1802:處理器 1804:記憶體 1806:視頻處理硬體 1900、2000:方法 1902、1904、2002、2004:步驟 A0、A1、B0、B1、B2、C0、C1:位置 A、B、C、D:子CU a、b、c、d:塊 mvL0、mvL1、refIdxL0、refIdxL1:候選 MV0、MV1、MV0′、MV1′:運動向量 refa、refb:參考圖片 tb、td:位置 TD0、TD1:時間距離1800: Video processing device 1802: processor 1804: memory 1806: Video processing hardware 1900, 2000: method 1902, 1904, 2002, 2004: steps A0, A1, B0, B1, B2, C0, C1: position A, B, C, D: sub CU a, b, c, d: block mvL0, mvL1, refIdxL0, refIdxL1: candidates MV0, MV1, MV0′, MV1′: motion vector refa, refb: reference pictures tb, td: location TD0, TD1: Time distance

圖1示出了用於Merge候選列表構建的推導過程的示例。圖2示出了空間Merge候選的示例位置。圖3是用於空間運動向量候選的運動向量縮放的圖示。圖4示出了用於運動向量預測候選的示例推導過程。圖5示出了被考慮用於空間Merge候選的冗餘校驗的候選對的示例。圖6示出了Nx2N和2NxN分割的第二PU的示例位置。圖7是用於時間Merge候選的運動向量縮放的示例。圖8示出了被標記為C0和C1的時間Merge候選的候選位置的示例。圖9示出了組合的雙向預測Merge候選的示例。圖10示出了雙邊匹配過程的示例。圖11示出了範本匹配過程的示例。圖12示出了畫面播放速率上轉換（frame rate up-conversion，FRUC）中的單邊運動估計（motion estimation，ME）的示例。圖13示出了雙邊範本匹配過程的示例。圖14示出了替代時間運動向量預測（alternative temporal motion vector prediction，ATMVP）方法的示例。圖15示出了識別源塊和源圖片的示例。圖16是具有四個子塊（A-D）的一個編碼單元（coding unit，CU）及其相鄰子塊（a-d）的示例。圖17示出了視頻編碼裝置的框圖示例。圖18是視頻處理裝置的示例的框圖。圖19是視頻位元流處理方法的示例的流程圖。圖20是視頻位元流處理方法的另一示例的流程圖。Fig. 1 shows an example of the derivation process for the construction of the Merge candidate list. Fig. 2 shows example positions of spatial Merge candidates. Figure 3 is an illustration of motion vector scaling for spatial motion vector candidates. Figure 4 shows an example derivation process for motion vector prediction candidates. FIG. 5 shows an example of candidate pairs considered for redundancy check of spatial Merge candidates. FIG. 6 shows an example location of the second PU divided by Nx2N and 2NxN. Fig. 7 is an example of motion vector scaling for temporal Merge candidates. FIG. 8 shows an example of candidate positions of temporal Merge candidates marked as C0 and C1. Fig. 9 shows an example of combined bidirectional prediction Merge candidates. Fig. 10 shows an example of a bilateral matching process. Fig. 11 shows an example of a template matching process. FIG. 12 shows an example of unilateral motion estimation (ME) in frame rate up-conversion (FRUC). Fig. 13 shows an example of a bilateral template matching process. FIG. 14 shows an example of an alternative temporal motion vector prediction (ATMVP) method. Fig. 15 shows an example of identifying source blocks and source pictures. Fig. 16 is an example of a coding unit (CU) with four sub-blocks (A-D) and its adjacent sub-blocks (a-d). Fig. 17 shows an example of a block diagram of a video encoding device. Fig. 18 is a block diagram of an example of a video processing device. Fig. 19 is a flowchart of an example of a video bitstream processing method. Fig. 20 is a flowchart of another example of a video bitstream processing method.

1900:方法 1900: method

1902、1904:步驟 1902, 1904: steps

Claims

A method for processing a video bit stream includes: in response to a mirror mode flag in the video bit stream, generating second motion vector difference information based on a symmetry rule and first motion vector difference information; and using the first motion vector Difference information and the second motion vector difference information to reconstruct a video block in the current picture, wherein the reconstruction is performed using bidirectional prediction, wherein the symmetry rule specifies not to send the second motion vector difference information, and the first The second motion vector difference is set as a mirror image of the first motion vector difference without scaling.

The method described in item 1 of the scope of patent application further includes: setting a second motion vector prediction as a mirror motion vector of the first motion vector; and performing motion vector refinement of the mirror motion vector value to generate a motion vector fine化值。 The value.

The method described in item 1 of the scope of patent application, wherein the mirror mode flag exists at the coding unit (CU) level, the coding tree unit (CTU) level, the area level covering multiple CU/CTU, or the slice level.

The method according to item 2 of the scope of patent application, wherein the motion vector refinement is selectively performed based on a refinement flag in the video bitstream.

The method according to item 4 of the scope of patent application, wherein the refinement flag is at least included in the slice header, picture parameter set, sequence parameter set, region level, coding unit or coding tree unit level.

The method according to claim 1, wherein the video bitstream includes skip information indicating a list of motion vector differences signaled by skipping in the video bitstream.

The method according to item 6 of the scope of patent application, wherein the skip information is at a coding unit level, a region level, a coding tree unit level, or a slice level.

The method according to item 7 of the scope of patent application, wherein, when the skip information is at the region level, the coding tree unit level, or the slice level, the coding unit uses the symmetry rule to generate the second motion vector News.

The method described in item 1 of the scope of patent application further includes: storing a motion vector predictor generated using the symmetry rule for processing prediction information of subsequent video blocks.

The method according to item 9 of the scope of patent application, wherein the motion vector predictor is used together with a conventional motion vector predictor, and wherein a two-bit field signals the motion vector in the video bitstream Predictor.

The method according to claim 9, wherein the motion vector predictor is used to replace one of the conventional motion vector predictors, and a single bit in the video bitstream is used to perform signaling Notice.

The method according to any one of items 1 to 11 of the scope of the patent application, wherein the symmetry rule is used only when the picture sequence count distances between two reference frames used for bidirectional prediction are equal .

The method according to the first item of the scope of patent application, wherein the video bit stream omits the signaling of the difference of the motion vector used to refer to the picture list 11, and wherein the following is used to perform the bidirectional prediction:( The first picture sequence number (POC) of the first reference picture in the reference picture list 0)-(the second POC of the current picture) = (the POC of the current picture)-(the third of another reference picture in the reference picture list 1 POC).

The method according to the first item of the scope of patent application, wherein the two reference pictures used for the bidirectional prediction are the reference pictures that are closest to the current picture derived from the past frame and the future frame.

The method according to item 1 of the scope of patent application, wherein the video bitstream uses a single reference index and a single motion vector prediction index for each video block, and jointly signals the references of reference list 0 and reference list 1 Index and motion vector prediction index.

A method for processing a video bit stream includes: in response to a mirror mode flag in the video bit stream, generating second motion vector difference information based on a symmetry rule and first motion vector difference information; and using the first motion vector The difference information and the second motion vector difference information reconstruct the video block in the current picture, wherein the reconstruction is performed using bidirectional prediction, wherein the relative time distance between the source frame and the target reference frame of the video block is used as Proportional scaling to determine the mirror motion vector value.

The method according to item 2 of the scope of patent application, wherein performing the motion vector refinement includes: using a third motion vector from a reference frame associated with the first reference picture list to generate a first patch of prediction samples; The mirror motion vector value of the reference frame associated with the second reference picture list generates a second patch of prediction samples; and the motion vector refinement value is determined to minimize one of the first patch and the second patch The value of the error function between.

The method according to item 17 of the scope of patent application, wherein the error function includes absolute difference and measurement.

The method according to claim 17, wherein the error function includes the correlation between the refined value of the motion vector and the weighted linear average of the first patch and the second patch.

The method according to item 17 of the scope of patent application, wherein the error function is a rate-distortion function using the refinement value of the motion vector.

The method according to item 2 of the scope of patent application, wherein performing the motion vector refinement includes: determining the motion vector refinement value to use the top and Reference or interpolate samples on the left to minimize the value of the error function.

The method according to item 2 of the scope of patent application, wherein performing the motion vector refinement includes: When the motion vector refinement value is greater than the threshold, the motion vector refinement value is determined to use integer reference samples between the two reference frames associated with the two reference picture lists to minimize the value of the error function.

The method according to claim 1, wherein the symmetry rule responds to a flag including MVD_L1_ZERO_FLAG in the slice-level signaling for the video block.

A video bit stream processing method, comprising: for a first reference picture list associated with a video block, receiving motion vector difference information of a first set of motion vectors; and using multiple hypothesis symmetry rules from the first set of motion vectors The motion vector difference information of derives the motion vector difference information associated with the second set of motion vectors of the second reference picture list associated with the video block, wherein the multiple hypothesis symmetry rule specifies the second motion vector difference value Is (0, 0), and the corresponding motion vector predictor is set to the mirror motion vector value derived from the first motion vector difference information without scaling; and using the derived result to execute the video block and all The bit stream representation of the video block is converted between.

The method according to claim 24, comprising: using the multiple hypothesis symmetry rule to derive another motion vector difference information associated with the first reference picture list associated with the video block; and using the The multiple hypothesis symmetry rule derives another motion vector difference information associated with the second reference picture list associated with the video block.

A method for processing a video bitstream includes: for a video block, receiving first motion vector difference information associated with a first reference picture list; for the video block, receiving a second motion vector difference information associated with a second reference picture list Motion vector difference information; using multiple hypothesis symmetry rules to derive the third motion vector difference information and the third motion vector difference information associated with the first reference picture list from the first motion vector difference information and the second motion vector difference information The fourth motion vector difference information picture list associated with the second reference, wherein the multiple hypothesis symmetry rule specifies that the second motion vector difference value is (0, 0), and the corresponding motion vector predictor is set to The mirror motion vector value derived from the first motion vector difference information.

The method according to any one of items 24 to 26 of the scope of the patent application further includes: performing a motion vector refinement of the mirror motion vector value to generate a motion vector refinement value.

A video processing method includes: receiving a future frame of the video relative to a reference frame of the video; receiving a motion vector related to the future frame of the video and the past frame of the video; applying the future frame of the video and the past frame of the video A predetermined relationship between; reconstructing the past frame of the video based on the future frame of the video, the motion vector, and the predetermined relationship between the past frame of the video and the future frame of the video, The predetermined relationship is that the future frame of the video and the past frame of the video are related by a mirroring condition, and the mirroring condition does not need to be scaled.

The method according to item 28 of the scope of patent application, wherein the mirroring condition means that an object having coordinates (x, y) in the future frame of the video has coordinates (-x) in the past frame of the video , -Y).

A video processing method includes: receiving a past frame of a video relative to a reference frame of the video; receiving a motion vector related to a past frame of the video and a future frame of the video; applying the future frame of the video and the past frame of the video The predetermined relationship between the video data; based on the past frame of the video, the motion vector, and the predetermined relationship between the past frame of the video and the future frame of the video to reconstruct the future frame of the video data, wherein The predetermined relationship is that the future frame of the video and the past frame of the video are related by a mirroring condition, and the mirroring condition does not need to be scaled.

The method according to item 30 of the scope of patent application, wherein the mirroring condition means that an object having coordinates (x, y) in the past frame of the video has coordinates (-x) in the future frame of the video , -Y).

A video decoding device includes a processor configured to implement the method according to any one of items 1 to 31 of the scope of patent application.

A video encoding device, comprising: a processor, configured to implement any of items 1 to 31 of the scope of patent application The method described in one item.

A computer program product having computer code stored thereon, wherein when the code is executed by a processor, the processor implements the method described in any one of items 1 to 31 of the scope of patent application.