TW202337216A - Method and apparatus for video coding using merge with mvd mode - Google Patents
Method and apparatus for video coding using merge with mvd mode Download PDFInfo
- Publication number
- TW202337216A TW202337216A TW112102681A TW112102681A TW202337216A TW 202337216 A TW202337216 A TW 202337216A TW 112102681 A TW112102681 A TW 112102681A TW 112102681 A TW112102681 A TW 112102681A TW 202337216 A TW202337216 A TW 202337216A
- Authority
- TW
- Taiwan
- Prior art keywords
- base point
- search
- modified
- mvs
- merging
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 60
- 239000013598 vector Substances 0.000 claims description 24
- 230000008569 process Effects 0.000 description 10
- 238000013139 quantization Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 6
- 230000003044 adaptive effect Effects 0.000 description 5
- 230000002457 bidirectional effect Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 230000002123 temporal effect Effects 0.000 description 4
- VBRBNWWNRIMAII-WYMLVPIESA-N 3-[(e)-5-(4-ethylphenoxy)-3-methylpent-3-enyl]-2,2-dimethyloxirane Chemical compound C1=CC(CC)=CC=C1OC\C=C(/C)CCC1C(C)(C)O1 VBRBNWWNRIMAII-WYMLVPIESA-N 0.000 description 3
- 238000009795 derivation Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000006073 displacement reaction Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013138 pruning Methods 0.000 description 2
- 230000011664 signaling Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000006735 deficit Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/57—Motion estimation characterised by a search window with variable size or shape
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
本發明涉及使用合並模式運動向量差 (Merge mode Motion Vector Difference,簡寫為MMVD) 編解碼工具的視訊編解碼系統。更具體而言,本發明涉及搜索位置以增強與MMVD相關聯的性能的設計。The present invention relates to a video encoding and decoding system using a Merge mode Motion Vector Difference (MMVD) encoding and decoding tool. More specifically, the present invention relates to the design of search locations to enhance performance associated with MMVD.
通用視訊編解碼(VVC)是由ITU-T視訊編解碼專家組(VCEG)和ISO/IEC運動圖像專家組的聯合視訊專家組(JVET)制定的最新國際視訊編解碼標準(MPEG)。該標準已作爲 ISO 標準發布:ISO/IEC 23090-3:2021,Information technology - Coded representation of immersive media - Part 3: Versatile video coding,2021 年 2 月發布。通過基於其前身HEVC(High Efficiency Video coding),添加更多編解碼工具來提高編解碼效率,並處理各種類型的視訊源,包括 3 維(3D)視訊信號,發展處發展出VVC。Universal Video Codec (VVC) is the latest international video codec standard (MPEG) developed by the ITU-T Video Codec Experts Group (VCEG) and the Joint Video Experts Group (JVET) of the ISO/IEC Moving Picture Experts Group. The standard has been published as an ISO standard: ISO/IEC 23090-3:2021, Information technology - Coded representation of immersive media - Part 3: Versatile video coding, published in February 2021. The Development Office developed VVC by building on its predecessor HEVC (High Efficiency Video coding), adding more coding and decoding tools to improve coding and decoding efficiency, and processing various types of video sources, including 3-dimensional (3D) video signals.
第1A圖說明瞭包含循環處理的示例性適應性幀間/幀內(adaptive Inter/Intra)視訊編解碼系統。對於幀內預測,預測資料是根據當前圖片(在後文中也稱爲畫面)中先前編解碼的視訊資料導出的。對於幀間預測112,在編碼器側執行運動估計(Motion Estimation,簡寫爲ME)並且基於ME的結果執行運動補償(Motion Compensation,簡寫爲MC)以提供從其他畫面和運動資料導出的預測資料。開關114選擇幀內預測110或幀間預測112並且所選擇的預測資料被提供給加法器116以形成預測誤差,也稱爲殘差(residual)。預測誤差然後由變換(T) 118和隨後的量化(Q) 120處理。變換和量化的殘差然後由熵編碼器122編碼以包括在對應於壓縮視訊資料的視訊位元流中。然後,與變換係數相關聯的位元流將與輔助資訊(side information)(例如與幀內預測和幀間預測相關聯的運動和解碼模式)以及其他資訊(例如與應用於底層圖像區域(underlying image area)的環路濾波器相關聯的參數)一起打包。 與幀內預測110、幀間預測112和環路濾波器130相關聯的輔助資訊被提供給熵編碼器122,如第1A圖所示。當使用幀間預測模式時,也必須在編碼器端重建一個或多個參考圖片。因此,經變換和量化的殘差由逆量化(IQ)124和逆變換(IT)126處理以恢復殘差。 然後在重建(REC)128處將殘差加回到預測資料136以重建視訊資料。 重建的視訊資料可以存儲在參考圖片緩衝器134中並用於預測其他幀。Figure 1A illustrates an exemplary adaptive Inter/Intra video codec system including loop processing. For intra prediction, prediction data is derived based on previously encoded and decoded video data in the current picture (hereinafter also referred to as picture). For inter-frame prediction 112, motion estimation (Motion Estimation, abbreviated as ME) is performed on the encoder side and motion compensation (Motion Compensation, abbreviated as MC) is performed based on the results of ME to provide prediction data derived from other pictures and motion data. Switch 114 selects intra prediction 110 or inter prediction 112 and the selected prediction data is provided to adder 116 to form a prediction error, also called a residual. The prediction error is then processed by transform (T) 118 and subsequent quantization (Q) 120. The transformed and quantized residuals are then encoded by entropy encoder 122 for inclusion in the video bitstream corresponding to the compressed video data. The bitstream associated with the transform coefficients is then combined with side information (such as the motion and decoding modes associated with intra- and inter-prediction) and other information (such as the information applied to the underlying image region). parameters associated with the loop filter underlying image area) are packed together. Auxiliary information associated with intra prediction 110, inter prediction 112, and loop filter 130 is provided to entropy encoder 122, as shown in Figure 1A. When using inter prediction mode, one or more reference pictures must also be reconstructed at the encoder side. Therefore, the transformed and quantized residuals are processed by inverse quantization (IQ) 124 and inverse transform (IT) 126 to recover the residuals. The residuals are then added back to the prediction data 136 at reconstruction (REC) 128 to reconstruct the video data. The reconstructed video data may be stored in the reference picture buffer 134 and used to predict other frames.
如第1A圖所示,輸入的視訊資料在編碼系統中經過一系列處理。 由於一系列處理,來自 REC 128 的重建的視訊資料可能會受到各種損害。 因此,環路濾波器130經常在重建的視訊資料被存儲在參考圖片緩衝器134中之前應用於重建的視訊資料以提高視訊質量。 例如,可以使用去塊濾波器(deblocking filter,簡寫爲DF)、樣本適應性偏移(Sample Adaptive Offset,簡寫爲SAO)和適應性環路濾波器(Adaptive Loop Filter,簡寫爲ALF)。可能需要將環路濾波器資訊合並到位元流中,以便解碼器可以正確地恢復所需的資訊。 因此,環路濾波器資訊也被提供給熵編碼器122以合並到位元流中。在第1A圖中,環路濾波器130在重建樣本被存儲在參考圖片緩衝器134中之前被應用於重建的視訊。第1A圖中的系統旨在說明典型視訊編碼器的示例性結構。它可能對應於高效視訊編解碼(HEVC)系統、VP8、VP9、H.264或VVC。As shown in Figure 1A, the input video data undergoes a series of processes in the encoding system. The reconstructed video data from REC 128 may suffer from various impairments due to a series of processes. Therefore, the loop filter 130 is often applied to the reconstructed video data before the reconstructed video data is stored in the reference picture buffer 134 to improve video quality. For example, you can use a deblocking filter (DF), a sample adaptive offset (SAO), and an adaptive loop filter (ALF). It may be necessary to merge the loop filter information into the bit stream so that the decoder can correctly recover the required information. Therefore, the loop filter information is also provided to the entropy encoder 122 for incorporation into the bit stream. In Figure 1A, loop filter 130 is applied to the reconstructed video before reconstructed samples are stored in reference picture buffer 134. The system in Figure 1A is intended to illustrate the exemplary structure of a typical video encoder. It may correspond to the High Efficiency Video Codec (HEVC) system, VP8, VP9, H.264 or VVC.
如第1B圖所示,除了變換 118 和量化 120 之外,解碼器可以使用與編碼器相似或相同的功能塊,因爲解碼器只需要逆量化 124 和逆變換 126。取代熵編碼器122,解碼器使用熵解碼器140將視訊位元流解碼爲量化的變換係數和需要的編解碼資訊(例如ILPF資訊、幀內預測資訊和幀間預測資訊)。解碼器側的幀內預測150不需要執行模式搜索。相反,解碼器僅需要根據從熵解碼器140接收的幀內預測資訊生成幀內預測。此外,對於幀間預測,解碼器僅需要根據從熵解碼器140接收的幀間預測資訊執行運動補償(MC 152)而無需運動估計。As shown in Figure 1B, in addition to transform 118 and quantization 120, the decoder can use similar or the same functional blocks as the encoder, since the decoder only requires inverse quantization 124 and inverse transform 126. Instead of the entropy encoder 122, the decoder uses an entropy decoder 140 to decode the video bit stream into quantized transform coefficients and required codec information (eg, ILPF information, intra prediction information, and inter prediction information). Intra prediction 150 on the decoder side does not require performing a mode search. Instead, the decoder only needs to generate intra prediction based on the intra prediction information received from the entropy decoder 140 . Furthermore, for inter prediction, the decoder only needs to perform motion compensation (MC 152) based on the inter prediction information received from the entropy decoder 140 without motion estimation.
根據VVC,與 HEVC類似輸入圖片被分區(partition)爲稱爲CTU(編解碼樹單元)的非重疊方形塊區域。 每個CTU可以劃分爲一個或多個更小的編解碼單元 (CU)。生成的CU分區可以是正方形或矩形。此外,VVC將 CTU 劃分爲預測單元 (PU),作爲應用預測處理的單元,例如幀間預測、幀內預測等。According to VVC, similar to HEVC, the input picture is partitioned into non-overlapping square block areas called CTUs (Coder-Decoder Tree Units). Each CTU can be divided into one or more smaller codec units (CUs). The generated CU partition can be square or rectangular. In addition, VVC divides the CTU into prediction units (PUs) as units where prediction processing is applied, such as inter prediction, intra prediction, etc.
VVC標準結合了各種新的編解碼工具,以進一步在HEVC標準基礎上提高編解碼效率。在各種新的編解碼工具中,與本發明相關的一些編解碼工具綜述如下。例如,使用運動向量差的合併模式(Merge Mode with MVD,簡寫為MMVD)技術重新使用與 VVC 中相同的合併候選,並且可以通過運動向量表達方法進一步擴展所選候選。 期望開發技術以進一步改進 MMVD。The VVC standard incorporates various new encoding and decoding tools to further improve encoding and decoding efficiency based on the HEVC standard. Among various new coding and decoding tools, some coding and decoding tools related to the present invention are summarized as follows. For example, the Merge Mode with MVD (MMVD) technique is used to reuse the same merge candidates as in VVC, and the selected candidates can be further expanded by motion vector expression methods. It is expected that techniques will be developed to further improve MMVD.
公開了一種使用 MMVD (Merge with MVD (Motion Vector Difference))模式進行視訊編解碼的方法和裝置。根據該方法,接收與當前塊相關聯的輸入資料,其中輸入資料包括在編碼器側待編碼的當前塊的像素資料或在解碼器側與待解碼的當前塊相關聯的已編碼資料。為當前塊確定來自合併列表的兩個或更多基點合併(base merge)運動向量(Motion Vector,簡寫為MV)。如果所述兩個或更多個基點合併MV中的所述至少一個接近所述兩個或更多個基點合併MV中的另一個基點MV,則使用一組修改的搜索位置為所述兩個或更多個基點合併MV中的至少一個確定修改的擴展合併候選,其中至少一個搜索位置在標稱組搜索位置和修改後的一組搜索位置之間不同,並且其中標稱組搜索位置包括在目標基點合併 MV 周圍的一個或多個定義的方向(defined direction)的一組標稱距離處。 使用包括經修改的擴展的合併候選的運動資訊來編碼或解碼當前塊。A method and device for video encoding and decoding using MMVD (Merge with MVD (Motion Vector Difference)) mode are disclosed. According to the method, input data associated with the current block is received, wherein the input data includes pixel data of the current block to be encoded on the encoder side or coded data associated with the current block to be decoded on the decoder side. Determine two or more base merge motion vectors (Motion Vectors, abbreviated as MV) from the merge list for the current block. If the at least one of the two or more base point merged MVs is close to another one of the two or more base point merged MVs, a modified set of search positions is used for the two At least one of or more base point merge MVs determines a modified extended merge candidate, wherein at least one search position differs between a nominal set search position and a modified set of search positions, and wherein the nominal set search position is included in The target base point merges at a set of nominal distances in one or more defined directions around the MV. The current block is encoded or decoded using motion information including the modified extended merging candidate.
在一個實施例中,所述一個或多個定義的方向對應於水平方向、垂直方向或兩者。在一個實施例中,修改後的一組搜索位置包括在非水平和非垂直方向上的修改後的搜索位置。在一個實施例中,修改後的一組搜索位置包括具有與該組標稱距離不同的至少一個距離的修改後的搜索位置。在另一個實施例中,修改後的一組搜索位置對應於一組修改後的距離,該組修改後的距離是根據所述兩個或更多個基點合併MV中的所述至少一個的長度從該組標稱距離歸一化的。 在又一實施例中,修改後的一組搜索位置包括在非水平和非垂直方向上的修改後的搜索位置,並且具有至少一個不同於該組標稱距離的距離。In one embodiment, the one or more defined directions correspond to a horizontal direction, a vertical direction, or both. In one embodiment, the modified set of search positions includes modified search positions in non-horizontal and non-vertical directions. In one embodiment, the modified set of search locations includes modified search locations that have at least one distance that is different from the set of nominal distances. In another embodiment, the modified set of search positions corresponds to a modified set of distances based on the length of the at least one of the two or more base point merged MVs. Normalized by nominal distance from this set. In yet another embodiment, the modified set of search locations includes modified search locations in non-horizontal and non-vertical directions and having at least one distance that is different from the set of nominal distances.
在一個實施例中,當 B 個基點合併 MV 彼此接近時,從 B 個基點合併 MV 導出公共基點(common base)合併 MV,並將修改後的一組搜索位置應用於公共基點合併 MV,並且其中 B 是大於1的整數。在一個實施例中,修改後的一組搜索位置包括除了水平方向和垂直方向之外的至少一個方向。在另一個實施例中,修改後的一組搜索位置包括B組搜索方向。在一個實施例中,公共基點合併MV對應於所述B組基點合併MV的中點。在另一個實施例中,公共基點合併MV對應於具有最小基點索引的所述B組基點合併MV之一。In one embodiment, when the B base merge MVs are close to each other, a common base merge MV is derived from the B base merge MVs, and a modified set of search positions is applied to the common base merge MV, and where B is an integer greater than 1. In one embodiment, the modified set of search positions includes at least one direction other than a horizontal direction and a vertical direction. In another embodiment, the modified set of search locations includes a set of B search directions. In one embodiment, the common base point merge MV corresponds to the midpoint of the B group of base point merge MVs. In another embodiment, the common base point merging MV corresponds to one of the B groups of base point merging MVs having the smallest base point index.
在一個實施例中,當第一基點合併MV接近第二基點合併MV時,第二基點合併MV的搜索方向依賴於第一基點合併MV。在一個實施例中,用於第二基點合併MV的修改後的一組搜索位置包括指向遠離第一基點合併MV的至少一個非水平和非垂直搜索方向。 在另一個實施例中,用於第二基點合併MV的修改後的一組搜索位置包括全部指向遠離第一基點合併MV的兩個非水平和非垂直搜索方向、一個水平搜索方向和一個垂直搜索方向。In one embodiment, when the first base point merging MV is close to the second base point merging MV, the search direction of the second base point merging MV depends on the first base point merging MV. In one embodiment, the modified set of search positions for the second base point merge MV includes at least one non-horizontal and non-vertical search direction pointing away from the first base point merge MV. In another embodiment, the modified set of search positions for the second base point merge MV includes two non-horizontal and non-vertical search directions, one horizontal search direction and one vertical search direction all pointing away from the first base point merge MV direction.
在一個實施例中,當第一基點合併 MV 接近第二基點合併 MV 時,第一基點合併 MV 的修改後的一組搜索位置使用分別平行於和垂直於線的修改後的搜索方向,其中該線連接第一個基點合併 MV 和第二個基點合併 MV。在一個實施例中,用於第二基點合併MV的修改後的一組搜索位置使用旋轉的搜索方向,其中旋轉的搜索方向是通過旋轉修改的搜索方向形成的。In one embodiment, when the first base point merge MV approaches the second base point merge MV , a modified set of search locations for the first base point merge MV uses modified search directions respectively parallel and perpendicular to the line, where the The line connects the first base point merged MV and the second base point merged MV. In one embodiment, the modified set of search positions for the second base point merging MV uses a rotated search direction, wherein the rotated search direction is formed by rotating the modified search direction.
在一個實施例中,當兩個基點合併MV之間的距離較小時,將偏移量添加到兩個基點合併MV之一以生成新的基點合併MV,使得距離足夠大。In one embodiment, when the distance between two base point merge MVs is small, an offset is added to one of the two base point merge MVs to generate a new base point merge MV such that the distance is large enough.
在另一個實施例中,當兩個基點合併MV之間的距離不夠大時,兩個基點合併MV中的一個被另一個基點合併MV替換。In another embodiment, when the distance between two base point merging MVs is not large enough, one of the two base point merging MVs is replaced by another base point merging MV.
容易理解的是,如本文附圖中大體描述和圖示的本發明的元件可以以多種不同的配置來佈置和設計。 因此,以下對如圖所示的本發明的系統和方法的實施例的更詳細描述並不旨在限制所要求保護的本發明的範圍,而僅代表本發明的選定實施例 。貫穿本說明書對“一實施例”、“一個實施例”或類似語言的引用意味著結合該實施例描述的特定特徵、結構或特性可以包括在本發明的至少一個實施例中。因此,貫穿本說明書各處出現的短語“在一實施例中”或“在一個實施例中”不一定都指代相同的實施例。It will be readily understood that the elements of the present invention, as generally described and illustrated in the drawings herein, may be arranged and designed in a variety of different configurations. Accordingly, the following more detailed description of the embodiments of the present systems and methods as illustrated in the Figures is not intended to limit the scope of the claimed invention, but rather represents selected embodiments of the present invention. Reference throughout this specification to "one embodiment," "an embodiment," or similar language means that a particular feature, structure or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. Thus, the appearances of the phrases "in one embodiment" or "in one embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment.
此外,所描述的特徵、結構或特性可以以任何合適的方式組合在一個或多個實施例中。然而,相關領域的技術人員將認識到,本發明可以在沒有一個或多個特定細節的情況下,或使用其他方法、元件等來實踐。在其他情況下,未顯示或未顯示眾所周知的結構或操作詳細描述以避免模糊本發明的方面。參考附圖將最好地理解本發明的所示實施例,其中相同的部分自始至終由相同的數位表示。 下面的描述僅旨在作為示例,並且簡單地說明與如本文要求保護的本發明一致的設備和方法的某些選定實施例。Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. However, one skilled in the relevant art will recognize that the present invention may be practiced without one or more specific details, or using other methods, elements, etc. In other instances, well-known structural or operational details have not been shown or shown in order to avoid obscuring aspects of the invention. The illustrated embodiments of the invention will be best understood by reference to the accompanying drawings, wherein like parts are designated by like numerals throughout. The following description is intended to be exemplary only and to briefly illustrate certain selected embodiments of apparatus and methods consistent with the invention as claimed herein.
當前圖片參考(Current picture reference ( Current Picture ReferencingCurrent Picture Referencing ))
運動補償是混合視訊編解碼中的關鍵技術之一,它探索相鄰圖片之間的像素相關性。通常假設,在視訊序列中,與幀中的對象或背景相對應的圖案(pattern)被移位以形成後續幀中的對應對象或與當前幀中的其他圖案相關聯。通過對這種位移的估計(例如使用塊匹配技術),可以在不需要重新編解碼圖案的情況下大部分地再現圖案。類似地,也嘗試了塊匹配和復制以允許從與當前塊相同的圖片中選擇參考塊。將此概念應用於相機捕獲的視訊時,觀察到效率低下。部分原因是空間相鄰區域中的文本圖案(textual 圖案)可能與當前編解碼塊相似,但通常在空間上有一些逐漸變化。一個塊很難在攝像頭拍攝的視訊中的同一張圖片中找到完全匹配的。因此,編解碼性能的提高是有限的。Motion compensation is one of the key technologies in hybrid video coding and decoding. It explores the pixel correlation between adjacent pictures. It is generally assumed that in a video sequence, patterns corresponding to objects or background in a frame are shifted to form corresponding objects in subsequent frames or to be associated with other patterns in the current frame. By estimating this displacement (e.g. using block matching techniques), the pattern can be largely reproduced without the need to re-encode the pattern. Similarly, block matching and copying were also attempted to allow reference blocks to be selected from the same picture as the current block. Inefficiencies are observed when applying this concept to camera-captured video. Part of the reason is that textual patterns in spatially adjacent regions may be similar to the current codec block, but usually have some gradual changes in space. It is difficult to find an exact match for a block in the same picture in the video captured by the camera. Therefore, the improvement of encoding and decoding performance is limited.
然而,對於屏幕內容,同一圖片內的像素之間的空間相關性情況是不同的。對於的帶有文字和圖形的典型視訊,通常在同一張圖片中有重複的圖案。因此,已經觀察到幀(圖片)內塊補償非常有效。一種新的預測模式,即幀內塊複製 (intra block copy,簡寫為IBC) 模式或稱為當前圖片參考 (CPR),已被引入屏幕內容編解碼以利用此特性。 在 CPR 模式中,預測單元 (PU) 是從同一圖片內先前重建的塊中預測出來的。此外,位移向量(稱為塊向量(block vector)或BV)用於指示從當前塊的位置到參考塊的位置的相對位移。 然後使用變換、量化和熵編解碼對預測誤差進行編解碼。 CPR 補償的示例如第2圖所示,其中塊 212 是塊 210 的對應塊,塊 222 是塊 220 的對應塊。在該技術中,參考樣本對應於環路濾波操作(HEVC 中的去塊以及採樣適應性偏移 (SAO) 濾波)之前當前解碼的圖片的重建樣本。However, for screen content, the spatial correlation situation between pixels within the same image is different. For typical videos with text and graphics, there are often repeating patterns within the same image. Therefore, intra-frame (picture) block compensation has been observed to be very effective. A new prediction mode, intra block copy (IBC) mode or Current Picture Reference (CPR), has been introduced into the screen content codec to take advantage of this feature. In CPR mode, prediction units (PUs) are predicted from previously reconstructed blocks within the same picture. In addition, a displacement vector (called a block vector or BV) is used to indicate the relative displacement from the position of the current block to the position of the reference block. The prediction error is then coded using transform, quantization and entropy coding. An example of CPR compensation is shown in Figure 2, where block 212 is the counterpart of block 210 and block 222 is the counterpart of block 220. In this technique, the reference samples correspond to the reconstructed samples of the currently decoded picture before the in-loop filtering operation (deblocking in HEVC and sample adaptive offset (SAO) filtering).
CPR 的第一個版本在 JCTVC-M0350 中提出(Budagavi 等人,AHG8: Video coding using Intra motion compensation,ITU-T SG16 WP3 和 ISO/ IEC JTC 1/SC 29/WG11的視訊編碼聯合協作組(JCT-VC),第 13 次會議:韓國仁川,2013 年 4 月 18-26 日,文件:JCTVC-M0350)到 HEVC 範圍擴展 (RExt) 開發。 在這個版本中,CPR 補償被限制在一個小的局部區域內,只有 1-D 塊向量並且僅適用於 2Nx2N 的塊大小。後來,在HEVC SCC(Screen Content Coding)的標準化過程中,開發出了更先進的CPR設計。 The first version of CPR was proposed in JCTVC-M0350 (Budagavi et al., AHG8: Video coding using Intra motion compensation , Joint Working Group on Video Coding (JCT) of ITU-T SG16 WP3 and ISO/IEC JTC 1/SC 29/WG11 -VC), Meeting 13: Incheon, South Korea, April 18-26, 2013, Document: JCTVC-M0350) to HEVC Range Extensions (RExt) development. In this version, CPR compensation is limited to a small local area with only 1-D block vectors and only applies to block sizes of 2Nx2N. Later, during the standardization process of HEVC SCC (Screen Content Coding), more advanced CPR designs were developed.
當使用CPR時,只有當前圖片的一部分可以用作參考圖片。施加了一些位元流一致性約束來調節參考當前圖片的有效 MV 值。首先,以下兩項中的一項必須為真: BV_x + offsetX + nPbSw + xPbs – xCbs <= 0 (1) BV_y + offsetY + nPbSh + yPbs – yCbs <= 0 (2) When using CPR, only part of the current picture can be used as a reference picture. Some bitstream consistency constraints are imposed to regulate the effective MV value referring to the current picture. First, one of the following two items must be true: BV_x + offsetX + nPbSw + xPbs – xCbs <= 0 BV_y + offsetY + nPbSh + yPbs – yCbs <= 0
其次,下方WPP條件必須為真: ( xPbs + BV_x + offsetX + nPbSw − 1 ) / CtbSizeY – xCbs / CtbSizeY <= yCbs / CtbSizeY − ( yPbs + BV_y + offsetY + nPbSh − 1 ) / CtbSizeY (3) Secondly, the following WPP condition must be true: ( xPbs + BV_x + offsetX + nPbSw − 1 ) / CtbSizeY – xCbs / CtbSizeY <= yCbs / CtbSizeY − ( yPbs + BV_y + offsetY + nPbSh − 1 ) / CtbSizeY (3)
在方程式(1)到(3)中,(BV_x,BV_y)是當前PU的亮度塊向量(CPR的運動向量); nPbSw和nPbSh是當前PU的寬高; (xPbS, yPbs)是當前PU的左上角像素相對於當前圖片的位置; (xCbs, yCbs)為當前CU左上角像素相對於當前圖片的位置; CtbSizeY 是 CTU 的大小。 OffsetX和offsetY是考慮到CPR模式的色度樣本插值的兩個二維調整的偏移量。 offsetX = BVC_x & 0x7 ? 2 : 0 (4) offsetY = BVC_y & 0x7 ? 2 : 0 (5) In equations (1) to (3), (BV_x, BV_y) is the brightness block vector of the current PU (motion vector of CPR); nPbSw and nPbSh are the width and height of the current PU; (xPbS, yPbs) is the upper left of the current PU The position of the corner pixel relative to the current picture; (xCbs, yCbs) is the position of the upper left corner pixel of the current CU relative to the current picture; CtbSizeY is the size of the CTU. OffsetX and offsetY are two two-dimensional adjusted offsets that allow for chroma sample interpolation in CPR mode. offsetX = BVC_x & 0x7 ? 2 : 0 offsetY = BVC_y & 0x7 ? 2 : 0
(BVC_x,BVC_y)是色度塊向量,在 HEVC 中為 1/8 像素分辨率。(BVC_x, BVC_y) is the chroma block vector, which is 1/8 pixel resolution in HEVC.
第三,用於 CPR 的參考塊必須在相同的圖塊/切片邊界內。Third, the reference blocks used for CPR must be within the same tile/slice boundaries.
具有have MVDMVD 的合併模式Merge mode (MMVD)(MMVD) 技術Technology
在 JVECT-J0024 中提出了 MMVD 技術。 MMVD 用於採用建議的運動向量表達方法的跳過(skip)或合併模式。 MMVD 重新使用與 VVC 中相同的合併候選。 在合併候選中,可以選擇一個候選,並通過所提出的運動向量表達方法進一步擴展。MMVD 提供了一種具有簡化的傳訊的新的運動向量表達。 表達方式包括預測方向資訊、起始點(本發明中也稱為基點)、運動幅度(本發明中也稱為距離)和運動方向。第3圖圖示了MMVD搜索過程的示例,其中當前幀310中的當前塊312通過使用L0參考幀320和L1參考幀330的雙向預測來處理。像素位置350被投影到L0 參考幀 320 中的像素位置 352 和 L1 參考幀 330 中的像素位置 354。根據 MMVD 搜索過程,將通過在所選方向上添加偏移來搜索更新的位置。例如,更新後的位置對應於水平方向上沿線342或344,距離為s、2s或3s的位置。MMVD technology was proposed in JVECT-J0024. MMVD is used in skip or merge mode using the proposed motion vector representation method. MMVD reuses the same merge candidates as in VVC. Among the merged candidates, one candidate can be selected and further extended by the proposed motion vector expression method. MMVD provides a new motion vector representation with simplified signaling. The expression includes predicted direction information, starting point (also called base point in the present invention), motion amplitude (also called distance in the present invention) and motion direction. Figure 3 illustrates an example of the MMVD search process, where the current block 312 in the current frame 310 is processed by bidirectional prediction using the L0 reference frame 320 and the L1 reference frame 330. Pixel location 350 is projected to pixel location 352 in L0 reference frame 320 and to pixel location 354 in L1 reference frame 330. According to the MMVD search process, the updated position is searched by adding an offset in the selected direction. For example, the updated position corresponds to a position along line 342 or 344 in the horizontal direction at a distance of s, 2s, or 3s.
該提議的技術按已有的方式使用合併候選列表。但是,只有默認合併類型(即 MRG_TYPE_DEFAULT_N)的候選才會被考慮用於 MMVD 的擴展。預測方向資訊表示L0、L1以及L0和L1預測中的預測方向。在 B 切片中,所提出的方法可以通過使用鏡像技術從具有單向預測的合併候選中生成雙向預測候選。 例如,如果合併候選是L1 的單向預測,則通過在列表 0 中搜索參考圖片來確定 L0 的參考索引,該參考圖片與列表 1 的參考圖片鏡像。如果沒有對應的圖片,則使用當前圖片的最近的參考圖片。 L0的MV 是通過縮放 L1 的 MV 得出的,縮放因子由 POC 距離計算得出。The proposed technique uses merge candidate lists in an already established manner. However, only candidates of the default merge type (i.e., MRG_TYPE_DEFAULT_N) will be considered for extensions to MMVD. The prediction direction information represents L0, L1, and prediction directions in L0 and L1 prediction. In B-slices, the proposed method can generate bidirectional prediction candidates from merged candidates with unidirectional predictions by using the mirroring technique. For example, if the merge candidate is a unidirectional prediction of L1, then the reference index of L0 is determined by searching in List 0 for a reference picture that mirrors the reference picture of List 1. If there is no corresponding picture, the nearest reference picture of the current picture is used. The MV of L0 is obtained by scaling the MV of L1 by the scaling factor calculated from the POC distance.
在MMVD中,在選擇了合併候選之後,通過用傳訊的MVD資訊進一步擴展或細化它。進一步的資訊包括合併候選標誌、用於指定運動幅度的索引和用於指示運動方向的索引。在 MMVD 模式中,選擇合併列表中前兩個候選之一作為 MV 基點。傳訊 MMVD 候選標誌以指定在第一和第二合併候選之間使用哪一個。從合併候選列表中選擇的初始 MV(即,合併候選)在本公開中也被稱為基點。在搜索該組位置之後,所選擇的MV候選在本公開中被稱為擴展的MV候選。In MMVD, after a merge candidate is selected, it is further expanded or refined by using the signaled MVD information. Further information includes a merge candidate flag, an index specifying the magnitude of the motion, and an index indicating the direction of the motion. In MMVD mode, one of the first two candidates in the merge list is selected as the MV base point. Pass the MMVD candidate flag to specify which one to use between the first and second merge candidates. The initial MV selected from the merge candidate list (i.e., the merge candidate) is also referred to as a base point in this disclosure. After searching the set of locations, the selected MV candidates are referred to in this disclosure as extended MV candidates.
如果MMVD候選的預測方向與原始合併候選之一相同,則傳訊具有值0的索引作為MMVD預測方向。否則,傳訊值為 1 的索引。 發送第一個位元後,剩餘的預測方向將根據預定義的 MMVD 預測方向優先級順序傳訊。優先順序為L0/L1預測、L0預測和L1預測。如果合併候選的預測方向是L1,傳訊'0'表示MMVD的預測方向為L1。 傳訊“10”表示 MMVD的預測方向為 L0 和 L1。傳訊“11”表示 MMVD 的預測方向為 L0。 如果 L0 和 L1 預測列表相同,則不發送 MMVD 的預測方向資訊。If the prediction direction of the MMVD candidate is the same as one of the original merge candidates, the index with value 0 is signaled as the MMVD prediction direction. Otherwise, an index with a value of 1 is passed. After the first bit is sent, the remaining prediction directions are signaled according to the predefined MMVD prediction direction priority order. The order of priority is L0/L1 prediction, L0 prediction and L1 prediction. If the prediction direction of the merge candidate is L1, signaling '0' indicates that the prediction direction of MMVD is L1. Submission "10" indicates that MMVD's prediction directions are L0 and L1. Subpoena "11" indicates that the predicted direction of MMVD is L0. If the L0 and L1 prediction lists are the same, the MMVD prediction direction information is not sent.
如表1所示,基點候選索引(base candidate index)定義了起始點。 基點候選索引表示列表中候選中的最佳候選,如下所示。
表 1. 基點候選 IDX
距離索引指定運動幅度資訊並且指示從L0參考塊410和L1參考塊420的起點(412和422)的預定義偏移,如第4圖所示。在第4圖中,偏移添加到起始 MV 的水平分量或垂直分量,其中不同樣式的小圓圈對應於距中心的不同偏移量。距離索引和預定義偏移量之間的關係在表 2 中指定。
表 2. 距離索引
方向索引表示 MVD 相對於起始點的方向。 方向索引可以表示四個方向,如下所示。 方向索引表示 MVD 相對於起點的方向。方向索引可以表示如表3所示的四個方向。需要注意的是,MVD符號的含義可以根據起始MV的資訊而變化。當起始 MV 是非預測 MV 或兩個列表都指向當前圖片的同一側(即兩個參考的 POC 均大於當前圖片的 POC,或均小於當前圖片的 POC)的雙向預測 MV,表 3 中的符號指定添加到起始 MV 的 MV 偏移量的正負符號。當起始 MV 是兩個 MV 指向當前圖片的不同側的雙向預測 MV(即一個參考的 POC 大於當前圖片的 POC,另一個參考的 POC 小於當前圖片的 POC),並且list 0中POC的差異大於list 1中的POC的差異,表3中描述添加到起始MV的list0 MV分量的MV偏移量的正負符號,且與list1 MV添加MV偏移量的正負符號有相反的值。否則,如果list 1中的POC的差異大於list 0的差異,則表3中描述添加到起始MV的list1 MV分量的MV偏移量的正負符號,且與list0 MV添加MV偏移量的正負符號有相反的值。
表 3. 方向索引
為了降低編碼器複雜度,應用塊限制。如果 CU 的寬度或高度小於 4,則不執行 MMVD。To reduce encoder complexity, block restrictions are applied. If the width or height of the CU is less than 4, MMVD is not performed.
多假設預測(Multi-hypothesis forecast ( Multi-Hypothesis PredictionMulti-Hypothesis Prediction ,簡寫為, abbreviated as MH)MH) 技術Technology
提出多假設預測以改進現有的幀間圖片中預測模式,包括高級運動向量預測(AMVP)模式的單向預測、跳過和合併模式以及幀內模式。一般概念是將現有的預測模式與額外的合併索引預測(merge indexed prediction)相結合。合併索引預測以與常規合併模式相同的方式執行,其中傳訊合併索引以獲取用於運動補償的預測(motion compensated prediction)的運動資訊。最終預測是合併索引預測和現有預測模式生成的預測的加權平均,其中根據組合應用不同的權重。詳細資訊可以在 JVET-K1030(Chih-Wei Hsu等人, Description of Core Experiment 10: Combined and multi-hypothesis prediction, ITU-T SG16 WP3和 ISO/IEC JTC 1/SC 29/WG11視訊編解碼聯合協作組(JCT-VC),第 11 次會議:盧布爾雅那,SI,2018 年 7 月 10-18 日,文件:JVET-K1030)或JVET-L0100中找到(Man-Shu Chiang 等人,CE10.1.1:Multi-hypothesis prediction for improving AMVP mode, skip or merge mode, and intra mode,ITU-T SG16 WP3和ISO/IEC JTC 1/SC 29/WG11視訊編碼聯合協作組(JCT-VC),第12次會議:澳門 , CN, 2018 年 10 月 3-12 日,文件:JVET-L0100)。Multi-hypothesis prediction is proposed to improve existing inter-picture prediction modes, including unidirectional prediction of advanced motion vector prediction (AMVP) mode, skip and merge modes, and intra mode. The general concept is to combine existing prediction modes with additional merge indexed predictions. Merge index prediction is performed in the same manner as regular merge mode, where the merge index is signaled to obtain motion information for motion compensated prediction. The final forecast is a weighted average of the merged index forecast and the forecast generated by the existing forecast mode, where different weights are applied depending on the combination. Detailed information can be found in JVET-K1030 (Chih-Wei Hsu et al., Description of Core Experiment 10: Combined and multi-hypothesis prediction, ITU-T SG16 WP3 and ISO/IEC JTC 1/SC 29/WG11 Video Codec Joint Collaboration Group (JCT-VC), Meeting 11: Ljubljana, SI, 10-18 July 2018, File: JVET-K1030) or found in JVET-L0100 (Man-Shu Chiang et al., CE10.1.1 :Multi-hypothesis prediction for improving AMVP mode, skip or merge mode, and intra mode, ITU-T SG16 WP3 and ISO/IEC JTC 1/SC 29/WG11 Joint Collaboration Group on Video Coding (JCT-VC), 12th meeting : Macau, CN, October 3-12, 2018, file: JVET-L0100).
成對平均合併候選(Pairwise average merge candidates ( Pairwise Averaged Merge CandidatesPairwise Averaged Merge Candidates ))
成對平均候選是通過對當前合併候選列表中的預定義候選對進行平均來生成的,並且預定義對被定義為{(0, 1), (0, 2), (1, 2), (0, 3), (1, 3), (2, 3)},其中數字表示合併候選列表的合併索引。為每個參考列表分別計算平均運動向量。如果兩個運動向量在一個列表中可用,則即使這兩個運動向量指向不同的參考圖片,也會對其進行平均;如果只有一個運動向量可用,則直接使用一個; 如果沒有可用的運動向量,則將此列表視為無效。Pairwise average candidates are generated by averaging predefined candidate pairs in the current merged candidate list, and the predefined pairs are defined as {(0, 1), (0, 2), (1, 2), ( 0, 3), (1, 3), (2, 3)}, where the number represents the merge index of the merge candidate list. The average motion vector is calculated separately for each reference list. If two motion vectors are available in a list, they are averaged even if they point to different reference pictures; if only one motion vector is available, one is used directly; if no motion vector is available, This list is considered invalid.
合併模式merge mode
為了提高 HEVC 中運動向量 (MV) 編解碼的編解碼效率,HEVC 具有跳過和合併模式。 跳過和合併模式從空間上相鄰的塊(空間候選)或時間上的同位(co-located)塊(時間候選)中獲取運動資訊。當 PU 為跳過或合併模式時,不對運動資訊進行編解碼,而是僅對所選候選的索引進行編解碼。對於跳過模式,殘差訊號被強制為零且不被編解碼。在 HEVC 中,如果特定塊被編碼為跳過或合併,則傳訊候選索引以指示候選集中的哪個候選用於合併。每個合併的 PU 重新使用所選候選的 MV、預測方向和參考圖片索引。To improve the codec efficiency of motion vector (MV) codec in HEVC, HEVC has skip and merge modes. Skip and merge modes obtain motion information from spatially adjacent blocks (spatial candidates) or temporally co-located blocks (temporal candidates). When the PU is in skip or merge mode, motion information is not encoded and decoded, but only the index of the selected candidate is encoded and decoded. For skip mode, the residual signal is forced to zero and is not encoded or decoded. In HEVC, if a particular block is coded as skipped or merged, the candidate index is polled to indicate which candidate in the candidate set is used for merging. Each merged PU reuses the MV, prediction direction, and reference picture index of the selected candidate.
對於 HEVC 中 HM-4.0 中的合併模式,如第5圖所示,從 A 0、A 1、B 0和 B 1導出最多四個空間 MV 候選,並且從 T BR或 T CTR(T BR首先使用,如果 T BR不可用,則使用 T CTR)導出一個時間MV。請注意,如果四個空間 MV 候選中的任何一個不可用,則位置 B 2將用於導出另一個 MV 候選作為替代。在四個空間 MV 候选和一個時間 MV 候選的推導過程之後,應用去除冗餘(修剪)來去除冗餘 MV 候選。如果在去除冗餘(修剪)之後,可用的 MV 候選的數量小於五個,則導出三種額外的候選並添加到候選集(候選列表)中。編碼器根據速率失真優化 (rate-distortion optimization,簡寫為RDO) 決策在候選集中為跳過或合併模式選擇一個最終候選,並將索引傳輸給解碼器。 For the merge mode in HM-4.0 in HEVC, as shown in Figure 5, up to four spatial MV candidates are derived from A 0 , A 1 , B 0 and B 1 , and from T BR or T CTR (T BR is used first , if T BR is not available, use T CTR ) to derive a temporal MV. Note that if any of the four spatial MV candidates is not available, location B2 will be used to derive another MV candidate as an alternative. After the derivation process of four spatial MV candidates and one temporal MV candidate, redundancy removal (pruning) is applied to remove redundant MV candidates. If, after removing redundancy (pruning), the number of available MV candidates is less than five, three additional candidates are derived and added to the candidate set (candidate list). The encoder selects a final candidate for skip or merge mode in the candidate set based on rate-distortion optimization (RDO) decisions and transmits the index to the decoder.
在下文中,我們將跳過和合併模式表示為“合併模式”。換句話說,當在以下說明書中提到“合併模式”時,“合併模式”可以指跳過和合併模式。In the following, we denote skip and merge modes as “merge mode”. In other words, when "merge mode" is mentioned in the following specification, "merge mode" may refer to skip and merge modes.
MMVDMMVD 中的適應性adaptability in MVDMVD 距離和方向distance and direction
在當前的 MMVD 設計中,對於每個基點,使用相同的距離和方向組合生成 MVD 候選。但是,如果有兩個基點彼此靠近,對兩個基點應用相同的MVD會導致兩個相似的運動向量,這可能是冗餘的。本發明公開了通過考慮基點之間的差異以及針對每個基點適應性地改變距離、方向或兩者來減少此類相似候選的方法。根據常規MMVD的一組搜索位置合在本公開中被稱為標稱組搜索位置。根據常規MMVD的一組搜索距離集合在本公開中被稱為標稱組搜索距離。In the current MMVD design, for each base point, the same combination of distance and direction is used to generate MVD candidates. However, if there are two base points close to each other, applying the same MVD to both base points will result in two similar motion vectors, which may be redundant. The present invention discloses methods to reduce such similar candidates by taking into account the differences between base points and adaptively changing the distance, direction, or both for each base point. The set of search locations according to conventional MMVD is collectively referred to in this disclosure as the nominal set of search locations. The set of search distances according to conventional MMVD is referred to in this disclosure as the nominal set of search distances.
第一種方法在第6A-B圖中示出。給定兩個基點 b
0= (x
b0, y
b0) 和 b
1= (x
b1, y
b1),考慮兩個基點基數 b
0– b
1的差異。如果 x 差異足夠小(如第6A圖所示),這意味著搜索垂直方向可能是多餘的。因此,本發明的實施例搜索b
1的其他方向(例如,對角線方向)而不是垂直方向。在第6A圖中,左側的搜索位置610對應於傳統的MMVD搜索。與由橢圓612包圍的基點b
1相關聯的搜索位置(在第6A圖的左側顯示為“x”)可能是冗餘的,因為它們與和基點 b
0相關聯搜索位置(在第6A圖的左側顯示為圓圈)的位置非常接近。在第6A圖的右側,示出了根據本發明的一個實施例的搜索,其中搜索方向是對角線並且新的搜索位置620由橢圓622指示。類似地,如果y差足夠小 (如第6B圖所示),尋找其他方向(例如對角線方向)而不是b
1的水平方向。在第6B圖中,左側的搜索位置630對應於傳統的MMVD搜索。與由橢圓632包圍的基點b
1相關聯的搜索位置(在第6B圖的左側顯示為“x”)可能是冗餘的,因為它們與和基點 b
0相關聯搜索位置(在第6B圖的左側顯示為圓圈)的位置非常接近。在第6B圖的右側,示出了根據本發明的一個實施例的搜索位置640,其中搜索方向是對角線並且新的搜索位置由橢圓642指示。如果x和y差異都不夠小,由於可能沒有多餘的候選,沒有進行任何更改。如上所述,本發明修改標稱搜索位置以避免搜索位置中的冗餘。換句話說,本發明的實施例使用一組修改的搜索位置用於MMVD。雖然傳統的MMVD總是在水平方向和垂直方向上搜索,但是如第6A圖和第6B圖所示的實施例使用包括非水平和非垂直方向的一組修改的搜索位置。
The first method is shown in Figures 6A-B. Given two base points b 0 = (x b0 , y b0 ) and b 1 = (x b1 , y b1 ), consider the difference of the two base points b 0 – b 1 . If the x difference is small enough (as shown in Figure 6A), this means that searching the vertical direction may be redundant. Therefore, embodiments of the present invention search other directions of b 1 (eg, diagonal directions) instead of the vertical direction. In Figure 6A, the
在另一種方法中,給定一個以上的基點,如果基點中的 B 個足夠接近,其中 B 是大於 1 的整數,則使用這 B 個基點定義一個公共基點 bc ,並基於該公共基點搜索 B 組不同的方向,而不是基於 B 個不同的基點搜索一組方向。 公共基點可以是最小基點索引對應的基點,也可以是B個基點的中點。 在這樣的修改後,保證了B個基點生成的候選不重複。對於其他基點,沒有做任何改變。示例如第7A-B圖,存在四個基點(即 b
0、b
1、b
2和 b
3),數字 B 等於三(即 b
0、b
1和 b
2)。第7A圖圖示了基於傳統MMVD的搜索位置,其中與基點b
0、b
1和b
2相關聯的搜索位置(由等高線710指示)集中在搜索位置簇710的中心附近,而與基點b
2相關聯的搜索位置(由等高線720指示)與搜索位置簇710完全分開。第7B圖圖示了根據本發明的實施例的搜索位置。在第7B圖中,公共基點b
c732是三個基點(即b
0、b
1和b
2)的中點。 如第7B圖所示,根據本發明的搜索位置(由等高線730指示)展開以覆蓋更大的區域。
In another method, given more than one base point, if B of the base points are close enough, where B is an integer greater than 1, then use these B base points to define a common base point bc, and search group B based on this common base point Different directions instead of searching a set of directions based on B different base points. The common base point can be the base point corresponding to the minimum base point index, or it can be the midpoint of B base points. After such modification, it is ensured that the candidates generated by B base points are not repeated. For other base points, no changes were made. For example, in Figure 7A-B, there are four base points (i.e., b 0 , b 1 , b 2 and b 3 ), and the number B is equal to three (i.e., b 0 , b 1 and b 2 ). Figure 7A illustrates search locations based on traditional MMVD, where the search locations (indicated by contours 710) associated with base points b0 , b1, and b2 are concentrated near the center of the
在另一種方法中,給定兩個基點b 0和 b 1,我們總是沿水平和垂直方向生成 b 0的候選。 b 1的方向由 b 1相對於 b 0的位置決定。第8A圖示出了根據傳統MMVD的搜索位置的示例,其中分別從基點b 0和b 1沿著垂直和水平方向執行搜索。在第8A圖的例子中,基點b 1位於 b 0的第一象限。根據本發明的一個實施例,如第8B圖所示,在第一象限中沿指向遠離基b 0的方向(810、812、814和816)生成b 1的搜索候選,以防止冗餘候選。四個搜索方向(810、812、814和816)包括兩個非水平和非垂直搜索方向,一個水平搜索方向和一個垂直搜索方向,均指向遠離第一基點合併MV的方向。如第8B圖所示,b 1的搜索位置與 b 0的搜索位置完全分開。 In another method, given two base points b 0 and b 1 , we always generate candidates for b 0 along the horizontal and vertical directions. The direction of b 1 is determined by the position of b 1 relative to b 0 . Figure 8A shows an example of a search position according to conventional MMVD, where searches are performed along the vertical and horizontal directions from base points b0 and b1, respectively. In the example of Figure 8A, the base point b 1 is located in the first quadrant of b 0 . According to one embodiment of the present invention, as shown in Figure 8B, search candidates for b 1 are generated in the first quadrant along directions (810, 812, 814, and 816) pointing away from base b 0 to prevent redundant candidates. The four search directions (810, 812, 814 and 816) include two non-horizontal and non-vertical search directions, one horizontal search direction and one vertical search direction, all pointing away from the first base point merged MV. As shown in Figure 8B, the search position of b 1 is completely separated from the search position of b 0 .
在另一種方法中,給定兩個基點b
0和b
1,沿著與b
1-b
0的方向平行或垂直的方向生成b
0的搜索候選。第9A圖示出了根據傳統MMVD的搜索位置的示例,其中分別從基點b
0和b
1沿著垂直和水平方向執行搜索。在第9B圖中,搜索方向910平行於b
1-b
0的方向並且搜索方向920垂直於b
1-b
0的方向。通過旋轉b
0的方向來確定b
1的方向(930和940)。
In another method, given two base points b 0 and b 1 , search candidates for b 0 are generated along a direction parallel or perpendicular to the direction of b 1 -b 0 . Figure 9A shows an example of a search position according to conventional MMVD, where searches are performed along the vertical and horizontal directions from base points b0 and b1, respectively. In Figure 9B,
如果多個基點彼此靠近,除了在其他方向上搜索之外,在不同距離上搜索也會產生具有較少冗餘的候選。在當前的 MMVD 中,距離是距離表(表 2){1/4, 1/2, 1, 2, 4, 8, 16, 32}中的值之一。在本發明的一個實施例中,如果兩個基點彼此接近,則一個基點按照原始距離表生成候選,而另一個基點使用新的距離表,例如{3/4,3/2, 3, 6, 12, 24, 48, 96}生成候選,以防止重複候選。If multiple base points are close to each other, searching at different distances, in addition to searching in other directions, will also produce candidates with less redundancy. In the current MMVD, the distance is one of the values in the distance table (Table 2) {1/4, 1/2, 1, 2, 4, 8, 16, 32}. In one embodiment of the present invention, if two base points are close to each other, one base point generates candidates according to the original distance table, while the other base point uses a new distance table, such as {3/4, 3/2, 3, 6, 12, 24, 48, 96} Generate candidates to prevent duplicate candidates.
在另一種方法中,基點的距離表通過與基點長度相關的因子(即,MV的幅度)歸一化。潛在的假設是具有較大長度的基點往往具有較大的 MVD。 因此,距離表應相應更改。In another approach, the distance table of base points is normalized by a factor related to the length of the base point (i.e., the magnitude of the MV). The underlying assumption is that base points with larger lengths tend to have larger MVDs. Therefore, the distance table should be changed accordingly.
此外,所提出的適應性地改變方向的方法可以與所提出的適應性地改變距離的方法相結合。Furthermore, the proposed method of adaptively changing the direction can be combined with the proposed method of adaptively changing the distance.
除了適應性地改變方向或距離之外,向基點生成過程添加約束也可以防止冗餘MMVD候選。 在一個實施例中,約束對應於以下情況:任何兩個基點之間的距離應該足夠大(例如大於閾值)。如果兩個現有的基點不滿足約束,其中一個應該被刪除或替換為另一個滿足約束的 MVP。在另一個實施例中,如果兩個基點之間的距離在一個方向上足夠小,則向其中一個或兩個基點添加一個偏移量以保持它們分開。In addition to adaptively changing directions or distances, adding constraints to the base point generation process can also prevent redundant MMVD candidates. In one embodiment, the constraint corresponds to the following situation: the distance between any two base points should be large enough (eg, greater than a threshold). If two existing base points do not satisfy the constraints, one of them should be removed or replaced with another MVP that satisfies the constraints. In another embodiment, if the distance between two base points is small enough in one direction, an offset is added to one or both of the base points to keep them separated.
上述任何 MMVD 方法都可以在編碼器和/或解碼器中實現。 例如,所提出的任何方法都可以在編碼器的幀間編解碼模組(例如第1A 圖中的幀間預測 112)、運動補償模組(例如第1B 圖中的 MC 152)、解碼器模組的合併候選推導中實現。或者,所提出的方法中的任何一個都可以實現為耦合到編碼器的幀間編解碼模組和/或運動補償模組、解碼器的合併候選推導模組的電路。雖然幀間預測 112和 MC 152 被示為支持 MMVD 方法的獨立處理單元,它們可能對應於存儲在媒體(例如硬盤或閃存)上的可執行軟體或韌體代碼,用於 CPU(中央處理單元)或可程式化設備( 例如 DSP(數位訊號處理器)或 FPGA(現場可程式化門陣列))。Any of the above MMVD methods can be implemented in the encoder and/or decoder. For example, any of the proposed methods can be implemented in the encoder's inter-codec module (e.g., inter prediction 112 in Figure 1A), the motion compensation module (e.g., MC 152 in Figure 1B), the decoder module Implemented in the derivation of merge candidates for groups. Alternatively, any of the proposed methods may be implemented as circuitry coupled to an inter-codec module and/or a motion compensation module of the encoder, a merge candidate derivation module of the decoder. Although inter prediction 112 and MC 152 are shown as independent processing units supporting the MMVD method, they may correspond to executable software or firmware code stored on media (such as hard disk or flash memory) for the CPU (Central Processing Unit) or programmable devices (such as DSP (Digital Signal Processor) or FPGA (Field Programmable Gate Array)).
第10圖圖示了根據本發明的實施例的利用針對MMVD的修改的搜索位置的另一示例性視訊編解碼系統的流程圖。流程圖中所示的步驟可以實現為可在編碼器側的一個或多個處理器(例如,一個或多個CPU)上執行的程式代碼。 流程圖中所示的步驟也可以基於硬體來實現,諸如被佈置為執行流程圖中的步驟的一個或多個電子設備或處理器。 根據該方法,在步驟1010中接收與當前塊相關聯的輸入資料,其中輸入資料包括在編碼器側待編碼的當前塊的像素資料或在解碼器側待解碼的與當前塊相關聯的編碼資料。在步驟1020中從當前塊的合併列表中確定兩個或更多個基點合併MV。在步驟1030中,如果兩個或更多個基點合併MV中的至少一個接近所述兩個或更多個基點合併MV中的另一個基點MV,則使用一組修改的搜索位置為所述兩個或更多個基點合併MV中的所述至少一個確定修改的擴展合併候選,其中搜索位置的標稱組和搜索位置的修改組之間至少有一個搜索位置不同,並且其中搜索位置的標稱組包括在圍繞目標基點合併 MV周圍的一個或多個定義的方向的一組標稱距離處。在步驟1040中使用包括修改的擴展合併候選的運動資訊對當前塊進行編碼或解碼。Figure 10 illustrates a flowchart of another exemplary video codec system utilizing modified search locations for MMVD, in accordance with an embodiment of the present invention. The steps shown in the flowchart may be implemented as program code executable on one or more processors (eg, one or more CPUs) on the encoder side. The steps shown in the flowcharts may also be implemented on a hardware basis, such as one or more electronic devices or processors arranged to perform the steps in the flowcharts. According to the method, input data associated with the current block is received in step 1010, where the input data includes pixel data of the current block to be encoded at the encoder side or encoded data associated with the current block to be decoded at the decoder side. . Two or more base point merge MVs are determined from the merge list of the current block in step 1020 . In step 1030, if at least one of the two or more base point merged MVs is close to another base point MV of the two or more base point merged MVs, a modified set of search positions is used for the two or more base point merged MVs. The at least one of the or more base point merge MVs determines a modified extended merge candidate, wherein at least one search position differs between a nominal set of search positions and a modified set of search positions, and wherein a nominal set of search positions A group consists of a set of nominal distances in one or more defined directions around the target base point merging MV. In step 1040, the current block is encoded or decoded using motion information including the modified extended merging candidate.
所示流程圖旨在說明根據本發明的視訊編解碼的示例。在不脫離本發明的精神的情況下,所屬領域具有通常知識者可以修改每個步驟、重新安排步驟、拆分步驟或組合步驟來實施本發明。在本公開中,已經使用特定句法和語義來說明示例以實現本發明的實施例。在不脫離本發明的精神的情況下,技術人員可以通過用等同的句法和語義替換句法和語義來實施本發明。The flow chart shown is intended to illustrate an example of video encoding and decoding according to the present invention. Without departing from the spirit of the present invention, one of ordinary skill in the art may modify each step, rearrange the steps, split the steps, or combine the steps to implement the present invention. In this disclosure, examples have been illustrated using specific syntax and semantics to implement embodiments of the invention. A skilled person may implement the invention by replacing syntax and semantics with equivalent syntax and semantics without departing from the spirit of the invention.
提供以上描述是為了使所屬領域具有通常知識者能夠實踐在特定應用及其要求的上下文中提供的本發明。對所描述的實施例的各種修改對於所屬領域具有通常知識者而言將是顯而易見的,並且本文定義的一般原理可以應用於其他實施例。 因此,本發明並不旨在限於所示出和描述的特定實施例,而是符合與本文公開的原理和新穎特徵一致的最寬範圍。在以上詳細描述中,舉例說明了各種具體細節以提供對本發明的透徹理解。然而,所屬領域具有通常知識者將理解可以實施本發明。The above description is provided to enable one of ordinary skill in the art to practice the invention in the context of a particular application and its requirements. Various modifications to the described embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments. Therefore, the present invention is not intended to be limited to the specific embodiments shown and described but is to be accorded the widest scope consistent with the principles and novel features disclosed herein. In the foregoing detailed description, various specific details are illustrated to provide a thorough understanding of the invention. However, one of ordinary skill in the art will understand that the present invention can be practiced.
如上所述的本發明的實施例可以以各種硬體、軟體代碼或兩者的組合來實現。例如,本發明的一個實施例可以是集成到視訊壓縮晶片中的一個或多個電路電路或者集成到視訊壓縮軟體中的程式碼以執行這裡描述的處理。 本發明的實施例還可以是要在數位訊號處理器(DSP)上執行以執行這裡描述的處理的程式碼。本發明還可以涉及由電腦處理器、數位訊號處理器、微處理器或現場可程式設計閘陣列(FPGA)執行的許多功能。這些處理器可以被配置為通過執行定義由本發明體現的特定方法的機器可讀軟體代碼或韌體代碼來執行根據本發明的特定任務。軟體代碼或韌體代碼可以以不同的程式設計語言和不同的格式或風格來開發。也可以為不同的目標平臺編譯軟體代碼。然而,軟體代碼的不同代碼格式、風格和語言以及配置代碼以執行根據本發明的任務的其他方式都不會脫離本發明的精神和範圍。The embodiments of the present invention as described above can be implemented in various hardware, software codes, or a combination of both. For example, one embodiment of the invention may be one or more circuit circuits integrated into a video compression chip or program code integrated into video compression software to perform the processes described herein. Embodiments of the invention may also be program code to be executed on a digital signal processor (DSP) to perform the processes described herein. The present invention may also involve many functions performed by a computer processor, digital signal processor, microprocessor or field programmable gate array (FPGA). These processors may be configured to perform specific tasks in accordance with the present invention by executing machine-readable software code or firmware code that defines specific methods embodied by the present invention. Software code or firmware code can be developed in different programming languages and in different formats or styles. Software code can also be compiled for different target platforms. However, different code formats, styles and languages of the software code, as well as other ways of configuring the code to perform tasks in accordance with the invention, do not depart from the spirit and scope of the invention.
在不脫離其精神或基本特徵的情況下,本發明可以以其他特定形式體現。所描述的示例在所有方面都應被視為說明性而非限制性的。因此,本發明的範圍由所附申請專利範圍而不是由前述描述來指示。落入申請專利範圍等同物的含義和範圍內的所有變化都應包含在其範圍內。The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples should be considered in all respects as illustrative and not restrictive. The scope of the invention is therefore indicated by the appended claims rather than by the foregoing description. All changes that fall within the meaning and scope of equivalents to the scope of the patent claimed shall be included within its scope.
110:幀內預測
112:幀間預測
114:開關
116:加法器
118:變換
120:量化
122:熵編碼器
130:環路濾波器
124:逆量化
126:逆變換
128:重建
134:參考圖片緩衝器
136:預測資料
140:熵解碼器
150:幀內預測
152:MC
210、212、220、222:塊
310:當前幀
312:當前塊
320、330:參考幀
350、352、354:像素位置
410、420:參考塊
412、422:起點
610、620、630、640:搜索位置
612、622、632、642:橢圓
710:搜索位置簇
710、720、730:等高線
732:公共基點b
c810~816、910~940:搜索方向
1010~1040:步驟
110: Intra prediction 112: Inter prediction 114: Switch 116: Adder 118: Transform 120: Quantization 122: Entropy encoder 130: Loop filter 124: Inverse quantization 126: Inverse transform 128: Reconstruction 134: Reference picture buffer 136: prediction data 140: entropy decoder 150: intra prediction 152: MC 210, 212, 220, 222: block 310: current frame 312: current block 320, 330: reference frame 350, 352, 354: pixel position 410 , 420: Reference block 412, 422:
第1A圖說明了包含循環處理的示例性適應性幀間/幀內視訊編解碼系統。 第1B圖圖示了第1A圖中的編碼器的相應解碼器。 第2圖圖示了當前圖片參考(Current Picture Referencing,簡寫為CPR)補償的示例,其中塊由相同圖片中的對應塊預測。 第3圖圖示了使用運動向量差的合併模式 (MMVD)搜索過程的示例,其中使用L0參考幀和L1參考幀通過雙向預測來處理當前幀中的當前塊。 第4圖示出了根據MMVD的L0參考塊410和L1參考塊在水平和垂直方向上的偏移距離。 第5圖圖示了從空間和時間鄰域塊(neighboring block)導出合併模式候選的示例。 第6A圖示出了根據本發明的一個實施例的當兩個基點在水平方向上接近時針對兩個基點之一的修改的搜索位置的示例,其中修改的搜索位置包括傾斜向下的搜索方向。 第6B圖示出了根據本發明的一個實施例的當兩個基點在垂直方向上接近時針對兩個基點之一的修改的搜索位置的示例,其中修改的搜索位置包括向右傾斜的搜索方向。 第7A圖示出了根據傳統MMVD的四個基點的搜索位置的示例,其中三個基點彼此靠近。 第7B圖示出了根據本發明實施例的四個基點的搜索位置的示例,其中三個基點彼此靠近,其中MMVD搜索位置基於從三個定位靠近的基點導出的公共基點。 第8A圖示出了根據傳統MMVD的搜索位置的示例,其中分別從基點b0和b1沿著垂直和水平方向執行搜索。 第8B圖示出了根據本發明實施例的搜索位置的示例,其中第二基點的搜索位置取決於相對於第一基點的相對位置。 第9A圖示出了根據傳統MMVD的搜索位置的示例,其中分別從基點b0和b1沿著垂直和水平方向執行搜索。 第9B圖示出了根據本發明實施例的搜索位置的示例,其中對於第一基點的搜索方向包括平行於(b 1-b 0)的方向以及垂直於(b 1-b 0)的方向 ,而第二基點的搜索方向從第一基點的搜索方向旋轉。 第10圖圖示了根據本發明的實施例的利用針對MMVD的修改的搜索位置的另一示例性視訊編解碼系統的流程圖。 Figure 1A illustrates an exemplary adaptive inter/intra video codec system including loop processing. Figure 1B illustrates the corresponding decoder of the encoder in Figure 1A. Figure 2 illustrates an example of Current Picture Referencing (CPR) compensation, where blocks are predicted from corresponding blocks in the same picture. Figure 3 illustrates an example of a merge mode using motion vector difference (MMVD) search process, where the current block in the current frame is processed by bidirectional prediction using the L0 reference frame and the L1 reference frame. Figure 4 shows the offset distance in the horizontal and vertical directions of the L0 reference block 410 and the L1 reference block according to MMVD. Figure 5 illustrates an example of deriving merge mode candidates from spatial and temporal neighborhood blocks. Figure 6A shows an example of a modified search position for one of the two base points when the two base points are close in the horizontal direction according to an embodiment of the present invention, wherein the modified search position includes a search direction tilted downwards . Figure 6B shows an example of a modified search position for one of the two base points when the two base points are close in the vertical direction according to an embodiment of the present invention, wherein the modified search position includes a search direction tilted to the right. . Figure 7A shows an example of search positions of four base points according to conventional MMVD, where three base points are close to each other. Figure 7B shows an example of search positions of four base points, where three base points are close to each other, in which the MMVD search position is based on a common base point derived from three base points positioned close to each other according to an embodiment of the present invention. FIG. 8A shows an example of search positions according to conventional MMVD, in which searches are performed along vertical and horizontal directions from base points b0 and b1, respectively. Figure 8B shows an example of a search position according to an embodiment of the present invention, in which the search position of the second base point depends on the relative position with respect to the first base point. FIG. 9A shows an example of search positions according to conventional MMVD, in which searches are performed along vertical and horizontal directions from base points b0 and b1, respectively. Figure 9B shows an example of a search position according to an embodiment of the present invention, in which the search direction for the first base point includes a direction parallel to (b 1 -b 0 ) and a direction perpendicular to (b 1 -b 0 ), The search direction of the second base point is rotated from the search direction of the first base point. Figure 10 illustrates a flowchart of another exemplary video codec system utilizing modified search locations for MMVD, in accordance with an embodiment of the present invention.
1010~1040:步驟 1010~1040: steps
Claims (19)
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202263304010P | 2022-01-28 | 2022-01-28 | |
US63/304,010 | 2022-01-28 | ||
WOPCT/CN2023/072978 | 2023-01-18 | ||
PCT/CN2023/072978 WO2023143325A1 (en) | 2022-01-28 | 2023-01-18 | Method and apparatus for video coding using merge with mvd mode |
Publications (2)
Publication Number | Publication Date |
---|---|
TW202337216A true TW202337216A (en) | 2023-09-16 |
TWI822567B TWI822567B (en) | 2023-11-11 |
Family
ID=87470738
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW112102681A TWI822567B (en) | 2022-01-28 | 2023-01-19 | Method and apparatus for video coding using merge with mvd mode |
Country Status (2)
Country | Link |
---|---|
TW (1) | TWI822567B (en) |
WO (1) | WO2023143325A1 (en) |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020098790A1 (en) * | 2018-11-16 | 2020-05-22 | Mediatek Inc. | Method and apparatus of improved merge with motion vector difference for video coding |
CN113228643A (en) * | 2018-12-28 | 2021-08-06 | 韩国电子通信研究院 | Image encoding/decoding method and apparatus, and recording medium for storing bit stream |
WO2020141928A1 (en) * | 2019-01-04 | 2020-07-09 | 엘지전자 주식회사 | Method and apparatus for decoding image on basis of prediction based on mmvd in image coding system |
US10869050B2 (en) * | 2019-02-09 | 2020-12-15 | Tencent America LLC | Method and apparatus for video coding |
-
2023
- 2023-01-18 WO PCT/CN2023/072978 patent/WO2023143325A1/en unknown
- 2023-01-19 TW TW112102681A patent/TWI822567B/en active
Also Published As
Publication number | Publication date |
---|---|
TWI822567B (en) | 2023-11-11 |
WO2023143325A1 (en) | 2023-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI737184B (en) | Method and apparatus of merge list generation for intra block copy mode | |
US11089323B2 (en) | Method and apparatus of current picture referencing for video coding | |
US11785207B2 (en) | Apparatus of encoding or decoding video blocks by current picture referencing coding | |
TWI692973B (en) | Method and apparatus of current picture referencing for video coding using adaptive motion vector resolution and sub-block prediction mode | |
CN113170191B (en) | Prediction method and prediction device for video encoding and decoding | |
JP2022505578A (en) | Video signal processing methods and equipment with subblock-based motion compensation | |
US20220264119A1 (en) | Method and Apparatus of Subblock Deblocking in Video Coding | |
TW202329694A (en) | Video coding method and apparatus thereof | |
TWI822567B (en) | Method and apparatus for video coding using merge with mvd mode | |
TW202349963A (en) | Method and apparatus for reordering candidates of merge with mvd mode in video coding systems | |
TW202349959A (en) | Method and apparatus for complexity reduction of video coding using merge with mvd mode | |
TWI830334B (en) | Method and apparatus for low-latency template matching in video coding system | |
TW202410696A (en) | Method and apparatus for complexity reduction of video coding using merge with mvd mode | |
TW202349962A (en) | Method and apparatus of video coding using merge with mvd mode | |
WO2023222016A1 (en) | Method and apparatus for complexity reduction of video coding using merge with mvd mode | |
WO2024088048A1 (en) | Method and apparatus of sign prediction for block vector difference in intra block copy | |
TW202402059A (en) | Method and apparatus for video coding | |
CN116456110A (en) | Video encoding and decoding method and device | |
TW202337214A (en) | Method and apparatus deriving merge candidate from affine coded blocks for video coding | |
TW202341741A (en) | Method and apparatus for video coding | |
TW202404368A (en) | Methods and apparatus for video coding using ctu-based history-based motion vector prediction tables | |
TW202123710A (en) | Method and apparatus of signaling adaptive motion vector difference resolution in video coding | |
CN117203959A (en) | Method, apparatus and medium for video processing |