TW202406348A

TW202406348A - Video coding method and apparatus thereof

Info

Publication number: TW202406348A
Application number: TW112127132A
Authority: TW
Inventors: 林郁晟; 莊子德; 徐志瑋; 陳慶曄
Original assignee: 聯發科技股份有限公司
Priority date: 2022-07-22
Filing date: 2023-07-20
Publication date: 2024-02-01

Abstract

A method for coding video pictures by reordering a reference picture list (RPL) is provided. A video coder receives a RPL for a current coding tree unit (CTU) of a current picture. The RPL identifies a plurality of reference pictures. The video coder assigns indices to the plurality of reference pictures in the RPL of the current CTU. The video coder receives data to be encoded or decoded as a plurality of blocks of the current CTU. The video coder encodes or decodes the plurality of blocks of the CTU by using the assigned indices to select one or more reference pictures from the RPL to generate inter-predictions.

Description

Video encoding and decoding method and device

本公開一般涉及視訊編解碼。具體而言，本公開涉及藉由使用參考列表進行幀間預測來對像素塊進行編解碼的方法。This disclosure relates generally to video codecs. In particular, the present disclosure relates to methods of encoding and decoding pixel blocks by using reference lists for inter prediction.

除非本文另有說明，否則本節中描述的方法不是下面列出的申請專利範圍的習知技術，以及不被包含在本節中而被承認為習知技術。Unless otherwise indicated herein, the methods described in this section are not in the art within the scope of the claims listed below and are not included in this section and are not admitted as being in the art.

高效視訊編解碼（High-Efficiency Video Coding，簡稱HEVC）是由視訊編解碼聯合協作組（Joint Collaborative Team on Video Coding，簡稱JCT-VC）開發的國際視訊編解碼標準。HEVC基於混合的基於塊的運動補償類DCT 變換編解碼架構。壓縮的基本單元，被稱為編解碼單元（coding unit，簡稱CU），是一個2Nx2N的方形像素塊，每個CU可以遞迴地分成四個更小的CU，直到達到預定的最小尺寸。每個CU包含一個或多個預測單元（prediction unit，簡稱PU）。High-Efficiency Video Coding (HEVC for short) is an international video codec standard developed by the Joint Collaborative Team on Video Coding (JCT-VC for short). HEVC is based on a hybrid block-based motion compensation-like DCT transform coding and decoding architecture. The basic unit of compression, called the coding unit (CU), is a 2Nx2N square pixel block. Each CU can be recursively divided into four smaller CUs until a predetermined minimum size is reached. Each CU contains one or more prediction units (PU).

多功能視訊編解碼（Versatile video coding，簡稱VVC）是由ITU-T SG16 WP3和ISO/IEC JTC1/SC29/WG11的聯合視訊專家組（Joint Video Expert Team，簡稱JVET）制定的最新國際視訊編解碼標準。輸入視訊訊號從重構訊號預測，該重構訊號從編解碼圖片區域導出。預測殘差訊號藉由塊變換進行處理。變換係數與位元流中的其他輔助資訊一起被量化和熵編解碼。重構訊號根據預測訊號和對去量化變換係數進行逆變換後的重構殘差訊號生成。重構訊號藉由環路濾波進一步被處理，以去除編解碼偽像。解碼後的圖片存儲在幀緩衝器中，用於預測輸入視訊訊號中的未來圖片。Versatile video coding (VVC) is the latest international video codec developed by the Joint Video Expert Team (JVET) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 standard. The input video signal is predicted from a reconstructed signal derived from the codec picture region. The prediction residual signal is processed by block transform. The transform coefficients are quantized and entropy coded together with other ancillary information in the bitstream. The reconstructed signal is generated based on the prediction signal and the reconstructed residual signal obtained by inversely transforming the dequantized transform coefficients. The reconstructed signal is further processed by loop filtering to remove coding and decoding artifacts. The decoded pictures are stored in a frame buffer and used to predict future pictures in the input video signal.

在VVC中，編解碼圖片被劃分為由相關聯的編解碼樹單元（coding tree unit，簡稱CTU）表示的非重疊方形塊區域。編解碼圖片可以由片段集合表示，每個片段包含整數個CTU。片段中的各個CTU以光柵掃描連續處理。幀內預測或幀間預測可以被用來對雙向預測（bi-predictive，簡稱B）片段進行解碼，其中最多有兩個運動向量和參考索引來預測每個塊的樣本值。預測（P）片段使用具有至多一個運動向量和參考索引的幀內預測或幀間預測來解碼以預測每個塊的樣本值。幀內（intra，簡稱I）片段僅使用幀內預測對進行解碼。In VVC, codec pictures are divided into non-overlapping square block areas represented by associated coding tree units (CTUs for short). Codec pictures can be represented by a collection of segments, each segment containing an integer number of CTUs. Individual CTUs in a segment are processed consecutively in a raster scan. Intra-prediction or inter-prediction can be used to decode bi-predictive (B) segments, where up to two motion vectors and reference indices are used to predict sample values for each block. Prediction (P) slices are decoded using intra prediction or inter prediction with at most one motion vector and reference index to predict sample values for each block. Intra (intra, or I) segments are decoded using only intra prediction pairs.

可以使用具有嵌套多類型樹（multi-type-tree，簡稱MTT）結構的四叉樹（quadtree，簡稱QT）將CTU劃分成一個或多個非重疊編解碼單元（coding unit，簡稱CU），以適應各種局部運動和紋理特徵。CU可以使用五種分割類型之一進一步分割成更小的CU：四叉樹分區、垂直二叉樹分區、水平二叉樹分區、垂直中心側三叉樹分區、水平中心側三叉樹分區。The CTU can be divided into one or more non-overlapping coding units (CUs) using a quadtree (QT) with a nested multi-type-tree (MTT) structure. to adapt to various local motion and texture features. A CU can be further partitioned into smaller CUs using one of five partitioning types: quadtree partitioning, vertical binary tree partitioning, horizontal binary tree partitioning, vertical center-side ternary tree partitioning, and horizontal center-side ternary tree partitioning.

每個CU包含一個或多個預測單元（prediction，簡稱PU）。預測單元與關聯的CU語法一起作為基本單元，以用於發送預測資訊。指定的預測處理用於預測PU內的相關像素樣本的值。每個CU可以包含一個或多個變換單元（transform unit，簡稱TU）用於表示預測殘差塊。變換單元（transform unit，簡稱TU）由亮度樣本的變換塊（transform block，簡稱TB）和兩個相應的色度樣本變換塊組成，每個TB對應於來自一種顏色分量的樣本的一個殘差塊。整數變換被應用於變換塊。量化係數的級別值與其他輔助資訊一起在位元流中進行熵編解碼。術語編解碼樹塊（coding tree block，簡稱CTB）、編解碼塊（coding block，簡稱CB）、預測塊（prediction block，簡稱PB）和變換塊（transform block，簡稱TB）被定義為分別指定與CTU，CU，PU和TU相關聯的一個顔色分量的2D樣本陣列。因此，CTU由一個亮度CTB，兩個色度CTB和相關語法元素組成。類似的關係對CU，PU和TU有效。Each CU contains one or more prediction units (prediction, PU for short). The prediction unit, together with the associated CU syntax, serves as the basic unit for sending prediction information. The specified prediction process is used to predict the values of relevant pixel samples within the PU. Each CU can contain one or more transform units (TUs for short) used to represent prediction residual blocks. The transform unit (TU) consists of a transform block (TB) of luma samples and two corresponding chroma sample transform blocks. Each TB corresponds to a residual block of samples from one color component. . Integer transforms are applied to transform blocks. The level values of the quantization coefficients are entropy encoded and decoded in the bit stream together with other auxiliary information. The terms coding tree block (CTB), coding block (CB), prediction block (PB), and transform block (TB) are defined to specify and CTU, CU, PU and TU are associated with a 2D array of samples of color components. Therefore, the CTU consists of a luma CTB, two chroma CTBs and related syntax elements. Similar relationships are valid for CU, PU and TU.

對於每個幀間預測CU，由運動向量，參考圖片索引和參考圖片清單使用索引組成的運動參數以及額外資訊用於幀間預測樣本生成。運動參數可以以顯式或隱式方式發送。當CU以跳過模式進行編解碼時，CU與一個PU相關聯並且沒有顯著的殘差係數，沒有被編解碼的運動向量增量或參考圖片索引。合併模式被指定，當前CU的運動參數從相鄰CU獲得，包括空間和時間候選，以及VVC中引入的額外排程。合併模式可以應用於任一幀間預測的CU。合併模式的可選方案是運動參數的顯式傳輸，其中運動向量，每個參考圖片列表的相應參考圖片索引和參考圖片列表使用標誌以及其他所需資訊按每個CU被顯式地發送。For each inter prediction CU, motion parameters consisting of motion vector, reference picture index and reference picture list usage index and additional information are used for inter prediction sample generation. Motion parameters can be sent explicitly or implicitly. When a CU is coded in skip mode, the CU is associated with a PU and has no significant residual coefficients, no coded motion vector delta or reference picture index. The merge mode is specified, and the motion parameters of the current CU are obtained from neighboring CUs, including spatial and temporal candidates, as well as the additional schedule introduced in VVC. Merge mode can be applied to any inter-predicted CU. An alternative to merge mode is explicit transmission of motion parameters, where motion vectors, corresponding reference picture indexes for each reference picture list and reference picture list usage flags and other required information are sent explicitly per CU.

以下概述僅是說明性的並且不旨在以任何方式進行約束。即，以下概述被提供以介紹本文所述的新穎且非顯而易見的技術的概念、亮點、益處和優點。選擇而不是所有的實施方式在下面的詳細描述中被進一步描述。因此，以下概述並非旨在識別所要求保護的主題的基本特徵，也不旨在用於決定所要求保護的主題的範圍。The following summary is illustrative only and is not intended to be binding in any way. That is, the following summary is provided to introduce the concepts, highlights, benefits, and advantages of the novel and non-obvious techniques described herein. Select, but not all, embodiments are further described in the detailed description below. Accordingly, the following summary is not intended to identify essential features of the claimed subject matter, nor is it intended to be used to determine the scope of the claimed subject matter.

本發明的一些實施例提供了一種藉由對參考圖片列表（reference picture list，簡稱RPL）重新排序來對視訊圖片進行解碼的方法。視訊編解碼器接收當前圖片的當前編解碼樹單元（coding tree unit，簡稱CTU）的RPL。RPL識別多個參考圖片。視訊編解碼器向當前CTU的RPL中的多個參考圖片分配索引。視訊編解碼器接收資料，該資料將被編碼或解碼為當前CTU的多個塊。視訊編解碼器藉由使用分配的索引來對CTU的多個塊進行編碼或解碼，以從RPL中選擇一個或多個參考圖片以生成幀間預測。Some embodiments of the present invention provide a method for decoding video pictures by reordering a reference picture list (RPL). The video codec receives the RPL of the current coding tree unit (CTU) of the current picture. RPL recognizes multiple reference pictures. The video codec assigns indices to multiple reference pictures in the RPL of the current CTU. The video codec receives data, which will be encoded or decoded into multiple blocks of the current CTU. The video codec selects one or more reference pictures from the RPL to generate inter prediction by encoding or decoding multiple blocks of the CTU using assigned indices.

在一些實施例中，基於顯式信令，索引被配置給RPL中的多個參考圖片。在一些實施例中，在對當前圖片的每個CTU進行編碼或解碼時，基於記錄參考圖片選擇的分佈的基於歷史的表，索引被配置給RPL中的多個參考圖片。In some embodiments, indexes are configured to multiple reference pictures in the RPL based on explicit signaling. In some embodiments, indexes are configured to multiple reference pictures in the RPL based on a history-based table recording the distribution of reference picture selections when encoding or decoding each CTU of the current picture.

在一些實施例中，視訊編解碼器導出當前CTU的代表性MV並計算多個參考圖片的成本。每個參考圖片的成本基於（i）當前CTU的相鄰樣本和（ii）由代表性MV識別的參考圖片中的參考樣本來計算，以及基於計算的成本，索引被分配給基於RPL中的多個參考圖片。In some embodiments, the video codec derives a representative MV of the current CTU and calculates the cost of multiple reference pictures. The cost of each reference picture is calculated based on (i) the neighboring samples of the current CTU and (ii) the reference samples in the reference picture identified by the representative MV, and based on the calculated cost, the index is assigned to the multiple based on the RPL. reference picture.

在一些實施例中，代表性MV從用於重構與當前CTU相鄰的一個或多個塊的MV導出，以及代表性MV可以是用於重構與當前CTU相鄰的塊的MV的加權平均值。在一些實施例中，代表性MV從當前CTU中的一個或多個塊的MV導出，以及代表性MV可以是來自當前CTU中的一個或多個塊的MV的加權平均值。在一些實施例中，代表性MV從來自同位的CTU或參考圖片CTU的時間MV導出，以及代表性MV可以是來自同位的CTU或參考圖片CTU的時間MV的加權平均值。代表性MV源自從當前CTU的相鄰位置繼承的MV。視訊編解碼器可以使用當前CTU中的塊的運動向量預測子（motion vector predictor，簡稱MVP）作為CTU的代表性MV。In some embodiments, the representative MV is derived from the MV used to reconstruct one or more blocks adjacent to the current CTU, and the representative MV may be a weighted MV used to reconstruct the MV of the block adjacent to the current CTU. average value. In some embodiments, the representative MV is derived from the MVs of one or more blocks in the current CTU, and the representative MV may be a weighted average of the MVs from one or more blocks in the current CTU. In some embodiments, the representative MV is derived from the temporal MV from the co-located CTU or reference picture CTU, and the representative MV may be a weighted average of the temporal MV from the co-located CTU or reference picture CTU. Representative MVs are derived from MVs inherited from neighboring positions of the current CTU. The video codec can use the motion vector predictor (MVP) of the block in the current CTU as the representative MV of the CTU.

在以下詳細描述中，藉由示例的方式闡述了許多具體細節，以便提供對相關教導的透徹理解。基於本文描述的教導的任何變化、衍生和/或擴展都在本公開的保護範圍內。在一些情況下，與在此公開的一個或多個示例實施方式有關的眾所周知的方法、處理、組件和/或電路可以在相對較高的水平上進行描述而沒有細節，以避免不必要地模糊本公開的教導的方面。 Ⅰ 、參考圖片管理 In the following detailed description, numerous specific details are set forth by way of examples in order to provide a thorough understanding of the relevant teachings. Any changes, derivatives, and/or extensions based on the teachings described herein are within the scope of this disclosure. In some instances, well-known methods, processes, components and/or circuits related to one or more example embodiments disclosed herein may be described at a relatively high level without detail in order to avoid unnecessarily obscuring Aspects of the Teachings of the Present Disclosure. Ⅰ . Reference picture management

視訊編解碼系統使用多個參考圖片執行用於幀間預測的參考圖片管理。參考圖片管理管理參考圖片在解碼圖片緩衝器（decoded picture buffer，簡稱DPB）中的存儲和移除，以及將參考圖片以適當的順序放入參考圖片列表（reference picture list，簡稱RPL）中。當圖片被參考用於幀間預測（作為時間參考或層間參考）時，RPL中的索引用於選擇正在參考DPB中的哪個圖片。參考圖片管理可以使用DPB和最多兩個RPL。參考圖片管理還可以將參考圖片標記為“用於短期參考”、“用於長期參考”和“不用於參考”。The video codec system uses multiple reference pictures to perform reference picture management for inter-frame prediction. Reference picture management manages the storage and removal of reference pictures in the decoded picture buffer (DPB), and places the reference pictures in the reference picture list (RPL) in the appropriate order. When a picture is referenced for inter prediction (either as a temporal reference or an inter-layer reference), the index in the RPL is used to select which picture in the DPB is being referenced. Reference picture management can use DPB and up to two RPLs. Reference picture management can also mark reference pictures as "for short-term reference", "for long-term reference" and "not for reference".

在編解碼處理中，圖片順序計數（picture order count，簡稱POC）是變數，被導出為輸出順序指示符以及用作圖片的識別字，包括DPB管理和參考圖片管理。為了最小化信令開銷位元成本，同時保持針對資料丟失的魯棒性，POC值的最高有效位（most significant bit，簡稱MSB）可以不在位元流中發送，因為通常僅POC值之間的差值對於編解碼處理的正確操作是必要的。In the encoding and decoding process, the picture order count (POC) is a variable that is exported as an output order indicator and used as an identifier of the picture, including DPB management and reference picture management. In order to minimize the signaling overhead bit cost while maintaining robustness against data loss, the most significant bit (MSB) of the POC value may not be sent in the bit stream, since usually only the most significant bit (MSB) between POC values The difference is necessary for the correct operation of the encoding and decoding process.

POC的最低有效位元（least significant bit，簡稱LSB）用於導出POC值以及對於圖片的所有片段具有相同的值，在圖片頭（picture header，簡稱PH）或片段頭（slice header，簡稱SH）中發送。POC MSB迴圈值可以在PH中發送，以使得能夠導出POC值，而無需以依賴於較早已編解碼圖片的POC資訊的方式跟蹤POC MSB。例如，這允許在多層位元流中的訪問單元（access unit，簡稱AU）內混合幀內隨機訪問圖片（intra random access picture，簡稱IRAP）和非IRAP圖片。可以對每個圖片（包括暫態解碼器刷新（instantaneous decoder refresh，簡稱IDR）圖片）發送POC LSB資訊。可以針對IDR圖片發送POC LSB，這為IDR圖片節省了一些位元。IDR圖片的POC LSB資訊的發送還有助於將來自不同位元流的IDR圖片和非IDR圖片合併到單個編解碼圖片中。The least significant bit (LSB) of the POC is used to derive the POC value and has the same value for all segments of the picture, in the picture header (PH) or slice header (SH) Send in. The POC MSB loop value can be sent in the PH to enable the POC value to be derived without tracking the POC MSB in a manner that relies on POC information from earlier encoded pictures. This allows, for example, a mix of intra random access pictures (IRAP) and non-IRAP pictures within an access unit (AU) in a multi-layer bitstream. POC LSB information can be sent for each picture (including instantaneous decoder refresh (IDR) pictures). POC LSB can be sent for IDR pictures, which saves some bits for IDR pictures. Sending POC LSB information of IDR pictures also helps merge IDR pictures and non-IDR pictures from different bit streams into a single codec picture.

在一些實施例中，對於所有類型的片段（例如，B、P和I片），被稱為列表0（L0或RPL 0）和列表1（L1或RPL 1）的兩個RPL被直接發送和導出，不使用RPL初始化或修改處理。在一些實施例中，RPL不基於參考圖片集或滑動視窗加記憶體管理控制操作處理。In some embodiments, for all types of slices (eg, B, P, and I slices), two RPLs called List 0 (L0 or RPL 0) and List 1 (L1 or RPL 1) are sent directly and Exported without using RPL initialization or modification processing. In some embodiments, RPL is not based on reference picture sets or sliding windows plus memory management control operations.

參考圖片標記直接基於RPL 0和1，指示RPL中的活動條目和非活動條目，其中只有活動條目可以由參考索引在當前圖片的幀間預測中使用。Reference picture markers are directly based on RPL 0 and 1, indicating active and inactive entries in the RPL, where only active entries can be used in inter prediction of the current picture by the reference index.

第1A-B圖示出與包括在序列參數集（sequence parameter set，簡稱SPS）、圖片參數集（picture parameter set，簡稱PPS）、PH和SH中的參考圖片管理信令有關的各種語法結構和元素。在圖中，始終存在的語法元素以實線矩形示出，而有條件存在的語法元素以虛線矩形示出。對於RPL 0和RPL 1（如果與RPL 0的那些不同），多個預定的候選RPL語法結構（例如，ref_pic_list_struct（listIdx，rplsIdx）語法結構）可以在SPS中發送，以藉由在PH或SH中引用它們來使用。Figures 1A-B illustrate various syntax structures related to reference picture management signaling included in sequence parameter set (SPS), picture parameter set (PPS), PH and SH and element. In the figure, syntax elements that are always present are shown in solid rectangles, while syntax elements that are conditionally present are shown in dashed rectangles. For RPL 0 and RPL 1 (if different from those of RPL 0), multiple predetermined candidate RPL syntax structures (e.g., ref_pic_list_struct(listIdx, rplsIdx) syntax structure) may be sent in the SPS to be passed in the PH or SH Quote them to use.

PPS中的語法元素指示RPL 0和RPL 1的活動條目的默認數量、控制ref_pic_lists()結構中RPL 1語法的存在的標誌、以及指定RPL資訊是否包含在PH或SH中的標誌。用於推導兩個RPL的資訊（即ref_pic_lists()結構）在PH中發送（如果圖片的所有片段都有相同的RPL），或者在SH中發送。代替引用預定的候選RPL結構（由rpl_idx[i]識別），另一RPL結構（即ref_pic_list_struct(i, sps_num_ref_pic_lists[i])結構）還可以在PH和SH中直接發送。The syntax elements in the PPS indicate the default number of active entries for RPL 0 and RPL 1, a flag that controls the presence of RPL 1 syntax in the ref_pic_lists() structure, and a flag that specifies whether RPL information is contained in PH or SH. The information used to derive the two RPLs (i.e. the ref_pic_lists() structure) is sent in PH (if all fragments of the picture have the same RPL), or in SH. Instead of referencing the predetermined candidate RPL structure (identified by rpl_idx[i]), another RPL structure (i.e. the ref_pic_list_struct(i, sps_num_ref_pic_lists[i]) structure) can also be sent directly in the PH and SH.

每個RPL語法結構包括特定RPL的參考圖片條目的數量的資訊。RPL中的參考圖片條目是短期參考圖片條目、長期參考圖片條目或層間參考圖片條目。RPL 0和RPL 1的活動條目的默認數量在PPS中發送（即 pps_num_ref_idx_default_active_minus1[i]），以及可以在SH中覆蓋（使用語法元素sh_num_ref_idx_active_override_flag和sh_num_ref_idx_active_minus1[i]）。 Ⅱ 、參考列表重新排序 Each RPL syntax structure includes information on the number of reference picture entries for a particular RPL. Reference picture entries in RPL are short-term reference picture entries, long-term reference picture entries or inter-layer reference picture entries. The default number of active entries for RPL 0 and RPL 1 is sent in PPS (i.e. pps_num_ref_idx_default_active_minus1[i]), and can be overridden in SH (using the syntax elements sh_num_ref_idx_active_override_flag and sh_num_ref_idx_active_minus1[i]). Ⅱ . Reordering the reference list

在一些實施例中，參考圖片重新排序方法被用來允許參考圖片索引配置的塊級適應性調整。參考圖片重新排序可以基於模板匹配成本。對於單向預測AMVP模式，清單0和清單1中的參考圖片被交織以生成聯合列表。對於聯合列表中的參考圖片的每個假設，運動資訊可以被相應地導出，以及模板匹配被執行以計算成本。聯合列表根據模板匹配成本的上升順序被重新排序。重新排序的聯合列表中所選擇的參考圖片的索引在位元流中發送。對於雙向預測AMVP模式，來自清單0和列表1的參考圖片對的列表被生成，以及基於模板匹配成本類似地重新排序。所選對的索引在位元流中發送。In some embodiments, a reference picture reordering method is used to allow block-level adaptation of the reference picture index configuration. Reference image reordering can be based on template matching cost. For uni-predictive AMVP mode, the reference pictures in List 0 and List 1 are interleaved to generate a joint list. For each hypothesis of the reference picture in the joint list, motion information can be derived accordingly, and template matching is performed to calculate the cost. The union list is reordered according to increasing template matching cost. The index of the selected reference picture in the reordered union list is sent in the bitstream. For bi-predictive AMVP mode, lists of reference picture pairs from List 0 and List 1 are generated and similarly reordered based on template matching cost. The index of the selected pair is sent in the bitstream.

在一些實施例中，藉由設置活動參考圖片的數量等於所報告的可用參考圖片的數量，隨機接入配置中的活動參考圖片的數量的結果被擴展。在一些實施例中，基於模板匹配的塊級參考圖片重新排序方法可以被使用。對於單向預測AMVP模式，清單0和清單1中的參考圖片被交織以生成聯合列表。對於聯合列表中參考圖片的每個假設，模板匹配被執行來計算成本。聯合列表根據模板匹配成本的上升順序被重新排序。重新排序的聯合列表中所選擇的參考圖片的索引在位元流中發送。對於雙向預測AMVP模式，來自清單0和列表1的參考圖片對的列表被生成，以及基於模板匹配成本類似地重新排序。所選對的索引在位元流中發送。In some embodiments, the results of the number of active reference pictures in a random access configuration are extended by setting the number of active reference pictures equal to the reported number of available reference pictures. In some embodiments, a template matching based block-level reference picture reordering method may be used. For uni-predictive AMVP mode, the reference pictures in List 0 and List 1 are interleaved to generate a joint list. For each hypothesis of the reference image in the joint list, template matching is performed to calculate the cost. The union list is reordered according to increasing template matching cost. The index of the selected reference picture in the reordered union list is sent in the bitstream. For bi-predictive AMVP mode, lists of reference picture pairs from List 0 and List 1 are generated and similarly reordered based on template matching cost. The index of the selected pair is sent in the bitstream.

在一些實施例中，運動向量差（motion vector difference，簡稱MVD）的符號預測被應用於常規和仿射AMVP模式。預測MVD符號的推導需要參考圖片為已知。然而，參考圖片重新排序方法需要在此處理中已知MVD。為了解決這個問題，所有MVD符號假設中的最小模板匹配成本被分配給參考圖片假設。然後，所選擇的參考圖片可以藉由解碼的索引和重新排序的參考圖片列表來決定。此後MVD符號預測藉由重新使用計算出的模板匹配成本來執行。為了簡化，在雙向預測的情況下，僅當列表0中的MVD為零時列表1中的MVD符號預測才啟用。 Ⅲ 、基於 CTU 的參考圖片列表重新排序 In some embodiments, motion vector difference (MVD) sign prediction is applied to regular and affine AMVP modes. The derivation of predicted MVD symbols requires that the reference image be known. However, the reference picture reordering method requires the MVD to be known in this process. To solve this problem, the minimum template matching cost among all MVD symbol hypotheses is assigned to the reference picture hypothesis. The selected reference picture can then be determined by the decoded index and the reordered reference picture list. Thereafter MVD symbol prediction is performed by reusing the calculated template matching cost. For simplicity, in the case of bidirectional prediction, MVD symbol prediction in List 1 is only enabled when the MVD in List 0 is zero. Ⅲ . Reordering of reference picture list based on CTU

本公開的一些實施例提供一種基於CTU的參考圖片列表（reference picture list，簡稱RPL）重新排序方法。具體地，當對CTU進行編解碼時，在對CTU的第一塊或第一AMVP或第一幀間塊進行編碼或解碼之前，隱式或顯式地（藉由信令）對參考圖片列表進行重新排序。CTU中的所有塊都使用相同的重新排序的參考圖片。CTU中的所有塊都使用相同的參考圖片順序。基於CTU的參考圖片排序可以根據與當前CTU相鄰的重構樣本與其在參考圖片中的相應參考樣本（或預測樣本）之間的模板匹配成本或SAD成本來決定。Some embodiments of the present disclosure provide a CTU-based reference picture list (RPL) reordering method. Specifically, when encoding and decoding a CTU, the reference picture list is implicitly or explicitly (via signaling) before encoding or decoding the first block of the CTU or the first AMVP or the first inter-frame block. Reorder. All blocks in a CTU use the same reordered reference picture. All blocks in a CTU use the same reference picture order. CTU-based reference picture ranking can be determined based on the template matching cost or SAD cost between the reconstructed sample adjacent to the current CTU and its corresponding reference sample (or prediction sample) in the reference picture.

在基於CTU的參考列表重新排序方法中，與當前CTU相鄰的重構樣本可以用於執行參考圖片重新排序。第2圖概念性地示出在基於CTU的參考圖片列表重新排序中使用的重構樣本。在該圖中，與當前CTU 200相鄰的陰影區域中的樣本可以用於當前CTU 200的基於CTU的參考圖片列表重新排序。這些相鄰重構樣本可以是與當前CTU 200相鄰的塊的樣本。這些相鄰塊CTU 200的塊可以是CTU A、CTU B和CTU D的塊，它們是與當前CTU 200相鄰的CTU。In the CTU-based reference list reordering method, reconstructed samples adjacent to the current CTU can be used to perform reference picture reordering. Figure 2 conceptually illustrates reconstructed samples used in CTU-based reference picture list reordering. In this figure, samples in the shaded area adjacent to the current CTU 200 can be used for CTU-based reference picture list reordering of the current CTU 200 . These adjacent reconstructed samples may be samples of blocks adjacent to the current CTU 200 . These adjacent blocks of CTU 200 may be blocks of CTU A, CTU B, and CTU D, which are CTUs adjacent to the current CTU 200 .

CTU 200與基於CTU的RPL 240相關聯，其包括參考圖片A、B、C和D。在一些實施例中，用於對RPL進行排序的成本是與該當前CTU相鄰的重構樣本與其在RPL的不同參考圖片中對應的參考樣本之間的計算的差值。這些參考樣本位於由當前CTU的運動源識別的參考塊中，其可以是當前CTU 200的代表性MV 220。當前CTU 220的代表性MV 220可以從以下導出或決定：相鄰CTU中的相鄰重構塊、當前CTU中的一個或多個幀間預測塊、當前CTU中的一個或多個AMVP模式塊、同位圖片或參考圖片中的時間MV、基於歷史的MV，或者基於歷史的運動資訊。用於決定參考圖片的順序的成本度量可以是模板匹配成本、SAD成本或SATD成本、或SSE成本、或與當前CTU相鄰的重構樣本與由代表性MV參考/識別的不同參考圖片中的對應參考樣本之間的其他差值度量。CTU 200 is associated with CTU-based RPL 240, which includes reference pictures A, B, C, and D. In some embodiments, the cost for sorting the RPL is the calculated difference between the reconstructed sample adjacent to the current CTU and its corresponding reference sample in a different reference picture of the RPL. These reference samples are located in reference blocks identified by the motion source of the current CTU, which may be representative MVs 220 of the current CTU 200 . The representative MV 220 of the current CTU 220 may be derived or determined from: adjacent reconstruction blocks in adjacent CTUs, one or more inter prediction blocks in the current CTU, one or more AMVP mode blocks in the current CTU , time MV in the same location picture or reference picture, history-based MV, or history-based motion information. The cost metric used to decide the order of reference pictures can be template matching cost, SAD cost or SATD cost, or SSE cost, or reconstructed samples adjacent to the current CTU with different reference pictures referenced/identified by the representative MV. Corresponding to other difference measures between reference samples.

第3圖概念性地示出使用CTU的代表性MV來重新排序CTU的參考圖片列表（reference picture list，簡稱RPL）。對於CTU 200，RPL 240識別四個參考圖片301-304（參考圖片A-D）。代表性MV 220用於定位這些參考圖片301-304中的參考塊或樣本，以及這些參考塊或樣本又用於計算這些參考圖片的成本。Figure 3 conceptually illustrates reordering the reference picture list (RPL) of a CTU using its representative MV. For CTU 200, RPL 240 recognizes four reference pictures 301-304 (reference pictures A-D). Representative MVs 220 are used to locate reference blocks or samples in these reference pictures 301-304, and these reference blocks or samples are in turn used to calculate the costs of these reference pictures.

如圖所示，當前CTU 200位於當前圖片210中。當前CTU 200的相鄰樣本或相鄰塊230將用於計算TM成本。CTU 200的代表性MV 220被導出以用於重新排序RPL 240。代表性MV 220用於識別不同參考圖片301-304中的參考塊或參考樣本331-334。參考塊或參考樣本331-334提供與當前CTU 200的相鄰樣本230相對應的參考樣本。As shown, the current CTU 200 is located in the current picture 210. The adjacent samples or adjacent blocks 230 of the current CTU 200 will be used to calculate the TM cost. Representative MVs 220 of CTU 200 are derived for reordering RPL 240. Representative MVs 220 are used to identify reference blocks or reference samples 331-334 in different reference pictures 301-304. Reference blocks or reference samples 331 - 334 provide reference samples corresponding to adjacent samples 230 of the current CTU 200 .

與參考圖片301（參考圖片A）相關聯的成本被計算為參考塊/樣本331與相鄰樣本/塊230之間的差值。與參考圖片302（參考圖片B）相關聯的成本被計算為參考塊/樣本332與相鄰樣本/塊230之間的差值。與參考圖片303（參考圖片C）相關聯的成本被計算為參考塊/樣本333與相鄰樣本/塊230之間的差值。與參考圖片304（參考圖片D）相關聯的成本被計算為參考塊/樣本334與相鄰樣本/塊230之間的差值。The cost associated with the reference picture 301 (reference picture A) is calculated as the difference between the reference block/sample 331 and the neighboring sample/block 230 . The cost associated with the reference picture 302 (reference picture B) is calculated as the difference between the reference block/sample 332 and the adjacent sample/block 230 . The cost associated with the reference picture 303 (reference picture C) is calculated as the difference between the reference block/sample 333 and the adjacent sample/block 230 . The cost associated with the reference picture 304 (reference picture D) is calculated as the difference between the reference block/sample 334 and the adjacent sample/block 230 .

在該示例中，對參考圖片B（參考圖片302）計算的成本在RPL 240中的所有參考圖片中是最低的，因此其被分配重新排序的索引0。對參考圖片C（參考圖片303）計算的成本是RPL 240中的所有參考圖片中最低的，因此其被分配重新排序的索引1。在RPL中參考圖片D具有第三低的成本，並被分配索引2。在RPL中參考圖片A具有第四低的成本，並被分配索引3，等等。In this example, the cost calculated for reference picture B (reference picture 302) is the lowest among all reference pictures in RPL 240, so it is assigned a reordered index of 0. The cost calculated for reference picture C (reference picture 303) is the lowest among all reference pictures in RPL 240, so it is assigned a reordered index of 1. Reference picture D has the third lowest cost in RPL and is assigned index 2. Reference picture A has the fourth lowest cost in RPL and is assigned index 3, and so on.

在一些實施例中，來自與當前CTU相鄰的CTU中的一個或多個相鄰重構塊的MV或運動資訊被用作當前CTU的代表性MV。在一些實施例中，來自相鄰CTU中的最後N個相鄰重構塊（N≥0）的MV或運動被用作當前CTU的代表性MV。在一些實施例中，來自相鄰CTU中的一個或多個相鄰重構塊的MV或運動被加權或平均，以及加權的MV或平均的MV被用作當前CTU的代表性MV。In some embodiments, MVs or motion information from one or more adjacent reconstruction blocks in CTUs adjacent to the current CTU are used as representative MVs for the current CTU. In some embodiments, the MV or motion from the last N neighboring reconstruction blocks (N≥0) in neighboring CTUs is used as the representative MV for the current CTU. In some embodiments, MVs or motions from one or more adjacent reconstruction blocks in adjacent CTUs are weighted or averaged, and the weighted MV or averaged MV is used as the representative MV for the current CTU.

在一些實施例中，來自當前CTU中的一個或多個塊的MV或運動被用作當前CTU的代表性MV。在一些實施例中，來自當前CTU中的一個或多個幀間預測塊或一個或多個AMVP模式塊的MV或運動被用作當前CTU的代表性MV。In some embodiments, the MV or motion from one or more blocks in the current CTU is used as the representative MV for the current CTU. In some embodiments, the MV or motion from one or more inter prediction blocks or one or more AMVP mode blocks in the current CTU is used as the representative MV for the current CTU.

在一些實施例中，來自當前CTU中的前N個塊（N≥0）的MV或運動被用作當前CTU的代表性MV。在一些實施例中，來自當前CTU中的一個或多個塊的MV或運動被加權或平均，以及加權的MV或平均的MV被用作當前CTU的代表性MV。In some embodiments, MVs or motions from the first N blocks (N≥0) in the current CTU are used as representative MVs for the current CTU. In some embodiments, the MV or motion from one or more blocks in the current CTU are weighted or averaged, and the weighted MV or averaged MV is used as the representative MV for the current CTU.

在一些實施例中，來自同位CTU的時間MV或來自參考圖片CTU的時間MV被用作當前CTU的代表性MV。在一些實施例中，來自同位CTU的時間MV或來自參考圖片CTU的時間MV被加權或平均，以及加權的MV或平均的MV被用作當前CTU的代表性MV。In some embodiments, the temporal MV from the co-located CTU or the temporal MV from the reference picture CTU is used as the representative MV of the current CTU. In some embodiments, the temporal MV from the co-located CTU or the temporal MV from the reference picture CTU is weighted or averaged, and the weighted MV or averaged MV is used as the representative MV of the current CTU.

在一些實施例中，來自當前CTU的相鄰位置的MV（之一）被繼承為當前CTU的代表性MV。第4圖示出在基於CTU的參考列表重新排序中使用的相鄰位置。該圖示出當前CTU 400。相鄰位置（例如圖中的A0、A1、A2、B0、B1、C0、C1）可以是相鄰CTU或相鄰重構塊，以及當前CTU的相鄰重構樣本被用於決定參考圖片順序。In some embodiments, the MV(s) from neighboring positions of the current CTU are inherited as the representative MV of the current CTU. Figure 4 shows neighbor positions used in CTU-based reference list reordering. This figure shows the current CTU 400. Adjacent positions (such as A0, A1, A2, B0, B1, C0, C1 in the figure) can be adjacent CTUs or adjacent reconstruction blocks, and the adjacent reconstruction samples of the current CTU are used to determine the reference picture order. .

在一些實施例中，基於歷史的運動資訊或基於歷史的MV被用作當前CTU的代表性MV。In some embodiments, history-based motion information or history-based MV is used as the representative MV of the current CTU.

在一些實施例中，參考圖片重新排序的處理按幀或圖片而不是按CTU發送。在對當前圖片進行編碼或解碼之前，參考圖片列表可以根據顯式信令或標誌進行重新排序。參考圖片的順序還可以對每個幀或圖片隱式地決定。In some embodiments, the reference picture reordering process is sent per frame or picture instead of per CTU. The reference picture list can be reordered based on explicit signaling or flags before encoding or decoding the current picture. The order of reference pictures can also be determined implicitly for each frame or picture.

在一些實施例中，在編碼或解碼之前，用於單向預測的（參考圖片的）聯合清單和用於雙向預測的（參考圖片的）聯合列表被形成，以及之後重新排序處理被執行。形成聯合列表的額外冗餘校驗處理不被處理，因此更多參考圖片可以被插入到聯合列表中。In some embodiments, before encoding or decoding, a joint list (of reference pictures) for unidirectional prediction and a joint list (of reference pictures) for bidirectional prediction are formed, and then the reordering process is performed. The additional redundancy check processing to form the joint list is not processed, so more reference pictures can be inserted into the joint list.

在一些實施例中，當運動向量差（motion vector difference，簡稱MVD）符號導出被啟用時，在MVD符號導出之前重新排序被執行。由於MVD的符號尚未被決定，為了決定參考圖片列表的順序，運動向量預測子（motion vector predictor，簡稱MVP）（作為CTU的代表性MV）首先決定相鄰重構塊與RPL中參考圖片中的對應參考塊之間的模板匹配成本或SATD成本。In some embodiments, when motion vector difference (MVD) symbol derivation is enabled, reordering is performed before MVD symbol derivation. Since the symbol of MVD has not yet been decided, in order to determine the order of the reference picture list, the motion vector predictor (MVP) (as the representative MV of CTU) first determines the adjacent reconstructed block and the reference picture in RPL. Template matching cost or SATD cost between corresponding reference blocks.

在一些實施例中，當當前圖片和當前參考圖片之間的差值/成本被計算時，來自不同參考圖片的所有MVP被縮放到當前參考圖片。In some embodiments, when the difference/cost between the current picture and the current reference picture is calculated, all MVPs from different reference pictures are scaled to the current reference picture.

在一些實施例中，當對當前CTU進行編碼或解碼時，參考圖片的順序基於先前編解碼的CTU的參考圖片分佈來決定。（第5圖示出當前CTU和當前圖片中的先前編解碼的CTU。）例如，當前CTU中的參考圖片的順序取決於一個先前編解碼的CTU的參考圖片分佈。又例如，當前CTU中參考圖片的順序取決於先前編解碼的CTU行的參考圖片分佈。又例如，基於歷史的表記錄對每個CTU進行編碼或解碼時參考圖片選擇的分佈，基於歷史的表用於當前CTU的參考圖片排序。In some embodiments, when encoding or decoding the current CTU, the order of reference pictures is determined based on the reference picture distribution of previously encoded and decoded CTUs. (Figure 5 shows the current CTU and the previous codec CTU in the current picture.) For example, the order of reference pictures in the current CTU depends on the reference picture distribution of one previous codec CTU. For another example, the order of reference pictures in the current CTU depends on the reference picture distribution of previously coded CTU lines. For another example, the history-based table records the distribution of reference picture selection when encoding or decoding each CTU, and the history-based table is used to sort the reference pictures of the current CTU.

任一前述提出的方法可以在編碼器和/或解碼器中實現。例如，任一所提出的方法可以在編碼器的預測子推導模組和/或解碼器的預測子推導模組中實現。或者，任一所提出的方法可以被實現為耦合到編碼器的預測子推導模組和/或解碼器的預測子推導模組的電路，以便提供預測子推導模組所需的資訊。 Ⅳ 、視訊編碼器示例 Any of the previously proposed methods may be implemented in the encoder and/or decoder. For example, any of the proposed methods may be implemented in a predictor derivation module of the encoder and/or a predictor derivation module of the decoder. Alternatively, any of the proposed methods may be implemented as circuitry coupled to a predictor derivation module of the encoder and/or a predictor derivation module of the decoder to provide information required by the predictor derivation module. Ⅳ . Video encoder example

第6圖示出可以實現基於CTU的參考圖片列表的示例視訊編碼器600。如圖所示，視訊編碼器600從視訊源605接收輸入視訊訊號，以及將該訊號編碼為位元流695。視訊編碼器600具有用於對來自視訊源605的訊號進行編碼的若干組件或模組，至少包括選自以下的一些組件：變換模組610，量化模組611，逆量化模組614，逆變換模組615，幀內估計模組620，幀內預測模組625，運動補償模組630，運動估計模組635，環路濾波器645，重構圖片緩衝器650，MV緩衝器665，MV預測模組675和熵編碼器690。運動補償模組630和運動估計模組635是幀間預測模組640的一部分。Figure 6 illustrates an example video encoder 600 that may implement a CTU-based reference picture list. As shown, video encoder 600 receives an input video signal from video source 605 and encodes the signal into a bit stream 695. Video encoder 600 has several components or modules for encoding signals from video source 605, including at least some components selected from the following: transform module 610, quantization module 611, inverse quantization module 614, inverse transform Module 615, intra estimation module 620, intra prediction module 625, motion compensation module 630, motion estimation module 635, loop filter 645, reconstructed picture buffer 650, MV buffer 665, MV prediction Module 675 and entropy encoder 690. Motion compensation module 630 and motion estimation module 635 are part of inter prediction module 640.

在一些實施例中，模組610-690是由計算設備或電子裝置的一個或多個處理單元（例如，處理器）執行的軟體指令模組。在一些實施例中，模組610-690是由電子裝置的一個或多個積體電路（integrated circuit，簡稱IC）實現的硬體電路模組。儘管模組610-690被示為單獨的模組，但一些模組可以組合成單個模組。In some embodiments, modules 610-690 are modules of software instructions executed by one or more processing units (eg, processors) of a computing device or electronic device. In some embodiments, the modules 610-690 are hardware circuit modules implemented by one or more integrated circuits (ICs) of the electronic device. Although modules 610-690 are shown as individual modules, some modules may be combined into a single module.

視訊源605提供原始視訊訊號，其呈現每個視訊幀的像素資料而不進行壓縮。減法器608計算視訊源605的原始視訊像素資料與來自運動補償模組630或幀內預測模組625的預測像素資料613之間的差值作為預測殘差609。變換模組610將差值（或殘差像素資料或殘差訊號）轉換成變換係數（例如，藉由執行離散余弦變換或DCT）。量化模組611將變換係數量化成量化資料（或量化係數）612，其由熵編碼器690編碼成位元流695。Video source 605 provides a raw video signal, which represents the pixel data of each video frame without compression. The subtractor 608 calculates the difference between the original video pixel data of the video source 605 and the predicted pixel data 613 from the motion compensation module 630 or the intra prediction module 625 as the prediction residual 609 . Transform module 610 converts the difference values (or residual pixel data or residual signal) into transform coefficients (eg, by performing a discrete cosine transform or DCT). The quantization module 611 quantizes the transform coefficients into quantized data (or quantized coefficients) 612, which is encoded into a bit stream 695 by the entropy encoder 690.

逆量化模組614對量化資料（或量化係數）612進行去量化以獲得變換係數，以及逆變換模組615對變換係數執行逆變換以產生重構殘差619。重構殘差619與預測像素資料613相加以產生重構的像素資料617。在一些實施例中，重構的像素資料617被臨時存儲在行緩衝器（line buffer未展示出）中用於幀內預測和空間MV預測。重構像素由環路濾波器645濾波並被存儲在重構圖片緩衝器650中。在一些實施例中，重構圖片緩衝器650是視訊編碼器600外部的記憶體。在一些實施例中，重構圖片緩衝器650是視訊編碼器600內部的記憶體。The inverse quantization module 614 dequantizes the quantized data (or quantized coefficients) 612 to obtain transform coefficients, and the inverse transform module 615 performs an inverse transform on the transform coefficients to generate a reconstructed residual 619 . The reconstructed residual 619 is added to the predicted pixel data 613 to produce reconstructed pixel data 617 . In some embodiments, the reconstructed pixel data 617 is temporarily stored in a line buffer (line buffer not shown) for intra prediction and spatial MV prediction. The reconstructed pixels are filtered by loop filter 645 and stored in reconstructed picture buffer 650. In some embodiments, the reconstructed picture buffer 650 is a memory external to the video encoder 600 . In some embodiments, the reconstructed picture buffer 650 is an internal memory of the video encoder 600 .

幀內估計模組620基於重構的像素資料617執行幀內預測以產生幀內預測資料。幀內預測資料被提供至熵編碼器690以被編碼成位元流695。幀內預測資料還被幀內預測模組625用來產生預測像素資料613。The intra estimation module 620 performs intra prediction based on the reconstructed pixel data 617 to generate intra prediction data. The intra prediction data is provided to an entropy encoder 690 to be encoded into a bit stream 695 . The intra prediction data is also used by the intra prediction module 625 to generate predicted pixel data 613 .

運動估計模組635藉由產生MV以參考存儲在重構圖片緩衝器650中的先前解碼幀的像素資料來執行幀間預測。這些MV被提供至運動補償模組630以產生預測像素資料。The motion estimation module 635 performs inter prediction by generating MVs to reference pixel data of previously decoded frames stored in the reconstructed picture buffer 650 . These MVs are provided to the motion compensation module 630 to generate predicted pixel data.

視訊編碼器600不是對位元流中的完整實際MV進行編碼，而是使用MV預測來生成預測的MV，以及用於運動補償的MV與預測的MV之間的差值被編碼為殘差運動資料並存儲在位元流695。Rather than encoding the complete actual MV in the bitstream, video encoder 600 uses MV prediction to generate a predicted MV, and the difference between the MV used for motion compensation and the predicted MV is encoded as residual motion. The data is stored in bit stream 695.

基於為編碼先前視訊幀而生成的參考MV，即用於執行運動補償的運動補償MV，MV預測模組675生成預測的MV。MV預測模組675從MV緩衝器665中獲取來自先前視訊幀的參考MV。視訊編碼器600將對當前視訊幀生成的MV存儲在MV緩衝器665中作為用於生成預測MV的參考MV。The MV prediction module 675 generates a predicted MV based on the reference MV generated for encoding the previous video frame, ie, the motion compensation MV used to perform motion compensation. The MV prediction module 675 obtains the reference MV from the previous video frame from the MV buffer 665 . The video encoder 600 stores the MV generated for the current video frame in the MV buffer 665 as a reference MV for generating a predicted MV.

MV預測模組675使用參考MV來創建預測的MV。預測的MV可以藉由空間MV預測或時間MV預測來計算。預測的MV和當前幀的運動補償MV（MC MV）之間的差值（殘差運動資料）由熵編碼器690編碼到位元流695中。The MV prediction module 675 uses the reference MV to create predicted MVs. The predicted MV can be calculated by spatial MV prediction or temporal MV prediction. The difference between the predicted MV and the motion compensated MV (MC MV) of the current frame (residual motion data) is encoded in the bit stream 695 by the entropy encoder 690 .

熵編碼器690藉由使用諸如上下文適應性二進位算術編解碼（context-adaptive binary arithmetic coding，簡稱CABAC）或霍夫曼編碼的熵編解碼技術將各種參數和資料編碼到位元流695中。熵編碼器690將各種標頭元素、標誌連同量化的變換係數612和作為語法元素的殘差運動資料編碼到位元流695中。位元流695繼而被存儲在存放裝置中或藉由比如網路等通訊媒介傳輸到解碼器。Entropy encoder 690 encodes various parameters and data into bit stream 695 by using entropy coding and decoding techniques such as context-adaptive binary arithmetic coding (CABAC) or Huffman coding. The entropy encoder 690 encodes various header elements, flags along with the quantized transform coefficients 612 and residual motion data as syntax elements into the bit stream 695. The bit stream 695 is then stored in a storage device or transmitted to the decoder via a communication medium such as a network.

環路濾波器645對重構的像素資料617執行濾波或平滑操作以減少編解碼的偽影，特別是在像素塊的邊界處。在一些實施例中，環路濾波器645所執行的濾波操作包括去塊濾波器（deblock filter，簡稱，DBF）、樣本適應性偏移（sample adaptive offset，簡稱SAO）和/或適應性環路濾波器（adaptive loop filter，簡稱ALF）。The loop filter 645 performs a filtering or smoothing operation on the reconstructed pixel data 617 to reduce coding and decoding artifacts, especially at the boundaries of pixel blocks. In some embodiments, the filtering operations performed by the loop filter 645 include a deblock filter (DBF), a sample adaptive offset (SAO), and/or an adaptive loop. Filter (adaptive loop filter, ALF for short).

第7圖示出視訊編碼器600的實現基於CTU或基於幀的參考圖片清單的部分。具體地，該圖示出了視訊編碼器600的幀間預測模組640的組件。如圖所示，幀間預測模組640從MV緩衝器665獲取候選運動向量以及搜索重構圖片緩衝器650的內容以藉由運動補償來生成預測像素資料613。Figure 7 illustrates a portion of video encoder 600 that implements a CTU-based or frame-based reference picture list. Specifically, this figure shows the components of inter prediction module 640 of video encoder 600. As shown, the inter prediction module 640 obtains candidate motion vectors from the MV buffer 665 and searches the contents of the reconstructed picture buffer 650 to generate predicted pixel data 613 through motion compensation.

幀間預測模組640包括運動補償模組630、運動估計模組635、代表性MV選擇器705、參考圖片列表重新排序模組710以及當前CTU或幀的參考圖片列表（reference picture list，簡稱RPL）730。The inter prediction module 640 includes a motion compensation module 630, a motion estimation module 635, a representative MV selector 705, a reference picture list reordering module 710, and a reference picture list (RPL) of the current CTU or frame. )730.

代表性MV選擇器705從MV緩衝器665獲取候選運動向量以導出或選擇當前CTU或幀的代表性MV。代表性MV選擇器705可以從以下導出或選擇代表性MV：用於重構與當前CTU相鄰的一個或多個塊的MV、或來自當前CTU中的一個或多個塊的MV、或來自同位的CTU或參考圖片CTU的時間MV，或從當前CTU的相鄰位置繼承的MV，或當前CTU中塊的運動向量預測子（motion vector predictor，簡稱MVP）。The representative MV selector 705 obtains candidate motion vectors from the MV buffer 665 to derive or select a representative MV for the current CTU or frame. Representative MV selector 705 may derive or select representative MVs from: MVs used to reconstruct one or more blocks adjacent to the current CTU, or MVs from one or more blocks in the current CTU, or from The temporal MV of the co-located CTU or reference picture CTU, or the MV inherited from the adjacent position of the current CTU, or the motion vector predictor (MVP) of the block in the current CTU.

RPL重新排序模組720使用所選擇的或導出的代表性MV（來自705）來計算與RPL 730中的參考圖片相關聯的模板匹配（template matching，簡稱TM）成本。在一些實施例中，與RPL 730中的每個參考圖片相關聯的成本基於(i)當前CTU的相鄰樣本或相鄰塊與(ii)由代表性MV識別的參考圖片中的對應參考樣本之間的差值測量來決定。（鄰近樣本和參考樣本從重構圖片緩衝器650獲取。）基於計算的成本，RPL重新排序模組720向RPL 730中的參考圖片分配索引。The RPL reordering module 720 uses the selected or derived representative MVs (from 705 ) to calculate a template matching (TM) cost associated with the reference picture in the RPL 730 . In some embodiments, the cost associated with each reference picture in RPL 730 is based on (i) neighboring samples or neighboring blocks of the current CTU and (ii) corresponding reference samples in the reference picture identified by the representative MV The difference between them is measured to determine. (Neighbor samples and reference samples are obtained from reconstructed picture buffer 650.) Based on the calculated cost, RPL reordering module 720 assigns indices to reference pictures in RPL 730.

運動估計模組635執行運動估計以向運動補償模組630提供一個或多個運動向量以執行運動補償。運動估計模組635還藉由使用分配給RPL 730中的參考圖片的索引將所選擇的運動向量發送給熵編碼器。Motion estimation module 635 performs motion estimation to provide one or more motion vectors to motion compensation module 630 to perform motion compensation. The motion estimation module 635 also sends the selected motion vector to the entropy encoder using the index assigned to the reference picture in the RPL 730 .

第8圖概念性地示出處理800，處理800用於使用基於CTU的參考圖片清單來對像素塊進行編碼。在一些實施例中，實現編碼器600的計算設備的一個或多個處理單元（例如，處理器）藉由執行存儲在電腦可讀介質中的指令來執行處理800。在一些實施例中，實現編碼器600的電子裝置執行處理800。Figure 8 conceptually illustrates a process 800 for encoding pixel blocks using a CTU-based reference picture list. In some embodiments, one or more processing units (eg, processors) of a computing device implementing encoder 600 perform process 800 by executing instructions stored in a computer-readable medium. In some embodiments, an electronic device implementing encoder 600 performs process 800.

視訊編碼器接收（在塊810）當前圖片的當前編解碼樹單元（coding tree unit，簡稱CTU）的參考圖片列表（RPL）。RPL識別多個參考圖片。The video encoder receives (at block 810) a reference picture list (RPL) for the current coding tree unit (CTU) of the current picture. RPL recognizes multiple reference pictures.

視訊編碼器向當前CTU的RPL中的多個參考圖片分配（在塊820）索引。在一些實施例中，基於顯式信令，索引被配置給RPL中的多個參考圖片。在一些實施例中，基於記錄對當前圖片的每個CTU進行編碼或解碼時參考圖片選擇的分佈的基於歷史的表，索引被配置給RPL中的多個參考圖片。The video encoder assigns (at block 820) indices to multiple reference pictures in the RPL of the current CTU. In some embodiments, indexes are configured to multiple reference pictures in the RPL based on explicit signaling. In some embodiments, an index is configured to a plurality of reference pictures in the RPL based on a history-based table recording the distribution of reference picture selections when encoding or decoding each CTU of the current picture.

在一些實施例中，視訊編碼器導出當前CTU的代表性運動向量（MV）以及計算多個參考圖片的成本。每個參考圖片的成本基於（i）當前CTU的相鄰樣本與（ii）由代表性MV識別的參考圖片中的參考樣本來計算，以及基於計算的成本，索引被分配給基於RPL中的多個參考圖片。In some embodiments, the video encoder derives a representative motion vector (MV) of the current CTU and calculates the cost of multiple reference pictures. The cost of each reference picture is calculated based on (i) the neighboring samples of the current CTU and (ii) the reference samples in the reference picture identified by the representative MV, and based on the calculated cost, the index is assigned to the multiple based on the RPL. reference picture.

在一些實施例中，代表性MV從用於重構與當前CTU相鄰的一個或多個塊的MV導出，以及代表性MV可以是用於重構與當前CTU相鄰的塊的MV的加權平均。在一些實施例中，代表性MV從當前CTU中的一個或多個塊的MV導出，以及代表性MV可以是來自當前CTU中的一個或多個塊的MV的加權平均。在一些實施例中，代表性MV從來自同位的CTU或參考圖片CTU的時間MV導出，以及代表性MV可以是來自同位的CTU或參考圖片CTU的時間MV的加權平均。代表性MV源自從當前CTU的相鄰位置繼承的MV。視訊編碼器可以使用當前CTU中的塊的運動向量預測子（motion vector predictor，簡稱MVP）作為CTU的代表性MV。In some embodiments, the representative MV is derived from the MV used to reconstruct one or more blocks adjacent to the current CTU, and the representative MV may be a weighted MV used to reconstruct the MV of the block adjacent to the current CTU. average. In some embodiments, the representative MV is derived from the MVs of one or more blocks in the current CTU, and the representative MV may be a weighted average of the MVs from one or more blocks in the current CTU. In some embodiments, the representative MV is derived from the temporal MV from the co-located CTU or reference picture CTU, and the representative MV may be a weighted average of the temporal MV from the co-located CTU or reference picture CTU. Representative MVs are derived from MVs inherited from neighboring positions of the current CTU. The video encoder can use the motion vector predictor (MVP) of the block in the current CTU as the representative MV of the CTU.

視訊編碼器接收（在塊830）資料，該資料將被編碼為當前CTU的多個塊。視訊編碼器藉由使用分配的索引來對CTU的多個塊進行編碼（在塊840）以從RPL中選擇一個或多個參考圖片以生成當前CTU的多個塊的幀間預測。 Ⅳ 、示例視訊解碼器 The video encoder receives (at block 830) data that will be encoded into blocks of the current CTU. The video encoder selects one or more reference pictures from the RPL to generate inter predictions for the blocks of the current CTU by encoding the blocks of the CTU using the assigned index (at block 840). Ⅳ . Sample video decoder

在一些實施例中，編碼器可以發送（或生成）位元流中的一個或多個語法元素，使得解碼器可以從位元流中解析所述一個或多個語法元素。In some embodiments, the encoder may send (or generate) one or more syntax elements in the bitstream such that the decoder may parse the one or more syntax elements from the bitstream.

第9圖示出可以實現基於CTU或基於幀的參考圖片列表的示例視訊解碼器900。如圖所示，視訊解碼器900是圖像解碼或視訊解碼電路，該圖像解碼或視訊解碼電路接收位元流995以及將位元流的內容解碼為視訊幀的像素資料以供顯示。視訊解碼器900具有用於解碼位元流995的若干組件或模組，包括選自以下的一些組件：逆量化模組911，逆變換模組910，幀內預測模組925，運動補償模組930，環路濾波器的945，解碼圖片緩衝器950，MV緩衝器965，MV預測模組975和解析器990。運動補償模組930是幀間預測模組940的一部分。Figure 9 illustrates an example video decoder 900 that may implement a CTU-based or frame-based reference picture list. As shown in the figure, the video decoder 900 is an image decoding or video decoding circuit that receives a bit stream 995 and decodes the content of the bit stream into pixel data of a video frame for display. Video decoder 900 has several components or modules for decoding bit stream 995, including some components selected from the following: inverse quantization module 911, inverse transform module 910, intra prediction module 925, motion compensation module 930, loop filter 945, decoded picture buffer 950, MV buffer 965, MV prediction module 975 and parser 990. Motion compensation module 930 is part of inter prediction module 940 .

在一些實施例中，模組910-990是由計算設備的一個或多個處理單元（例如，處理器）執行的軟體指令模組。在一些實施例中，模組910-990是由電子設備的一個或多個IC實現的硬體電路模組。儘管模組910-990被示為單獨的模組，但一些模組可以組合成單個模組。In some embodiments, modules 910-990 are modules of software instructions executed by one or more processing units (eg, processors) of a computing device. In some embodiments, modules 910-990 are hardware circuit modules implemented by one or more ICs of an electronic device. Although modules 910-990 are shown as individual modules, some modules may be combined into a single module.

解析器990（或熵解碼器）接收位元流995以及根據由視訊編碼或圖像編碼標準定義的語法執行初始解析。解析的語法元素包括各種標頭元素、標誌以及量化資料（或量化係數）912。解析器990藉由使用熵編解碼技術（例如上下文適應性二進位算術編解碼（context-adaptive binary arithmetic coding，簡稱ABAC）或霍夫曼編碼（Huffman encoding）解析出各種語法元素。A parser 990 (or entropy decoder) receives the bitstream 995 and performs initial parsing according to the syntax defined by the video encoding or image encoding standard. Parsed syntax elements include various header elements, flags, and quantization data (or quantization coefficients) 912 . The parser 990 parses out various syntax elements by using entropy coding and decoding techniques such as context-adaptive binary arithmetic coding (ABAC) or Huffman encoding.

逆量化模組911對量化資料（或量化係數）912進行去量化以獲得變換係數，以及逆變換模組910對變換係數916進行逆變換以產生重構殘差訊號919。重構殘差訊號919與來自幀內預測模組925或運動補償模組930的預測像素資料913相加以產生解碼像素資料917。解碼像素資料由環路濾波器945濾波以及存儲在解碼圖片緩衝器950中。在一些實施例中，解碼圖片緩衝器950是視訊解碼器900外部的記憶體。在一些實施例中，解碼圖片緩衝器950是視訊解碼器900內部的記憶體。The inverse quantization module 911 dequantizes the quantized data (or quantized coefficients) 912 to obtain transform coefficients, and the inverse transform module 910 inversely transforms the transform coefficients 916 to generate a reconstructed residual signal 919 . The reconstructed residual signal 919 is added to the predicted pixel data 913 from the intra prediction module 925 or the motion compensation module 930 to generate decoded pixel data 917 . The decoded pixel data is filtered by loop filter 945 and stored in decoded picture buffer 950. In some embodiments, the decoded picture buffer 950 is a memory external to the video decoder 900 . In some embodiments, the decoded picture buffer 950 is an internal memory of the video decoder 900 .

幀內預測模組925從位元流995接收幀內預測資料，以及據此，從存儲在解碼圖片緩衝器950中的解碼像素資料917產生預測像素資料913。在一些實施例中，解碼像素資料917也被存儲在行緩衝器（未展示出）中，用於幀內預測和空間MV預測。Intra prediction module 925 receives intra prediction data from bitstream 995 and, accordingly, generates predicted pixel data 913 from decoded pixel data 917 stored in decoded picture buffer 950 . In some embodiments, decoded pixel data 917 is also stored in a line buffer (not shown) for intra prediction and spatial MV prediction.

在一些實施例中，解碼圖片緩衝器950的內容用於顯示。顯示裝置955或者獲取解碼圖像緩衝器950的內容以直接顯示，或者獲取解碼圖像緩衝器的內容到顯示緩衝器。在一些實施例中，顯示裝置藉由像素傳輸從解碼圖片緩衝器950接收像素值。In some embodiments, the contents of picture buffer 950 are decoded for display. The display device 955 either obtains the contents of the decoded image buffer 950 for direct display, or obtains the contents of the decoded image buffer to a display buffer. In some embodiments, the display device receives pixel values from decoded picture buffer 950 via pixel transfer.

運動補償模組930根據運動補償MV（MC MV）從解碼圖片緩衝器950中存儲的解碼像素資料917產生預測像素資料913。藉由將從位元流995接收的殘差運動資料與從MV預測模組975接收的預測MV相加，這些運動補償MV被解碼。The motion compensation module 930 generates predicted pixel data 913 from the decoded pixel data 917 stored in the decoded picture buffer 950 according to the motion compensated MV (MC MV). These motion compensated MVs are decoded by adding the residual motion data received from bit stream 995 to the predicted MV received from MV prediction module 975 .

MV預測模組975基於為解碼先前視訊幀而生成的參考MV（例如，用於執行運動補償的運動補償MV）生成預測的MV。MV預測模組975從MV緩衝器965中獲取先前視訊幀的參考MV。視訊解碼器900將用於解碼當前視訊幀而生成的運動補償MV存儲在MV緩衝器965中作為用於產生預測MV的參考MV。The MV prediction module 975 generates predicted MVs based on reference MVs generated for decoding previous video frames (eg, motion compensation MVs used to perform motion compensation). The MV prediction module 975 obtains the reference MV of the previous video frame from the MV buffer 965 . The video decoder 900 stores the motion compensated MV generated for decoding the current video frame in the MV buffer 965 as a reference MV for generating the predicted MV.

環路濾波器945對解碼的像素資料917執行濾波或平滑操作以減少編解碼的偽影，特別是在像素塊的邊界處。在一些實施例中，由環路濾波器945執行的濾波或平滑操作包括去塊濾波器（deblock filter，簡稱DBF）、樣本適應性偏移（sample adaptive offset，簡稱SAO）和/或適應性環路濾波器（adaptive loop filter，簡稱ALF)。Loop filter 945 performs a filtering or smoothing operation on decoded pixel data 917 to reduce encoding and decoding artifacts, particularly at pixel block boundaries. In some embodiments, the filtering or smoothing operations performed by the loop filter 945 include a deblock filter (DBF), a sample adaptive offset (SAO), and/or an adaptive loop. Adaptive loop filter (ALF).

第10圖示出實現基於CTU或基於幀的參考圖片列表的視訊解碼器900的部分。具體地，該圖示出視訊解碼器900的幀間預測模組940的組件。如圖所示，幀間預測模組940從MV緩衝器965和解碼圖片緩衝器950的內容獲取候選運動向量以藉由運動補償產生預測像素資料913。Figure 10 illustrates portions of a video decoder 900 that implements a CTU-based or frame-based reference picture list. Specifically, this figure illustrates the components of the inter prediction module 940 of the video decoder 900 . As shown, the inter prediction module 940 obtains candidate motion vectors from the contents of the MV buffer 965 and the decoded picture buffer 950 to generate predicted pixel data 913 through motion compensation.

幀間預測模組940包括運動補償模組930、運動解碼器1035、代表性MV選擇器1005、參考圖片列表重新排序模組1010以及當前CTU或幀的參考圖片清單（RPL）1030。The inter prediction module 940 includes a motion compensation module 930, a motion decoder 1035, a representative MV selector 1005, a reference picture list reordering module 1010, and a reference picture list (RPL) 1030 for the current CTU or frame.

代表性MV選擇器1005從MV緩衝器965獲取候選運動向量以導出或選擇當前CTU或幀的代表性MV。代表性MV選擇器1005可以從以下導出或選擇代表性MV：用於重構與當前CTU相鄰的一個或多個塊的MV、或來自當前CTU中的一個或多個塊的MV、或來自同位的CTU或參考圖片CTU的時間MV，或從當前CTU的相鄰位置繼承的MV，或當前CTU中塊的運動向量預測子（motion vector predictor，簡稱MVP）。The representative MV selector 1005 obtains candidate motion vectors from the MV buffer 965 to derive or select a representative MV for the current CTU or frame. The representative MV selector 1005 may derive or select representative MVs from: MVs used to reconstruct one or more blocks adjacent to the current CTU, or MVs from one or more blocks in the current CTU, or from The temporal MV of the co-located CTU or reference picture CTU, or the MV inherited from the adjacent position of the current CTU, or the motion vector predictor (MVP) of the block in the current CTU.

RPL重新排序模組1020使用所選擇的或導出的代表性MV（來自1005）來計算與RPL 1030中的參考圖片相關聯的模板匹配（TM）成本。在一些實施例中，與RPL 1030中的每個參考圖片相關聯的成本基於(i)當前CTU的相鄰樣本或相鄰塊與(ii)由代表性MV識別的參考圖片中的相應參考樣本之間的差值測量來決定。（從解碼圖片緩衝器950獲取相鄰樣本和參考樣本。）基於計算的成本，RPL重新排序模組1020向RPL 1030中的參考圖片分配索引。The RPL re-ranking module 1020 uses the selected or derived representative MVs (from 1005) to calculate the template matching (TM) cost associated with the reference picture in the RPL 1030. In some embodiments, the cost associated with each reference picture in RPL 1030 is based on (i) neighboring samples or neighboring blocks of the current CTU and (ii) corresponding reference samples in the reference picture identified by the representative MV The difference between them is measured to determine. (Neighbor samples and reference samples are obtained from the decoded picture buffer 950.) Based on the calculated cost, the RPL reordering module 1020 assigns an index to the reference picture in the RPL 1030.

熵解碼器990接收指示當前塊的運動資訊的信令，以及將運動資訊中繼到運動解碼器1035作為用於運動補償的MV（MC MV)。運動解碼器1035可以使用由熵解碼器990提供的一個或多個索引來識別RPL 1030中的一個或多個參考圖片。補償模組930根據MC使用從解碼圖片緩衝器950獲取的預測樣本來執行運動補償 MV。獲取到的預測樣本是所識別的參考圖片的樣本。Entropy decoder 990 receives signaling indicating motion information for the current block and relays the motion information to motion decoder 1035 as MVs for motion compensation (MC MVs). Motion decoder 1035 may use one or more indices provided by entropy decoder 990 to identify one or more reference pictures in RPL 1030. The compensation module 930 performs motion compensation MV according to the MC using prediction samples obtained from the decoded picture buffer 950. The obtained prediction samples are samples of the identified reference pictures.

第11圖概念性地示出用於使用基於CTU的參考圖片列表來解碼區塊的處理1100。在一些實施例中，實現解碼器900的計算設備的一個或多個處理單元（例如，處理器）藉由執行存儲在電腦可讀介質中的指令來執行處理1100。在一些實施例中，實現解碼器900的電子裝置執行處理1100。Figure 11 conceptually illustrates a process 1100 for decoding blocks using a CTU-based reference picture list. In some embodiments, one or more processing units (eg, processors) of a computing device implementing decoder 900 perform process 1100 by executing instructions stored in a computer-readable medium. In some embodiments, an electronic device implementing decoder 900 performs process 1100 .

視訊解碼器接收（在塊1110）當前圖片的當前編解碼樹單元（CTU）的參考圖片列表（RPL）。RPL識別多個參考圖片。The video decoder receives (at block 1110) a reference picture list (RPL) for the current codec tree unit (CTU) of the current picture. RPL recognizes multiple reference pictures.

視訊解碼器向當前CTU的RPL中的多個參考圖片分配（在塊1120）索引。在一些實施例中，基於顯式信令，索引被分配給RPL中的多個參考圖片。在一些實施例中，基於記錄當對當前圖片的每個CTU進行解碼或解碼時參考圖片選擇的分佈的基於歷史的表，索引被分配給RPL中的多個參考圖片。The video decoder assigns (at block 1120) indices to multiple reference pictures in the RPL of the current CTU. In some embodiments, indexes are assigned to multiple reference pictures in the RPL based on explicit signaling. In some embodiments, indexes are assigned to multiple reference pictures in the RPL based on a history-based table that records the distribution of reference picture selections when decoding or decoding each CTU of the current picture.

在一些實施例中，視訊解碼器導出當前CTU的代表性運動向量（MV）以及計算多個參考圖片的成本。每個參考圖片的成本基於（i）當前CTU的相鄰樣本和（ii）由代表性MV識別的參考圖片中的參考樣本來計算，以及基於計算的成本，索引被分配給基於RPL中的多個參考圖片。In some embodiments, the video decoder derives a representative motion vector (MV) of the current CTU and calculates the cost of multiple reference pictures. The cost of each reference picture is calculated based on (i) the neighboring samples of the current CTU and (ii) the reference samples in the reference picture identified by the representative MV, and based on the calculated cost, the index is assigned to the multiple based on the RPL. reference picture.

在一些實施例中，代表性MV從用於重構與當前CTU相鄰的一個或多個塊的MV導出，以及代表性MV可以是用於重構與當前CTU相鄰的塊的MV的加權平均。在一些實施例中，代表性MV從當前CTU中的一個或多個塊的MV導出，以及代表性MV可以是來自當前CTU中的一個或多個塊的MV的加權平均。在一些實施例中，代表性MV從來自同位的CTU或參考圖片CTU的時間MV導出，以及代表性MV可以是來自同位的CTU或參考圖片CTU的時間MV的加權平均。代表性MV源自從當前CTU的相鄰位置繼承的MV。視訊解碼器可以使用當前CTU中的塊的運動向量預測子（MVP）作為CTU的代表性MV。In some embodiments, the representative MV is derived from the MV used to reconstruct one or more blocks adjacent to the current CTU, and the representative MV may be a weighted MV used to reconstruct the MV of the block adjacent to the current CTU. average. In some embodiments, the representative MV is derived from the MVs of one or more blocks in the current CTU, and the representative MV may be a weighted average of the MVs from one or more blocks in the current CTU. In some embodiments, the representative MV is derived from the temporal MV from the co-located CTU or reference picture CTU, and the representative MV may be a weighted average of the temporal MV from the co-located CTU or reference picture CTU. Representative MVs are derived from MVs inherited from neighboring positions of the current CTU. The video decoder may use the motion vector predictor (MVP) of the block in the current CTU as the representative MV of the CTU.

視訊解碼器接收（在塊1130）資料，該資料將被解碼為當前CTU的多個塊。視訊解碼器藉由使用分配的索引來從RPL中選擇一個或多個參考圖片來重構（在塊1140）CTU的多個塊，以生成當前CTU的多個塊的幀間預測。解碼器然後可以提供重構的當前塊以作為重構的當前圖片的一部分進行顯示。 Ⅵ 、示例電子系統 The video decoder receives (at block 1130) data which will be decoded into blocks of the current CTU. The video decoder reconstructs (at block 1140) the blocks of the CTU by selecting one or more reference pictures from the RPL using the assigned index to generate inter predictions of the blocks of the current CTU. The decoder may then provide the reconstructed current block for display as part of the reconstructed current picture. Ⅵ . Example electronic system

許多上述特徵和應用被實現為軟體處理，這些軟體處理被指定為記錄在電腦可讀存儲介質（也稱為電腦可讀介質）上的一組指令。當這些指令由一個或多個計算或處理單元（例如，一個或多個處理器、處理器內核或其他處理單元）執行時，它們使處理單元執行指令中指示的動作。電腦可讀介質的示例包括但不限於唯讀光碟驅動器（compact disc read-only memory，簡稱CD-ROM）、快閃記憶體驅動器、隨機存取記憶體（random-access memroy，簡稱RAM）晶片、硬碟驅動器、可擦除可程式設計唯讀記憶體（erasable programmble read-only memory，簡稱EPROM）、電可擦除可程式設計唯讀記憶體（electrically erasable proagrammble read-only memory，簡稱EEPROM）等。電腦可讀介質不包括藉由無線或有線連接傳遞的載波和電子訊號。Many of the above features and applications are implemented as software processes specified as a set of instructions recorded on a computer-readable storage medium (also referred to as a computer-readable medium). When these instructions are executed by one or more computing or processing units (eg, one or more processors, processor cores, or other processing units), they cause the processing unit to perform the actions indicated in the instructions. Examples of computer-readable media include, but are not limited to, compact disc read-only memory (CD-ROM), flash memory drives, random-access memory (RAM) chips, Hard drive, erasable programmble read-only memory (EPROM), electrically erasable proagrammble read-only memory (EEPROM), etc. . Computer-readable media does not include carrier waves and electronic signals transmitted over wireless or wired connections.

在本說明書中，術語“軟體”意在包括駐留在唯讀記憶體中的韌體或存儲在磁記憶體中的應用程式，其可以讀入記憶體以供處理器處理。此外，在一些實施例中，多個軟體發明可以實現為更大程式的子部分，同時保留不同的軟體發明。在一些實施例中，多個軟體發明也可以實現為單獨的程式。最後，共同實現此處描述的軟體發明的單獨程式的任一組合都在本公開的範圍內。在一些實施例中，軟體程式，在被安裝以在一個或多個電子系統上運行時，定義一個或多個特定機器實施方式，該實施方式處理和執行軟體程式的操作。In this specification, the term "software" is intended to include firmware that resides in read-only memory or applications stored in magnetic memory that can be read into memory for processing by a processor. Furthermore, in some embodiments, multiple software inventions may be implemented as sub-portions of a larger program while retaining distinct software inventions. In some embodiments, multiple software inventions may also be implemented as separate programs. Finally, any combination of individual programs that together implement the software inventions described herein is within the scope of this disclosure. In some embodiments, a software program, when installed to run on one or more electronic systems, defines one or more specific machine implementations that process and perform the operations of the software program.

第12圖概念性地展示出了實現本公開的一些實施例的電子系統1200。電子系統1200可以是電腦（例如，臺式電腦、個人電腦、平板電腦等）、電話、PDA或任一其他類型的電子設備。這種電子系統包括各種類型的電腦可讀介質和用於各種其他類型的電腦可讀介質的介面。電子系統1200包括匯流排1205，處理單元1210，圖形處理單元（graphics-processing unit，簡稱GPU）1215，系統記憶體1220，網路1225，唯讀記憶體1230，永久存放設備1235，輸入設備1240和輸出設備1245。Figure 12 conceptually illustrates an electronic system 1200 implementing some embodiments of the present disclosure. Electronic system 1200 may be a computer (eg, desktop computer, personal computer, tablet computer, etc.), telephone, PDA, or any other type of electronic device. Such electronic systems include various types of computer-readable media and interfaces for various other types of computer-readable media. Electronic system 1200 includes bus 1205, processing unit 1210, graphics-processing unit (GPU) 1215, system memory 1220, network 1225, read-only memory 1230, permanent storage device 1235, input device 1240 and Output device 1245.

匯流排1205共同表示與電子系統1200通訊連接的眾多內部設備的所有系統、週邊設備和晶片組匯流排。例如，匯流排1205將處理單元1210與GPU 1215，唯讀記憶體1230，系統記憶體1220和永久存放設備1235通訊地連接。Bus 1205 collectively represents all system, peripheral, and chipset busses of the numerous internal devices that are communicatively connected to electronic system 1200 . For example, bus 1205 communicatively connects processing unit 1210 to GPU 1215, read-only memory 1230, system memory 1220, and persistent storage 1235.

處理單元1210從這些各種記憶體單元中獲取要執行的指令和要處理的資料，以便執行本公開的處理。在不同的實施例中，處理單元可以是單個處理器或多核處理器。一些指令被傳遞到GPU 1215並由其執行。GPU 1215可以卸載各種計算或補充由處理單元1210提供的影像處理。The processing unit 1210 obtains instructions to be executed and data to be processed from these various memory units in order to perform the processes of the present disclosure. In different embodiments, the processing unit may be a single processor or a multi-core processor. Some instructions are passed to and executed by the GPU 1215. GPU 1215 may offload various computations or supplement the image processing provided by processing unit 1210.

唯讀記憶體（read-only-memory，簡稱ROM）1230存儲由處理單元1210和電子系統的其他模組使用的靜態資料和指令。另一方面，永久存放設備1235是讀寫存放設備。該設備是即使在電子系統1200關閉時也存儲指令和資料的非易失性存儲單元。本公開的一些實施例使用大容量記憶裝置（例如磁片或光碟及其對應的磁碟機）作為永久存放設備1235。Read-only memory (ROM) 1230 stores static data and instructions used by the processing unit 1210 and other modules of the electronic system. On the other hand, the permanent storage device 1235 is a read-write storage device. This device is a non-volatile storage unit that stores instructions and data even when the electronic system 1200 is turned off. Some embodiments of the present disclosure use large-capacity memory devices (such as magnetic disks or optical disks and their corresponding disk drives) as the permanent storage device 1235 .

其他實施例使用卸載式存放裝置設備（例如軟碟、快閃記憶體設備等，及其對應的磁碟機）作為永久存放設備。與永久存放設備1235一樣，系統記憶體1220是讀寫記憶體設備。然而，與永久存放設備1235不同，系統記憶體1220是易失性（volatile）讀寫記憶體，例如隨機存取記憶體。系統記憶體1220存儲處理器在運行時使用的一些指令和資料。在一些實施例中，根據本公開的處理被存儲在系統記憶體1220、永久存放設備1235和/或唯讀記憶體1230中。例如，根據本公開的一些實施例，各種記憶體單元包括用於根據處理多媒體剪輯的指令。從這些各種記憶體單元中，處理單元1210獲取要執行的指令和要處理的資料，以便執行一些實施例的處理。Other embodiments use off-mount storage devices (such as floppy disks, flash memory devices, etc., and their corresponding disk drives) as permanent storage devices. Like persistent storage device 1235, system memory 1220 is a read-write memory device. However, unlike the permanent storage device 1235, the system memory 1220 is a volatile read-write memory, such as a random access memory. System memory 1220 stores some instructions and data used by the processor during operation. In some embodiments, processes in accordance with the present disclosure are stored in system memory 1220, persistent storage device 1235, and/or read-only memory 1230. For example, according to some embodiments of the present disclosure, various memory units include instructions for processing multimedia clips. From these various memory units, the processing unit 1210 obtains instructions to be executed and data to be processed in order to perform the processing of some embodiments.

匯流排1205還連接到輸入設備1240和輸出設備1245。輸入設備1240使使用者能夠向電子系統傳達資訊和選擇命令。輸入設備1240包括字母數位元元元鍵盤和定點設備（也被稱為“遊標控制設備”）、照相機（例如，網路攝像頭）、麥克風或用於接收語音命令的類似設備等。輸出設備1245顯示由電子系統生成的圖像或者輸出資料。輸出設備1245包括印表機和顯示裝置，例如陰極射線管（cathode ray tubes，簡稱CRT）或液晶顯示器（liquid crystal display，簡稱LCD），以及揚聲器或類似的音訊輸出設備。一些實施例包括用作輸入和輸出設備的設備，例如觸控式螢幕。Bus 1205 also connects to input device 1240 and output device 1245 . Input device 1240 enables the user to communicate information and select commands to the electronic system. Input devices 1240 include alphanumeric keyboards and pointing devices (also known as "cursor control devices"), cameras (eg, webcams), microphones or similar devices for receiving voice commands, and the like. Output device 1245 displays images or output material generated by the electronic system. Output devices 1245 include printers and display devices, such as cathode ray tubes (CRT) or liquid crystal displays (LCD), as well as speakers or similar audio output devices. Some embodiments include devices used as input and output devices, such as touch screens.

最後，如第12圖所示，匯流排1205還藉由網路介面卡（未展示出）將電子系統1200耦合到網路1225。以這種方式，電腦可以是電腦網路（例如局域網（“LAN”）、廣域網路（“WAN”）或內聯網的一部分，或者是多種網路的一個網路，例如互聯網。電子系統1200的任一或所有組件可以與本公開結合使用。Finally, as shown in Figure 12, bus 1205 also couples electronic system 1200 to network 1225 via a network interface card (not shown). In this manner, the computer may be part of a computer network, such as a local area network ("LAN"), a wide area network ("WAN"), or an intranet, or a network of multiple networks, such as the Internet. Electronic system 1200 Any or all components may be used in conjunction with the present disclosure.

一些實施例包括電子組件，例如微處理器、存儲裝置和記憶體，其將電腦程式指令存儲在機器可讀或電腦可讀介質（或者被稱為電腦可讀存儲介質、機器可讀介質或機器可讀存儲介質）中。這種電腦可讀介質的一些示例包括RAM、ROM、唯讀光碟（read-only compact discs，簡稱CD-ROM）、可記錄光碟（recordable compact discs，簡稱CD-R）、可重寫光碟（rewritable compact discs，簡稱CD-RW）、唯讀數位多功能光碟（read-only digital versatile discs）（例如, DVD-ROM, 雙層DVD-ROM）, 各種可燒錄/可重寫DVD （例如, DVD-RAM, DVD-RW, DVD+RW等）, 快閃記憶體（例如, SD卡, 迷你SD卡、微型SD卡等）、磁性和/或固態硬碟驅動器、唯讀和可記錄Blu-Ray®光碟、超密度光碟、任一其他光學或磁性介質以及軟碟。電腦可讀介質可以存儲可由至少一個處理單元執行以及包括用於執行各種操作的指令集合的電腦程式。電腦程式或電腦代碼的示例包括諸如由編譯器產生的機器代碼，以及包括由電腦、電子組件或使用注釋器（interpreter）的微處理器執行的高級代碼的文檔。Some embodiments include electronic components, such as microprocessors, storage devices, and memories that store computer program instructions on a machine-readable or computer-readable medium (also referred to as a computer-readable storage medium, machine-readable medium, or machine-readable medium). readable storage media). Some examples of such computer-readable media include RAM, ROM, read-only compact discs (CD-ROM), recordable compact discs (CD-R), rewritable discs compact discs (CD-RW for short), read-only digital versatile discs (e.g., DVD-ROM, double-layer DVD-ROM), various recordable/rewritable DVDs (e.g., DVD -RAM, DVD-RW, DVD+RW, etc.), flash memory (e.g., SD card, mini SD card, micro SD card, etc.), magnetic and/or solid-state hard drives, read-only and recordable Blu-Ray ® optical discs, ultra-density optical discs, any other optical or magnetic media, and floppy disks. The computer-readable medium may store a computer program that is executable by at least one processing unit and includes a set of instructions for performing various operations. Examples of computer programs or computer code include machine code such as that produced by a compiler, as well as documents that include high-level code executed by a computer, electronic component, or microprocessor using an interpreter.

雖然上述討論主要涉及執行軟體的微處理器或多核處理器，但許多上述特徵和應用由一個或多個積體電路執行，例如專用積體電路（application specific integrated circuit，簡稱ASIC）或現場可程式設計閘陣列（field programmable gate array，簡稱FPGA）。在一些實施例中，這樣的積體電路執行存儲在電路本身上的指令。此外，一些實施例執行存儲在可程式設計邏輯器件（programmable logic device，簡稱PLD）、ROM或RAM器件中的軟體。While the above discussion primarily relates to microprocessors or multi-core processors executing software, many of the above features and applications are performed by one or more integrated circuits, such as application specific integrated circuits (ASICs) or field programmable Design a field programmable gate array (FPGA for short). In some embodiments, such integrated circuits execute instructions stored on the circuit itself. Additionally, some embodiments execute software stored in a programmable logic device (PLD), ROM, or RAM device.

如在本說明書和本申請的任一申請專利範圍中使用的，術語“電腦”、“伺服器”、“處理器”和“記憶體”均指電子或其他技術設備。這些術語不包括人或人群。出於本說明書的目的，術語顯示或顯示是指在電子設備上顯示。如在本說明書和本申請的任何申請專利範圍中所使用的，術語“電腦可讀介質”、“電腦可讀介質”和“機器可讀介質”完全限於以電腦可讀形式存儲資訊的有形物理物件。這些術語不包括任何無線訊號、有線下載訊號和任何其他短暫訊號。As used in this specification and any claim in this application, the terms "computer", "server", "processor" and "memory" refer to electronic or other technical equipment. These terms do not include persons or groups of people. For the purposes of this specification, the term display or display refers to display on an electronic device. As used in this specification and any claim claimed in this application, the terms "computer-readable medium," "computer-readable medium," and "machine-readable medium" are exclusively limited to tangible physical media that stores information in a computer-readable form. object. These terms do not include any wireless signals, wired download signals and any other short-lived signals.

雖然已經參考許多具體細節描述了本公開，但是本領域之通常知識者將認識到，本公開可以以其他特定形式實施而不背離本公開的精神。此外，許多圖（包括第8圖和第11圖）概念性地說明瞭處理。這些處理的具體操作可能不會按照所示和描述的確切循序執行。具體操作可以不是在一個連續的一系列操作中執行，在不同的實施例中可以執行不同的具體操作。此外，該處理可以使用幾個子處理來實現，或者作為更大的宏處理的一部分來實現。因此，本領域之通常知識者將理解本公開不受前述說明性細節的約束，而是由所附申請專利範圍限定。 補充說明 Although the present disclosure has been described with reference to numerous specific details, those of ordinary skill in the art will recognize that the disclosure may be embodied in other specific forms without departing from the spirit of the disclosure. Additionally, many figures (including Figures 8 and 11) conceptually illustrate processing. The specific operations of these processes may not be performed in the exact sequence shown and described. Specific operations may not be performed in a continuous series of operations, and different specific operations may be performed in different embodiments. Furthermore, this processing can be implemented using several sub-processes or as part of a larger macro-process. Accordingly, one of ordinary skill in the art will understand that the present disclosure is not limited by the foregoing illustrative details, but rather by the scope of the appended claims. Additional information

本文所描述的主題有時表示不同的組件，其包含在或者連接到其他不同的組件。可以理解的是，所描述的結構僅是示例，實際上可以由許多其他結構來實施，以實現相同的功能，從概念上講，任何實現相同功能的組件的排列實際上是“相關聯的”，以便實現所需功能。因此，不論結構或中間部件，為實現特定的功能而組合的任何兩個組件被視為“相互關聯”，以實現所需的功能。同樣，任何兩個相關聯的組件被看作是相互“可操作連接”或“可操作耦接”，以實現特定功能。能相互關聯的任何兩個組件也被視為相互“可操作地耦接”，以實現特定功能。能相互關聯的任何兩個組件也被視為相互“可操作地耦合”以實現特定功能。可操作連接的具體例子包括但不限於物理可配對和/或物理上相互作用的組件，和/或無線可交互和/或無線上相互作用的組件，和/或邏輯上相互作用和/或邏輯上可交互的組件。The subject matter described herein sometimes represents different components that are contained within or connected to other different components. It will be understood that the structures described are examples only and may in fact be implemented by many other structures to achieve the same functionality, and conceptually any arrangement of components achieving the same functionality is in fact "related" , in order to achieve the required functions. Therefore, any two components, regardless of structure or intermediate components, that are combined to achieve a specific function are considered to be "interrelated" to achieve the required function. Likewise, any two associated components are considered to be "operably connected" or "operably coupled" to each other to achieve the specified functionality. Any two components that can be associated with each other are also said to be "operably coupled" with each other to achieve the specified functionality. Any two components that can be associated with each other are also said to be "operably coupled" with each other to achieve the specified functionality. Specific examples of operably connected components include, but are not limited to, physically pairable and/or physically interacting components, and/or wirelessly interactable and/or wirelessly interacting components, and/or logically interacting and/or logically interacting components. Interactive components.

此外，關於基本上任何複數和/或單數術語的使用，本領域之通常知識者可以根據上下文和/或應用從複數變換為單數和/或從單數到複數。為清楚起見，本發明明確闡述了不同的單數/複數排列。Furthermore, with regard to the use of substantially any plural and/or singular term, one of ordinary skill in the art may convert the plural to the singular and/or from the singular to the plural depending on the context and/or application. For the sake of clarity, this disclosure expressly sets out different singular/plural arrangements.

此外，本領域之通常知識者可以理解，通常，本發明所使用的術語特別是申請專利範圍中的，如申請專利範圍的主題，通常用作“開放”術語，例如，“包括”應解釋為“包括但不限於”，“有”應理解為“至少有”“包括”應解釋為“包括但不限於”等。本領域之通常知識者可以進一步理解，若計畫介紹特定數量的申請專利範圍內容，將在申請專利範圍內明確表示，並且，在沒有這類內容時將不顯示。例如，為幫助理解，下面申請專利範圍可能包含短語“至少一個”和“一個或複數個”，以介紹申請專利範圍的內容。然而，這些短語的使用不應理解為暗示使用不定冠詞“一個”或“一種”介紹申請專利範圍內容，而約束了任何特定神專利範圍。甚至當相同的申請專利範圍包括介紹性短語“一個或複數個”或“至少有一個”，不定冠詞，例如“一個”或“一種”，則應被解釋為表示至少一個或者更多，對於用於介紹申請專利範圍的明確描述的使用而言，同樣成立。此外，即使明確引用特定數量的介紹性內容，本領域之通常知識者可以認識到，這樣的內容應被解釋為表示所引用的數量，例如，沒有其他修改的“兩個引用”，意味著至少兩個引用，或兩個或兩個以上的引用。此外，在使用類似於“A、B和C中的至少一個”的表述的情況下，通常如此表述是為了本領域之通常知識者可以理解該表述，例如，“系統包括A、B和C中的至少一個”將包括但不限於單獨具有A的系統，單獨具有B的系統，單獨具有C的系統，具有A和B的系統，具有A和C的系統，具有B和C的系統，和/或具有A、B和C的系統等。本領域之通常知識者進一步可理解，無論在説明書中，申請專利範圍中或者附圖中，由兩個或兩個以上的替代術語所表現的任何分隔的單詞和/或短語應理解為，包括這些術語中的一個，其中一個，或者這兩個術語的可能性。例如，“A或B”應理解為，“A”，或者“B”，或者“A和B”的可能性。In addition, those of ordinary skill in the art will understand that generally, terms used in the present invention, especially within the scope of the application, such as the subject matter of the scope of the application, are generally used as "open" terms, for example, "including" should be interpreted as "Including but not limited to", "have" should be understood as "at least have", "include" should be interpreted as "including but not limited to", etc. One of ordinary skill in the art will further understand that if a specific amount of claimed content is intended to be introduced, this will be explicitly stated within the claimed scope and, in the absence of such content, it will not be shown. For example, to aid understanding, the following patent claims may contain the phrases "at least one" and "one or a plurality" to introduce the content of the patent claims. However, the use of these phrases should not be construed as implying that the use of the indefinite article "a" or "an" to introduce the scope of the claim limits any particular patent scope. Even when the same claim includes the introductory phrase "one or plural" or "at least one", the indefinite article, such as "a" or "an", shall be construed to mean at least one or more, for The same holds true for the use of an explicit description to introduce the scope of a patent claim. Furthermore, even if an introductory reference to a particular number is expressly cited, one of ordinary skill in the art would recognize that such reference should be construed to mean the number cited, e.g., "two citations" without other modifications, means at least Two citations, or two or more citations. Furthermore, where an expression similar to "at least one of A, B, and C" is used, it is usually stated so that a person of ordinary skill in the art can understand the expression, for example, "the system includes at least one of A, B, and C" "At least one of" will include, but is not limited to, a system with A alone, a system with B alone, a system with C alone, a system with A and B, a system with A and C, a system with B and C, and/ Or a system with A, B and C etc. It will be further understood by those of ordinary skill in the art that any separated words and/or phrases represented by two or more alternative terms, whether in the specification, patent claims or drawings, should be understood as , including the possibility of one, one, or both of these terms. For example, "A or B" should be understood as the possibility of "A", or "B", or "A and B".

從前述可知，出於説明目的，本發明已描述了各種實施方案，並且在不偏離本發明的範圍和精神的情況下，可以進行各種變形。因此，此處所公開的各種實施方式不用於約束，真實的範圍和申請由申請專利範圍表示。It will be understood from the foregoing that various embodiments of the present invention have been described for purposes of illustration, and that various modifications may be made without departing from the scope and spirit of the invention. Accordingly, the various embodiments disclosed herein are not to be construed as limiting, and the true scope and claims are indicated by the claims.

200:當前CTU 210:當前圖片 220:代表性MV 230:相鄰樣本/塊 240:RPL 400:當前CTU 605:視訊源 608:減法器 610:變換模組 611:量化模組 612:變換係數 613:預測像素資料 614:逆量化模組 615:逆變換模組 616:變換係數 617:重構的像素資料 619:重構殘差 620:幀內估計模組 625:幀內預測模組 630:運動補償模組 635:運動估計模組 640:幀間預測模組 645:環路濾波器 650:重構圖片緩衝器 665:MV緩衝器 675:MV預測模組 695:位元流 705:代表性MV選擇器 720:RPL重新排序模組 730:RPL 800:處理 810、820、830、840:步驟 900:視訊解碼器 910:逆變換模組 911:逆量化模組 912:量化資料 913:預測像素資料 916:變換係數 917:解碼像素資料 919:重構殘差訊號 925:幀內預測模組 930:運動補償模組 940:幀間預測模組 950:解碼圖片緩衝器 955:顯示裝置 965:MV緩衝器 975:MV預測模組 990:熵解碼器 995:位元流 1005:代表性MV選擇器 1020:RPL重新排序模組 1030:RPL 1035:運動解碼器 1100:處理 1110、1120、1130、1140:步驟 1200:電子系統 1205:匯流排 1210:處理單元 1215:GPU 1220:系統記憶體 1225:網路 1230:唯讀記憶體 1235:永久存放設備 1240:輸入設備 1245:輸出設備 200:Current CTU 210:Current picture 220: Representative MV 230: Adjacent samples/block 240:RPL 400:Current CTU 605:Video source 608:Subtractor 610:Transformation module 611:Quantization module 612: Transformation coefficient 613: Predict pixel data 614:Inverse quantization module 615:Inverse transformation module 616: Transformation coefficient 617:Reconstructed pixel data 619:Reconstruction residuals 620: Intra-frame estimation module 625: Intra prediction module 630: Motion compensation module 635: Motion estimation module 640: Inter-frame prediction module 645: Loop filter 650: Reconstruct image buffer 665:MV buffer 675:MV prediction module 695:Bit stream 705: Representative MV selector 720:RPL reordering module 730:RPL 800: Processing 810, 820, 830, 840: steps 900:Video decoder 910:Inverse transformation module 911:Inverse quantization module 912:Quantitative data 913: Predict pixel data 916: Transformation coefficient 917: Decode pixel data 919:Reconstruct the residual signal 925: Intra prediction module 930: Motion compensation module 940: Inter prediction module 950: Decode picture buffer 955:Display device 965:MV buffer 975:MV prediction module 990:Entropy decoder 995:bit stream 1005: Representative MV selector 1020:RPL reordering module 1030:RPL 1035:Motion decoder 1100: Processing 1110, 1120, 1130, 1140: steps 1200: Electronic systems 1205:Bus 1210: Processing unit 1215:GPU 1220:System memory 1225:Internet 1230: Read-only memory 1235:Permanent storage of equipment 1240:Input device 1245:Output device

附圖被包括以提供對本公開的進一步理解並且被併入並構成本公開的一部分。附圖說明瞭本公開的實施方式，並且與描述一起用於解釋本公開的原理。值得注意的是，附圖不一定是按比例繪製的，因為在實際實施中特定組件可能被顯示為與大小不成比例，以便清楚地說明本公開的概念。第1A-B圖示出與參考圖片管理信令相關的各種語法結構和元素。第2圖概念性地示出在基於編編碼樹單元（coding tree unit，簡稱CTU）的參考圖片列表重新排序中使用的重構樣本。第3圖概念性地示出使用CTU的代表性運動向量（motion vector，簡稱MV）來對CTU的參考圖片列表（RPL）進行重新排序。第4圖示出在基於CTU的參考列表重新排序中使用的相鄰位置。第5圖示出當前CTU和當前圖片中先前編解碼的CTU。第6圖示出可以實現基於CTU的參考圖片列表的示例視訊編碼器。第7圖概念性地示出實現基於CTU的參考圖片列表的視訊編碼器的部分。第8圖概念性地示出使用基於CTU的參考圖片清單來對像素塊進行編碼的處理。第9圖示出可以實現基於CTU的參考圖片列表的示例視訊解碼器。第10圖示出視訊解碼器中實現基於CTU的參考圖片清單的部分。第11圖概念性地示出使用基於CTU的參考圖片清單來解碼像素塊的處理。第12圖概念性地示出用於實現本公開的一些實施例的電子系統。 The accompanying drawings are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this disclosure. The drawings illustrate embodiments of the disclosure and, together with the description, serve to explain the principles of the disclosure. Notably, the drawings are not necessarily to scale as certain components may be shown disproportionately large in actual implementations in order to clearly illustrate the concepts of the present disclosure. Figures 1A-B illustrate various syntax structures and elements related to reference picture management signaling. Figure 2 conceptually illustrates reconstructed samples used in coding tree unit (CTU) based reference picture list reordering. Figure 3 conceptually illustrates the use of the representative motion vector (MV) of the CTU to reorder the reference picture list (RPL) of the CTU. Figure 4 shows neighbor positions used in CTU-based reference list reordering. Figure 5 shows the current CTU and the previously codected CTU in the current picture. Figure 6 illustrates an example video encoder that may implement a CTU-based reference picture list. Figure 7 conceptually illustrates portions of a video encoder that implements a CTU-based reference picture list. Figure 8 conceptually illustrates the process of encoding pixel blocks using a CTU-based reference picture list. Figure 9 illustrates an example video decoder that may implement a CTU-based reference picture list. Figure 10 shows the part of the video decoder that implements the CTU-based reference picture list. Figure 11 conceptually illustrates the process of decoding a block of pixels using a CTU-based reference picture list. Figure 12 conceptually illustrates an electronic system for implementing some embodiments of the present disclosure.

200:當前CTU 200:Current CTU

210:當前圖片 210:Current picture

220:代表性MV 220: Representative MV

230:相鄰樣本/塊 230: Adjacent samples/block

240:RPL 240:RPL

Claims

A video coding and decoding method includes: receiving a reference picture list of a current codec tree unit of a current picture, the reference picture list identifying multiple reference pictures; Assign multiple indices to the reference pictures in the reference picture list of the current codec tree unit; Receive data to be encoded or decoded into blocks of the current codec tree unit; and Inter predictions are generated by selecting one or more reference pictures from the reference picture list by encoding or decoding the blocks of the current codec tree unit using the assigned indices.

The video encoding and decoding method as described in claim 1, wherein the indices are assigned to the reference pictures in the reference picture list based on explicit signaling.

The video encoding and decoding method as described in request item 1 further includes: Derive a representative motion vector for the current codec tree unit; and Computing a plurality of costs for the reference pictures, wherein the cost for each reference picture is based on (i) a plurality of neighboring samples of the current codec tree unit and (ii) a plurality of the reference pictures identified by the representative motion vectors. Reference samples are calculated, wherein the indices are assigned to the reference pictures in the reference picture list based on the calculated costs.

The video encoding and decoding method of claim 3, wherein the representative motion vector is derived from a plurality of motion vectors used to reconstruct one or more blocks adjacent to the current encoding and decoding tree unit.

The video coding and decoding method of claim 4, wherein the representative motion vector is a weighted average of the motion vectors used to reconstruct the blocks adjacent to the current codec tree unit.

The video encoding and decoding method of claim 3, wherein the representative MV is derived from multiple motion vectors of one or more blocks in the current CTU.

The video encoding and decoding method of claim 6, wherein the representative motion vector is a weighted average of the motion vectors of the one or more blocks in the current CTU.

The video encoding and decoding method of claim 3, wherein the representative MV is derived from multiple temporal MVs from a co-located CTU or a reference picture CTU.

The video encoding and decoding method of claim 8, wherein the representative MV is a weighted average of the temporal MVs from the co-located CTU or the reference picture CTU.

The video encoding and decoding method as described in claim 3, wherein the representative MV is derived from multiple MVs inherited from multiple adjacent positions of the current CTU.

The video encoding and decoding method as claimed in claim 3, wherein a motion vector predictor of a block in the current CTU is used as the representative MV of the CTU.

The video encoding and decoding method as described in claim 3, wherein the adjacent samples of the current CTU are located in multiple CTUs adjacent to the current CTU.

The video encoding and decoding method as described in claim 1, wherein indexes are assigned to the reference pictures in the reference picture list according to a history-based table, which is recorded in each of the current pictures. The distribution of multiple reference picture selections when the CTU performs encoding or decoding.

An electronic device includes: a video codec circuit configured to perform the following operations, including: receiving a reference picture list of a current codec tree unit of a current picture, the reference picture list identifying a plurality of reference pictures; Assign multiple indices to the reference pictures in the reference picture list of the current codec tree unit; Receive data to be encoded or decoded into blocks of the current codec tree unit; and Inter predictions are generated by selecting one or more reference pictures from the reference picture list by encoding or decoding the blocks of the current codec tree unit using the assigned indices.

A video decoding method includes: receiving a reference picture list of a current codec tree unit of a current picture, the reference picture list identifying a plurality of reference pictures; Assign multiple indices to the reference pictures in the reference picture list of the current codec tree unit; Receive data to be decoded into blocks of the current codec tree unit; and Inter predictions are generated by selecting one or more reference pictures from the reference picture list by reconstructing the blocks of the current codec tree unit using the assigned indices.