TW202412522A - Constraining convolution model coefficient - Google Patents

Constraining convolution model coefficient Download PDF

Info

Publication number
TW202412522A
TW202412522A TW112128822A TW112128822A TW202412522A TW 202412522 A TW202412522 A TW 202412522A TW 112128822 A TW112128822 A TW 112128822A TW 112128822 A TW112128822 A TW 112128822A TW 202412522 A TW202412522 A TW 202412522A
Authority
TW
Taiwan
Prior art keywords
coefficients
current block
samples
derived
video
Prior art date
Application number
TW112128822A
Other languages
Chinese (zh)
Inventor
莊政彥
陳慶曄
Original Assignee
聯發科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 聯發科技股份有限公司 filed Critical 聯發科技股份有限公司
Publication of TW202412522A publication Critical patent/TW202412522A/en

Links

Abstract

A method for performing cross component prediction by constraining the coefficients of a component prediction model is provided. A video coder receives data for a block of pixels to be encoded or decoded as a current block of a current picture of a video. The video coder derives a set of coefficients based on corresponding input and output component samples. The video coder constrains the derived set of coefficients based on a set of constraints. The video coder applies the constrained set of coefficients as a component prediction model to generate a predictor for the current block. The video coder encodes or decodes the current block by using the generated predictor.

Description

約束卷積模型的係數Coefficients of the Constrained Convolution Model

本申請涉及視訊編解碼。特別是,本申請涉及通過交叉分量預測對圖元塊進行編解碼的方法。The present application relates to video coding and decoding. In particular, the present application relates to a method for coding and decoding a picture element block by cross-component prediction.

除非本文另有說明,本節所述的方法不是針對下文所列的申請專利範圍的現有技術,並且不因包含在本節中而被承認為現有技術。Unless otherwise indicated herein, the methods described in this section are not prior art to the claims listed below and are not admitted to be prior art by inclusion in this section.

高效視訊編碼(HEVC)是由視訊編碼聯合協作小組(JCT-VC)制定的國際視訊編碼標準。HEVC是基於基於混合塊的運動補償DCT類轉換編碼架構。壓縮的基本單元,稱為編碼單元(CU),是一個2Nx2N的正方形塊,每個CU可以遞迴地分割成四個更小的CU,直到達到預定的最小尺寸。每個CU包含一個或複數個預測單元(PU)。High Efficiency Video Coding (HEVC) is an international video coding standard developed by the Joint Collaborative Team on Video Coding (JCT-VC). HEVC is based on a hybrid block-based motion-compensated DCT-like transform coding architecture. The basic unit of compression, called a coding unit (CU), is a 2Nx2N square block, and each CU can be recursively divided into four smaller CUs until a predetermined minimum size is reached. Each CU contains one or more prediction units (PUs).

多功能視訊編碼(VVC)是由ITU-T SG16 WP3和ISO/IEC JTC1/SC29/WG11的聯合視訊專家小組(JVET)制定的最新國際視訊編碼標準。輸入的視訊訊號是從重建訊號中預測出來的,而重建訊號是由編碼的圖片區域得出的。預測的殘餘訊號由塊變換處理。變換係數被量化,並與位元流中的其他側資訊一起進行熵編碼。重建訊號是由預測訊號和重建殘餘訊號在對去量化的變換係數進行反變換後產生的。重建訊號通過環內濾波進一步處理,以消除編碼偽影。解碼後的圖片被存儲在幀緩衝器中,用於預測輸入視訊訊號中未來的圖片。Versatile Video Coding (VVC) is the latest international video coding standard developed by the Joint Video Experts Group (JVET) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11. The input video signal is predicted from the reconstruction signal, which is derived from the coded picture region. The predicted residual signal is processed by a block transform. The transform coefficients are quantized and entropy coded together with other side information in the bitstream. The reconstructed signal is generated by the prediction signal and the reconstruction residual signal after inverse transforming the dequantized transform coefficients. The reconstructed signal is further processed by in-loop filtering to remove coding artifacts. The decoded pictures are stored in a frame buffer and used to predict future pictures in the input video signal.

在VVC中,編碼圖片被劃分為非重疊的方形塊區域,由相關的編碼樹單元(CTU)表示。編碼樹的葉節點對應於編碼單元(CU)。編碼圖片可以用切片的集合來表示,每個切片包括整數個的CTU。切片中的各個CTU是按光柵掃描連續處理的。雙預測(B)片可以使用幀內預測或幀間預測進行解碼,最多使用兩個運動向量和參考索引來預測每個塊的樣本值。預測(P)片使用幀內預測或幀間預測進行解碼,最多使用一個運動向量和參考索引來預測每個塊的樣本值。內部(I)片僅使用幀內預測進行解碼。In VVC, coded pictures are divided into non-overlapping square block areas, represented by related coding tree units (CTUs). The leaf nodes of the coding tree correspond to coding units (CUs). Coded pictures can be represented by a set of slices, each slice includes an integer number of CTUs. Each CTU in a slice is processed continuously by raster scanning. Bi-prediction (B) slices can be decoded using intra-frame prediction or inter-frame prediction, using up to two motion vectors and reference indices to predict the sample values of each block. Prediction (P) slices are decoded using intra-frame prediction or inter-frame prediction, using up to one motion vector and reference index to predict the sample values of each block. Intra-frame (I) slices are decoded using intra-frame prediction only.

CTU可以使用具有嵌套多類型樹(MTT)結構的四叉樹(QT)被分割成一個或複數個不重疊的編碼單元(CU),以適應各種局部運動和紋理特徵。CU可以使用五種分割類型之一進一步分割成更小的CU:四叉樹分割、垂直二叉樹分割、水平二叉樹分割、垂直中心側三叉樹分割、水平中心側三叉樹分割。A CTU can be partitioned into one or more non-overlapping coding units (CUs) using a quadtree (QT) with a nested multi-type tree (MTT) structure to accommodate various local motion and texture characteristics. A CU can be further partitioned into smaller CUs using one of five partitioning types: quadtree partitioning, vertical binary tree partitioning, horizontal binary tree partitioning, vertical center-side tritree partitioning, and horizontal center-side tritree partitioning.

每個CU包含一個或複數個預測單元(PU)。預測單元與相關的CU語法一起,作為訊號預測器資訊的基本單元工作。採用指定的預測過程來預測PU內相關圖元樣本的值。每個CU可以包含一個或複數個轉換單元(TU),用於表示預測殘餘塊。變換單元(TU)由一個亮度樣本的變換塊(TB)和兩個相應的色度樣本的變換塊組成,每個TB對應一個顏色分量的樣本殘餘塊。整數轉換被應用於轉換塊。量化係數的電平值與其他側面資訊一起在位元流中進行熵編碼。術語編碼樹塊(CTB)、編碼塊(CB)、預測塊(PB)、和變換塊(TB)被分別定義為指定與CTU、CU、PU、和TU相關的單色分量的二維樣本陣列。因此,CTU由一個亮度CTB、兩個色度CTB、和相關的語法元素組成。類似的關係適用於CU、PU和TU。Each CU contains one or more prediction units (PU). The prediction unit, together with the associated CU syntax, works as a basic unit of signal predictor information. The specified prediction process is used to predict the value of the associated primitive sample within the PU. Each CU can contain one or more transform units (TU) to represent the prediction residual block. The transform unit (TU) consists of a transform block (TB) of luminance samples and two corresponding transform blocks of chrominance samples, each TB corresponding to a sample residual block of a color component. Integer transforms are applied to the transform blocks. The level values of the quantization coefficients are entropy encoded in the bitstream along with other side information. The terms coding tree block (CTB), coding block (CB), prediction block (PB), and transform block (TB) are defined to specify the two-dimensional sample arrays of monochrome components associated with CTU, CU, PU, and TU, respectively. Thus, a CTU consists of one luma CTB, two chroma CTBs, and related syntax elements. Similar relationships apply to CU, PU, and TU.

對於每個幀間預測的CU,由運動向量、參考圖片索引、參考圖片清單使用索引、以及附加資訊組成的運動參數被用於幀間預測的樣本生成。該運動參數可以以顯式或隱式方式發出訊號。當CU被編碼為跳過模式時,該CU與一個PU相關,並且沒有顯著的殘餘係數,沒有編碼的運動向量增量或參考圖片索引。規定了一種合併模式,其中,當前CU的運動參數從鄰近的CU中獲得,包括空間和時間上的候選以及VVC中引入的附加時間表。該合併模式可以應用於任何幀間預測的CU。該合併模式的替代方案是運動參數的顯式傳輸,其中運動向量、每個參考圖片清單的相應參考圖片索引、參考圖片清單使用標誌、以及其他需要的資訊在每個CU中顯式地發出訊號。For each inter-frame predicted CU, motion parameters consisting of motion vector, reference picture index, reference picture list usage index, and additional information are used for inter-frame predicted sample generation. The motion parameters can be signaled explicitly or implicitly. When the CU is encoded in skip mode, the CU is associated with one PU and there are no significant residual coefficients, no encoded motion vector increments or reference picture indices. A merge mode is specified, in which the motion parameters of the current CU are obtained from neighboring CUs, including spatial and temporal candidates and additional schedules introduced in VVC. This merge mode can be applied to any inter-frame predicted CU. An alternative to this merge mode is explicit transmission of motion parameters, where the motion vectors, the corresponding reference picture index for each reference picture list, the reference picture list usage flag, and other required information are explicitly signaled in each CU.

下面的總結只是說明性的,不旨在以任何方式進行約束。也就是說,提供以下發明內容是為了介紹本文所述的新穎和非顯而易見的技術的概念、亮點、益處、和優點。下面的詳細描述中會進一步描述選擇的和並非所有的實施方案。因此,下面的發明內容並不旨在確定所要求的主題的基本特徵,也不旨在用於確定所要求的主題的範圍。The following summary is illustrative only and is not intended to be binding in any way. That is, the following disclosure is provided to introduce the concepts, highlights, benefits, and advantages of the novel and non-obvious technologies described herein. Selected and not all implementations are further described in the detailed description below. Therefore, the following disclosure is not intended to identify essential features of the claimed subject matter, nor is it intended to be used to determine the scope of the claimed subject matter.

本申請的一些實施例提供了一種通過約束分量預測模型的係數來執行交叉分量預測的方法。視訊編解碼器接收將被編碼或解碼的圖元塊資料,作為視訊當前圖片的當前塊。視訊編解碼器根據相應的輸入和輸出分量樣本推導出係數集。視訊編解碼器根據約束條件集對推導出的係數集進行約束。視訊編解碼器將被約束的係數集用作分量預測模型,以生成當前塊的預測器。視訊編解碼器使用生成的預測器對當前塊進行編碼或解碼。Some embodiments of the present application provide a method for performing cross-component prediction by constraining the coefficients of a component prediction model. A video codec receives picture element block data to be encoded or decoded as a current block of a current video picture. The video codec derives a coefficient set based on corresponding input and output component samples. The video codec constrains the derived coefficient set according to a set of constraint conditions. The video codec uses the constrained coefficient set as a component prediction model to generate a predictor for the current block. The video codec uses the generated predictor to encode or decode the current block.

在一些實施例中,該分量預測模型是基於該當前塊的重建亮度樣本生成預測色度樣本的交叉分量模型。該相應的輸入和輸出分量樣本是與該當前塊相鄰的範本區域的相應亮度和色度樣本。該預測器可包括生成的預測色度樣本。在一些實施例中,該分量預測模型是卷積模型; 該係數集是通過求解該相應輸入和輸出分量樣本之間的矩陣方程推導出的。In some embodiments, the component prediction model is a cross-component model that generates predicted chrominance samples based on reconstructed luma samples of the current block. The corresponding input and output component samples are corresponding luma and chrominance samples of a sample region adjacent to the current block. The predictor may include the generated predicted chrominance samples. In some embodiments, the component prediction model is a convolution model; the set of coefficients is derived by solving a matrix equation between the corresponding input and output component samples.

在一些實施例中,該編碼器通過在剪切閾值處剪切係數,或在不同的剪切閾值處剪切不同的係數,或將係數限制在預定義範圍內,來對推導出的該係數集進行約束。在一些實施例中,該推導出的係數集和該受約束的係數集在該編碼器中由定點表示,其小數部分包括N位元,該預定義範圍的大小基於1<<N放大。在一些實施例中,當推導出的係數超出該預定義範圍時,該係數集被設置為等於標識濾波器,或者不使用該推導出的係數集對該當前塊進行編碼或解碼(禁用CCCM模式)。In some embodiments, the encoder constrains the derived coefficient set by clipping coefficients at a clipping threshold, clipping different coefficients at different clipping thresholds, or limiting coefficients to a predefined range. In some embodiments, the derived coefficient set and the constrained coefficient set are represented in the encoder by a fixed point, whose fractional part includes N bits, and the size of the predefined range is magnified based on 1<<N. In some embodiments, when the derived coefficient exceeds the predefined range, the coefficient set is set to be equal to the identification filter, or the derived coefficient set is not used to encode or decode the current block (disabling CCCM mode).

在下面的詳細描述中,通過示例闡述了許多具體細節,以提供對相關教義的全面理解。基於本文所述教義的任何變化、衍生、和/或擴展都在本申請的保護範圍內。在某些情況下,與本文披露的一個或複數個示例實施方案有關的眾所周知的方法、程式、元件、和/或電路可以在相對較高的水準上進行描述,而不進行詳細說明,以避免不必要地掩蓋本披露的教義的各個方面 I. 交叉分量線性模型( CCLM In the following detailed description, many specific details are explained by way of example to provide a comprehensive understanding of the relevant teachings. Any variation, derivation, and/or extension based on the teachings described herein are within the scope of protection of this application. In some cases, well-known methods, procedures, components, and/or circuits related to one or more exemplary embodiments disclosed herein may be described at a relatively high level without detailed description to avoid unnecessarily covering up various aspects of the teachings disclosed herein. I. Cross-Component Linear Model ( CCLM )

交叉分量線性模型(CCLM)或線性模型(LM)模式是一種交叉分量預測模式,其中通過線性模型從並置的重建亮度(Luma)樣本中預測出塊的色度分量。線性模型的參數(例如,縮放和偏移)從已經重建的與該塊相鄰的亮度和色度(Chroma)樣本推導。例如,在VVC中,CCLM模式利用通道間的依存關係,從重建的亮度樣本中預測色度樣本。這種預測是使用以下形式的線性模型進行的: (1) The Cross-Component Linear Model (CCLM) or Linear Model (LM) mode is a cross-component prediction mode in which the chrominance components of a block are predicted from the collocated reconstructed luminance (Luma) samples by a linear model. The parameters of the linear model (e.g., scale and offset) are derived from the already reconstructed luminance and chrominance (Chroma) samples adjacent to the block. For example, in VVC, the CCLM mode exploits the inter-channel dependencies to predict chrominance samples from reconstructed luminance samples. This prediction is performed using a linear model of the following form: (1)

公式(1)中的 代表CU中的預測色度樣本(或當前CU的預測色度樣本), 代表同一CU的下採樣重建的亮度樣本(或當前CU的相應重建的亮度樣本)。 In formula (1) Represents the predicted chroma sample in the CU (or the predicted chroma sample of the current CU), Represents the down-sampled reconstructed luma samples of the same CU (or the corresponding reconstructed luma samples of the current CU).

CCLM模型參數α(縮放參數)和β(偏移參數)是根據最多四個相鄰的色度樣本和它們相應的下採樣的亮度樣本得出的。在LM_A模式下(也表示為LM-T模式),只有上側或頂部相鄰的範本被用來計算線性模型的係數。在LM_L模式下(也表示為LM-L模式),只有左側範本被用來計算線性模型的係數。在LM-LA模式(也表示為LM-LT模式)中,左側和上側的範本都被用來計算線性模型的係數。 The CCLM model parameters α (scaling parameter) and β (offset parameter) are derived from up to four adjacent chrominance samples and their corresponding downsampled luma samples. In LM_A mode (also denoted as LM-T mode), only the upper or top adjacent samples are used to calculate the coefficients of the linear model. In LM_L mode (also denoted as LM-L mode), only the left samples are used to calculate the coefficients of the linear model. In LM-LA mode (also denoted as LM-LT mode), both the left and top samples are used to calculate the coefficients of the linear model.

第1圖概念性地說明了用於推導線性模型參數的色度和亮度樣本。該圖式示出了具有4:2:0格式的亮度分量樣本和色度分量樣本的當前塊100。與當前塊相鄰的亮度和色度樣本是重建樣本。這些重建樣本被用來得出交叉分量的線性模型(參數α和β)。由於當前塊為4:2:0格式,在用於線性模型推導之前,先對亮度樣本進行下採樣。在這個示例中,有16對重建的亮度(下採樣)和色度樣本與當前塊相鄰。這16對的亮度與色度值被用來推導線性模型參數。 FIG. 1 conceptually illustrates the chrominance and luma samples used to derive linear model parameters. The figure shows a current block 100 with luma component samples and chroma component samples in a 4:2:0 format. The luma and chroma samples adjacent to the current block are reconstructed samples. These reconstructed samples are used to derive the linear model (parameters α and β) of the cross components. Since the current block is in a 4:2:0 format, the luma samples are downsampled before being used in linear model derivation. In this example, there are 16 pairs of reconstructed luma (downsampled) and chroma samples adjacent to the current block. These 16 pairs of luma and chroma values are used to derive linear model parameters.

假設當前的色度塊尺寸為W×H,那麼W'和H'被設定為: -當應用LM-LT模式時,W'=W,H'=H; -當應用LM-T模式時,W'=W+H; -當應用LM-L模式時,H'=H+W。 Assuming the current chroma block size is W×H, then W' and H' are set as follows: - When LM-LT mode is applied, W'=W, H'=H; - When LM-T mode is applied, W'=W+H; - When LM-L mode is applied, H'=H+W.

上側相鄰位置表示為S[0, -1]...S[W' -1, -1],左側相鄰位置表示為S[-1, 0]... S[-1, H' -1]。然後選擇四個樣本為: -當應用LM模式時(上側和左側的相鄰樣本都可用),S[W'/4, -1], S[3*W'/4, -1], S[-1, H'/4], S[-1, 3*H'/4]; -當應用LM-T模式時(只有上側的相鄰樣本可用),S[W'/8, -1], S[3*W'/8, -1], S[5*W'/8, -1], S[7*W'/8, -1]; -當應用LM-L模式時(只有左側的相鄰樣本可用),S[-1, H'/8], S[-1, 3*H'/8], S[-1, 5*H'/8], S[-1, 7*H'/8]; The upper adjacent positions are represented as S[0, -1]...S[W' -1, -1], and the left adjacent positions are represented as S[-1, 0]...S[-1, H' -1]. Then select four samples: - When the LM mode is applied (both the upper and left adjacent samples are available), S[W'/4, -1], S[3*W'/4, -1], S[-1, H'/4], S[-1, 3*H'/4]; - When the LM-T mode is applied (only the upper adjacent samples are available), S[W'/8, -1], S[3*W'/8, -1], S[5*W'/8, -1], S[7*W'/8, -1]; - When the LM-L mode is applied (only the left adjacent samples are available), S[-1, H'/8], S[-1, 3*H'/8], S[-1, 5*H'/8], S[-1, 7*H'/8];

選定位置的四個相鄰的亮度樣本被下採樣,並進行四次比較,找到兩個較大的值: x 0 A x 1 A ,以及兩個較小的值: x 0 B x 1 B 。它們對應的色度樣本值被表示為 y 0 A y 1 A y 0 B 、和 y 1 B 。然後, X A X B Y A 、和 Y B 被推導為: X a = ( x 0 A + x 1 A +1)>>1; X b = ( x 0 B + x 1 B +1)>>1                               (2) Y a = ( y 0 A + y 1 A +1)>>1; Y b = ( y 0 B + y 1 B +1)>>1                                (3) The four adjacent luminance samples at the selected position are downsampled and compared four times to find the two larger values: x0A and x1A, and the two smaller values: x0B and x1B. Their corresponding chrominance sample values are denoted as y0A, y1A , y0B , and y1B . Then , XA , XB , YA , and YB are derived as follows: Xa = ( x0A + x1A +1) >> 1 ; Xb = ( x0B + x1B +1 )>> 1 (2 ) Ya = (y0A + y1A +1)>>1; Yb = ( y0B + y1B +1 ) >> 1 ( 3 )

線性模型參數α和β,根據以下公式得到: (4) (5) The linear model parameters α and β are obtained according to the following formula: (4) (5)

根據公式(4)和(5)計算α和β參數的操作可以通過查閱資料表來實現。在一些實施例中,為了減少存儲查閱資料表所需的記憶體, diff值(最大值和最小值之差)和參數α,用指數符號表示。例如, diff用4位元有效部分和一個指數來近似表示。因此,1/diff的表格被簡化為16個元素,用於16個有效值,如下所示: DivTable [ ] = {0, 7, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 1, 1, 0}                (6) The operation of calculating the α and β parameters according to formulas (4) and (5) can be implemented by looking up a data table. In some embodiments, in order to reduce the memory required to store the data table lookup, the diff value (the difference between the maximum and minimum values) and the parameter α are represented by exponential symbols. For example, diff is approximated by a 4-bit significant part and an exponent. Therefore, the table of 1/diff is simplified to 16 elements for 16 valid values, as shown below: DivTable[] = {0, 7, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 1, 1, 0} (6)

這減少了計算的複雜性以及存儲所需表格所需的記憶體大小。 This reduces the complexity of the calculations and the memory size required to store the required tables.

在一些實施例中,為了獲得更多的樣本用於計算CCLM模型參數α和β,對於LM-T模式,上側範本被擴展為包含(W+H)樣本,對於LM-L模式,左側範本被擴展為包含(H+W)樣本。對於LM-LT模式,擴展的左側範本和擴展的上側範本都被用來計算線性模型的係數。 In some embodiments, in order to obtain more samples for calculating the CCLM model parameters α and β, for the LM-T mode, the upper sample is expanded to include (W+H) samples, and for the LM-L mode, the left sample is expanded to include (H+W) samples. For the LM-LT mode, both the expanded left sample and the expanded upper sample are used to calculate the coefficients of the linear model.

為了匹配4:2:0視訊序列的色度樣本位置,兩種類型的下採樣濾波器被應用於亮度樣本,以實現水平和垂直方向上的2比1下取樣率。下採樣濾波器的選擇由序列參數集(SPS)級別的標誌來指定。兩個下採樣濾波器如下,分別對應「類型0」和「類型2」(「type-0」和「type-2」)的內容。 (7) (8) To match the chroma sample positions of a 4:2:0 video sequence, two types of downsampling filters are applied to luma samples to achieve a 2:1 downsampling ratio in both horizontal and vertical directions. The choice of downsampling filter is specified by a flag in the sequence parameter set (SPS) level. The two downsampling filters are as follows, corresponding to the contents of "type 0" and "type 2"("type-0" and "type-2"), respectively. (7) (8)

在一些實施例中,α和β參數計算是作為解碼過程的一部分進行的,而不僅僅是作為編碼器搜索操作。因此,沒有使用語法將α和β的值傳達給解碼器。 In some embodiments, the α and β parameter calculations are performed as part of the decoding process, not just as an encoder search operation. Therefore, no syntax is used to communicate the values of α and β to the decoder.

對於色度幀內模式編解碼,總共允許有8種幀內模式。這些模式包括五種傳統的幀內模式和三種交叉分量的線性模式(LM_LA、LM_A和LM_L)。色度幀內模式編解碼可以直接取決於相應的亮度塊的幀內預測模式。色度幀內模式訊號和相應的亮度幀內預測模式如下表所示: 色度幀內預測模式 對應的亮度幀內預測模式 0 50 18 1 X (0 ≤ X ≤66) 0 66 0 0 0 0 1 50 66 50 50 50 2 18 18 66 18 18 3 1 1 1 66 1 4 0 50 18 1 X 5 81 81 81 81 81 6 82 82 82 82 82 7 83 83 83 83 83 For chroma intra-frame mode encoding and decoding, a total of 8 intra-frame modes are allowed. These modes include five traditional intra-frame modes and three cross-component linear modes (LM_LA, LM_A and LM_L). Chroma intra-frame mode encoding and decoding can directly depend on the intra-frame prediction mode of the corresponding luminance block. The chroma intra-frame mode signal and the corresponding luminance intra-frame prediction mode are shown in the following table: Chroma Intra-frame Prediction Mode Corresponding brightness frame prediction mode 0 50 18 1 X (0 ≤ X ≤66) 0 66 0 0 0 0 1 50 66 50 50 50 2 18 18 66 18 18 3 1 1 1 66 1 4 0 50 18 1 X 5 81 81 81 81 81 6 82 82 82 82 82 7 83 83 83 83 83

由於在I片中啟用了獨立的亮度和色度分量的塊劃分結構,一個色度塊可能對應複數個亮度塊。因此,對於色度衍生模式(DM)模式,直接繼承覆蓋當前色度塊中心位置的相應亮度塊的幀內預測模式。 Since the independent block partitioning structure of luma and chroma components is enabled in I-picture, one chroma block may correspond to multiple luma blocks. Therefore, for the chroma derivative mode (DM) mode, the intra-frame prediction mode of the corresponding luma block covering the center position of the current chroma block is directly inherited.

根據下表,統一的二值化表(映射到bin字串)用於色度幀內預測模式: 色度幀內預測模式 bin字串 4 00 0 0100 1 0101 2 0110 3 0111 5 10 6 110 7 111 A unified binarization table (mapped to bin string) is used for chroma intra prediction mode according to the following table: Chroma Intra-frame Prediction Mode bin string 4 00 0 0100 1 0101 2 0110 3 0111 5 10 6 110 7 111

在該表中,第一個bin表示它是常規模式(0)還是LM模式(1)。如果它是LM模式,那麼下一個bin表示它是否是LM_CHROMA(0)。如果它不是LM_CHROMA,下一個bin表示它是LM_L(0)還是LM_A(1)。對於這種情況,當sps_cclm_enabled_flag為0時,可以在熵編解碼之前丟棄相應的intra_chroma_pred_mode的二值化表的第一個bin。或者,換句話說,第一個bin被推斷為0,因此不會被編解碼。這個單一的二值化表被用於sps_cclm_enabled_flag等於0和1的情況。表中的前兩個bin用其自身的上下文模型進行上下文編解碼,其餘的bin是旁路編解碼。 In this table, the first bin indicates whether it is normal mode (0) or LM mode (1). If it is LM mode, then the next bin indicates whether it is LM_CHROMA (0). If it is not LM_CHROMA, the next bin indicates whether it is LM_L (0) or LM_A (1). For this case, when sps_cclm_enabled_flag is 0, the first bin of the binarization table of the corresponding intra_chroma_pred_mode can be discarded before entropy encoding and decoding. Or, in other words, the first bin is inferred to be 0 and therefore will not be encoded or decoded. This single binarization table is used for the cases where sps_cclm_enabled_flag is equal to 0 and 1. The first two bins in the table are context encoded or decoded with their own context model, and the remaining bins are bypass encoded or decoded.

此外,為了減少雙樹中的幀內色度延遲,當64x64的亮度編解碼樹節點沒有被分割(並且ISP沒有用於64x64的CU)或用QT分割時,32x32/32x16色度編解碼樹節點中的色度CU被允許以如下方式使用CCLM: 如果32x32色度節點沒有被分割或用QT分割,32x32節點中的所有色度CU可以使用CCLM。 如果32x32色度節點用水平BT(Horizontal BT)分割,而32x16子節點沒有分割或使用垂直BT(Vertical BT)分割,32x16色度節點中的所有色度CU都可以使用CCLM。 在所有其他的亮度和色度編解碼樹分割條件下,色度CU不允許使用CCLM。 II. 多模式 CCLM( MMLM) In addition, to reduce intra-frame chroma latency in dual trees, when the 64x64 luma codec tree node is not split (and the ISP is not used for 64x64 CUs) or is split with QT, the chroma CUs in the 32x32/32x16 chroma codec tree nodes are allowed to use CCLM in the following manner: If the 32x32 chroma node is not split or is split with QT, all chroma CUs in the 32x32 node can use CCLM. If the 32x32 chroma node is split with Horizontal BT and the 32x16 child node is not split or is split with Vertical BT, all chroma CUs in the 32x16 chroma node can use CCLM. Under all other luma and chroma codec tree split conditions, chroma CUs are not allowed to use CCLM. II. Multi-Mode CCLM ( MMLM)

多模型CCLM模式(MMLM)使用兩個模型從整個CU的亮度樣本中預測色度樣本。與CCLM類似,三種多模型CCLM模式(MMLM_LA、MMLM_A和MMLM_L)被用來指示在模型參數推導中是否同時使用上側和左側的相鄰樣本,只使用上側的相鄰樣本,或者只使用左側的相鄰樣本。 Multi-model CCLM mode (MMLM) uses two models to predict chrominance samples from luma samples of the entire CU. Similar to CCLM, three multi-model CCLM modes (MMLM_LA, MMLM_A, and MMLM_L) are used to indicate whether to use both upper and left neighboring samples, only upper neighboring samples, or only left neighboring samples in model parameter derivation.

在MMLM中,當前塊的相鄰亮度樣本和相鄰色度樣本被分為兩組,每組被用作一訓練集以推導出一個線性模型(即為特定組推導出特定的 αβ)。此外,當前亮度塊的樣本也根據相鄰亮度樣本分類的相同規則進行分類。 In MMLM, the neighboring luminance samples and neighboring chrominance samples of the current block are divided into two groups, each of which is used as a training set to derive a linear model (i.e., derive a specific α and β for a specific group). In addition, the samples of the current luminance block are also classified according to the same rules as the classification of the neighboring luminance samples.

第2圖示出了將相鄰樣本分類為兩組的示例。閾值( Threshold)被計算為相鄰重建的亮度樣本的平均值。在[x,y]處的相鄰樣本,如果 Rec’ L [ x, y] <= Threshold,則被歸入第一組;而在[x,y]處的相鄰樣本,如果 Rec’ L [ x, y] > Threshold,則被歸入第二組。因此,色度樣本的多模型CCLM預測是: Pred c [ x,y] =α 1 ×Rec ʹ L [ x,y] +β 1 if Rec’ L [ x,y] ≤ Threshold Pred c [ x,y] =α 2 ×Rec ʹ L [ x,y] +β 2 if Rec’ L [ x,y] > Threshold III. 卷積交叉分量模型( CCCM Figure 2 shows an example of classifying neighboring samples into two groups. The threshold ( Threshold ) is calculated as the average of neighboring reconstructed brightness samples. Neighboring samples at [x,y] are classified into the first group if Rec' L [ x , y ] <= Threshold , while neighboring samples at [x,y] are classified into the second group if Rec' L [ x , y ] > Threshold . Therefore, the multi-model CCLM prediction for the chrominance samples is: Pred c [ x,y ] = α 1 ×Rec ʹ L [ x,y ] + β 1 if Rec' L [ x,y ] ≤ Threshold Pred c [ x,y ] = α 2 ×Rec ʹ L [ x,y ] + β 2 if Rec' L [ x,y ] > Threshold III. Convolutional Cross-Component Model ( CCCM )

在一些實施例中,應用卷積交叉分量模型(CCCM)來提高交叉分量預測性能。對於一些實施例,卷積模型有7抽頭濾波器,具有5抽頭加符號形狀的空間分量、非線性項、和偏置項。濾波器的空間5抽頭分量的輸入包括中心(C)亮度樣本,它與要預測的色度樣本匹配,及該中心亮度樣本的上側/北方(N)、下側/南方(S)、左側/西方(W)、和右側/東方(E)的相鄰樣本。第3圖概念性地說明了卷積濾波器的空間分量。非線性項(表示為P)被表示為中心亮度樣本C的二次冪,並按內容的樣本值範圍進行縮放: P = (C*C + midVal) >> bitDepth                                                      (9) In some embodiments, a convolution cross-component model (CCCM) is applied to improve cross-component prediction performance. For some embodiments, the convolution model has a 7-tap filter with a 5-tap plus sign-shaped spatial component, a nonlinear term, and a bias term. The input to the spatial 5-tap component of the filter includes a center (C) luma sample, which matches the chroma sample to be predicted, and the neighboring samples above/north (N), below/south (S), left/west (W), and right/east (E) of the center luma sample. FIG. 3 conceptually illustrates the spatial components of the convolution filter. The nonlinear term (denoted as P) is expressed as the quadratic value of the center brightness sample C, and scaled by the sample value range of the content: P = (C*C + midVal) >> bitDepth                                                              (9)

因此,對於10位元的內容,非線性項P的計算方法是: P = (C*C + 512) >> 10 Therefore, for 10-bit content, the nonlinear term P is calculated as: P = (C*C + 512) >> 10

偏置項(表示為B)表示輸入和輸出之間的純量偏移(類似於CCLM中的偏置項),並被設置為中間色度值(對於10位元內容為512)。濾波器的輸出被計算為濾波器係數c i和輸入值之間的卷積,並被剪切到有效的色度樣本範圍內: predChromaVal = c 0C + c 1N + c 2S + c 3E + c 4W + c 5P + c 6B                         (10) The bias term (denoted B) represents a scalar offset between input and output (similar to the bias term in CCLM) and is set to the intermediate chroma value (512 for 10-bit content). The output of the filter is computed as the convolution between the filter coefficients ci and the input value, and is clipped to the valid chroma sample range: predChromaVal = c 0 C + c 1 N + c 2 S + c 3 E + c 4 W + c 5 P + c 6 B (10)

濾波係數c i是通過最小化參考區域中重建(或目標)色度樣本與相應預測色度樣本之間的MSE來計算的。每個預測色度樣本都是使用推導出的分量預測模型(如公式(10)),從並置的亮度樣本及其周圍的亮度樣本生成的。公式(10)是基於中心樣本和周圍4個樣本(C、N、S、E、W)的卷積模型。公式(10)可以擴展到包括中心樣本和周圍8個樣本(C、N、S、E、W、NE、NW、SE、SW)的抽頭。公式(10)及其擴展形式可稱為分量預測模型(因為它可用於交叉分量預測或分量內預測)。 The filter coefficients c i are calculated by minimizing the MSE between the reconstructed (or target) chrominance samples and the corresponding predicted chrominance samples in the reference region. Each predicted chrominance sample is generated from the collocated luma sample and its surrounding luma samples using the derived component prediction model (such as equation (10)). Equation (10) is a convolution model based on the center sample and the surrounding 4 samples (C, N, S, E, W). Equation (10) can be expanded to include taps of the center sample and the surrounding 8 samples (C, N, S, E, W, NE, NW, SE, SW). Equation (10) and its extended form can be called a component prediction model (because it can be used for cross-component prediction or intra-component prediction).

第4圖示出了用於導出當前塊的卷積模型的濾波器係數的參考區域。該參考區域包括當前塊400上側和左側的(色度)樣本的(參考)行。(在這個示例中,當前塊400是一個PU)。參考區域向右延伸一個PU寬度,在PU邊界以下延伸一個PU高度。該參考區域可被調整為只包括可用的樣本。參考區域的擴展區域被用來支援加符號形狀的空間濾波器的「側面樣本」(例如,第3圖所述的N、E、W、S樣本,以及NW、NE、SW、SE樣本),在不可用的區域時被填充。Figure 4 shows a reference region for deriving filter coefficients for the convolution model of the current block. The reference region includes (reference) rows of (chrominance) samples above and to the left of the current block 400. (In this example, the current block 400 is a PU). The reference region extends a PU width to the right and a PU height below the PU boundary. The reference region can be adjusted to include only available samples. The extended area of the reference region is used to support "side samples" of signed-shaped spatial filters (e.g., the N, E, W, S samples, and NW, NE, SW, SE samples described in Figure 3), which are filled in when unavailable areas.

MSE的最小化可以是通過計算亮度輸入的自相關矩陣和亮度輸入與色度輸出之間的交叉相關向量來實現的。自相關矩陣被LDL分解,最後的濾波係數用反置換法計算。該過程類似於ECM中ALF濾波器係數的計算,然而,在一些實施例中,選擇LDL分解而不是柯列斯基(Cholesky)分解,以避免使用平方根運算。Minimization of MSE can be achieved by calculating the autocorrelation matrix of the luma input and the cross-correlation vector between the luma input and the chroma output. The autocorrelation matrix is decomposed by LDL and the final filter coefficients are calculated using the inverse permutation method. This process is similar to the calculation of the ALF filter coefficients in ECM, however, in some embodiments, LDL decomposition is selected instead of Cholesky decomposition to avoid the use of square root operations.

在一些實施例中,可以使用高階模型而不是線性模型來預測色度樣本。該高階模型可包括 k-tap空間項、非線性項(表示為 P)、和偏置項(表示為 B)。該高階模型可被指定為: (11) In some embodiments, a high-order model may be used to predict chrominance samples instead of a linear model. The high-order model may include a k -tap space term, a nonlinear term (denoted as P ), and a bias term (denoted as B ). The high-order model may be specified as: (11)

其中 是位置 的下採樣重建的亮度樣本, 周圍的一個鄰近樣本, 、和 是模型參數。該高階模型公式可以被用於推導顏色分量之間的模型參數,或當前幀/圖片與參考幀/圖片的參考樣本之間的模型參數。 III. 對模型係數的約束 in It's location The brightness samples reconstructed by downsampling, yes A nearby sample, , , ,and are model parameters. The high-level model formula can be used to derive model parameters between color components, or between reference samples of the current frame/picture and a reference frame/picture. III. Constraints on model coefficients

在一些實施例中,在矩陣方程求解過程之後,最終推導出的濾波/模型係數用於生成CCCM預測器。然而,該推導出的濾波係數可能不合理,例如模型過度擬合參考樣本,或者濾波係數過大。本申請的一些實施例提供了在生成CCCM預測器之前對推導出的模型係數進行約束的方法。In some embodiments, after the matrix equation solving process, the final derived filter/model coefficients are used to generate the CCCM predictor. However, the derived filter coefficients may be unreasonable, for example, the model over-fits the reference sample, or the filter coefficients are too large. Some embodiments of the present application provide a method for constraining the derived model coefficients before generating the CCCM predictor.

第5圖概念性地說明了視訊編解碼器的資料路徑500,該視訊編解碼器推導並約束模型的係數(例如,用於CCCM)。受約束的模型係數用於生成當前塊的預測器。如圖式所示,資料路徑500從矩陣準備模組510開始,該模組生成自相關矩陣和交叉相關向量515。該自相關矩陣是基於相應的輸入分量樣本505(X樣本)準備的。該交叉相關向量是基於相應的輸入和輸出分量樣本505(X樣本和Y樣本)準備的。該相應的輸入和輸出分量樣本可以是與當前塊相鄰的參考範本區域的相應重建亮度和色度樣本。FIG. 5 conceptually illustrates a data path 500 of a video codec that derives and constrains coefficients of a model (e.g., for CCCM). The constrained model coefficients are used to generate a predictor for the current block. As shown, the data path 500 begins with a matrix preparation module 510 that generates an autocorrelation matrix and a cross-correlation vector 515. The autocorrelation matrix is prepared based on corresponding input component samples 505 (X samples). The cross-correlation vector is prepared based on corresponding input and output component samples 505 (X samples and Y samples). The corresponding input and output component samples can be corresponding reconstructed luminance and chrominance samples of a reference template region adjacent to the current block.

該生成的自相關矩陣和交叉相關向量515被提供給矩陣方程求解器模組530,以生成優化係數集535。係數約束模組540對優化係數535進行約束,以產生受約束的最終係數集545。該受約束的最終係數545最終被用作分量預測模型550(例如,CCLM或CCCM),以基於參考分量樣本560(例如,當前塊的亮度)生成預測分量樣本565(例如,當前塊的色度)。該生成的預測分量樣本565可用作CCCM預測器。The generated autocorrelation matrix and cross-correlation vector 515 are provided to a matrix equation solver module 530 to generate an optimized coefficient set 535. The coefficient constraint module 540 constrains the optimized coefficient 535 to generate a constrained final coefficient set 545. The constrained final coefficient 545 is ultimately used as a component prediction model 550 (e.g., CCLM or CCCM) to generate a predicted component sample 565 (e.g., chrominance of the current block) based on a reference component sample 560 (e.g., luminance of the current block). The generated predicted component sample 565 can be used as a CCCM predictor.

可以在優化係數535上應用不同類型的約束,以生成該受約束的最終係數545。例如,在一些實施例中,在生成受約束的最終係數545之前,使用預定義的剪切閾值以剪切該優化係數535。在一些實施例中,為該優化係數535預先定義多個剪切閾值,並可發出信號指示所選閾值的語法。在一些實施例中,預先定義該優化係數535的多個剪切閾值,所選閾值可以明確地從鄰近的重建樣本或邊資訊中推導得出。在一些實施例中,不同係數的剪切閾值可以全部不同或部分不同。Different types of constraints may be applied to the optimized coefficients 535 to generate the constrained final coefficients 545. For example, in some embodiments, a predefined clipping threshold is used to clip the optimized coefficients 535 before generating the constrained final coefficients 545. In some embodiments, multiple clipping thresholds are predefined for the optimized coefficients 535, and a syntax indicating the selected threshold may be signaled. In some embodiments, multiple clipping thresholds are predefined for the optimized coefficients 535, and the selected threshold may be explicitly derived from neighboring reconstructed samples or side information. In some embodiments, the clipping thresholds for different coefficients may be all different or partially different.

在一些實施例中,分量預測模型550的係數用定點表示,其整數部分的位元寬被限制為一定值。在本示例中,在係數約束模組540約束之前,優化係數535的定點格式為整數部分48位元,小數部分16位元。經過係數約束模組540的約束後,在第5圖所示的示例中,約束係數545的定點格式為整數部分36位元,小數部分16位元。在其他示例中,約束係數545的整數部分可以有不同的位元寬。在一些實施例中,每個係數的整數和/或小數部分的位元寬可以完全不同或部分不同。例如,約束係數可以用定點格式表示,整數部分36位元,小數部分14位元。In some embodiments, the coefficients of the component prediction model 550 are represented in fixed-point format, and the bit width of the integer part is limited to a certain value. In this example, before the coefficient constraint module 540 constrains, the fixed-point format of the optimization coefficient 535 is 48 bits for the integer part and 16 bits for the decimal part. After the constraint of the coefficient constraint module 540, in the example shown in Figure 5, the fixed-point format of the constraint coefficient 545 is 36 bits for the integer part and 16 bits for the decimal part. In other examples, the integer part of the constraint coefficient 545 can have different bit widths. In some embodiments, the bit width of the integer and/or decimal part of each coefficient can be completely different or partially different. For example, the constraint coefficient can be represented in a fixed-point format, with 36 bits for the integer part and 14 bits for the decimal part.

在一些實施例中,視訊編解碼器可對超出範圍的係數進行剪切操作。在一些實施例中,如果係數超出範圍,則可推斷不啟用CCCM或CCLM模式。In some embodiments, the video codec may perform a shearing operation on the coefficients that are out of range. In some embodiments, if the coefficients are out of range, it may be inferred that the CCCM or CCLM mode is not enabled.

在一些實施例中,在生成最終預測器565之前,優化係數535被剪切到預定義範圍內。例如,就浮點精度而言,範圍可以是[-1,1)、[-2,2]、[-4,4)、或[-8,8]。如果使用定點精度表示模型係數,且小數部分的位元數為N位,則該預定義範圍將擴大(1<<N)。例如,如果浮點精度的支援範圍是[-8,8),而小數部分的位元數是5位,則支援範圍將從浮點精度的[-8,8)變為具有5位元小數部分的定點精度的[-8*32,8*32)。In some embodiments, the optimization coefficients 535 are clipped to a predefined range before generating the final predictor 565. For example, in terms of floating point precision, the range can be [-1, 1), [-2, 2], [-4, 4), or [-8, 8]. If the model coefficients are represented using fixed point precision and the number of bits in the fractional part is N bits, the predefined range will be expanded (1<<N). For example, if the supported range of floating point precision is [-8, 8) and the number of bits in the fractional part is 5 bits, the supported range will change from [-8, 8) for floating point precision to [-8*32, 8*32) for fixed point precision with a 5-bit fractional part.

在一些實施例中,不同模型係數的預定義範圍可能不同。例如,預定義範圍取決於模型係數的空間位置。在一些實施例中,預定義範圍取決於相應輸入的順序。在一些實施例中,所有係數只使用一個預定義範圍。在一些實施例中,當一個推導出的係數超出預定義範圍時,通過將該係數剪切到預定義範圍內以推導出最終係數545。在一些實施例中,當一個推導出係數超出預定範圍時,最終係數545設置為零。在一些實施例中,當一個推導出係數超出預定範圍時,CCCM模式的係數被設置為等於標識濾波器(即CCCM模式不應用濾波)。In some embodiments, the predefined ranges for different model coefficients may be different. For example, the predefined range depends on the spatial position of the model coefficients. In some embodiments, the predefined range depends on the order of the corresponding inputs. In some embodiments, all coefficients use only one predefined range. In some embodiments, when a derived coefficient exceeds the predefined range, the final coefficient 545 is derived by clipping the coefficient to within the predefined range. In some embodiments, when a derived coefficient exceeds the predetermined range, the final coefficient 545 is set to zero. In some embodiments, when a derived coefficient exceeds the predetermined range, the coefficient of the CCCM mode is set to be equal to the identification filter (i.e., the CCCM mode does not apply filtering).

上述提出的任何方法都可以在編碼器和/或解碼器中實現。例如,所提出的任何方法可以在編碼器的幀間/幀內預測(inter/intra/prediction)模組,和/或解碼器的幀間/幀內預測模組中實現。或者,所提出的任何方法可以實現為與編碼器的幀間/幀內預測模組和/或解碼器的幀間/幀內預測模組耦合的電路,以便提供幀間/幀內預測模組所需的資訊。 VI. 視訊編碼器實例 Any of the methods proposed above can be implemented in an encoder and/or a decoder. For example, any of the methods proposed can be implemented in an inter/intra/prediction module of an encoder and/or an inter/intra/prediction module of a decoder. Alternatively, any of the methods proposed can be implemented as a circuit coupled to an inter/intra/prediction module of an encoder and/or an inter/intra/prediction module of a decoder to provide information required by the inter/intra/prediction module. VI. Video Encoder Example

第6圖示出了可使用分量預測模型對圖元塊進行編碼的示例性視訊編碼器600。如圖所示,視訊編碼器600接收來自視訊源605的輸入視訊訊號,並將該訊號編碼為位元流695。視訊編碼器600具有複數個元件或模組用於對來自視訊源605的訊號進行編碼,至少包括選自變換模組610、量化模組611、反量化模組614、反變換模組615、幀內預測估計模組620,幀內預測模組625,運動補償模組630,運動估計模組635,環內濾波器645,重建圖片緩衝器650,MV緩衝器665,MV預測模組675,以及熵編碼器690中的一些元件。運動補償模組630和運動估計模組635是幀間預測模組640的一部分。FIG6 shows an exemplary video encoder 600 that can encode a picture element block using a component prediction model. As shown, the video encoder 600 receives an input video signal from a video source 605 and encodes the signal into a bit stream 695. The video encoder 600 has a plurality of components or modules for encoding a signal from a video source 605, including at least a transform module 610, a quantization module 611, an inverse quantization module 614, an inverse transform module 615, an intra-frame prediction estimation module 620, an intra-frame prediction module 625, a motion compensation module 630, a motion estimation module 635, an intra-loop filter 645, a reconstructed picture buffer 650, an MV buffer 665, an MV prediction module 675, and some components in an entropy encoder 690. The motion compensation module 630 and the motion estimation module 635 are part of the inter-frame prediction module 640.

在一些實施例中,模組610-690是由計算裝置或電子裝置的一個或複數個處理單元(例如,處理器)執行的軟體指令的模組。在一些實施例中,模組610-690是由電子裝置的一個或複數個積體電路(IC)實現的硬體電路的模組。雖然模組610-690被圖示為是獨立的模組,但一些模組可以被組合成一個單一的模組。In some embodiments, modules 610-690 are modules of software instructions executed by one or more processing units (e.g., processors) of a computing device or electronic device. In some embodiments, modules 610-690 are modules of hardware circuits implemented by one or more integrated circuits (ICs) of an electronic device. Although modules 610-690 are illustrated as separate modules, some modules may be combined into a single module.

視訊源605提供了原始的視訊訊號,呈現了每個視訊幀的圖元資料,沒有壓縮。減法器608計算視訊源605的原始視訊圖元資料與來自運動補償模組630或幀內預測模組625的預測圖元資料613之間的差異,作為預測殘差609。變換模組610將差值(或殘餘圖元資料或殘餘訊號)轉換為變換係數616(例如,通過執行離散餘弦變換,或DCT)。量化模組611將變換係數616量化為量化資料(或量化係數)612,由熵編碼器690將其編碼到位元流695。The video source 605 provides a raw video signal, presenting pixel data for each video frame, without compression. The subtractor 608 calculates the difference between the raw video pixel data of the video source 605 and the predicted pixel data 613 from the motion compensation module 630 or the intra-frame prediction module 625 as a prediction residue 609. The transform module 610 converts the difference (or residual pixel data or residual signal) into a transform coefficient 616 (e.g., by performing a discrete cosine transform, or DCT). The quantization module 611 quantizes the transform coefficient 616 into quantized data (or quantized coefficient) 612, which is encoded by the entropy encoder 690 into a bit stream 695.

反量化模組614對量化資料(或量化係數)612進行去量化,以獲得變換係數616,反變換模組615對變換係數616進行反變換,以產生重建殘差619。重建殘差619與預測圖元資料613相加,產生重建圖元資料617。在一些實施例中,重建圖元資料617被暫時存儲在行緩衝器中(未圖示),用於圖內預測和空間MV預測。重建圖元被環內濾波器645過濾並存儲在重建圖片緩衝器650中。在一些實施例中,重建圖片緩衝器650是視訊編碼器600的外部存儲。在一些實施例中,重建圖片緩衝器650是視訊編碼器600的內部存儲。The inverse quantization module 614 dequantizes the quantized data (or quantized coefficient) 612 to obtain a transform coefficient 616, and the inverse transform module 615 inversely transforms the transform coefficient 616 to generate a reconstruction residue 619. The reconstruction residue 619 is added to the predicted pixel data 613 to generate a reconstructed pixel data 617. In some embodiments, the reconstructed pixel data 617 is temporarily stored in a row buffer (not shown) for intra-picture prediction and spatial MV prediction. The reconstructed pixel is filtered by an intra-loop filter 645 and stored in a reconstructed picture buffer 650. In some embodiments, the reconstructed picture buffer 650 is an external storage of the video encoder 600. In some embodiments, the reconstructed picture buffer 650 is internal storage of the video encoder 600.

圖內估計模組620基於重建圖元資料617執行幀內預測,以產生幀內預測資料。幀內預測資料被提供給熵編碼器690以被編碼到位元流695。幀內預測資料也被幀內預測模組625用來產生預測圖元資料613。The intra-frame estimation module 620 performs intra-frame prediction based on the reconstructed pixel data 617 to generate intra-frame prediction data. The intra-frame prediction data is provided to the entropy encoder 690 to be encoded into the bit stream 695. The intra-frame prediction data is also used by the intra-frame prediction module 625 to generate the prediction pixel data 613.

運動估計模組635通過產生MV以參考存儲在重建圖片緩衝器650中的先前解碼幀的圖元資料來執行幀間預測。這些MV被提供給運動補償模組630以產生預測圖元資料。The motion estimation module 635 performs inter-frame prediction by generating MVs to refer to the pixel data of the previous decoded frame stored in the reconstructed picture buffer 650. These MVs are provided to the motion compensation module 630 to generate predicted pixel data.

視訊編碼器600使用MV預測來產生預測MV,而不是在位元流中編碼完整的實際MV,用於運動補償的MV和預測MV之間的差異被編碼為殘餘運動資料並存儲在位元流695中。The video encoder 600 uses MV prediction to generate a predicted MV, instead of encoding the complete actual MV in the bitstream, and the difference between the MV used for motion compensation and the predicted MV is encoded as residual motion data and stored in the bitstream 695.

MV預測模組675基於為編碼先前視訊幀而生成的參考MV,即用於執行運動補償的運動補償MV,生成預測MV。MV預測模組675從MV緩衝器665中檢索先前視訊幀的參考MV。視訊編碼器600將為當前視訊幀生成的MV存儲在MV緩衝器665中,作為用於生成預測MV的參考MV。The MV prediction module 675 generates a predicted MV based on a reference MV generated for encoding a previous video frame, i.e., a motion compensation MV for performing motion compensation. The MV prediction module 675 retrieves the reference MV of the previous video frame from the MV buffer 665. The video encoder 600 stores the MV generated for the current video frame in the MV buffer 665 as a reference MV for generating the predicted MV.

MV預測模組675使用參考MV來創建預測MV。預測MV可以通過空間MV預測或時間MV預測來計算。預測MV和當前幀的運動補償MV(MC MV)之間的差異(殘餘運動資料)由熵編碼器690編碼到位元流695。The MV prediction module 675 uses the reference MV to create a predicted MV. The predicted MV can be calculated by spatial MV prediction or temporal MV prediction. The difference (residual motion data) between the predicted MV and the motion compensation MV (MC MV) of the current frame is encoded by the entropy encoder 690 into the bit stream 695.

熵編碼器690通過使用熵編碼技術,例如上下文自我調整二進位算術編碼(CABAC)或哈夫曼(Huffman)編碼,將各種參數和資料編碼到位元流695中。熵編碼器690將各種頭元素、標誌、以及量化的變換係數612和殘餘運動資料作為語法元素編碼到位元流695中。位元流695又被存儲在存放裝置中,或通過通信介質如網路傳輸給解碼器。The entropy encoder 690 encodes various parameters and data into a bit stream 695 by using an entropy coding technique, such as context-adjusting binary arithmetic coding (CABAC) or Huffman coding. The entropy encoder 690 encodes various header elements, flags, and quantized transform coefficients 612 and residual motion data as syntax elements into the bit stream 695. The bit stream 695 is then stored in a storage device or transmitted to a decoder via a communication medium such as a network.

環內濾波器645對重建圖元資料617進行過濾或平滑操作,以減少編解碼的偽影,特別是在塊的邊界。在一些實施例中,由環內濾波器645執行的濾波或平滑操作包括去塊濾波器(DBF)、樣本自適應偏移(SAO)、和/或自適應環路濾波器(ALF)。The in-loop filter 645 performs filtering or smoothing operations on the reconstructed pixel data 617 to reduce encoding and decoding artifacts, especially at block boundaries. In some embodiments, the filtering or smoothing operations performed by the in-loop filter 645 include a deblocking filter (DBF), a sample adaptive offset (SAO), and/or an adaptive loop filter (ALF).

第7圖示出了通過對分量預測模型的係數進行約束來推導和使用該模型的視訊編碼器600的部分。如圖式所示,初始預測器生成模組720向分量預測模型710提供初始預測器715。初始預測器715可包括用於預測當前塊的分量樣本的參考塊的分量樣本(亮度或色度),或用於交叉分量預測的當前塊的分量樣本。分量預測模型710應用於初始預測器715,以生成精細預測器725。精細預測器725的樣本可用作該預測圖元資料613。初始預測器715可以是當前塊的重建亮度樣本,而精細預測器725可以是當前塊的預測色度樣本。FIG. 7 shows a portion of a video encoder 600 that derives and uses a component prediction model by constraining its coefficients. As shown, an initial predictor generation module 720 provides an initial predictor 715 to a component prediction model 710. The initial predictor 715 may include component samples (luminance or chrominance) of a reference block for predicting component samples of a current block, or component samples of the current block for cross-component prediction. The component prediction model 710 is applied to the initial predictor 715 to generate a fine predictor 725. Samples of the fine predictor 725 may be used as the prediction primitive data 613. The initial predictor 715 may be a reconstructed luminance sample of the current block, while the fine predictor 725 may be a predicted chrominance sample of the current block.

當當前塊通過幀間預測編解碼時,運動估計模組635提供MV,運動補償模組630使用該MV將參考圖片中的參考塊識別為初始預測器715。當當前塊通過幀內預測編解碼時,幀內預測估計模組620提供幀內模式,由幀內預測模組625用來生成當前塊的幀內預測,作為初始預測器715。然後,初始預測器715的分量樣本可用作分量預測模型710的輸入。When the current block is coded by inter-frame prediction, the motion estimation module 635 provides an MV, which is used by the motion compensation module 630 to identify the reference block in the reference picture as the initial predictor 715. When the current block is coded by intra-frame prediction, the intra-frame prediction estimation module 620 provides an intra-frame mode, which is used by the intra-frame prediction module 625 to generate an intra-frame prediction of the current block as the initial predictor 715. Then, the component samples of the initial predictor 715 can be used as input to the component prediction model 710.

為了得出分量預測模型710,回歸資料選擇模組730會從重建圖片緩衝區650中檢索所需的分量樣本作為回歸資料。回歸資料可以取自當前圖片中當前塊內和/或周圍的區域,以及參考圖片中參考塊內和/或周圍的區域。檢索到的回歸資料(即所需的分量樣本)包括用於確定分量預測模型710的係數或參數的相應輸入(X)和輸出(Y)分量樣本。In order to obtain the component prediction model 710, the regression data selection module 730 retrieves the required component samples as regression data from the reconstructed image buffer 650. The regression data can be taken from the area within and/or around the current block in the current image, and the area within and/or around the reference block in the reference image. The retrieved regression data (i.e., the required component samples) include corresponding input (X) and output (Y) component samples for determining the coefficients or parameters of the component prediction model 710.

模型構造器705使用回歸資料(X和Y),利用諸如消除法、反覆運演算法、或分解法等技術推導出分量預測模型710的係數。在一些實施例中,模型構造器705在提供用作分量預測模型710的受約束的係數之前,會對推導出的係數應用某些約束。模型構造器705可以通過在閾值處剪切或將係數限制在預定義範圍內來約束係數。在一些實施例中,模型構造器705可以對不同的係數應用不同的剪切閾值。在一些實施例中,係數由定點表示,其小數部分具有N位元,預定義範圍按1<<N放大,對分量預測模型的係數進行約束在上文第四部分( VI )中描述。 The model builder 705 uses the regression data (X and Y) to derive the coefficients of the component prediction model 710 using techniques such as elimination, iterative operation algorithms, or decomposition methods. In some embodiments, the model builder 705 applies certain constraints to the derived coefficients before providing the constrained coefficients for use as the component prediction model 710. The model builder 705 can constrain the coefficients by clipping at a threshold or limiting the coefficients to a predefined range. In some embodiments, the model builder 705 can apply different clipping thresholds to different coefficients. In some embodiments, the coefficients are represented by fixed points, the fractional part of which has N bits, and the predefined range is magnified by 1<<N. Constraining the coefficients of the component prediction model is described in Section IV ( VI ) above.

第8圖概念性地說明了對分量預測模型的係數進行約束的過程800。在一些實施例中,實現編碼器600的計算設備的一個或多個處理單元(例如處理器)通過執行存儲在電腦可讀介質中的指令來執行流程800。在一些實施例中,實現編碼器600的電子設備執行過程800。FIG. 8 conceptually illustrates a process 800 for constraining coefficients of a component prediction model. In some embodiments, one or more processing units (e.g., processors) of a computing device implementing the encoder 600 perform the process 800 by executing instructions stored in a computer-readable medium. In some embodiments, an electronic device implementing the encoder 600 performs the process 800.

編碼器(在塊810處)接收將被編碼的資料,作為視訊的當前圖片的當前塊。The encoder (at block 810) receives data to be encoded as a current block of a current picture of a video.

編碼器(在塊820處)基於相應的輸入和輸出分量樣本推導出用於分量預測模型的係數集。在一些實施例中,分量預測模型是基於當前塊的重建亮度樣本生成預測色度樣本的交叉分量模型,該相應的輸入和輸出分量樣本是與當前塊相鄰的範本區域的相應亮度和色度樣本。在一些實施例中,該分量預測模型是卷積模型,其基於該相應的輸入和輸出分量樣本之間的自相關矩陣,通過分解和反向置換推導出該係數集。The encoder (at block 820) derives a set of coefficients for a component prediction model based on corresponding input and output component samples. In some embodiments, the component prediction model is a cross-component model that generates predicted chrominance samples based on reconstructed luminance samples of the current block, and the corresponding input and output component samples are corresponding luminance and chrominance samples of a sample region adjacent to the current block. In some embodiments, the component prediction model is a convolution model that derives the set of coefficients by decomposition and inverse permutation based on an autocorrelation matrix between the corresponding input and output component samples.

編碼器(在塊830處)根據約束條件集對推導出的係數集進行約束。在一些實施例中,編碼器通過在剪切閾值處剪切係數,或在不同的剪切閾值處剪切不同的係數,或將係數限制在預定義範圍內來約束該推導出的係數集。在一些實施例中,該推導出的係數集和該受約束的係數集在編碼器中用浮點表示。在一些實施例中,係數由定點表示,其小數部分有N位元,預定義範圍的大小基於1<<N放大。在一些實施例中,當推導出的係數超出預定義範圍時,係數集被設置為等於標識濾波器,或對超出範圍的係數應用剪切操作,或不使用該推導出的係數集對當前塊進行編碼或解碼(禁用CCCM模式)。The encoder (at block 830) constrains the derived coefficient set according to the set of constraint conditions. In some embodiments, the encoder constrains the derived coefficient set by clipping the coefficient at a clipping threshold, or clipping different coefficients at different clipping thresholds, or limiting the coefficients to a predefined range. In some embodiments, the derived coefficient set and the constrained coefficient set are represented in the encoder by floating point. In some embodiments, the coefficient is represented by a fixed point, and its fractional part has N bits, and the size of the predefined range is enlarged based on 1<<N. In some embodiments, when the derived coefficient exceeds the predefined range, the coefficient set is set to be equal to the identification filter, or a clipping operation is applied to the coefficients that are out of range, or the derived coefficient set is not used to encode or decode the current block (disable CCCM mode).

編碼器(在塊840處)應用該受約束的係數集作為分量預測模型,以生成當前塊的預測器。該預測器可包括生成的預測色度樣本。The encoder applies (at block 840) the constrained set of coefficients as a component prediction model to generate a predictor for the current block. The predictor may include the generated predicted chrominance samples.

編碼器(在塊850處)通過使用生成的預測器來生成預測殘差,從而對當前區塊進行編碼。 VI. 視訊解碼器實例 The encoder (at block 850) encodes the current block by using the generated predictor to generate a prediction residual. VI. Video Decoder Example

在一些實施例中,編碼器可以在位元流中發出訊號(或產生)一個或複數個語法元素,從而解碼器可以從位元流中解析所述一個或複數個語法元素。In some embodiments, an encoder may signal (or generate) one or more syntax elements in a bitstream, such that a decoder may parse the one or more syntax elements from the bitstream.

第9圖示出了可使用分量預測模型對圖元塊進行解碼的視訊解碼器900的示例。如圖式所示,視訊解碼器900是圖像解碼或視訊解碼電路,它接收位元流995,並將位元流的內容解碼成視訊幀的圖元資料,以供顯示。視訊解碼器900具有複數個用於解碼位元流995的元件或模組,包括選自反量化模組911、反變換模組910、幀內預測模組925、運動補償模組930、環內濾波器945、解碼圖片緩衝器950、MV緩衝器965、MV預測模組975、和解析器990的一些元件。運動補償模組930是幀間預測模組940的一部分。FIG. 9 shows an example of a video decoder 900 that can use a component prediction model to decode a pixel block. As shown in the figure, the video decoder 900 is an image decoding or video decoding circuit that receives a bit stream 995 and decodes the content of the bit stream into pixel data of a video frame for display. The video decoder 900 has a plurality of components or modules for decoding the bit stream 995, including some components selected from an inverse quantization module 911, an inverse transform module 910, an intra-frame prediction module 925, a motion compensation module 930, an intra-loop filter 945, a decoded picture buffer 950, an MV buffer 965, an MV prediction module 975, and a parser 990. The motion compensation module 930 is part of the frame prediction module 940.

在一些實施例中,模組910-990是由計算裝置的一個或多個處理單元(例如,處理器)執行的軟體指令的模組。在一些實施例中,模組910-990是由電子裝置的一個或多個IC實現的硬體電路的模組。雖然模組910-990被說明為是獨立的模組,但其中一些模組可以被組合成一個單一的模組。In some embodiments, modules 910-990 are modules of software instructions executed by one or more processing units (e.g., processors) of a computing device. In some embodiments, modules 910-990 are modules of hardware circuits implemented by one or more ICs of an electronic device. Although modules 910-990 are illustrated as separate modules, some of these modules may be combined into a single module.

解析器990(或熵解碼器)接收位元流995,並根據視訊編解碼或圖像編解碼標準定義的語法執行初始解析。被解析的語法元素包括各種頭元素、標誌、以及量化資料(或量化係數)912。解析器990通過使用熵編解碼技術,如上下文自我調整二進位算術編解碼(CABAC)或哈夫曼編碼,解析出各種語法元素。The parser 990 (or entropy decoder) receives the bit stream 995 and performs initial parsing according to the syntax defined by the video codec or image codec standard. The parsed syntax elements include various header elements, flags, and quantization data (or quantization coefficients) 912. The parser 990 parses out various syntax elements by using entropy coding and decoding techniques, such as context-adjusting binary arithmetic coding (CABAC) or Huffman coding.

反量化模組911對量化資料(或量化係數)912進行去量化,以獲得變換係數,反變換模組910對變換係數916進行反變換,以產生重建殘餘訊號919。重建殘餘訊號919與來自幀內預測模組925或運動補償模組930的預測圖元資料913相加,產生解碼圖元資料917。解碼後的圖元資料由環內濾波器945過濾並存儲在解碼圖片緩衝器950中。在一些實施例中,解碼圖片緩衝器950是視訊解碼器900的外部存儲。在一些實施例中,解碼圖片緩衝器950是視訊解碼器900的內部存儲。The inverse quantization module 911 dequantizes the quantized data (or quantized coefficients) 912 to obtain the transform coefficients, and the inverse transform module 910 inversely transforms the transform coefficients 916 to generate a reconstructed residual signal 919. The reconstructed residual signal 919 is added to the predicted pixel data 913 from the intra-frame prediction module 925 or the motion compensation module 930 to generate decoded pixel data 917. The decoded pixel data is filtered by the in-loop filter 945 and stored in the decoded picture buffer 950. In some embodiments, the decoded picture buffer 950 is an external storage of the video decoder 900. In some embodiments, the decoded picture buffer 950 is internal storage of the video decoder 900.

幀內預測模組925從位元流995接收幀內預測資料,並據此從存儲在解碼圖片緩衝器950中的解碼圖元資料917產生預測圖元資料913。在一些實施例中,解碼圖元資料917也被存儲在行緩衝器中(未圖示),用於圖內預測和空間MV預測。The intra prediction module 925 receives intra prediction data from the bitstream 995 and generates prediction pixel data 913 from the decoded pixel data 917 stored in the decoded picture buffer 950. In some embodiments, the decoded pixel data 917 is also stored in a row buffer (not shown) for intra prediction and spatial MV prediction.

在一些實施例中,解碼圖片緩衝器950的內容被用於顯示。顯示裝置955直接檢索解碼圖片緩衝器950的內容用於顯示,或者將解碼圖片緩衝器的內容檢索到顯示緩衝器。在一些實施例中,顯示裝置通過圖元傳輸從解碼圖片緩衝器950接收圖元值。In some embodiments, the contents of the decoded picture buffer 950 are used for display. The display device 955 directly retrieves the contents of the decoded picture buffer 950 for display, or retrieves the contents of the decoded picture buffer to the display buffer. In some embodiments, the display device receives the pixel value from the decoded picture buffer 950 through pixel transmission.

運動補償模組930根據運動補償MV(MC MV)從存儲在解碼圖片緩衝器950中的解碼圖元資料917產生預測圖元資料913。這些運動補償MV是通過將從位元流995收到的殘餘運動資料與從MV預測模組975收到的預測MV相加而解碼的。The motion compensation module 930 generates predicted pixel data 913 from the decoded pixel data 917 stored in the decoded picture buffer 950 according to the motion compensation MV (MC MV). These motion compensation MVs are decoded by adding the residual motion data received from the bitstream 995 to the predicted MV received from the MV prediction module 975.

MV預測模組975基於為解碼先前視訊幀而生成的參考MV,例如,用於執行運動補償的運動補償MV,生成預測MV。MV預測模組975從MV緩衝器965檢索先前視訊幀的參考MV。視訊解碼器900將為解碼當前視訊幀而生成的運動補償MV存儲在MV緩衝器965中,作為用於產生預測MV的參考MV。The MV prediction module 975 generates a predicted MV based on a reference MV generated for decoding a previous video frame, for example, a motion compensation MV for performing motion compensation. The MV prediction module 975 retrieves the reference MV of the previous video frame from the MV buffer 965. The video decoder 900 stores the motion compensation MV generated for decoding the current video frame in the MV buffer 965 as a reference MV for generating the predicted MV.

環內濾波器945對解碼的圖元資料917進行過濾或平滑操作,以減少編解碼的偽影,特別是在塊的邊界。在一些實施例中,由環內濾波器945執行的過濾或平滑操作包括去塊濾波器(DBF)、樣本自適應偏移(SAO)、和/或自適應環路濾波器(ALF)。The in-loop filter 945 performs filtering or smoothing operations on the decoded primitive data 917 to reduce encoding and decoding artifacts, particularly at block boundaries. In some embodiments, the filtering or smoothing operations performed by the in-loop filter 945 include a deblocking filter (DBF), a sample adaptive offset (SAO), and/or an adaptive loop filter (ALF).

第10圖示出了通過對分量預測模型的係數進行約束來推導和使用該模型的視訊解碼器900的部分。如圖式所示,初始預測器生成模組1020向分量預測模型1010提供初始預測器1015。初始預測器1015可包括用於預測當前塊的分量樣本的參考塊的分量樣本(亮度或色度),或用於交叉分量預測的當前塊的分量樣本。分量預測模型1010應用於初始預測器1015,以生成精細預測器1025。精細預測器1025的樣本可用作預測圖元資料913。初始預測器1015可以是當前塊的重建亮度樣本,而精細預測器1025可以是當前塊的預測色度樣本。FIG. 10 shows a portion of a video decoder 900 that derives and uses a component prediction model by constraining its coefficients. As shown, an initial predictor generation module 1020 provides an initial predictor 1015 to a component prediction model 1010. The initial predictor 1015 may include component samples (luminance or chrominance) of a reference block for predicting component samples of a current block, or component samples of the current block for cross-component prediction. The component prediction model 1010 is applied to the initial predictor 1015 to generate a refined predictor 1025. Samples of the refined predictor 1025 may be used as prediction primitives 913. The initial predictor 1015 may be a reconstructed luminance sample of the current block, while the refined predictor 1025 may be a predicted chrominance sample of the current block.

當當前塊通過幀間預測編解碼時,熵解碼器990提供MV,運動補償模組930使用該MV將參考圖片中的參考塊識別為初始預測器1015。當當前塊通過幀內預測編解碼時,熵解碼器990提供幀內模式,由幀內預測模組925用來生成當前塊的幀內預測,作為初始預測器1015。然後,初始預測器1015的分量樣本可用作分量預測模型1010的輸入。When the current block is coded by inter-frame prediction, the entropy decoder 990 provides an MV, which is used by the motion compensation module 930 to identify the reference block in the reference picture as the initial predictor 1015. When the current block is coded by intra-frame prediction, the entropy decoder 990 provides an intra-frame mode, which is used by the intra-frame prediction module 925 to generate an intra-frame prediction of the current block as the initial predictor 1015. The component samples of the initial predictor 1015 can then be used as input to the component prediction model 1010.

為了得出分量預測模型1010,回歸資料選擇模組1030會從解碼圖片緩衝區950中檢索所需的分量樣本作為回歸資料。回歸資料可以取自當前圖片中當前塊內和/或周圍的區域,以及參考圖片中參考塊內和/或周圍的區域。檢索到的回歸資料(即所需的分量樣本)包括用於確定分量預測模型1010的係數或參數的相應輸入(X)和輸出(Y)分量樣本。In order to obtain the component prediction model 1010, the regression data selection module 1030 retrieves the required component samples as regression data from the decoded picture buffer 950. The regression data can be taken from the area within and/or around the current block in the current picture, and the area within and/or around the reference block in the reference picture. The retrieved regression data (i.e., the required component samples) include corresponding input (X) and output (Y) component samples for determining the coefficients or parameters of the component prediction model 1010.

模型構造器1005使用回歸資料(X和Y),利用諸如消除法、反覆運演算法、或分解法等技術推導出分量預測模型1010的係數。在一些實施例中,模型構造器1005在提供用作分量預測模型1010的受約束的係數之前,會對推導出的係數應用某些約束。模型構造器1005可以通過在閾值處剪切或將係數限制在預定義範圍內來約束係數。在一些實施例中,模型構造器1005可對不同係數應用不同的剪切閾值。在一些實施例中,係數由定點表示,其小數部分具有N位元,預定義範圍按1 <<N放大,對分量預測模型的係數進行約束在上文第四部分( VI )中描述。 The model builder 1005 uses the regression data (X and Y) to derive the coefficients of the component prediction model 1010 using techniques such as elimination, iterative operation algorithms, or decomposition methods. In some embodiments, the model builder 1005 applies certain constraints to the derived coefficients before providing the constrained coefficients for use as the component prediction model 1010. The model builder 1005 can constrain the coefficients by clipping at a threshold or limiting the coefficients to a predefined range. In some embodiments, the model builder 1005 can apply different clipping thresholds to different coefficients. In some embodiments, the coefficients are represented by fixed points, the fractional part of which has N bits, and the predefined range is magnified by 1<<N. Constraining the coefficients of the component prediction model is described in Section IV ( VI ) above.

第11圖概念性地說明了對分量預測模型的係數進行約束的過程1100。在一些實施例中,實現解碼器900的計算設備的一個或多個處理單元(例如處理器)通過執行存儲在電腦可讀介質中的指令來執行過程1100。在一些實施例中,實現解碼器900的電子設備執行過程1100。FIG. 11 conceptually illustrates a process 1100 for constraining coefficients of a component prediction model. In some embodiments, one or more processing units (e.g., processors) of a computing device implementing the decoder 900 perform the process 1100 by executing instructions stored in a computer-readable medium. In some embodiments, an electronic device implementing the decoder 900 performs the process 1100.

解碼器(在塊1110處)接收將被解碼的資料,作為視訊的當前圖片的當前塊。The decoder (at block 1110) receives data to be decoded as the current block of the current picture of the video.

解碼器(在塊1120處)基於相應的輸入和輸出分量樣本推導出用於分量預測模型的係數集。在一些實施例中,分量預測模型是基於當前塊的重建亮度樣本生成預測色度樣本的交叉分量模型,該相應的輸入和輸出分量樣本是與當前塊相鄰的範本區域的相應亮度和色度樣本。在一些實施例中,該分量預測模型是卷積模型,其基於該相應的輸入和輸出分量樣本之間的自相關矩陣,通過分解和反向置換推導出該係數集。The decoder derives (at block 1120) a set of coefficients for a component prediction model based on corresponding input and output component samples. In some embodiments, the component prediction model is a cross-component model that generates predicted chrominance samples based on reconstructed luminance samples of a current block, and the corresponding input and output component samples are corresponding luminance and chrominance samples of a sample region adjacent to the current block. In some embodiments, the component prediction model is a convolution model that derives the set of coefficients by decomposition and inverse permutation based on an autocorrelation matrix between the corresponding input and output component samples.

解碼器(在塊1130處)根據約束條件集對推導出的係數集進行約束。在一些實施例中,編碼器通過在剪切閾值處剪切係數,或在不同的剪切閾值處剪切不同的係數,或將係數限制在預定義範圍內來約束該推導出的係數集。在一些實施例中,該推導出的係數集和該受約束的係數集在編碼器中用浮點表示。在一些實施例中,係數由定點表示,其小數部分有N位元,預定義範圍的大小基於1<<N放大。在一些實施例中,當推導出的係數超出預定義範圍時,係數集被設置為等於標識濾波器,或對超出範圍的係數應用剪切操作,或不使用該推導出的係數集對當前塊進行編碼或解碼(禁用CCCM模式)The decoder (at block 1130) constrains the derived set of coefficients according to the set of constraints. In some embodiments, the encoder constrains the derived set of coefficients by clipping the coefficients at a clipping threshold, or clipping different coefficients at different clipping thresholds, or limiting the coefficients to a predefined range. In some embodiments, the derived set of coefficients and the constrained set of coefficients are represented in the encoder using floating point. In some embodiments, the coefficients are represented by fixed point with N bits in the fractional part, and the size of the predefined range is magnified based on 1<<N. In some embodiments, when the derived coefficients are outside the predefined range, the coefficient set is set to be equal to an identification filter, or a clipping operation is applied to the coefficients outside the range, or the derived set of coefficients is not used to encode or decode the current block (CCCM mode is disabled)

解碼器(在塊1140處)應用該受約束的係數集作為分量預測模型,以生成當前塊的預測器。該預測器可包括生成的預測色度樣本。The decoder applies (at block 1140) the constrained set of coefficients as a component prediction model to generate a predictor for the current block. The predictor may include the generated predicted chrominance samples.

解碼器(在塊1150處)通過使用生成的預測器重建當前塊。然後,解碼器可提供重建的當前塊,作為重建的當前圖片的一部分進行顯示。 VII. 電子系統實例 The decoder (at block 1150) reconstructs the current block by using the generated predictor. The decoder can then provide the reconstructed current block for display as part of the reconstructed current picture. VII. Electronic System Example

上文描述的許多特徵和應用是作為軟體過程實現的,這些軟體過程被指定為記錄在電腦可讀存儲介質(也被稱為電腦可讀介質)上的一組指令。當這些指令被一個或複數個計算或處理單元(例如,一個或複數個處理器、處理器的核心、或其他處理單元)執行時,它們會使處理單元執行指令中指示的行動。電腦可讀介質的示例包括但不限於CD-ROM、快閃記憶體驅動器、隨機存取記憶體(RAM)晶片、硬碟、可擦除可程式設計唯讀記憶體(EPROM)、電可擦除可程式設計唯讀記憶體(EEPROM)等。電腦可讀介質不包括以無線方式或通過有線連接傳遞的載波和電子訊號。Many of the features and applications described above are implemented as software processes that are specified as a set of instructions recorded on a computer-readable storage medium (also referred to as a computer-readable medium). When these instructions are executed by one or more computing or processing units (e.g., one or more processors, processor cores, or other processing units), they cause the processing units to perform the actions indicated in the instructions. Examples of computer-readable media include, but are not limited to, CD-ROMs, flash memory drives, random access memory (RAM) chips, hard disks, erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), etc. Computer-readable media does not include carrier waves and electronic signals transmitted wirelessly or over wired connections.

在本說明書中,術語「軟體」是指包括駐留在唯讀記憶體中的固件或存儲在磁性記憶體中的應用程式,它們可以被讀入記憶體以便由處理器處理。另,在一些實施例中,複數個軟體發明可以作為一個較大程式的子部分來實現,同時保持不同的軟體發明。在一些實施例中,複數個軟體發明也可以作為單獨的程式來實現。最後,共同實現此處描述的軟體發明的獨立程式的任何組合都在本申請的範圍內。在一些實施例中,軟體程式在安裝到一個或複數個電子系統上運行時,定義了一個或複數個具體的機器實現,這些機器執行和實現軟體程式的操作。In this specification, the term "software" is meant to include firmware residing in read-only memory or applications stored in magnetic memory, which can be read into memory for processing by a processor. In addition, in some embodiments, multiple software inventions can be implemented as sub-parts of a larger program while maintaining different software inventions. In some embodiments, multiple software inventions can also be implemented as separate programs. Finally, any combination of independent programs that jointly implement the software inventions described herein is within the scope of this application. In some embodiments, when the software program is installed on one or more electronic systems and runs, it defines one or more specific machine implementations that execute and implement the operations of the software program.

第12圖概念性地說明了用於實現本申請的一些實施例的電子系統1200。電子系統1200可以是電腦(例如,桌上型電腦、個人電腦、平板電腦等)、電話、PDA、或任何其他種類的電子裝置。如是的電子系統包括各種類型的電腦可讀介質和各種其他類型的電腦可讀介質的介面。電子系統1200包括匯流排1205、處理單元1210、圖形處理單元(GPU)1215、系統記憶體1220、網路1225、唯讀記憶體1230、永久存放裝置1235、輸入裝置1240、和輸出裝置1245。FIG. 12 conceptually illustrates an electronic system 1200 for implementing some embodiments of the present application. The electronic system 1200 may be a computer (e.g., a desktop computer, a personal computer, a tablet computer, etc.), a phone, a PDA, or any other type of electronic device. Such an electronic system includes various types of computer-readable media and interfaces for various other types of computer-readable media. The electronic system 1200 includes a bus 1205, a processing unit 1210, a graphics processing unit (GPU) 1215, a system memory 1220, a network 1225, a read-only memory 1230, a permanent storage device 1235, an input device 1240, and an output device 1245.

匯流排1205統稱為所有的系統、週邊裝置、和晶片組匯流排,它們以通信方式連接電子系統1200的眾多內部裝置。例如,匯流排1205將處理單元1210與GPU 1215、唯讀記憶體1230、系統記憶體1220、和永久存放裝置1235通信連接。Bus 1205 is collectively referred to as all system, peripheral, and chipset buses that communicatively connect the various internal devices of electronic system 1200. For example, bus 1205 communicatively connects processing unit 1210 with GPU 1215, read-only memory 1230, system memory 1220, and permanent storage device 1235.

從這些不同的存儲單元中,處理單元1210檢索要執行的指令和要處理的資料,以執行本申請的過程。在不同的實施例中,處理單元可以是單一的處理器或多核處理器。一些指令被傳遞給GPU 1215並由其執行。GPU 1215可以卸載各種計算或補充由處理單元1210提供的影像處理。From these various storage units, the processing unit 1210 retrieves instructions to be executed and data to be processed to perform the process of the present application. In different embodiments, the processing unit can be a single processor or a multi-core processor. Some instructions are passed to and executed by the GPU 1215. The GPU 1215 can offload various calculations or supplement the image processing provided by the processing unit 1210.

唯讀記憶體(ROM)1230存儲靜態資料和指令,由處理單元1210和電子系統的其他模組使用。另一方面,永久存放裝置1235是讀寫存放裝置。該裝置是非易失性存儲單元,即使在電子系統1200關閉時也能存儲指令和資料。本申請的一些實施例使用大型存放區裝置(如磁性或光學磁片及其相應的磁碟機)作為永久存放裝置1235。Read-only memory (ROM) 1230 stores static data and instructions, which are used by processing unit 1210 and other modules of the electronic system. On the other hand, permanent storage device 1235 is a read-write storage device. This device is a non-volatile storage unit that can store instructions and data even when the electronic system 1200 is turned off. Some embodiments of the present application use a large storage area device (such as a magnetic or optical disk and its corresponding disk drive) as permanent storage device 1235.

其他實施例使用卸載式存放裝置(如軟碟、快閃記憶體裝置等、及其相應的磁碟機)作為永久存放裝置。與永久存放裝置1235一樣,系統記憶體1220是讀寫記憶體裝置。然而,與存放裝置1235不同,系統記憶體1220是易失性讀寫記憶體,如隨機存取記憶體。系統記憶體1220存儲處理器在運行時使用的一些指令和資料。在一些實施例中,根據本申請內容的進程被存儲在系統記憶體1220、永久存儲裝置1235、和/或唯讀記憶體1230中。例如,各種記憶體單元包括根據一些實施例的處理多媒體剪輯的指令。從這些不同的存儲單元,處理單元1210檢索要執行的指令和要處理的資料,以便執行一些實施例的進程。Other embodiments use unloadable storage devices (such as floppy disks, flash memory devices, etc., and their corresponding disk drives) as permanent storage devices. Like permanent storage device 1235, system memory 1220 is a read-write memory device. However, unlike storage device 1235, system memory 1220 is a volatile read-write memory, such as random access memory. System memory 1220 stores some instructions and data used by the processor during operation. In some embodiments, processes according to the content of the present application are stored in system memory 1220, permanent storage device 1235, and/or read-only memory 1230. For example, various memory units include instructions for processing multimedia clips according to some embodiments. From these different storage units, processing unit 1210 retrieves instructions to be executed and data to be processed in order to execute the processes of some embodiments.

匯流排1205也連接到輸入和輸出裝置1240和1245。輸入裝置1240使使用者能夠向電子系統傳達資訊和選擇命令。輸入裝置1240包括字母數位元鍵盤和指點裝置(也稱為「遊標控制裝置」)、相機(例如網路相機)、麥克風或用於接收語音命令的類似裝置,等等。輸出裝置1245顯示由電子系統產生的圖像或以其他方式輸出資料。輸出裝置1245包括印表機和顯示裝置,如陰極射線管(CRT)或液晶顯示器(LCD),以及揚聲器或類似的音訊輸出裝置。一些實施例包括諸如觸控式螢幕等既作為輸入又作為輸出裝置的裝置。Bus 1205 is also connected to input and output devices 1240 and 1245. Input device 1240 enables a user to communicate information and select commands to the electronic system. Input device 1240 includes an alphanumeric keyboard and pointing device (also known as a "cursor control device"), a camera (e.g., a webcam), a microphone or similar device for receiving voice commands, etc. Output device 1245 displays images generated by the electronic system or outputs data in other ways. Output device 1245 includes a printer and a display device, such as a cathode ray tube (CRT) or a liquid crystal display (LCD), as well as a speaker or similar audio output device. Some embodiments include devices such as a touch screen that serve as both an input and an output device.

最後,如圖12所示,匯流排1205還通過網路介面卡(未顯示)將電子系統1200耦合到網路1225。以這種方式,電腦可以是電腦網路(如局域網(「LAN」),廣域網路(「WAN」),或內聯網,或網路的網路,如網際網路)的一部分。電子系統1200的任何或所有元件都可與本申請內容結合使用。Finally, as shown in FIG12 , bus 1205 also couples electronic system 1200 to a network 1225 via a network interface card (not shown). In this manner, the computer can be part of a network of computers, such as a local area network (“LAN”), a wide area network (“WAN”), or an intranet, or a network of networks, such as the Internet. Any or all of the components of electronic system 1200 may be used in conjunction with the present application.

一些實施例包括電子元件,如微處理器、存放裝置、和記憶體,它們將電腦程式指令存儲在機器可讀或電腦可讀介質(可替代地稱為電腦可讀存儲介質、機器可讀介質、或機器可讀存儲介質)。這種電腦可讀介質的一些示例包括RAM、ROM、唯讀光碟(CD-ROM)、可錄光碟(CD-R)、可改寫光碟(CD-RW)、唯讀數位多功能光碟(例如,DVD-ROM、雙層DVD-ROM)、各種可錄/可改寫DVD(例如,DVD-RAM、DVD-RW、DVD+RW等)、快閃記憶體(例如,SD卡、迷你SD卡、微型SD卡等)、磁性和/或固態硬碟、唯讀和可記錄的Blu-Ray®光碟、超密度光碟、任何其他光學或磁性媒體以及軟碟。電腦可讀介質可存儲電腦程式,該程式可由至少一個處理單元執行,包括用於執行各種操作的指令集。電腦程式或電腦代碼的示例包括機器代碼,如由編譯器產生的,以及包括由電腦、電子元件、或微處理器使用解譯器執行的高級代碼的檔。Some embodiments include electronic components, such as microprocessors, storage devices, and memory, which store computer program instructions on a machine-readable or computer-readable medium (alternatively referred to as a computer-readable storage medium, a machine-readable medium, or a machine-readable storage medium). Some examples of such computer-readable media include RAM, ROM, compact disc-read only (CD-ROM), compact disc-recordable (CD-R), compact disc-rewritable (CD-RW), digital versatile disc-read only (e.g., DVD-ROM, double-layer DVD-ROM), various recordable/rewritable DVDs (e.g., DVD-RAM, DVD-RW, DVD+RW, etc.), flash memory (e.g., SD card, mini SD card, micro SD card, etc.), magnetic and/or solid state hard drives, read-only and recordable Blu-Ray® discs, ultra-density optical discs, any other optical or magnetic media, and floppy disks. The computer-readable medium can store a computer program that can be executed by at least one processing unit, including a set of instructions for performing various operations. Examples of computer programs or computer code include machine code, such as produced by a compiler, and files including high-level code that are executed by a computer, electronic component, or microprocessor using an interpreter.

雖然上述討論主要是指執行軟體的微處理器或多核處理器,但上述的許多功能和應用是由一個或複數個積體電路執行的,如特定應用積體電路(ASIC)或現場可程式設計閘陣列(FPGA)。在一些實施例中,這種積體電路執行存儲在電路本身的指令。此外,一些實施例執行存儲在可程式設計邏輯裝置(PLD)、ROM、或RAM裝置中的軟體。Although the above discussion refers primarily to microprocessors or multi-core processors that execute software, many of the functions and applications described above are performed by one or more integrated circuits, such as application-specific integrated circuits (ASICs) or field-programmable gate arrays (FPGAs). In some embodiments, such integrated circuits execute instructions stored in the circuits themselves. In addition, some embodiments execute software stored in programmable logic devices (PLDs), ROM, or RAM devices.

在本說明書和本申請的任何專利範圍中,術語「電腦」、「伺服器」、「處理器」、和「記憶體」均指電子或其他技術設備。這些術語不包括人或人的群體。在本說明書中,術語「顯示」或「展示」是指在電子設備上顯示。在本說明書和本申請的任何專利範圍中,術語「電腦可讀介質」、「電腦可讀媒體」、和「機器可讀介質」完全限於以電腦可讀的形式存儲資訊的有形、物理物體。這些術語不包括任何無線訊號、有線下載訊號,以及任何其他短暫的訊號。In this specification and any patent scope of this application, the terms "computer", "server", "processor", and "memory" refer to electronic or other technical devices. These terms do not include people or groups of people. In this specification, the term "display" or "show" refers to display on an electronic device. In this specification and any patent scope of this application, the terms "computer-readable medium", "computer-readable medium", and "machine-readable medium" are completely limited to tangible, physical objects that store information in a computer-readable form. These terms do not include any wireless signals, wired download signals, and any other transient signals.

雖然本申請內容已參照許多具體細節進行了描述,但本領域的通常知識者將認識到,本申請內容可以在不背離本申請內容的精神的情況下以其他具體形式體現出來。此外,一些圖式(包括第8圖和第11圖)在概念上說明了過程。這些過程的具體操作可能不會按照所示和所述的確切順序進行。具體操作可能不是以一個連續的操作系列進行的,不同的具體操作可以在不同的實施例中進行。此外,該過程可以使用幾個子過程來實現,或作為一個更大的宏觀過程的一部分。因此,本領域的通常知識者可以理解,本申請內容不受前述說明性細節的限制,而是由所附的專利範圍來定義。 補充說明 Although the content of this application has been described with reference to many specific details, a person of ordinary skill in the art will recognize that the content of this application may be embodied in other specific forms without departing from the spirit of the content of this application. In addition, some figures (including Figures 8 and 11) conceptually illustrate the processes. The specific operations of these processes may not be performed in the exact order shown and described. The specific operations may not be performed in a continuous series of operations, and different specific operations may be performed in different embodiments. In addition, the process may be implemented using several sub-processes or as part of a larger macro process. Therefore, a person of ordinary skill in the art can understand that the content of this application is not limited by the aforementioned illustrative details, but is defined by the attached patent scope. Supplementary description

本文所描述的主題有時會說明包含在不同的其他元件中,或與不同的其他元件相連的不同元件。應該理解的是,這種描述的架構僅僅是示例,事實上,許多其他的架構可以實現同樣的功能。從概念上講,任何實現相同功能的元件排列都是有效的「關聯」,從而實現了所需的功能。因此,這裡的任何兩個元件結合起來實現一個特定的功能可以被看作是相互「關聯」的,從而實現了所需的功能,而不考慮架構或中間元件。同樣,任何兩個如此關聯的元件也可被視為彼此「可操作地連接」或「可操作地耦合」,以實現所需的功能,並且任何兩個能夠如此關聯的元件也可被視為彼此「可操作地耦合」,以實現所需的功能。可操作的耦合的具體示例包括但不限於物理上可配對和/或物理上相互作用的元件和/或無線上可相互作用和/或無線上可相互作用和/或邏輯上可相互作用的元件。The subject matter described herein sometimes illustrates different elements contained in or connected to different other elements. It should be understood that the architecture described in this manner is merely an example, and in fact, many other architectures can achieve the same function. Conceptually, any arrangement of elements that achieve the same function is effectively "associated" to achieve the desired function. Therefore, any two elements here combined to achieve a specific function can be regarded as "associated" with each other to achieve the desired function, without considering the architecture or intermediate elements. Similarly, any two elements so associated can also be regarded as "operably connected" or "operably coupled" to each other to achieve the desired function, and any two elements that can be so associated can also be regarded as "operably coupled" to each other to achieve the desired function. Specific examples of operable coupling include, but are not limited to, physically mateable and/or physically interacting elements and/or wirelessly interactable and/or wirelessly interactable and/or logically interactable elements.

此外,關於本文中基本上任何複數和/或單數術語的使用,本領域的通常知識者可以根據上下文和/或應用的需要從複數翻譯成單數和/或從單數翻譯成複數。為清楚起見,各種單數/複數的排列組合可在此明確提出。In addition, with respect to the use of substantially any plural and/or singular terms herein, a person skilled in the art may translate from the plural to the singular and/or from the singular to the plural as the context and/or application requires. For clarity, various singular/plural permutations and combinations may be explicitly proposed herein.

此外,本領域的通常知識者將理解,一般來說,本文使用的術語,特別是在所附專利範圍中,例如所附專利範圍的主體,一般是作為「開放」術語。例如,術語「包括」應解釋為「包括但不限於」,術語「具有」應解釋為「至少具有」,術語「含有」應解釋為「含有但不限於」,等等。本領域的通常知識者將進一步理解,如果引入的專利範圍的具體數字是有意的,如是的意圖將在專利範圍中明確地敘述,而在沒有如是敘述的情況下,沒有如是的意圖。例如,為了幫助理解,以下所附的專利範圍可能包含使用介紹性短語「至少一個」和「一個或複數個」來引入請求項敘述。然而,這些短語的使用不應理解為暗示由不定冠詞「一個」或「一種」引入的請求項敘述,將包含這種引入的請求項敘述的任何特定請求項限制為只包含一個這種敘述的實施方案,即使同一請求項包括引入短語「一個或複數個」或「至少一個」和不定冠詞如「一個」或「一種」。例如,「一個」和/或「一種」應被解釋為「至少一個」或「一個或更多」;對於使用定語來引入請求項敘述也是如此 此外,即使明確敘述了引入的請求項敘述的具體數目,本領域的通常知識者也會認識到,這種敘述應被解釋為至少是指所敘述的數目;例如,不加其他修飾語的「兩個敘述」的直接敘述是指至少兩個敘述,或兩個或複數個敘述。此外,在那些使用類似於「A、B、和C等中的至少一個」的慣例的情況下,一般來說,如是的結構是在本領域的通常知識者理解的慣例的意義上進行的。例如,「具有A、B、和C中至少一個的系統」包括但不限於具有A單獨、B單獨、C單獨、A和B一起、A和C一起、B和C一起、和/或A、B和C一起的系統,等等。本領域的通常知識者將進一步理解,無論是在描述、專利範圍、或圖式中,幾乎任何呈現兩個或更多備選術語的非連接詞和/或短語,都應理解為考慮包括其中一個術語、任意一個術語、或兩個術語的可能性。例如,短語「A或B」將被理解為包括「A」或「B」或「A和B」的可能性。In addition, one of ordinary skill in the art will understand that, in general, the terms used herein, particularly in the appended claims, such as the subject matter of the appended claims, are generally intended as "open" terms. For example, the term "including" should be interpreted as "including but not limited to," the term "having" should be interpreted as "having at least," the term "containing" should be interpreted as "containing but not limited to," and so on. One of ordinary skill in the art will further understand that if specific numbers introduced into the claims are intended, such intent will be expressly stated in the claims, and in the absence of such a statement, no such intent is made. For example, to aid understanding, the following appended claims may include the use of the introductory phrases "at least one" and "one or more" to introduce claim statements. However, the use of these phrases should not be construed as implying that a claim statement introduced by the indefinite article "a" or "an" limits any particular claim statement containing such an introduced claim statement to embodiments that contain only one such statement, even if the same claim statement includes the introductory phrase "one or more" or "at least one" and an indefinite article such as "an" or "an." For example, "one" and/or "an" should be interpreted as "at least one" or "one or more"; the same is true for the use of an attributive to introduce a claim statement. In addition, even if a specific number of the introduced claim statements is explicitly stated, a person of ordinary skill in the art will recognize that such a statement should be interpreted as referring to at least the number stated; for example, a direct statement of "two statements" without other modifiers refers to at least two statements, or two or more statements. In addition, in those cases where a convention similar to "at least one of A, B, and C, etc." is used, generally speaking, such a structure is performed in the conventional sense understood by a person of ordinary skill in the art. For example, "a system having at least one of A, B, and C" includes but is not limited to a system having A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc. One of ordinary skill in the art will further understand that almost any non-conjunction and/or phrase presenting two or more alternative terms, whether in the description, patent scope, or drawings, should be understood to consider the possibility of including one of the terms, any one of the terms, or both of the terms. For example, the phrase "A or B" will be understood to include the possibility of "A" or "B" or "A and B".

從上述內容可以看出,為了說明,這裡已經描述了本申請的各種實施方案,在不偏離本申請的範圍和精神的情況下,可以做出各種修改。因此,這裡公開的各種實施方案並不意味著是限制性的,真正的範圍和精神是由以下的專利範圍來表示的。As can be seen from the above, various embodiments of the present application have been described here for the purpose of illustration, and various modifications may be made without departing from the scope and spirit of the present application. Therefore, the various embodiments disclosed here are not meant to be restrictive, and the true scope and spirit are represented by the following patent scope.

100,400:當前塊 500:資料路徑 505:相應的輸入分量樣本 510:矩陣準備模組 515:自相關矩陣和交叉相關向量 530:矩陣方程求解器模組 535:優化係數 540:係數約束模組 545:約束係數 550:分量預測模型 560:參考分量樣本 565:預測分量樣本 600:視訊編碼器 605:視訊源 608:減法器 609:預測殘差 610:變換模組 611:量化模組 612,912:量化係數 613,913:預測圖元資料 614,911:反量化模組 615,910:反變換模組 616,916:變換係數 617:重建圖元資料 619:重建殘差 620:圖內估計模組 625,925:幀內預測模組 630,930:運動補償模組 635:運動估計模組 640:幀間預測模組 645,945:環內濾波器 650,950:重建圖片緩衝器 665,965:MV緩衝器 675,975:MV預測模組 690:熵編碼器 695,995:位元流 705,1005:模型構造器 710,1010:分量預測模型 715,1015:初始預測器 720,1020:初始預測器生成模組 725,1025:精細預測器 730,1030:回歸資料選擇模組 735:約束 800,810,820,830,840,850:步驟 900:視訊解碼器 940:幀間預測模組 990:解析器(熵解碼器) 917:解碼圖元資料 919:重建殘餘訊號 955:顯示裝置 1100,1110,1120,1130,1140,1150:步驟 1200:電子系統 1205:匯流排 1210:處理單元 1215:圖形處理單元(GPU) 1220:系統記憶體 1225:網路 1230:唯讀記憶體 1235:永久存放裝置 1240:輸入裝置 1245:輸出裝置 100, 400: current block 500: data path 505: corresponding input component samples 510: matrix preparation module 515: autocorrelation matrix and cross-correlation vector 530: matrix equation solver module 535: optimization coefficients 540: coefficient constraint module 545: constraint coefficients 550: component prediction model 560: reference component samples 565: prediction component samples 600: video encoder 605: video source 608: subtractor 609: prediction residual 610: transform module 611: quantization module 612, 912: quantization coefficients 613, 913: prediction pixel data 614, 911: Inverse quantization module 615, 910: Inverse transform module 616, 916: Transformation coefficients 617: Reconstructed image data 619: Reconstructed residual 620: Intra-image estimation module 625, 925: Intra-frame prediction module 630, 930: Motion compensation module 635: Motion estimation module 640: Inter-frame prediction module 645, 945: Intra-loop filter 650, 950: Reconstructed image buffer 665, 965: MV buffer 675, 975: MV prediction module 690: Entropy encoder 695, 995: Bitstream 705, 1005: Model builder 710, 1010: component prediction model 715, 1015: initial predictor 720, 1020: initial predictor generation module 725, 1025: fine predictor 730, 1030: regression data selection module 735: constraint 800, 810, 820, 830, 840, 850: step 900: video decoder 940: inter-frame prediction module 990: parser (entropy decoder) 917: decoded pixel data 919: reconstructed residual signal 955: display device 1100, 1110, 1120, 1130, 1140, 1150: step 1200: electronic system 1205: bus 1210: processing unit 1215: graphics processing unit (GPU) 1220: system memory 1225: network 1230: read-only memory 1235: permanent storage device 1240: input device 1245: output device

包括如下附圖以提供對本申請的進一步理解,並納入本申請內容並構成其一部分。附圖說明了本申請的實施方案,並與描述一起,用於解釋本申請的原理。值得注意的是,附圖不一定是按比例繪製的,為清楚地說明本申請的概念,一些部件可能顯示為與實際實施的尺寸不相稱。 第1圖概念性地說明了用於推導線性模型參數的色度和亮度樣本。 第2圖示出了將相鄰樣本分類為兩組的示例。 第3圖概念性地說明了卷積濾波器的空間分量。 第4圖示出了用於導出當前塊的卷積模型的濾波器係數的參考區域。 第5圖概念性地說明了視訊編解碼器的資料路徑,該視訊編解碼器推導並約束模型的係數。 第6圖示出了可使用分量預測模型對圖元塊進行編碼的示例性視訊編碼器。 第7圖示出了通過對分量預測模型的係數進行約束來推導和使用該模型的視訊編碼器的部分。 第8圖概念性地說明了對分量預測模型的係數進行約束的過程。 第9圖示出了可使用分量預測模型對圖元塊進行解碼的視訊解碼器的示例。 第10圖示出了通過對分量預測模型的係數進行約束來推導和使用該模型的視訊解碼器的部分。 第11圖概念性地說明了對分量預測模型的係數進行約束的過程。 第12圖概念性地說明了用於實現本申請的一些實施例的電子系統。 The following figures are included to provide a further understanding of the present application and are incorporated into and constitute a part of the present application. The figures illustrate embodiments of the present application and, together with the description, are used to explain the principles of the present application. It is worth noting that the figures are not necessarily drawn to scale, and some components may be shown as being out of proportion to the size of the actual implementation in order to clearly illustrate the concepts of the present application. FIG. 1 conceptually illustrates chrominance and luminance samples used to derive linear model parameters. FIG. 2 shows an example of classifying adjacent samples into two groups. FIG. 3 conceptually illustrates the spatial components of a convolution filter. FIG. 4 shows a reference region for filter coefficients used to derive a convolution model of the current block. FIG. 5 conceptually illustrates a data path of a video codec that derives and constrains the coefficients of the model. FIG. 6 illustrates an exemplary video encoder that can encode a block of pixels using a component prediction model. FIG. 7 illustrates a portion of a video encoder that derives and uses a component prediction model by constraining coefficients of the model. FIG. 8 conceptually illustrates a process for constraining coefficients of a component prediction model. FIG. 9 illustrates an example of a video decoder that can decode a block of pixels using a component prediction model. FIG. 10 illustrates a portion of a video decoder that derives and uses a component prediction model by constraining coefficients of the model. FIG. 11 conceptually illustrates a process for constraining coefficients of a component prediction model. FIG. 12 conceptually illustrates an electronic system for implementing some embodiments of the present application.

800,810,820,830,840,850:步驟 800, 810, 820, 830, 840, 850: Steps

Claims (15)

一種視訊編解碼方法,其改良在於,包括: 接收將被編碼或解碼的圖元塊的資料,作為視訊的當前圖片的當前塊; 基於相應的輸入和輸出分量樣本推導出係數集; 根據約束條件集對該推導出的係數集進行約束; 應用該受約束的係數集作為分量預測模型,以生成該當前塊的預測器;以及 通過使用該生成的預測器對該當前塊進行編碼或解碼。 A video encoding and decoding method, the improvement of which comprises: receiving data of a pixel block to be encoded or decoded as a current block of a current picture of a video; deriving a coefficient set based on corresponding input and output component samples; constraining the derived coefficient set according to a set of constraint conditions; applying the constrained coefficient set as a component prediction model to generate a predictor for the current block; and encoding or decoding the current block by using the generated predictor. 如請求項1所述之方法,其中,該分量預測模型是基於該當前塊的重建亮度樣本生成預測色度樣本的交叉分量模型。A method as described in claim 1, wherein the component prediction model is a cross-component model that generates predicted chrominance samples based on reconstructed luminance samples of the current block. 如請求項1所述之方法,其中,該相應的輸入和輸出分量樣本是與該當前塊相鄰的範本區域的相應亮度和色度樣本。The method of claim 1, wherein the corresponding input and output component samples are corresponding luminance and chrominance samples of a template area adjacent to the current block. 如請求項1所述之方法,其中,該分量預測模型是卷積模型; 該係數集是通過求解該相應輸入和輸出分量樣本之間的矩陣方程推導出的。A method as described in claim 1, wherein the component prediction model is a convolution model; the coefficient set is derived by solving a matrix equation between the corresponding input and output component samples. 如請求項1所述之方法,其中,該對推導出的該係數集進行約束的步驟包括:在剪切閾值處剪切係數。The method as described in claim 1, wherein the step of constraining the derived set of coefficients includes: clipping the coefficients at a clipping threshold. 如請求項1所述之方法,其中,該對推導出的該係數集進行約束的步驟包括:在不同的剪切閾值處剪切不同的係數。The method as described in claim 1, wherein the step of constraining the derived coefficient set includes: clipping different coefficients at different clipping thresholds. 如請求項1所述之方法,其中,該推導出的係數集和該受約束的係數集在視訊編解碼器中用浮點表示。A method as described in claim 1, wherein the derived coefficient set and the constrained coefficient set are represented in floating point in a video codec. 如請求項1所述之方法,其中,該對推導出的該係數集進行約束的步驟包括:將係數限制在預定義範圍內。A method as described in claim 1, wherein the step of constraining the derived set of coefficients comprises: limiting the coefficients to a predefined range. 如請求項8所述之方法,其中,該推導出的係數集和該受約束的係數集在視訊編解碼器中由定點表示,其小數部分包括N位元,該預定義範圍的大小基於1<<N放大。A method as described in claim 8, wherein the derived coefficient set and the constrained coefficient set are represented in a video codec by a fixed point representation whose fractional portion comprises N bits and the size of the predefined range is magnified based on 1<<N. 如請求項8所述之方法,其中,當推導出的係數超出該預定義範圍時,該係數集被設置為等於標識濾波器。A method as described in claim 8, wherein when the derived coefficients are outside the predefined range, the set of coefficients is set equal to the identification filter. 如請求項8所述之方法,其中,當推導出的係數超出該預定義範圍時,對該超出範圍的係數應用剪切操作。A method as described in claim 8, wherein when the derived coefficient is outside the predetermined range, a clipping operation is applied to the coefficient outside the range. 如請求項11所述之方法,其中,當推導出的係數超出該預定義範圍時,不使用該推導出的係數集對該當前塊進行編碼或解碼。A method as described in claim 11, wherein when the derived coefficients are outside the predetermined range, the derived set of coefficients is not used to encode or decode the current block. 一種電子裝置,其改良在於,包括: 視訊編解碼電路,被配置為執行包括: 接收將被編碼或解碼的圖元塊的資料,作為視訊的當前圖片的當前塊; 基於相應的輸入和輸出分量樣本推導出係數集; 根據約束條件集對該推導出的係數集進行約束; 應用該受約束的係數集作為分量預測模型,以生成該當前塊的預測器;以及 通過使用該生成的預測器對該當前塊進行編碼或解碼。 An electronic device, the improvement of which comprises: A video codec circuit, configured to perform operations including: Receiving data of a block of pixels to be encoded or decoded as a current block of a current picture of a video; Derived a coefficient set based on corresponding input and output component samples; Constrained the derived coefficient set according to a set of constraint conditions; Applying the constrained coefficient set as a component prediction model to generate a predictor for the current block; and Encoding or decoding the current block by using the generated predictor. 一種視訊解碼方法,其改良在於,包括: 接收將被解碼的圖元塊的資料,作為視訊的當前圖片的當前塊; 基於相應的輸入和輸出分量樣本推導出係數集; 根據約束條件集對該推導出的係數集進行約束; 應用該受約束的係數集作為分量預測模型,以生成該當前塊的預測器;以及 通過使用該生成的預測器重建該當前塊。 A video decoding method, the improvement of which comprises: receiving data of a pixel block to be decoded as a current block of a current picture of a video; deriving a coefficient set based on corresponding input and output component samples; constraining the derived coefficient set according to a set of constraint conditions; applying the constrained coefficient set as a component prediction model to generate a predictor for the current block; and reconstructing the current block by using the generated predictor. 一種視訊編碼方法,其改良在於,包括: 接收將被編碼的圖元塊的資料,作為視訊的當前圖片的當前塊; 基於相應的輸入和輸出分量樣本推導出係數集; 根據約束條件集對該推導出的係數集進行約束; 應用該受約束的係數集作為分量預測模型,以生成該當前塊的預測器;以及 通過使用該生成的預測器對該當前塊進行編碼。 A video coding method, the improvement of which comprises: receiving data of a block of pixels to be coded as a current block of a current picture of a video; deriving a set of coefficients based on corresponding input and output component samples; constraining the derived set of coefficients according to a set of constraint conditions; applying the constrained set of coefficients as a component prediction model to generate a predictor for the current block; and encoding the current block by using the generated predictor.
TW112128822A 2022-08-02 2023-08-01 Constraining convolution model coefficient TW202412522A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US63/370,133 2022-08-02
WOPCT/CN2023/109712 2023-07-28

Publications (1)

Publication Number Publication Date
TW202412522A true TW202412522A (en) 2024-03-16

Family

ID=

Similar Documents

Publication Publication Date Title
TWI685246B (en) Coding transform blocks
CN112640448B (en) Entropy coding and decoding method and device for coding and converting coefficient
TWI723448B (en) Entropy coding of coding units in image and video data
TWI792074B (en) Signaling block partitioning of image and video
TWI792149B (en) Signaling quantization related parameters
US11350131B2 (en) Signaling coding of transform-skipped blocks
TW201944779A (en) Encoding/decoding method and electronic apparatus
TWI784348B (en) Specifying video picture information
TW202412522A (en) Constraining convolution model coefficient
WO2024027566A1 (en) Constraining convolution model coefficient
TWI832602B (en) Entropy coding transform coefficient signs
WO2024017006A1 (en) Accessing neighboring samples for cross-component non-linear model derivation
TWI839968B (en) Local illumination compensation with coded parameters
TW202406350A (en) Unified cross-component model derivation
WO2024074131A1 (en) Method and apparatus of inheriting cross-component model parameters in video coding system
WO2022217442A1 (en) Coefficient encoding/decoding method, encoder, decoder, and computer storage medium
WO2023272517A1 (en) Encoding and decoding method, bitstream, encoder, decoder, and computer storage medium
WO2022217417A1 (en) Encoding method, decoding method, encoder, decoder and storage medium
TW202335499A (en) Multi-model cross-component linear model prediction
TW202349954A (en) Adaptive coding image and video data
TW202404354A (en) Prediction refinement with convolution model
TW202349952A (en) Video coding method and apparatus thereof
TW202412524A (en) Using mulitple reference lines for prediction
TW202412521A (en) Adaptive loop filter with virtual boundaries and multiple sample sources
KR20220045231A (en) Signaling of sub picture structure