TWI830548B

TWI830548B - Video encoding method and electronic equipment thereof

Info

Publication number: TWI830548B
Application number: TW111149216A
Authority: TW
Inventors: 莊政彥; 江嫚書; 蕭裕霖; 陳俊嘉; 徐志瑋; 莊子德; 陳慶曄; 黃毓文
Original assignee: 聯發科技股份有限公司
Priority date: 2021-12-21
Filing date: 2022-12-21
Publication date: 2024-01-21
Also published as: US20230199217A1; TW202327363A; CN116366857A

Abstract

A video encoder receives raw pixel data to be encoded as a current block of a current picture of a video into a bitstream. The video encoder identifies multiple candidate bi-prediction positions for the current block, including a center position, a first set of offset positions, and a second set of offset positions. The first set of offset positions and the second set of offset positions interleave each other. The encoder computes distortion values for each of the candidate bi-prediction positions based on several possible weighting parameter values. The distortion values for the center position are based on each of the several possible weighting parameter values. The distortion values for the first set of offset positions are based on a first subset of the possible weighting parameter values. The distortion values for the second set of offset positions are based on a second subset of the possible weighting parameter values.

Description

Video encoding method and related electronic equipment

本公開涉及視訊編解碼。特別地，本公開涉及被配置爲支援多種不同編解碼模式的硬體架構。This disclosure relates to video codecs. In particular, the present disclosure relates to hardware architectures configured to support multiple different codec modes.

除非本文另有說明，否則本節中描述的方法不是後面列出的申請專利範圍的先前技術，並且不因包含在本節中而被承認爲先前技術。Unless otherwise indicated herein, the methods described in this section are not prior art to the scope of the claims later listed and are not admitted to be prior art by inclusion in this section.

高效視訊編解碼(HEVC)是由視訊編解碼聯合協作組(JCT-VC)開發的國際視訊編解碼標準。 HEVC 基於混合的基於塊的運動補償類 DCT 變換編解碼架構。壓縮的基本單元，稱爲編解碼單元 (CU)，是一個 2Nx2N 的方形塊，每個 CU 可以遞歸地拆分爲四個更小的 CU，直到達到預定義的最小尺寸。每個 CU 包含一個或多個預測單元 (PU)。High Efficiency Video Codec (HEVC) is an international video codec standard developed by the Joint Collaboration on Video Codecs (JCT-VC). HEVC is based on a hybrid block-based motion compensated DCT transform-like codec architecture. The basic unit of compression, called a Codec Unit (CU), is a 2Nx2N square block, and each CU can be recursively split into four smaller CUs until a predefined minimum size is reached. Each CU contains one or more prediction units (PU).

爲了在HEVC中實現混合編解碼架構的最佳編解碼效率，每個PU有兩種預測模式，即幀內預測和幀間預測。對於幀內預測模式，空間相鄰的重構圖元可用於生成方向預測。 HEVC 中有多達 35 個方向。對於幀間預測模式，時間重構參考幀可用於生成運動補償預測。共有三種不同的模式，包括跳過（Skip）、合並（Merge）和幀間高級運動向量預測(Inter Advanced Motion Vector Prediction，簡寫爲AMVP) 模式。In order to achieve the best encoding and decoding efficiency of the hybrid encoding and decoding architecture in HEVC, each PU has two prediction modes, namely intra prediction and inter prediction. For intra prediction mode, spatially adjacent reconstructed primitives can be used to generate directional predictions. There are up to 35 orientations in HEVC. For inter prediction mode, temporally reconstructed reference frames can be used to generate motion compensated predictions. There are three different modes, including Skip, Merge, and Inter Advanced Motion Vector Prediction (AMVP) mode.

當 PU 在幀間 AMVP 模式下編解碼時，運動補償預測是利用傳輸的運動向量差 (MVD) 執行的，該運動向量差可以與運動向量預測子 (MVP) 一起使用以導出運動向量 (MV)。爲了決定幀間 AMVP 模式下的MVP，高級運動向量預測（advanced motion vector prediction，簡寫爲AMVP）方案用於在包括兩個空間 MVP 和一個時間 MVP 的 AMVP 候選集中選擇運動向量預測子。因此，在AMVP模式下，需要對MVP的MVP索引和對應的MVD進行編碼和傳輸。此外，還應編碼和傳輸幀間預測方向以及每個清單的參考幀索引，以指定清單 0（L0）和列表 1（L1）的雙向預測（bi-prediction）和單向預測（uni-prediction）中的預測方向。When the PU is coded in inter-AMVP mode, motion compensated prediction is performed using the transmitted motion vector difference (MVD), which can be used with the motion vector predictor (MVP) to derive the motion vector (MV) . To determine the MVP in inter-AMVP mode, the advanced motion vector prediction (AMVP) scheme is used to select motion vector predictors in the AMVP candidate set including two spatial MVPs and one temporal MVP. Therefore, in AMVP mode, the MVP index of MVP and the corresponding MVD need to be encoded and transmitted. In addition, the inter prediction direction and the reference frame index of each manifest should be encoded and transmitted to specify bi-prediction and uni-prediction for list 0 (L0) and list 1 (L1) prediction direction.

當 PU 在跳過或合並模式中編碼時，除了所選候選的合並索引之外，沒有運動資訊被傳輸。這是因爲跳過和合並模式利用運動推斷方法（MV=MVP+MVD，其中 MVD 爲零）從位於共位（co-located）圖片的空間相鄰塊（空間候選）或時間塊（時間候選）獲取運動資訊，其中共位圖片是列表 0 或列表 1 中的第一個參考圖片，在切片標頭中發信（signaled）。在跳過PU 的情況下，殘差信號也被忽略。爲了確定跳過（Skip）和合並（Merge）模式的合並索引，合並方案用於在包含四個空間 MVP 和一個時間 MVP 的合並候選集中選擇運動向量預測子。When a PU is encoded in skip or merge mode, no motion information is transmitted except for the merge index of the selected candidate. This is because skip and merge modes utilize motion inference methods (MV=MVP+MVD, where MVD is zero) from spatially adjacent blocks (spatial candidates) or temporal blocks (temporal candidates) located in co-located pictures. Get motion information, where the co-located picture is the first reference picture in list 0 or list 1, signaled in the slice header. In case PU is skipped, the residual signal is also ignored. To determine the merge index for Skip and Merge modes, the merge scheme is used to select motion vector predictors in a merge candidate set containing four spatial MVPs and one temporal MVP.

有鑒於此，本發明提供以下技術方案：In view of this, the present invention provides the following technical solutions:

本發明提供一種視訊編碼方法，包括：接收圖元塊的原始圖元資料以作爲視訊的當前圖片的當前塊被編碼到位元流中；識別包括中心位置、第一組偏移位置和第二組偏移位置的多個候選雙向預測位置；基於多個可能的加權參數值計算多個候選雙向預測位置中的每一個的失真值，其中：(i)基於多個可能的加權參數值中的每一個計算爲中心位置計算的失真值， (ii) 爲第一組偏移位置計算的失真值是基於多個可能加權參數值的第一子集計算的，以及 (iii) 爲第二組偏移位置計算的失真值是基於多個可能的加權參數值的第二子集計算的，其中多個可能的加權參數值的第一子集不同於多個可能的加權參數值的第二子集；基於計算的當前塊的多個候選雙向預測位置的失真值，選擇候選雙向預測位置的加權參數值；以及基於選擇的加權參數值使用雙向預測對當前塊進行編碼。The present invention provides a video coding method, which includes: receiving the original picture element data of the picture element block so that the current block of the current picture of the video is encoded into the bit stream; identifying the center position, the first group of offset positions and the second group of a plurality of candidate bidirectional prediction positions for the offset position; calculating a distortion value for each of the plurality of candidate bidirectional prediction positions based on a plurality of possible weighting parameter values, wherein: (i) based on each of the plurality of possible weighting parameter values; a distortion value calculated for the center position, (ii) a distortion value calculated for a first set of offset positions based on a first subset of a plurality of possible weighted parameter values, and (iii) for a second set of offset The positionally calculated distortion value is calculated based on a second subset of the plurality of possible weighting parameter values, wherein the first subset of the plurality of possible weighting parameter values is different from the second subset of the plurality of possible weighting parameter values; Selecting a weighting parameter value of the candidate bidirectional prediction position based on the calculated distortion values of the plurality of candidate bidirectional prediction positions of the current block; and encoding the current block using bidirectional prediction based on the selected weighting parameter value.

本發明還提供一種電子設備，包括：編碼器電路，配置爲執行操作，包括：接收圖元塊的原始圖元資料以作爲視訊的當前圖片的當前塊被編碼到位元流中；識別包括中心位置、第一組偏移位置和第二組偏移位置的多個候選雙向預測位置；基於多個可能的加權參數值計算多個候選雙向預測位置中的每一個的失真值，其中：(i)基於多個可能的加權參數值中的每一個計算爲中心位置計算的失真值， (ii) 爲第一組偏移位置計算的失真值是基於多個可能加權參數值的第一子集計算的，以及 (iii) 爲第二組偏移位置計算的失真值是基於多個可能的加權參數值的第二子集計算的，其中多個可能的加權參數值的第一子集不同於多個可能的加權參數值的第二子集；基於計算的當前塊的多個候選雙向預測位置的失真值，選擇候選雙向預測位置的加權參數值；以及基於選擇的加權參數值使用雙向預測對當前塊進行編碼。The present invention also provides an electronic device, including: an encoder circuit configured to perform operations including: receiving original primitive data of the primitive block to encode the current block of the current picture of the video into the bit stream; identifying the center position including , a plurality of candidate bidirectional prediction positions for the first set of offset positions and a second set of offset positions; calculating a distortion value for each of the plurality of candidate bidirectional prediction positions based on a plurality of possible weighted parameter values, where: (i) a distortion value calculated for the center position based on each of a plurality of possible weighting parameter values, (ii) a distortion value calculated for the first set of offset positions based on a first subset of the plurality of possible weighting parameter values , and (iii) the distortion values calculated for the second set of offset positions are calculated based on a second subset of a plurality of possible weighting parameter values, where the first subset of a plurality of possible weighting parameter values is different from the plurality of possible weighting parameter values. a second subset of possible weighting parameter values; selecting weighting parameter values for candidate bidirectional prediction locations based on calculated distortion values for a plurality of candidate bidirectional prediction locations for the current block; and using bidirectional prediction for the current block based on the selected weighting parameter values. Encode.

本發明的視訊編碼方法及相應電子設備可以由多個不同編解碼工具共用。The video encoding method and corresponding electronic equipment of the present invention can be shared by multiple different encoding and decoding tools.

在下面的描述中，闡述了許多具體細節。然而，應當理解，可以在沒有這些具體細節的情況下實踐本發明的實施例。在其他情況下，未詳細示出公知的電路、結構和技術，以免混淆對本說明書的理解。然而，所屬領域具有通常知識者將理解，可以在沒有這種具體細節的情況下實踐本發明。具有所包括的描述的所屬領域具有通常知識者將能夠實現適當的功能而無需過度的實驗。In the description that follows, many specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In other instances, well-known circuits, structures, and techniques have not been shown in detail so as not to obscure the understanding of this specification. However, one of ordinary skill in the art will understand that the present invention may be practiced without such specific details. A person with ordinary knowledge in the art given the included description will be able to implement appropriate functionality without undue experimentation.

以下描述是實施本發明的最佳預期模式。該描述是為了說明本發明的一般原理，不應理解為限制性的。本發明的範圍最好通過參考所附申請專利範圍來確定。The following description is of the best contemplated modes of carrying out the invention. This description is intended to illustrate the general principles of the invention and should not be construed as limiting. The scope of the invention can best be determined by reference to the appended claims.

在後面的詳細描述中，通過示例的方式闡述了許多具體細節，以便提供對相關教導的透徹理解。基於本文描述的教導的任何變化、派生和/或擴展都在本公開的保護範圍內。在一些情況下，可以在相對較概括的水準上描述與本文公開的一個或多個示例實現有關的眾所周知的方法、過程、元件和/或電路而不詳細描述，以避免不必要地模糊本公開的教導的方面。In the detailed description that follows, numerous specific details are set forth by way of examples in order to provide a thorough understanding of the relevant teachings. Any changes, derivatives, and/or extensions based on the teachings described herein are within the scope of this disclosure. In some instances, well-known methods, procedures, components and/or circuits related to one or more example implementations disclosed herein may be described at a relatively general level without being described in detail to avoid unnecessarily obscuring the disclosure aspects of teaching.

在最新的視訊編解碼標準通用視訊編解碼(Versatile Video coding，簡寫爲VVC)中，提出了許多新的編解碼工具來提高編解碼效率。此類編解碼工具的示例包括高級運動向量預測 (Advanced Motion Vector Prediction，簡寫為AMVP)、幾何預測模式 (Geometric Prediction Mode，簡寫為GPM)、適應性運動向量解析度 (Adaptive Motion Vector Resolution，簡寫為AMVR)、運動向量差合併模式 (Merge Mode with Motion Vector Difference，簡寫為MMVD)、解碼器端運動向量細化 (Decoder-side Motion Vector Refinement，簡寫為DMVR)、具有 CU 級權重的雙向預測 (Bi-prediction with CU-level Weights，簡寫為BCW)、子塊變換 (Sub-Block Transform，簡寫為SBT) 等。 VVC 標準的高性能視訊轉碼器可以通過支持盡可能多的這些編解碼工具來實現高編解碼增益。但是，為每個編解碼工具設計專用的硬體電路是非常低效的。In the latest video coding and decoding standard Versatile Video coding (VVC), many new coding and decoding tools are proposed to improve coding and decoding efficiency. Examples of such codec tools include Advanced Motion Vector Prediction (AMVP), Geometric Prediction Mode (GPM), Adaptive Motion Vector Resolution (Adaptive Motion Vector Resolution) AMVR), Merge Mode with Motion Vector Difference (MMVD), Decoder-side Motion Vector Refinement (DMVR), Bidirectional Prediction (Bi) with CU-level weights -prediction with CU-level Weights, abbreviated as BCW), sub-block transform (Sub-Block Transform, abbreviated as SBT), etc. The VVC standard's high-performance video transcoder can achieve high codec gains by supporting as many of these codec tools as possible. However, it is very inefficient to design dedicated hardware circuits for each codec tool.

本公開的一些實施例提供了一種視訊轉碼器，其包括被配置為由多個不同編解碼工具共用的電路。視訊轉碼器可以包括作為第一階段 RDO 的低複雜度 (low-complexity，簡寫為LC) 率失真優化階段 (LC rate-distortion optimization stage，簡寫為LC-RDO) 和作為第二階段 RDO 的高複雜度 (high-complexity，簡寫為HC) 率失真優化階段 (HC-RDO)。視訊轉碼器嘗試各種編解碼工具，並選擇RD成本（RD-cost）最低的編解碼工具。由於一個編解碼工具可能有多個可能的候選，LC-RDO階段用於選擇具有最低 RD 成本的候選。在一些實施例中，LC-RDO階段執行更簡單的變換，例如絕對差和(Sum of Absolute Difference，簡寫為SAD)或絕對變換差和(Sum of Absolute Transformed Difference，簡寫為SATD)，以簡化失真計算。 LC-RDO 可以通過執行更簡單的變換和/或不執行全搜索來選擇候選。然後 HC-RDO階段使用 LC-RDO 提供的候選來最終確定用於編碼當前 CU 的編解碼工具的選擇，包括通過執行 DCT2 等標準變換。Some embodiments of the present disclosure provide a video transcoder that includes circuitry configured to be shared by multiple different codec tools. The video transcoder may include a low-complexity (LC) rate-distortion optimization stage (LC-RDO) as the first-stage RDO and a high-complexity (LC-RDO) as the second-stage RDO. Complexity (high-complexity, abbreviated as HC) rate-distortion optimization stage (HC-RDO). The video transcoder tries various codec tools and selects the codec tool with the lowest RD-cost. Since a codec tool may have multiple possible candidates, the LC-RDO stage is used to select the candidate with the lowest RD cost. In some embodiments, the LC-RDO stage performs simpler transformations such as Sum of Absolute Difference (SAD) or Sum of Absolute Transformed Difference (SATD) to simplify distortion. calculate. LC-RDO can select candidates by performing simpler transformations and/or not performing a full search. The HC-RDO stage then uses the candidates provided by LC-RDO to finalize the selection of codec tools to encode the current CU, including by performing standard transforms such as DCT2.

在一些實施例中，LC-RDO包括被配置為由多個不同編解碼工具共用的電路。因此，為了對 CU 進行編碼，視訊轉碼器可以在多個不同的配置下為多個不同的編解碼工具實施 LC-RDO。在一些實施例中，對於每個編解碼工具，LC-RDO 為HC-RDO 識別一個最佳候選。In some embodiments, the LC-RDO includes circuitry configured to be shared by multiple different codec tools. Therefore, to encode CUs, a video transcoder can implement LC-RDO in multiple different configurations for multiple different codec tools. In some embodiments, for each codec tool, LC-RDO identifies a best candidate for HC-RDO.

第1圖概念性地圖示視訊轉碼器100的一部分，其使用LC-RDO階段110和HC-RDO階段120來選擇編解碼工具和/或用於編解碼當前塊的預測候選。Figure 1 conceptually illustrates a portion of video transcoder 100 that uses LC-RDO stage 110 and HC-RDO stage 120 to select codec tools and/or prediction candidates for codec of the current block.

如圖所示，LC-RDO 110 和 HC-RDO 120 一起為視訊轉碼器 100 執行 RDO 功能，使得當前塊使用在速率和失真方面具有最低成本的編解碼工具和預測候選來編碼。 LC-RDO階段 110 檢查編解碼工具和預測候選以向 HC-RDO階段 120 提供中間結果。LC-RDO 110 產生的中間結果可以包括諸如特定編解碼工具的最佳預測候選的身份及其相關成本的資訊。 HC-RDO 可以使用中間結果來產生最終的編解碼工具選擇和預測候選選擇。然後視訊轉碼器100使用編解碼工具和預測候選選擇來對當前塊進行編碼。As shown, LC-RDO 110 and HC-RDO 120 together perform RDO functions for video transcoder 100 such that the current block is encoded using the lowest cost codec tool and prediction candidate in terms of rate and distortion. The LC-RDO stage 110 checks the codec tools and prediction candidates to provide intermediate results to the HC-RDO stage 120 . Intermediate results produced by LC-RDO 110 may include information such as the identity of the best predicted candidate for a particular codec tool and its associated cost. HC-RDO can use intermediate results to produce final codec tool selections and prediction candidate selections. Video transcoder 100 then encodes the current block using codec tools and prediction candidate selection.

LC-RDO 110 和 HC-RDO 120 都可以從資料快取記憶體 130 中檢索資料，資料快取記憶體 130 可以包括當前圖片（重構或原始源）和幾個參考圖片的圖元資料的存儲。資料快取記憶體130還可包括用於參考運動資訊的存儲，例如先前用於其他CU的運動向量。 LC-RDO 和 HC-RDO 可以根據當前塊的位置以及正在檢查的編解碼工具和預測候選，選擇性地從資料快取記憶體 130 中檢索資料。Both LC-RDO 110 and HC-RDO 120 can retrieve data from data cache 130, which can include storage of primitive data for the current picture (reconstructed or original source) and several reference pictures. . Data cache 130 may also include storage for reference motion information, such as motion vectors previously used for other CUs. LC-RDO and HC-RDO can selectively retrieve data from the data cache 130 based on the location of the current block and the codec tool and prediction candidate being examined.

LC-RDO 110包括由不同編解碼工具共用的電路並且可以被配置爲爲不同編解碼工具産生中間結果。因此，相同的LC-RDO 110可以在第一配置111下配置以産生用於第一編解碼工具的第一中間結果121，然後在第二配置112下配置以産生用於第二編碼的第二中間結果122工具等。HC-RDO 120使用不同編解碼工具的中間結果來最終確定用於編碼當前塊的編解碼工具和/或預測候選的選擇。在圖中，編解碼工具配置111-114分別被LC-RDO 110用來産生用於HC-RDO 120的中間結果121-124。LC-RDO 110 includes circuitry that is common to different codec tools and may be configured to generate intermediate results for different codec tools. Thus, the same LC-RDO 110 may be configured in a first configuration 111 to produce a first intermediate result 121 for a first codec tool, and then configured in a second configuration 112 to produce a second codec for a second encoding. Intermediate results 122 tools etc. HC-RDO 120 uses the intermediate results of different codec tools to finalize the selection of codec tools and/or prediction candidates for encoding the current block. In the figure, codec tool configurations 111-114 are used by LC-RDO 110 to generate intermediate results 121-124 for HC-RDO 120, respectively.

在一些實施例中，LC-RDO階段110可以執行以下操作中的至少一些：（1)從資料快取記憶體130獲取資料，（2)生成候選，（3)使用低複雜度變換計算失真，以及（4）比較不同候選的RD成本。由 LC-RDO階段 110 識別的獲勝候選隨後被提供給 HC-RDO階段 120。LC-RDO 的相同 4 個操作可用於識別不同編解碼工具（例如 AMVP、MMVD、和GPM）的最佳候選。In some embodiments, LC-RDO stage 110 may perform at least some of the following operations: (1) retrieve data from data cache 130, (2) generate candidates, (3) compute distortion using low-complexity transformations, and (4) compare the RD costs of different candidates. The winning candidates identified by the LC-RDO stage 110 are then provided to the HC-RDO stage 120 . The same 4 operations of LC-RDO can be used to identify the best candidates for different codec tools such as AMVP, MMVD, and GPM.

第2圖從概念上說明了 LC-RDO 的元件。 LC-RDO 的電路由特定編解碼工具的配置資料 200 配置，以確定編解碼工具的最佳候選並提供使用該候選的成本。Figure 2 conceptually illustrates the components of the LC-RDO. The circuitry of the LC-RDO is configured by a codec-specific profile 200 to determine the best candidate for the codec tool and provide the cost of using that candidate.

對於幾個候選中的每一個，LC-RDO 110從資料快取記憶體130獲取資料，應用插值（interpolation）210，執行失真計算220，並執行速率計算230。基於失真和速率計算，LC-RDO 110確定每個候選的成本值。整體比較器（亦可稱爲候選比較器）240 比較不同候選的成本值以確定特定編解碼工具的最佳候選。For each of several candidates, LC-RDO 110 retrieves data from data cache 130 , applies interpolation 210 , performs distortion calculations 220 , and performs rate calculations 230 . Based on the distortion and rate calculations, LC-RDO 110 determines a cost value for each candidate. The overall comparator (also called a candidate comparator) 240 compares the cost values of different candidates to determine the best candidate for a specific codec tool.

LC-RDO 110根據配置資料200執行操作210-230。配置資料200還可以配置LC-RDO以省略步驟210-230中的一個或多個。在一些實施例中，LC-RDO 可能具有可配置為並行檢查多個候選的電路（例如，存在能夠同時檢查不同候選的獨立電路組）。在一些實施例中，LC-RDO 可以具有由不同候選共用的電路並且可以被配置為連續地檢查每個候選。LC-RDO 110 performs operations 210-230 according to configuration data 200. Configuration profile 200 may also configure LC-RDO to omit one or more of steps 210-230. In some embodiments, the LC-RDO may have circuitry that is configured to check multiple candidates in parallel (eg, there are independent sets of circuits capable of checking different candidates simultaneously). In some embodiments, the LC-RDO may have circuitry shared by different candidates and may be configured to check each candidate continuously.

第3圖圖示了具有用於多個不同編解碼工具的共用電路的LC-RDO 110的示例實現。 LC-RDO 110包括L0單向預測部分301、L0雙向預測部分302、L1單向預測部分303和L1雙向預測部分304。當為編解程工具配置時，每個部分的電路計算該部分候選的失真和速率。每個部分都有一個或多個局部比較器（local comparator），以確定該部分成本最低的最佳候選。整體比較器（overall comparator）240 比較四個部分 301-304 的最佳候選，以基於從計算的失真和速率得出的成本識別給定編解碼工具的整體最佳候選。Figure 3 illustrates an example implementation of the LC-RDO 110 with common circuitry for multiple different codec tools. The LC-RDO 110 includes an L0 unidirectional prediction part 301, an L0 bidirectional prediction part 302, an L1 unidirectional prediction part 303, and an L1 bidirectional prediction part 304. When configured for the programming tool, each section's circuitry calculates the distortion and rate of that section candidate. Each part has one or more local comparators to determine the best candidate with the lowest cost for that part. The overall comparator 240 compares the best candidates of the four parts 301-304 to identify the overall best candidate for a given codec tool based on costs derived from calculated distortion and rate.

如圖所示，L0 單向預測部分 301 具有插值濾波器 310、SATD 陣列 320、速率計算器 331-334 和局部比較器 340。插值濾波器 310 接收參考樣本（“參考資料”）並使用水準濾波器陣列 311、移位暫存器 312、垂直濾波器陣列 313 和插值緩衝器 314 生成濾波的參考樣本（例如，用於分數位置）。參考圖片緩衝器 315 用於將參考圖片的圖元資料存儲為參考樣本。內插濾波器310可以提供來自元件311-314中任一組件的輸出，並且SATD陣列320可以使用這些輸出中的任何一個用於其失真計算。 SATD陣列320基於來源資料（來自視訊源）和來自插值濾波器310的（單向）濾波的參考樣本執行失真計算。SATD陣列320還可以基於來自L0雙向預測部分302的混合的參考樣本執行失真計算。SATD陣列320的輸出被提供給四分之一圖元、1/2圖元、1圖元、4圖元比較器341-344。As shown, the L0 unidirectional prediction section 301 has an interpolation filter 310, a SATD array 320, rate calculators 331-334, and a local comparator 340. Interpolation filter 310 receives reference samples ("references") and uses horizontal filter array 311, shift register 312, vertical filter array 313, and interpolation buffer 314 to generate filtered reference samples (e.g., for fractional positions ). The reference picture buffer 315 is used to store the primitive data of the reference picture as a reference sample. Interpolation filter 310 may provide outputs from any of elements 311-314, and SATD array 320 may use any of these outputs for its distortion calculations. The SATD array 320 performs distortion calculations based on the source data (from the video source) and the (unidirectional) filtered reference samples from the interpolation filter 310 . SATD array 320 may also perform distortion calculations based on mixed reference samples from L0 bidirectional prediction section 302 . The output of SATD array 320 is provided to quarter, ½, 1, and 4 comparators 341-344.

速率計算器331-334的輸出連同SATD陣列320的輸出被饋送到比較器341-344。比較器341-344依次爲不同圖元解析度的不同候選提供成本值。局部比較器340比較這些不同的候選並識別用於L0單向預測的最佳候選。The outputs of rate calculators 331-334, along with the output of SATD array 320, are fed to comparators 341-344. Comparators 341-344 in turn provide cost values for different candidates at different primitive resolutions. Local comparator 340 compares these different candidates and identifies the best candidate for L0 unidirectional prediction.

L0雙向預測部分302具有雙向預測混合模塊319、雙向預測SATD陣列325、速率計算器335和雙向預測比較器345。雙向預測混合模塊319執行加權平均以混合來自兩個不同時間位置的參考圖片的參考樣本圖元（例如，L0 和 L1 的兩個單向預測子（uni-predictor））。雙向預測 SATD 陣列 325 計算混合參考樣本中的失真，並向雙向預測比較器 345 和速率計算器 335 提供失真值。在一些實施例中，當執行 L0 雙向預測時，可以重新使用 SATD 陣列 320用於計算失真，並且比較器331-334也可以重複使用。The L0 bi-prediction section 302 has a bi-prediction mixing module 319, a bi-prediction SATD array 325, a rate calculator 335, and a bi-prediction comparator 345. The bi-predictive blending module 319 performs a weighted average to blend reference sample primitives from two reference pictures at different temporal locations (eg, two uni-predictors of L0 and L1). The bidirectional prediction SATD array 325 calculates the distortion in the mixed reference sample and provides the distortion value to the bidirectional prediction comparator 345 and the rate calculator 335 . In some embodiments, when performing L0 bidirectional prediction, SATD array 320 may be reused for calculating distortion, and comparators 331-334 may also be reused.

該圖圖示了L0雙向和L0單向預測部分301和302的組件。L1單向預測部分303和L1雙向預測部分304的組件未被圖示，因爲它們類似於L0單向預測部分301和L0雙向預測部分302的組件。This figure illustrates the components of L0 bidirectional and L0 unidirectional prediction sections 301 and 302. The components of the L1 unidirectional prediction part 303 and the L1 bidirectional prediction part 304 are not illustrated because they are similar to the components of the L0 unidirectional prediction part 301 and the L0 bidirectional prediction part 302 .

每個候選與基於計算的失真值和速率值計算的成本值相關聯。整體比較器240比較來自L0單向預測部分301、L0雙向預測部分302、L1單向預測部分303和L1雙向預測部分304的最佳候選。Each candidate is associated with a cost value calculated based on the calculated distortion value and rate value. The global comparator 240 compares the best candidates from the L0 uni-prediction part 301, the L0 bi-prediction part 302, the L1 uni-prediction part 303 and the L1 bi-prediction part 304.

LC-RDO 110的各種組件/電路可以由不同的編解碼工具共用。在一些實施例中，LC-RDO 的至少一些組件可以配置爲識別各種編解碼工具的最低成本候選。Various components/circuitry of LC-RDO 110 can be shared by different codec tools. In some embodiments, at least some components of the LC-RDO may be configured to identify lowest cost candidates for various codec tools.

具有運動向量差的合並模式（MMVD）是通用視訊編解碼（VVC）標準使用的編解碼工具。與常規合並模式（其中隱式導出的運動資訊直接用於當前 CU 的預測樣本生成）不同，在 MMVD 中，導出的運動資訊通過運動向量差 (MVD) 進一步細化（refine）。 MMVD 還通過基於預定義偏移量（也稱爲 MMVD 偏移量）添加額外的 MMVD 候選來擴展合並模式的候選清單。Merge Mode with Motion Vector Difference (MMVD) is a codec tool used by the Common Video Codec (VVC) standard. Unlike the regular merging mode, where the implicitly derived motion information is directly used for prediction sample generation of the current CU, in MMVD, the derived motion information is further refined through motion vector difference (MVD). MMVD also extends the candidate list of merge modes by adding additional MMVD candidates based on predefined offsets (also called MMVD offsets).

在一些實施例中，LC-RDO 110 的電路可用於識別 MMVD 模式的候選。例如，水準濾波器陣列311和垂直濾波器陣列313是可配置用於MMVD模式的通用係數濾波器。參考圖片緩衝器315的大小足夠大以存儲MMVD所需的參考樣本。移位暫存器312和內插緩衝器314的大小同樣足夠大以容納垂直和水準濾波的臨時結果。雙向預測部分的雙向預測混合模塊319和雙向預測SATD陣列325可以直接用於MMVD計算失真。整體比較器240也可以重新用於MMVD。In some embodiments, the circuitry of LC-RDO 110 may be used to identify candidates for MMVD modes. For example, horizontal filter array 311 and vertical filter array 313 are general coefficient filters configurable for MMVD mode. The size of the reference picture buffer 315 is large enough to store the reference samples required for MMVD. The size of shift register 312 and interpolation buffer 314 are also large enough to accommodate the temporary results of vertical and horizontal filtering. The bidirectional prediction mixing module 319 and the bidirectional prediction SATD array 325 of the bidirectional prediction part can be directly used for MMVD calculation distortion. The overall comparator 240 can also be reused for MMVD.

解碼器側運動向量細化（Decoder-side Motion Vector Refinement，簡寫爲DMVR)是可以應用於常規合並候選的編解碼工具。根據 DMVR，解碼器按照預定義的步驟細化MV：(i) 將當前 CU 拆分爲多個 16x16/8x16/16x8 子塊，(ii) 使用雙綫性濾波器（bilinear filter）圍繞常規合並候選生成 25 個整數偏移細化候選，（iii）通過鏡像匹配計算偶數行 SAD 成本，以及（iv）如果滿足預定義條件，則應用分數偏移細化。如圖。第4圖概念性地說明瞭 DMVR 中的鏡像匹配。該圖顯示了L0合並MV的25個細化候選位置與L1合並MV的25個細化候選位置之間的鏡像匹配。Decoder-side Motion Vector Refinement (DMVR) is a codec tool that can be applied to regular merge candidates. According to DMVR, the decoder refines the MV according to predefined steps: (i) split the current CU into multiple 16x16/8x16/16x8 sub-blocks, (ii) use a bilinear filter around the regular merge candidates Generate 25 integer offset refinement candidates, (iii) calculate the even row SAD cost through mirror matching, and (iv) apply fractional offset refinement if predefined conditions are met. As shown in the picture. Figure 4 conceptually illustrates image matching in DMVR. The figure shows the mirror matching between the 25 refinement candidate positions of the L0 merged MV and the 25 refinement candidate positions of the L1 merged MV.

DMVR MV偏移導出的過程與LC-RDO階段有許多相似之處。因此，LC-RDO 110 階段的電路可以與 DMVR 共用或重複使用，其中一些模塊的配置不同於其他編解碼工具。對於 DMVR，LC-RDO 可以配置爲（i）從緩存中讀取資料，（ii）生成雙綫性候選，（iii）計算偶數行 SAD 成本，以及（iv）比較每個候選的成本以進行分數細化和輸出一個細化的合並 MV。細化的合並MV被提供給HC-RDO 120。例如，水準濾波器陣列311和垂直濾波器陣列313可以被配置爲執行雙綫性濾波。 SATD陣列320可以被配置爲計算偶數行的SAD成本。雙向混合模塊319可以被配置爲支援鏡像匹配。參考圖片緩衝器315、移位暫存器312、插值緩衝器314和局部比較器340可以直接重新用於DMVR。比較器341-344可以被配置來執行分數細化，例如構造誤差表面，然後在誤差表面上找到具有最小成本的偏移。The process of DMVR MV offset export has many similarities with the LC-RDO stage. Therefore, the circuitry of the LC-RDO 110 stage can be shared or reused with DMVR, with some modules configured differently than other codec tools. For DMVR, LC-RDO can be configured to (i) read from the cache, (ii) generate bilinear candidates, (iii) calculate the even row SAD cost, and (iv) compare the cost of each candidate for scoring Refine and output a refined merged MV. The refined merged MV is provided to HC-RDO 120. For example, the horizontal filter array 311 and the vertical filter array 313 may be configured to perform bilinear filtering. SATD array 320 may be configured to calculate SAD costs for even rows. Bidirectional mixing module 319 may be configured to support image matching. The reference picture buffer 315, shift register 312, interpolation buffer 314 and local comparator 340 can be directly reused for DMVR. Comparators 341-344 may be configured to perform fractional refinement, such as constructing an error surface and then finding the offset with minimum cost on the error surface.

適應性運動向量解析度（AMVR)允許以不同的精度編碼運動向量差（MVD)。 VVC 中有 4 種 AMVR 精度：四樣本 (4-pel)、整數樣本 (1-pel)、1/2樣本 (H-pel) 和四分之一樣本 (Q-pel)。在一些實施例中，第3圖中的LC-RDO 110的示例架構可以直接與AMVR共用，其中4個比較器331-334對應4個AMVR精度。更普遍的是，四個 AMVR 精度由四個單獨的 PE 調用處理（例如，使用四個單獨的硬體處理元件。）Adaptive Motion Vector Resolution (AMVR) allows encoding of motion vector differences (MVD) with different precisions. There are 4 AMVR precisions in VVC: four-sample (4-pel), integer-sample (1-pel), 1/2-sample (H-pel), and quarter-sample (Q-pel). In some embodiments, the example architecture of LC-RDO 110 in Figure 3 can be directly shared with AMVR, where 4 comparators 331-334 correspond to 4 AMVR precisions. More commonly, the four AMVR precisions are handled by four separate PE calls (e.g., using four separate hardware processing elements.)

在一些實施例中，LC-RDO可以通過一次PE調用（使用相同的處理元件）執行不同的AMVR精度，具體地，通過對齊不同AMVR精度的中心MV並一並執行RDO。第5圖概念性地說明瞭通過對齊它們的中心 MV 來對齊四個 AMVR 精度。圖中的每個圓圈代表一個圖元位置（整數和分數）。標記爲“Q”的圓圈是使用 8抽頭（8-tap）濾波器進行插值的四分之一圖元位置，而標記爲“H”的圓圈是使用 6 抽頭（6-tap）濾波器進行插值的1/2圖元位置。標記爲“HQ”的圓圈是對1/2圖元和四分之一圖元執行插值操作的位置。標記爲“1”和“4”的圓圈分別是不需要插值的 1 圖元和 4 圖元位置。In some embodiments, LC-RDO can execute different AMVR precisions with one PE call (using the same processing element), specifically by aligning the center MVs of different AMVR precisions and executing the RDO together. Figure 5 conceptually illustrates the alignment of four AMVR accuracies by aligning their center MVs. Each circle in the diagram represents a primitive position (integer and fractional). The circle labeled "Q" is the quarter primitive position interpolated using an 8-tap filter, while the circle labeled "H" is interpolated using a 6-tap filter. 1/2 primitive position. The circle labeled "HQ" is where the interpolation operation is performed on the 1/2 primitive and the quarter primitive. The circles labeled "1" and "4" are respectively the 1- and 4-prime locations that do not require interpolation.

通過在它們的中心 MV 對齊四個精度，所有四個 AMVR 精度可以由一個 PE 調用（PE-call）執行或具有 AMVR 精度的子集（例如，1圖元、1/2圖元和四分之一圖元，但沒有4圖元) 到一個 PE調用中。經驗表明，在某些條件下，這對 BD速率沒有太大影響。可以觀察到，當當前 CU 的大小較大時，AMVR輔助資訊（side information）變得不那麽重要。四種不同精度的部分插值結果可以共用，以進一步降低硬體成本。By aligning the four precisions at their center MV, all four AMVR precisions can be executed by a single PE-call or have a subset of AMVR precisions (e.g., 1 primitive, 1/2 primitive, and quarter one primitive, but not 4 primitives) into a PE call. Experience shows that under certain conditions this does not have much impact on the BD rate. It can be observed that when the current CU size is larger, AMVR side information becomes less important. Partial interpolation results of four different precisions can be shared to further reduce hardware costs.

子塊變換（Sub-block Transform，簡寫爲SBT)是用於幀間預測的CU的編解碼工具，允許視訊編碼器僅對殘差塊或變換塊的一部分執行變換。變換塊的編碼部分使用隱式決定的變換進行編碼。非編碼部分（non-coded portion）被清零。 SBT規定了多種不同的候選模式，用於將變換塊劃分爲編碼部分和非編碼部分。第6圖圖示了用於將變換塊劃分爲編碼部分和零部分（zero portion）的SBT的各種候選模式。在圖中，對於每個候選 SBT 模式，變換塊的編碼部分標記爲“A”，非編碼部分標記爲“0”。變換塊的寬度稱爲 tbWidth。編碼部分的寬度稱爲 trafoWidth。Sub-block Transform (SBT) is a coding and decoding tool for CUs used for inter-frame prediction, allowing the video encoder to perform transformation only on a part of the residual block or transformation block. The encoded portion of the transform block is encoded using an implicitly determined transform. The non-coded portion is cleared. SBT specifies a number of different candidate modes for dividing the transform block into a coded part and a non-coded part. Figure 6 illustrates various candidate modes of SBT for dividing a transform block into a coding portion and a zero portion. In the figure, for each candidate SBT mode, the coding part of the transform block is marked as "A" and the non-coding part as "0". The width of the transform block is called tbWidth. The width of the encoded part is called trafoWidth.

第6圖中的SBT的各種候選模式可以通過 cu_sbt_quad_flag、cu_sbt_horizontal_flag 和 cu_sb_pos_flag 啓用或禁用。 cu_sbt_quad_flag 指示 trafoWidth:(tbWidth-trafoWidth) 是否可以是 2:2 或 1:3 或 3:1。 cu_sbt_horizontal_flag 指示變換塊是水準分割還是垂直分割用於 SBT。 cu_sb_pos_flag 指示左側/頂部子塊是否具有非零變換係數且右側/底部被置零（zeroed out），或者右側/底部子塊是否具有非零變換係數且左側/頂部被置零。The various candidate modes of SBT in Figure 6 can be enabled or disabled via cu_sbt_quad_flag, cu_sbt_horizontal_flag and cu_sb_pos_flag. cu_sbt_quad_flag indicates whether trafoWidth:(tbWidth-trafoWidth) can be 2:2 or 1:3 or 3:1. cu_sbt_horizontal_flag indicates whether the transform block is split horizontally or vertically for SBT. cu_sb_pos_flag indicates whether the left/top sub-block has non-zero transform coefficients and the right/bottom is zeroed out, or whether the right/bottom sub-block has non-zero transform coefficients and the left/top is zeroed out.

取代在 HC-RDO階段爲 SBT 嘗試所有可能的候選模式，在一些實施例中，LC-RDO階段被用於識別具有用於編碼變換塊的最小成本的候選模式。在一些實施例中，LC-RDO通過變換編碼部分（transform coded portion）的SSD和歸零部分的SSD的加權平均計算SBT的每個候選模式的殘差成本。具體來說，每個候選模式的成本按照如下計算：Instead of trying all possible candidate modes for SBT in the HC-RDO stage, in some embodiments the LC-RDO stage is used to identify the candidate mode with the minimum cost for coding the transform block. In some embodiments, LC-RDO calculates the residual cost of each candidate mode of the SBT by a weighted average of the SSD of the transform coded portion and the SSD of the zeroed portion. Specifically, the cost of each candidate pattern is calculated as follows:

(Eq. 1) (Eq. 1)

LC-RDO 將選擇具有最低成本（根據等式 1 計算）的候選模式以在 HC-RDO 中進行測試。這被稱爲 SBT 預選（SBT-preselection）。因此，在一些實施例中，取代讓HC-RDO嘗試SBT的每個可能的候選模式，視訊編碼器可以在LC-RDO處執行SBT預選。 HC-RDO然後根據SBT預選的結果進行候選選擇。LC-RDO will select the candidate mode with the lowest cost (calculated according to Equation 1) for testing in HC-RDO. This is called SBT-preselection. Therefore, in some embodiments, instead of having the HC-RDO try every possible candidate mode of SBT, the video encoder can perform SBT pre-selection at the LC-RDO. HC-RDO then performs candidate selection based on the results of SBT pre-selection.

LC-RDO 110可以被配置爲執行SBT預選。具體地，可以將SBT的各個候選模式的殘差提供給SATD陣列320，SATD陣列320然後可以根據上方的等式1計算每個候選模式的成本。LC-RDO 110 may be configured to perform SBT pre-selection. Specifically, the residuals of each candidate mode of the SBT may be provided to the SATD array 320, which may then calculate the cost of each candidate mode according to Equation 1 above.

具有CU級權重的雙向預測（BCW)是一種用於增強雙向預測的編解碼工具。 BCW 允許在將L0 預測和 L1 預測組合起來爲 CU 生成雙向預測之前，對 L0 預測和 L1 預測應用不同的權重。對於要由 BCW 編碼的 CU，爲 L0 和 L1 預測發送一個加權參數 w，以便根據以下公式基於 w 計算雙向預測結果 P _bi-pred： Bidirectional prediction with CU-level weights (BCW) is a codec tool for enhanced bidirectional prediction. BCW allows applying different weights to L0 predictions and L1 predictions before combining them to generate bidirectional predictions for the CU. For CUs to be encoded by BCW, a weighting parameter w is sent for L0 and L1 prediction so that the bidirectional prediction result P _bi-pred is calculated based on w according to the following formula:

(Eq. 2) (Eq. 2)

P ₀表示由L0 MV(或L0預測)預測的圖元值。 P ₁表示由 L1 MV（或 L1 預測）預測的圖元值。 P _bi-pred是 P ₀和 P ₁根據 w 的加權平均值。對於低延遲圖片，即使用具有小圖片順序計數 (picture order count，簡寫爲POC) 的參考幀的圖片，w 的可能值包括 {-2、3、4、5、10}。對於非低延遲圖片，w 的可能值包括 {3, 4, 5}。在一些實施例中，爲了找到編解碼當前 CU 的最佳 w，而不是爲所有候選雙向預測 MV 位置搜索 w 的所有可能值，LC-RDO階段可以採用交織搜索模式（interleaving search pattern）來找到BCW加權參數w的最佳值。 P ₀ represents the primitive value predicted by L0 MV (or L0 prediction). P ₁ represents the primitive value predicted by L1 MV (or L1 prediction). P _bi-pred is the weighted average of P ₀ and P ₁ according to w. For low-latency pictures, that is, pictures using reference frames with a small picture order count (POC), possible values of w include {-2, 3, 4, 5, 10}. For non-low latency images, possible values for w include {3, 4, 5}. In some embodiments, in order to find the best w for encoding and decoding the current CU, instead of searching all possible values of w for all candidate bidirectional prediction MV positions, the LC-RDO stage can use an interleaving search pattern to find the BCW The optimal value of the weighting parameter w.

第7圖概念性地說明瞭用於尋找最佳BCW加權參數值的交織搜索模式。圖中，權重索引（BCWIdx）0、1、2、3、4分別對應可能的BCW權重參數值-2、3、4、5、10。搜索模式用於候選雙向預測 MV 位置的陣列或排列。排列中的候選雙向預測MV位置包括中心位置700和基於中心位置700的多個垂直和/或水準偏移位置（±1、±2等）。Figure 7 conceptually illustrates the interleaved search pattern used to find optimal BCW weighting parameter values. In the figure, weight indexes (BCWIdx) 0, 1, 2, 3, and 4 correspond to possible BCW weight parameter values -2, 3, 4, 5, and 10 respectively. Search mode is used for arrays or permutations of candidate bidirectionally predicted MV positions. Candidate bidirectionally predicted MV positions in the alignment include a center position 700 and a plurality of vertical and/or horizontal offset positions (±1, ±2, etc.) based on the center position 700.

對於每個候選雙向預測 MV 位置，LC-RDO 計算對應於該位置的每個可能 w 值的失真值。如圖所示，在中心雙向預測 MV 位置 700，LC-RDO階段通過嘗試分別對應於 BCW 權重值 -2、3、4、5、10 的權重索引 0、1、2、3、4 來嘗試找到 w 的最佳值。對於作爲中心位置的偏移位置的候選雙向預測MV位置中的每一個，選擇性地檢查BCW權重。在第一類偏移位置，LC-RDO階段嘗試權重索引 (BCWIdx) 0、2 和 3，它們分別對應於權重參數值 -2、4 和 5。在第二種類型的偏移位置，LC-RDO階段嘗試權重索引 1、2 和 4，它們分別對應於 BCW 權重值 3、4 和 10。For each candidate bidirectionally predicted MV position, LC-RDO computes the distortion value corresponding to each possible w value for that position. As shown in the figure, at the center bidirectional prediction MV position 700, the LC-RDO stage tries to find by trying the weight indices 0, 1, 2, 3, 4 corresponding to the BCW weight values -2, 3, 4, 5, 10 respectively The optimal value of w. For each of the candidate bidirectional prediction MV positions that are offset positions from the center position, the BCW weight is selectively checked. At type 1 offset positions, the LC-RDO stage attempts weight indices (BCWIdx) 0, 2, and 3, which correspond to weight parameter values -2, 4, and 5, respectively. At the second type of offset position, the LC-RDO stage attempts weight indices 1, 2, and 4, which correspond to BCW weight values 3, 4, and 10, respectively.

通過利用w值計算當前塊的雙向預測(Pbi_pred)，即基於 L0 MV 的L0預測 (P ₀) 和基於 L1 MV 的 L1 預測 (P ₁)的加權平均（等式 2），LC-RDO計算候選雙向預測位置的可能w值的失真值。 L0和L1 MV是基於候選雙向預測位置來識別的。可能的 w 值的失真是當前塊的原始圖元資料與雙向預測之間的差異。 By using the w value to calculate the bidirectional prediction (Pbi_pred) of the current block, that is, the weighted average of the L0 prediction (P ₀ ) based on L0 MV and the L1 prediction (P ₁ ) based on L1 MV (Equation 2), LC-RDO calculates the candidate Distortion values for possible w values at bidirectional prediction locations. L0 and L1 MVs are identified based on candidate bidirectional prediction positions. The possible distortion of the w value is the difference between the original primitive data of the current block and the bidirectional prediction.

如圖所示，兩種類型的偏移位置的放置以交錯模式佈置。具體地，第一種類型的偏移位置是在中心位置700的右上、左上、右下和左下，而第二種類型的偏移位置是在中心位置700的上方、右側、左側以及底部。圖案擴展，使得一種類型的每個偏移位置都有相同類型的對角相鄰（右上、左上、右下、左下），並且有相反類型的水準和垂直相鄰（上方、右側、左側以及底部)。The placement of the two types of offset positions is arranged in a staggered pattern as shown in the figure. Specifically, the first type of offset positions are at the upper right, upper left, lower right and lower left of the center position 700 , while the second type of offset positions are at the upper, right, left and bottom of the center position 700 . The pattern expands so that each offset position of one type has a diagonal neighbor of the same type (top right, top left, bottom right, bottom left), and a horizontal and vertical neighbor of the opposite type (top, right, left, and bottom ).

通常，LC-RDO僅在中心位置嘗試所有可能的BCW加權參數值。對於所有其他候選雙向預測 MV 位置（偏移位置），LC-RDO 僅嘗試可能的加權參數值的子集。此外，偏移位置被分成不同的組（例如，根據交織模式），LC-RDO 嘗試不同的偏移位置組的可能 w 值的不同子集。更進一步地，BCWIdx=2（或w=4）被嘗試用於所有候選雙向預測MV位置，因爲BCWIdx=2對應於在計算雙向預測結果時同等加權的L1預測和L0預測。Typically, LC-RDO tries all possible BCW weighting parameter values at the center location only. For all other candidate bidirectionally predicted MV positions (offset positions), LC-RDO only tries a subset of possible weighting parameter values. Furthermore, the offset positions are divided into different groups (e.g., according to the interleaving pattern), and LC-RDO tries different subsets of possible w values for different groups of offset positions. Furthermore, BCWIdx=2 (or w=4) is tried for all candidate bidirectional prediction MV positions, because BCWIdx=2 corresponds to equally weighted L1 prediction and L0 prediction when calculating the bidirectional prediction results.

通過減少根據交織模式檢查的可能 w 值的數量，視訊編碼器可以減少在找到最佳雙向預測 MV 位置和最佳 BCW 加權參數值以獲得最佳雙向預測 MV 位置時執行的 SATD 操作的數量。By reducing the number of possible w values that are checked based on the interleaving pattern, the video encoder can reduce the number of SATD operations performed when finding the optimal bidirectionally predicted MV position and the optimal BCW weighting parameter value to obtain the optimal bidirectionally predicted MV position.

如上所述，LC-RDO 110 是一個硬體電路，可以由多個不同的編解碼工具共用，包括 BCW。雙向混合模塊319可用於計算BCW的L0和L1預測的加權平均值。雙向SATD陣列325可用於計算在各種雙向預測MV位置處的不同加權參數值的失真值。雙向預測比較器345可用於通過比較與不同候選雙向預測MV位置處的不同加權參數值相關聯的成本來識別最佳加權參數值和最佳雙向預測MV位置。As mentioned above, the LC-RDO 110 is a hardware circuit that can be shared by several different codec tools, including BCW. Bidirectional blending module 319 may be used to calculate a weighted average of the L0 and L1 predictions of BCW. Bidirectional SATD array 325 may be used to calculate distortion values for different weighting parameter values at various bidirectional predicted MV positions. Bi-prediction comparator 345 may be used to identify optimal weighting parameter values and optimal bi-prediction MV locations by comparing costs associated with different weighting parameter values at different candidate bi-prediction MV locations.

示例的視訊編碼器Sample video encoder

第8圖圖示了示例視訊編碼器800，其可以使用LC-RDO階段和HC-RDO階段來選擇用於編解碼當前塊的編解碼工具和/或預測候選。如圖所示，視訊編碼器800從視訊源805接收輸入視訊信號並將該信號編碼成位元流895。視訊編碼器800具有用於對來自視訊源805的信號進行編碼的若干組件或模塊，至少包括選自以下的一些組件：變換模塊810、量化模塊811、逆量化模塊814、逆變換模塊815、圖片內估計模塊（也稱幀內估計模塊）820、幀內預測模塊825、運動補償模塊830、運動估計模塊835、環路濾波器 845、重構圖片緩衝器 850、MV 緩衝器 865、MV 預測模塊 875 和熵編碼器 890。運動補償模塊 830 和運動估計模塊 835 是幀間預測模塊 840 的一部分。Figure 8 illustrates an example video encoder 800 that can use LC-RDO stages and HC-RDO stages to select codec tools and/or prediction candidates for encoding the current block. As shown, video encoder 800 receives an input video signal from a video source 805 and encodes the signal into a bit stream 895. Video encoder 800 has several components or modules for encoding signals from video source 805, including at least some components selected from the following: transform module 810, quantization module 811, inverse quantization module 814, inverse transform module 815, image Intra estimation module (also called intra estimation module) 820, intra prediction module 825, motion compensation module 830, motion estimation module 835, loop filter 845, reconstructed picture buffer 850, MV buffer 865, MV prediction module 875 and entropy encoder 890. Motion compensation module 830 and motion estimation module 835 are part of inter prediction module 840.

在一些實施例中，模塊810-890是由計算設備或電子裝置的一個或多個處理單元（例如處理器）執行的軟體指令模塊。在一些實施例中，模塊810-890是由電子裝置的一個或多個集成電路(IC)實現的硬體電路模塊。儘管模塊810-890被示爲單獨的模塊，但是一些模塊可以組合成單個模塊。In some embodiments, modules 810-890 are modules of software instructions executed by one or more processing units (eg, processors) of a computing device or electronic device. In some embodiments, modules 810-890 are hardware circuit modules implemented by one or more integrated circuits (ICs) of an electronic device. Although modules 810-890 are shown as separate modules, some modules may be combined into a single module.

視訊源805提供原始視訊信號，該原始視訊信號在沒有壓縮的情況下呈現每個視訊幀的圖元資料。減法器 808 計算視訊源 805 的原始視訊圖元資料與來自運動補償模塊 830 或幀內預測模塊 825 的預測的圖元資料 813 之間的差異。變換模塊 810 將差異（或殘差圖元資料或殘差信號808）轉換爲變換係數（transform coefficient）（例如，通過執行離散余弦變換或 DCT）816。量化模塊811將變換係數量化爲量化的資料（quantized data）（或量化的係數）812，其由熵編碼器890編碼爲位元流895。Video source 805 provides a raw video signal that represents the primitive data of each video frame without compression. Subtractor 808 calculates the difference between original video primitive data of video source 805 and predicted primitive data 813 from motion compensation module 830 or intra prediction module 825 . The transform module 810 converts the differences (or residual primitives or residual signals 808 ) into transform coefficients (eg, by performing a discrete cosine transform or DCT) 816 . The quantization module 811 quantizes the transform coefficients into quantized data (or quantized coefficients) 812 , which is encoded by the entropy encoder 890 into a bit stream 895 .

逆量化模塊814對量化的資料(或量化的係數)812進行逆量化以獲得變換係數，並且逆變換模塊815對變換係數執行逆變換以産生重構的殘差819。重構的殘差819被添加與預測的圖元資料813一起産生重構的圖元資料817。在一些實施例中，重構的圖元資料817臨時存儲在行緩衝器(未示出)中用於圖片內預測和空間MV預測。重構的圖元由環路濾波器845濾波並存儲在重構圖片緩衝器850中。在一些實施例中，重構圖片緩衝器850是視訊編碼器800外部的記憶體。在一些實施例中，重構圖片緩衝器850是視訊編碼器800內部的記憶體。The inverse quantization module 814 inversely quantizes the quantized data (or quantized coefficients) 812 to obtain transform coefficients, and the inverse transform module 815 performs an inverse transform on the transform coefficients to produce a reconstructed residual 819 . The reconstructed residuals 819 are added together with the predicted primitives 813 to produce reconstructed primitives 817 . In some embodiments, the reconstructed primitive data 817 is temporarily stored in a line buffer (not shown) for intra-picture prediction and spatial MV prediction. The reconstructed primitives are filtered by loop filter 845 and stored in reconstructed picture buffer 850. In some embodiments, the reconstructed picture buffer 850 is a memory external to the video encoder 800 . In some embodiments, the reconstructed picture buffer 850 is an internal memory of the video encoder 800 .

圖片內估計模塊820基於重構的圖元資料817執行幀內預測以産生幀內預測資料。幀內預測資料被提供給熵編碼器890以被編碼成位元流895。幀內預測資料也被幀內預測模塊825用來産生預測的圖元資料813。Intra-picture estimation module 820 performs intra prediction based on the reconstructed primitive data 817 to generate intra prediction data. The intra prediction data is provided to an entropy encoder 890 to be encoded into a bitstream 895. The intra prediction data is also used by the intra prediction module 825 to generate predicted primitive data 813 .

運動估計模塊835通過産生MV以參考存儲在重構圖片緩衝器850中的先前解碼幀的圖元資料來執行幀間預測。這些MV被提供給運動補償模塊830以産生預測的圖元資料。Motion estimation module 835 performs inter prediction by generating MVs to reference primitive data of previously decoded frames stored in reconstructed picture buffer 850 . These MVs are provided to the motion compensation module 830 to generate predicted primitive data.

視訊編碼器 800 不是在位元流中編碼完整的實際 MV，而是使用 MV 預測來生成預測的 MV，並且用於運動補償的 MV 與預測的 MV 之間的差異被編碼爲殘差運動資料並存儲在位元流 895中。Instead of encoding the complete actual MV in the bitstream, the video encoder 800 uses MV prediction to generate the predicted MV, and the difference between the MV used for motion compensation and the predicted MV is encoded as residual motion data and Stored in bitstream 895.

MV預測模塊875基於爲編碼先前視訊幀而生成的參考MV，即，用於執行運動補償的運動補償MV，生成預測的MV。 MV預測模塊875從MV緩衝器865中檢索（retrieve）來自先前視訊幀的參考MV。視訊編碼器800將針對當前視訊幀生成的MV存儲在MV緩衝器865中作爲用於生成預測的MV的參考MV。The MV prediction module 875 generates a predicted MV based on the reference MV generated for encoding the previous video frame, ie, the motion compensation MV used to perform motion compensation. The MV prediction module 875 retrieves reference MVs from previous video frames from the MV buffer 865 . Video encoder 800 stores the MV generated for the current video frame in MV buffer 865 as a reference MV for generating predicted MVs.

MV預測模塊875使用參考MV來創建預測的MV。預測的 MV 可以通過空間 MV 預測或時間 MV 預測來計算。熵編碼器890將預測的MV與當前幀的運動補償MV(MC MV)之間的差異(殘差運動資料)編碼到位元流895中。The MV prediction module 875 uses the reference MV to create predicted MVs. Predicted MV can be calculated by spatial MV prediction or temporal MV prediction. Entropy encoder 890 encodes the difference (residual motion data) between the predicted MV and the motion compensated MV (MC MV) of the current frame into the bit stream 895.

熵編碼器890通過使用諸如上下文適應性二進制算術編解碼(CABAC)或霍夫曼編碼的熵編解碼技術將各種參數和資料編碼到位元流895中。熵編碼器 890 將各種報頭元素、標志連同量化的變換係數 812 和殘差運動資料作爲語法元素編碼到位元流 895 中。位元流 895 又存儲在存儲設備中或通過諸如網絡的通信媒介傳輸到解碼器。Entropy encoder 890 encodes various parameters and information into bit stream 895 using entropy coding techniques such as Context Adaptive Binary Arithmetic Coding (CABAC) or Huffman coding. The entropy encoder 890 encodes various header elements, flags together with the quantized transform coefficients 812 and residual motion data as syntax elements into the bit stream 895 . The bit stream 895 is in turn stored in a storage device or transmitted to the decoder via a communication medium such as a network.

環路濾波器845對重構的圖元資料817執行濾波或平滑操作以減少編碼的僞影，特別是在圖元塊的邊界處。在一些實施例中，執行的濾波操作包括樣本適應性偏移（SAO）。在一些實施例中，濾波操作包括適應性環路濾波器（ALF）。Loop filter 845 performs a filtering or smoothing operation on reconstructed primitive data 817 to reduce coding artifacts, particularly at primitive block boundaries. In some embodiments, the filtering operations performed include sample adaptive offset (SAO). In some embodiments, the filtering operation includes an adaptive loop filter (ALF).

第9圖示意了實現LC-RDO和HC-RDO的視訊編碼器800的部分。具體地，該圖圖示了視訊編碼器800的幀間預測模塊840的組件。如圖所示，幀間預測模塊840包括LC-RDO 110和HC-RDO 120，用於執行速率失真優化以識別最合適的編解碼工具和/或預測候選。Figure 9 illustrates parts of the video encoder 800 that implement LC-RDO and HC-RDO. Specifically, this figure illustrates the components of inter prediction module 840 of video encoder 800. As shown, inter prediction module 840 includes LC-RDO 110 and HC-RDO 120 for performing rate-distortion optimization to identify the most suitable codec tools and/or prediction candidates.

RDO階段110和120都接收來自視訊源805的源圖元資料、來自重構圖片緩衝器850的參考圖元資料以及來自MV緩衝器865的參考MV資料。LC-RDO 110使用接收到的資料來識別預測候選並計算使用各種編解碼工具的成本（例如，速率和失真值）。 LC-RDO 110可以爲每個編解碼工具識別具有最佳成本的預測候選作爲HC-RDO 120的中間RDO結果。HC-RDO 120依次完成編解碼工具和預測候選的選擇。運動補償模塊830（可以是HC-RDO 120的一部分）使用選擇的編解碼工具和預測候選來執行運動補償並産生預測圖元資料813。Both RDO stages 110 and 120 receive source primitive data from video source 805 , reference primitive data from reconstructed picture buffer 850 , and reference MV data from MV buffer 865 . LC-RDO 110 uses the received data to identify prediction candidates and calculate the cost of using various codec tools (eg, rate and distortion values). LC-RDO 110 may identify the prediction candidate with the best cost for each codec tool as the intermediate RDO result of HC-RDO 120 . HC-RDO 120 completes the selection of encoding and decoding tools and prediction candidates in sequence. Motion compensation module 830 (which may be part of HC-RDO 120) uses the selected codec tool and prediction candidates to perform motion compensation and generate prediction primitives 813.

第10圖概念性地說明瞭在使用BCW對編碼塊進行編碼時尋找加權參數值的過程1000。在一些實施例中，計算設備的一個或多個處理單元（例如，處理器）實現編碼器800，通過執行存儲在計算機可讀介質中的指令來執行過程1000。在一些實施例中，實現編碼器800的電子設備，具體地LC-RDO階段110，執行過程1000。Figure 10 conceptually illustrates the process 1000 of finding weighting parameter values when encoding a coding block using BCW. In some embodiments, one or more processing units (eg, processors) of a computing device implement encoder 800 to perform process 1000 by executing instructions stored in a computer-readable medium. In some embodiments, electronics implementing encoder 800, specifically LC-RDO stage 110, performs process 1000.

編碼器（在塊1010處）接收圖元塊的原始圖元資料以作爲視訊的當前圖片的當前塊被編碼到位元流中。The encoder (at block 1010) receives raw primitive data for the primitive block as the current block of the current picture of the video is encoded into the bit stream.

編碼器（在塊1020處）識別多個候選雙向預測位置，包括中心位置、第一組偏移位置和第二組偏移位置。第一組偏移位置和第二組偏移位置是中心位置的偏移位置。在一些實施例中，第一組偏移位置和第二組偏移位置彼此交織。The encoder (at block 1020) identifies a plurality of candidate bi-prediction locations, including a center location, a first set of offset locations, and a second set of offset locations. The first set of offset positions and the second set of offset positions are offset positions from the center position. In some embodiments, the first set of offset locations and the second set of offset locations are interleaved with each other.

編碼器（在塊1030處）基於幾個可能的加權參數值中的每一個來計算中心位置的失真值。編碼器（在塊1040）基於可能的加權參數值的第一子集計算第一組偏移位置中的每一個的失真值。編碼器（在塊1050）基於可能的加權參數值的第二不同子集計算第二組偏移位置中的每一個的失真值。The encoder (at block 1030) calculates a distortion value for the center position based on each of several possible weighting parameter values. The encoder (at block 1040) calculates distortion values for each of the first set of offset positions based on the first subset of possible weighting parameter values. The encoder (at block 1050) calculates distortion values for each of the second set of offset positions based on a second different subset of possible weighting parameter values.

在一些實施例中，編碼器基於候選雙向預測位置處的加權參數值來計算失真值，通過使用加權參數值來計算雙向預測，該雙向預測是基於第一運動向量的第一預測和基於第二運動向量的第二預測的加權平均值(例如，根據等式2)。基於候選雙向預測位置識別第一和第二運動向量。候選雙向預測位置處的可能的加權參數值的失真是當前塊的原始圖元資料與基於候選雙向預測位置處的可能的加權參數值計算的雙向預測之間的差異。In some embodiments, the encoder calculates the distortion value based on the weighted parameter values at the candidate bidirectional prediction positions by using the weighted parameter values to calculate the bidirectional prediction based on the first prediction of the first motion vector and the second prediction based on the second motion vector. The weighted average of the second prediction of the motion vector (eg, according to Equation 2). First and second motion vectors are identified based on the candidate bi-prediction positions. The distortion of the possible weighting parameter values at the candidate bi-prediction positions is the difference between the original primitive data of the current block and the bi-prediction calculated based on the possible weighting parameter values at the candidate bi-prediction positions.

在一些實施例中，幾個可能的加權參數值包括第一、第二、第三、第四和第五值。對於BCW編解碼工具，這五個可能的加權參數值對應於-2、3、4、5和10。多個可能的加權參數值的第一子集包括第二、第三和第五值（3， 4 和 10）。多個可能的加權參數值的第二子集包括第一、第三和第四值（-2、4和5）。在一些實施例中，可能的加權參數值的第一和第二子集共用一個可能的加權參數值(第三值，即4)。一個共用的可能加權參數值對應於對第一和第二預測進行同等加權（根據等式 2）。In some embodiments, several possible weighting parameter values include first, second, third, fourth and fifth values. For the BCW codec tool, the five possible weighting parameter values correspond to -2, 3, 4, 5, and 10. A first subset of possible weighting parameter values includes second, third, and fifth values (3, 4, and 10). A second subset of the plurality of possible weighting parameter values includes first, third, and fourth values (-2, 4, and 5). In some embodiments, the first and second subsets of possible weighting parameter values share one possible weighting parameter value (a third value, ie, 4). A common possible weighting parameter value corresponds to equally weighting the first and second predictions (according to Equation 2).

編碼器（在塊1060處）基於當前塊的多個候選雙向預測位置的計算的失真值來選擇候選雙向預測位置的加權參數值。編碼器（在塊1070）基於選擇的加權參數值和選擇的候選雙向預測位置使用雙向預測對當前塊進行編碼。The encoder (at block 1060) selects weighting parameter values for the candidate bi-prediction locations based on the calculated distortion values for the plurality of candidate bi-prediction locations for the current block. The encoder (at block 1070) encodes the current block using bi-prediction based on the selected weighting parameter values and the selected candidate bi-prediction positions.

示例的電子系統Example electronic system

許多上述特徵和應用被實現為軟體過程，這些軟體過程被指定為記錄在電腦可讀存儲介質（也稱為電腦可讀介質）上的一組指令。當這些指令由一個或多個計算或處理單元（例如，一個或多個處理器、處理器核心或其他處理單元）執行時，它們會導致處理單元執行指令中指示的動作。電腦可讀介質的示例包括但不限於 CD-ROM、快閃記憶體驅動器、隨機存取記憶體 (RAM) 晶片、硬碟驅動器、可擦除可程式設計唯讀記憶體 (EPROM)、電可擦除可程式設計唯讀記憶體 (EEPROM) ）等。電腦可讀介質不包括無線或通過有線連接傳遞的載波和電子信號。Many of the features and applications described above are implemented as software processes specified as a set of instructions recorded on a computer-readable storage medium (also referred to as computer-readable media). When these instructions are executed by one or more computing or processing units (e.g., one or more processors, processor cores, or other processing units), they cause the processing unit to perform the actions indicated in the instructions. Examples of computer-readable media include, but are not limited to, CD-ROMs, flash memory drives, random access memory (RAM) chips, hard drives, erasable programmable read-only memory (EPROM), electronic Erase programmable read-only memory (EEPROM)), etc. Computer-readable media does not include carrier waves and electronic signals transmitted wirelessly or over wired connections.

在本說明書中，術語“軟體”意味著包括駐留在唯讀記憶體中的韌體或存儲在磁記憶體中的應用程式，其可以被讀入記憶體以供處理器處理。此外，在一些實施例中，多個軟體發明可以作為較大程式的子部分來實現，同時保留不同的軟體發明。在一些實施例中，多個軟體發明也可以被實現為單獨的程式。最後，一起實現這裡描述的軟體發明的單獨程式的任何組合都在本公開的範圍內。在一些實施例中，當軟體程式被安裝以在一個或多個電子系統上運行時，定義了一個或多個執行和執行軟體程式的操作的特定機器實現。In this specification, the term "software" is meant to include firmware that resides in read-only memory or applications stored in magnetic memory that can be read into memory for processing by a processor. Furthermore, in some embodiments, multiple software inventions may be implemented as subparts of a larger program while retaining distinct software inventions. In some embodiments, multiple software inventions may also be implemented as separate programs. Finally, any combination of separate programs that together implement the software inventions described herein is within the scope of this disclosure. In some embodiments, one or more specific machine implementations are defined that execute and perform the operations of the software program when it is installed to run on one or more electronic systems.

第11圖概念性地圖示了實現本公開的一些實施例的電子系統1100。電子系統1100可以是電腦(例如臺式電腦、個人電腦、平板電腦等)、電話、PDA或任何其他種類的電子設備。這樣的電子系統包括各種類型的電腦可讀介質和用於各種其他類型的電腦可讀介質的介面。電子系統1100包括匯流排1105、處理單元1110、圖形處理單元(GPU)1115、系統記憶體1120、網路1125、唯讀記憶體1130、永久存放裝置1135、輸入裝置1140和輸出設備 1145。Figure 11 conceptually illustrates an electronic system 1100 implementing some embodiments of the present disclosure. Electronic system 1100 may be a computer (eg, desktop computer, personal computer, tablet computer, etc.), telephone, PDA, or any other kind of electronic device. Such electronic systems include various types of computer-readable media and interfaces for various other types of computer-readable media. Electronic system 1100 includes bus 1105, processing unit 1110, graphics processing unit (GPU) 1115, system memory 1120, network 1125, read-only memory 1130, persistent storage 1135, input device 1140, and output device 1145.

匯流排 1105 共同表示通信連接電子系統 1100 的眾多內部設備的所有系統、週邊設備和晶片組匯流排。例如，匯流排 1105 通信連接處理單元 1110 與 GPU 1115、唯讀記憶體1130、系統記憶體1120和永久存放裝置1135。Bus 1105 collectively represents all system, peripheral, and chipset busses that communicatively connect the numerous internal devices of electronic system 1100 . For example, bus 1105 communicatively connects processing unit 1110 with GPU 1115, read-only memory 1130, system memory 1120, and persistent storage 1135.

從這些不同的記憶體單元，處理單元1110檢索要執行的指令和要處理的資料以便執行本公開的過程。在不同的實施例中，處理單元可以是單一處理器或多核處理器。一些指令被傳遞到 GPU 1115 並由其執行。GPU 1115 可以卸載各種計算或補充由處理單元 1110 提供的影像處理。From these various memory units, processing unit 1110 retrieves instructions to be executed and data to be processed in order to perform the processes of the present disclosure. In different embodiments, the processing unit may be a single processor or a multi-core processor. Some instructions are passed to the GPU 1115 and executed by it. GPU 1115 can offload various computations or supplement image processing provided by processing unit 1110 .

唯讀記憶體（ROM) 1130存儲由處理單元1110和電子系統的其他模組使用的靜態資料和指令。另一方面，永久存放裝置1135是讀寫存放裝置。該設備是即使在電子系統1100關閉時也存儲指令和資料的非易失性存儲單元。本公開的一些實施例使用大型存放區設備（例如磁片或光碟及其對應的磁碟機）作為永久存放裝置1135。Read-only memory (ROM) 1130 stores static data and instructions used by processing unit 1110 and other modules of the electronic system. On the other hand, the permanent storage device 1135 is a read-write storage device. This device is a non-volatile storage unit that stores instructions and data even when the electronic system 1100 is turned off. Some embodiments of the present disclosure use large storage area devices (such as magnetic disks or optical disks and their corresponding disk drives) as the permanent storage device 1135 .

其他實施例使用卸除式存放裝置設備(例如軟碟、快閃記憶體設備等，及其對應的磁碟機)作為永久存放裝置。與永久存放裝置1135一樣，系統記憶體1120是讀寫存放裝置。然而，與存放裝置1135不同，系統記憶體1120是易失性讀寫記憶體，例如隨機存取記憶體。系統記憶體1120存儲處理器在運行時使用的一些指令和資料。在一些實施例中，根據本公開的過程存儲在系統記憶體1120、永久存放裝置1135和/或唯讀記憶體1130中。例如，各種記憶體單元包括用於處理多媒體剪輯的指令與一些實施例。從這些不同的記憶體單元，處理單元1110檢索要執行的指令和要處理的資料以便執行一些實施例的過程。Other embodiments use removable storage devices (such as floppy disks, flash memory devices, etc., and their corresponding disk drives) as permanent storage devices. Like permanent storage 1135, system memory 1120 is a read-write storage device. However, unlike the storage device 1135, the system memory 1120 is a volatile read-write memory, such as a random access memory. System memory 1120 stores some instructions and data used by the processor during operation. In some embodiments, processes in accordance with the present disclosure are stored in system memory 1120, persistent storage 1135, and/or read-only memory 1130. For example, various memory units include instructions and some embodiments for processing multimedia clips. From these various memory units, processing unit 1110 retrieves instructions to be executed and data to be processed in order to perform the processes of some embodiments.

匯流排 1105 還連接到輸入和輸出設備 1140 和 1145。輸入裝置 1140 使用戶能夠向電子系統傳送資訊和選擇命令。輸入裝置 1140 包括字母數位鍵盤和定點設備（也稱為“游標控制設備”）、相機（例如，網路攝像頭）、麥克風或用於接收語音命令的類似設備等。輸出設備 1145 顯示由電子系統生成的圖像或其他輸出資料。輸出設備1145包括印表機和顯示裝置，例如陰極射線管(CRT)或液晶顯示器(LCD)，以及揚聲器或類似的音訊輸出設備。一些實施例包括同時用作輸入和輸出設備的設備，例如觸控式螢幕。Bus 1105 also connects to input and output devices 1140 and 1145. Input devices 1140 enable users to transmit information and select commands to electronic systems. Input devices 1140 include alphanumeric keyboards and pointing devices (also known as "cursor control devices"), cameras (e.g., webcams), microphones or similar devices for receiving voice commands, and the like. Output device 1145 displays images or other output material generated by an electronic system. Output devices 1145 include printers and display devices, such as cathode ray tubes (CRT) or liquid crystal displays (LCD), as well as speakers or similar audio output devices. Some embodiments include devices that serve as both input and output devices, such as touch screens.

最後，如第11圖所示，匯流排1105還通過網路介面卡(未示出)將電子系統1100耦合到網路1125。以這種方式，電腦可以是電腦網路（例如局域網（“LAN”）、廣域網路（“WAN”）或內聯網，或網路的網路，例如互聯網）的一部分。電子系統1100的任何或所有元件可以結合本公開使用。Finally, as shown in Figure 11, bus 1105 also couples electronic system 1100 to network 1125 through a network interface card (not shown). In this manner, a computer may be part of a computer network, such as a local area network ("LAN"), a wide area network ("WAN"), or an intranet, or a network of networks, such as the Internet. Any or all elements of electronic system 1100 may be used in connection with the present disclosure.

一些實施例包括在機器可讀或電腦可讀介質（或者稱為電腦可讀存儲介質、機器可讀介質或機器可讀存儲介質）。此類電腦可讀介質的一些示例包括 RAM、ROM、唯讀光碟 (CD-ROM)、可記錄光碟 (CD-R)、可重寫光碟 (CD-RW)、唯讀數位多功能光碟（例如, DVD-ROM、雙層 DVD-ROM)、各種可燒錄/可重寫 DVD (例如DVD-RAM、DVD-RW、DVD+RW等等), 快閃記憶體(例如SD 卡、 mini-SD卡、微型 SD 卡等）、磁性和/或固態硬碟驅動器、唯讀和可燒錄 Blu-Ray® 光碟、超密度光碟、任何其他光學或磁性介質以及軟碟。電腦可讀介質可以存儲可由至少一個處理單元執行並且包括用於執行各種操作的指令集的電腦程式。電腦程式或電腦代碼的示例包括機器代碼，例如由編譯器生成的機器代碼，以及包括由電腦、電子元件或使用解譯器的微處理器執行的高級代碼的檔。Some embodiments are included on a machine-readable or computer-readable medium (also referred to as a computer-readable storage medium, a machine-readable medium, or a machine-readable storage medium). Some examples of such computer-readable media include RAM, ROM, compact disc-read only (CD-ROM), compact disc-recordable (CD-R), compact disc-rewritable (CD-RW), compact disc-read only (e.g. , DVD-ROM, dual-layer DVD-ROM), various burnable/rewritable DVDs (such as DVD-RAM, DVD-RW, DVD+RW, etc.), flash memory (such as SD card, mini-SD card, micro SD card, etc.), magnetic and/or solid-state hard drives, read-only and recordable Blu-Ray® discs, ultra-density optical discs, any other optical or magnetic media, and floppy disks. The computer-readable medium may store a computer program executable by at least one processing unit and including a set of instructions for performing various operations. Examples of computer programs or computer code include machine code, such as that generated by a compiler, and files that include high-level code executed by a computer, electronic component, or microprocessor using an interpreter.

雖然以上討論主要涉及執行軟體的微處理器或多核處理器，但許多上述特徵和應用是由一個或多個積體電路執行的，例如專用積體電路（ASIC)或現場可程式設計閘陣列 (FPGA)。在一些實施例中，這樣的積體電路執行存儲在電路本身上的指令。此外，一些實施例執行存儲在可程式設計邏輯裝置(programmable logic device，簡寫為PLD)、ROM或RAM設備中的軟體。While the above discussion primarily relates to microprocessors or multi-core processors executing software, many of the above features and applications are performed by one or more integrated circuits, such as an application specific integrated circuit (ASIC) or a field programmable gate array ( FPGA). In some embodiments, such integrated circuits execute instructions stored on the circuit itself. Additionally, some embodiments execute software stored in a programmable logic device (PLD), ROM, or RAM device.

如在本說明書和本申請的任何申請專利範圍中使用的，術語“電腦”、“伺服器”、“處理器”和“記憶體”均指電子或其他技術設備。這些術語不包括人或人群。出於說明的目的，術語顯示表示在電子設備上顯示。如本說明書和本申請的任何申請專利範圍中所使用，術語“電腦可讀介質”、“電腦可讀媒介”和“機器可讀介質”完全限於以電腦可讀形式存儲資訊的有形物理物件。這些術語不包括任何無線信號、有線下載信號和任何其他臨時信號。As used in this specification and any claims filed in this application, the terms "computer", "server", "processor" and "memory" refer to electronic or other technical equipment. These terms do not include persons or groups of people. For illustrative purposes, the term display means display on an electronic device. As used in this specification and any claim claimed in this application, the terms "computer-readable medium," "computer-readable medium," and "machine-readable medium" are strictly limited to tangible physical objects that store information in a computer-readable form. These terms do not include any wireless signals, wired download signals and any other temporary signals.

雖然已經參考許多具體細節描述了本公開，但是所屬領域具有通常知識者將認識到，在不脫離本公開的精神的情況下，本公開可以以其他具體形式實施。此外，許多圖（包括第10圖）概念性地說明了過程。這些過程的特定操作可能不會按照所示和描述的確切循序執行。具體操作可以不在一個連續的系列操作中執行，並且可以在不同的實施例中執行不同的具體操作。此外，該過程可以使用多個子過程或作為更大的宏過程的一部分來實現。因此，所屬領域具有通常知識者將理解本公開不受前述說明性細節的限制，而是由所附申請專利範圍限定。Although the present disclosure has been described with reference to numerous specific details, those of ordinary skill in the art will recognize that the disclosure may be embodied in other specific forms without departing from the spirit of the disclosure. Additionally, many figures (including Figure 10) illustrate the process conceptually. The specific operations of these procedures may not be performed in the exact sequence shown and described. Specific operations may not be performed in a continuous series of operations, and different specific operations may be performed in different embodiments. Additionally, the process can be implemented using multiple sub-processes or as part of a larger macro-process. Accordingly, one of ordinary skill in the art will understand that the present disclosure is not limited by the foregoing illustrative details, but rather by the scope of the appended claims.

文中描述的主題有時示出了包含在其它不同部件內的或與其它不同部件連接的不同部件。應當理解：這樣描繪的架構僅僅是示例性的，並且，實際上可以實施實現相同功能的許多其它架構。在概念意義上，實現相同功能的部件的任何佈置是有效地“相關聯的”，以使得實現期望的功能。因此，文中被組合以獲得特定功能的任意兩個部件可以被視為彼此“相關聯的”，以實現期望的功能，而不管架構或中間部件如何。類似地，這樣相關聯的任意兩個部件還可以被視為彼此“可操作地連接的”或“可操作地耦接的”，以實現期望的功能，並且，能夠這樣相關聯的任意兩個部件還可以被視為彼此“操作上可耦接的”，以實現期望的功能。“操作上可耦接的”的具體示例包含但不限於：實體地可聯結和/或實體地相互、作用的部件、和/或無線地可相互作用和/或無線地相互作用的部件、和/或邏輯地相互作用的和/或邏輯地可相互作用的部件。The subject matter described herein sometimes shows various components contained within or connected to various other components. It should be understood that the architecture so depicted is exemplary only, and that many other architectures may be implemented that achieve the same functionality. In a conceptual sense, any arrangement of components performing the same function is effectively "related" such that the desired functionality is achieved. Thus, any two components herein combined to achieve a specific functionality can be considered "associated with" each other so as to achieve the desired functionality, regardless of architecture or intervening components. Similarly, any two components so associated may also be considered "operably connected" or "operably coupled" to each other to achieve the desired functionality, and any two components capable of being so associated Components may also be considered "operably coupled" to each other to achieve the desired functionality. Specific examples of "operably coupled" include, but are not limited to: components that are physically coupled and/or physically interact with each other, and/or components that are wirelessly interactable and/or wirelessly interact, and /or logically interacting and/or logically interactable components.

此外，關於文中基本上任何複數和/或單數術語的使用，只要對於上下文和/或應用是合適的，所屬領域具有通常知識者可以將複數變換成單數，和/或將單數變換成複數。爲清楚起見，這裏可以明確地闡述各種單數/複數排列。Furthermore, with regard to the use of substantially any plural and/or singular term herein, one of ordinary skill in the art may convert the plural into the singular, and/or the singular into the plural, as appropriate to the context and/or application. For the sake of clarity, various singular/plural permutations may be explicitly stated here.

所屬領域具有通常知識者將會理解，通常，文中所使用的術語，特別是在所附申請專利範圍(例如，所附申請專利範圍中的主體）中所使用的術語通常意在作爲“開放性”術語(例如，術語“包含”應當被解釋爲“包含但不限幹”，術語“具有”應當被解釋爲“至少具有”，術語“包含”應當被解釋爲“包含但不限幹”等)。所屬領域具有通常知識者還將理解，如果意在所介紹的申請專利範圍陳述對象的具體數目，則這樣的意圖將會明確地陳述在申請專利範圍中，在缺乏這樣的陳述的情況下，不存在這樣的意圖。例如，爲了幫助理解，所附申請專利範圍可以包含使用介紹性短語“至少一個”和“一個或更多個”來介紹申請專利範圍陳述對象。然而，這樣的短語的使用不應當被解釋爲：用不定冠詞“一個（a或an)”的申請專利範圍陳述對象的介紹將包含這樣介紹的申請專利範圍陳述對象的任何申請專利範圍限制爲只包含一個這樣的陳述對象的發明，即使在同一申請專利範圍包含介紹性短語“一個或更多個”或“至少一個”以及諸如“一個（a)”或“一個(an)”之類的不定冠詞的情況下(例如，“一個（a)”和/或“一個（an)”應當通常被解釋爲意味著“至少一個”或“一個或更多個”）也如此；上述對以定冠詞來介紹申請專利範圍陳述對象的情況同樣適用。另外，即使明確地陳述了介紹的申請專利範圍陳述對象的具體數目，但所屬領域具有通常知識者也會認識到：這樣的陳述通常應當被解釋爲意味著至少所陳述的數目（例如，僅有“兩個陳述對象”而沒有其他修飾語的陳述通常意味著至少兩個陳述對象，或兩個或更多個陳述對象)。此外，在使用類似於“A、B和C中的至少一個等”的慣用語的情況下，通常這樣的結構意在所屬領域具有通常知識者所理解的該慣用語的含義(例如，“具有A、B和C中的至少一個的系統”將包含但不限於具有單獨的A、單獨的B、單獨的C、A和B —起、A和C 一起、B和C 一起和/或A、B和C 一起的系統等)。在使用類似於“A、B或C中的至少一個等”的慣用語的情況下，通常這樣的結構意在所屬領域具有通常知識者所理解的該慣用語的含義(例如，“具有A、B或C中的至少一個的系統”將包含但不限於具有單獨的A、單獨的B、單獨的C、A和B —起、A和C 一起、B和C 一起和/或A、B和C 一起的系統等)。所屬領域具有通常知識者將進一歩理解，不管在說明書、申請專利範圍中還是在附圖中，表示兩個或更多個可替換的術語的幾乎任意析取詞和/或短語應當理解成考慮包含術語中的一個、術語中的任一個或所有兩個術語的可能性。例如，短語“A或B”應當被理解成包含“A”、“B”、或“A和B”的可能性。One of ordinary skill in the art will understand that generally, terms used herein, and particularly in the appended claims (e.g., the subject matter of the appended claims), are generally intended to be used as "open terms." ” terms (for example, the term “comprising” should be interpreted as “including but not limited to”, the term “having” should be interpreted as “at least having”, the term “comprising” should be interpreted as “including but not limited to”, etc. ). One of ordinary skill in the art will also understand that if an introduced patent claim states a specific number of the subject matter, such intention will be expressly stated in the patent claim, and in the absence of such a statement, no such intention will be expressed. There is such an intention. For example, to aid understanding, the appended claims may contain the use of the introductory phrases "at least one" and "one or more" to introduce the claimed subject matter. However, the use of such a phrase should not be construed to mean that the introduction of a claim statement with the indefinite article "a or an" limits the scope of any claim containing the claim statement so introduced to An invention containing only one such recited object, even if the same patent claim contains the introductory phrase "one or more" or "at least one" together with words such as "a (a)" or "an (an)" This is also true in the case of indefinite articles (e.g., "a(a)" and/or "an" should generally be interpreted to mean "at least one" or "one or more"); The same applies to the case where the definite article is used to introduce the object of the patent scope statement. In addition, even if a specific number of recited objects of an introduced patent claim is expressly stated, one of ordinary skill in the art will recognize that such statements generally should be construed to mean at least the recited number (e.g., only A statement "two declarative objects" without other modifiers usually means at least two declarative objects, or two or more declarative objects). Furthermore, in the case where an idiomatic expression similar to "at least one of A, B, and C, etc." is used, generally such a structure is intended to have the meaning of the idiomatic expression understood by a person of ordinary skill in the art (for example, "having "A system with at least one of A, B and C" will include, but is not limited to, A alone, B alone, C alone, A and B together, A and C together, B and C together and/or A, B and C together system, etc.). Where an idiomatic expression similar to "at least one of A, B or C, etc." is used, usually such a construction is intended to have the meaning of the idiomatic expression as understood by a person of ordinary skill in the art (e.g., "having A, A system with at least one of B or C" will include, but is not limited to, A alone, B alone, C alone, A and B together, A and C together, B and C together and/or A, B and C together with the system, etc.). It will be further understood by those of ordinary skill in the art that virtually any disjunctive word and/or phrase denoting two or more alternative terms, whether in the specification, claims, or drawings, should be understood to mean Consider the possibility of including one of the terms, either of the terms, or both of the terms. For example, the phrase "A or B" should be understood to include the possibilities of "A", "B", or "A and B".

從前述內容可以理解，本文已經出於說明的目的描述了本公開的各種實施方式，並且在不脫離本公開的範圍和精神的情況下可以進行各種修改。因此，本文公開的各種實施方式不旨在是限制性的，真正的範圍和精神由所附申請專利範圍指示。It will be understood from the foregoing that various embodiments of the disclosure have been described herein for purposes of illustration, and that various modifications may be made without departing from the scope and spirit of the disclosure. Therefore, the various embodiments disclosed herein are not intended to be limiting, with the true scope and spirit being indicated by the appended claims.

100:視訊轉碼器 110:LC-RDO 111~114、200:配置 120:HC-RDO 121~124:中間結果 130:資料快取記憶體 210:插值 220:失真計算 230:速率計算 240:候選比較器 301:L0單向預測部分 302:L0雙向預測部分 303:L1單向預測部分 304:L1雙向預測部分 310:插值濾波器 320:SATD 陣列 331-334:速率計算器 340:局部比較器 311:水準濾波器陣列 312:移位暫存器 313:垂直濾波器陣列 314:插值緩衝器 315:參考圖片緩衝器 331-335:速率計算器 340:局部比較器 341-344:比較器 319:雙向預測混合模塊 325:雙向預測SATD陣列 345:雙向預測比較器 800:視訊編碼器 805:視訊源 810:變換模塊 811:量化模塊 814:逆量化模塊 815:逆變換模塊 820:圖片內估計模塊 825:幀內預測模塊 830:運動補償模塊 835:運動估計模塊 840:幀間預測模塊 845:環路濾波器 850:重構圖片緩衝器 865:MV 緩衝器 875:MV 預測模塊 890:熵編碼器 895:位元流 808:殘差信號 812:量化的係數 816:變換係數 813、817:圖元資料 1000:過程 1010~1070:塊 1100:電子系統 1105:匯流排 1110:處理單元 1115:圖形處理單元 1120:系統記憶體 1125:網路 1130:唯讀記憶體 1135:永久存放裝置 1140:輸入裝置 1145:輸出設備 100:Video transcoder 110:LC-RDO 111~114, 200: Configuration 120:HC-RDO 121~124: Intermediate results 130:Data cache 210: Interpolation 220: Distortion calculation 230: Rate calculation 240: Candidate Comparator 301: L0 one-way prediction part 302:L0 bidirectional prediction part 303:L1 one-way prediction part 304:L1 bidirectional prediction part 310: Interpolation filter 320:SATD array 331-334: Rate Calculator 340: Local comparator 311: Horizontal filter array 312:Shift register 313: Vertical filter array 314: Interpolation buffer 315: Reference picture buffer 331-335: Rate Calculator 340: Local comparator 341-344: Comparator 319: Bidirectional prediction hybrid module 325: Bidirectional prediction SATD array 345: Bidirectional prediction comparator 800:Video encoder 805:Video source 810: Transformation module 811: Quantization module 814:Inverse quantization module 815: Inverse transformation module 820: In-picture estimation module 825: Intra prediction module 830: Motion compensation module 835: Motion estimation module 840: Inter prediction module 845: Loop filter 850: Reconstruct image buffer 865:MV buffer 875:MV prediction module 890:Entropy encoder 895:Bit stream 808: Residual signal 812:Quantized coefficient 816: Transformation coefficient 813, 817: Graph element data 1000:Process 1010~1070: block 1100:Electronic systems 1105:Bus 1110: Processing unit 1115: Graphics processing unit 1120:System memory 1125:Internet 1130: Read-only memory 1135:Permanent storage device 1140:Input device 1145:Output device

結合在本說明書中並構成本說明書一部分的附圖示出了本發明的實施例，並且與說明書一起用於解釋本發明的原理：第1圖概念性地圖示視訊轉碼器的一部分，其使用LC-RDO階段和HC-RDO階段來選擇編解碼工具和/或用於編解碼當前塊的預測候選。第2圖從概念上說明瞭 LC-RDO的元件。第3圖圖示了具有用於多個不同編解碼工具的共用電路的LC-RDO的示例實現。第4圖概念性地說明 DMVR 中的鏡像匹配。第5圖概念性地說明瞭通過對齊它們的中心 MV 來對齊四個 AMVR 精度。第6圖圖示了用於將變換塊劃分爲編碼部分和零部分的SBT的各種候選模式。第7圖概念性地說明瞭用於尋找最佳BCW加權參數值的交織搜索模式。第8圖圖示了示例視訊編碼器，其可以使用LC-RDO階段和HC-RDO階段來選擇用於編解碼當前塊的編解碼工具和/或預測候選。第9圖示意了實現LC-RDO和HC-RDO的視訊編碼器的部分。第10圖概念性地說明瞭在使用BCW對編碼塊進行編碼時尋找加權參數值的過程。第11圖概念性地圖示了實現本公開的一些實施例的電子系統。 The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention: Figure 1 conceptually illustrates a portion of a video transcoder that uses the LC-RDO stage and the HC-RDO stage to select codec tools and/or prediction candidates for codec of the current block. Figure 2 conceptually illustrates the components of the LC-RDO. Figure 3 illustrates an example implementation of an LC-RDO with common circuitry for multiple different codec tools. Figure 4 Conceptually illustrates image matching in DMVR. Figure 5 conceptually illustrates the alignment of four AMVR accuracies by aligning their center MVs. Figure 6 illustrates various candidate modes of SBT for dividing the transform block into a coded part and a zero part. Figure 7 conceptually illustrates the interleaved search pattern used to find optimal BCW weighting parameter values. Figure 8 illustrates an example video encoder that may use LC-RDO stages and HC-RDO stages to select codec tools and/or prediction candidates for encoding the current block. Figure 9 illustrates the parts of the video encoder that implements LC-RDO and HC-RDO. Figure 10 conceptually illustrates the process of finding weighting parameter values when encoding a coding block using BCW. Figure 11 conceptually illustrates an electronic system implementing some embodiments of the present disclosure.

1000:過程 1000:Process

1010~1070:塊 1010~1070: block

Claims

A video encoding method including: receiving raw primitive data of the primitive block as the current block of the current picture of the video is encoded into the bit stream; identifying a plurality of candidate bidirectional prediction locations including a center location, a first set of offset locations, and a second set of offset locations; The distortion value calculated for each of the plurality of candidate bidirectional prediction positions is calculated based on a plurality of possible weighting parameter values, wherein: (i) the distortion value calculated for the center position is calculated based on each of the plurality of possible weighting parameter values. a distortion value, (ii) calculated for the first set of offset locations based on a first subset of the plurality of possible weighted parameter values, and (iii) calculated for the second set of offset locations The distortion value is calculated based on a second subset of the plurality of possible weighting parameter values, wherein the first subset of the plurality of possible weighting parameter values is different from the second subset of the plurality of possible weighting parameter values. Subset; Selecting weighting parameter values for candidate bi-directional prediction positions based on the calculated distortion values of the plurality of candidate bi-directional prediction positions of the current block; and The current block is encoded using bidirectional prediction based on the selected weighting parameter value.

The video encoding method of claim 1, wherein the first set of offset positions and the second set of offset positions are interleaved.

The video encoding method as described in claim 1, wherein calculating the distortion value of encoding and decoding the current block according to the weighted parameter value at the candidate bidirectional prediction position includes: using the weighted parameter value to calculate the bidirectional prediction, wherein the bidirectional prediction is based on the first An average of a first prediction of a motion vector and a second prediction based on a second motion vector, wherein the first motion vector and the second motion vector are identified based on the candidate bidirectional prediction location.

The video encoding method of claim 3, wherein the first subset and the second subset of the multiple possible weighting parameter values share one possible weighting parameter value.

The video encoding method of claim 4, wherein calculating the distortion value for encoding and decoding the current block based on the common possible weighting parameter value includes weighting the first prediction and the second prediction equally.

The video encoding method of claim 3, wherein the distortion value of the possible weighting parameter value at the candidate bidirectional prediction position is the original picture element data of the current block and the possible weighting at the candidate bidirectional prediction position. The difference between the two predictions calculated by the parameter.

The video encoding method as described in request item 1, wherein: The plurality of possible weighting parameter values include first, second, third, fourth and fifth values, The first subset of the plurality of possible weighting parameter values includes second, third and fifth values, The second subset of the plurality of possible weighting parameter values includes first, third and fourth values.

The video encoding method of claim 1, wherein the distortion value is calculated by a circuit shared by multiple different encoding and decoding tools.

An electronic device including: Encoder circuitry, configured to perform operations including: receiving raw primitive data of the primitive block as the current block of the current picture of the video is encoded into the bit stream; identifying a plurality of candidate bidirectional prediction locations including a center location, a first set of offset locations, and a second set of offset locations; The distortion value calculated for each of the plurality of candidate bidirectional prediction positions is calculated based on a plurality of possible weighting parameter values, wherein: (i) the distortion value calculated for the center position is calculated based on each of the plurality of possible weighting parameter values. a distortion value, (ii) calculated for the first set of offset locations based on a first subset of the plurality of possible weighted parameter values, and (iii) calculated for the second set of offset locations The distortion value is calculated based on a second subset of the plurality of possible weighting parameter values, wherein the first subset of the plurality of possible weighting parameter values is different from the second subset of the plurality of possible weighting parameter values. Subset; Selecting weighting parameter values for candidate bi-directional prediction positions based on the calculated distortion values of the plurality of candidate bi-directional prediction positions of the current block; and The current block is encoded using bidirectional prediction based on the selected weighting parameter value.