TW202013960A - Shape dependent interpolation order - Google Patents

Shape dependent interpolation order Download PDF

Info

Publication number
TW202013960A
TW202013960A TW108124952A TW108124952A TW202013960A TW 202013960 A TW202013960 A TW 202013960A TW 108124952 A TW108124952 A TW 108124952A TW 108124952 A TW108124952 A TW 108124952A TW 202013960 A TW202013960 A TW 202013960A
Authority
TW
Taiwan
Prior art keywords
interpolation
video block
video
block
prediction
Prior art date
Application number
TW108124952A
Other languages
Chinese (zh)
Other versions
TWI704799B (en
Inventor
劉鴻彬
張莉
張凱
王悅
Original Assignee
大陸商北京字節跳動網絡技術有限公司
大陸商字節跳動有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 大陸商北京字節跳動網絡技術有限公司, 大陸商字節跳動有限公司 filed Critical 大陸商北京字節跳動網絡技術有限公司
Publication of TW202013960A publication Critical patent/TW202013960A/en
Application granted granted Critical
Publication of TWI704799B publication Critical patent/TWI704799B/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop

Abstract

The application provides A video processing method, comprising: determining a first prediction mode applied to a first video block; performing a first conversion between the first video block and a coded representation of the first video block by applying a horizontal interpolation and/or a vertical interpolation to the first video block; determining a second prediction mode applied to a second video block; performing a second conversion between the second video block and a coded representation of the second video block by applying a horizontal interpolation and/or a vertical interpolation to the second video block, wherein, based on the determination that the first prediction mode is a multi-hypothesis prediction mode and the second prediction mode is not a multi-hypothesis prediction mode, one or both of the horizontal interpolation and the vertical interpolation for first video block use a shorter tap filter compared to that used for the second video block.

Description

取決於形狀的插值順序Interpolation order depending on shape

本發明文件涉及視頻編碼技術、設備和系統。 [相關申請的交叉引用] 根據適用的《專利法》和/或《巴黎公約》的規定,本發明及時要求於2018年7月13日提交的國際專利申請號PCT/CN2018/095576的優先權和利益。將國際專利申請號PCT/CN2018/095576的全部公開以引用方式併入本文,作為本發明公開的一部分。The document of the present invention relates to video coding technology, equipment and system. [Cross reference to related applications] According to the provisions of the applicable Patent Law and/or the Paris Convention, the present invention promptly requires the priority and benefits of the international patent application number PCT/CN2018/095576 filed on July 13, 2018. The entire disclosure of International Patent Application No. PCT/CN2018/095576 is incorporated herein by reference as part of the disclosure of the present invention.

儘管視頻壓縮有所進步,數位視頻在互聯網和其它數位通信網路上使用的頻寬仍然最大。隨著能夠接收和顯示視頻的連接使用者設備數量的增加,預計數位視頻使用的頻寬需求將繼續增長。Despite the advances in video compression, digital video still has the largest bandwidth on the Internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the demand for digital video bandwidth will continue to grow.

所公開的技術可以由視頻解碼器或編碼器實施例使用,其中使用塊形插值順序技術來改進插值。The disclosed techniques may be used by video decoder or encoder embodiments, where block-shaped interpolation sequential techniques are used to improve interpolation.

在一個示例方面,公開了一種視頻位元流處理方法。該方法包括:確定視頻塊的形狀;基於視頻塊的形狀確定插值順序,插值順序指示執行水平插值和垂直插值的序列;以及按由插值順序指示的序列對視頻塊執行水平插值和垂直插值,以重建視頻塊的解碼表示。In an example aspect, a video bit stream processing method is disclosed. The method includes: determining the shape of the video block; determining an interpolation order based on the shape of the video block, the interpolation order indicating a sequence of performing horizontal interpolation and vertical interpolation; and performing horizontal interpolation and vertical interpolation of the video block in the sequence indicated by the interpolation order, to Reconstruct the decoded representation of the video block.

在另一個示例方面,視頻位元流處理方法包括:確定與視頻塊相關的運動向量的特徵;基於運動向量的特徵確定插值順序,插值順序指示執行水平插值和垂直插值的序列;以及按由插值順序指示的序列對視頻塊執行水平插值和垂直插值,以重建視頻塊的解碼表示。In another example aspect, a video bit stream processing method includes: determining characteristics of a motion vector related to a video block; determining an interpolation order based on the characteristics of the motion vector, the interpolation order indicating a sequence of performing horizontal interpolation and vertical interpolation; and interpolation by The sequence indicated by the order performs horizontal interpolation and vertical interpolation on the video block to reconstruct the decoded representation of the video block.

在另一示例方面,公開了一種視頻位元流處理方法。該方法包括:確定視頻塊的形狀;基於視頻塊的形狀確定插值順序,插值順序指示執行水平插值和垂直插值的序列;以及按由插值順序指示的序列對視頻塊執行水平插值和垂直插值,以構造視頻塊的編碼表示。In another example aspect, a video bit stream processing method is disclosed. The method includes: determining the shape of the video block; determining an interpolation order based on the shape of the video block, the interpolation order indicating a sequence of performing horizontal interpolation and vertical interpolation; and performing horizontal interpolation and vertical interpolation of the video block in the sequence indicated by the interpolation order, to Construct a coded representation of the video block.

在另一示例方面,公開了一種視頻位元流處理方法。該方法包括:確定與視頻塊相關的運動向量的特徵;基於運動向量的特徵確定插值順序,插值順序指示執行水平插值和垂直插值的序列;以及按由插值順序指示的序列對視頻塊執行水平插值和垂直插值,以構造視頻塊的編碼表示。In another example aspect, a video bit stream processing method is disclosed. The method includes: determining characteristics of a motion vector related to a video block; determining an interpolation order based on the characteristics of the motion vector, the interpolation order indicating a sequence of performing horizontal interpolation and vertical interpolation; and performing horizontal interpolation on the video block in the sequence indicated by the interpolation order And vertical interpolation to construct a coded representation of the video block.

在一個示例方面,公開了一種視頻處理方法。該方法包括:確定應用於第一視頻塊的第一預測模式;通過對第一視頻塊應用水平插值和/或垂直插值,在第一視頻塊和第一視頻塊的編碼表示之間執行第一轉換,確定應用於第二視頻塊的第二預測模式;通過對第二視頻塊應用水平插值和/或垂直插值,在第二視頻塊和第二視頻塊的編碼表示之間執行第二轉換,其中,基於第一預測模式是多假設預測模式而第二預測模式不是多假設預測模式的確定,第一視頻塊的水平插值和垂直插值中的一個或兩個使用與用於第二視頻塊的濾波器相比的較短抽頭濾波器。In an example aspect, a video processing method is disclosed. The method includes: determining a first prediction mode applied to the first video block; performing a first between the first video block and the encoded representation of the first video block by applying horizontal interpolation and/or vertical interpolation to the first video block Conversion, determining the second prediction mode applied to the second video block; by applying horizontal interpolation and/or vertical interpolation to the second video block, performing a second conversion between the second video block and the encoded representation of the second video block, Among them, based on the determination that the first prediction mode is a multi-hypothesis prediction mode and the second prediction mode is not a multi-hypothesis prediction mode, one or both of horizontal interpolation and vertical interpolation of the first video block are used as A filter with a shorter tap compared to the filter.

在另一示例方面,公開了一種實現本文所述的視頻處理方法的視頻解碼裝置。In another example aspect, a video decoding device implementing the video processing method described herein is disclosed.

在又一示例方面,公開了一種實現本文所述的視頻處理方法的視頻編碼裝置。In yet another example aspect, a video encoding device implementing the video processing method described herein is disclosed.

在又一典型的方面,本文所述的各種技術可以實施為儲存在非暫時性電腦可讀介質上的電腦程式產品。電腦程式產品包括用於執行本文所述方法的程式碼。In yet another typical aspect, the various technologies described herein can be implemented as computer program products stored on a non-transitory computer-readable medium. The computer program product includes code for performing the methods described herein.

在又一示例方面,公開了一種視頻系統中的裝置。該裝置包括處理器和其上具有指令的非暫時性儲存器,其中由處理器執行的指令使處理器實現上述方法。In yet another example aspect, an apparatus in a video system is disclosed. The device includes a processor and a non-transitory memory having instructions thereon, wherein the instructions executed by the processor cause the processor to implement the above method.

在附件、附圖和下面的描述中闡述了一個或多個實現的細節。其它特徵將從說明書和附圖以及申請專利範圍書中顯而易見。The details of one or more implementations are set forth in the annex, drawings, and the following description. Other features will be apparent from the description and drawings, and the scope of patent application.

本文件提供了可由視頻位元流的解碼器使用以改進解壓縮或解碼的數位視頻的品質的各種技術。此外,視頻編碼器還可在編碼過程期間實現這些技術,以便重建用於進一步編碼的所解碼幀。This document provides various techniques that can be used by decoders of video bitstreams to improve the quality of decompressed or decoded digital video. In addition, the video encoder can also implement these techniques during the encoding process in order to reconstruct the decoded frames for further encoding.

為了便於理解,在本文件中使用章節標題,並且不將實施例和技術限制於相應的部分。這樣,來自一個章節的實施例可以與來自其他章節的實施例組合。For ease of understanding, chapter titles are used in this document, and the embodiments and techniques are not limited to the corresponding parts. In this way, embodiments from one chapter can be combined with embodiments from other chapters.

1. 總結1. Summary

本發明涉及視頻編碼技術。具體地,涉及視頻編碼中的插值。可應用於現有的視頻編碼標準,比如HEVC,或待最終確定的標準(多功能視頻編碼)。也可能適用於未來的視頻編碼標準或視頻編碼器。The invention relates to video coding technology. Specifically, it relates to interpolation in video coding. It can be applied to existing video coding standards, such as HEVC, or the standard to be finalized (multifunctional video coding). It may also be suitable for future video coding standards or video encoders.

2. 背景2. Background

視頻編碼標準主要是通過開發公知的ITU-T和ISO/IEC標準而發展起來的。ITU-T開發了H.261和H.263,ISO/IEC開發了MPEG-1和MPEG-4視覺,並且兩個組織聯合開發了H.262/MPEG-2視頻、H.264/MPEG-4高級視頻編碼(AVC)和H.265/HEVC標準。自H.262以來,視頻編碼標準基於混合視頻編碼結構,其中採用了時域預測加變換編碼。為了探索HEVC之外的未來視頻編碼技術,聯合視頻探索團隊(JVET)由VCEG和MPEG於2015年聯合成立。從那時起,JVET採用了許多新方法並將其引入名為聯合探索模型(JEM)的參考軟體中。在2018年4月,VCEG(Q6 / 16)和ISO / IEC JTC1 SC29 / WG11(MPEG)之間的聯合視頻專家組(JVET)被創建用於研究VVC標準,目標是與HEVC相比降低50%的位元速率。Video coding standards are mainly developed by developing well-known ITU-T and ISO/IEC standards. ITU-T developed H.261 and H.263, ISO/IEC developed MPEG-1 and MPEG-4 vision, and the two organizations jointly developed H.262/MPEG-2 video, H.264/MPEG-4 Advanced Video Coding (AVC) and H.265/HEVC standards. Since H.262, the video coding standard is based on a hybrid video coding structure, in which time domain prediction plus transform coding is used. In order to explore future video coding technologies beyond HEVC, the Joint Video Exploration Team (JVET) was jointly established by VCEG and MPEG in 2015. Since then, JVET has adopted many new methods and introduced it into a reference software called Joint Exploration Model (JEM). In April 2018, the Joint Video Experts Group (JVET) between VCEG (Q6/16) and ISO/IEC JTC1 SC29/WG11 (MPEG) was created to study the VVC standard with a goal of 50% reduction compared to HEVC Bit rate.

圖18是視頻編碼器的示例實現的方塊圖。Figure 18 is a block diagram of an example implementation of a video encoder.

2.1具有較大CTU的四叉樹加二叉樹(QTBT)塊結構2.1 Quadtree plus Binary Tree (QTBT) block structure with larger CTU

在HEVC中,通過使用四叉樹結構(表示為編碼樹)將CTU劃分成CU來適應各種局部特性。在CU級別決定是使用幀間(時域)預測還是幀內(空間)預測對圖片區域進行編碼。根據PU的分割類型,每個CU可以進一步劃分成一個、兩個或四個PU。在一個PU中,應用相同的預測處理,並且相關資訊以PU為基礎傳輸到解碼器。在基於PU分割類型通過應用預測處理獲得殘差塊後,可以根據與CU的編碼樹相似的另一個四叉樹結構將CU分割成變換單元(TU)。HEVC結構的一個重要特徵是它具有多個分割概念,包括CU、PU以及TU。In HEVC, various local characteristics are adapted by dividing the CTU into CUs by using a quadtree structure (represented as a coding tree). It is decided at the CU level whether to use inter (temporal) prediction or intra (spatial) prediction to encode the picture area. Each CU can be further divided into one, two, or four PUs according to the type of PU partitioning. In a PU, the same prediction process is applied, and related information is transmitted to the decoder based on the PU. After the residual block is obtained by applying prediction processing based on the PU partition type, the CU can be partitioned into transform units (TUs) according to another quadtree structure similar to the CU's coding tree. An important feature of the HEVC structure is that it has multiple partitioning concepts, including CU, PU, and TU.

QTBT結構消除了多個分割類型的概念,即QTBT結構消除了CU、PU和TU概念的分離,並支持CU分割形狀的更多靈活性。在QTBT塊結構中,CU可以是方形或矩形。如圖1所示,首先用四叉樹結構對編碼樹單元(CTU)進行分割。四叉樹葉節點進一步被二叉樹結構分割。在二叉樹劃分中有兩種分割類型:對稱的水平劃分和對稱的垂直劃分。二叉樹葉節點被稱為編碼單元(CU),該劃分用於預測和轉換處理,而無需進一步分割。這意味著在QTBT編碼塊結構中CU、PU和TU具有相同的塊尺寸。在JEM中,CU有時由不同顏色分量的編碼塊(CB)組成,例如,在4:2:0彩度格式的P條帶和B條帶中,一個CU包含一個亮度 CB和兩個彩度 CB,並且CU有時由單個分量的CB組成,例如,在I條帶的情況下,一個CU僅包含一個亮度 CB或僅包含兩個彩度 CB。The QTBT structure eliminates the concept of multiple partition types, that is, the QTBT structure eliminates the separation of the concepts of CU, PU, and TU, and supports more flexibility in the shape of the CU partition. In the QTBT block structure, the CU may be square or rectangular. As shown in Figure 1, the coding tree unit (CTU) is first partitioned with a quadtree structure. The quad leaf nodes are further divided by a binary tree structure. There are two types of partitioning in binary tree partitioning: symmetric horizontal partitioning and symmetric vertical partitioning. The binary leaf node is called a coding unit (CU), and this division is used for prediction and conversion processing without further segmentation. This means that CU, PU and TU have the same block size in the QTBT coding block structure. In JEM, the CU is sometimes composed of coding blocks (CB) of different color components. For example, in the P and B slices in the 4:2:0 chroma format, one CU contains one luminance CB and two colors Degree CB, and CU is sometimes composed of a single component CB, for example, in the case of I-slice, one CU contains only one luminance CB or only two chroma CB.

為QTBT分割方案定義了以下參數。The following parameters are defined for the QTBT segmentation scheme.

–CTU尺寸:四叉樹的根節點尺寸,與HEVC中的概念相同。– CTU size: The size of the root node of the quadtree, which is the same as the concept in HEVC.

MiNQTSize :最小允許的四叉樹葉節點尺寸MiNQTSize : The minimum allowed size of the quad leaf node

MaxBTSize :最大允許的二叉樹根節點尺寸MaxBTSize : the maximum allowed size of the root node of the binary tree

MaxBTDePTh :最大允許的二叉樹深度MaxBTDePTh : maximum allowable depth of binary tree

MiNBTSize :最小允許的二叉樹葉節點尺寸MiNBTSize : the minimum allowable binary leaf node size

在QTBT分割結構的一個示例中,CTU尺寸被設置為具有兩個對應的64×64彩度樣點塊的128×128 個亮度樣點,MiNQTSize 被設置為16×16,MaxBTSize 被設置為64×64,MiNBTSize (寬度和高度)被設置為4×4,MaxBTSize 被設置為4。四叉樹分割首先應用於CTU,以生成四叉樹葉節點。四叉樹葉節點的尺寸可以具有從16×16(即,MiNQTSize )到128×128(即,CTU尺寸)的尺寸。如果葉四叉樹節點是128×128,則其不會被二叉樹進一步劃分,因為其尺寸超過了MaxBTSize (例如,64×64)。否則,葉四叉樹節點可以被二叉樹進一步分割。因此,四叉樹葉節點也是二叉樹的根節點,並且其二叉樹深度為0。當二叉樹深度達到MaxBTDePTh (即,4)時,不考慮進一步劃分。當二叉樹節點的寬度等於MiNBTSize (即,4)時,不考慮進一步的水平劃分。同樣,當二叉樹節點的高度等於MiNBTSize 時,不考慮進一步的垂直劃分。通過預測和變換處理進一步處理二叉樹的葉節點,而不需要進一步的分割。在JEM中,最大CTU尺寸為256×256 個亮度樣點。In an example of the QTBT segmentation structure, the CTU size is set to 128×128 luma samples with two corresponding 64×64 chroma sample blocks, MiNQTSize is set to 16×16, and MaxBTSize is set to 64× 64, MiNBTSize (width and height) is set to 4×4, and MaxBTSize is set to 4. Quadtree segmentation is first applied to CTU to generate quadtree leaf nodes. The size of the quad leaf node may have a size from 16×16 (ie, MiNQTSize ) to 128×128 (ie, CTU size). If the leaf quadtree node is 128×128, it will not be further divided by the binary tree because its size exceeds MaxBTSize (for example, 64×64). Otherwise, the leaf quadtree nodes can be further divided by the binary tree. Therefore, the quad leaf node is also the root node of the binary tree, and the depth of the binary tree is 0. When the depth of the binary tree reaches MaxBTDePTh (ie, 4), no further division is considered. When the width of the binary tree node is equal to MiNBTSize (ie, 4), no further horizontal division is considered. Similarly, when the height of the binary tree node is equal to MiNBTSize , no further vertical division is considered. The leaf nodes of the binary tree are further processed through prediction and transformation processing without further segmentation. In JEM, the maximum CTU size is 256×256 luminance samples.

圖1(左側)圖示了通過使用QTBT進行塊分割的示例,圖1(右側)圖示了相應的樹表示。實線表示四叉樹分割,並且虛線表示二叉樹分割。在二叉樹的每個劃分(即,非葉)節點中,會對一個標誌發信號來指示使用哪種分割類型(即,水平或垂直),其中0表示水平劃分,1表示垂直劃分。對於四叉樹分割,不需要指明分割類型,因為四叉樹分割總是水平和垂直劃分一個塊,以生成尺寸相同的4個子塊。Fig. 1 (left side) illustrates an example of block division by using QTBT, and Fig. 1 (right side) illustrates a corresponding tree representation. The solid line indicates a quadtree partition, and the broken line indicates a binary tree partition. In each division (ie, non-leaf) node of the binary tree, a flag is signaled to indicate which type of division is used (ie, horizontal or vertical), where 0 represents horizontal division and 1 represents vertical division. For quadtree partitioning, there is no need to specify the partitioning type, because quadtree partitioning always divides a block horizontally and vertically to generate 4 sub-blocks of the same size.

此外,QTBT方案支援亮度和彩度具有單獨的QTBT結構的能力。目前,對於P條帶和B條帶,一個CTU中的亮度和彩度 CTB共用相同的QTBT結構。然而,對於I條帶,用QTBT結構將亮度CTB分割為CU,用另一個QTBT結構將彩度CTB分割為彩度CU。這意味著I條帶中的CU由亮度分量的編碼塊或兩個彩度分量的編碼塊組成,P條帶或B條帶中的CU由所有三種顏色分量的編碼塊組成。In addition, the QTBT solution supports the ability to have separate QTBT structures for brightness and saturation. At present, for P-band and B-band, the brightness and saturation CTB in a CTU share the same QTBT structure. However, for the I band, the QTBT structure is used to divide the luminance CTB into CUs, and the other QTBT structure is used to divide the chroma CTB into chroma CUs. This means that the CU in the I slice is composed of coding blocks of the luma component or two chroma components, and the CU in the P slice or B slice is composed of coding blocks of all three color components.

在HEVC中,為了減少運動補償的儲存器訪問,限制小塊的幀間預測,使得4×8和8×4塊不支持雙向預測,並且4×4塊不支援幀間預測。在JEM的QTBT中,這些限制被移除。In HEVC, in order to reduce memory access for motion compensation, inter prediction of small blocks is restricted so that 4×8 and 8×4 blocks do not support bidirectional prediction, and 4×4 blocks do not support inter prediction. In JEM's QTBT, these restrictions have been removed.

2.2 HEVC/H.265中的幀間預測2.2 Inter prediction in HEVC/H.265

每個幀間預測的PU具有一個或兩個參考圖片列表的運動參數。運動參數包括運動向量和參考圖片索引。對兩個參考圖片列表中的一個的使用也可以使用inter_pred_idc 發信號。運動向量可以相對於預測值顯式地編碼為增量。Each inter-predicted PU has motion parameters of one or two reference picture lists. Motion parameters include motion vectors and reference picture indexes. The use of one of the two reference picture lists can also be signaled using inter_pred_idc . Motion vectors can be explicitly encoded as increments relative to the predicted value.

當CU採用跳躍模式編碼時,一個PU與CU相關聯,並且沒有顯著的殘差係數、沒有編碼的運動向量增量或參考圖片索引。指定了一種Merge模式,通過該模式,可以從相鄰的PU(包括空間和時域候選)中獲取當前PU的運動參數。Merge模式可以應用於任何幀間預測的PU,而不僅僅是跳躍模式。Merge模式的另一種選擇是運動參數的顯式傳輸,其中運動向量(更準確地說,與運動向量預測值相比的運動向量差)、每個參考圖片列表對應的參考圖片索引和參考圖片列表的使用都會在每個PU中顯式地發信號。在本文件中,這種模式被稱為高級運動向量預測(AMVP)。When the CU is coded in skip mode, one PU is associated with the CU, and there are no significant residual coefficients, no coded motion vector increments, or reference picture indexes. A Merge mode is specified, through which the motion parameters of the current PU can be obtained from neighboring PUs (including spatial and temporal candidates). Merge mode can be applied to any inter-predicted PU, not just skip mode. Another option for the Merge mode is the explicit transmission of motion parameters, where the motion vector (more precisely, the motion vector difference compared to the motion vector predictor), the reference picture index and reference picture list corresponding to each reference picture list Will be explicitly signaled in each PU. In this document, this mode is called Advanced Motion Vector Prediction (AMVP).

當信號指示要使用兩個參考圖片列表中的一個時,從一個樣點塊中生成PU。這被稱為“單向預測”。單向預測對P條帶和B條帶都可用。When the signal indicates that one of the two reference picture lists is to be used, the PU is generated from one sample block. This is called "one-way prediction". One-way prediction is available for both P and B bands.

當信號指示要使用兩個參考圖片列表時,從兩個樣點塊中生成PU。這被稱為“雙向預測”。雙向預測僅對B條帶可用。When the signal indicates that two reference picture lists are to be used, the PU is generated from the two sample blocks. This is called "bidirectional prediction". Bidirectional prediction is only available for B bands.

下面文本提供了HEVC中規定的幀間預測模式的細節。描述將從Merge模式開始。The text below provides details of the inter prediction modes specified in HEVC. The description will start in Merge mode.

2.2.1 Merge模式2.2.1 Merge mode

2.2.1.1Merge模式的候選的推導2.2.1.1 Derivation of candidate in Merge mode

當使用Merge模式預測PU時,從位元流分析指向Merge候選列表中條目的索引,並用於檢索運動資訊。該列表的結構在HEVC標準中有規定,並且可以按照以下步驟順序進行概括:When using the Merge mode to predict the PU, the index pointing to the entry in the Merge candidate list is analyzed from the bitstream and used to retrieve motion information. The structure of this list is specified in the HEVC standard and can be summarized in the following order of steps:

步驟1:初始候選推導Step 1: Derivation of initial candidates

步驟1.1:空域候選推導Step 1.1: Derivation of airspace candidates

步驟1.2:空域候選的冗餘檢查Step 1.2: Redundancy check of airspace candidates

步驟1.3:時域候選推導Step 1.3: Time-domain candidate derivation

步驟2:附加候選插入Step 2: Additional candidate insertion

步驟2.1:雙向預測候選的創建Step 2.1: Creation of bidirectional prediction candidates

步驟2.2:零運動候選的插入Step 2.2: Insertion of zero motion candidates

在圖2中也示意性描述了這些步驟。對於空間Merge候選推導,在位於五個不同位置的候選中最多選擇四個Merge候選。對於時域Merge候選推導,在兩個候選中最多選擇一個Merge候選。由於在解碼器處假定每個PU的候選數為常量,因此當從步驟1獲得的候選數未達到條帶標頭中發信號的最大Merge候選數(MaxNumMergeCand )時,生成附加的候選。由於候選數是恆定的,所以最佳Merge候選的索引使用截斷的一元二值化(TU)進行編碼。如果CU的大小等於8,則當前CU的所有PU都共用一個Merge候選列表,這與2N×2N預測單元的Merge候選列表相同。These steps are also schematically depicted in Figure 2. For the spatial Merge candidate derivation, a maximum of four Merge candidates are selected among the candidates located in five different positions. For the time-domain Merge candidate derivation, at most one Merge candidate is selected from the two candidates. Since the number of candidates for each PU is assumed to be constant at the decoder, when the number of candidates obtained from step 1 does not reach the maximum number of Merge candidates ( MaxNumMergeCand ) signaled in the stripe header , additional candidates are generated. Since the number of candidates is constant, the index of the best Merge candidate is encoded using truncated unary binarization (TU). If the size of the CU is equal to 8, all PUs of the current CU share a Merge candidate list, which is the same as the Merge candidate list of the 2N×2N prediction unit.

下面詳細介紹與上述步驟相關的操作。The operations related to the above steps are described in detail below.

2.2.1.2空域候選推導2.2.1.2 Derivation of airspace candidates

在空間Merge候選的推導中,在位於圖3所示位置的候選中最多選擇四個Merge候選。推導順序為A1、 B1、 B0、 A0 和 B2。只有當位置A1、 B1、 B0、 A0的任何PU不可用(例如,因為它屬於另一個條帶或片)或是內部編碼時,才考慮位置B2。在增加A1位置的候選後,對其餘候選的增加進行冗餘檢查,其確保具有相同運動資訊的候選被排除在列表之外,從而提高編碼效率。為了降低計算的複雜度,在所提到的冗餘檢查中並不考慮所有可能的候選對。相反,只有與圖4中的箭頭連結的對才會被考慮,並且只有當用於冗餘檢查的對應候選沒有相同的運動資訊時,才將候選添加到列表中。複製運動資訊的另一個來源是與2N×2N不同的分區相關的“第二PU”。例如,圖5分別描述了N×2N和2N×N情況下的第二PU。當當前的PU被劃分為N×2N時,對於列表構建不考慮A1位置的候選。在一些實施例中,添加此候選可能導致兩個具有相同運動資訊的預測單元,這對於在編碼單元中僅具有一個PU是冗餘的。同樣地,當當前PU被劃分為2N×N時,不考慮位置B1。In the derivation of spatial Merge candidates, up to four Merge candidates are selected among the candidates located in the position shown in FIG. 3. The derivation order is A1, B1, B0, A0 and B2. Position B2 is only considered when any PU at positions A1, B1, B0, A0 is not available (for example, because it belongs to another slice or slice) or is internally coded. After adding candidates at the A1 position, a redundancy check is performed on the addition of the remaining candidates, which ensures that candidates with the same motion information are excluded from the list, thereby improving coding efficiency. In order to reduce the computational complexity, not all possible candidate pairs are considered in the mentioned redundancy check. On the contrary, only the pairs connected with the arrows in FIG. 4 are considered, and the candidates are added to the list only when the corresponding candidates for redundancy check do not have the same motion information. Another source for copying motion information is the "second PU" associated with 2N×2N different partitions. For example, FIG. 5 depicts the second PU in the case of N×2N and 2N×N, respectively. When the current PU is divided into N×2N, candidates for the A1 position are not considered for list construction. In some embodiments, adding this candidate may result in two prediction units with the same motion information, which is redundant for having only one PU in the coding unit. Likewise, when the current PU is divided into 2N×N, the position B1 is not considered.

2.2.1.3時域候選推導2.2.1.3 Time domain candidate derivation

在此步驟中,只有一個候選添加到列表中。特別地,在這個時域Merge候選的推導中,基於與給定參考圖片列表中當前圖片具有最小圖片順序計數POC差異的並置PU推導了縮放運動向量。用於推導並置PU的參考圖片列表在條帶標頭中顯式地發信號。圖6中的虛線示出了時域Merge候選的縮放運動向量的獲得,其使用POC距離tb和td從並置PU的運動向量進行縮放,其中tb定義為當前圖片的參考圖片和當前圖片之間的POC差異,並且td定義為並置圖片的參考圖片與並置圖片之間的POC差異。時域Merge候選的參考圖片索引設置為零。HEVC規範中描述了縮放處理的實際實現。對於B條帶,得到兩個運動向量(一個是對於參考圖片列表0,另一個是對於參考圖片列表1)並將其組合使其成為雙向預測Merge候選。In this step, only one candidate is added to the list. In particular, in the derivation of this time-domain Merge candidate, the scaled motion vector is derived based on the juxtaposed PU with the smallest picture order count POC difference from the current picture in the given reference picture list. The reference picture list used to derive the collocated PU is explicitly signaled in the stripe header. The dotted line in FIG. 6 shows the acquisition of the scaled motion vector of the time domain Merge candidate, which uses the POC distances tb and td to scale from the motion vector of the juxtaposed PU, where tb is defined as the reference picture between the current picture and the current picture. POC difference, and td is defined as the POC difference between the reference picture of the juxtaposed picture and the juxtaposed picture. The reference picture index of the time domain Merge candidate is set to zero. The actual implementation of scaling processing is described in the HEVC specification. For the B slice, two motion vectors (one for reference picture list 0 and the other for reference picture list 1) are obtained and combined to make it a bidirectional prediction Merge candidate.

圖6是用於時域Merge候選的運動向量縮放的說明。6 is an illustration of motion vector scaling for time-domain Merge candidates.

在屬於參考幀的並置PU(Y)中,在候選C0 和C1 之間選擇時域候選的位置,如圖7所示。如果位置C0 處的PU不可用、內部編碼或在當前CTU行之外,則使用位置C1 。否則,位置C0 被用於時域Merge候選的推導。In the juxtaposed PU (Y) belonging to the reference frame, the position of the time domain candidate is selected between the candidates C 0 and C 1 , as shown in FIG. 7. If the PU at position C 0 is unavailable, internally coded, or outside the current CTU line, then position C 1 is used. Otherwise, the position C 0 is used for the derivation of time-domain Merge candidates.

2.2.1.4附加候選插入2.2.1.4 Additional candidate insertion

除了空間和時域Merge候選,還有兩種附加類型的Merge候選:組合雙向預測Merge候選和零Merge候選。組合雙向預測Merge候選是利用空間和時域Merge候選生成的。組合雙向預測Merge候選僅用於B條帶。通過將初始候選的第一參考圖片列表運動參數與另一候選的第二參考圖片列表運動參數相結合,生成組合雙向預測候選。如果這兩個元組提供不同的運動假設,則它們將形成新的雙向預測候選。作為示例,圖8示出了原始列表中(在左側)的兩個候選被用於創建添加到最終列表(在右側)中的組合雙向預測Merge候選的情況,其具有MvL0和refIdxL0或MvL1和refIdxL1的兩個候選。現有技術中定義了許多關於組合的規則需要考慮以生成這些附加Merge候選。In addition to spatial and temporal Merge candidates, there are two additional types of Merge candidates: combined bidirectional prediction Merge candidates and zero Merge candidates. Combined bidirectional prediction Merge candidates are generated using spatial and temporal Merge candidates. The combined bidirectional prediction Merge candidate is only used for B bands. By combining the initial candidate first reference picture list motion parameter and another candidate's second reference picture list motion parameter, a combined bidirectional prediction candidate is generated. If these two tuples provide different motion hypotheses, they will form new bidirectional prediction candidates. As an example, FIG. 8 shows a case where two candidates in the original list (on the left) are used to create a combined bidirectional prediction Merge candidate added to the final list (on the right), which has MvL0 and refIdxL0 or MvL1 and refIdxL1 Of the two candidates. In the prior art, many rules regarding combinations need to be considered to generate these additional Merge candidates.

插入零運動候選以填充Merge候選列表中的其餘條目,從而達到MaxNumMergeCand的容量。這些候選具有零空間位移和從零開始並且每次將新的零運動候選添加到列表中時都會增加的參考圖片索引。這些候選使用的參考幀的數目對於單向預測和雙向預測分別是1幀和2幀。最後,對這些候選不執行冗餘檢查。Insert zero motion candidates to fill the remaining entries in the Merge candidate list, thus reaching the capacity of MaxNumMergeCand. These candidates have zero spatial displacement and a reference picture index that starts from zero and increases each time a new zero motion candidate is added to the list. The number of reference frames used by these candidates is 1 frame and 2 frames for unidirectional prediction and bidirectional prediction, respectively. Finally, no redundancy check is performed on these candidates.

2.2.1.5並行處理的運動估計區域2.2.1.5 Parallel processing motion estimation area

為了加快編碼處理,可以並存執行運動估計,從而同時推導給定區域內所有預測單元的運動向量。從空間鄰域推導Merge候選可能會干擾並行處理,因為一個預測單元在完成相關運動估計之前無法從相鄰的PU推導運動參數。為了緩和編碼效率和處理延遲之間的平衡,HEVC定義了運動估計區域(MER),可使用語法元素“log2_parallel_merge_level_minus2”在圖片參數集中對MER的尺寸中發信號。當定義MER時,落入同一區域的Merge候選標記為不可用,並且因此在列表構建中不考慮。In order to speed up the encoding process, motion estimation can be performed concurrently, thereby deriving the motion vectors of all prediction units in a given area simultaneously. Derivation of Merge candidates from spatial neighborhoods may interfere with parallel processing because a prediction unit cannot derive motion parameters from neighboring PUs before completing related motion estimation. In order to ease the balance between coding efficiency and processing delay, HEVC defines a motion estimation area (MER). The syntax element "log2_parallel_merge_level_minus2" can be used to signal the size of the MER in the picture parameter set. When MER is defined, Merge candidates that fall into the same area are marked as unavailable, and therefore are not considered in the list construction.

2.2.2 AMVP2.2.2 AMVP

AMVP利用運動向量與相鄰的PU的空時相關性,其用於運動參數的顯式傳輸。對於每個參考圖片列表,首先通過檢查左上方的時域相鄰的PU位置的可用性、去掉多餘的候選位置並且加上零向量以使候選列表長度恆定來構建運動向量候選列表。然後,編碼器可以從候選列表中選擇最佳的預測值,並發送指示所選候選的對應索引。與Merge索引信號類似,最佳運動向量候選的索引使用截斷的一元進行編碼。在這種情況下要編碼的最大值是2(參照圖9)。在下面的章節中,將詳細介紹運動向量預測候選的推導過程。AMVP utilizes the space-time correlation of motion vectors with neighboring PUs, which is used for explicit transmission of motion parameters. For each reference picture list, a motion vector candidate list is first constructed by checking the availability of PU positions adjacent to the time domain in the upper left, removing redundant candidate positions, and adding a zero vector to make the candidate list length constant. The encoder can then select the best predicted value from the candidate list and send the corresponding index indicating the selected candidate. Similar to the Merge index signal, the index of the best motion vector candidate is encoded using a truncated unary. In this case, the maximum value to be encoded is 2 (refer to FIG. 9). In the following chapters, the derivation process of motion vector prediction candidates will be introduced in detail.

2.2.2.1AMVP候選的推導2.2.2.1 Derivation of AMVP candidates

圖9概括了運動向量預測候選的推導過程。Fig. 9 summarizes the derivation process of motion vector prediction candidates.

在運動向量預測中,考慮了兩種類型的運動向量候選:空間運動向量候選和時域運動向量候選。對於空間運動向量候選的推導,基於位於圖3所示的五個不同位置的每個PU的運動向量最終推推導兩個運動向量候選。In motion vector prediction, two types of motion vector candidates are considered: spatial motion vector candidates and time-domain motion vector candidates. For the derivation of spatial motion vector candidates, two motion vector candidates are finally derived based on the motion vectors of each PU located at five different positions shown in FIG. 3.

對於時域運動向量候選的推導,從兩個候選中選擇一個運動向量候選,這兩個候選是基於兩個不同的並置位置推導的。在作出第一個空時候選列表後,移除列表中重複的運動向量候選。如果潛在候選的數量大於二,則從列表中移除相關聯的參考圖片列表中參考圖片索引大於1的運動向量候選。如果空時運動向量候選數小於二,則會在列表中添加附加的零運動向量候選。For the derivation of time-domain motion vector candidates, one motion vector candidate is selected from the two candidates, which are derived based on two different juxtaposed positions. After making the first space-time candidate list, the repeated motion vector candidates in the list are removed. If the number of potential candidates is greater than two, the motion vector candidates whose reference picture index is greater than 1 in the associated reference picture list are removed from the list. If the number of space-time motion vector candidates is less than two, additional zero motion vector candidates will be added to the list.

2.2.2.2空間運動向量候選2.2.2.2 Spatial motion vector candidates

在推導空間運動向量候選時,在五個潛在候選中最多考慮兩個候選,這五個候選來自圖3所描繪位置上的PU,這些位置與運動Merge的位置相同。當前PU左側的推導順序定義為A0 、A1 、以及縮放的 A0 、縮放的A1 。當前PU上面的推導順序定義為B0 、B1 , B2 、縮放的 B0 、縮放的 B1 、縮放的B2 。因此,每側有四種情況可以用作運動向量候選,其中兩種情況不需要使用空間縮放,並且兩種情況使用空間縮放。四種不同的情況概括如下:When deriving spatial motion vector candidates, at most two candidates are considered among the five potential candidates, and these five candidates are from the PU at the positions depicted in FIG. 3, and these positions are the same as the positions of the motion Merge. The derivation order on the left side of the current PU is defined as A 0 , A 1 , and scaled A 0 , scaled A 1 . The derivation order on the current PU is defined as B 0 , B 1 , B 2 , scaled B 0 , scaled B 1 , scaled B 2 . Therefore, there are four cases on each side that can be used as motion vector candidates, of which two cases do not require the use of spatial scaling, and two cases use spatial scaling. The four different situations are summarized as follows:

--無空間縮放- No space zoom

(1)相同的參考圖片列表,並且相同的參考圖片索引(相同的POC)(1) The same reference picture list and the same reference picture index (same POC)

(2)不同的參考圖片列表,但是相同的參考圖片索引(相同的POC)(2) Different reference picture lists, but the same reference picture index (same POC)

--空間縮放--Space zoom

(3)相同的參考圖片列表,但是不同的參考圖片索引(不同的POC)(3) The same reference picture list, but different reference picture indexes (different POC)

(4)不同的參考圖片列表,並且不同的參考圖片索引(不同的POC)(4) Different reference picture lists and different reference picture indexes (different POC)

首先檢查無空間縮放的情況,然後檢查空間縮放。當POC在相鄰PU的參考圖片與當前PU的參考圖片之間不同時,都會考慮空間縮放,而不考慮參考圖片列表。如果左側候選的所有PU都不可用或是內部編碼,則允許對上述運動向量進行縮放,以幫助左側和上方MV候選的平行推導。否則,不允許對上述運動向量進行空間縮放。First check the situation without spatial scaling, then check the spatial scaling. When the POC is different between the reference picture of the neighboring PU and the reference picture of the current PU, spatial scaling will be considered instead of the reference picture list. If all PUs of the left candidate are unavailable or intra-coded, the above motion vectors are allowed to be scaled to help parallel derivation of the left and upper MV candidates. Otherwise, spatial scaling of the above motion vectors is not allowed.

圖10是空間運動向量候選的運動向量縮放的說明。FIG. 10 is an illustration of motion vector scaling of spatial motion vector candidates.

在空間縮放處理中,相鄰PU的運動向量以與時域縮放相似的方式縮放,如圖10所示。主要區別在於,給出了當前PU的參考圖片列表和索引作為輸入,實際縮放處理與時域縮放處理相同。In the spatial scaling process, the motion vectors of neighboring PUs are scaled in a similar manner to time-domain scaling, as shown in FIG. The main difference is that the reference picture list and index of the current PU are given as input, and the actual scaling process is the same as the time domain scaling process.

2.2.2.3時域運動向量候選2.2.2.3 Time domain motion vector candidates

除了參考圖片索引的推導外,時域Merge候選的所有推導過程與空間運動向量候選的推導過程相同(參見圖7)。向解碼器發參考圖片索引的信號。Except for the derivation of the reference picture index, all the derivation processes of the time-domain Merge candidate are the same as the derivation process of the spatial motion vector candidate (see FIG. 7). The decoder is signaled to refer to the picture index.

2.3 JEM中新的幀間Merge候選2.3 New interframe Merge candidate in JEM

2.3.1基於子CU的運動向量預測2.3.1 Motion vector prediction based on sub-CU

在具有QTBT的JEM中,每個CU對於每個預測方向最多可以具有一組運動參數。通過將大的CU分割成子CU並推導該大CU的所有子CU的運動資訊,編碼器中考慮了兩種子CU級的運動向量預測方法。可選時域運動向量預測(ATMVP)方法允許每個CU從多個小於並置參考圖片中當前CU的塊中獲取多組運動資訊。在空時運動向量預測(STMVP)方法中,通過利用時域運動向量預測值和空間鄰接運動向量遞迴地推導子CU的運動向量。In JEM with QTBT, each CU can have at most one set of motion parameters for each prediction direction. By dividing a large CU into sub-CUs and deriving the motion information of all sub-CUs of the large CU, two sub-CU-level motion vector prediction methods are considered in the encoder. The optional time domain motion vector prediction (ATMVP) method allows each CU to obtain multiple sets of motion information from multiple blocks smaller than the current CU in the collocated reference picture. In the space-time motion vector prediction (STMVP) method, the motion vector of the sub-CU is recursively derived by using the time-domain motion vector prediction value and the spatially adjacent motion vector.

為了為子CU運動預測的保持更精確的運動場,當前禁用參考幀的運動壓縮。In order to maintain a more accurate motion field for sub-CU motion prediction, motion compression of reference frames is currently disabled.

2.3.1.1可選時域運動向量預測2.3.1.1 Optional time-domain motion vector prediction

在可選時域運動向量預測(ATMVP)方法中,運動向量時域運動向量預測(TMVP)是通過從小於當前CU的塊中提取多組運動資訊(包括運動向量和參考索引)來修改的。如圖11所示,子CU為方形N×N塊(默認N設置為4)。In the optional time domain motion vector prediction (ATMVP) method, motion vector time domain motion vector prediction (TMVP) is modified by extracting multiple sets of motion information (including motion vectors and reference indexes) from a block smaller than the current CU. As shown in FIG. 11, the sub-CU is a square N×N block (the default N is set to 4).

ATMVP分兩步預測CU內的子CU的運動向量。第一步是用所謂的時域向量識別參考圖片中的對應塊。參考圖片稱為運動源圖片。第二步是將當前CU劃分成子CU,並從每個子CU對應的塊中獲取運動向量以及每個子CU的參考索引,如圖11所示。ATMVP predicts the motion vector of the sub-CU in the CU in two steps. The first step is to identify the corresponding blocks in the reference picture with so-called time-domain vectors. The reference picture is called the motion source picture. The second step is to divide the current CU into sub-CUs, and obtain the motion vector and the reference index of each sub-CU from the block corresponding to each sub-CU, as shown in FIG. 11.

在第一步中,參考圖片和對應的塊由當前CU的空間相鄰塊的運動資訊確定。為了避免相鄰塊的重複掃描處理,使用當前CU的Merge候選列表中的第一個Merge候選。第一個可用的運動向量及其相關聯的參考索引被設置為時域向量和運動源圖片的索引。這樣,在ATMVP中,與TMVP相比,可以更準確地識別對應的塊,其中對應的塊(有時稱為並置塊)始終位於相對於當前CU的右下角或中心位置。In the first step, the reference picture and the corresponding block are determined by the motion information of the spatial neighboring block of the current CU. In order to avoid repeated scanning processing of adjacent blocks, the first Merge candidate in the Merge candidate list of the current CU is used. The first available motion vector and its associated reference index are set to the time domain vector and the index of the motion source picture. In this way, in ATMVP, compared with TMVP, the corresponding block can be more accurately identified, where the corresponding block (sometimes called a juxtaposed block) is always located in the lower right corner or center position relative to the current CU.

在第二步中,通過將時域向量添加到當前CU的座標中,通過運動源圖片中的時域向量識別子CU的對應塊。對於每個子CU,使用其對應塊的運動資訊(覆蓋中心樣點的最小運動網格)來推導子CU的運動資訊。在識別出對應N×N塊的運動資訊後,將其轉換為當前子CU的運動向量和參考索引,與HEVC的TMVP方法相同,其中應用運動縮放和其它處理。例如,解碼器檢查是否滿足低延遲條件(例如,當前圖片的所有參考圖片的POC都小於當前圖片的POC),並可能使用運動向量MVx(與參考圖片列表X對應的運動向量)來為每個子CU預測運動向量MVy(X等於0或1且Y等於1-X)。In the second step, by adding the time domain vector to the coordinates of the current CU, the corresponding block of the sub-CU is identified by the time domain vector in the motion source picture. For each sub-CU, the motion information of the corresponding block (the smallest motion grid covering the center sample) is used to derive the motion information of the sub-CU. After identifying the motion information corresponding to the N×N block, it is converted into the motion vector and reference index of the current sub-CU, which is the same as the HEVC TMVP method, in which motion scaling and other processing are applied. For example, the decoder checks whether the low-latency condition is met (for example, the POC of all reference pictures of the current picture is less than the POC of the current picture), and may use the motion vector MVx (the motion vector corresponding to the reference picture list X) for each sub The CU predicts the motion vector MVy (X is equal to 0 or 1 and Y is equal to 1-X).

2.3.1.2空時運動向量預測2.3.1.2 Space-time motion vector prediction

在這種方法中,子CU的運動向量是按照光柵掃描順序遞迴推導的。圖12說明了該概念。我們來考慮一個8×8的 CU,它包含四個4×4的子CU A、B、C和D。當前幀中相鄰的4×4的塊標記為a、b、c和d。In this method, the motion vector of the sub-CU is derived recursively in the order of raster scan. Figure 12 illustrates this concept. Let us consider an 8×8 CU, which contains four 4×4 sub-CUs A, B, C and D. The adjacent 4×4 blocks in the current frame are labeled a, b, c, and d.

子CU A的運動推導由識別其兩個空間鄰居開始。第一個鄰居是子CU A上方的N×N塊(塊c)。如果該塊c不可用或內部編碼,則檢查子CU A上方的其它N×N塊(從左到右,從塊c處開始)。第二個鄰居是子CU A左側的一個塊(塊b)。如果塊b不可用或是內部編碼,則檢查子CU A左側的其它塊(從上到下,從塊b處開始)。每個列表從相鄰塊獲得的運動資訊被縮放到給定列表的第一個參考幀。接下來,按照HEVC中規定的與TMVP相同的程式,推推導子塊A的時域運動向量預測(TMVP)。提取位置D處的並置塊的運動資訊並進行相應的縮放。最後,在檢索和縮放運動資訊後,對每個參考列表分別平均所有可用的運動向量(最多3個)。將平均運動向量指定為當前子CU的運動向量。The motion derivation of sub-CU A starts by identifying its two spatial neighbors. The first neighbor is the N×N block (block c) above sub-CU A. If this block c is unavailable or intra-coded, check the other N×N blocks above sub-CU A (from left to right, starting at block c). The second neighbor is a block to the left of sub-CU A (block b). If block b is unavailable or internally coded, check the other blocks on the left side of sub-CU A (from top to bottom, starting at block b). The motion information obtained from neighboring blocks in each list is scaled to the first reference frame of the given list. Next, the time domain motion vector prediction (TMVP) of sub-block A is derived according to the same procedure as that of TMVP specified in HEVC. Extract the motion information of the juxtaposed block at position D and zoom accordingly. Finally, after retrieving and scaling motion information, all available motion vectors (up to 3) are averaged for each reference list. The average motion vector is specified as the motion vector of the current sub-CU.

2.3.1.3 子CU運動預測模式信號2.3.1.3 Sub-CU motion prediction mode signal

作為附加merge候選而啟用子CU模式,並且不需要附加的語法元素來發信號通知該模式。對每個CU的merge候選列表添加兩個附加的merge候選,以表示ATMVP模式和STMVP模式。如果序列參數集合指示啟用ATMVP和STMVP,則使用多達七個merge候選。額外merge候選的編碼邏輯與HM中的merge候選相同,這意味著,對於P或B條帶中的每個CU,對於兩個附加merge候選需要多兩個RD檢查。The sub-CU mode is enabled as an additional merge candidate, and no additional syntax elements are required to signal the mode. Two additional merge candidates are added to the merge candidate list of each CU to represent the ATMVP mode and the STMVP mode. If the sequence parameter set indicates that ATMVP and STMVP are enabled, up to seven merge candidates are used. The coding logic of the extra merge candidate is the same as the merge candidate in HM, which means that for each CU in the P or B slice, two more RD checks are required for the two additional merge candidates.

在JEM中,CABAC對merge索引的所有二元位元進行上下文編碼。而在HEVC中,僅對第一個二元位元進行上下文編碼,而對剩餘二元位元上下文旁路編碼。In JEM, CABAC context encodes all binary bits indexed by merge. In HEVC, only the first binary bit is context coded, and the remaining binary bit context is bypass coded.

2.3.2非相鄰Merge候選2.3.2 Non-adjacent Merge candidates

在J0021中,高通提出從如在圖13中標記為6到49的非相鄰的相鄰位置推導附加空間Merge候選。將所推導的候選添加在Merge候選列表中的TMVP候選之後。In J0021, Qualcomm proposed to derive additional spatial Merge candidates from non-adjacent adjacent positions marked as 6 to 49 as in FIG. 13. The derived candidate is added after the TMVP candidate in the Merge candidate list.

在J0058中,騰訊提出從相對於當前塊具有偏移(-96, - 96)的外部參考區域中的位置推導附加空間Merge候選。In J0058, Tencent proposed to have an offset from the current block (-96,- 96) The location in the external reference area derives additional spatial Merge candidates.

如圖14所示,位置標記為A(i, j)、B(i, j)、C(i, j)、D(i, j)和E(i, j)。與其先前的B或C候選相比,每個候選B(i, j)或C(i, j)在垂直方向上具有16的偏移。與其先前的A或D候選相比,每個候選A(i, j)或D(i, j)在水平方向上具有16的偏移。與其先前的E候選相比,每個E(i, j)在水平方向和垂直方向上具有16的偏移。對候選從內到外進行檢查。並且候選的順序是A(i, j),B(i, j)、C(i, j)、D(i, j)和E(i, j)。進一步研究merge候選的數量是否可以進一步減少。將候選添加在merge候選列表中的TMVP候選之後。As shown in FIG. 14, the position markers are A(i, j), B(i, j), C(i, j), D(i, j), and E(i, j). Compared to its previous B or C candidate, each candidate B(i, j) or C(i, j) has an offset of 16 in the vertical direction. Compared with its previous A or D candidate, each candidate A(i, j) or D(i, j) has an offset of 16 in the horizontal direction. Compared to its previous E candidate, each E(i, j) has an offset of 16 in the horizontal and vertical directions. Check candidates from inside to outside. And the order of candidates is A(i, j), B(i, j), C(i, j), D(i, j) and E(i, j). Further study whether the number of merge candidates can be further reduced. The candidate is added after the TMVP candidate in the merge candidate list.

在J0059中,根據它們在時域候選之後的數位順序,對圖15中從6到27的擴展空間位置進行檢查。為了節省MV行緩衝,所有空間候選都被限制在兩個CTU行內。In J0059, the extended spatial positions from 6 to 27 in FIG. 15 are checked according to their digital order after the time domain candidates. In order to save the MV line buffer, all spatial candidates are limited to two CTU lines.

2.4 JEM中的幀內預測2.4 Intra prediction in JEM

2.4.1具有67個幀內預測模式的幀內模式編碼2.4.1 Intra mode coding with 67 intra prediction modes

對於亮度插值濾波,將8抽頭可分離的基於DCT的插值濾波器用於2/4精度樣本,並且將7抽頭可分離的基於DCT的插值濾波器用於1/4精度樣本,如表1所示。For luminance interpolation filtering, an 8-tap separable DCT-based interpolation filter is used for 2/4 precision samples, and a 7-tap separable DCT-based interpolation filter is used for 1/4 precision samples, as shown in Table 1.

table 11 :用於: For 1/41/4 亮度插值的Luminance interpolated 88 抽頭Tap DCT-IFDCT-IF 係數。coefficient.

Figure 108124952-A0304-0001
Figure 108124952-A0304-0001

類似地,將4抽頭可分離的基於DCT的插值濾波器用於彩度插值濾波器,如表2所示。Similarly, a 4-tap separable DCT-based interpolation filter is used for the chroma interpolation filter, as shown in Table 2.

table 22 :用於: For 1/81/8 彩度插值的Chroma interpolation 44 抽頭Tap DCT-IFDCT-IF 係數。coefficient.

Figure 108124952-A0304-0002
Figure 108124952-A0304-0002

對於4:2:2的垂直插值以及4:4:4彩度通道的水平和垂直插值,不使用表2中的奇數位置,導致1/4彩度插值。For 4:2:2 vertical interpolation and 4:4:4 chroma channel horizontal and vertical interpolation, the odd positions in Table 2 are not used, resulting in 1/4 chroma interpolation.

對於雙向預測,在對兩個預測信號求平均之前,無論源位元深度如何,插值濾波器的輸出的位元深度都保持為14位元精度。實際平均過程是隱式地通過位元深度降低過程完成的:For bidirectional prediction, before averaging the two prediction signals, the bit depth of the output of the interpolation filter is maintained at 14-bit accuracy regardless of the source bit depth. The actual averaging process is done implicitly through the bit depth reduction process:

predSamples[x, y] = predSamplesL0[x, y] + predSamplesL1[x, y] + offset)>>shiftpredSamples[x, y] = predSamplesL0[x, y] + predSamplesL1[x, y] + offset)>>shift

其中shift = (15 – BitDepth)且offset = 1 >> (shift – 1)Where shift = (15 – BitDepth) and offset = 1 >> (shift – 1)

如果運動向量的水平分量和垂直分量都指向子像素位置,則總是先執行水平插值,然後執行垂直插值。例如為了對圖16中所示的子像素j0,0進行插值,首先,根據等式2-1對b0,k(k = -3,-2,... 3)進行插值,然後根據等式2-2對j0,0進行插值。這裡,shift1 = Min(4,BitDepthY - 8),且shift2 = 6,其中BitDepthY是視頻塊的位元深度,更具體地說,是視頻塊的亮度分量的位元深度。If both the horizontal and vertical components of the motion vector point to sub-pixel positions, then horizontal interpolation is always performed first, followed by vertical interpolation. For example, in order to interpolate the sub-pixel j0,0 shown in FIG. 16, first, interpolate b0,k (k = -3,-2,...3) according to equation 2-1, and then according to the equation 2-2 Interpolate j0,0. Here, shift1 = Min(4, BitDepthY-8), and shift2 = 6, where BitDepthY is the bit depth of the video block, and more specifically, the bit depth of the luminance component of the video block.

b0,k = ( -A-3,k + 4 * A-2,k - 11 * A – 1,k + 40 * A0,k + 40 * A1,k - 11 * A2,k + 4 * A3,k - A4,k) >> shift1 (2-1)b0,k = (-A-3,k + 4 * A-2,k-11 * A – 1,k + 40 * A0,k + 40 * A1,k-11 * A2,k + 4 * A3, k-A4,k) >> shift1 (2-1)

j0,0 = ( -b0,-3 + 4 * b0,-2 - 11 * b0,-1 + 40 * b0,0 + 40 * b0,1 - 11 * b0,2 + 4 * b0,3 - b0,4 ) >> shift2 (2-2)j0,0 = (-b0,-3 + 4 * b0,-2-11 * b0,-1 + 40 * b0,0 + 40 * b0,1-11 * b0,2 + 4 * b0,3-b0 ,4) >> shift2 (2-2)

可替代地,我們可以先執行垂直插值,然後執行水平插值。在這種情況下,為了對j0,0進行插值,首先,根據等式2-3對hk,0(k = -3,-2,... 3)進行插值,然後根據等式2-4對j0,0進行插值。當BitDepthY小於或等於8時,shift1為0,在第一個插值階段沒有任何損失,因此,最終插值結果不會被插值順序改變。然而,當BitDepthY大於8時,shift1大於0。在這種情況下,當應用不同的插值順序時,最終的插值結果可能不同。Alternatively, we can perform vertical interpolation first and then horizontal interpolation. In this case, in order to interpolate j0,0, first, interpolate hk,0 (k = -3,-2,...3) according to equation 2-3, and then according to equation 2-4 Interpolate j0,0. When BitDepthY is less than or equal to 8, shift1 is 0, and there is no loss in the first interpolation stage. Therefore, the final interpolation result will not be changed by the interpolation order. However, when BitDepthY is greater than 8, shift1 is greater than 0. In this case, when different interpolation orders are applied, the final interpolation result may be different.

hk,0 =(-Ak,-3 + 4 * Ak,-2-11 * Ak,-1 + 40 * Ak,0 + 40 * Ak,1 - 11 * Ak,2 + 4 * Ak,3 – Ak, 4)>> shift1 (2-3)hk,0 = (-Ak,-3 + 4 * Ak,-2-11 * Ak, -1 + 40 * Ak,0 + 40 * Ak,1-11 * Ak,2 + 4 * Ak,3 – Ak , 4)>> shift1 (2-3)

j0,0 =( - h-3,0 + 4 * h-2,0 - 11 * h-1,0 + 40 * h0,0 + 40 * h1,0 - 11 * h2,0 + 4 * h3,0 - h4,0 )>> shift2 (2-4)j0,0 = (-h-3,0 + 4 * h-2,0-11 * h-1,0 + 40 * h0,0 + 40 * h1,0-11 * h2,0 + 4 * h3, 0-h4,0 )>> shift2 (2-4)

3.實施例解決的問題的示例3. Examples of problems solved by the embodiment

對於亮度塊尺寸WxH,如果我們總是先執行水平插值,則所需的插值(每個像素)在表3中示出。For the luminance block size WxH, if we always perform horizontal interpolation first, the required interpolation (per pixel) is shown in Table 3.

表3:HEVC/JEM對WxH亮度分量所需的插值

Figure 108124952-A0304-0003
Table 3: HEVC/JEM interpolation for WxH luminance component
Figure 108124952-A0304-0003

另一方面,如果我們先執行垂直插值,則表4中示出了所需的插值。顯然,最佳插值順序是在表3和表4之間需要較小插值次數的插值順序。On the other hand, if we perform vertical interpolation first, the required interpolation is shown in Table 4. Obviously, the best interpolation order is the one that requires a smaller number of interpolations between Table 3 and Table 4.

表4:當插值順序顛倒時,WxH亮度分量所需的插值

Figure 108124952-A0304-0004
Table 4: Interpolation required for WxH luminance component when the interpolation order is reversed
Figure 108124952-A0304-0004

對於彩度分量,如果我們總是先執行水平插值,則所需插值為((H + 3) x W + W x H) / (W x H) = 2 + 3 / H。如果我們總是先執行垂直插值,所需插值為((W + 3) x H + W x H) / (W x H) = 2 + 3 / W.For the chroma component, if we always perform horizontal interpolation first, the desired interpolation value is ((H + 3) x W + W x H) / (W x H) = 2 + 3 / H. If we always perform vertical interpolation first, the required interpolation value is ((W + 3) x H + W x H) / (W x H) = 2 + 3 / W.

如上所述,當輸入視頻的位元深度大於8時,不同的插值順序可導致不同的插值結果。因此,插值順序應在編碼器和解碼器中隱式地定義。As described above, when the bit depth of the input video is greater than 8, different interpolation orders may result in different interpolation results. Therefore, the interpolation order should be implicitly defined in the encoder and decoder.

4. 實施例的示例4. Examples of embodiments

為了解決這些問題並提供其他益處,我們提出了形狀相關的插值順序。To solve these problems and provide other benefits, we propose shape-dependent interpolation sequences.

以下詳細示例應被視為解釋一般概念的示例。不應以狹隘的方式解釋這些發明。此外,這些發明可以以任何方式組合。The following detailed examples should be considered as examples explaining general concepts. These inventions should not be interpreted in a narrow manner. Furthermore, these inventions can be combined in any way.

1. 提出插值順序取決於當前編碼塊形狀(例如,編碼塊是CU)。1. It is proposed that the interpolation order depends on the current coding block shape (for example, the coding block is a CU).

a. 在一個示例中,對於寬度>高度的塊(諸如在基於子塊的預測(如仿射、ATMVP或BIO)中使用的CU、PU或子塊),首先執行垂直插值,然後執行水平插值,例如,首先對像素dk,0 ,hk,0 和nk,0 進行插值,然後對e0,0 至r0,0 進行插值。等式2-3和2-4中示出了j0,0 的示例。a. In one example, for blocks with width> height (such as CU, PU, or sub-blocks used in sub-block-based prediction (such as affine, ATMVP, or BIO)), first perform vertical interpolation, and then perform horizontal interpolation For example, first, pixels d k,0 , h k,0, and n k,0 are interpolated, and then e 0,0 to r 0,0 are interpolated. Examples of j 0,0 are shown in equations 2-3 and 2-4.

i. 可替代地,對於寬度> =高度的塊(諸如在基於子塊的預測(如仿射、ATMVP或BIO)中使用的CU、PU或子塊),先執行垂直插值,然後執行水平插值。i. Alternatively, for blocks with width >= height (such as CU, PU, or sub-blocks used in sub-block-based prediction (such as affine, ATMVP, or BIO)), perform vertical interpolation first, and then perform horizontal interpolation .

b. 在一個示例中,對於寬度>=高度的塊(諸如在基於子塊的預測(如仿射、ATMVP或BIO)中使用的CU、PU或子塊),先執行水平插值,然後執行垂直插值。b. In one example, for blocks with width >= height (such as CU, PU, or sub-blocks used in sub-block-based prediction (such as affine, ATMVP, or BIO)), perform horizontal interpolation first, and then perform vertical Interpolation.

i. 可替代地,對於寬度>高度的塊(諸如在基於子塊的預測(如仿射、ATMVP或BIO)中使用的CU、PU或子塊),先執行水平插值,然後執行垂直插值。i. Alternatively, for blocks with width>height (such as CU, PU, or sub-blocks used in sub-block-based prediction (such as affine, ATMVP, or BIO)), perform horizontal interpolation first, and then perform vertical interpolation.

c. 在一個示例中,亮度分量和彩度分量都遵循相同的插值順序。c. In one example, both the luma and chroma components follow the same interpolation order.

d. 可替代地,當一個彩度編碼塊對應於多個亮度編碼塊時(例如,對於4:2:0的顏色格式,一個彩度4×4塊可對應於兩個8×4或4×8亮度塊),亮度和彩度可使用不同的插值順序。d. Alternatively, when one chroma encoding block corresponds to multiple luma encoding blocks (for example, for a color format of 4:2:0, one chroma 4×4 block may correspond to two 8×4 or 4 ×8 brightness block), brightness and saturation can use different interpolation order.

e. 在一個示例中,當利用不同的插值順序時,可以相應地進一步改變多個階段中的縮放因數(即,shift1和shift2)。e. In one example, when different interpolation orders are utilized, the scaling factors in multiple stages (ie, shift1 and shift2) can be further changed accordingly.

2. 可替代地,另外,提出亮度分量的插值順序還可以取決於MV。2. Alternatively, in addition, it is proposed that the interpolation order of the luminance components may also depend on the MV.

a. 在一個示例中,如果垂直MV分量指向四分之一像素位置並且水平MV分量指向半像素位置,則先執行水平插值,然後執行垂直插值。a. In one example, if the vertical MV component points to a quarter-pixel position and the horizontal MV component points to a half-pixel position, then horizontal interpolation is performed first, followed by vertical interpolation.

b.在一個示例中,如果垂直MV分量指向半像素位置並且水平MV分量指向四分之一像素位置,則先執行垂直插值,然後執行水平插值。b. In one example, if the vertical MV component points to a half-pixel position and the horizontal MV component points to a quarter-pixel position, vertical interpolation is performed first, and then horizontal interpolation is performed.

c. 在一個示例中,所提出的方法僅應用於方形編碼塊。c. In one example, the proposed method only applies to square coding blocks.

3. 所提出的方法可以應用於某些模式、塊尺寸/形狀和/或某些子塊尺寸。3. The proposed method can be applied to certain modes, block sizes/shapes and/or certain sub-block sizes.

a. 所提出的方法可以應用於某些模式,諸如雙向預測模式。a. The proposed method can be applied to certain modes, such as bidirectional prediction mode.

b. 所提出的方法可以應用於某些塊尺寸。b. The proposed method can be applied to certain block sizes.

i. 在一個示例中,它僅應用於w×h >= T1的塊,其中w和h是當前塊的寬度和高度,並且T1是第一閾值,其可以是取決於設計要求的預定義值,諸如16、32或64。i. In one example, it applies only to blocks with w×h >= T1, where w and h are the width and height of the current block, and T1 is the first threshold, which may be a predefined value depending on design requirements , Such as 16, 32 or 64.

ii. 在一個示例中,它僅應用於h >= T2的塊,並且T2是第二閾值,其可以是取決於設計要求的預定義值,諸如4或8。ii. In one example, it applies only to blocks where h >= T2, and T2 is the second threshold, which may be a predefined value, such as 4 or 8, depending on design requirements.

c. 所提出的方法可以應用於某些顏色分量(諸如僅亮度分量)。 4. 提出當對一個塊應用多假設預測時,與應用於普通預測模式的那些濾波器相比,可以應用短抽頭或不同的插值濾波器。c. The proposed method can be applied to certain color components (such as only the luminance component). 4. It is proposed that when multi-hypothesis prediction is applied to a block, it is possible to apply short taps or different interpolation filters compared to those applied to ordinary prediction modes.

a. 在一個示例中,可以使用雙線性濾波器。a. In one example, a bilinear filter can be used.

b. 短抽頭或第二插值濾波器可以應用於涉及多個參考塊的參考圖片列表,而對於僅具有一個參考塊的另一參考圖片,可以應用與用於普通預測模式的濾波器相同的濾波器。b. The short-tap or second interpolation filter can be applied to a reference picture list involving multiple reference blocks, and for another reference picture having only one reference block, the same filtering as the filter used for the normal prediction mode can be applied Device.

c. 所提出的方法可以在某些條件下應用,諸如包含該塊的某些(一個或多個)時域層、塊/片/條帶/圖片的量化參數在範圍內(諸如大於閾值)。c. The proposed method can be applied under certain conditions, such as certain time domain layer(s) containing the block, block/slice/strip/picture quantization parameters are within the range (such as greater than the threshold) .

圖17是視頻處理裝置1700的方塊圖。裝置1700可以用於實現本文描述的一個或多個方法。裝置1700可以嵌入在智慧型電話、平板電腦、電腦、物聯網(IoT)接收機等中。裝置1700可以包括一個或多個處理器1702、一個或多個儲存器1704和視頻處理硬體1706。(一個或多個)處理器1702可以被配置為實現本文件中描述的一個或多個方法。(一個或多個)儲存器1704可以用於儲存用於實現本文描述的方法和技術的資料和代碼。視頻處理硬體1706可用於在硬體電路中實現本文件中描述的一些技術。FIG. 17 is a block diagram of the video processing device 1700. The apparatus 1700 may be used to implement one or more methods described herein. The device 1700 may be embedded in a smart phone, tablet computer, computer, Internet of Things (IoT) receiver, and so on. The apparatus 1700 may include one or more processors 1702, one or more storages 1704, and video processing hardware 1706. The processor(s) 1702 may be configured to implement one or more methods described in this document. The storage(s) 1704 may be used to store data and codes for implementing the methods and techniques described herein. Video processing hardware 1706 can be used to implement some of the techniques described in this document in a hardware circuit.

圖19是視頻位元流處理的方法1900的流程圖。方法1900包括確定(1905)視頻塊的形狀,基於視頻塊確定(1910)插值順序,該插值順序指示執行水平插值和垂直插值的序列,並根據視頻塊的插值循序執行水平插值和垂直插值,以重建(1915)視頻塊的解碼表示。19 is a flowchart of a method 1900 of video bitstream processing. Method 1900 includes determining (1905) the shape of a video block, determining (1910) an interpolation order based on the video block, the interpolation order indicating a sequence of performing horizontal interpolation and vertical interpolation, and sequentially performing horizontal interpolation and vertical interpolation according to the interpolation of the video block, to Reconstruct (1915) the decoded representation of the video block.

圖20是視頻位元流處理的方法2000的流程圖。方法2000包括確定(2005)與視頻塊相關的運動向量的特性,基於運動向量的特性確定(2010)視頻塊的插值順序,該插值順序指示執行水平插值和垂直插值的序列,並根據視頻塊的插值循序執行水平插值和垂直插值,以重建(2015)視頻塊的解碼表示。20 is a flowchart of a method 2000 of video bitstream processing. The method 2000 includes determining (2005) the characteristics of the motion vector related to the video block, determining (2010) the interpolation order of the video block based on the characteristics of the motion vector, the interpolation order indicating a sequence of performing horizontal interpolation and vertical interpolation, and according to the video block's Interpolation performs horizontal interpolation and vertical interpolation in order to reconstruct (2015) the decoded representation of the video block.

參考方法1900和2000,在本文件的第4章節中描述了一些執行水平插值和垂直插值的序列的示例及其使用。例如,如第4章節所述,在視頻塊的不同形狀下,可以優先首先執行水平插值或垂直插值中的一個。在一些實施例中,水平插值先於垂直插值執行,並且在一些實施例中,垂直插值先於水平插值執行。With reference to methods 1900 and 2000, some examples of sequences that perform horizontal interpolation and vertical interpolation and their use are described in Section 4 of this document. For example, as described in Chapter 4, under different shapes of video blocks, it may be preferred to first perform one of horizontal interpolation or vertical interpolation. In some embodiments, horizontal interpolation is performed prior to vertical interpolation, and in some embodiments, vertical interpolation is performed prior to horizontal interpolation.

參考方法1900和2000,可以在視頻位元流中對視頻塊進行編碼,其中可以通過使用與插值順序相關的位元流生成規則來實現位元效率,該插值順序也取決於視頻塊的形狀。Referring to methods 1900 and 2000, video blocks can be encoded in a video bitstream, where bit efficiency can be achieved by using bitstream generation rules related to the interpolation order, which also depends on the shape of the video block.

應當理解,所公開的技術可以嵌入在視頻編碼器或解碼器中,以在被壓縮的編碼單元具有與傳統方形塊或半方形矩形塊明顯不同的形狀時改進壓縮效率。例如,使用諸如4×32或32×4尺寸單元的長或高編碼單元的新編碼工具可受益於所公開的技術。It should be understood that the disclosed technology may be embedded in a video encoder or decoder to improve compression efficiency when the compressed coding unit has a shape that is significantly different from a conventional square block or semi-square rectangular block. For example, new coding tools using long or high coding units such as 4×32 or 32×4 size units may benefit from the disclosed technology.

圖21是視頻處理方法2100的示例的流程圖。方法2100包括:確定(2102)應用於第一視頻塊的第一預測模式;通過對第一視頻塊應用水平插值和/或垂直插值,在第一視頻塊和第一視頻塊的編碼表示之間執行(2104)第一轉換;確定(2106)應用於第二視頻塊的第二預測模式;通過對第二視頻塊應用水平插值和/或垂直插值,在第二視頻塊和第二視頻塊的編碼表示之間執行(2108)第二轉換,其中,基於第一預測模式是多假設預測模式而第二預測模式不是多假設預測模式的確定,第一視頻塊的水平插值和垂直插值中的一個或兩個使用與用於第二視頻塊的濾波器相比的較短抽頭濾波器。21 is a flowchart of an example of a video processing method 2100. Method 2100 includes: determining (2102) a first prediction mode applied to the first video block; by applying horizontal interpolation and/or vertical interpolation to the first video block, between the first video block and the encoded representation of the first video block Perform (2104) the first conversion; determine (2106) the second prediction mode applied to the second video block; by applying horizontal interpolation and/or vertical interpolation to the second video block, between the second video block and the second video block Perform (2108) a second conversion between encoding representations, where, based on the determination that the first prediction mode is a multi-hypothesis prediction mode and the second prediction mode is not a multi-hypothesis prediction mode, one of horizontal interpolation and vertical interpolation of the first video block Or both use shorter tap filters compared to the filters used for the second video block.

圖22是視頻位元流處理的方法2200的流程圖。方法包括:確定(2205)視頻塊的形狀;基於視頻塊的形狀確定(2210)插值順序,插值順序指示執行水平插值和垂直插值的序列,以及按由插值順序指示的序列對視頻塊執行水平插值和垂直插值,以構造(2215)視頻塊的編碼表示。22 is a flowchart of a method 2200 of video bitstream processing. The method includes: determining (2205) the shape of the video block; determining (2210) an interpolation order based on the shape of the video block, the interpolation order indicating a sequence of performing horizontal interpolation and vertical interpolation, and performing horizontal interpolation on the video block in the sequence indicated by the interpolation order And vertical interpolation to construct (2215) the encoded representation of the video block.

圖23是視頻位元流處理的方法2300的流程圖。該方法包括:確定(2305)與視頻塊相關的運動向量的特徵;基於運動向量的特徵確定(2310)插值順序,插值順序指示執行水平插值和垂直插值的序列;以及按由插值順序指示的序列對視頻塊執行水平插值和垂直插值,以構造(2315)視頻塊的編碼表示。23 is a flowchart of a method 2300 of video bitstream processing. The method includes: determining (2305) the characteristics of the motion vector related to the video block; determining (2310) the interpolation order based on the characteristics of the motion vector, the interpolation order indicating the sequence of performing horizontal interpolation and vertical interpolation; and the sequence indicated by the interpolation order Perform horizontal interpolation and vertical interpolation on the video block to construct (2315) the encoded representation of the video block.

可以在以下實施例列表中描述本文件中公開的各種實施例和技術。Various embodiments and techniques disclosed in this document can be described in the following list of embodiments.

1. 一種視頻處理方法,包括:確定應用於第一視頻塊的第一預測模式;通過對第一視頻塊應用水平插值和/或垂直插值,在第一視頻塊和第一視頻塊的編碼表示之間執行第一轉換;確定應用於第二視頻塊的第二預測模式;通過對第二視頻塊應用水平插值和/或垂直插值,在第二視頻塊和第二視頻塊的編碼表示之間執行第二轉換,其中,基於第一預測模式是多假設預測模式而第二預測模式不是多假設預測模式的確定,第一視頻塊的水平插值和垂直插值中的一個或兩個使用與用於第二視頻塊的濾波器相比的較短抽頭濾波器。1. A video processing method, comprising: determining a first prediction mode applied to a first video block; by applying horizontal interpolation and/or vertical interpolation to the first video block, the encoded representation in the first video block and the first video block Perform the first conversion between; determine the second prediction mode applied to the second video block; by applying horizontal and/or vertical interpolation to the second video block, between the encoded representation of the second video block and the second video block Perform a second conversion in which, based on the determination that the first prediction mode is a multi-hypothesis prediction mode and the second prediction mode is not a multi-hypothesis prediction mode, one or both of horizontal interpolation and vertical interpolation of the first video block are used A shorter tap filter compared to the filter of the second video block.

2. 根據示例1的方法,其中,第一視頻塊利用多於兩個參考塊進行轉換以用於雙向預測,並至少對於一個參考圖片列表,其使用多於兩個參考塊。2. The method according to example 1, wherein the first video block is converted with more than two reference blocks for bidirectional prediction, and it uses more than two reference blocks for at least one reference picture list.

3. 根據示例1的方法,其中利用多於一個參考塊對第一視頻塊進行轉換以用於單向預測。3. The method according to example 1, wherein the first video block is converted with more than one reference block for unidirectional prediction.

4. 根據示例1-3中任一個的方法,其中較短抽頭濾波器是雙線性濾波器。4. The method according to any one of Examples 1-3, wherein the shorter tap filter is a bilinear filter.

5. 根據示例1-3中任一個的方法,其中水平插值和垂直插值中的一個或兩個對與多個參考塊相關的參考圖片列表使用較短抽頭濾波器。5. The method according to any one of Examples 1-3, wherein one or both of horizontal interpolation and vertical interpolation use a shorter tap filter for a reference picture list related to a plurality of reference blocks.

6. 根據示例1-5中任一個的方法,其中,當參考圖片列表與單個參考塊相關時,水平插值或垂直插值中的一個或兩個使用與用於普通預測模式相同的濾波器。6. The method according to any one of Examples 1-5, wherein, when the reference picture list is related to a single reference block, one or both of horizontal interpolation or vertical interpolation uses the same filter as used for the ordinary prediction mode.

7. 根據示例1-6中任一個的方法,其中,基於以下中的一個或多個的確定來應用方法:時域層的使用、包含視頻塊的一個或多個塊、片、條帶或圖片的量化參數在閾值範圍內。7. The method according to any one of examples 1-6, wherein the method is applied based on the determination of one or more of the following: use of the time domain layer, one or more blocks containing video blocks, slices, slices or The quantization parameters of the picture are within the threshold.

8. 根據示例7的方法,其中在閾值範圍內的量化參數包括大於閾值的量化參數。8. The method according to example 7, wherein the quantization parameters within the threshold range include quantization parameters greater than the threshold.

9. 根據示例6的方法,其中,普通預測模式包括單向預測或雙向預測幀間預測模式,單向預測使用具有至多一個運動向量和一個參考索引的幀間預測來預測塊中的樣本的樣本值,雙向預測幀間預測模式使用具有至多兩個運動向量和參考索引的幀間預測來預測塊中樣本的樣本值。9. The method according to example 6, wherein the general prediction mode includes unidirectional prediction or bidirectional prediction inter prediction mode, unidirectional prediction uses inter prediction with at most one motion vector and one reference index to predict samples of samples in the block Value, the bi-directional prediction inter prediction mode uses inter prediction with up to two motion vectors and reference indexes to predict the sample value of the samples in the block.

10. 一種視頻解碼裝置,包括處理器,被配置為實現示例1至9的一個或多個的方法。10. A video decoding device, including a processor, configured to implement one or more methods of Examples 1 to 9.

11. 一種視頻編碼裝置,包括處理器,被配置為實現示例1至9的一個或多個的方法。11. A video encoding device, including a processor, configured to implement one or more methods of Examples 1 to 9.

12. 一種其上儲存了代碼的電腦可讀程式介質,代碼包括指令,當處理器執行指令時,使處理器實現示例1至9的一個或個中的方法。12. A computer-readable program medium having code stored thereon, the code including instructions, when the processor executes the instructions, causing the processor to implement the method of one or more of Examples 1 to 9.

13.一種視頻位元流處理方法,包括:確定視頻塊的形狀;基於視頻塊的形狀確定插值順序,插值順序指示執行水平插值和垂直插值的序列;以及按由插值順序指示的序列對視頻塊執行水平插值和垂直插值,以重建視頻塊的解碼表示。13. A video bit stream processing method, comprising: determining a shape of a video block; determining an interpolation order based on the shape of the video block, the interpolation order indicating a sequence in which horizontal interpolation and vertical interpolation are performed; and pairing the video blocks in a sequence indicated by the interpolation order Perform horizontal and vertical interpolation to reconstruct the decoded representation of the video block.

14. 根據示例13的方法,其中,視頻塊的形狀由視頻塊的寬度和高度表示,並且確定插值順序的步驟還包括:14. The method according to example 13, wherein the shape of the video block is represented by the width and height of the video block, and the step of determining the interpolation order further includes:

當視頻塊的寬度大於視頻塊的高度時,確定在水平插值之前執行垂直插值作為插值順序。When the width of the video block is greater than the height of the video block, it is determined that vertical interpolation is performed as the interpolation order before horizontal interpolation.

15.根據示例13的方法,其中,視頻塊的形狀由寬度和高度表示,並且確定插值順序的步驟還包括:15. The method according to example 13, wherein the shape of the video block is represented by the width and height, and the step of determining the interpolation order further includes:

當視頻塊的寬度大於或等於視頻塊的高度時,確定在水平插值之前執行垂直插值作為插值順序。When the width of the video block is greater than or equal to the height of the video block, it is determined that vertical interpolation is performed as the interpolation sequence before horizontal interpolation.

16.根據示例13的方法,其中,視頻塊的形狀由寬度和高度表示,並且確定插值順序的步驟還包括:16. The method according to example 13, wherein the shape of the video block is represented by the width and height, and the step of determining the interpolation order further includes:

當視頻塊的高度大於或等於視頻塊的寬度時,確定在垂直插值之前執行水平插值作為插值順序。When the height of the video block is greater than or equal to the width of the video block, it is determined that horizontal interpolation is performed as the interpolation sequence before vertical interpolation.

17.根據示例1的方法,其中,視頻塊的形狀由寬度和高度表示,並且確定插值順序的步驟還包括:17. The method according to example 1, wherein the shape of the video block is represented by a width and a height, and the step of determining the interpolation order further includes:

當視頻塊的高度大於視頻塊的寬度時,確定在垂直插值之前執行水平插值作為插值順序。When the height of the video block is greater than the width of the video block, it is determined that horizontal interpolation is performed as the interpolation order before vertical interpolation.

18.根據示例1的方法,其中,基於所述插值順序或基於不同的插值順序,對視頻塊的亮度分量和彩度分量進行插值。18. The method according to example 1, wherein the luma and chroma components of the video block are interpolated based on the interpolation order or based on different interpolation orders.

19.根據示例1的方法,其中,當彩度分量的每個彩度塊對應於亮度分量的多個亮度塊時,使用不同的插值順序對視頻塊的亮度分量和彩度分量進行插值。19. The method according to example 1, wherein, when each chroma block of the chroma component corresponds to a plurality of luma blocks of the luma component, the luma and chroma components of the video block are interpolated using different interpolation orders.

20.根據示例13的方法,其中使用不同的插值順序對視頻塊的亮度分量和彩度分量進行插值,並且其中對於亮度分量和彩度分量,在水平插值和垂直插值中使用的縮放因數不同。20. The method according to example 13, wherein the luminance component and the chroma component of the video block are interpolated using different interpolation orders, and wherein the scaling factors used in the horizontal interpolation and the vertical interpolation are different for the luminance component and the chroma component.

21.一種視頻位元流處理方法,包括:確定與視頻塊相關的運動向量的特徵;基於運動向量的特徵確定插值順序,插值順序指示執行水平插值和垂直插值的序列;以及按由插值順序指示的序列對視頻塊執行水平插值和垂直插值,以重建視頻塊的解碼表示。21. A video bit stream processing method, comprising: determining a feature of a motion vector related to a video block; determining an interpolation order based on the feature of the motion vector, the interpolation order indicating a sequence of performing horizontal interpolation and vertical interpolation; and in accordance with the order of interpolation The sequence of performs horizontal interpolation and vertical interpolation on the video block to reconstruct the decoded representation of the video block.

22.根據示例21的方法,其中,運動向量的特徵由運動向量指向的四分之一像素位置和半像素位置表示,運動向量包括垂直分量和水平分量,並且確定插值順序包括:當垂直分量指向四分之一像素位置並且水平分量指向半像素位置時,確定在垂直插值之前執行水平插值作為插值順序。22. The method according to example 21, wherein the characteristics of the motion vector are represented by the quarter pixel position and the half pixel position pointed by the motion vector, the motion vector includes a vertical component and a horizontal component, and determining the interpolation order includes: when the vertical component points When the quarter-pixel position and the horizontal component point to the half-pixel position, it is determined that horizontal interpolation is performed as the interpolation order before vertical interpolation.

23.根據示例21的方法,其中,運動向量的特徵由運動向量指向的四分之一像素位置和半像素位置表示,運動向量包括垂直分量和水平分量,並且確定插值順序包括:當垂直分量指向半像素位置並且水平分量指向四分之一像素位置時,確定在水平插值之前執行垂直插值。23. The method according to example 21, wherein the feature of the motion vector is represented by a quarter pixel position and a half pixel position pointed by the motion vector, the motion vector includes a vertical component and a horizontal component, and determining the interpolation order includes: when the vertical component points When the half-pixel position and the horizontal component point to the quarter-pixel position, it is determined that vertical interpolation is performed before horizontal interpolation.

24.根據示例21-23中任一個的方法,其中視頻塊的形狀是正方形。24. The method according to any one of examples 21-23, wherein the shape of the video block is a square.

25.根據示例21-24中任一個的方法,其中方法應用於雙預測模式。25. The method according to any one of Examples 21-24, wherein the method is applied to a dual prediction mode.

26.根據示例21-25中任一個的方法,其中,當視頻塊的高度乘以視頻塊的寬度小於或等於T1時,應用方法,T1是第一閾值。26. The method according to any one of Examples 21-25, wherein the method is applied when the height of the video block times the width of the video block is less than or equal to T1, T1 being the first threshold.

27.根據示例21-25中任一個的方法,其中,當視頻塊具有小於或等於T2的高度時,應用方法,T2是第二閾值。27. The method according to any one of Examples 21-25, wherein when the video block has a height less than or equal to T2, the method is applied, and T2 is the second threshold.

28.根據示例21-25中任一個的方法,其中將方法應用於視頻塊的亮度分量。28. The method according to any one of examples 21-25, wherein the method is applied to the luminance component of the video block.

29.一種視頻位元流處理方法,包括: 確定視頻塊的形狀;29. A video bit stream processing method, including: Determine the shape of the video block;

基於視頻塊的形狀確定插值順序,插值順序指示執行水平插值和垂直插值的序列;以及The interpolation order is determined based on the shape of the video block, and the interpolation order indicates a sequence of performing horizontal interpolation and vertical interpolation; and

按由插值順序指示的序列對視頻塊執行水平插值和垂直插值,以構造視頻塊的編碼表示。The horizontal interpolation and vertical interpolation are performed on the video block in the sequence indicated by the interpolation order to construct the encoded representation of the video block.

30.一種視頻位元流處理方法,包括:30. A video bit stream processing method, including:

確定與視頻塊相關的運動向量的特徵;Determine the characteristics of the motion vector associated with the video block;

基於運動向量的特徵確定插值順序,插值順序指示執行水平插值和垂直插值的序列;以及The interpolation order is determined based on the characteristics of the motion vector, and the interpolation order indicates a sequence in which horizontal interpolation and vertical interpolation are performed; and

按由插值順序指示的序列對視頻塊執行水平插值和垂直插值,以構造視頻塊的編碼表示。The horizontal interpolation and vertical interpolation are performed on the video block in the sequence indicated by the interpolation order to construct the encoded representation of the video block.

31.一種視頻解碼裝置,包括處理器,其被配置為實現示例21至28的一個或多個的方法。31. A video decoding device including a processor configured to implement one or more of the methods of Examples 21 to 28.

32.一種視頻編碼裝置,包括處理器,其被配置為實現示例29或30的方法。32. A video encoding device comprising a processor configured to implement the method of Example 29 or 30.

33.一種電腦程式產品,其上儲存有電腦代碼,代碼在由處理器執行時使處理器實現示例13至30中任一個的方法。33. A computer program product on which computer code is stored, which when executed by a processor causes the processor to implement the method of any of Examples 13 to 30.

34.一種視頻系統中的裝置,包括處理器和其上具有指令的非暫時性儲存器,其中指令在由處理器執行時使處理器實現示例13至30中任一個的方法。34. An apparatus in a video system, including a processor and a non-transitory storage having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to implement the method of any of Examples 13 to 30.

從上述來看,應當理解的是,為了便於說明,本發明公開的技術的具體實施例已經在本文中進行了描述,但是可以在不偏離本發明範圍的情況下進行各種修改。因此,除了的之外,本發明公開的技術不限於申請專利範圍的限定。From the above, it should be understood that, for the convenience of explanation, specific embodiments of the technology disclosed in the present invention have been described herein, but various modifications can be made without departing from the scope of the present invention. Therefore, in addition to the above, the technology disclosed in the present invention is not limited to the limitation of the scope of patent application.

本專利文件中主題名稱的實現和功能操作可以在各種系統、數位電子電路、或電腦軟體、韌體或硬體中實現,包括本說明書中所公開的結構及其結構等效體,或其中一個或多個的組合。在本說明書中描述的主題的實現可以實現為一個或多個電腦程式產品,即一個或多個編碼在暫時性和非暫時性電腦可讀介質上的電腦程式指令的模組,以供資料處理裝置執行或控制資料處理裝置的操作。電腦可讀介質可以是機器可讀存放裝置、機器可讀儲存基板、存放裝置、影響機器可讀傳播信號的物質組成或其中的一個或多個的組合。術語“資料處理單元”或“資料處理裝置”包括用於處理資料的所有裝置、設備和機器,包括例如可程式設計處理器、電腦或多處理器或電腦組。除硬體外,該裝置還可以包括為電腦程式創建執行環境的代碼,例如,構成處理器韌體的代碼、協定棧、資料庫管理系統、作業系統或其中一個或多個的組合。The implementation and functional operation of the subject name in this patent document can be implemented in various systems, digital electronic circuits, or computer software, firmware or hardware, including the structures disclosed in this specification and their structural equivalents, or one of them Or a combination of multiple. The implementation of the subject matter described in this specification can be implemented as one or more computer program products, that is, one or more modules of computer program instructions encoded on temporary and non-transitory computer readable media for data processing The device performs or controls the operation of the data processing device. The computer-readable medium may be a machine-readable storage device, a machine-readable storage substrate, a storage device, a material composition that affects a machine-readable propagation signal, or a combination of one or more of them. The term "data processing unit" or "data processing device" includes all devices, equipment, and machines used to process data, including, for example, programmable processors, computers, or multiple processors or computer groups. In addition to the hardware, the device may also include code to create an execution environment for the computer program, for example, code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.

電腦程式(也稱為程式、軟體、軟體應用、腳本或代碼)可以用任何形式的程式設計語言(包括編譯語言或解釋語言)編寫,並且可以以任何形式部署,包括作為獨立程式或作為模組、元件、副程式或其他適合在計算環境中使用的單元。電腦程式不一定與檔案系統中的文件對應。程式可以儲存在保存其他程式或資料的文件的部分中(例如,儲存在標記語言文件中的一個或多個腳本)、專用於該程式的單個文件中、或多個協調文件(例如,儲存一個或多個模組、副程式或部分代碼的文件)中。電腦程式可以部署在一台或多台電腦上來執行,這些電腦位於一個網站上或分佈在多個網站上,並通過通信網路互連。Computer programs (also known as programs, software, software applications, scripts, or codes) can be written in any form of programming language (including compiled or interpreted languages) and can be deployed in any form, including as standalone programs or as modules , Components, subprograms, or other units suitable for use in computing environments. Computer programs do not necessarily correspond to files in the file system. Programs can be stored in parts of documents that hold other programs or data (for example, one or more scripts stored in markup language documents), in a single document dedicated to the program, or in multiple coordination documents (for example, storing a Or multiple modules, subprograms, or some code files). Computer programs can be deployed on one or more computers for execution. These computers are located on one website or distributed on multiple websites and are interconnected through a communication network.

本說明書中描述的處理和邏輯流可以通過一個或多個可程式設計處理器執行,該處理器執行一個或多個電腦程式,通過在輸入資料上操作並生成輸出來執行功能。處理和邏輯流也可以通過特殊用途的邏輯電路來執行,並且裝置也可以實現為特殊用途的邏輯電路,例如,FPGA(現場可程式設計閘陣列)或ASIC(專用積體電路)。The processes and logic flows described in this specification can be performed by one or more programmable processors that execute one or more computer programs and perform functions by operating on input data and generating output. Processing and logic flow can also be performed by special purpose logic circuits, and the device can also be implemented as special purpose logic circuits, such as FPGA (field programmable gate array) or ASIC (dedicated integrated circuit).

例如,適於執行電腦程式的處理器包括通用和專用微處理器,以及任何類型數位電腦的任何一個或多個。通常,處理器將從唯讀儲存器或隨機存取儲存器或兩者接收指令和資料。電腦的基本元件是執行指令的處理器和儲存指令和資料的一個或多個存放裝置。通常,電腦還將包括一個或多個用於儲存資料的大型存放區設備,例如,磁片、磁光碟或光碟,或通過操作耦合到一個或多個大型存放區設備來從其接收資料或將資料傳輸到一個或多個大型存放區設備,或兩者兼有。然而,電腦不一定具有這樣的設備。適用於儲存電腦程式指令和資料的電腦可讀介質包括所有形式的非揮發性儲存器、介質和儲存器設備,包括例如半導體儲存器設備,例如EPROM、EEPROM和快閃儲存器設備。處理器和儲存器可以由專用邏輯電路來補充,或合併到專用邏輯電路中。For example, processors suitable for executing computer programs include general-purpose and special-purpose microprocessors, and any one or more of any type of digital computer. Typically, the processor will receive commands and data from read-only memory or random access memory or both. The basic components of a computer are a processor that executes instructions and one or more storage devices that store instructions and data. Typically, the computer will also include one or more large storage area devices for storing data, such as magnetic disks, magneto-optical disks, or optical discs, or be operatively coupled to one or more large storage area devices to receive data from or Data is transferred to one or more large storage area equipment, or both. However, computers do not necessarily have such equipment. Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile storage, media, and storage devices, including, for example, semiconductor storage devices, such as EPROM, EEPROM, and flash storage devices. The processor and memory can be supplemented by dedicated logic circuits or incorporated into dedicated logic circuits.

說明書和附圖旨在被認為是示例性的,其中示例性意味著示例。如這裡所使用的,單數形式“一”、“一個”和“該”旨在也包括複數形式,除非上下文另有明確說明。另外,“或”的使用旨在包括“和/或”,除非上下文另有明確說明。The description and drawings are intended to be considered exemplary, where exemplary means exemplary. As used herein, the singular forms "a", "an", and "the" are intended to also include the plural forms unless the context clearly dictates otherwise. In addition, the use of "or" is intended to include "and/or" unless the context clearly dictates otherwise.

雖然本專利文件包含許多細節,但不應將其解釋為對任何發明或申請專利範圍的限制,而應解釋為對特定發明的特定實施例的特徵的描述。本專利文件在單獨實施例的上下文描述的一些特徵也可以在單個實施例中組合實施。相反,在單個實施例的上下文中描述的各種功能也可以在多個實施例中單獨實施,或在任何合適的子組合中實施。此外,儘管上述特徵可以描述為在一些組合中起作用,甚至最初要求是這樣,但在一些情況下,可以從組合中移除申請專利範圍組合中的一個或多個特徵,並且申請專利範圍的組合可以指向子組合或子組合的變體。Although this patent document contains many details, it should not be construed as a limitation on the scope of any invention or patent application, but as a description of features of specific embodiments of specific inventions. Some features described in this patent document in the context of separate embodiments may also be implemented in combination in a single embodiment. Conversely, various functions that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. In addition, although the above-mentioned features can be described as functioning in some combinations, even if initially required, in some cases, one or more features in the patent-applicable combination can be removed from the combination, and Combinations can refer to sub-combinations or variations of sub-combinations.

同樣,儘管圖紙中以特定順序描述了操作,但這不應理解為要獲得想要的結果必須按照所示的特定順序或循序執行此類操作,或執行所有說明的操作。此外,本專利文件所述實施例中各種系統元件的分離不應理解為在所有實施例中都需要這樣的分離。Similarly, although the operations are described in a specific order in the drawings, it should not be understood that such operations must be performed in the specific order or order shown, or that all the operations described should be performed in order to obtain the desired results. Furthermore, the separation of various system elements in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.

僅描述了一些實現和示例,其他實現、增強和變體可以基於本專利文件中描述和說明的內容做出。Only some implementations and examples are described, and other implementations, enhancements, and variations can be made based on what is described and illustrated in this patent document.

tb、td:距離 1700:裝置 1702:處理器 1704:儲存器 1706:視頻處理硬體 1900、2000、2100、2200、2300:方法 1905、1910、1915、2005、2010、2015、2102、2104、2106、2108、2205、2210、2215、2305、2310、2315:步驟tb, td: distance 1700: device 1702: processor 1704: storage 1706: Video processing hardware 1900, 2000, 2100, 2200, 2300: method 1905, 1910, 1915, 2005, 2010, 2015, 2102, 2104, 2106, 2108, 2205, 2210, 2215, 2305, 2310, 2315: steps

圖1是四叉樹二叉樹(QTBT)結構的圖示。 圖2示出了Merge候選列表構造的示例推導過程。 圖3示出了空間Merge候選的示例位置。 圖4示出了對於空間Merge候選的冗餘檢查考慮的候選對的示例。 圖5示出了Nx2N和2NxN分割的第二個預測單元(PU)的位置的示例。 圖6是時域Merge候選的運動向量縮放的圖示。 圖7示出了時域Merge候選C0和C1的示例候選位置。 圖8示出了組合的雙向預測Merge候選的示例。 圖9示出了運動向量預測候選的推導過程的示例。 圖10是空間運動向量候選的運動向量縮放的圖示。 圖11示出了編碼單元(CU)的高級時間運動向量預測(ATMVP)運動預測的示例。 圖12示出了具有四個子塊(A-D)及其相鄰塊(a-d)的一個CU的示例。 圖13示出了J0021中提出的非相鄰Merge候選。 圖14示出了J0058中提出的非相鄰Merge候選。 圖15示出了J0059中提出的非相鄰Merge候選。 圖16示出了用於四分之一樣本亮度插值的整數樣本和分數樣本位置的示例。 圖17是視頻處理裝置的示例的方塊圖。 圖18示出了視頻編碼器的示例實現的方塊圖。 圖19是視頻位元流處理方法的示例的流程圖。 圖20是視頻位元流處理方法的示例的流程圖。 圖21是視頻處理方法的示例的流程圖。 圖22是視頻位元流處理方法的示例的流程圖。 圖23是視頻位元流處理方法的示例的流程圖。Figure 1 is a diagram of a quadtree binary tree (QTBT) structure. FIG. 2 shows an example derivation process of Merge candidate list construction. Figure 3 shows an example location of spatial Merge candidates. FIG. 4 shows an example of candidate pairs considered for the redundancy check of spatial Merge candidates. FIG. 5 shows an example of the position of the second prediction unit (PU) partitioned by Nx2N and 2NxN. FIG. 6 is an illustration of motion vector scaling of time-domain Merge candidates. FIG. 7 shows example candidate positions of time domain Merge candidates C0 and C1. FIG. 8 shows an example of combined bidirectional prediction Merge candidates. FIG. 9 shows an example of the derivation process of motion vector prediction candidates. 10 is an illustration of motion vector scaling of spatial motion vector candidates. FIG. 11 shows an example of advanced temporal motion vector prediction (ATMVP) motion prediction of a coding unit (CU). FIG. 12 shows an example of one CU having four sub-blocks (A-D) and its neighboring blocks (a-d). Fig. 13 shows the non-adjacent Merge candidates proposed in J0021. Fig. 14 shows the non-adjacent Merge candidates proposed in J0058. Figure 15 shows the non-adjacent Merge candidates proposed in J0059. FIG. 16 shows an example of integer sample and fractional sample positions for quarter sample luminance interpolation. 17 is a block diagram of an example of a video processing device. Figure 18 shows a block diagram of an example implementation of a video encoder. 19 is a flowchart of an example of a video bit stream processing method. 20 is a flowchart of an example of a video bit stream processing method. 21 is a flowchart of an example of a video processing method. 22 is a flowchart of an example of a video bit stream processing method. 23 is a flowchart of an example of a video bit stream processing method.

2100:方法 2100: Method

2102、2104、2106、2108:步驟 2102, 2104, 2106, 2108: steps

Claims (12)

一種視頻處理方法,包括: 確定應用於第一視頻塊的第一預測模式; 通過對所述第一視頻塊應用水平插值和/或垂直插值,在所述第一視頻塊和所述第一視頻塊的編碼表示之間執行第一轉換; 確定應用於第二視頻塊的第二預測模式; 通過對所述第二視頻塊應用水平插值和/或垂直插值,在所述第二視頻塊和所述第二視頻塊的編碼表示之間執行第二轉換, 其中,基於所述第一預測模式是多假設預測模式而所述第二預測模式不是多假設預測模式的確定,所述第一視頻塊的所述水平插值和所述垂直插值中的一個或兩個使用與用於所述第二視頻塊的濾波器相比的較短抽頭濾波器。A video processing method, including: Determine the first prediction mode applied to the first video block; Performing a first conversion between the first video block and the encoded representation of the first video block by applying horizontal interpolation and/or vertical interpolation to the first video block; Determine the second prediction mode applied to the second video block; Performing a second conversion between the encoded representation of the second video block and the second video block by applying horizontal interpolation and/or vertical interpolation to the second video block, Wherein, based on the determination that the first prediction mode is a multiple hypothesis prediction mode and the second prediction mode is not a multiple hypothesis prediction mode, one or both of the horizontal interpolation and the vertical interpolation of the first video block Uses a shorter tap filter than the filter used for the second video block. 如申請專利範圍第1項所述的方法,其中,所述第一視頻塊利用多於兩個參考塊進行轉換以用於雙向預測,並且至少對於一個參考圖片列表,其使用至少兩個參考塊。The method according to item 1 of the patent application scope, wherein the first video block is converted using more than two reference blocks for bidirectional prediction, and at least for one reference picture list, it uses at least two reference blocks . 如申請專利範圍第1項所述的方法,其中利用多於一個參考塊對所述第一視頻塊進行轉換以用於單向預測。The method according to item 1 of the patent application scope, wherein more than one reference block is used to convert the first video block for unidirectional prediction. 如申請專利範圍第1-3項中任一項所述的方法,其中所述較短抽頭濾波器是雙線性濾波器。The method according to any one of items 1 to 3 of the patent application range, wherein the shorter tap filter is a bilinear filter. 如申請專利範圍第1-3項中任一項所述的方法,其中所述水平插值和所述垂直插值中的一個或兩個對與多個參考塊相關的參考圖片列表使用所述較短抽頭濾波器。The method according to any one of items 1 to 3 of the patent application range, wherein one or both of the horizontal interpolation and the vertical interpolation use the shorter for a reference picture list related to a plurality of reference blocks Tap filter. 如申請專利範圍第1-5項中任一項所述的方法,其中,當參考圖片列表與單個參考塊相關時,所述水平插值或所述垂直插值中的一個或兩個使用與用於正常預測模式相同的濾波器。The method according to any one of items 1 to 5 of the patent application range, wherein, when the reference picture list is related to a single reference block, one or both of the horizontal interpolation or the vertical interpolation are used and used for The same filter as the normal prediction mode. 如申請專利範圍第1-6項中任一項所述的方法,其中,基於以下中的一個或多個的確定來應用所述方法:時域層的使用、包含所述視頻塊的一個或多個塊、片、條帶或圖片的量化參數在閾值範圍內。The method according to any one of items 1 to 6 of the patent application range, wherein the method is applied based on the determination of one or more of the following: use of the time domain layer, one or more including the video block The quantization parameters of multiple blocks, slices, slices, or pictures are within the threshold. 如申請專利範圍第7項所述的方法,其中在閾值範圍內的量化參數包括大於閾值的量化參數。The method according to item 7 of the patent application scope, wherein the quantization parameters within the threshold range include quantization parameters greater than the threshold. 如申請專利範圍第6項所述的方法,其中,所述正常預測模式包括單向預測或雙向預測幀間預測模式,所述單向預測使用具有至多一個運動向量和一個參考索引的幀間預測來預測塊中的樣本的樣本值,所述雙向預測幀間預測模式使用具有至多兩個運動向量和參考索引的幀間預測來預測塊中樣本的樣本值。The method according to item 6 of the patent application scope, wherein the normal prediction mode includes a unidirectional prediction or a bidirectional prediction inter prediction mode, the unidirectional prediction uses inter prediction with at most one motion vector and one reference index To predict the sample values of the samples in the block, the bidirectional prediction inter prediction mode uses inter prediction with at most two motion vectors and reference indexes to predict the sample values of the samples in the block. 一種視頻解碼裝置,包括處理器,被配置為實現申請專利範圍第1至9項的一項或多項所述的方法。A video decoding device includes a processor configured to implement one or more of the methods described in items 1 to 9 of the patent application. 一種視頻編碼裝置,包括處理器,被配置為實現申請專利範圍第1至9項的一項或多項所述的方法。A video encoding device, including a processor, is configured to implement one or more of the methods described in items 1 to 9 of the patent application. 一種其上儲存了代碼的電腦可讀程式介質,所述代碼包括指令,當處理器執行所述指令時,使所述處理器實現申請專利範圍第1至9項的一項或多項中所述的方法。A computer readable program medium having code stored thereon, the code including instructions, when the processor executes the instructions, causing the processor to implement one or more of items 1 to 9 of the patent application Methods.
TW108124952A 2018-07-13 2019-07-15 Shape dependent interpolation order TWI704799B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
WOPCT/CN2018/095576 2018-07-13
CN2018095576 2018-07-13

Publications (2)

Publication Number Publication Date
TW202013960A true TW202013960A (en) 2020-04-01
TWI704799B TWI704799B (en) 2020-09-11

Family

ID=67989031

Family Applications (2)

Application Number Title Priority Date Filing Date
TW108124952A TWI704799B (en) 2018-07-13 2019-07-15 Shape dependent interpolation order
TW108124953A TWI722486B (en) 2018-07-13 2019-07-15 Shape dependent interpolation order

Family Applications After (1)

Application Number Title Priority Date Filing Date
TW108124953A TWI722486B (en) 2018-07-13 2019-07-15 Shape dependent interpolation order

Country Status (3)

Country Link
CN (2) CN110719475B (en)
TW (2) TWI704799B (en)
WO (2) WO2020012448A2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023198120A1 (en) * 2022-04-13 2023-10-19 Beijing Bytedance Network Technology Co., Ltd. Method, apparatus, and medium for video processing

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6807231B1 (en) * 1997-09-12 2004-10-19 8×8, Inc. Multi-hypothesis motion-compensated video image predictor
AU2003246987A1 (en) * 2002-07-09 2004-01-23 Nokia Corporation Method and system for selecting interpolation filter type in video coding
EP2127391A2 (en) * 2007-01-09 2009-12-02 Nokia Corporation Adaptive interpolation filters for video coding
CN101527847B (en) * 2009-01-04 2012-01-04 炬力集成电路设计有限公司 Motion compensation interpolation device and method
US20120008686A1 (en) * 2010-07-06 2012-01-12 Apple Inc. Motion compensation using vector quantized interpolation filters
WO2012100085A1 (en) * 2011-01-19 2012-07-26 General Instrument Corporation High efficiency low complexity interpolation filters
US20120230393A1 (en) * 2011-03-08 2012-09-13 Sue Mon Thet Naing Methods and apparatuses for encoding and decoding video using adaptive interpolation filter length
US9313519B2 (en) * 2011-03-11 2016-04-12 Google Technology Holdings LLC Interpolation filter selection using prediction unit (PU) size
CN102665080B (en) * 2012-05-08 2015-05-13 开曼群岛威睿电通股份有限公司 Electronic device for motion compensation and motion compensation method
US11122262B2 (en) * 2014-06-27 2021-09-14 Samsung Electronics Co., Ltd. System and method for motion compensation in video coding
CN104881843A (en) * 2015-06-10 2015-09-02 京东方科技集团股份有限公司 Image interpolation method and image interpolation apparatus

Also Published As

Publication number Publication date
CN110719475A (en) 2020-01-21
TWI704799B (en) 2020-09-11
WO2020012448A2 (en) 2020-01-16
WO2020012448A3 (en) 2020-04-16
CN110719475B (en) 2022-12-09
CN110719466B (en) 2022-12-23
CN110719466A (en) 2020-01-21
WO2020012449A1 (en) 2020-01-16
TWI722486B (en) 2021-03-21
TW202023276A (en) 2020-06-16

Similar Documents

Publication Publication Date Title
TWI818086B (en) Extended merge prediction
TWI743506B (en) Selection from multiple luts
JP7446297B2 (en) Decoder side motion vector improvement
TWI696384B (en) Motion vector prediction for affine motion models in video coding
CN110771163A (en) Combination of inter-prediction and intra-prediction in video coding
WO2018126163A1 (en) Motion vector generation for affine motion model for video coding
TW201743619A (en) Confusion of multiple filters in adaptive loop filtering in video coding
TW201924350A (en) Affine motion vector prediction in video coding
CN113273186A (en) Invocation of LUT update
CN110677668B (en) Spatial motion compression
JP2022521979A (en) Restrictions on improving motion vector on the decoder side
JP2015514340A (en) Disparity vector prediction in video coding
TW202021354A (en) Motion vector predictor list generation
CN112534820A (en) Signaling sub-prediction unit motion vector predictor
TW202110188A (en) Overlapped block motion compensation using spatial neighbors
CN113196777B (en) Reference pixel padding for motion compensation
TWI704799B (en) Shape dependent interpolation order
CN113273216B (en) Improvement of MMVD
CN110677650A (en) Reducing complexity of non-adjacent Merge designs
TWI839388B (en) Simplified history based motion vector prediction
CN113574867B (en) MV precision constraint