CN113676733A - System and method for improved inter-frame intra joint prediction - Google Patents

System and method for improved inter-frame intra joint prediction Download PDF

Info

Publication number
CN113676733A
CN113676733A CN202111033487.XA CN202111033487A CN113676733A CN 113676733 A CN113676733 A CN 113676733A CN 202111033487 A CN202111033487 A CN 202111033487A CN 113676733 A CN113676733 A CN 113676733A
Authority
CN
China
Prior art keywords
prediction
coding block
current coding
gradient value
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111033487.XA
Other languages
Chinese (zh)
Other versions
CN113676733B (en
Inventor
修晓宇
陈漪纹
王祥林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority claimed from CN202080008692.8A external-priority patent/CN113661704A/en
Publication of CN113676733A publication Critical patent/CN113676733A/en
Application granted granted Critical
Publication of CN113676733B publication Critical patent/CN113676733B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/527Global motion vector estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/109Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/11Selection of coding mode or of prediction mode among a plurality of spatial predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present disclosure relates to a system and method for improving inter-frame intra joint prediction for video coding and decoding. The method includes obtaining a first reference picture and a second reference picture associated with a current prediction block, obtaining a first prediction L0 based on a first motion vector MV0 from the current prediction block to a reference block in the first reference picture, obtaining a second prediction L1 based on a second motion vector MV1 from the current prediction block to a reference block in the second reference picture, determining whether to apply a bi-directional optical flow (BDOF) operation, and calculating bi-prediction of the current prediction block based on the first prediction L0 and the second prediction L1 and the first gradient value and the second gradient value.

Description

System and method for improved inter-frame intra joint prediction
Cross Reference to Related Applications
This application is based on and claims priority from provisional application No. 62/790421 filed on 9.1.2019, which is incorporated herein by reference in its entirety.
Technical Field
This application relates to video coding and compression. More particularly, the present application relates to a method and apparatus related to an inter-frame intra joint prediction (CIIP) method for video coding and decoding.
Background
Various video codec techniques may be used to compress the video data. Video coding is performed according to one or more video coding standards. For example, video codec standards include general video codec (VVC), joint exploration test model (JEM), high efficiency video codec (h.265/HEVC), advanced video codec (h.264/AVC), Moving Picture Experts Group (MPEG) codec, and so forth. Video coding typically uses prediction methods that exploit redundancy present in video pictures or sequences (e.g., inter-prediction, intra-prediction, etc.). An important goal of video codec techniques is to compress video data into a form that uses a lower bit rate while avoiding or minimizing degradation of video quality.
Disclosure of Invention
Examples of the present disclosure provide methods for improving the efficiency of semantic signaling of merge (merge) related modes.
According to a first aspect of the present disclosure, a video encoding and decoding method includes: obtaining a first reference picture and a second reference picture associated with a current prediction block, wherein the first reference picture precedes the current picture and the second reference picture follows the current picture in display order based on a first motion from the current prediction block to a reference block in the first reference pictureThe vector MV0 retrieves a first prediction L0, retrieves a second prediction L1 based on a second motion vector MV1 from the current prediction block to a reference block in a second reference picture, determines whether to apply a bi-directional optical flow (BDOF) operation, wherein the BDOF calculates first horizontal and vertical gradient values of prediction samples associated with the first prediction L0
Figure DEST_PATH_IMAGE001
And
Figure DEST_PATH_IMAGE002
and second horizontal and vertical gradient values associated with a second prediction L1
Figure DEST_PATH_IMAGE003
And
Figure DEST_PATH_IMAGE004
and based on the first and second predictions L0 and L1 and the first gradient value
Figure 184048DEST_PATH_IMAGE001
And
Figure 101188DEST_PATH_IMAGE002
and a second gradient value
Figure 84188DEST_PATH_IMAGE003
And
Figure 479397DEST_PATH_IMAGE004
a bi-prediction of the current prediction block is calculated.
According to a second aspect of the present disclosure, a video encoding and decoding method includes: the method includes obtaining a reference picture in a reference picture list associated with a current prediction block, generating inter prediction based on a first motion vector from the current picture to a first reference picture, obtaining an intra prediction mode associated with the current prediction block, generating intra prediction of the current prediction block based on the intra prediction, generating final prediction of the current prediction block by averaging the inter prediction and the intra prediction, and determining whether the current prediction block is considered as inter mode or intra mode for Most Probable Mode (MPM) based intra mode prediction.
According to a third aspect of the disclosure, a non-transitory computer-readable storage medium having instructions stored therein is provided. When executed by one or more processors, cause a computing device to perform operations comprising: obtaining a first reference picture and a second reference picture associated with a current prediction block, wherein the first reference picture precedes the current picture and the second reference picture follows the current picture in display order, obtaining a first prediction L0 based on a first motion vector MV0 from the current prediction block to a reference block in the first reference picture, obtaining a second prediction L1 based on a second motion vector MV1 from the current prediction block to a reference block in the second reference picture, determining whether to apply a bi-directional optical flow (BDOF) operation, wherein the BDOF calculates first horizontal and vertical gradient values of a prediction sample associated with the first prediction L0
Figure 293769DEST_PATH_IMAGE001
And
Figure 804647DEST_PATH_IMAGE002
and second horizontal and vertical gradient values associated with a second prediction L1
Figure 71681DEST_PATH_IMAGE003
And
Figure 191952DEST_PATH_IMAGE004
and calculates a bi-prediction of the current prediction block.
According to a fourth aspect of the disclosure, a non-transitory computer-readable storage medium having instructions stored therein is provided. When executed by one or more processors, cause a computing device to perform operations comprising: the method includes obtaining a reference picture in a reference picture list associated with a current prediction block, generating inter prediction based on a first motion vector from the current picture to a first reference picture, obtaining an intra prediction mode associated with the current prediction block, generating intra prediction of the current prediction block based on the intra prediction, generating final prediction of the current prediction block by averaging the inter prediction and the intra prediction, and determining whether the current prediction block is considered as inter mode or intra mode for Most Probable Mode (MPM) based intra mode prediction.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate examples consistent with the disclosure and, together with the description, serve to explain the principles of the disclosure.
Fig. 1 is a block diagram of an encoder according to an example of the present disclosure.
Fig. 2 is a block diagram of a decoder according to an example of the present disclosure.
Fig. 3 is a flow diagram illustrating a method for generating inter-frame intra joint prediction (CIIP) according to an example of the present disclosure.
Fig. 4 is a flow chart illustrating a method for generating a CIIP according to an example of the present disclosure.
Fig. 5A is a diagram illustrating block partitions in a multi-type tree structure, according to an example of the present disclosure.
Fig. 5B is a diagram illustrating block partitions in a multi-type tree structure, according to an example of the present disclosure.
Fig. 5C is a diagram illustrating block partitions in a multi-type tree structure, according to an example of the present disclosure.
Fig. 5D is a diagram illustrating block partitions in a multi-type tree structure, according to an example of the present disclosure.
Fig. 5E is a diagram illustrating block partitions in a multi-type tree structure, according to an example of the present disclosure.
Fig. 6A is a diagram illustrating inter-frame intra joint prediction (CIIP) according to an example of the present disclosure.
Fig. 6B is a diagram illustrating inter-frame intra joint prediction (CIIP) according to an example of the present disclosure.
Fig. 6C is a diagram illustrating inter-frame intra joint prediction (CIIP) according to an example of the present disclosure.
Fig. 7A is a flow chart of an MPM candidate list generation process according to an example of the present disclosure.
Fig. 7B is a flow chart of an MPM candidate list generation process according to an example of the present disclosure.
Fig. 8 is a diagram illustrating a workflow of an existing CIIP design in a VVC, according to an example of the present disclosure.
Fig. 9 is a diagram illustrating a workflow of a CIIP method proposed by removing BDOF according to an example of the present disclosure.
Fig. 10 is a diagram illustrating a workflow of CIIP based on unidirectional prediction, where a prediction list is selected based on POC distances, according to an example of the present disclosure.
Fig. 11A is a flow chart of a method when generating CIIP-enabled blocks for an MPM candidate list according to an example of the present disclosure.
Fig. 11B is a flow chart of a method when generating a forbidden CIIP block for an MPM candidate list according to an example of the present disclosure.
Fig. 12 is a diagram illustrating a computing environment coupled with a user interface, according to an example of the present disclosure.
Detailed Description
Reference will now be made in detail to examples of the present disclosure, some of which are illustrated in the accompanying drawings. The following description refers to the accompanying drawings in which the same numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments set forth in the following description of the examples of the present disclosure do not represent all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with aspects related to the disclosure set forth in the claims below.
The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term "and/or" as used herein is intended to mean and include any and all possible combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, third, etc. may be used herein to describe various information, the information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first information may be referred to as a second information without departing from the scope of the present disclosure; and similarly, the second information may also be referred to as the first information. As used herein, the term "if" may be understood in context to mean "when … …" or "at … …" or "in response to a determination".
The first version of the HEVC standard, finalized in 2013 in 10 months, provides a bit rate saving of about 50% or an equivalent perceptual quality compared to the previous generation video codec standard h.264/MPEG AVC. Although the HEVC standard provides significant codec improvements over its predecessors, there is evidence that higher codec efficiencies than HEVC can be achieved using additional codec tools. Based on this, both VCEG and MPEG started the exploration of new codec technologies standardized for future video codecs. In 10 months 2015, ITU-T VECG and ISO/IEC MPEG established a joint video exploration team (jfet), and important research on advanced technologies capable of greatly improving coding and decoding efficiency began. By integrating several additional codec tools on top of the HEVC test model (HM), jfet maintains a reference software named Joint Exploration Model (JEM).
In 10.2017, ITU-T and ISO/IEC published a joint proposal symptom set (CfP) on video compression with the capability to go beyond HEVC. In month 4 2018, 23 CfP responses were received and evaluated at the 10 th jfet meeting, which exhibited a compression efficiency gain of about 40% higher than that of HEVC. Based on such evaluation results, jfet initiated a new project to develop a new generation of video codec standard, which is named universal video codec (VVC). In the same month, a reference software code base called VVC Test Model (VTM) was established for presenting reference embodiments of the VVC standard.
Like HEVC, VVC is built on a block-based hybrid video codec framework. Fig. 1 (described below) presents a block diagram of a generic block-based hybrid video codec system. The input video signal is processed block by block, called Coding Unit (CU). In VTM-1.0, a CU may reach 128 × 128 pixels. However, unlike HEVC which partitions blocks based on only quadtrees, in VVC, one Coding Tree Unit (CTU) is partitioned into multiple CUs based on quadtrees/binary/ternary trees to accommodate varying local features. Furthermore, the concept of multiple partition unit types in HEVC is removed, i.e. there is no longer a distinction of CU, Prediction Unit (PU) and Transform Unit (TU) in VVC; instead, each CU is always used as a basic unit for both prediction and transform without further partitioning. In the multi-type tree structure, one CTU is first divided by a quad tree structure. Each leaf node of the quadtree may then be further partitioned by a binary tree and a ternary tree structure. As shown in fig. 5A, 5B, 5C, 5D, and 5E (described below), there are five division types, which are a four-pronged division, a horizontal two-pronged division, a vertical two-pronged division, a horizontal three-pronged division, and a vertical three-pronged division, respectively.
In fig. 1 (described below), spatial prediction and/or temporal prediction may be performed. Spatial prediction (or "intra prediction") uses pixel points from already encoded samples (called reference samples) of neighboring blocks in the same video picture/slice to predict the current video block. Spatial prediction reduces the spatial redundancy inherent in video signals. Temporal prediction (also referred to as "inter prediction" or "motion compensated prediction") uses reconstructed pixels from already encoded video pictures to predict the current video block. Temporal prediction reduces temporal redundancy inherent in video signals. The temporal prediction signal for a given CU is typically signaled by one or more Motion Vectors (MVs) that indicate the amount and direction of motion between the current CU and its temporal reference. Further, when a plurality of reference pictures are supported, one reference picture index for identifying from which reference picture in the reference picture library the temporal prediction signal comes is additionally transmitted. After spatial and/or temporal prediction, a mode decision block in the encoder selects the best prediction mode, e.g., based on a rate-distortion optimization method. Then subtracting the prediction block from the current video block; and the prediction residual is decorrelated using transform and quantization.
The quantized residual coefficients are inverse quantized and inverse transformed to form a reconstructed residual, which is then added back to the prediction block to form the reconstructed signal for the CU. Further in-loop filtering, such as deblocking filtering, sample adaptive compensation (SAO), and adaptive in-loop filtering (ALF), may be applied to the reconstructed CU before the reconstructed CU is placed in the reference picture library and used to encode future video blocks. To form the output video bitstream, the coding mode (inter or intra), prediction mode information, motion information and quantized residual coefficients are all sent to an entropy coding unit to be further compressed and packed to form the bitstream.
Fig. 2 (described below) presents a general block diagram of a block-based video decoder. The video bitstream is first entropy decoded in an entropy decoding unit. The coding mode and prediction information are sent to a spatial prediction unit (when intra coded) or a temporal prediction unit (when inter coded) to form a prediction block. The residual transform coefficients are sent to an inverse quantization unit and an inverse transform unit to reconstruct the residual block. The prediction block and the residual block are then added together. The reconstructed block may further undergo in-loop filtering before being stored in the reference picture library. The reconstructed video in the reference picture library is then sent out to drive the display device and used to predict future video blocks.
Fig. 1 shows a typical encoder 100. The encoder 100 has a video input 110, motion compensation 112, motion estimation 114, intra/inter mode decision 116, block predictor 140, adder 128, transform 130, quantization 132, prediction related information 142, intra prediction 118, picture buffer 120, inverse quantization 134, inverse transform 136, adder 126, memory 124, in-loop filter 122, entropy coding 138, and bitstream 144.
Fig. 2 shows a block diagram of a typical decoder 200. The decoder 200 has a bitstream 210, entropy decoding 212, inverse quantization 214, inverse transform 216, adder 218, intra/inter mode selection 220, intra prediction 222, memory 230, in-loop filter 228, motion compensation 224, picture buffer 226, prediction related information 234, and video output 232.
Fig. 3 illustrates an example method 300 for generating inter-frame intra joint prediction (CIIP) in accordance with this disclosure.
At step 310, a first reference picture and a second reference picture associated with the current prediction block are obtained, wherein the first reference picture precedes the current picture and the second reference picture follows the current picture in display order.
At step 312, a first prediction L0 is obtained based on a first motion vector MV0 from the current prediction block to a reference block in a first reference picture.
At step 314, a second prediction L1 is obtained based on a second motion vector MV1 from the current prediction block to a reference block in a second reference picture.
At step 316, it is determined whether a bi-directional optical flow (BDOF) operation is to be applied, wherein the BDOF computes first horizontal and vertical gradient values for predicted samples associated with first prediction L0 and second horizontal and vertical gradient values associated with second prediction L1. For example, the BDOF calculates first horizontal and vertical gradient values for predicted samples associated with a first prediction L0
Figure 120551DEST_PATH_IMAGE001
And
Figure 176232DEST_PATH_IMAGE002
and second horizontal and vertical gradient values associated with the second prediction L1
Figure 930561DEST_PATH_IMAGE003
And
Figure 729890DEST_PATH_IMAGE004
in step 318, a bi-prediction of the current prediction block is calculated based on the first and second predictions L0 and L1 and the first and second gradient values. E.g. first gradient value
Figure 253275DEST_PATH_IMAGE001
And
Figure 683119DEST_PATH_IMAGE002
and a second gradient value
Figure 987062DEST_PATH_IMAGE003
And
Figure 262185DEST_PATH_IMAGE004
FIG. 4 illustrates an example method for generating CIIPs in accordance with this disclosure. For example, the method includes uni-directional prediction based inter prediction and MPM based intra prediction for generating CIIP.
At step 410, a reference picture in a reference picture list associated with the current prediction block is acquired.
At step 412, an inter prediction is generated based on the first motion vector from the current picture to the first reference picture.
At step 414, the intra prediction mode associated with the current prediction block is obtained.
At step 416, an intra prediction of the current prediction block is generated based on the intra prediction.
At step 418, a final prediction of the current prediction block is generated by averaging the inter prediction and the intra prediction.
At step 420, for Most Probable Mode (MPM) based intra-mode prediction, it is determined whether the current prediction block is considered as an inter-mode or an intra-mode.
Fig. 5A illustrates a diagram showing block quad-partitioned in a multi-type tree structure according to an example of the present disclosure.
Fig. 5B illustrates a diagram showing block vertical binary partitions in a multi-type tree structure, according to an example of the present disclosure.
Fig. 5C illustrates a diagram showing block-level binary partitioning in a multi-type tree structure, according to an example of the present disclosure.
Fig. 5D illustrates a diagram showing block vertical trifurcated partitioning in a multi-type tree structure, according to an example of the present disclosure.
Fig. 5E illustrates a diagram showing block-level trifurcated partitions in a multi-type tree structure, according to an example of the present disclosure.
Inter-frame intra joint prediction
As shown in fig. 1 and 2, inter and intra prediction methods are used in a hybrid video coding scheme, in which each PU is only allowed to select either inter prediction or intra prediction to exploit correlation in the temporal or spatial domain, and never both. However, as indicated in the previous document, residual signals generated by inter-prediction blocks and intra-prediction blocks may exhibit characteristics very different from each other. Thus, when two kinds of prediction can be combined in an efficient manner, a more accurate prediction can be expected for reducing the energy of the prediction residual and thus improving the coding efficiency. Furthermore, in natural video content, the motion of moving objects can be complex. For example, there may be regions that contain both old content (e.g., objects included in previously encoded pictures) and emerging new content (e.g., objects not included in previously encoded pictures). In this case, neither inter prediction nor intra prediction can provide an accurate prediction of the current block.
To further improve prediction efficiency, inter-frame intra joint prediction (CIIP), which combines inter prediction and intra prediction of one CU encoded by a merge mode, is employed in the VVC standard. Specifically, for each merging CU, an additional identification is signaled to indicate whether the current CU enables CIIP. For the luminance component, CIIP supports four commonly used intra modes, including plane Prediction (PLANAR), directional angle prediction (DC), horizontal prediction (HORIZONAL), and VERTICAL prediction (VERTICAL) modes. For the chroma component, DM (i.e., the same intra mode for chroma reuse of the luma component) is always applied without additional signaling. In addition, in the existing CIIP design, weighted averaging is applied to combine inter-predicted samples and intra-predicted samples of one CIIP CU. Specifically, when PLANAR or DC mode is selected, an equal weight (i.e., 0.5) is applied. Otherwise (i.e., applying the HORIZONAL or VERTICAL mode), the current CU is first split horizontally (for the HORIZONAL mode) or vertically (for the VERTICAL mode) into four equal sized regions.
Representing an application as(w_intra i , w_inter i )Where i =0 and i = 3 represent the regions closest and farthest to the reconstructed neighboring samples for intra prediction. In current CIIP designs, the values of the weight set are set to(w_intra 0 , w_inter 0 ) = (0.75, 0.25)(w_intra 1 , w_inter 1 ) = (0.625, 0.375)、(w_intra 2 , w_inter 2 ) = (0.375, 0.625)And(w_intra 3 , w_inter 3 ) = (0.25, 0.75). Fig. 6A, 6B, and 6C (described below) provide examples to illustrate the CIIP mode.
Furthermore, in the current VVC operating specification, the intra-mode of one CIIP CU can be used as a predictor to predict the intra-mode of its neighboring CIIP CU through a Most Probable Mode (MPM) mechanism. In particular, for each CIIP CU, when its neighboring blocks are also CIIP CUs, the intra-modes of those neighboring blocks are first rounded to the closest of PLANAR, DC, HORIZONAL, and VERTICAL modes and then added to the MPM candidate list of the current CU. However, when constructing the MPM list for each intra CU, when one of its neighboring blocks is coded by the CIIP mode, then that neighboring block is deemed unavailable, i.e., the intra mode of one CIIP CU is not allowed to be used to predict the intra mode of its neighboring intra CU. Fig. 7A and 7B (described below) compare MPM list generation procedures for intra-CU and CIIP CU.
Bidirectional light stream
Conventional bi-prediction in video coding is a simple combination of two temporally predicted blocks taken from already reconstructed reference pictures. However, due to the limitation of block-based motion compensation, residual small motion may be observed between samples of two prediction blocks, thus reducing the efficiency of motion compensated prediction. To solve this problem, bi-directional optical flow (BDOF) is applied in VVC to reduce the effect of such motion on each sample within a block.
Specifically, as shown in fig. 6A, 6B, and 6C (described below), BDOF is a sample-wise motion refinement performed on top of block-based motion compensated prediction when bi-prediction is used. A 6 x 6 window around the sub-block
Figure DEST_PATH_IMAGE005
After intra-application of BDOF, motion refinement for each 4 × 4 sub-block is calculated by minimizing the difference between L0 and L1 prediction samples
Figure DEST_PATH_IMAGE006
. In particular, the present invention relates to a method for producing,
Figure 469438DEST_PATH_IMAGE006
the values of (d) are derived as follows:
Figure DEST_PATH_IMAGE007
Figure DEST_PATH_IMAGE008
where ⌊ ∙ ⌋ is a floor function; clip3(min, max, x) is at [ min, max]Clipping a function of a given value x within a range; symbol>>Representing a bitwise right shift operation; symbol<<Representing a bit-wise left shift operation;
Figure DEST_PATH_IMAGE009
is a motion refinement threshold that prevents propagation errors caused by irregular local motion, which is equal to
Figure DEST_PATH_IMAGE010
WhereinBDIs the bit depth of the input video. In the step (1), the first step is carried out,
Figure DEST_PATH_IMAGE011
Figure DEST_PATH_IMAGE012
the values of S1, S2, S3, S5 and S6 were calculated as follows:
Figure DEST_PATH_IMAGE013
Figure DEST_PATH_IMAGE014
Figure DEST_PATH_IMAGE015
Figure DEST_PATH_IMAGE016
Figure DEST_PATH_IMAGE017
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE018
Figure DEST_PATH_IMAGE019
Figure DEST_PATH_IMAGE020
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE021
is the sample value at coordinate (i, j) of the prediction signal in list k (k =0, 1), which is generated with medium-high precision (i.e. 16 bits);
Figure DEST_PATH_IMAGE022
and
Figure DEST_PATH_IMAGE023
by directly calculating its two phasesThe difference between neighboring samples obtains the horizontal and vertical gradient values of the samples, i.e.,
Figure DEST_PATH_IMAGE024
Figure DEST_PATH_IMAGE025
based on the motion refinement derived in (1), the final bidirectional predicted samples for the CU are computed by interpolating the L0/L1 predicted samples along the motion trajectory based on the optical flow model, as indicated by the following equation
Figure DEST_PATH_IMAGE026
Figure DEST_PATH_IMAGE027
Wherein
Figure DEST_PATH_IMAGE028
And
Figure DEST_PATH_IMAGE029
are right shift and offset values that are applied to combine the L0 and L1 prediction signals for bi-directional prediction, equal to 15-BD and 15-BD, respectively
Figure DEST_PATH_IMAGE030
Fig. 6A illustrates a diagram showing inter-frame intra joint prediction for horizontal mode according to an example of the present disclosure.
Fig. 6B illustrates a diagram showing inter-frame intra joint prediction for VERTICAL mode according to an example of the present disclosure.
Fig. 6C illustrates a diagram showing inter-frame intra joint prediction for PLANAR and DC modes according to an example of the present disclosure.
Fig. 7A shows a flowchart of an MPM candidate list generation process for an intra CU according to an example of the present disclosure.
Fig. 7B shows a flow diagram of an MPM candidate list generation process for a CIIP CU according to an example of the present disclosure.
Improvements in CIIP
Although CIIP can improve the efficiency of conventional motion compensated prediction, its design can be further improved. Specifically, the following problems in existing CIIP designs in VVCs are identified in this disclosure.
First, as discussed in the "inter-frame intra joint prediction" section, because CIIP combines inter-and intra-predicted samples, each CIIP CU needs to use its reconstructed neighboring samples to generate the prediction signal. This means that the decoding of one CIIP CU depends on the complete reconstruction of its neighboring blocks. Due to this interdependency, for practical hardware implementations, CIIP needs to be performed at the reconstruction stage where neighboring reconstructed samples become available for intra prediction. Since the decoding of CUs in the reconstruction stage has to be performed sequentially (i.e. one after the other), the number of computational operations involved in the CIIP process (e.g. multiplication, addition and bit shifting) cannot be too high in order to ensure a sufficient throughput for real-time decoding.
As mentioned in the "bi-directional optical flow" section, when an inter-coded CU is predicted from two reference blocks in the forward and backward temporal directions, BDOF is enabled to improve the prediction quality. As shown in fig. 8 (described below), in the current VVC, BDOF is also involved to generate inter prediction samples of the CIIP mode. Given the additional complexity of BDOF, such a design may severely reduce the coding/decoding throughput of a hardware codec when CIIP is enabled.
Second, in the current CIIP design, when a CIIP CU refers to a bi-directionally predicted merge candidate, it is necessary to generate motion compensated prediction signals in lists L0 and L1. When one or more MVs are not integer precision, an additional interpolation process must be invoked to interpolate samples at fractional sample positions. Such a process not only increases computational complexity, but also memory bandwidth because more reference samples need to be accessed from external memory.
Third, as discussed in the "inter-frame intra joint prediction" section, in current CIIP designs, the intra-mode of a CIIP CU and the intra-mode of an intra-CU are treated differently when building the MPM lists of their neighboring blocks. Specifically, when a current CU is encoded by the CIIP mode, its neighboring CIIP CUs are considered as intra-frames, i.e., the intra-frames mode of the neighboring CIIP CUs may be added to the MPM candidate list. However, when the current CU is encoded by intra mode, its neighboring CIIP CUs are considered as inter, i.e., the intra mode of the neighboring CIIP CU is not included in the MPM candidate list. This non-uniform design may not be optimal for the final version of the VVC standard.
Fig. 8 illustrates a diagram showing a workflow of an existing CIIP design in a VVC, according to an example of the present disclosure.
Simplified CIIP
In the present disclosure, methods are provided that simplify existing CIIP designs to facilitate hardware codec implementations. In general, the main aspects of the technology presented in this disclosure are summarized below.
First, in order to improve CIIP encoding/decoding throughput, it is proposed to exclude BDOF from the generation of inter-frame prediction samples in CIIP mode.
Second, in order to reduce computational complexity and storage bandwidth consumption, when one CIIP CU is bi-directionally predicted (i.e., has L0 MV and L1 MV), a method of converting a block from bi-directional prediction to uni-directional prediction to generate inter prediction samples is proposed.
Third, two methods are proposed to coordinate intra modes of a CIIP CU and an intra CU when forming MPM candidates for neighboring blocks of the CU.
CIIP without BDOF
As noted in the "problem statement" section, BDOF is always enabled to generate inter prediction samples for CIIP mode when the current CU is bi-predicted. Due to the additional complexity of BDOF, existing CIIP designs can significantly reduce encoding/decoding throughput, especially making real-time decoding difficult for VVC decoders. On the other hand, for CIIP CUs, their final predicted samples are generated by averaging the inter-predicted samples and the intra-predicted samples. In other words, the predicted samples refined by BDOF will not be used directly as the prediction signal for CIIP CU. Thus, the corresponding improvements obtained from BDOF are less efficient for CIIP CUs than for conventional bi-predictive CUs (where BDOF is applied directly to generate the prediction samples). Therefore, based on the above considerations, it is proposed to disable BDOF when generating inter-prediction samples for CIIP mode. Fig. 9 (described below) shows the corresponding workflow of the proposed CIIP process after the BDOF removal.
Fig. 9 illustrates a diagram showing a workflow of the CIIP method proposed by removing BDOF according to an example of the present disclosure.
CIIP based on unidirectional prediction
As discussed above, when the merging candidates referred to by one CIIP CU are bidirectionally predicted, both L0 and L1 prediction signals are generated to predict samples within the CU. To reduce memory bandwidth and interpolation complexity, in one embodiment of the present disclosure, it is proposed to use only inter-predicted samples generated with uni-directional prediction (even when the current CU is bi-directionally predicted) in combination with intra-predicted samples in the CIIP mode. Specifically, when the current CIIP CU is predicted uni-directionally, inter-prediction samples will be directly combined with intra-prediction samples. Otherwise (i.e., the current CU is bi-predicted), inter prediction samples used by the CIIP are generated based on uni-directional prediction from one prediction list (L0 or L1). To select the prediction list, different methods may be applied. In the first approach, it is proposed to always select the first prediction (i.e., list L0) for any CIIP block predicted by two reference pictures.
In the second approach, it is proposed to always select the second prediction (i.e., list L1) for any CIIP block predicted by two reference pictures. In a third approach, an adaptive approach is applied, where a prediction list associated with one reference picture is selected, the reference picture having a smaller Picture Order Count (POC) distance from the current picture. Fig. 10 (described below) illustrates a unidirectional prediction based CIIP workflow where the prediction list is selected based on POC distances.
Finally, in the last approach, it is proposed to enable the CIIP mode only when the current CU is predicted unidirectionally. Furthermore, to reduce overhead, the signaling of the CIIP enable/disable identification depends on the prediction direction of the current CIIP CU. When the current CU is predicted uni-directionally, a CIIP identification will be signaled in the bitstream to indicate whether CIIP is enabled or disabled. Otherwise (i.e. the current CU is bi-directionally predicted), the signaling of the CIIP identity will be skipped and always inferred to be false, i.e. CIIP is always disabled.
Fig. 10 illustrates a diagram showing a workflow of unidirectional prediction based CIIP of a POC distance based selection prediction list according to one example of the present disclosure.
Coordination of intra modes for CIIP CU and intra CU for MPM candidate list construction
As discussed above, current CIIP designs are not uniform in how the intra-modes of a CIIP CU and an intra-CU are used to form the MPM candidate list of their neighboring blocks. In particular, the intra-modes of both the CIIP CU and the intra-CU may predict the intra-modes of their neighboring blocks encoded in CIIP mode. However, only the intra mode of the intra CU can predict the intra mode of the intra CU. To achieve a more uniform design, this section proposes two methods to coordinate the use of intra modes for CIIP CUs and intra CUs in MPM list construction.
In the first approach, for MPM list construction, it is proposed to treat the CIIP mode as inter mode. Specifically, when generating an MPM list for a CIIP CU or an intra CU, its neighboring blocks are marked as unavailable in intra mode when they are coded in CIIP mode. In this way, intra modes without CIIP blocks can be used to construct the MPM list. In contrast, in the second method, for MPM list construction, it is suggested to treat the CIIP mode as an intra mode. Specifically, in this method, the intra mode of a CIIP CU may predict the intra modes of its neighboring CIIP blocks and intra blocks. Fig. 11A and 11B (described below) show an MPM candidate list generation process when the above-described two methods are applied.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice in the art. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise examples described above and shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof. It is intended that the scope of the disclosure be limited only by the claims appended hereto.
Fig. 11A illustrates a flow diagram of a method when generating CIIP-enabled blocks for an MPM candidate list according to an example of the present disclosure.
Fig. 11B illustrates a flow chart of a method when generating a forbidden CIIP block for an MPM candidate list according to an example of the present disclosure.
FIG. 12 illustrates a computing environment 1210 coupled with a user interface 1260. The computing environment 1210 may be part of a data processing server. Computing environment 1210 includes a processor 1220, memory 1240, and I/O interfaces 1250.
The processor 1220 typically controls the overall operation of the computing environment 1210, such as operations associated with display, data acquisition, data communication, and picture processing. Processor 1220 may include one or more processors to execute instructions to perform all or some of the steps of the methods described above. Further, the processor 1220 may include one or more circuits that facilitate interaction between the processor 1220 and other components. The processor may be a Central Processing Unit (CPU), microprocessor, single-chip, GPU, etc.
The memory 1240 is configured to store various types of data to support the operation of the computing environment 1210. Examples of such data include instructions for any application or method operating on computing environment 1210, video data, picture data, and so forth. The memory 1240 may be implemented using any type or combination of volatile or non-volatile storage devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, a magnetic disk, or an optical disk.
I/O interface 1250 provides an interface between processor 1220 and peripheral interface modules, such as a keyboard, click wheel, buttons, and the like. These buttons may include, but are not limited to, a home button, a start scan button, and a stop scan button. The I/O interface 1250 may be coupled with an encoder and a decoder.
In an embodiment, a non-transitory computer readable storage medium comprising a plurality of programs, such as included in memory 1240, executable by processor 1220 in computing environment 1210, for performing the above-described methods is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
A non-transitory computer readable storage medium has stored therein a plurality of programs for execution by a computing device having one or more processors, wherein the plurality of programs, when executed by the one or more processors, cause the computing device to perform the above-described method for motion prediction.
In an embodiment, the computing environment 1210 may be implemented with one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), Graphics Processing Units (GPUs), controllers, micro-controllers, microprocessors, or other electronic components for performing the above-described methods.

Claims (14)

1. A method of video decoding, the method comprising:
obtaining a first reference picture and a second reference picture associated with a current encoding block of a current picture, wherein the first reference picture precedes the current picture and the second reference picture follows the current picture in display order;
obtaining a first prediction based on a first motion vector from the current coding block to a reference block in the first reference picture;
obtaining a second prediction based on a second motion vector from the current coding block to a reference block in the second reference picture; and
calculating a bi-prediction for the current coding block based at least on the first prediction and the second prediction, comprising: in response to determining that inter-frame intra joint prediction is not applied to compute bi-directional prediction for the current coding block, enabling bi-directional optical flow (BDOF) in computing bi-directional prediction for the current coding block.
2. The method of claim 1, wherein calculating a bi-prediction for the current coding block based at least on the first prediction and the second prediction further comprises:
in response to determining to apply inter-frame intra joint prediction to compute bi-prediction for the current coding block, disabling BDOF in computing bi-prediction for the current coding block.
3. The method of claim 2, in response to determining to apply inter-intra joint prediction to compute a bi-prediction for the current coding block, computing a bi-prediction for the current coding block further comprising:
calculating a bi-prediction for the current coding block based on averaging the first prediction and the second prediction.
4. The method of claim 1, in response to determining that inter-intra joint prediction is not applied to compute bi-prediction for the current coding block, computing bi-prediction for the current coding block further comprises:
calculating first horizontal and first vertical gradient values, respectively, for predicted samples associated with the first prediction
Figure 495278DEST_PATH_IMAGE001
And
Figure 249607DEST_PATH_IMAGE002
and calculating second horizontal and second vertical gradient values of predicted samples associated with the second prediction, respectively
Figure 986619DEST_PATH_IMAGE003
And
Figure 572321DEST_PATH_IMAGE004
wherein, in the step (A),
Figure 2165DEST_PATH_IMAGE005
is associated with the first prediction at a sample point position
Figure 243791DEST_PATH_IMAGE006
Predicted sample points of (c), and
Figure 581231DEST_PATH_IMAGE007
is associated with the second prediction at a sample point position
Figure 224702DEST_PATH_IMAGE006
A predicted sample point of (c); and
calculating a bi-prediction for the current coding block based on the first prediction, the second prediction, the first horizontal gradient value, the first vertical gradient value, the second horizontal gradient value, and the second vertical gradient value.
5. The method of claim 4, in response to determining that inter-intra joint prediction is not applied to compute bi-prediction for the current coding block, computing bi-prediction for the current coding block further comprises:
calculating a motion correction for each sub-block by minimizing a difference between predicted samples of the first prediction and the second prediction, an
And calculating the bidirectional prediction of the current coding block based on the motion correction, the first horizontal gradient value, the first vertical gradient value, the second horizontal gradient value, the second vertical gradient value, the first prediction and the second prediction.
6. The method of claim 5, in response to determining that inter-intra joint prediction is not applied to compute bi-prediction for the current coding block, computing bi-prediction for the current coding block further comprises:
calculating a BDOF value based on the motion correction, the first horizontal gradient value, the first vertical gradient value, the second horizontal gradient value, and the second vertical gradient value;
and calculating the bidirectional prediction of the current coding block based on the BDOF value and the first prediction and the second prediction.
7. A video decoding device comprising one or more processors and one or more memories coupled to the one or more processors, the video decoding device configured to perform operations comprising:
obtaining a first reference picture and a second reference picture associated with a current encoding block of a current picture, wherein the first reference picture precedes the current picture and the second reference picture follows the current picture in display order;
obtaining a first prediction based on a first motion vector from the current coding block to a reference block in the first reference picture;
obtaining a second prediction based on a second motion vector from the current coding block to a reference block in the second reference picture; and
calculating a bi-prediction for the current coding block based at least on the first prediction and the second prediction, comprising: in response to determining that inter-frame intra joint prediction is not applied to compute bi-directional prediction for the current coding block, enabling bi-directional optical flow (BDOF) in computing bi-directional prediction for the current coding block.
8. The video coding and decoding apparatus of claim 7, wherein calculating a bi-prediction for the current coding block based at least on the first prediction and the second prediction further comprises: in response to determining to apply inter-frame intra joint prediction to compute bi-prediction for the current coding block, disabling BDOF in computing bi-prediction for the current coding block.
9. The video coding and decoding apparatus of claim 8, in response to determining to apply inter-intra joint prediction to compute a bi-prediction for the current coding block, computing a bi-prediction for the current coding block further comprising:
calculating a bi-prediction for the current coding block based on averaging the first prediction and the second prediction.
10. The video coding and decoding apparatus of claim 7, in response to determining that inter-intra joint prediction is not applied to compute a bi-prediction for the current coding block, computing a bi-prediction for the current coding block further comprises:
calculating first horizontal and first vertical gradient values, respectively, for predicted samples associated with the first prediction
Figure 123650DEST_PATH_IMAGE001
And
Figure 852572DEST_PATH_IMAGE002
and calculating second horizontal and second vertical gradient values of predicted samples associated with the second prediction, respectively
Figure 931386DEST_PATH_IMAGE003
And
Figure 429364DEST_PATH_IMAGE004
wherein, in the step (A),
Figure 263327DEST_PATH_IMAGE005
is as described aboveAssociated with the first prediction, at the location of the sample point
Figure 213966DEST_PATH_IMAGE006
Predicted sample points of (c), and
Figure 96471DEST_PATH_IMAGE007
is associated with the second prediction at a sample point position
Figure 448955DEST_PATH_IMAGE006
A predicted sample point of (c); and
calculating a bi-prediction for the current coding block based on the first prediction, the second prediction, the first horizontal gradient value, the first vertical gradient value, the second horizontal gradient value, and the second vertical gradient value.
11. The video coding and decoding apparatus of claim 10, in response to determining that inter-intra joint prediction is not applied to compute bi-prediction for the current coding block, computing bi-prediction for the current coding block further comprises:
calculating a motion correction for each sub-block by minimizing a difference between predicted samples of the first prediction and the second prediction, an
And calculating the bidirectional prediction of the current coding block based on the motion correction, the first horizontal gradient value, the first vertical gradient value, the second horizontal gradient value, the second vertical gradient value, the first prediction and the second prediction.
12. The video coding and decoding device of claim 11, wherein in response to determining that inter-intra joint prediction is not applied to compute bi-prediction for the current coding block, computing bi-prediction for the current coding block further comprises:
calculating a BDOF value based on the motion correction, the first horizontal gradient value, the first vertical gradient value, the second horizontal gradient value, and the second vertical gradient value;
and calculating the bidirectional prediction of the current coding block based on the BDOF value and the first prediction and the second prediction.
13. A non-transitory computer readable storage medium storing a plurality of programs for execution by a computing device having one or more processors, wherein the plurality of programs, when executed by the one or more processors, cause the computing device to perform the method of any of claims 1-6.
14. A computer program product comprising instructions which, when executed by a processor, carry out the method according to any one of claims 1-6.
CN202111033487.XA 2019-01-09 2020-01-09 Video decoding method, apparatus, non-transitory computer-readable storage medium Active CN113676733B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201962790421P 2019-01-09 2019-01-09
US62/790421 2019-01-09
CN202080008692.8A CN113661704A (en) 2019-01-09 2020-01-09 System and method for improved inter-frame intra joint prediction
PCT/US2020/012826 WO2020146562A1 (en) 2019-01-09 2020-01-09 System and method for improving combined inter and intra prediction

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202080008692.8A Division CN113661704A (en) 2019-01-09 2020-01-09 System and method for improved inter-frame intra joint prediction

Publications (2)

Publication Number Publication Date
CN113676733A true CN113676733A (en) 2021-11-19
CN113676733B CN113676733B (en) 2023-02-24

Family

ID=78230437

Family Applications (6)

Application Number Title Priority Date Filing Date
CN202310876401.2A Active CN117014615B (en) 2019-01-09 2020-01-09 Video encoding method, apparatus, and non-transitory computer readable storage medium
CN202310936271.7A Pending CN116800962A (en) 2019-01-09 2020-01-09 Video encoding and decoding method, apparatus, and non-transitory computer readable storage medium
CN202310315285.7A Active CN116347102B (en) 2019-01-09 2020-01-09 Video encoding method, apparatus, non-transitory computer readable storage medium
CN202111033487.XA Active CN113676733B (en) 2019-01-09 2020-01-09 Video decoding method, apparatus, non-transitory computer-readable storage medium
CN202110784990.2A Active CN113542748B (en) 2019-01-09 2020-01-09 Video encoding and decoding method, apparatus, and non-transitory computer readable storage medium
CN202311175593.0A Pending CN117294842A (en) 2019-01-09 2020-01-09 Video encoding and decoding method, apparatus, and non-transitory computer readable storage medium

Family Applications Before (3)

Application Number Title Priority Date Filing Date
CN202310876401.2A Active CN117014615B (en) 2019-01-09 2020-01-09 Video encoding method, apparatus, and non-transitory computer readable storage medium
CN202310936271.7A Pending CN116800962A (en) 2019-01-09 2020-01-09 Video encoding and decoding method, apparatus, and non-transitory computer readable storage medium
CN202310315285.7A Active CN116347102B (en) 2019-01-09 2020-01-09 Video encoding method, apparatus, non-transitory computer readable storage medium

Family Applications After (2)

Application Number Title Priority Date Filing Date
CN202110784990.2A Active CN113542748B (en) 2019-01-09 2020-01-09 Video encoding and decoding method, apparatus, and non-transitory computer readable storage medium
CN202311175593.0A Pending CN117294842A (en) 2019-01-09 2020-01-09 Video encoding and decoding method, apparatus, and non-transitory computer readable storage medium

Country Status (1)

Country Link
CN (6) CN117014615B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017188566A1 (en) * 2016-04-25 2017-11-02 엘지전자 주식회사 Inter-prediction method and apparatus in image coding system
CN108781294A (en) * 2016-02-05 2018-11-09 联发科技股份有限公司 The motion compensation process and device based on bi-directional predicted optic flow technique for coding and decoding video

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108632608B (en) * 2011-09-29 2022-07-29 夏普株式会社 Image decoding device, image decoding method, image encoding device, and image encoding method
WO2013109124A1 (en) * 2012-01-19 2013-07-25 삼성전자 주식회사 Method and device for encoding video to limit bidirectional prediction and block merging, and method and device for decoding video
WO2017043816A1 (en) * 2015-09-10 2017-03-16 엘지전자(주) Joint inter-intra prediction mode-based image processing method and apparatus therefor
US10375413B2 (en) * 2015-09-28 2019-08-06 Qualcomm Incorporated Bi-directional optical flow for video coding
CN115278229A (en) * 2015-11-11 2022-11-01 三星电子株式会社 Apparatus for decoding video and apparatus for encoding video
KR20170058838A (en) * 2015-11-19 2017-05-29 한국전자통신연구원 Method and apparatus for encoding/decoding of improved inter prediction
US11032550B2 (en) * 2016-02-25 2021-06-08 Mediatek Inc. Method and apparatus of video coding
WO2018048265A1 (en) * 2016-09-11 2018-03-15 엘지전자 주식회사 Method and apparatus for processing video signal by using improved optical flow motion vector
US10623737B2 (en) * 2016-10-04 2020-04-14 Qualcomm Incorporated Peak sample adaptive offset
US10805630B2 (en) * 2017-04-28 2020-10-13 Qualcomm Incorporated Gradient based matching for motion search and derivation
EP3410724A1 (en) * 2017-05-31 2018-12-05 Thomson Licensing Method and apparatus for signalling bi-directional intra prediction in video encoding and decoding
CN110800302A (en) * 2017-06-07 2020-02-14 联发科技股份有限公司 Method and apparatus for intra-inter prediction for video encoding and decoding
US10757420B2 (en) * 2017-06-23 2020-08-25 Qualcomm Incorporated Combination of inter-prediction and intra-prediction in video coding

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108781294A (en) * 2016-02-05 2018-11-09 联发科技股份有限公司 The motion compensation process and device based on bi-directional predicted optic flow technique for coding and decoding video
WO2017188566A1 (en) * 2016-04-25 2017-11-02 엘지전자 주식회사 Inter-prediction method and apparatus in image coding system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XIAOYU XIU等: "CE9-related: Complexity reduction and bit-width control for bi-directional optical flow (BIO)", 《JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11》 *

Also Published As

Publication number Publication date
CN113542748B (en) 2023-07-11
CN117014615A (en) 2023-11-07
CN113542748A (en) 2021-10-22
CN116347102B (en) 2024-01-23
CN117014615B (en) 2024-03-12
CN116347102A (en) 2023-06-27
CN117294842A (en) 2023-12-26
CN116800962A (en) 2023-09-22
CN113676733B (en) 2023-02-24

Similar Documents

Publication Publication Date Title
US11172203B2 (en) Intra merge prediction
CN114363612B (en) Method and apparatus for bit width control of bi-directional optical flow
JP7313533B2 (en) Method and Apparatus in Predictive Refinement by Optical Flow
US20230051193A1 (en) System and method for combined inter and intra prediction
CN114125441B (en) Bidirectional optical flow method for decoding video signal, computing device and storage medium
JP2023100979A (en) Methods and apparatuses for prediction refinement with optical flow, bi-directional optical flow, and decoder-side motion vector refinement
CN114009033A (en) Method and apparatus for signaling symmetric motion vector difference mode
JP2023063506A (en) Method for deriving constructed affine merge candidate
CN113676733B (en) Video decoding method, apparatus, non-transitory computer-readable storage medium
KR102450491B1 (en) System and method for combined inter and intra prediction
JP7303255B2 (en) Video coding method, video coding device, computer readable storage medium and computer program
WO2024016955A1 (en) Out-of-boundary check in video coding
WO2023250047A1 (en) Methods and devices for motion storage in geometric partitioning mode

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant