CN113841396B - Simplified local illumination compensation - Google Patents

Simplified local illumination compensation Download PDF

Info

Publication number
CN113841396B
CN113841396B CN202080037225.8A CN202080037225A CN113841396B CN 113841396 B CN113841396 B CN 113841396B CN 202080037225 A CN202080037225 A CN 202080037225A CN 113841396 B CN113841396 B CN 113841396B
Authority
CN
China
Prior art keywords
lic
samples
block
video block
mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202080037225.8A
Other languages
Chinese (zh)
Other versions
CN113841396A (en
Inventor
张娜
张莉
张凯
刘鸿彬
王悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
ByteDance Inc
Original Assignee
Beijing ByteDance Network Technology Co Ltd
ByteDance Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd, ByteDance Inc filed Critical Beijing ByteDance Network Technology Co Ltd
Publication of CN113841396A publication Critical patent/CN113841396A/en
Application granted granted Critical
Publication of CN113841396B publication Critical patent/CN113841396B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/583Motion compensation with overlapping blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Abstract

Simplified Local Illumination Compensation (LIC) is described. An exemplary method for video processing comprises: deriving a motion candidate list for a transition between a video block of a video and a bitstream representation of the video block, wherein a first candidate of the motion candidate list is set to have a Local Illumination Compensation (LIC) flag; and performing a conversion using the motion candidate list, wherein, during the conversion, upon selection of a first candidate from the motion candidate list, it is determined whether LIC is enabled based on a flag of the first candidate.

Description

Simplified local illumination compensation
Cross Reference to Related Applications
The present application claims timely priority and benefit from international patent application No. PCT/CN2019/087620 filed on 20/5/2019, in accordance with the regulations of the applicable patent laws and/or paris convention. The entire disclosure of international patent application No. PCT/CN2019/087620 is incorporated herein by reference as part of the disclosure of the present application.
Technical Field
This patent document relates to the field of video encoding and decoding.
Background
Currently, efforts are being made to improve the performance of current video codec technologies to provide better compression ratios or to provide video codec and decoding schemes that allow for lower complexity or parallel implementations. Experts in the field have recently proposed some new video codec tools and are currently testing to determine their effectiveness.
Disclosure of Invention
Techniques are provided for incorporating simplified local illumination compensation in a video encoder or decoder.
In one exemplary aspect, a video processing method is disclosed. The method comprises the following steps: based on a codec mode of a current video block, making a decision regarding selectively applying a Local Illumination Compensation (LIC) model to at least a portion of the current video block; and performing a transition between the current video block and a bitstream representation of the current video block based on the decision.
In another exemplary aspect, a method of video processing is disclosed. The method comprises the following steps: configuring, for a current video block comprising a plurality of sub-regions, a set of Local Illumination Compensation (LIC) parameters for each of the plurality of sub-regions; and based on the configuration, performing a conversion between the current video block and a bitstream representation of the current video block.
In yet another exemplary aspect, a method of video processing is disclosed. The method comprises the following steps: making a decision regarding selectively enabling a filtering process for a current video block based on use of a Local Illumination Compensation (LIC) model on the current video block; and performing a conversion between the current video block and a bitstream representation of the current video block based on the decision.
In yet another exemplary aspect, a method of video processing is disclosed. The method comprises the following steps: during a transition between a current video block and a bitstream representation of the current video block, a Local Illumination Compensation (LIC) model applied to the current video block is configured, wherein the LIC model is trained using a fixed number of one or more samples of neighboring blocks of the current video block.
In yet another exemplary aspect, a method of video processing is disclosed. The method comprises the following steps: deriving a motion candidate list for a transition between a video block of a video and a bitstream representation of the video block, wherein a first candidate of the motion candidate list is set to have a Local Illumination Compensation (LIC) flag; and performing a conversion using the motion candidate list, wherein, during the conversion, upon selection of a first candidate from the motion candidate list, it is determined whether LIC is enabled based on a flag of the first candidate.
In yet another exemplary aspect, a method of video processing is disclosed. The method comprises the following steps: for a transition between a video block of a video and a bit stream representation of the video block, determining at least one of: based on properties of the video block, whether Local Illumination Compensation (LIC) is enabled or disabled for at least a portion of the video block, whether LIC is enabled for a reference picture list, and LIC parameters of at least one reference picture list; and performing a conversion based on the determination.
In yet another exemplary aspect, a method of video processing is disclosed. The method comprises the following steps: determining whether and/or how to apply a loop filter process and/or a post-reconstruction filtering process based on a use of Local Illumination Compensation (LIC) for transitions between video blocks of a video and a bitstream representation of the video blocks, wherein the loop filter process comprises a deblocking filter, a Sample Adaptive Offset (SAO), an Adaptive Loop Filter (ALF), and the post-reconstruction filtering process comprises a bilateral filter; and performing a conversion based on the determination.
In yet another exemplary aspect, a method of video processing is disclosed. The method comprises the following steps: deriving Local Illumination Compensation (LIC) parameters in an LIC model applied to a video block by using a fixed number of adjacent samples of the video block for a transition between the video block of the video and a bit stream representation of the video block; and performing a transformation based on the LIC parameters.
In yet another exemplary aspect, the various techniques described herein may be implemented as a computer program product stored on a non-transitory computer-readable medium. The computer program product comprises program code for performing the method described herein.
In yet another typical aspect, a video decoder device may implement the methods described herein.
The details of one or more implementations are set forth in the accompanying drawings, and the description below. Other features will be apparent from the description and drawings, and from the claims.
Drawings
Fig. 1 shows an example of a derivation process for merge candidate list construction.
Fig. 2 illustrates exemplary locations of spatial merge candidates.
Fig. 3 shows an example of candidate pairs considered for redundancy checking of spatial merge candidates.
Fig. 4 shows exemplary locations of a second PU for both N × 2N and 2N × N partitions.
Fig. 5 is an illustration of motion vector scaling for a temporal merge candidate.
Fig. 6 shows examples C0 and C1 of candidate positions of the time-domain merge candidate.
Figure 7 shows an example of a combined bi-predictive merge candidate.
Fig. 8 shows an example of a derivation process of a motion vector prediction candidate.
Fig. 9 is an exemplary illustration of motion vector scaling for spatial motion vector candidates.
Fig. 10 shows an example of an Advanced Temporal Motion Vector Predictor (ATMVP) for a Coding Unit (CU).
Fig. 11 shows an example of one CU with four sub-blocks (a-D) and their neighboring blocks (a-D).
Fig. 12 shows an example of a planar motion vector prediction process.
Fig. 13 is a flowchart of an example of encoding with different Motion Vector (MV) precision.
Fig. 14A and 14B are examples of sub-blocks for which OBMC is applicable.
Fig. 15 shows an example of adjacent samples used to derive IC parameters.
Fig. 16 shows an example of a Local Illumination Compensation (LIC) method for 16 × 16 unit processing.
Fig. 17 is an illustration of partitioning a Codec Unit (CU) into two triangle prediction units.
Fig. 18 shows an example of the positions of adjacent blocks.
Fig. 19 shows an example of CU applying the first weighting factor set.
Figure 20 shows an example of a motion vector storage implementation.
FIG. 21 shows an example of a simplified affine motion model.
Fig. 22 shows an example of affine MVF of each sub-block.
Fig. 23 shows examples of (a) a 4-parameter affine model and (b) a 6-parameter affine model.
Fig. 24 shows an example of a motion vector predictor (MV) for the AF _ INTER mode.
Fig. 25A-25B show examples of candidates for the AF _ MERGE mode.
Fig. 26 shows candidate positions for affine merge mode.
FIG. 27 illustrates an exemplary process for bilateral matching.
Fig. 28 illustrates an exemplary process of template matching.
Fig. 29 shows an embodiment of one-way Motion Estimation (ME) in Frame Rate Up Conversion (FRUC).
Fig. 30 illustrates an embodiment of a final motion vector representation (UMVE) search process.
FIG. 31 illustrates an example of UMVE search points.
FIG. 32 shows an example of a distance index and distance offset mapping.
FIG. 33 shows an example of an optical flow trace.
34A-34B illustrate examples of bidirectional optical flow (BDO) without block expansion: a) an access location outside the block; b) padding is used to avoid additional memory accesses and computations.
Fig. 35 shows an example of using decoder-side motion vector refinement (DMVR) based on two-sided template matching.
Fig. 36 shows an example of an architecture for luma mapping with chroma scaling (LMCS).
Fig. 37 shows an example of sample positions for deriving parameters of a cross-component linear model (CCLM) mode.
Fig. 38A and 38B show example positions of upper adjacent row samples and left adjacent column samples, respectively.
Fig. 39 shows an example of a straight line between the maximum and minimum luminance values.
Fig. 40A and 40B show examples of adjacent samples selected for deriving LIC parameters using 4 samples.
Fig. 41A and 41B show examples of adjacent samples selected for deriving LIC parameters using 8 samples.
Fig. 42A and 42B show other examples of neighboring samples selected for deriving LIC parameters using 8 samples.
43A-43D illustrate an exemplary method for video processing.
FIG. 44 is a block diagram of a hardware platform for implementing the video codec or decoding techniques described herein.
Fig. 45 illustrates an exemplary method for video processing.
Fig. 46 illustrates an exemplary method for video processing.
Fig. 47 illustrates an exemplary method for video processing.
Fig. 48 illustrates an exemplary method for video processing.
Detailed Description
Several techniques are provided herein that may be implemented into digital video encoders and decoders. Section headings are used herein for clarity of understanding, and the scope of the techniques and embodiments disclosed in each section is not limited to that section.
1. Overview
This patent document relates to video encoding and decoding techniques. In particular, this patent document relates to simplified Local Illumination Compensation (LIC) in video codecs. It can be applied to existing video codec standards, such as HEVC, or pending standards (multifunctional video codec). It may also be applicable to future video codec standards or video codecs.
2. Examples of video codec techniques
The video codec standard has evolved largely by the development of the well-known ITU-T and ISO/IEC standards. ITU-T developed H.261 and H.263, ISO/IEC developed MPEG-1 and MPEG-4 visuals, and both organizations jointly developed the H.262/MPEG-2 video, H.264/MPEG-4 Advanced Video Codec (AVC), and H.265/HEVC standards. Since h.262, the video codec standard was based on a hybrid video codec structure, in which temporal prediction plus transform coding was employed. To explore future video codec technologies beyond HEVC, VCEG and MPEG have together established the joint video exploration team (jfet) in 2015. Since then, JFET has adopted many new approaches and applied them to a reference software named Joint Exploration Model (JEM). In month 4 of 2018, a joint video experts team (jfet) was created between VCEG (Q6/16) and ISO/IEC JTC1 SC29/WG11(MPEG) in an effort to study the VVC standard targeting a 50% bit rate reduction compared to HEVC.
Inter prediction in HEVC/H.265
Each inter-predicted PU has motion parameters for one or two reference picture lists. The motion parameters include a motion vector and a reference picture index. The use of one of the two reference picture lists can also be signaled using inter _ pred _ idc. Motion vectors can be explicitly coded as Δ with respect to a predictor (predictor).
When a CU is coded in skip mode, one PU is associated with the CU and there are no significant residual coefficients, no coded motion vectors Δ or reference picture indices. The merge mode is specified such that the motion parameters of the current PU are obtained from neighboring PUs that include spatial and temporal candidates. merge mode can be applied to any inter-predicted PU, not just those for skip mode. An alternative to merge mode is explicit transmission of motion parameters, where the motion vectors (more precisely, the motion vector difference compared to the motion vector predictor), the corresponding reference picture indices and reference picture list usage of each reference picture list are explicitly signaled from each PU. Such a pattern is referred to as advanced motion in this disclosure
A PU is generated from a block of samples when the signaling indicates that one of the two reference picture lists is to be employed. This is called "one-way prediction". For both P-slices and B-slices, unidirectional prediction is available.
When signaling indicates that both reference picture lists are to be used, the PU is generated from two blocks of samples. This is called "bi-prediction". Bi-prediction is only available for B slices.
Details regarding these inter prediction modes specified in HEVC will be provided below. The description will start in merge mode.
Merg model 2.1.1
2.1.1.1. derivation of candidates for merg mode
When predicting a PU using merge mode, indices pointing to entries in the merge candidate list are parsed from the bitstream and employed to retrieve motion information. The construction of this list is specified in the HEVC standard and can be summarized according to the following sequence of steps:
step 1: initial candidate derivation
Step 1.1: spatial domain candidate derivation
O step 1.2: redundancy check for null field candidates
Step 1.3: time domain candidate derivation
Step 2: additional candidate insertions
Step 2.1: creation of bi-directional prediction candidates
Step 2.2: insertion of zero motion candidates
A schematic illustration of these steps is also given in fig. 1. For spatial merge candidate derivation, a maximum of four merge candidates are selected from among the candidates located at five different positions. For time-domain merge candidate derivation, at most one merge candidate is selected from among the two candidates. Since a constant number of candidates is taken for each PU at the decoder, additional candidates are generated when the number of candidates obtained from step 1 does not reach the maximum number of merge candidates (MaxNumMergeCand) signaled in the slice header. Since the number of candidates is constant, the index of the best merge candidate is encoded using truncated unary binarization (TU). If the size of the CU is equal to 8, then all PUs of the current CU share a single merge candidate list, which is equivalent to the merge candidate list of the 2N × 2N prediction unit.
Hereinafter, operations associated with the foregoing steps will be described in detail.
2.1.1.2. Spatial domain candidate derivation
In the derivation of the spatial merge candidates, a maximum of four merge candidates are selected among the candidates located at the positions shown in fig. 2. The order of derivation is A 1 、B 1 、B 0 、A 0 And B 2 . Only in position A 1 、B 1 、B 0 、A 0 Does not available (e.g., because it belongs to another slice or slice) or is subject to intra-coding before considering location B 2 . In position A 1 After the candidate of (b) is added, the addition of the remaining candidates is subjected to a redundancy check, which ensures that candidates having the same motion information are excluded from the list, thereby improving the coding efficiency. In order to reduce computational complexity, not all possible candidate pairs are considered among the mentioned redundancy checks. Instead, only the pairs linked by the arrows of fig. 3 are considered and the corresponding candidate for redundancy check is added to the list only if it does not have the same motion information. Another source of repetitive motion information is the "second PU" associated with a partition other than 2 nx 2N. As an example, fig. 4 shows the second PU for the N × 2N and 2N × N cases, respectively. When the current PU is partitioned into Nx 2N, it is not considered to be at position A 1 The candidates of (3) are used for list construction. In fact, adding this candidate will cause both prediction units to have the same motion information, which is for the order codec unitOnly one PU is redundant within. Similarly, when the current PU is partitioned into 2N, position B is not considered 1
2.1.1.3. Time domain candidate derivation
In this step, only one candidate is added to the list. In particular, in the derivation of this temporal merge candidate, a scaled motion vector is derived on the basis of the co-located PU belonging to the picture within the given reference picture list having the smallest POC difference with the current picture. The reference picture lists to be used for the derivation of co-located PUs are explicitly signaled within the slice header. The scaled motion vector for the temporal merge candidate is obtained as shown by the dashed line in fig. 5, scaled from the motion vector of the co-located PU by the POC difference (i.e. tb and td), where tb is defined as the POC difference between the reference picture of the current picture and the current picture, and td is defined as the POC difference between the reference picture of the co-located picture and the co-located picture. The reference picture index of the temporal merge candidate is set equal to zero. A practical implementation of the scaling process is described in the HEVC specification. For B slices, two motion vectors are obtained (one for reference picture list0 and the other for reference picture list 1) and combined to produce a bi-predictive merge candidate.
In co-located PU (Y) belonging to a reference frame, in candidate C 0 And C 1 The location of the time domain candidate is selected as shown in fig. 6. If at position C 0 Is not available, is intra-coded, or is outside the current CTU row, then position C is used 1 . Otherwise, position C is set 0 Use in the derivation of the time-domain merge candidate.
2.1.1.4. Additional candidate insertions
In addition to spatial and temporal merge candidates, there are two additional types of merge candidates: a combined bi-directional predictive merge candidate and zero merge candidate. The combined bi-directional prediction merge candidate is generated by using the spatial merge candidate and the temporal merge candidate. The combined bi-predictive merge candidate is used for B slices only. The combined bi-directional prediction candidate is generated by combining the first reference picture list motion parameters of the initial candidate with the second reference picture list motion parameters of the other. If these two tuples provide different motion hypotheses, they will form new bi-directional prediction candidates. As an example, fig. 7 shows the case when two candidates (which have mvL0 and refIdxL0 or mvL1 and refIdxL1) within the original list (left) are taken to create a combined bi-predictive merge candidate that is added to the final list (right). There are many rules regarding the combination considered to generate these additional merge candidates.
Zero motion candidates are inserted to fill the remaining entries in the merge candidate list and thus hit the MaxNumMergeCand capacity. These candidates have zero spatial displacement and reference picture indices that start at zero and increase each time a new zero motion candidate is added to the list. These candidates use one and two reference frame numbers for uni-directional prediction and bi-directional prediction, respectively. Finally, no redundancy check is performed on these candidates.
2.1.1.5. Motion estimation region for parallel processing
To speed up the encoding process, motion estimation may be performed in parallel, whereby the motion vectors of all prediction units inside a given region are derived simultaneously. Deriving merge candidates from spatial neighbors may interfere with parallel processing because a prediction unit cannot derive motion parameters from neighboring PUs until its associated motion estimation is complete. To mitigate the trade-off between coding efficiency and processing delay, HEVC defines a Motion Estimation Region (MER), the size of which is signaled in the picture parameter set using a "log 2_ parallel _ merge _ level _ minus 2" syntax element. Merge candidates that fall in the same region are marked as unavailable when defining MER and are therefore not considered when building the list.
2.1.2.AMVP
AMVP exploits the spatio-temporal correlation of motion vectors with neighboring PUs, which is used for explicit transmission of motion parameters. For each reference picture list, the motion vector candidate list is constructed in the following way: the availability of left and top temporal neighboring PU positions is checked first, redundant candidates are removed, and zero vectors are added so that the candidate list has a constant length. The encoder may then select the best predictor from the candidate list and transmit a corresponding index indicating the selected candidate. Similar to merge index signaling, the index of the best motion vector candidate is encoded using a truncated unary code. The maximum value to be encoded in this case is 2 (refer to fig. 8). In the following sections, details will be provided regarding the derivation process of motion vector prediction candidates.
2.1.2.1. derivation of AMVP candidates
In motion vector prediction, two types of motion vector candidates are considered: spatial motion vector candidates and temporal motion vector candidates. For spatial motion vector candidate derivation, two motion vector candidates are finally derived based on the motion vectors of each PU located at five different positions as shown in fig. 2.
For temporal motion vector candidate derivation, one motion vector candidate is selected from two candidates derived based on two different co-located positions. After the first list of space-time candidates is made, the repeated motion vector candidates in the list are removed. If the number of possible candidates is greater than two, such motion vector candidates are removed from the list: its reference picture index within the associated reference picture list is greater than 1. If the number of spatio-temporal motion vector candidates is less than two, additional zero motion vector candidates are added to the list.
2.1.2.2. Spatial motion vector candidates
Among the derivation of spatial motion vector candidates, a maximum of two candidates are considered among five possible candidates derived from PUs located at the positions shown in fig. 2, which are the same as those of the moving merge. The derivation order for the left side of the current PU is defined as A 0 、A 1 And scaling A 0 Zoom A 1 . The derivation order for the upper side of the current PU is defined as B 0 、B 1 、B 2 Zoom B 0 Zoom B 1 Zoom B 2 . For each side, there are thus four cases that can be used as motion vector candidates, two of which do not require the use of spatial scaling, and two of which useAnd (5) spatial domain scaling. The four different cases are summarized as follows:
no spatial domain scaling
- (1) same reference Picture List and same reference Picture index (same POC)
- (2) different reference picture lists, but the same reference picture (same POC)
Spatial scaling
- (3) same reference picture list, but different reference picture indices (different POCs)
- (4) different reference Picture lists and different reference pictures (different POCs)
First the no spatial scaling case is checked, followed by spatial scaling. Spatial scaling is considered regardless of the reference picture list when POC is different between the reference picture of the neighboring PU and the reference picture of the current PU. If all PUs of the left side candidate are not available or intra coded, scaling for the upper side motion vector is allowed, facilitating parallel derivation of the left side MV candidate and the upper side MV candidate. Otherwise, spatial scaling for the upper motion vector is not allowed.
In the spatial scaling process, the motion vectors of neighboring PUs are scaled in a similar manner as the temporal scaling, as shown in fig. 9. The main difference is that the reference picture list and index of the current PU are given as input; the actual scaling procedure is the same as for the time domain scaling.
2.1.2.3. Time domain vector candidates
All the procedures for the derivation of temporal merge candidates are the same as the derivation of spatial motion vector candidates (see fig. 6), except for the reference picture index derivation. The reference picture index is signaled to the decoder.
2.2. New inter-frame prediction method in JEM
2.2.1. sub-CU-based motion vector prediction
In JEM with QTBT, each CU may have at most one set of motion parameters for each predicted direction. In the encoder, two sub-CU-level motion vector prediction methods are considered by dividing a large CU into sub-CUs and deriving motion information for all sub-CUs of the large CU. An Alternative Temporal Motion Vector Prediction (ATMVP) method allows each CU to obtain multiple sets of motion information from multiple smaller blocks than the current CU in the co-located reference picture. In the spatial-temporal motion vector prediction (STMVP) method, motion vectors of sub-CUs are recursively derived using temporal motion vector predictors and spatial neighboring motion vectors.
In order to preserve more accurate motion fields for sub-CU motion prediction, motion compression for reference frames is currently disabled.
2.2.1.1 optional temporal motion vector prediction
In an Alternative Temporal Motion Vector Prediction (ATMVP) method, the motion vector Temporal Motion Vector Prediction (TMVP) is modified by fetching multiple sets of motion information (including motion vectors and reference indices) from blocks smaller than the current CU. As shown in fig. 10, a sub-CU is a square N × N block (N is set to 4 by default).
ATMVP predicts motion vectors of sub-CUs within a CU in two steps. The first step is to identify the corresponding block in the reference picture using a so-called temporal vector. The reference picture is called a motion source picture. The second step is to divide the current CU into sub-CUs and obtain the motion vector and reference index of each sub-CU from and from the block corresponding to each sub-CU, as shown in fig. 10.
In a first step, a reference picture and a corresponding block are determined by motion information of spatial neighboring blocks of the current CU. To avoid a repeated scanning process of neighboring blocks, the first merge candidate in the list of merge candidates in the current CU is used. The first available motion vector and its associated reference index are set to the temporal vector and the index to the motion source picture. In this way, in ATMVP, the corresponding block can be identified more accurately than in TMVP, where the corresponding block (sometimes referred to as a co-located block) is always in a lower right or center position with respect to the current CU.
In a second step, the corresponding block of the sub-CU is identified by the temporal vector in the motion source picture by adding the temporal vector to the coordinates of the current CU. For each sub-CU, the motion information of its corresponding block (the smallest motion grid covering the central samples) is used to derive the motion information of that sub-CU. After identifying the motion information corresponding to an nxn block, in the same manner as the TMVP of HEVC,it is converted into a motion vector and reference index of the current sub-CU, where motion scaling and other procedures apply. For example, the decoder checks whether a low delay condition is met (i.e., POC of all reference pictures of the current picture is smaller than POC of the current picture) and possibly uses the motion vector MV x (motion vector corresponding to reference picture list X) prediction motion vector MV for each sub-CU y (X is equal to 0 or 1 and Y is equal to 1-X).
2.2.1.2. Spatio-temporal motion vector prediction
In this approach, the motion vectors of the sub-CUs are derived recursively, following a raster scan order. Fig. 11 illustrates this concept. Consider an 8 × 8CU containing four 4 × 4 sub-CUs a, B, C, and D. The neighboring 4x4 blocks in the current frame are labeled a, b, c, and d.
The motion derivation for sub-CU a starts by identifying its two spatial neighbors. The first neighbor is the nxn block (block c) above the sub-CU a. If this block c is not available or intra-coded, the other nxn blocks above the sub-CU a are examined (from left to right, starting from block c). The second neighbor is the block to the left of sub-CU a (block b). If block b is not available or intra-coded, the other blocks to the left of sub-CU a are checked (from top to bottom, starting from block b). The motion information obtained from the neighboring blocks for each list is scaled to the first reference frame for the given list. Next, the Temporal Motion Vector Predictor (TMVP) of sub-block a is derived by following the same procedure of TMVP derivation as specified in HEVC. The motion information of the co-located block at position D is obtained and scaled accordingly. Finally, after retrieving and scaling the motion information, all available motion vectors (up to 3) are averaged for each reference list independently. The averaged motion vector is assigned as the motion vector of the current sub-CU.
2.2.1.3. sub-CU motion prediction mode signaling
The sub-CU mode is enabled as an additional merge candidate, no additional syntax elements are needed to signal the notification mode. Two additional merge candidates are added to the merge candidate list for each CU to represent ATMVP mode and STMVP mode. If the sequence parameter set indicates ATMVP and STMVP are enabled, a maximum of seven merge candidates are used. The encoding logic of the additional merge candidates is the same as the merge candidate in HM, which means that for each CU in a P or B slice, two additional RD checks are needed for two additional merge candidates.
In JEM, all bins indexed by merge are context coded by CABAC. Whereas in HEVC, only the first bin is context coded and the remaining bins are context bypass coded.
2.2.2. Pairwise mean candidates
The pairwise average candidate is generated by averaging a predefined candidate pair in the current merge candidate list, the predefined pair being defined as { (0,1), (0,2), (1,2), (0,3), (1,3), (2,3) }, wherein the number represents the merge index to the merge candidate list. The average motion vector is calculated independently for each reference list. If two motion vectors exist in a list, averaging the two motion vectors even if they point to different reference pictures; if there is only one motion vector, then one is used directly; if there are no motion vectors, the list is kept invalid. The pairwise average candidate replaces a combined candidate in the HEVC standard.
2.2.3. Planar motion vector prediction
Planar motion vector prediction is proposed.
To generate a smooth fine-grained motion field, fig. 12 gives a brief description of the planar motion vector prediction process.
Planar motion vector prediction is achieved by averaging horizontal and vertical linear interpolations based on 4x4 blocks as follows.
P(x,y)=(H×P h (x,y)+W×P v (x,y)+H×W)/(2×H×W)
W and H represent the width and height of the block. (x, y) is the coordinates of the current sub-block relative to the top left sub-block. All distances are represented by the pixel distance divided by 4. P (x, y) is the motion vector of the current sub-block.
The horizontal prediction P for position (x, y) is calculated as follows h (x, y) and vertical prediction P v (x,y):
P h (x,y)=(W-1-x)×L(-1,y)+(x+1)×R(W,y)
P v (x,y)=(H-1-y)×A(x,-1)+(y+1)×B(x,H)
Where L (-1, y) and R (W, y) are motion vectors of the 4x4 blocks to the left and right of the current block. A (x, -1) B (x, H) are the motion vectors of the 4x4 blocks above and at the bottom of the current block.
Reference motion information of left-column and upper-row neighboring blocks is derived from spatial neighboring blocks of the current block.
The reference motion information of the right column and bottom row neighboring blocks is derived as follows.
1) Deriving motion information for lower-right temporal neighboring 4x4 blocks
2) As described in equation 1, the motion vector of the lower right adjacent 4 × 4 block is calculated using the derived motion information of the lower right adjacent 4 × 4 block together with the motion information of the upper right adjacent 4 × 4 block.
3) The motion vector of the bottom row of adjacent 4x4 blocks is calculated using the derived motion information of the bottom right adjacent 4x4 block together with the motion information of the bottom left adjacent 4x4 block as described in equation 2.
R (W, y) ═ ((H-y-1) × AR + (y +1) × BR)/H equation 1
B (x, H) ═ ((W-x-1) × BL + (x +1) × BR)/W equation 2
Where AR is the motion vector of the upper right spatial neighboring 4 × 4 block, BR is the motion vector of the lower right temporal neighboring 4 × 4 block, and BL is the motion vector of the lower left spatial neighboring 4 × 4 block.
The motion information obtained from the neighboring blocks for each list is scaled to the first reference picture for the given list.
2.2.4. Adaptive motion vector difference resolution
In HEVC, when use _ integer _ mv _ flag is equal to 0 in the slice header, a motion vector difference value (MVD) (between a motion vector and a predicted motion vector of a PU) is signaled in units of quarter luminance samples. In JEM, a Locally Adaptive Motion Vector Resolution (LAMVR) is introduced. In the JEM, the MVD can be coded and decoded in units of quarter-luminance samples, full-luminance samples, and four-luminance samples. The MVD resolution is controlled on a Coding Unit (CU) level, and an MVD resolution flag is signaled in a conventional manner for each CU having at least one non-zero MVD component.
For a CU with at least one non-zero MVD component, a first flag is signaled to indicate whether quarter luma sample MV precision is used in the CU. When the first flag (equal to 1) indicates that quarter-luma sample MV accuracy is not used, another flag is signaled to indicate whether full luma sample MV accuracy or four luma sample MV accuracy is used.
When the first MVD resolution flag of a CU is zero, or when the flag is not coded for the CU (meaning that all MVDs within the CU are zero), a quarter-luma sample MV resolution is used for the CU. When a CU uses full luma sample MV precision or four luma sample MV precision, the MVP in the AMVP candidate list of the CU is rounded with respect to the corresponding precision.
In the encoder, a CU-level RD check is used to determine which MVD resolution to use for a CU. That is, three CU-level RD checks are performed for each MVD resolution. To speed up the encoder speed, the following encoding scheme is applied in JEM.
During the RD check of a CU with normal quarter-luminance sample MVD resolution, the motion information (full luminance sample accuracy) of the current CU is stored. During the RD check for the same CU with full luma sample and 4 luma sample MVD resolutions, the stored motion information (after rounding) is used as a starting point for smaller range motion vector refinement, so that the time-consuming motion estimation process is not repeated three times.
Conditionally invoke the RD check of CUs with 4 luma samples MVD resolution. For a CU, when RD cost full luminance sample MVD resolution is much larger than quarter luminance sample MVD resolution, RD checking for the 4 luminance sample MVD resolution of the CU is skipped.
The encoding process is shown in fig. 13. First, 1/4 pixel MVs were tested, RD costs were calculated and denoted as RDCost0, and then the full MV was tested and the RD costs were denoted as RDCost 1. If RDCost1< th RDCost0 (where th is a positive value), then test 4 pixels MV; otherwise, 4-pixel MVs are skipped. Basically, when checking integer-pixel MV or 4-pixel MV, the motion information and RD cost etc. are already known for 1/4-pixel MV, which can be reused to speed up the integer-pixel MV or 4-pixel MV encoding process.
2.2.5. Higher motion vector storage accuracy
In HEVC, the motion vector precision is one-quarter pixel (for 4:2:0 video, one-quarter luma samples and one-eighth chroma samples). In JEM, the accuracy of the intra motion vector storage and merge candidate is improved to 1/16 pixels. This higher motion vector precision (1/16 pixels) is used when motion compensated inter prediction is performed on CUs coded with skip/merge mode. For CUs that are coded using normal AMVP mode, integer-pixel or quarter-pixel motion is used, as described in section 0.
The SHVC upsampling interpolation filter has the same filter length and normalization factor as the HEVC motion compensated interpolation filter and is used as a motion compensated interpolation filter for the extra fractional pixel positions. The chroma component motion vector accuracy is 1/32 samples in JEM, and an additional interpolation filter for the 1/16 pixel fractional position is derived by using the average of the filters for the two adjacent 1/16 pixel fractional positions.
2.2.6. Overlapped block motion compensation
Overlapped Block Motion Compensation (OBMC) was previously used in h.263. In JEM, the syntax switch OBMC can be used at the CU level, unlike in H.263. When OBMC is used in JEM, OBMC is performed for all Motion Compensation (MC) block boundaries except for the right and bottom boundaries of the CU. Furthermore, it is applied to both luminance and chrominance components. In JEM, the MC block corresponds to a codec block. When a CU is coded using sub-CU modes (including sub-CU merge, affine, and FRUC modes), each sub-block of the CU is an MC block. To handle CU boundaries in a uniform manner, OBMC is performed for all MC block boundaries at the sub-block level, with the sub-block size set equal to 4 × 4, as shown in fig. 14.
When OBMC is applied to the current sub-block, the motion vectors of the four connected neighboring sub-blocks (if any and different from the current motion vector) are used in addition to the current motion vector to derive a prediction block for the current sub-block. These multiple prediction blocks based on multiple motion vectors are combined to generate a final prediction signal for the current sub-block.
The prediction block based on the motion vectors of the neighboring sub-blocks is denoted as P N Where N denotes an index for adjacent upper, lower, left and right sub-blocks, and a prediction block based on a motion vector of a current sub-block is denoted as P C . At P N Based on motion information of neighboring subblocks containing the same motion information as the current subblock, not from P N OBMC is performed. Otherwise, P is added N Is added to P C I.e. P N Is added to P C . Is P N Using weighting factors of 1/4,1/8,1/16,1/32 and being P C Weighting factors {3/4,7/8,15/16,31/32} are used. The exceptions are small MC blocks, (i.e. when the height or width of a codec block is equal to 4 or a CU is coded with sub-CU mode), for small MC blocks only P-wise C Addition of P N Two rows/columns. In this case, P N Using weighting factors of 1/4,1/8, and being P C Weighting factors 3/4,7/8 are used. P generated for motion vector based on vertically (horizontally) adjacent sub-blocks N A1 is to P N To P having the same weighting factor C
In JEM, for CUs with a size less than or equal to 256 luma samples, a CU-level flag is signaled to indicate whether OBMC is applied to the current CU. For CUs with a size larger than 256 luma samples or not coded with AMVP mode, OBMC is applied by default. At the encoder, when applying OBMC on a CU, its impact is taken into account during the motion estimation segment. The prediction signal formed by OBMC using the motion information of the top-neighboring block and the left-neighboring block is used to compensate the top and left boundaries of the initial signal of the current CU, and then the normal motion estimation process is applied.
2.2.7. Local illumination compensation
Local Illumination Compensation (LIC) is based on a linear model of illumination variation, using a scaling factor a and an offset b. Also, a Codec Unit (CU) for each inter-mode codec adaptively enables or disables local illumination compensation.
When LIC is applied to a CU, a least squares error method is employed to derive the parameters a and b by using neighboring samples of the current block and its reference block (identified by motion information of the current block or sub-CU). To avoid integer division, the number of samples should be a power of 2, so that the division is performed with a right shift. Maximum number of adjacent samples N used by LIC (in case both sides are available):
minDim=min(cuHeight,cuWidth)
minStepBit=minDim>81:0;
numSteps=minDim>>minStepBit;
N=2×numSteps
here, cuHeight and cuWidth are the height and width of the current block.
More specifically, as shown in fig. 15, neighboring samples of the 32 × 16CU and corresponding samples in the reference picture are used.
When the merge mode is adopted to code and decode the CU, copying the LIC mark from the airspace HMVP and the unidirectional paired merge candidate in a similar mode of copying the motion information in the merge mode; otherwise signaling a LIC flag for the CU to indicate whether LIC is applicable.
In addition to the MV and reference index, LIC flag is included as part of the motion information. When constructing the merge candidate list, the LIC flag is inherited from the neighboring block for the merge candidate, but is not used for motion vector pruning for simplification purposes.
The LIC flag is not stored in the motion vector buffer of the reference picture, so the LIC flag is always set equal to false for TMVP. For bi-directional merge candidates, such as pairwise mean candidates and zero motion candidates, the LIC flag is also set equal to false.
And carrying out context coding and decoding on the LIC mark by using a single context, and not signaling the LIC mark when the LIC tool is not applicable.
When using loop luma reshaping, inverse reshaping is applied to neighboring samples prior to LIC parameter derivation, since the neighbors of the current block are in the reshaped domain, but the reference picture samples are in the original (non-reshaped) domain.
When LIC is enabled for a picture, an additional CU-level RD check is needed to determine if LIC is to be applied to a CU. When LIC is enabled for a CU, the sum of absolute differences with mean removed (MR-SAD) and the sum of absolute Hadamard transform differences with mean removed (MR-SATD), rather than SAD and SATD, are used for integer-pixel motion search and fractional-pixel motion search, respectively.
To reduce the coding complexity, the following coding scheme is applied in JEM.
When there is no significant illumination change between the current picture and its reference picture, LIC is disabled for the whole picture. To identify this, a histogram of the current picture and each reference picture of the current picture is computed at the encoder. Disabling LIC for the current picture if the histogram difference between the current picture and each reference picture of the current picture is less than a given threshold; otherwise, starting LIC for the current picture.
The LIC also has the following constraints,
CIIP and IBC blocks that do not use reconstructed intra-frame inner coding.
Not having to disable LIC above CTU boundary
Disabling CIIP and IBC blocks
For blocks with less than 64 luminance samples, LIC is disabled.
LIC flag has no temporal inheritance
Pruning without based on LIC flag when generating a clustering candidate list
Disabling LIC for all sub-block modes including triangle mode and BDOF
It is not applied to 128 XN and NX 128 blocks
Not applying it to bi-directional prediction
The use of intra-reconstruction samples that result in correlation with each other makes it necessary to wait for the adjacent intra-block to be reconstructed in order to perform the prediction or reconstruction of the inter-coded block, the additional processing only starting at that time. To make a more friendly implementation and avoid possible delays, LIC uses only reconstructed samples from neighbors of inter-frame coding. If the neighbor reconstructed sample is intra, CIIP or IBC coded, it will be replaced by the corresponding reference sample.
To reduce the local memory size of the hardware pipeline, it is desirable to use 16 × 16 unit processing for inter blocks. But existing LIC uses all neighboring samples of the current block to derive LIC parameters. It has been proposed to introduce constraints to enable small tube-line unit processing of LIC. Fig. 16 shows the proposed method for 16 x 16 unit processing. The LIC parameters are calculated at the first 16 x 16 block of the current block, only referring to the neighboring samples of this 16 x 16 block. And share LIC parameters with other 16 x 16 blocks in the current block.
vpdu_blk_size=(compID==COMPONENT_Y)?INTER_LIC_VPDU_SIZE:(INTER_LIC_VPDU_SIZE>>1);
cuWidth=(cu.blocks[compID].width>vpdu_blk_size)?vpdu_blk_size:cu.blocks[compID].width;
cuHeight=(cu.blocks[compID].height>vpdu_blk_size)?vpdu_blk_size:cu.blocks[compID].height;
INTER _ LIC _ VPDU _ SIZE is 16. For the luma component, vpdu _ blk _ size is 16; for the chroma component, vpdu _ blk _ size is 8. Block [ composition ]. width is the width of the current block for the composition (composition _ Y, composition _ Cb, composition _ Cr); block [ composition ]. height is the height of the current block for the composition (composition _ Y, composition _ Cb, composition _ Cr).
Maximum number of adjacent samples N used by LIC (in case both sides are available):
minDim=min(cuHeight,cuWidth)
minStepBit=minDim>81:0;
numSteps=minDim>>minStepBit;
N=2×numSteps
2.2.8. hybrid intra and inter prediction
Multi-hypothesis prediction is proposed, where mixed intra and inter prediction is one way to generate multiple hypotheses.
When applying multi-hypothesis prediction to improve intra mode, multi-hypothesis prediction combines one intra prediction and one merge index prediction. In merge CU, needlesA flag is signaled to merge mode to select intra mode from the intra candidate list when the flag is true. For the luminance component, the intra candidate list is derived from 4 intra prediction modes including DC mode, planar mode, horizontal mode, and vertical mode, and the size of the intra candidate list may be 3 or 4 depending on the block size. When the CU width is greater than twice the CU height, the horizontal mode is excluded from the intra-mode list, and when the CU height is greater than twice the CU width, the vertical mode is removed from the intra-mode list. Combining an intra prediction mode selected by an intra mode index and a merge index prediction selected by a merge index using a weighted average. For the chroma component, DM is always applied without extra signaling. The weights used to combine the predictions are as follows. Equal weights are applied when either DC or planar mode is selected, or CB width or height is less than 4. For those CB having a CB width and height greater than or equal to 4, when the horizontal/vertical mode is selected, one CB is first vertically/horizontally divided into four equal-area regions. Is represented as (w _ intra) i ,w_inter i ) Will be applied to the corresponding region, where i is from 1 to 4, and (w _ intra) 1 ,w_inter 1 )=(6,2),(w_intra 2 ,w_inter 2 )=(5,3),(w_intra 3 ,w_inter 3 ) (w _ intra) and (3,5) 4 ,w_inter 4 )=(2,6)。(w_intra 1 ,w_inter 1 ) For the region closest to the reference sample point, and (w _ intra) 4 ,w_inter 4 ) For the region furthest from the reference sample. Thereafter, a combined prediction can be calculated by summing the two weighted predictions and right-shifting by 3 bits. Furthermore, the intra-prediction mode of the intra-hypothesis for the predictor can be saved for reference by the next neighboring CU.
Such a method is also known as combined intra-inter prediction (CIIP).
2.2.9. Triangle prediction unit mode
The idea of the triangle prediction unit mode is to introduce a new triangle partition for motion compensated prediction. As shown in fig. 17, it divides a CU into two triangle Prediction Units (PUs) in the diagonal or anti-diagonal direction. Each triangle prediction unit in a CU uses its own uni-directional prediction motion vector and a reference frame index derived from the uni-directional prediction candidate list for inter prediction. After the triangle prediction unit is predicted, an adaptive weighting process is performed on the diagonal edges. Then, the transform and quantization process is applied to the entire CU. It is noted that this mode only applies to skip and merge modes.
Unidirectional prediction candidate list
The unidirectional prediction candidate list is composed of five unidirectional prediction motion vector candidates. It is derived from seven neighboring blocks, including five spatial neighboring blocks (1 to 5) and two temporal co-located blocks (6 to 7), as shown in fig. 18. The motion vectors of the seven neighboring blocks are collected and put into the uni-directional prediction candidate list in the order of the uni-directional prediction motion vector, the L0 motion vector of the bi-directional prediction motion vector, the L1 motion vector of the bi-directional prediction motion vector, and the average motion vectors of the L0 and L1 motion vectors of the bi-directional prediction motion vector. If the number of candidates is less than five, then zero motion vectors are added to the list.
Adaptive weighting process
After predicting each triangle prediction unit, an adaptive weighting process is applied to the diagonal edges between two triangle prediction units to derive a final prediction for the entire CU. Two weight factor groups are listed as follows:
first set of weighting factors: use {7/8,6/8,4/8,2/8,1/8} and {7/8,4/8,1/8} for luma and chroma samples, respectively;
second set of weighting factors: {7/8,6/8,5/8,4/8,3/8,2/8,1/8} and {6/8,4/8,2/8} are used for luminance and chrominance samples, respectively.
A set of weighting factors is selected based on a comparison of the motion vectors of the two triangular prediction units. The second weight factor set is used when the reference pictures of the two triangular prediction units are different from each other or have a motion vector difference greater than 16 pixels. Otherwise, the first set of weighting factors is used. An example is shown in fig. 19.
Motion vector storage
The motion vectors of the triangle prediction units (Mv 1 and Mv2 in fig. 20) are stored in a 4 × 4 mesh. For each 4x4 mesh, either uni-directional predicted or bi-directional predicted motion vectors are stored, depending on the position of the 4x4 mesh in the CU. As shown in fig. 20, the uni-directional predictive motion vector Mv1 or Mv2 is stored for a 4 × 4 mesh located in an unweighted region. On the other hand, for a 4 × 4 mesh located in the weighting region, the bidirectional predictive motion vector is stored. Bi-predictive motion vectors are derived from Mv1 and Mv2 according to the following rules:
1) in the case of Mv1 and Mv2 with motion vectors from different directions (L0 or L1), Mv1 and Mv2 are simply combined to form a bi-predictive motion vector.
2) In the case where Mv1 and Mv2 both come from the same L0 (or L1) direction,
2.a) if the reference picture of Mv2 is the same as a picture in the L1 (or L0) reference picture list, then Mv2 is scaled to that picture. The Mv1 and the scaled Mv2 are combined to form a bi-predictive motion vector.
2, b) if the reference picture of Mv1 is the same as a picture in the L1 (or L0) reference picture list, then Mv1 is scaled to that picture. The scaled Mv1 and Mv2 are combined to form a bi-predictive motion vector.
2, c) otherwise, only Mv1 is stored for the weighted region.
2.2.10. Affine motion compensated prediction
In HEVC, only the translational motion model is applied for Motion Compensated Prediction (MCP). In the real world, there are many kinds of movements, such as zoom-in/zoom-out, rotation, perspective movement, and other irregular movements. In JEM, a simplified affine transform motion compensated prediction is applied. As shown in fig. 21, the affine motion field of the block is described by two control point motion vectors.
The Motion Vector Field (MVF) of a block is described by the following equation:
Figure BDA0003363224490000201
wherein (v) 0x ,v 0y ) Is to the leftMotion vector of upper corner control point, (v) 1x ,v 1y ) Is the motion vector of the upper right hand corner control point.
To further simplify motion compensated prediction, sub-block based affine transform prediction is applied. The subblock size M × N is derived as in equation 2, where MvPre is the motion vector fractional precision (1/16 in JEM), (v) and 2x ,v 2y ) Is the motion vector of the lower left control point, calculated according to equation 1.
Figure BDA0003363224490000202
After derivation by equation 2, M and N should be adjusted downward, if necessary, to be divisors of w and h, respectively.
To derive the motion vector for each M × N sub-block, the motion vector for the center sample of each sub-block is calculated according to equation 1 (as shown in fig. 22) and rounded to 1/16 fractional accuracy. Then, the motion compensated interpolation filter mentioned in section 0 is applied to generate a prediction for each sub-block by means of the derived motion vector.
After MCP, the high precision motion vector of each sub-block is rounded and stored with the same precision as the normal motion vector.
2.2.10.1.AF _ INTER mode
In JEM, there are two affine motion patterns: AF _ INTER mode and AF _ MERGE mode. For CUs with both width and height larger than 8, the AF _ INTER mode may be applied. An affine flag within the CU level is signaled in the bitstream to indicate whether AF _ INTER mode is used. In this mode, neighboring blocks are used to construct a motion vector pair { (v) 0 ,v 1 )|v 0 ={v A ,v B ,v c },v 1 ={v D ,v E } of the candidate list. As shown in FIG. 24, v is selected from the motion vectors of block A, B or C 0 . The motion vectors from the neighboring blocks are scaled according to the reference list and the relationship between the reference POC of the neighboring block, the reference POC of the current CU, and the POC of the current CU. Selecting v from neighboring blocks D and E 1 The manner of (a) is similar. If the number of candidate lists is less than 2, the list is populated with pairs of motion vectors formed by copying each AMVP candidate. When the candidate list is larger than 2, the candidates are first stored according to the consistency of the adjacent motion vectors (similarity of two motion vectors in a pair of candidates), and only the first two candidates are held. An RD cost check is used to determine which motion vector pair candidate to select as the Control Point Motion Vector Predictor (CPMVP) for the current CU. And signaling an index indicating a position of the CPMVP in the candidate list in the bitstream. After determining the CPMVP of the current affine CU, affine motion estimation is applied and Control Point Motion Vectors (CPMVs) are found. The difference between CPMV and CPMVP is then signaled in the bitstream.
In the AF _ INTER mode, 2/3 control points are required when the 4/6 parameter affine mode is adopted, and thus 2/3 MVDs need to be coded for these control points, as shown in fig. 23. It is proposed to derive the MV by mvd 0 Predicting mvd 1 And mvd 2
Figure BDA0003363224490000211
Figure BDA0003363224490000212
Figure BDA0003363224490000213
Wherein the content of the first and second substances,
Figure BDA0003363224490000214
mvd i and mv 1 Predicted motion vectors, motion vector differences, and motion vectors of the upper left pixel (i ═ 0), the upper right pixel (i ═ 1), or the lower left pixel (i ═ 2), respectively, as shown in fig. 23. Note that the addition of two motion vectors (e.g., mvA (xA, yA) and mvB (xB, yB)) is equal to the sum of two separate components, that is, newMV is mvA + mvB, and the two components of newMV are summedRespectively, they are (xA + xB) and (yA + yB).
2.2.10.2. fast affine ME Algorithm in AF _ INTER mode
In affine mode, the MVs of 2 or 3 control points need to be jointly determined. Direct joint search for multiple MVs is computationally complex. A fast ME algorithm is proposed and employed in VTM/BMS.
The fast affine ME algorithm is described for a 4-parameter affine model and the idea can be extended to a 6-parameter affine model.
Figure BDA0003363224490000221
Figure BDA0003363224490000222
Replace (a-1) with a', then the motion vector can be rewritten as:
Figure BDA0003363224490000223
assuming that the motion vectors of the two control points (0,0) and (0, w) are known, affine parameters can be derived from equation (5),
Figure BDA0003363224490000224
the motion vector can be rewritten in vector form as:
Figure BDA0003363224490000225
wherein
Figure BDA0003363224490000226
Figure BDA0003363224490000227
P ═ x, y is the pixel location.
At the encoder, the MVD of AF _ INTER is iteratively derived. Representing the MVs derived in the ith iteration for position P as MVs i (P) and applying the (i) th iteration against the MV C The updated delta is represented as dMV C i . Then in the (i +1) th iteration,
Figure BDA0003363224490000228
by Pic ref Denotes a reference picture, denoted by Pic cur Represents the current picture and represents Q ═ P + MV i (P) of the reaction mixture. Assuming that MSE is used as the matching criterion, then minimization is required:
Figure BDA0003363224490000229
suppose that
Figure BDA00033632244900002210
Small enough to be approximately rewritten with a first order Taylor expansion as follows
Figure BDA0003363224490000231
Wherein the content of the first and second substances,
Figure BDA0003363224490000232
represents E i+1 (P)=Pic cur (P)-Pic ref (Q),
Figure BDA0003363224490000233
May be derived by setting the derivative of the error function to zero
Figure BDA0003363224490000234
Then, can be based on
Figure BDA0003363224490000235
Calculating a difference MV between the control points (0,0) and (0, w),
Figure BDA0003363224490000236
Figure BDA0003363224490000237
Figure BDA0003363224490000238
Figure BDA0003363224490000239
assuming that such an MVD derivation process is iterated n times, the final MVD is calculated as follows,
Figure BDA00033632244900002310
Figure BDA00033632244900002311
Figure BDA00033632244900002312
Figure BDA00033632244900002313
from mvd 0 The delta MV of the control point (0,0) represented is predicted by mvd 1 The delta MV of the control point (0, w) represented is now actually for mvd 1 Only to
Figure BDA00033632244900002314
Figure BDA00033632244900002315
And (5) encoding.
2.2.10.3.AF _ MERGE mode
When the CPU is applied in AF _ MERGE mode, it gets the first block coded in affine mode from the valid neighbor reconstruction block. And the selection order of the candidate blocks is from left, top right, bottom left to top left, as shown in fig. 25A, motion vectors v of the top left corner, top right corner and bottom left corner of the CU containing block a are derived 2 、v 3 And v 4 . And according to v 2 、v 3 And v 4 To calculate the motion vector v of the upper left corner of the current CU 0 . Second, calculate the top right motion vector v of the current CU 1
In deducing the current CU v 0 And v 1 After the CPMV, the MVF of the current CU is generated according to the simplified affine motion model equation 1. To identify whether the current CU is coded with AF _ MERGE mode, an affine flag is signaled in the bitstream when there is at least one neighboring block coded with affine mode.
Fig. 25A and 25B show examples of candidates for AF _ MERGE.
The affine merge candidate list is constructed by the following steps:
1) inserting inherited affine candidates
Inherited affine candidates refer to: candidates are derived from affine motion models of their valid neighbor affine codec blocks. In the common basis, as shown in fig. 26, the scanning order for the candidate positions is: a1, B1, B0, a0 and B2.
In deriving the candidates, a full pruning procedure is performed to check whether the same candidate has been inserted into the list. If the same candidate exists, the derived candidate is discarded.
2) Interpolating constructed affine candidates
If the number of candidates in the affine merge candidate list is less than maxnumaffineband (set to 5 in this construction), the constructed affine candidate is inserted into the candidate list. The affine candidates constructed refer to: candidates are constructed by combining neighbor motion information for each control point.
The motion information of the control point is first derived from the specified spatial and temporal neighbors shown in fig. 26. CPk (k ═ 1,2,3,4) denotes the kth control point. A0, a1, a2, B0, B1, B2, and B3 are spatial positions for predicting CPk (k ═ 1,2, 3); t is the temporal location used to predict CP 4.
The coordinates of the CP1, CP2, CP3, and CP4 are (0,0), (W,0), (H,0), and (W, H), respectively, where W and H are the width and height of the current block.
The motion information of each control point is obtained according to the following priority order:
for CP1, the inspection priority is B2- > B3- > A2. If B2 is available, then B2 is employed. Otherwise, if B2 is not available, then B3 is employed. If neither B2 nor B3 is available, then A2 is employed. If all three candidates are not available, then no motion information for CP1 is available.
For CP2, the check priority is B1- > B0.
For CP3, the check priority is a1- > a 0.
For CP4, T is used.
Next, affine merge candidates are constructed using a combination of control points.
Motion information of three control points is required to construct a 6-parameter affine candidate. The three control points may be selected from one of the following four combinations ({ CP1, CP2, CP4}, { CP1, CP2, CP3}, { CP2, CP3, CP4}, { CP1, CP3, CP4 }). The combinations CP1, CP2, CP3, CP2, CP3, CP4, CP1, CP3, CP4 will be converted into a 6-parameter motion model represented by an upper left control point, an upper right control point and a lower left control point.
Motion information of two control points is required to construct a 4-parameter affine candidate. The two control points may be selected from one of the following six combinations ({ CP1, CP4}, { CP2, CP3}, { CP1, CP2}, { CP2, CP4}, { CP1, CP3}, { CP3, CP4 }). The combinations CP1, CP4, CP2, CP3, CP2, CP4, CP1, CP3, CP3, CP4 will be converted into a 4-parameter motion model represented by the upper left and upper right control points.
The combination of affine candidates constructed is inserted into the candidate list in the following order:
{CP1,CP2,CP3}、{CP1,CP2,CP4}、{CP1,CP3,CP4}、{CP2,CP3,CP4}、{CP1,CP2}、{CP1,CP3}、{CP2,CP3}、{CP1,CP4}、{CP2,CP4}、{CP3,CP4}
for the combined reference list X (X is 0 or 1), the reference index with the highest usage rate in the control points is selected as the reference index for list X, and the motion vector pointing to the difference reference picture is scaled.
In deriving the candidates, a full pruning process is performed to check if the same candidate has been inserted into the list. If the same candidate exists, the derived candidate is discarded.
3) Filling with zero motion vectors
If the number of candidates in the affine merge candidate list is less than 5, a zero motion vector with zero reference index is inserted into the candidate list until the list is full.
2.2.11. Pattern-matched motion vector derivation
The mode-matched motion vector derivation (PMMVD) mode is a special merge mode based on Frame Rate Up Conversion (FRUC) techniques. For this mode, the motion information of the block is not signaled but derived at the decoder side.
Signaling the FRUC flag when the merge flag of the CU is true. When the FRUC flag is false, the merge index is signaled and the normal merge mode is used. When the FRUC flag is true, an additional FRUC flag is signaled to indicate which method (bilateral matching or template matching) to use to derive motion information for the block.
At the encoder side, the decision whether to use FRUC merge mode for a CU is based on RD cost selection, as for normal merge candidates. That is, both matching patterns (bilateral matching and template matching) are checked for CUs using RD cost selection. The one that results in the lowest cost is compared to the other CU modes. If the FRUC matching pattern is the most efficient one, the FRUC flag is set to true for the CU and the associated matching pattern is used.
The motion derivation process in FRUC merge mode has two steps. CU-level motion search is performed first, followed by sub-CU-level motion refinement. At the CU level, an initial motion vector is derived for the entire CU based on bilateral matching or template matching. First, a list of MV candidates is generated and the candidate that results in the lowest matching cost is selected as the starting point for further CU-level refinement. Then, a local search is performed around the starting point based on bilateral matching or template matching, and the MV result with the minimum matching cost is taken as the MV of the entire CU. Next, the motion information is further refined at the sub-CU level, taking the derived CU motion vector as a starting point.
For example, the following derivation process is performed for W × HCU motion information derivation. In the first stage, MVs for the entire W × HCU are derived. In the second stage, the CU is further divided into M × M sub-CUs. The value of M is calculated as in (16), D is a predefined division depth, and is set to 3 by default in JEM. The MV for each sub-CU is then derived.
Figure BDA0003363224490000261
As shown in fig. 27, motion information of the current CU is derived using bilateral matching by finding the closest match between two blocks along the motion trajectory of the current CU in two different reference pictures. Under the assumption of a continuous motion trajectory, the motion vectors MV0 and MV1 pointing to the two reference blocks should be proportional to the temporal distance between the current picture and the two reference pictures (i.e., TD0 and TD 1). As a special case, the bilateral matching becomes a mirror-based bi-directional MV when the current picture is temporally between two reference pictures and the temporal distances from the current picture to the two reference pictures are the same.
As shown in fig. 28, using template matching, motion information of a current CU is derived by finding the closest match between a template in the current picture (the top and/or left neighboring block of the current CU) and a block in the reference picture (the same size as the template). In addition to the FRUC merge schema described previously, template matching is also applied to AMVP schema. In JEM, AMVP has two candidates, as in HEVC. New candidates are derived using a template matching method. If the newly derived candidate by template matching is different from the first existing AMVP candidate, it is inserted at the very beginning of the AMVP candidate list and then the list size is set to two (meaning the second existing AMVP candidate is eliminated). When applied to AMVP mode, only CU-level searching is applied.
2.2.11.1.CU level MV candidate set
The MV candidates at the CU level are composed of:
(i) if the current CU is in AMVP mode, the initial AMVP candidate
(ii) All of the merge candidates are selected as candidates,
(iii) several MVs in the difference MV field are introduced in section 0.
(iv) Top and left adjacent motion vectors
When using bilateral matching, each valid MV of the merge candidate is used as an input to generate MV pairs using the assumption of bilateral matching. For example, one valid MV of merge candidate is at reference list a (MVa, refa). Then, the reference picture of its paired bilateral MV is found in another reference list B, so that refa and refb are on different sides of the current picture in the time domain. If there is no such refb in the reference list B, refb is determined to be a different reference from refa and its temporal distance from the current picture is the smallest one in list B. After refb is determined, MVb is derived by scaling MVa based on the temporal distance between the current picture and refa, refb.
Four MVs from the interpolated MV field are also added to the CU level candidate list. More specifically, the interpolated MVs at positions (0,0), (W/2,0), (0, H/2), and (W/2, H/2) of the current CU are added.
When FRUC is applied in AMVP mode, the initial AMVP candidate is also added to the CU-level MV candidate set.
At the CU level, a maximum of 15 MVs for AMVP CU and a maximum of 13 MVs for merge CU are added to the candidate list.
2.2.11.2. sub-CU level MV candidate set
The MV candidates at the sub-CU level consist of:
(i) the determined MV is searched from the CU level,
(ii) top, left, top left and top right adjacent MVs,
(iii) a scaled version of the co-located MV from the reference picture,
(iv) a maximum of 4 ATMVP candidates,
(v) a maximum of 4 STMVP candidates
The scaled MV from the reference picture is derived as follows. All reference pictures in both lists are traversed. The MV at the co-located position of the sub-CU in the reference picture is scaled with reference to the starting CU level MV.
ATMVP and STMVP candidates are limited to the first four.
At the sub-CU level, a maximum of 17 MVs are added to the candidate list.
2.2.11.3. Generation of interpolated MV fields
Before encoding and decoding the frame, an interpolation motion field is generated for the whole picture based on the one-way ME. The motion field can then be used as MV candidates at the CU level or sub-CU level.
First, the motion field of each reference picture in the two reference lists is traversed at the level of 4 × 4 blocks. For each 4x4 block, if the motion associated to that block passes through the 4x4 block in the current picture (as shown in fig. 29) and no interpolated motion is assigned to that block, the motion of the reference block is scaled to the current picture according to temporal distances TD0 and TD1 (in the same way as MV scaling of TMVP in HEVC) and the scaled motion is assigned to that block in the current frame. If no scaled MV is assigned to a 4x4 block, the motion of that block is marked as unavailable in the interpolated motion field.
2.2.11.4. Interpolation and matching costs
When the motion vector points to a fractional sample position, motion compensated interpolation is required. To reduce complexity, bilinear interpolation is used for both bilateral matching and template matching instead of the conventional 8-tap HEVC interpolation.
The matching costs are calculated slightly differently at different steps. When selecting candidates from the candidate set at the CU level, the matching cost is the Sum of Absolute Differences (SAD) of the bilateral matching or the template matching. After determining the starting MV, the matching cost C of the bilateral match under the sub-CU level search is calculated as follows:
Figure BDA0003363224490000281
where w is a weighting factor, empirically set to 4, MV and MV s Indicating a current MV and a starting MV, respectively. SAD is also used as the matching cost for template matching under the sub-CU level search.
In FRUC mode, the MV is derived by using only the luminance samples. The derived motion will be used for both luma and chroma for MC inter prediction. After the MV is determined, final MC is performed using an 8-tap interpolation filter for luminance and a 4-tap interpolation filter for chrominance.
MV refinement 2.2.11.5
MV refinement is a pattern-based MV search, based on a bilateral matching cost or a template matching cost. In JEM, two search modes are supported, an unconstrained center-biased diamond search (UCBDS) and an adaptive cross search, respectively, for MV refinement at the CU level and sub-CU level. For both CU-level and sub-CU-level MV refinement, the MV is searched directly at the quarter-luma sample MV precision, followed by one-eighth luma sample MV refinement. The search range for MV refinement for the CU and sub-CU steps is set equal to 8 luma samples.
2.2.11.6. Selection of prediction direction in template matching FRUC merge mode
In the bilateral matching merge mode, bi-prediction is always applied, since the motion information of a CU is derived along the motion trajectory of the current CU in two different reference pictures based on the closest match between the two blocks. There is no such restriction on template matching merge patterns. In template matching merge mode, the encoder may select among uni-directional prediction from list0, uni-directional prediction from list1, or bi-directional prediction for a CU. The selection is based on the template matching cost as follows:
if costBi & gt factor & ltmin (cost0, cost1)
Using bi-directional prediction;
otherwise, if cost0< ═ cost1
Using one-way prediction from list 0;
if not, then,
using one-way prediction from list 1;
where cost0 is the SAD of the list0 template match, cost1 is the SAD of the list1 template match, and cost Bi is the SAD of the bi-prediction template match. The value of factor equals 1.25, which indicates that the selection process is biased towards bi-prediction.
Inter prediction direction selection is only applied to the CU-level template matching process.
2.2.12. Generalized bi-directional prediction
In conventional bi-directional prediction, predictors from L0 and L1 are averaged to generate a final predictor with equal weight of 0.5. The predictor Generation equation is shown in equation (3)
P TraditionalBiPred =(P L0 +P L1 +RoundingOffset)>>shiftNum, (3)
In equation (3), P TraditionalBiPred Is the final predictor, P, for conventional bi-directional prediction L0 And P L1 Are predictors from L0 and L1, respectively, RoundingOffset and shiftNum are used to normalize the final predictor.
Generalized bi-prediction (GBI) is proposed to allow different weights to be applied to predictors from L0 and L1. Predictor generation is shown in equation (4)
P GBi =((1-w 1 )*P L0 +w 1 *P L1 +RoundingOffset GBi )>>shiftNum GBi , (4)
In equation (4), P GBi Is the final predictor of GBi. (1-w) 1 ) And w 1 Are the selected GBI weights applied to the predictors of L0 and L1, respectively. roundingOffset GBi And shiftNum GBi For normalizing the final predictor in GBi.
Supported w 1 Is { -1/4,3/8,1/2,5/8,5/4}. One set of equal weights and four sets of unequal weights are supported. For the case of equal weight, the process of generating the final predictor is exactly the same as in the conventional bi-directional prediction mode. For the true bi-prediction case in Random Access (RA) conditions, the number of candidate weight sets is reduced to three.
For the Advanced Motion Vector Prediction (AMVP) mode, if this CU is bi-predictive codec, the weight selection in the GBI is explicitly signaled at the CU level. For merge mode, the weight selection inherits from the merge candidate. In this proposal, GBI supports DMVR to generate a weighted average of the templates and a final predictor of BMS-1.0.
2.2.13. Multi-hypothesis inter prediction
In the multi-hypothesis inter prediction mode, one or more additional prediction signals are signaled in addition to the conventional uni/bi prediction signals. The resulting overall prediction signal is obtained by weighted overlap-add on a sample-by-sample basis. Using unidirectional/bidirectional predictive signals p uni/bi And a first additional inter prediction signal/hypothesis h 3 The obtained prediction signal p is obtained as follows 3
p 3 =(1-α)p unb/bi +αh 3
The following illustrates changes to the syntax structure of the prediction unit:
Figure BDA0003363224490000301
the weighting factor α is specified by the syntax element add _ hyp _ weight _ idx according to the following mapping:
add_hyp_weight_idx α
0 1/4
1 -1/8
note that for the additional prediction signal, the concept of prediction list0/list1 is eliminated and a combined list is used. This combined list is generated by alternately inserting reference frames with increased reference indices from list0 and list1 and omitting already inserted reference frames, thereby avoiding double entries.
Similar to the above, more than one additional prediction signal may be used. The resulting overall predicted signal is iteratively accumulated with each additional predicted signal.
p n+1 =(1-α n+1 )p nn+1 h n+1
Obtaining the last p n As the resulting overall prediction signal (i.e. p with the largest index n n )。
Note that for inter-predicted blocks using the Merge mode (instead of Skip mode), an additional inter-prediction signal may also be specified. It is also noted that in the case of Merge, not only uni/bi-directional prediction parameters, but also additional prediction parameters of the selected prediction candidate may be used for the current block.
2.2.14. Multi-hypothesis prediction for unidirectional prediction of AMVP mode
When applying multi-hypothesis prediction to improve the unidirectional prediction of the AMVP mode, a flag is signaled to enable or disable multi-hypothesis prediction for inter dir equal to 1 or 2, where 1,2, and 3 represent list0, list1, and bi-prediction, respectively. In addition, when the flag is true, a merge index is signaled. In this way, multi-hypothesis prediction changes uni-directional prediction to bi-directional prediction, where one motion is obtained using the initial syntax element in AMVP mode and the other motion is obtained using merge scheme. The final prediction uses 1:1 weights, combining the two predictions as in bi-directional prediction. A merge candidate list is first derived from merge mode, with sub-CU candidates excluded (e.g., affine, optional temporal motion vector prediction (ATMVP)). Next, it is split into two individual lists, one for list 0(L0) containing all L0 motions from the candidates, and the other for list 1(L1) containing all L1 motions. After eliminating redundancy and filling in the blank, two merge lists are generated for L0 and L1, respectively. There are two constraints when applying multi-hypothesis prediction to improve AMVP mode. First, it is enabled for those CUs that have a luma Codec Block (CB) area greater than or equal to 64. Second, it is only applied to L1 when in low-delay B pictures.
2.2.15. Multi-hypothesis prediction for skip/merge mode
When applying multi-hypothesis prediction to skip or merge mode, explicitly signaling whether multi-hypothesis prediction is enabled. In addition to the initial prediction, an additional merge indexed prediction is selected. Thus, each candidate for multi-hypothesis prediction means a pair of merge candidates, comprising one candidate for first merge index prediction and the other candidate for second merge index prediction. However, in each pair, the merge candidate used for the second merge index prediction is implicitly derived as a subsequent merge candidate (i.e., the already signaled merge index plus one) without signaling any additional merge indices. After eliminating redundancy by excluding those pairs that contain similar merge candidates and fill in the gaps, a candidate list for multi-hypothesis prediction is formed. Then, motion from a pair of two merge candidates is acquired to generate a final prediction, where 5:3 weights are applied to the first and second merge index predictions, respectively. Furthermore, merge or skip CUs that enable multi-hypothesis prediction may save motion information of additional hypotheses for reference by subsequent neighboring CUs, in addition to the motion information of existing hypotheses. Note that sub-CU candidates (e.g., affine, ATMVP) are excluded from the candidate list, and for low-latency B pictures, no multi-hypothesis prediction is applied to skip mode. Furthermore, when applying multi-hypothesis prediction to merge or skip mode, bi-linear interpolation filters are used for multiple hypotheses in motion compensation for those CUs having a CU width or CU height less than 16, or for those CUs having both a CU width and a CU height equal to 16. Thus, the worst-case bandwidth (samples that need to be accessed per sample) for each merge or skip Cu with multi-hypothesis prediction enabled is calculated in table 1, each number being less than half the worst-case bandwidth for each 4 × 4Cu with multi-hypothesis prediction disabled.
2.2.16. Final motion vector representation
The final motion vector representation (UMVE) will be introduced. UMVE is used for skip mode or merge mode in combination with the proposed motion vector expression method.
UMVE reuses merge candidates in the same way as used in VVC. Among these merge candidates, a candidate may be selected and further expanded by the proposed motion vector expression method.
UMVE provides a new motion vector representation with simplified signaling. The expression method comprises a starting point, a motion amplitude and a motion direction.
FIG. 30 illustrates an example of a UMVE search process.
FIG. 31 illustrates an example of UMVE search points.
This proposed technique uses the merge candidate list as is. But only candidates of the DEFAULT merge TYPE (MRG _ TYPE _ DEFAULT _ N) are considered for expansion of UMVE.
The base candidate index defines a starting point. The base candidate index indicates the best candidate among the candidates in the list, as described below.
Table 1. Base candidate IDX
Base candidate IDX 0 1 2 3
Nth MVP First MVP Second MVP Third MVP Fourth MVP
If the number of base candidates is equal to 1, no signaling is made of the base candidate IDX.
The distance index is motion amplitude information. The distance index indicates a predefined distance by the start point information. The predefined distance is as follows:
table 2. Distance IDX
Figure BDA0003363224490000321
Figure BDA0003363224490000331
The direction index indicates the direction of the MVD with respect to the starting point. The direction index may represent four directions as shown below.
Table 3. Direction IDX
Direction IDX 00 01 10 11
x axis + - N/A N/A
y axis N/A N/A + -
The UMVE flag is signaled immediately after the skip flag and merge flag are sent. If the skip and merge flags are true, then the UMVE flag is parsed. If the UMVE flag is equal to 1, then the UMVE syntax is parsed. However, if not equal to 1, the AFFINE flag is resolved. If AFFINE flag is equal to 1, it is AFFINE mode, but if not equal to 1, then the skip/merge index is parsed for skip/merge mode of the VTM.
No additional line buffering due to UMVE candidates is required. Since the software skip/merge candidates are directly used as base candidates. In case of using the input UMVE index, the supplementation of the MV is decided just before the motion compensation. It is not necessary to keep a long line cache for this purpose.
2.2.17. Affine merge mode with prediction offset
UMVE is extended to affine merge mode, which will be referred to as UMVE affine mode hereinafter. The proposed method selects the first available affine merge candidate as the base predictor. A motion vector offset is then applied to the motion vector value from each control point of the base predictor. If no affine merge candidate is available, the proposed method will not be used.
The inter prediction direction of the selected base predictor and the reference index of each direction are used without modification.
In the current embodiment, assuming that the affine model of the current block is a 4-parameter model, only 2 control points need to be derived. Then, the first 2 control points that use only the base predictor will be the control point predictors.
For each control point, a zero _ MVD flag is used to indicate whether the control point for the current block has the same MV value as the corresponding control point predictor. If the zero _ MVD flag is true, no further signaling is required by the control point. Otherwise, signaling the distance index and the offset direction index for the control point.
A distance offset table of size 5 was used as shown in the table below. The distance index is signaled to indicate which distance offset to use. The mapping of the distance index and the distance offset value is shown in fig. 32.
TABLE 1 distance migration Table
Distance IDX 0 1 2 3 4
Offset of distance 1/2 pixel 1 pixel 2 pixels 4 pixels 8 pixels
Fig. 31 shows an example of distance index and distance offset mapping.
The direction index may represent four directions as shown below, where only the x and y directions may have MV differences, but not both directions.
Offset direction IDX 00 01 10 11
x-dir-factor +1 -1 0 0
y-dir-factor 0 0 +1 -1
If the inter prediction is unidirectional, a signaled distance offset is applied to each control point predictor in the offset direction. The result will be the MV value for each control point.
For example, when the base predictor is unidirectional, the motion vector value of the control point is MVP (v) px ,v py ). When the distance offset and direction index are signaled, the motion vector of the corresponding control point of the current block will be calculated as follows.
MV(v x ,v y )=MVP(v px ,v py )+MV(x-dir-factor*distance-offset,y-dir-factor*distance-offset);
If the inter prediction is bi-directional, applying the signaled distance offset to the L0 motion vector of the control point predictor in the signaled offset direction; and applies the same distance offset with opposite direction to the L1 motion vector of the control point predictor. The result will be MV values for each control point in each inter prediction direction.
For example, when the base predictor is unidirectional, the motion vector value of the control point at L0 is MVP L0 (v 0px ,v 0py ) The motion vector of the control point on L1 is MVP L1 (v 1px ,v 1py ). When the distance offset and direction index are signaled, the motion vector of the corresponding control point of the current block will be calculated as follows.
MV L0 (v 0x ,v 0y )=MVP L0 (v 0px ,v 0py )+MV(x-dir-factor*distance-offset,y-dir-factor*distance-offset);
MV L1 (v 0x ,v 0y )=MVP L1 (v 0px ,v 0py )+MV(-x-dir-factor*distance-offset,-y-dir-factor*distance-offset);
2.2.18. Bidirectional light stream
Bi-directional optical flow (BIO) is sample-by-sample motion refinement performed on block-by-block motion compensation for bi-directional prediction. Sample-level motion refinement does not use signaling.
Let I (k) For luminance values from reference k (k 0,1) after block motion compensation,
Figure BDA0003363224490000341
and
Figure BDA0003363224490000342
are respectively I (k) The horizontal and vertical components of the gradient. Assuming that the optical flow is valid, the motion vector field (v) is given by the following equation x ,v y ):
Figure BDA0003363224490000351
Combining this optical flow equation with the Hermite interpolation of the motion trajectory for each sample point yields the final sum function value I (k) Sum derivative
Figure BDA0003363224490000352
All matching unique third order polynomials. The value of this polynomial at t ═ 0 is the BIO prediction:
Figure BDA0003363224490000353
here,. tau. 0 And τ 1 Indicating the distance from the reference frame as shown in fig. 33. Calculating distance τ based on POC for Ref0 and Ref1 0 And τ 1 :τ 0 POC (current) -POC (Ref0), τ 1 POC (Ref1) -POC (current). If both predictions are from the same time direction (both from the past or both from the future), then the sign is different (i.e., τ 0 ·τ 1 < 0). In this case, only if the predictions are not from the same time instant (i.e., τ) 0 ≠τ 1 ) With both reference regions having non-zero motion (MVx) 0 ,MVy 0 ,MVx 1 ,MVy 1 Not equal to 0) and block motion vector-to-temporal distance (MVx) 0 /MVx 1 =MVy 0 /MVy 1 =-τ 01 ) In direct proportion, BIO is applied.
By minimizing the difference Δ between the values in points A and B (intersection between the motion trajectory and the reference frame plane on FIG. 33)To determine the motion vector field 9v x ,v y ). The model uses only the first linear term of the local taylor expansion of Δ:
Figure BDA0003363224490000354
all values in equation 26 depend on the sample position (i ', j'), which has been omitted from the representation so far. Assuming that the motion is consistent in the local surrounding area, Δ inside a (2M +1) square window Ω centered on the current predicted point (i, j) is minimized, where M equals 2:
Figure BDA0003363224490000355
for this optimization problem, JEM uses a simplified approach, first minimizing in the vertical direction and then minimizing in the horizontal direction. Thus obtaining
Figure BDA0003363224490000356
Figure BDA0003363224490000357
Wherein the content of the first and second substances,
Figure BDA0003363224490000358
Figure BDA0003363224490000359
Figure BDA00033632244900003510
to avoid division by zero or a minimum value, regularization parameters r and m are introduced in equations 28 and 29.
r=500·4 d-8 (31)
m=700·4 d-8 (32)
Here, d is the bit depth of the video samples.
To keep the memory access for BIO the same as for conventional bi-predictive motion compensation, all prediction and gradient values I are computed only for positions inside the current block (k) ,
Figure BDA0003363224490000361
In equation 30, a (2M +1) × (2M +1) square window Ω centered on the current prediction point on the boundary of the prediction block needs to access a position outside the block (as shown in fig. 34A). In JEM, the block is externally I (k) ,
Figure BDA0003363224490000362
Is set equal to the nearest available value inside the block. This may be implemented as padding, for example, as shown in fig. 34B.
With BIO, the motion field can be refined for each sample point. To reduce computational complexity, a block-based design of the BIO is used in JEM. Motion refinement is calculated based on 4x4 blocks. In block-based BIO, the value of sn in equation 30 for all samples in a 4x4 block is aggregated, and then the aggregate value of sn is used to derive the BIO motion vector offset for the 4x4 block. More specifically, block-based BIO derivation is performed using the following formula:
Figure BDA0003363224490000363
Figure BDA0003363224490000364
Figure BDA0003363224490000365
wherein, b k Represents a set of samples belonging to the kth 4x4 block of the prediction block. With ((s) n,bk )>>4) Substitution of s in equations 28 and 29 n To derive the associated motion vector offset.
In some cases, the MV community of the BIO may be unsatisfactory due to noise or irregular motion. Thus, in BIO, the size of the MV community is clipped to the threshold thBIO. The threshold is determined based on whether all of the reference pictures of the current picture are from one direction. If all reference pictures of the current picture come from one direction, the threshold value is set to 12 × 2 14-d (ii) a Otherwise, it is set to 12 × 2 13-d
Concurrently with motion compensated interpolation, the gradient for the BIO is calculated using operations that conform to the HEVC motion compensation process (2D separable FIR). The input to this 2D separable FIR is the same reference frame sample as the motion compensation process and fractional position (fracX, fracY) from the fractional part of the block motion vector. The horizontal gradient is interpolated vertically using BIOfilter S corresponding to fractional position fracY with de-scaling shift d-8 first
Figure BDA0003363224490000372
In the case of a signal, the gradient filter BIOfilterg is then applied in the horizontal direction corresponding to the fractional position fracX with the de-scaling shift 18-d. Vertical gradient is applied first using the BIOfiltER G vertical corresponding to the fractional position fracY with the de-scaling shift d-8
Figure BDA0003363224490000373
In the case of the first gradient filter, the signal shifting is then performed using BIOfilters in the horizontal direction corresponding to the fractional position fracX with the de-scaling shift 18-d. The length of the interpolation filter bialterg for gradient calculations and the interpolation filter bialterf for signal displacement is shorter (6 taps) in order to maintain moderate complexity. Table 2 shows the filtering for gradient computation for different fractional positions of block motion vectors in BIOA wave filter. Table 3 shows the interpolation filters used for prediction signal generation in BIO.
TABLE 2 Filter for gradient calculation in BIO
Fractional pixel position Interpolation filter for gradient (BIOfiltrG)
0 {8,-39,-3,46,-17,5}
1/16 {8,-32,-13,50,-18,5}
1/8 {7,-27,-20,54,-19,5}
3/16 {6,-21,-29,57,-18,5}
1/4 {4,-17,-36,60,-15,4}
5/16 {3,-9,-44,61,-15,4}
3/8 {1,-4,-48,61,-13,3}
7/16 {0,1,-54,60,-9,2}
1/2 {-1,4,-57,57,-4,1}
TABLE 3 interpolation Filter for prediction Signal Generation in BIO
Figure BDA0003363224490000371
Figure BDA0003363224490000381
In JEM, BIO is applied to all bi-prediction blocks when the two predictions are from different reference pictures. When LIC is enabled for a CU, BIO is disabled.
In JEM, OBMC is applied for blocks after the normal MC process. To reduce computational complexity, no BIO is applied during the OBMC process. This means that when using the block's own MV, the BIO is only applied in the MC procedure for the block, and when using the neighboring block's MV during the OBMC procedure, the BIO is not applied in the MC procedure.
2.2.19. Decoder-side motion vector refinement
In the bi-directional prediction operation, for prediction of a region of one block, two prediction blocks formed using a Motion Vector (MV) of list0 and an MV of list1, respectively, are combined to form a single prediction signal. In the decoder-side motion vector refinement (DMVR) method, the two motion vectors for bi-prediction are further refined by a two-sided template matching process. Applying bilateral template matching in the decoder to perform a distortion-based search between the bilateral template and reconstructed samples in the reference picture to obtain refined MVs without the need to transmit additional motion information.
In DMVR, the two-sided template is generated as a weighted combination (i.e., average) of two prediction blocks derived from the original MV0 of list0 and MV1 of list1, respectively, as shown in fig. 35. The template matching operation consists of computing a cost metric between the generated template and the sample region of the reference picture (surrounding the initial prediction block). For each of the two reference pictures, the MV that results in the lowest template cost is considered as the updated MV of the list to replace the original one. In JEM, nine MV candidates are searched for each list. The nine MV candidates include the original MV and 8 surrounding MVs that have an offset of one luma sample from the original MV in the horizontal direction or the vertical direction or both. Finally, two new MVs, MV0 'and MV 1', as shown in fig. 35, are used to generate the final bi-directional prediction result. The Sum of Absolute Differences (SAD) is used as a cost measure. Note that, in calculating the cost of a prediction block generated from one surrounding MV, a rounded MV (to the integer-pixel level) is actually used to obtain the prediction block instead of the actual MV.
DMVR is applied for merge mode for bi-prediction, where one MV is from a past reference picture and another MV is from a future reference picture, no additional syntax elements are transmitted. In JEM, DMVR is not applied when LIC, affine motion, FRUC, or sub-CU merge candidates are enabled for a CU.
2.2.20. History-based merge candidate derivation
After spatial MVP and TMVP, history-based MVP (hmvp) merge candidates are added to the merge list. In this method, motion information of a previously coded block is stored in a table and used as an MVP of a current block. A table with multiple HMVP candidates is maintained during the encoding/decoding process. The table is reset (cleared) when a new CTU row is encountered. Whenever there is a non-subblock inter-frame codec CU, the associated motion information is added as a new HMVP candidate to the last entry of the table.
In VTM4, the HMVP table size S is set to 6, which means that up to 6 history-based mvp (HMVP) candidates can be added to the table. When inserting a new motion candidate into the table, a constrained first-in-first-out (FIFO) rule is utilized, wherein a redundancy check is first applied to find whether there is an identical HMVP in the table. If found, the same HMVP is removed from the table and all HMVP candidates behind are moved forward.
The HMVP candidate may be used in the merge candidate list construction process. The latest HMVP candidates in the table are checked in order and inserted after the TMVP candidate in the candidate list. Redundancy checks are applied to the HMVP candidates relative to spatial or temporal merge candidates.
To reduce the number of redundancy check operations, the following simplifications are introduced:
(1) set the number of HMVP candidates for merge list generation to (N ═ 4)? M (8-N), wherein N represents the number of the existing candidates in the merge list, and M represents the number of the available HMVP candidates in the table.
(2) Once the total number of available merge candidates reaches the maximum allowed merge candidate minus 1, the merge candidate list construction process from the HMVP is terminated.
2.3. Other related coding and decoding tools
2.3.1. Luminance mapping with chroma scaling (LMCS)
In VTM4, a codec tool called luma map with chroma scaling (LMCS) is added as a new processing block before the loop filter. LMCS has two main components: 1) a loop map of the luminance component based on an adaptive piece-by-piece linear model; 2) for the chroma component, luma-dependent chroma residual scaling is applied. Figure 36 shows the LMCS architecture from the decoder perspective. The light shaded blocks in FIG. 36 indicate where in the map domain to apply the process; these include inverse quantization, inverse transformation, luma intra prediction, and adding luma prediction along with luma residual. The unshaded blocks in FIG. 36 indicate where in the initial (i.e., unmapped) domain the process applies; these include loop filters such as deblocking, ALF and SAO, motion compensated prediction, chroma intra prediction, adding chroma prediction along with chroma residual, and storing decoded pictures as reference pictures. The lighter shaded blocks in figure 36 are the new LMCS functional blocks, including the forward and inverse mapping of the luminance signal and the luminance-dependent chroma scaling process. Like most other tools in VVC, the LMCS may be enabled/disabled using the SPS flag at the sequence level.
2.3.2. Intra block copy
Intra Block Copy (IBC) is one tool adopted by HEVC extensions for SCC. It is well known that it significantly improves the codec efficiency of screen content material. Since the IBC mode is implemented as a block-level codec mode, Block Matching (BM) is performed at the encoder to find the optimal block vector (or motion vector) for each CU. Here, the motion vector is used to indicate a displacement from the current block to a reference block that has been reconstructed inside the current picture. The luma motion vector of a CU for IBC coding is at integer precision. The chroma motion vectors are also clipped to integer precision. When combined with AMVR, IBC mode can switch between 1-pixel and 4-pixel motion vector accuracy. The IBC-coded CU is treated as a third prediction mode in addition to the intra or inter prediction modes.
To reduce memory consumption and decoder complexity, IBC in VTM4 only allows the use of the reconstructed portion of the predefined region that includes the current CTU. This restriction allows IBC mode to be implemented using local on-chip memory for hardware implementation.
At the encoder side, hash-based motion estimation is performed for IBC. The encoder performs an RD check on blocks that are not larger than 16 luminance samples in width or height. For non-merge mode, a block vector search is first performed using a hash-based search. If the hash search does not return a valid candidate, a local search based on block matching will be performed.
In a hash-based search, hash-key matching (32-bit CRC) between the current block and the reference block is extended to all allowed block sizes. The hash key calculation for each position in the current picture is based on 4x4 sub-blocks. For a larger size current block, when all hash keys of all 4x4 sub-blocks match the hash key of the corresponding reference location, it is determined that the hash key matches the hash key of the reference block. If the hash keys of multiple reference blocks are found to match the hash key of the current block, the block vector cost of each matching reference is calculated and the one with the lowest cost is selected.
In the block matching search, a search range is set to N samples on the left and above the current block within the current CTU. At the beginning of the CTU, the value of N is initialized to 128 if there is no temporal reference picture and to 64 if there is at least one temporal reference picture. The hash hit rate is defined as the percentage of samples in the CTU that find a match using a hash-based search. When encoding the current CTU, N is reduced by half if the hash hit rate is below 5%.
At the CU level, IBC mode is signaled with a flag, which may be signaled as IBC AMVP mode or IBC skip/merge mode as follows:
omicron IBC skip/merge mode: the merge candidate index is used to indicate which of the block vectors from the list of neighboring candidate IBC coded blocks is used to predict the current block. The merge list consists of spatial domain, HMVP and paired candidates.
Omicron AMVP mode: the block vector difference is coded in the same way as the motion vector difference. The block vector prediction method uses two candidates as predictors, one from the left-hand neighbor and one from the top neighbor (if IBC coded). When any neighbor is not available, the default block vector will be used as the predictor. Signaling a flag to indicate a block vector predictor index.
2.3.3. Cross-component linear model (CCLM) prediction
To reduce cross-component redundancy, a cross-component linear model (CCLM) prediction mode, also called LM, is used in JEM, for which chroma samples are predicted based on luma samples of the same CU reconstructed using a linear model as follows:
pred c (i,j)=α·rec L ′(i,j)+β
therein, pred C (i, j) denotes the predicted chroma samples, rec, in the CU L ' (i, j) denotes the down-sampled reconstructed luma samples of the same CU for color formats 4:2:0 or 4:2:2, and rec L ' (i, j) represents reconstructed luminance samples of the same CU for color format 4:4: 4. The CCLM parameters α and β are derived by minimizing the regression error between neighboring reconstructed luma and chroma samples around the current block as follows:
Figure BDA0003363224490000411
Figure BDA0003363224490000412
where l (N) represents the up-and left-side neighboring reconstructed luma samples down-sampled (for color format 4:2:0 or 4:2:2) or initially (for color format 4:4:4), c (N) represents the up-and left-side neighboring reconstructed chroma samples, and the value of N is equal to twice the minimum of the width and height of the current chroma codec block. For a codec block with a square shape, the above two equations are applied directly. For non-square codec blocks, adjacent samples of the longer boundary are first subsampled to have the same number of samples as the shorter boundary. Fig. 37 shows the left and upper reconstructed samples involved in CCLM mode and the position of the samples of the current block.
This regression error minimization calculation is performed as part of the decoding process, not just as an encoder search operation, so that no syntax is used to convey the values of α and β.
It proposes multi-directional lm (mdlm). In MDLM, two additional CCLM modes are proposed: LM-a, in which linear model parameters are derived based only on top-adjacent samples, as shown in fig. 38A, and LM-L, in which linear model parameters are derived based only on left-adjacent samples, as shown in fig. 38B.
It proposes an LMS algorithm that replaces the linear model parameters α and β with a linear equation, the so-called two-point method. The 2 points (luminance and chrominance pairs) (a, B) are the minimum and maximum values inside the set of adjacent luminance samples, as shown in fig. 39.
Wherein the linear model parameters α and β are obtained according to the following equations:
Figure BDA0003363224490000421
β=y A -αx A
the division to derive a is avoided and replaced by multiplication and shifting.
The derivation of the CCLM parameters involves a considerable comparison operation, which is an undesirable situation in hardware or software design.
It proposes to derive the CCLM parameters using at most four adjacent chrominance samples and their corresponding downsampled luminance samples.
Assuming that the current chroma block dimension is W H, then W 'and H' are set to
Omicron, when the LM mode is applied, W ═ W, H ═ H;
o when LM-a mode is applied, W' ═ W + H;
omicron, when LM-L mode is applied, H' ═ H + W;
the upper adjacent position is denoted as S0, -1 … S W '-1, and the left adjacent position is denoted as S-1, 0 … S-1, H' -1. Then, four samples were selected as follows:
omicron, when LM mode is applied and both upper and left neighboring samples are available, selects S [ W '/4, -1], S [ 3W'/4, -1], S [ -1, H '/4 ], S [ -1, 3H'/4 ];
omicron selects S [ W '/8, -1], S [ 3W'/8, -1], S [5W '/8, -1], S [ 7W'/8, -1] when LM-A mode is applied or only upper neighboring samples are available;
select S < -1 >, H '/8 >, S < -1, 3H'/8 >, S < -1,5H '/8 >, S < -1, 7H'/8 > when LM-L mode is applied or only left-hand neighboring samples are available;
four adjacent luminance samples at the selected position are down-sampled and compared four times to find two smaller values: x is the number of 0 A And x 1 A And two larger values: x is the number of 0 B And x 1 B . Their corresponding chroma sample values are denoted y 0 A 、y 1 A 、y 0 B And y 1 B . Then x is put A 、x B 、y A And y B The derivation is:
x A =(x 0 A +x 1 A +1)>>1;x B =(x 0 B +x 1 B +1)>>1;y A =(y 0 A +y 1 A +1)>>1;y B =(y 0 B +y 1 B +1)>>1。
the linear model parameters α and β are then derived as described above.
3. Disadvantages of the existing embodiments
In LIC, adjacent samples of the current block or the first 16 × 16 block in the current block are used to derive LIC parameters. Although not all neighboring samples are used in most cases, the computational complexity is still high, especially for large blocks.
4. Some exemplary embodiments and techniques
The invention detailed below should be considered as an example to explain the general concept. These inventions should not be construed narrowly. Furthermore, these inventions may be combined in any manner.
Assuming that the upper left coordinate of the current block is (x, y), cuWidth and cuHeight are the width and height of the block. minDimBit, minDim, minStepBit, numSteps, dimShift are some of the integer variables used in the LIC parameter derivation process, which in one example may be defined as: minDimBit ═ Log2[ min (cuHeight, cuWidth) ]; minDim ═ min (cuHeight, cuWidth); minStepBit is more than 81: 0; numstents > > minDim > minStepBit; dimShift-minDimBit-minStepBit.
1. For paired candidates, it is proposed to always set the IC flag to false.
a. In one example, LIC should be disabled when a block is coded with paired candidates.
b. Alternatively, the LIC flag associated with the pair candidate may be derived from the two merge candidates from which the pair candidate is derived.
i. In one example, if the LIC flag associated with at least one of the two candidates is true, the IC flag associated with the paired candidate may be set to true.
in one example, if the LIC flag associated with both candidates is true, the IC flag associated with the paired candidate may be set to true.
in one example, if at least one of the two candidates is a TMVP candidate, the IC flag associated with the paired candidate may be set to false.
in one example, if at least one of the two candidates is an HMVP candidate, the IC flag associated with the paired candidate may be set to false.
2. For the zero motion merge candidate, it is proposed to always set the LIC flag to false.
a. Alternatively, how to set the LIC flag may depend on the LIC flag associated with other merge candidates.
i. In one example, if the first merge candidate is associated with an IC flag equal to true, it is set to true for the zero motion merge candidate.
3. LIC may be enabled for blocks of ATMVP codec.
a. In one example, whether LIC is enabled may depend on from neighboring blocks, e.g.
The 'a 1' block derived spatial merge candidate, as shown in fig. 10.
b. In one example, a set of LIC parameters for each reference picture list may be derived for the entire block. In this case, if LIC is enabled for a sub-block, all sub-blocks may share the same LIC parameters.
i. Alternatively, also one or more motion vectors of one sub-block may be used to identify neighboring reference samples of the whole block, which may be used to derive LIC parameters of the whole block.
For example, the sub-block may be the top left sub-block.
For example, the sub-block may be a central sub-block.
Different sub-blocks may be selected for different color components.
4. Different sub-regions within a block may use different LIC parameters.
a. In one example, multiple sets of LIC parameters for each reference picture list may be derived, and each sub-block within the block may select one set from the multiple sets.
b. In one example, the sub-region may be defined as a sub-block for a block utilizing sub-block based techniques, such as ATMVP, affine coding.
5. LIC may be enabled for one reference picture list and disabled for another reference picture list.
a. Alternatively, also for each block, two LIC flags may be stored.
b. Alternatively, also for the pair merge candidates, the LIC flag may be inherited from each of the two merge candidates used to derive the pair candidate.
6. For bi-predictive coding blocks, LIC parameters may be derived once for both reference picture lists.
a. In one example, it may be derived from motion information of one reference picture list.
b. Alternatively, it may be derived from the motion information of the two reference picture lists.
i. In one example, neighboring samples with respect to two reference pictures in two reference picture lists may be utilized.
in one example, how to select neighboring samples relative to two reference pictures in two reference picture lists may be different for the two reference picture lists.
7. The LIC may be enabled for some sub-areas within one block and disabled for the remaining sub-areas.
a. In one example, for blocks coded by ATMVP, one sub-block (e.g., an 8 x 8 block) may enable LIC and another sub-block may disable LIC.
b. In one example, for an affine codec block, one sub-block (e.g., an 8 x 8 block) may enable LIC and another sub-block may disable LIC.
8. Whether LIC is enabled may depend on the location of the blocks.
a. In one example, LIC may be disabled for blocks located at the picture/strip/tile/brick boundaries.
9. LIC and transform bypass (TransBypass) modes (e.g., TS mode, QR-BDPCM) may be used exclusively.
a. In one example, if a block is coded with LIC mode, the TransBypass mode (e.g., TS mode) should be disabled.
i. Alternatively, also the signaling of the change skip mode may be skipped.
b. The notification may be conditionally signaled based on an indication of the use of the transform skip mode
Side information of LIC mode.
i. Alternatively, also if TS mode is enabled for one block, signaling of side information of LIC mode may be skipped.
10. Whether and/or how loop filter processes (e.g., deblocking filters, SAO, ALF) and/or post-reconstruction filtering processes (e.g., bilateral filters) are enabled and/or applied may depend on the use of the LIC.
a. In one example, for two contiguous blocks with different LIC flags, the edge between them may still be filtered.
i. In one example, if two contiguous blocks are coded with different LIC flags (e.g., one true and one false), the boundary strength may be set to M (M | ═ 0).
b. In one example, for two contiguous blocks that both have an LIC flag equal to 1, the edge between them may be filtered.
i. In one example, if both contiguous blocks are codec with LIC flag equal to 1, the boundary strength may be set to M (M | ═ 0).
11. LIC is proposed to train linear model parameters using a fixed number of adjacent (contiguous or non-contiguous) samples.
a. In one example, the LIC may use a fixed denominator or value for shifting in the linear model parameter derivation process.
b. How many neighboring samples are used to derive the LIC parameters may depend on the availability of neighboring samples or/and the dimensions of the block.
c. In one example, the selected spots may be located at (x-1, y + offset y + Ky H/Fy) and/or (x + offset x + Kx W/Fx, y-1), where Fx spots are selected from the top adjacent rows, Kx may be from 0 to Fx-1, Fy spots are selected from the left adjacent columns, and Ky may be from 0 to Fy-1. offset x and offset y are integers. For example, offset x is W/8, and offset y is H/8.
d. In one example, four neighboring samples of the current block are utilized.
i. If both the left-hand adjacent column and the top-hand adjacent row are available, SH samples of the left-hand column and SW samples of the top row may be selected, where SH ≦ H and SW ≦ W.
1. In one example, SH ═ SW ═ 2.
2. They can be located (assuming the upper left coordinate of the current block is (x, y), the width and height of the current block are W and H, respectively):
(a)(x+W/4,y-1),(x+3W/4,y-1),(x-1,y+H/4),(x-1,y+3H/4)
(b)(x,y-1),(x+W-1,y-1),(x-1,y),(x-1,y+H-1)
(c) (x, y-1), (x + W-1, y-1), (x-1, y), (x-1, y + H-H/W). For example, when H is greater than W.
(d) (x, y-1), (x + W-W/H, y-1), (x-1, y), (x-1, y + H-1). For example, when H is less than W.
(e)(x,y-1),(x+W-max(1,W/H),y-1),(x-1,y),(x-1,y+H-max(1,H/W))。
(f)(x+W/2-1,y-1),(x+W/2,y-1),(x-1,y+H/2-1),(x-1,y+H/2)
(h)(x+W/2-1,y-1),(x+W-1,y-1),(x-1,y+H/2-1),(x-1,y+H-1)
(i)(x,y-1),(x+W/2,y-1),(x-1,y),(x-1,y+H/2)
3. In an alternative example, they may be located where f1(K) = ((K × W) > > dimShift), f2(K) ═ ((K × H) > > dimShift), and N is an integer, such as numgates. In another example, N may depend on W and/or H.
(a)(x+f1(N/4),y-1),(x+f1(3*N/4),y-1),(x-1,y+f2(N/4)),(x-1,y+f2(3*N/4))
(b)(x,y-1),(x+f1(N-1),y-1),(x-1,y),(x-1,y+f2(N-1))。
(c)(x+f1(N/2-1),y-1),(x+f1(N/2),y-1),(x-1,y+f2(N/2-1)),(x-1,y+f2(N/2))
(d)(x+f1(N/2-1),y-1),(x+f1(N-1),y-1),(x-1,y+f2(N/2-1)),(x-1,y+f2(N-1))
(e)(x,y-1),(x+f1(N/2),y-1),(x-1,y),(x-1,y+f2(N/2))。
Select samples from the top row only if there is only a top adjacent row.
(a) For example, SW samples in the upper row may be selected, where SW ≦ W.
(1) For example, SW 2 or SW 4.
(b) How the sampling points are selected may depend on the width/height.
(c) For example, four samples are selected when W >2, and two samples are selected when W equals 2.
(d) The selected samples can be located (assuming the upper left coordinates of the current block are (x, y), the width and height of the current block are W and H, respectively):
(1)(x,y-1),(x+W/4,y-1),(x+2*W/4,y-1),(x+3*W/4,y-1)
(2)(x,y-1),(x+W/4,y-1),(x+3*W/4,y-1),(x+W-1,y-1)
(3)(x+W/8,y-1),(x+3W/8,y-1),(x+5W/8,y-1),(x+7W/8,y-1)
(4)(x,y-1),(x+W/2-1,y-1),(x+W/2,y-1),(x+W-1,y-1)。
(e) in an alternative example, the selected sampling points may be located at positions (assuming that the upper left coordinate of the current block is (x, y), and the width and height of the current block are W and H, respectively), where f1(K) ((K × W) > > dimShift), and f2(K) ((K × H) > > dimShift). In one example, N is an integer, such as numbates. In another example, N may depend on W and/or H.
(1)(x,y-1),(x+f1(N/4),y-1),(x+f1(2*N/4),y-1),(x+f1(3*N/4),y-1)
(2)(x,y-1),(x+f1(N/4),y-1),(x+f1(3*N/4),y-1),(x+f1(N-1),y-1)
(3)(x+f1(N/8),y-1),(x+f1(3N/8),y-1),(x+f1(5N/8),y-1),(x+f1(7N/8),y-1)
(4)(x,y-1),(x+f1(N/2-1),y-1),(x+f1(N/2),y-1),(x+f1(N-1),y-1)。
Select samples only from the left column if there is only a left adjacent column.
1. For example, the SH samples in the left column may be selected, where SH ≦ H.
a. For example, SH ═ 2 or SH ═ 4.
2. How the sampling points are selected may depend on the height/width.
3. For example, four samples are selected when H >2, and two samples are selected when H equals 2.
4. The selected samples may be located at:
(1)(x-1,y),(x-1,y+H/4),(x-1,y+2*H/4),(x-1,y+3*H/4)
(2)(x-1,y),(x-1,y+H/4),(x-1,y+3*H/4),(x-1,y+H-1)
(3)(x-1,y+H/8),(x-1,y+3H/8),(x-1,y+5H/8),(x-1,y+7H/8)
(4)(x-1,y),(x-1,y+H/2-1),(x-1,y+H/2),(x-1,y+H-1)
(d) in an alternative example, the selected spots may be located at positions where f1(K) ═ ((K) · W) > > dimShift), and f2(K) ═ ((K) · H) > > dimShift). In one example, N is an integer, such as numbates. In another example, N may depend on W and/or H.
(1)(x-1,y),(x-1,y+f2(N/4)),(x-1,y+f2(2*N/4)),(x-1,y+f2(3*N/4))
(2)(x-1,y),(x-1,y+f2(N/4)),(x-1,y+f2(3*N/4)),(x-1,y+f2(N-1))
(3)(x-1,y+f2(N/8)),(x-1,y+f2(3N/8)),(x-1,y+f2(5N/8)),(x-1,y+f2(7N/8))
(4)(x-1,y),(x-1,y+f2(N/2-1)),(x-1,y+f2(N/2)),(x-1,y+f2(N-1))
e. In one example, eight neighboring samples of the current block and its reference block are utilized.
i. If both the left adjacent column and the top adjacent row are available, the four samples of the left column and the four samples of the top row may be selected.
1. For example, when W >2, four samples in the upper row are selected, and when W equals 2, two samples are selected.
2. For example, four samples in the left column are selected when H >2, and two samples are selected when H is equal to 2.
3. For example, they may be located (assuming that the upper left coordinate of the current block is (x, y), the width and height of the current block are W and H, respectively):
a.(x+W/8,y-1),(x+3W/8,y-1),(x+5W/8,y-1),(x+7W/8,y-1),(x-1,y+H/8),(x-1,y+3H/8),(x-1,y+5H/8),(x-1,y+7H/8)
b.(x,y-1),(x+W/2-1,y-1),(x+W/2,y-1),(x+W-1,y-1),(x-1,y),(x-1,y+H/2-1),(x-1,y+H/2),(x-1,y+H-1)
c.(x,y-1),(x+W/4,y-1),(x+2*W/4,y-1),(x+3*W/4,y-1),(x-1,y),(x-1,y+H/4),(x-1,y+2*H/4),(x-1,y+3*H/4)
d.(x,y-1),(x+W/4,y-1),(x+3*W/4,y-1),(x+W-1,y-1),(x-1,y),(x-1,y+H/4),(x-1,y+3*H/4),(x-1,y+H-1)
4. in an alternative example, they may be located at positions (assuming that the upper left coordinate of the current block is (x, y), and the width and height of the current block are W and H, respectively), where f1(K) ((K × W) > > dimShift), and f2(K) ((K × H) > > dimShift). In one example, N is an integer, such as numbates. In another example, N may depend on W and/or H:
a.(x+f1(N/8),y-1),(x+f1(3N/8),y-1),(x+f1(5N/8),y-1),(x+f1(7N/8),y-1),(x-1,y+f2(N/8)),(x-1,y+f2(3N/8)),(x-1,y+f2(5N/8)),(x-1,y+f2(7N/8))
b.(x,y-1),(x+f1(N/2-1),y-1),(x+f1(N/2),y-1),(x+f1(N-1),y-1),(x-1,y),(x-1,y+f2(N/2-1)),(x-1,y+f2(N/2)),(x-1,y+f2(N-1))
c.(x,y-1),(x+f1(N/4),y-1),(x+f1(2*N/4),y-1),(x+f1(3*N/4),y-1),(x-1,y),(x-1,y+f2(N/4)),(x-1,y+f2(2*N/4)),(x-1,y+f2(3*N/4))
d.(x,y-1),(x+f1(N/4),y-1),(x+f1(3*N/4),y-1),(x+f1(N-1),y-1),(x-1,y),(x-1,y+f2(N/4)),(x-1,y+f2(3*N/4)),(x-1,y+f2(N-1))
if there is only an upper adjacent row, then only the sample points from the upper row are selected.
1. For example, the top row of eight samples may be selected.
2. For example, the top row of four samples may be selected.
3. For example, two samples in the upper row may be selected.
4. How the spots are selected may depend on the width/height.
5. For example, eight samples are selected when W >4, four samples are selected when 8> W >2, and two samples are selected when W equals 2.
a. The selected samples may be located (assuming the upper left coordinate of the current block is (x, y), the width and height of the current block are W and H, respectively):
i.(x,y-1),(x+W/8,y-1),(x+2*W/8,y-1),(x+3*W/8,y-1),(x+4*W/8,y-1),(x+5*W/8,y-1),(x+6*W/8,y-1),(x+7*W/8,y-1)
ii.(x,y-1),(x+W/8,y-1),(x+2*W/8,y-1),(x+3*W/8,y-1),(x+5*W/8,y-1),(x+6*W/8,y-1),(x+7*W/8,y-1),(x+W-1,y-1)
iii.(x+W/16,y-1),(x+3W/16,y-1),(x+5W/16,y-1),(x+7W/16,y-1),(x+9W/16,y-1),(x+11W/16,y-1),(x+13W/16,y-1),(x+15W/16,y-1)
iv.(x,y-1),(x+W/4-1,y-1),(x+W/4,y-1),(x+W/2-1,y-1),(x+W/2,y-1),(x+3W/4-1,y-1),(x+3W/4,y-1),(x+W-1,y-1),
6. in an alternative example, the selected sampling points may be located at positions (assuming that the upper left coordinate of the current block is (x, y), and the width and height of the current block are W and H, respectively), where f1(K) ((K × W) > > dimShift), and f2(K) ((K × H) > > dimShift). In one example, N is an integer, such as numbates. In another example, N may depend on W and/or H:
a.(x,y-1),(x+f1(N/8),y-1),(x+f1(2*N/8),y-1),(x+f1(3*N/8),y-1),(x+f1(4*N/8),y-1),(x+f1(5*N/8),y-1),(x+f1(6*N/8),y-1),(x+f1(7*N/8),y-1)
b.(x,y-1),(x+f1(N/8),y-1),(x+f1(2*N/8),y-1),(x+f1(3*N/8),y-1),(x+f1(5*N/8),y-1),(x+f1(6*N/8),y-1),(x+f1(7*N/8),y-1),(x+f1(N-1),y-1)
c.(x+f1(N/16),y-1),(x+f1(3N/16),y-1),(x+f1(5N/16),y-1),(x+f1(7N/16),y-1),(x+f1(9N/16),y-1),(x+f1(11N/16),y-1),(x+f1(13N/16),y-1),(x+f1(15N/16),y-1)
d.(x,y-1),(x+f1(N/4-1),y-1),(x+f1(N/4),y-1),(x+f1(N/2-1),y-1),(x+f1(N/2),y-1),(x+f1(3N/4-1),y-1),(x+f1(3N/4),y-1),(x+f1(N-1),y-1)。
select samples only from the left column if there is only a left adjacent column.
1. For example, the left column of eight samples may be selected;
2. for example, the left column of four samples may be selected;
3. for example, the left column of two samples may be selected;
4. how the sampling points are selected may depend on the height/width.
5. For example, eight samples are selected when H >4, four samples are selected when 8> H >2, and two samples are selected when H is equal to 2.
6. The selected samples may be located at:
a.(x-1,y),(x-1,y+H/8),(x-1,y+2*H/8),(x-1,y+3*H/8),(x-1,y+4*H/8),(x-1,y+5*H/8),(x-1,y+6*H/8),(x-1,y+7*H/8)
b.(x-1,y),(x-1,y+H/8),(x-1,y+2*H/8),(x-1,y+3*H/8),(x-1,y+5*H/8),(x-1,y+6*H/8),(x-1,y+7*H/8),(x-1,y+H-1)
c.(x-1,y+H/16),(x-1,y+3H/16),(x-1,y+5H/16),(x-1,y+7H/16),(x-1,y+9H/16),(x-1,y+11H/16),(x-1,y+13H/16),(x-1,y+15H/16)
d.(x-1,y),(x-1,y+H/4-1),(x-1,y+H/4),(x-1,y+H/2-1),(x-1,y+H/2),(x-1,y+3H/4-1),(x-1,y+3H/4),(x-1,y+H-1)
7. in an alternative example, the selected spots may be located at positions where f1(K) ═ ((K) > > dimShift), and f2(K) ═ ((K) > > dimShift). In one example, N is an integer, such as numbates. In another example, N may depend on W and/or H:
a.(x-1,y),(x-1,y+f2(N/8)),(x-1,y+f2(2*N/8)),(x-1,y+f2(3*N/8)),(x-1,y+f2(4*N/8)),(x-1,y+f2(5*N/8)),(x-1,y+f2(6*N/8)),(x-1,y+f2(7*N/8))
b.(x-1,y),(x-1,y+f2(N/8)),(x-1,y+f2(2*N/8)),(x-1,y+f2(3*N/8)),(x-1,y+f2(5*N/8)),(x-1,y+f2(6*N/8)),(x-1,y+f2(7*N/8)),(x-1,y+f2(N-1))
c.(x-1,y+f2(N/16)),(x-1,y+f2(3N/16)),(x-1,y+f2(5N/16)),(x-1,y+f2(7N/16)),(x-1,y+f2(9N/16)),(x-1,y+f2(11N/16)),(x-1,y+f2(13N/16)),(x-1,y+f2(15N/16))
d.(x-1,y),(x-1,y+f2(N/4-1)),(x-1,y+f2(N/4)),(x-1,y+f2(N/2-1)),(x-1,y+f2(N/2)),(x-1,y+f2(3N/4-1)),(x-1,y+f2(3N/4)),(x-1,y+f2(N-1))
f. for the above example, the selected adjacent pixel distance is greater than or equal to S, e.g., S ═ 1, for both the left column and the upper row.
g. In the above example, when the width of the current block is greater than 16, W may be set equal to 16; when the height of the current block is greater than 16, H may be set equal to 16.
i. Alternatively, W and H are set to be equal to the width and height of the current block, respectively.
h. In the above example, the LIC parameters may be derived using a least squares error method.
i. In the above example, the LIC parameters may be derived using a two-point method.
i.2 points x A And x B Are the minimum and maximum samples within the set of selected neighboring samples of the current block, as shown in 4.11.d or 4.11. e. Its corresponding sample point in the reference picture is denoted as y A And y B
Comparing four or eight neighboring samples of the current block at the selected position shown in 4.11.d or 4.11.e to find two minimum samples: x is the number of 0 A And x 1 A And two maxima: x is a radical of a fluorine atom 0 B And x 1 B . Their corresponding samples in the reference picture are denoted y 0 A 、y 1 A 、y 0 B And y 1 B . Then x is put A 、x B 、y A And y B The derivation is:
x A =(x 0 A +x 1 A +off)>>1;x B =(x 0 B +x 1 B +off)>>1;y A =(y 0 A +y 1 A +off)>>1;y B =(y 0 B +y 1 B +off)>>1, where off may be equal to 1 or 0.
Compare the eight neighboring samples of the current block at the selected position shown in 4.11.e to find four smaller samples: x is the number of 0 A 、x 1 A 、x 2 A 、x 3 A And four larger values: x is the number of 0 B 、x 1 B 、x 2 B 、x 3 B . Their corresponding samples in the reference picture are denoted y 0 A 、y 1 A 、y 2 A 、y 3 A 、y 0 B 、y 1 B 、y 2 B 、y 3 B . Then x is put A 、x B 、y A And y B The derivation is:
x A =(x 0 A +x 1 A +x 2 A +x 3 A +off)>>2;x B =(x 0 B +x 1 B +x 2 B +x 3 B +off)>>2;
y A =(y 0 A +y 1 A +y 2 A +y 3 A +off)>>2;y B =(y 0 B +y 1 B +y 2 B +y 3 B +off)>>2,
where off may be equal to 2 or 0.
in the above example, the average value is used instead of the minimum value (x) A ,y A ) To derive β, the mean value is calculated inside the set of selected neighboring samples of the current block and of its reference block, as shown in 4.11.d or 4.11. e.
12. It is proposed that only upper or left neighboring samples are involved in the LIC parameter derivation process (i.e., one-sided selection), even though both upper and left neighboring samples are available.
a. Alternatively, one-side selection may be invoked only if the current block is non-square.
b. In one example, which side (top or left) may depend on the dimensions of the block, e.g., the longer side may be used to derive the LIC parameters.
c. In one example, if the height is less than the width, the LIC parameters are trained using only the top neighboring samples of the current block and its reference block.
d. In one example, if the height is greater than the width, the LIC parameters are trained using only the left-side neighboring samples of the current block and its reference block.
e. In one example, if the height is less than the width, the LIC parameters are trained using only the top neighboring samples of the current block and its reference block. Alternatively, further, numseps ═ numseps < < 1; dimShift + ═ 1. The numsites above neighboring samples of the current block and its reference block are utilized.
f. In one example, if the height is greater than the width, the LIC parameters are trained using only the left-side neighboring samples of the current block and its reference block. Alternatively, further, numseps ═ numseps < < 1; dimShift + ═ 1. The numsites left neighboring samples of the current block and its reference block are utilized.
13. It is proposed that if neighboring samples of a current block are coded using intra mode and/or mixed intra and inter mode or/and IBC mode, they may be considered "unavailable" and replaced with "available" (e.g., samples coded using non-intra mode and/or non-CIIP mode or/and non-IBC mode) neighboring samples.
a. For example, an "unavailable" sample may be replaced by its nearest "available" neighbor.
i. In one example, if the current "unavailable" sample point is above, for an upper neighboring sample point, the nearest available sample point is the sample point that is coded using non-intra mode and/or non-CIIP mode or/and non-IBC mode the shortest distance before or after the current "unavailable" sample point in acquisition order; the situation is similar for the case of left-hand neighboring samples.
b. For example, an "unavailable" selected neighboring sample may be replaced by its nearest "available" selected neighboring sample.
c. For example, LIC can be disabled depending on the dimensions of the block.
i. For example, for a 4 × 4 block, LIC may be disabled.
d. In one example, padding may be applied to replace unavailable samples, e.g., copied from available samples.
14. It is proposed that for a block of LIC coding, neighboring samples coded with intra mode and/or mixed intra and inter mode or/and IBC mode are excluded from the derivation of LIC parameters.
a. To ensure that a total of 2 is obtained N Samples are taken to solve for the least squares error and samples taken later are discarded, i.e. if 9 samples are available, the last sample is discarded.
b. In one example, for a block of LIC coding, selected neighboring samples that utilize intra-mode and/or mixed intra and inter mode or/and IBC mode coding are excluded from the derivation of LIC parameters.
15. It is proposed that for blocks of LIC codec, neighboring samples with non-intra mode and/or non-ciap mode or/and non-IBC mode codec may be included in the derivation of LIC parameters.
5. Exemplary embodiments
5.1 method #1 with a maximum of four samples
Assume that the block width and height of the current block are W and H, respectively. And the upper left coordinate of the current block is [0,0 ]. The LIC parameters were derived using least squares error. The selection of neighboring samples to be used for LIC parameter derivation is defined as follows:
(1) if both the left column and the top row are available, then the two samples of the left column and the two samples of the top row are selected.
The coordinates of the two upper spots are [ W/4, -1] and [ 3W/4, -1 ].
The coordinates of the two left-hand spots are [ -1, H/4] and [ -1, 3H/4 ].
The selected spots are painted in black as shown in fig. 40A.
(2) If only the upper row is available, then only the samples from the upper row are selected. Four samples are selected when W >2, and two samples are selected when W equals 2. The coordinates of the four selected upper spots are [ W/8, -1], [ W/8+ W/4, -1], [ W/8+ 2W/4, -1] and [ W/8+ 3W/4, -1 ]. The selected spots are painted black as shown in fig. 40B.
(3) If only the left column is available, the samples are selected from the left column only. Four samples are selected when H >2, and two samples are selected when H equals 2.
(4) The coordinates of the four selected left-hand spots are [ -1, H/8], [ -1, H/8+ H/4], [ -1, H/8+ 2H/4 ] and [ -1, H/8+ 3H/4 ].
(5) If neither the left column nor the top row is available, then default prediction is used, where α equals (1< < shift), β equals 0, and shift is 5.
5.2. Method #2 with a maximum of eight samples
Assume that the block width and height of the current block are W and H, respectively. And the upper left coordinate of the current block is [0,0 ]. The LIC parameters were derived using the least squares error method. The selection of neighboring samples to be used for LIC parameter derivation is defined as follows:
(1) if both the left column and the upper row are available, then the four samples of the left column and the four samples of the upper row are selected. When W is greater than 2, selecting four sampling points in the upper row, and when W is equal to 2, selecting two sampling points; four samples of the left column are selected when H >2, and two samples are selected when H equals 2.
The coordinates of the four upper spots are [ W/8-1 ], [ W/8+ W/4-1 ], [ W/8+ 2W/4-1 ] and [ W/8+ 3W/4-1 ].
The coordinates of the four left spots are [ -1, H/8], [ -1, H/8+ H/4], [ -1, H/8+ 2H/4 ], and [ -1, H/8+ 3H/4 ].
The selected spots are painted black as shown in fig. 41A.
(2) If only the upper row is available, then only the samples from the upper row are selected. Eight samples are selected when W >4, four samples are selected when 8> W >2, and two samples are selected when W equals 2.
The coordinates of the eight selected upper spots are [ W/16, -1], [ W/16+ W/8, -1], [ W/16+ 2W/8, -1], [ W/16+ 3W/8, -1], [ W/16+ 4W/8, -1], [ W/16+ 5W/8, -1], [ W/16+ 6W/8, -1], [ W/16+ 7W/8, -1 ].
The selected spots are painted in black as shown in fig. 41B.
(3) If only the left column is available, the samples are selected only from the left column. Eight samples are selected when H >4, four samples are selected when 8> H >2, and two samples are selected when H equals 2.
The coordinates of the eight selected left-hand spots are [ -1, H/16], [ -1, H/16+ H/8], [ -1, H/16+ 2H/8 ], [ -1, H/16+ 3H/8 ], [ -1, H/16+ 4H/8 ], [ -1, H/16+ 5H/8 ], [ -1, H/16+ 6H/8 ], [ -1, H/16+ 7H/8 ].
(4) If neither the left column nor the top row is available, then default prediction is used, where α equals (1< < shift), β equals 0, and shift is 5.
5.3. Method #3 with a maximum of eight samples
Assume that the block width and height of the current block are W and H, respectively. And the upper left coordinate of the current block is [0,0 ]. The LIC parameters were derived using least squares error. The selection of neighboring samples to be used for LIC parameter derivation is defined as follows:
when the width of the current block is larger than 16, W is 16; when the height of the current block is greater than 16, H is 16.
(1) If both the left column and the top row are available, then the four samples of the left column and the four samples of the top row are selected. When W is greater than 2, selecting four sampling points in the upper row, and when W is equal to 2, selecting two sampling points; four samples of the left column are selected when H >2, and two samples are selected when H equals 2.
The coordinates of the four upper spots are [ W/8-1 ], [ W/8+ W/4-1 ], [ W/8+ 2W/4-1 ] and [ W/8+ 3W/4-1 ].
The coordinates of the four left-hand spots are [ -1, H/8], [ -1, H/8+ H/4], [ -1, H/8+ 2H/4 ] and [ -1, H/8+ 3H/4 ].
The selected spots are painted black as shown in fig. 42A.
(2) If only the upper row is available, then only the samples from the upper row are selected. Eight samples are selected when W >4, four samples are selected when 8> W >2, and two samples are selected when W equals 2.
The coordinates of the eight selected upper spots are [ W/16-1 ], [ W/16+ W/8-1 ], [ W/16+ 2W/8-1 ], [ W/16+ 3W/8, -1], [ W/16+ 4W/8, -1], [ W/16+ 5W/8, -1], [ W/16+ 6W/8, -1], [ W/16+ 7W/8, -1 ].
The selected spots are painted in black as shown in fig. 42B.
(3) If only the left column is available, the samples are selected from the left column only. Eight samples are selected when H >4, four samples are selected when 8> H >2, and two samples are selected when H equals 2.
The coordinates of the eight selected left-hand spots are [ -1, H/16], [ -1, H/16+ H/8], [ -1, H/16+ 2H/8 ], [ -1, H/16+ 3H/8 ], [ -1, H/16+ 4H/8 ], [ -1, H/16+ 5H/8 ], [ -1, H/16+ 6H/8 ], [ -1, H/16+ 7H/8 ].
(4) If neither the left column nor the top row is available, then default prediction is used, where α equals (1< < shift), β equals 0, and shift is 5.
Additional exemplary methods for simplifying LIC
The examples described above may be incorporated into the context of methods described below (e.g., methods 4310, 4320, 4330, and 4340), which may be implemented at a video decoder or a video encoder.
Fig. 43A shows a flow diagram of an exemplary method for video processing. The method 4310 comprises: at step 4312, a decision is made regarding selectively applying a Local Illumination Compensation (LIC) model to at least a portion of the current video block based on a codec mode of the current video block.
The method 4310 comprises: at step 4314, a transition between the current video block and a bitstream representation of the current video block is performed based on the decision.
In some embodiments, the current video block is decoded using a pairwise candidate codec, and wherein application of the LIC model is disabled.
In some embodiments, the current video block is coded based on the zero motion merge candidate, and wherein application of the LIC model is disabled.
In some embodiments, the coding mode is an optional temporal motion vector prediction (ATMVP) mode, and wherein application of the LIC model is enabled.
In some embodiments, the decision is further based on the location of the current video block in the video unit. In an example, the location of the current video block is a boundary of a video unit, wherein the video unit is a picture, a slice, or a brick, and wherein application of the LIC model is disabled.
In some embodiments, the codec mode is a transform bypass mode, and application of the LIC model is disabled. In other embodiments, the codec mode excludes the transform bypass mode and application of the LIC model is enabled. In an example, the transform bypass mode is a transform skip (ts) mode or a quantization residual block differential pulse codec modulation (QR-BDPCM) mode.
Fig. 43B shows a flow diagram of an exemplary method for video processing. The method 4320 comprises: at step 4322, a set of Local Illumination Compensation (LIC) parameters is configured for each of the plurality of sub-regions for a current video block comprising the plurality of sub-regions.
Method 4320 includes performing a conversion between the current video block and a bitstream representation of the current video block based on the configuration in step 4324.
In some embodiments, the set of LIC parameters for one sub-region is different from the set of LIC parameters for any other sub-region of the plurality of sub-regions.
In some embodiments, each of the plurality of sub-regions comprises a sub-block of the current video block that is coded using a sub-block based mode.
In some embodiments, the LIC model is enabled for a first of the plurality of sub-regions, and wherein the LIC model is disabled for a second of the plurality of sub-regions.
In some embodiments, the current video block is coded using an optional temporal motion vector prediction (ATMVP) mode or an affine mode.
Fig. 43C shows a flow diagram of an exemplary method for video processing. The method 4330 comprises: at step 4332, a decision is made for the current video block regarding selectively enabling a filtering process for the current video block based on use of the current video block by a Local Illumination Compensation (LIC) model.
The method 4330 comprises: at step 4334, a conversion between the current video block and a bitstream representation of the current video block is performed based on the decision.
In some embodiments, the filtering process includes at least one of a deblocking filter, a Sample Adaptive Offset (SAO) filter, an adaptive loop filter, or a bilateral filter. In other embodiments, the filtering process is applied to the current video block and the edges of the neighboring blocks. In an example, the use of the LIC model for the current video block is enabled and the use of the LIC mode for the neighboring blocks is disabled. In another example, the use of the LIC model for the current video block is disabled and the use of the LIC model for the neighboring blocks is enabled. In yet another example, the use of the LIC model for the current video block is enabled and the use of the LIC model for the neighboring blocks is enabled.
Fig. 43D shows a flow diagram of an exemplary method for video processing. The method 4340 comprises: at step 4342, during a transition between a current video block and a bitstream representation of the current video block, a Local Illumination Compensation (LIC) model applied to the current video block is configured, wherein the LIC model is trained using a fixed number of one or more samples of neighboring blocks of the current video block.
In some embodiments, the fixed number is based on availability of one or more samples of neighboring blocks or at least one dimension of the current video block.
In some embodiments, the current video block has a size of WxH, the one or more samples include samples having coordinates of (x-1, y + offset Y + Ky x H/Fy) or (x + offset X + Kx W/Fx, y-1), Fx samples are selected from above adjacent rows, Fy samples are selected from left adjacent columns, Kx ranges from 0 to (Fx-1), Ky ranges from 0 to (Fy-1), H, W, Kx, Ky, Fx, Fy, offsetX, and offsetY are integers. In the example, offsetX is W/8 and offsetY is H/8.
In some embodiments, the current video block has a size of W H, the one or more samples include SW ≦ W samples for the top adjacent row and SH ≦ H samples for the left adjacent column, SH, SW, H, and W being integers.
In some embodiments, SH ═ 2 and SW ═ 2. In an example, the top-left coordinate of the current video block is represented as (x, y), and the one or more samples include (x + W/4, y-1), (x +3 xW/4, y-1), (x-1, y + H/4), and (x-1, y +3 xH/4). In another example, the upper-left coordinate of the current video block is represented as (x, y), and the one or more samples include (x, y-1), (x + W-max (1, W/H), y-1), (x-1, y), and (x-1, y + H-max (1, H/W)).
In some embodiments, SW-2 or SW-4 and SH-0. In an example, the top-left coordinate of the current video block is represented as (x, y), and the one or more samples include (x, y-1), (x + W/4, y-1), (x +2 xW/4, y-1), and (x +3 xW/4, y-1). In another example, the top-left coordinate of the current video block is represented as (x, y), and the one or more samples include (x + W/8, y-1), (x +3 xW/8, y-1), (x +5 xW/8, y-1), and (x +7 xW/8, y-1).
In some embodiments, SH ═ 2 or SH ═ 4, and SW ═ 0. In an example, the top-left coordinate of the current video block is represented as (x, y), and the one or more samples include (x-1, y), (x-1, y + H/4), (x-1, y +2 XH/4), and (x-1, y +3 XH/4). In another example, the top-left coordinate of the current video block is represented as (x, y), and wherein one or more samples include (x-1, y + H/8), (x-1, y +3 XH/8), (x-1, y +5 XH/8), and (x-1, y +7 XH/8).
In some embodiments, the one or more samples of the neighboring block comprise a set of samples from a left neighboring column and an above neighboring row, a pixel distance between the set of samples is greater than or equal to S, and S is an integer. In an example, S ═ 1.
In some embodiments, method 4340 further comprises the step of determining that one or more samples in both the upper adjacent row and the left adjacent column are available, wherein the one or more samples of the adjacent block are selected from either the upper adjacent row or the left adjacent column. In an example, the current video block is non-square.
In some embodiments, the one or more samples of the neighboring blocks exclude samples that were coded using at least one of intra mode, mixed intra and inter mode, or Intra Block Copy (IBC) mode.
In some embodiments, the one or more samples of the neighboring blocks comprise samples that were coded using one or more of non-intra mode, non-CIIP (combined intra-inter prediction) mode, or non-IBC (intra block copy) mode.
Fig. 44 is a block diagram of the video processing device 4400. Device 4400 may be used to implement one or more of the methods described herein. The device 4400 may be embodied in a smartphone, tablet, computer, internet of things (IoT) receiver, or the like. The device 4400 may include one or more processors 4402, one or more memories 4404, and video processing hardware 4406. Processor 4402 may be configured to implement one or more methods described herein (including, but not limited to, methods 4310, 4320, 4330, and 4340). Memory 4404 (or memories) may be used to store data and code for implementing the methods and techniques described herein. Video processing hardware 4406 may be used to implement some of the techniques described in this document in hardware circuits.
In some embodiments, the video codec method may be implemented using an apparatus implemented on a hardware platform as described in connection with fig. 44.
Fig. 45 shows a flow diagram of an exemplary method for video processing. The method 4500 includes: in step 4502, a motion candidate list is derived for a transition between a video block of a video and a bitstream representation of the video block, wherein a first candidate of the motion candidate list is set to have a Local Illumination Compensation (LIC) flag; and at step 4504, performing the conversion using the motion candidate list, wherein, during the conversion, upon selection of a first candidate from the motion candidate list, it is determined whether LIC is enabled based on a flag of the first candidate.
In some embodiments, the LIC flag of the first candidate is set according to at least one of: a type of the first candidate; a LIC flag for deriving a second candidate of the first candidate; a type of a second candidate in the motion candidate list for deriving the first candidate; LIC flags for other candidates in the motion candidate list.
In some embodiments, the first candidate is a pair candidate in a merge candidate list for the video block.
In some embodiments, the LIC flag associated with a pair candidate is always set to false.
In some embodiments, when the pair of candidate video blocks are utilized for coding, the LIC flag is set to false and LIC is disabled.
In some embodiments, the LIC flag associated with the pair candidate is set based on the two merge candidates used to derive the pair candidate.
In some embodiments, if the LIC flag associated with at least one of the two merge candidates is true, then the IC flag associated with that pair candidate is set to true.
In some embodiments, if the LIC flag associated with both merge candidates is true, the LIC flag associated with the paired candidate is set to true.
In some embodiments, if at least one of the two merge candidates is a Temporal Motion Vector Prediction (TMVP) candidate, the LIC flag associated with the pair of candidates is set to false.
In some embodiments, if at least one of the two merge candidates is a history-based motion vector prediction (HMVP) candidate, the LIC flag associated with the pair of candidates is set to false.
In some embodiments, the first candidate is a zero motion merge candidate in a merge candidate list for the video block.
In some embodiments, the LIC flag associated with the zero motion merge candidate is always set to false.
In some embodiments, the LIC flag associated with a zero motion merge candidate is set according to LIC flags associated with other merge candidates in the merge candidate list.
In some embodiments, if a particular merge candidate in the merge candidate list is associated with an LIC flag equal to true, then the LIC flag associated with the zero motion merge candidate is set to true.
In some embodiments, the particular merge candidate is the first merge candidate.
Fig. 46 shows a flow diagram of an exemplary method for video processing. The method 4600 includes: in step 4602, for a transition between a video block of a video and a bitstream representation of the video block, at least one of: based on a property of the video block, whether Local Illumination Compensation (LIC) is enabled or disabled for at least a portion of the video block; whether LIC is enabled for a reference picture list; and LIC parameters of at least one reference picture list; and in step 4604, perform a conversion based on the determination.
In some embodiments, the property includes a codec mode of a video block, and the LIC is enabled when the video block is coded using an optional temporal motion vector prediction (ATMVP) mode.
In some embodiments, whether LIC is enabled further depends on spatial merge candidates derived from neighboring blocks of the video block.
In some embodiments, when LIC is enabled for a video block, a set of LIC parameters for each reference picture list is derived for the entire video block.
In some embodiments, if LIC is enabled for one sub-block of a video block, all sub-blocks share the same LIC parameters.
In some embodiments, one or more motion vectors of a sub-block are used to identify neighboring reference samples of the entire video block, which are used to derive LIC parameters for the entire video block.
In some embodiments, the one sub-block is the top left sub-block.
In some embodiments, the one sub-block is a central sub-block.
In some embodiments, different sub-blocks are selected for different color components.
In some embodiments, when LIC is enabled for a video block, different sub-regions within the video block use different LIC parameters.
In some embodiments, multiple sets of LIC parameters for each reference picture list are derived, and each sub-region within the video block selects one of the multiple sets of LIC parameters.
In some embodiments, the sub-region is a sub-block of the block that is coded using a sub-block based technique that includes at least one of an ATMVP mode and an affine mode.
In some embodiments, LIC is enabled for one reference picture list and disabled for another reference picture list.
In some embodiments, two LIC flags are stored for each video block.
In some embodiments, for a paired merge candidate, the LIC flag is inherited from each of the two merge candidates used to derive the paired candidate.
In some embodiments, when the video block is a bi-predictive coding block, the LIC parameters are derived once for both reference picture lists.
In some embodiments, LIC parameters are derived from motion information of one reference picture list.
In some embodiments, the LIC parameters are derived from motion information of two reference picture lists.
In some embodiments, neighboring samples with respect to both reference pictures in both reference picture lists are utilized.
In some embodiments, the selection of adjacent samples with respect to two reference pictures in the two reference picture lists is different for the two reference picture lists.
In some embodiments, LIC is enabled for a portion of sub-regions within a video block and disabled for the remaining sub-regions.
In some embodiments, when the video block is an ATMVP codec block, LLC is enabled for one sub-block and disabled for another sub-block.
In some embodiments, when the video block is an affine codec block, LLC is enabled for one sub-block and disabled for another sub-block.
In some embodiments, the sub-block is an 8 x 8 block.
In some embodiments, the property includes a location of the video block.
In some embodiments, LIC is disabled when a video block is located at a boundary of at least one of a picture, a strip, a slice, and a brick.
In some embodiments, the property includes a codec mode.
In some embodiments, the LIC and transform bypass (TransBypass) modes are exclusively used, including at least one of a transform skip (ts) mode and a quantized residual block differential pulse codec modulation (QR-BDPCM) mode.
In some embodiments, if the video block is coded with LIC mode, the TransBypass mode, including TS mode, is disabled.
In some embodiments, if TS mode is disabled, signaling of TS mode is skipped.
In some embodiments, the LIC mode side information is conditionally signaled based on an indication of the use of the TS mode.
In some embodiments, if TS mode is enabled for the video block, signaling of side information for LIC mode is skipped.
Fig. 47 shows a flow diagram of an exemplary method for video processing. The method 4700 comprises: at step 4702, determining whether and/or how to enable and/or apply a loop filter process and/or a post-reconstruction filtering process based on the use of Local Illumination Compensation (LIC) for transitions between video blocks of the video and bitstream representations of the video blocks, wherein the loop filter process includes a deblocking filter, a Sample Adaptive Offset (SAO), an Adaptive Loop Filter (ALF), and the post-reconstruction filtering process includes a bilateral filter; and at step 4704, performing a conversion based on the determination.
In some embodiments, for two contiguous blocks with different LIC flags, the edge between the two contiguous blocks is filtered.
In some embodiments, the boundary strength is set to M if two adjacent blocks are coded with different LIC flags, M not being equal to 0.
In some embodiments, for two contiguous blocks that both have an LIC flag equal to 1, the edge between the two contiguous blocks is filtered.
In some embodiments, if both contiguous blocks are coded with LIC flag equal to 1, the boundary strength is set to M, which is not equal to 0.
Fig. 48 shows a flow diagram of an exemplary method for video processing. Method 4800 comprises: at step 4802, deriving Local Illumination Compensation (LIC) parameters in a LIC model applied to a video block by using a fixed number of adjacent samples of the video block for transitions between the video block and a bitstream representation of the video block; and at step 4804, a conversion is performed based on the LIC parameters.
In some embodiments, adjacent samples include adjacent contiguous or non-contiguous samples.
In some embodiments, the LIC mode uses a fixed denominator or value for shifting in the LIC parameter derivation process.
In some embodiments, the number of adjacent samples selected for deriving the LIC parameters depends on the availability of adjacent samples or/and the dimensions of the video block, where the width of the video block is W, the height of the video block is H, and the upper left coordinate of the video block is (x, y).
In some embodiments, the selected spots are located at (x-1, y + offset y + Ky H/Fy) and/or (x + offset x + Kx W/Fx, y-1), wherein Fx spots are selected from the top adjacent row, Kx may be from 0 to Fx-1, Fy spots are selected from the left adjacent column, Ky may be from 0 to Fy-1, offset x and offset are integers.
In some embodiments, offsetX is W/8 and offsetY is H/8.
In some embodiments, the fixed number is four.
In some embodiments, if both the left adjacent column and the upper adjacent row of the video block are available, SH samples of the left column and SW samples of the upper row are selected, where SH and SW are integers, SH < H and SW < W.
In some embodiments, SH SW 2.
In some embodiments, the selected sampling points are located in one of the following positions:
(a)(x+W/4,y-1),(x+3W/4,y-1),(x-1,y+H/4),(x-1,y+3H/4);
(b)(x,y-1),(x+W-1,y-1),(x-1,y),(x-1,y+H-1);
(c) (x, y-1), (x + W-1, y-1), (x-1, y), (x-1, y + H-H/W), when H is greater than W;
(d) (x, y-1), (x + W-W/H, y-1), (x-1, y), (x-1, y + H-1) when H is less than W;
(e)(x,y-1),(x+W-max(1,W/H),y-1),(x-1,y),(x-1,y+H-max(1,H/W));
(f)(x+W/2-1,y-1),(x+W/2,y-1),(x-1,y+H/2-1),(x-1,y+H/2);
(h)(x+W/2-1,y-1),(x+W-1,y-1),(x-1,y+H/2-1),(x-1,y+H-1);
(i)(x,y-1),(x+W/2,y-1),(x-1,y),(x-1,y+H/2)。
in some embodiments, the selected sampling points are located at one of the following positions:
(a)(x+f1(N/4),y-1),(x+f1(3*N/4),y-1),(x-1,y+f2(N/4)),(x-1,y+f2(3*N/4));
(b)(x,y-1),(x+f1(N-1),y-1),(x-1,y),(x-1,y+f2(N-1));
(c)(x+f1(N/2-1),y-1),(x+f1(N/2),y-1),(x-1,y+f2(N/2-1)),(x-1,y+f2(N/2));
(d)(x+f1(N/2-1),y-1),(x+f1(N-1),y-1),(x-1,y+f2(N/2-1)),(x-1,y+f2(N-1));
(e)(x,y-1),(x+f1(N/2),y-1),(x-1,y),(x-1,y+f2(N/2)),
wherein, f1(K) ((K) > > dimShift), f2(K) ((K) > > dimShift), minDimBit ═ Log2[ min (H, W) ], minDim ═ min (H, W), minStepBit ═ minDim >81:0, numstamps ═ minDim > > minStepBit, dimShift ═ minDimBit-minStepBit, and K and N are integers.
In some embodiments, N numsites.
In some embodiments, N depends on W and/or H.
In some embodiments, samples are selected only from the upper adjacent row if only the upper adjacent row is available.
In some embodiments, SW samples of the top adjacent row are selected, where SW is an integer and SW < W.
In some embodiments, SW-2 or SW-4.
In some embodiments, the selection of the sampling points depends on the width/height.
In some embodiments, when W >2, four samples are selected.
In some embodiments, when W equals 2, two samples are selected.
In some embodiments, the selected sampling points are located at one of the following positions:
(1)(x,y-1),(x+W/4,y-1),(x+2*W/4,y-1),(x+3*W/4,y-1);
(2)(x,y-1),(x+W/4,y-1),(x+3*W/4,y-1),(x+W-1,y-1);
(3)(x+W/8,y-1),(x+3W/8,y-1),(x+5W/8,y-1),(x+7W/8,y-1);
(4)(x,y-1),(x+W/2-1,y-1),(x+W/2,y-1),(x+W-1,y-1)。
in some embodiments, the selected sampling points are located in one of the following positions:
(1)(x,y-1),(x+f1(N/4),y-1),(x+f1(2*N/4),y-1),(x+f1(3*N/4),y-1),
(2)(x,y-1),(x+f1(N/4),y-1),(x+f1(3*N/4),y-1),(x+f1(N-1),y-1),
(3)(x+f1(N/8),y-1),(x+f1(3N/8),y-1),(x+f1(5N/8),y-1),(x+f1(7N/8),y-1),
(4)(x,y-1),(x+f1(N/2-1),y-1),(x+f1(N/2),y-1),(x+f1(N-1),y-1),
wherein, f1(K) ((K) > > dimShift), f2(K) ((K) > > dimShift), minDimBit ═ Log2[ min (H, W) ], minDim ═ min (H, W), minStepBit ═ minDim >81:0, numstamps ═ minDim > > minStepBit, dimShift ═ minDimBit-minStepBit, and K and N are integers.
In some embodiments, N numsites.
In some embodiments, N depends on W and/or H.
In some embodiments, if only the left-adjacent column is available, the samples are selected only from the left-adjacent column.
In some embodiments, SH samples of the left adjacent column are selected, where SH is an integer and SH < ═ H.
In some embodiments, SH ═ 2 or SH ═ 4.
In some embodiments, the selection of the sampling points depends on the width/height.
In some embodiments, four samples are selected when H > 2.
In some embodiments, two samples are selected when H equals 2.
In some embodiments, the selected sampling points are located in one of the following positions:
((1)(x-1,y),(x-1,y+H/4),(x-1,y+2*H/4),(x-1,y+3*H/4);
(2)(x-1,y),(x-1,y+H/4),(x-1,y+3*H/4),(x-1,y+H-1);
(3)(x-1,y+H/8),(x-1,y+3H/8),(x-1,y+5H/8),(x-1,y+7H/8);
(4)(x-1,y),(x-1,y+H/2-1),(x-1,y+H/2),(x-1,y+H-1)。
in some embodiments, the selected sampling points are located at one of the following positions:
(1)(x-1,y),(x-1,y+f2(N/4)),(x-1,y+f2(2*N/4)),(x-1,y+f2(3*N/4)),
(2)(x-1,y),(x-1,y+f2(N/4)),(x-1,y+f2(3*N/4)),(x-1,y+f2(N-1)),
(3)(x-1,y+f2(N/8)),(x-1,y+f2(3N/8)),(x-1,y+f2(5N/8)),(x-1,y+f2(7N/8)),
(4)(x-1,y),(x-1,y+f2(N/2-1)),(x-1,y+f2(N/2)),(x-1,y+f2(N-1)),
wherein f1(K) ((K × W) > > dimShift), f2(K) ((K × H) > > dimShift), minDimBit ═ Log2[ min (H, W) ], minDim ═ min (H, W), minStepBit ═ minDim >81:0, numstamps ═ minDim > > minsteppbit, dimShift ═ minDimBit-stepbit, and K and N are integers.
In some embodiments, N numsites.
In some embodiments, N depends on W and/or H.
In some embodiments, the fixed number is eight.
In some embodiments, if both the left-adjacent column and the upper-adjacent row of the video block are available, then the four samples of the left-column and the four samples of the upper row are selected.
In some embodiments, when W >2, the four samples in the upper row are selected.
In some embodiments, when W equals 2, the two samples in the upper row are selected.
In some embodiments, when H >2, the four samples of the left column are selected.
In some embodiments, when H equals 2, the two samples of the left column are selected.
In some embodiments, the selected sampling points are located in one of the following positions:
a)(x+W/8,y-1),(x+3W/8,y-1),(x+5W/8,y-1),(x+7W/8,y-1),(x-1,y+H/8),(x-1,y+3H/8),(x-1,y+5H/8),(x-1,y+7H/8);
b)(x,y-1),(x+W/2-1,y-1),(x+W/2,y-1),(x+W-1,y-1),(x-1,y),(x-1,y+H/2-1),(x-1,y+H/2),(x-1,y+H-1);
c)(x,y-1),(x+W/4,y-1),(x+2*W/4,y-1),(x+3*W/4,y-1),(x-1,y),(x-1,y+H/4),(x-1,y+2*H/4),(x-1,y+3*H/4);
d)(x,y-1),(x+W/4,y-1),(x+3*W/4,y-1),(x+W-1,y-1),(x-1,y),(x-1,y+H/4),(x-1,y+3*H/4),(x-1,y+H-1)。
in some embodiments, the selected sampling points are located in one of the following positions:
a)(x+f1(N/8),y-1),(x+f1(3N/8),y-1),(x+f1(5N/8),y-1),(x+f1(7N/8),y-1),(x-1,y+f2(N/8)),(x-1,y+f2(3N/8)),(x-1,y+f2(5N/8)),(x-1,y+f2(7N/8));
b)(x,y-1),(x+f1(N/2-1),y-1),(x+f1(N/2),y-1),(x+f1(N-1),y-1),(x-1,y),(x-1,y+f2(N/2-1)),(x-1,y+f2(N/2)),(x-1,y+f2(N-1));
c)(x,y-1),(x+f1(N/4),y-1),(x+f1(2*N/4),y-1),(x+f1(3*N/4),y-1),(x-1,y),(x-1,y+f2(N/4)),(x-1,y+f2(2*N/4)),(x-1,y+f2(3*N/4));
d)(x,y-1),(x+f1(N/4),y-1),(x+f1(3*N/4),y-1),(x+f1(N-1),y-1),(x-1,y),(x-1,y+f2(N/4)),(x-1,y+f2(3*N/4)),(x-1,y+f2(N-1)),
wherein f1(K) ((K × W) > > dimShift), f2(K) ((K × H) > > dimShift), minDimBit ═ Log2[ min (H, W) ], minDim ═ min (H, W), minStepBit ═ minDim >81:0, numstamps ═ minDim > > minsteppbit, dimShift ═ minDimBit-stepbit, and K and N are integers.
In some embodiments, N numsites.
In some embodiments, N depends on W and/or H.
In some embodiments, samples are selected only from the top adjacent row if only the top adjacent row is available.
In some embodiments, SW samples of the upper adjacent row are selected, where SW is an integer.
In some embodiments, SW 2 or SW 4 or SW 8.
In some embodiments, the selection of the sampling points depends on the width/height.
In some embodiments, eight samples are selected when W >4, or four samples are selected when W >2, or two samples are selected when W equals 2.
In some embodiments, the selected sampling points are located at one of the following positions:
(a)(x,y-1),(x+W/8,y-1),(x+2*W/8,y-1),(x+3*W/8,y-1),(x+4*W/8,y-1),(x+5*W/8,y-1),(x+6*W/8,y-1),(x+7*W/8,y-1);
(b)(x,y-1),(x+W/8,y-1),(x+2*W/8,y-1),(x+3*W/8,y-1),(x+5*W/8,y-1),(x+6*W/8,y-1),(x+7*W/8,y-1),(x+W-1,y-1);
(c)(x+W/16,y-1),(x+3W/16,y-1),(x+5W/16,y-1),(x+7W/16,y-1),(x+9W/16,y-1),(x+11W/16,y-1),(x+13W/16,y-1),(x+15W/16,y-1);
(d)(x,y-1),(x+W/4-1,y-1),(x+W/4,y-1),(x+W/2-1,y-1),(x+W/2,y-1),(x+3W/4-1,y-1),(x+3W/4,y-1),(x+W-1,y-1)。
in some embodiments, the selected sampling points are located at one of the following positions:
a)(x-1,y),(x-1,y+f2(N/8)),(x-1,y+f2(2*N/8)),(x-1,y+f2(3*N/8)),(x-1,y+f2(4*N/8)),(x-1,y+f2(5*N/8)),(x-1,y+f2(6*N/8)),(x-1,y+f2(7*N/8))
b)(x-1,y),(x-1,y+f2(N/8)),(x-1,y+f2(2*N/8)),(x-1,y+f2(3*N/8)),(x-1,y+f2(5*N/8)),(x-1,y+f2(6*N/8)),(x-1,y+f2(7*N/8)),(x-1,y+f2(N-1))
c)(x-1,y+f2(N/16)),(x-1,y+f2(3N/16)),(x-1,y+f2(5N/16)),(x-1,y+f2(7N/16)),(x-1,y+f2(9N/16)),(x-1,y+f2(11N/16)),(x-1,y+f2(13N/16)),(x-1,y+f2(15N/16))
d)(x-1,y),(x-1,y+f2(N/4-1)),(x-1,y+f2(N/4)),(x-1,y+f2(N/2-1)),(x-1,y+f2(N/2)),(x-1,y+f2(3N/4-1)),(x-1,y+f2(3N/4)),(x-1,y+f2(N-1)),
wherein, f1(K) ((K) > > dimShift), f2(K) ((K) > > dimShift), minDimBit ═ Log2[ min (H, W) ], minDim ═ min (H, W), minStepBit ═ minDim >81:0, numstamps ═ minDim > > minStepBit, dimShift ═ minDimBit-minStepBit, and K and N are integers.
In some embodiments, N numsites.
In some embodiments, N depends on W and/or H.
In some embodiments, for both the left adjacent column and the top adjacent row of video blocks, selected adjacent samples have a pixel distance greater than or equal to S, which is an integer.
In some embodiments, S-1.
In some embodiments, when the width of the current video block is greater than 16, W is set equal to 16; h is set equal to 16 when the height of the current video block is greater than 16.
In some embodiments, when the width of the current video block is greater than 16, W is set equal to the width of the current video block; when the height of the current video block is greater than 16, H is set equal to the height of the current video block.
In some embodiments, the LIC parameters are derived by using a least squares error method.
In some embodiments, the LIC parameters are derived by using a two-point method.
In some embodiments, 2 points x A And x B Is the minimum and maximum sample point within a set of selected adjacent sample points of the video block, whose corresponding sample point in the reference picture is denoted y A And y B
In some embodiments, four or eight adjacent samples of the video block at the selected location are compared to find the two smallest samples: x is the number of 0 A And x 1 A And two maxima: x is the number of 0 B And x 1 B And their corresponding samples in the reference picture are denoted as y 0 A 、y 1 A 、y 0 B And y 1 B Wherein x is A 、x B 、y A And y B Is derived as:
x A =(x 0 A +x 1 A +off)>>1;x B =(x 0 B +x 1 B +off)>>1;y A =(y 0 A +y 1 A +off)>>1;y B =(y 0 B +y 1 B +off)>>1, where off equals 1 or 0.
In some embodiments, eight adjacent samples of the video block at the selected location are compared to find four smaller samples: x is the number of 0 A 、x 1 A 、x 2 A 、x 3 A And four larger values: x is the number of 0 B 、x 1 B 、x 2 B 、x 3 B . Their corresponding samples in the reference picture are denoted y 0 A 、y 1 A 、y 2 A 、y 3 A 、y 0 B 、y 1 B 、y 2 B 、y 3 B Wherein x is A 、x B 、y A And y B Is derived as:
x A =(x 0 A +x 1 A +x 2 A +x 3 A +off)>>2;x B =(x 0 B +x 1 B +x 2 B +x 3 B +off)>>2;
y A =(y 0 A +y 1 A +y 2 A +y 3 A +off)>>2;y B =(y 0 B +y 1 B +y 2 B +y 3 B +off)>>2,
where off may be equal to 2 or 0.
In some embodiments, by using an average value rather than a minimum value (x) A ,y A ) To derive the LIC parameters beta, which are,the average is computed within the set of selected neighboring samples of the video block and the set of selected neighboring samples of its reference block.
In some embodiments, one-sided selection is involved in the LIC parameter derivation process, wherein only the upper neighboring samples or the left neighboring samples are involved in the LIC parameter derivation process, even if both upper neighboring samples and left neighboring samples are available.
In some embodiments, one-side selection is invoked only when the current video block is non-square.
In some embodiments, the selection of a side depends on the dimension of the video block, where the side is the top side or the left side.
In some embodiments, the longer of the upper and left sides is selected to derive the LIC parameters.
In some embodiments, if the height of a video block is less than the width, the LIC parameters are derived using only the top neighboring samples of the video block and its reference block.
In some embodiments, if the height of a video block is greater than the width, the LIC parameters are derived using only the left neighboring samples of the video block and its reference block.
In some embodiments, if the height of a video block is less than the width, then only the above-neighboring samples of the video block and its reference block are used to derive LIC parameters, where numteps ≦ numteps < < 1; dimShift + -, 1, using the numsites upper neighboring samples of the video block and its reference block.
In some embodiments, if the height of a video block is greater than the width, then only left-side neighboring samples of the video block and its reference block are used to derive LIC parameters, where numteps ═ numteps < < 1; dimShift + -, 1, using the numsites left neighboring samples of the video block and its reference block.
In some embodiments, if neighboring samples of a video block are coded using intra mode and/or mixed intra and inter mode or/and IBC mode, the samples are considered unavailable and replaced with available neighboring samples selected from samples coded using non-intra mode and/or non-CIIP (combined intra-inter prediction) mode and/or non-IBC (intra block copy) mode.
In some embodiments, an unavailable sample is replaced by its nearest available neighbor.
In some embodiments, if the currently unavailable sample is in the top row of the video block, the nearest available sample is the sample that is coded using non-intra mode and/or non-CIIP mode or/and non-IBC mode, the shortest distance before or after the currently unavailable sample in the acquisition order of the top adjacent samples.
In some embodiments, if the currently unavailable sample point is in the left column of the video block, the nearest available sample point is the sample point that is coded and decoded with the non-intra mode and/or the non-CIIP mode or/and the non-IBC mode at the shortest distance before or after the currently unavailable sample point in the acquisition order of the left neighboring sample point.
In some embodiments, the unavailable selected neighboring samples are replaced by their nearest available selected neighboring samples.
In some embodiments, the LIC mode is disabled according to the dimensionality of the video block.
In some embodiments, LIC is disabled when the video block is a 4x4 block.
In some embodiments, padding is applied in place of the unavailable spots.
In some embodiments, the padding comprises copying from available samples.
In some embodiments, for a video block that is LIC coded, neighboring samples that are coded with intra mode and/or mixed intra and inter mode or/and IBC mode are excluded from the derivation of LIC parameters.
In some embodiments, one or more samples acquired later are discarded to ensure that a total of 2N samples are obtained to solve for the least squares error.
In some embodiments, for a video block coded with LIC, selected neighboring samples coded with intra mode and/or mixed intra and inter mode or/and IBC mode are excluded from the derivation of LIC parameters.
In some embodiments, for a video block coded in LIC, neighboring samples coded in non-intra mode and/or non-ciap mode or/and non-IBC mode are included in the derivation of LIC parameters.
In some embodiments, the conversion generates a video block of the video from the bitstream representation.
In some embodiments, the conversion generates a bitstream representation from video blocks of the video.
From the foregoing, it will be appreciated that specific embodiments of the technology disclosed herein have been described herein for purposes of illustration, but that various modifications may be made without deviating from the scope of the invention. Accordingly, the disclosed technology is not limited except as by the appended claims.
Embodiments of the subject matter and the functional operations described in this patent document can be implemented in various systems, digital electronic circuitry, or computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or combinations of one or more of them. Implementations of the subject matter described in this specification can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a tangible and non-transitory computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more of them. The term "data processing unit" or "data processing apparatus" encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus may include, in addition to hardware, code that creates an execution environment for the computer program, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
A computer program (also known as a program, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this specification can be performed by one or more computer programs executed by one or more programmable processors to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer does not necessarily have such a device. Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
It is intended that the specification, together with the drawings, be considered exemplary only, with examples being shown therein. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Further, the use of "or" is intended to include "and/or" unless the context clearly dictates otherwise.
While this patent document contains many specifics, these should not be construed as limitations on the scope of any invention or of what may be claimed, but rather as descriptions of features specific to particular embodiments of particular inventions. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Furthermore, although certain features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are shown in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the sequential order or at the particular times shown, or that all illustrated operations be performed, to achieve desirable results. Moreover, the division of various system components among the embodiments described in this patent document should not be understood as requiring such division in all embodiments.
Only a few embodiments and examples have been described and other embodiments, enhancements and variations can be made based on what is described and illustrated in this patent document.

Claims (139)

1.A method for video processing, comprising:
deriving, for a transition between a video block of a video and a bit stream representation of the video block, LIC parameters in a local illumination compensation, LIC, model applied to the video block by using a fixed number of adjacent samples of the video block; and
performing the conversion based on the LIC parameters.
2. The method of claim 1, wherein the adjacent samples comprise adjacent contiguous or non-contiguous samples.
3. The method of claim 2, wherein LIC mode uses a fixed denominator or value for shifting in the LIC parameter derivation process.
4. The method of any of claims 1-3, wherein the number of adjacent samples selected for deriving the LIC parameters depends on availability of the adjacent samples or/and dimensions of the video block, wherein the width of the video block is W, the height of the video block is H, and the upper left coordinate of the video block is (x, y).
5. The method of claim 4, wherein the selected samples are located at (x-1, y + offset y + Ky H/Fy) and/or (x + offset x + Kx W/Fx, y-1), wherein Fx samples are selected from the top adjacent rows, Kx is from 0 to Fx-1, Fy samples are selected from the left adjacent columns, Ky is from 0 to Fy-1, offset x and offset y are integers.
6. The method of claim 5, wherein offsetX = W/8 and offsetY = H/8.
7. The method of claim 4, wherein the fixed number is four.
8. The method of claim 7, wherein if both a left adjacent column and an upper adjacent row of the video block are available, SH samples of the left column and SW samples of the upper row are selectedThe dots, wherein SH and SW are integers,
Figure DEST_PATH_IMAGE001
and is
Figure 583065DEST_PATH_IMAGE002
9. The method of claim 8, wherein SH = SW = 2.
10. The method of claim 7, wherein the selected sampling point is located at one of:
(a) (x + W/4, y -1),(x + 3W/4, y -1),(x -1, y+H/4), (x -1, y+3H/4);
(b) (x, y-1),(x + W-1, y-1),(x-1, y),(x-1, y+H-1);
(c) (x, y-1), (x + W-1, y-1), (x-1, y), (x-1, y + H-H/W), when H is greater than W;
(d) (x, y-1), (x + W-W/H, y-1), (x-1, y), (x-1, y + H-1) when H is less than W;
(e) (x, y-1),(x + W- max(1, W/H), y-1), (x-1, y),(x-1, y + H – max(1, H/W));
(f) (x+W/2-1, y-1),(x+W/2, y-1),(x-1, y+H/2-1),(x-1, y+H/2);
(h) (x+W/2-1, y-1),(x + W-1, y-1),(x-1, y+H/2-1),(x-1, y+H-1);
(i) (x, y-1),(x+W/2, y-1),(x-1, y),(x-1, y+H/2)。
11. the method of claim 7, wherein the selected sampling point is located at one of:
(a) (x + f1(N/4), y -1),(x + f1(3*N/4), y - 1),(x -1, y+f2(N/4)),(x -1, y+f2(3*N/4));
(b) (x, y-1),(x + f1(N-1), y-1),(x-1, y),(x-1, y+f2(N-1));
(c) (x+f1(N/2-1), y-1),(x+f1(N/2), y-1),(x-1, y+f2(N/2-1)),(x-1, y+f2(N/2));
(d) (x+f1(N/2-1), y-1),(x + f1(N-1), y-1),(x-1, y+f2(N/2-1)),(x-1, y+f2(N-1));
(e) (x, y-1),(x+f1(N/2), y-1),(x-1, y),(x-1, y+f2(N/2)),
Figure DEST_PATH_IMAGE003
12. the method of claim 11, wherein N = numbates.
13. The method of claim 11, wherein N depends on W and/or H.
14. The method of claim 7, wherein samples are selected only from an upper adjacent row if the upper adjacent row is only available.
15. The method of claim 14, wherein SW samples of the top adjacent row are selected, wherein SW is an integer, and wherein SW is a positive integer
Figure 687156DEST_PATH_IMAGE002
16. The method of claim 15, wherein SW = 2 or SW = 4.
17. The method of claim 14, wherein the sampling points are selected depending on width/height.
18. The method of claim 17, wherein the method is carried out in
Figure 404577DEST_PATH_IMAGE004
Then, four samples are selected.
19. The method of claim 17, wherein two samples are selected when W equals 2.
20. The method of claim 14, wherein the selected sampling points are located at one of:
(1) (x, y-1),(x + W/4, y-1),(x + 2*W/4, y-1),(x + 3*W/4, y – 1);
(2) (x, y-1),(x + W/4, y-1),(x + 3*W/4, y – 1),(x + W-1, y -1);
(3) (x +W/8, y-1),(x +3W/8, y-1),(x +5W/8, y-1),(x +7W/8, y-1);
(4) (x, y-1),(x+W/2-1, y-1),(x+W/2, y-1),(x + W-1, y-1)。
21. the method of claim 14, wherein the selected sampling point is located at one of:
(1) (x, y-1),(x + f1(N/4), y-1),(x + f1(2*N/4), y-1),(x + f1(3*N/4), y – 1),
(2) (x, y-1),(x + f1(N/4), y-1),(x + f1(3*N/4), y – 1),(x + f1(N-1), y -1),
(3) (x + f1(N/8), y-1),(x + f1(3N/8), y-1),(x + f1(5N/8), y-1),(x + f1(7N/8), y-1),
(4) (x, y-1),(x+ f1(N/2-1), y-1),(x+ f1(N/2), y-1),(x + f1(N-1), y-1),
Figure DEST_PATH_IMAGE005
22. the method of claim 21, wherein N = numbates.
23. The method of claim 21, wherein N depends on W and/or H.
24. The method of claim 7, wherein if only a left-side neighboring column is available, sampling points are selected only from the left-side neighboring column.
25. The method of claim 24, wherein SH samples of the left adjacent column are selected, wherein SH is an integer, and
Figure 986737DEST_PATH_IMAGE006
26. the method of claim 25, wherein SH = 2 or SH = 4.
27. The method of claim 24, wherein the sampling points are selected depending on width/height.
28. The method of claim 27, wherein the method is carried out in a laboratory environment
Figure DEST_PATH_IMAGE007
Then, four samples are selected.
29. The method of claim 27, wherein two samples are selected when H equals 2.
30. The method of claim 24, wherein the selected sampling point is located at one of:
(1) (x-1, y),(x -1, y + H/4),(x -1, y + 2*H/4),(x -1, y + 3*H/4);
(2) (x-1, y),(x - 1, y+ H/4),(x -1, y + 3*H/4),(x -1, y + H-1);
(3) (x-1, y+H/8),(x-1,y+3H/8),(x-1, y+5H/8),(x-1, y+7H/8);
(4) (x-1, y),(x-1, y+H/2-1),(x-1, y+H/2),(x-1, y+H-1)。
31. the method of claim 24, wherein the selected sampling points are located at one of:
(1) (x-1, y),(x -1, y + f2(N/4)),(x -1, y +f2(2*N/4)),(x -1, y + f2(3*N/4)),
(2) (x-1, y),(x - 1, y+f2(N/4)),(x -1, y +f2(3*N/4)),(x -1, y + f2(N-1)),
(3) (x-1, y+f2(N/8)),(x-1,y+f2(3N/8)),(x-1, y+f2(5N/8)),(x-1, y+f2(7N/8)),
(4) (x-1, y),(x-1, y+f2(N/2-1)),(x-1, y+f2(N/2)),(x-1, y+f2(N-1)),
Figure 738792DEST_PATH_IMAGE008
32. the method of claim 31, wherein N = numbates.
33. The method of claim 31, wherein N depends on W and/or H.
34. The method of claim 4, wherein the fixed number is eight.
35. The method of claim 34, wherein if both a left adjacent column and an upper adjacent row of the video block are available, then four samples of the left column and four samples of the upper row are selected.
36. The method of claim 35, wherein the method is carried out in
Figure 826834DEST_PATH_IMAGE004
And selecting the four sampling points in the upper row.
37. The method of claim 35, wherein the two samples in the upper row are selected when W equals 2.
38. The method of claim 35, wherein the method is carried out in
Figure DEST_PATH_IMAGE009
Then, the four samples in the left column are selected.
39. The method of claim 35, wherein two samples of the left column are selected when H equals 2.
40. The method of claim 34, wherein the selected sampling points are located at one of:
(a)(x +W/8, y-1),(x +3W/8, y-1),(x +5W/8, y-1),(x +7W/8, y-1),(x-1, y+H/8),(x-1,y+3H/8),(x-1, y+5H/8),(x-1, y+7H/8);
(b)(x, y-1),(x+W/2-1, y-1),(x+W/2, y-1),(x + W-1, y-1),(x-1, y),(x-1, y+H/2-1),(x-1, y+H/2),(x-1, y+H-1);
(c)(x, y-1),(x + W/4, y-1),(x + 2*W/4, y-1),(x + 3*W/4, y – 1),(x-1, y),(x -1, y + H/4),(x -1, y + 2*H/4),(x -1, y + 3*H/4);
(d)(x, y-1),(x + W/4, y-1),(x + 3*W/4, y – 1),(x + W-1, y -1),(x-1, y),(x - 1, y+ H/4),(x -1, y + 3*H/4),(x -1, y + H-1)。
41. the method of claim 34, wherein the selected sampling point is located at one of:
(a)(x+f1(N/8), y-1),(x+f1(3N/8), y-1),(x+f1(5N/8), y-1),(x+f1(7N/8), y-1),(x-1, y+f2(N/8)),(x-1,y+f2(3N/8)),(x-1, y+f2(5N/8)),(x-1, y+f2(7N/8));
(b)(x, y-1),(x+f1(N/2-1), y-1),(x+f1(N/2), y-1),(x+f1(N-1), y-1),(x-1, y),(x-1, y+f2(N/2-1)),(x-1, y+f2(N/2)),(x-1, y+f2(N-1));
(c)(x, y-1),(x+f1(N/4), y-1),(x+f1(2*N/4), y-1),(x+f1(3*N/4), y – 1),(x-1, y),(x -1, y+f2(N/4)),(x -1, y+f2(2*N/4)),(x -1, y+f2(3*N/4));
(d) (x, y-1),(x+f1(N/4), y-1),(x+ f1(3*N/4), y – 1),(x+f1(N-1), y -1),(x-1, y),(x - 1, y+f2(N/4)),(x -1, y+f2(3*N/4)),(x -1, y+f2(N-1)),
Figure 785929DEST_PATH_IMAGE010
42. the method of claim 41, wherein N = numStreps.
43. The method of claim 41, wherein N depends on W and/or H.
44. The method of claim 34, wherein if only an upper adjacent row is available, then only samples from the upper adjacent row are selected.
45. The method of claim 44, wherein SW samples of the top adjacent row are selected, wherein SW is an integer.
46. The method of claim 45, wherein SW = 2 or SW = 4, or SW = 8.
47. The method of claim 45, wherein the sampling points are selected depending on width/height.
48. The method of claim 47, wherein the step of treating is carried out in
Figure DEST_PATH_IMAGE011
Then, eight samples are selected, or
Figure 109463DEST_PATH_IMAGE012
Four samples are selected, or two samples are selected when W is equal to 2.
49. The method of claim 44, wherein the selected samples are located at one of:
(a)(x, y-1),(x + W/8, y-1),(x + 2*W/8, y-1),(x + 3*W/8, y – 1),(x + 4*W/8, y-1),(x + 5*W/8, y – 1),(x + 6*W/8, y-1),(x + 7*W/8, y – 1);
(b)(x, y-1),(x + W/8, y-1),(x + 2*W/8, y-1),(x + 3*W/8, y – 1),(x + 5*W/8, y – 1),(x + 6*W/8, y-1),(x + 7*W/8, y – 1),(x + W-1, y -1);
(c)(x +W/16, y-1),(x +3W/16, y-1),(x +5W/16, y-1),(x +7W/16, y-1),(x +9W/16, y-1),(x +11W/16, y-1),(x +13W/16, y-1),(x +15W/16, y-1);
(d)(x, y-1),(x+W/4-1, y-1),(x+W/4, y-1),(x+W/2-1, y-1),(x+W/2, y-1),(x+3W/4-1, y-1),(x+3W/4, y-1),(x + W-1, y-1)。
50. the method of claim 44, wherein the selected samples are located at one of:
(a) (x-1, y),(x -1, y+f2(N/8)),(x -1, y+f2(2*N/8)),(x -1, y+f2(3*N/8)),(x -1, y+f2(4*N/8)),(x-1, y+f2(5*N/8)),(x-1, y+f2(6*N/8)),(x-1, y+f2(7*N/8))
(b) (x-1, y),(x -1, y+f2(N/8)),(x -1, y+f2(2*N/8)),(x -1, y+f2(3*N/8)),(x -1, y+f2( 5*N/8)),(x -1, y+f2( 6*N/8)),(x -1, y+f2(7*N/8)),(x -1, y+f2( N-1))
(c) (x-1, y+f2(N/16)),(x-1,y+f2(3N/16)),(x-1, y+f2(5N/16)),(x-1, y+f2(7N/16)),(x-1, y+f2(9N/16)),(x-1,y+f2(11N/16)),(x-1, y+f2(13N/16)),(x-1, y+f2(15N/16))
(d) (x-1, y),(x-1, y+f2(N/4-1)),(x-1, y+f2(N/4)),(x-1, y+f2(N/2-1)),(x-1, y+f2(N/2)),(x-1, y+f2(3N/4-1)),(x-1, y+f2(3N/4)),(x-1, y+f2(N-1)),
Figure DEST_PATH_IMAGE013
51. the method of claim 50, wherein N = numstrats.
52. The method of claim 50, wherein N depends on W and/or H.
53. The method of claim 4, wherein the selected adjacent samples have a pixel distance greater than or equal to S, S being an integer, for both a left adjacent column and an upper adjacent row of the video block.
54. The method of claim 53, wherein S = 1.
55. The method of claim 4, wherein, when the width of a current video block is greater than 16, W is set equal to 16; h is set equal to 16 when the height of the current video block is greater than 16.
56. The method of claim 4, wherein when the width of a current video block is greater than 16, W is set equal to the width of the current video block; h is set equal to the height of the current video block when the height of the current video block is greater than 16.
57. The method as claimed in claim 4, wherein the LIC parameters are derived by using a least squares error method.
58. The method of claim 4, wherein the LIC parameters are derived by using a two-point method.
59. The method of claim 58, wherein the two points arex A Andx B are the minimum and maximum samples within a set of selected neighboring samples of the video block, the corresponding samples in a reference picture being represented asy A Andy B
60. the method of claim 59A method in which four or eight adjacent samples of the video block at selected locations are compared to find two minimum samples:x 0 A andx 1 A and two maxima:x 0 B andx 1 B and their corresponding samples in the reference picture are represented asy 0 A y 1 A y 0 B Andy 1 B wherein, in the step (A),x A x B y A andy B is derived as:
Figure 965293DEST_PATH_IMAGE014
61. the method of claim 59, wherein eight adjacent samples of the video block at selected locations are compared to find four smaller samples:x 0 A x 1 A x 2 A x 3 A and four larger values:x 0 B x 1 B 、x 2 B x 3 B their corresponding samples in the reference picture are represented asy 0 A y 1 A y 2 A y 3 A y 0 B y 1 B 、y 2 B 、y 3 B Wherein, in the step (A),x A x B y A andy B is derived as:
Figure DEST_PATH_IMAGE015
Figure 896340DEST_PATH_IMAGE016
where off may be equal to 2 or 0.
62. The method of claim 58, wherein the minimum value (x) is replaced by a mean value A , y A ) To derive LIC parameters beta, said average being calculated inside the set of selected neighboring samples of said video block and of its reference block.
63. The method according to any of claims 1-3, wherein one-sided selection is involved in the LIC parameter derivation process, wherein only upper-neighboring samples or left-neighboring samples are involved in the LIC parameter derivation process even if both upper-neighboring and left-neighboring samples are available.
64. The method of claim 63, wherein the one-side selection is invoked only when a current video block is non-square.
65. The method of claim 63, wherein the side selection depends on a dimension of the video block, wherein the side is an upper side or a left side.
66. The method of claim 65, wherein a longer one of the upper side and the left side is selected to derive the LIC parameter.
67. The method of claim 65, wherein if the height of the video block is less than the width, the LIC parameters are derived using only the above-neighboring samples of the video block and its reference block.
68. The method of claim 65, wherein if the height of the video block is greater than the width, the LIC parameter is derived using only the left neighboring samples of the video block and its reference block.
69. The method of claim 65, wherein if the height of the video block is less than the width, then deriving the LIC parameter using only the above-neighboring samples of the video block and its reference block, wherein,
Figure 650669DEST_PATH_IMAGE017
(ii) a dimShift + =1, using numsites above neighboring samples of the video block and its reference block.
70. The method of claim 65, wherein if the height of the video block is greater than the width, then deriving the LIC parameter using only the left neighboring samples of the video block and its reference block, wherein,
Figure DEST_PATH_IMAGE018
(ii) a dimShift + =1, using numsites left neighboring samples of the video block and its reference block.
71. The method of any of claims 1-3, wherein if neighboring samples of the video block are coded using intra mode and/or mixed intra and inter mode or/and IBC mode, the samples are considered unavailable and replaced with available neighboring samples selected from samples coded using non-intra mode and/or non-CIIP (combined intra-inter prediction) mode and/or non-IBC (intra block copy) mode.
72. The method of claim 71, wherein an unavailable sample is replaced by its nearest available neighbor.
73. The method of claim 72, wherein if a currently unavailable sample is in an upper row of the video block, the nearest available sample is a sample that is coded using non-Intra mode and/or non-CIIP mode or/and non-IBC mode that is the shortest distance before or after the currently unavailable sample in acquisition order of upper neighboring samples.
74. The method of claim 72, wherein if a currently unavailable sample is in a left column of the video block, the nearest available sample is a sample that is coded using non-Intra mode and/or non-CIIP mode or/and non-IBC mode that is the shortest distance before or after the currently unavailable sample in an acquisition order of left neighboring samples.
75. The method of claim 71, wherein an unavailable selected neighboring sample is replaced by its nearest available selected neighboring sample.
76. The method of claim 71, wherein an LIC mode is disabled according to a dimension of the video block.
77. The method of claim 76, wherein LIC is disabled when the video block is a 4x4 block.
78. The method of claim 71, wherein padding is applied to replace the unavailable samples.
79. The method of claim 71, wherein padding comprises copying from available samples.
80. A method according to any of claims 1-3, wherein for LIC coded video blocks, neighboring samples coded with intra mode and/or mixed intra and inter mode or/and IBC mode are excluded from the derivation of LIC parameters.
81. The method of claim 80, wherein one or more samples taken later are discarded to ensure that a total of 2 are obtained N Sampling points to solve the least square error.
82. The method of claim 80, wherein for LIC coded video blocks, selected neighboring samples coded with intra mode and/or mixed intra and inter mode or/and IBC mode are excluded from the derivation of the LIC parameters.
83. The method according to any of claims 1-3, wherein for LIC coded video blocks, the deriving of the LIC parameters comprises using the neighboring samples of non-intra mode and/or non-CIIP mode or/and non-IBC mode coding.
84. The method according to any one of claims 1-3, further comprising:
deriving a motion candidate list for the transitions between video blocks of the video and the bitstream representation of the video blocks, wherein a first candidate in the motion candidate list is set to have a local illumination compensation, LIC, flag; and
performing the conversion using the motion candidate list, wherein, during the conversion, upon selection of the first candidate from the motion candidate list, it is determined whether LIC is enabled based on the flag of the first candidate.
85. The method of claim 84, wherein the LIC flag of the first candidate is set according to at least one of:
a type of the first candidate;
a LIC flag for deriving a second candidate of the first candidates;
a type of a second candidate in the motion candidate list used to derive the first candidate;
LIC flags of other candidates in the motion candidate list.
86. The method of claim 85, wherein the first candidate is a pair candidate in a merge candidate list for the video block.
87. The method of claim 86, wherein the LIC flag associated with a paired candidate is always set to false.
88. The method of claim 86, wherein the LIC flag is set to false and LIC is disabled when the video block is coded with paired candidates.
89. The method of claim 86, wherein the LIC flag associated with a pair candidate is set based on two merge candidates used to derive the pair candidate.
90. The method of claim 89 wherein if an LIC flag associated with at least one of said two merge candidates is true, setting an LIC flag associated with said pair candidate to true.
91. The method of claim 89, wherein if an LIC flag associated with both of said merge candidates is true, setting an LIC flag associated with said pair of candidates to true.
92. The method of claim 89, wherein if at least one of said two merge candidates is a Temporal Motion Vector Prediction (TMVP) candidate, setting a LIC flag associated with the paired candidate to false.
93. The method of claim 89, wherein if at least one of said two merge candidates is a history-based motion vector prediction (HMVP) candidate, then setting a LIC flag associated with a paired candidate to false.
94. The method of claim 85, wherein the first candidate is a zero motion merge candidate in a merge candidate list for the video block.
95. The method of claim 94 wherein the LIC flag associated with a zero motion merge candidate is always set to false.
96. The method of claim 11, wherein the LIC flag associated with a zero motion merge candidate is set according to LIC flags associated with other merge candidates in the merge candidate list.
97. The method of claim 96, wherein if a particular merge candidate in the list of merge candidates is associated with an LIC flag equal to true, then setting the LIC flag associated with a zero motion merge candidate to true.
98. The method of claim 97, wherein the particular merge candidate is a first merge candidate.
99. The method of any of claims 1-3, further comprising:
determining, for the transitions between the video blocks of the video and the bitstream representations of the video blocks, at least one of:
enabling or disabling Local Illumination Compensation (LIC) for at least a portion of the video block based on a property of the video block,
whether LIC is enabled for a reference picture list, an
LIC parameter of at least one reference picture list; and
performing the conversion based on the determination.
100. The method of claim 99, wherein the property comprises a coding mode of the video block, and LIC is enabled when the video block is coded using a selectable temporal motion vector prediction (ATMVP) mode.
101. The method of claim 100, wherein whether LIC is enabled further depends on spatial merge candidates derived from neighboring blocks of the video block.
102. The method of claim 100, wherein when LIC is enabled for the video block, a set of LIC parameters for each reference picture list is derived for the entire video block.
103. The method of claim 102, wherein if LIC is enabled for a sub-block of the video block, all sub-blocks share the same LIC parameters.
104. The method of claim 103, wherein one or more motion vectors of a sub-block are used to identify neighboring reference samples of the entire video block, the neighboring reference samples being used to derive LIC parameters of the entire video block.
105. The method of claim 104, wherein the one sub-block is the top-left sub-block.
106. The method of claim 104, wherein the one sub-block is a center sub-block.
107. The method of claim 104, wherein different sub-blocks are selected for different color components.
108. The method of claim 99, wherein when LIC is enabled for the video chunk, different sub-regions within the video chunk use different LIC parameters.
109. The method of claim 108, wherein a plurality of sets of LIC parameters are derived for each reference picture list, and each sub-region within the video block selects one of the plurality of sets of LIC parameters.
110. The method of claim 108, wherein the sub-region is a sub-block of a block that is coded using a sub-block based technique that includes at least one of an ATMVP mode and an affine mode.
111. The method of claim 99, wherein LIC is enabled for one reference picture list and disabled for another reference picture list.
112. The method of claim 99, wherein two LIC flags are stored for each video block.
113. The method of claim 99, wherein, for a pair of merge candidates, the LIC flag is inherited from each of the two merge candidates used to derive the pair.
114. The method of claim 99, wherein, when the video block is a bi-predictive coded block, LIC parameters are derived once for two reference picture lists.
115. The method of claim 114, wherein the LIC parameters are derived from motion information of one reference picture list.
116. The method of claim 114, wherein the LIC parameters are derived from motion information of two reference picture lists.
117. The method of claim 116, wherein neighboring samples with respect to two reference pictures in the two reference picture lists are both utilized.
118. The method of claim 116, wherein the selection of neighboring samples relative to the two reference pictures in the two reference picture lists is different for the two reference picture lists.
119. The method of claim 99, wherein LIC is enabled for a portion of sub-regions within the video block and disabled for remaining sub-regions.
120. The method of claim 119, wherein LLC is enabled for one sub-block and LLC is disabled for another sub-block when the video block is an ATMVP codec block.
121. The method of claim 119, wherein, when the video block is an affine codec block, LLC is enabled for one sub-block and LLC is disabled for another sub-block.
122. The method of claim 120, wherein the sub-blocks are 8 x 8 blocks.
123. The method of claim 99, wherein the property comprises a location of the video block.
124. The method of claim 123, wherein LIC is disabled when the video block is located at a boundary of at least one of a picture, a slice, and a brick.
125. The method of claim 99, wherein the property comprises a codec mode.
126. The method of claim 125, wherein LIC and transform bypass TransBypass modes are exclusively used, the transform bypass mode comprising at least one of a transform skip (ts) mode and a quantized residual block differential pulse codec modulation QR-BDPCM mode.
127. The method of claim 126, wherein if the video block is coded with LIC mode, a TransBypass mode comprising TS mode is disabled.
128. The method of claim 126, wherein the signaling of the TS mode is skipped if the TS mode is disabled.
129. The method of claim 126, wherein the LIC mode side information is conditionally signaled based on an indication of the use of the TS mode.
130. The method of claim 129, wherein signaling of side information for LIC mode is skipped if TS mode is enabled for the video block.
131. The method of any of claims 1-3, further comprising:
determining whether and/or how to enable and/or apply a loop filter process and/or a post-reconstruction filtering process based on a use of local illumination compensation, LIC, for the transitions between video blocks of the video and the bitstream representations of the video blocks, wherein the loop filter process comprises a deblocking filter, a sample adaptive offset, SAO, an adaptive loop filter, ALF, and the post-reconstruction filtering process comprises a bilateral filter; and
performing the conversion based on the determination.
132. The method of claim 131, wherein for two contiguous blocks with different LIC flags, filtering is performed on an edge between the two contiguous blocks.
133. The method of claim 132, wherein the boundary strength is set to M if the two contiguous blocks are coded with different LIC flags, M not equal to 0.
134. The method of claim 131, wherein for two contiguous blocks that both have an LIC flag equal to 1, filtering is performed on an edge between the two contiguous blocks.
135. The method of claim 133, wherein if both of the two neighboring blocks are coded with a LIC flag equal to 1, then setting a boundary strength to M, M not equal to 0.
136. The method of any of claims 1-3, wherein the converting generates the video block of video from the bitstream representation.
137. The method of any of claims 1-3, wherein the converting generates the bitstream representation from the video blocks of the video.
138. An apparatus in a video system comprising a processor and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to implement the method of any of claims 1-137.
139. A non-transitory computer readable medium storing instructions that cause a process to implement the method of any of claims 1-137.
CN202080037225.8A 2019-05-20 2020-05-20 Simplified local illumination compensation Active CN113841396B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CNPCT/CN2019/087620 2019-05-20
CN2019087620 2019-05-20
PCT/CN2020/091299 WO2020233600A1 (en) 2019-05-20 2020-05-20 Simplified local illumination compensation

Publications (2)

Publication Number Publication Date
CN113841396A CN113841396A (en) 2021-12-24
CN113841396B true CN113841396B (en) 2022-09-13

Family

ID=73458289

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080037225.8A Active CN113841396B (en) 2019-05-20 2020-05-20 Simplified local illumination compensation

Country Status (2)

Country Link
CN (1) CN113841396B (en)
WO (1) WO2020233600A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117121484A (en) * 2021-02-01 2023-11-24 交互数字Ce专利控股有限公司 Method and apparatus for encoding or decoding video
CN117581542A (en) * 2021-09-06 2024-02-20 北京达佳互联信息技术有限公司 Candidate derivation for affine merge mode in video codec
WO2023134452A1 (en) * 2022-01-11 2023-07-20 Beijing Bytedance Network Technology Co., Ltd. Method, apparatus, and medium for video processing
WO2023200561A1 (en) * 2022-04-13 2023-10-19 Qualcomm Incorporated Methods for adaptive signaling of maximum number of merge candidates in multiple hypothesis prediction
WO2024037649A1 (en) * 2022-08-19 2024-02-22 Douyin Vision Co., Ltd. Extension of local illumination compensation

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101193302A (en) * 2006-12-01 2008-06-04 三星电子株式会社 Illumination compensation method and apparatus and video encoding and decoding method and apparatus
CN102215389A (en) * 2010-04-09 2011-10-12 华为技术有限公司 Video coding and decoding methods and devices capable of realizing local luminance compensation
CN107690810A (en) * 2015-06-09 2018-02-13 高通股份有限公司 It is determined that the system and method for the illumination compensation state for video coding
WO2018056603A1 (en) * 2016-09-22 2018-03-29 엘지전자 주식회사 Illumination compensation-based inter-prediction method and apparatus in image coding system
CN108462873A (en) * 2017-02-21 2018-08-28 联发科技股份有限公司 The method and apparatus that the Candidate Set of block determines is split for quaternary tree plus binary tree
EP3468193A1 (en) * 2017-10-05 2019-04-10 Thomson Licensing Method and apparatus for adaptive illumination compensation in video encoding and decoding

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3013049A4 (en) * 2013-06-18 2017-02-22 Sharp Kabushiki Kaisha Illumination compensation device, lm predict device, image decoding device, image coding device
US10390015B2 (en) * 2016-08-26 2019-08-20 Qualcomm Incorporated Unification of parameters derivation procedures for local illumination compensation and cross-component linear model prediction
WO2018045944A1 (en) * 2016-09-06 2018-03-15 Mediatek Inc. Methods and apparatuses of candidate set determination for binary-tree splitting blocks
CN107147911B (en) * 2017-07-05 2019-07-26 中南大学 Quick interframe coding mode selection method and device based on local luminance compensation LIC
US11310535B2 (en) * 2017-07-05 2022-04-19 Telefonaktiebolaget Lm Ericsson (Publ) Deblocking filtering control for illumination compensation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101193302A (en) * 2006-12-01 2008-06-04 三星电子株式会社 Illumination compensation method and apparatus and video encoding and decoding method and apparatus
CN102215389A (en) * 2010-04-09 2011-10-12 华为技术有限公司 Video coding and decoding methods and devices capable of realizing local luminance compensation
CN107690810A (en) * 2015-06-09 2018-02-13 高通股份有限公司 It is determined that the system and method for the illumination compensation state for video coding
WO2018056603A1 (en) * 2016-09-22 2018-03-29 엘지전자 주식회사 Illumination compensation-based inter-prediction method and apparatus in image coding system
CN108462873A (en) * 2017-02-21 2018-08-28 联发科技股份有限公司 The method and apparatus that the Candidate Set of block determines is split for quaternary tree plus binary tree
EP3468193A1 (en) * 2017-10-05 2019-04-10 Thomson Licensing Method and apparatus for adaptive illumination compensation in video encoding and decoding

Also Published As

Publication number Publication date
WO2020233600A1 (en) 2020-11-26
CN113841396A (en) 2021-12-24

Similar Documents

Publication Publication Date Title
CN112868240B (en) Collocated local illumination compensation and modified inter prediction codec
US11889108B2 (en) Gradient computation in bi-directional optical flow
CN110581999B (en) Chroma decoder side motion vector refinement
CN112868239B (en) Collocated local illumination compensation and intra block copy codec
US20240098295A1 (en) Efficient affine merge motion vector derivation
CN113056914B (en) Partial position based difference calculation
US11641467B2 (en) Sub-block based prediction
CN113841396B (en) Simplified local illumination compensation
CN113302918A (en) Weighted prediction in video coding and decoding
CN112956197A (en) Restriction of decoder-side motion vector derivation based on coding information
CN113316933A (en) Deblocking filtering using motion prediction
JP2023145563A (en) Inclination calculation in different motion vector fine adjustment
CN110740321B (en) Motion prediction based on updated motion vectors
CN110677674B (en) Method, apparatus and non-transitory computer-readable medium for video processing
CN113316935A (en) Motion candidate list using local illumination compensation
CN113383548A (en) Interaction between MV precision and MV differential coding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant