WO2019079206A2 - Améliorations diverses apportées à la mise en correspondance de modèles de fruc - Google Patents

Améliorations diverses apportées à la mise en correspondance de modèles de fruc Download PDF

Info

Publication number
WO2019079206A2
WO2019079206A2 PCT/US2018/055933 US2018055933W WO2019079206A2 WO 2019079206 A2 WO2019079206 A2 WO 2019079206A2 US 2018055933 W US2018055933 W US 2018055933W WO 2019079206 A2 WO2019079206 A2 WO 2019079206A2
Authority
WO
WIPO (PCT)
Prior art keywords
candidates
video
slice
template matching
block
Prior art date
Application number
PCT/US2018/055933
Other languages
English (en)
Other versions
WO2019079206A3 (fr
Inventor
Vijayaraghavan Thirumalai
Xiang Li
Nan HU
Hsiao-Chiang Chuang
Marta Karczewicz
Original Assignee
Qualcomm Incorporated
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Incorporated filed Critical Qualcomm Incorporated
Priority to AU2018350913A priority Critical patent/AU2018350913A1/en
Priority to CN201880065805.0A priority patent/CN111201794B/zh
Priority to BR112020007329-6A priority patent/BR112020007329A2/pt
Priority to KR1020207010186A priority patent/KR20200069303A/ko
Priority to SG11202001988QA priority patent/SG11202001988QA/en
Priority to EP18796349.1A priority patent/EP3698545A2/fr
Publication of WO2019079206A2 publication Critical patent/WO2019079206A2/fr
Publication of WO2019079206A3 publication Critical patent/WO2019079206A3/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/537Motion estimation other than block-based
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/56Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression

Definitions

  • This application is related to the FRUC template matching in the field of video encoding and decoding.
  • Video coding standards include ITU-T H.261, ISO/IEC MPEG-1 Visual, ITU-T H.262 or ISO/IEC MPEG-2 Visual, ITU-T H.263, ISO/IEC MPEG-4 Visual and ITU-T H.264 (also known as ISO/IEC MPEG-4 AVC), including its Scalable Video Coding (SVC) and Multi-view Video Coding (MVC) extensions.
  • SVC Scalable Video Coding
  • MVC Multi-view Video Coding
  • HEVC High Efficiency Video Coding
  • ITU-T H.265 including its range extension, multiview extension (MV- HEVC) and scalable extension (SHVC)
  • JCT-VC Joint Collaboration Team on Video Coding
  • JCT-3V Joint Collaboration Team on 3D Video Coding Extension Development
  • ITU-T VCEG Q6/16
  • ISO/IEC MPEG JTC 1/SC 29/WG 11
  • JVET Joint Video Exploration Team
  • JEM 5 Joint Exploration Model 5
  • JEM5 Joint Exploration Test Model 5
  • the device for video decoding may include a memory configured to store video data.
  • the device may also include a processor configured receive a bitstream including encoded video data.
  • the processor may be configured to select a number of template matching (TM) candidates for a temporal layer or slice during the video decoding. The number of TM candidates selected are fixed prior to the video decoding, or adaptively calculated during the video decoding.
  • the processor may be configured to generate a prediction block and residual block, based on a template matching candidate, to reconstruct the video data.
  • the techniques are also directed to a method of video coding, comprising selecting a number of template matching (TM) candidates based on a temporal layer or slice.
  • the number of TM candidates selected for the temporal layer or slice may be selectively fixed prior to coding, or adaptively calculated during coding.
  • the techniques further include determining a number of allowed TM candidates for the temporal layer or slice in the slice header, sequence parameter set (SPS), picture parameter set (PPS).
  • SPS sequence parameter set
  • PPS picture parameter set
  • the techniques may include that the determining comprises receiving a bitstream including encoded video data, and generating a prediction block and residual block, based on a template matching candidate, to reconstruct the video data when the video coding is video decoding.
  • the techniques may include that the selected number of template matching candidates is signaled in a bitstream of a video encoder when the video coding is video encoding.
  • This disclosure also relates to a device for video encoding
  • the device may include a memory configured to store video data.
  • the device may also include a processor configured to select a number of template matching (TM) candidates based on a temporal layer or slice.
  • the number of TM candidates selected for a the temporal layer or slice is selectively fixed prior to the video encoding, or adaptively calculated during the video encoding.
  • the processor may be configured to signal a number of allowed TM candidates for a temporal layer or slice in the slice header, sequence parameter set (SPS), or picture parameter set (PPS).
  • SPS sequence parameter set
  • PPS picture parameter set
  • This disclosure also relates to a computer readable medium having stored thereon instructions that when executed by a processor perform selecting a number of template matching (TM) candidates based on a temporal layer or slice.
  • the number of TM candidates selected for the temporal layer or slice is selectively fixed prior to encoding or decoding, or adaptively calculated during encoding or decoding.
  • the instructions when executed by the processor may also perform signaling a number of allowed TM candidates for the temporal layer or slice in the slice header, sequence parameter set (SPS), or picture parameter set (PPS).
  • SPS sequence parameter set
  • PPS picture parameter set
  • This disclosure also relates to an apparatus that includes means for performing selecting a number of template matching (TM) candidates based on a temporal layer or slice.
  • the number of TM candidates selected for the temporal layer or slice is selectively fixed prior to encoding or decoding, or adaptively calculated during encoding or decoding.
  • the apparatus may also include means for performing signaling a number of allowed TM candidates for the temporal layer or slice in the slice header, sequence parameter set (SPS), or picture parameter set (PPS).
  • SPS sequence parameter set
  • PPS picture parameter set
  • FIG. 1 (a) illustrates a conceptual diagram of Spatial neighboring MV candidates for merge mode.
  • FIG. 1 (b) illustrates a conceptual diagram of AMVP mode.
  • FIG. 2 (a) illustrates a conceptual diagram of TMVP candidates.
  • FIG. 2 (b) illustrates a conceptual diagram of MV scaling.
  • FIG. 3 illustrates a conceptual diagram of Bilateral matching.
  • FIG. 4 illustrates a conceptual diagram of Template matching.
  • FIG. 5 (a) illustrates a flowchart of an existing FRUC template matching mode.
  • FIG. 5 (b) illustrates a flowchart of a proposed FRUC template matching mode.
  • FIG. 6 illustrates a conceptual diagram of optical flow trajectory.
  • FIG. 7 (a)-(c) illustrates an example of BIO for 8x4 block.
  • FIG. 8 illustrates a proposed DMVD based on bilateral template matching.
  • FIG. 9 (a)-(b) illustrate examples of sub-blocks where OBMC applies.
  • FIG. 10 (a)-(d) illustrates examples of OBMC weightings.
  • FIG. 11 Illustrates a flowchart of a process to decide between carrying out a bi prediction template matching or uni-prediction template matching.
  • FIG. 12 illustrates an exemplary video encoder that may be used to implement one or more of the techniques described in this disclosure.
  • FIG. 13 illustrates an exemplary video decoder that may be used to implement one or more of the techniques described in this disclosure.
  • FRUC template matching provides significant bit-rate reduction as the motion vector can be derived at the decoder side.
  • the coding complexity of the FRUC template matching method is high, especially at the encoder due to motion vector refinement and (Rate-Distortion) RD calculations.
  • TM template matching
  • the number of TM selected for a given temporal layer or slice can be fixed prior to encoding or decoding, or it can be adaptively calculated during the encoding or decoding process. For example, during encoding of a temporal layer or slice, or during decoding of a temporal layer or slice, the number of allowed TM candidates for a given temporal layer or slice may be included in a slice header, sequence parameter set (SPS), or picture parameter set (PPS). The inclusion of the number of TM candidates may be signaled.
  • SPS sequence parameter set
  • PPS picture parameter set
  • the signaling may be explicit, i.e., the number of allowed TM candidates may be part of the slice header, SPS, or PPS that a video encoder sends in a bitstream.
  • the signaling may be implicit, and the decoder may derive the number of allowed TM candidates, e.g., using DMVD (decoder- side motion vector derivation). Additional context will be described with reference to the figures.
  • CTB coding tree block
  • CTU coding tree unit
  • the size of a CTB can be ranges from 16x16 to 64x64 in the HEVC main profile (although technically 8x8 CTB sizes can be supported).
  • a coding unit (CU) could be the same size of a CTB although and as small as 8x8.
  • Each coding unit is coded with one mode. When a CU is inter coded, it may be further partitioned into 2 or 4 prediction units (PUs) or become just one PU when further partition doesn't apply. When two PUs are present in one CU, they can be half size rectangles or two rectangle size with 1 ⁇ 4 or 3 ⁇ 4 size of the CU.
  • PUs prediction units
  • two PUs are present in one CU, they can be half size rectangles or two rectangle size with 1 ⁇ 4 or 3 ⁇ 4 size of the CU.
  • the CU is inter coded, one set of motion information is present for each PU. In addition, each PU is coded with a unique inter-prediction mode to derive the set of motion information.
  • merge skip is considered as a special case of merge
  • AMVP advanced motion vector prediction
  • a motion vector (MV) candidate list is maintained for multiple motion vector predictors.
  • the motion vector(s), as well as reference indices in the merge mode, of the current PU are generated by taking one candidate from the MV candidate list.
  • the MV candidate list contains up to 5 candidates for the merge mode and only two candidates for the AMVP mode.
  • a merge candidate may contain a set of motion information, e.g., motion vectors corresponding to both reference picture lists (list 0 and list 1) and the reference indices. If a merge candidate is identified by a merge index, the reference pictures are used for the prediction of the current blocks, as well as the associated motion vectors are determined. However, under AMVP mode for each potential prediction direction from either list 0 or list 1, a reference index needs to be explicitly signaled, together with an MV predictor (MVP) index to the MV candidate list since the AMVP candidate contains only a motion vector. In AMVP mode, the predicted motion vectors can be further refined.
  • MVP MV predictor
  • a merge candidate corresponds to a full set of motion information while an AMVP candidate contains just one motion vector for a specific prediction direction and reference index.
  • Spatial MV candidates are derived from the neighboring blocks shown on FIG. 1, for a specific PU (PUo), although the methods generating the candidates from the blocks differ for merge and AMVP modes.
  • FIG. 1 (a) illustrates a conceptual diagram of Spatial neighboring MV candidates for merge mode.
  • up to four spatial MV candidates can be derived with the orders showed on FIG. 1(a) with numbers, and the order is the following: left (0, Al), above (1, B l), above right (2, BO), below left (3, AO), and above left (4, B2), as shown in FIG. 1 (a).
  • FIG. 1(b) illustrates a conceptual diagram of AMVP mode.
  • the neighboring blocks are divided into two groups: left group consisting of the block 0 and 1, and above group consisting of the blocks 2, 3, and 4 as shown on FIG. 1 (b).
  • the potential candidate in a neighboring block referring to the same reference picture as that indicated by the signaled reference index has the highest priority to be chosen to form a final candidate of the group. It is possible that all neighboring blocks don't contain a motion vector pointing to the same reference picture. Therefore, if such a candidate cannot be found, the first available candidate will be scaled to form the final candidate, thus the temporal distance differences can be compensated.
  • Temporal motion vector predictor (TMVP) candidate if enabled and available, is added into the MV candidate list after spatial motion vector candidates.
  • the process of motion vector derivation for TMVP candidate is the same for both merge and AMVP modes, however the target reference index for the TMVP candidate in the merge mode is always set to 0.
  • FIG. 2 (a) illustrates a conceptual diagram of TMVP candidates.
  • FIG. 2 (b) illustrates a conceptual diagram of MV scaling.
  • the primary block location for TMVP candidate derivation is the bottom right block outside of the collocated PU as shown in FIG. 2 (a) as a block "T,” to compensate the bias to the above and left blocks used to generate spatial neighboring candidates. However, if that block is located outside of the current CTB row or motion information is not available, the block is substituted with a center block of the PU.
  • Motion vector for TMVP candidate is derived from the co-located PU of the co- located picture, indicated in the slice level.
  • the motion vector for the co-located PU is called collocated MV.
  • the co-located MV need to be scaled to compensate the temporal distance differences, as shown in FIG. 2 (b).
  • Motion vector scaling it is assumed that the value of motion vectors is proportional to the distance of pictures in the presentation time.
  • a motion vector associates two pictures, the reference picture, and the picture containing the motion vector (namely the containing picture).
  • the distance of the containing picture and the reference picture is calculated based on the Picture Order Count (POC) values.
  • POC Picture Order Count
  • both its associated containing picture and reference picture may be different. Therefore, a new distance (based on POC) is calculated. And the motion vector is scaled based on these two POC distances.
  • the containing pictures for the two motion vectors are the same, while the reference pictures are different.
  • motion vector scaling applies to both TMVP and AMVP for spatial and temporal neighboring candidates.
  • bi-directional combined motion vector candidates are derived by a combination of the motion vector of the first candidate referring to a picture in the list 0 and the motion vector of a second candidate referring to a picture in the list 1.
  • Pruning process for candidate insertion candidates from different blocks may happen to be the same, which decreases the efficiency of a merge/ AMVP candidate list.
  • a pruning process is applied to solve this problem. It compares one candidate against the others in the current candidate list to avoid inserting identical candidate in certain extent. To reduce the complexity, only limited numbers of pruning process is applied instead of comparing each potential one with all the other existing ones.
  • Pattern matched motion vector derivation (PMMVD) mode is a special merge mode based on Frame-Rate Up Conversion (FRUC) techniques. With this mode, motion information of a block is not signaled but derived at decoder side. This technology was included in JEM.
  • a FRUC flag is signalled for a CU when its merge flag is true.
  • FRUC flag is false, a merge index is signalled, and the regular merge mode is used.
  • FRUC flag is true, an additional FRUC mode flag is signalled to indicate which method (bilateral matching or template matching) is to be used to derive motion information for the block.
  • the syntax table to code flags for FRUC is as follows,
  • an initial motion vector is first derived for the whole CU based on bilateral matching or template matching.
  • the merge list of the CU, or called PMMVD seeds is checked and the candidate which leads to the minimum matching cost is selected as the starting point.
  • a local search based on bilateral matching or template matching around the starting point is performed and the MV results in the minimum matching cost is taken as the MV for the whole CU.
  • the motion information is further refined at sub-block level with the derived CU motion vectors as the starting points.
  • FIG. 3 illustrates bilateral matching.
  • the bilateral matching is used to derive motion information of the current block by finding the best match between two reference blocks along the motion trajectory of the current block in two different reference pictures.
  • the motion vectors MVO and MV1 pointing to the two reference blocks shall be proportional to the temporal distances between the current picture and the two reference pictures.
  • the bilateral matching becomes mirror based bi-directional MV.
  • FIG. 4 illustrates template matching.
  • template matching is used to derive motion information of the current block by finding the best match between a template (top and/or left neighbouring blocks of the current block) in the current picture and a block (same size to the template) in a reference picture.
  • the decision on whether using FRUC merge mode for a CU is based on RD cost selection as done for normal merge candidate. That is the two matching modes (bilateral matching and template matching) are both checked for a CU by using RD cost selection. The one leading to the minimal cost is further compared to other CU modes. If a FRUC matching mode is the most efficient one, FRUC flag is set to true for the CU and the related matching mode is used.
  • FIG. 5 Flowchart of the existing FRUC template matching mode is shown in FIG. 5 (a).
  • a template To (and its corresponding motion information MV0) is found to match current template Tc of current block from listO reference pictures.
  • template Ti (and its corresponding motion information MV1) is found from listl reference pictures. The obtained motion information MV0 and MV1 are used to perform bi-prediction to generate predictor of the current block.
  • the existing FRUC template matching mode is enhanced by introducing bidirectional template matching and adaptive selection between uni-prediction and bi- prediction.
  • the proposed modifications for FRUC template matching mode are in FIG. 5 (b), compared to FIG. 5 (a) which illustrates the existing FRUC template matching mode.
  • a proposed bi-directional template matching is implemented based on the existing uni-directional template matching.
  • a matched template To is firstly found in the first step of template matching from listO reference pictures (Noted that listO here is only taken as an example. In fact, whether lisO or listl used in the first step is adaptive to initial distortion cost between current template and initial template in corresponding reference picture.
  • the initial template can be determined with initial motion information of the current block which is available before performing the 1 st template matching.
  • the updated current template T c instead of the current template Tc is utilized to find another matched template Ti from listl reference pictures in the second template matching.
  • the matched template Ti is founded by jointly using listO and listl reference pictures. This matching process is called bi-directional template matching.
  • the proposed selection between uni-prediction and bi-prediction for motion compensation prediction (MCP) is based on template matching distortion.
  • distortion between template To and Tc (the current template) can be calculated as costO
  • distortion between template Ti and T'c (the updated current template) can be calculated as costl. If costO is less than 0.5*costl, uni- prediction based on MVO is applied to FRUC template matching mode; otherwise, bi- prediction based on MVO and MV1 is applied.
  • costO is compared to 0.5*costl since costl indicates difference between template Ti and T'c (the updated current template), which is 2 times of difference between Tc (the current template) and its prediction of 0.5*(To+Ti).It is noted that the proposed methods are only applied to PU- level motion refinement. Sub-PU level motion refinement keeps unchanged.
  • FIG. 6 illustrates optical flow trajectory.
  • Bi-directional Optical flow (BIO) is pixel-wise motion refinement which is performed on top of block-wise motion compensation in a case of bi-prediction. Since it compensates the fine motion can inside the block enabling BIO results in enlarging block size for motion compensation.
  • Sample- level motion refinement doesn't require exhaustive search or signaling since there is explicit equation which gives fine motion vector for each sample.
  • r 0 and r denote the distance to reference frames as shown on a FIG. 6.
  • Distances T 0 and r are calculated based on POC for RefO and Refl:
  • the motion vector field ( ⁇ , ⁇ ) is determined by minimizing the difference
  • Model uses only first linear term of local Taylor expansion for ⁇ :
  • d is internal bit-depth of the input video.
  • MV regiment of BIO might be unreliable due to noise or irregular motion. Therefore, in BIO, the magnitude of MV regiment is clipped to the certain threshold thBIO.
  • the threshold value is determined based on whether all the reference pictures of the current picture are all from one direction. If all the reference pictures of the current pictures of the current picture are from one direction, the value of the threshold is set to 12 X 2 14_d , otherwise, it is set to 12 X 2 13_d .
  • BIOfilterG corresponding to the fractional position fracY with de-scaling shift d-8
  • signal displacement is performed using BlOfilterS in horizontal direction corresponding to the fractional position fracX with de-scaling shift by 18-d.
  • the length of interpolation filter for gradients calculation BIOfilterG and signal displacement BIOfilterF is shorter (6-tap) in order to maintain reasonable complexity.
  • Table 1 shows the filters used for gradients calculation for different fractional positions of block motion vector in BIO.
  • Table 2 shows the interpolation filters used for prediction signal generation in BIO.
  • FIG. 7 shows an example of the gradient calculation for an 8x4 block.
  • it needs to fetch the motion compensated predictors and calculate the HOR/VER gradients of all the pixels within current block as well as the outer two lines of pixels because solving vx and vy for each pixel needs the HOR/VER gradient values and motion compensated predictors of the pixels within the window ⁇ centered in each pixel as shown in equation (4).
  • the size of this window is set to 5x5, it therefore needs to fetch the motion compensated predictors and calculate the gradients for the outer two lines of pixels.
  • BIO is applied to all bi-directional predicted blocks when the two predictions are from different reference pictures.
  • BIO is disabled.
  • FIG. 8 illustrates FIG. 8 Proposed DMVD based on bilateral template matching.
  • a bilateral template is generated as the weighted combination of the two prediction blocks, from the initial MV0 of listO and MV1 of listl respectively, as shown in FIG. 8.
  • the template matching operation consists of calculating cost measures between the generated template and the sample region (around the initial prediction block) in the reference picture. For each of the two reference pictures, the MV that yields the minimum template cost is considered as the updated MV of that list to replace the original one.
  • the two new MVs i.e., MV0' and MV1' as shown in FIG. 8, are used for regular bi-prediction. As it is commonly used in block-matching motion estimation, the sum of absolute differences (SAD) is utilized as cost measure.
  • SAD sum of absolute differences
  • the proposed decoder-side motion vector derivation is applied for merge mode of bi-prediction with one from the reference picture in the past and the other from reference picture in the future, without the transmission of additional syntax element.
  • the bilateral template matching may also be described as a hierarchical temporal matching process.
  • the base temporal layer may be viewed as compression of a sequence of consecutive frames without considering a higher level operation that takes into account a relationship between the sequence of consecutive frames, e.g., a base temporal layer may include an independent frame (I-frame) and prediction frames (P-frames), where the prediction frames were predicted based on an I- frame.
  • a bi-lateral matching technique may create the use of a B-frame, which allows for predicting a frame based on taking into account differences between a current frame and both the previous frame and following frame to specify its content (e.g., as seen in FIG. 8).
  • a B-frame represents a higher temporal layer.
  • the base temporal layer is a lower temporal layer.
  • FIG. 9 illustrates an example of sub-blocks where OBMC applies.
  • the Overlapped Block Motion Compensation (OBMC) has been used for early generations of video standards, e.g., as in H.263.
  • the OBMC is performed for all Motion Compensated (MC) block boundaries except the right and bottom boundaries of a CU. Moreover, it is applied for both luma and chroma components.
  • a MC block is corresponding to a coding block.
  • sub-CU mode includes sub- CU merge, Affine and FRUC mode [3]
  • each sub-block of the CU is a MC block.
  • OBMC is performed at sub-block level for all MC block boundaries, where sub-block size is set equal to 4x4, as illustrated in FIG. 9.
  • FIG. 10 illustrates OBMC weightings.
  • a prediction block based on motion vectors of a neighbouring sub-block is denoted as PN, with N indicating an index for the neighbouring above, below, left and right sub-blocks and a prediction block based on motion vectors of the current sub-block is denoted as Pc.
  • PN is based on the motion information of a neighbouring sub-block that contains the same motion information to the current sub-block
  • the OBMC is not performed from PN. Otherwise, every pixel of PN is added to the same pixel in Pc, i.e., four rows/columns of PN are added to Pc.
  • the weighting factors ⁇ 1/4, 1/8, 1/16, 1/32 ⁇ are used for PN and the weighting factors ⁇ 3/4, 7/8, 15/16, 31/32 ⁇ are used for Pc.
  • the exception are small MC blocks, (i.e., when height or width of the coding block is equal to 4 or a CU is coded with sub-CU mode), for which only two rows/columns of PN are added to Pc.
  • weighting factors ⁇ 1/4, 1/8 ⁇ are used for PN and weighting factors ⁇ 3/4, 7/8 ⁇ are used for Pc.
  • For PN generated based on motion vectors of vertically (horizontally) neighbouring sub-block pixels in the same row (column) of PN are added to Pc with a same weighting factor. It is noted that BIO is also applied for the derivation of the prediction block Pn.
  • a CU level flag is signalled to indicate whether OBMC is applied or not for the current CU.
  • OBMC is applied by default.
  • the prediction signal by using motion information of the top neighboring block and the left neighboring block is used to compensate the top and left boundaries of the original signal of the current CU, and then the normal motion estimation process is applied.
  • FRUC template matching provides significant bit-rate reduction as the motion vector can be derived at the decoder side.
  • the coding complexity of the FRUC template matching method is high, especially at the encoder due to motion vector refinement and RD calculations.
  • TM template matching
  • the number of TM selected for a given temporal layer or slice can be fixed prior to encoding or decoding, or it can be adaptively calculated during the encoding or decoding process.
  • a bitstream may be received by a video decoder.
  • the bitstream may include encoded video data.
  • To reconstruct the video data a prediction block and residual block may be generated based on a template matching candidate.
  • the video decoder may receive a bitstream that also includes a syntax element representing the number of template matching candidates for the given temporal layer or slice.
  • the reconstructed video data may be used determine the number of template matching candidates for the given temporal layer or slice.
  • the number of allowed TM candidates for a given temporal layer or slice may be included in a slice header, sequence parameter set (SPS), or picture parameter set (PPS).
  • SPS sequence parameter set
  • PPS picture parameter set
  • the inclusion of the number of TM candidates may be signaled.
  • the signaling may be explicit, i.e., the number of allowed TM candidates may be part of the slice header, SPS, or PPS that a video encoder includes in a bitstream, output by the video encoder.
  • a slice may be a one frame or part of a frame (e.g., a set of rows, or even partial rows of a frame).
  • the encoder may output a slice header to indicate how the components of the frame or part of the frame are represented in a particular slice. Included in the slice header may be the number of allowed TM candidates. If there are multiple slices in a picture, the PPS may include different parameters for each slice. One of these sets of parameters may include the number of allowed TM candidates per PPS.
  • a sequence of pictures may include sets of parameters that are common or different as part of the encoding or decoding. One of these sets of parameters may include the number of allowed TM candidates per SPS.
  • a syntax element may be included in the bitstream which represents the number of allowed TM candidates per a given slice or temporal layer (i.e., a base layer or higher layer).
  • the number of TM candidates may be larger for the lower temporal layer when compared to the number of TM candidates in the higher temporal layer. This may occur because there may be less correlation between the content of frames at a lower temporal layer than a higher temporal layer.
  • the signaling may be implicit, and the decoder may derive the number of allowed TM candidates using DMVD.
  • the DMVD may generate motion vectors.
  • FRUC mode is a special merge mode, with which motion information of a block is not signaled but derived at the decoder side.
  • the number of allowed TM candidates may be derived.
  • too many template matching candidates e.g. 50 may increase the complexity of the encoding or decoding process as too many comparisons may take place. If there are too few template matching candidates, e.g. 2, the comparisons may be less, but there may not be enough coding efficiency gained.
  • the solution of providing TM candidates prior to encoding or decoding, or during encoding or decoding may also depend on the rate of change of the motion vectors generated based on the content of the video data.
  • a check may be performed to see if there is a fast rate of change of the motion vectors.
  • the mean or variance from a history motion vectors may be analyzed.
  • the number of temporal matched candidates may be set to three or more. If the variance of horizontal and vertical component are both less than 10, the number of temporal candidates may be set to one less than the case where the horizontal and vertical components were both greater than 10, i.e., 2, 3 or 4.
  • the number 10 is just an example, but depending on the distribution of the motion vectors a different variance value may be used, e.g. a number between 4-9, or 11- 15, or possibly larger if the density of the pixels in a frame is larger. If the number of selected temporal matched candidates are not adaptively calculated they may be set to a pre-defined default value, e.g. 10.
  • the candidate number of template matching when the illumination compensation (IC) on is signaled in a slice header, an SPS, or PPS.
  • Illumination compensation may be used to compensate for non-uniform distributions of illuminations in a frame.
  • IC is on, it may be more challenging to find coding efficiencies based on motion vector motion.
  • the many number of TM candidates may be signaled (implicitly or explicitly) taking into account the status of the IC block, i.e., whether there is an IC flag that is on or off.
  • the number of TM candidates per temporal layer or slice may be applied only when IC flag is OFF for a given CU. In another example, when the IC flag is ON for a given CU, the number of TM candidates may be fixed irrespective of the temporal layer or slice.
  • a metric used by an IC-enabled frame rate up conversion search is based on the number of the neighboring samples in the template to perform search.
  • An edge-preserving denoise filter such as bilateral filter, may be applied for the current, L0, and LI template before a search is performed.
  • mean-removal based metric such as Mean- Removed Sum of Absolute Difference (MR-SAD) or Normalized Cross Correlation (NCC) can be used to find the closest motion vector.
  • MR-SAD Mean- Removed Sum of Absolute Difference
  • NCC Normalized Cross Correlation
  • the reference samples in a template may be a multiple of the number of rows or columns of a coding unit.
  • the threshold may be 16 or 32 samples in either the horizontal or vertical direction, or both.
  • the regular metric of SAD or Sum of Squared Difference may be used.
  • the threshold may be pre-defined or signaled as part of a SPS, a PPS, or a slice header.
  • the maximum number of candidates for FRUC TM is signaled as part of an SPS, a PPS, or a slice header.
  • the maximum number of candidates may refer to the combined number of candidates between IC-enabled and non-illumination compensation (non-IC) FRUC TM.
  • the maximum number of candidates may be based on a combination illumination compensation (IC) enabled and non-illumination compensation (IC) enabled TM candidates.
  • IC illumination compensation
  • IC non-illumination compensation
  • the maximum number of TM candidates is N, and N-l TM candidates are designated as IC enabled and one TM candidate is designated as non- IC enabled.
  • the number of either the IC enabled TM candidates or non-IC enabled TM candidates use a default value that includes at least two IC enabled TM candidates and four non-IC enabled TM candidates.
  • N-l TM candidates when the neighboring blocks of the current block are coded as IC block's, it may be desirable to assign N-l TM candidates during the IC-enabled case, i.e., when IC is on. The remaining singular TM candidate may be assigned to the non-IC case.
  • the number of candidates for either IC or non-IC case may have a default value, including but not limited to two candidates for the IC- enabled case, and four TM candidates for the non-IC case.
  • the ratio of the neighboring 4x4 blocks which are coded with IC flags of 1 may be used as a context to encode the IC flag for FRUC TM.
  • the maximum number of TM candidates may be determined prior to encoding, decoding, or may be calculated during encoding or decoding. For example, as mentioned previously having too many TM candidates may increase complexity of coding. Thus, setting a maximum number per slice, or temporal layer may be desired. Similarly, the maximum number of TM candidates may be set as part of the SPS or PPS. Over a given sequence of frames or slices, it may be desirable to signal a limited number of TM candidates.
  • the number of TM candidates may be signaled as part of the SPS, PPS or slice header.
  • the difference between the maximum number of TM candidates and the actual number of allowed TM candidates at a given slice may be signaled in the slice header. Note that the actual number of TM candidates at a given slice must be less than or equal to the maximum number of TM candidates.
  • Two candidates are said to be identical or similar may be based on the value of reference index and motion vector.
  • the following rules proposed in IDF 180270 may be used: i) reference indexes of both candidates are the same (ii) absolute difference of the horizontal motion vector is less than navd ⁇ (iii) absolute difference of the vertical motion vector is less than navd, ⁇ .
  • the candidate motion vectors are first scaled to a frame such as the reference frame with reference index 0 in the current reference list. If the difference, such as in terms of LI norm or L2 norm, between the scaled motion vectors is below or no larger than a threshold, the candidate motion vectors are regarded as same or similar.
  • the first and third candidates in list LO is pruned to check for similarity. If they are not identical third candidate is added to the list, otherwise not it is not added. This process is repeated until N non-similar candidates are selected from the list LO. If N non-similar candidates could not be selected from list LO, default candidates are added to the candidate list.
  • the default candidate may be motion vector equal to (0,0) and reference index equal to the first picture in list L0. In another example, default candidate can be marked unavailable by setting reference index equal to 1. The above described process is carried out to select N non-similar candidates from the list LI.
  • TM is carried out for each of the N candidate and the best candidate is picked.
  • mvd th 0.
  • any scaled values of mvd th can be used for the similarity check, i.e., c* rnvd ⁇ , where c is a positive integer.
  • bi directional prediction could be skipped. In one example, only uni-directional prediction is allowed for a block with IC flag true. In another example, after uni-directional prediction for L0 and LI, MVO and MVl are derived. Then bi-directional motion search is skipped by using MVO and MVl directly to get the bi-directionally predicted template. The cost is calculated based on this predicted template, TO and Tl for uni-/bi- directional decision.
  • costO, costl and costBi are calculated as follows.
  • costO refers to the distortion between the current template T c and the matched template
  • costl refers to the distortion between the current template T c and the matched template Ti.
  • the matched template Ti is identified using the current template and the motion information in list LI without refinement steps.
  • costBi refers to the bi-prediction cost, which is calculated as distortion between the current template T c and the average of the matched template (Ti+To)/2.
  • costBi ⁇ factor*min(costO, costl) is one example to decide between the uni-directional and the bi-directional.
  • the disclosed method can be applied even if other methods are used to decide between the two.
  • joint/bi-directional template matching is used for bi-prediction.
  • the disclosed method can be applied even if bi-directional template matching is not used for bi-prediction, e.g., two uni-predictors can be performed first and the results are combined/averaged for bi-prediction.
  • FIG. 12 is a block diagram illustrating an example video encoder 20 that may implement the techniques described in this disclosure.
  • Video encoder 20 may perform intra- and inter-coding of video blocks within video slices.
  • Intra-coding relies on spatial prediction to reduce or remove spatial redundancy in video within a given video frame or picture.
  • Inter-coding relies on temporal prediction to reduce or remove temporal redundancy in video within adjacent frames or pictures of a video sequence.
  • Intra-mode may refer to any of several spatial based compression modes.
  • Inter-modes such as uni-directional prediction (P mode) or bi-prediction (B mode), may refer to any of several temporal-based compression modes.
  • video encoder 20 includes a video data memory 33, partitioning unit 35, prediction processing unit 41, summer 50, transform processing unit 52, quantization unit 54, entropy encoding unit 56.
  • Prediction processing unit 41 includes motion estimation unit (MEU) 42, motion compensation unit (MCU) 44, and intra prediction unit 46.
  • MEU motion estimation unit
  • MCU motion compensation unit
  • intra prediction unit 46 intra prediction unit 46.
  • video encoder 20 also includes inverse quantization unit 58, inverse transform processing unit 60, summer 62, filter unit 64, and decoded picture buffer (DPB) 66.
  • DPB decoded picture buffer
  • video encoder 20 receives video data from a camera and stores the received video data along with metadata (e.g., Sequence Parameter Set (SPS) or Picture Parameter Set (PPS data) in video data memory 33.
  • Video data memory 33 may store video data to be encoded by the components of video encoder 20.
  • the video data stored in video data memory 33 may be obtained, for example, from video source 18.
  • DPB 66 may be a reference picture memory that stores reference video data for use in encoding video data by video encoder 20, e.g., in intra- or inter-coding modes.
  • Video data memory 33 and DPB 66 may be formed by any of a variety of memory devices, such as dynamic random access memory (DRAM), including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM), resistive RAM (RRAM), or other types of memory devices. Video data memory 33 and DPB 66 may be provided by the same memory device or separate memory devices. In various examples, video data memory 33 may be on-chip with other components of video encoder 20, or off-chip relative to those components.
  • DRAM dynamic random access memory
  • SDRAM synchronous DRAM
  • MRAM magnetoresistive RAM
  • RRAM resistive RAM
  • Partitioning unit 35 retrieves the video data from video data memory 33 and partitions the video data into video blocks. This partitioning may also include partitioning into slices, tiles, or other larger units, as wells as video block partitioning, e.g., according to a quadtree structure of LCUs and CUs. For example, in a different embodiment, the partitioning unit 35 may generate the sequence parameter set (SPS) and/or picture parameter set (PPS).
  • SPS sequence parameter set
  • PPS picture parameter set
  • Video encoder 20 generally illustrates the components that encode video blocks within a video slice to be encoded. The slice may be divided into multiple video blocks (and possibly into sets of video blocks referred to as tiles).
  • Prediction processing unit 41 may select one of a plurality of possible coding modes, such as one of a plurality of intra coding modes or one of a plurality of inter coding modes, for the current video block based on error results (e.g., coding rate and the level of distortion). Prediction processing unit 41 may provide the resulting intra- or inter- coded block to summer 50 to generate residual block data and to summer 62 to reconstruct the encoded block for use as a reference picture.
  • error results e.g., coding rate and the level of distortion
  • the prediction processing unit 41 may be part of a processor which may be configured to generate a first prediction block for a block of a picture according to an intra-prediction mode, and generate a second prediction block for the block of the picture according to an inter-prediction mode. After the first and second prediction blocks are generated, the prediction processing unit 41 may be configured to propagate motion information to the first prediction block based upon motion information from the second prediction block and generate a final prediction block for the block of the picture based on a combination of the first and second prediction blocks.
  • the first prediction block is used in the construction of a candidate list.
  • the candidate list may be a merging candidate list, or alternatively the candidate list may be an AMVP list.
  • the first prediction block and the second prediction block are neighboring blocks.
  • the first prediction block and the second prediction block are spatially neighboring blocks.
  • the first prediction block and the second prediction block are temporally neighboring blocks.
  • the neighboring blocks are within the group of the same: slice, or tile or LCU or ROW or picture.
  • the neighboring blocks are located in one or more previously coded frames.
  • the first prediction block inherits motion information from the second prediction block, and the relative position of the second prediction block with respect to the first prediction block is pre-defined.
  • the second prediction block is selected from a plurality of neighboring blocks according to a predetermined rule.
  • Intra prediction unit 46 within prediction processing unit 41 may perform intra- predictive coding of the current video block relative to one or more neighboring blocks in the same frame or slice as the current block to be coded to provide spatial compression.
  • DMVD decoder side motion derivation
  • Motion estimation unit 42A and DMVD 42B may be configured to determine the inter-prediction mode for a video slice according to a predetermined pattern for a video sequence.
  • the predetermined pattern may designate video slices in the sequence as P slices or B slices.
  • Motion estimation unit 42A and DMVD 42B and motion compensation unit 44 may be highly integrated, but are illustrated separately for conceptual purposes.
  • Motion estimation, performed by motion estimation unit 42A is the process of generating motion vectors, which estimate motion for video blocks.
  • a motion vector for example, may indicate the displacement of a PU of a video block within a current video frame or picture relative to a predictive block within a reference picture.
  • motion estimation unit 42A may generate motion vectors.
  • the decision on which FRUC merge mode for a CU is based on rate distortion (RD) cost selection is done for normal merge candidate. That is, the two matching modes (bilateral matching and template matching) are both checked for a CU by using RD cost selection. The one leading to the minimal cost is further compared to other CU modes. If a FRUC mode is the most efficient one, FRUC flag is set to true for the CU and the related matching mode is used.
  • RD rate distortion
  • LIC Local Illumination Compensation
  • CU inter-mode coded coding unit
  • a least square error method is employed to derive the parameters a and b by using the neighbouring samples of the current CU and their corresponding reference samples.
  • the LIC flag is copied from neighbouring blocks, in a way similar to motion information copy in merge mode; otherwise, an LIC flag is signalled for the CU to indicate whether LIC applies or not.
  • the motion estimation unit 42A outputs a motion vector.
  • the DMVD 42B outputs the motion vector.
  • the motion compensation unit 44 may decide which motion vector is better to use based on comparing the output of the motion estimation unit 42A and the decoder side motion vector derivation 42B.
  • a predictive block is a block that is found to closely match the PU of the video block to be coded in terms of pixel difference, which may be determined by sum of absolute difference (SAD), sum of square difference (SSD), or other difference metrics.
  • video encoder 20 may calculate values for sub-integer pixel positions of reference pictures stored in DPB 66. For example, video encoder 20 may interpolate values of one-quarter pixel positions, one-eighth pixel positions, or other fractional pixel positions of the reference picture. Therefore, motion estimation unit 42A may perform a motion search relative to the full pixel positions and fractional pixel positions and output a motion vector with fractional pixel precision.
  • Motion estimation unit 42A calculates a motion vector for a PU of a video block in an inter-coded slice by comparing the position of the PU to the position of a predictive block of a reference picture.
  • the reference picture may be selected from a first reference picture list (List 0) or a second reference picture list (List 1), each of which identify one or more reference pictures stored in DPB 66.
  • Motion estimation unit 42 sends the calculated motion vector to entropy encoding unit 56 and motion compensation unit 44.
  • Motion compensation performed by motion compensation unit 44, may involve fetching or generating the predictive block based on the motion vector determined by motion estimation, possibly performing interpolations to sub-pixel precision.
  • motion compensation unit 44 may locate the predictive block to which the motion vector points in one of the reference picture lists.
  • Video encoder 20 forms a residual video block by subtracting pixel values of the predictive block from the pixel values of the current video block being coded, forming pixel difference values.
  • the pixel difference values form residual data for the block, and may include both luma and chroma difference components.
  • Summer 50 represents the component or components that perform this subtraction operation.
  • Motion compensation unit 44 may also generate syntax elements associated with the video blocks and the video slice for use by video decoder 30 in decoding the video blocks of the video slice.
  • video encoder 20 forms a residual video block by subtracting the predictive block from the current video block.
  • the residual video data in the residual block may be included in one or more TUs and applied to transform processing unit 52.
  • Transform processing unit 52 transforms the residual video data into residual transform coefficients using a transform, such as a discrete cosine transform (DCT) or a conceptually similar transform.
  • Transform processing unit 52 may convert the residual video data from a pixel domain to a transform domain, such as a frequency domain.
  • DCT discrete cosine transform
  • Transform processing unit 52 may send the resulting transform coefficients to quantization unit 54.
  • Quantization unit 54 quantizes the transform coefficients to further reduce bit rate. The quantization process may reduce the bit depth associated with some or all of the coefficients. The degree of quantization may be modified by adjusting a quantization parameter.
  • quantization unit 54 may then perform a scan of the matrix including the quantized transform coefficients.
  • entropy encoding unit 56 may perform the scan.
  • entropy encoding unit 56 entropy encodes the quantized transform coefficients.
  • entropy encoding unit 56 may perform context adaptive variable length coding (CAVLC), context adaptive binary arithmetic coding (CABAC), syntax-based context-adaptive binary arithmetic coding (SBAC), probability interval partitioning entropy (PIPE) coding or another entropy encoding methodology or technique.
  • CAVLC context adaptive variable length coding
  • CABAC context adaptive binary arithmetic coding
  • SBAC syntax-based context-adaptive binary arithmetic coding
  • PIPE probability interval partitioning entropy
  • the encoded bitstream may be transmitted to video decoder 30, or archived for later transmission or retrieval by video decoder 30.
  • Entropy encoding unit 56 may also entropy encode the motion vectors and the other syntax elements for the current video slice being coded.
  • Inverse quantization unit 58 and inverse transform processing unit 60 apply inverse quantization and inverse transformation, respectively, to reconstruct the residual block in the pixel domain for later use as a reference block of a reference picture.
  • Motion compensation unit 44 may calculate a reference block by adding the residual block to a predictive block of one of the reference pictures within one of the reference picture lists. Motion compensation unit 44 may also apply one or more interpolation filters to the reconstructed residual block to calculate sub-integer pixel values for use in motion estimation.
  • Summer 62 adds the reconstructed residual block to the motion compensated prediction block produced by motion compensation unit 44 to produce a reconstructed block.
  • Filter unit 64 filters the reconstructed block (e.g. the output of summer 62) and stores the filtered reconstructed block in DPB 66 for uses as a reference block.
  • the reference block may be used by motion estimation unit 42 and motion compensation unit 44 as a reference block to inter-predict a block in a subsequent video frame or picture.
  • Filter unit 64 may perform any type of filtering such as deblock filtering, SAO filtering, ALF, and/or GALF, and/or other types of loop filters.
  • a deblock filter may, for example, apply deblocking filtering to filter block boundaries to remove blockiness artifacts from reconstructed video.
  • An SAO filter may apply offsets to reconstructed pixel values in order to improve overall coding quality. Additional loop filters (in loop or post loop) may also be used.
  • FIG. 13 is a block diagram illustrating an example video decoder 30 that may implement the techniques described in this disclosure.
  • Video decoder 30 of FIG. 8 may, for example, be configured to receive the signaling described above with respect to video encoder 20 of FIG. 12.
  • video decoder 30 includes video data memory 78, entropy decoding unit 80, prediction processing unit 81, inverse quantization unit 86, inverse transform processing unit 88, summer 90, and DPB 94.
  • Prediction processing unit 81 includes motion compensation unit 82 and intra prediction unit 84.
  • Video decoder 30 may, in some examples, perform a decoding pass generally reciprocal to the encoding pass described with respect to video encoder 20.
  • video decoder 30 receives an encoded video bitstream that represents video blocks of an encoded video slice and associated syntax elements from video encoder 20.
  • Video decoder 30 stores the received encoded video bitstream in video data memory 78.
  • Video data memory 78 may store video data, such as an encoded video bitstream, to be decoded by the components of video decoder 30.
  • the video data stored in video data memory 78 may be obtained, for example, via link 16, from storage device 26, or from a local video source, such as a camera, or by accessing physical data storage media.
  • Video data memory 78 may form a coded picture buffer (CPB) that stores encoded video data from an encoded video bitstream.
  • CPB coded picture buffer
  • DPB 94 may be a reference picture memory that stores reference video data for use in decoding video data by video decoder 30, e.g., in intra- or inter-coding modes.
  • Video data memory 78 and DPB 94 may be formed by any of a variety of memory devices, such as DRAM, SDRAM, MRAM, RRAM, or other types of memory devices.
  • Video data memory 78 and DPB 94 may be provided by the same memory device or separate memory devices.
  • video data memory 78 may be on-chip with other components of video decoder 30, or off-chip relative to those components.
  • Entropy decoding unit 80 of video decoder 30 entropy decodes the video data stored in video data memory 78 to generate quantized coefficients, motion vectors, and other syntax elements. Entropy decoding unit 80 forwards the motion vectors and other syntax elements to prediction processing unit 81. Video decoder 30 may receive the syntax elements at the video slice level and/or the video block level.
  • intra prediction unit 84 of prediction processing unit 81 may generate prediction data for a video block of the current video slice based on a signaled intra prediction mode and data from previously decoded blocks of the current frame or picture.
  • motion compensation unit 82 of prediction processing unit 81 produces a final generated predictive blocks for a video block of the current video slice based on the motion vectors and other syntax elements received from entropy decoding unit 80.
  • the final generated predictive blocks may be produced from one of the reference pictures within one of the reference picture lists.
  • the prediction processing unit 81 may be part of a processor which may be configured to reconstruct a first prediction block for a block of a picture according to an intra-prediction mode, and reconstruct a second prediction block for the block of the picture according to an inter- prediction mode.
  • the prediction processing unit 81 may be configured to propagate motion information to the first generated prediction block based upon motion information from the generated second prediction block and generate a final generated prediction block for the block of the picture based on a combination of the first and second prediction blocks.
  • the first generated prediction block is used in the construction of a candidate list.
  • the candidate list may be a merging candidate list, or alternatively the candidate list may be an AMVP list.
  • the first prediction block and the second prediction block are neighboring blocks.
  • the first prediction block and the second prediction block are spatially neighboring blocks.
  • the first prediction block and the second prediction block are temporally neighboring blocks.
  • the neighboring blocks are within the group of the same: slice, or tile or LCU or ROW or picture.
  • the neighboring blocks are located in one or more previously coded frames.
  • the first prediction block inherits motion information from the second prediction block, and the relative position of the second prediction block with respect to the first prediction block is pre-defined.
  • the second prediction block is selected from a plurality of neighboring blocks according to a predetermined rule.
  • Video decoder 30 may construct the reference frame lists, List 0 and List 1, using default construction techniques based on reference pictures stored in DPB 94.
  • Inverse quantization unit 86 inverse quantizes, i.e., de-quantizes, the quantized transform coefficients provided in the bitstream and decoded by entropy decoding unit 80. The inverse quantization process may include use of a quantization parameter calculated by video encoder 20 for each video block in the video slice to determine a degree of quantization and, likewise, a degree of inverse quantization that should be applied.
  • Inverse transform processing unit 88 applies an inverse transform, e.g., an inverse DCT, an inverse integer transform, or a conceptually similar inverse transform process, to the transform coefficients in order to produce residual blocks in the pixel domain.
  • video decoder 30 After the prediction processing unit 81 generates the predictive block for the current video block using, for example, intra or inter prediction, video decoder 30 forms a reconstructed video block by summing the residual blocks from inverse transform processing unit 88 with the corresponding predictive blocks generated by motion compensation unit 82.
  • Summer 90 represents the component or components that perform this summation operation.
  • the Decode side Motion Vector Derivation (DMVD) 83 may use template matching techniques in a FRUC mode to select a number of template matching (TM) candidates based on a temporal layer or slice, wherein the number of TM candidates selected for a given temporal layer or slice can be selectively fixed prior to the video decoding, or adaptively calculated on the fly (i.e., in real time) during the video decoding. And explicitly signal a number of allowed TM candidates for a given temporal layer or slice at the slice header or SPS or PPS.
  • TM template matching
  • FRUC mode is a special merge mode, with which motion information of a block is not signaled but derived at the decoder side.
  • a FRUC flag is signalled for a CU when its merge flag is true.
  • a merge index is signaled and the regular merge mode is used.
  • an additional FRUC mode flag is signalled to indicate which method (bilateral matching or template matching) is to be used to derive motion information for the block.
  • an initial motion vector is first derived for the whole CU based on bilateral matching or template matching.
  • the merge list of the CU is checked and the candidate which leads to the minimum matching cost is selected as the starting point. Then, a local search based on bilateral matching or template matching around the starting point is performed and the MV results in the minimum matching cost is taken as the MV for the whole CU. Subsequently, the motion information is further refined at sub-block level with the derived CU motion vectors as the starting points.
  • the DMVD 83 may adaptively calculate the number of template matching candidates based on a history of motion vectors from a previous frame. In other instances, the DMVD may use a default number of template matching candidates.
  • Filter unit 92 filters the reconstructed block (e.g. the output of summer 90) and stores the filtered reconstructed block in DPB 94 for uses as a reference block.
  • the reference block may be used by motion compensation unit 82 as a reference block to inter- predict a block in a subsequent video frame or picture.
  • Filter unit 92 may perform any type of filtering such as deblock filtering, SAO filtering, ALF, and/or GALF, and/or other types of loop filters.
  • a deblock filter may, for example, apply deblocking filtering to filter block boundaries to remove blockiness artifacts from reconstructed video.
  • An SAO filter may apply offsets to reconstructed pixel values in order to improve overall coding quality. Additional loop filters (in loop or post loop) may also be used.
  • the techniques described herein may be implemented in hardware, software, firmware, or any combination thereof. Such techniques may be implemented in any of a variety of devices such as general purposes computers, wireless communication device handsets, or integrated circuit devices having multiple uses including application in wireless communication device handsets and other devices. Any features described as modules or components may be implemented together in an integrated logic device or separately as discrete but interoperable logic devices. If implemented in software, the techniques may be realized at least in part by a computer-readable data storage medium comprising program code including instructions that, when executed, performs one or more of the methods described above. The computer-readable data storage medium may form part of a computer program product, which may include packaging materials.
  • the computer-readable medium may comprise memory or data storage media, such as random access memory (RAM) such as synchronous dynamic random access memory (SDRAM), read-only memory (ROM), non-volatile random access memory (NVRAM), electrically erasable programmable read-only memory (EEPROM), FLASH memory, magnetic or optical data storage media, and the like.
  • RAM random access memory
  • SDRAM synchronous dynamic random access memory
  • ROM read-only memory
  • NVRAM non-volatile random access memory
  • EEPROM electrically erasable programmable read-only memory
  • FLASH memory magnetic or optical data storage media, and the like.
  • the techniques additionally, or alternatively, may be realized at least in part by a computer-readable communication medium that carries or communicates program code in the form of instructions or data structures and that can be accessed, read, and/or executed by a computer, such as propagated signals or waves.
  • the program code, or instructions may be executed by a processor, which may include one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, an application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry.
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGAs field programmable logic arrays
  • a general purpose processor may be a microprocessor; but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine.
  • a processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration. Accordingly, the term "processor,” as used herein may refer to any of the foregoing structure, any combination of the foregoing structure, or any other structure or apparatus suitable for implementation of the techniques described herein. In addition, in some aspects, the functionality described herein may be provided within dedicated software modules or hardware modules configured for encoding and decoding, or incorporated in a combined video encoder-decoder (CODEC).
  • CODEC combined video encoder-decoder
  • a system includes a source device that provides encoded video data to be decoded at a later time by a destination device.
  • the source device provides the video data to destination device via a computer-readable medium.
  • the source device and the destination device may comprise any of a wide range of devices, including desktop computers, notebook (i.e., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called “smart” phones, so-called “smart” pads, televisions, cameras, display devices, digital media players, video gaming consoles, video streaming device, or the like.
  • the source device and the destination device may be equipped for wireless communication.
  • the destination device may receive the encoded video data to be decoded via the computer-readable medium.
  • the computer-readable medium may comprise any type of medium or device capable of moving the encoded video data from source device to destination device.
  • computer-readable medium may comprise a communication medium to enable source device to transmit encoded video data directly to destination device in real-time.
  • the encoded video data may be modulated according to a communication standard, such as a wireless communication protocol, and transmitted to destination device.
  • the communication medium may comprise any wireless or wired communication medium, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
  • the communication medium may form part of a packet-based network, such as a local area network, a wide-area network, or a global network such as the Internet.
  • the communication medium may include routers, switches, base stations, or any other equipment that may be useful to facilitate communication from source device to destination device.
  • encoded data may be output from output interface to a storage device.
  • encoded data may be accessed from the storage device by input interface.
  • the storage device may include any of a variety of distributed or locally accessed data storage media such as a hard drive, Blu-ray discs, DVDs, CD-ROMs, flash memory, volatile or non- volatile memory, or any other suitable digital storage media for storing encoded video data.
  • the storage device may correspond to a file server or another intermediate storage device that may store the encoded video generated by source device. Destination device may access stored video data from the storage device via streaming or download.
  • the file server may be any type of server capable of storing encoded video data and transmitting that encoded video data to the destination device.
  • Example file servers include a web server (e.g., for a website), an FTP server, network attached storage (NAS) devices, or a local disk drive.
  • Destination device may access the encoded video data through any standard data connection, including an Internet connection. This may include a wireless channel (e.g., a Wi-Fi connection), a wired connection (e.g., DSL, cable modem, etc.), or a combination of both that is suitable for accessing encoded video data stored on a file server.
  • the transmission of encoded video data from the storage device may be a streaming transmission, a download transmission, or a combination thereof.
  • the techniques of this disclosure are not necessarily limited to wireless applications or settings.
  • the techniques may be applied to video coding in support of any of a variety of multimedia applications, such as over-the-air television broadcasts, cable television transmissions, satellite television transmissions, Internet streaming video transmissions, such as dynamic adaptive streaming over HTTP (DASH), digital video that is encoded onto a data storage medium, decoding of digital video stored on a data storage medium, or other applications.
  • system may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
  • the source device includes a video source, a video encoder, and an output interface.
  • the destination device may include an input interface, a video decoder, and a display device.
  • the video encoder of source device may be configured to apply the techniques disclosed herein.
  • a source device and a destination device may include other components or arrangements.
  • the source device may receive video data from an external video source, such as an external camera.
  • the destination device may interface with an external display device, rather than including an integrated display device.
  • the example system above merely one example.
  • Techniques for processing video data in parallel may be performed by any digital video encoding and/or decoding device.
  • the techniques of this disclosure are performed by a video encoding device, the techniques may also be performed by a video encoder/decoder, typically referred to as a "CODEC.”
  • the techniques of this disclosure may also be performed by a video preprocessor.
  • Source device and destination device are merely examples of such coding devices in which source device generates coded video data for transmission to destination device.
  • the source and destination devices may operate in a substantially symmetrical manner such that each of the devices include video encoding and decoding components.
  • example systems may support oneway or two-way video transmission between video devices, e.g., for video streaming, video playback, video broadcasting, or video telephony.
  • the video source may include a video capture device, such as a video camera, a video archive containing previously captured video, and/or a video feed interface to receive video from a video content provider.
  • the video source may generate computer graphics-based data as the source video, or a combination of live video, archived video, and computer generated video.
  • source device and destination device may form so-called camera phones or video phones.
  • the techniques described in this disclosure may be applicable to video coding in general, and may be applied to wireless and/or wired applications.
  • the captured, pre-captured, or computer- generated video may be encoded by the video encoder.
  • the encoded video information may then be output by output interface onto the computer-readable medium.
  • the techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an integrated circuit (IC) or a set of ICs (e.g., a chip set).
  • IC integrated circuit
  • a set of ICs e.g., a chip set.
  • Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. Rather, as described above, various units may be combined in a codec hardware unit or provided by a collection of interoperative hardware units, including one or more processors as described above, in conjunction with suitable software and/or firmware.
  • an ordinal term e.g., "first,” “second,” “third,” etc.
  • an element such as a structure, a component, an operation, etc.
  • the term “set” refers to a grouping of one or more elements
  • the term “plurality” refers to multiple elements.
  • Coupled may include “communicatively coupled,” “electrically coupled,” or “physically coupled,” and may also (or alternatively) include any combinations thereof.
  • Two devices (or components) may be coupled (e.g., communicatively coupled, electrically coupled, or physically coupled) directly or indirectly via one or more other devices, components, wires, buses, networks (e.g., a wired network, a wireless network, or a combination thereof), etc.
  • Two devices (or components) that are electrically coupled may be included in the same device or in different devices and may be connected via electronics, one or more connectors, or inductive coupling, as illustrative, non-limiting examples.
  • two devices may send and receive electrical signals (digital signals or analog signals) directly or indirectly, such as via one or more wires, buses, networks, etc.
  • electrical signals digital signals or analog signals
  • directly coupled may include two devices that are coupled (e.g., communicatively coupled, electrically coupled, or physically coupled) without intervening components.
  • integrated may include “manufactured or sold devices.”
  • a device may be integrated if a user buys a package that bundles or includes the device as part of the package.
  • two devices may be coupled, but not necessarily integrated (e.g., different peripheral devices may not be integrated to a command device, but still may be “coupled”).
  • Another example may be that any of the transceivers or antennas described herein that may be “coupled” to a processor, but not necessarily part of the package that includes a video device.
  • Other examples may be inferred from the context disclosed herein, including this paragraph, when using the term "integrated”.
  • a wireless connection between devices may be based on various wireless technologies, such as Bluetooth, Wireless-Fidelity (Wi-Fi) or variants of Wi-Fi (e.g. Wi-Fi Direct.
  • Devices may be "wirelessly connected” based on different cellular communication systems, such as, a Long-Term Evolution (LTE) system, a Code Division Multiple Access (CDMA) system, a Global System for Mobile Communications (GSM) system, a wireless local area network (WLAN) system, or some other wireless system.
  • LTE Long-Term Evolution
  • CDMA Code Division Multiple Access
  • GSM Global System for Mobile Communications
  • WLAN wireless local area network
  • a CDMA system may implement Wideband CDMA (WCDMA), CDMA IX, Evolution- Data Optimized (EVDO), Time Division Synchronous CDMA (TD-SCDMA), or some other version of CDMA.
  • WCDMA Wideband CDMA
  • CDMA IX Code Division Multiple Access
  • EVDO Evolution- Data Optimized
  • TD-SCDMA Time Division Synchro
  • a "wireless connection” may also be based on other wireless technologies, such as ultrasound, infrared, pulse radio frequency electromagnetic energy, structured light, or directional of arrival techniques used in signal processing (e.g. audio signal processing or radio frequency processing).
  • a “and/or" B may mean that either “A and B,” or “A or B,” or both “A and B” and “A or B” are applicable or acceptable.
  • a unit can include, for example, a special purpose hardwired circuitry, software and/or firmware in conjunction with programmable circuitry, or a combination thereof.
  • computing device is used generically herein to refer to any one or all of servers, personal computers, laptop computers, tablet computers, mobile devices, cellular telephones, smartbooks, ultrabooks, palm-top computers, personal data assistants (PDA's), wireless electronic mail receivers, multimedia Internet-enabled cellular telephones, Global Positioning System (GPS) receivers, wireless gaming controllers, and similar electronic devices which include a programmable processor and circuitry for wirelessly sending and/or receiving information.
  • PDA's personal data assistants
  • wireless electronic mail receivers multimedia Internet-enabled cellular telephones
  • GPS Global Positioning System
  • gaming controllers and similar electronic devices which include a programmable processor and circuitry for wirelessly sending and/or receiving information.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne un dispositif de décodage vidéo pouvant comprendre une mémoire configurée pour stocker des données vidéo et un processeur configuré pour recevoir un flux binaire comprenant des données vidéo codées. Le processeur peut être configuré pour sélectionner un certain nombre de candidats de mise en correspondance de modèles (TM) pour une couche ou une tranche temporelle pendant le décodage vidéo. Le nombre de candidats de TM sélectionnés est fixe avant le décodage vidéo, ou calculé de manière adaptative pendant le décodage vidéo. Le processeur peut être configuré pour générer un bloc de prédiction et un bloc résiduel, sur la base d'un candidat de correspondance de modèle, pour reconstruire les données vidéo.
PCT/US2018/055933 2017-10-16 2018-10-15 Améliorations diverses apportées à la mise en correspondance de modèles de fruc WO2019079206A2 (fr)

Priority Applications (6)

Application Number Priority Date Filing Date Title
AU2018350913A AU2018350913A1 (en) 2017-10-16 2018-10-15 Various improvements to FRUC template matching
CN201880065805.0A CN111201794B (zh) 2017-10-16 2018-10-15 对帧速率上转换模板匹配的各种改进
BR112020007329-6A BR112020007329A2 (pt) 2017-10-16 2018-10-15 diversos aprimoramentos para correspondência de modelo de fruc
KR1020207010186A KR20200069303A (ko) 2017-10-16 2018-10-15 Fruc 템플릿 매칭에 대한 다양한 개선들
SG11202001988QA SG11202001988QA (en) 2017-10-16 2018-10-15 Various improvements to fruc template matching
EP18796349.1A EP3698545A2 (fr) 2017-10-16 2018-10-15 Améliorations diverses apportées à la mise en correspondance de modèles de fruc

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201762573115P 2017-10-16 2017-10-16
US62/573,115 2017-10-16
US16/159,458 US10986360B2 (en) 2017-10-16 2018-10-12 Various improvements to FRUC template matching
US16/159,458 2018-10-12

Publications (2)

Publication Number Publication Date
WO2019079206A2 true WO2019079206A2 (fr) 2019-04-25
WO2019079206A3 WO2019079206A3 (fr) 2019-05-31

Family

ID=66171331

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2018/055933 WO2019079206A2 (fr) 2017-10-16 2018-10-15 Améliorations diverses apportées à la mise en correspondance de modèles de fruc

Country Status (9)

Country Link
US (1) US10986360B2 (fr)
EP (1) EP3698545A2 (fr)
KR (1) KR20200069303A (fr)
CN (1) CN111201794B (fr)
AU (1) AU2018350913A1 (fr)
BR (1) BR112020007329A2 (fr)
SG (1) SG11202001988QA (fr)
TW (1) TW201924347A (fr)
WO (1) WO2019079206A2 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020228836A1 (fr) * 2019-05-16 2020-11-19 Beijing Bytedance Network Technology Co., Ltd. Détermination d'affinement d'informations de mouvement basée sur une sous-région

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019191717A1 (fr) * 2018-03-30 2019-10-03 Hulu, LLC Bi-prédiction à modèle affiné pour codage vidéo
US10834409B2 (en) * 2018-04-06 2020-11-10 Arris Enterprises Llc System and method of implementing multiple prediction models for local illumination compensation
US10958928B2 (en) * 2018-04-10 2021-03-23 Qualcomm Incorporated Decoder-side motion vector derivation for video coding
WO2020084474A1 (fr) 2018-10-22 2020-04-30 Beijing Bytedance Network Technology Co., Ltd. Calcul de gradients dans un flux optique bidirectionnel
WO2020084476A1 (fr) 2018-10-22 2020-04-30 Beijing Bytedance Network Technology Co., Ltd. Prédiction à base de sous-blocs
JP7231727B2 (ja) * 2018-11-05 2023-03-01 北京字節跳動網絡技術有限公司 精緻化を伴うインター予測のための補間
WO2020098647A1 (fr) 2018-11-12 2020-05-22 Beijing Bytedance Network Technology Co., Ltd. Procédés de commande de largeur de bande pour prédiction affine
CN113056914B (zh) 2018-11-20 2024-03-01 北京字节跳动网络技术有限公司 基于部分位置的差计算
CN113170097B (zh) 2018-11-20 2024-04-09 北京字节跳动网络技术有限公司 视频编解码模式的编解码和解码
US11490112B2 (en) * 2018-11-29 2022-11-01 Interdigital Vc Holdings, Inc. Motion vector predictor candidates ordering in merge list
US11470340B2 (en) 2018-12-10 2022-10-11 Tencent America LLC Simplified merge list construction for small coding blocks
US11153590B2 (en) * 2019-01-11 2021-10-19 Tencent America LLC Method and apparatus for video coding
JP2022521554A (ja) 2019-03-06 2022-04-08 北京字節跳動網絡技術有限公司 変換された片予測候補の利用
CN111698515B (zh) * 2019-03-14 2023-02-14 华为技术有限公司 帧间预测的方法及相关装置
JP2022525876A (ja) * 2019-03-17 2022-05-20 北京字節跳動網絡技術有限公司 オプティカルフローベースの予測精緻化の計算
JP7307192B2 (ja) 2019-04-02 2023-07-11 北京字節跳動網絡技術有限公司 デコーダ側の動きベクトルの導出
AU2020298425A1 (en) * 2019-06-21 2021-12-23 Panasonic Intellectual Property Corporation Of America Encoder, decoder, encoding method, and decoding method
US11272200B2 (en) * 2019-06-24 2022-03-08 Tencent America LLC Method and apparatus for video coding
US11272203B2 (en) 2019-07-23 2022-03-08 Tencent America LLC Method and apparatus for video coding
KR20230070535A (ko) * 2019-10-09 2023-05-23 베이징 다지아 인터넷 인포메이션 테크놀로지 컴퍼니 리미티드 광 흐름에 의한 예측 개선, 양방향 광 흐름 및 디코더 측 움직임 벡터 개선을 위한 방법들 및 장치들
US11671616B2 (en) 2021-03-12 2023-06-06 Lemon Inc. Motion candidate derivation
US11936899B2 (en) * 2021-03-12 2024-03-19 Lemon Inc. Methods and systems for motion candidate derivation
CN117561711A (zh) * 2021-06-18 2024-02-13 抖音视界有限公司 用于视频处理的方法、装置和介质
US20220417500A1 (en) * 2021-06-29 2022-12-29 Qualcomm Incorporated Merge candidate reordering in video coding
US20230109532A1 (en) * 2021-10-05 2023-04-06 Tencent America LLC Alternative merge mode with motion vector difference by using template-matching
US20230164322A1 (en) * 2021-11-22 2023-05-25 Tencent America LLC Constrained template matching
WO2023147262A1 (fr) * 2022-01-31 2023-08-03 Apple Inc. Codage vidéo prédictif utilisant des trames de référence virtuelles générées par projection mv directe (dmvp)

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8873626B2 (en) * 2009-07-02 2014-10-28 Qualcomm Incorporated Template matching for video coding
CN101860754B (zh) * 2009-12-16 2013-11-13 香港应用科技研究院有限公司 运动矢量编码和解码的方法和装置
KR102257542B1 (ko) * 2012-10-01 2021-05-31 지이 비디오 컴프레션, 엘엘씨 향상 레이어에서 변환 계수 블록들의 서브블록-기반 코딩을 이용한 스케일러블 비디오 코딩
JP2018050091A (ja) * 2015-02-02 2018-03-29 シャープ株式会社 画像復号装置、画像符号化装置および予測ベクトル導出装置
JP6379186B2 (ja) * 2016-02-17 2018-08-22 テレフオンアクチーボラゲット エルエム エリクソン(パブル) ビデオピクチャを符号化および復号する方法および装置
WO2018163858A1 (fr) 2017-03-10 2018-09-13 ソニー株式会社 Dispositif et procédé de traitement d'image
US20190007699A1 (en) * 2017-06-28 2019-01-03 Futurewei Technologies, Inc. Decoder Side Motion Vector Derivation in Video Coding
US10757442B2 (en) * 2017-07-05 2020-08-25 Qualcomm Incorporated Partial reconstruction based template matching for motion vector derivation
US11095895B2 (en) * 2018-02-01 2021-08-17 Intel Corporation Human visual system optimized transform coefficient shaping for video encoding

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
None

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020228836A1 (fr) * 2019-05-16 2020-11-19 Beijing Bytedance Network Technology Co., Ltd. Détermination d'affinement d'informations de mouvement basée sur une sous-région
US11736698B2 (en) 2019-05-16 2023-08-22 Beijing Bytedance Network Technology Co., Ltd Sub-region based determination of motion information refinement

Also Published As

Publication number Publication date
TW201924347A (zh) 2019-06-16
AU2018350913A1 (en) 2020-04-02
US20190124350A1 (en) 2019-04-25
CN111201794B (zh) 2024-03-01
WO2019079206A3 (fr) 2019-05-31
EP3698545A2 (fr) 2020-08-26
BR112020007329A2 (pt) 2020-10-06
SG11202001988QA (en) 2020-04-29
US10986360B2 (en) 2021-04-20
KR20200069303A (ko) 2020-06-16
CN111201794A (zh) 2020-05-26

Similar Documents

Publication Publication Date Title
US10986360B2 (en) Various improvements to FRUC template matching
CN111602399B (zh) 改进的解码器侧运动矢量推导
AU2018349463B2 (en) Low-complexity design for FRUC
CN110431845B (zh) 约束通过解码器侧运动向量推导导出的运动向量信息
CN111989922B (zh) 用于对视频数据进行解码的方法、设备和装置
CN110301135B (zh) 解码视频数据的方法和装置以及计算机可读存储介质
AU2018205783B2 (en) Motion vector reconstructions for bi-directional optical flow (BIO)
CN110741639B (zh) 视频译码中的运动信息传播
CN107431820B (zh) 视频译码中运动向量推导
WO2018175756A1 (fr) Dérivation de vecteur de mouvement côté décodeur
WO2019010123A1 (fr) Mise en correspondance de modèle basée sur une reconstruction partielle pour une dérivation de vecteur de mouvement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18796349

Country of ref document: EP

Kind code of ref document: A2

ENP Entry into the national phase

Ref document number: 2018350913

Country of ref document: AU

Date of ref document: 20181015

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018796349

Country of ref document: EP

Effective date: 20200518

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112020007329

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112020007329

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20200413