CN111684799B - Video processing method and device - Google Patents

Video processing method and device Download PDF

Info

Publication number
CN111684799B
CN111684799B CN201980009149.7A CN201980009149A CN111684799B CN 111684799 B CN111684799 B CN 111684799B CN 201980009149 A CN201980009149 A CN 201980009149A CN 111684799 B CN111684799 B CN 111684799B
Authority
CN
China
Prior art keywords
offset value
motion vector
block
target candidate
current block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201980009149.7A
Other languages
Chinese (zh)
Other versions
CN111684799A (en
Inventor
马思伟
王苏红
郑萧桢
王苫社
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
SZ DJI Technology Co Ltd
Original Assignee
Peking University
SZ DJI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University, SZ DJI Technology Co Ltd filed Critical Peking University
Publication of CN111684799A publication Critical patent/CN111684799A/en
Application granted granted Critical
Publication of CN111684799B publication Critical patent/CN111684799B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction

Abstract

A video processing method and apparatus are provided. The method includes selecting a motion vector of a target candidate block from a motion vector candidate list of a current block, the motion vector candidate list including motion vectors of a plurality of candidate blocks; determining a first offset value of the current block; when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors, determining a second offset value of the current block; shifting the first motion vector of the target candidate block according to the first offset value; and shifting the second motion vector of the target candidate block according to the second offset value. By identifying a particular candidate block and selecting an adapted offset scheme for the particular candidate block, the inter prediction mode may be optimized.

Description

Video processing method and device
Copyright declaration
The disclosure of this patent document contains material which is subject to copyright protection. The copyright is owned by the copyright owner. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the patent and trademark office patent files or records.
Technical Field
The present application relates to the field of video encoding and decoding, and more particularly, to a video processing method and apparatus.
Background
The video encoding process includes an inter prediction process. There are various inter prediction modes, and some inter prediction modes construct a motion vector candidate list of a current block and determine a motion vector of the current block based on the motion vector candidate list of the current block.
In order to improve the inter prediction effect, some inter prediction modes may shift motion vectors in a motion vector candidate list of a current block to some extent in determining the motion vector of the current block.
Disclosure of Invention
The application provides a video processing method and a video processing device for optimizing an inter-frame prediction mode.
In a first aspect, a video processing method is provided, including: selecting a motion vector of a target candidate block from a motion vector candidate list of a current block, wherein the motion vector candidate list comprises motion vectors of a plurality of candidate blocks; determining a first offset value of the current block; determining a second offset value of the current block when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors; shifting a first motion vector of the target candidate block according to the first offset value; and shifting a second motion vector of the target candidate block according to the second offset value.
In a second aspect, there is provided a video processing apparatus comprising: a memory for storing codes; and the processor is used for reading the codes in the memory to execute the following operations: selecting a motion vector of a target candidate block from a motion vector candidate list of a current block, wherein the motion vector candidate list comprises motion vectors of a plurality of candidate blocks; determining a first offset value of the current block; determining a second offset value of the current block when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors; shifting a first motion vector of the target candidate block according to the first offset value; and shifting a second motion vector of the target candidate block according to the second offset value.
In a third aspect, there is provided a video processing apparatus comprising means for performing the steps of the method of the first aspect.
In a fourth aspect, there is provided a computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the method of the first aspect.
In a fifth aspect, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the first aspect.
The method and the device identify the specific candidate block, and select an adaptive offset scheme for the specific candidate block, so that an inter-frame prediction mode can be optimized.
Drawings
Fig. 1 is a schematic flow chart of a construction process of a merge candidate list.
Fig. 2 is an example diagram of a scaling manner of a temporal motion vector.
Fig. 3 is an exemplary diagram of a prediction mode in the lowdelay mode.
Fig. 4 is an exemplary diagram of a prediction mode in the random access mode.
Fig. 5 is a schematic flowchart of a video processing method provided in an embodiment of the present application.
Fig. 6 is an exemplary diagram of a current frame in which two reference frames are both located on the same side of the current frame.
Fig. 7 is an exemplary diagram of a scaling scheme provided by an embodiment of the present application.
Fig. 8 is a schematic structural diagram of a video processing apparatus provided in an embodiment of the present application.
Detailed Description
The application can be applied to various video coding standards, such as H.264, high efficiency video coding (high efficiency video coding, HEVC), universal video coding (versatile video coding, VVC), audio video coding standards (audio video coding standard, AVS), AVS+, AVS2, AVS3, and the like.
The distance between two frames mentioned in the present application may refer to the difference of the two frames in the playing order, and the larger the distance is, the larger the playing order difference is; since the order of play may be represented by a frame number (POC), in some embodiments, the distance between two frames may be measured as the difference between the frame numbers of the two frames.
The video coding process mainly comprises prediction, transformation, quantization, entropy coding, loop filtering and the like. Prediction is an important component of mainstream video coding techniques. Prediction can be classified into intra prediction and inter prediction. Inter prediction may be implemented by means of motion compensation. The motion compensation process is illustrated below.
For example, for a frame of image, it may be first divided into one or more coding regions. The coding region may also be referred to as a Coding Tree Unit (CTU). The CTU may be, for example, 64×64 or 128×128 (the units are pixels, and the units are omitted from the following similar description). Each CTU may be divided into square or rectangular image blocks, and an image block currently being processed by an encoding side or a decoding side will hereinafter be referred to as a current block. The current block mentioned in the embodiment of the present application may sometimes refer to a current Coding Unit (CU), and may sometimes refer to a current Prediction Unit (PU), which is not limited in this embodiment of the present application.
When inter-predicting a current block, a similar block of the current block may be found from a reference frame (may be a reconstructed frame near the time domain) as a predicted block of the current block. The relative displacement between the current block and the similar block is referred to as a motion vector. The process of searching for a similar block in the reference frame as a predicted block for the current block is motion compensation.
There are various inter prediction modes, and some inter prediction modes construct a motion vector candidate list of the current block and select a motion vector of the current block from the motion vector candidate list of the current block. Taking the merge mode as an example, the construction process of a motion vector candidate list (in the merge mode, the motion vector candidate list may also be referred to as a merge candidate list) is illustrated below.
As shown in fig. 1, the construction process of the merge candidate list includes steps S102 to S118.
In step S102, a spatial MVP (Motion Vector Prediction, motion vector predictor), or spatial MVP candidate (spatial MVP candidate (S)), is added to the merge candidate list.
The spatial MVP is a motion vector of a spatially neighboring block that is in the same frame as the current block. The maximum number of spatial MVPs may be set to 4.
In step S104, it is determined whether the number of candidates in the merge candidate list reaches a preset maximum value (maxNumMergeCand).
If the number of candidates in the merge candidate list reaches a preset maximum value, ending the flow of FIG. 1; if the number of candidates in the merge candidate list does not reach the preset maximum value, step S106 may be continued.
In step S106, TMVP, or TMVP candidate, is added to the merge candidate list (temporal MVP candidate (S)).
The TMVP added in step S106 may be determined based on a motion vector of a co-located block (col-located block) in a co-located frame of the current frame. Since the distance between the co-located frame and the reference frame of the co-located frame is typically different from the distance between the current frame and the reference frame of the current frame, the motion vector of the co-located block typically needs to be scaled before being added as TMVP to the merge candidate list. The process of obtaining TMVP will be described in more detail below in conjunction with fig. 2.
In step S108, it is determined whether the number of candidates in the merge candidate list reaches a preset maximum value.
If the number of candidates in the merge candidate list reaches a preset maximum value, ending the flow of FIG. 1; if the number of candidates in the merge candidate list does not reach the preset maximum value, the step S110 is continued.
In step S110, a history MVP (historical MVP), or HMVP candidate, is added to the merge candidate list.
HMVP may be a motion vector of other blocks in the frame where the current block is located (e.g., non-neighboring blocks of the current block).
In step S112, it is determined whether the number of candidates in the merge candidate list reaches a preset maximum value.
If the number of candidates in the merge candidate list reaches a preset maximum value, ending the flow of FIG. 1; if the number of candidates in the merge candidate list does not reach the preset maximum value, the process continues to step S114.
In step S114, a pair MVP (pairwise MVP) is added to the merge candidate list.
The MVP of the pariphase may be a MVP obtained by adding and averaging MVPs already added to the merge candidate list.
In step S116, it is determined whether the number of candidates in the merge candidate list reaches a preset maximum value.
If the number of candidates in the merge candidate list reaches a preset maximum value, ending the flow of FIG. 1; if the number of candidates in the merge candidate list does not reach the preset maximum value, the process continues to step S118.
In step S118, a vector (0, 0) is added to the merge candidate list until the number of candidates in the merge candidate list reaches a preset maximum value.
It should be understood that steps S108 to S118 are optional steps. For example, after adding the spatial MVP and TMVP to the merge candidate list, the construction of the merge candidate list may be stopped. For another example, after adding the spatial MVP and TMVP to the merge candidate list, if the number of candidates in the merge candidate list has not yet reached a preset maximum value, a vector (0, 0) may be added. As another example, after adding the spatial MVP and the TMVP to the merge candidate list, if the number of candidates in the merge candidate list does not reach the preset maximum value yet, the above-described steps S110 to S112 may be performed without performing the steps S114 to S116; alternatively, if the number of candidates in the merge candidate list has not reached the preset maximum value, steps S114 to S116 are performed without performing steps S110 to S112.
The TMVP acquired in step S106 may refer to a motion vector of a candidate block obtained based on the TMVP technique. The process of TMVP acquisition is described below in conjunction with fig. 2.
In constructing the merge candidate list of the current block (cur_pu), if TMVP needs to be added to the merge candidate list, a co-located frame (col_pic) of the current block is generally determined first. The co-located frame is a temporally different frame than the current frame. The co-located frame may be, for example, the first frame in the reference list of the current block. After determining the co-located frame, a corresponding block of the current block may be determined from the co-located frame as a co-located block (col_pu) of the current block. For example, a corresponding block in the co-located frame to the current block at a C1 (lower right corner point of the current block) or C0 (center point of the current block) position may be determined as the co-located block of the current block.
The TMVP may then be determined based on the motion vector of the co-located block. As shown in fig. 2, the distance between the co-located frame (col_pic) and the reference frame (col_ref) of the co-located frame is denoted by tb, and the distance between the current frame (cur_pic) and the reference frame (cur_ref) of the current frame is denoted by td. Since tb and td are usually different, the motion vector colMV of the co-located block is usually scaled first to obtain curMV, and then the curMV is added to the merge candidate list as TMVP. curMV can be calculated by the following formula: curmv=td colMV/tb. For convenience of description, the scaling operation of the motion vector of the co-located block will be hereinafter referred to as: the motion vector of the co-located block is mapped to a reference frame of the current frame.
In applying the temporal motion vector prediction technique, if the current frame is a B frame (i.e., the current frame adopts the bi-directional prediction mode), it is necessary to perform 2 scaling operations as shown in fig. 2, one for mapping the motion vector of the co-located block to the forward reference frame of the current block and another for mapping the motion vector of the co-located block to the backward reference frame of the current block.
Note that, the co-located block may be in a bi-prediction mode, and thus, the co-located block may include two motion vectors, one of which is a motion vector (hereinafter, referred to as col_mv0) corresponding to a forward prediction mode of the co-located block, and the other of which is a motion vector (hereinafter, referred to as col_mv1) corresponding to a backward prediction mode of the co-located block. In performing the above two scaling operations, one possible implementation is to map col_mv0 to the forward reference frame and backward reference frame of the current block, respectively; alternatively, col_mv0 may be mapped to a forward reference frame of the current block and col_mv1 may be mapped to a backward reference frame of the current block; alternatively, col_mv1 may be mapped to a forward reference frame and a backward reference frame of the current block, respectively; alternatively, col_mv1 mapping may also refer to a forward reference frame of the current block, and col_mv0 may be mapped to a backward reference frame of the current block.
The process of mapping the motion vector of the co-located block to the forward reference frame and the backward reference frame of the current frame will be described with reference to fig. 3 and 4, taking a low delay (lowdelay) mode and a random access (random access) mode as examples, respectively.
In the lowdelay mode, the frame numbers of the reference frames of the current frame are smaller than the frame numbers of the current frame, namely, the playing sequence of the reference frames of the current frame is before the playing sequence of the current frame.
Taking fig. 3 as an example, in the lowdelay mode, the current frame (cur_pic) is POC 5, the first frame of the forward reference list is POC 4, and the first frame of the backward reference list is POC 4. In the example of fig. 3, POC 4 is simultaneously the co-located frame (col_pic), the forward reference frame (cur_ref0) and the backward reference frame (cur_ref1) of the current frame. The forward reference list of POC 4 is { POC 3, POC2, POC0}, and the backward reference list is { POC 3, POC2, POC0}. The forward reference frame of POC 4 may be any one of POC 3, POC2 and POC0, and fig. 3 is an illustration taking the forward reference frame of POC 4 as POC 3. Similarly, the backward reference frame of POC 4 may be any one of POC 3, POC2 and POC0, and fig. 3 is an illustration taking the backward reference frame of POC 4 as POC 2.
Fig. 3 (a) illustrates a scaling operation for a forward prediction mode in a lowdelay mode, resulting in mapping a motion vector (col_mv0) of a co-located block (col_pu) of a current block (cur_pu) to a forward reference frame (POC 4) of the current block, thereby obtaining a temporal motion vector prediction cur_mv0. Fig. 3 (b) illustrates a scaling operation for a backward prediction mode in a lowdelay mode, resulting in mapping a motion vector (col_mv1) of a co-located block (col_pu) of a current block (cur_pu) to a backward reference frame (POC 4) of the current block, thereby obtaining a temporal motion vector prediction cur_mv1.
In the random access mode, the frame number of the reference frame of the current frame may be larger than the frame number of the current frame, or may be smaller than the frame number of the current frame, i.e., the reference frame of the current frame may be played after the current frame or may be played before the current frame.
Taking fig. 4 as an example, in the random access mode, the current frame is POC 27. The first frame of the forward reference list of the current frame is POC 26 and the first frame of the backward reference list is POC 28. In the example of fig. 4, POC 26 is both the co-located frame (col_pic) and the forward reference frame (cur_ref0) of the current frame; POC 28 is the backward reference frame (cur_ref1) of the current frame. The forward reference list for POC 26 is { POC24, POC 16}, and the backward reference list is { POC 28, POC 32}. The backward reference frame of POC 26 may be POC 28 or POC 32. In the example of fig. 4, POC 32 is selected as the backward reference frame (col_ref1) of POC 32.
Fig. 4 (a) illustrates a scaling operation for a forward prediction mode in a random access mode, resulting in mapping a motion vector (col_mv1) of a co-located block (col_pu) of a current block (cur_pu) to a forward reference frame (cur_ref0) of the current block, thereby obtaining a temporal motion vector prediction cur_mv0. Fig. 4 (b) illustrates a scaling operation for the backward prediction mode in the random access mode, resulting in mapping the motion vector (col_mv1) of the co-located block (col_pu) of the current block (cur_pu) to the backward reference frame (cur_ref1) of the current block, thereby obtaining the temporal motion vector prediction cur_mv1.
In order to improve the inter prediction effect, some inter prediction modes may shift motion vectors in a motion vector candidate list of a current block to some extent in determining the motion vector of the current block. The merge mode (merge mode with motion vector difference, MMVD) with motion vector difference is an inter prediction technique that introduces an offset scheme based on the conventional merge mode,
the motion vector offset scheme is illustrated below using MMVD as an example.
MMVD techniques may also be referred to as final motion vector expression (ultimate motion vector expression, UMVE) techniques. The implementation process of MMVD mainly comprises the following two steps.
The first step: a base motion vector (base MV) is selected from the already constructed merge candidate list. In general, the first two motion vector predictions in the merge candidate list may be selected as the base motion vector.
And a second step of: and shifting the basic motion vector according to a certain rule to generate a new motion vector candidate, and predicting by using the new motion vector.
For example, assuming that the base motion vector is (x, y), the new motion vector obtained after the offset is (x ', y'), the offset value used for the offset operation may be a preconfigured set of offset values. For example, 8 optional offset values (offsets) may be preconfigured, and the implementation of each offset value is divided into 4 kinds as follows:
-x’=x+offset,y’=y;
-x’=x–offset,y’=y;
-x’=x,y’=y+offset;
-x’=x,y’=y–offset。
thus, two base motion vectors are offset to yield 2×4× 8=64 possible combinations. When the base motion vector adopts the bi-prediction mode, the motion vectors of the base motion vector in both reference directions need to be shifted. If the forward reference frame and the backward reference frame of the basic motion vector are the same frame, the basic motion vector can be offset by adopting the same offset value in two reference directions; if the forward reference frame and the backward reference frame of the base motion vector are different, it is necessary to scale the offset value of the motion vector of the base motion vector in a certain reference direction and offset the base motion vector using the scaled offset value.
An embodiment of the present application is described in detail below with reference to fig. 5. It should be understood that the method of fig. 5 may be applied to the encoding side and the decoding side, which is not limited in this embodiment of the present application.
Fig. 5 is a schematic flowchart of a video processing method provided in an embodiment of the present application. The method of fig. 5 includes steps S510 to S550.
In step S510, a motion vector of the target candidate block is selected from the motion vector candidate list of the current block.
The motion vector candidate list may include motion vectors of a plurality of candidate blocks. Taking the merge mode as an example, the motion vector candidate list mentioned in step S510 may be a merge candidate list. The construction process of the merge candidate list can be seen in fig. 1.
In some embodiments, the motion vector of the target candidate block may be referred to as the Base motion vector (Base MV) of the current block.
In this embodiment of the present application, the current frame where the current block is located may be a B frame. The prediction mode of the current block may be a bi-prediction mode.
In step S520, a first offset value of the current block is determined.
In step S530, when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors, a second offset value of the current block is determined.
The target candidate block may be a candidate block determined according to TMVP technology. In other words, the candidate block determined based on the TMVP technique and the current block may be located in different frames.
The two motion vectors of the target candidate block may be forward prediction and backward prediction, respectively. Alternatively, both motion vectors of the target candidate block may be unidirectional predictions. Taking the coding mode of video as the random access mode as an example, the two motion vectors of the target candidate block may be forward prediction and backward prediction. Taking the example of the coding mode of video as the low-delay mode, both motion vectors of the target candidate block may be forward predictions.
The reference frames of the two motion vectors of the target candidate block may be the same or different.
In step S540, the first motion vector of the target candidate block is shifted according to the first shift value.
The first offset value may include one or more selectable values. The first offset value may comprise 8 selectable values, for example. The first offset value may be a preset offset value, or may be an offset value obtained by performing other operations on the preset offset value. Other operations herein may be a scaling operation, a negating operation, etc., which are not specifically limited in the embodiments of the present application.
As an example, step S540 may include: the first motion vector of the target candidate block is shifted according to the first non-scaled shift value.
It should be noted that the first offset value that is not scaled may be a preset offset value, or may be an offset value obtained by performing an operation process other than the scaling operation on the preset offset value, for example, an offset value obtained by performing a negation operation on the preset offset value. It should be noted that, the preset offset value may include a plurality of optional values, where the inverting operation may be that all of the plurality of optional values are inverted, or that some of the plurality of optional values are inverted, which is not limited in this embodiment of the present application.
In step S550, the second motion vector of the target candidate block is shifted according to the second shift value.
The method and the device identify the specific candidate block, and select the adaptive offset scheme for the specific candidate block, so that the inter-frame prediction mode can be optimized.
The second offset value may include one or more selectable values. The second offset value may comprise 8 selectable values, for example. The second offset value may be a preset offset value, or may be an offset value obtained by performing other operations on the preset offset value. Other operations herein may be a scaling operation, a negating operation, etc., which are not specifically limited in the embodiments of the present application.
It should be noted that the second offset value that is not scaled may be a preset offset value, or may be an offset value obtained by performing an operation process other than the scaling operation on the preset offset value, for example, an offset value obtained by performing a negation operation on the preset offset value. It should be noted that, the preset offset value may include a plurality of optional values, where the inverting operation may be that all of the plurality of optional values are inverted, or that some of the plurality of optional values are inverted, which is not limited in this embodiment of the present application.
The first offset value and the second offset value may be offset values independent of each other. Alternatively, the second offset value may be an offset value derived from the first offset value that is not scaled. Alternatively, the second offset value may be obtained by other means than scaling the first offset value. For example, the first offset value may be a preset offset value, and the second offset value may be an offset value obtained by partially or completely inverting the first offset value.
The first offset value and the second offset value may be the same or different. For example, the first offset value and the second offset value may each be a preset offset value. For another example, the first offset value may be a preset offset value, and the second offset value may be an opposite number of the preset offset value. For another example, the second offset value may be a preset offset value, and the first offset value may be an opposite number of the preset offset value. For another example, the first offset value and the second offset value have the same partial offset value, and the partial offset values are opposite numbers.
For example, if the two motion vectors of the target candidate block are forward prediction and backward prediction, respectively, the first offset value and the second offset value may be the same.
As another example, where the two motion vectors of the target candidate block are forward predicted, the first offset value and the second offset value may be opposite numbers to each other.
Taking fig. 6 as an example, assuming that the frame number of the current frame is current POC, the frame number of the forward reference frame of the current frame is POC0, and the frame number of the backward reference frame of the current frame is POC 1, if (current POC-POC 0) > 0, it is explained that the play order of the forward reference frame and the backward reference frame of the current frame is located before or after the current frame. In this case, in some embodiments, the first offset value and the second offset value of the current block may be set to the same offset value.
As can be seen from the description of fig. 2, if the target candidate block is located in a different frame from the current block and the target candidate block has two motion vectors, two scaling operations may be required when adding the two motion vectors of the target candidate block to the motion vector candidate list of the current block. In addition, if two reference frames corresponding to the motion vector of the target candidate block are different, a scaling operation may also be required for the offset value when the motion vector of the target candidate block is offset. It follows that when the target candidate block and the current block are located in different frames, the target candidate block has two motion vectors, and reference frames of the two motion vectors of the target candidate block are different, 3 scaling operations may occur. The 3 scaling operations increase the complexity of video processing and increase the burden on the codec system.
In order to avoid the above-described problem caused by the 3 scaling operations, when the target candidate block and the current block are located in different frames, the target candidate block has two motion vectors, and reference frames of the two motion vectors of the target candidate block are different, both the first offset value and the second offset value may be set to non-scaled offset values. In other words, when the target candidate block and the current block are located in different frames, the target candidate block has two motion vectors, and the reference frames of the two motion vectors of the target candidate block are different, the scaling operation on the offset value can be omitted, and the motion vectors of the target candidate block are directly offset by using the non-scaled offset value, so that the video processing process is simplified, and the complexity of video processing is reduced.
Optionally, in some embodiments, the method of fig. 5 may further comprise: and determining the motion vector of the current block according to the shifted first motion vector and the second motion vector. The motion vector of the current block may sometimes be referred to as an optimal motion vector of the current block. The motion vector of the current block may be calculated according to an algorithm such as rate distortion cost, which is not limited in the embodiment of the present application.
The above is exemplified by the case where the target candidate block and the current block are located in different frames, and when the target candidate block and the current block are located in the same frame, the method of fig. 5 may further include: and scaling the second offset value according to the first offset value, and offsetting the second motion vector of the target candidate block by using the scaled second offset value.
The following description will take the spatial domain MVP as an example of the target motion vector. As shown in fig. 7, a forward reference frame of a base motion vector of a current block (cur_pu) is cur ref pic 0, a backward reference frame is cur ref pic 1, a motion vector of the base motion vector with respect to the forward reference frame is MV0, and a motion vector with respect to the backward reference frame is MV1. When cur ref pic 0 and cur ref pic 1 are not the same frame, there may be a difference in the distance between the two reference frames and the current frame, and at this time, if MV0 and MV1 are offset with the same offset value (the offset value is represented by offset 0 and offset 1 in fig. 7), it is not reasonable. For example, if the offset value is 128, if the offset value of 128 is used in both reference directions, the MV corresponding to the reference frame having a shorter distance from the current frame will vary too much, and does not conform to the motion rule of the object in the natural video, so that the offset value of MV needs to be scaled.
The video processing method according to the embodiment of the present application is described in detail above with reference to fig. 1 to 7, and the video processing apparatus according to the embodiment of the present application is described in detail below with reference to fig. 8. It should be understood that the descriptions of the method embodiments and the apparatus embodiments correspond to each other, and thus, details not described in the apparatus embodiment part may refer to the corresponding details of the method embodiment part in the foregoing.
Fig. 8 is a schematic structural diagram of a video processing apparatus provided in an embodiment of the present application. The video processing apparatus 800 of fig. 8 may be an encoder or a decoder. The video processing apparatus 800 may include a memory 810 and a processor 820.
Memory 810 may be used to store code.
Processor 820 may be used to read code in memory to perform the following operations: selecting a motion vector of a target candidate block from a motion vector candidate list of a current block, wherein the motion vector candidate list comprises motion vectors of a plurality of candidate blocks; determining a first offset value of the current block; determining a second offset value of the current block when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors; shifting a first motion vector of the target candidate block according to the first offset value; and shifting a second motion vector of the target candidate block according to the second offset value.
Optionally, the second offset value is derived from the first offset value that is not scaled.
Optionally, the first offset value and the second offset value are the same.
Optionally, the processor 820 may be further configured to perform the following operations: and when the target candidate block and the current block are positioned in the same frame, scaling the second offset value, and shifting a second motion vector of the target candidate block by using the scaled second offset value.
Optionally, the target candidate block is determined according to TMVP technique.
Optionally, the two motion vectors of the target candidate block are forward prediction and backward prediction, respectively.
Optionally, the first offset value and the second offset value are opposite numbers to each other.
Optionally, the coding mode of the video is a random access mode.
Optionally, the two motion vectors of the target candidate block are forward predictions.
Optionally, the first offset value and the second offset value are the same.
Optionally, the coding mode of the video is a low-delay mode.
Optionally, the motion vector candidate list is a merge candidate list.
Optionally, the motion vector of the target candidate block is a base motion vector of the current block.
Optionally, the first offset value and/or the second offset value comprises a plurality of selectable values.
Optionally, the processor 820 may be further configured to perform the following operations: and determining the motion vector of the current block according to the shifted first motion vector and the second motion vector.
Optionally, the second offset value is obtained by other means than scaling the first offset value.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital video disc (digital video disc, DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (30)

1. A video processing method, comprising:
selecting a motion vector of a target candidate block from a motion vector candidate list of a current block, wherein the motion vector candidate list comprises motion vectors of a plurality of candidate blocks;
determining a first offset value of the current block;
determining a second offset value of the current block when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors;
shifting a first motion vector of the target candidate block according to the first offset value;
and shifting a second motion vector of the target candidate block according to the second offset value.
2. The method of claim 1, wherein the second offset value is derived from the first offset value that is not scaled.
3. The method of claim 1, wherein the first offset value and the second offset value are the same.
4. The method according to claim 1, wherein the method further comprises:
and when the target candidate block and the current block are positioned in the same frame, scaling the second offset value, and shifting a second motion vector of the target candidate block by using the scaled second offset value.
5. The method of claim 1, wherein the target candidate block is determined according to a temporal motion vector prediction TMVP technique.
6. The method of claim 1, wherein the two motion vectors of the target candidate block are forward prediction and backward prediction, respectively.
7. The method of claim 6, wherein the first offset value and the second offset value are opposite numbers to each other.
8. The method of claim 6, wherein the coding mode of the video is a random access mode.
9. The method of claim 1, wherein the two motion vectors of the target candidate block are forward predictions.
10. The method of claim 9, wherein the first offset value and the second offset value are the same.
11. The method of claim 9, wherein the coding mode of the video is a low-latency mode.
12. The method of claim 1, wherein the motion vector candidate list is a merge candidate list.
13. The method of claim 1, wherein the motion vector of the target candidate block is a base motion vector of the current block.
14. The method according to claim 1, wherein the first offset value and/or the second offset value comprises a plurality of selectable values.
15. The method according to claim 1, wherein the method further comprises:
and determining the motion vector of the current block according to the first motion vector and the second motion vector after the offset.
16. A video processing apparatus, comprising:
a memory for storing codes;
and the processor is used for reading the codes in the memory to execute the following operations:
selecting a motion vector of a target candidate block from a motion vector candidate list of a current block, wherein the motion vector candidate list comprises motion vectors of a plurality of candidate blocks;
determining a first offset value of the current block;
determining a second offset value of the current block when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors;
shifting a first motion vector of the target candidate block according to the first offset value;
and shifting a second motion vector of the target candidate block according to the second offset value.
17. The apparatus of claim 16, wherein the second offset value is derived from the first offset value without scaling.
18. The apparatus of claim 16, wherein the first offset value and the second offset value are the same.
19. The apparatus of claim 16, wherein the processor is further configured to:
and when the target candidate block and the current block are positioned in the same frame, scaling the second offset value, and shifting a second motion vector of the target candidate block by using the scaled second offset value.
20. The apparatus of claim 16, wherein the target candidate block is determined according to a Temporal Motion Vector Prediction (TMVP) technique.
21. The apparatus of claim 16, wherein the two motion vectors of the target candidate block are forward prediction and backward prediction, respectively.
22. The apparatus of claim 21, wherein the first offset value and the second offset value are opposite numbers to each other.
23. The apparatus of claim 21, wherein the coding mode of the video is a random access mode.
24. The apparatus of claim 16, wherein the two motion vectors of the target candidate block are forward predictions.
25. The apparatus of claim 24, wherein the first offset value and the second offset value are the same.
26. The apparatus of claim 24, wherein the video encoding mode is a low-latency mode.
27. The apparatus of claim 16, wherein the motion vector candidate list is a merge candidate list.
28. The apparatus of claim 16, wherein the motion vector of the target candidate block is a base motion vector of the current block.
29. The apparatus of claim 16, wherein the first offset value and/or the second offset value comprises a plurality of selectable values.
30. The apparatus of claim 16, wherein the processor is further configured to:
and determining the motion vector of the current block according to the first motion vector and the second motion vector after the offset.
CN201980009149.7A 2019-06-25 2019-06-25 Video processing method and device Active CN111684799B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/092751 WO2020258024A1 (en) 2019-06-25 2019-06-25 Video processing method and device

Publications (2)

Publication Number Publication Date
CN111684799A CN111684799A (en) 2020-09-18
CN111684799B true CN111684799B (en) 2023-07-25

Family

ID=72451465

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980009149.7A Active CN111684799B (en) 2019-06-25 2019-06-25 Video processing method and device

Country Status (2)

Country Link
CN (1) CN111684799B (en)
WO (1) WO2020258024A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112565753B (en) * 2020-12-06 2022-08-16 浙江大华技术股份有限公司 Method and apparatus for determining motion vector difference, storage medium, and electronic apparatus
CN115086678B (en) * 2022-08-22 2022-12-27 北京达佳互联信息技术有限公司 Video encoding method and device, and video decoding method and device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104488272A (en) * 2012-07-02 2015-04-01 三星电子株式会社 Method and apparatus for predicting motion vector for coding video or decoding video

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2647705C1 (en) * 2011-11-08 2018-03-16 Кт Корпорейшен Method of video signal decoding
CN110832865A (en) * 2017-07-04 2020-02-21 三星电子株式会社 Image encoding method and apparatus, and image decoding method and apparatus

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104488272A (en) * 2012-07-02 2015-04-01 三星电子株式会社 Method and apparatus for predicting motion vector for coding video or decoding video

Also Published As

Publication number Publication date
WO2020258024A1 (en) 2020-12-30
CN111684799A (en) 2020-09-18

Similar Documents

Publication Publication Date Title
US9369731B2 (en) Method and apparatus for estimating motion vector using plurality of motion vector predictors, encoder, decoder, and decoding method
JP7328337B2 (en) Video processing method and apparatus
CN112868239A (en) Collocated local illumination compensation and intra block copy codec
US20190342571A1 (en) Image predictive encoding and decoding system
CN111684799B (en) Video processing method and device
CN112602326A (en) Motion vector prediction method and device and coder-decoder
US11503283B2 (en) Dynamic image decoding device, dynamic image decoding method, dynamic image decoding program, dynamic image encoding device, dynamic image encoding method, and dynamic image encoding program
US11949874B2 (en) Image encoding/decoding method and device for performing prof, and method for transmitting bitstream
JP5880758B2 (en) Moving picture decoding apparatus, moving picture decoding method, moving picture decoding program, receiving apparatus, receiving method, and receiving program
US20220191535A1 (en) Image encoding/decoding method and apparatus for performing bi-directional prediction, and method for transmitting bitstream

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant