CN111684799A - Video processing method and device - Google Patents

Video processing method and device Download PDF

Info

Publication number
CN111684799A
CN111684799A CN201980009149.7A CN201980009149A CN111684799A CN 111684799 A CN111684799 A CN 111684799A CN 201980009149 A CN201980009149 A CN 201980009149A CN 111684799 A CN111684799 A CN 111684799A
Authority
CN
China
Prior art keywords
offset value
motion vector
block
target candidate
current block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201980009149.7A
Other languages
Chinese (zh)
Other versions
CN111684799B (en
Inventor
马思伟
王苏红
郑萧桢
王苫社
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
SZ DJI Technology Co Ltd
Original Assignee
Peking University
SZ DJI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University, SZ DJI Technology Co Ltd filed Critical Peking University
Publication of CN111684799A publication Critical patent/CN111684799A/en
Application granted granted Critical
Publication of CN111684799B publication Critical patent/CN111684799B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A video processing method and apparatus are provided. The method includes selecting a motion vector of a target candidate block from a motion vector candidate list of a current block, the motion vector candidate list including motion vectors of a plurality of candidate blocks; determining a first offset value for the current block; determining a second offset value for the current block when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors; offsetting a first motion vector of the target candidate block according to the first offset value; and offsetting the second motion vector of the target candidate block according to the second offset value. By identifying a specific candidate block and selecting an adaptive offset scheme for the specific candidate block, the inter prediction mode can be optimized.

Description

Video processing method and device
Copyright declaration
The disclosure of this patent document contains material which is subject to copyright protection. The copyright is owned by the copyright owner. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the patent and trademark office official records and records.
Technical Field
The present application relates to the field of video coding and decoding, and more particularly, to a video processing method and apparatus.
Background
The video encoding process includes an inter prediction process. There are various inter prediction modes, and some inter prediction modes construct a motion vector candidate list of the current block and determine a motion vector of the current block based on the motion vector candidate list of the current block.
In order to improve the inter prediction effect, some inter prediction modes may offset the motion vector in the motion vector candidate list of the current block to some extent in determining the motion vector of the current block.
Disclosure of Invention
The application provides a video processing method and device for optimizing an inter-frame prediction mode.
In a first aspect, a video processing method is provided, including: selecting a motion vector of a target candidate block from a motion vector candidate list of a current block, wherein the motion vector candidate list comprises motion vectors of a plurality of candidate blocks; determining a first offset value for the current block; determining a second offset value for the current block when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors; offsetting a first motion vector of the target candidate block according to the first offset value; and offsetting a second motion vector of the target candidate block according to the second offset value.
In a second aspect, a video processing apparatus is provided, including: a memory for storing code; a processor to read code in the memory to perform the following operations: selecting a motion vector of a target candidate block from a motion vector candidate list of a current block, wherein the motion vector candidate list comprises motion vectors of a plurality of candidate blocks; determining a first offset value for the current block; determining a second offset value for the current block when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors; offsetting a first motion vector of the target candidate block according to the first offset value; and offsetting a second motion vector of the target candidate block according to the second offset value.
In a third aspect, there is provided a video processing apparatus comprising means for performing the steps of the method of the first aspect.
In a fourth aspect, there is provided a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to perform the method of the first aspect.
In a fifth aspect, there is provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the first aspect.
The method and the device identify the specific candidate block and select the adaptive offset scheme for the specific candidate block, so that the inter-frame prediction mode can be optimized.
Drawings
Fig. 1 is a schematic flowchart of a merge candidate list building process.
Fig. 2 is an exemplary diagram of a scaling manner of a temporal motion vector.
Fig. 3 is a diagram illustrating an example of a prediction manner in the lowdelay mode.
Fig. 4 is a diagram illustrating an example of a prediction mode in random access mode.
Fig. 5 is a schematic flow chart of a video processing method provided by an embodiment of the present application.
FIG. 6 is an exemplary diagram of a current frame with both reference frames located on the same side of the current frame.
Fig. 7 is an exemplary diagram of a scaling scheme provided by an embodiment of the present application.
Fig. 8 is a schematic structural diagram of a video processing apparatus according to an embodiment of the present application.
Detailed Description
The method and the device can be applied to various video coding standards, such as H.264, High Efficiency Video Coding (HEVC), universal video coding (VVC), audio video coding standard (AVS), AVS +, AVS2, AVS3 and the like.
The distance between two frames mentioned in the present application may refer to the difference of the two frames in the playing sequence, and the larger the distance is, the larger the difference of the playing sequence is; since the playing order can be expressed by frame number (POC), in some embodiments, the distance between two frames can be measured by the difference between the frame numbers of the two frames.
The video coding process mainly comprises the steps of prediction, transformation, quantization, entropy coding, loop filtering and the like. Prediction is an important component of mainstream video coding techniques. Prediction can be divided into intra prediction and inter prediction. Inter prediction can be achieved by means of motion compensation. The motion compensation process is exemplified below.
For example, a frame of image may be first divided into one or more encoded regions. The coding region may also be referred to as a Coding Tree Unit (CTU). The CTU may be 64 × 64 or 128 × 128 in size (unit is a pixel, and similar description hereinafter is omitted). Each CTU may be divided into square or rectangular image blocks, and the image block currently being processed by the encoding side or the decoding side is hereinafter referred to as a current block. The current block mentioned in this embodiment of the present application may sometimes refer to a Coding Unit (CU) and may sometimes refer to a Prediction Unit (PU), which is not limited in this embodiment of the present application.
In inter-predicting a current block, a similar block to the current block may be found from a reference frame (which may be a reconstructed frame around a time domain) as a prediction block of the current block. The relative displacement between the current block and the similar block is called a motion vector. The process of finding a similar block in the reference frame as the prediction block of the current block is motion compensation.
The inter prediction modes may be various, and some inter prediction modes may construct a motion vector candidate list of the current block and select a motion vector of the current block from the motion vector candidate list of the current block. The following will exemplify a process of constructing a motion vector candidate list (in the merge mode, the motion vector candidate list may also be referred to as a merge candidate list).
As shown in fig. 1, the construction process of the merge candidate list includes steps S102 to S118.
In step S102, a spatial MVP (Motion Vector Prediction), or spatial MVP candidate (S), is added to the merge candidate list.
The spatial MVP is a motion vector of a spatially neighboring block in the same frame as the current block. The maximum number of spatial MVPs may be set to 4.
In step S104, it is determined whether the number of candidates in the merge candidate list reaches a preset maximum value (maxNumMergeCand).
If the number of candidates in the merge candidate list reaches a preset maximum value, ending the process of fig. 1; if the number of candidates in the merge candidate list does not reach the preset maximum value, the process may continue to step S106.
In step S106, TMVP, or TMVP candidate (S), is added to the merge candidate list.
The TMVP added in step S106 may be determined based on a motion vector of a co-located block (col-located block) in a co-located frame of the current frame. Since the distance between the co-located frame and the reference frame of the co-located frame is usually different from the distance between the current frame and the reference frame of the current frame, the motion vector of the co-located block generally needs to be scaled before being added as TMVP to the merge candidate list. The TMVP acquisition process is described in more detail below in conjunction with fig. 2.
In step S108, it is determined whether the number of candidates in the merge candidate list reaches a preset maximum value.
If the number of candidates in the merge candidate list reaches a preset maximum value, ending the process of fig. 1; if the number of candidates in the merge candidate list does not reach the preset maximum value, the process continues to step S110.
In step S110, a historical mvp (historical mvp), or HMVP candidate, is added to the merge candidate list.
The HMVP may be a motion vector of other blocks in the frame in which the current block is located (e.g., non-adjacent blocks to the current block).
In step S112, it is determined whether the number of candidates in the merge candidate list reaches a preset maximum value.
If the number of candidates in the merge candidate list reaches a preset maximum value, ending the process of fig. 1; if the number of candidates in the merge candidate list does not reach the preset maximum value, the process continues to step S114.
In step S114, a pair mvp (pair mvp) is added to the merge candidate list.
The pagewise MVP may be an MVP obtained after performing an addition and averaging process on MVPs that have been added to the merge candidate list.
In step S116, it is determined whether the number of candidates in the merge candidate list reaches a preset maximum value.
If the number of candidates in the merge candidate list reaches a preset maximum value, ending the process of fig. 1; if the number of candidates in the merge candidate list does not reach the preset maximum value, the process continues to step S118.
In step S118, a vector (0,0) is added to the merge candidate list until the number of candidates in the merge candidate list reaches a preset maximum value.
It should be understood that steps S108 to S118 are optional steps. For example, after adding the spatial MVP and TMVP to the merge candidate list, the construction of the merge candidate list may be stopped. As another example, after adding spatial MVP and TMVP to the merge candidate list, if the number of candidates in the merge candidate list has not yet reached a preset maximum, vector (0,0) may be added. For another example, after the spatial domain MVP and TMVP are added to the merge candidate list, if the number of candidates in the merge candidate list still does not reach the preset maximum value, the above steps S110 to S112 may be performed, but step S114 to step S116 are not performed; alternatively, if the number of candidates in the merge candidate list has not yet reached the preset maximum value, step S114 to step S116 are performed, but step S110 to step S112 are not performed.
The TMVP acquired at step S106 may refer to a motion vector of a candidate block obtained based on the TMVP technique. The following describes the acquisition process of TMVP with reference to fig. 2.
In the process of constructing the merge candidate list of the current block (cur _ PU), if it is necessary to add TMVP to the merge candidate list, the co-located frame (col _ pic) of the current block is usually determined first. The co-located frame is a frame temporally different from the current frame. The co-located frame may be, for example, the first frame in a reference list of the current block. After determining the co-located frame, a corresponding block of the current block may be determined from the co-located frame as a co-located block (col _ PU) of the current block. For example, a corresponding block in the co-located frame, which is located at C1 (the lower right corner of the current block) or C0 (the center point of the current block) with the current block, may be determined as the co-located block of the current block.
The TMVP may then be determined based on the motion vector of the co-located block. As shown in fig. 2, the distance between the collocated frame (col _ pic) and the reference frame (col _ ref) of the collocated frame is denoted by tb, and the distance between the current frame (cur _ pic) and the reference frame (cur _ ref) of the current frame is denoted by td. Since tb and td are usually different, the motion vector colMV of the co-located block is usually scaled first to obtain curMV, and then the curMV is added to the merge candidate list as TMVP. The curMV can be calculated by the following equation: currmv is td colMV/tb. For convenience of description, the scaling operation of the motion vector of the co-located block will be referred to as: the motion vector of the co-located block is mapped to a reference frame of the current frame.
When applying temporal motion vector prediction, if the current frame is a B frame (i.e. the current frame adopts bi-directional prediction mode), 2 scaling operations as shown in fig. 2 need to be performed, one for mapping the motion vector of the co-located block to the forward reference frame of the current block, and the other for mapping the motion vector of the co-located block to the backward reference frame of the current block.
It should be noted that the co-located block may be in a bi-directional prediction mode, and therefore, the co-located block may include two motion vectors, one of the motion vectors is a motion vector corresponding to a forward prediction mode of the co-located block (hereinafter referred to as col _ MV0), and the other motion vector is a motion vector corresponding to a backward prediction mode of the co-located block (hereinafter referred to as col _ MV 1). When the above two scaling operations are performed, one possible implementation is to map col _ MV0 to the forward reference frame and the backward reference frame of the current block, respectively; alternatively, col _ MV0 may be mapped to the forward reference frame of the current block, and col _ MV1 may be mapped to the backward reference frame of the current block; alternatively, col _ MV1 may be mapped to the forward reference frame and the backward reference frame of the current block, respectively; alternatively, col _ MV1 may be mapped to refer to the forward reference frame of the current block, and col _ MV0 may be mapped to the backward reference frame of the current block.
The process of mapping the motion vectors of the co-located blocks to the forward reference frame and the backward reference frame of the current frame will be described below with reference to fig. 3 and 4, respectively, by taking a low latency (low delay) mode and a random access (random access) mode as examples.
In the lowdelay mode, the frame numbers of the reference frames of the current frame are all smaller than the frame number of the current frame, that is, the playing sequence of the reference frames of the current frame is all before the playing sequence of the current frame.
Taking fig. 3 as an example, in the lowdelay mode, the current frame (cur _ pic) is POC 5, the first frame in the forward reference list is POC 4, and the first frame in the backward reference list is POC 4. In the example of fig. 3, POC 4 is simultaneously the co-located frame (col _ pic), the forward reference frame (cur _ ref0), and the backward reference frame (cur _ ref1) of the current frame. The forward reference list for POC 4 is { POC3, POC2, POC0}, and the backward reference list is { POC3, POC2, POC0 }. The POC 4 forward reference frame may be any one of POC3, POC2 and POC0, and fig. 3 illustrates the POC 4 forward reference frame as POC 3. Similarly, POC 4 backward reference frame may be any one of POC3, POC2 and POC0, and fig. 3 illustrates POC 4 backward reference frame as POC 2.
Fig. 3 (a) illustrates the scaling operation for the forward prediction mode in lowdelay mode, which results in mapping the motion vector (col _ MV0) of the co-located block (col _ PU) of the current block (cur _ PU) to the forward reference frame (POC 4) of the current block, resulting in the temporal motion vector prediction cur _ MV 0. Fig. 3 (b) illustrates a scaling operation for the backward prediction mode in lowdelay mode, which results in mapping the motion vector (col _ MV1) of the co-located block (col _ PU) of the current block (cur _ PU) to the backward reference frame (POC 4) of the current block, resulting in a temporal motion vector prediction cur _ MV 1.
In the random access mode, the frame number of the reference frame of the current frame may be larger than or smaller than the frame number of the current frame, that is, the reference frame of the current frame may be played after the current frame or before the current frame.
Taking fig. 4 as an example, in random access mode, the current frame is POC 27. The first frame of the forward reference list of the current frame is POC 26 and the first frame of the backward reference list is POC 28. In the example of fig. 4, POC 26 is simultaneously the co-located frame (col _ pic) and the forward reference frame (cur _ ref0) of the current frame; POC 28 is the backward reference frame (cur _ ref1) for the current frame. The forward reference list for POC 26 is { POC24, POC 16} and the backward reference list is { POC 28, POC 32 }. The backward reference frame of POC 26 may be POC 28 or POC 32. In the example of fig. 4, POC 32 is selected as the backward reference frame (col _ ref1) for POC 32.
Fig. 4 (a) illustrates a scaling operation for the forward prediction mode in random access mode, which results in mapping the motion vector (col _ MV1) of the co-located block (col _ PU) of the current block (cur _ PU) to the forward reference frame (cur _ ref0) of the current block, resulting in a temporal motion vector prediction cur _ MV 0. Fig. 4 (b) illustrates the scaling operation for the backward prediction mode in random access mode, which results in mapping the motion vector (col _ MV1) of the co-located block (col _ PU) of the current block (cur _ PU) to the backward reference frame (cur _ ref1) of the current block, resulting in the temporal motion vector prediction cur _ MV 1.
In order to improve the inter prediction effect, some inter prediction modes may offset the motion vector in the motion vector candidate list of the current block to some extent in determining the motion vector of the current block. A merge mode with motion vector difference (MMVD) is an inter prediction technique that introduces an offset scheme based on the conventional merge mode,
the following takes MMVD as an example to illustrate the shifting scheme of the motion vector.
The MMVD technique may also be referred to as an extreme motion vector expression (UMVE) technique. The implementation process of the MMVD mainly comprises the following two steps.
The first step is as follows: a base motion vector (base MV) is selected from the merge candidate list that has been constructed. Typically, the first two motion vector predictions in the merge candidate list will be selected as the base motion vector.
The second step is that: and offsetting the basic motion vector according to a certain rule to generate a new motion vector candidate, and predicting by using the new motion vector.
For example, assuming that the base motion vector is (x, y), the new motion vector obtained after the offset is (x ', y'), the offset value used by the offset operation may be a set of offset values configured in advance. For example, 8 optional offset values (offsets) can be preconfigured, and the implementation manner of each offset value is divided into the following 4 types:
-x’=x+offset,y’=y;
-x’=x–offset,y’=y;
-x’=x,y’=y+offset;
-x’=x,y’=y–offset。
thus, the two base motion vectors are shifted to obtain 2 × 4 × 8 — 64 possible combinations. When the base motion vector uses the bi-directional prediction mode, the motion vectors of the base motion vector in both reference directions need to be shifted. If the forward reference frame and the backward reference frame of the basic motion vector are the same frame, the basic motion vector can be offset in two reference directions by adopting the same offset value; if the forward reference frame and the backward reference frame of the base motion vector are different, the offset value of the motion vector of the base motion vector in a certain reference direction needs to be scaled, and the scaled offset value is used to offset the base motion vector.
The following describes embodiments of the present application in detail with reference to fig. 5. It should be understood that the method of fig. 5 may be applied to an encoding end and may also be applied to a decoding end, which is not limited by the embodiment of the present application.
Fig. 5 is a schematic flow chart of a video processing method provided by an embodiment of the present application. The method of fig. 5 includes steps S510 to S550.
In step S510, a motion vector of a target candidate block is selected from the motion vector candidate list of the current block.
The motion vector candidate list may include motion vectors of a plurality of candidate blocks. Taking the merge mode as an example, the motion vector candidate list mentioned in step S510 may be a merge candidate list. The construction process of the merge candidate list can be seen in fig. 1.
In some embodiments, the motion vector of the target candidate block may be referred to as a Base motion vector (Base MV) of the current block.
In the embodiment of the present application, the current frame where the current block is located may be a B frame. The prediction mode of the current block may be a bidirectional prediction mode.
In step S520, a first offset value of the current block is determined.
In step S530, when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors, a second offset value of the current block is determined.
The target candidate block may be a candidate block determined according to TMVP techniques. In other words, the candidate block determined based on the TMVP technique would be located in a different frame than the current block.
The two motion vectors of the target candidate block may be forward predicted and backward predicted, respectively. Alternatively, both motion vectors of the target candidate block may be uni-directional predicted. Taking the coding mode of the video as the random access mode as an example, the two motion vectors of the target candidate block may be forward prediction and backward prediction. Taking the encoding mode of the video as the low-latency mode as an example, both motion vectors of the target candidate block may be forward predicted.
The reference frames of the two motion vectors of the target candidate block may be the same or different.
In step S540, the first motion vector of the target candidate block is shifted according to the first shift value.
The first offset value may include one or more selectable values. The first offset value may comprise, for example, 8 selectable values. The first offset value may be a preset offset value, or may be an offset value obtained after the preset offset value is subjected to other operations. The other operations may be a scaling operation, an inverting operation, and the like, which is not specifically limited in this embodiment of the application.
As an example, step S540 may include: the first motion vector of the target candidate block is shifted according to the first non-scaled shift value.
It should be noted that the first offset value that is not scaled may be a preset offset value, or may be an offset value obtained by subjecting the preset offset value to other operation processing besides scaling operation, for example, an offset value obtained by subjecting the preset offset value to negation operation. It should be noted that the preset offset value may include a plurality of selectable values, where the negation operation may be to negate all of the plurality of selectable values, or to negate some of the plurality of selectable values, which is not limited in this application.
In step S550, the second motion vector of the target candidate block is shifted according to the second shift value.
The embodiment of the application identifies the specific candidate block and selects the adaptive offset scheme for the specific candidate block, so that the inter-frame prediction mode can be optimized.
The second offset value may include one or more selectable values. The second offset value may comprise, for example, 8 selectable values. The second offset value may be a preset offset value, or an offset value obtained by performing other operations on the preset offset value. The other operations may be a scaling operation, an inverting operation, and the like, which is not specifically limited in this embodiment of the application.
It should be noted that the second offset value that is not scaled may be a preset offset value, or may be an offset value obtained by subjecting the preset offset value to other operation processing besides scaling operation, for example, an offset value obtained by subjecting the preset offset value to negation operation. It should be noted that the preset offset value may include a plurality of selectable values, where the negation operation may be to negate all of the plurality of selectable values, or to negate some of the plurality of selectable values, which is not limited in this application.
The first offset value and the second offset value may be offset values independent of each other. Alternatively, the second offset value may be an offset value derived from the first offset value without scaling. Alternatively, the second offset value may be derived by other means than scaling the first offset value. For example, the first offset value may be a preset offset value, and the second offset value may be an offset value obtained by partially or completely negating the first offset value.
The first offset value and the second offset value may be the same or different. For example, the first offset value and the second offset value may be preset offset values. For another example, the first offset value may be a predetermined offset value, and the second offset value may be the inverse of the predetermined offset value. For another example, the second offset value may be a preset offset value, and the first offset value may be the inverse of the preset offset value. For another example, the partial offset values in the first offset value and the second offset value are the same, and the partial offset values are opposite numbers to each other.
For example, if the two motion vectors of the target candidate block are forward prediction and backward prediction, respectively, the first offset value and the second offset value may be the same.
For another example, if the two motion vectors of the target candidate block are forward predicted, the first offset value and the second offset value may be opposite numbers.
Taking fig. 6 as an example, assuming that the frame number of the current frame is current POC, the frame number of the forward reference frame of the current frame is POC0, and the frame number of the backward reference frame of the current frame is POC 1, if (current POC-POC 0) × (current POC-POC 1) > 0, it indicates that the playing order of the forward reference frame and the backward reference frame of the current frame is both before or after the current frame. In this case, in some embodiments, the first offset value and the second offset value for the current block may be set to the same offset value.
As can be seen from the description of fig. 2, if the target candidate block is located in a different frame from the current block and the target candidate block has two motion vectors, two scaling operations may be required to add the two motion vectors of the target candidate block to the motion vector candidate list of the current block. In addition, if the two reference frames corresponding to the motion vector of the target candidate block are different, when the motion vector of the target candidate block is offset, a scaling operation may be further performed on the offset value. It follows that when the target candidate block and the current block are located in different frames, the target candidate block has two motion vectors, and the reference frames of the two motion vectors of the target candidate block are different, 3 scaling operations may occur. The 3 scaling operations increase the complexity of video processing and increase the burden on the codec system.
To avoid the above-mentioned problem caused by the scaling operation of 3 times, when the target candidate block and the current block are located in different frames, the target candidate block has two motion vectors, and the reference frames of the two motion vectors of the target candidate block are different, both the first offset value and the second offset value may be set as the non-scaled offset values. In other words, when the target candidate block and the current block are located in different frames, the target candidate block has two motion vectors, and the reference frames of the two motion vectors of the target candidate block are different, the scaling operation on the offset value can be discarded, and the motion vector of the target candidate block is directly offset by using the non-scaled offset value, so that the video processing process is simplified, and the complexity of video processing is reduced.
Optionally, in some embodiments, the method of fig. 5 may further include: and determining the motion vector of the current block according to the first motion vector and the second motion vector after the offset. The motion vector of the current block may also be sometimes referred to as an optimal motion vector of the current block. The motion vector of the current block may be calculated according to an algorithm such as rate distortion cost, which is not limited in this embodiment of the present application.
As exemplified above by the target candidate block and the current block being located in different frames, when the target candidate block and the current block are located in the same frame, the method of fig. 5 may further include: and scaling a second offset value according to the first offset value, and offsetting a second motion vector of the target candidate block by using the scaled second offset value.
The following description will take the target motion vector as the spatial MVP as an example. As shown in FIG. 7, the forward reference frame of the base motion vector of the current block (cur _ PU) is cur ref pic 0, the backward reference frame is cur ref pic 1, the motion vector of the base motion vector with respect to the forward reference frame is MV0, and the motion vector with respect to the backward reference frame is MV 1. When cur ref pic 0 and cur ref pic 1 are not the same frame, there may be a difference in the distance of the two reference frames from the current frame, and at this time, it is not reasonable if MV0 and MV1 are offset using the same offset value (the offset values are denoted by offset 0 and offset 1 in fig. 7). For example, assuming that the offset value is 128, if the offset value of 128 is used for offset in both reference directions, the MV corresponding to the reference frame that is closer to the current frame will change too much, which is not in accordance with the motion rule of the object in the natural video, and therefore, the offset value of the MV needs to be scaled.
The video processing method according to the embodiment of the present application is described in detail above with reference to fig. 1 to 7, and the video processing apparatus according to the embodiment of the present application is described in detail below with reference to fig. 8. It should be understood that the description between the method embodiments and the apparatus embodiments correspond to each other, and therefore, the content that is not described in detail in the apparatus embodiment part can refer to the corresponding content in the method embodiment part in the foregoing.
Fig. 8 is a schematic structural diagram of a video processing apparatus according to an embodiment of the present application. The video processing apparatus 800 of fig. 8 may be an encoder or a decoder. The video processing apparatus 800 may include a memory 810 and a processor 820.
The memory 810 may be used to store code.
The processor 820 may be used to read code in memory to perform the following operations: selecting a motion vector of a target candidate block from a motion vector candidate list of a current block, wherein the motion vector candidate list comprises motion vectors of a plurality of candidate blocks; determining a first offset value for the current block; determining a second offset value for the current block when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors; offsetting a first motion vector of the target candidate block according to the first offset value; and offsetting a second motion vector of the target candidate block according to the second offset value.
Optionally, the second offset value is derived from the first offset value without scaling.
Optionally, the first offset value and the second offset value are the same.
Optionally, the processor 820 may be further configured to perform the following operations: and when the target candidate block and the current block are positioned in the same frame, scaling the second offset value, and offsetting a second motion vector of the target candidate block by using the scaled second offset value.
Optionally, the target candidate block is determined according to TMVP techniques.
Optionally, the two motion vectors of the target candidate block are forward prediction and backward prediction, respectively.
Optionally, the first offset value and the second offset value are opposite numbers to each other.
Optionally, the encoding mode of the video is a random access mode.
Optionally, the two motion vectors of the target candidate block are forward prediction.
Optionally, the first offset value and the second offset value are the same.
Optionally, the encoding mode of the video is a low latency mode.
Optionally, the motion vector candidate list is a merge candidate list.
Optionally, the motion vector of the target candidate block is a base motion vector of the current block.
Optionally, the first offset value and/or the second offset value comprises a plurality of selectable values.
Optionally, the processor 820 may be further configured to perform the following operations: and determining the motion vector of the current block according to the first motion vector and the second motion vector after the offset.
Optionally, the second offset value is obtained by other means than scaling the first offset value.
In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware or any other combination. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a Digital Video Disk (DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), among others.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (32)

1. A video processing method, comprising:
selecting a motion vector of a target candidate block from a motion vector candidate list of a current block, wherein the motion vector candidate list comprises motion vectors of a plurality of candidate blocks;
determining a first offset value for the current block;
determining a second offset value for the current block when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors;
offsetting a first motion vector of the target candidate block according to the first offset value;
and offsetting a second motion vector of the target candidate block according to the second offset value.
2. The method of claim 1, wherein the second offset value is derived from the first offset value without scaling.
3. The method of claim 1, wherein the first offset value and the second offset value are the same.
4. The method of claim 1, further comprising:
and when the target candidate block and the current block are positioned in the same frame, scaling the second offset value, and offsetting a second motion vector of the target candidate block by using the scaled second offset value.
5. The method of claim 1, wherein the target candidate block is determined according to a Temporal Motion Vector Prediction (TMVP) technique.
6. The method of claim 1, wherein the two motion vectors of the target candidate block are forward prediction and backward prediction, respectively.
7. The method of claim 6, wherein the first offset value and the second offset value are opposite numbers of each other.
8. The method of claim 6, wherein the encoding mode of the video is a random access mode.
9. The method of claim 1, wherein the two motion vectors of the target candidate block are forward predicted.
10. The method of claim 9, wherein the first offset value and the second offset value are the same.
11. The method of claim 9, wherein the encoding mode of the video is a low latency mode.
12. The method of claim 1, wherein the motion vector candidate list is a merge candidate list.
13. The method of claim 1, wherein the motion vector of the target candidate block is a base motion vector of the current block.
14. The method of claim 1, wherein the first offset value and/or the second offset value comprises a plurality of selectable values.
15. The method of claim 1, further comprising:
and determining the motion vector of the current block according to the first motion vector and the second motion vector after the offset.
16. The method of claim 1, wherein the second offset value is derived by dividing by scaling the first offset value.
17. A video processing apparatus, comprising:
a memory for storing code;
a processor to read code in the memory to perform the following operations:
selecting a motion vector of a target candidate block from a motion vector candidate list of a current block, wherein the motion vector candidate list comprises motion vectors of a plurality of candidate blocks;
determining a first offset value for the current block;
determining a second offset value for the current block when the target candidate block and the current block are located in different frames and the target candidate block has two motion vectors;
offsetting a first motion vector of the target candidate block according to the first offset value;
and offsetting a second motion vector of the target candidate block according to the second offset value.
18. The apparatus of claim 17, wherein the second offset value is derived from the first offset value without scaling.
19. The apparatus of claim 17, wherein the first offset value and the second offset value are the same.
20. The apparatus of claim 17, wherein the processor is further configured to:
and when the target candidate block and the current block are positioned in the same frame, scaling the second offset value, and offsetting a second motion vector of the target candidate block by using the scaled second offset value.
21. The apparatus of claim 17, wherein the target candidate block is determined according to a Temporal Motion Vector Prediction (TMVP) technique.
22. The apparatus of claim 17, wherein the two motion vectors of the target candidate block are forward prediction and backward prediction, respectively.
23. The apparatus of claim 22, wherein the first offset value and the second offset value are opposite numbers of each other.
24. The apparatus of claim 22, wherein the encoding mode of the video is a random access mode.
25. The apparatus of claim 17 wherein the two motion vectors of the target candidate block are forward predicted.
26. The apparatus of claim 25, wherein the first offset value and the second offset value are the same.
27. The apparatus of claim 25, wherein the coding mode of the video is a low latency mode.
28. The apparatus of claim 17, wherein the motion vector candidate list is a merge candidate list.
29. The apparatus of claim 17, wherein the motion vector of the target candidate block is a base motion vector of the current block.
30. The apparatus of claim 17, wherein the first offset value and/or the second offset value comprises a plurality of selectable values.
31. The apparatus of claim 17, wherein the processor is further configured to:
and determining the motion vector of the current block according to the first motion vector and the second motion vector after the offset.
32. The apparatus of claim 17, wherein the second offset value is derived by dividing by scaling the first offset value.
CN201980009149.7A 2019-06-25 2019-06-25 Video processing method and device Active CN111684799B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/092751 WO2020258024A1 (en) 2019-06-25 2019-06-25 Video processing method and device

Publications (2)

Publication Number Publication Date
CN111684799A true CN111684799A (en) 2020-09-18
CN111684799B CN111684799B (en) 2023-07-25

Family

ID=72451465

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980009149.7A Active CN111684799B (en) 2019-06-25 2019-06-25 Video processing method and device

Country Status (2)

Country Link
CN (1) CN111684799B (en)
WO (1) WO2020258024A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112565753A (en) * 2020-12-06 2021-03-26 浙江大华技术股份有限公司 Method and apparatus for determining motion vector difference, storage medium, and electronic apparatus
CN115086678A (en) * 2022-08-22 2022-09-20 北京达佳互联信息技术有限公司 Video encoding method and device, and video decoding method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2855027C (en) * 2011-11-08 2017-01-24 Kt Corporation A technique for encoding and decoding video by interpolating a reference picture block by applying different interpolation tap filters in vertical and horizontal directions to thereference block
US10200710B2 (en) * 2012-07-02 2019-02-05 Samsung Electronics Co., Ltd. Motion vector prediction method and apparatus for encoding or decoding video
EP3609184B1 (en) * 2017-07-04 2024-06-12 Samsung Electronics Co., Ltd. Image encoding method and apparatus, and image decoding method and apparatus

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JINGYA LI,RU-LING LIAO,CHONG SOON LIM: "Non-CE4: MMVD scaling simplification", 《JVET OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11》 *
SEUNGSOO JEONG,MIN WOO PARK,CHANYUL KIM: "CE4 Ultimate motion vector expression in J0024", 《JVET OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11》 *
XU CHEN,JIANHUA ZHENG: "Non-CE4: MMVD simplification", 《JVET OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112565753A (en) * 2020-12-06 2021-03-26 浙江大华技术股份有限公司 Method and apparatus for determining motion vector difference, storage medium, and electronic apparatus
CN112565753B (en) * 2020-12-06 2022-08-16 浙江大华技术股份有限公司 Method and apparatus for determining motion vector difference, storage medium, and electronic apparatus
CN115086678A (en) * 2022-08-22 2022-09-20 北京达佳互联信息技术有限公司 Video encoding method and device, and video decoding method and device

Also Published As

Publication number Publication date
WO2020258024A1 (en) 2020-12-30
CN111684799B (en) 2023-07-25

Similar Documents

Publication Publication Date Title
CN111630861B (en) Video processing method and device
US8385420B2 (en) Method and apparatus for estimating motion vector using plurality of motion vector predictors, encoder, decoder, and decoding method
CN113852828A (en) Method and device for obtaining video image motion vector
US10440383B2 (en) Image predictive encoding and decoding system
CN111630859A (en) Method and apparatus for image decoding according to inter prediction in image coding system
US9473787B2 (en) Video coding apparatus and video coding method
CN113905235A (en) Video image processing method and device
CN112602326A (en) Motion vector prediction method and device and coder-decoder
CN111684799B (en) Video processing method and device
US11949874B2 (en) Image encoding/decoding method and device for performing prof, and method for transmitting bitstream
JP2022529104A (en) Video coding and decoding methods and equipment with optical flow based on boundary smoothed motion compensation
US20220337842A1 (en) Image encoding/decoding method and device for performing bdof, and method for transmitting bitstream
CN111357288B (en) Video image processing method and device
KR20240113906A (en) Picture encoding and decoding method and device
CN112514391A (en) Video processing method and device
JP2017127023A (en) Video encoder, video encoding method and video encoding program, and transmitter, transmission method, and transmission program
JP2013031131A (en) Moving image encoder, moving image encoding method, and moving image encoding program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant