CN111970516B

CN111970516B - Inter-frame prediction method, video encoding method, electronic device and storage device

Info

Publication number: CN111970516B
Application number: CN202010712410.4A
Authority: CN
Inventors: 张政腾; 江东; 方瑞东; 陈瑶; 林聚财
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2020-07-22
Filing date: 2020-07-22
Publication date: 2022-02-18
Anticipated expiration: 2040-07-22
Also published as: CN111970516A

Abstract

The application discloses an inter-frame prediction method, a video coding method, electronic equipment and a storage device, wherein the inter-frame prediction method comprises the following steps: dividing a current block into a plurality of sub-block sets; wherein each sub-block set comprises a plurality of sub-blocks; obtaining a second motion vector of the sub-block based on the first motion vector of the current block; wherein the first motion vector is obtained by using a preset prediction mode; for at least one set of sub-blocks: respectively taking the second motion vector of at least one sub-block as a starting point, and acquiring the optimal offset vector of at least one sub-block by utilizing a preset motion search mode; predicting to obtain the optimal offset vector of other sub-blocks in the sub-block set by using the optimal offset vector of at least one sub-block; based on the second motion vector of the sub-block and the best offset vector, a final motion vector of the corresponding sub-block is determined. According to the scheme, the efficiency of inter-frame prediction can be improved.

Description

Inter-frame prediction method, video encoding method, electronic device and storage device

Technical Field

The present application relates to the field of video coding technologies, and in particular, to an inter-frame prediction method, a video coding method, an electronic device, and a storage apparatus.

Background

The video coding technology aims to solve the video compression problem so as to improve the video transmission and storage efficiency. Currently, video coding techniques include: H.264/AVC (Advanced Video Coding), H.265/HEVC (High Efficiency Video Coding), H.266/VVC (Versatile Video Coding), and the like. Among these techniques, inter-frame prediction is a very critical functional module, and its prediction efficiency has a significant impact on video coding performance. Therefore, how to improve the efficiency of inter-frame prediction is an urgent issue to be studied.

Disclosure of Invention

The technical problem mainly solved by the application is to provide an inter-frame prediction method, a video coding method, an electronic device and a storage device, which can improve the efficiency of inter-frame prediction.

In order to solve the above problem, a first aspect of the present application provides an inter prediction method, including: dividing a current block into a plurality of sub-block sets; wherein each sub-block set comprises a plurality of sub-blocks; obtaining a second motion vector of the sub-block based on the first motion vector of the current block; wherein the first motion vector is obtained by using a preset prediction mode; for at least one set of sub-blocks: respectively taking the second motion vector of at least one sub-block as a starting point, and acquiring the optimal offset vector of at least one sub-block by utilizing a preset motion search mode; predicting to obtain the optimal offset vector of other sub-blocks in the sub-block set by using the optimal offset vector of at least one sub-block; based on the second motion vector of the sub-block and the best offset vector, a final motion vector of the corresponding sub-block is determined.

In order to solve the above problem, a second aspect of the present application provides a video encoding method, including: obtaining the final motion vector of each sub-block in each sub-block set of the current block; encoding the current block by using the final motion vector of each sub-block; wherein the final motion vector of the sub-block is obtained by using the inter prediction method in the first aspect.

In order to solve the above problem, a third aspect of the present application provides an electronic device, which includes a memory and a processor coupled to each other, the memory storing program instructions, and the processor being configured to execute the program instructions to implement the inter-prediction method in the first aspect or implement the video coding method in the second aspect.

In order to solve the above problem, a fourth aspect of the present application provides a storage device storing program instructions executable by a processor, the program instructions being for implementing the inter-frame prediction method in the above first aspect or implementing the video coding method in the above second aspect.

In the above solution, a current block is divided into a plurality of sub-block sets, each sub-block set includes a plurality of sub-blocks, and a second motion vector of a sub-block is obtained based on a first motion vector of the current block, and the first motion vector is obtained by using a preset prediction mode, so as to, for at least one sub-block set: respectively taking the second motion vector of at least one sub-block as a starting point, and acquiring the optimal offset vector of at least one sub-block by utilizing a preset motion search mode; predicting to obtain the optimal offset vector of other sub-blocks in the sub-block set by using the optimal offset vector of at least one sub-block; and determining the final motion vector of the corresponding sub-block based on the second motion vector and the optimal offset vector of the sub-block, and further searching the optimal offset vector of at least one sub-block for each sub-block set, wherein other sub-blocks can be obtained through prediction, so that the complexity of inter-frame prediction can be reduced, and the inter-frame prediction efficiency is improved.

Drawings

FIG. 1 is a flowchart illustrating an embodiment of an inter prediction method according to the present application;

FIG. 2 is a schematic diagram of an embodiment of partitioning a current block into sub-blocks;

FIG. 3 is a diagram illustrating an embodiment of a default motion search mode;

FIG. 4 is a diagram of another embodiment of a preset motion search mode;

FIG. 5 is a schematic diagram of an embodiment of a sub-pixel search phase;

FIG. 6 is a flowchart illustrating an embodiment of step S13 in FIG. 1;

FIG. 7 is a diagram of another embodiment of a preset motion search mode;

FIG. 8 is a schematic flow chart illustrating another embodiment of step S13 in FIG. 1;

FIG. 9 is a schematic diagram of one embodiment of a corresponding process;

FIG. 10 is a schematic diagram of another embodiment of a corresponding process;

FIG. 11 is a schematic diagram of yet another embodiment of a corresponding process;

FIG. 12 is a flowchart illustrating an embodiment of a video encoding method;

FIG. 13 is a state diagram of an embodiment of video encoding;

FIG. 14 is a block diagram of an embodiment of an inter prediction apparatus;

FIG. 15 is a block diagram of an embodiment of a video encoding apparatus;

FIG. 16 is a block diagram of an embodiment of an electronic device;

FIG. 17 is a block diagram of an embodiment of a memory device.

Detailed Description

The following describes in detail the embodiments of the present application with reference to the drawings attached hereto.

In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present application.

The terms "system" and "network" are often used interchangeably herein. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two.

Referring to fig. 1, fig. 1 is a flowchart illustrating an inter prediction method according to an embodiment of the present application. Specifically, the method may include the steps of:

step S11: dividing a current block into a plurality of sub-block sets; wherein each sub-block set comprises a plurality of sub-blocks.

In the embodiment of the present disclosure, the size of the sub-blocks may be set according to actual application requirements, for example, 8 × 8.

In one implementation scenario, the width and height of the current block are both greater than 8, and the product of width and height is not less than 128.

In an implementation scenario, in order to improve the accuracy of inter prediction, the size of the sub-block may be set to be the minimum size of inter prediction, for example, in AVS (Audio Video coding Standard, digital Audio Video coding Standard), the size of the sub-block may be set to 4 × 4, and other cases may be set according to actual situations, which is not illustrated herein.

In one implementation scenario, the number of sub-block sets may be 1,2, 3, 4, etc., and is not limited herein.

In an implementation scenario, each sub-block set may include 2,3, 4, and so on, which are not limited herein. In a specific implementation scenario, the sub-block set may be a rectangle, and each sub-block set may include the following sub-blocks: 2 × 2,3 × 3, 4 × 4, etc., without limitation.

In a specific implementation scenario, please refer to fig. 2 in combination, fig. 2 is a schematic diagram of an embodiment of dividing a current block into sub-blocks, as shown in fig. 2, the current block may be firstly divided into 4 sub-block regions, each of which may be respectively used as a sub-block set, specifically, if the width W of the current block is less than 16, the width of the sub-block region is W, and if the height H of the current block is less than 16, the height H of the sub-block region is H. After the sub-blocks are obtained by dividing, the sub-block region may be further divided into a plurality of sub-blocks, as shown in fig. 2, the sub-block region shown by the black bold line may be divided into 4 × 4 sub-blocks. In other implementation scenarios, other partitioning manners may also be adopted according to the actual application situation, and are not limited herein.

Step S12: obtaining a second motion vector of the sub-block based on the first motion vector of the current block; wherein the first motion vector is obtained by using a preset prediction mode.

In an implementation scenario, the preset prediction mode may include, but is not limited to, a Merge mode, and may be specifically set according to an actual application requirement, which is not limited herein. Specifically, a Motion Vector (MV) candidate list may be constructed for the current block, and an optimal Motion Vector may be selected from the MV candidate list as the first Motion Vector of the current block, which is not described herein again in detail.

In an implementation scenario, the second motion vector of the sub-block may be obtained by specifically combining the first motion vector of the current block and the relative position of the sub-block in the current block. With reference to fig. 2, as shown in fig. 2, if the first motion vector of the current block is (1,1), and the size of the sub-block is 4 × 4, the second motion vector of the sub-block in the first row and the first column is (1,1) + (0,0) ═ 1,1), and similarly, the second motion vector of the sub-block in the first row and the second column is (1,1) + (4,0) — (5,1), the second motion vector of the sub-block in the first row and the first column is (1,1) + (0,4) — (1,5), and the second motion vector of the sub-block in the second row and the second column is (1,1) + (4,4) — 5, 5.

In one implementation scenario, where the current block is located in a B-frame, the first motion vector may specifically comprise a forward motion vector pointing to the forward reference frame of the B-frame and a backward motion vector pointing to the backward reference frame of the B-frame, and accordingly, the second motion vector of the sub-block may specifically comprise a forward motion vector pointing to the forward reference frame of the B-frame and a backward motion vector pointing to the backward reference frame of the B-frame. Specifically, a forward motion vector in the second motion vector of the sub-block may be obtained based on a forward motion vector in the first motion vector of the current block, and a backward motion vector in the second motion vector of the sub-block may be obtained based on a backward motion vector in the first motion vector of the current block. Specifically, the forward reference frame refers to a frame whose POC (Picture Order Count) is located before the frame where the current block is located, and the backward reference frame refers to a frame whose POC is located after the frame where the current block is located.

Step S13: for at least one set of sub-blocks: respectively taking the second motion vector of at least one sub-block as a starting point, and acquiring the optimal offset vector of at least one sub-block by utilizing a preset motion search mode; and predicting to obtain the optimal offset vector of other sub-blocks in the sub-block set by using the optimal offset vector of at least one sub-block.

In the embodiment of the present disclosure, for each subblock set, the following steps are performed: respectively taking the second motion vector of at least one sub-block as a starting point, and acquiring the optimal offset vector of at least one sub-block by utilizing a preset motion search mode; predicting to obtain the optimal offset vector of other sub-blocks in the sub-block set by using the optimal offset vector of at least one sub-block; based on the second motion vector of the sub-block and the best offset vector, a final motion vector of the corresponding sub-block is determined. Therefore, for each subblock set, only one subblock can be subjected to motion search to obtain the optimal offset vector of the subblock, and other subblocks only need to be subjected to prediction to obtain the optimal offset vector of the subblock, so that the complexity of inter-frame prediction can be effectively reduced, and the efficiency of inter-frame prediction can be improved. In addition, the prediction mode is adopted for each subblock set, so that the accuracy of inter-frame prediction is improved, and the performance of subsequent video coding can be improved.

In an implementation scenario, the preset motion search mode may specifically include two stages: the method comprises an integer pixel search stage and a sub-pixel search stage, wherein the integer pixel search stage specifically includes but is not limited to: regular search, et (early termination) search. Taking the frame where the current block is located as the B frame as an example, the process of obtaining the optimal offset vector of the sub-block in the integer pixel search stage by using the conventional search and the ET search will be specifically described.

In a specific implementation scenario, please refer to fig. 3 in combination, fig. 3 is a schematic diagram of an embodiment of a preset motion search method, as shown in fig. 3, a first starting pixel point (e.g., point a1 filled with oblique lines in fig. 3) pointed by a forward motion vector in a second motion vector may be determined, a second starting pixel point (e.g., point a2 filled with oblique lines in fig. 3) pointed by a backward motion vector in the second motion vector is determined, a pixel point in a preset region (e.g., region 5 × 5 in fig. 3) centered around a first starting pixel point (e.g., point a1 in fig. 3) is used as a first candidate pixel point (e.g., black in the left image in fig. 3), and a second candidate pixel point matched with the first candidate pixel point is selected in a dot region (e.g., region 5 × 5 in fig. 3) centered around a second starting pixel point (e.g., point a2 in fig. 3), specifically, the second candidate pixel point matched with the first candidate pixel point satisfies the following conditions: the deviation distance from the second candidate pixel point to the second initial pixel point is the same as the deviation distance from the first candidate pixel point to the first initial pixel point, and the deviation direction from the second candidate pixel point to the second initial pixel point is opposite to the deviation direction from the first candidate pixel point to the first initial pixel point. For example, in fig. 3, the deviation distance from the first candidate pixel C to the first initial pixel a1 is the same as the deviation distance from the second candidate pixel D to the second initial pixel a2, and the deviation direction from the first candidate pixel C to the first initial pixel a1 is opposite to the deviation from the second candidate pixel D to the second initial pixel a2, so that the second candidate pixel D is the second candidate pixel matched with the first candidate pixel C, and so on. In addition, the preset region may also be set as another region according to an actual situation, for example, in the AVS, 21 pixel regions remaining after pixels at the upper left corner, the upper right corner, the lower left corner, and the lower right corner are removed from the 5 × 5 region shown in fig. 3 may also be used as the preset region, and other switches may be set according to an actual need, which is not illustrated herein. After obtaining the second candidate pixel point matched with the first candidate pixel point, a difference between a region where the first candidate pixel point is a vertex and has a size of a predetermined size (e.g., the same size as the sub-block) and a region where the matched second candidate pixel point is a vertex and has a size of a predetermined size (e.g., the same size as the sub-block) may be respectively used as a pixel difference corresponding to the first candidate pixel point. For example, the pixel value difference between the region (i.e., the left dotted line region in fig. 3) in fig. 3, which has the first pixel point C as the vertex and has the same size as the sub-block, and the region (i.e., the right dotted line region in fig. 3), which has the second pixel point D matched with the first pixel point C as the vertex and has the same size as the sub-block, can be used as the pixel difference corresponding to the first candidate pixel point C. Specifically, the pixel difference may be calculated using SAD (Sum of Absolute Differences). After the pixel differences corresponding to all the first pixel points are obtained through calculation, the first candidate pixel point corresponding to the minimum value of the pixel differences can be used as a final target pixel point, and the optimal offset vector of the sub-block in the whole pixel searching stage is determined based on the deviation between the final target pixel point and the first candidate pixel point. For example, when point C in fig. 3 is the final target pixel point, the deviation of point a from the first starting pixel point is (-1, -1), the optimal offset vector of the sub-block in the whole pixel search stage is (-1, -1) in the forward reference frame, and the optimal offset vectors in the backward reference frame are the same in magnitude and opposite in direction, that is, (1, 1). Other cases may be analogized, and no one example is given here.

In another specific implementation scenario, please refer to fig. 4 in combination, and fig. 4 is a schematic diagram of another embodiment of the preset motion search mode. Specifically, a first initial pixel point pointed by a forward motion vector in a forward reference frame can be determined, and a second initial pixel point pointed by a backward motion vector in a backward reference frame can be determined; taking a pixel point in a preset area with the first initial pixel point as a center as a first candidate pixel point, and selecting a second candidate pixel point matched with the first candidate pixel point in the preset area with the second initial pixel point as the center; and respectively taking the difference between the area which takes the first candidate pixel point as the peak and has the size as the preset size and the area which takes the matched second candidate pixel point as the peak and has the size as the preset size as the pixel difference corresponding to the first candidate pixel point. The above steps may specifically refer to the foregoing description, and are not repeated herein. During each round of searching, the first candidate pixel point corresponding to the minimum value of the pixel difference between P0 and the first candidate pixel point adjacent to P0 (e.g., in fig. 4, P1 above P0, P2 below P0, P3 on the right side of P0, and P4 on the left side of P0) may be used as the first target pixel point, and if the first target pixel point is the center pixel point, the whole pixel searching stage may be ended, and the first target pixel point may be used as the final target pixel point; if the first target pixel is not the center pixel, the first target pixel may be used to check other nearby first candidate pixels (e.g., P5 in fig. 4), and then the first candidate pixel corresponding to the minimum pixel difference value in P1 to P5 is used as a new center pixel to perform a new round of search. After the final target pixel point is obtained, the optimal offset vector of the sub-block in the whole pixel search stage is determined based on the deviation between the final target pixel point and the first starting pixel point, which may be referred to specifically in the foregoing description, and is not described herein again.

In another specific implementation scenario, the sub-pixel search stage may specifically be obtained by calculating a pixel difference corresponding to a first candidate pixel point adjacent to a target pixel point, please refer to fig. 5, fig. 5 is a schematic diagram of an embodiment of the sub-pixel search stage, as shown in fig. 5, a two-dimensional coordinate system is established with a first starting pixel point a1 as an origin, a horizontal right direction is a positive direction of an X axis, a vertical upward direction is a positive direction of a Y axis, and the two-dimensional coordinate system is represented by two-dimensional coordinates, then a1 is (0,0), in the whole pixel search stage, a determined final target pixel point E may be (1,1), a first candidate pixel point above and above the adjacent position thereof may be (1,2), a first candidate pixel point below and below the adjacent position thereof may be (1,0), a first candidate pixel point on the left side and a first candidate pixel point on the adjacent right side thereof may be (2, 1). In addition, for convenience of describing the pixel difference corresponding to the first candidate pixel point, SAD [ ] [ ], and brackets [ ] [ ] represent the two-dimensional coordinates corresponding to the first candidate pixel point, then the component dMvX in the X-axis direction and the component dMvY in the Y-axis direction of the optimal offset vector of the sub-block in the sub-pixel search stage can be represented as:

and when the two-dimensional coordinates of the final target pixel point are in other situations, the analogy can be performed, and the examples are not repeated.

In yet another specific implementation scenario, after obtaining the optimal offset vector in the integer-pixel search stage and the optimal offset vector in the fractional-pixel search stage, the two may be added as the final offset vector of the sub-block. In addition, the optimal offset vector comprises a forward offset vector pointing to the front reference frame and a backward offset vector pointing to the back reference frame, the forward offset vector and the backward offset vector are equal in size and opposite in direction, and in the integral pixel searching stage and the sub-pixel searching stage, only the forward offset vector or the backward offset vector can be calculated, and the other can be obtained by directly taking the inverse.

In one implementation scenario, in order to improve the prediction speed, the optimal offset vector obtained by using the preset motion search method may be directly used as the optimal offset vector of other sub-blocks in the sub-block set. Referring to fig. 2, for example, for the sub-block set shown by the black bold line, the optimal offset vector of the sub-block located in the first row and the first column may be determined by using a preset motion search method, and the optimal offset vector may be directly used as the optimal offset vector of other sub-blocks in the sub-block set shown by the black bold line in order to increase the prediction speed. Other cases may be analogized, and are not limited herein.

In another implementation scenario, in order to improve the prediction accuracy, the optimal offset vectors of other sub-blocks may also be predicted by using relative positions between the other sub-blocks in the sub-block set and the sub-block for determining the optimal offset vector, which is not described herein for the sake of detail.

Step S14: based on the second motion vector of the sub-block and the best offset vector, a final motion vector of the corresponding sub-block is determined.

In this embodiment of the present disclosure, a sum of the second motion vector of the sub-block and the optimal offset vector may be specifically used as the final motion vector of the sub-block. In a specific implementation scenario, when the frame where the current block is located is a B frame, the optimal offset vector of the sub-block may specifically include a forward offset vector pointing to a forward reference frame of the B frame and a backward offset vector pointing to a backward reference frame of the B frame, and the forward offset vector and the backward offset vector are equal in magnitude and opposite in direction, for example, the forward offset vector is (1,1), and the backward offset vector is (-1, -1), or the forward offset vector is (2,3), and the backward offset vector is (-2, -3), and the other cases may be similar, which is not exemplified herein. For convenience of description, the forward offset vector may be denoted as deltaMV, the backward offset vector may be denoted as deltaMV, the forward motion vector pointing to the forward reference frame in the second motion vector may be denoted as MV0, and the backward motion vector pointing to the backward reference frame may be denoted as MV1, and then the forward final motion vector MV0_ final pointing to the forward reference frame in the final motion vector and the backward final motion vector MV1_ final pointing to the backward reference frame in the final motion vector may be denoted as:

in one implementation scenario, when there is a sub-block set that has not undergone the above-mentioned processing in step S13, the best offset vector of the sub-block within the sub-block set may be considered to be 0.

In a specific implementation scenario, the current block may be divided into a sub-block set, that is, the current block may be directly regarded as a sub-block set, and the sub-block set includes a plurality of sub-blocks, for example, may include 4, 9, etc., thus, the second motion vector of the sub-block can be obtained based on the first motion vector of the current block, which is not described herein again, and for this unique sub-block set, that is, for the current block, the best offset vector of at least one sub-block can be obtained by using a preset motion search mode with at least one second motion vector as a starting point, and using the best offset vector of at least one sub-block to predict the best offset vectors of other sub-blocks in the unique sub-block set (i.e. the current block), thereby determining a final motion vector of the corresponding sub-block based on the second motion vector of the sub-block and the best offset vector.

Referring to fig. 6, fig. 6 is a schematic flowchart illustrating an embodiment of step S13 in fig. 1. Specifically, fig. 6 is a flowchart illustrating an embodiment of obtaining an optimal offset vector of at least one sub-block by using a predetermined motion search method with a second motion vector of the at least one sub-block as a starting point. In the disclosed embodiment, the current block is located in the B frame, and the second motion vector includes a forward motion vector pointing to a reference frame before the B frame and a backward motion vector pointing to a reference frame after the B frame. The method specifically comprises the following steps:

step S61: and determining a first initial pixel point pointed by the forward motion vector in the forward reference frame, and determining a second initial pixel point pointed by the backward motion vector in the backward reference frame.

Reference may be made to the related description in the foregoing disclosed embodiments, and details are not repeated herein.

Step S62: and taking the pixel points in the preset area with the first initial pixel point as the center as first candidate pixel points, and selecting second candidate pixel points matched with the first candidate pixel points in the preset area with the second initial pixel points as the center.

Step S63: and respectively taking the difference between the area which takes the first candidate pixel point as the peak and has the size as the preset size and the area which takes the matched second candidate pixel point as the peak and has the size as the preset size as the pixel difference corresponding to the first candidate pixel point.

Step S64: searching a final target pixel point in the first candidate pixel points; and the pixel difference corresponding to the final target pixel point is smaller than the pixel difference corresponding to any first candidate pixel point adjacent to the final target pixel point.

In the embodiment of the present disclosure, the final target pixel point needs to satisfy that the corresponding pixel difference is smaller than the pixel difference corresponding to any one of the first candidate pixel points adjacent to the final target pixel point. Specifically, the calculation manner of the pixel difference may refer to the related description in the foregoing disclosed embodiments, and is not repeated herein. In addition, the adjacency may specifically include: adjacent left side, adjacent right side, adjacent top, adjacent bottom. Therefore, in the embodiment of the present disclosure, the final target pixel point is set to have a corresponding pixel difference smaller than a pixel difference corresponding to any of the first candidate pixel points adjacent to the final target pixel point, so that the first candidate pixel point located at the corner in the preset region is not missed in the search process, and therefore, the accuracy of motion search can be improved, and the performance of subsequent video coding can be improved.

In an implementation scenario, the first initial pixel point may be specifically used as a central pixel point, so that a pixel point corresponding to the minimum value in the corresponding pixel difference among the central pixel point and the first candidate pixel points adjacent to the central pixel point is used as a first target pixel point, if the first target pixel point is the central pixel point, the first target pixel point may be directly used as a final target pixel point, otherwise, the first target pixel point may be used as a new central pixel point, and the pixel point corresponding to the minimum value in the corresponding pixel difference among the central pixel point and the first candidate pixel points adjacent to the central pixel point is used as the first target pixel point, and subsequent steps are performed after the new step.

In a specific implementation scenario, please refer to fig. 7 in combination, and fig. 7 is a schematic diagram of another embodiment of the preset motion search mode. As shown in fig. 7, each triangle represents a first candidate pixel, taking AVS as an example, the preset region is a region formed after upper-left, lower-left, upper-right, and lower-right pixels are removed from a 5 × 5 region, and the preset region may also be set according to an actual situation in other scenes, such as the 5 × 5 region shown in fig. 3, and so on, which are not illustrated here. Please refer to the left view of fig. 7, first start pixel (triangle filled with oblique lines) is taken as the center pixel, the pixel corresponding to the minimum value of the corresponding pixel difference between the center pixel and the first candidate pixel (triangle filled with grid) adjacent to the center pixel is taken as the first target pixel (the first candidate pixel indicated by arrow), please refer to the middle view of fig. 7 again, since the first target pixel is not the center pixel, it can be taken as the new center pixel (triangle filled with oblique lines), and the pixel corresponding to the minimum value of the corresponding pixel difference between the center pixel and the first candidate pixel (triangle filled with grid) adjacent to the center pixel is taken as the first target candidate pixel (the first candidate pixel indicated by arrow), please refer to the right view of fig. 7 finally, the first target pixel point is not a central pixel point, and can be used as a new central pixel point (triangle filled with oblique lines), and a pixel point corresponding to the minimum value of the pixel difference between the central pixel point and a first candidate pixel point (triangle filled with grids) adjacent to the central pixel point is used as a first target candidate pixel point (the first candidate pixel point indicated by the arrow), and the first target candidate pixel point can be used as a final target pixel point because the first target candidate pixel point satisfies that the corresponding pixel difference is smaller than the pixel difference corresponding to any first candidate pixel point adjacent to the first target pixel point. As can be seen from the above search process, the search can prevent the first candidate pixel point located at the corner in the preset region from being omitted, so that the accuracy of motion search can be improved, and the performance of subsequent video coding can be improved.

Step S65: and acquiring the optimal offset vector of the second sub-block based on the final target pixel point.

In the embodiment of the present disclosure, the optimal offset vector corresponding to the whole-pixel search stage may be determined according to the relative position of the final target pixel point in the preset region, and the optimal offset vector corresponding to the sub-pixel search stage may be obtained based on the pixel difference between the final target pixel point and the first candidate pixel point adjacent to the final target pixel point. The optimal offset vector of the second sub-block is the sum of the optimal offset vector in the integer-pixel search stage and the optimal offset vector in the sub-pixel search stage, which may specifically refer to the related description in the foregoing disclosed embodiments, and is not described herein again.

In an implementation scenario, when the final target pixel exceeds or is located at an edge position of the preset region, the processing step in the sub-pixel search stage may not be executed, and at this time, the optimal offset vector corresponding to the sub-pixel search stage is considered to be (0, 0). For example, referring to fig. 7, when the final target pixel point is located in or exceeds the top row, the bottom row, the left column, or the right column of the preset area, the processing step of the sub-pixel search stage may not be performed, and at this time, the optimal offset vector corresponding to the sub-pixel search stage is considered to be (0, 0).

In the embodiment of the present disclosure, the optimal offset vector includes a forward offset vector pointing to the front reference frame and a backward offset vector pointing to the back reference frame, and the forward offset vector and the backward offset vector are equal in magnitude and opposite in direction. Therefore, one of the forward offset vector and the backward offset vector can be obtained by calculation first, and then the other can be obtained by directly negating.

Different from the embodiment, the searching method can prevent the first candidate pixel point located at the corner in the preset area from being missed in the searching process, so that the accuracy of motion searching can be improved, and the performance of subsequent video coding can be improved.

Referring to fig. 8, fig. 8 is a schematic flowchart illustrating another embodiment of step S13 in fig. 1. Specifically, fig. 8 is a flowchart illustrating an embodiment of predicting optimal offset vectors of other sub-blocks in a sub-block set by using the optimal offset vector of at least one sub-block. Specifically, the method may include the steps of:

step S81: and selecting a corresponding processing mode based on the number of the sub-blocks of at least one sub-block and the relative position in the sub-block set.

In the embodiment of the present disclosure, the number of sub-blocks of at least one sub-block is: and in each sub-block set, acquiring the number of sub-blocks of the optimal offset vector by presetting a motion search mode. In particular, the number of sub-blocks may include any of: 1,2, 3, 4, etc., which may be specifically set according to the actual application requirement, and are not limited herein. In an implementation scenario, the number of sub-blocks of at least one sub-block may also be the same as the number of sub-blocks included in the sub-block set, that is, all sub-blocks in the sub-block set may be searched in a preset motion search manner to obtain the optimal offset vector thereof.

In an embodiment of the disclosure, the relative position of the at least one sub-block in the set of sub-blocks may include at least one of: the edge position, the center position, and the corner position of the sub-block set may be specifically set according to the actual application requirement, and are not limited herein.

In the embodiment of the present disclosure, the processing manner may include, but is not limited to: an upsampling method, an affine transformation method, and the like, which are not limited herein.

Step S82: and predicting the optimal offset vector of at least one sub-block by using a processing mode to obtain the optimal offset vectors of other sub-blocks in the sub-block set.

In an implementation scenario, when the number of sub-blocks is one, the processing manner may be determined to be a first upsampling manner, and the first upsampling manner may specifically include: and taking the optimal offset vector of the sub-block as the optimal offset vector of other sub-blocks in the sub-block set in which the sub-block is positioned.

In another implementation scenario, when the number of sub-blocks is the same as the number of sub-block sets and the number of sub-regions, and at least one sub-block is located in a different sub-region, the processing manner may be determined to be the second upsampling manner, and the sub-region may include the plurality of sub-blocks. In particular, the size of each sub-region may be different or the same. For example, when the set of sub-blocks is 16 × 16 in size and the sub-blocks are 4 × 4 in size, the sub-regions may be 8 × 8 in size, that is, the set of sub-blocks includes 4 sub-regions of 8 × 8 in size, and each sub-region includes 4 sub-blocks.

In a specific implementation scenario, please refer to fig. 9 in combination, where fig. 9 is a schematic diagram of an embodiment of a corresponding processing manner, and as shown in fig. 9, the subblock set includes 16 subblocks, which are numbered from 0 to 15 for convenience of description, the subblocks numbered 0,1, 4, and 5 are in the same sub-region, the subblocks numbered 2,3, 6, and 7 are in the same sub-region, the subblocks numbered 8, 9, 12, and 13 are in the same sub-region, and the subblocks numbered 10, 11, 14, and 15 are in the same sub-region. By presetting the motion search mode, the best offset vector deltaMV (-16, -4) of the sub-block numbered 0 in fig. 9, the best offset vector deltaMV (-4, -16) of the sub-block numbered 3, the best offset vector deltaMV (-4, -4) of the sub-block numbered 8, the best offset vector deltaMV (16, -16) of the sub-block numbered 11 can be obtained, the best offset vector deltaMV (-16, -4) for sub-blocks numbered 1, 4, 5, the best offset vector deltaMV (-4, -16) for sub-blocks numbered 2, 6, 7, the best offset vector deltaMV (-4, -4) for sub-blocks numbered 9, 12, 13, the best offset vector deltaMV (16, -16) for sub-blocks numbered 10, 14, 15 can be determined using the second upsampling method. Other scenarios may be analogized, and are not exemplified here.

In another specific implementation scenario, after determining the optimal offset vectors of the other sub-blocks in the sub-block set by using the second upsampling method, the optimal offset vectors of all sub-blocks in the sub-block set may be adjusted by performing any one of weighted averaging, maximum taking, minimum taking, median taking, mode taking, and the like on the optimal offset vectors of all sub-blocks in the sub-block set.

In another implementation scenario, the sub-block set may be a rectangle, and when the number of the sub-blocks is two or three and at least one sub-block is located at a preset corner position of the sub-block set, the processing manner may be determined to be an affine transformation manner. Specifically, the presetting of the vertex angle position may include: the position of the upper vertex angle, the position of the lower vertex angle, the position of the left vertex angle and the position of the right vertex angle. Referring to fig. 9, the sub-blocks numbered 0 and 3 are located at the top corner position, the sub-blocks numbered 12 and 15 are located at the bottom corner position, the sub-blocks numbered 0 and 12 are located at the left corner position, and the sub-blocks numbered 3 and 15 are located at the right corner position.

In a specific implementation scenario, when the number of the sub-blocks is two, the preset vertex angle position may specifically include any one of the following: go up apex angle position, lower apex angle position, left apex angle position, right apex angle position, the affine transformation mode specifically can include: and obtaining a first size of a subblock set where at least one subblock is located and a second size of the at least one subblock, and performing affine transformation on the optimal offset vector, the first size and the second size of the at least one subblock by respectively using the relative positions of other subblocks in the subblock set to obtain the optimal offset vector of the corresponding subblock. Taking the preset vertex angle position as the top vertex angle position as an example, please refer to fig. 10, fig. 10 is a schematic diagram of another embodiment of a corresponding processing manner, as shown in fig. 10, the optimal offset vector deltaMV (Δ v) of the sub-block with number 0 in the sub-block set can be determined by a preset motion search manner_0x,Δv_0y) And best offset vector deltaMV (Δ v) of sub-block number 3_3x,Δv_3y) Then the best offset vector deltaMV (Δ v) of other sub-blocks in the set of sub-blocks may be determined by the following equation_x,Δv_y) The method comprises the following steps:

in the above equation (4), w represents the width of the subblock set, and x and y represent distances in the horizontal and vertical directions when the subblock is located at (0,0) with respect to the upper left-hand coordinate of the subblock set, respectively. In addition, the sub-block for determining the optimal offset vector by the preset motion search method is not limited to the sub-block at the top corner position, but may also be the sub-block at the bottom corner position, or the sub-block at the left corner position, or the sub-block at the right corner position, which is not limited herein. When the sub-blocks are other sub-blocks, the formula (4) can be adjusted correspondingly,this is not exemplified. Specifically, when the best offset vector deltaMV of the sub-block numbered 0 is (-16, -4) and the best offset vector deltaMV of the sub-block numbered 3 is (-4, -16), the best offset vector deltaMV (Δ v) of the sub-block numbered 1 is_x,Δv_y) The following calculation can be performed by the above equation (4):

the best offset vectors for other sub-blocks in the sub-block set may be calculated by analogy, and no further example is given here.

In another specific implementation scenario, when the number of the sub-blocks is three, the preset vertex angle position may include any three of the four vertex angle positions of the sub-block set, that is, may be any three of an upper right vertex angle, a lower right vertex angle, an upper left vertex angle, and a lower left vertex angle, and the affine transformation manner may specifically include: and obtaining a first size of a subblock set where at least one subblock is located and a second size of the at least one subblock, and performing affine transformation on the optimal offset vector, the first size and the second size of the at least one subblock by respectively using the relative positions of other subblocks in the subblock set to obtain the optimal offset vector of the corresponding subblock. Taking the preset vertex angle positions including the top left vertex angle, the bottom left vertex angle, and the top right vertex angle as an example, please refer to fig. 11, fig. 11 is a schematic diagram of another embodiment of a corresponding processing manner, as shown in fig. 11, the optimal offset vector deltaMV (Δ v) of the sub-block numbered 0 in the sub-block set can be determined by a preset motion search manner_0x,Δv_0y) Best offset vector deltaMV (Δ v) of sub-block number 3_3x,Δv_3y) And best offset vector deltaMV (Δ v) of sub-block number 12_12x,Δv_12y) Then the best offset vector deltaMV (Δ v) of other sub-blocks in the set of sub-blocks may be determined by the following equation_x,Δv_y) The method comprises the following steps:

in the above equation (6), w and h represent the width and height of the subblock set, respectively, and x and y represent the distances in the horizontal and vertical directions when the subblock is (0,0) with respect to the coordinate of the upper left corner of the subblock set, respectively. In addition, the sub-block for determining the optimal offset vector by the preset motion search mode is not limited to include the upper right vertex angle, the upper left vertex angle and the lower left vertex angle, and may further include the upper right vertex angle, the upper left vertex angle and the lower right vertex angle, or may further include the upper left vertex angle, the lower left vertex angle and the lower right vertex angle, or may further include the upper right vertex angle, the lower right vertex angle and the lower left vertex angle, which is not limited herein. When the sub-blocks are other sub-blocks, the above formula (6) can be adjusted correspondingly, and is not illustrated here.

Different from the foregoing embodiment, the corresponding processing method is selected based on the number of the sub-blocks of the at least one sub-block and the relative position in the sub-block set, so that the optimal offset vector of the at least one sub-block is predicted by using the processing method to obtain the optimal offset vectors of other sub-blocks in the sub-block set, the complexity of inter-frame prediction can be reduced, and the inter-frame prediction efficiency can be improved.

Referring to fig. 12, fig. 12 is a flowchart illustrating video encoding according to an embodiment of the present application. Specifically, the method may include the steps of:

step 1210: and acquiring the final motion vector of each sub-block in each sub-block set of the current block.

In the embodiment of the present disclosure, the final motion vector of the self-block is obtained by using the steps in any of the embodiments of the inter prediction method, which may specifically refer to the relevant steps in the embodiment of the present disclosure, and will not be described herein again.

Step S1220: and encoding the current block by using the final motion vector of each sub-block.

Specifically, after the final motion vector of each sub-block is obtained, a reference region to which the final motion vector points in the reference frame may be determined, and pixel values of the reference region may be padded as pixel values of the sub-blocks. In an implementation scenario, the frame where the current block is located is a B frame, the pixel value of the reference region pointed to by the forward reference frame according to the final motion vector in the forward direction may be used as the forward prediction value, and the pixel value of the reference region pointed to by the backward reference frame according to the optimal motion vector in the backward direction may be used as the backward prediction value. In another implementation scenario, the size of the reference region may be the same as the size of the sub-block. Referring to fig. 13 in combination, fig. 13 is a state diagram of an embodiment of video coding, as shown in fig. 12, sub-block a0 points to reference region a0 in the reference frame, sub-block B0 points to reference region B0 in the reference frame, sub-block C0 points to reference region C0 in the reference frame, and sub-block D0 points to reference region D0 in the reference frame. Other scenarios may be analogized, and are not exemplified here.

According to the scheme, the final motion vector of each subblock in each subblock set of the current block is obtained, the current block is encoded by using the final motion vector of each subblock, and the final motion vector of each subblock is obtained by using the steps in any one of the embodiments of the inter-frame prediction method, so that the complexity of inter-frame prediction can be reduced, and the inter-frame prediction efficiency can be improved.

Referring to fig. 14, fig. 14 is a block diagram illustrating an embodiment of an inter prediction apparatus 1400. The inter-frame prediction apparatus 1400 comprises a sub-block dividing module 1410, a vector obtaining module 1420, and a search determining module 1430, wherein the sub-block dividing module 1410 is configured to divide the current block into a plurality of sub-block sets; wherein each sub-block set comprises a plurality of sub-blocks; the vector obtaining module 1420 is configured to obtain a second motion vector of the sub-block based on the first motion vector of the current block; wherein the first motion vector is obtained by using a preset prediction mode; search determination module 1430 is configured to, for at least one set of sub-blocks: respectively taking the second motion vector of at least one sub-block as a starting point, and acquiring the optimal offset vector of at least one sub-block by utilizing a preset motion search mode; predicting to obtain the optimal offset vector of other sub-blocks in the sub-block set by using the optimal offset vector of at least one sub-block; based on the second motion vector of the sub-block and the best offset vector, a final motion vector of the corresponding sub-block is determined.

In some disclosed embodiments, the search determining module 1430 includes a processing selecting sub-module configured to select a corresponding processing manner based on the number of sub-blocks of at least one sub-block and the relative position in the sub-block set, and the search determining module 1430 includes a vector predicting sub-module configured to predict the optimal offset vector of at least one sub-block using the processing manner to obtain the optimal offset vectors of other sub-blocks in the sub-block set.

In some disclosed embodiments, the processing selection sub-module includes a first upsampling unit configured to determine that the processing mode is the first upsampling mode when the number of the sub-blocks is one, and the processing selection sub-module includes a second upsampling unit configured to determine that the processing mode is the second upsampling mode when the number of the sub-blocks is the same as the number of the plurality of sub-areas of the sub-block set and at least one of the sub-blocks is located in a different sub-area, respectively; wherein the sub-region comprises a plurality of sub-blocks.

Different from the foregoing embodiment, the first upsampling mode or the second upsampling mode is determined to be adopted according to the number of the subblocks and the relative positions of the subblocks in the subblock set, which is beneficial to improving the prediction speed.

In some disclosed embodiments, the first upsampling unit is further specifically configured to use the best offset vector of the sub-block as the best offset vector of other sub-blocks in the sub-block set in which the sub-block is located.

Different from the foregoing embodiment, it is beneficial to improve the prediction speed by using the optimal offset vector of the sub-block as the optimal offset vector of other sub-blocks in the sub-block set where the sub-block is located.

In some disclosed embodiments, the second upsampling unit is further specifically configured to use the best offset vector of each sub-block as the best offset vector of other sub-blocks in the sub-region where the sub-block is located.

Unlike the foregoing embodiments, it is advantageous to improve the prediction speed by using the optimal offset vector of each sub-block as the optimal offset vector of other sub-blocks in the sub-area.

In some disclosed embodiments, the sub-block set is a rectangle, and the processing selection sub-module further includes an affine transformation unit configured to determine that the processing manner is an affine transformation manner when the number of the sub-blocks is two or three and at least one of the sub-blocks is located at a preset corner position of the sub-block set.

Different from the foregoing embodiment, when the number of the subblocks is two or three, and at least one subblock is located at the preset vertex angle position of the subblock set, the processing mode is determined to be an affine transformation mode, which is beneficial to reducing the complexity of inter-frame prediction and improving the prediction precision and the inter-frame prediction efficiency.

In some disclosed embodiments, the affine transformation unit is further configured to obtain a first size of a sub-block set in which the at least one sub-block is located, and a second size of the at least one sub-block; and performing affine transformation on the optimal offset vector, the first size and the second size of at least one sub-block by respectively using the relative positions of other sub-blocks in the sub-block set to obtain the optimal offset vector of the corresponding sub-block.

Different from the foregoing embodiment, by obtaining a first size of a sub-block set in which at least one sub-block is located, and a second size of the at least one sub-block; and performing affine transformation on the optimal offset vector, the first size and the second size of at least one subblock by respectively using the relative positions of other subblocks in the subblock set to obtain the optimal offset vector of the corresponding subblock, so that the complexity of inter-frame prediction can be reduced, the prediction precision can be improved, and the inter-frame prediction efficiency can be improved.

In some disclosed embodiments, when the number of sub-blocks is two, the preset corner position comprises any one of: an upper vertex angle position, a lower vertex angle position, a left vertex angle position and a right vertex angle position; and/or when the number of the sub-blocks is three, the preset vertex angle position comprises any three of the four vertex angle positions of the sub-block set.

Different from the foregoing embodiment, when the number of sub-blocks is two, the preset vertex angle position is set to include any one of the following: the positions of the upper vertex angle, the lower vertex angle, the left vertex angle and the right vertex angle can be favorable for improving the prediction robustness; when the number of the subblocks is three, the preset vertex angle positions are set to any three of the four vertex angle positions including the subblock set, so that the improvement of the prediction robustness can be facilitated.

In some disclosed embodiments, the current block is located in a B frame, the second motion vector includes a forward motion vector pointing to a reference frame before the B frame and a backward motion vector pointing to a reference frame after the B frame, the search determining module 1430 further includes an initial determining submodule for determining a first initial pixel point pointed to by the forward motion vector in the forward reference frame and determining a second initial pixel point pointed to by the backward motion vector in the backward reference frame, the search determining module 1430 further includes a candidate matching submodule for using a pixel point within a preset region centered on the first initial pixel point as a first candidate pixel point and selecting a second candidate pixel point matching the first candidate pixel point within a preset region centered on the second initial pixel point, the search determining module 1430 further includes a difference calculating submodule for respectively using the first candidate pixel point as a vertex and a region having a size as a preset size, and a difference between the regions having the vertex and the preset size as the second candidate pixel point, the difference being a pixel difference corresponding to the first candidate pixel point, the search determining module 1430 further includes a target searching submodule for searching for a final target pixel point among the first candidate pixel points; the pixel difference corresponding to the final target pixel point is smaller than the pixel difference corresponding to any one of the first candidate pixel points adjacent to the final target pixel point, and the search determining module 1430 further includes an offset determining submodule configured to obtain the optimal offset vector of the second sub-block based on the final target pixel point.

In some disclosed embodiments, the target search submodule includes a center determining unit configured to determine a first start pixel as a center pixel, the target search submodule includes a candidate target unit configured to determine a pixel corresponding to a minimum value of corresponding pixel differences among the center pixel and a first candidate pixel adjacent to the center pixel as a first target pixel, the target search submodule includes a center determining unit configured to determine whether the first target pixel is the center pixel, the target search submodule includes a first executing unit configured to determine the first target pixel as a final pixel when the first target pixel is the center pixel, the target search submodule includes a second executing unit configured to determine the first target pixel as a new center pixel when the first target pixel is not the center pixel, and combine the candidate target unit, And the central judging unit executes the step of taking the central pixel point and the pixel point corresponding to the minimum value in the corresponding pixel difference in the first candidate pixel points adjacent to the central pixel point as the first target pixel point again and the subsequent steps.

In some disclosed embodiments, the second candidate pixel point matched with the first candidate pixel point satisfies the following condition: the deviation distance from the second candidate pixel point to the second initial pixel point is the same as the deviation distance from the first candidate pixel point to the first initial pixel point, and the deviation direction from the second candidate pixel point to the second initial pixel point is opposite to the deviation direction from the first candidate pixel point to the first initial pixel point; and/or the preset size is the same as the size of the sub-blocks, and the size of the sub-blocks at least comprises 4 x 4; and/or, the size of the preset area at least comprises 5 x 5; and/or the optimal offset vector comprises a forward offset vector pointing to the front reference frame and a backward offset vector pointing to the back reference frame, wherein the forward offset vector and the backward offset vector are equal in size and opposite in direction.

Referring to fig. 15, fig. 15 is a block diagram of a video encoding apparatus 1500 according to an embodiment of the present application. The video encoding apparatus 1500 includes a vector obtaining module 1510 and a video encoding module 1520, the vector obtaining module 1510 is configured to obtain a final motion vector of each sub-block in each sub-block set of the current block; the video encoding module 1520 is configured to encode the current block using the final motion vector of each sub-block; the final motion vector of the sub-block is obtained by using the inter prediction apparatus in any of the above embodiments of the inter prediction apparatus.

According to the scheme, the final motion vector of each subblock in each subblock set of the current block is obtained, the current block is encoded by using the final motion vector of each subblock, and the final motion vector of each subblock is obtained by using the inter-frame prediction device in any one of the embodiments of the inter-frame prediction device, so that the complexity of inter-frame prediction can be reduced, and the inter-frame prediction efficiency can be improved.

Referring to fig. 16, fig. 16 is a schematic block diagram of an electronic device 1600 according to an embodiment of the disclosure. The electronic device 1600 comprises a memory 1610 and a processor 1620, which are coupled to each other, wherein the memory 1610 stores program instructions, and the processor 1620 is configured to execute the program instructions to implement the steps in any of the embodiments of the inter prediction method described above or implement the steps in any of the embodiments of the video coding method described above. Specifically, the electronic devices may include, but are not limited to: the electronic devices such as the server, the microcomputer, the tablet computer, and the mobile phone are not limited herein.

Specifically, the processor 1620 is configured to control itself and the memory 1610 to implement the steps in any of the above embodiments of inter prediction methods or to implement the steps in any of the above embodiments of video coding. Processor 1620 may also be referred to as a CPU (Central Processing Unit). Processor 1620 may be an integrated circuit chip having signal processing capabilities. The Processor 1620 may also be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 1620 may be commonly implemented by a plurality of integrated circuit chips.

According to the scheme, the complexity of inter-frame prediction can be reduced, and the inter-frame prediction efficiency is improved.

Referring to fig. 17, fig. 17 is a schematic diagram illustrating a memory device 1700 according to an embodiment of the present application. The storage 1700 stores program instructions 1710 that are executable by the processor, the program instructions 1710 for implementing steps in any of the above embodiments of inter prediction methods, or for implementing steps in any of the above embodiments of video encoding methods.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims

1. An inter-frame prediction method, comprising:

dividing a current block into a plurality of sub-block sets; wherein each sub-block set comprises a plurality of sub-blocks;

obtaining a second motion vector of the sub-block based on the first motion vector of the current block; wherein the first motion vector is obtained by using a preset prediction mode;

for at least one of the sets of sub-blocks: respectively taking a second motion vector of at least one sub-block as a starting point, and acquiring the optimal offset vector of the at least one sub-block by utilizing a preset motion search mode; predicting to obtain the optimal offset vectors of other sub-blocks in the sub-block set by using the optimal offset vector of the at least one sub-block;

determining a final motion vector corresponding to the sub-block based on the second motion vector and the best offset vector of the sub-block.

2. The method of claim 1, wherein the predicting the best offset vector of the other sub-blocks in the set of sub-blocks by using the best offset vector of the at least one sub-block comprises:

selecting a corresponding processing mode based on the number of the sub-blocks of the at least one sub-block and the relative position in the sub-block set;

and predicting the optimal offset vector of the at least one subblock by utilizing the processing mode to obtain the optimal offset vectors of other subblocks in the subblock set.

3. The method according to claim 2, wherein selecting the corresponding processing manner based on the number of sub-blocks of the at least one sub-block and the relative position in the sub-block set comprises:

if the number of the sub-blocks is one, determining that the processing mode is a first up-sampling mode;

if the number of the sub-blocks is the same as the number of the sub-areas of the sub-block set and the at least one sub-block is located in different sub-areas, determining that the processing mode is a second up-sampling mode; wherein the sub-region includes a plurality of the sub-blocks.

4. The method of claim 3, wherein the first upsampling manner comprises:

and taking the optimal offset vector of the sub-block as the optimal offset vector of other sub-blocks in the sub-block set in which the sub-block is positioned.

5. The method of claim 3, wherein the second upsampling manner comprises:

and respectively taking the best offset vector of each sub-block as the best offset vectors of other sub-blocks in the sub-area.

6. The method of claim 2, wherein the sub-block set is rectangular, and wherein selecting the corresponding processing mode based on the number of sub-blocks of the at least one sub-block and the relative position in the sub-block set further comprises:

and if the number of the sub-blocks is two or three and the at least one sub-block is located at the preset vertex angle position of the sub-block set, determining that the processing mode is an affine transformation mode.

7. The method according to claim 6, wherein the affine transformation comprises:

acquiring a first size of a subblock set where the at least one subblock is located and a second size of the at least one subblock;

and performing affine transformation on the optimal offset vector, the first size and the second size of the at least one sub-block by respectively using the relative positions of the other sub-blocks in the sub-block set to obtain the optimal offset vector corresponding to the sub-block.

8. The method of claim 6, wherein when the number of sub-blocks is two, the preset corner position comprises any one of the following: an upper vertex angle position, a lower vertex angle position, a left vertex angle position and a right vertex angle position;

and/or when the number of the subblocks is three, the preset vertex angle position comprises any three of the four vertex angle positions of the subblock set.

9. The method of claim 1, wherein the current block is located in a B-frame, and wherein the second motion vector comprises a forward motion vector pointing to a previous reference frame of the B-frame and a backward motion vector pointing to a subsequent reference frame of the B-frame;

the obtaining the optimal offset vector of at least one sub-block by using a preset motion search mode with the second motion vector of at least one sub-block as a starting point respectively comprises:

determining a first initial pixel point pointed by the forward motion vector in the front reference frame, and determining a second initial pixel point pointed by the backward motion vector in the back reference frame;

taking the pixel points in the preset area with the first initial pixel point as the center as first candidate pixel points, and selecting second candidate pixel points matched with the first candidate pixel points in the preset area with the second initial pixel points as the center;

taking the difference between the region with the first candidate pixel point as the vertex and the size as the preset size and the region with the second candidate pixel point as the vertex and the size as the preset size as the pixel difference corresponding to the first candidate pixel point;

searching a final target pixel point in the first candidate pixel points; the pixel difference corresponding to the final target pixel point is smaller than the pixel difference corresponding to any first candidate pixel point adjacent to the final target pixel point;

and acquiring the optimal offset vector of the sub-block based on the final target pixel point.

10. The method of claim 9, wherein searching for a final target pixel among the first candidate pixels comprises:

taking the first starting pixel point as a central pixel point;

taking the pixel point corresponding to the minimum value in the corresponding pixel difference between the central pixel point and the first candidate pixel point adjacent to the central pixel point as a first target pixel point;

if the first target pixel point is the central pixel point, taking the first target pixel point as the final pixel point;

and if the first target pixel point is not the central pixel point, taking the first target pixel point as a new central pixel point, and re-executing the step of taking the central pixel point and a pixel point corresponding to the minimum value in the corresponding pixel difference in the first candidate pixel points adjacent to the central pixel point as the first target pixel point and the subsequent steps.

11. The method of claim 9, wherein the second candidate pixel matched to the first candidate pixel satisfies the following condition: the deviation distance from the second candidate pixel point to the second initial pixel point is the same as the deviation distance from the first candidate pixel point to the first initial pixel point, and the deviation direction from the second candidate pixel point to the second initial pixel point is opposite to the deviation direction from the first candidate pixel point to the first initial pixel point;

and/or the preset size is the same as the size of the sub-block, and the size of the sub-block at least comprises 4 x 4;

and/or, the size of the preset area at least comprises 5 x 5;

and/or the optimal offset vector comprises a forward offset vector pointing to the front reference frame and a backward offset vector pointing to the back reference frame, and the forward offset vector and the backward offset vector are equal in size and opposite in direction.

12. A video encoding method, comprising:

obtaining the final motion vector of each sub-block in each sub-block set of the current block;

encoding the current block using the final motion vector of each of the sub-blocks;

wherein the final motion vector of the sub-block is obtained using the inter prediction method of any one of claims 1 to 11.

13. An electronic device comprising a memory and a processor coupled to each other, the memory storing program instructions, the processor being configured to execute the program instructions to implement the inter prediction method of any one of claims 1 to 11 or to implement the video coding method of claim 12.

14. A memory device storing program instructions executable by a processor to implement the inter prediction method of any one of claims 1 to 11 or to implement the video encoding method of claim 12.