WO2022061573A1

WO2022061573A1 - Motion search method, video coding device, and computer-readable storage medium

Info

Publication number: WO2022061573A1
Application number: PCT/CN2020/117076
Authority: WO
Inventors: 王悦名; 胡光辉; 郑萧桢
Original assignee: 深圳市大疆创新科技有限公司
Priority date: 2020-09-23
Filing date: 2020-09-23
Publication date: 2022-03-31

Abstract

A motion search method, a video encoding device, and a computer-readable storage medium. The method comprises: for a current image block to be encoded, determining a search area in a reference frame of said current image block; selecting N first target search points among M search points included in the search area, N＜M, and any two first target search points not being adjacent; acquiring the encoding costs between said current image block and predicted image blocks corresponding to the first target search points; and determining the predicted image block corresponding to the first target search point having the lowest coding cost as the best matching block of said current image block. The present embodiment is conducive to reducing the number of searches and saving search resources.

Description

Motion search method, video encoding device, and computer-readable storage medium

technical field

The present application relates to the technical field of image coding, and in particular, to a motion search method, a video coding apparatus, and a computer-readable storage medium.

Background technique

Compression and coding technology of data video is one of the key technologies in video transmission. By using efficient video compression technology to greatly compress video, it can effectively reduce the demand for network transmission bandwidth.

At present, in the video coding process, the intra-frame prediction method is usually used to eliminate the spatial redundancy of the image, and the inter-frame prediction method is used to eliminate the temporal redundancy. Since the temporal redundancy between adjacent frames of a video source is much larger than the spatial redundancy within a frame, this makes the inter-frame prediction method extremely important in video coding. Because the video sequence images have strong correlation on the time axis, the motion estimation technology and motion compensation technology in the inter prediction method can effectively reduce the temporal redundancy. Therefore, this technology is widely used in video compression coding schemes. .

Among them, the motion estimation technique is usually completed by using the image block to be encoded in the current frame to find the best matching block within the corresponding search range of one or more reference frames. Specifically, in hardware encoders, traversal search is usually used to find the best matching block within a certain search range, but the number of traversal searches is large, which requires excessive search resources, and the search efficiency is relatively high. Low. For other fast search algorithms such as diamond search algorithm, hexagon search algorithm, etc., because the number of convergence is not certain, whether the convergence depends on whether the result of each search satisfies the conditions, if the conditions are not met, then Continue the next search, so the search time for different image blocks to be encoded is not certain, and the hardware encoder has certain performance requirements, and it is necessary to strictly control the execution time of each step. Obviously, the search time is uncertain and fast The search algorithm does not work with hardware encoders.

SUMMARY OF THE INVENTION

In view of this, one of the objectives of the present application is to provide a motion search method, a video encoding device, and a computer-readable storage medium.

In a first aspect, an embodiment of the present application provides a motion search method, including:

For the current image block to be encoded, determining a search area in the reference frame corresponding to the current image block to be encoded;

Select N first target search points from the M search points included in the search area; N<M; any two first target search points in at least one direction are not adjacent;

obtaining the encoding cost between the current image block to be encoded and the predicted image block corresponding to each of the first target search points;

The predicted image block corresponding to the first target search point with the smallest encoding cost is determined as the best matching block of the current image block to be encoded.

In a second aspect, embodiments of the present application provide a video encoding apparatus, including one or more processors, working individually or together, the processors include multiple pipeline stages; and a memory for storing executable instructions ;

When the processor executes the executable instructions, the following steps are performed in one of the pipeline stages:

In a third aspect, embodiments of the present application provide a computer-readable storage medium on which computer instructions are stored, and when the instructions are executed by a processor, implement the method described in the first aspect.

The motion search method, video encoding device, and computer-readable storage medium provided by the embodiments of the present application are beneficial to reduce the number of searches, save search resources, and improve search efficiency.

Description of drawings

In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the drawings that are used in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative labor.

1 is a schematic diagram of a video communication system provided by an embodiment of the present application;

2 is a schematic diagram of pipeline level division and execution of a hardware encoder provided by an embodiment of the present application;

3 is a schematic flowchart of a motion search method provided by an embodiment of the present application;

4 is a schematic structural diagram of two adjacent image frames provided by an embodiment of the present application;

5A is a schematic diagram of a search area provided by an embodiment of the present application;

5B to 5H are schematic diagrams of a selected first target search point provided by an embodiment of the present application;

6 is a schematic flowchart of another motion search method provided by an embodiment of the present application;

7A and 7B are schematic diagrams of a selected second target search point provided by an embodiment of the present application;

FIG. 8 is a structural diagram of a video encoding apparatus provided by an embodiment of the present application.

detailed description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

The motion search method in the embodiment of the present application is applied to the compression coding process of multimedia information, and the multimedia information includes video, static image, dynamic image, and the like. In this application, video communication is used as an example for introduction. FIG. 1 is a schematic diagram of a typical video communication system. As shown in FIG. 1 , the sending end 100 includes a video capturing device 101, a video encoding device 102 and a sending device 103. The video capturing device 101 will The collected video is sent to the video encoding device 102 for compression encoding of image information, and then sent out through the sending device 103 . The receiving end 200 includes a receiving device 201, a video display device 203 and a video decoding device 202. The receiving device 201 receives the compressed video data sent by the transmitting end 100, and the video decoding device 202 decodes the received video data to recover an image, and decodes the video data. The resulting image is displayed on the video display device 203 . The motion estimation method in the embodiments of the present application is mainly applied to a video coding device in a video communication system, and motion estimation is a major part of computing resources occupied in the coding process, and the rapid research on motion estimation algorithm is important for improving the overall performance of video coding. Speed plays an important role.

Motion estimation is a technique widely used in video coding and video processing. The basic idea of motion estimation is to divide each frame of the image sequence into many non-overlapping image blocks to be encoded, and consider that the displacement of all pixels in the image block to be encoded is the same, and then assign each image block to be encoded to the reference frame. The predicted image block most similar to the image block to be coded is found within a certain search range according to certain matching criteria, as the best matching block, and the relative displacement between the best matching block and the current image block to be coded is the motion vector.

When compressing video, you only need to save the motion vector and residual data (the difference between the best matching block and the current image block to be coded), and at the decoding end, according to the position indicated by the motion vector, from the adjacent decoded reference The corresponding predicted image block is found in the frame, and the reconstructed data is obtained by adding the predicted image block and the residual data. In the inter-frame predictive coding, there is a certain correlation between the scenes in the adjacent frames of the moving image. Therefore, the moving image can be divided into several blocks, and the position of each block in the adjacent frame image can be searched, and the relative offset of the spatial position between the two can be obtained, and the obtained relative offset is usually Refers to the motion vector, and the process of obtaining the motion vector is called motion estimation. Inter-frame redundancy can be removed through motion estimation, which greatly reduces the number of bits in video transmission. Therefore, motion estimation is an important part of a video compression processing system.

In hardware encoders, related technologies usually use traversal search to find the best matching block within a certain search range, but the number of traversal searches is large, which requires excessive search resources, and the search efficiency is also high. relatively low. For other fast search algorithms such as diamond search algorithm, hexagon search algorithm, etc., because the number of convergence is not certain, whether the convergence depends on whether the result of each search satisfies the conditions, if the conditions are not met, then Continue the next search, so the search time for different image blocks to be encoded is not certain, and the hardware encoder has certain performance requirements, and it is necessary to strictly control the execution time of each step. Obviously, the search time is uncertain and fast The search algorithm does not work with hardware encoders. Specifically, in order to improve the processing speed and resource utilization, when performing video encoding processing in a hardware encoder, multiple pipeline stages are usually divided, and the video encoding process is divided into multiple steps accordingly. In general, the video coding process includes prediction, transformation, quantization, inverse transformation, inverse quantization, entropy coding, and loop filtering. The prediction step is further divided into intra-frame prediction and inter-frame prediction. prediction step. Because the hardware encoder has certain performance requirements, the processing time of each pipeline stage is limited. Therefore, the fast search algorithm with uncertain search time is not suitable for the hardware encoder.

In an example, please refer to FIG. 2, which is a schematic diagram of the pipeline stage division of a hardware encoder. The pipeline stages are divided into 5 stages, which are integer pixel search, sub-pixel search, intra prediction, mode decision, entropy coding and Filtering, in which the whole pixel search pipeline stage and the sub-pixel search pipeline stage carry out the steps related to the inter-frame search, the intra-frame prediction pipeline stage and the mode decision pipeline stage carry out the steps related to the intra-frame search, and each pipeline stage is processed in parallel. , in the example shown in Figure 2:

At time T1, the Nth image block to be encoded is processed by integer pixel search in the integer pixel search pipeline stage; N is an integer;

At time T2, the N+1 th image block to be encoded is subjected to integer pixel search processing in the integer pixel search pipeline stage, and the Nth to-be-coded image block is subjected to sub-pixel search processing in the sub-pixel search pipeline stage;

At time T3, the N+2 th image block to be encoded is subjected to integer pixel search processing in the integer pixel search pipeline stage, and the N+1 th image block to be encoded is subjected to sub-pixel search processing in the sub-pixel search pipeline stage. Intra-frame prediction processing is performed in the intra-frame prediction pipeline stage for the image blocks to be encoded;

At time T4, the N+3 th image block to be encoded is subjected to integer pixel search processing in the integer pixel search pipeline stage, the N+2 th image block to be encoded is subjected to pixel-by-pixel search processing in the sub-pixel search pipeline stage, and the Nth +1 image block to be encoded is subjected to intra prediction processing in the intra prediction pipeline stage, and the Nth image block to be encoded is subjected to mode decision processing in the mode decision pipeline stage;

At time T5, the N+4th image block to be encoded is subjected to the integer pixel search process in the integer pixel search pipeline stage, the N+3th image block to be encoded is subjected to pixel-by-pixel search processing in the sub-pixel search pipeline stage, and the Nth The +2 to-be-coded image blocks are subjected to intra-frame prediction processing in the intra-frame prediction pipeline stage, the N+1-th to-be-coded image block is subjected to mode decision processing in the mode decision pipeline stage, and the N-th to-be-coded image block is subjected to entropy coding Entropy coding and filtering are performed in the filtering pipeline stage.

Of course, it can be understood that the above division of flow levels is only for illustration, and different flow levels may be divided according to actual needs, which is not limited in this embodiment. Among them, the execution time of each pipeline stage is relatively fixed, so the fast search algorithm with indefinite search time is not suitable for hardware encoders.

The motion search method in the embodiments of the present application may be applied to a chip including multiple pipeline stages, the chip may be installed on a video encoding device, and the video encoding device uses the pipeline stages to perform video encoding processing, wherein the present The motion search method of an application embodiment may be performed in one of the pipeline stages. Alternatively, the motion search method in the embodiment of the present application may also be executed on a software encoder, which is not limited in the embodiment of the present application.

Please refer to FIG. 3 , which is a motion search method provided by an embodiment of the present application, which can be applied to a video encoding device, and the method includes:

In step S101, for the current image block to be encoded, a search area is determined in the reference frame corresponding to the current image block to be encoded.

In step S102, N first target search points are selected from the M search points included in the search area; N<M; any two first target search points in at least one direction are not adjacent.

In step S103, the encoding cost between the current image block to be encoded and the predicted image blocks corresponding to each of the first target search points is acquired.

In step S104, the predicted image block corresponding to the first target search point with the smallest encoding cost is determined as the best matching block of the current image block to be encoded.

According to an embodiment of the present invention, any two first target search points in the first direction are not adjacent, and any two first target search points in the second direction are adjacent; wherein the first There is an included angle between the direction and the second direction. The included angle may be 0 to 90 degrees.

According to an embodiment of the present invention, an image block to be encoded refers to a prediction unit to be subjected to motion estimation. For example, the basic coding unit used in the prediction link is the prediction unit PU (Prediction Unit), and all prediction-related operations are in units of PU. For example, the direction of intra prediction, the motion vector difference and reference frame index of inter prediction, motion vector prediction, motion estimation and motion compensation are all processed based on PU. There are 3 prediction types in High Efficiency Video Coding (HEVC): Skip, Intra, Inter. The prediction type is the main factor that affects PU segmentation. For example: in Skip mode, the PU size of the prediction unit may be 8×8 to 64×64; in Intra mode, the size of the prediction unit PU may be 4×4 to 64×64; in Inter mode, the size of the prediction unit PU may be 8×4 Or 4×8 to 64×64. Since motion estimation is a process of encoding an image frame, the image frame is divided into many non-overlapping image blocks to be encoded during the process of motion estimation. Therefore, in this embodiment of the present application, the prediction unit to be subjected to motion estimation is referred to as an image block to be encoded. . It can be understood that, in the actual application process, different video compression standards have different sizes of the image blocks to be encoded, and their names may also be different. Specific settings can be made according to the actual application scenario. There are no restrictions on this.

The reference frame in this embodiment of the present application refers to an image frame that has been previously encoded. The reference frame is a frame that is used for reference when the current frame is encoded. The reference frame may advance or lag behind the current frame in time. When the best matching block of the current image block to be encoded in the current frame is searched in the reference frame, the motion vector of the current image block to be encoded can be further obtained.

For step S101, when determining the search area, the video encoding apparatus may determine the search area in the reference frame according to the predicted motion vector of the current image block to be encoded; wherein, considering that the adjacent encoded image blocks have certain Therefore, the predicted motion vector of the currently to-be-coded image block can be determined based on the motion vectors of one or more adjacent coded image blocks, thereby helping to improve search efficiency.

In an implementation manner, considering that the coded image blocks temporally adjacent to the current to-be-coded image block and the coded image blocks spatially adjacent to the current to-be-coded image block are both the same as the to-be-coded image block The coded image has a certain correlation, that is, the adjacent coded image blocks may include temporally adjacent coded image blocks and/or (and/or represent both or either) in Spatially adjacent coded image blocks.

In one example, the video data 10 includes multiple frames of images, and each frame of image may be divided into several image blocks to be encoded. Referring to FIG. 4, image frame 11 and image frame 12 are two consecutive adjacent frames in video data 10, wherein image frame 11 is an encoded image, image frame 12 is an image currently being encoded, image frame 11 It includes 4 coded image blocks, namely coded image block 111 , coded image block 112 , coded image block 113 and coded image block 114 . The image frame 12 includes the coded image block 121, the coded image block 122, the to-be-coded image block 123, and the to-be-coded image block 124; the to-be-coded image block 123 is currently to be encoded, and the predicted motion vector of the to-be-coded image block 123 can be based on The motion vector of the adjacent coded image blocks is determined, and the coded image blocks adjacent to the to-be-coded image block 123 in space include the coded image block 121 and the coded image block 122, which are temporally similar to the to-be-coded image block 123. The adjacent coded image blocks include the coded image block 113 , that is, the predicted motion vector of the to-be-coded image block 123 is determined according to the motion vectors of the coded image block 113 , the coded image block 121 and the coded image block 122 . In this embodiment, considering that adjacent coded blocks have a certain correlation with the current image block to be coded, the predicted motion vector of the currently coded image block is determined according to the motion vector of the adjacent coded image block , which can improve the accuracy of the search area for subsequent positioning, thereby improving the search efficiency.

Determining the predicted motion vector of the currently to-be-coded image block according to the motion vector of the adjacent coded image block is described here: When there are multiple coded image blocks, in order to further improve the search efficiency, the adjacent coded image block with the highest correlation with the current to-be-coded image block should be selected from the adjacent multiple coded image blocks, and The predicted motion vector is determined according to the motion vector of the adjacent coded image block with the highest correlation. Specifically, first, each motion vector of each encoded image block adjacent to the current image block to be encoded is used to determine the encoded image block corresponding to each motion vector in the reference frame; for example, the video encoding device First obtain the position information of the current image block to be coded in the current frame, and then determine the position pointed to by the position information in the reference frame according to the position information. Each motion vector of the coded image block offsets the position to obtain the coded image block corresponding to each motion vector. Next, by comparing the coding cost between the coded image block corresponding to each motion vector in the reference frame and the current to-be-coded image block, the predicted motion vector of the current to-be-coded image block is determined. For example, from each coded image block determined in the reference frame, the coded image block with the least coding cost is selected, and the motion vector corresponding to the coded image block with the least coding cost is used as the predicted motion vector. In this way, when the search region is subsequently determined based on the predicted motion vector of the current image block to be encoded, the determined search region can be made more accurate, so that the best matching block of the current image block to be encoded can be found more quickly , which is beneficial to improve search efficiency and save search resources.

In an example, please refer to FIG. 4 , for example, the adjacent coded image blocks of the current image block 123 to be coded in the current frame include the coded image block 113 , the coded image block 121 and the coded image block 122 . For example, the reference frame is Image frame 11, the image frame 11 is an encoded image, and the video encoding device determines the position information of the image block 123 to be encoded in the current frame 12, and then determines the position information in the reference frame (ie, the image frame 11). point, and then offset the positions according to the motion vectors of the encoded image block 113, the encoded image block 121, and the encoded image block 122, and determine the encoded image block 113 in the reference frame (ie, image frame 11). , the encoded image blocks corresponding to the motion vectors of the encoded image block 121 and the encoded image block 122 respectively, and finally calculate the encoding cost between the encoded image blocks corresponding to the respective motion vectors and the currently to-be-encoded image block 123 , and a corresponding motion vector with the least coding cost is used as the predicted motion vector of the image block 123 currently to be coded.

In another embodiment, the prediction of the current to-be-coded image block can be directly selected from the motion vectors of each adjacent coded image block corresponding to the current to-be-coded image block of the current frame according to a predetermined selection rule Motion vector. For example, the selection rule includes selecting the motion vector of the adjacent coded image block at a specific position as the predicted motion vector of the current image block to be coded; in an example, please refer to FIG. Spatially adjacent coded image blocks include coded image block 121 and coded image block 122, and coded image blocks temporally adjacent to to-be-coded image block 123 include coded image block 113, for example, the selection The rule may be to select the encoded image block that is spatially adjacent to and above the image block to be encoded 123 , that is, the encoded image block 121 .

In one embodiment, the encoding cost includes, but is not limited to, Rate Distortion Optimized (RDO). In other embodiments, the Sum of Absolute Difference (SAD), the Sum of Absolute Transformed Difference (SATD), the Mean Squared Error (MSE), the Sum of the Differences (sum of The encoding cost is derived from any one of squared difference, SSD), mean absolute difference (MAE), and the number of encoded bits of the motion vector and the rate-distortion parameter lambda.

After determining the predicted motion vector of the currently to-be-encoded image block, in the reference frame of the currently to-be-encoded image block, the video encoding apparatus uses the position pointed by the predicted motion vector as a search starting point, and will include the The preset range of the search starting point is used as the search area. It can be understood that, the preset range may be specifically set according to an actual application scenario, which is not limited in this embodiment. In this embodiment, the predicted motion vector of the current image block to be coded is determined according to the motion vector of the adjacent coded image block. Since the adjacent coded image block and the image to be coded have a certain correlation, change In other words, adjacent coded image blocks have a certain similarity with the image to be coded. For example, in a sky image, most of them are sky areas, and the pixels in the sky area are similar. There is also a high degree of similarity between two adjacent image blocks obtained by dividing the sky area. In the search area determined according to the motion vectors of adjacent coded image blocks, there is a high probability that the The best matching block of the current image block to be coded, thereby helping to improve search efficiency and save search resources.

In an implementation manner, a preset range including the search starting point may be used as the search area, which may be a preset range with the search starting point as the center point as the search area, that is, in the The best matching block of the current image block to be encoded is searched around the search starting point, considering that the search starting point is determined by the motion vector of the encoded image block adjacent to the current image block to be encoded, and the adjacent image block is determined by the motion vector. The coded image block has a certain correlation with the to-be-coded image, and the surrounding area of the search starting point is determined as the search area, which is beneficial to ensure that the most recent image block to be coded can be quickly searched in the search area. Best matching block.

Next, in step S102, in the search area, considering that the search resources consumed by the traversal search are relatively large and the search efficiency is relatively low, therefore, the video encoding apparatus in this embodiment includes the search area. Select N first target search points from M search points, N<M, and any two first target search points in at least one direction are not adjacent; in this way, compared with traversal search, it can greatly reduce Search times, improve search efficiency and save search resources.

It can be understood that the present application does not impose any restrictions on the search order of the N first target search points, and specific settings may be made according to actual application scenarios. In one embodiment, the search sequence of the above-mentioned N first target search points is from left to right and from top to bottom. In another embodiment, the above-mentioned search order of the first target search point of N is two orders of row-by-row scanning and column-by-column scanning.

In an exemplary embodiment, the M search points are arranged in rows and columns, please refer to FIG. 5A , the search area is a 9*9 search area, including 81 search points; Taking a search point spaced between any two first target search points as an example, please refer to FIG. 5B , the N first target search points can be respectively search points on odd rows and odd columns in the search area at the same time. (gray point in the figure), when searching, you can search based on the set search order, for example, the search order is from left to right, from top to bottom, the selection rule is the search points on odd rows and odd columns, Then start the search from the first row and the first column, from left to right, the search points on the first row and the third column, the first row and the fifth column, the first row and the seventh column, and the first row and the ninth column, Then search from top to bottom, starting from the third row and the first column, from left to right, and so on, until the search point on the ninth row and ninth column is searched, and the search ends.

Alternatively, referring to FIG. 5C , the N first target search points can be search points on even rows and even columns at the same time (gray points in the figure), then when searching, for example, the search order is from left to right, From top to bottom, the selection rule is the search points on the even-numbered columns and even-numbered rows, then the search starts from the second row and the second column, from left to right, the second row, the fourth column, the second row, the sixth column, and the second row. The search point on the second row and the eighth column is searched from top to bottom, starting from the fourth row and the second column from left to right, and so on, until the search point on the eighth row and eighth column is searched. end.

Alternatively, referring to FIG. 5D , the N first target search points may be search points on even rows and odd columns at the same time (gray points in the figure), then when searching, for example, the search order is from left to right, From top to bottom, the selection rule is the search point on the even-numbered row and odd-numbered column, then the search starts from the second row and the first column, and from left to right, the order is the second row, the third column, the second row, the fifth column, the first column. The search point on the seventh column of the second row and the ninth column of the second row, then from top to bottom, starting from the fourth row and the first column to search from left to right, and so on, until the search point is on the eighth row and the ninth column search point, the search ends.

Alternatively, referring to FIG. 5E, the N first target search points may be search points on odd-numbered rows of even columns at the same time (gray points in the figure), then when searching, for example, the search order is from left to right, From top to bottom, the selection rule is the search point on the even-numbered column and the odd-numbered row, then start the search from the first row and the second column, from left to right, the first row, the fourth column, the first row, the sixth column, and the first row. The search point on the eighth column of a row, then from top to bottom, starting from the third row and the second column to search from left to right, and so on, until the search point on the ninth row and the eighth column is searched, then search end.

In another example, taking two search points spaced between any two first target search points as an example, please refer to FIG. 5F , when searching, for example, the search order is from left to right, top to top The selection rule is to start from the even-numbered rows and even-numbered columns, then start the search from the second row and the second column, from left to right, followed by the second row and five columns, the second row and the eighth column; then from top to bottom, The search starts from the fifth row and the second column from left to right, and so on, until the search point on the eighth row and the eighth column is searched, and the search ends. Alternatively, the selection rule may also be selected from even rows and odd columns, odd rows and odd columns, or odd rows and even columns, which can be specifically set according to actual application scenarios.

According to an embodiment of the present invention, any two first target search points in the first direction are not adjacent, and any two first target search points in the second direction are adjacent; wherein the first direction and The second direction is the direction indicated by the connecting line between any two target search points. That is to say, the first direction and the second direction are respectively formed by connecting lines obtained by taking the positions of any two target search points in the reference frame corresponding to the current image block to be encoded as the starting point and the ending point, respectively. indicated direction. There is an included angle between the first direction and the second direction. The included angle may be 0 to 180 degrees. Referring to FIG. 5G , the first direction is the a direction, and the second direction is the b direction. In the a direction (ie, the horizontal direction), any two first target search points (ie, the gray points in the figure) are not adjacent. However, in the b direction (ie, the vertical direction), any two first target search points (ie, the gray points in the figure) are adjacent. The angle between the a direction and the b direction is 90 degrees. That is, the a direction and the b direction are orthogonal.

Referring to FIG. 5H , in another embodiment, the first direction is the a-direction or the b-direction, and the second direction is the c-direction. In the a direction or the b direction, any two first target search points are not adjacent. However, in the c direction, any two first target search points are adjacent. Among them, the angle between the a direction and the c direction is 45 degrees. The angle between the b direction and the c direction is also 45 degrees.

The reason why any two first target search points selected in this embodiment are not adjacent is that the correlation of the predicted image blocks corresponding to the adjacent search points is relatively high. If the predicted image block is not the best matching block of the current image block to be coded, then the predicted image block corresponding to the adjacent search point is also very likely not the best matching block of the current image block to be coded. If the adjacent search points are all used as the first target search points to be searched, the search efficiency will inevitably be too low. Therefore, when selecting the first target search point, non-adjacent search points can be selected as the first target search point, which can save search resources and reduce the number of searches. In addition, because non-adjacent search points are selected as the first target search points, the first target search points are evenly distributed in the search area, so that the search area can be uniformly searched to ensure that the best search results are obtained. the result of.

In one embodiment, one or more search points may be spaced between any two first target search points. Further, the two first target search points adjacent to the top and bottom are located in the same column, and the two first target search points adjacent to the left and right are located in the same row. As an implementation, the number of search points spaced between any two first target search points may be determined based on the size of the search area and/or the number of searches (and/or representing either or both) of the search points. Sure. For example, the number of search points spaced between any two first target search points may have a positive correlation with the search area, and a negative correlation with the number of searches. When the number of searches is fixed, the larger the search area, the greater the number of search points spaced between any two first target search points. The smaller the number of spaced search points. In the case where the search area is fixed, the more times the search is performed, the smaller the number of search points spaced between any two first target search points. The greater the number of search points in the interval.

In an exemplary embodiment, the M search points are arranged in rows and columns, please refer to FIG. 5A , the search area is a 9*9 search area, including 81 search points; Taking one search point between any two first target search points as an example, please refer to FIG. 5B , FIG. 5C , FIG. 5D and FIG. 5E , the N first target search points can be the search area at the same time Search points on odd-numbered rows and odd-numbered columns, search points on even-numbered rows, even-numbered rows, odd-numbered columns, or search points on odd-numbered rows and even-numbered columns at the same time (gray points in the figure) , the first target search points are evenly distributed in the search area, so as to achieve a uniform search in the search area and ensure that the search results are better; in another example, please refer to FIG. 5F, any two first targets There are two search points between the search points; it can be seen that the number of first target search points in each row is equal to the number of first target search points in other rows, and is located between any two first target search points in each row. and/or the number of first target search points in each column is equal to the number of first target search points in other columns, and the interval between any two first target search points located in each column is equal, In this way, a uniform search of the search area can be achieved, and better results can be ensured.

In step S103, after acquiring the first target search point, the video encoding apparatus may search for the first target search point. For example, the search order of the above N first target search points is: to the right, top to bottom.

Considering that the encoding cost of any two image blocks can reflect the degree of difference between the two, the video encoding apparatus can obtain the predicted images corresponding to the current image blocks to be encoded and each of the first target search points respectively. encoding cost between blocks, the first target search point with the smallest encoding cost is the best matching point of the current image block to be encoded. In an example, please refer to FIG. 5B , during the search process, the video encoding apparatus obtains the encoding cost between the current image block to be encoded and the predicted image block corresponding to the first target search point on odd rows and odd columns . Or, in an example, referring to FIG. 5C , during the search process, the video encoding apparatus obtains the difference between the current image block to be encoded and the predicted image block corresponding to the first target search point on the even-numbered rows and even-numbered columns. encoding cost. Or, in an example, referring to FIG. 5D , during the search process, the video encoding apparatus obtains the difference between the current image block to be encoded and the predicted image block corresponding to the first target search point on the even row and odd column. encoding cost. Or, in an example, referring to FIG. 5E , during the search process, the video encoding apparatus obtains the difference between the current image block to be encoded and the predicted image block corresponding to the first target search point on odd rows and even columns. encoding cost.

Correspondingly, the predicted image block corresponding to the first target search point with the smallest encoding cost is determined as the best matching block of the current image block to be encoded. In other words, the predicted image block with the smallest difference is determined as the best matching block of the current image block to be encoded. In this embodiment, by selecting the first target search point for searching, it is beneficial to reduce the number of searches and save search resources.

The size of the predicted image block corresponding to the first target search point is the same as that of the current image block to be coded, and the predicted image block corresponding to the first target search point may be the first target search point in the reference frame including the first image block. The image block of the target search point, that is to say, the first target search point is one of the pixel points in the corresponding predicted image block; for example, the predicted image block determined by the first target search point as the starting point, Or the first target search point is a certain pixel point in the predicted image block, which is not limited in this embodiment of the present application.

In one example, the encoding cost includes, but is not limited to, Rate Distortion Optimized (RDO). In other embodiments, the Sum of Absolute Difference (SAD), the Sum of Absolute Transformed Difference (SATD), the Mean Squared Error (MSE), the Sum of the Differences (sum of The encoding cost is derived from any one of squared difference, SSD), mean absolute difference (MAE), and the number of encoded bits of the motion vector and the rate-distortion parameter lambda.

Further, please refer to FIG. 6 , which is a schematic flowchart of another motion search method provided by an embodiment of the present application. The method includes:

In step S201, for the current image block to be encoded, a search area is determined in the reference frame of the current image block to be encoded. Similar to step S101, details are not repeated here.

In step S202, N first target search points are selected from the M search points included in the search area; N<M; any two first target search points in at least one direction are not adjacent. Similar to step S102, details are not repeated here.

In step S203, the encoding cost between the current image block to be encoded and the predicted image blocks corresponding to each of the first target search points is acquired. Similar to step S103, details are not repeated here.

In step S204, the first target search point with the smallest coding cost and the search point in the neighborhood of the first target search point with the smallest coding cost are used as the second target search point, and the search area is in the center position The search point of , and the search point in the neighborhood of the search point at the center position are taken as the second target search point.

In step S205, the encoding cost between the current image block to be encoded and the predicted image blocks corresponding to each of the second target candidate search points is acquired.

In step S206, from the first target search point with the smallest coding cost and the second target search point, a corresponding predicted image block with the smallest coding cost is selected as the best matching block of the current image block to be coded.

We mentioned earlier that the correlation of the predicted image blocks corresponding to the adjacent search points is relatively high. If the predicted image block corresponding to the current search point is not the best matching block of the current image block to be encoded, then the adjacent The predicted image block corresponding to the search point is also very likely not the best matching block of the current image block to be encoded, and vice versa, the predicted image block corresponding to the first target search point with the smallest encoding cost is all the first target search points. Among all the predicted image blocks corresponding to the point, the difference with the current coded image block is the smallest, then the predicted image block corresponding to the search point adjacent to the first target search point with the least coding cost may be different from the currently coded image block. smaller.

Therefore, in order to be able to search for better results, in step S204, on the basis of obtaining the first target search point with the smallest coding cost, the video encoding apparatus further searches the first target search point with the smallest coding cost and all the The search point in the neighborhood of the first target search point with the smallest coding cost is used as the second target search point, and the search point in the center of the search area and the search point in the neighborhood of the search point in the center point as the second target search point. In this embodiment, the search point in the neighborhood of the first target search point with the smallest coding cost is used as the second target search point to search, which is beneficial to obtain better search results; The search point may be the search starting point of the current image block to be encoded in the reference frame when determining the search area, considering that the search starting point is determined by the motion of the encoded image block adjacent to the current image block to be encoded The coded image blocks adjacent to the current to-be-coded image block usually have a certain similarity with the current to-be-coded image block. For example, in a sky image, most of them are sky areas. , the pixels in the sky area are relatively similar, and the two adjacent image blocks divided based on the sky area also have high similarity; Therefore, the search point in the central position in the search area and the search point in the neighborhood of the search point in the central position are searched as the second target search point to further search for better results. result.

In one embodiment, the first target search point with the smallest coding cost, the search point in the neighborhood of the first target search point, the search point in the center of the search area, the search point in the center The search point within the neighborhood of the search point is used as the second target search point. For example, when searching is performed with the first target search point with the least coding cost and the search points in its neighborhood, the search is performed with the search point at the center of the search area and the search points in its neighborhood at the same time. As mentioned above, the search point at the center position may be the search start point of the current image block to be encoded in the reference frame when the search area is determined. This embodiment considers that there is a high probability of searching near the search start point. Therefore, the search point at the center of the search area and the search point in the neighborhood of the search point at the center are used as the second target search point to search, and further search results can be obtained. the result of. Wherein, the neighborhood is an area of L×L, and L represents the number of search points in any row or column in the neighborhood, L≥3; in other words, the number of search points in the neighborhood is greater than or equal to 8.

In an implementation manner, the size of the neighborhood may be determined according to the size of the search area, for example, the size of the neighborhood has a positive correlation with the size of the search area, that is, the larger the search area, The larger the neighborhood is, on the contrary, the smaller the search area is, the smaller the neighborhood is, so as to ensure that a better result can be searched.

In an example, please refer to FIGS. 7A and 7B , assuming that the black point in the figure is the first target search point with the smallest coding cost, and FIG. 7A takes an area with a neighborhood of 3×3 as an example for illustration, then the coding cost can be minimized The first target search point and the search point in the neighborhood of the first target search point with the least coding cost are taken as the second target search point, that is, the 3 × 3 area centered on the black point, Fig. 7B takes the neighborhood as The 5×5 area is taken as an example to illustrate, that is, the 5×5 area with the black point as the center; in addition, the search point at the center position in the search area and the neighbors of the search point at the center position can also be used. The search point in the domain is used as the second target search point, that is, the 3×3 area of the central search point in FIG. 7A and the 5×5 area of the central search point in FIG. 7B .

In one embodiment, such as shown in FIG. 7B , considering that the neighborhood of the first target search point with the smallest coding cost may include some already searched first target search points, the search at the center position The neighborhood of the point may also include some first target search points that have been searched. These first target search points have been searched before and it is determined that the coding cost is not the smallest. If you search again, it will obviously waste search resources. Therefore, when determining the second target search point, the video encoding apparatus may divide the first target search point with the smallest encoding cost and the first target search point with the smallest encoding cost in the neighborhood of the first target search point with the smallest encoding cost The part other than the first target search point is used as the second target search point; and the search point at the center position in the search area and the part other than the first target search point in the neighborhood of the search point at the center position are used as the second target search point , in this embodiment, it is considered that there may be already searched first target search points in the neighborhood of the first target search point with the smallest coding cost and in the neighborhood of the search point at the center position, and these first target search points If the point is not the least expensive for encoding, then these first target search points have been determined not to be optimal, and these first target search points are eliminated in this embodiment, which can reduce unnecessary search times, save search resources, and help improve search efficiency.

Next, in step S205, after acquiring the second target search point, the video encoding apparatus may search for the second target search point. As we mentioned earlier, the encoding cost of any two image blocks can be reflected in the The degree of difference between the two, therefore, the video encoding apparatus can obtain the encoding cost between the current image block to be encoded and the predicted image blocks corresponding to each of the second target candidate search points, and then obtain the encoding cost from the encoding cost. Among the smallest first target search point and the second target search point, a corresponding predicted image block with the smallest encoding cost is selected as the best matching block of the current image block to be encoded, in other words, the smallest difference is selected. The predicted image block is determined as the best matching block of the current image block to be encoded. In this embodiment, by further increasing the second target search point and searching for the second target search point, it is beneficial to search for better results, and compared with the traversal search, the number of searches is significantly reduced, and search resources are saved, And it can ensure that the best matching block of the current image block to be encoded is obtained.

Correspondingly, referring to FIG. 8 , an embodiment of the present application further provides a video encoding apparatus, which includes one or more processors 31 that work independently or together, and the processors include multiple pipeline stages; Memory 32 that stores executable instructions.

For the current image block to be encoded, a search area is determined in the reference frame corresponding to the current image block to be encoded.

Select N first target search points from the M search points included in the search area; N<M; any two first target search points in at least one direction are not adjacent.

The encoding cost between the currently to-be-encoded image block and the predicted image blocks corresponding to each of the first target search points is acquired.

The processor 31 may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 32 may include at least one type of storage medium including flash memory, hard disk, multimedia card, card-type memory (eg, SD or DX memory 32, etc.), random access memory (RAM), static random access memory (SRAM), read only memory 32 (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, and the like. The memory 32 may be an internal storage unit of the video encoding apparatus 30, such as a hard disk or a memory. The memory 32 may also be an external storage device of the video encoding device 30, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) equipped on the video encoding device 30 card, Flash Card, etc. Further, the memory 32 may also include both an internal storage unit of the video encoding apparatus 30 and an external storage device. The memory 32 is used to store computer programs and other programs and data required by the device. The memory 32 may also be used to temporarily store data that has been or will be output.

In one embodiment, one or more search points are spaced between any two first target search points.

In one embodiment, the number of search points spaced between any two first target search points is determined based on the size of the search area and/or the number of searches.

In one embodiment, the M search points are arranged in rows and columns.

The N first target search points are simultaneously search points on odd-numbered rows and odd-numbered columns in the search area, simultaneously search points on even-numbered columns and even-numbered rows, simultaneously search points on even-numbered rows and odd-numbered columns, or simultaneously odd-numbered search points. Search points on rows and even columns.

The number of first target search points in each row is equal to the number of first target search points in other rows, the number of first target search points in each column is equal to the number of first target search points in other columns, and any The interval between the two first target search points is equal.

In one embodiment, the processor 31 is further configured to: use the first target search point with the smallest coding cost and the search point in the neighborhood of the first target search point with the smallest coding cost as the second target search point, And take the search point at the center position in the search area and the search point in the neighborhood of the search point at the center position as the second target search point; obtain the current image block to be encoded and each of the second The coding cost between the predicted image blocks corresponding to the target candidate search points; from the first target search point with the smallest coding cost and the second target search point, select a corresponding predicted image block with the smallest coding cost as the current The best matching block of the image block to be encoded.

In an embodiment, when determining the second target search point, the processor is specifically configured to: place the first target search point with the smallest coding cost and the first target search point with the smallest coding cost in the neighborhood of the first target search point with the smallest coding cost The part other than the first target search point is used as the second target search point; and the search point in the center of the search area and the part other than the first target search point in the neighborhood of the search point in the center as the second target search point.

In one embodiment, the number of search points within the neighborhood is greater than or equal to eight.

In one embodiment, the neighborhood is an L×L area, and L represents the number of search points in any row or column in the neighborhood, and L≥3.

In one embodiment, the size of the neighborhood is determined according to the size of the search area.

In one embodiment, the size of the neighborhood is positively correlated with the size of the search area.

In an embodiment, when determining the search area, the processor is specifically configured to: determine the search area in the reference frame according to the predicted motion vector of the current image block to be encoded; the predicted motion vector is based on one or The motion vectors of a plurality of adjacent coded image blocks are determined.

In an embodiment, the adjacent coded image blocks include temporally adjacent coded image blocks and/or spatially adjacent coded image blocks.

In one embodiment, the predicted motion vector is the motion vector of the one with the least coding cost among the one or more adjacent coded image blocks.

In one embodiment, when determining the search area, the processor 31 is specifically configured to: in the reference frame of the image block currently to be encoded, use the position pointed to by the predicted motion vector as the search starting point, and include The preset range of the search starting point is used as the search area.

In one embodiment, the coding cost includes at least one of: rate-distortion optimization, absolute error, transformed absolute error, mean squared error, sum of squares of differences, mean absolute difference, or number of bits of a motion vector.

For the apparatus embodiments, since they basically correspond to the method embodiments, reference may be made to the partial descriptions of the method embodiments for related parts. The device embodiments described above are only illustrative, wherein the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in One place, or it can be distributed over multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment. Those of ordinary skill in the art can understand and implement it without creative effort.

In an exemplary embodiment, there is also provided a non-transitory computer-readable storage medium, such as a memory including instructions, executable by a processor of an electronic device to perform the above method. For example, the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.

Wherein, when the instructions in the storage medium are executed by the processor, the electronic device is enabled to execute the aforementioned motion search method.

It should be noted that, in this document, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any relationship between these entities or operations. any such actual relationship or sequence exists. The terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion such that a process, method, article or device comprising a list of elements includes not only those elements, but also other not expressly listed elements, or also include elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.

The methods and devices provided by the embodiments of the present application have been introduced in detail above, and specific examples are used to illustrate the principles and implementations of the present application. At the same time, for those of ordinary skill in the art, according to the idea of the application, there will be changes in the specific implementation and application scope. In summary, the content of this specification should not be construed as a limitation to the application. .

Claims

A motion search method, comprising:

For the current image block to be encoded, determining a search area in the reference frame corresponding to the current image block to be encoded;

Select N first target search points from the M search points included in the search area; N<M; any two first target search points in at least one direction are not adjacent;

obtaining the encoding cost between the current image block to be encoded and the predicted image block corresponding to each of the first target search points;

The predicted image block corresponding to the first target search point with the smallest encoding cost is determined as the best matching block of the current image block to be encoded.
The method according to claim 1, wherein any two first target search points in the first direction are not adjacent, and any two first target search points in the second direction are adjacent;

Wherein, the first direction and the second direction are directions indicated by a connecting line between any two target search points, respectively, and there is an included angle between the first direction and the second direction.
The method according to claim 1, wherein one or more search points are spaced between any two first target search points.
The method according to claim 1, wherein the number of search points spaced between any two first target search points is determined based on the size of the search area and/or the number of searches.
The method according to claim 1, wherein the M search points are arranged in rows and columns;

The N first target search points are simultaneously search points on odd-numbered rows and odd-numbered columns in the search area, simultaneously search points on even-numbered columns and even-numbered rows, and simultaneously search points on even-numbered columns and odd-numbered rows, or both. search points on odd rows and even columns;

Wherein, the number of first target search points in each row is equal to the number of first target search points in other rows, and the interval between any two first target search points in each row is equal; and/or

The number of first target search points in each column is equal to the number of first target search points in other columns, and the interval between any two first target search points located in each column is equal.
The method according to claim 1, wherein after acquiring the encoding cost between the current image block to be encoded and the predicted image blocks corresponding to each of the first target search points, the method further comprises:

The first target search point with the smallest coding cost and the search point in the neighborhood of the first target search point with the smallest coding cost are used as the second target search point, and the search point at the center of the search area and all The search point in the neighborhood of the search point at the central position is used as the second target search point;

Obtain the encoding cost between the current image block to be encoded and the predicted image block corresponding to each of the second target candidate search points;

Determining the predicted image block corresponding to the first target search point with the smallest encoding cost as the best matching block of the current image block to be encoded, including:

From the first target search point with the smallest encoding cost and the second target search point, a corresponding predicted image block with the smallest encoding cost is selected as the best matching block of the current image block to be encoded.
The method according to claim 6, wherein the first target search point with the smallest coding cost and the search point in the neighborhood of the first target search point with the smallest coding cost are used as the second target search point, include:

Taking the first target search point with the smallest coding cost and the part other than the first target search point in the neighborhood of the first target search point with the smallest coding cost as the second target search point;

Said taking the search point at the center position in the search area and the search point in the neighborhood of the search point at the center position as the second target search point, including:

The search point at the center position in the search area and the part other than the first target search point in the neighborhood of the search point at the center position are used as the second target search point.
The method according to claim 6, wherein the number of search points in the neighborhood is greater than or equal to 8.
The method according to claim 6, wherein the neighborhood is an area of L×L, and the L represents the number of search points in any row or column in the neighborhood, and L≥3.
The method according to claim 6, wherein the size of the neighborhood is determined according to the size of the search area.
The method according to claim 10, wherein the size of the neighborhood is positively correlated with the size of the search area.
The method according to claim 1, wherein the determining a search area in the reference frame of the current image block to be encoded comprises:

A search area is determined in the reference frame according to the predicted motion vector of the currently to-be-coded image block; the predicted motion vector is determined based on the motion vectors of one or more adjacent coded image blocks.
The method according to claim 12, wherein the adjacent coded image blocks comprise temporally adjacent coded image blocks and/or spatially adjacent coded image blocks.
The method according to claim 12, wherein the predicted motion vector is the motion vector of the one with the least coding cost among the one or more adjacent coded image blocks.
The method according to claim 12, wherein, determining a search area in the reference frame according to the predicted motion vector of the current image block to be encoded, comprising:

In the reference frame of the image block currently to be encoded, the position pointed to by the predicted motion vector is used as a search starting point, and a preset range including the searching starting point is used as the search area.
The method according to claim 1 or 14, wherein the coding cost comprises at least one of the following: rate-distortion optimization, absolute error, transformed absolute error sum, mean squared error, square sum of differences, average The absolute difference or the number of encoded bits of the motion vector.
The method of claim 1, wherein the method is applied to a chip comprising a plurality of pipeline stages; and the method is performed in one of the pipeline stages.
A video encoding device, characterized in that it comprises one or more processors, working independently or together, the processors comprising a plurality of pipeline stages; and a memory for storing executable instructions;

When the processor executes the executable instructions, the following steps are performed in one of the pipeline stages:

For the current image block to be encoded, determining a search area in the reference frame corresponding to the current image block to be encoded;

Select N first target search points from the M search points included in the search area; N<M; any two first target search points in at least one direction are not adjacent;

obtaining the encoding cost between the current image block to be encoded and the predicted image block corresponding to each of the first target search points;

The predicted image block corresponding to the first target search point with the smallest encoding cost is determined as the best matching block of the current image block to be encoded.
The device according to claim 18, wherein any two first target search points in the first direction are not adjacent, and any two first target search points in the second direction are adjacent;

Wherein, the first direction and the second direction are directions indicated by a connecting line between any two target search points, respectively, and there is an included angle between the first direction and the second direction.
The apparatus according to claim 18, wherein one or more search points are spaced between any two first target search points.
The apparatus according to claim 18, wherein the number of search points spaced between any two first target search points is determined based on the size of the search area and/or the number of searches.
The device according to claim 18, wherein the M search points are arranged in rows and columns;

The N first target search points are simultaneously search points on odd-numbered rows and odd-numbered columns in the search area, simultaneously search points on even-numbered columns and even-numbered rows, simultaneously search points on even-numbered rows and odd-numbered columns, or simultaneously odd-numbered search points. search points on rows and even columns;

Wherein, the number of first target search points in each row is equal to the number of first target search points in other rows, and the interval between any two first target search points in each row is equal; and/or

The number of first target search points in each column is equal to the number of first target search points in other columns, and the interval between any two first target search points located in each column is equal.
The apparatus of claim 18, wherein the processor is further configured to:

The first target search point with the smallest coding cost and the search point in the neighborhood of the first target search point with the smallest coding cost are used as the second target search point, and the search point in the center of the search area and all The search point in the neighborhood of the search point at the central position is used as the second target search point;

obtaining the encoding cost between the currently to-be-encoded image blocks and the predicted image blocks corresponding to the respective second target candidate search points;

From the first target search point with the smallest encoding cost and the second target search point, a corresponding predicted image block with the smallest encoding cost is selected as the best matching block of the current image block to be encoded.
The apparatus according to claim 23, wherein when determining the second target search point, the processor is specifically configured to: combine the first target search point with the smallest coding cost and the first target search point with the smallest coding cost A part of the neighborhood of a target search point other than the first target search point is used as a second target search point; A portion other than a target search point is used as a second target search point.
The apparatus according to claim 23, wherein the number of search points in the neighborhood is greater than or equal to 8.
The device according to claim 23, wherein the neighborhood is an area of L×L, and L represents the number of search points in any row or column in the neighborhood, and L≥3.
The apparatus according to claim 23, wherein the size of the neighborhood is determined according to the size of the search area.
The apparatus according to claim 27, wherein the size of the neighborhood is positively correlated with the size of the search area.
The apparatus according to claim 18, wherein when determining the search area, the processor is specifically configured to: determine the search area in the reference frame according to the predicted motion vector of the current image block to be encoded; The predicted motion vector is determined based on the motion vectors of one or more adjacent coded image blocks.
The apparatus according to claim 29, wherein the adjacent coded image blocks comprise temporally adjacent coded image blocks and/or spatially adjacent coded image blocks.
The apparatus of claim 29, wherein the predicted motion vector is the motion vector of the one with the least coding cost among the one or more adjacent coded image blocks.
The apparatus according to claim 29, wherein when determining the search area, the processor is specifically configured to: in the reference frame of the current image block to be encoded, use the position pointed by the predicted motion vector as A search start point, and a preset range including the search start point is used as the search area.
The apparatus according to claim 18 or 31, wherein the coding cost comprises at least one of the following: rate-distortion optimization, absolute error, transformed absolute error, mean squared error, sum of squares of differences, mean absolute error The difference or the number of bits of the motion vector.
A computer-readable storage medium, characterized in that computer instructions are stored thereon, and when the instructions are executed by a processor, the method according to any one of claims 1 to 17 is implemented.