CN110730344A

CN110730344A - Video coding method and device and computer storage medium

Info

Publication number: CN110730344A
Application number: CN201910883606.7A
Authority: CN
Inventors: 方诚; 江东; 林聚财; 殷俊; 曾飞洋
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2019-09-18
Filing date: 2019-09-18
Publication date: 2020-01-24
Anticipated expiration: 2039-09-18
Also published as: CN110730344B

Abstract

The application discloses a video coding method, a video coding device and a computer storage medium, wherein the video coding method comprises the following steps: obtaining a hash value of a current block in a current video frame; matching a reference block of a reference video frame by using the hash value of the current block to obtain an integer pixel candidate matching block; traversing MMVD (MMVD) search around the current candidate matching block by taking the preset vertex angle pixel of the current integer pixel candidate matching block as a starting point to obtain sub-pixel blocks with the same hash value as sub-pixel candidate matching blocks; the best matching block is selected from candidate matching blocks including integer pixel candidate matching blocks and sub-pixel candidate matching blocks. By the mode, the integer pixel matching block is utilized, meanwhile, the sub-pixel matching block is adopted, so that the accuracy of selecting the optimal matching block is improved, and the coding quality of the video frame is further improved.

Description

Video coding method and device and computer storage medium

Technical Field

The present application relates to the field of image coding technologies, and in particular, to a video coding method and apparatus, and a computer storage medium.

Background

Because the data volume of the video image is large, when the video image interaction is carried out, the video image needs to be coded and decoded, and the video coding mainly has the function of compressing video pixel data (RGB, YUV and the like) into a video code stream, so that the data volume of the video is reduced, and the purposes of reducing the network bandwidth in the transmission process and reducing the storage space are achieved.

The video coding system mainly comprises video acquisition, prediction, transformation quantization and entropy coding, wherein the prediction comprises an intra-frame prediction part and an inter-frame prediction part, the intra-frame prediction part is used for compressing an image by using spatial correlation in an image frame, and the inter-frame prediction part is used for compressing the image by using temporal correlation among image frames.

One of the prediction modes in inter prediction is a HASH (HASH) prediction mode. The application of the hash prediction mode in the prior art is limited and the precision is low.

Disclosure of Invention

In order to solve the above problems, the present application provides a video encoding method, an apparatus and a computer storage medium, which can use a sub-pixel matching block while using an integer-pixel matching block, and is beneficial to improving the accuracy of selecting an optimal matching block, so as to further improve the encoding quality of a video frame.

The technical scheme adopted by the application is as follows: there is provided a video encoding method, the method comprising: obtaining a hash value of a current block in a current video frame; matching a reference block of a reference video frame by using the hash value of the current block to obtain an integer pixel candidate matching block; traversing MMVD (MMVD) search around the current candidate matching block by taking the preset vertex angle pixel of the current integer pixel candidate matching block as a starting point to obtain sub-pixel blocks with the same hash value as sub-pixel candidate matching blocks; the best matching block is selected from candidate matching blocks including integer pixel candidate matching blocks and sub-pixel candidate matching blocks.

The method for traversing MMVD search around the current candidate matching block by taking the preset vertex angle pixel of the current integer pixel candidate matching block as a starting point comprises the following steps: taking the upper left corner pixel of the current integer pixel candidate matching block as a starting point, performing traversal MMVD search around the candidate matching block at the current by using the search amplitude of sub-pixel precision, and constructing a sub-pixel prediction block with the same size as the current block by using the sub-pixel point as the upper left corner pixel when traversing to a sub-pixel point, wherein each pixel point in the sub-pixel prediction block is a sub-pixel, and the sub-pixel value of each sub-pixel is interpolated by the adjacent integer pixel values; and performing hash comparison on the sub-pixel prediction block and the current block, and reserving the sub-pixel prediction block with the same hash as a sub-pixel candidate matching block.

The method for selecting the best matching block from the candidate matching blocks including the integer pixel candidate matching block and the sub-pixel candidate matching block comprises the following steps: calculating the distance between the center point of all candidate matching blocks and the center point of the current block; and selecting the candidate matching block with the shortest distance as the best matching block.

The selecting the candidate matching block with the shortest distance as the best matching block includes: judging whether at least two candidate matching blocks with the shortest and same distance exist or not; and if at least two candidate matching blocks exist, sorting according to the coding sequence, and selecting the candidate matching block with the last coding sequence as the best matching block.

The method for selecting the best matching block from the candidate matching blocks including the integer pixel candidate matching block and the sub-pixel candidate matching block comprises the following steps: calculating rdcosts of all candidate matching blocks; the candidate matching block with the lowest rdcost is selected as the best matching block.

After selecting the best matching block from the candidate matching blocks including the integer pixel candidate matching block and the sub-pixel candidate matching block, the method comprises the following steps: the MV and predictor are determined using the best match block.

Wherein determining the MV and the predicted value using the best matching block comprises: using the MV corresponding to the best matching block as an original MV, and further carrying out MMVD search to obtain a final MV; and constructing a block with the same size as the current block by taking the position pointed by the final MV as a central point as a reference block for obtaining a predicted value.

The method for constructing a block with the same size as the current block as a reference block for obtaining the predicted value by taking the position pointed by the final MV as a central point comprises the following steps: judging whether the final position pointed by the MV is a central point or not; if the sub-pixel is the sub-pixel, the integer pixel near the sub-pixel is used for interpolation to obtain the value of the sub-pixel so as to obtain a predicted value.

Wherein determining the MV and the predicted value using the best matching block comprises: selecting the first n candidate matching blocks with the lowest rdcost from the candidate matching blocks except the best matching block; carrying out weighted average on pixel values of corresponding positions of the first n rdcost minimum candidate matching blocks and the best candidate matching block to obtain a weighted prediction value; calculating rdcost by using the weighted prediction value, comparing the rdcost with the rdcost directly predicted by using the best matching block, and selecting the rdcost with small value as a final prediction value; wherein, the weight of the pixel value of the best matching block is set as w, 1> w >0.5, the pixel value weight of n candidate matching blocks is 1-w in total, and the weight of the candidate matching block with smaller rdcost is larger.

Wherein determining the MV and the predicted value using the best matching block comprises: selecting the MVs corresponding to the first n rdcost minimum candidate matching blocks from the MVs corresponding to the candidate matching blocks except the best matching block; performing weighted average on the MVs corresponding to the first n rdcost minimum candidate matching blocks and the MVs corresponding to the best candidate matching blocks to obtain weighted MVs; obtaining a candidate predicted value by using the position pointed by the weighted MV, calculating a rdcost by using the candidate predicted value, comparing the rdcost with the rdcost directly predicted by using the best matching block, and selecting the rdcost with small rdcost as a final predicted value; the MV weight corresponding to the best matching block is set as w, 1> w >0.5, the MV weights corresponding to n candidate matching blocks are 1-w in total, and the smaller the rdcost, the larger the weight of the candidate matching block.

The matching method for the integer pixel candidate matching block by using the hash value of the current block to match the reference block of the reference video frame comprises the following steps: and performing matching search in the coded blocks in the frame by using the hash value of the current block, and searching integer pixel candidate matching blocks with consistent block sizes and consistent hash values.

Another technical scheme adopted by the application is as follows: there is provided a video encoding apparatus, the video encoding comprising a processor and a memory interconnected, the memory for storing program data, the processor for executing the program data to implement the method as described above.

Another technical scheme adopted by the application is as follows: there is provided a computer storage medium having stored thereon program data for implementing the method as described above when executed by a processor.

The video coding method provided by the application comprises the following steps: obtaining a hash value of a current block in a current video frame; matching a reference block of a reference video frame by using the hash value of the current block to obtain an integer pixel candidate matching block; traversing MMVD (MMVD) search around the current candidate matching block by taking the preset vertex angle pixel of the current integer pixel candidate matching block as a starting point to obtain sub-pixel blocks with the same hash value as sub-pixel candidate matching blocks; the best matching block is selected from candidate matching blocks including integer pixel candidate matching blocks and sub-pixel candidate matching blocks. By the method, when the candidate matching block is obtained, the integer pixel matching block is utilized, and the sub-pixel matching block is adopted, so that the base number to be selected of the candidate matching block is enlarged, the accuracy of selecting the optimal matching block is improved, and the coding quality of the video frame is further improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts. Wherein:

fig. 1 is a schematic flowchart of a first embodiment of a video encoding method provided in the present application;

FIG. 2 is a schematic diagram of a non-square CU block as provided herein;

FIG. 3 is a schematic diagram of a current block and a neighboring block of a spatial MV provided in the present application;

FIG. 4 is a schematic diagram of a time-domain MV provided herein;

FIG. 5 is a schematic flow chart of step 13;

FIG. 6 is a schematic diagram of an integer-pixel candidate matching block and a sub-pixel prediction block provided herein;

FIG. 7 is a schematic flow chart of step 14;

FIG. 8 is a schematic diagram of the selection of best match blocks provided herein;

fig. 9 is a flowchart illustrating a second embodiment of a video encoding method provided in the present application;

FIG. 10 is a first schematic diagram of obtaining MVs and predicted values as provided herein;

fig. 11 is a flowchart illustrating a video encoding method according to a third embodiment of the present application;

FIG. 12 is a second schematic diagram of obtaining MVs and predicted values as provided herein;

fig. 13 is a schematic flowchart of a fourth embodiment of a video encoding method provided in the present application;

FIG. 14 is a third schematic diagram of obtaining MVs and predicted values as provided herein;

fig. 15 is a schematic structural diagram of a first embodiment of a video encoding apparatus provided in the present application;

fig. 16 is a schematic structural diagram of a second embodiment of a video encoding apparatus provided in the present application;

fig. 17 is a schematic structural diagram of a computer storage medium provided in the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be further noted that, for the convenience of description, only some of the structures related to the present application are shown in the drawings, not all of the structures. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "first", "second", etc. in this application are used to distinguish between different objects and not to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a first embodiment of a video encoding method provided in the present application, the method including:

step 11: and obtaining the hash value of the current block in the current video frame.

For a frame of image, it is generally divided into a plurality of CU (coding unit) blocks, and each CU block is individually coded.

The method includes the following steps that a plurality of dividing modes are provided for a CU block of a frame image, and the hash value of a current CU block is obtained in different modes for the shape and different dividing modes of the CU block, which are described in the following by several different embodiments:

HASH value acquisition for square CU block:

(1) for 4 × 4 blocks, the current block is divided into 4 sub-blocks of 2 × 2, a Cyclic Redundancy Check (CRC) value of each sub-block is calculated as a hash value, and the hash values of the 4 sub-blocks are combined into a total hash value as the hash value of the current 4 × 4 block.

(2) For 8 × 8 blocks, the current block is equally divided into 4 × 4 sub-blocks, the CRC value (obtained in scheme (1) above) is calculated in 4 × 4 blocks as the hash value of the 4 × 4 sub-blocks, and the 4 sub-block hash values are combined into the hash value of the current 8 × 8 block.

(3) For a 16 × 16 block, the current block is equally divided into 164 × 4 sub-blocks, and the CRC value (obtained in scheme (2) above) is calculated in units of 4 × 4 blocks as the hash value of the 4 × 4 sub-block; and dividing the current block into 4 sub-blocks of 8 by 8 on average, obtaining the hash value of the 8 by 8 sub-blocks from 4 by 4 sub-blocks inside, and combining the hash values of the 4 8 by 8 sub-blocks into the hash value of the current 16 by 16 block.

(4) For a 64 × 64 block, the current block is equally divided into 256 4 × 4 sub-blocks, and the CRC value (obtained in the above scheme (2)) is calculated in units of 4 × 4 blocks as the hash value of the 4 × 4 sub-block; dividing the current block into 64 8-8 sub-blocks equally, wherein the hash value of the 8-8 sub-blocks is obtained from 4-4 sub-blocks inside; dividing the current block into 16 sub blocks 16 by 16 equally, wherein the hash value of the 16 sub blocks 16 by 16 is obtained from the internal 4 sub blocks 8 by 8; dividing the current block into 4 32 × 32 sub-blocks equally, wherein the hash value of the 32 × 32 sub-blocks is obtained from the internal 4 16 × 16 sub-blocks; finally, the 4 hash values of 32 × 32 are merged into the hash value of the current 64 × 64 block.

HASH value acquisition for non-square CU blocks:

firstly, dividing a current non-square block into a plurality of square blocks with the short side of the current block as the side length, then respectively judging whether pixel values in the square blocks are completely the same (completely flat) from left to right or from top to bottom, wherein the first square block which does not meet the completely same is used as a square block for subsequent matching, otherwise, the first square block is used as a square block for matching, and the HASH value of the block is obtained by using the method for obtaining the HASH value of the square block.

As shown in fig. 2, fig. 2 is a schematic diagram of a non-square CU block provided in the present application, for example, a block a is a non-square CU block, and a plurality of continuous positive bits are obtained with a side length of a short side of the block aSquare blocks, i.e. block A₁Block A₂Block A₃And Block A₄。

In one case, block A₁Block A₂Block A₃And Block A₄The pixel values in the block are identical, the block A is formed₁As a square block used for matching, and the above-described manner of calculating the hash value of the square block is adopted to calculate the block a₁As the hash value of block a.

In another case, Block A₁Block A₂Are identical in intra-block pixel values, block a₃Will be different, block a will be processed₃As a square block used for matching, and the above-described manner of calculating the hash value of the square block is adopted to calculate the block a₃As the hash value of block a.

In the above-described process of calculating hash values, two hash values (CRC check value) hashvalue1 and hashvalue2 are actually calculated for each case using two different parameters. Wherein, 16 bits after hashvalue1 are calculated according to the original pixel value of the current block, and the size information is stored from the 17 th bit; hashvalue2 is also calculated according to the original pixel value of the current block, and is used to eliminate hash collision and achieve the purpose of precision matching.

Optionally, before step 11, a step of determining whether a hash prediction mode can be adopted may be added, if so, the method of this embodiment is further adopted to perform hash prediction, and if not, other prediction modes may be adopted. Other prediction modes are not included in the description of the present embodiment, and are not described in detail here.

Specifically, the hash prediction mode is enabled until a certain number of 4 × 4-sized slice blocks in the current frame is reached. First, the current frame image is divided into 4 × 4 blocks, and a total of all num blocks are set. If each luminance pixel value in each row in the 4 × 4 block is the same or each luminance pixel value in each column in the 4 × 4 block is the same, determining that the 4 × 4 block is a stripe block, and setting the number of the stripe blocks as simpleNum. The HASH prediction mode is enabled when simpleNum > is 0.3 allNum. The 0.3 is a self-defined coefficient, and in other embodiments, the coefficient may also be modified, and may be any number between 0 and 1.

Step 12: and matching the reference block of the reference video frame by using the hash value of the current block to obtain the integer pixel candidate matching block.

Step 12 can be applied to inter-frame prediction and intra-frame prediction, and the HASH mode is used for both inter-frame prediction and intra-frame prediction, which increases the application range of the HASH mode.

In an alternative embodiment, the following two aspects are mainly included in inter-frame prediction: building a Merge mode (Merge mode) list and building an MMVD (Merge mode with MVD, motion vector residue based) candidate list. The mv (motion vector) is a motion vector, which is obtained by searching the current frame and the reference frame, and is a vector indicating the position of the best matching block. Mvd (motion Vector difference) is the motion Vector residual, i.e. the difference between the MV of the matching block and the MV in the candidate list.

(1) Constructing Merge candidate list

In the mode, a Merge candidate list is firstly established for a current CU block, 6 candidate MVs (each candidate MV comprises a forward MV and a backward MV) exist in the list, the MV candidate list comprises five types of spatial MVs, time-domain MVs, HMVPs (MVs of historical coded blocks), average MVs and zero MVs, and the priority is also added into the candidate list from front to back.

A. Airspace MV

As shown in fig. 3, fig. 3 is a schematic diagram of a current block and neighboring blocks of spatial MV provided by the present application, where the spatial domain provides at most 4 candidate MVs, i.e. MV information of at most 4 neighboring blocks of 5 neighboring blocks in fig. 3 is used, and the list is according to a₁-B₁-B₀-A₀-(B₂) In which B is₂For replacement, when A₁，B₁，B₀，A₀B is not present and the MV information of B2 is different from that of A1 and B1, B is required to be used₂The MV information of (1).

B. Time domain MV

When the size of the current block is larger than 4 × 4, 8 × 4 or 4 × 8, it is necessary to useThe time domain MV is filled into the Merge candidate list. The MV information of the CU block at the corresponding position in the neighboring coded picture (the co-located CU block at the corresponding position in the co-located frame) of the current CU block is utilized. Different from the spatial domain situation, the time domain candidate list cannot directly use the MV information of the neighboring blocks, and needs to perform corresponding scaling adjustment according to the position relationship of the reference frame, which is the default of the first frame in the reference frame list. As shown in fig. 4, fig. 4 is a schematic diagram of time-domain MVs provided by the present application, and the time domain provides only one candidate MV at most, indicated by C in fig. 4₀And (4) the MV of the position co-located CU blocks is obtained in a stretching mode. If C₀If the location co-located CU block is not available, then use C₁The collocated CU block of the location is replaced.

C、HMVP

When the Merge candidate list is not filled, comparing the MVs in the HMVP list with the MVs in the space domains A1 and B1 in sequence, and filling different MVs into the candidate list until the candidate list is filled.

D. Mean MV

If the Merge candidate list is not filled, the first two MVs in the Merge candidate list are used for carrying out averaging, forward and forward averaging, backward and backward averaging, and finally the average value is filled into the Merge candidate list.

E. Zero MV

If the number of the candidate MVs in the current Merge candidate list is still less than 6, zero MVs are used for filling to reach the specified number.

(2) Constructing MMVD candidate lists

The MMVD candidate list comprises 2 candidate MVs (each candidate MV comprises a forward MV and a backward MV), and the first two MVs in the Merge candidate list are selected according to the Merge candidate list and filled in the MMVD list.

In another optional embodiment, during intra prediction, a matching search is performed in an encoded block in the frame by using the hash value of the current block, and an integer pixel candidate matching block with a consistent block size and a consistent hash value is searched.

Specifically, a match search is performed in the encoded blocks in the frame using the HASH value of the current block, after a plurality of candidate matching blocks with the same block size and HASH value are searched, the best matching block is selected by a specific method (the specific method may be the prior art or the method of the present disclosure), the direction of the position offset between the best matching block and the current block is used as the prediction direction, and the pixel value in the best matching block is used as the prediction value of the current block.

The HASH mode is used for intra prediction as a separate mode, and a separate flag is required in the syntax element to indicate whether the HASH mode is used in the frame and is to be transmitted to the decoding side. For the prediction direction obtained by HASH matching, the prediction direction can be transmitted to a decoding end so that the decoding end can decode, or not, but the decoding end needs to perform matching operation as the encoding end to obtain the prediction direction.

Step 13: and traversing MMVD (MMVD) search around the current candidate matching block by taking the preset vertex angle pixel of the current integer pixel candidate matching block as a starting point to obtain sub-pixel blocks with the same hash value as the sub-pixel candidate matching block.

Since the HASH value is calculated based on integer pixels, the block to which the HASH is matched must also be integer pixels, and then the MV is also integer pixels, which may not be accurate enough, it can be considered that the HASH value is calculated based on sub-pixels, and the MV is further adjusted to sub-pixels for prediction.

Referring to fig. 5, fig. 5 is a schematic flow chart of step 13, where step 13 may specifically include:

step 131: and constructing a sub-pixel prediction block with the same size as the current block by taking the upper-left pixel of the current integer pixel candidate matching block as a starting point and performing traversal MMVD search around the candidate matching block by taking the search amplitude with sub-pixel precision as the upper-left pixel every time a sub-pixel point is traversed, wherein each pixel point in the sub-pixel prediction block is a sub-pixel, and the sub-pixel value of each sub-pixel is interpolated by the adjacent integer pixel values.

As shown in fig. 6, fig. 6 is a schematic diagram of an integer-pixel candidate matching block and a sub-pixel prediction block provided in the present application. Wherein, a is a current integer pixel candidate matching block, a0 is an upper left pixel of the current integer pixel candidate matching block, B0 is a sub-pixel point searched by a pixel point a0 in a certain range according to the search amplitude of the sub-pixel precision, and B is a sub-pixel prediction block using a sub-pixel point B0 as the upper left pixel.

Alternatively, the search magnitude of the sub-pixel precision may be 1/2, 1/4 pixels, which may be searching in multiple directions around the pixel point a 0.

Step 132: and performing hash comparison on the sub-pixel prediction block and the current block, and reserving the sub-pixel prediction block with the same hash as a sub-pixel candidate matching block.

The hash value of the sub-pixel prediction block may be calculated in the manner described in the above embodiments, and details thereof are not repeated here.

Step 14: the best matching block is selected from candidate matching blocks including integer pixel candidate matching blocks and sub-pixel candidate matching blocks.

Referring to fig. 7, fig. 7 is a schematic flowchart of step 14, and step 14 may specifically include:

step 141: and calculating the distance between the center points of all the candidate matching blocks and the center point of the current block.

Step 142: and selecting the candidate matching block with the shortest distance as the best matching block.

As shown in fig. 8, fig. 8 is a schematic diagram illustrating selection of a best matching block provided in the present application, for example, if the distance between the current block and the center of the matching block 1 is distance 1, the distance between the current block and the center of the matching block 2 is distance 2, and the distance between the current block and the center of the matching block 3 is distance 3, in an embodiment, if the distance 1> the distance 2 is distance 3, then the distances between the matching block 2 and the matching block 3 and the current block are the closest. And if the distances of a plurality of matching blocks are equal, sorting according to the coding sequence, and selecting the matching block with the last coding sequence as the best matching block. In the above embodiment, the encoding order of the matching block 3 is after the matching block 2, so the matching block 3 is selected as the best matching block.

In addition, in addition to the above embodiments, rdcost (rate distortion cost value) of all candidate matching blocks may also be calculated, and the candidate matching block with the lowest rdcost may be selected as the best matching block.

Different from the prior art, the video encoding method provided by this embodiment includes: obtaining a hash value of a current block in a current video frame; matching a reference block of a reference video frame by using the hash value of the current block to obtain an integer pixel candidate matching block; traversing MMVD (MMVD) search around the current candidate matching block by taking the preset vertex angle pixel of the current integer pixel candidate matching block as a starting point to obtain sub-pixel blocks with the same hash value as sub-pixel candidate matching blocks; the best matching block is selected from candidate matching blocks including integer pixel candidate matching blocks and sub-pixel candidate matching blocks. By the method, when the candidate matching block is obtained, the integer pixel matching block is utilized, and the sub-pixel matching block is adopted, so that the base number to be selected of the candidate matching block is enlarged, the accuracy of selecting the optimal matching block is improved, and the coding quality of the video frame is further improved.

After the best matching block is selected, the MV and the prediction value can be determined by using the best matching block, and the determination of the MV and the prediction value will be described in several embodiments below.

Referring to fig. 9, fig. 9 is a flowchart illustrating a second embodiment of a video encoding method provided in the present application, the method including:

step 91: and obtaining the hash value of the current block in the current video frame.

And step 92: and matching the reference block of the reference video frame by using the hash value of the current block to obtain the integer pixel candidate matching block.

Step 93: and traversing MMVD (MMVD) search around the current candidate matching block by taking the preset vertex angle pixel of the current integer pixel candidate matching block as a starting point to obtain sub-pixel blocks with the same hash value as the sub-pixel candidate matching block.

Step 94: the best matching block is selected from candidate matching blocks including integer pixel candidate matching blocks and sub-pixel candidate matching blocks.

The above steps 91-94 are similar to the previous embodiment and will not be described again.

Step 95: and taking the MV corresponding to the best matching block as an original MV, and further carrying out MMVD search to obtain a final MV.

Step 96: and constructing a block with the same size as the current block by taking the position pointed by the final MV as a central point as a reference block for obtaining a predicted value.

As shown in fig. 10, fig. 10 is a first schematic diagram of obtaining MV and predicted values provided by the present application.

Since the HASH matching block is an integer pixel block, the resulting MV is also an integer pixel. In this embodiment, MMVD is added to the MV in the HASH mode, that is, a current MV endpoint is used as a starting point, and further small-range search is performed in the same specific search mode as MMVD, and each MMVD combination corresponds to one MVD. The final MV is then calculated as follows: final MV is current MV + MVD. Since the final MV may point to a sub-pixel position, it is necessary to interpolate with integer pixels near the sub-pixel to obtain the sub-pixel value when obtaining the predicted value. Specifically, whether the position pointed by the final MV is a central point is a sub-pixel or not is judged; if the sub-pixel is the sub-pixel, the integer pixel near the sub-pixel is used for interpolation to obtain the value of the sub-pixel so as to obtain a predicted value.

As shown in fig. 10, after finding the best matching block, taking the MV corresponding to the matching block as the original MV, further performing MMVD search, setting MVD as the search direction to the left, and the search amplitude as 1/2 pixels, so that the dashed line in fig. 10 represents the final MV, and the block finally obtaining the predicted value is the block (the block indicated by the dashed line) whose position pointed by the final MV (dashed arrow) is the center point.

It is to be understood that, in the above embodiment, the pixel block search performed in step 93 and the MV search performed in step 95 both adopt the MMVD search. In other embodiments, the MMVD search may be used in step 93, while other approaches are used in step 95; alternatively, the MMVD search may be performed in step 95, and another approach may be performed in step 93.

Referring to fig. 11, fig. 11 is a flowchart illustrating a video encoding method according to a third embodiment of the present application, the method including:

step 111: and obtaining the hash value of the current block in the current video frame.

Step 112: and matching the reference block of the reference video frame by using the hash value of the current block to obtain the integer pixel candidate matching block.

Step 113: and traversing MMVD (MMVD) search around the current candidate matching block by taking the preset vertex angle pixel of the current integer pixel candidate matching block as a starting point to obtain sub-pixel blocks with the same hash value as the sub-pixel candidate matching block.

Step 114: the best matching block is selected from candidate matching blocks including integer pixel candidate matching blocks and sub-pixel candidate matching blocks.

The steps 111-114 are similar to the previous embodiments and are not described herein.

Step 115: the top n rdcost-smallest candidate matching blocks are selected from the candidate matching blocks other than the best matching block.

Where rdcost is the rate-distortion cost value.

Step 116: and performing weighted average on the pixel values of the corresponding positions of the first n rdcost minimum candidate matching blocks and the best candidate matching block to obtain a weighted prediction value.

Step 117: rdcost is calculated using the weighted predictors and compared to rdcost predicted directly using the best matching block, and the final predictor is selected as the rdcost that is smaller.

Wherein, the weight of the pixel value of the best matching block is set as w, 1> w >0.5, the pixel value weight of n candidate matching blocks is 1-w in total, and the weight of the candidate matching block with smaller rdcost is larger.

As shown in fig. 12, fig. 12 is a second schematic diagram of obtaining MV and predicted values provided by the present application.

In this embodiment, except for the best matching block, the pixel values of the remaining candidate matching blocks are used for weighting to obtain the prediction values.

After the best matching block and the best MV are selected, when a predicted value is obtained, the pixel value of the best matching block is not directly copied, except the best matching block, the first n rdcost minimum candidate matching blocks are selected, weighted average is carried out on the pixel values of the corresponding positions of the best matching blocks, the weight of the pixel value of the best matching block is set to be w (1> w >0.5), the weight of the pixel values of the n candidate matching blocks is 1-w in total, and the weight of the candidate matching block with the smaller rdcost is larger. And obtaining a weighted prediction value after weighting, calculating the rdcost by using the weighted prediction value, comparing the rdcost with the rdcost directly predicted by using the best matching block, and selecting the rdcost with small value as a final prediction value, wherein the MV is still the original best MV.

As shown in fig. 12, when n is 2, the matching block 2 is the best matching block, the weight of the matching block 2 is 0.8, the pixel value of the matching block 2 is denoted by pix2, the weight of the matching block 1 is 0.2, and the pixel value of the matching block 1 is denoted by pix 1. Then, the predicted value pix weighted by the above-mentioned weighting method is 0.8 pix2+0.2 pix1, whereas the predicted value of the prior art is pix2, and the final predicted value is determined by comparing rdcosts of pix and pix 2. Assume that the final predictor selects pix, but the MV still selects MV2 as the best MV.

On the syntax element, a flag needs to be transmitted to indicate whether the scheme is used, if a weighted value is finally selected as a predicted value, an index (index) of an MVP (Motion Vector Prediction) needs to be transmitted, and if several matching blocks are used for weighting, MVDs corresponding to the several matching blocks need to be transmitted.

Referring to fig. 13, fig. 13 is a flowchart illustrating a fourth embodiment of a video encoding method provided in the present application, the method including:

step 131: and obtaining the hash value of the current block in the current video frame.

Step 132: and matching the reference block of the reference video frame by using the hash value of the current block to obtain the integer pixel candidate matching block.

Step 133: and traversing MMVD (MMVD) search around the current candidate matching block by taking the preset vertex angle pixel of the current integer pixel candidate matching block as a starting point to obtain sub-pixel blocks with the same hash value as the sub-pixel candidate matching block.

Step 134: the best matching block is selected from candidate matching blocks including integer pixel candidate matching blocks and sub-pixel candidate matching blocks.

The steps 131 to 134 are similar to the previous embodiments and are not described herein again.

Step 135: and selecting the MV corresponding to the candidate matching block with the smallest first n rdcosts from the MVs corresponding to the candidate matching blocks except the best matching block.

Step 136: and performing weighted average on the MVs corresponding to the candidate matching blocks with the minimum first n rdcosts and the MVs corresponding to the best candidate matching blocks to obtain weighted MVs.

Step 137: and obtaining candidate predicted values by using the positions pointed by the weighted MVs, calculating the rdcost by using the candidate predicted values, comparing the rdcost with the rdcost directly predicted by the best matching block, and selecting the rdcost with small rdcost as a final predicted value.

The MV weight corresponding to the best matching block is set as w, 1> w >0.5, the MV weights corresponding to n candidate matching blocks are 1-w in total, and the smaller the rdcost, the larger the weight of the candidate matching block.

In this embodiment, in addition to the best MV, the MV of the candidate matching block with the smallest rdcost in the first n blocks is selected and weighted with the best MV, the weight of the MV of the best matching block is set to w (1> w >0.5), the MV weights of the n candidate matching blocks are 1-w in total, and the weight of the candidate matching block with the smaller rdcost is larger. And obtaining a weighted MV after weighting, obtaining a predicted value according to the position pointed by the weighted MV when obtaining the predicted value, calculating rdcost by using the predicted value, comparing the rdcost with the rdcost directly predicted by using the best matching block, and selecting the rdcost with small rdcost as the final predicted value.

As shown in fig. 14, fig. 14 is a third schematic diagram of obtaining MV and predicted values provided in the present application, where n is 2, MV1 is an optimal MV, the weight of MV1 is 0.7, and the weight of MV2 is 0.3. Then the weighted MV represented by the dashed arrow is 0.7 MV1+0.3 MV 2. The dashed box pointed to by the weighted MV indicates the area used to obtain the final prediction value.

In the above-described embodiment, after the best matching block is selected and the final prediction value is determined, the current block is encoded using the prediction value.

The beneficial effect of this application is as follows:

1. the HASH mode is used for both inter-frame prediction and intra-frame prediction, and the application range of the HASH mode is enlarged.

2. After HASH is matched with the candidate matching blocks, the distance between the center points of all the candidate matching blocks and the center point of the current block is calculated, so that the candidate matching block with the shortest distance is selected as the best matching block. Compared with the mode of determining the best matching block through rdcost in the prior art, the method has the advantages that calculation is simpler and more convenient, and the process of determining the best matching block is simplified through the distance algorithm of the central point.

3. Improvements are made to the acquisition and matching of HASH values. In the application, interpolation can be carried out on integer pixels in the current block to obtain sub-pixels, and the sub-pixels are used for carrying out CRC calculation and are used as sub-pixel HASH values for matching. By the method, the sub-pixel matching blocks can be simultaneously used as the candidate matching blocks, and compared with the mode that only the whole-pixel matching block is used as the candidate matching block, the number of the candidate matching blocks is increased, and the optimal matching block can be found more accurately.

4. And the acquisition of the MV and the acquisition of the predicted value are improved.

Firstly, as in the embodiment of fig. 9, the MV can be further adjusted by using the MMVD technique, and the final MV may include the full-pixel MV or the sub-pixel MV, so that the accuracy of MV acquisition is improved; furthermore, because the value of the divided pixel is obtained by utilizing the integral pixel near the divided pixel to carry out difference value, the predicted value obtained in such a way is more accurate;

secondly, as in the embodiment of fig. 11, under the condition that the MV selection mode is not changed, weighting is performed by using the pixel values of other candidate matching blocks and the pixel value of the best matching block, the weighted values are used as predicted values, a first rdcost is calculated, then the first rdcost is compared with a second rdcost predicted by directly adopting the best matching block, and the smaller rdcost is selected as a final predicted value; by the method, the estimation of the predicted value can be carried out in two ways, and the rdcost of the final predicted value is made as small as possible, so that the calculation accuracy of the predicted value is improved;

thirdly, as in the embodiment of fig. 13, besides the MV of the best matching block, the MV can be weighted and adjusted by using the information of the other matching blocks, and the first rdcost is calculated by using the predicted value obtained from the position pointed by the weighted and adjusted MV, and then the first rdcost is compared with the second rdcost calculated by directly using the predicted value obtained from the position pointed by the MV corresponding to the best matching block, and the smaller rdcost is selected as the final predicted value; by the method, the estimation of the predicted value can be carried out in two ways, and the rdcost of the final predicted value is made as small as possible, so that the calculation accuracy of the predicted value is improved.

Referring to fig. 15, fig. 15 is a schematic structural diagram of a first embodiment of a video encoding apparatus provided in the present application, where the video encoding apparatus 150 includes a processor 151 and a memory 152 connected to each other, where the memory 152 stores program data, and the processor 151 is configured to execute the program data to implement the following methods:

obtaining a hash value of a current block in a current video frame; matching a reference block of a reference video frame by using the hash value of the current block to obtain an integer pixel candidate matching block; traversing MMVD (MMVD) search around the current candidate matching block by taking the preset vertex angle pixel of the current integer pixel candidate matching block as a starting point to obtain sub-pixel blocks with the same hash value as sub-pixel candidate matching blocks; the best matching block is selected from candidate matching blocks including integer pixel candidate matching blocks and sub-pixel candidate matching blocks.

Optionally, in another embodiment, the processor 151 is configured to execute the program data to implement the following method: taking the upper left corner pixel of the current integer pixel candidate matching block as a starting point, performing traversal MMVD search around the candidate matching block at the current by using the search amplitude of sub-pixel precision, and constructing a sub-pixel prediction block with the same size as the current block by using the sub-pixel point as the upper left corner pixel when traversing to a sub-pixel point, wherein each pixel point in the sub-pixel prediction block is a sub-pixel, and the sub-pixel value of each sub-pixel is interpolated by the adjacent integer pixel values; and performing hash comparison on the sub-pixel prediction block and the current block, and reserving the sub-pixel prediction block with the same hash as a sub-pixel candidate matching block.

Optionally, in another embodiment, the processor 151 is configured to execute the program data to implement the following method: calculating the distance between the center point of all candidate matching blocks and the center point of the current block; and selecting the candidate matching block with the shortest distance as the best matching block.

Optionally, in another embodiment, the processor 151 is configured to execute the program data to implement the following method: judging whether at least two candidate matching blocks with the shortest and same distance exist or not; and if at least two candidate matching blocks exist, sorting according to the coding sequence, and selecting the candidate matching block with the last coding sequence as the best matching block.

Optionally, in another embodiment, the processor 151 is configured to execute the program data to implement the following method: using the MV corresponding to the best matching block as an original MV, and further carrying out MMVD search to obtain a final MV; and constructing a block with the same size as the current block by taking the position pointed by the final MV as a central point as a reference block for obtaining a predicted value.

Optionally, in another embodiment, the processor 151 is configured to execute the program data to implement the following method: judging whether the final position pointed by the MV is a central point or not; if the sub-pixel is the sub-pixel, the integer pixel near the sub-pixel is used for interpolation to obtain the value of the sub-pixel so as to obtain a predicted value.

Optionally, in another embodiment, the processor 151 is configured to execute the program data to implement the following method: selecting the first n candidate matching blocks with the lowest rdcost from the candidate matching blocks except the best matching block; carrying out weighted average on pixel values of corresponding positions of the first n rdcost minimum candidate matching blocks and the best candidate matching block to obtain a weighted prediction value; calculating rdcost by using the weighted prediction value, comparing the rdcost with the rdcost directly predicted by using the best matching block, and selecting the rdcost with small value as a final prediction value; wherein, the weight of the pixel value of the best matching block is set as w, 1> w >0.5, the pixel value weight of n candidate matching blocks is 1-w in total, and the weight of the candidate matching block with smaller rdcost is larger.

Optionally, in another embodiment, the processor 151 is configured to execute the program data to implement the following method: selecting the MVs corresponding to the first n rdcost minimum candidate matching blocks from the MVs corresponding to the candidate matching blocks except the best matching block; performing weighted average on the MVs corresponding to the first n rdcost minimum candidate matching blocks and the MVs corresponding to the best candidate matching blocks to obtain weighted MVs; obtaining a candidate predicted value by using the position pointed by the weighted MV, calculating a rdcost by using the candidate predicted value, comparing the rdcost with the rdcost directly predicted by using the best matching block, and selecting the rdcost with small rdcost as a final predicted value; the MV weight corresponding to the best matching block is set as w, 1> w >0.5, the MV weights corresponding to n candidate matching blocks are 1-w in total, and the smaller the rdcost, the larger the weight of the candidate matching block.

Referring to fig. 16, fig. 16 is a schematic structural diagram of a second embodiment of the video encoding apparatus 160 provided in the present application, where the video encoding apparatus includes a hash value obtaining module 161, a full-pixel matching module 162, a sub-pixel matching module 163, and a best matching block determining module 164.

The hash value obtaining module 161 is configured to obtain a hash value of a current block in a current video frame; the integer pixel matching module 162 is configured to match a reference block of a reference video frame with the hash value of the current block to obtain an integer pixel candidate matching block; the sub-pixel matching module 163 is configured to perform traversal MMVD search around the current candidate matching block, using a predetermined vertex angle pixel of the current integer-pixel candidate matching block as a starting point, to obtain sub-pixel blocks with the same hash value as the sub-pixel candidate matching block; the best match block determination module 164 is used to select the best match block from the candidate match blocks including integer pixel candidate match blocks, fractional pixel candidate match blocks.

Referring to fig. 17, fig. 17 is a schematic structural diagram of a computer storage medium 170 provided by the present application, in which program data 171 is stored, and when the program data 171 is executed by a processor, the method is implemented as follows:

It is to be understood that, in the embodiments of the video encoding apparatus and the computer storage medium provided in the present application, reference may be made to the foregoing embodiments of the video encoding method for the method steps performed, and the principle and steps are similar, which are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated units in the other embodiments described above may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only for the purpose of illustrating embodiments of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made according to the content of the present specification and the accompanying drawings, or which are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims

1. A video encoding method, comprising:

obtaining a hash value of a current block in a current video frame;

matching a reference block of a reference video frame by using the hash value of the current block to obtain an integer pixel candidate matching block;

traversing MMVD (MMVD) search around the current candidate matching block by taking the preset vertex angle pixel of the current integer pixel candidate matching block as a starting point to obtain sub-pixel blocks with the same hash value as sub-pixel candidate matching blocks;

selecting a best matching block from candidate matching blocks including the integer pixel candidate matching block and the sub-pixel candidate matching block.

2. The method of claim 1,

the step of performing traversal MMVD search around the current candidate matching block with a predetermined vertex angle pixel of the current integer pixel candidate matching block as a starting point includes:

taking the top left pixel of the current integer pixel candidate matching block as a starting point, performing traversal MMVD search around the current candidate matching block by using the search amplitude of sub-pixel precision, and constructing a sub-pixel prediction block with the same size as that of the current block by using the sub-pixel point as the top left pixel when traversing to a sub-pixel point, wherein each pixel point in the sub-pixel prediction block is a sub-pixel, and the sub-pixel value of each sub-pixel is interpolated by using the adjacent integer pixel values;

and performing hash comparison on the sub-pixel prediction block and the current block, and reserving the sub-pixel prediction block with the same hash as the sub-pixel candidate matching block.

3. The method of claim 1,

the selecting a best matching block from candidate matching blocks including the integer pixel candidate matching block and the sub-pixel candidate matching block comprises:

calculating the distance between the center point of all the candidate matching blocks and the center point of the current block;

and selecting the candidate matching block with the shortest distance as the best matching block.

4. The method of claim 3,

the selecting the candidate matching block with the shortest distance as the best matching block includes:

judging whether at least two candidate matching blocks with the shortest and same distance exist or not;

and if at least two candidate matching blocks exist, sorting according to the coding sequence, and selecting the candidate matching block with the last coding sequence as the best matching block.

5. The method of claim 1,

calculating rdcosts of all the candidate matching blocks;

selecting the candidate matching block with the smallest rdcost as the best matching block.

6. The method of claim 1,

after selecting the best matching block from the candidate matching blocks including the integer pixel candidate matching block and the sub-pixel candidate matching block, the method comprises the following steps:

and determining the MV and the predicted value by using the best matching block.

7. The method of claim 6,

the determining the MV and the predicted value by using the best matching block comprises:

taking the MV corresponding to the best matching block as an original MV, and further carrying out MMVD search to obtain a final MV;

and constructing a block with the same size as the current block by taking the position pointed by the final MV as a central point as a reference block for obtaining a predicted value.

8. The method of claim 7,

the constructing a block with the same size as the current block by taking the position pointed by the final MV as a central point as a reference block for obtaining a predicted value comprises the following steps:

judging whether the final position pointed by the MV is a central point and is a sub-pixel or not;

if the sub-pixel is the sub-pixel, the integer pixel near the sub-pixel is utilized to carry out interpolation to obtain the value of the sub-pixel so as to obtain the predicted value.

9. The method of claim 6,

selecting the top n candidate matching blocks with the lowest rdcost from the candidate matching blocks except the best matching block;

carrying out weighted average on pixel values of corresponding positions of the first n rdcost minimum candidate matching blocks and the best candidate matching block to obtain a weighted prediction value;

calculating rdcost by using the weighted prediction value, comparing the rdcost with the rdcost directly predicted by using the best matching block, and selecting the rdcost with small rdcost as a final prediction value;

wherein the pixel value of the best matching block is weighted to w, 1> w >0.5, the pixel value of the n candidate matching blocks is weighted to 1-w in total, and the smaller the rdcost, the larger the weight of the candidate matching block.

10. The method of claim 6,

selecting the MVs corresponding to the top n rdcost minimum candidate matching blocks from the MVs corresponding to the candidate matching blocks except the best matching block;

performing weighted average on the MVs corresponding to the candidate matching block with the minimum first n rdcosts and the MVs corresponding to the best candidate matching block to obtain weighted MVs;

obtaining a candidate predicted value by using the position pointed by the weighted MV, calculating a rdcost by using the candidate predicted value, comparing the rdcost with the rdcost directly predicted by using the best matching block, and selecting the rdcost with small rdcost as a final predicted value;

and setting the MV weight corresponding to the best matching block as w, 1> w >0.5, wherein the MV weights corresponding to the n candidate matching blocks are 1-w in total, and the smaller the rdcost, the larger the weight of the candidate matching block is.

11. The method of claim 1,

the matching the reference block of the reference video frame by using the hash value of the current block to obtain the integer pixel candidate matching block comprises:

and performing matching search in the coded blocks in the frame by using the hash value of the current block, and searching integer pixel candidate matching blocks with consistent block sizes and consistent hash values.

12. A video encoding apparatus, characterized in that the video encoding apparatus comprises a processor and a memory connected to each other, the memory being configured to store program data, the processor being configured to execute the program data to implement the method according to any one of claims 1-11.

13. A computer storage medium, characterized in that program data are stored in the computer storage medium, which program data, when being executed by a processor, are adapted to carry out the method of any one of claims 1-11.