CN110662060B

CN110662060B - Video encoding method and apparatus, video decoding method and apparatus, and storage medium

Info

Publication number: CN110662060B
Application number: CN201910927120.9A
Authority: CN
Inventors: 高欣玮; 李蔚然; 毛煦楠; 谷沉沉
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-09-27
Filing date: 2019-09-27
Publication date: 2024-02-09
Anticipated expiration: 2039-09-27
Also published as: CN110662060A

Abstract

The invention discloses a video encoding method and device, a video decoding method and device and a storage medium. Wherein the method comprises the following steps: the method comprises the steps of adjusting the resolution of a reconstruction block of a decoded block of a current video frame from an initial resolution to a highest resolution, and obtaining a pixel region to be matched with the highest resolution, wherein the initial resolution is the resolution adopted by the decoded block in decoding, and the highest resolution is the highest resolution in a preset resolution set; determining a target pixel area matched with the block to be decoded from the pixel area to be matched according to a target MV corresponding to the block to be decoded of the current video frame, wherein the target MV is a motion vector from the target pixel area to the block to be decoded, and the size of the target pixel area is the same as the size of the block to be decoded; and determining the pixel value of the pixel point in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel value of the pixel point in the target pixel region.

Description

Video encoding method and apparatus, video decoding method and apparatus, and storage medium

Technical Field

The present invention relates to the field of computers, and in particular, to a video encoding method and apparatus, a video decoding method and apparatus, and a storage medium.

Background

Currently, in the process of video coding, a video frame in a video may be divided into a plurality of blocks to perform coding processing respectively. The current block may be encoded by pre-analysis by referring to the encoded block to determine encoding parameter information of the current block before encoding the current block. For ease of processing, it is common to encode different blocks of a frame in video with a uniform resolution.

However, in the manner of encoding and decoding different blocks in a frame with uniform resolution, there is a problem of distortion verification due to mismatch between resolution and transmission bandwidth. Even if different resolutions are adopted for different blocks, the relative relationship between each block and its reference block is dynamic, so that it is difficult to determine the relative position of the block to be encoded and its reference block, thereby affecting the efficiency and effect of encoding and decoding.

Therefore, the related art method of encoding and decoding with different resolutions by using different blocks in one frame has the problems of low encoding and decoding efficiency and poor decoding effect due to difficulty in determining the relative positions of the block to be encoded and the reference block.

Disclosure of Invention

The embodiment of the invention provides a video coding method and device, a video decoding method and device and a storage medium, which at least solve the technical problems of low coding and decoding efficiency and poor decoding effect in a mode of coding and decoding different blocks in one frame by adopting different resolutions in the related technology.

According to an aspect of an embodiment of the present invention, there is provided a video decoding method including: the method comprises the steps of adjusting the resolution of a reconstruction block of a decoded block of a current video frame from an initial resolution to a highest resolution, and obtaining a pixel region to be matched with the resolution of the highest resolution, wherein the initial resolution is the resolution adopted by the decoded block in decoding, and the highest resolution is the highest resolution in a preset resolution set; determining a target pixel region matched with the block to be decoded from the pixel region to be matched according to a target motion vector MV corresponding to the block to be decoded of the current video frame, wherein the target MV is a motion vector from the target pixel region to the block to be decoded, and the size of the target pixel region is the same as the size of the block to be decoded; and determining the pixel value of the pixel point in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel value of the pixel point in the target pixel region.

According to another aspect of an embodiment of the present invention, there is provided a video encoding method including: the method comprises the steps of adjusting the resolution of a reconstruction block of an encoded block of a current video frame from an initial resolution to a highest resolution, and obtaining a pixel region to be matched with the resolution of the highest resolution, wherein the initial resolution is the resolution adopted by the encoded block in encoding, and the highest resolution is the highest resolution in a preset resolution set; searching a target pixel area matched with a block to be coded of the current video frame in the pixel area to be matched, wherein the size of the target pixel area is the same as the size of the block to be coded; and obtaining a motion vector MV from the target pixel region to the block to be encoded, and obtaining a target MV corresponding to the block to be encoded.

According to still another aspect of the embodiments of the present invention, there is also provided a video decoding apparatus including: a first adjusting unit, configured to adjust a resolution of a reconstructed block of a decoded block of a current video frame from an initial resolution to a highest resolution, to obtain a pixel region to be matched with the resolution being the highest resolution, where the initial resolution is a resolution adopted by the decoded block during decoding, and the highest resolution is a highest resolution in a preset resolution set; a first determining unit, configured to determine a target pixel area matched with a block to be decoded from the pixel area to be matched according to a target motion vector MV corresponding to the block to be decoded of the current video frame, where the target MV is a motion vector from the target pixel area to the block to be decoded, and a size of the target pixel area is the same as a size of the block to be decoded; and the second determining unit is used for determining the pixel value of the pixel point in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel value of the pixel point in the target pixel region.

According to still another aspect of the embodiments of the present invention, there is also provided a video encoding apparatus including: the second adjusting unit is used for adjusting the resolution of a reconstruction block of the coded block of the current video frame from an initial resolution to a highest resolution, so as to obtain a pixel region to be matched, wherein the resolution is the highest resolution, and the initial resolution is the resolution adopted by the coded block in coding, and the highest resolution is the highest resolution in a preset resolution set; the searching unit is used for searching a target pixel area matched with the block to be coded of the current video frame in the pixel area to be matched, wherein the size of the target pixel area is the same as that of the block to be coded; and the acquisition unit is used for acquiring the motion vector MV from the target pixel region to the block to be encoded to obtain a target MV corresponding to the block to be encoded.

According to a further aspect of embodiments of the present invention, there is also provided a computer readable storage medium having a computer program stored therein, wherein the computer program is configured to perform the above-described video decoding method when run.

According to a further aspect of embodiments of the present invention, there is also provided a computer readable storage medium having a computer program stored therein, wherein the computer program is configured to perform the video encoding method described above when run.

According to still another aspect of the embodiments of the present invention, there is further provided an electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the video decoding method described above through the computer program.

According to still another aspect of the embodiments of the present invention, there is further provided an electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the video encoding method described above through the computer program.

In the embodiment of the invention, a mode of unifying the resolutions of the reconstructed blocks of all decoded blocks is adopted, and the pixel region to be matched with the highest resolution is obtained by adjusting the resolution of the reconstructed block of the decoded block of the current video frame from the initial resolution to the highest resolution, wherein the initial resolution is the resolution adopted by the decoded block in decoding, and the highest resolution is the highest resolution in a preset resolution set; determining a target pixel area matched with the block to be decoded from the pixel area to be matched according to a target MV corresponding to the block to be decoded of the current video frame, wherein the target MV is a motion vector from the target pixel area to the block to be decoded, and the size of the target pixel area is the same as the size of the block to be decoded; according to the pixel values of the pixel points in the target pixel area, the pixel values of the pixel points in the target decoding block obtained by decoding the block to be decoded according to the highest resolution are determined, and as the resolution of the reconstructed blocks of all the decoded blocks is adjusted to the highest resolution, the reference block of the block to be decoded can be determined according to the motion vector under the same resolution, the purposes of rapidly positioning the reference block of the block to be decoded and ensuring the accuracy of the decoding result are achieved, so that the technical effects of improving the coding and decoding efficiency and improving the decoding effect are achieved, and the technical problems of low coding and decoding efficiency and poor decoding effect existing in the mode of adopting different resolutions to code and decode different blocks in one frame in the related art are solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:

FIG. 1 is a schematic diagram of an application environment of an alternative video decoding method according to an embodiment of the present invention;

FIG. 2 is a flow chart of an alternative video decoding method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an alternative video decoding method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of another alternative video decoding method according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of yet another alternative video decoding method according to an embodiment of the present invention;

FIG. 6 is a flow chart of an alternative video encoding method according to an embodiment of the invention;

FIG. 7 is a schematic diagram of an alternative video encoding method according to an embodiment of the invention;

FIG. 8 is a schematic diagram of an alternative video encoding method according to an embodiment of the invention;

FIG. 9 is a schematic diagram of an alternative video encoding and decoding process according to an embodiment of the present invention;

fig. 10 is a schematic structural view of an alternative video decoding apparatus according to an embodiment of the present invention;

Fig. 11 is a schematic structural view of an alternative video encoding apparatus according to an embodiment of the present invention;

fig. 12 is a schematic structural view of an alternative electronic device according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Technical terms related to the embodiment of the invention include:

(1) PSNR (Peak Signal to Noise Ratio, peak signal-to-noise ratio), which represents the ratio of the maximum possible power of a signal to the destructive noise power affecting its accuracy of representation, in dB, with a larger PSNR value representing less distortion;

(2) MV (motion Vector), which represents the relative displacement between the current coding block and the best matching block in its reference picture.

According to an aspect of an embodiment of the present invention, there is provided a video decoding method. Alternatively, as an alternative embodiment, the video decoding method described above may be applied, but not limited to, in an application environment as shown in fig. 1. The application environment includes a terminal 102 and a server 104, where the terminal 102 and the server 104 communicate through a network. The terminal 102 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, etc. The server 104 may be, but is not limited to, a computer processing device with a high data processing capability and a certain storage space.

Note that, the video encoding method corresponding to the video decoding method described above may be applied to, but not limited to, the application environment shown in fig. 1. After obtaining the video to be encoded, the video encoding method provided in the present application may be, but is not limited to, using the interaction process between the terminal 102 and the server 104 shown in fig. 1, in the scene of encoding at different resolutions at the block level, searching the pixel region (for example, the most-similar pixel region) matched with the current block to be encoded in the pixel region encoded in the frame at the same resolution (the highest resolution), and obtaining the target MV matching the pixel region and the block to be encoded, thereby saving the transmission bandwidth and simultaneously ensuring the encoding efficiency and quality of the video frame. In addition, after the video to be decoded is obtained, the video decoding method provided by the application may be used, but is not limited to, by the interaction process between the terminal 102 and the server 104 shown in fig. 1, under the scene of block-level different resolution coding, a matching pixel area matching with the current block to be decoded is determined according to the target MV at the same resolution, and then the block to be decoded is determined according to the matching pixel area, thereby improving decoding efficiency and decoding effect.

In one embodiment, terminal 102 may include, but is not limited to, the following: an image processing unit 1021, a processor 1022, a storage medium 1023, a memory 1024, a network interface 1025, a display screen 1026, and an input device 1027. The components described above may be connected by, but are not limited to, a system bus 1028. Wherein, the image processing unit 1021 is used for providing at least the drawing capability of the display interface; the processor 1022 is configured to provide computing and control capabilities to support operation of the terminal 102; the storage medium 1023 has stored therein an operating system 1023-2, a video encoder and/or a video decoder 1023-4. The operating system 1023-2 is used to provide control operation instructions, and the video encoder and/or video decoder 1023-4 is used to perform encoding/decoding operations in accordance with the control operation instructions. In addition, the memory provides an operating environment for the video encoder and/or video decoder 1023-4 in the storage medium 1023, and the network interface 1025 is used for network communication with the network interface 1043 in the server 104. The display screen is used for displaying application interfaces and the like, such as decoding video; the input device 1027 is used to receive commands or data input by a user, and the like. For a terminal 102 with a touch screen, the display screen 1026 and the input device 1027 may be touch screens. The above-described internal structure of the terminal shown in fig. 1 is merely a block diagram of a part of the structure related to the present application and does not constitute a limitation of the terminal to which the present application is applied, and a specific terminal or server may include more or less components than those shown in the drawings, or may combine some components, or have different arrangements of components.

In one embodiment, the server 104 may include, but is not limited to, the following: a processor 1041, memory 1042, a network interface 1043, and storage media 1044. The components described above may be connected by, but are not limited to, a system bus 1045. The storage medium 1044 includes an operating system 1044-1, a database 1044-2, a video encoder and/or a video decoder 1044-3. Wherein the processor 1041 is configured to provide computing and control capabilities to support operation of the server 104. Memory 1042 provides an environment for operation of video encoder and/or video decoding 1044-3 in storage medium 1044. The network interface 1043 communicates with the network interface 1025 of the external terminal 102 through a network connection. The operating system 1044-1 in the storage medium is used to provide control operation instructions; the video encoder and/or video decoder 1044-3 is for performing encoding/decoding operations according to the control operation instructions; database 1044-2 is used to store data. The above-described structure inside the server shown in fig. 1 is merely a block diagram of a part of the structure related to the present application, and does not constitute a limitation of the computer device to which the present application is applied, and a specific computer device has a different arrangement of components.

In one embodiment, the network may include, but is not limited to: wired network, wireless network. Wherein, the wired network may include, but is not limited to: wide area network, metropolitan area network, local area network. The above is merely an example, and is not limited in any way in the present embodiment.

Alternatively, in the present embodiment, as an optional implementation manner, a video decoding method performed by the terminal 102 or the server 104 is provided. As shown in fig. 2, the video decoding method includes:

step S202, the resolution of a reconstructed block of a decoded block of a current video frame is adjusted from an initial resolution to a highest resolution, and a pixel region to be matched with the highest resolution is obtained, wherein the initial resolution is the resolution adopted by the decoded block in decoding, and the highest resolution is the highest resolution in a preset resolution set;

step S204, determining a target pixel area matched with the block to be decoded from the pixel area to be matched according to a target MV corresponding to the block to be decoded of the current video frame, wherein the target MV is a motion vector from the target pixel area to the block to be decoded, and the size of the target pixel area is the same as the size of the block to be decoded;

Step S206, according to the pixel values of the pixel points in the target pixel area, determining the pixel values of the pixel points in the target decoding block obtained by decoding the block to be decoded according to the highest resolution.

It should be noted that the video decoding method shown in fig. 2 may be used in the video decoder shown in fig. 1, but is not limited to the above method. The decoding process of the current video frame is completed through the interaction of the video decoder with other components.

Alternatively, in the present embodiment, the video decoding method described above may be applied to various scenes related to video encoding and decoding technologies, for example, application scenes such as video playing application, video sharing application, video session application, short video transmission in instant messaging application, video chat in instant messaging application, and the like. The video transmitted in the application scenario may include, but is not limited to: the long video, the short video, such as the long video, can be a play episode with longer play time (for example, the play time is longer than 10 minutes), or the pictures shown in the long video session, and the short video can be a voice message interacted by two or more parties, or a video with shorter play time (for example, the play time is less than or equal to 30 seconds) shown on the sharing platform. The foregoing is merely an example, and the video decoding method provided in the present embodiment may be applied to, but not limited to, a playback device for playing back video in the foregoing application scenario.

In this embodiment, for a current block to be decoded in a current video frame (a current decoded video frame to be processed, some blocks in the current video frame may be decoded, and some blocks to be decoded) in a video to be decoded, a plurality of decoded blocks are reconstructed into one set and adjusted to the same resolution (highest resolution), a pixel region to be matched is obtained, a target pixel region matched with the block to be decoded is determined according to a target MV corresponding to the block to be decoded, and then a pixel value of a pixel point in the target decoded block obtained by decoding the block to be decoded is determined according to a pixel value of the target pixel region, so that the technical problems of low encoding and decoding efficiency and poor decoding effect existing in a manner of encoding and decoding different blocks in one frame in the related art by adopting different resolutions can be solved, the encoding and decoding efficiency is improved, and the decoding effect is improved.

The video decoding method in the present embodiment is explained below with reference to fig. 2.

In step S202, the resolution of the reconstructed block of the decoded block of the current video frame is adjusted from an initial resolution to a highest resolution, so as to obtain a pixel region to be matched with the highest resolution, wherein the initial resolution is the resolution adopted by the decoded block during decoding, and the highest resolution is the highest resolution in a preset resolution set.

A receiving device of the video (e.g., server 104) may receive a video bitstream transmitted by a transmitting device of the video (e.g., terminal 102) through a network. The video code stream contains data obtained by video encoding one or more video frames (video images) of the video.

After receiving the video code stream, the receiving device may decode each current video frame in the video to obtain a decoded video. Video frames with relevance can be decoded according to the sequence of relevance, and for video frames without relevance, the video frames can be decoded according to the sequence in video, can be decoded according to random sequence, can be decoded in parallel by a plurality of video frames, and the specific decoding mode can be determined according to convention or according to indication information in a video code stream, so that the method is not particularly limited in the embodiment.

For each image block in the current video frame to be decoded, when an intra block copy (intra block copy) coding mode is used, different resolutions may be used for coding. The adopted resolution can be indicated by indication information corresponding to a current video frame (or a block to be decoded) in the video code stream, can be determined according to reference information of other image blocks or other video frames, and can be determined according to indication information corresponding to the current video frame (or the block to be decoded) in the video code stream and reference information of other image blocks or other video frames. The specific manner in which the resolution employed for each image block is determined is not particularly limited in this embodiment.

When each block to be decoded is decoded, in the case that the video code stream contains the data to be decoded corresponding to the current block to be decoded, the data to be decoded can be decoded with the resolution corresponding to the current block to be decoded, so as to obtain the current decoded block corresponding to the current block to be decoded. In the case where the video bitstream does not include the data to be decoded corresponding to the current block to be decoded, the current decoded block corresponding to the current block to be decoded may be determined according to a target pixel region (e.g., a pixel region most similar to the current block to be decoded) that matches the current block to be decoded among the decoded pixel regions.

Alternatively, in the present embodiment, the target pixel region may span a plurality of blocks (decoded blocks), reconstruct the plurality of blocks into one and adjust to the highest resolution. For example, the resolution of the reconstructed block of the decoded block of the current video frame may be adjusted from the initial resolution to the highest resolution, resulting in a pixel region to be matched having the highest resolution. The initial resolution is the resolution adopted by the decoded blocks in decoding, and the corresponding initial resolutions of different decoded blocks can be the same or different. The highest resolution is the highest resolution in the preset resolution set.

It should be noted that, the highest resolution may be a preset resolution, that is, an original resolution of the video frame, that is, a resolution before the resolution adjustment and the video encoding is not performed on the encoding side, or a maximum value of resolutions adopted by all image blocks in the current video frame. The highest resolution may be indicated by the indication information or may be determined based on the resolution employed for all image blocks.

Alternatively, the resolution of the reconstructed block of the decoded block of the current video frame may be adjusted from the initial resolution to a preset resolution that is greater than the resolution employed by the current block to be decoded at the time of decoding (in the case where the resolution employed by the current block to be decoded at the time of decoding is less than the highest resolution), or equal to the resolution employed by the current block to be decoded at the time of decoding (in the case where the resolution employed by the current block to be decoded at the time of decoding is equal to the highest resolution), in addition to the highest resolution. The preset resolution may be indicated by indication information, or, according to configuration information, the configuration information may be: the preset resolution is a resolution of N levels higher than the resolution adopted by the current block to be decoded in the preset resolution set, and if the resolution of N levels is not higher, the resolution is the highest resolution, wherein N is a positive integer greater than or equal to 1.

The reconstructed block of the decoded block is a copy of the reconstructed decoded block, the decoded block and the reconstructed block are the same in size, the same in resolution, and the pixel values of the pixels at the same position are also the same.

The magnitude relation of the initial resolution and the highest resolution may include at least one of: the initial resolution is equal to the highest resolution and the initial resolution is lower than the highest resolution.

As an alternative embodiment, adjusting the resolution of the reconstructed block of the decoded block of the current video frame from the initial resolution to the highest resolution, to obtain the pixel region to be matched with the highest resolution includes: and under the condition that the initial resolution is lower than the highest resolution, upsampling the resolution of the reconstruction block from the initial resolution to the highest resolution to obtain a pixel region to be matched with the highest resolution.

The resolution of the reconstruction block with the initial resolution lower than the highest resolution may be up-sampled from the initial resolution to the highest resolution, so as to obtain a pixel region to be matched with the resolution of the highest resolution (or a pixel region corresponding to the reconstruction block in the pixel region to be matched).

As another alternative embodiment, in the case where the initial resolution is equal to the highest resolution, the resolution of the reconstructed block may not be adjusted, and the reconstructed block may be regarded as the pixel region to be matched (or the pixel region corresponding to the reconstructed block in the pixel region to be matched).

According to the embodiment, the resolution of the reconstruction block is up-sampled or not adjusted according to the relation between the initial resolution and the highest resolution, so that the pixel region to be matched is obtained, the resolution of the reconstruction block can be ensured to be adjusted to the highest resolution, and the accuracy of determining the pixel region to be matched is improved.

There may be one or more decoded blocks before decoding the current block to be decoded. In the case that the decoded block is one, the resolution of the reconstructed block of the one decoded block may be adjusted from the initial resolution to the highest resolution, and the pixel region to be matched with the highest resolution may be obtained. In the case that the number of decoded blocks is one, the resolutions of the plurality of reconstructed blocks of the plurality of decoded blocks may be respectively adjusted from the initial resolution to the highest resolution, so as to obtain a pixel region to be matched with the highest resolution.

As an alternative, adjusting the resolution of the reconstructed block of the decoded block of the current video frame from the initial resolution to the highest resolution, to obtain the pixel region to be matched with the highest resolution includes: in the case that the decoded block includes a plurality of blocks and the plurality of blocks includes a target block having an initial resolution lower than a highest resolution, upsampling the resolution of a reconstructed block of the target block from the initial resolution to the highest resolution results in a target decoded block having a resolution of the highest resolution, wherein the pixel region to be matched includes the target decoded block.

For a target block of the plurality of decoded blocks having an initial resolution lower than the highest resolution (the target block may include one or more), the resolution of the reconstructed block of the target block may be upsampled from the initial resolution to the highest resolution to obtain a target decoded block (a first target decoded block) having a resolution of the highest resolution, and the pixel region to be matched includes the target decoded block.

As another alternative embodiment, in the case where the decoded block includes a plurality of blocks and the plurality of blocks includes other blocks (the other blocks may include one or more) having an initial resolution equal to the highest resolution, the resolution of the reconstructed block of the other blocks may not be adjusted, and the reconstructed block of the other blocks is taken as a second target decoded block having the highest resolution, and the pixel region to be matched includes the second target decoded block.

For example, as shown in fig. 3, the current video frame contains 4×4 image blocks, 10 decoded blocks, including: 4 decoded blocks of Resolution R1 (Resolution), 3 decoded blocks of Resolution R2, 3 decoded blocks of Resolution R3, wherein R1 > R3 > R2. The resolution of the reconstructed block of the decoded blocks of resolution R2 and R3 may be up-sampled by R2 and R3 to R1, respectively, and taken as the pixel region to be matched together with the reconstructed block of the decoded block of resolution R1.

According to the embodiment, when a plurality of decoded blocks are provided, the resolution of the reconstructed block of each decoded block is adjusted according to the relation between the resolution corresponding to each decoded block and the highest resolution, so that the resolution of each reconstructed block can be ensured to be adjusted to the highest resolution, and the accuracy of determining the pixel region to be matched is improved.

After the resolution of the reconstructed block of each decoded block is adjusted from the initial resolution to the highest resolution, the adjusted reconstructed block may be used as the pixel region to be matched, or the adjusted reconstructed block may be subjected to edge filtering first, and the filtered reconstructed block may be used as the pixel region to be matched.

As an alternative, after adjusting the resolution of the reconstructed block of the decoded block of the current video frame from the initial resolution to the highest resolution, the method further comprises: and carrying out edge filtering on pixel points on adjacent sides of adjacent pixel blocks in the pixel region to be matched in the case that the decoded block comprises a plurality of blocks, wherein the adjacent pixel blocks are the pixel blocks corresponding to the adjacent blocks in the plurality of blocks in the pixel region to be matched.

Edge filtering may be performed by one or more rows of pixels on adjacent row edges (row edge neighbors) or one or more columns of pixels on adjacent column edges of adjacent pixel blocks. In the edge filtering, for a pixel to be filtered, the pixel used for filtering the pixel to be filtered may be one or more pixels adjacent to the pixel to be filtered (row-adjacent, column-adjacent, etc.), or may be pixels within a predetermined range centered on the pixel to be filtered, where the predetermined range may be a circle with a radius of a predetermined length, or a square, a rectangle, etc. with a side length of a predetermined length. The filtering method used for the edge filtering may be set as needed, and is not particularly limited in this embodiment.

For example, as shown in fig. 4, for the adjusted reconstruction block, edge filtering is performed for two rows of pixels on adjacent sides (as shown by a dashed box in fig. 4), and the pixel value (filtering) of each pixel is adjusted according to the pixel value of the pixel corresponding to each pixel (for example, the pixel used for the filtering), so as to obtain the filtered reconstruction block. The filtered pixel block may be used as the pixel region to be matched.

On the decoding side, the filtering process may be performed by:

determining at least one pair of decoding blocks to be reconstructed from a current decoding video frame to be processed, wherein each pair of decoding blocks in the at least one pair of decoding blocks comprises a first decoding block with a first resolution and a second decoding block with a second resolution, and the first decoding block and the second decoding block are adjacent decoding blocks in position;

adjusting a first resolution of the first decoding block to a target resolution (e.g., a highest resolution), and adjusting a second resolution of the second decoding block to the target resolution;

determining a first edge pixel point set from a first decoding block and determining a second edge pixel point set from a second decoding block, wherein the position of the first edge pixel point set is adjacent to the position of the second edge pixel point set;

Filtering the first edge pixel point set to obtain a filtered first edge pixel point set, filtering the second edge pixel point set to obtain a filtered second edge pixel point set, wherein a first difference value between a pixel value of an ith pixel point in the filtered first edge pixel point set and a pixel value of a jth pixel point corresponding to the ith pixel point in the filtered second edge pixel point set is smaller than a second difference value between the pixel value of the ith pixel point in the first edge pixel point set and the pixel value of the jth pixel point in the second edge pixel point set, i is a positive integer and is smaller than or equal to the total number of the pixel points in the first edge pixel point set, j is a positive integer and is smaller than or equal to the total number of the pixel points in the second edge pixel point set.

Wherein adjusting to the target resolution comprises:

1) Adjusting the second resolution to the first resolution in the case where the target resolution is equal to the first resolution;

2) Adjusting the first resolution to the second resolution in the case where the target resolution is equal to the second resolution;

3) In the case where the target resolution is equal to the third resolution, the first resolution is adjusted to the third resolution, and the second resolution is adjusted to the third resolution, wherein the third resolution is different from the first resolution and different from the second resolution.

The resolution of the decoding block is adjusted, and the edge pixel point set determined in the decoding block is subjected to edge filtering treatment, so that obvious joints in the video can be avoided in the reconstruction process, the content in the video is ensured to be accurately restored, and the technical problem of video distortion caused by inconsistent resolution is solved.

According to the embodiment, the influence of resolution adjustment on the pixel points can be reduced by carrying out edge filtering on the adjusted reconstruction blocks, and the effectiveness of the pixel points in the pixel region to be matched is ensured.

In step S204, a target pixel region matching the block to be decoded is determined from the pixel regions to be matched according to a target MV corresponding to the block to be decoded of the current video frame, where the target MV is a motion vector from the target pixel region to the block to be decoded, and the size of the target pixel region is the same as the size of the block to be decoded.

After obtaining the pixel region to be matched, the target pixel region can be determined from the pixel region to be matched according to the motion vector (target MV) from the target pixel region corresponding to the block to be decoded in the pixel region to be matched. The target pixel area is used for determining a target decoding block corresponding to the block to be decoded, wherein the target decoding block is a decoding block obtained by decoding the block to be decoded according to resolution, and can span multiple adjusted reconstruction blocks.

The target MV may be carried in indication information corresponding to the video to be decoded, the current video frame, or the block to be decoded. The indication information may carry a plurality of parameters including at least a motion vector parameter for indicating the target MV. From the parameter values of the motion vector parameters, the target MV can be determined.

It should be noted that, the target MV is used to determine the relative positional relationship between the target pixel region and the block to be decoded, and may be a motion vector from the target pixel region to the block to be decoded, or a motion vector from the block to be decoded to the target pixel region.

In addition to the target MV, determining the target pixel region also requires reference positions for locating the target pixel region, the reference positions being relative positions of reference point positions in the block to be decoded. The reference point position in the block to be decoded may be a certain pixel position (an upper left pixel, an upper right pixel, a lower left pixel, a lower right pixel) in the block to be decoded, or may be other positions than the pixel position, for example, a center point. The reference point position may be set as needed, and is not particularly limited in this embodiment.

As an alternative, determining a target pixel region matching the block to be decoded from among the pixel regions to be matched according to the target MV corresponding to the block to be decoded may include: determining a second pixel position matched with the first pixel position in the block to be decoded in the pixel region to be matched, wherein the motion vector from the second pixel position to the first pixel position is a target MV; and determining a target pixel area in the pixel area to be matched according to the second pixel position, wherein the relative position of the second pixel position in the target pixel area is the same as the relative position of the first pixel position in the block to be decoded.

After the target MV is obtained, the reference point position in the block to be decoded can also be determined. The reference point position in the block to be decoded may be the first pixel point position in the block to be decoded. The first pixel location may be a location of a first pixel in the block to be decoded. The first pixel point may be a specific pixel point in the block to be decoded, and the specific pixel point may be any pixel point, for example, a pixel point in an upper left corner, a pixel point in an upper right corner, a pixel point in a lower left corner, and a pixel point in a lower right corner.

The reference point position in the block to be decoded (the relative position of the first pixel point position in the block to be decoded) and the reference point position in the target pixel region (the relative position of the second pixel point position in the target pixel region) are the same. The reference point position may be defined by convention, or may be determined by a parameter value indicating a reference point position parameter among a plurality of parameters in the information. The reference point position may be defined as needed, and is not particularly limited in this embodiment.

The motion vector from the second pixel position to the first pixel position is the target MV, or the motion vector from the first pixel position to the second pixel position is the target MV. From the reference point positions in the block to be decoded, the reference point positions (second pixel point positions) in the target pixel region can be obtained from the target MV. Since the size of the block to be decoded is constant, it is possible to locate the target pixel region from among the pixel regions to be matched according to the reference point position in the target pixel region.

For example, as shown in fig. 5, according to the target MV and the first pixel location, the second pixel location (e.g., two end locations of the target MV in fig. 5) may be determined, and then the target pixel area may be determined according to the size information of the block to be decoded (or other information capable of determining the size of the block to be decoded).

According to the embodiment, the target pixel area is determined according to the target MV corresponding to the block to be decoded and the reference point position (the first pixel point position) in the block to be decoded, so that the accuracy of determining the target pixel point position can be ensured, and the decoding quality can be improved.

In step S206, the pixel values of the pixel points in the target decoding block obtained by decoding the block to be decoded at the highest resolution are determined according to the pixel values of the pixel points in the target pixel region.

After the target pixel area is determined, the pixel value of the pixel point in the target decoding block corresponding to the block to be decoded can be determined according to the pixel value of the pixel point in the target pixel area, wherein the target decoding block is a decoding block obtained by decoding the block to be decoded according to the highest resolution.

The pixel values of the pixel points in the target decoding block corresponding to the block to be decoded may be determined in various manners, for example, the target pixel region may be directly used as the target decoding block: and taking the pixel value of the pixel point in the target pixel area as the pixel value of the corresponding pixel point in the target decoding block.

As an alternative, determining, according to the pixel values of the pixel points in the target pixel area, the pixel values of the pixel points in the target decoding block obtained by decoding the block to be decoded according to the highest resolution includes: and converting the pixel values of the pixel points in the target pixel region according to the target conversion parameters to obtain the pixel values of the pixel points in the target decoding block.

In addition to the target MV, parameters in the video bitstream corresponding to the block to be decoded may further include: a target conversion parameter, which may be a conversion parameter that converts a pixel value of a pixel point in a target pixel region into a pixel value of a pixel point in a target decoding block. The conversion parameters may be: the difference between the pixel values of the pixel points in the target pixel region and the pixel values of the pixel points in the target decoding block.

Since the target pixel region is a pixel region matching the block to be decoded (for example, a pixel region most similar to the block to be decoded in the pixel region to be matched), the data amount of the conversion parameters required for determining the target decoding block is smaller, so that an accurate target decoding block can be obtained at a lower cost.

According to the embodiment, the pixel values of the pixel points in the target pixel region are converted according to the target conversion parameters, so that the pixel values of the pixel points in the target decoding block are obtained, the accuracy of determining the target decoding block can be ensured, and the decoding quality is improved.

It should be noted that, for each block of a current video frame, for a target decoding block obtained according to a motion vector, the highest resolution may be used as the resolution adopted when the block to be decoded is decoded, or the target decoding block may be downsampled to a suitable resolution (the target resolution is indicated by the indication information); the resolution employed at the time of encoding may be regarded as the resolution employed at the time of decoding for a decoding block to be obtained by decoding a data code stream.

According to another aspect of an embodiment of the present invention, there is provided a video encoding method, as shown in fig. 6, the method including:

Step S602, the resolution of a reconstruction block of an encoded block of a current video frame is adjusted from an initial resolution to a highest resolution, and a pixel region to be matched with the highest resolution is obtained, wherein the initial resolution is the resolution adopted by the encoded block during encoding, and the highest resolution is the highest resolution in a preset resolution set;

step S604, searching a target pixel area matched with a block to be coded of the current video frame in the pixel area to be matched, wherein the size of the target pixel area is the same as the size of the block to be coded;

step S606, obtaining MVs from the target pixel area to the block to be encoded, and obtaining target MVs corresponding to the block to be encoded.

It should be noted that the video encoding method shown in fig. 6 may be used in the video encoder shown in fig. 1, but is not limited to the above method. The video encoder is matched with other parts in interaction to complete the encoding process of the current video frame.

Alternatively, in this embodiment, the video encoding method may be applied to, but not limited to, application scenarios such as a video playing application, a video sharing application, or a video session application. The video transmitted in the application scenario may include, but is not limited to: the long video, the short video, such as the long video, can be a play episode with longer play time (for example, the play time is longer than 10 minutes), or the pictures shown in the long video session, and the short video can be a voice message interacted by two or more parties, or a video with shorter play time (for example, the play time is less than or equal to 30 seconds) shown on the sharing platform. The above is merely an example, and the video encoding method provided in the present embodiment may be applied to, but not limited to, a transmitting apparatus for transmitting video in the above application scenario.

In the video encoding process, in order to ensure the real-time performance and flexibility of video frame encoding, a video frame is divided into a plurality of blocks to be encoded respectively. Furthermore, before encoding the current block, it is necessary to perform encoding pre-analysis on the current block by referring to the encoded block to determine the resolution of the current block. In the video encoding process in the related art, for different blocks within the same video frame, for convenience of processing, a uniform resolution is generally used for encoding.

As shown in fig. 7, if high resolution encoding is used for different blocks in one frame of video, in the case where the bandwidth of transmission is relatively small (e.g., smaller than the bandwidth threshold Th shown in fig. 7), the peak signal-to-noise ratio PSNR1 corresponding to encoding with high resolution for different blocks in one frame of video is lower than the peak signal-to-noise ratio PSNR2 corresponding to encoding with low resolution for different blocks in one frame of video, that is, the peak signal-to-noise ratio PSNR1 when encoding with high resolution is relatively small when the bandwidth of transmission is small, and the distortion is relatively large.

Similarly, if low resolution encoding is used for different blocks in a frame of video, in the case where the bandwidth of transmission is relatively large (e.g., greater than the bandwidth threshold Th shown in fig. 7), the peak signal-to-noise ratio PSNR3 corresponding to encoding with low resolution for different blocks in a frame of video is lower than the peak signal-to-noise ratio PSNR4 corresponding to encoding with high resolution for different blocks in a frame of video, that is, the peak signal-to-noise ratio PSNR3 when encoding with low resolution is relatively small when the transmission bandwidth is large, and the distortion is relatively large.

Even if different resolutions are adopted for different blocks, there is no way to effectively and dynamically mark the above-mentioned relative relationship with respect to the dynamic relative relationship between each block and its reference block, and thus the video coding method in the related art has a problem of low coding efficiency.

In this embodiment, for a current block to be encoded in a current video frame to be encoded in a video to be encoded, a plurality of encoded blocks are reconstructed and adjusted to have the same resolution (highest resolution), a pixel area to be matched is obtained, a target pixel area matched with the block to be encoded is searched from the pixel area to be matched, and then a target MV from the target pixel area to the block to be encoded is determined, so that the technical problems of low encoding and decoding efficiency and poor decoding effect existing in a manner that different blocks in a frame adopt different resolutions to encode and decode in the related art are solved, encoding and decoding efficiency is improved, and decoding effect is improved.

The video encoding method in the present embodiment is described below with reference to fig. 6.

Step S602, the resolution of the reconstructed block of the encoded block of the current video frame is adjusted from an initial resolution to a highest resolution, and a pixel region to be matched with the highest resolution is obtained, wherein the initial resolution is the resolution adopted by the encoded block during encoding, and the highest resolution is the highest resolution in a preset resolution set.

A video transmitting device (e.g., terminal 102) may encode a video to be transmitted to obtain an encoded video bitstream. For video frames with relevance in the video to be encoded, video frame encoding can be performed according to the sequence of relevance, for video frames without relevance, encoding can be performed according to the sequence in the video, encoding can be performed according to random sequence, and parallel encoding of a plurality of video frames can be performed, and a specific encoding mode can be determined according to convention or the resource condition of a terminal, which is not particularly limited in the embodiment.

For each image block in the current video frame to be encoded, when an intra block copy (intra block copy) encoding mode is used, encoding can be performed with different resolutions. The resolution ratio adopted by the block to be encoded can be determined according to the pixel value of the pixel point in the block to be encoded, can be determined according to the reference information of other image blocks or other video frames, and can be determined according to the pixel value of the pixel point in the block to be encoded and the reference information of other image blocks or other video frames. The specific manner in which the resolution employed for each image block is determined is not particularly limited in this embodiment. The resolution adopted by each image block can be indicated by indication information corresponding to the current video frame (or the block to be decoded) in the video code stream.

When each block to be encoded is encoded, the resolution of a reconstructed block of the encoded block of the current video frame may be first adjusted from the initial resolution to the highest resolution, so as to obtain a pixel region to be matched with the highest resolution. The initial resolution is the resolution adopted by the coded block in the process of coding, and the corresponding initial resolutions of different coded blocks can be the same or different. The highest resolution is the highest resolution in the preset resolution set.

On the encoding side, the resolution adjustment of the reconstructed block of the encoded block is similar to that on the decoding side, and will not be described here.

The reconstructed block of the encoded block is a copy of the reconstructed encoded block, the encoded block and the reconstructed block are the same size, the resolution is the same, and the pixel values of the pixels at the same position are the same.

As an optional implementation manner, in a case that the initial resolution is lower than the highest resolution, adjusting the resolution of the reconstructed block of the encoded block of the current video frame from the initial resolution to the highest resolution, to obtain the pixel region to be matched with the resolution of the highest resolution includes: and up-sampling the resolution of the reconstruction block from the initial resolution to the highest resolution to obtain a pixel region to be matched with the highest resolution.

According to the embodiment, the resolution of the reconstruction block is up-sampled according to the relation between the initial resolution and the highest resolution, so that the pixel region to be matched is obtained, the resolution of the reconstruction block can be guaranteed to be adjusted to the highest resolution, and the accuracy of determining the pixel region to be matched is improved.

The coded blocks may have one or more prior to coding the current block to be coded. In the case that the number of encoded blocks is one, the resolution of the reconstructed block of the one encoded block may be adjusted from the initial resolution to the highest resolution, and the pixel region to be matched with the highest resolution may be obtained. In the case that the number of the encoded blocks is plural, the resolutions of the plural reconstructed blocks of the plural encoded blocks may be respectively adjusted from the initial resolution to the highest resolution, and the pixel region to be matched having the highest resolution is obtained.

As an alternative embodiment, in a case where the encoded block includes a plurality of blocks and the plurality of blocks includes a target block having an initial resolution lower than a highest resolution, adjusting the resolution of a reconstructed block of the encoded block of the current video frame from the initial resolution to the highest resolution, obtaining a pixel region to be matched having the highest resolution includes: and upsampling the resolution of the reconstructed block of the target block from the initial resolution to the highest resolution to obtain a target encoded block with the highest resolution, wherein the pixel region to be matched comprises the target encoded block.

For a target block (which may include one or more target blocks) of the plurality of encoded blocks having an initial resolution lower than the highest resolution, the resolution of the reconstructed block of the target block may be upsampled from the initial resolution to the highest resolution to obtain a target encoded block (a first target encoded block) having a resolution of the highest resolution, and the region of pixels to be matched includes the target encoded block.

As another alternative embodiment, in the case where the encoded block includes a plurality of blocks and the plurality of blocks includes other blocks (the other blocks may include one or more) having an initial resolution equal to the highest resolution, the resolution of the reconstructed block of the other blocks may not be adjusted, and the reconstructed block of the other blocks is taken as a second target encoded block having the highest resolution, and the pixel region to be matched includes the second target encoded block.

According to the embodiment, when the number of the coded blocks is multiple, the resolution of the reconstruction block of each coded block is adjusted according to the relation between the resolution corresponding to each coded block and the highest resolution, so that the resolution of each reconstruction block can be guaranteed to be adjusted to the highest resolution, and the accuracy of determining the pixel region to be matched is improved.

In addition to adjusting the resolution of the reconstructed block of each encoded block from the initial resolution to the highest resolution, for the current block to be encoded, if the resolution of the current block to be encoded is not the highest resolution, the current block to be encoded may be upsampled, thereby adjusting the resolution of the current block to be encoded to the highest resolution.

After the resolution of the reconstructed block of each encoded block is adjusted from the initial resolution to the highest resolution, the adjusted reconstructed block may be used as the pixel region to be matched, or the adjusted reconstructed block may be subjected to edge filtering first, and the filtered reconstructed block may be used as the pixel region to be matched.

As an alternative, after the resolution of the reconstructed block of the encoded block of the current video frame is adjusted from the initial resolution to the highest resolution, in the case where the encoded block includes a plurality of blocks, edge filtering may be performed on pixel points on adjacent sides of adjacent pixel blocks in the pixel region to be matched, where the adjacent pixel blocks are pixel blocks in the pixel region to be matched that correspond to adjacent blocks in the plurality of blocks.

It should be noted that the filtering process performed on the encoding side is substantially similar to the filtering operation performed on the decoding side, and will not be described here.

In step S604, a target pixel area matching the block to be encoded of the current video frame is searched for in the pixel area to be matched, wherein the size of the target pixel area is the same as the size of the block to be encoded.

And searching a target pixel area matched with the block to be encoded in the pixel area to be matched when the pixel area to be matched is obtained. The manner of finding the target pixel region may include: on the premise that each coding block selects different resolution codes, the resolutions of the reconstruction blocks of each coding block are unified, and target pixel areas (such as most similar areas) matched with the to-be-coded blocks in the coded blocks (to-be-matched pixel areas) are found in an enumeration mode.

For example, on the premise that each coding block decides to select coding with different resolutions, enumeration is performed to find the most similar region in the coded block, and the distance between each pixel of the region and the current block needs to be calculated. The region is obtained according to the reconstruction of the similar region, the similar region needs to be processed with uniform resolution, and the mode of uniform resolution can be as follows: each block is interpolated to the highest resolution.

As an alternative, searching the target pixel area matched with the block to be encoded in the pixel area to be matched includes: sequentially acquiring a plurality of candidate pixel areas with the same size as a block to be encoded from the pixel areas to be matched according to the sequence of the preceding column or the following column by using a preset step length; and determining a target pixel area from the plurality of candidate pixel areas, wherein the target pixel area is the candidate pixel area which has the highest similarity with the block to be encoded and has the similarity with the block to be encoded higher than a similarity threshold value in the plurality of candidate pixel areas.

The enumeration of the target pixel region where the pixel region to be matched is found and the block to be coded may be: and sequentially acquiring a plurality of candidate pixel areas with the same size as the block to be encoded from the pixel areas to be matched according to the sequence of the first column and the last column by using a preset step length, and taking the candidate pixel area with the highest similarity with the block to be encoded and the similarity with the block to be encoded higher than a similarity threshold value as a target pixel threshold value.

It should be noted that, the similarity between the candidate pixel region and the block to be encoded may be: the ratio of the number of pixels with the same pixel value at the same position to the total number of pixels contained in the candidate pixel region. Other types of similarity are also possible. The manner of determining the similarity between the candidate pixel region and the block to be encoded may be set as required, and is not particularly limited in this embodiment.

For example, as shown in fig. 8, the manner of determining the candidate pixel region may be that, with the pixel point in the upper left corner of the pixel region to be matched as the starting point, the first candidate pixel region (the pixel point in the upper left corner is a (0, 0)) having the same size as the block to be encoded is first obtained; then shifting one pixel point to the right to obtain a second candidate pixel region (the pixel point at the upper left corner is A (0, 1)); sequentially acquiring candidate pixel areas in a mode of shifting one pixel point rightward until reaching the right side of the pixel area to be matched; then, translating downwards by one line, returning to the left side, and continuously acquiring a candidate pixel region (the pixel point at the upper left corner is A (1, 0)); the candidate pixel regions are sequentially acquired in the preceding and following manner until no candidate pixel region satisfying the size condition is present.

According to the embodiment, a plurality of candidate pixel areas are sequentially acquired by using a preset step length according to the sequence of the preceding column or the following column or the preceding column and the following row; and the target pixel area is determined from the plurality of candidate pixel areas, so that omission of acquisition of the candidate pixel areas is avoided, and the accuracy of determination of the target pixel area is ensured.

In step S606, MVs from the target pixel region to the block to be encoded are acquired, and a target MV corresponding to the block to be encoded is obtained.

After the target pixel region is determined, a motion vector from the reference point position in the target pixel region to the reference point position in the block to be encoded may be taken as a target MV corresponding to the block to be encoded.

Alternatively, in addition to the target MV, a difference in pixel value between a pixel point in the target pixel region and a pixel point in the block to be encoded may be determined, and the difference may be written as a target conversion parameter into the video bitstream obtained after encoding (for example, written as indication information to a specific position in the video bitstream).

It should be noted that if the target pixel area matched with the block to be encoded cannot be found, the block to be encoded may be directly encoded according to the target resolution (the resolution adopted when the current block to be encoded initially determines encoding), so as to obtain an encoded block (encoded data) corresponding to the block to be encoded, and the obtained encoded data is written into the video code stream.

Specifically, steps S902 to S930 in the example shown in fig. 9 are described. In this example, when the intra block copy coding mode is adopted in a scene of block-level different resolution coding, a pixel region most similar to a current block to be coded is searched in a pixel region coded in a frame, and an MV from the pixel region to the current block is calculated, wherein the most similar pixel region possibly spans multiple blocks, the multiple blocks are reconstructed and adjusted to the same resolution, and the pixel region corresponding to the current block is found on the multiple blocks. The manner of adjusting the resolution employed in this example is: for each coding block with the highest resolution which is not predetermined, the resolution of each coding block is adjusted to the highest resolution by adopting an upsampling interpolation mode.

At the encoding end, steps S902 to S916 are performed: acquiring a current video frame, and deciding through the resolution of a block to be encoded; and determining the resolution adopted when the block to be encoded is encoded as a target resolution, adjusting the resolution of the block to be encoded (or the reconstructed block of the current block to be encoded) to the highest resolution, and then adjusting the reconstructed blocks of a plurality of encoded blocks to the same resolution (highest resolution) (step S906-2 to step S906-4) to obtain the pixel region to be matched. Judging whether a pixel area most similar to a current block to be encoded is found from the pixel areas to be matched; if so, determining a target MV and a target conversion parameter corresponding to the block to be encoded, if not, encoding the block to be encoded by adopting a target resolution, adding a resolution identification of the target resolution into a video code stream, and outputting the video code stream.

At the decoding end, steps S918 to S930 are performed: acquiring a video code stream; performing resolution decision on the block to be decoded, and determining target resolution corresponding to the block to be decoded; determining whether a target MV is acquired from a video code stream; if the target MV is acquired, the reconstructed blocks of the plurality of decoded blocks are adjusted to the same resolution (highest resolution) (step S924-2 to step S924-4) to obtain a pixel region to be matched; and determining a target pixel area matched with the block to be decoded according to the target MV, and determining the pixel value of the pixel point in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel value of the pixel point in the target pixel area. And if the target MV is not acquired, decoding the block to be decoded by adopting the target resolution.

And decoding each block to be decoded of each current video frame to obtain a decoded video frame corresponding to each current video frame in the video code stream, and finally decoding the video.

The foregoing is merely an example, and the video encoding method and the video decoding method provided in this embodiment are applied to the decision process of the target MV and the use process of the target MV shown in fig. 1, so that when the encoding end and the decoding end adopt different resolutions for different blocks to be encoded/decoded, before encoding each block according to the different resolutions, the resolution of the reconstructed block of each encoded block is adjusted to the highest resolution in an upsampling interpolation manner, so as to determine the motion information between each block and its reference block, so as to directly utilize the relative relationship to perform encoding and decoding, and improve the encoding and decoding efficiency.

It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present invention is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present invention. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present invention.

According to still another aspect of the embodiment of the present invention, there is also provided a video decoding apparatus. As shown in fig. 10, the apparatus includes:

(1) The first adjusting unit 1002 adjusts the resolution of the reconstructed block of the decoded block of the current video frame from an initial resolution to a highest resolution, to obtain a pixel region to be matched with the highest resolution, where the initial resolution is the resolution adopted by the decoded block during decoding, and the highest resolution is the highest resolution in a preset resolution set;

(2) A first determining unit 1004, configured to determine, from a pixel area to be matched, a target pixel area matched with the block to be decoded according to a target MV corresponding to the block to be decoded of the current video frame, where the target MV is a motion vector from the target pixel area to the block to be decoded, and a size of the target pixel area is the same as a size of the block to be decoded;

(3) The second determining unit 1006 is configured to determine, according to the pixel values of the pixel points in the target pixel area, the pixel values of the pixel points in the target decoding block obtained by decoding the block to be decoded according to the highest resolution.

It should be noted that the video decoding apparatus shown in fig. 10 may be used in the video decoder shown in fig. 1, but is not limited to the above. The decoding process of the current video frame is completed through the interaction of the video decoder with other components.

Alternatively, the first adjusting unit 1002 may be used to perform the aforementioned step S202, the first determining unit 1004 may be used to perform the aforementioned step S204, and the second determining unit 1006 may be used to perform the aforementioned step S206.

In this embodiment, for a current block to be decoded in a current video frame in a video to be decoded, a plurality of decoded blocks are reconstructed and adjusted to have the same resolution (highest resolution), so as to obtain a pixel area to be matched, a target pixel area matched with the block to be decoded is determined according to a target MV corresponding to the block to be decoded, and then a pixel value of a pixel point in the target decoding block obtained by decoding the block to be decoded is determined according to a pixel value of the target pixel area, so that the technical problems of low encoding and decoding efficiency and poor decoding effect existing in the manner of encoding and decoding different blocks in a frame in the related art by adopting different resolutions can be solved, the encoding and decoding efficiency is improved, and the decoding effect is improved.

As an alternative embodiment, the first adjusting unit 1002 includes:

(1) And the first up-sampling module is used for up-sampling the resolution of the reconstruction block from the initial resolution to the highest resolution under the condition that the initial resolution is lower than the highest resolution, so as to obtain a pixel region to be matched, wherein the resolution of the pixel region is the highest resolution.

As an alternative embodiment, the first adjusting unit 1002 includes:

(1) And the second upsampling module is used for upsampling the resolution of the reconstructed block of the target block from the initial resolution to the highest resolution to obtain the target decoded block with the highest resolution in the case that the decoded block comprises a plurality of blocks and the plurality of blocks comprise the target block with the initial resolution lower than the highest resolution, wherein the pixel area to be matched comprises the target decoded block.

As an alternative embodiment, the first adjusting unit 1002 includes:

(1) And the first filtering module is used for carrying out edge filtering on pixel points on adjacent edges of adjacent pixel blocks in a pixel region to be matched in the case that the decoded block comprises a plurality of blocks after the resolution of a reconstructed block of the decoded block of the current video frame is adjusted from the initial resolution to the highest resolution, wherein the adjacent pixel blocks are pixel blocks corresponding to the adjacent blocks in the plurality of blocks in the pixel region to be matched.

As an alternative embodiment, the first determining unit 1004 includes:

(1) The first determining module is used for determining a second pixel position matched with the first pixel position in the block to be decoded in the pixel region to be matched, wherein the motion vector from the second pixel position to the first pixel position is a target MV;

(2) And the second determining module is used for determining a target pixel area in the pixel area to be matched according to the second pixel point position, wherein the relative position of the second pixel point position in the target pixel area is the same as the relative position of the first pixel point position in the block to be decoded.

As an alternative embodiment, the second determining unit 1006 includes:

(1) And the conversion module is used for converting the pixel values of the pixel points in the target pixel area according to the target conversion parameters to obtain the pixel values of the pixel points in the target decoding block.

According to still another aspect of the embodiment of the present invention, there is also provided a video encoding apparatus. As shown in fig. 11, the apparatus includes:

(1) A second adjusting unit 1102, configured to adjust a resolution of a reconstructed block of an encoded block of the current video frame from an initial resolution to a highest resolution, to obtain a pixel region to be matched with the highest resolution, where the initial resolution is a resolution adopted by the encoded block during encoding, and the highest resolution is a highest resolution in a preset resolution set;

(2) A searching unit 1104, configured to search a target pixel area matching with a block to be encoded of the current video frame in the pixel area to be matched, where a size of the target pixel area is the same as a size of the block to be encoded;

(3) The obtaining unit 1106 is configured to obtain an MV from the target pixel area to the block to be encoded, and obtain a target MV corresponding to the block to be encoded.

It should be noted that the video encoding apparatus shown in fig. 11 may be used in the video encoder shown in fig. 1, but is not limited to the foregoing. The video encoder is matched with other parts in interaction to complete the encoding process of the current video frame.

Alternatively, the second adjusting unit 1102 may be used to perform the aforementioned step S602, the searching unit 1104 may be used to perform the aforementioned step S604, and the obtaining unit 1106 may be used to perform the aforementioned step S606.

In this embodiment, for a current block to be encoded in a current video frame in a video to be encoded, a plurality of encoded blocks are reconstructed and adjusted to the same resolution (highest resolution), a pixel area to be matched is obtained, a target pixel area matched with the block to be encoded is searched from the pixel area to be matched, and then a target MV from the target pixel area to the block to be encoded is determined, so that the technical problems of low encoding and decoding efficiency and poor decoding effect existing in a manner that different blocks in a frame adopt different resolutions in related technologies are solved, encoding and decoding efficiency is improved, and decoding effect is improved.

As an alternative embodiment, the second adjusting unit 1102 includes:

(1) And the third upsampling module is used for upsampling the resolution of the reconstruction block from the initial resolution to the highest resolution under the condition that the initial resolution is lower than the highest resolution, so as to obtain a pixel region to be matched, wherein the resolution of the pixel region is the highest resolution.

As an alternative embodiment, the second adjusting unit 1102 includes:

(1) A fourth upsampling module for upsampling the resolution of the reconstructed block of the target block from the initial resolution to the highest resolution to obtain a target coded block with a resolution of the highest resolution, in case the coded block comprises a plurality of blocks including the target block with an initial resolution lower than the highest resolution, wherein the pixel area to be matched comprises the target coded block

As an alternative embodiment, the second adjusting unit 1102 includes:

(1) And the second filtering module is used for carrying out edge filtering on pixel points on adjacent edges of adjacent pixel blocks in a pixel region to be matched in the case that the encoded block comprises a plurality of blocks after the resolution of a reconstructed block of the encoded block of the current video frame is adjusted from the initial resolution to the highest resolution, wherein the adjacent pixel blocks are pixel blocks corresponding to the adjacent blocks in the plurality of blocks in the pixel region to be matched.

As an alternative embodiment, the search unit 1104 includes:

(1) The acquisition module is used for sequentially acquiring a plurality of candidate pixel areas with the same size as the block to be encoded from the pixel areas to be matched according to the sequence of the preceding column or the following column by using a preset step length;

(2) And the third determining module is used for determining a target pixel area from the plurality of candidate pixel areas, wherein the target pixel area is the candidate pixel area which has the highest similarity with the block to be encoded and has the similarity with the block to be encoded higher than a similarity threshold value in the plurality of candidate pixel areas.

According to a further aspect of embodiments of the present invention, there is also provided a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.

Alternatively, in the present embodiment, the above-described computer-readable storage medium may be configured to store a computer program for executing the steps of:

s1, adjusting the resolution of a reconstruction block of a decoded block of a current video frame from an initial resolution to a highest resolution, and obtaining a pixel region to be matched with the highest resolution, wherein the initial resolution is the resolution adopted by the decoded block in decoding, and the highest resolution is the highest resolution in a preset resolution set;

s2, determining a target pixel area matched with the block to be decoded from the pixel area to be matched according to a target MV corresponding to the block to be decoded of the current video frame, wherein the target MV is a motion vector from the target pixel area to the block to be decoded, and the size of the target pixel area is the same as that of the block to be decoded;

And S3, determining the pixel value of the pixel point in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel value of the pixel point in the target pixel region.

s1, adjusting the resolution of a reconstruction block of an encoded block of a current video frame from an initial resolution to a highest resolution, and obtaining a pixel region to be matched with the highest resolution, wherein the initial resolution is the resolution adopted by the encoded block during encoding, and the highest resolution is the highest resolution in a preset resolution set;

s2, searching a target pixel area matched with a block to be coded of the current video frame in the pixel area to be matched, wherein the size of the target pixel area is the same as that of the block to be coded;

and S3, obtaining MVs from the target pixel region to the block to be encoded, and obtaining a target MV corresponding to the block to be encoded.

Alternatively, in this embodiment, it will be understood by those skilled in the art that all or part of the steps in the methods of the above embodiments may be performed by a program for instructing a terminal device to execute the steps, where the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.

According to a further aspect of the embodiments of the present invention, there is also provided an electronic device for implementing the video decoding method or the video encoding method described above, as shown in fig. 12, the electronic device comprising a memory 1202 and a processor 1204, the memory 1202 storing a computer program, the processor 1204 being arranged to perform the steps of any of the method embodiments described above by means of the computer program.

Alternatively, in this embodiment, the electronic apparatus may be located in at least one network device of a plurality of network devices of the computer network.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:

Alternatively, it will be understood by those skilled in the art that the structure shown in fig. 12 is only schematic, and the electronic device may also be a terminal device such as a smart phone (e.g. an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, and a mobile internet device (Mobile Internet Devices, MID), a PAD, etc. Fig. 12 is not limited to the structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 12, or have a different configuration than shown in FIG. 12.

The memory 1202 may be used to store software programs and modules, such as a video decoding method and apparatus or program instructions/modules corresponding to a video encoding method and apparatus in an embodiment of the present invention, and the processor 1204 executes the software programs and modules stored in the memory 1202 to perform various functional applications and data processing, that is, implement the video decoding method or the video encoding method. Memory 1202 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 1202 may further include memory located remotely from the processor 1204, which may be connected to the terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 1202 may be, but not limited to, a memory for storing information such as sample characteristics of the item and a target virtual resource account number. As an example, as shown in fig. 12, the memory 1202 may include, but is not limited to, a first adjustment unit 1002, a first determination unit 1004, and a second determination unit 1006 in the video decoding apparatus. As another example, the memory 1202 may include, but is not limited to, the second adjusting unit 1102, the searching unit 1104, and the obtaining unit 1106 in the video encoding apparatus. In addition, the foregoing video decoding apparatus or other module units in the video encoding apparatus may be included, but are not limited to, and are not described in detail in this example.

Optionally, the transmission device 1206 is configured to receive or transmit data via a network. Specific examples of the network described above may include wired networks and wireless networks. In one example, the transmission means 1206 comprises a network adapter (Network Interface Controller, NIC) that can be connected to other network devices and routers via a network cable to communicate with the internet or a local area network. In one example, the transmission device 1206 is a Radio Frequency (RF) module for communicating wirelessly with the internet.

In addition, the electronic device further includes: a connection bus 1208 for connecting the respective module components in the above-described electronic apparatus.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

The integrated units in the above embodiments may be stored in the above-described computer-readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing one or more computer devices (which may be personal computers, servers or network devices, etc.) to perform all or part of the steps of the method described in the embodiments of the present invention.

In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. A video decoding method, comprising:

the method comprises the steps of adjusting the resolution of a reconstruction block of a decoded block of a current video frame from an initial resolution to a highest resolution, and obtaining a pixel region to be matched with the resolution of the highest resolution, wherein the initial resolution is the resolution adopted by the decoded block in decoding, and the highest resolution is the highest resolution in a preset resolution set;

determining a target pixel region matched with the block to be decoded from the pixel region to be matched according to a target motion vector MV corresponding to the block to be decoded of the current video frame, wherein the target motion vector MV is a motion vector from the block to be decoded to the target pixel region, the size of the target pixel region is the same as the size of the block to be decoded, and determining the target pixel region matched with the block to be decoded from the pixel region to be matched according to the target motion vector MV corresponding to the block to be decoded comprises: determining a second pixel position matched with a first pixel position in the block to be decoded in the pixel region to be matched, wherein a motion vector from the second pixel position to the first pixel position is the target motion vector MV; determining the target pixel area in the pixel area to be matched according to the second pixel point position, wherein the relative position of the second pixel point position in the target pixel area is the same as the relative position of the first pixel point position in the block to be decoded;

And determining the pixel value of the pixel point in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel value of the pixel point in the target pixel region.

2. The method of claim 1, wherein adjusting the resolution of the reconstructed block of the decoded block of the current video frame from the initial resolution to the highest resolution, the obtaining the pixel region to be matched having the resolution of the highest resolution comprises:

and under the condition that the initial resolution is lower than the highest resolution, upsampling the resolution of the reconstruction block from the initial resolution to the highest resolution to obtain the pixel region to be matched with the resolution of the highest resolution.

3. The method of claim 1, wherein adjusting the resolution of the reconstructed block of the decoded block of the current video frame from the initial resolution to the highest resolution, the obtaining the pixel region to be matched having the resolution of the highest resolution comprises:

and in the case that the decoded blocks comprise a plurality of blocks and the plurality of blocks comprise the target block with the initial resolution lower than the highest resolution, upsampling the resolution of the reconstructed block of the target block from the initial resolution to the highest resolution to obtain a target decoded block with the resolution of the highest resolution, wherein the pixel region to be matched comprises the target decoded block.

4. The method of claim 1, wherein after adjusting the resolution of the reconstructed block of the decoded block of the current video frame from the initial resolution to the highest resolution, the method further comprises:

and carrying out edge filtering on pixel points on adjacent edges of adjacent pixel blocks in the pixel region to be matched under the condition that the decoded blocks comprise a plurality of blocks, wherein the adjacent pixel blocks are pixel blocks corresponding to the adjacent blocks in the plurality of blocks in the pixel region to be matched.

5. The method according to any one of claims 1 to 4, wherein determining, from pixel values of pixel points in the target pixel region, pixel values of pixel points in the target decoded block obtained by decoding the block to be decoded at the highest resolution includes:

and converting the pixel values of the pixel points in the target pixel region according to the target conversion parameters to obtain the pixel values of the pixel points in the target decoding block.

6. A video encoding method, comprising:

the method comprises the steps of adjusting the resolution of a reconstruction block of an encoded block of a current video frame from an initial resolution to a highest resolution, and obtaining a pixel region to be matched with the resolution of the highest resolution, wherein the initial resolution is the resolution adopted by the encoded block in encoding, and the highest resolution is the highest resolution in a preset resolution set;

Searching for a target pixel area matched with a block to be encoded of the current video frame in the pixel area to be matched, wherein the size of the target pixel area is the same as the size of the block to be encoded, and searching for the target pixel area matched with the block to be encoded in the pixel area to be matched comprises: sequentially obtaining a plurality of candidate pixel areas with the same size as the block to be encoded from the pixel areas to be matched according to the sequence of the preceding column or the following column by using a preset step length; determining the target pixel region from the plurality of candidate pixel regions, wherein the target pixel region is a candidate pixel region which has the highest similarity with the block to be encoded and has a similarity with the block to be encoded higher than a similarity threshold value in the plurality of candidate pixel regions;

and obtaining a motion vector MV from the target pixel region to the block to be encoded, and obtaining a target motion vector MV corresponding to the block to be encoded.

7. The method of claim 6, wherein adjusting the resolution of the reconstructed block of the encoded block of the current video frame from the initial resolution to the highest resolution, the obtaining the pixel region to be matched having the resolution of the highest resolution comprises:

8. The method of claim 6, wherein adjusting the resolution of the reconstructed block of the encoded block of the current video frame from the initial resolution to the highest resolution, the obtaining the pixel region to be matched having the resolution of the highest resolution comprises:

and under the condition that the coded blocks comprise a plurality of blocks and the plurality of blocks comprise a target block with the initial resolution lower than the highest resolution, upsampling the resolution of a reconstructed block of the target block from the initial resolution to the highest resolution to obtain a target coded block with the resolution being the highest resolution, wherein the pixel region to be matched comprises the target coded block.

9. The method of claim 6, wherein after adjusting the resolution of the reconstructed block of the encoded block of the current video frame from the initial resolution to the highest resolution, the method further comprises:

And carrying out edge filtering on pixel points on adjacent edges of adjacent pixel blocks in the pixel region to be matched under the condition that the encoded blocks comprise a plurality of blocks, wherein the adjacent pixel blocks are pixel blocks corresponding to the adjacent blocks in the plurality of blocks in the pixel region to be matched.

10. A video decoding apparatus, comprising:

a first adjusting unit, configured to adjust a resolution of a reconstructed block of a decoded block of a current video frame from an initial resolution to a highest resolution, to obtain a pixel region to be matched with the resolution being the highest resolution, where the initial resolution is a resolution adopted by the decoded block during decoding, and the highest resolution is a highest resolution in a preset resolution set;

a first determining unit, configured to determine, from the pixel areas to be matched, a target pixel area matched with the block to be decoded according to a target motion vector MV corresponding to the block to be decoded of the current video frame, where the target motion vector MV is a motion vector from the target pixel area to the block to be decoded, a size of the target pixel area is the same as a size of the block to be decoded, and determining, from the pixel areas to be matched, the target pixel area matched with the block to be decoded according to the target motion vector MV corresponding to the block to be decoded includes: determining a second pixel position matched with a first pixel position in the block to be decoded in the pixel region to be matched, wherein a motion vector from the second pixel position to the first pixel position is the target motion vector MV; determining the target pixel area in the pixel area to be matched according to the second pixel point position, wherein the relative position of the second pixel point position in the target pixel area is the same as the relative position of the first pixel point position in the block to be decoded;

And the second determining unit is used for determining the pixel value of the pixel point in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel value of the pixel point in the target pixel region.

11. A video encoding apparatus, comprising:

the second adjusting unit is used for adjusting the resolution of a reconstruction block of the coded block of the current video frame from an initial resolution to a highest resolution, so as to obtain a pixel region to be matched, wherein the resolution is the highest resolution, and the initial resolution is the resolution adopted by the coded block in coding, and the highest resolution is the highest resolution in a preset resolution set;

the searching unit is configured to search a target pixel area matched with a block to be encoded of the current video frame in the pixel area to be matched, where the size of the target pixel area is the same as the size of the block to be encoded, and searching the target pixel area matched with the block to be encoded in the pixel area to be matched includes: sequentially obtaining a plurality of candidate pixel areas with the same size as the block to be encoded from the pixel areas to be matched according to the sequence of the preceding column or the following column by using a preset step length; determining the target pixel region from the plurality of candidate pixel regions, wherein the target pixel region is a candidate pixel region which has the highest similarity with the block to be encoded and has a similarity with the block to be encoded higher than a similarity threshold value in the plurality of candidate pixel regions;

And the acquisition unit is used for acquiring the motion vector MV from the target pixel region to the block to be encoded to obtain a target motion vector MV corresponding to the block to be encoded.

12. A computer readable storage medium comprising a stored program, wherein the program when run performs the method of any one of the preceding claims 1 to 9.

13. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method according to any of the claims 1 to 9 by means of the computer program.