CN110662060A

CN110662060A - Video encoding method and apparatus, video decoding method and apparatus, and storage medium

Info

Publication number: CN110662060A
Application number: CN201910927120.9A
Authority: CN
Inventors: 高欣玮; 李蔚然; 毛煦楠; 谷沉沉
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-09-27
Filing date: 2019-09-27
Publication date: 2020-01-07
Anticipated expiration: 2039-09-27
Also published as: CN110662060B

Abstract

The invention discloses a video encoding method and device, a video decoding method and device and a storage medium. Wherein, the method comprises the following steps: adjusting the resolution of a reconstructed block of a decoded block of a current video frame from an initial resolution to a highest resolution to obtain a pixel region to be matched, wherein the resolution is the highest resolution, the initial resolution is a resolution adopted by the decoded block during decoding, and the highest resolution is the highest resolution in a preset resolution set; determining a target pixel area matched with a block to be decoded from a pixel area to be matched according to a target MV corresponding to the block to be decoded of a current video frame, wherein the target MV is a motion vector from the target pixel area to the block to be decoded, and the size of the target pixel area is the same as that of the block to be decoded; and determining the pixel value of the pixel point in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel value of the pixel point in the target pixel region.

Description

Video encoding method and apparatus, video decoding method and apparatus, and storage medium

Technical Field

The present invention relates to the field of computers, and in particular, to a video encoding method and apparatus, a video decoding method and apparatus, and a storage medium.

Background

Currently, in the process of video encoding, a video frame in a video may be divided into a plurality of blocks to be encoded respectively. Before encoding the current block, the current block may be pre-analyzed for encoding by referring to the encoded block to determine encoding parameter information of the current block. For ease of processing, different blocks of a frame in a video are typically encoded with uniform resolution.

However, in the way of encoding and decoding different blocks in one frame with uniform resolution, there is a problem of distortion verification due to mismatch of resolution and transmission bandwidth. Even if different resolutions are adopted for different blocks, the relative relationship between each block and its reference block is dynamic, which makes it difficult to determine the relative position of the block to be coded and its reference block, and further affects the efficiency and effect of coding and decoding.

Therefore, the related art adopts different resolutions for different blocks in a frame to perform coding and decoding, which causes the problems of low coding and decoding efficiency and poor decoding effect caused by difficult determination of the relative position of the block to be coded and the reference block thereof.

Disclosure of Invention

The embodiment of the invention provides a video coding method and device, a video decoding method and device and a storage medium, which at least solve the technical problems of low coding and decoding efficiency and poor decoding effect in a mode that different blocks in one frame are coded and decoded by adopting different resolutions in the related technology.

According to an aspect of an embodiment of the present invention, there is provided a video decoding method including: adjusting the resolution of a reconstructed block of a decoded block of a current video frame from an initial resolution to a highest resolution to obtain a pixel region to be matched, wherein the resolution is the highest resolution, the initial resolution is a resolution adopted by the decoded block during decoding, and the highest resolution is the highest resolution in a preset resolution set; determining a target pixel area matched with a block to be decoded from the pixel area to be matched according to a target motion vector MV corresponding to the block to be decoded of the current video frame, wherein the target MV is a motion vector from the target pixel area to the block to be decoded, and the size of the target pixel area is the same as that of the block to be decoded; and determining the pixel values of the pixels in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel values of the pixels in the target pixel region.

According to another aspect of the embodiments of the present invention, there is provided a video encoding method including: adjusting the resolution of a reconstructed block of a coded block of a current video frame from an initial resolution to a highest resolution to obtain a pixel region to be matched, wherein the resolution is the highest resolution, the initial resolution is a resolution adopted by the coded block during coding, and the highest resolution is the highest resolution in a preset resolution set; searching a target pixel area matched with a block to be coded of the current video frame in the pixel area to be matched, wherein the size of the target pixel area is the same as that of the block to be coded; and obtaining the motion vector MV from the target pixel area to the block to be coded to obtain the target MV corresponding to the block to be coded.

According to still another aspect of the embodiments of the present invention, there is also provided a video decoding apparatus including: a first adjusting unit, configured to adjust a resolution of a reconstructed block of a decoded block of a current video frame from an initial resolution to a highest resolution, to obtain a pixel region to be matched, where the resolution is the highest resolution, where the initial resolution is a resolution adopted by the decoded block during decoding, and the highest resolution is a highest resolution in a preset resolution set; a first determining unit, configured to determine, according to a target motion vector MV corresponding to a block to be decoded of the current video frame, a target pixel region matching the block to be decoded from the pixel region to be matched, where the target MV is a motion vector from the target pixel region to the block to be decoded, and a size of the target pixel region is the same as a size of the block to be decoded; and the second determining unit is used for determining the pixel values of the pixels in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel values of the pixels in the target pixel region.

According to still another aspect of the embodiments of the present invention, there is also provided a video encoding apparatus, including: a second adjusting unit, configured to adjust a resolution of a reconstructed block of a coded block of a current video frame from an initial resolution to a highest resolution, to obtain a pixel region to be matched, where the resolution is the highest resolution, where the initial resolution is a resolution adopted by the coded block during coding, and the highest resolution is a highest resolution in a preset resolution set; the searching unit is used for searching a target pixel area matched with a block to be coded of the current video frame in the pixel area to be matched, wherein the size of the target pixel area is the same as that of the block to be coded; and the obtaining unit is used for obtaining the motion vector MV from the target pixel area to the block to be coded to obtain the target MV corresponding to the block to be coded.

According to still another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is configured to execute the above-mentioned video decoding method when running.

According to a further aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium, in which a computer program is stored, wherein the computer program is configured to perform the above-mentioned video encoding method when running.

According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the video decoding method through the computer program.

According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the video encoding method through the computer program.

In the embodiment of the invention, a mode of unifying the resolutions of all reconstructed blocks of decoded blocks is adopted, and the pixel area to be matched with the highest resolution is obtained by adjusting the resolution of the reconstructed block of the decoded block of the current video frame from the initial resolution to the highest resolution, wherein the initial resolution is the resolution adopted by the decoded block during decoding, and the highest resolution is the highest resolution in a preset resolution set; determining a target pixel area matched with a block to be decoded from a pixel area to be matched according to a target MV corresponding to the block to be decoded of a current video frame, wherein the target MV is a motion vector from the target pixel area to the block to be decoded, and the size of the target pixel area is the same as that of the block to be decoded; the method comprises the steps of determining pixel values of pixel points in a target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel values of the pixel points in the target pixel region, and determining a reference block of the block to be decoded according to a motion vector under the same resolution because the resolution of reconstructed blocks of all decoded blocks is adjusted to the highest resolution, so that the purposes of quickly positioning the reference block of the block to be decoded and ensuring the accuracy of a decoding result are achieved, the technical effects of improving the coding and decoding efficiency and improving the decoding effect are achieved, and the technical problems of low coding and decoding efficiency and poor decoding effect in a mode that different blocks in one frame are coded and decoded by adopting different resolutions in the related technology are solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

fig. 1 is a schematic diagram of an application environment of an alternative video decoding method according to an embodiment of the present invention;

FIG. 2 is a flow chart illustrating an alternative video decoding method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an alternative video decoding method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an alternative video decoding method according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of yet another alternative video decoding method according to an embodiment of the present invention;

fig. 6 is a flow chart illustrating an alternative video encoding method according to an embodiment of the present invention;

fig. 7 is a schematic diagram of an alternative video encoding method according to an embodiment of the present invention;

fig. 8 is a schematic diagram of an alternative video encoding method according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of an alternative video encoding and decoding process according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of an alternative video decoding apparatus according to an embodiment of the present invention;

fig. 11 is a schematic structural diagram of an alternative video encoding apparatus according to an embodiment of the present invention;

fig. 12 is a schematic structural diagram of an alternative electronic device according to an embodiment of the invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Technical terms involved in the embodiments of the present invention include:

(1) PSNR (Peak Signal to Noise Ratio), which represents the Ratio of the maximum possible power of a Signal to the destructive Noise power affecting its representation accuracy, with the unit of PSNR being dB, with higher PSNR values representing less distortion;

(2) MV (Moving Vector), which represents the relative displacement between the current coding block and the best matching block in its reference picture.

According to an aspect of an embodiment of the present invention, there is provided a video decoding method. Optionally, as an optional implementation, the video decoding method may be applied, but not limited to, in an application environment as shown in fig. 1. The application environment includes a terminal 102 and a server 104, and the terminal 102 and the server 104 communicate with each other through a network. The terminal 102 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, and the like. The server 104 may be, but not limited to, a computer processing device with a relatively high data processing capability and a certain storage space.

The video encoding method corresponding to the video decoding method described above may also be applied, but not limited to, to the application environment shown in fig. 1. After the video to be encoded is obtained, without limitation, by using the video encoding method provided in this application, in a scene of block-level different-resolution encoding, through the interaction process between the terminal 102 and the server 104 shown in fig. 1, in the same resolution (highest resolution), a pixel region (e.g., the most-image pixel region) matching the current block to be encoded is searched for in an already-encoded pixel region in a frame, and a target MV matching the pixel region and the block to be encoded is obtained, so that while the transmission bandwidth is saved, the encoding efficiency and quality of the video frame are also ensured. In addition, after the video to be decoded is obtained, the video decoding method provided by the present application may also be, but is not limited to, adopted, and through the interaction process between the terminal 102 and the server 104 shown in fig. 1, in a scene of block-level coding with different resolutions, a matching pixel region matched with the current block to be decoded is determined according to the target MV at the same resolution, and then the block to be decoded is determined according to the matching pixel region, so that the decoding efficiency is improved, and the decoding effect is improved.

In one embodiment, the terminal 102 may include, but is not limited to, the following components: an image processing unit 1021, a processor 1022, a storage medium 1023, a memory 1024, a network interface 1025, a display screen 1026, and an input device 1027. The aforementioned components may be connected by, but are not limited to, a system bus 1028. The image processing unit 1021 is configured to provide at least a rendering capability of a display interface; the processor 1022 is configured to provide computing and control capabilities to support operation of the terminal 102; the storage medium 1023 stores therein an operating system 1023-2, a video encoder and/or a video decoder 1023-4. The operating system 1023-2 is used to provide control operation instructions, and the video encoder and/or video decoder 1023-4 is used to perform encoding/decoding operations according to the control operation instructions. In addition, the memory provides an operating environment for the video encoder and/or video decoder 1023-4 in the storage medium 1023, and the network interface 1025 is used for network communication with the network interface 1043 in the server 104. The display screen is used for displaying an application interface and the like, such as a decoded video; the input device 1027 is used for receiving commands or data input by a user. For a terminal 102 with a touch screen, the display screen 1026 and input device 1027 may be touch screens. The internal structure of the terminal shown in fig. 1 is a block diagram of only a part of the structure related to the present application, and does not constitute a limitation of the terminal to which the present application is applied, and a specific terminal or server may include more or less components than those shown in the figure, or combine some components, or have a different arrangement of components.

In one embodiment, the server 104 may include, but is not limited to, the following components: a processor 1041, a memory 1042, a network interface 1043, and a storage medium 1044. The above components may be connected by, but are not limited to, a system bus 1045. The storage medium 1044 includes an operating system 1044-1, a database 1044-2, a video encoder and/or a video decoder 1044-3. The processor 1041 is used for providing computing and control capability to support the operation of the server 104. The memory 1042 provides an environment for the operation of the video encoder 1044-3 and/or the video decoding 1044-3 in the storage medium 1044. The network interface 1043 communicates with the network interface 1025 of the external terminal 102 via a network connection. The operating system 1044-1 in the storage medium is configured to provide control operation instructions; the video encoder and/or video decoder 1044-3 is configured to perform encoding/decoding operations according to the control operation instructions; database 1044-2 is used to store data. The internal structure of the server shown in fig. 1 is a block diagram of only a part of the structure related to the present application, and does not constitute a limitation on the computer device to which the present application is applied, and a specific computer device has different component arrangements.

In one embodiment, the network may include, but is not limited to: wired networks, wireless networks. The wired network may include, but is not limited to: wide area networks, metropolitan area networks, and local area networks. The above is merely an example, and this is not limited in this embodiment.

Optionally, in this embodiment, as an optional implementation manner, a video decoding method performed by the terminal 102 or the server 104 is provided. As shown in fig. 2, the video decoding method includes:

step S202, adjusting the resolution of a reconstructed block of a decoded block of a current video frame from an initial resolution to a highest resolution to obtain a pixel region to be matched, wherein the resolution is the highest resolution, the initial resolution is a resolution adopted by the decoded block during decoding, and the highest resolution is the highest resolution in a preset resolution set;

step S204, according to a target MV corresponding to a block to be decoded of a current video frame, determining a target pixel area matched with the block to be decoded from the pixel area to be matched, wherein the target MV is a motion vector from the target pixel area to the block to be decoded, and the size of the target pixel area is the same as that of the block to be decoded;

step S206, determining pixel values of pixels in the target decoding block obtained by decoding the block to be decoded at the highest resolution according to the pixel values of the pixels in the target pixel region.

It should be noted that the video decoding method shown in fig. 2 can be used in, but is not limited to, the video decoder shown in fig. 1. The decoding process of the current video frame is completed through the interactive cooperation of the video decoder and other components.

Optionally, in this embodiment, the video decoding method may be applied, but not limited to, to various scenes involving video encoding and decoding technologies, for example, to application scenes such as a video playing application, a video sharing application, a video session application, short video transmission in an instant messaging application, and video chat in an instant messaging application. The video transmitted in the application scene may include, but is not limited to: the long video, the short video, such as the long video, may be a play episode with a longer play time (for example, the play time is longer than 10 minutes), or a picture shown in a long-time video session, and the short video may be a voice message interacted between two or more parties, or a video with a shorter play time (for example, the play time is less than or equal to 30 seconds) for being shown on the sharing platform. The above is merely an example, and the video decoding method provided in the present embodiment can be applied to, but is not limited to, a playing device for playing video in the above application scenario.

In this embodiment, for a current block to be decoded in a current video frame (the current video frame to be decoded, some blocks in the current video frame may have been decoded, and some blocks are to be decoded) in a video to be decoded, a plurality of decoded blocks are reconstructed and adjusted to have the same resolution (the highest resolution), a pixel region to be matched is obtained, a target pixel region matched with the block to be decoded is determined according to a target MV corresponding to the block to be decoded, and then a pixel value of a pixel point in a target decoded block obtained by decoding the block to be decoded is determined according to a pixel value of the target pixel region.

The video decoding method in the present embodiment is explained below with reference to fig. 2.

In step S202, the resolution of a reconstructed block of a decoded block of a current video frame is adjusted from an initial resolution to a highest resolution, so as to obtain a pixel region to be matched, where the resolution is the highest resolution, where the initial resolution is a resolution adopted by the decoded block during decoding, and the highest resolution is a highest resolution in a preset resolution set.

A receiving device of the video (e.g., server 104) may receive the video stream transmitted by a transmitting device of the video (e.g., terminal 102) over a network. The video code stream contains data obtained by video coding one or more video frames (video images) of the video.

After receiving the video code stream, the receiving device may decode each current video frame in the video to obtain a decoded video. The video frames with relevance may be decoded according to a sequence of relevance, and for video frames without relevance, the video frames may be decoded according to a sequence in a video, may also be decoded according to a random sequence, and may also be decoded in parallel by multiple video frames, where a specific decoding manner may be according to an appointment, or determined according to indication information in a video code stream, which is not specifically limited in this embodiment.

For each image block in the current video frame to be decoded, when an intra block copy (intra block copy) coding mode is adopted, coding can be performed by adopting different resolutions. The adopted resolution can be indicated by the indication information corresponding to the current video frame (or the block to be decoded) in the video code stream, can also be determined according to the reference information of other image blocks or other video frames, and can also be determined according to the indication information corresponding to the current video frame (or the block to be decoded) in the video code stream and the reference information of other image blocks or other video frames. The resolution of each image block is determined, which is not specifically limited in this embodiment.

When each block to be decoded is decoded, under the condition that the video code stream contains data to be decoded corresponding to the current block to be decoded, the data to be decoded can be decoded by adopting the resolution corresponding to the current block to be decoded, so as to obtain the current decoded block corresponding to the current block to be decoded. In the case that the video code stream does not include data to be decoded corresponding to the current block to be decoded, the current decoded block corresponding to the current block to be decoded may be determined according to a target pixel region (for example, a pixel region that is the most pixel region of the current block to be decoded) that matches the current block to be decoded in the decoded pixel regions.

Alternatively, in this embodiment, the target pixel region may span multiple blocks (decoded blocks), which are reconstructed and adjusted to the highest resolution. For example, the resolution of the reconstructed block of the decoded block of the current video frame may be adjusted from the initial resolution to the highest resolution, resulting in the pixel region to be matched with the highest resolution. The initial resolution is the resolution adopted by the decoded block during decoding, and the corresponding initial resolutions of different decoded blocks may be the same or different. The highest resolution is the highest resolution in the preset resolution set.

It should be noted that the highest resolution may be a preset resolution, and may be an original resolution of the video frame, that is, on the encoding side, a resolution before the resolution adjustment and the video encoding are not performed, or may be a maximum value of resolutions used by all image blocks in the current video frame. The highest resolution may be indicated by the indication information, or may be determined according to the resolutions adopted by all image blocks.

Optionally, in addition to the highest resolution, the resolution of the reconstructed block of the decoded block of the current video frame may be adjusted from the initial resolution to a preset resolution, where the preset resolution is greater than the resolution adopted by the current block to be decoded when decoding (in a case where the resolution adopted by the current block to be decoded when decoding is less than the highest resolution), or equal to the resolution adopted by the current block to be decoded when decoding (in a case where the resolution adopted by the current block to be decoded when decoding is equal to the highest resolution). The preset resolution may be indicated by the indication information, or, determined according to the configuration information, the configuration information may be: the preset resolution is a resolution which is higher than the resolution adopted by the current block to be decoded in the decoding process by N levels in the preset resolution set, and if the resolution is not higher than the resolution of the current block to be decoded by N levels, the preset resolution is the highest resolution, wherein N is a positive integer greater than or equal to 1.

The reconstructed block of the decoded block is a copy of the reconstructed decoded block, the size of the decoded block is the same as that of the reconstructed block, the resolution is the same, and the pixel values of the pixels at the same position are the same.

The size relationship of the initial resolution to the highest resolution may include at least one of: the initial resolution is equal to the highest resolution, and the initial resolution is lower than the highest resolution.

As an optional implementation manner, adjusting the resolution of a reconstructed block of a decoded block of a current video frame from an initial resolution to a highest resolution, and obtaining a pixel region to be matched with the highest resolution includes: and under the condition that the initial resolution is lower than the highest resolution, upsampling the resolution of the reconstruction block from the initial resolution to the highest resolution to obtain a pixel area to be matched with the highest resolution.

The resolution of the reconstruction block with the initial resolution lower than the highest resolution may be up-sampled from the initial resolution to the highest resolution, so as to obtain a pixel region to be matched with the highest resolution (or a pixel region corresponding to the reconstruction block in the pixel region to be matched).

As another optional implementation, in a case that the initial resolution is equal to the highest resolution, the resolution of the reconstructed block may be regarded as the pixel region to be matched (or the pixel region corresponding to the reconstructed block in the pixel region to be matched), without adjusting the resolution of the reconstructed block.

By the embodiment, the resolution of the reconstructed block is up-sampled or not adjusted according to the relationship between the initial resolution and the highest resolution to obtain the pixel region to be matched, so that the resolution of the reconstructed block can be adjusted to the highest resolution, and the accuracy of determining the pixel region to be matched is improved.

There may be one or more of the decoded blocks before decoding the block currently to be decoded. In the case that there is one decoded block, the resolution of the reconstructed block of the decoded block may be adjusted from the initial resolution to the highest resolution, so as to obtain the pixel region to be matched with the highest resolution. When there is one decoded block, the resolution of each of the reconstructed blocks of the decoded blocks may be adjusted from the initial resolution to the highest resolution, so as to obtain a pixel region to be matched with the highest resolution.

As an optional scheme, adjusting the resolution of a reconstructed block of a decoded block of a current video frame from an initial resolution to a highest resolution, and obtaining a pixel region to be matched with the highest resolution includes: and under the condition that the decoded block comprises a plurality of blocks and the plurality of blocks comprise a target block of which the initial resolution is lower than the highest resolution, upsampling the resolution of a reconstructed block of the target block from the initial resolution to the highest resolution to obtain the target decoded block of which the resolution is the highest resolution, wherein the pixel area to be matched comprises the target decoded block.

For a target block (which may include one or more target blocks) with an initial resolution lower than the highest resolution among the plurality of decoded blocks, the resolution of a reconstructed block of the target block may be upsampled from the initial resolution to the highest resolution, resulting in a target decoded block (a first target decoded block) with the highest resolution, and the pixel region to be matched includes the target decoded block.

As another optional implementation, in a case where the decoded block includes a plurality of blocks and the plurality of blocks include other blocks (the other blocks may include one or more) whose initial resolution is equal to the highest resolution, the resolution of the reconstructed block of the other blocks may be set as a second target decoded block whose resolution is the highest resolution without adjusting the resolution of the reconstructed block of the other blocks, and the pixel region to be matched includes the second target decoded block.

For example, as shown in fig. 3, the current video frame comprises 4 × 4 image blocks, and there are 10 decoded blocks, which include: 4 decoded blocks of Resolution R1(Resolution), 3 decoded blocks of Resolution R2, 3 decoded blocks of Resolution R3, wherein R1 > R3 > R2. The resolution of the reconstructed block of the decoded blocks of resolutions R2 and R3 may be up-sampled from R2 and R3 to R1, respectively, and taken together with the reconstructed block of the decoded block of resolution R1 as a pixel region to be matched.

By the embodiment, when a plurality of decoded blocks are provided, the resolution of the reconstructed block of each decoded block is adjusted according to the relationship between the resolution corresponding to each decoded block and the highest resolution, so that the resolution of each reconstructed block can be adjusted to the highest resolution, and the accuracy of determining the pixel region to be matched is improved.

After the resolution of the reconstruction block of each decoded block is adjusted from the initial resolution to the highest resolution, the adjusted reconstruction block may be used as a pixel region to be matched, or edge filtering processing may be performed on the adjusted reconstruction block first, and the filtered reconstruction block is used as a pixel region to be matched.

As an optional solution, after adjusting the resolution of the reconstructed block of the decoded block of the current video frame from the initial resolution to the highest resolution, the method further comprises: and under the condition that the decoded block comprises a plurality of blocks, performing edge filtering on pixel points on adjacent edges of adjacent pixel blocks in the pixel area to be matched, wherein the adjacent pixel blocks are pixel blocks corresponding to adjacent blocks in the plurality of blocks in the pixel area to be matched.

The edge filtering may be performed on one or more rows of pixels on adjacent row edges of adjacent pixel blocks (row edge adjacent) or one or more columns of pixels on adjacent column edges. When performing edge filtering, for a pixel point to be filtered, the pixel point used for filtering may be one or more pixel points adjacent to the pixel point to be filtered (adjacent in rows, adjacent in columns, etc.), or may be a pixel point within a predetermined range with the pixel point to be filtered as a center, where the predetermined range may be a circle with a radius of a predetermined length, or a square, a rectangle, or the like with a side length of a predetermined length. The filtering method used for the edge filtering may be set as needed, which is not specifically limited in this embodiment.

For example, as shown in fig. 4, for the reconstructed block after adjustment, two rows of pixels on adjacent sides are used for edge filtering (as shown by a dashed box in fig. 4), and the pixel value of each pixel (for example, the pixel used for the filtering) is adjusted according to the pixel value of the pixel corresponding to each pixel (for example, the pixel used for the filtering), so as to obtain the reconstructed block after filtering. The filtered pixel block can be used as a pixel area to be matched.

It should be noted that, on the decoding side, the filtering process may be performed by:

determining at least one pair of decoding blocks to be reconstructed from a decoded video frame to be currently processed, wherein each pair of decoding blocks in the at least one pair of decoding blocks comprises a first decoding block with a first resolution and a second decoding block with a second resolution, and the first decoding block and the second decoding block are position-adjacent decoding blocks;

adjusting a first resolution of the first decoded block to a target resolution (e.g., a highest resolution) and adjusting a second resolution of the second decoded block to the target resolution;

determining a first edge pixel point set from the first decoding block and a second edge pixel point set from the second decoding block, wherein the position of the first edge pixel point set is adjacent to the position of the second edge pixel point set;

and filtering the first edge pixel point set to obtain a filtered first edge pixel point set, and filtering the second edge pixel point set to obtain a filtered second edge pixel point set, wherein a first difference value between a pixel value of an ith pixel point in the filtered first edge pixel point set and a pixel value of a jth pixel point corresponding to the ith pixel point in the filtered second edge pixel point set is smaller than a second difference value between a pixel value of the ith pixel point in the first edge pixel point set and a pixel value of a jth pixel point in the second edge pixel point set, i is a positive integer and is less than or equal to the total number of pixel points in the first edge pixel point set, j is a positive integer and is less than or equal to the total number of pixel points in the second edge pixel point set.

Wherein adjusting to the target resolution comprises:

1) adjusting the second resolution to the first resolution in a case where the target resolution is equal to the first resolution;

2) adjusting the first resolution to the second resolution in a case where the target resolution is equal to the second resolution;

3) in the case where the target resolution is equal to the third resolution, the first resolution is adjusted to the third resolution, and the second resolution is adjusted to the third resolution, wherein the third resolution is different from the first resolution and different from the second resolution.

By adjusting the resolution of the decoding blocks and performing edge filtering processing on the edge pixel point set determined in the decoding blocks, obvious seams can be avoided in the video in the reconstruction process, so that the content in the video can be accurately restored, and the technical problem of video distortion caused by inconsistent resolution is solved.

Through the embodiment, the influence of resolution adjustment on the pixel points can be reduced by performing edge filtering on the adjusted reconstruction block, and the effectiveness of the pixel points in the pixel region to be matched is ensured.

In step S204, a target pixel region matched with the block to be decoded is determined from the pixel regions to be matched according to a target MV corresponding to the block to be decoded of the current video frame, wherein the target MV is a motion vector from the target pixel region to the block to be decoded, and the size of the target pixel region is the same as that of the block to be decoded.

After the pixel area to be matched is obtained, a target pixel area can be determined from the pixel area to be matched according to a motion vector (target MV) from a target pixel area corresponding to the block to be decoded in the pixel area to be matched to the block to be decoded. The target pixel area is used for determining a target decoding block corresponding to the block to be decoded, the target decoding block is a decoding block obtained by decoding the block to be decoded according to resolution, and the target decoding block can span a plurality of adjusted reconstruction blocks.

The target MV may be carried in indication information corresponding to a video to be decoded, a current video frame, or a block to be decoded. The indication information may carry a plurality of parameters, including at least a motion vector parameter for indicating the target MV. The target MV can be determined by the parameter values of the motion vector parameters.

It should be noted that the target MV is used to determine the relative position relationship between the target pixel region and the block to be decoded, and may be a motion vector from the target pixel region to the block to be decoded, or a motion vector from the block to be decoded to the target pixel region.

In addition to the target MV, determining the target pixel region needs to be based on a reference position for locating the target pixel region, which is a relative position of a reference point position in the block to be decoded. The reference point position in the block to be decoded may be a position of a certain pixel point in the block to be decoded (a pixel point at the top left corner, a pixel point at the top right corner, a pixel point at the bottom left corner, and a pixel point at the bottom right corner), or may be other positions besides the position of the pixel point, such as a center point. The position of the reference point may be set as needed, which is not particularly limited in this embodiment.

As an optional solution, determining, according to the target MV corresponding to the block to be decoded, a target pixel region matching the block to be decoded from the pixel region to be matched may include: determining a position of a second pixel point matched with the position of a first pixel point in a block to be decoded in a pixel area to be matched, wherein a motion vector from the position of the second pixel point to the position of the first pixel point is a target MV; and determining a target pixel area in the pixel area to be matched according to the position of a second pixel point, wherein the relative position of the second pixel point in the target pixel area is the same as the relative position of the first pixel point in the block to be decoded.

After the target MV is obtained, the position of a reference point in the block to be decoded can also be determined. The reference point location in the block to be decoded may be the first pixel point location in the block to be decoded. The first pixel point position may be a position of a first pixel point in the block to be decoded. The first pixel point may be a specific pixel point in the block to be decoded, and the specific pixel point may be any pixel point, for example, a pixel point at the upper left corner, a pixel point at the upper right corner, a pixel point at the lower left corner, and a pixel point at the lower right corner.

It should be noted that the position of the reference point in the block to be decoded (the relative position of the first pixel point position in the block to be decoded) and the position of the reference point in the target pixel region (the relative position of the second pixel point position in the target pixel region) are the same. The reference point position may be determined according to convention, or may be determined by a parameter value of a reference point position parameter among a plurality of parameters in the indication information. The reference point position may be defined as needed, which is not specifically limited in this embodiment.

The motion vector from the second pixel point position to the first pixel point position is a target MV, or the motion vector from the first pixel point position to the second pixel point position is a target MV. According to the reference point position in the block to be decoded, the reference point position (second pixel point position) in the target pixel region can be obtained according to the target MV. Since the size of the block to be decoded is constant, the target pixel region can be located from the pixel region to be matched according to the position of the reference point in the target pixel region.

For example, as shown in fig. 5, according to the target MV and the position of the first pixel point, the position of the second pixel point (e.g., two end points of the target MV in fig. 5) may be determined, and then the target pixel area may be determined according to the size information of the block to be decoded (or other information that may determine the size of the block to be decoded).

By the embodiment, the target pixel area is determined according to the target MV corresponding to the block to be decoded and the reference point position (the first pixel point position) in the block to be decoded, so that the accuracy of determining the position of the target pixel point can be ensured, and the decoding quality is improved.

In step S206, the pixel values of the pixels in the target decoding block obtained by decoding the block to be decoded at the highest resolution are determined according to the pixel values of the pixels in the target pixel region.

After the target pixel region is determined, the pixel values of the pixels in the target decoding block corresponding to the block to be decoded can be determined according to the pixel values of the pixels in the target pixel region, wherein the target decoding block is a decoding block obtained by decoding the block to be decoded according to the highest resolution.

There may be various pixel values for determining pixel points in a target decoding block corresponding to a block to be decoded, for example, a target pixel region may be directly used as the target decoding block: and taking the pixel value of the pixel point in the target pixel region as the pixel value of the corresponding pixel point in the target decoding block.

As an optional scheme, determining, according to pixel values of pixels in a target pixel region, pixel values of pixels in a target decoding block obtained by decoding a block to be decoded at the highest resolution includes: and converting the pixel values of the pixels in the target pixel region according to the target conversion parameters to obtain the pixel values of the pixels in the target decoding block.

Besides the target MV, the parameters corresponding to the block to be decoded in the video code stream may further include: and a target conversion parameter, which may be a conversion parameter for converting the pixel values of the pixels in the target pixel region into the pixel values of the pixels in the target decoding block. The conversion parameter may be: and the difference value between the pixel value of the pixel point in the target pixel region and the pixel value of the pixel point in the target decoding block.

Since the target pixel region is a pixel region that matches the block to be decoded (for example, a pixel region that is the closest to the block to be decoded in the pixel region to be matched), the amount of data of the conversion parameter required to determine the target decoding block is small, so that an accurate target decoding block can be obtained at a low cost.

According to the embodiment, the pixel values of the pixels in the target pixel region are converted according to the target conversion parameters to obtain the pixel values of the pixels in the target decoding block, so that the accuracy of determining the target decoding block can be ensured, and the decoding quality is improved.

It should be noted that, for each block of a current video frame, for a target decoding block obtained according to a motion vector, the highest resolution may be used as the resolution adopted by the block to be decoded when decoding, or the target decoding block may be downsampled to a proper resolution (the target resolution, which is indicated by the indication information); as for a decoded block to be obtained by decoding a data code stream, the resolution adopted at the time of encoding can be taken as the resolution adopted at the time of decoding.

According to another aspect of the embodiments of the present invention, there is provided a video encoding method, as shown in fig. 6, the method including:

step S602, adjusting the resolution of a reconstruction block of a coded block of a current video frame from an initial resolution to a highest resolution to obtain a pixel region to be matched, wherein the resolution is the highest resolution, the initial resolution is a resolution adopted by the coded block during coding, and the highest resolution is the highest resolution in a preset resolution set;

step S604, searching a target pixel area matched with a block to be coded of the current video frame in the pixel area to be matched, wherein the size of the target pixel area is the same as that of the block to be coded;

step S606, obtaining the MV from the target pixel area to the block to be coded, and obtaining the target MV corresponding to the block to be coded.

It should be noted that the video encoding method shown in fig. 6 can be used in, but is not limited to, the video encoder shown in fig. 1. The encoding process of the current video frame is completed through the interactive cooperation of the video encoder and other components.

Optionally, in this embodiment, the video encoding method may be applied to, but not limited to, an application scenario such as a video playing application, a video sharing application, or a video session application. The video transmitted in the application scene may include, but is not limited to: the long video, the short video, such as the long video, may be a play episode with a longer play time (for example, the play time is longer than 10 minutes), or a picture shown in a long-time video session, and the short video may be a voice message interacted between two or more parties, or a video with a shorter play time (for example, the play time is less than or equal to 30 seconds) for being shown on the sharing platform. The above is merely an example, and the video encoding method provided in the present embodiment may be applied to, but is not limited to, a transmitting apparatus for transmitting video in the above application scenario.

In the video encoding process, in order to ensure real-time performance and flexibility of video frame encoding, a video frame is divided into a plurality of blocks to be encoded respectively. Furthermore, before encoding the current block, it is necessary to perform encoding pre-analysis on the current block by referring to the encoded block to determine the resolution of the current block. In the related art video encoding process, different blocks in the same video frame are usually encoded with uniform resolution for processing convenience.

As shown in fig. 7, if high resolution is used for encoding different blocks in a frame of a video, when the transmission bandwidth ratio is small (for example, smaller than the bandwidth threshold Th shown in fig. 7), the peak signal-to-noise ratio PSNR1 corresponding to high resolution encoding of different blocks in a frame of the video is lower than the peak signal-to-noise ratio PSNR2 corresponding to low resolution encoding of different blocks in a frame of the video, that is, the peak signal-to-noise ratio PSNR1 corresponding to high resolution encoding of a frame of the video is relatively small and distortion is relatively large.

Similarly, if the different blocks in a frame of the video are encoded with low resolution, when the transmission bandwidth ratio is large (for example, larger than the bandwidth threshold Th shown in fig. 7), the peak signal-to-noise ratio PSNR3 corresponding to the low resolution encoding of the different blocks in a frame of the video is lower than the peak signal-to-noise ratio PSNR4 corresponding to the high resolution encoding of the different blocks in a frame of the video, that is, the peak signal-to-noise ratio PSNR3 corresponding to the low resolution encoding of the large transmission bandwidth is relatively small and the distortion is relatively large.

Even if different resolutions are adopted for different blocks, there is no effective way to dynamically mark the dynamic relative relationship between each block and its reference block, and therefore, the video coding method in the related art has a problem of low coding efficiency.

In this embodiment, for a current block to be encoded in a current video frame to be encoded in a video to be encoded, a plurality of blocks to be encoded are reconstructed and adjusted to have the same resolution (highest resolution), a pixel region to be matched is obtained, a target pixel region matched with the block to be encoded is searched from the pixel region to be matched, and then a target MV from the target pixel region to the block to be encoded is determined.

The video encoding method in the present embodiment is explained below with reference to fig. 6.

Step S602, adjusting the resolution of a reconstructed block of a coded block of a current video frame from an initial resolution to a highest resolution, to obtain a pixel region to be matched with the highest resolution, where the initial resolution is a resolution adopted by the coded block during coding, and the highest resolution is a highest resolution in a preset resolution set.

A sending device of the video (e.g., the terminal 102) may encode the video to be sent, so as to obtain an encoded video code stream. For a video frame with relevance in a video to be encoded, video frame encoding may be performed according to a sequence of relevance, for a video frame without relevance, encoding may be performed according to a sequence in the video, or may be performed according to a random sequence, or may perform parallel encoding of a plurality of video frames, where a specific encoding manner may be according to an appointment, or a resource status of a terminal is determined, which is not specifically limited in this embodiment.

For each image block in the current video frame to be encoded, when an intra block copy (intra block copy) encoding mode is adopted, encoding can be performed by adopting different resolutions. The resolution ratio adopted by the block to be coded can be determined according to the pixel value of the pixel point in the block to be coded, can also be determined according to the reference information of other image blocks or other video frames, and can also be determined according to the pixel value of the pixel point in the block to be coded and the reference information of other image blocks or other video frames. The resolution of each image block is determined, which is not specifically limited in this embodiment. The resolution adopted by each image block can be indicated through indication information corresponding to a current video frame (or a block to be decoded) in a video code stream.

When each block to be coded is coded, the resolution of a reconstructed block of a coded block of a current video frame may be first adjusted from an initial resolution to a highest resolution, so as to obtain a pixel region to be matched with the highest resolution. The initial resolution is a resolution adopted by the encoded block during encoding, and the corresponding initial resolutions of different encoded blocks may be the same or different. The highest resolution is the highest resolution in the preset resolution set.

On the encoding side, the resolution adjustment method of the reconstructed block of the encoded block is similar to that on the decoding side, and is not described herein again.

The reconstructed block of the encoded block is a duplicate of the reconstructed encoded block, the size of the encoded block is the same as that of the reconstructed block, the resolution is the same, and the pixel values of the pixels at the same position are the same.

As an optional implementation manner, in a case that the initial resolution is lower than the highest resolution, adjusting the resolution of a reconstructed block of an encoded block of a current video frame from the initial resolution to the highest resolution, and obtaining a pixel region to be matched with the highest resolution includes: and upsampling the resolution of the reconstruction block from the initial resolution to the highest resolution to obtain a pixel area to be matched with the resolution as the highest resolution.

By the embodiment, the resolution of the reconstructed block is up-sampled according to the relation between the initial resolution and the highest resolution to obtain the pixel region to be matched, so that the resolution of the reconstructed block can be adjusted to the highest resolution, and the accuracy of determining the pixel region to be matched is improved.

There may be one or more encoded blocks before encoding the current block to be encoded. When there is one encoded block, the resolution of the reconstructed block of the encoded block may be adjusted from the initial resolution to the highest resolution, so as to obtain the pixel region to be matched with the highest resolution. When a plurality of coded blocks are available, the resolution of a plurality of reconstructed blocks of the plurality of coded blocks can be adjusted from the initial resolution to the highest resolution, so as to obtain a pixel region to be matched with the highest resolution.

As an optional implementation manner, in a case that the encoded block includes a plurality of blocks and the plurality of blocks include a target block whose initial resolution is lower than the highest resolution, adjusting the resolution of a reconstructed block of the encoded block of the current video frame from the initial resolution to the highest resolution, and obtaining a pixel region to be matched whose resolution is the highest resolution includes: and upsampling the resolution of the reconstructed block of the target block from the initial resolution to the highest resolution to obtain a target encoded block with the highest resolution, wherein the pixel area to be matched comprises the target encoded block.

For a target block (the target block may include one or more blocks) with an initial resolution lower than the highest resolution among the plurality of encoded blocks, the resolution of a reconstructed block of the target block may be up-sampled from the initial resolution to the highest resolution, resulting in a target encoded block (a first target encoded block) with the highest resolution, and the pixel region to be matched includes the target encoded block.

As another optional implementation, in a case that the encoded block includes a plurality of blocks, and the plurality of blocks includes other blocks (the other blocks may include one or more) whose initial resolution is equal to the highest resolution, the resolution of the reconstructed block of the other blocks may be used as a second target encoded block whose resolution is the highest resolution without adjusting the resolution of the reconstructed block of the other blocks, and the pixel region to be matched includes the second target encoded block.

According to the embodiment, when a plurality of coded blocks are provided, the resolution of the reconstruction block of each coded block is adjusted according to the relation between the resolution corresponding to each coded block and the highest resolution, so that the resolution of each reconstruction block can be adjusted to the highest resolution, and the accuracy of determining the pixel region to be matched is improved.

In addition to adjusting the resolution of the reconstructed block of each encoded block from the initial resolution to the highest resolution, for the current block to be encoded, if the resolution of the current block to be encoded is not the highest resolution, the current block to be encoded may be upsampled, so as to adjust the resolution of the current block to be encoded to the highest resolution.

After the resolution of the reconstructed block of each encoded block is adjusted from the initial resolution to the highest resolution, the adjusted reconstructed block may be used as a pixel region to be matched, or edge filtering may be performed on the adjusted reconstructed block first, and the filtered reconstructed block is used as a pixel region to be matched.

As an optional scheme, after the resolution of the reconstructed block of the encoded block of the current video frame is adjusted from the initial resolution to the highest resolution, edge filtering may be performed on pixel points on adjacent edges of adjacent pixel blocks in the pixel region to be matched under the condition that the encoded block includes a plurality of blocks, where the adjacent pixel blocks are pixel blocks in the pixel region to be matched, and the adjacent pixel blocks correspond to adjacent blocks in the plurality of blocks.

It should be noted that the filtering process performed on the encoding side is substantially similar to the filtering operation performed on the decoding side, and is not described herein again.

In step S604, a target pixel region matched with a block to be encoded of the current video frame is searched for in the pixel region to be matched, where the size of the target pixel region is the same as the size of the block to be encoded.

And after the pixel area to be matched is obtained, a target pixel area matched with the block to be coded can be searched in the pixel area to be matched. The manner of finding the target pixel region may include: on the premise that each coding block selects different resolution codes, the resolution of a reconstruction block of each coding block is unified, and a target pixel region (for example, the most similar region) which is matched with the coding block to be coded in the coded block (pixel region to be matched) is enumerated and found.

For example, under the premise that each coding block selects different resolution coding in a decision, enumeration finds the most similar region in the coded blocks, and the distance from each pixel in the region to the current block needs to be calculated. The region is obtained by reconstructing the similar region, and the similar region needs to be processed with uniform resolution, and the manner of uniform resolution may be: each block is interpolated to the highest resolution.

As an optional scheme, searching for a target pixel region matched with a block to be encoded in a pixel region to be matched includes: using a preset step length to sequentially obtain a plurality of candidate pixel areas with the same size as a block to be coded from the pixel areas to be matched according to the sequence of first-row-after-column or first-row-after-row; and determining a target pixel area from the candidate pixel areas, wherein the target pixel area is the candidate pixel area which has the highest similarity with the block to be coded and has the similarity higher than a similarity threshold value with the block to be coded.

The way of enumerating to find the pixel region to be matched and the target pixel region to be coded block may be: and acquiring a plurality of candidate pixel regions with the same size as the block to be coded from the pixel regions to be matched in sequence according to the sequence of first-column-after-column or first-column-after-row by using a preset step length, and taking the candidate pixel region with the highest similarity to the block to be coded and the similarity to the block to be coded higher than a similarity threshold as a target pixel threshold.

It should be noted that the similarity between the candidate pixel region and the block to be coded may be: the ratio of the number of the pixel points with the same pixel value at the same position to the total number of the pixel points contained in the candidate pixel area. Other types of similarity are also possible. The method for determining the similarity between the candidate pixel region and the block to be coded may be set as needed, which is not specifically limited in this embodiment.

For example, as shown in fig. 8, the candidate pixel region is determined by taking a pixel point at the top left corner of the pixel region to be matched as a starting point, and first obtaining a first candidate pixel region (the pixel point at the top left corner is a (0,0)) with the same size as that of the block to be coded; then, a pixel point is translated rightwards to obtain a second candidate pixel area (the pixel point at the upper left corner is A (0, 1)); sequentially acquiring candidate pixel regions in a mode of translating a pixel point to the right until reaching the right side of the pixel region to be matched; then, downwards translating one row, returning to the left side to continuously acquire a candidate pixel region (the pixel point at the upper left corner is A (1, 0)); and sequentially acquiring the candidate pixel regions in a preceding and following mode until no candidate pixel region meeting the size condition exists.

With the present embodiment, a plurality of candidate pixel regions are sequentially obtained in the order of first-column-last or first-column-last using a predetermined step size; and the target pixel area is determined from the candidate pixel areas, so that omission of acquisition of the candidate pixel areas is avoided, and the accuracy of determination of the target pixel area is ensured.

In step S606, the MV from the target pixel region to the block to be encoded is obtained, and the target MV corresponding to the block to be encoded is obtained.

After the target pixel area is determined, a motion vector from a reference point position in the target pixel area to a reference point position in the block to be encoded can be used as a target MV corresponding to the block to be encoded.

Optionally, in addition to the target MV, a difference between pixel values of a pixel point in the target pixel region and a pixel point in the block to be encoded may be determined, and the difference is written into the encoded video stream as the target conversion parameter (for example, written into a specific position in the video stream as the indication information).

It should be noted that, if the target pixel region matched with the block to be coded cannot be found, the block to be coded may be directly coded according to the target resolution (the resolution adopted when the current block to be coded initially determines coding), so as to obtain a coding block (coded data) corresponding to the block to be coded, and write the obtained coded data into the video code stream.

The description will be made in detail with reference to steps S902 to S930 in the example shown in fig. 9. In this example, in a scene of block-level coding with different resolutions, when an intra block copy coding mode is adopted, a pixel region that is the most pixel of a current block to be coded is searched in an already-coded pixel region in a frame, and an MV from the pixel region to the current block is calculated, where the most pixel region may span multiple blocks, the multiple blocks are reconstructed and adjusted to the same resolution, and a pixel region corresponding to the current block is found on the multiple blocks. The manner of adjusting the resolution employed in this example is: and for the coding blocks with the resolution ratios which are not the predetermined highest resolution ratio, the resolution ratio of each coding block is adjusted to the highest resolution ratio by adopting an up-sampling interpolation mode.

At the encoding end, steps S902 to S916 are performed: obtaining a current video frame and making a decision through the resolution of a block to be coded; determining the resolution adopted when the block to be coded is coded as a target resolution, adjusting the resolution of the block to be coded (or the reconstructed block of the current block to be coded) to the highest resolution, and then adjusting the reconstructed blocks of the plurality of coded blocks to the same resolution (the highest resolution) (step S906-2 to step S906-4) to obtain a pixel area to be matched. Judging whether a pixel area which is the most image of the current block to be coded is found from the pixel area to be matched; if yes, determining a target MV and a target conversion parameter corresponding to the block to be coded, if not, coding the block to be coded by adopting a target resolution, adding a resolution identifier of the target resolution into a video code stream, and outputting the video code stream.

At the decoding end, steps S918 to S930 are performed: acquiring a video code stream; performing resolution decision on a block to be decoded, and determining a target resolution corresponding to the block to be decoded; determining whether a target MV is acquired from a video code stream; if the target MV is acquired, adjusting reconstruction blocks of a plurality of decoded blocks to the same resolution (highest resolution) (step S924-2 to step S924-4) to obtain a pixel region to be matched; and determining a target pixel area matched with the block to be decoded according to the target MV, and determining pixel values of pixel points in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel values of the pixel points in the target pixel area. And if the target MV is not acquired, decoding the block to be decoded by adopting the target resolution.

And decoding each block to be decoded of each current video frame to obtain a decoded video frame corresponding to each current video frame in the video code stream, and finally decoding the video.

The above is only an example, and the video encoding method and the video decoding method provided in this embodiment are applied to the decision process of the target MV and the use process of the target MV shown in fig. 1, and are used to, under the condition that the encoding end and the decoding end use different resolutions for different blocks to be encoded/blocks to be decoded, adjust the resolution of the reconstructed block of each encoded block to the highest resolution by using an upsampling interpolation manner before encoding each block according to different resolutions, so as to determine motion information between each block and its reference block, so as to perform encoding and decoding directly by using a relative relationship, thereby improving encoding and decoding efficiency.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

According to another aspect of the embodiment of the present invention, there is also provided a video decoding apparatus. As shown in fig. 10, the apparatus includes:

(1) a first adjusting unit 1002, which adjusts the resolution of a reconstructed block of a decoded block of a current video frame from an initial resolution to a highest resolution, to obtain a pixel region to be matched, where the resolution is the highest resolution, the initial resolution is a resolution adopted by the decoded block during decoding, and the highest resolution is a highest resolution in a preset resolution set;

(2) a first determining unit 1004, configured to determine, according to a target MV corresponding to a block to be decoded of a current video frame, a target pixel region matched with the block to be decoded from a pixel region to be matched, where the target MV is a motion vector from the target pixel region to the block to be decoded, and a size of the target pixel region is the same as a size of the block to be decoded;

(3) the second determining unit 1006 is configured to determine, according to the pixel value of the pixel point in the target pixel region, the pixel value of the pixel point in the target decoding block obtained by decoding the block to be decoded according to the highest resolution.

It should be noted that the video decoding apparatus shown in fig. 10 can be used in, but is not limited to, the video decoder shown in fig. 1. The decoding process of the current video frame is completed through the interactive cooperation of the video decoder and other components.

Alternatively, the first adjusting unit 1002 may be configured to execute the foregoing step S202, the first determining unit 1004 may be configured to execute the foregoing step S204, and the second determining unit 1006 may be configured to execute the foregoing step S206.

In this embodiment, for a current block to be decoded in a current video frame in a video to be decoded, a plurality of decoded blocks are reconstructed and adjusted to have the same resolution (highest resolution), a pixel region to be matched is obtained, a target pixel region matched with the block to be decoded is determined according to a target MV corresponding to the block to be decoded, and then a pixel value of a pixel point in the target decoded block obtained by decoding the block to be decoded is determined according to a pixel value of the target pixel region.

As an alternative embodiment, the first adjusting unit 1002 includes:

(1) and the first up-sampling module is used for up-sampling the resolution of the reconstruction block from the initial resolution to the highest resolution to obtain a pixel region to be matched with the resolution being the highest resolution under the condition that the initial resolution is lower than the highest resolution.

As an alternative embodiment, the first adjusting unit 1002 includes:

(1) and the second up-sampling module is used for up-sampling the resolution of the reconstructed block of the target block from the initial resolution to the highest resolution to obtain the target decoded block with the highest resolution under the condition that the decoded block comprises a plurality of blocks and the plurality of blocks comprise the target block with the initial resolution lower than the highest resolution, wherein the pixel area to be matched comprises the target decoded block.

As an alternative embodiment, the first adjusting unit 1002 includes:

(1) the first filtering module is used for performing edge filtering on pixel points on adjacent edges of adjacent pixel blocks in a pixel area to be matched under the condition that the decoded blocks comprise a plurality of blocks after the resolution of a reconstructed block of a decoded block of a current video frame is adjusted to the highest resolution from an initial resolution, wherein the adjacent pixel blocks are pixel blocks in the pixel area to be matched and corresponding to the adjacent blocks in the plurality of blocks.

As an alternative embodiment, the first determination unit 1004 includes:

(1) the first determining module is used for determining a position of a second pixel point matched with the position of a first pixel point in the block to be decoded in the pixel area to be matched, wherein a motion vector from the position of the second pixel point to the position of the first pixel point is a target MV;

(2) and the second determining module is used for determining a target pixel area in the pixel area to be matched according to the position of a second pixel point, wherein the relative position of the second pixel point in the target pixel area is the same as the relative position of the first pixel point in the block to be decoded.

As an alternative embodiment, the second determining unit 1006 includes:

(1) and the conversion module is used for converting the pixel values of the pixels in the target pixel region according to the target conversion parameters to obtain the pixel values of the pixels in the target decoding block.

According to another aspect of the embodiments of the present invention, there is also provided a video encoding apparatus. As shown in fig. 11, the apparatus includes:

(1) a second adjusting unit 1102, configured to adjust a resolution of a reconstructed block of a coded block of a current video frame from an initial resolution to a highest resolution, to obtain a pixel region to be matched, where the resolution is the highest resolution, the initial resolution is a resolution adopted by the coded block during coding, and the highest resolution is a highest resolution in a preset resolution set;

(2) a searching unit 1104, configured to search a target pixel region matched with a block to be coded of a current video frame in a pixel region to be matched, where a size of the target pixel region is the same as a size of the block to be coded;

(3) the obtaining unit 1106 is configured to obtain an MV from the target pixel region to the block to be coded, and obtain a target MV corresponding to the block to be coded.

It should be noted that the video encoding apparatus shown in fig. 11 can be used in, but is not limited to, the video encoder shown in fig. 1. The encoding process of the current video frame is completed through the interactive cooperation of the video encoder and other components.

Alternatively, the second adjusting unit 1102 may be configured to perform the foregoing step S602, the searching unit 1104 may be configured to perform the foregoing step S604, and the obtaining unit 1106 may be configured to perform the foregoing step S606.

In this embodiment, for a current block to be encoded in a current video frame in a video to be encoded, a plurality of blocks to be encoded are reconstructed and adjusted to have the same resolution (highest resolution), a pixel region to be matched is obtained, a target pixel region matched with the block to be encoded is searched from the pixel region to be matched, and then a target MV from the target pixel region to the block to be encoded is determined.

As an alternative embodiment, the second adjusting unit 1102 includes:

(1) and the third up-sampling module is used for up-sampling the resolution of the reconstruction block from the initial resolution to the highest resolution to obtain a pixel region to be matched with the resolution being the highest resolution under the condition that the initial resolution is lower than the highest resolution.

As an alternative embodiment, the second adjusting unit 1102 includes:

(1) a fourth upsampling module, configured to, when the encoded block includes multiple blocks and the multiple blocks include a target block with an initial resolution lower than a highest resolution, upsample the resolution of a reconstructed block of the target block from the initial resolution to the highest resolution to obtain a target encoded block with the highest resolution, where a pixel region to be matched includes the target encoded block

As an alternative embodiment, the second adjusting unit 1102 includes:

(1) and the second filtering module is used for performing edge filtering on pixel points on adjacent edges of adjacent pixel blocks in the pixel area to be matched under the condition that the coded block comprises a plurality of blocks after the resolution of a reconstructed block of the coded block of the current video frame is adjusted to the highest resolution from the initial resolution, wherein the adjacent pixel blocks are pixel blocks corresponding to adjacent blocks in the plurality of blocks in the pixel area to be matched.

As an alternative embodiment, the lookup unit 1104 includes:

(1) the acquisition module is used for sequentially acquiring a plurality of candidate pixel areas with the same size as the block to be coded from the pixel areas to be matched according to the sequence of first-row and second-row or first-row and second-row by using a preset step length;

(2) and the third determining module is used for determining a target pixel area from the candidate pixel areas, wherein the target pixel area is the candidate pixel area which has the highest similarity with the block to be coded and has the similarity higher than the similarity threshold with the block to be coded in the candidate pixel areas.

According to a further aspect of an embodiment of the present invention, there is also provided a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

Alternatively, in the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:

s1, adjusting the resolution of a reconstruction block of a decoded block of a current video frame from an initial resolution to a highest resolution to obtain a pixel region to be matched with the highest resolution, wherein the initial resolution is the resolution adopted by the decoded block during decoding, and the highest resolution is the highest resolution in a preset resolution set;

s2, determining a target pixel area matched with the block to be decoded from the pixel area to be matched according to a target MV corresponding to the block to be decoded of the current video frame, wherein the target MV is a motion vector from the target pixel area to the block to be decoded, and the size of the target pixel area is the same as that of the block to be decoded;

s3, determining the pixel value of the pixel point in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel value of the pixel point in the target pixel region.

s1, adjusting the resolution of a reconstruction block of a coded block of a current video frame from an initial resolution to a highest resolution to obtain a pixel area to be matched with the highest resolution, wherein the initial resolution is the resolution adopted by the coded block during coding, and the highest resolution is the highest resolution in a preset resolution set;

s2, searching a target pixel area matched with a block to be coded of the current video frame in the pixel area to be matched, wherein the size of the target pixel area is the same as that of the block to be coded;

and S3, acquiring the MV from the target pixel area to the block to be coded, and acquiring the target MV corresponding to the block to be coded.

Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

According to yet another aspect of the embodiments of the present invention, there is also provided an electronic device for implementing the video decoding method or the video encoding method, as shown in fig. 12, the electronic device includes a memory 1202 and a processor 1204, the memory 1202 stores a computer program, and the processor 1204 is configured to execute the steps in any one of the above method embodiments through the computer program.

Optionally, in this embodiment, the electronic apparatus may be located in at least one network device of a plurality of network devices of a computer network.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 12 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 12 is a diagram illustrating a structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 12, or have a different configuration than shown in FIG. 12.

The memory 1202 may be used to store software programs and modules, such as program instructions/modules corresponding to the video decoding method and apparatus or the video encoding method and apparatus in the embodiments of the present invention, and the processor 1204 executes various functional applications and data processing by executing the software programs and modules stored in the memory 1202, so as to implement the video decoding method or the video encoding method. The memory 1202 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 1202 can further include memory located remotely from the processor 1204, which can be connected to a terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 1202 may be, but not limited to, specifically used for storing information such as sample characteristics of the item and the target virtual resource account number. As an example, as shown in fig. 12, the memory 1202 may include, but is not limited to, a first adjusting unit 1002, a first determining unit 1004, and a second determining unit 1006 in the video decoding apparatus. As another example, the memory 1202 may include, but is not limited to, the second adjusting unit 1102, the searching unit 1104, and the obtaining unit 1106 of the video encoding apparatus. In addition, the video encoding apparatus may further include, but is not limited to, other module units in the video decoding apparatus or the video encoding apparatus, which is not described in this example again.

Optionally, the transmitting device 1206 is configured to receive or transmit data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmitting device 1206 includes a Network adapter (NIC) that can be connected to a router via a Network cable to communicate with the internet or a local area Network. In one example, the transmitting device 1206 is a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.

In addition, the electronic device further includes: a connection bus 1208 for connecting the various module components in the electronic device.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A video decoding method, comprising:

adjusting the resolution of a reconstructed block of a decoded block of a current video frame from an initial resolution to a highest resolution to obtain a pixel region to be matched, wherein the resolution is the highest resolution, the initial resolution is a resolution adopted by the decoded block during decoding, and the highest resolution is the highest resolution in a preset resolution set;

determining a target pixel area matched with a block to be decoded from the pixel area to be matched according to a target motion vector MV corresponding to the block to be decoded of the current video frame, wherein the target MV is a motion vector from the block to be decoded to the target pixel area, and the size of the target pixel area is the same as that of the block to be decoded;

and determining the pixel values of the pixels in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel values of the pixels in the target pixel region.

2. The method of claim 1, wherein adjusting the resolution of the reconstructed block of the decoded blocks of the current video frame from the initial resolution to the highest resolution, and wherein obtaining the pixel region to be matched with the highest resolution comprises:

and under the condition that the initial resolution is lower than the highest resolution, upsampling the resolution of the reconstruction block from the initial resolution to the highest resolution to obtain the pixel region to be matched with the resolution as the highest resolution.

3. The method of claim 1, wherein adjusting the resolution of the reconstructed block of the decoded blocks of the current video frame from the initial resolution to the highest resolution, and wherein obtaining the pixel region to be matched with the highest resolution comprises:

and in the case that the decoded block comprises a plurality of blocks and the plurality of blocks comprise a target block of which the initial resolution is lower than the highest resolution, upsampling the resolution of a reconstructed block of the target block from the initial resolution to the highest resolution to obtain the target decoded block of which the resolution is the highest resolution, wherein the pixel region to be matched comprises the target decoded block.

4. The method of claim 1, wherein after adjusting the resolution of the reconstructed block of the decoded blocks of the current video frame from the initial resolution to the highest resolution, the method further comprises:

and under the condition that the decoded block comprises a plurality of blocks, performing edge filtering on pixel points on adjacent edges of adjacent pixel blocks in the pixel area to be matched, wherein the adjacent pixel blocks are pixel blocks corresponding to adjacent blocks in the plurality of blocks in the pixel area to be matched.

5. The method of claim 1, wherein determining the target pixel region matching the block to be decoded from the pixel region to be matched according to the target MV corresponding to the block to be decoded comprises:

determining a second pixel point position matched with a first pixel point position in the block to be decoded in the pixel area to be matched, wherein a motion vector from the second pixel point position to the first pixel point position is the target MV;

and determining the target pixel area in the pixel area to be matched according to the position of the second pixel point, wherein the relative position of the second pixel point in the target pixel area is the same as the relative position of the first pixel point in the block to be decoded.

6. The method according to any one of claims 1 to 5, wherein determining, according to the pixel values of the pixels in the target pixel region, the pixel values of the pixels in the target decoding block obtained by decoding the block to be decoded at the highest resolution comprises:

and converting the pixel values of the pixels in the target pixel region according to the target conversion parameters to obtain the pixel values of the pixels in the target decoding block.

7. A video encoding method, comprising:

adjusting the resolution of a reconstructed block of a coded block of a current video frame from an initial resolution to a highest resolution to obtain a pixel region to be matched, wherein the resolution is the highest resolution, the initial resolution is a resolution adopted by the coded block during coding, and the highest resolution is the highest resolution in a preset resolution set;

searching a target pixel area matched with a block to be coded of the current video frame in the pixel area to be matched, wherein the size of the target pixel area is the same as that of the block to be coded;

and obtaining the motion vector MV from the target pixel area to the block to be coded to obtain the target MV corresponding to the block to be coded.

8. The method of claim 7, wherein adjusting the resolution of the reconstructed block of the encoded block of the current video frame from the initial resolution to the highest resolution, and wherein obtaining the pixel region to be matched with the highest resolution comprises:

9. The method of claim 7, wherein adjusting the resolution of the reconstructed block of the encoded block of the current video frame from the initial resolution to the highest resolution, and wherein obtaining the pixel region to be matched with the highest resolution comprises:

and under the condition that the coded block comprises a plurality of blocks and the plurality of blocks comprise a target block of which the initial resolution is lower than the highest resolution, upsampling the resolution of a reconstructed block of the target block from the initial resolution to the highest resolution to obtain the target coded block of which the resolution is the highest resolution, wherein the pixel area to be matched comprises the target coded block.

10. The method of claim 7, wherein after adjusting the resolution of the reconstructed block of the encoded block of the current video frame from the initial resolution to the highest resolution, the method further comprises:

and under the condition that the coded block comprises a plurality of blocks, performing edge filtering on pixel points on adjacent edges of adjacent pixel blocks in the pixel region to be matched, wherein the adjacent pixel blocks are pixel blocks corresponding to adjacent blocks in the plurality of blocks in the pixel region to be matched.

11. The method according to any one of claims 7 to 10, wherein searching the pixel region to be matched for the target pixel region matching the block to be encoded comprises:

using a preset step length to sequentially obtain a plurality of candidate pixel areas with the same size as the block to be coded from the pixel area to be matched according to the sequence of first-row-after-column or first-row-after-row;

and determining the target pixel area from the candidate pixel areas, wherein the target pixel area is the candidate pixel area which has the highest similarity with the block to be coded and has the similarity higher than a similarity threshold value with the block to be coded.

12. A video decoding apparatus, comprising:

a first adjusting unit, configured to adjust a resolution of a reconstructed block of a decoded block of a current video frame from an initial resolution to a highest resolution, to obtain a pixel region to be matched, where the resolution is the highest resolution, where the initial resolution is a resolution adopted by the decoded block during decoding, and the highest resolution is a highest resolution in a preset resolution set;

a first determining unit, configured to determine, according to a target motion vector MV corresponding to a block to be decoded of the current video frame, a target pixel region matching the block to be decoded from the pixel region to be matched, where the target MV is a motion vector from the target pixel region to the block to be decoded, and a size of the target pixel region is the same as a size of the block to be decoded;

and the second determining unit is used for determining the pixel values of the pixels in the target decoding block obtained by decoding the block to be decoded according to the highest resolution according to the pixel values of the pixels in the target pixel region.

13. A video encoding apparatus, comprising:

a second adjusting unit, configured to adjust a resolution of a reconstructed block of a coded block of a current video frame from an initial resolution to a highest resolution, to obtain a pixel region to be matched, where the resolution is the highest resolution, where the initial resolution is a resolution adopted by the coded block during coding, and the highest resolution is a highest resolution in a preset resolution set;

the searching unit is used for searching a target pixel area matched with a block to be coded of the current video frame in the pixel area to be matched, wherein the size of the target pixel area is the same as that of the block to be coded;

and the obtaining unit is used for obtaining the motion vector MV from the target pixel area to the block to be coded to obtain the target MV corresponding to the block to be coded.

14. A computer-readable storage medium comprising a stored program, wherein the program when executed performs the method of any of claims 1 to 11.

15. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 11 by means of the computer program.