CN110636295B

CN110636295B - Video encoding and decoding method and device, storage medium and electronic device

Info

Publication number: CN110636295B
Application number: CN201910927102.0A
Authority: CN
Inventors: 高欣玮; 谷沉沉
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-09-27
Filing date: 2019-09-27
Publication date: 2023-10-24
Anticipated expiration: 2039-09-27
Also published as: CN110636295A

Abstract

The invention discloses a video encoding and decoding method and device, a storage medium and an electronic device. Wherein the method comprises the following steps: acquiring a motion vector MV of a block to be decoded in a video frame to be decoded; the method comprises the steps of adjusting the resolution of a block to be decoded to be a target resolution, obtaining a first reconstruction block, adjusting the resolution of a reconstruction frame of a forward reference frame of the block to be decoded to be the target resolution, obtaining a first reconstruction frame, and adjusting the resolution of a reconstruction frame of a backward reference frame of the block to be decoded to be the target resolution, obtaining a second reconstruction frame; synthesizing the first reconstructed frame and the second reconstructed frame into a virtual reference frame; and determining pixel values of the first reconstruction block according to the motion vector MV of the first reconstruction block relative to the corresponding region in the virtual reference frame and the virtual reference frame, wherein the motion vector MV of the first reconstruction block relative to the corresponding region in the virtual reference frame is equal to the motion vector MV of the block to be decoded. The method solves the technical problem that the pixel value cannot be determined due to different resolutions of video blocks.

Description

Video encoding and decoding method and device, storage medium and electronic device

Technical Field

The present invention relates to the field of audio/video encoding and decoding, and in particular, to a video encoding and decoding method and apparatus, a storage medium, and an electronic apparatus.

Background

With the development of digital media technology and computer technology, video is applied to various fields such as mobile communication, network monitoring, network television, etc. With the improvement of hardware performance and screen resolution, the demand of users for high-definition video is increasing.

Under the condition of limited mobile bandwidth, the existing codec usually adopts the same resolution to perform the codec on the video frame, which will make the peak signal-to-noise ratio (Peak Signal to Noise Ratio, abbreviated as PSNR) under the partial bandwidth relatively lower, thereby causing distortion to the video frame and causing the problem of poor video playing quality. In the related art, the inventor can reduce distortion of video frames by adjusting the resolutions of different video blocks used in encoding and decoding, but the resolution of the video blocks used in encoding and decoding is adjusted, so that the pixel values of the pixels in the decoding blocks cannot be determined in decoding, and therefore the decoding cannot be performed.

In view of the above problems, no effective solution has been proposed at present.

Disclosure of Invention

The embodiment of the invention provides a video encoding and decoding method and device, a storage medium and an electronic device, which at least solve the technical problem that pixel values cannot be determined due to different resolutions of video blocks.

According to an aspect of an embodiment of the present invention, there is provided a video decoding method including: acquiring a motion vector MV of a block to be decoded in a video frame to be decoded; the method comprises the steps of adjusting the resolution of a block to be decoded to be a target resolution, obtaining a first reconstruction block, adjusting the resolution of a reconstruction frame of a forward reference frame of the block to be decoded to be the target resolution, obtaining a first reconstruction frame, and adjusting the resolution of a reconstruction frame of a backward reference frame of the block to be decoded to be the target resolution, obtaining a second reconstruction frame; synthesizing the first reconstructed frame and the second reconstructed frame into a virtual reference frame; and determining pixel values of the first reconstruction block according to the motion vector MV of the first reconstruction block relative to the corresponding area in the virtual reference frame and the virtual reference frame, wherein the motion vector MV of the first reconstruction block relative to the corresponding area in the virtual reference frame is equal to the motion vector MV of the block to be decoded, and the corresponding area is the area corresponding to the first reconstruction block in the virtual reference frame.

According to another aspect of the embodiment of the present invention, there is also provided a video encoding method, including: the method comprises the steps of adjusting the resolution of a block to be encoded in a video frame to be encoded to be a target resolution, obtaining a first reconstructed block, adjusting the resolution of a reconstructed frame of a forward reference frame of the block to be encoded to be the target resolution, obtaining a first reconstructed frame, and adjusting the resolution of a reconstructed frame of a backward reference frame of the block to be encoded to be the target resolution, obtaining a second reconstructed frame; synthesizing the first reconstructed frame and the second reconstructed frame into a virtual reference frame; and determining the motion vector MV of the first reconstruction block relative to a corresponding region in the virtual reference frame as the motion vector MV of the block to be encoded, wherein the corresponding region is a region corresponding to the first reconstruction block in the virtual reference frame.

According to another aspect of the embodiment of the present invention, there is also provided a video decoding apparatus including: the first acquisition unit is used for acquiring a motion vector MV of a block to be decoded in a video frame to be decoded; the first adjusting unit is used for adjusting the resolution of the block to be decoded to be the target resolution, obtaining a first reconstructed block, adjusting the resolution of the reconstructed frame of the forward reference frame of the block to be decoded to be the target resolution, obtaining a first reconstructed frame, and adjusting the resolution of the reconstructed frame of the backward reference frame of the block to be decoded to be the target resolution, obtaining a second reconstructed frame; the synthesizing unit is used for synthesizing the first reconstruction frame and the second reconstruction frame into a virtual reference frame; and the determining unit is used for determining the pixel value of the first reconstruction block according to the motion vector MV of the first reconstruction block relative to the corresponding area in the virtual reference frame and the virtual reference frame, wherein the motion vector MV of the first reconstruction block relative to the corresponding area in the virtual reference frame is equal to the motion vector MV of the block to be decoded, and the corresponding area is the area corresponding to the first reconstruction block in the virtual reference frame.

According to another aspect of the embodiment of the present invention, there is also provided a video encoding apparatus including: the first adjusting unit is used for adjusting the resolution of a block to be encoded in a video frame to be encoded to a target resolution, obtaining a first reconstructed block, adjusting the resolution of a reconstructed frame of a forward reference frame of the block to be encoded to the target resolution, obtaining a first reconstructed frame, and adjusting the resolution of a reconstructed frame of a backward reference frame of the block to be encoded to the target resolution, obtaining a second reconstructed frame; the synthesizing unit is used for synthesizing the first reconstruction frame and the second reconstruction frame into a virtual reference frame; and the determining unit is used for determining the motion vector MV of the first reconstruction block relative to a corresponding area in the virtual reference frame as the motion vector MV of the block to be encoded, wherein the corresponding area is an area corresponding to the first reconstruction block in the virtual reference frame.

According to yet another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is configured to perform the video encoding and decoding method described above when run.

According to still another aspect of the embodiments of the present invention, there is further provided an electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the video encoding and decoding method described above through the computer program.

In the embodiment of the invention, the method comprises the steps of acquiring the motion vector of a block to be decoded in a video to be decoded, adjusting the resolution of the block to be decoded to the target resolution, thereby obtaining a first reconstruction block, adjusting the resolutions of a forward reference frame and a backward reference frame of the block to be decoded to the target resolution, and merging the target resolution and the target resolution into a virtual reference frame, thereby determining the pixel value of the first reconstruction block according to the motion vector of the first reconstruction block relative to a corresponding area in the virtual reference frame, and further realizing the technical effect that the pixel value can be determined under the condition of different resolutions of the video block, and further solving the technical problem that the pixel value cannot be determined due to different resolutions of the video block.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

FIG. 1 is a schematic diagram of an application environment of an alternative video decoding method according to an embodiment of the present application;

FIG. 2 is a flow chart of an alternative video decoding method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of an alternative video decoding method according to an embodiment of the present application;

FIG. 4 is a schematic diagram of another alternative video decoding method according to an embodiment of the present application;

FIG. 5 is a flow chart of an alternative video encoding method according to an embodiment of the application;

fig. 6 is a schematic structural view of an alternative video decoding apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an alternative video encoding apparatus according to an embodiment of the present application;

FIG. 8 is a schematic diagram of an alternative electronic device according to an embodiment of the application;

fig. 9 is a schematic structural view of another alternative electronic device according to an embodiment of the present application.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

According to an aspect of the embodiment of the present application, a video decoding method is provided, optionally, as an optional implementation manner, the video decoding method may be applied, but not limited to, in an application environment as shown in fig. 1. The application environment includes a terminal 102 and a server 104, where the terminal 102 and the server 104 communicate through a network. The terminal 102 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, etc. The server 104 may be, but is not limited to, a computer processing device with a high data processing capability and a certain storage space.

Note that, the video encoding method corresponding to the video decoding method described above may be applied to, but not limited to, the application environment shown in fig. 1. After the video to be encoded is obtained, the video encoding method provided by the present application may be, but is not limited to, adopted, through the interaction process between the terminal 102 and the server 104 shown in fig. 1, the video to be encoded is encoded under the condition that the resolutions of the block to be encoded in the video frame to be encoded, the reconstructed frame of the forward reference frame of the block to be encoded, and the reconstructed frame of the backward reference frame of the block to be encoded are adjusted to be the target resolutions, and the virtual reference frame is synthesized, so as to determine the motion vector MV of the block to be encoded, thereby implementing the encoding of the video to be encoded under the condition that the resolutions of the video blocks are different. In addition, after the video to be decoded is acquired, the video decoding method provided by the present application may be, but is not limited to, adopted, by the interaction process between the terminal 102 and the server 104 shown in fig. 1, the forward reference frame and the backward reference frame of the block to be decoded are adjusted to the target resolution, and the virtual reference frame is synthesized, and the pixel value of the block to be decoded is determined by the motion vector of the block to be decoded and the corresponding region in the virtual reference frame, so as to realize decoding of the video to be decoded under the condition that the resolutions of the video blocks are different.

In one embodiment, terminal 102 may include, but is not limited to, the following: an image processing unit 1021, a processor 1022, a storage medium 1023, a memory 1024, a network interface 1025, a display screen 1026, and an input device 1027. The components described above may be connected by, but are not limited to, a system bus 1028. Wherein, the image processing unit 1021 is used for providing at least the drawing capability of the display interface; the processor 1022 is configured to provide computing and control capabilities to support operation of the terminal 102; the storage medium 1023 has stored therein an operating system 1023-2, a video encoder and/or a video decoder 1023-4. The operating system 1023-2 is used to provide control operation instructions, and the video encoder and/or video decoder 1023-4 is used to perform encoding/decoding operations in accordance with the control operation instructions. In addition, the memory provides an operating environment for the video encoder and/or video decoder 1023-4 in the storage medium 1023, and the network interface 1025 is used for network communication with the network interface 1043 in the server 104. The display screen is used for displaying application interfaces and the like, such as decoding video; the input device 1027 is used to receive commands or data input by a user, and the like. For a terminal 102 with a touch screen, the display screen 1026 and the input device 1027 may be touch screens. The above-described internal structure of the terminal shown in fig. 1 is merely a block diagram of a part of the structure related to the present application and does not constitute a limitation of the terminal to which the present application is applied, and a specific terminal or server may include more or less components than those shown in the drawings, or may combine some components, or have different arrangements of components.

In one embodiment, the server 104 may include, but is not limited to, the following: a processor 1041, memory 1042, a network interface 1043, and storage media 1044. The components described above may be connected by, but are not limited to, a system bus 1045. The storage medium 1044 includes an operating system 1044-1, a database 1044-2, a video encoder and/or a video decoder 1044-3. Wherein the processor 1041 is configured to provide computing and control capabilities to support operation of the server 104. Memory 1042 provides an environment for operation of video encoder and/or video decoding 1044-3 in storage medium 1044. The network interface 1043 communicates with the network interface 1025 of the external terminal 102 through a network connection. The operating system 1044-1 in the storage medium is used to provide control operation instructions; the video encoder and/or video decoder 1044-3 is for performing encoding/decoding operations according to the control operation instructions; database 1044-2 is used to store data. The above-described structure inside the server shown in fig. 1 is merely a block diagram of a part of the structure related to the present application, and does not constitute a limitation of the computer device to which the present application is applied, and a specific computer device has a different arrangement of components.

In one embodiment, the network may include, but is not limited to, a wired network. Wherein, the wired network may include, but is not limited to: wide area network, metropolitan area network, local area network. The above is merely an example, and is not limited in any way in the present embodiment.

According to an aspect of an embodiment of the present invention, there is provided a video decoding method, as shown in fig. 2, including:

s202, obtaining a motion vector MV of a block to be decoded in a video frame to be decoded;

s204, adjusting the resolution of the block to be decoded to be the target resolution, obtaining a first reconstructed block, adjusting the resolution of a reconstructed frame of a forward reference frame of the block to be decoded to be the target resolution, obtaining a first reconstructed frame, and adjusting the resolution of a reconstructed frame of a backward reference frame of the block to be decoded to be the target resolution, obtaining a second reconstructed frame;

s206, synthesizing the first reconstructed frame and the second reconstructed frame into a virtual reference frame;

and S208, determining pixel values of the first reconstruction block according to the motion vector MV of the first reconstruction block relative to the corresponding area in the virtual reference frame and the virtual reference frame, wherein the motion vector MV of the first reconstruction block relative to the corresponding area in the virtual reference frame is equal to the motion vector MV of the block to be decoded, and the corresponding area is the area corresponding to the first reconstruction block in the virtual reference frame.

Here, the pixel values of the pixel points in the first reconstruction block are equal to the pixel values of the pixel points of the corresponding region in the virtual reference frame. It can be appreciated that the pixel values of the pixels in the block to be decoded are equal to the pixel values of the corresponding pixels in the first reconstructed block. In an embodiment of the present invention, both the forward reference frame and the backward reference frame are decoded frames.

It should be noted that the video decoding method shown in fig. 2 may be used in the video decoder shown in fig. 1, but is not limited to the above method. The video decoder is matched with other parts in interaction to complete the decoding process of the video frames to be decoded.

Alternatively, in this embodiment, the video decoding method may be applied to, but not limited to, application scenarios such as a video playing application, a video sharing application, or a video session application. The video transmitted in the application scenario may include, but is not limited to: the long video, the short video, such as the long video, can be a play episode with longer play time (for example, the play time is longer than 10 minutes), or the pictures shown in the long video session, and the short video can be a voice message interacted by two or more parties, or a video with shorter play time (for example, the play time is less than or equal to 30 seconds) shown on the sharing platform. The foregoing is merely an example, and the video decoding method provided in this embodiment may be, but is not limited to, applied to a playing device for playing video in the foregoing application scenario, where after encoded code stream data is acquired, pixel values of pixels are determined through adjustment of resolution, so as to perform decoding, and avoid incapability of decoding caused by incapability of determining pixel values due to different resolutions of a block to be decoded and a reference block.

When the video is encoded, different resolution ratios can be adopted to encode different video blocks in the video frame, so that the problem of distortion caused by adopting uniform resolution ratios in the related art can be solved, and the video playing quality is ensured. In this embodiment, when video decoding is performed, the reference area of the block to be decoded is a corresponding area in a virtual reference frame synthesized by a forward reference frame and a backward reference frame of the block to be decoded. It will be appreciated that the pixel value of each pixel in the first reconstructed block is determined herein, and the pixel value for the pixel in the block to be decoded is equal to the pixel value for the corresponding pixel in the first reconstructed block. In the embodiment of the invention, the resolution of the block to be decoded, the forward reference frame and the backward reference frame are required to be adjusted during decoding. It should be noted that, the resolution of the reconstructed block of the block to be decoded, the reconstructed block of the forward reference frame and the reconstructed block of the backward reference frame may be adjusted, so that the motion vector of the block to be decoded relative to the reference block may be determined without actually changing the original block to be decoded, the forward reference frame and the backward reference frame, which may, of course, be applied to the encoding process. It will be appreciated that the resolution of the block to be decoded, the forward reference frame and the backward reference frame may be directly adjusted, and the resolution may be adjusted after determining the pixel value and then adjusted back to the resolution before adjustment.

Optionally, in this embodiment, after determining a video frame to be decoded in the video to be decoded from the code stream received by the encoding device and before decoding the video frame to be decoded, a reference video frame may be determined from video frames that have been decoded before the video frame to be decoded, and further, a reference block in the reference video frame may be determined, and in this embodiment, the encoding mode of the reference video frame may be determined by:

1) Acquiring a preset flag bit in a code stream, and determining an encoding mode adopted by a reference video frame, such as intra-frame decoding or inter-frame decoding, according to the flag bit;

2) Decoding is carried out according to the convention between the encoding equipment of the encoding end, and the encoding mode adopted by the reference video frame which is decoded is determined after decoding, such as intra-frame decoding or inter-frame decoding.

For the reference area in the embodiment of the present invention, as shown in fig. 3, the t-th frame is the current frame to be decoded, and the video block a is the block to be decoded. When decoding the t-th frame, the t-k-th frame as the forward reference frame and the t+n-th frame as the backward reference frame may be referred to, where k may be equal to n, and k and n are positive integers. In the decoding process, the forward reference frame and the backward reference frame are synthesized into a virtual reference frame, so that a corresponding region B to be decoded is determined in the virtual reference frame, wherein the corresponding region is a reference region of a block to be decoded. It can be understood that, since the forward reference frame and the backward reference frame are decoded frames, the pixel values of the pixels in the synthesized virtual reference frame are known, for the motion vector MV of the first reconstructed block and the reference area after the block to be decoded is adjusted to the target resolution is equal to the motion vector MV of the block to be decoded, the pixels of the first reconstructed block and the pixels in the reference area corresponding to each other one by one can be determined according to the motion vector MV of the first reconstructed block relative to the reference area, so as to determine the pixel values of the first reconstructed block, that is, the pixel values of the pixels in the first reconstructed block.

Optionally, adjusting the resolution of the block to be decoded to a target resolution to obtain a first reconstructed block, adjusting the resolution of a reconstructed frame of a forward reference frame of the block to be decoded to the target resolution to obtain a first reconstructed frame, and adjusting the resolution of a reconstructed frame of a backward reference frame of the block to be decoded to the target resolution to obtain a second reconstructed frame, including: the method comprises the steps of adjusting a first resolution adopted by a block to be decoded in decoding to be a third resolution to obtain a first reconstruction block, wherein the target resolution is the third resolution; the method comprises the steps of adjusting the resolution adopted by each video block in a reconstructed frame of a forward reference frame to be a third resolution to obtain a first reconstructed frame, wherein the forward reference frame comprises at least 2 video blocks with different resolutions adopted by the decoding; and adjusting the resolution adopted by each video block in the reconstructed frame of the backward reference frame to be the third resolution to obtain a second reconstructed frame, wherein the backward reference frame comprises at least 2 video blocks with different resolutions during decoding. It can be understood that when encoding video, different video blocks in a video frame can be encoded with different resolutions, so that the problem of distortion caused by the adoption of uniform resolution in the related art can be solved, and the video playing quality is ensured. Thus, the resolution of the video blocks in each video frame may be different, so the resolution of each video block in the forward reference frame may be adjusted to the target resolution and the resolution of each video block in the backward reference frame may be adjusted to the target resolution when the forward reference frame and the backward reference frame are adjusted. Taking the adjustment of the forward reference frame as an example for illustration, as shown in fig. 4, different video blocks in the forward reference frame have different resolutions, and R1-R4 in fig. 4 are used to represent the different resolutions, when the resolution adjustment is performed, the resolutions of all the video blocks need to be adjusted to the target resolution, so that the resolution of the adjusted first reconstruction block is the target resolution.

Optionally, before adjusting the first resolution adopted by the block to be decoded in decoding to the third resolution, the method further includes: and acquiring a syntax element carried in the data to be decoded corresponding to the block to be decoded, wherein the syntax element is used for indicating the third resolution. In an embodiment of the present invention, the syntax element herein may be identification information, thereby indicating a third resolution required at the time of decoding. It will be understood, of course, that the third resolution may be pre-agreed, so that no syntax elements need to be carried in the bitstream, and the motion vector MV of the block to be decoded relative to the reference block is determined directly according to the pre-agreed third resolution at decoding.

In an alternative embodiment of the present invention, the syntax element may be an index flag for inter prediction adaptive resolution alignment, specifically denoted as 0,1,2,3,4, etc., each index representing a scale of resolution scaling of the third resolution. For example, a threshold of 0 represents the highest resolution ratio, and 1 represents each of 3/4 samples wide and high for encoding; 2 represents width and height 2/3 samples, and 3 represents width and height 1/2 samples for encoding; 4 represents 1/3 of the width and height samples; 5 denotes the width and height 1/4 samples for decoding. It is to be understood that this is only an alternative embodiment provided by the present invention and the present invention is not limited thereto.

Optionally, the third resolution is the original resolution of the block to be decoded, or the third resolution is the highest resolution in a predetermined set of resolutions. It will be appreciated that for video, there may be multiple resolutions, such as 720p,1080p, etc. available, these alternative resolutions constituting the resolution set herein. Of course, existing video resolution specifications may be, but are not limited to, used in the resolution set. It should be noted that, the original resolution is herein referred to as the original resolution of the video to be decoded, and it is understood that the original resolution may be the same as or different from the first resolution of the block to be decoded.

Optionally, in the case that the third resolution is lower than the highest resolution in the predetermined set of resolutions, adjusting the first resolution adopted by the block to be decoded in decoding to the third resolution to obtain the first reconstructed block, including: up-sampling a first resolution adopted by a block to be decoded in decoding to the highest resolution to obtain the first block to be decoded; downsampling the resolution of the first block to be decoded from the highest resolution to a third resolution to obtain a first reconstructed block; the method comprises the steps of adjusting the resolution adopted in decoding each video block in a reconstructed frame of a forward reference frame to be a third resolution to obtain a first reconstructed frame, and the method comprises the following steps: up-sampling the resolution adopted by each video block in the reconstructed frame of the forward reference frame to the highest resolution during decoding to obtain a first forward reference frame; downsampling the resolution of each video block in the first forward reference frame from a highest resolution to a third resolution to obtain a first reconstructed frame; the method for decoding the video block in the backward reference frame comprises the steps of adjusting the resolution adopted in decoding each video block in the reconstructed frame of the backward reference frame to be a third resolution to obtain a second reconstructed frame, and the method comprises the following steps: up-sampling the resolution adopted by each video block in the reconstructed frame of the backward reference frame to the highest resolution during decoding to obtain a first backward reference frame; and downsampling the resolution of each video block in the first backward reference frame from the highest resolution to a third resolution to obtain a second reconstructed frame. In the embodiment of the present invention, when the third resolution is lower than the highest resolution in the resolution set, up-sampling may be performed to the highest resolution, and then down-sampling may be performed to the third resolution.

Optionally, in the case that the resolution adopted by the block to be decoded in decoding is the original resolution, adjusting the resolution of the block to be decoded in the video frame to be decoded to the target resolution, obtaining a first reconstructed block, adjusting the resolution of the reconstructed frame of the forward reference frame of the block to be decoded to the target resolution, obtaining a first reconstructed frame, and adjusting the resolution of the reconstructed frame of the backward reference frame of the block to be decoded to the target resolution, obtaining a second reconstructed frame, including: the method comprises the steps of adjusting the resolution adopted by each video block in a reconstructed frame of a forward reference frame to be original resolution, obtaining a first reconstructed frame, and adjusting the resolution adopted by each video block in a reconstructed frame of a backward reference frame to be original resolution, obtaining a second reconstructed frame, wherein the target resolution is the original resolution, the forward reference frame comprises at least 2 video blocks with different resolutions when being decoded, and the backward reference frame comprises at least 2 video blocks with different resolutions when being decoded. In the embodiment of the invention, the original resolution is the original resolution of the video, and when the resolution is adjusted, the block to be decoded, the forward reference frame and the backward reference frame can be adjusted to the original resolution under the condition that the resolution adopted by the block to be decoded in decoding is the original resolution. It will be appreciated that after decoding the forward and backward reference frames, the decoding will be to the original resolution, so that the reconstructed frame of the forward reference frame may be determined as the first reconstructed frame and the reconstructed frame of the backward reference frame as the second reconstructed frame.

According to another aspect of an embodiment of the present invention, there is provided a video encoding method, as shown in fig. 5, the method including:

s502, adjusting the resolution of a block to be encoded in a video frame to be encoded to a target resolution, obtaining a first reconstructed block, adjusting the resolution of a reconstructed frame of a forward reference frame of the block to be encoded to the target resolution, obtaining a first reconstructed frame, and adjusting the resolution of a reconstructed frame of a backward reference frame of the block to be encoded to the target resolution, obtaining a second reconstructed frame;

s504, synthesizing the first reconstructed frame and the second reconstructed frame into a virtual reference frame;

and S506, determining the motion vector MV of the first reconstruction block relative to a corresponding region in the virtual reference frame as the motion vector MV of the block to be encoded, wherein the corresponding region is a region corresponding to the first reconstruction block in the virtual reference frame.

It should be noted that the video encoding method shown in fig. 5 may be used in the video encoder shown in fig. 1, but is not limited to the above method. The video encoder is matched with other parts in interaction to complete the encoding process of the video frames to be encoded.

Alternatively, in this embodiment, the video encoding method may be applied to, but not limited to, application scenarios such as a video playing application, a video sharing application, or a video session application. The video transmitted in the application scenario may include, but is not limited to: the long video, the short video, such as the long video, can be a play episode with longer play time (for example, the play time is longer than 10 minutes), or the pictures shown in the long video session, and the short video can be a voice message interacted by two or more parties, or a video with shorter play time (for example, the play time is less than or equal to 30 seconds) shown on the sharing platform. The foregoing is merely an example, and the video encoding method provided in this embodiment may be, but is not limited to, applied to a playing device for playing video in the foregoing application scenario, and after obtaining a video to be encoded, determine a motion vector MV of a first reconstructed block after adjusting a block to be encoded to a target resolution with respect to a corresponding area in a virtual reference frame, so as to perform encoding.

When the video is encoded, different resolution ratios can be adopted to encode different video blocks in the video frame, so that the problem of distortion caused by adopting uniform resolution ratios in the related art can be solved, and the video playing quality is ensured. In this embodiment, the resolution of the block to be encoded in the video frame to be encoded is adjusted to the target resolution to obtain the first reconstructed block, and the forward reference frame and the backward reference frame of the block to be encoded are both adjusted to the target resolution and synthesized to be the virtual reference frame, so that the motion vector MV of the first reconstructed block relative to the corresponding region in the virtual reference frame is conveniently determined. It is understood that the motion vector MV of the first reconstructed block with respect to the corresponding region in the virtual reference frame may be used as the motion vector MV of the block to be encoded. In the embodiment of the invention, in order to determine the motion vector of the block to be encoded relative to the reference area during encoding, the resolutions of the block to be encoded, the forward reference frame and the backward reference frame need to be adjusted, and it should be noted that the resolutions of the reconstructed block of the block to be encoded, the reconstructed block of the forward reference frame and the reconstructed block of the backward reference frame may be adjusted, so that the motion vector of the block to be encoded relative to the reference block can be determined without actually changing the original block to be encoded, the forward reference frame and the backward reference frame. It will be appreciated that the resolution of the block to be encoded, the forward reference frame and the backward reference frame may be directly adjusted, and the resolution adjustment may be performed after the motion vector MV is determined and then the resolution before adjustment is adjusted.

It can be appreciated that the video encoding method according to the embodiment of the present invention may be referred to with the video decoding method described above.

Optionally, adjusting the resolution of a block to be encoded in a video frame to be encoded to a target resolution, obtaining a first reconstructed block, adjusting the resolution of a reconstructed frame of a forward reference frame of the block to be encoded to the target resolution, obtaining a first reconstructed frame, and adjusting the resolution of a reconstructed frame of a backward reference frame of the block to be encoded to the target resolution, obtaining a second reconstructed frame, including: the method comprises the steps of adjusting a first resolution adopted by a block to be encoded in encoding to a third resolution to obtain a first reconstruction block, wherein the target resolution is the third resolution; the method comprises the steps of adjusting the resolution adopted by each video block in a reconstructed frame of a forward reference frame to be a third resolution to obtain a first reconstructed frame, wherein the forward reference frame comprises at least 2 video blocks with different resolutions adopted by the encoding; and adjusting the resolution adopted in the encoding of each video block in the reconstructed frame of the backward reference frame to be the third resolution to obtain a second reconstructed frame, wherein the backward reference frame comprises at least 2 video blocks with different resolutions in the encoding.

Optionally, after determining the motion vector MV of the first reconstructed block with respect to the corresponding region in the virtual reference frame as the motion vector MV of the block to be encoded, the method further comprises: and adding a syntax element to the data to be encoded corresponding to the block to be encoded, wherein the syntax element is used for indicating the third resolution.

Optionally, the third resolution is the original resolution of the block to be encoded, or the third resolution is the highest resolution in a predetermined set of resolutions. It will be appreciated that for video, there may be multiple resolutions, such as 720p,1080p, etc. available, these alternative resolutions constituting the resolution set herein. Of course, existing video resolution specifications may be, but are not limited to, used in the resolution set. It should be noted that, the original resolution is herein referred to as the original resolution of the video to be decoded, and it is understood that the original resolution may be the same as or different from the first resolution of the block to be decoded.

Optionally, in a case that the third resolution is lower than the highest resolution in the predetermined set of resolutions, adjusting the first resolution adopted by the block to be encoded to the third resolution when encoding, to obtain a first reconstructed block, including: up-sampling a first resolution adopted by a block to be encoded to the highest resolution during encoding to obtain a first block to be encoded; downsampling the resolution of the first block to be coded from the highest resolution to a third resolution to obtain a first reconstructed block; the method comprises the steps of adjusting the resolution adopted in the encoding of each video block in the reconstructed frame of the forward reference frame to be a third resolution to obtain a first reconstructed frame, and the method comprises the following steps: the method comprises the steps of up-sampling the resolution adopted by each video block in a reconstructed frame of a forward reference frame to the highest resolution in the process of encoding to obtain a first forward reference frame; downsampling the resolution of each video block in the first forward reference frame from a highest resolution to a third resolution to obtain a first reconstructed frame; the method for obtaining the second reconstructed frame comprises the steps of: the method comprises the steps of up-sampling the resolution adopted by each video block in a reconstructed frame of a backward reference frame to the highest resolution in the process of encoding, and obtaining a first backward reference frame; and downsampling the resolution of each video block in the first backward reference frame from the highest resolution to a third resolution to obtain a second reconstructed frame. In the embodiment of the present invention, when the third resolution is lower than the highest resolution in the resolution set, up-sampling may be performed to the highest resolution, and then down-sampling may be performed to the third resolution.

Optionally, in the case that the resolution adopted by the block to be encoded in encoding is the original resolution, adjusting the resolution of the block to be encoded in the video frame to be encoded to the target resolution, obtaining a first reconstructed block, adjusting the resolution of the reconstructed frame of the forward reference frame of the block to be encoded to the target resolution, obtaining a first reconstructed frame, and adjusting the resolution of the reconstructed frame of the backward reference frame of the block to be encoded to the target resolution, obtaining a second reconstructed frame, including: the method comprises the steps of adjusting the resolution adopted by each video block in a reconstructed frame of a forward reference frame to be original resolution, obtaining a first reconstructed frame, and adjusting the resolution adopted by each video block in a reconstructed frame of a backward reference frame to be original resolution, obtaining a second reconstructed frame, wherein the target resolution is the original resolution, the forward reference frame comprises at least 2 video blocks with different resolutions in encoding, and the backward reference frame comprises at least 2 video blocks with different resolutions in encoding. It can be understood that in the encoding process, different resolutions are adopted for encoding different video blocks in the video frame, and when the resolution adopted by the block to be encoded in encoding is the original resolution, the reconstructed frame of the original frame corresponding to the forward reference frame can be determined as a first reconstructed frame, and the reconstructed frame of the original frame corresponding to the backward reference frame can be determined as a second reconstructed frame.

It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present invention is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present invention. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present invention.

According to still another aspect of an embodiment of the present invention, there is also provided a video decoding apparatus for performing the above video decoding, as shown in fig. 6, the apparatus including:

a first obtaining unit 602, configured to obtain a motion vector MV of a block to be decoded in a video frame to be decoded;

a first adjusting unit 604, configured to adjust a resolution of a block to be decoded to a target resolution, obtain a first reconstructed block, adjust a resolution of a reconstructed frame of a forward reference frame of the block to be decoded to the target resolution, obtain a first reconstructed frame, and adjust a resolution of a reconstructed frame of a backward reference frame of the block to be decoded to the target resolution, obtain a second reconstructed frame;

A synthesizing unit 606, configured to synthesize the first reconstructed frame and the second reconstructed frame into a virtual reference frame;

the determining unit 608 is configured to determine a pixel value of the first reconstructed block according to the motion vector MV of the first reconstructed block relative to the corresponding area in the virtual reference frame and the virtual reference frame, where the motion vector MV of the first reconstructed block relative to the corresponding area in the virtual reference frame is equal to the motion vector MV of the block to be decoded, and the corresponding area is an area corresponding to the first reconstructed block in the virtual reference frame.

Specific embodiments may refer to the examples shown in the video decoding method, and in this example, details are not repeated here.

As an alternative, the first adjusting unit includes: the first adjusting module is used for adjusting the first resolution adopted by the block to be decoded in decoding to be a third resolution to obtain a first reconstruction block, wherein the target resolution is the third resolution; the second adjusting module is used for adjusting the resolution adopted by each video block in the reconstructed frame of the forward reference frame to be the third resolution to obtain a first reconstructed frame, wherein the forward reference frame comprises at least 2 video blocks with different resolutions during decoding; and the third adjusting module is used for adjusting the resolution adopted by each video block in the reconstructed frame of the backward reference frame to be the third resolution to obtain a second reconstructed frame, wherein the backward reference frame comprises at least 2 video blocks with different resolutions during decoding.

As an alternative, the apparatus may further include: the second obtaining unit is configured to obtain a syntax element carried in data to be decoded corresponding to the block to be decoded before adjusting the first resolution adopted by the block to be decoded in decoding to the third resolution, where the syntax element is used to indicate the third resolution.

As an alternative, the third resolution is the original resolution of the block to be decoded, or the third resolution is the highest resolution in a predetermined set of resolutions.

As an alternative, in case the third resolution is lower than the highest resolution of the predetermined set of resolutions, the first adjustment module is specifically configured to: up-sampling a first resolution adopted by a block to be decoded in decoding to the highest resolution to obtain the first block to be decoded; downsampling the resolution of the first block to be decoded from the highest resolution to a third resolution to obtain a first reconstructed block; the second adjusting module is specifically configured to: up-sampling the resolution adopted by each video block in the reconstructed frame of the forward reference frame to the highest resolution during decoding to obtain a first forward reference frame; downsampling the resolution of each video block in the first forward reference frame from a highest resolution to a third resolution to obtain a first reconstructed frame; the third adjustment module is specifically configured to: up-sampling the resolution adopted by each video block in the reconstructed frame of the backward reference frame to the highest resolution during decoding to obtain a first backward reference frame; and downsampling the resolution of each video block in the first backward reference frame from the highest resolution to a third resolution to obtain a second reconstructed frame.

As an alternative, in the case where the resolution adopted by the block to be decoded at the time of decoding is the original resolution, the first adjustment unit includes: and a fourth adjustment module, configured to adjust the resolution adopted by each video block in the reconstructed frame of the forward reference frame to the original resolution, obtain a first reconstructed frame, and adjust the resolution adopted by each video block in the reconstructed frame of the backward reference frame to the original resolution, obtain a second reconstructed frame, where the target resolution is the original resolution, the forward reference frame includes at least 2 video blocks adopting different resolutions when decoding, and the backward reference frame includes at least 2 video blocks adopting different resolutions when decoding.

According to still another aspect of an embodiment of the present invention, there is provided a video encoding apparatus, as shown in fig. 7, including:

a first adjusting unit 702, configured to adjust a resolution of a block to be encoded in a video frame to be encoded to a target resolution, obtain a first reconstructed block, adjust a resolution of a reconstructed frame of a forward reference frame of the block to be encoded to the target resolution, obtain the first reconstructed frame, and adjust a resolution of a reconstructed frame of a backward reference frame of the block to be encoded to the target resolution, obtain a second reconstructed frame;

A synthesizing unit 704, configured to synthesize the first reconstructed frame and the second reconstructed frame into a virtual reference frame;

the determining unit 706 is configured to determine a motion vector MV of the first reconstructed block with respect to a corresponding region in the virtual reference frame, where the corresponding region is a region corresponding to the first reconstructed block in the virtual reference frame, as the motion vector MV of the block to be encoded.

Specific embodiments may refer to the examples shown in the video encoding method, and this example is not described herein.

As an alternative, the first adjusting unit includes: the first adjusting module is used for adjusting the first resolution adopted by the block to be encoded in encoding to a third resolution to obtain a first reconstruction block, wherein the target resolution is the third resolution; the second adjusting module is used for adjusting the resolution adopted by each video block in the reconstructed frame of the forward reference frame to be the third resolution to obtain a first reconstructed frame, wherein the forward reference frame comprises at least 2 video blocks with different resolutions during encoding; and the third adjusting module is used for adjusting the resolution adopted by each video block in the reconstructed frame of the backward reference frame to be the third resolution to obtain a second reconstructed frame, wherein the backward reference frame comprises at least 2 video blocks with different resolutions during encoding.

As an alternative, the apparatus may further include: and an adding unit, configured to add a syntax element to data to be encoded corresponding to the block to be encoded after determining a motion vector MV of the first reconstructed block relative to a corresponding region in the virtual reference frame as the motion vector MV of the block to be encoded, where the syntax element is used to indicate the third resolution.

As an alternative, the third resolution is the original resolution of the block to be encoded, or the third resolution is the highest resolution in a predetermined set of resolutions.

As an alternative, in case the third resolution is lower than the highest resolution of the predetermined set of resolutions, the first adjustment module is specifically configured to: up-sampling a first resolution adopted by a block to be encoded to the highest resolution during encoding to obtain a first block to be encoded; downsampling the resolution of the first block to be coded from the highest resolution to a third resolution to obtain a first reconstructed block; the second adjusting module is specifically configured to: the method comprises the steps of up-sampling the resolution adopted by each video block in a reconstructed frame of a forward reference frame to the highest resolution in the process of encoding to obtain a first forward reference frame; downsampling the resolution of each video block in the first forward reference frame from a highest resolution to a third resolution to obtain a first reconstructed frame; the third adjustment module is specifically configured to: the method comprises the steps of up-sampling the resolution adopted by each video block in a reconstructed frame of a backward reference frame to the highest resolution in the process of encoding, and obtaining a first backward reference frame; and downsampling the resolution of each video block in the first backward reference frame from the highest resolution to a third resolution to obtain a second reconstructed frame.

As an alternative, in the case where the resolution employed in encoding the block to be encoded is the original resolution, the first adjustment unit includes: and a fourth adjustment module, configured to adjust the resolution adopted by each video block in the reconstructed frame of the forward reference frame to be the original resolution, obtain a first reconstructed frame, and adjust the resolution adopted by each video block in the reconstructed frame of the backward reference frame to be the original resolution, obtain a second reconstructed frame, where the target resolution is the original resolution, the forward reference frame includes at least 2 video blocks adopting different resolutions during encoding, and the backward reference frame includes at least 2 video blocks adopting different resolutions during encoding.

According to a further aspect of the embodiments of the present invention, there is also provided an electronic device for implementing the above-described video decoding method, as shown in fig. 8, the electronic device comprising a memory and a processor, the memory storing a computer program, the processor being arranged to perform the steps of any of the method embodiments described above by means of the computer program.

Alternatively, in this embodiment, the electronic apparatus may be located in at least one network device of a plurality of network devices of the computer network.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:

s1, obtaining a motion vector MV of a block to be decoded in a video frame to be decoded;

s2, adjusting the resolution of the block to be decoded to be a target resolution, obtaining a first reconstructed block, adjusting the resolution of a reconstructed frame of a forward reference frame of the block to be decoded to be the target resolution, obtaining a first reconstructed frame, and adjusting the resolution of a reconstructed frame of a backward reference frame of the block to be decoded to be the target resolution, obtaining a second reconstructed frame;

s3, synthesizing the first reconstructed frame and the second reconstructed frame into a virtual reference frame;

and S4, determining pixel values of the first reconstruction block according to the motion vector MV of the first reconstruction block relative to the corresponding area in the virtual reference frame and the virtual reference frame, wherein the motion vector MV of the first reconstruction block relative to the corresponding area in the virtual reference frame is equal to the motion vector MV of the block to be decoded, and the corresponding area is the area corresponding to the first reconstruction block in the virtual reference frame.

Alternatively, it will be understood by those skilled in the art that the structure shown in fig. 8 is only schematic, and the electronic device may also be a terminal device such as a smart phone (e.g. an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, and a mobile internet device (Mobile Internet Devices, MID), a PAD, etc. Fig. 8 is not limited to the structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 8, or have a different configuration than shown in FIG. 8.

The memory 802 may be used to store software programs and modules, such as program instructions/modules corresponding to the video decoding method and apparatus in the embodiment of the present invention, and the processor 804 executes the software programs and modules stored in the memory 802, thereby performing various functional applications and data processing, that is, implementing the video decoding method described above. Memory 802 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, memory 802 may further include memory remotely located relative to processor 804, which may be connected to the terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 802 may be used for storing information such as a block to be decoded, in particular, but not limited to. As an example, as shown in fig. 8, the memory 802 may include, but is not limited to, the first acquisition unit 602, the first adjustment unit 604, the synthesis unit 606, and the determination unit 608 in the video decoding apparatus. In addition, other module units in the video decoding apparatus may be included, but are not limited to, and are not described in detail in this example.

Optionally, the transmission device 806 is used to receive or transmit data via a network. Specific examples of the network described above may include wired networks and wireless networks. In one example, the transmission means 806 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices and routers via a network cable to communicate with the internet or a local area network. In one example, the transmission device 806 is a Radio Frequency (RF) module for communicating wirelessly with the internet.

In addition, the electronic device further includes: a display 808 for displaying the decoded video; and a connection bus 810 for connecting the respective module parts in the above-described electronic device.

According to a further aspect of the embodiments of the present invention there is also provided an electronic device for implementing the video encoding method described above, as shown in fig. 9, the electronic device comprising a memory 902 and a processor 904, the memory 902 having stored therein a computer program, the processor 904 being arranged to perform the steps of any of the method embodiments described above by means of the computer program.

s1, adjusting the resolution of a block to be encoded in a video frame to be encoded to a target resolution, obtaining a first reconstructed block, adjusting the resolution of a reconstructed frame of a forward reference frame of the block to be encoded to the target resolution, obtaining a first reconstructed frame, and adjusting the resolution of a reconstructed frame of a backward reference frame of the block to be encoded to the target resolution, obtaining a second reconstructed frame;

s2, synthesizing the first reconstructed frame and the second reconstructed frame into a virtual reference frame;

and S3, determining the motion vector MV of the first reconstruction block relative to a corresponding region in the virtual reference frame as the motion vector MV of the block to be encoded, wherein the corresponding region is a region corresponding to the first reconstruction block in the virtual reference frame.

Alternatively, it will be understood by those skilled in the art that the structure shown in fig. 9 is only schematic, and the electronic device may also be a terminal device such as a smart phone (e.g. an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, and a mobile internet device (Mobile Internet Devices, MID), a PAD, etc. Fig. 9 is not limited to the structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 9, or have a different configuration than shown in FIG. 9.

The memory 902 may be used to store software programs and modules, such as program instructions/modules corresponding to the video encoding method and apparatus in the embodiments of the present invention, and the processor 904 executes the software programs and modules stored in the memory 902, thereby performing various functional applications and data processing, that is, implementing the video encoding method described above. The memory 902 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 902 may further include memory remotely located relative to the processor 904, which may be connected to the terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 902 may be, but is not limited to, information for a block to be encoded. As an example, as shown in fig. 9, the memory 902 may include, but is not limited to, the first adjusting unit 702, the synthesizing unit 704, and the determining unit 706 in the video encoding apparatus. In addition, other module units in the video encoding apparatus may be included, but are not limited to, and are not described in detail in this example.

Optionally, the transmission device 906 is used to receive or transmit data via a network. Specific examples of the network described above may include wired networks and wireless networks. In one example, the transmission means 906 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices and routers via a network cable to communicate with the internet or a local area network. In one example, the transmission device 906 is a Radio Frequency (RF) module for communicating wirelessly with the internet.

In addition, the electronic device further includes: a display 908 for displaying video before encoding; and a connection bus 910 for connecting the respective module parts in the above-described electronic device.

An embodiment of the invention also provides a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of:

Optionally, the storage medium is further arranged to store a computer program for performing the steps of:

Optionally, the storage medium is further configured to store a computer program for executing the steps included in the method in the above embodiment, which is not described in detail in this embodiment.

Alternatively, in this embodiment, it will be understood by those skilled in the art that all or part of the steps in the methods of the above embodiments may be performed by a program for instructing a terminal device to execute the steps, where the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

The integrated units in the above embodiments may be stored in the above-described computer-readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing one or more computer devices (which may be personal computers, servers or network devices, etc.) to perform all or part of the steps of the method described in the embodiments of the present invention.

In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In several embodiments provided by the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. A video decoding method, comprising:

acquiring a motion vector MV of a block to be decoded in a video frame to be decoded;

acquiring a syntax element carried in data to be decoded corresponding to the block to be decoded, wherein the syntax element is used for indicating a third resolution;

adjusting the first resolution adopted by the block to be decoded in decoding to the third resolution to obtain a first reconstruction block; the resolution adopted by each video block in the reconstructed frame of the forward reference frame of the block to be decoded is adjusted to be the third resolution, so as to obtain a first reconstructed frame, wherein the forward reference frame comprises at least 2 video blocks with different resolutions during decoding; the resolution adopted by each video block in the reconstructed frame of the backward reference frame of the block to be decoded is adjusted to be the third resolution, so as to obtain a second reconstructed frame, wherein the backward reference frame comprises at least 2 video blocks with different resolutions in decoding;

Synthesizing the first reconstructed frame and the second reconstructed frame into a virtual reference frame;

and determining a pixel value of the first reconstruction block according to the motion vector MV of the first reconstruction block relative to a corresponding region in the virtual reference frame and the virtual reference frame, wherein the motion vector MV of the first reconstruction block relative to the corresponding region in the virtual reference frame is equal to the motion vector MV of the block to be decoded, and the corresponding region is a region corresponding to the first reconstruction block in the virtual reference frame.

2. The method of claim 1, wherein the third resolution is an original resolution of the block to be decoded or the third resolution is a highest resolution of a predetermined set of resolutions.

3. The method of claim 1, wherein, in the event that the third resolution is lower than the highest resolution in the predetermined set of resolutions,

the adjusting the first resolution adopted by the block to be decoded in decoding to the third resolution to obtain the first reconstructed block includes: upsampling the first resolution adopted by the block to be decoded in decoding to the highest resolution to obtain a first block to be decoded; downsampling the resolution of the first block to be decoded from the highest resolution to the third resolution to obtain the first reconstructed block;

The step of adjusting the resolution adopted in decoding each video block in the reconstructed frame of the forward reference frame to the third resolution to obtain the first reconstructed frame includes: upsampling the resolution adopted by each video block in the reconstructed frame of the forward reference frame to the highest resolution during decoding to obtain a first forward reference frame; and downsampling the resolution of each video block in the first forward reference frame from the highest resolution to the third resolution to obtain the first reconstructed frame;

the step of adjusting the resolution adopted in decoding each video block in the reconstructed frame of the backward reference frame to the third resolution to obtain the second reconstructed frame includes: up-sampling the resolution adopted by each video block in the reconstructed frame of the backward reference frame to the highest resolution during decoding to obtain a first backward reference frame; and downsampling the resolution of each video block in the first backward reference frame from the highest resolution to the third resolution to obtain the second reconstructed frame.

4. The method according to claim 1, wherein, in case the resolution employed by the block to be decoded at the time of decoding is the original resolution,

The first resolution adopted by the block to be decoded in decoding is adjusted to the third resolution, and a first reconstruction block is obtained; the resolution adopted when each video block in the reconstructed frame of the forward reference frame of the block to be decoded is adjusted to be the third resolution, and a first reconstructed frame is obtained; and adjusting the resolution adopted in decoding each video block in the reconstructed frame of the backward reference frame of the block to be decoded to the third resolution to obtain a second reconstructed frame, wherein the method comprises the following steps: and adjusting the resolution adopted when each video block in the reconstructed frame of the forward reference frame is decoded to the original resolution, obtaining the first reconstructed frame, and adjusting the resolution adopted when each video block in the reconstructed frame of the backward reference frame is decoded to the original resolution, obtaining the second reconstructed frame, wherein the third resolution is the original resolution, the forward reference frame comprises at least 2 video blocks adopting different resolutions when decoding, and the backward reference frame comprises at least 2 video blocks adopting different resolutions when decoding.

5. A video encoding method, comprising:

the method comprises the steps of adjusting a first resolution adopted by a block to be encoded in encoding to a third resolution to obtain a first reconstruction block; the resolution adopted by each video block in the reconstructed frame of the forward reference frame of the block to be coded is adjusted to be the third resolution, so as to obtain a first reconstructed frame, wherein the forward reference frame comprises at least 2 video blocks with different resolutions in the coding process; the resolution adopted by each video block in the reconstructed frame of the backward reference frame of the block to be coded is adjusted to be the third resolution, so as to obtain a second reconstructed frame, wherein the backward reference frame comprises at least 2 video blocks with different resolutions in the process of coding;

determining a motion vector MV of the first reconstruction block relative to a corresponding region in the virtual reference frame as the motion vector MV of the block to be encoded, wherein the corresponding region is a region corresponding to the first reconstruction block in the virtual reference frame;

and adding a syntax element to the data to be encoded corresponding to the block to be encoded, wherein the syntax element is used for indicating the third resolution.

6. The method of claim 5, wherein the third resolution is an original resolution of the block to be encoded or the third resolution is a highest resolution of a predetermined set of resolutions.

7. The method of claim 5, wherein, in the event that the third resolution is lower than the highest resolution in the predetermined set of resolutions,

the step of adjusting the first resolution adopted by the block to be encoded to the third resolution to obtain the first reconstructed block includes: upsampling the first resolution adopted by the block to be coded in coding to the highest resolution to obtain a first block to be coded; downsampling the resolution of the first block to be coded from the highest resolution to the third resolution to obtain the first reconstructed block;

The step of adjusting the resolution adopted in the encoding of each video block in the reconstructed frame of the forward reference frame to the third resolution to obtain the first reconstructed frame includes: the resolution adopted by each video block in the reconstructed frame of the forward reference frame in the encoding process is up-sampled to the highest resolution, and a first forward reference frame is obtained; and downsampling the resolution of each video block in the first forward reference frame from the highest resolution to the third resolution to obtain the first reconstructed frame;

the step of adjusting the resolution adopted in the encoding of each video block in the reconstructed frame of the backward reference frame to the third resolution to obtain the second reconstructed frame includes: the resolution adopted by each video block in the reconstructed frame of the backward reference frame in the encoding process is up-sampled to the highest resolution, and a first backward reference frame is obtained; and downsampling the resolution of each video block in the first backward reference frame from the highest resolution to the third resolution to obtain the second reconstructed frame.

8. The method of claim 5, wherein, in the case where the resolution employed in encoding the block to be encoded is the original resolution,

The first resolution adopted by the block to be encoded in encoding is adjusted to be third resolution, and a first reconstruction block is obtained; the resolution adopted in the coding of each video block in the reconstructed frame of the forward reference frame is adjusted to be the third resolution, and a first reconstructed frame is obtained; and adjusting the resolution adopted in the encoding of each video block in the reconstructed frame of the backward reference frame to the third resolution to obtain the second reconstructed frame, wherein the method comprises the following steps: and adjusting the resolution adopted when each video block in the reconstructed frames of the forward reference frame is coded to the original resolution, obtaining the first reconstructed frame, and adjusting the resolution adopted when each video block in the reconstructed frames of the backward reference frame is coded to the original resolution, obtaining the second reconstructed frame, wherein the third resolution is the original resolution, the forward reference frame comprises at least 2 video blocks adopting different resolutions when the video blocks are coded, and the backward reference frame comprises at least 2 video blocks adopting different resolutions when the video blocks are coded.

9. A video decoding apparatus, comprising:

the first acquisition unit is used for acquiring a motion vector MV of a block to be decoded in a video frame to be decoded;

A second obtaining unit, configured to obtain a syntax element carried in data to be decoded corresponding to the block to be decoded, where the syntax element is used to indicate a third resolution;

the first adjusting unit comprises a first adjusting module and is used for adjusting the first resolution adopted by the block to be decoded in decoding to the third resolution to obtain a first reconstruction block; the second adjusting module is used for adjusting the resolution adopted by each video block in the reconstructed frame of the forward reference frame of the block to be decoded to the third resolution to obtain a first reconstructed frame, wherein the forward reference frame comprises at least 2 video blocks with different resolutions during decoding; the third adjusting module is used for adjusting the resolution adopted by each video block in the reconstructed frame of the backward reference frame of the block to be decoded to the third resolution to obtain a second reconstructed frame, wherein the backward reference frame comprises at least 2 video blocks with different resolutions during decoding;

a synthesizing unit, configured to synthesize the first reconstructed frame and the second reconstructed frame into a virtual reference frame;

and the determining unit is used for determining the pixel value of the first reconstruction block according to the motion vector MV of the first reconstruction block relative to the corresponding area in the virtual reference frame and the virtual reference frame, wherein the motion vector MV of the first reconstruction block relative to the corresponding area in the virtual reference frame is equal to the motion vector MV of the block to be decoded, and the corresponding area is the area corresponding to the first reconstruction block in the virtual reference frame.

10. The apparatus of claim 9, wherein the third resolution is an original resolution of the block to be decoded or the third resolution is a highest resolution of a predetermined set of resolutions.

11. The apparatus of claim 9, wherein, in the case where the third resolution is lower than the highest resolution in the predetermined set of resolutions,

the first adjusting module is used for: upsampling the first resolution adopted by the block to be decoded in decoding to the highest resolution to obtain a first block to be decoded; downsampling the resolution of the first block to be decoded from the highest resolution to the third resolution to obtain the first reconstructed block;

the second adjusting module is used for: the step of adjusting the resolution adopted in decoding each video block in the reconstructed frame of the forward reference frame to the third resolution to obtain the first reconstructed frame includes: upsampling the resolution adopted by each video block in the reconstructed frame of the forward reference frame to the highest resolution during decoding to obtain a first forward reference frame; and downsampling the resolution of each video block in the first forward reference frame from the highest resolution to the third resolution to obtain the first reconstructed frame;

The third adjustment module is used for: the step of adjusting the resolution adopted in decoding each video block in the reconstructed frame of the backward reference frame to the third resolution to obtain the second reconstructed frame includes: up-sampling the resolution adopted by each video block in the reconstructed frame of the backward reference frame to the highest resolution during decoding to obtain a first backward reference frame; and downsampling the resolution of each video block in the first backward reference frame from the highest resolution to the third resolution to obtain the second reconstructed frame.

12. The apparatus of claim 9, wherein, in the case where the resolution employed by the block to be decoded at the time of decoding is the original resolution,

the first adjusting unit further includes: and a fourth adjustment module, configured to adjust a resolution adopted when each video block in a reconstructed frame of the forward reference frame is decoded to the original resolution, obtain the first reconstructed frame, and adjust a resolution adopted when each video block in a reconstructed frame of the backward reference frame is decoded to the original resolution, obtain the second reconstructed frame, where the third resolution is the original resolution, the forward reference frame includes at least 2 video blocks adopting different resolutions when decoding, and the backward reference frame includes at least 2 video blocks adopting different resolutions when decoding.

13. A video encoding apparatus, comprising:

the first adjusting unit comprises a first adjusting module and is used for adjusting a first resolution adopted by a block to be encoded in encoding to a third resolution to obtain a first reconstruction block; the second adjusting module is used for adjusting the resolution adopted by each video block in the reconstructed frame of the forward reference frame of the block to be coded to the third resolution to obtain a first reconstructed frame, wherein the forward reference frame comprises at least 2 video blocks with different resolutions during coding; the third adjusting module is configured to adjust the resolution adopted by each video block in the reconstructed frame of the backward reference frame of the block to be encoded to the third resolution, so as to obtain a second reconstructed frame, where the backward reference frame includes at least 2 video blocks adopting different resolutions during encoding;

a determining unit, configured to determine a motion vector MV of the first reconstructed block with respect to a corresponding region in the virtual reference frame as the motion vector MV of the block to be encoded, where the corresponding region is a region corresponding to the first reconstructed block in the virtual reference frame;

And the adding unit is used for adding a syntax element to the data to be coded corresponding to the block to be coded, wherein the syntax element is used for indicating the third resolution.

14. The apparatus of claim 13, wherein the third resolution is an original resolution of the block to be encoded or the third resolution is a highest resolution of a predetermined set of resolutions.

15. The apparatus of claim 13, wherein, in the case where the third resolution is lower than the highest resolution in the predetermined set of resolutions,

the first adjusting module is used for: upsampling the first resolution adopted by the block to be coded in coding to the highest resolution to obtain a first block to be coded; downsampling the resolution of the first block to be coded from the highest resolution to the third resolution to obtain the first reconstructed block; the second adjusting module is used for: the resolution adopted by each video block in the reconstructed frame of the forward reference frame in the encoding process is up-sampled to the highest resolution, and a first forward reference frame is obtained; and downsampling the resolution of each video block in the first forward reference frame from the highest resolution to the third resolution to obtain the first reconstructed frame; the third adjustment module is used for: the resolution adopted by each video block in the reconstructed frame of the backward reference frame in the encoding process is up-sampled to the highest resolution, and a first backward reference frame is obtained; and downsampling the resolution of each video block in the first backward reference frame from the highest resolution to the third resolution to obtain the second reconstructed frame.

16. The apparatus according to claim 13, wherein the first adjustment unit further includes, in the case where the resolution employed by the block to be encoded at the time of encoding is the original resolution:

and a fourth adjustment module, configured to adjust a resolution adopted when each video block in a reconstructed frame of the forward reference frame is encoded to the original resolution, obtain the first reconstructed frame, and adjust a resolution adopted when each video block in a reconstructed frame of the backward reference frame is encoded to the original resolution, obtain the second reconstructed frame, where the third resolution is the original resolution, the forward reference frame includes at least 2 video blocks adopting different resolutions when encoding, and the backward reference frame includes at least 2 video blocks adopting different resolutions when encoding.

17. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method according to any of the claims 1 to 8 by means of the computer program.