CN110572674B

CN110572674B - Video encoding and decoding method and device, storage medium and electronic device

Info

Publication number: CN110572674B
Application number: CN201910927041.8A
Authority: CN
Inventors: 高欣玮; 谷沉沉
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-09-27
Filing date: 2019-09-27
Publication date: 2024-03-15
Anticipated expiration: 2039-09-27
Also published as: CN110572674A

Abstract

The invention discloses a video encoding and decoding method and device, a storage medium and an electronic device. Wherein the method comprises the following steps: acquiring motion vector data MVD of a to-be-decoded block carried in to-be-decoded data corresponding to the to-be-decoded block in a to-be-decoded video frame, motion vector MV of a reference block, a first resolution adopted by the to-be-decoded block in decoding and a second resolution adopted by the reference block in decoding; under the condition that the first resolution and the second resolution are different, adjusting the resolution of the block to be decoded to be the target resolution to obtain a first reconstruction block, and adjusting the resolution of the reference block to be the target resolution to obtain a second reconstruction block; and determining the sum of the motion vector predicted value MVP of the block to be decoded and the motion vector data MVD of the block to be decoded as the motion vector MV of the first reconstruction block relative to the second reconstruction block. The method solves the technical problem that the motion vector cannot be determined due to different resolutions of the video blocks.

Description

Video encoding and decoding method and device, storage medium and electronic device

Technical Field

The present invention relates to the field of audio/video encoding and decoding, and in particular, to a video encoding and decoding method and apparatus, a storage medium, and an electronic apparatus.

Background

With the development of digital media technology and computer technology, video is applied to various fields such as mobile communication, network monitoring, network television, etc. With the improvement of hardware performance and screen resolution, the demand of users for high-definition video is increasing.

Under the condition of limited mobile bandwidth, the existing codec usually adopts the same resolution to perform the codec on the video frame, which will make the peak signal-to-noise ratio (Peak Signal to Noise Ratio, abbreviated as PSNR) under the partial bandwidth relatively lower, thereby causing distortion to the video frame and causing the problem of poor video playing quality. In the related art, the inventors can reduce distortion of video frames by adjusting resolutions employed in encoding and decoding of different video blocks, but adjusting the resolutions in encoding and decoding of video blocks results in failure to determine motion vectors MV of decoded blocks in decoding, and thus failure to decode.

In view of the above problems, no effective solution has been proposed at present.

Disclosure of Invention

The embodiment of the invention provides a video encoding and decoding method and device, a storage medium and an electronic device, which at least solve the technical problem that motion vectors cannot be determined due to different resolutions of video blocks.

According to an aspect of an embodiment of the present invention, there is provided a video decoding method including: acquiring motion vector data MVD of a to-be-decoded block carried in to-be-decoded data corresponding to the to-be-decoded block in a to-be-decoded video frame, motion vector MV of a reference block, a first resolution adopted by the to-be-decoded block in decoding and a second resolution adopted by the reference block in decoding, wherein the reference block is a block where a region to be referred to by the to-be-decoded block in a decoded reference frame is located, and the size of the to-be-decoded block is the same as that of the reference region; under the condition that the first resolution and the second resolution are different, adjusting the resolution of the block to be decoded to be the target resolution to obtain a first reconstruction block, and adjusting the resolution of the reference block to be the target resolution to obtain a second reconstruction block; and determining the sum of the motion vector predicted value MVP of the block to be decoded and the motion vector data MVD of the block to be decoded as the motion vector MV of the first reconstruction block relative to the second reconstruction block, wherein the motion vector predicted value MVP of the block to be decoded is equal to the motion vector MV of the reference block.

According to another aspect of the embodiment of the present invention, there is also provided a video encoding method, including: acquiring a first resolution adopted by a block to be coded in a video frame to be coded in coding, a second resolution adopted by a reference block in coding and a motion vector MV of the reference block, wherein the reference block is a block where a region to be coded, which is referred to by the block to be coded in the coded reference frame, is located, and the size of the block to be coded is the same as that of the region to be coded; under the condition that the first resolution and the second resolution are different, adjusting the resolution of the block to be coded to the target resolution to obtain a first reconstruction block, and adjusting the resolution of the reference block to the target resolution to obtain a second reconstruction block; and determining the difference value between the motion vector MV of the first reconstruction block relative to the second reconstruction block and the motion vector predicted value MVP of the block to be encoded as motion vector data MVD of the block to be encoded, wherein the motion vector predicted value MVP of the block to be encoded is equal to the motion vector MV of the reference block.

According to another aspect of the embodiment of the present invention, there is also provided a video decoding apparatus including: the first obtaining unit is used for obtaining motion vector data MVD of a to-be-decoded block carried in to-be-decoded data corresponding to the to-be-decoded block in the to-be-decoded video frame, motion vector MV of a reference block, first resolution adopted by the to-be-decoded block in decoding and second resolution adopted by the reference block in decoding, wherein the reference block is a block where a region of the to-be-decoded block referenced in the decoded reference frame is located, and the size of the to-be-decoded block is the same as the size of the referenced region; a first adjusting unit, configured to adjust the resolution of the block to be decoded to a target resolution, obtain a first reconstructed block, and adjust the resolution of the reference block to the target resolution, obtain a second reconstructed block, if the first resolution and the second resolution are different; and a first determining unit configured to determine a sum of a motion vector predictor MVP of the block to be decoded and motion vector data MVD of the block to be decoded as a motion vector MV of the first reconstructed block relative to the second reconstructed block, where the motion vector predictor MVP of the block to be decoded is equal to the motion vector MV of the reference block.

According to another aspect of the embodiment of the present invention, there is also provided a video encoding apparatus including: a first obtaining unit, configured to obtain a first resolution adopted by a block to be encoded in a video frame to be encoded during encoding, a second resolution adopted by a reference block during encoding, and a motion vector MV of the reference block, where the reference block is a block where a region to be encoded, which is referred to by the block to be encoded in the encoded reference frame, is located, and a size of the block to be encoded is the same as a size of the region to be encoded; a first adjusting unit, configured to adjust the resolution of the block to be encoded to a target resolution, obtain a first reconstructed block, and adjust the resolution of the reference block to the target resolution, obtain a second reconstructed block, if the first resolution and the second resolution are different; and a first determining unit, configured to determine a difference value between a motion vector MV of the first reconstructed block relative to the second reconstructed block and a motion vector predictor MVP of the block to be encoded, as motion vector data MVD of the block to be encoded, where the motion vector predictor MVP of the block to be encoded is equal to the motion vector MV of the reference block.

According to yet another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is configured to perform the video encoding and decoding method described above when run.

According to still another aspect of the embodiments of the present invention, there is further provided an electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the video encoding and decoding method described above through the computer program.

In the embodiment of the invention, the motion vector data MVD of the block to be decoded, the motion vector of the reference block, the first resolution adopted by the block to be decoded in decoding and the second resolution adopted by the reference block in decoding are adopted, under the condition that the first resolution and the second resolution are different, the resolutions of the block to be decoded and the reference block are adjusted to be the target resolution, and the motion vector of the reference block is taken as the motion vector predicted value of the block to be decoded, so that the motion vector of the first reconstructed block after the resolution adjustment of the block to be decoded is determined according to the sum of the motion vector predicted value of the block to be decoded and the motion vector data of the block to be decoded, thereby realizing the technical effect that the motion vector can be determined under the condition that the resolutions of the video blocks are different, and further solving the technical problem that the motion vector cannot be determined due to the different resolutions of the video block MV.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:

FIG. 1 is a schematic diagram of an application environment of an alternative video decoding method according to an embodiment of the present invention;

FIG. 2 is a flow chart of an alternative video decoding method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an alternative video decoding method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of another alternative video decoding method according to an embodiment of the present invention;

FIG. 5 is a flow chart of an alternative video encoding method according to an embodiment of the invention;

fig. 6 is a schematic structural view of an alternative video decoding apparatus according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an alternative video encoding apparatus according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of an alternative electronic device according to an embodiment of the invention;

fig. 9 is a schematic structural view of another alternative electronic device according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

According to an aspect of the embodiment of the present invention, a video decoding method is provided, optionally, as an optional implementation manner, the video decoding method may be applied, but not limited to, in an application environment as shown in fig. 1. The application environment includes a terminal 102 and a server 104, where the terminal 102 and the server 104 communicate through a network. The terminal 102 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, etc. The server 104 may be, but is not limited to, a computer processing device with a high data processing capability and a certain storage space.

Note that, the video encoding method corresponding to the video decoding method described above may be applied to, but not limited to, the application environment shown in fig. 1. After the video to be encoded is obtained, the video encoding method provided in the present application may be, but is not limited to, adopted, through the interaction process between the terminal 102 and the server 104 shown in fig. 1, the first reconstruction block is obtained by adjusting the block to be encoded to the target resolution, the second reconstruction block is obtained by adjusting the reference block to the target resolution, and the difference value between the motion vector of the first reconstruction block relative to the second reconstruction block and the motion vector predicted value of the block to be encoded is determined as the motion vector of the block to be encoded, so that the video to be encoded is encoded under the condition that the resolutions of the video blocks are different, and the motion vector MV of the block to be encoded is not required to be added to the encoded data, and only the motion vector data MVD of the block to be encoded is added, thereby reducing the overhead in transmission. Here, the motion vector predictor of the block to be encoded is equal to the motion vector of the reference block. In addition, after the video to be decoded is obtained, the video decoding method provided in the present application may be, but is not limited to, adopted, through the interaction process between the terminal 102 and the server 104 shown in fig. 1, the motion vector of the first reconstructed block after the resolution adjustment of the block to be decoded is determined according to the sum of the motion vector predicted value of the block to be decoded and the motion vector data of the block to be decoded, the motion vector of the block to be decoded is also determined, that is, the motion vector MV of the first reconstructed block after the resolution adjustment of the block to be decoded is also determined under the condition that the resolution of the video block to be decoded is different, and the resolution of the block to be decoded is also determined under the condition that the resolution of the video block is different.

In one embodiment, terminal 102 may include, but is not limited to, the following: an image processing unit 1021, a processor 1022, a storage medium 1023, a memory 1024, a network interface 1025, a display screen 1026, and an input device 1027. The components described above may be connected by, but are not limited to, a system bus 1028. Wherein, the image processing unit 1021 is used for providing at least the drawing capability of the display interface; the processor 1022 is configured to provide computing and control capabilities to support operation of the terminal 102; the storage medium 1023 has stored therein an operating system 1023-2, a video encoder and/or a video decoder 1023-4. The operating system 1023-2 is used to provide control operation instructions, and the video encoder and/or video decoder 1023-4 is used to perform encoding/decoding operations in accordance with the control operation instructions. In addition, the memory provides an operating environment for the video encoder and/or video decoder 1023-4 in the storage medium 1023, and the network interface 1025 is used for network communication with the network interface 1043 in the server 104. The display screen is used for displaying application interfaces and the like, such as decoding video; the input device 1027 is used to receive commands or data input by a user, and the like. For a terminal 102 with a touch screen, the display screen 1026 and the input device 1027 may be touch screens. The above-described internal structure of the terminal shown in fig. 1 is merely a block diagram of a part of the structure related to the present application and does not constitute a limitation of the terminal to which the present application is applied, and a specific terminal or server may include more or less components than those shown in the drawings, or may combine some components, or have different arrangements of components.

In one embodiment, the server 104 may include, but is not limited to, the following: a processor 1041, memory 1042, a network interface 1043, and storage media 1044. The components described above may be connected by, but are not limited to, a system bus 1045. The storage medium 1044 includes an operating system 1044-1, a database 1044-2, a video encoder and/or a video decoder 1044-3. Wherein the processor 1041 is configured to provide computing and control capabilities to support operation of the server 104. Memory 1042 provides an environment for operation of video encoder and/or video decoding 1044-3 in storage medium 1044. The network interface 1043 communicates with the network interface 1025 of the external terminal 102 through a network connection. The operating system 1044-1 in the storage medium is used to provide control operation instructions; the video encoder and/or video decoder 1044-3 is for performing encoding/decoding operations according to the control operation instructions; database 1044-2 is used to store data. The above-described structure inside the server shown in fig. 1 is merely a block diagram of a part of the structure related to the present application, and does not constitute a limitation of the computer device to which the present application is applied, and a specific computer device has a different arrangement of components.

In one embodiment, the network may include, but is not limited to, a wired network. Wherein, the wired network may include, but is not limited to: wide area network, metropolitan area network, local area network. The above is merely an example, and is not limited in any way in the present embodiment.

According to an aspect of an embodiment of the present invention, there is provided a video decoding method, as shown in fig. 2, including:

s202, obtaining motion vector data MVD of a to-be-decoded block carried in to-be-decoded data corresponding to the to-be-decoded block in a to-be-decoded video frame, motion vector MV of a reference block, a first resolution adopted by the to-be-decoded block in decoding and a second resolution adopted by the reference block in decoding, wherein the reference block is a block where a region to be-decoded block is referenced in a decoded reference frame, and the size of the to-be-decoded block is the same as the size of the referenced region;

s204, under the condition that the first resolution and the second resolution are different, adjusting the resolution of the block to be decoded to be the target resolution to obtain a first reconstruction block, and adjusting the resolution of the reference block to be the target resolution to obtain a second reconstruction block;

s206, determining the sum of the motion vector predicted value MVP of the block to be decoded and the motion vector data MVD of the block to be decoded as the motion vector MV of the first reconstruction block relative to the second reconstruction block, wherein the motion vector predicted value MVP of the block to be decoded is equal to the motion vector MV of the reference block.

Here, the motion vector MV of the first reconstructed block with respect to the second reconstructed block may be equal to the motion vector MV of the block to be decoded.

It should be noted that the video decoding method shown in fig. 2 may be used in the video decoder shown in fig. 1, but is not limited to the above method. The video decoder is matched with other parts in interaction to complete the decoding process of the video frames to be decoded.

Alternatively, in this embodiment, the video decoding method may be applied to, but not limited to, application scenarios such as a video playing application, a video sharing application, or a video session application. The video transmitted in the application scenario may include, but is not limited to: the long video, the short video, such as the long video, can be a play episode with longer play time (for example, the play time is longer than 10 minutes), or the pictures shown in the long video session, and the short video can be a voice message interacted by two or more parties, or a video with shorter play time (for example, the play time is less than or equal to 30 seconds) shown on the sharing platform. The foregoing is merely an example, and the video decoding method provided in this embodiment may be, but is not limited to, applied to a playing device for playing video in the foregoing application scenario, where after encoded code stream data is acquired, a motion vector MV is determined for a block to be decoded in each video frame to be decoded, that is, a motion vector MV of a first reconstruction block relative to a second reconstruction block, so as to perform a decoding operation, thereby avoiding that the motion vector MV cannot be determined due to different resolutions of the block to be decoded and a reference block. In the embodiment of the invention, after the motion vector MV of the block to be decoded is determined, the pixel point corresponding to the pixel point in the block to be decoded in the reference block can be determined according to the reference block of the block to be decoded and the motion vector MV of the block to be decoded, so as to determine the pixel value of the pixel point in the block to be decoded, where the pixel value of the pixel point in the block to be decoded is equal to the pixel value of the corresponding pixel point in the reference block, and the pixel value of the pixel point on the reference block in the decoded frame is known. In the embodiment of the invention, the motion vector of the first reconstruction block relative to the second reconstruction block is determined through the sum of the motion vector predicted value and the motion vector data to be decoded, namely the motion vector of the block to be decoded is determined, so that the pixel value of the pixel point in the block to be decoded can be determined according to the determined motion vector of the block to be decoded, and the decoding operation can be performed. Meanwhile, only the motion vector data MVD of the block to be decoded is needed to be added in the decoded data, the motion vector MV of the block to be decoded is determined by decoding and obtaining the MVP of the block to be decoded, and the MV does not need to be independently encoded in the decoded data, so that the transmission bandwidth is saved, and the flexibility of encoding and decoding can be improved.

When the video is encoded, different resolution ratios can be adopted to encode different video blocks in the video frame, so that the problem of distortion caused by adopting uniform resolution ratios in the related art can be solved, and the video playing quality is ensured. In this embodiment, when video decoding is performed, motion vector data MVD of a block to be decoded, a motion vector of a reference block, a first resolution adopted by the block to be decoded when decoding, and a second resolution adopted by the reference block when decoding are obtained, under the condition that the first resolution and the second resolution are different, the resolutions of the block to be decoded and the reference block are adjusted to be target resolutions, and the motion vector of the reference block is used as a motion vector predicted value of the block to be decoded, so that the motion vector of the first reconstructed block after the resolution adjustment of the block to be decoded is determined according to the sum of the motion vector predicted value of the block to be decoded and the motion vector data of the block to be decoded, and thus the motion vector MV can be determined under the condition that the resolutions of the video blocks are different. It is understood that the motion vector MV of the first reconstructed block with respect to the second reconstructed block may be used as the motion vector MV of the block to be decoded. In the embodiment of the invention, in order to determine the motion vector of the block to be decoded relative to the reference block during decoding, the resolutions of the block to be decoded and the reference block need to be adjusted. It should be noted that, the resolution of the reconstructed block of the block to be decoded and the reconstructed block of the reference block may be adjusted, so that the motion vector of the block to be decoded relative to the reference block may be determined without actually changing the original block to be decoded and the reference block, which may be applied to the encoding process.

Optionally, in this embodiment, after determining a video frame to be decoded in the video to be decoded from the code stream received by the encoding device and before decoding the video frame to be decoded, a reference video frame may be determined from video frames that have been decoded before the video frame to be decoded, and further, a reference block in the reference video frame may be determined, and in this embodiment, the encoding mode of the reference video frame may be determined by:

1) Acquiring a preset flag bit in a code stream, and determining an encoding mode adopted by a reference video frame, such as intra-frame decoding or inter-frame decoding, according to the flag bit;

2) Decoding is carried out according to the convention between the encoding equipment of the encoding end, and the encoding mode adopted by the reference video frame which is decoded is determined after decoding, such as intra-frame decoding or inter-frame decoding.

Optionally, before the motion vector MV of the reference block is acquired, the method further comprises: determining a plurality of target video blocks, wherein each target video block comprises at least one pixel point positioned in a reference area;

acquiring a motion vector MV of a reference block, comprising: determining a motion vector of a first video block located in an upper left corner of the plurality of target video blocks as a motion vector of a reference block, wherein the first video block is the same size as the reference block; or, determining a motion vector of a second video block located at a corner of the plurality of target video blocks as a motion vector of the reference region, wherein the second video block is the same size as the reference block; or, determining a motion vector of a third video block with the largest area among the plurality of target video blocks as a motion vector of the reference block; alternatively, a weighted sum of motion vectors for each of the plurality of target video blocks is determined as the motion vector for the reference block.

For the reference block in the embodiment of the present invention, as shown in fig. 3, the t frame is the video frame to be decoded currently, the t-k frame is the reference frame of the t frame, the block a to be decoded has a corresponding reference region in the reference frame, that is, the reference region B in the t-k frame, and at least one pixel point in the target video blocks in the reference frame is located in the reference region. It is understood that the t-k frame herein may be a frame previous to the video frame in which the current block to be decoded is located, or may be a previous N frame, where N is a positive integer, and it is understood that the t-k frame herein may also be a virtual frame synthesized by a plurality of video frames previous to the video frame in which the current block to be decoded is located. It should be understood that the above determination of the reference blocks is only an alternative embodiment provided by the present invention, and the present invention is not limited to the determination of the reference blocks.

As shown in fig. 4, in an embodiment of the present invention, a motion vector of a first video block a located in an upper left corner in a reference area may be determined as a motion vector of a reference block; the motion vector of the video block located at a corner, e.g., the upper left or lower left or upper right or lower right corner, in the reference region may also be determined as the motion vector of the reference block; the motion vector of the third video block b having the largest area in the reference region may also be determined as the motion vector of the reference block; the weighted sum of the motion vectors of the plurality of target video blocks may also be determined as the motion vector of the reference block. It will be appreciated that a plurality of video blocks may be included in each video frame, and that the size of each video block may be different.

It will be appreciated that the manner of determination of the motion vector MV for the reference block may be pre-agreed, i.e. the encoding side and decoding side pre-predefine the manner of determination, so that no identification information need be added to the code stream. The encoding side may add identification information for indicating a determination manner of the motion vector MV of the reference block to the encoded data, so that the decoding side can determine the motion vector MV of the reference block according to the identification information.

Optionally, in the case that the first resolution and the second resolution are different, adjusting the resolution of the block to be decoded to the target resolution to obtain a first reconstructed block, and adjusting the resolution of the reference block to the target resolution to obtain a second reconstructed block, including: the method comprises the steps of adjusting a first resolution adopted by a block to be decoded in decoding to be a third resolution, and obtaining a first reconstruction block, wherein the third resolution is different from the first resolution and the second resolution, and the target resolution is the third resolution; and adjusting the second resolution adopted by the reference block in decoding to be the third resolution to obtain a second reconstruction block.

The third resolution here is the original resolution of the block to be decoded, or the third resolution is the highest resolution in a predetermined set of resolutions. It will be appreciated that for video, there may be multiple resolutions, such as 720p,1080p, etc. available, these alternative resolutions constituting the resolution set herein. Of course, existing video resolution specifications may be, but are not limited to, used in the resolution set. It should be noted that, the original resolution is herein referred to as the original resolution of the video to be decoded, and it is understood that the original resolution may be the same as or different from the first resolution of the block to be decoded.

Optionally, before adjusting the first resolution adopted by the block to be decoded in decoding to the third resolution to obtain the first reconstructed block, the method further includes: and acquiring a syntax element carried in the data to be decoded corresponding to the block to be decoded, wherein the syntax element is used for indicating the third resolution. In an embodiment of the present invention, the syntax element herein may be identification information for indicating a third resolution required at the time of decoding. It will be understood, of course, that the encoding side and decoding side may also pre-define the third resolution so that no syntax elements need to be carried in the bitstream, and determine the motion vector MV of the block to be decoded relative to the reference block directly according to the pre-defined third resolution during decoding.

In an alternative embodiment of the present invention, the syntax element may be an index flag for inter prediction adaptive resolution alignment, specifically denoted as 0,1,2,3,4, etc., each index representing a scale of resolution scaling of the third resolution. For example, a threshold of 0 represents the highest resolution ratio, and 1 represents each of 3/4 samples wide and high for encoding; 2 represents width and height 2/3 samples, and 3 represents width and height 1/2 samples for encoding; 4 represents 1/3 of the width and height samples; 5 represents 1/4 of the width and height samples for encoding. It is to be understood that this is only an alternative embodiment provided by the present invention and the present invention is not limited thereto. It is understood that the identification herein may be mutually applicable for codecs.

Optionally, in the case that the third resolution is lower than the highest resolution in the predetermined set of resolutions, adjusting the first resolution adopted by the block to be decoded in decoding to the third resolution to obtain the first reconstructed block, including: upsampling a first resolution adopted by a block to be decoded in decoding to the highest resolution to obtain a third reconstruction block; downsampling the resolution of the third reconstruction block from the highest resolution to a third resolution to obtain a first reconstruction block; adjusting a second resolution adopted by the reference block in decoding to a third resolution to obtain a second reconstructed block, wherein the method comprises the following steps: upsampling a second resolution employed by the reference block during decoding to a highest resolution to obtain a fourth reconstructed block; downsampling the resolution of the fourth reconstruction block from the highest resolution to the third resolution to obtain a second reconstruction block. In the embodiment of the present invention, when the third resolution is lower than the highest resolution in the resolution set, up-sampling may be performed to the highest resolution, and then down-sampling may be performed to the third resolution.

Optionally, adjusting the resolution of the block to be decoded to a target resolution, obtaining a first reconstructed block, and adjusting the resolution of the reference block to the target resolution, obtaining a second reconstructed block, including: adjusting a second resolution adopted by the reference block during decoding to be a first resolution to obtain a second reconstruction block, wherein the target resolution is the first resolution; it is to be understood that the block to be decoded may be used as the first reconstructed block, which is of course used to avoid changing the original video block in the process of determining the motion vector, and it is to be understood that the reconstructed block of the block to be decoded is identical to the block to be decoded;

Or, the first resolution adopted by the block to be decoded in decoding is adjusted to be the second resolution, so that the first reconstruction block is obtained, and the target resolution is the second resolution. It will be appreciated that the reference block may be referred to herein as a second reconstructed block, although in order to avoid altering the original reference block during the determination of the motion vector, the reconstructed block of the reference block may be referred to as a first reconstructed block.

According to another aspect of an embodiment of the present invention, there is provided a video encoding method, as shown in fig. 5, the method including:

s502, acquiring a first resolution adopted by a block to be coded in a video frame to be coded in coding, a second resolution adopted by a reference block in coding and a motion vector MV of the reference block, wherein the reference block is a block where a region to be coded, which is referred to by the block to be coded in the coded reference frame, is located, and the size of the block to be coded is the same as the size of the region to be coded;

s504, under the condition that the first resolution and the second resolution are different, adjusting the resolution of the block to be coded to the target resolution to obtain a first reconstruction block, and adjusting the resolution of the reference block to the target resolution to obtain a second reconstruction block;

s506, determining the difference value between the motion vector MV of the first reconstruction block relative to the second reconstruction block and the motion vector predicted value MVP of the block to be encoded as the motion vector data MVD of the block to be encoded, wherein the motion vector predicted value MVP of the block to be encoded is equal to the motion vector MV of the reference block.

It should be noted that the video encoding method shown in fig. 5 may be used in the video encoder shown in fig. 1, but is not limited to the above method. The video encoder is matched with other parts in interaction to complete the encoding process of the video frames to be encoded.

Alternatively, in this embodiment, the video encoding method may be applied to, but not limited to, application scenarios such as a video playing application, a video sharing application, or a video session application. The video transmitted in the application scenario may include, but is not limited to: the long video, the short video, such as the long video, can be a play episode with longer play time (for example, the play time is longer than 10 minutes), or the pictures shown in the long video session, and the short video can be a voice message interacted by two or more parties, or a video with shorter play time (for example, the play time is less than or equal to 30 seconds) shown on the sharing platform. The foregoing is merely an example, and the video encoding method provided in the present embodiment may be, but is not limited to, applied to a playing device for playing video in the foregoing application scenario, and after obtaining a video to be encoded, determine a motion vector of a block to be encoded, thereby implementing encoding of the video to be encoded under the condition that resolutions of video blocks are different.

When the video is encoded, different resolution ratios can be adopted to encode different video blocks in the video frame, so that the problem of distortion caused by adopting uniform resolution ratios in the related art can be solved, and the video playing quality is ensured. In this embodiment, when video encoding is performed, a block to be encoded is adjusted to a target resolution to obtain a first reconstructed block, a reference block is adjusted to the target resolution to obtain a second reconstructed block, and a difference value between a motion vector of the first reconstructed block relative to the second reconstructed block and a motion vector predicted value of the block to be encoded is determined as a motion vector of the block to be encoded, so that video to be encoded is encoded under the condition that the resolutions of the video blocks are different, and the motion vector MV of the block to be encoded is not required to be added to encoded data, and only the motion vector data MVD of the block to be encoded is added, thereby reducing the overhead in transmission.

Optionally, before the motion vector MV of the reference block is acquired, the method further comprises: determining a plurality of target video blocks, wherein each target video block comprises at least one pixel point positioned in a reference area; acquiring a motion vector MV of a reference block, comprising: determining a motion vector of a first video block located in an upper left corner of the plurality of target video blocks as a motion vector of a reference block, wherein the first video block is the same size as the reference block; or, determining a motion vector of a second video block located at a corner of the plurality of target video blocks as a motion vector of the reference region, wherein the second video block is the same size as the reference block; or, determining a motion vector of a third video block with the largest area among the plurality of target video blocks as a motion vector of the reference block; alternatively, a weighted sum of motion vectors for each of the plurality of target video blocks is determined as the motion vector for the reference block.

For the reference block in the embodiment of the present invention, the block to be encoded has a corresponding reference region in the reference frame, and at least one pixel point in the plurality of target video blocks in the reference frame is located in the reference region. It is understood that the reference frame herein may be a previous frame to be encoded where the current block to be encoded is located, or may be a previous N frames, where N is a positive integer, and it is understood that the reference frame herein may also be a virtual frame synthesized by a plurality of video frames before the video frame where the current block to be encoded is located. It should be understood that the above determination of the reference blocks is only an alternative embodiment provided by the present invention, and the present invention is not limited to the determination of the reference blocks.

Optionally, after determining the difference value between the motion vector MV of the first reconstruction block relative to the second reconstruction block and the motion vector predictor MVP of the block to be encoded as the motion vector data MVD of the block to be encoded, the method further comprises: and adding the motion vector data MVD of the block to be encoded into the encoded data corresponding to the block to be encoded. In the embodiment of the invention, when video is encoded, the motion vector data MVD of the block to be encoded can be added into the encoded data corresponding to the block to be encoded, so that a decoding side can utilize the motion vector data MVD to decode the encoded data corresponding to the block. It can be understood that in the embodiment of the present invention, the motion vector MV of the block to be encoded is not required to be added to the encoded data, and only the motion vector data MVD of the block to be encoded is added, so that the number of bits occupied during encoding is reduced, and the encoding rate is improved, that is, the overhead in transmission can be reduced, the transmission bandwidth is saved, and the flexibility of encoding and decoding can be improved.

It may be understood that, in the embodiment of the present invention, when encoding video, in the case that the first resolution and the second resolution are different, the resolution of the block to be encoded is adjusted to the target resolution, so as to obtain the first reconstructed block, and the resolution of the reference block is adjusted to the target resolution, so as to obtain the second reconstructed block, and a specific adjustment manner may be referred to an example in the above decoding embodiment, which is not described herein.

It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present invention is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present invention. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present invention.

According to still another aspect of an embodiment of the present invention, there is also provided a video decoding apparatus for performing the above video decoding, as shown in fig. 6, the apparatus including:

a first obtaining unit 602, configured to obtain motion vector data MVD of a block to be decoded, which is carried in data to be decoded and corresponds to a block to be decoded in a video frame to be decoded, motion vector MV of a reference block, a first resolution adopted by the block to be decoded when decoding, and a second resolution adopted by the reference block when decoding, where the reference block is a block where a region to be decoded, which is referred to by the block to be decoded in a decoded reference frame, is located, and the size of the block to be decoded is the same as the size of the region to be decoded;

A first adjusting unit 604, configured to adjust the resolution of the block to be decoded to a target resolution, obtain a first reconstructed block, and adjust the resolution of the reference block to the target resolution, obtain a second reconstructed block, when the first resolution and the second resolution are different;

the first determining unit 606 is configured to determine a sum of a motion vector predictor MVP of the block to be decoded and motion vector data MVD of the block to be decoded as a motion vector MV of the first reconstructed block relative to the second reconstructed block, where the motion vector predictor MVP of the block to be decoded is equal to the motion vector MV of the reference block.

Specific embodiments may refer to the examples shown in the video decoding method, and in this example, details are not repeated here.

As an alternative, the apparatus may further include: a second determining unit, configured to determine a plurality of target video blocks before acquiring the motion vector MV of the reference block, where each target video block includes at least one pixel located in the reference region;

the first acquisition unit is specifically configured to: determining a motion vector of a first video block located in an upper left corner of the plurality of target video blocks as a motion vector of a reference block, wherein the first video block is the same size as the reference block; or, determining a motion vector of a second video block located at a corner of the plurality of target video blocks as a motion vector of the reference region, wherein the second video block is the same size as the reference block; or, determining a motion vector of a third video block with the largest area among the plurality of target video blocks as a motion vector of the reference block; alternatively, a weighted sum of motion vectors for each of the plurality of target video blocks is determined as the motion vector for the reference block.

As an alternative, in the case where the first resolution and the second resolution are different, the first adjustment unit includes: the first adjusting module is used for adjusting the first resolution adopted by the block to be decoded in decoding to a third resolution to obtain a first reconstruction block, wherein the third resolution is different from the first resolution and the second resolution, and the target resolution is the third resolution; and the second adjusting module is used for adjusting the second resolution adopted by the reference block in decoding to be the third resolution so as to obtain a second reconstruction block.

As an alternative, the apparatus may further include: the second obtaining unit is configured to obtain a syntax element carried in data to be decoded corresponding to the block to be decoded before the first resolution adopted by the block to be decoded in decoding is adjusted to be the third resolution to obtain the first reconstructed block, where the syntax element is used to indicate the third resolution.

As an alternative, the third resolution is the original resolution of the block to be decoded, or the third resolution is the highest resolution in a predetermined set of resolutions.

As an alternative, in case the third resolution is lower than the highest resolution of the predetermined set of resolutions, the first adjustment unit comprises: the third adjusting module is used for upsampling the first resolution adopted by the block to be decoded to the highest resolution in decoding to obtain a third reconstruction block; a fourth adjustment module, configured to downsample the resolution of the third reconstruction block from the highest resolution to the third resolution, to obtain a first reconstruction block; the first adjusting unit further includes: a fifth adjusting module, configured to upsample the second resolution adopted by the reference block to the highest resolution during decoding, to obtain a fourth reconstructed block; and a sixth adjustment module, configured to downsample the resolution of the fourth reconstruction block from the highest resolution to the third resolution, to obtain a second reconstruction block.

As an alternative, the first adjusting unit includes: a seventh adjustment module, configured to adjust a second resolution adopted by the reference block during decoding to a first resolution, to obtain a second reconstruction block, where the target resolution is the first resolution; or the eighth adjusting module is configured to adjust the first resolution adopted by the block to be decoded during decoding to a second resolution, so as to obtain the first reconstructed block, where the target resolution is the second resolution.

According to still another aspect of an embodiment of the present invention, there is provided a video encoding apparatus, as shown in fig. 7, including:

a first obtaining unit 702, configured to obtain a first resolution adopted by a block to be encoded in a video frame to be encoded when encoding, a second resolution adopted by a reference block when encoding, and a motion vector MV of the reference block, where the reference block is a block where a region to be encoded, which is referred to by the block to be encoded in the encoded reference frame, is located, and a size of the block to be encoded is the same as a size of the region to be referenced;

a first adjusting unit 704, configured to adjust the resolution of the block to be encoded to a target resolution, obtain a first reconstructed block, and adjust the resolution of the reference block to the target resolution, obtain a second reconstructed block, if the first resolution and the second resolution are different;

The first determining unit 706 is configured to determine, as motion vector data MVD of the block to be encoded, a difference value between a motion vector MV of the first reconstructed block relative to the second reconstructed block and a motion vector predictor MVP of the block to be encoded, where the motion vector predictor MVP of the block to be encoded is equal to the motion vector MV of the reference block.

Specific embodiments may refer to the examples shown in the video encoding method, and this example is not described herein.

As an alternative, the apparatus may further include: a second determining unit, configured to determine a plurality of target video blocks before acquiring the motion vector MV of the reference block, where each target video block includes at least one pixel located in the reference region; the first acquisition unit has a function for: determining a motion vector of a first video block located in an upper left corner of the plurality of target video blocks as a motion vector of a reference block, wherein the first video block is the same size as the reference block; or, determining a motion vector of a second video block located at a corner of the plurality of target video blocks as a motion vector of the reference region, wherein the second video block is the same size as the reference block; or, determining a motion vector of a third video block with the largest area among the plurality of target video blocks as a motion vector of the reference block; alternatively, a weighted sum of motion vectors for each of the plurality of target video blocks is determined as the motion vector for the reference block.

As an alternative, the apparatus may further include: and the adding unit is used for adding the motion vector data MVD of the block to be encoded to the encoding data corresponding to the block to be encoded after determining the difference value of the motion vector MV of the first reconstruction block relative to the second reconstruction block and the motion vector predicted value MVP of the block to be encoded as the motion vector data MVD of the block to be encoded.

According to a further aspect of the embodiments of the present invention, there is also provided an electronic device for implementing the above-described video decoding method, as shown in fig. 8, the electronic device comprising a memory and a processor, the memory storing a computer program, the processor being arranged to perform the steps of any of the method embodiments described above by means of the computer program.

Alternatively, in this embodiment, the electronic apparatus may be located in at least one network device of a plurality of network devices of the computer network.

Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:

S1, acquiring motion vector data MVD of a to-be-decoded block carried in to-be-decoded data corresponding to the to-be-decoded block in a to-be-decoded video frame, motion vector MV of a reference block, a first resolution adopted by the to-be-decoded block in decoding and a second resolution adopted by the reference block in decoding, wherein the reference block is a block where a region to be-decoded block is referenced in a decoded reference frame, and the size of the to-be-decoded block is the same as the size of the referenced region;

s2, under the condition that the first resolution and the second resolution are different, adjusting the resolution of the block to be decoded to be the target resolution to obtain a first reconstruction block, and adjusting the resolution of the reference block to be the target resolution to obtain a second reconstruction block;

and S3, determining the sum of the motion vector predicted value MVP of the block to be decoded and the motion vector data MVD of the block to be decoded as the motion vector MV of the first reconstruction block relative to the second reconstruction block, wherein the motion vector predicted value MVP of the block to be decoded is equal to the motion vector MV of the reference block.

Alternatively, it will be understood by those skilled in the art that the structure shown in fig. 8 is only schematic, and the electronic device may also be a terminal device such as a smart phone (e.g. an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, and a mobile internet device (Mobile Internet Devices, MID), a PAD, etc. Fig. 8 is not limited to the structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 8, or have a different configuration than shown in FIG. 8.

The memory 802 may be used to store software programs and modules, such as program instructions/modules corresponding to the video decoding method and apparatus in the embodiment of the present invention, and the processor 804 executes the software programs and modules stored in the memory 802, thereby performing various functional applications and data processing, that is, implementing the video decoding method described above. Memory 802 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, memory 802 may further include memory remotely located relative to processor 804, which may be connected to the terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 802 may be used for storing information such as a block to be decoded, in particular, but not limited to. As an example, as shown in fig. 8, the memory 802 may include, but is not limited to, the first obtaining unit 602, the first adjusting unit 604, and the first determining unit 606 in the video decoding apparatus. In addition, other module units in the video decoding apparatus may be included, but are not limited to, and are not described in detail in this example.

Optionally, the transmission device 806 is used to receive or transmit data via a network. Specific examples of the network described above may include wired networks and wireless networks. In one example, the transmission means 806 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices and routers via a network cable to communicate with the internet or a local area network. In one example, the transmission device 806 is a Radio Frequency (RF) module for communicating wirelessly with the internet.

In addition, the electronic device further includes: a display 808 for displaying the decoded video; and a connection bus 810 for connecting the respective module parts in the above-described electronic device.

According to a further aspect of the embodiments of the present invention there is also provided an electronic device for implementing the video encoding method described above, as shown in fig. 9, the electronic device comprising a memory 902 and a processor 904, the memory 902 having stored therein a computer program, the processor 904 being arranged to perform the steps of any of the method embodiments described above by means of the computer program.

s1, acquiring a first resolution adopted by a block to be coded in a video frame to be coded in coding, a second resolution adopted by a reference block in coding and a motion vector MV of the reference block, wherein the reference block is a block where a region to be coded, which is referred to by the block to be coded in the coded reference frame, is located, and the size of the block to be coded is the same as that of the region to be coded;

s2, under the condition that the first resolution and the second resolution are different, adjusting the resolution of the block to be coded to be the target resolution to obtain a first reconstruction block, and adjusting the resolution of the reference block to be the target resolution to obtain a second reconstruction block;

and S3, determining a difference value between the motion vector MV of the first reconstruction block relative to the second reconstruction block and the motion vector predicted value MVP of the block to be encoded as motion vector data MVD of the block to be encoded, wherein the motion vector predicted value MVP of the block to be encoded is equal to the motion vector MV of the reference block.

Alternatively, it will be understood by those skilled in the art that the structure shown in fig. 9 is only schematic, and the electronic device may also be a terminal device such as a smart phone (e.g. an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, and a mobile internet device (Mobile Internet Devices, MID), a PAD, etc. Fig. 9 is not limited to the structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 9, or have a different configuration than shown in FIG. 9.

The memory 902 may be used to store software programs and modules, such as program instructions/modules corresponding to the video encoding method and apparatus in the embodiments of the present invention, and the processor 904 executes the software programs and modules stored in the memory 902, thereby performing various functional applications and data processing, that is, implementing the video encoding method described above. The memory 902 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 902 may further include memory remotely located relative to the processor 904, which may be connected to the terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 902 may be, but is not limited to, information for a block to be encoded. As an example, as shown in fig. 9, the memory 902 may include, but is not limited to, the first obtaining unit 702, the first adjusting unit 704, and the first determining unit 706 in the video encoding apparatus. In addition, other module units in the video encoding apparatus may be included, but are not limited to, and are not described in detail in this example.

Optionally, the transmission device 906 is used to receive or transmit data via a network. Specific examples of the network described above may include wired networks and wireless networks. In one example, the transmission means 906 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices and routers via a network cable to communicate with the internet or a local area network. In one example, the transmission device 906 is a Radio Frequency (RF) module for communicating wirelessly with the internet.

In addition, the electronic device further includes: a display 908 for displaying video before encoding; and a connection bus 910 for connecting the respective module parts in the above-described electronic device.

An embodiment of the invention also provides a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of:

Optionally, the storage medium is further arranged to store a computer program for performing the steps of:

Optionally, the storage medium is further configured to store a computer program for executing the steps included in the method in the above embodiment, which is not described in detail in this embodiment.

Alternatively, in this embodiment, it will be understood by those skilled in the art that all or part of the steps in the methods of the above embodiments may be performed by a program for instructing a terminal device to execute the steps, where the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

The integrated units in the above embodiments may be stored in the above-described computer-readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing one or more computer devices (which may be personal computers, servers or network devices, etc.) to perform all or part of the steps of the method described in the embodiments of the present invention.

In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. A video decoding method, comprising:

acquiring motion vector data MVD of a block to be decoded, which is carried in data to be decoded and corresponds to the block to be decoded in a video frame to be decoded, motion vector MV of a reference block, a first resolution adopted by the block to be decoded in decoding and a second resolution adopted by the reference block in decoding, wherein the reference block is a block where a region to be decoded, which is referred to by the block to be decoded in the decoded reference frame, is located, and the size of the block to be decoded is the same as the size of the referred region;

Acquiring a syntax element carried in data to be decoded corresponding to the block to be decoded, wherein the syntax element is an index mark aligned with a third resolution, and the index mark is used for indicating a scaling of the third resolution;

under the condition that the first resolution and the second resolution are different, the first resolution adopted by the block to be decoded in decoding is adjusted to be the third resolution, so that a first reconstruction block is obtained, wherein the third resolution is different from the first resolution and the second resolution, and the target resolution is the third resolution;

adjusting the second resolution adopted by the reference block in decoding to the third resolution to obtain a second reconstruction block;

and determining the sum of the motion vector predicted value MVP of the block to be decoded and the motion vector data MVD of the block to be decoded as the motion vector MV of the first reconstruction block relative to the second reconstruction block, wherein the motion vector predicted value MVP of the block to be decoded is equal to the motion vector MV of the reference block.

2. The method of claim 1, wherein the step of determining the position of the substrate comprises,

before the motion vector MV of the reference block is acquired, the method further comprises: determining a plurality of target video blocks, wherein each target video block comprises at least one pixel point positioned in the reference area;

The obtaining the motion vector MV of the reference block includes: determining a motion vector of a first video block located in an upper left corner of the plurality of target video blocks as a motion vector of the reference block, wherein the first video block is the same size as the reference block; or determining a motion vector of a second video block located at a corner of the plurality of target video blocks as a motion vector of a region of the reference, wherein the second video block is the same size as the reference block; or determining a motion vector of a third video block with the largest area among the target video blocks as the motion vector of the reference block; alternatively, a weighted sum of motion vectors for each of the plurality of target video blocks is determined as the motion vector for the reference block.

3. The method of claim 1, wherein the third resolution is an original resolution of the block to be decoded or the third resolution is a highest resolution of a predetermined set of resolutions.

4. The method of claim 1, wherein, in the event that the third resolution is lower than the highest resolution in the predetermined set of resolutions,

The adjusting the first resolution adopted by the block to be decoded in decoding to a third resolution to obtain the first reconstruction block includes: upsampling the first resolution adopted by the block to be decoded in decoding to the highest resolution to obtain a third reconstruction block; downsampling the resolution of the third reconstruction block from the highest resolution to the third resolution to obtain the first reconstruction block;

the adjusting the second resolution adopted by the reference block in decoding to the third resolution to obtain the second reconstructed block includes: upsampling the second resolution employed by the reference block during decoding to the highest resolution to obtain a fourth reconstructed block; downsampling the resolution of the fourth reconstruction block from the highest resolution to the third resolution to obtain the second reconstruction block.

5. A video encoding method, comprising:

acquiring a first resolution adopted by a block to be encoded in a video frame to be encoded during encoding, a second resolution adopted by a reference block during encoding and a motion vector MV of the reference block, wherein the reference block is a block where a region of the block to be encoded, which is referred to in the encoded reference frame, is located, and the size of the block to be encoded is the same as the size of the region to be referred to;

Acquiring a syntax element carried in data to be coded corresponding to the block to be coded, wherein the syntax element is an index mark aligned with a third resolution, and the index mark is used for indicating a scaling ratio of the third resolution;

the first resolution adopted by the block to be coded in the process of coding is adjusted to be the third resolution, so that a first reconstruction block is obtained, wherein the third resolution is different from the first resolution and the second resolution, and the target resolution is the third resolution;

under the condition that the first resolution is different from the second resolution, the second resolution adopted by the reference block in encoding is adjusted to the third resolution, and a second reconstruction block is obtained;

and determining a difference value between the motion vector MV of the first reconstruction block relative to the second reconstruction block and the motion vector predicted value MVP of the block to be encoded as motion vector data MVD of the block to be encoded, wherein the motion vector predicted value MVP of the block to be encoded is equal to the motion vector MV of the reference block.

6. The method of claim 5, wherein the step of determining the position of the probe is performed,

7. The method according to claim 5, wherein after determining a difference value between a motion vector MV of the first reconstructed block with respect to the second reconstructed block and a motion vector predictor MVP of the block to be encoded as motion vector data MVD of the block to be encoded, the method further comprises:

and adding the motion vector data MVD of the block to be encoded into the encoded data corresponding to the block to be encoded.

8. A video decoding apparatus, comprising:

a first obtaining unit, configured to obtain motion vector data MVD of a block to be decoded, which is carried in data to be decoded and corresponds to the block to be decoded in a video frame to be decoded, motion vector MV of a reference block, a first resolution adopted by the block to be decoded when decoding, and a second resolution adopted by the reference block when decoding, where the reference block is a block where a region of the block to be decoded that is referred to in a decoded reference frame is located, and a size of the block to be decoded is the same as a size of the referred region;

a first adjusting unit, configured to obtain a syntax element carried in data to be decoded corresponding to the block to be decoded, where the syntax element is an index flag aligned to a third resolution, and the index flag is used to indicate a scaling of the third resolution; under the condition that the first resolution and the second resolution are different, the first resolution adopted by the block to be decoded in decoding is adjusted to be the third resolution, so that a first reconstruction block is obtained, wherein the third resolution is different from the first resolution and the second resolution, and the target resolution is the third resolution; adjusting the second resolution adopted by the reference block in decoding to the third resolution to obtain a second reconstruction block;

And a first determining unit, configured to determine a sum of a motion vector predicted value MVP of the block to be decoded and motion vector data MVD of the block to be decoded as a motion vector MV of the first reconstructed block relative to the second reconstructed block, where the motion vector predicted value MVP of the block to be decoded is equal to the motion vector MV of the reference block.

9. The apparatus of claim 8, wherein the device comprises a plurality of sensors,

the apparatus further comprises: a second determining unit, configured to determine a plurality of target video blocks before acquiring a motion vector MV of a reference block, where each target video block includes at least one pixel located in a region of the reference;

the first obtaining unit is specifically configured to: determining a motion vector of a first video block located in an upper left corner of the plurality of target video blocks as a motion vector of the reference block, wherein the first video block is the same size as the reference block; or determining a motion vector of a second video block located at a corner of the plurality of target video blocks as a motion vector of a region of the reference, wherein the second video block is the same size as the reference block; or determining a motion vector of a third video block with the largest area among the target video blocks as the motion vector of the reference block; alternatively, a weighted sum of motion vectors for each of the plurality of target video blocks is determined as the motion vector for the reference block.

10. A video encoding apparatus, comprising:

a first obtaining unit, configured to obtain a first resolution adopted by a block to be encoded in a video frame to be encoded during encoding, a second resolution adopted by a reference block during encoding, and a motion vector MV of the reference block, where the reference block is a block where a region of the block to be encoded that is referred to in an encoded reference frame is located, and a size of the block to be encoded is the same as a size of the referred region;

a first adjusting unit, configured to obtain a syntax element carried in data to be encoded corresponding to the block to be encoded, where the syntax element is an index flag aligned to a third resolution, and the index flag is used to indicate a scaling of the third resolution; under the condition that the first resolution and the second resolution are different, the first resolution adopted by the block to be coded in coding is adjusted to be the third resolution, so that a first reconstruction block is obtained, wherein the third resolution is different from the first resolution and the second resolution, and the target resolution is the third resolution; adjusting the second resolution adopted by the reference block in encoding to the third resolution to obtain a second reconstruction block;

A first determining unit, configured to determine, as motion vector data MVD of the block to be encoded, a difference value between a motion vector MV of the first reconstructed block relative to the second reconstructed block and a motion vector predictor MVP of the block to be encoded, where the motion vector predictor MVP of the block to be encoded is equal to the motion vector MV of the reference block.

11. A storage medium having a computer program stored therein, wherein the computer program when executed by a processor performs the method of any of claims 1 to 7.

12. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method according to any of the claims 1 to 7 by means of the computer program.