CN113329228A

CN113329228A - Video encoding method, decoding method, device, electronic device and storage medium

Info

Publication number: CN113329228A
Application number: CN202110586627.XA
Authority: CN
Inventors: 何鸣; 阮良; 陈功; 韩庆瑞
Original assignee: Hangzhou Langhe Technology Co Ltd
Current assignee: Hangzhou Netease Zhiqi Technology Co Ltd
Priority date: 2021-05-27
Filing date: 2021-05-27
Publication date: 2021-08-31
Anticipated expiration: 2041-05-27
Also published as: CN113329228B

Abstract

The method comprises the steps of carrying out scene analysis on an acquired video frame to determine whether the video frame contains video content of a specified type, dividing the video frame into at least one data block based on whether the video frame contains the video content of the specified type, determining whether downsampling processing is carried out before encoding according to whether each data block contains the video content of the specified type, and carrying out encoding processing on the data block according to the result of determining whether downsampling processing is carried out before encoding. Thus, some data blocks can be selectively downsampled before coding, the size of the data blocks can be compressed, and the amount of video to be coded can be reduced, so that the coding efficiency can be effectively improved.

Description

Video encoding method, decoding method, device, electronic device and storage medium

Technical Field

The present disclosure relates to the field of video encoding and compression technologies, and in particular, to a video encoding method, a video decoding method, an apparatus, an electronic device, and a storage medium.

Background

An important objective of video codec techniques is to compress video data into a form that uses a lower bit rate while avoiding or minimizing video quality loss. In the related art, the redundancy of video in time and space is eliminated by two means, namely prediction and quantization. But the coding efficiency of this scheme still needs to be improved.

Disclosure of Invention

The disclosed embodiments provide a video encoding method, a video decoding device, an electronic device, and a storage medium, so as to solve the problem of relatively low encoding efficiency in the related art.

In a first aspect, an embodiment of the present disclosure provides a video encoding method, including:

performing scene analysis on the acquired video frame to determine whether the video frame contains video content of a specified type;

partitioning the video frame to obtain at least one data block based on whether the video frame contains video content of a specified type or not, wherein the at least one data block contains a data block which does not contain the video content of the specified type;

determining whether to perform downsampling processing before encoding according to whether each data block contains the video content of the specified type;

according to the result of whether downsampling processing is performed before encoding or not, encoding processing is performed on the data block to obtain prediction information and video information of the data block;

and sending video decoding information, wherein the video decoding information at least comprises downsampling indication information, prediction information and video information of the data block, and the downsampling indication information is used for indicating whether downsampling processing is carried out on the data block.

In some possible embodiments, blocking the video frame based on whether the video frame contains a specified type of video content to obtain at least one data block includes:

when the video frame does not contain the video content of the specified type, taking the video frame as a data block;

when the video frame contains the video content of the specified type, the video frame is divided into at least two data blocks based on the area where the video content of the specified type is located.

In some possible embodiments, dividing the video frame into at least two data blocks based on the area in which the specified type of video content is located includes:

and dividing the video frame into at least two data blocks in the horizontal direction based on the area where the video content of the specified type is located, so that the lengths of the data blocks are the same.

In some possible embodiments, when the data block does not contain the specified type of video content, a downsampling process is performed prior to encoding, an

According to the result of determining whether to perform downsampling processing before encoding, encoding the data block to obtain the prediction information and video information of the data block, the method comprises the following steps:

performing downsampling processing on the data block, and performing downsampling processing on a coding reference area of the data block to enable the size of the coding reference area to be matched with the size of the data block, wherein the coding reference area refers to a coded area which is referred to when the data block is coded;

coding the data block after the down-sampling processing by using the coding reference area after the down-sampling processing to obtain the prediction information and the video information of the data block; and

the video coding information further comprises sample description information for the data block.

In some possible embodiments, downsampling the encoded reference area of the data block includes:

when the coding reference area is a reference line, performing downsampling processing on the coding reference area in the horizontal direction, wherein the downsampling proportion in the horizontal direction is equal to that of the data block in the horizontal direction;

and when the coding reference area is a reference block, respectively performing downsampling processing on the coding reference area in the horizontal direction and the vertical direction, wherein the downsampling proportion in the horizontal direction is equal to that of the data block in the horizontal direction, and the downsampling proportion in the vertical direction is equal to that of the data block in the vertical direction.

In some possible embodiments, the encoding processing the data block after the downsampling processing by using the encoding reference area after the downsampling processing to obtain the prediction information and the video information of the data block includes:

dividing the data block after the downsampling processing into a plurality of coding tree units according to the given size of the coding tree unit supported by a coding end;

respectively coding each coding tree unit by using the coding reference area subjected to the down-sampling processing to obtain the prediction information and the video information of the data block; and

the video decoding information further includes coding tree unit partition information of the data block.

In some possible embodiments, when the block of data contains the specified type of video content, no downsampling is performed prior to encoding, and

when the coding reference area of the data block is subjected to the downsampling processing, the coding reference area is subjected to the upsampling processing, the size of the coding reference area is matched with the size of the data block, the data block is coded by using the coding reference area subjected to the upsampling processing, and the video information of the data block is obtained, wherein the coding reference area refers to a coded area which is referred to when the data block is coded;

and when the coding reference area of the data block is not subjected to downsampling processing, the coding reference area is used for coding the data block to obtain the video information of the data block.

In some possible embodiments, the upsampling process on the coded reference region includes:

when the coding reference area is a reference line, performing up-sampling processing on the coding reference area in the horizontal direction, wherein the up-sampling proportion in the horizontal direction is equal to the reciprocal of the down-sampling proportion of the coding reference area in the horizontal direction;

and when the coding reference area is a reference block, respectively performing upsampling processing on the coding reference area in the horizontal direction and the vertical direction, wherein the upsampling proportion in the horizontal direction is equal to the reciprocal of the downsampling proportion of the coding reference area in the horizontal direction, and the upsampling proportion in the vertical direction is equal to the reciprocal of the downsampling proportion of the coding reference area in the vertical direction.

In some possible embodiments, the specified type of video content refers to video content of the video frames with a motion amplitude higher than a set amplitude compared with an adjacent video frame, or the specified type of video content refers to video content of the video frames with a texture complexity higher than a set complexity.

In a second aspect, an embodiment of the present disclosure provides a video decoding method, including:

receiving video decoding information, wherein the video decoding information at least comprises downsampling indication information, prediction information and video information of a data block to be decoded, and the downsampling indication information is used for indicating whether downsampling processing is carried out on the corresponding data block or not;

decoding according to the downsampling indication information, the prediction information and the video information to obtain a decoding area;

according to the downsampling indication information, whether upsampling processing is carried out after decoding is determined;

and determining the corresponding data block according to the result of whether the up-sampling processing is carried out after the decoding and the decoding area.

In some possible embodiments, when the downsampling indication information indicates that the corresponding data block is downsampled, the video decoding information further includes sample description information of the corresponding data block, and

decoding according to the downsampling indication information, the prediction information and the video information to obtain a decoding area, comprising:

performing downsampling processing on a decoding reference area of the corresponding data block based on the sampling description information to enable the size of the decoding reference area to be matched with the size of the corresponding data block, wherein the decoding reference area refers to a decoded area which is referred to when the corresponding data block is decoded;

and decoding by using the decoding reference area, the prediction information and the video information after the down-sampling processing to obtain a decoding area.

In some possible embodiments, the downsampling the decoding reference area of the corresponding data block based on the sampling description information includes:

when the decoding reference area is a reference line, performing downsampling processing on the decoding reference area in the horizontal direction, wherein the downsampling proportion in the horizontal direction is equal to that of the corresponding data block;

and when the decoding reference area is a reference block, respectively performing up-sampling processing on the decoding reference area in the horizontal direction and the vertical direction, wherein the down-sampling proportion in the horizontal direction is equal to the down-sampling proportion of the corresponding data block in the horizontal direction, and the down-sampling proportion in the vertical direction is equal to the down-sampling proportion of the corresponding data block in the vertical direction.

In some possible embodiments, the video decoding information further includes coding tree unit partition information of the corresponding data block, an

Decoding the decoded reference region, the prediction information, and the video information after downsampling to obtain a decoded region, including:

determining a coding tree unit contained in the corresponding data block according to the coding tree unit division information;

and respectively decoding each coding tree unit by using the decoding reference area, the prediction information and the video information after the down-sampling processing to obtain the decoding area.

In some possible embodiments, the downsampling indication information indicates that when the corresponding data block is subjected to downsampling processing, upsampling processing is performed after decoding, and

determining a corresponding data block according to a result of whether the upsampling processing is performed after decoding and the decoding area, including:

and performing upsampling processing on the decoding area based on the sampling description information of the corresponding data block to obtain the corresponding data block.

In some possible embodiments, upsampling the decoding area based on the sampling description information of the corresponding data block to obtain the corresponding data block includes:

performing upsampling processing on the decoding area in a horizontal direction, wherein an upsampling proportion in the horizontal direction is equal to the reciprocal of a downsampling proportion of a corresponding data block in the horizontal direction;

and performing upsampling processing on the decoding area in the vertical direction, wherein the upsampling proportion in the vertical direction is equal to the reciprocal of the downsampling proportion of the corresponding data block in the vertical direction.

In some possible embodiments, when the downsampling indication information indicates that the corresponding data block has not been downsampled,

and decoding by using the decoding reference area of the corresponding data block, the prediction information and the video information to obtain the decoding area, wherein the decoding reference area refers to a decoded area which is referred to when the corresponding data block is decoded.

In some possible embodiments, the downsampling indication information indicates that no upsampling is performed after decoding when the corresponding data block is not subjected to downsampling, and

determining the decoding area as a corresponding data block.

In a third aspect, an embodiment of the present disclosure provides a video encoding apparatus, including:

the analysis unit is used for carrying out scene analysis on the acquired video frames so as to determine whether the video frames contain video contents of a specified type;

a blocking unit, configured to block the video frame to obtain at least one data block based on whether the video frame includes a video content of a specified type, where the at least one data block includes a data block that does not include the video content of the specified type;

a determining unit, configured to determine whether to perform downsampling processing before encoding according to whether each data block contains the specified type of video content;

the encoding unit is used for encoding the data block according to the result of determining whether downsampling processing is carried out before encoding to obtain the prediction information and the video information of the data block;

a sending unit, configured to send video decoding information, where the video decoding information at least includes downsampling indication information of the data block, prediction information, and video information, and the downsampling indication information is used to indicate whether downsampling processing is performed on the data block.

In some possible embodiments, the blocking unit is specifically configured to:

The encoding unit is specifically configured to:

In some possible embodiments, the encoding unit is specifically configured to:

respectively coding each coding tree unit by using the coding reference area subjected to the down-sampling processing to obtain the prediction information and the video information of the data block;

the encoding unit is specifically configured to:

In some possible embodiments, the encoding unit is specifically configured to:

In a fourth aspect, an embodiment of the present disclosure provides a video decoding apparatus, including:

the video decoding device comprises a receiving unit, a decoding unit and a decoding unit, wherein the receiving unit is used for receiving video decoding information, the video decoding information at least comprises downsampling indication information, prediction information and video information of a data block to be decoded, and the downsampling indication information is used for indicating whether downsampling processing is carried out on the corresponding data block or not;

a decoding unit, configured to perform decoding processing according to the downsampling indication information, the prediction information, and the video information to obtain a decoded area;

the sampling unit is used for determining whether to carry out up-sampling processing after decoding according to the down-sampling indication information;

and the determining unit is used for determining the corresponding data block according to the result of whether the upsampling processing is carried out after the decoding and the decoding area.

the decoding unit is specifically configured to:

In some possible embodiments, the decoding unit is specifically configured to:

The decoding unit is specifically configured to:

the determining unit is specifically configured to perform upsampling processing on the decoding area based on the sampling description information of the corresponding data block to obtain the corresponding data block.

In some possible embodiments, the determining unit is specifically configured to:

In some possible embodiments, when the downsampling indication information indicates that the corresponding data block has not been subjected to downsampling processing, the decoding unit is specifically configured to:

the determining unit is specifically configured to determine the decoding area as a corresponding data block.

In a fifth aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor, and a memory communicatively coupled to the at least one processor, wherein:

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the video encoding method or the video decoding method described above.

In a sixth aspect, the present disclosure provides a storage medium, where instructions are executed by a processor of an electronic device, and the electronic device is capable of executing the above-mentioned video encoding method or video decoding method.

In the embodiment of the disclosure, scene analysis is performed on an acquired video frame to determine whether the video frame contains video content of a specified type, the video frame is divided into at least one data block based on whether the video frame contains the video content of the specified type, whether downsampling processing is performed before encoding is determined according to whether each data block contains the video content of the specified type, and the data block is encoded according to a result of whether downsampling processing is performed before encoding. Thus, some data blocks can be selectively downsampled before coding, the size of the data blocks can be compressed, and the amount of video to be coded can be reduced, so that the coding efficiency can be effectively improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and not to limit the disclosure. In the drawings:

fig. 1 is a schematic diagram of a video encoding process provided in an embodiment of the present disclosure;

fig. 2 is a schematic diagram of a video decoding process provided in an embodiment of the present disclosure;

fig. 3 is a flowchart of a video encoding method according to an embodiment of the disclosure;

fig. 4 is a schematic diagram illustrating a video frame division according to an embodiment of the disclosure;

fig. 5 is a schematic diagram illustrating a division of a video frame according to another embodiment of the present disclosure;

fig. 6 is a flowchart of a method for encoding a data block according to an embodiment of the present disclosure;

fig. 7 is a flowchart of a further method for encoding a data block according to an embodiment of the present disclosure;

fig. 8 is a schematic diagram illustrating a division of a video frame according to another embodiment of the present disclosure;

fig. 9 is a schematic diagram of another video encoding process provided by the embodiment of the present disclosure;

fig. 10 is a flowchart of a video decoding method according to an embodiment of the disclosure;

fig. 11 is a flowchart of a method for decoding a data block according to an embodiment of the present disclosure;

fig. 12 is a schematic diagram of another video decoding process provided by the embodiment of the present disclosure;

fig. 13 is a schematic structural diagram of a video encoding apparatus according to an embodiment of the present disclosure;

fig. 14 is a schematic structural diagram of a video decoding apparatus according to an embodiment of the present disclosure;

fig. 15 is a schematic hardware structure diagram of an electronic device for implementing a video encoding method or a video decoding method according to an embodiment of the present disclosure.

Detailed Description

In order to solve the problem of relatively low coding efficiency in the related art, embodiments of the present disclosure provide a video coding method, a decoding method, an apparatus, an electronic device, and a storage medium.

The preferred embodiments of the present disclosure will be described below with reference to the accompanying drawings of the specification, it being understood that the preferred embodiments described herein are merely for illustrating and explaining the present disclosure, and are not intended to limit the present disclosure, and that the embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.

To facilitate understanding of the present disclosure, the present disclosure relates to technical terms in which:

a Coding Tree Unit (CTU), an independent coding unit in the coding process, may be recursively divided into Coding Units (CUs).

A block (patch), which may include one or more patches in a video frame, and a patch may include multiple CTUs but is smaller than or equal to the size of the video frame.

Downsampled (downsampled) refers to reducing the size of an image at a certain magnification.

Upsampling, which refers to enlarging the size of an image at a certain magnification.

Scale (scale), representing the mode of the down-sampling scale.

The horizontal scale (h scale) represents the down-sampling ratio in the horizontal direction.

The vertical scale (v scale) represents the down-sampling ratio in the vertical direction.

And a reference frame (ref) representing a reference frame in the reference frame list.

Fig. 1 is a schematic diagram of a video encoding process provided by an embodiment of the present disclosure, which mainly includes several modules of prediction, transform quantization and entropy encoding, where the prediction module is configured to perform prediction encoding on input video information to obtain predicted video information and prediction information including a prediction flag and a motion vector, and at this time, reference video information of inter-frame prediction is obtained after performing inverse quantization and inverse transform on encoded frame information; the transformation quantization module is used for transforming and quantizing residual video information of the prediction video information and the input video information to obtain quantization information; and the entropy coding module is used for carrying out coding operation on the prediction and quantization information and the header information such as resolution, key frames and the like, and obtaining and outputting code stream information.

Fig. 2 is a schematic diagram of a video decoding process provided by an embodiment of the present disclosure, which mainly includes several modules of entropy decoding, inverse quantization and inverse transformation, and prediction, where the entropy decoding module is configured to decode code stream information to obtain prediction, quantization information and header information; the prediction module is used for restoring the predicted video information according to the prediction information, and at the moment, the reference video information used in the inter-frame prediction is from the decoded video frame; and the inverse quantization and inverse transformation module is used for recovering the residual video information by utilizing the quantization information. And finally, reconstructing and outputting the video information by using the residual video information, the predicted video information and the header information.

In the related art, the redundancy of video in time and space is eliminated by two means, namely prediction and quantization. But the coding efficiency of this scheme still needs to be improved.

To this end, the present disclosure provides a video encoding method, in which scene recognition is performed on an acquired video frame to determine whether the video frame includes a video content of a specified type, the video frame is divided into at least one data block based on whether the video frame includes the video content of the specified type, whether downsampling processing is performed before encoding is determined according to whether each data block includes the video content of the specified type, and the data block is encoded according to a result of the determination of whether downsampling processing is performed before encoding. Thus, some data blocks can be selectively downsampled before coding, the size of the data blocks can be compressed, and the amount of video to be coded can be reduced, so that the coding efficiency can be effectively improved.

Fig. 3 is a flowchart of a video encoding method provided in an embodiment of the present disclosure, where the method includes the following steps.

In step S301, scene analysis is performed on the acquired video frames to determine whether the video frames contain video content of a specified type.

In specific implementation, the video content of the designated type may refer to video content of which the motion amplitude is higher than a set amplitude compared with an adjacent video frame in the video frames, that is, the video content of motion; the specified type of video content may also refer to video content with texture complexity higher than a set complexity in a video frame, that is, complex video content.

Moreover, the video content of the specified type in different scenes may be different, for example, the video content of the specified type in the monitoring scene may be a person, and the video content of the specified type in the vehicle detection scene may be a vehicle. Specifically, which video content is selected as the video content of the designated type is determined by a technician according to an actual scene, and is not described herein again.

In addition, methods of performing scene analysis include, but are not limited to, saliency detection, optical flow detection, frame difference detection, and object detection.

In step S302, the video frame is blocked to obtain at least one data block based on whether the video frame contains the video content of the specified type, wherein the at least one data block contains a data block that does not contain the video content of the specified type.

In some possible embodiments, the video frame does not contain video content of a specified type, and in this case, the video frame may be treated as a data block in order to compress the video to the maximum extent.

In other possible embodiments, the video frame includes a specified type of video content, and in this case, in order to compress the video content of the non-specified type in the video frame to the maximum extent, the video frame may be divided into at least two data blocks based on the area where the specified type of video content is located, where one of the two data blocks includes a data block that does not include the specified type of video content.

For example, based on the area of the video content of the designated type, the video frame is divided into at least two data blocks in the horizontal direction, so that the lengths of the data blocks are the same. Therefore, the dividing direction of the video block is consistent with the encoding direction of the video frame, the compatibility is good, and the encoding effect is good.

Assuming that the area in which the specified type of video content is located is in the upper half of the video frame, the video frame may be divided into two data blocks in the horizontal direction. Referring to fig. 4, assuming that a rectangular frame indicates an area where video content of a specified type is located, at this time, a video frame may be divided into a data block 1 located in an upper half and a data block 2 located in a lower half along a lower boundary of the area where the rectangular frame is located, where the data block 1 contains video content of the specified type, and the data block 2 does not contain video content of the specified type.

Assuming that the region of the specified type of video content is located in the middle portion of the video frame, the video frame may be divided into three data blocks in the horizontal direction. Referring to fig. 5, assuming that a rectangular frame indicates an area where video content of a specified type is located, at this time, a video frame may be divided into a data block 3 located in an upper half portion, a data block 4 located in a middle portion, and a data block 5 located in a lower half portion along an upper boundary and a lower boundary of the area where the rectangular frame is located, where the data block 3 and the data block 5 do not contain video content of the specified type, and the data block 4 contains video content of the specified type.

It should be noted that fig. 4 and fig. 5 are only examples, and in a specific implementation, the area where the rectangular frame is located may also be divided into a plurality of data blocks, and a specific division rule is determined by a skilled person according to an actual situation, and is not described herein again.

In addition, the video frame may be divided into at least two data blocks in the vertical direction based on the area where the video content of the designated type is located, and the widths of the data blocks may be made the same.

In step S303, it is determined whether or not the downsampling process is performed before encoding, according to whether or not each data block contains video content of a specified type.

In specific implementation, when the data block does not contain the video content of the specified type, downsampling processing is carried out before encoding; when the data block contains a specified type of video content, no downsampling process is performed prior to encoding. Therefore, different coding strategies are executed on different data blocks in the video frame, so that the code rate can be effectively saved at the frame level, and the coding efficiency is improved.

In step S304, the data block is encoded according to the result of determining whether to perform downsampling processing before encoding, and prediction information and video information of the data block are obtained.

The prediction information of the data block includes information such as inter prediction, index of reference video frame, intra prediction, prediction mode, etc.; video information such as quantization information for a block of data.

In the first case: the data block does not contain video content of a specified type, and downsampling processing is performed before encoding.

In this case, the video coding information further includes sampling description information of the data block, the sampling description information includes a sampling ratio, a down-sampling mode, and the like, wherein each down-sampling mode can correspond to a down-sampling magnification, and each down-sampling mode can be expressed by a power of 2 in order to save a code rate.

It should be noted that the presentation form of the sampling description information is only an example, and as long as the decoding end can know the down-sampling ratio of the encoding end to the data block, the description is not enumerated here.

In specific implementation, when encoding a data block according to the result of determining whether to perform downsampling processing before encoding to obtain prediction information and video information of the data block, the process may be performed according to a flow shown in fig. 6, where the flow includes the following steps:

in step 601a, a downsampling process is performed on the data block.

In step 602a, a down-sampling process is performed on the coding reference area of the data block so that the size of the coding reference area matches the size of the data block.

Wherein the encoding reference area refers to an encoded area that is referred to when encoding the data block. And the number of the first and second electrodes,

when inter-frame coding is adopted, the coding reference area refers to an area matched with the data block in the coded reference frame, and at this time, downsampling processing can be performed on the reference frame firstly, and then the coding reference area can be selected from the reference frame after the downsampling processing.

When intra-coding is employed, the coded reference region refers to a region of the coded region that is adjacent to the data block.

In particular, when the down-sampling processing is performed on the coding reference area of the data block:

if the coding reference area is a reference line, performing downsampling processing on the coding reference area in the horizontal direction, wherein the downsampling proportion in the horizontal direction is equal to the downsampling proportion of the data block in the horizontal direction;

and if the coding reference area is a reference block, respectively performing downsampling processing on the coding reference area in the horizontal direction and the vertical direction, wherein the downsampling ratio in the horizontal direction is equal to the downsampling ratio of the data block in the horizontal direction, and the downsampling ratio in the vertical direction is equal to the downsampling ratio of the data block in the vertical direction.

In step 603a, the data block after the down-sampling process is encoded using the encoding reference area after the down-sampling process, and prediction information and video information of the data block are obtained.

In a specific implementation, the data block after the downsampling process may be encoded by directly using the encoding reference area after the downsampling process, so as to obtain the prediction information and the video information of the data block.

Considering that the size of the CTU originally divided in the data block is changed after the downsampling process is performed on the data block, an irregular size (i.e., the size of the CTU not supported by the encoding end) may be generated, thereby complicating encoding.

In order to solve this problem, in a specific implementation, the data block after the downsampling process may be divided into a plurality of CTUs according to the given size of the CTUs supported by the encoding end, the size of each CTU is made to be the given size, and then each CTU is encoded by using the encoding reference area after the downsampling process, so as to obtain the prediction information and the video information of the data block.

Thus, the CTUs of odd-shaped size are not generated, so that the encoding is not complicated. In addition, in this case, the video coding information further includes CTU partition information of the data block, and the CTU partition information is used to notify the decoding side of CTU partition of the data block.

It should be noted that there is no strict precedence relationship between S601a and S602 a.

In the second case: the data block contains video content of a specified type and is not downsampled prior to encoding.

In specific implementation, when encoding a data block according to a result of determining whether to perform downsampling processing before encoding to obtain prediction information and video information of the data block, there are two cases:

case 1: the coded reference area of the data block is downsampled.

At this time, the encoding may be performed according to a flow shown in fig. 7, which includes the following steps:

in step 701a, an up-sampling process is performed on the coding reference region of the data block to match the size of the coding reference region with the size of the data block.

If the coding reference area is a reference line, performing upsampling processing on the coding reference area in the horizontal direction, wherein the upsampling proportion in the horizontal direction is equal to the reciprocal of the downsampling proportion of the coding reference area in the horizontal direction;

if the coding reference area is a reference block, performing up-sampling processing on the coding reference area in a horizontal direction and a vertical direction respectively, wherein an up-sampling ratio in the horizontal direction is equal to the reciprocal of a down-sampling ratio in the horizontal direction of the coding reference area, and an up-sampling ratio in the vertical direction is equal to the reciprocal of a down-sampling ratio in the vertical direction of the coding reference area.

In step 702a, a data block is encoded using the upsampled coding reference region to obtain video information of the data block.

Case 2: the coded reference area of the data block is not downsampled.

At this time, the data block may be directly encoded using the encoding reference region of the data block, thereby obtaining video information of the data block.

In step S305, video decoding information is transmitted, wherein the video decoding information at least includes downsampling instruction information of a data block, prediction information, and video information, and the downsampling instruction information is used for instructing whether the data block is subjected to downsampling processing.

Fig. 8 is a schematic diagram of another video encoding process provided by an embodiment of the present disclosure, which mainly includes several modules of scene analysis, downsampling, prediction, transformation, quantization and entropy encoding, where the scene analysis module is configured to perform scene analysis on input video information to determine whether a video frame includes a video content of a specified type, and divide the video frame into a normal encoding patch and a downsampling encoding patch based on whether the video frame includes the video content of the specified type, where the normal encoding patch includes the video content of the specified type, and the downsampling encoding patch does not include the video content of the specified type. Then, the normal coding patch is predicted, transformed and quantized by the prediction, transformation and quantization module to obtain the prediction and quantization information of the normal coding patch, at this time, the inter-frame reference video information is obtained by performing prediction, inverse quantization and inverse transformation on the coded frame information, and the intra-frame reference video information may be obtained by performing up-sampling on the coded down-sampling coding patch. The down-sampling module performs down-sampling (i.e., scaling) on the down-sampling coded patch, and the prediction, conversion, and quantization module performs prediction, conversion, and quantization on the down-sampling coded patch after scaling to obtain prediction and quantization information of the down-sampling coded patch. And finally, the entropy coding module performs coding operation on the prediction and quantization information of the normal coding patch, the prediction and quantization information of the downsampling coding patch and header information such as resolution, key frames and the like to obtain code stream information and output the code stream information.

The following describes a video encoding method provided by the embodiments of the present disclosure with reference to specific embodiments.

The method comprises the following steps: the method includes performing scene analysis on the input video frame to determine whether the video frame contains video content of a specified type, and the scene analysis includes, but is not limited to, saliency detection, optical flow detection, frame difference detection, object detection, and the like.

Step two: referring to fig. 9, according to the scene analysis result in the first step, the video frame is divided into patch _1 and patch _2 … … patch _ n in the horizontal direction, the width of each patch is equal to the pixel width of the video frame, the height of each patch _1 to patch _ n-1 is equal to an integer multiple of the pixel height of the CTU supported by the encoding end, and the height of each patch _ n is equal to the sum of the pixel height of the video frame minus the pixel heights of the patch _1 to patch _ n-1, wherein each patch has the maximum height of the pixel height of the video frame and the minimum height of one CTU.

Step three: and dividing the patch _1 and the patch _2 … … patch _ n obtained in the step two into a downsampling coding patch and a normal coding patch according to a rule that the patch containing the video content of the specified type is the normal coding patch and the patch not containing the video content of the specified type is the downsampling coding patch. Referring to fig. 9, where patch _2 is a downsampling coding patch, and the rest of patches are normal coding patches, fig. 9 shows the size conversion of patch _2 before and after downsampling.

Step four: referring to fig. 8, the downsampling coding patch (i.e., patch _2) obtained in step three is downsampled, and the scaling factor is h _ scale in the horizontal direction and v _ scale in the vertical direction. And re-dividing the patch _2 subjected to the downsampling processing into CTUs, wherein the sizes of the CTUs are the sizes of the CTUs supported by the current encoding end, and sequentially performing prediction, transformation and quantization operations on the re-divided CTUs to obtain prediction and quantization information of the patch _2 subjected to the downsampling processing.

In the intra-frame prediction encoding process of patch _2, the encoding reference area (pixel block or pixel line) used in patch _1 can be scaled by the same magnification, and if the encoding reference area is a pixel line, the encoding reference area can be down-sampled in the horizontal direction by the scaling of h _ scale; if the encoding reference region is a pixel block, scaling may be performed in the horizontal direction and the vertical direction, respectively, and the scaling factor in the horizontal direction is h _ scale and the scaling factor in the vertical direction is v _ scale. In the inter prediction encoding process of patch _2, a down sampling process may be performed on ref _1 and ref _2 … … ref _ m in the reference video frame list, and the scaling factor in the horizontal direction is h _ scale and the scaling factor in the vertical direction is v _ scale, and then an encoding reference area is selected from the scaled reference video frame.

Step five: and predicting, changing and quantizing the normal coding patch obtained in the step three to obtain the prediction and quantization information of the normal coding patch. In specific implementation, if the previous patch of the normal encoding patch is a downsampling encoding patch, such as the patch _3 shown in fig. 9, the upsampling scaling may be performed on the encoding reference area (pixel block or pixel row) used by intra and inter references in the previous patch, where if the encoding reference area is a pixel row, the upsampling scaling is performed in the horizontal direction, and the scaling factor is 1/h _ scale; if the encoded reference region is a pixel block, up-sampling scaling is performed by 1/h _ scale times in the horizontal direction and 1/v _ scale times in the vertical direction, and methods of up-sampling include, but are not limited to, neighbor interpolation, bilinear interpolation, trilinear interpolation, and Lanuss interpolation, etc.

Step six: and inputting the prediction and the quantization information of the downsampling coding patch obtained in the step four, the prediction and the quantization information of the normal coding patch obtained in the step five and the header information of the video frame into an entropy coding module for coding to obtain code stream information.

The header information includes sampling indication information indicating whether the current patch is subjected to downsampling processing, the sampling indication information of the normal coding patch can be 0, and the sampling indication information of the downsampling coding patch can be 1; the header information may also include sampling description information of the current patch, for example, a plurality of scales scale _1 and scale _2 … … scale _ t are set, each scale corresponds to a sampling proportion, and t may be a power of 2 for saving code rate during encoding; the header information may further include CTU line number information included in the current patch, the CTU line number of the normal encoding patch is the current actual CTU line number, and the CTU line number of the downsampling encoding patch is the recalculated CTU line number after downsampling conversion.

In the embodiment of the present disclosure, a scene analysis is performed on an input video frame, based on a scene analysis result, the video frame is divided into a plurality of patches in a horizontal direction, the plurality of patches are divided into a normal encoding patch and a downsampling encoding patch according to a rule that the normal encoding patch contains a video content of a specific type and the downsampling encoding patch does not contain the video content of the specific type, and then the normal encoding patch is encoded by using a normal encoding method, the downsampling encoding patch is downsampled first to compress the size of the downsampling encoding patch, and then the downsampling encoding patch after size compression is encoded. Thus, the amount of video to be encoded can be reduced, and the encoding efficiency can be effectively improved.

Fig. 10 is a flowchart of a video decoding method provided in an embodiment of the present disclosure, where the method includes the following steps.

In step S1001, video decoding information is received, where the video decoding information at least includes downsampling indication information of a data block to be decoded, prediction information, and video information, and the downsampling indication information is used to indicate whether downsampling processing is performed on the corresponding data block.

In step S1002, decoding processing is performed based on the downsampling instruction information, the prediction information, and the video information, and a decoded area is obtained.

In the first case, the downsampling instruction information instructs the corresponding data block to perform downsampling processing.

In this case, the video decoding information further includes sample description information of the corresponding data block.

In a specific implementation, when performing decoding processing based on the downsampling instruction information, the prediction information, and the video information to obtain a decoded area, the decoding processing may be performed according to a flow shown in fig. 11, where the flow includes the following steps:

s1101 a: and performing downsampling processing on the decoding reference area of the corresponding data block based on the sampling description information to enable the size of the decoding reference area to be matched with the size of the corresponding data block.

The decoding reference region refers to a decoded region that is referred to when decoding a corresponding data block. And the number of the first and second electrodes,

when inter-frame decoding is adopted, the decoding reference area refers to an area matched with the data block in the encoded reference frame, and at this time, downsampling processing can be performed on the reference frame firstly, and then the decoding reference area can be selected from the reference frame after the downsampling processing.

When intra-decoding is employed, the decoding reference region refers to a region adjacent to the data block in the encoded region.

In particular, when downsampling the decoding reference area of the corresponding data block:

if the decoding reference area is a reference line, performing downsampling processing on the decoding reference area in the horizontal direction, wherein the downsampling proportion in the horizontal direction is equal to that of the corresponding data block;

if the decoding reference area is a reference block, the decoding reference area is respectively subjected to up-sampling processing in the horizontal direction and the vertical direction, wherein the down-sampling proportion in the horizontal direction is equal to the down-sampling proportion of the corresponding data block in the horizontal direction, and the down-sampling proportion in the vertical direction is equal to the down-sampling proportion of the corresponding data block in the vertical direction.

S1102 a: and decoding by using the decoding reference area, the prediction information and the video information after the down-sampling processing to obtain a decoding area.

In a specific implementation, the decoding reference region, the prediction information, and the video information after the down-sampling process may be directly used for decoding, so as to obtain a decoded region.

In some possible embodiments, the video decoding information may further include coding tree unit partition information of the corresponding data block. In this case, the coding tree unit included in the corresponding data block may be determined according to the coding tree unit partition information, and then each coding tree unit may be decoded using the decoded reference region, the prediction information, and the video information after the downsampling process, to obtain the decoded region.

In the second case: the downsampling indication information indicates that the corresponding data block is not downsampled.

In specific implementation, the decoding reference region, the prediction information, and the video information of the corresponding data block may be directly utilized for decoding, so as to obtain the decoded region.

In step S1003, it is determined whether or not to perform the up-sampling process after decoding, based on the down-sampling instruction information.

In specific implementation, when the down-sampling indication information indicates that the corresponding data block is subjected to the down-sampling processing, the up-sampling processing is performed after decoding; when the downsampling instruction information indicates that the corresponding data block is not subjected to downsampling processing, the upsampling processing is not performed after decoding. Thus, the decoding end can be enabled to correctly recover the original video frame.

In step S1004, the corresponding data block is determined according to the result of determining whether to perform the upsampling process after decoding and the decoded area.

In the first case: the downsampling instruction information instructs the corresponding data block to perform downsampling processing, and performs upsampling processing after decoding.

In this case, the up-sampling processing may be performed on the decoding area based on the sampling description information of the corresponding data block, so as to obtain the corresponding data block.

For example, the decoding area is up-sampled in the horizontal direction, wherein the up-sampling ratio in the horizontal direction is equal to the inverse of the down-sampling ratio in the horizontal direction of the corresponding data block; the decoding area is up-sampled in a vertical direction, wherein an up-sampling ratio in the vertical direction is equal to an inverse of a down-sampling ratio in the vertical direction of the corresponding data block.

In the second case: the downsampling instruction information indicates that, when downsampling processing is not performed on the corresponding data block, upsampling processing is not performed after decoding.

In this case, the decoding area can be directly determined as the corresponding data block.

In the embodiment of the disclosure, video decoding information is received, decoding processing is performed according to downsampling indication information, prediction information and video information in the video decoding information to obtain a decoding area, whether upsampling processing is performed after decoding is determined according to the downsampling indication information, and a corresponding data block is determined according to a result of determining whether upsampling processing is performed after decoding and the decoding area. Since the up-sampling operation is simpler than the decoding operation, some data blocks are selectively up-sampled after decoding to recover the sizes of the data blocks, the amount of video to be decoded can be reduced, and the decoding efficiency can be effectively improved.

Fig. 12 is a schematic diagram of another video decoding process provided by the embodiment of the present disclosure, which mainly includes several modules of entropy decoding, predicting, inverse quantizing, inverse transforming, down-sampling and up-sampling, where the entropy decoding module is configured to decode code stream information to obtain prediction and quantization information of a normal coding patch and prediction and quantization information of a down-sampling coding patch. Then, the prediction, inverse quantization and inverse transformation module performs prediction, inverse quantization and inverse transformation on the prediction and quantization information of the normal coding patch to obtain the video information of the normal coding patch, and at this time, the reference video information may be obtained by performing downsampling on the decoded video information. Prediction, inverse quantization and inverse transformation are performed on the prediction and quantization information of the downsampled coding patch by a prediction, inverse quantization and inverse transformation module to obtain video information of the downsampled coding patch (size after downsampling), and then upsampling is performed on the downsampled coding patch to obtain video information of the downsampled coding patch (original size). Finally, the video information is reconstructed and output by using the video information of the downsampling coding patch (original size), the video information of the normal coding patch and the header information.

The following describes a video decoding method provided by the embodiments of the present disclosure with reference to specific embodiments.

The method comprises the following steps: entropy decoding is carried out on the code stream information to obtain header information, quantization and prediction information of the downsampling coding patch and quantization and prediction information of the normal coding patch, patch dividing information of the video frame is analyzed based on the header information, the normal coding patch and the downsampling coding patch are determined, and downsampling scale information h _ scale and v _ scale of the downsampling coding patch are obtained.

Step two: the quantization and prediction information of the downsampling coding patch obtained in the first step is subjected to prediction, inverse quantization, and inverse transformation processing, so that video information of the downsampling coding patch (downsampling size) is obtained. The intra prediction required for the current downsampling coding patch decoding and the decoding reference area (reference pixel block or reference pixel line) required for the inter prediction need to be subjected to corresponding downsampling scaling operation. If the decoding reference area is a pixel row, performing h _ scale times down-sampling operation in the horizontal direction; if the decoding reference area is a pixel block, a downsampling operation of h _ scale times may be performed in the horizontal direction and a downsampling operation of v _ scale may be performed in the vertical direction. And performing downsampling scaling on ref _1 and ref _2 … … ref _ m in the inter reference list required by the current downsampling coding patch, wherein the scaling factor is h _ scale in the horizontal direction, and the scaling factor is v _ scale in the vertical direction. Then, the video information of the downsampling coding patch (downsampling size) can be subjected to upsampling scale conversion, wherein the horizontal direction upsampling multiplying factor is 1/h _ scale, and the vertical direction upsampling multiplying factor is 1/v _ scale, so that the video information of the downsampling coding patch (normal size) is obtained.

Step three: and performing prediction, inverse transformation and inverse quantization processing on the quantization and prediction information of the normal coding patch obtained in the step one to obtain video information of the normal coding patch. Wherein the reference information required for normal encoding of the intra and inter prediction of the patch is from the decoded down-sampled encoding of the patch (normal size) and the video information of the normal encoding of the patch.

Step four: and recombining the video information of the downsampling coding patch (normal size) obtained in the step two and the video information of the normal coding patch obtained in the step three to obtain the video information.

In the embodiment of the disclosure, entropy decoding is performed on the bitstream information to determine a normal encoding patch and a downsampling encoding patch in a video frame, then, a normal encoding patch is decoded by using a conventional decoding method to obtain a corresponding data block, a downsampling encoding patch is decoded to obtain a decoding area, and then, upsampling processing is performed on the decoding area to obtain a corresponding data block. The up-sampling operation is simpler than the decoding operation, and the up-sampling processing is carried out on some data blocks after decoding to recover the sizes of the data blocks, so that the video volume needing to be decoded can be reduced, and the decoding efficiency can be effectively improved.

The present disclosure provides a video encoding method and a video decoding method based on scale scaling, which are directed to the problem that it is difficult to further improve the video compression ratio and provide a higher quality video using the existing video compression encoding tools and rate control methods. In the video coding method, a scene analysis is performed on a video frame to determine whether the video frame contains motion or complex video content, and the video frame is divided into at least one patch based on a scene analysis result, wherein the at least one patch contains the motion or complex video content and is a normal coding patch, and the at least one patch does not contain the motion or complex video content and is a down-sampling coding patch. Then, the downsampling coding patch is subjected to downsampling scale conversion and then coding operation is performed. In the decoding method, a down-sampling coding patch is decoded first, and then up-sampling scale conversion is performed. The normal encoding patch does not need scaling at both the encoding end and the decoding end. In this way, the number of pixels of simple and relatively still content is reduced by means of downsampling scaling, thereby reducing the codestream information required to encode such video content. Also, the reduced codestream information may be used for encoding of complex content or content of interest to improve the video quality of these regions.

In addition, under the transmission condition of limited bandwidth, the downsampling patch is downsampled to reduce the number of pixels in the downsampling patch, fewer CTUs can be encoded, the decoding overhead of a decoding end is reduced, the code stream required by compression encoding is reduced, the decoding end can be favorable for quickly reconstructing compressed video information, the smoothness and stability of transmission are improved, and therefore better video watching experience is provided for live or on-demand users.

Based on the same technical concept, embodiments of the present disclosure further provide a video encoding apparatus, where the principle of the video encoding apparatus to solve the problem is similar to that of the video encoding method, so that the implementation of the video encoding apparatus can refer to the implementation of the video encoding method, and repeated details are not repeated. Fig. 13 is a schematic structural diagram of a video encoding apparatus provided in the embodiment of the present disclosure, which includes an analysis unit 1301, a blocking unit 1302, a determination unit 1303, an encoding unit 1304, and a sending unit 1305.

An analysis unit 1301, configured to perform scene analysis on the obtained video frame to determine whether the video frame includes a video content of a specified type;

a blocking unit 1302, configured to block the video frame to obtain at least one data block based on whether the video frame includes a video content of a specified type, where the at least one data block includes a data block that does not include the video content of the specified type;

a determining unit 1303, configured to determine whether to perform downsampling processing before encoding according to whether each data block contains the video content of the specified type;

an encoding unit 1304, configured to perform encoding processing on the data block according to a result of determining whether downsampling processing is performed before encoding, so as to obtain prediction information and video information of the data block;

a sending unit 1305, configured to send video decoding information, where the video decoding information at least includes downsampling indication information of the data block, prediction information, and video information, and the downsampling indication information is used to indicate whether the data block is subjected to downsampling processing.

In some possible embodiments, the blocking unit 1302 is specifically configured to:

The encoding unit 1304 is specifically configured to:

In some possible embodiments, the encoding unit 1304 is specifically configured to:

the encoding unit 1304 is specifically configured to:

Based on the same technical concept, embodiments of the present disclosure further provide a video decoding apparatus, where the principle of the video decoding apparatus to solve the problem is similar to that of the video decoding method, so that the implementation of the video decoding apparatus can refer to the implementation of the video decoding method, and repeated details are not repeated. Fig. 14 is a schematic structural diagram of a video decoding apparatus provided in an embodiment of the present disclosure, and includes a receiving unit 1401, a decoding unit 1402, a sampling unit 1403, and a determining unit 1404.

A receiving unit 1401, configured to receive video decoding information, where the video decoding information at least includes downsampling indication information of a data block to be decoded, prediction information, and video information, and the downsampling indication information is used to indicate whether downsampling processing is performed on the corresponding data block;

a decoding unit 1402, configured to perform decoding processing according to the downsampling indication information, the prediction information, and the video information to obtain a decoded area;

a sampling unit 1403, configured to determine whether to perform upsampling processing after decoding according to the downsampling indication information;

a determining unit 1404, configured to determine a corresponding data block according to a result of determining whether to perform upsampling processing after decoding and the decoded area.

the decoding unit 1402 is specifically configured to:

In some possible embodiments, the decoding unit 1402 is specifically configured to:

The decoding unit 1402 is specifically configured to:

the determining unit 1404 is specifically configured to perform upsampling processing on the decoding area based on the sampling description information of the corresponding data block to obtain the corresponding data block.

In some possible embodiments, the determining unit 1404 is specifically configured to:

In some possible embodiments, when the downsampling indication information indicates that the corresponding data block has not been subjected to downsampling processing, the decoding unit 1402 is specifically configured to:

the determining unit 1404 is specifically configured to determine the decoding area as a corresponding data block.

The division of the modules in the embodiments of the present disclosure is schematic, and only one logical function division is provided, and in actual implementation, there may be another division manner, and in addition, each functional module in each embodiment of the present disclosure may be integrated in one processor, may also exist alone physically, or may also be integrated in one module by two or more modules. The coupling of the various modules to each other may be through interfaces that are typically electrical communication interfaces, but mechanical or other forms of interfaces are not excluded. Thus, modules described as separate components may or may not be physically separate, may be located in one place, or may be distributed in different locations on the same or different devices. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.

Fig. 15 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure, where the electronic device includes a transceiver 1501 and a processor 1502, where the processor 1502 may be a Central Processing Unit (CPU), a microprocessor, an application specific integrated circuit, a programmable logic circuit, a large scale integrated circuit, or a digital Processing Unit. The transceiver 1501 is used for data transmission and reception between an electronic device and another device.

The electronic device may further comprise a memory 1503 for storing software instructions to be executed by the processor 1502, but may also store some other data required by the electronic device, such as identification information of the electronic device, encryption information of the electronic device, user data, etc. The Memory 1503 may be a Volatile Memory (Volatile Memory), such as a Random-Access Memory (RAM); the Memory 1503 may also be a Non-Volatile Memory (Non-Volatile Memory), such as a Read-Only Memory (ROM), a Flash Memory (Flash Memory), a Hard Disk Drive (HDD) or a Solid-State Drive (SSD), or the Memory 1503 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto. Memory 1503 may be a combination of the above memories.

The specific connection medium between the processor 1502, the memory 1503, and the transceiver 1501 is not limited in the embodiments of the present disclosure. In fig. 15, the embodiment of the present disclosure is described by taking only an example in which the memory 1503, the processor 1502, and the transceiver 1501 are connected by the bus 1504, the bus is shown by a thick line in fig. 15, and the connection manner between other components is merely schematically described and is not limited thereto. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 15, but this is not intended to represent only one bus or type of bus.

The processor 1502 may be dedicated hardware or a processor running software, and when the processor 1502 may run software, the processor 1502 reads software instructions stored in the memory 1503 and executes a video encoding method or a video decoding method involved in the foregoing embodiments under the drive of the software instructions.

The disclosed embodiments also provide a storage medium, and when instructions in the storage medium are executed by a processor of an electronic device, the electronic device is capable of performing the video encoding method or the video decoding method referred to in the foregoing embodiments.

In some possible embodiments, various aspects of the encoding or video decoding method provided by the present disclosure may also be implemented in the form of a program product, which includes program code therein, and when the program product is run on an electronic device, the program code is used for causing the electronic device to execute the video encoding method or the video decoding method referred to in the foregoing embodiments.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable Disk, a hard Disk, a RAM, a ROM, an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a Compact Disk Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The program product for encoding or decoding in the embodiments of the present disclosure may be a CD-ROM and include program code, and may be run on a computing device. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, Radio Frequency (RF), etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In situations involving remote computing devices, the remote computing devices may be connected to the user computing device over any kind of Network, such as a Local Area Network (LAN) or Wide Area Network (WAN), or may be connected to external computing devices (e.g., over the internet using an internet service provider).

It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such division is merely exemplary and not mandatory. Indeed, the features and functions of two or more units described above may be embodied in one unit, in accordance with embodiments of the present disclosure. Conversely, the features and functions of one unit described above may be further divided into embodiments by a plurality of units.

Further, while the operations of the disclosed methods are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present disclosure have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the disclosure.

It will be apparent to those skilled in the art that various changes and modifications can be made in the present disclosure without departing from the spirit and scope of the disclosure. Thus, if such modifications and variations of the present disclosure fall within the scope of the claims of the present disclosure and their equivalents, the present disclosure is intended to include such modifications and variations as well.

Claims

1. A video encoding method, comprising:

2. The method of claim 1, wherein blocking the video frame based on whether the video frame contains a specified type of video content to obtain at least one data block comprises:

3. The method of claim 2, wherein dividing the video frame into at least two data blocks based on the area in which the specified type of video content is located comprises:

4. A method according to any of claims 1-3, wherein the data block, when it does not contain video content of the specified type, is subjected to a down-sampling process before being encoded, and

5. The method of claim 4, wherein downsampling the encoded reference area of the data block comprises:

6. A video decoding method, comprising:

7. A video encoding apparatus, comprising:

8. A video decoding apparatus, comprising:

9. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein:

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.

10. A storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method of any of claims 1-6.