CN114302139A

CN114302139A - Video encoding method, video decoding method and device

Info

Publication number: CN114302139A
Application number: CN202111509090.3A
Authority: CN
Inventors: 张艺
Original assignee: Alibaba China Co Ltd
Current assignee: Alibaba China Co Ltd
Priority date: 2021-12-10
Filing date: 2021-12-10
Publication date: 2022-04-08

Abstract

The embodiment of the application provides a video coding method, a video decoding method and a device, comprising the following steps: acquiring a video to be coded; adjusting an adjusting intensity value of the video frame according to the motion scene of the video frame, wherein the adjusting intensity value is used for controlling the size of a quantization parameter of a coding region in the video frame; acquiring a quantization parameter of a coding region according to the adjusted intensity value of the video frame and a reference value parameter of the coding region obtained by pre-coding the video frame; and coding the video to be coded according to the quantization parameter of the coding region to obtain the coded video. According to the method and the device, the strength value of the video frame can be dynamically adjusted according to the motion scene, and then the size of the quantization parameter of the coding region is corrected, so that the code rate and the volume of the coding region are optimized on the basis of conforming to the appearance requirement of the corresponding motion scene, the picture appearance quality of the coded video is met, the volume of the coded video is reduced, and the coding quality is improved.

Description

Video encoding method, video decoding method and device

Technical Field

The present application relates to the field of computer technologies, and in particular, to a video encoding method, a video decoding device, an electronic device, and a machine-readable medium.

Background

Video coding refers to a mode of converting a file in an original video format into a file in another video format by a compression technology, and is an important link for obtaining a viewable video.

In the related art, the curree is one of effective coding tools in video coding, and controls the image quality loss of video frames by analyzing the mutual reference process between the video frames, so as to control the code rate and the volume of the coded video.

However, in the current scheme, it is difficult to achieve further optimization of the code rate and the size of the coded video, which results in that the size of the coded video is larger under the condition of high code rate or lower under the condition of small size, and the coding quality is reduced.

Disclosure of Invention

The embodiment of the application provides a video coding method and a video decoding method, which aim to solve the problem of low coding quality in the related technology.

Correspondingly, the embodiment of the application also provides a video coding device, electronic equipment and a storage medium, which are used for ensuring the realization and the application of the method.

In order to solve the above problem, an embodiment of the present application discloses a video encoding method, where the method includes:

acquiring a video to be coded, wherein a video frame of the video to be coded is divided into a plurality of coding areas;

adjusting an adjustment intensity value of the video frame according to the motion scene of the video frame, wherein the adjustment intensity value is used for controlling the size of a quantization parameter of a coding region in the video frame;

acquiring a quantization parameter of the coding region according to the adjusted intensity value of the video frame and a reference value parameter of the coding region obtained by pre-coding the video frame;

and coding the video to be coded according to the quantization parameter of the coding region to obtain a coded video.

The embodiment of the application discloses a video decoding method, which comprises the following steps:

acquiring a coded video, wherein a video frame of the coded video is divided into a plurality of coding areas;

decoding the coded video to obtain a quantization parameter of the coding region, wherein the quantization parameter is as follows: adjusting the intensity value determined by the motion scene where the video frame of the coded video is located before coding, and calculating the parameters obtained by the reference value parameters obtained by pre-coding the coding region;

and constructing a video to be played according to the quantization parameter of the coding region.

The embodiment of the application discloses a video coding device, the device comprises:

the device comprises a first acquisition module, a second acquisition module and a first display module, wherein the first acquisition module is used for acquiring a video to be coded, and a plurality of coding areas are divided in a video frame of the video to be coded;

the adjusting module is used for adjusting an adjusting intensity value of the video frame according to the motion scene of the video frame, wherein the adjusting intensity value is used for controlling the size of a quantization parameter of a coding region in the video frame;

the calculation module is used for acquiring the quantization parameter of the coding region according to the adjusted intensity value of the video frame and the reference value parameter of the coding region obtained by pre-coding the video frame;

and the coding module is used for coding the video to be coded according to the quantization parameter of the coding region to obtain a coded video.

The embodiment of the application discloses a video decoding device, the device includes:

the second acquisition module is used for acquiring a coded video, and a plurality of coding areas are divided in a video frame of the coded video;

a decoding module, configured to decode the encoded video to obtain a quantization parameter of the encoded region, where the quantization parameter is: adjusting the intensity value determined by the motion scene where the video frame of the coded video is located before coding, and calculating the parameters obtained by the reference value parameters obtained by pre-coding the coding region;

and the reconstruction module is used for constructing a video to be played according to the quantization parameter of the coding region.

The embodiment of the application also discloses an electronic device, which comprises: a processor; and a memory having executable code stored thereon that, when executed, causes the processor to perform a method as described in one or more of the embodiments of the application.

Embodiments of the present application also disclose one or more machine-readable media having executable code stored thereon that, when executed, cause a processor to perform a method as described in one or more of the embodiments of the present application.

Compared with the prior art, the embodiment of the application has the following advantages:

in the embodiment of the application, the intensity value of the video frame can be dynamically adjusted according to the motion scene by analyzing the motion scene where the video frame of the video to be coded is located, so that the size of the quantization parameter of the coding region is corrected, and the code rate and the volume of the coding region can be adjusted according to the quantization parameter of the coding region during subsequent coding, so that the code rate and the volume are optimized on the basis of conforming to the appearance requirement of the corresponding motion scene, the image appearance quality of the coded video is met, the volume of the coded video is reduced, and the coding quality is improved.

Drawings

Fig. 1 is an architecture diagram of a video encoding method according to an embodiment of the present application;

fig. 2 is an architecture diagram of a video decoding method according to an embodiment of the present application;

fig. 3 is an architecture diagram of a video encoding method in an e-commerce live broadcast scene according to an embodiment of the present application;

fig. 4 is an architecture diagram of a video encoding method in a short video scene according to an embodiment of the present application;

fig. 5 is an architecture diagram of a video encoding method in a video conference scenario according to an embodiment of the present application;

fig. 6 is an architecture diagram of a video coding method in a network vod scenario according to an embodiment of the present application;

FIG. 7 is a flowchart illustrating steps of a video encoding method according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a video frame according to an embodiment of the present application;

FIG. 9 is a flow chart of steps in another embodiment of a method of video encoding in accordance with an embodiment of the present application;

FIG. 10 is a diagram illustrating a precoding process according to an embodiment of the present application;

fig. 11 is a diagram illustrating a specific example of a precoding process according to an embodiment of the present application;

FIG. 12 is a flowchart illustrating steps of a video decoding method according to an embodiment of the present application;

fig. 13 is a block diagram of a video encoding apparatus according to an embodiment of the present application;

fig. 14 is a block diagram of a video decoding apparatus according to an embodiment of the present application;

fig. 15 is a schematic structural diagram of an apparatus according to an embodiment of the present application.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.

To enable those skilled in the art to better understand the present application, the following description is made of the concepts related to the present application:

video to be encoded: that is, the original video file before encoding is a yuv (a color coding, y is brightness, and uv is color degree) video sequence composed of original video frame images.

And (3) encoding video: the video to be coded is processed by video coding to obtain a file, which is also called as a video bare stream file, and the video bare stream file is processed by subsequent packaging to obtain a video-audio video which can be watched.

Coding region: also called Coding Unit (CU), is an image block of a fixed size (e.g., 8 × 8) divided in advance in a video frame, and is also a processing object in the Coding process.

A motion scene: reflecting a scene with an instantaneous or continuous motion process, the motion scene in which the current video frame is located in the embodiment of the present application may affect the magnitude of the quantization parameter, and the motion scene includes a motion scene with a severely changed picture and a motion scene with a slowly changed picture.

Quantization Parameter (QP): the quantization is a process of mapping continuous values of signals into a plurality of discrete amplitudes, so that many-to-one mapping of the values of the signals is realized, the quantization can effectively reduce the value range of the signals, and a better compression effect is obtained; the larger the quantization parameter is, the smaller the coded code stream volume is, and the lower the picture quality is.

And (3) CUTree: in a video encoding process, a method for adjusting quantization parameter offset of a current coding region based on the degree to which the current coding region is referenced by coding regions of other frames.

The Lookahead module: and when the current frame is coded in the CURree algorithm, quickly coding a plurality of subsequent frames of the current frame in advance to obtain a reference value parameter of the current frame to a subsequent reference frame.

Intensity adjustment value: namely, the syntax is CUTree Strength, and is used for controlling the parameter of the adjustment amplitude of the quantization parameter offset of the current coding region, and the quantization parameter offset and the initial quantization parameter are superposed to obtain the adjusted quantization parameter. The larger the intensity adjustment value, the smaller the quantization parameter after adjustment.

Reference value parameter (propagateCost): assuming that the coding region b of the reference video frame refers to the coding region a of the referenced video frame, the reference value parameter of the coding region a reflects the reference value provided by the coding region a to the coding region b.

And (3) motion search: and finding a matching coding region with the minimum difference with the current coding region on the reference frame, wherein the matching coding region is used for identifying the motion scene where the video frame is located.

Motion Vector (MV): and connecting the current coding area with the vector of the matched coding area to express the motion direction of the object in the video frame.

Static coding area: the motion vector approaches to the coding region of 0, and the position of the static coding region in the current frame is close to the position of the matching coding region in the reference frame.

Pixel cost parameter (cost): and reflecting the pixel level difference between the current coding region and the matching coding region, wherein the larger the pixel cost parameter is, the smaller the similarity between the current coding region and the matching coding region is.

Intra prediction mode (intra): mode for prediction using only current frame reconstructed pixels.

Inter prediction mode (inter): the mode of current frame prediction is made using pixels of previously encoded frames.

Intra prediction cost (intracost): the total amount of information of the video frame in intra prediction mode is reflected.

Inter prediction cost (intercost): reflecting the total amount of information of the video frame in inter-prediction mode, the difference between the intra-prediction cost and the inter-prediction cost represents the amount of information the current frame has passed from its reference frame.

Preset index parameter (preset): the encoder uses preset index parameters to configure the coding grades, for example, 10 coding grades can be set, the preset index parameters are from 0 to 9, the coding speed becomes slower as the preset index parameters become larger, and the coding efficiency increases as the preset index parameters become larger.

H.265/HEVC: international video coding standards.

x 265: an open source H.265/HEVC encoder.

s 265: self-developed H.265/HEVC encoder.

Referring to fig. 1, which shows an architecture diagram of a video encoding method provided in an embodiment of the present application, including: the device comprises a down-sampling module, a pre-coding module, an analysis module, a calculation module and a coding module.

The framework of the video coding method in the embodiment of the application can be applied to the fields of short videos, video conferences, live videos, network video on demand and the like, and specifically, when a target video needs to be output in the fields, a video to be coded is coded according to the flow of fig. 1 to obtain a coded video, and finally the coded video is packaged according to a preset format to obtain the target video in the preset format for output.

In the process, the video coding plays a role in video compression, and the mission of the video coding is to reduce the volume of video files as much as possible on the basis of ensuring that the quality of videos watched by human eyes is not obviously reduced, so that the bandwidth cost is saved, and the visual experience of watching fluency is improved. In addition, different requirements for the code rate and the volume of the coded video exist in various fields, if any field requires to output a high-quality video with a high code rate, and if any field requires to output a light-weight video with a small volume, the requirements for the code rate and the volume in the fields can be achieved by adjusting related parameters in the coding process, and after the requirements for the parameters of the coded video on one hand are met, the parameters on the other hand have excellent performance as much as possible, for example, the code rate of the coded video is improved as much as possible on the premise of meeting the requirements for the small volume of the coded video; on the premise of meeting the requirement of high code rate of the coded video, the amount of the coded video is reduced as much as possible.

In the video coding process, the reference coding region of the reference video frame in the video to be coded can control the image quality loss of the reference video frame by obtaining reference from the referenced coding region of the referenced video frame, so as to control the code rate (picture quality) and the volume of the reference video frame. If the reference value provided by the reference coding region to the reference coding region is larger, the reference coding region is endowed with smaller quantization parameters through the control of related parameters, so that the image quality loss is smaller and higher code rate is achieved, sufficient reference value can be provided for the subsequent reference coding region, and the quality of the subsequent reference coding region is improved; if the reference value provided by the reference coding region to the reference coding region is smaller, the utilization value of the reference coding region itself is considered to be lower, and even if the picture quality is lower, the subsequent coding is not affected, and then the reference coding region is given larger quantization parameters through the control of the related parameters, so that the picture quality loss is larger and the code rate is lower, thereby reducing the amount of the reference coding region and further reducing the amount of the whole coded video.

Further, the embodiment of the present application may obtain a conclusion by analyzing a motion scene in which a video frame picture of a video to be encoded is located: a coding region in a motion scene in which a picture changes drastically and a coding region in a motion scene in which a picture changes slowly, even though both coding regions have the same reference value for a subsequent frame, since picture distortion of the coding region in the motion scene in which a picture changes slowly is more easily captured by human eyes, picture quality of the coding region in the motion scene in which a picture changes slowly is more important; however, since the human eye is not sensitive to a rapidly changing scene, even if the image quality of a coding region in a moving scene in which the picture is changed drastically is reduced, the actual appearance is not affected.

In addition, the image noise of the video frame also affects the code rate and the volume of the coding region, and as human eyes are insensitive to the scene of the change of the picture under the condition of high image noise, the actual impression is not affected even if the image quality of the coding region in the motion scene under the condition of high image noise is reduced; the change of the picture is more easily captured by human eyes when the image noise is low, and therefore the picture quality of the coding region is more important when the image noise is low.

Based on the above conclusion, the embodiment of the present application may dynamically adjust the adjustment intensity value of the video frame according to the motion scene by analyzing the motion scene where the video frame of the video to be encoded is located, so as to modify the size of the quantization parameter of the encoding region, and reduce the adjustment intensity value of the video frame under the condition that the video frame is located in the motion scene with a drastically changed picture or a large image noise, so as to increase the quantization parameter of the encoding region in the video frame, so that the volume of the encoding region after encoding is reduced, and the requirement of reducing the volume of the encoded video is met; under the condition that the video frame is in a motion scene with slowly changing pictures and less image noise, the adjustment intensity value of the video frame is increased, and further the quantization parameter of a coding region in the video frame is reduced, so that the code rate of the coding region after coding is improved, and the high image quality requirement of human eyes in the scene with easily capturing changes is met.

It should be noted that, in a simplified scene, the embodiment of the present application may also set the adjustment intensity value of the video frame only by distinguishing a drastic change of the picture or a slow change of the picture, and reduce the adjustment intensity value of the video frame when it is determined that the video frame is in a motion scene in which the picture is drastically changed; and in the case of determining that the video frame is in a motion scene with slowly changing pictures, increasing the adjustment intensity value of the video frame.

Specifically, referring to fig. 1, the down-sampling module may perform resolution down-sampling processing on an input video to be encoded, for example, the original resolution of a video frame of the video to be encoded is down-sampled horizontally 1/2 and vertically 1/2 to obtain a down-sampled image with an original resolution of 1/4, the down-sampling processing may effectively reduce the data calculation amount in a subsequent encoding process, and slightly affects the calculation accuracy, and the down-sampled video to be encoded may be further input to the pre-encoding module for subsequent encoding processing. It should be noted that.

The pre-coding module can execute relevant functions in a lookup head module in a CUTree algorithm, pre-code a video frame of a video to be coded by the pre-coding module, and aim to obtain a reference value parameter of a coding region of the video frame to a subsequent reference frame, wherein the reference value parameter is an important parameter in a process of subsequently calculating a quantization parameter of the coding region.

The analysis module can perform motion search on a video frame of a video to be coded to identify the proportion of a static coding region in the video frame and the image noise accumulated value of the video frame, determine a motion scene of the video frame according to the proportion of the static coding region and the image noise accumulated value of the video frame, and finally allocate the adjustment intensity value corresponding to the video frame according to the motion scene of the video frame.

The calculation module can calculate the quantization parameter of each coding region of the video frame by combining the reference value parameter of the coding region of the video frame output by the pre-coding module to the subsequent reference frame and the adjustment intensity value dynamically set to the video frame according to the motion scene.

The encoding module can encode the whole video to be encoded based on the quantization parameter of each encoding region of the video frame to finally obtain the encoded video, the encoded video is also called as a video bare stream file, and the video bare stream file can obtain the video and audio which can be watched after subsequent encapsulation processing.

Further, in practical applications, the encoder defines a coding mode and a decoding mode matching the coding mode, so that after the viewing terminal obtains the packaged viewable video, the video can be unpacked according to the unpacking mode corresponding to the packaging mode, and then the unpacked video can be decoded according to the decoding mode corresponding to the coding mode, so as to obtain a decoded video stream for viewing.

For example, if the encoding rule defined by s265 is used to encode the video to be encoded to obtain an encoded video, and then the encoded video is encapsulated into the audio-visual video in MP4 format according to the MP4(MPEG-4 Part 14) encapsulation mode and is sent to the viewing terminal, the viewing terminal may decapsulate the audio-visual video in MP4 format according to the MP4 decapsulation mode, and then decode the decapsulated audio-visual video according to the decoding rule defined by s265, so as to obtain a viewable video stream for direct playing.

In decoding, referring to fig. 2, which shows an architecture diagram of a video decoding method provided in an embodiment of the present application, specifically, the encoded video may be entropy decoded (binary decoded) to obtain a quantization parameter, a transform coefficient and a prediction coefficient of each coding region of the encoded video, where the transform coefficient is a coefficient of a transform unit in the coding region, and in the coding and decoding processes, luminance data and chrominance data are both based on the transform unit, and position information of a non-zero coefficient and amplitude information of the non-zero coefficient are encoded to represent the transform coefficient; the prediction coefficient is a coefficient of a prediction unit in the coding region, the prediction coefficient is related to a prediction mode (intra-frame/inter-frame prediction mode) selected during coding, the quantization parameter is an adjustment intensity value determined by a motion scene where a video frame of a coded video is located before coding in the embodiment of the application, and the parameter obtained by calculating a reference value parameter obtained by precoding in the coding region, and then the video to be played can be constructed based on the quantization parameter, the transformation coefficient and the prediction coefficient.

Further, based on a calculation branch, the embodiment of the present application may sequentially perform inverse quantization and inverse transformation operations according to quantization parameters and quantization coefficients of a coding region to obtain residual parameters of pixel values in the coding region, where the inverse quantization and inverse transformation are inverse operations of quantization and transformation in a corresponding coding process, and the residual parameters are reconstructed residuals, which are parameters reflecting mean square errors before and after the pixel values are coded; based on another calculation branch, the embodiment of the application can perform prediction operation according to the prediction mode of the prediction unit in the coding region to obtain the prediction parameters of the pixel values in the coding region, finally add the residual parameters and the prediction parameters of the coding region to obtain the reconstructed pixel values of the pixel points in the coding region, and further construct the video to be played according to the reconstructed pixel values of the pixel points in the coding region of the video frame.

In order to better describe the scenes to which the video coding method of the embodiment of the present application is applied, several scenes to which the video coding method is specifically applied are now provided:

in an e-commerce live broadcast scenario to which a video coding method is applied, referring to fig. 3, it is shown an architecture diagram of a video coding method in an e-commerce live broadcast scenario provided in an embodiment of the present application, including: the system comprises a live broadcast client, an E-commerce live broadcast server and a viewer client. The client of the live broadcast can integrate an encoding program for implementing the video encoding method of the embodiment of the application.

Specifically, the live broadcast client may execute S11, collect an original live broadcast video stream including a live broadcast picture, encode the original live broadcast video stream according to the encoding method provided in the embodiment of the present application through an encoding program, obtain an encoded video, and encapsulate the encoded video to obtain a target live broadcast video stream. And further executing S12, sending the target live broadcast video stream to a live broadcast service end of the E-commerce, wherein the live broadcast service end of the E-commerce can further execute S13 and issue the target live broadcast video stream to a client of a viewer, and the client of the viewer can execute S14 and sequentially unpack and decode the target live broadcast video stream for viewing.

It should be noted that, an encoding program for implementing the video encoding method of the embodiment of the present application may also be integrated in the e-commerce live broadcast server, and the live broadcast client may send the original live broadcast video stream to the e-commerce live broadcast server, so that a process of encoding the original live broadcast video stream by the encoding program to obtain an encoded video may be completed at the e-commerce live broadcast server, and a process of encapsulating the encoded video may be completed at the e-commerce live broadcast server.

In a short video scene to which a video coding method is applied, referring to fig. 4, an architecture diagram of a video coding method in a short video scene provided in an embodiment of the present application is shown, including: the system comprises an author client, a short video server and a viewer client. The author client may integrate an encoding program for implementing the video encoding method according to the embodiment of the present application.

Specifically, the author client may execute S21, collect an original short video stream including the material, encode the original short video stream according to the encoding method provided in the embodiment of the present application through the encoding program, obtain an encoded video, and encapsulate the encoded video, to obtain the target short video. And further executing S22, sending the target short video to the short video service end, wherein the short video service end can further execute S23, issuing the target short video to the viewer client, and the viewer client can execute S24, sequentially decapsulating and decoding the target short video for viewing.

It should be noted that, an encoding program for implementing the video encoding method according to the embodiment of the present application may also be integrated in the short video service end, and the creator client may send the original short video stream to the short video service end, so that a process of encoding the original short video stream by using the encoding program to obtain an encoded video and a process of encapsulating the encoded video may be completed at the short video service end.

In a video conference scene to which a video coding method is applied, referring to fig. 5, an architecture diagram of a video coding method in a video conference scene provided in an embodiment of the present application is shown, including: a host client, a conference server and a viewer client. The host client may integrate an encoding program for implementing the video encoding method according to the embodiment of the present application.

Specifically, the host client may execute S31, collect an original conference video stream including a host picture, encode the original conference video stream through an encoding program to obtain an encoded video, and encapsulate the encoded video to obtain a target conference video stream. And further executing S32, sending the target conference video stream to the conference server, wherein the conference server can further execute S33, and issue the target conference video stream to the viewer client, and the viewer client can execute S34, and sequentially decapsulate and decode the target conference video stream for viewing.

It should be noted that, a conference server may also integrate an encoding program for implementing the video encoding method in the embodiment of the present application, and the creator client may send an original conference video stream to the conference server, so that a process of encoding the original conference video stream by using the encoding program to obtain an encoded video and a process of encapsulating the encoded video may be completed at the conference server.

In a network video-on-demand scenario to which a video coding method is applied, referring to fig. 6, it shows an architecture diagram of a video coding method in a network video-on-demand scenario provided in an embodiment of the present application, including: a network video on demand server side and a viewer client side. The network video-on-demand service end can integrate an encoding program for implementing the video encoding method of the embodiment of the application.

Specifically, the network video-on-demand service end may execute S41, respond to the video-on-demand request of the viewer client, obtain an original video stream of the video-on-demand, encode the original video stream through an encoding program to obtain an encoded video, and encapsulate the encoded video to obtain the video-on-demand. And further executing S42, issuing the on-demand video to the viewer client, wherein the viewer client can execute S43, sequentially decapsulate and decode the on-demand video for viewing, and the coding method of the embodiment of the application is provided, so that the picture impression quality of the coded video stream is met in the network video on-demand scene, and the amount of the coded video stream is reduced.

Referring to fig. 7, a flowchart illustrating steps of a video encoding method provided by an embodiment of the present application is shown, including:

step 101, a video to be coded is obtained, and a video frame of the video to be coded is divided into a plurality of coding areas.

In the embodiment of the present application, a video to be encoded is an original video file before encoding, and is a video sequence composed of original video frame images, and an encoding area is also referred to as an encoding unit, and is an image block of a fixed size divided in advance in a video frame, and is also a processing object in an encoding process.

For example, referring to fig. 8, which shows a schematic diagram of a video frame provided in an embodiment of the present application, assuming that a video frame with a size of 24 × 24, if a coding region is 8 × 8, the video frame may be divided into 9 coding regions.

Step 102, adjusting an adjustment intensity value of the video frame according to the motion scene where the video frame is located, where the adjustment intensity value is used to control the size of a quantization parameter of a coding region in the video frame.

In the embodiment of the present application, according to characteristics of different influences of human eye senses on the speed of change of a picture in a video frame and image noise, specifically, a motion search may be performed on the video frame of a video to be encoded to identify a ratio of a still encoding region in the video frame and an image noise accumulated value of the video frame, and a motion scene where the video frame is located is determined according to the ratio of the still encoding region and the image noise accumulated value of the video frame, and finally, an adjustment intensity value corresponding to the video frame is assigned according to the motion scene where the video frame is located.

It should be noted that, the motion scene where the video frame is located may also be labeled in advance, so that the size of the adjustment intensity value of the video frame may be adjusted according to the labeled motion scene.

Step 103, obtaining a quantization parameter of the coding region according to the adjusted intensity value of the video frame and a reference value parameter of the coding region obtained by pre-coding the video frame.

In the embodiment of the application, a video frame of a video to be coded can be pre-coded by executing a related function in a Lookahead module in a curree algorithm, so that a reference value parameter of a coding region of the video frame to a subsequent reference frame is obtained, and the reference value parameter is an important parameter in a process of subsequently calculating a quantization parameter of the coding region.

Furthermore, different coding rules have different slice QPs, a slice QP is a basic quantization parameter, and after one coding rule (encoder) is selected, a basic quantization parameter corresponding to the coding rule is also selected.

And step 104, coding the video to be coded according to the quantization parameter of the coding region to obtain a coded video.

In the step, in the encoding process, the quantization parameter is used for controlling the image quality loss of the encoding region in the encoding process, and the larger the quantization parameter is, the larger the image quality loss of the encoding region is, the smaller the code rate of the encoding region is, and the smaller the volume of the encoding region is; the smaller the quantization parameter, the smaller the image quality loss of the encoded region, and the larger the code rate of the encoded region, the larger the volume of the encoded region.

The coded video is also called a video bare stream file, and the video bare stream file can be subjected to subsequent packaging processing to obtain a video-audio video capable of being watched.

Further, for the encoding process using the video encoding method provided by the embodiment of the present application, in comparison with the encoding process not using the method, under the condition that the encoding configuration is completely consistent, the following gain effects shown in table 1 below can be obtained through 46 1080p high-definition test files, 3 encoding grades corresponding to the preset index parameters, and 2 evaluation indexes:

preset index parameter	psnr-bdrate	ssim-bdrate
			2 (encoding grade veryfast)	-3.59％	-1.74％
5 (encoding grade medium)	-3.86％	-1.55％
			7 (coding grade veryslow)	-2.36％	-2.00％

TABLE 1

Wherein, psnr-bdrate (peak signal-to-noise ratio-the mean of the difference between two rate-distortion curves corresponding to the two schemes) is the code rate saved under the same psnr quality; the ssim-bdrate (structural similarity-the mean value of the difference values of the two rate-distortion curves corresponding to the two schemes) is the code rate saved under the same ssim quality, and it can be seen that, under the condition of fixing any index, the coding process using the video coding method provided by the embodiment of the application achieves an obvious code rate saving effect, thereby improving the coding gain effect.

In summary, in the embodiment of the present application, the intensity value of the video frame can be dynamically adjusted according to the motion scene by analyzing the motion scene where the video frame of the video to be encoded is located, and then the size of the quantization parameter of the encoding region is corrected, and during subsequent encoding, the code rate and the volume of the encoding region can be adjusted according to the quantization parameter of the encoding region, so that the code rate and the volume are optimized on the basis of conforming to the viewing requirement of the corresponding motion scene, the picture viewing quality of the encoded video is met, the volume of the encoded video is reduced, and the encoding quality is improved.

Referring to fig. 9, a flow chart of steps of another video encoding method embodiment of the present application is shown.

The method comprises the following steps:

step 201, obtaining a video to be encoded, where a video frame of the video to be encoded is divided into a plurality of encoding regions.

This step may specifically refer to step 101, which is not described herein again.

Step 202, performing down-sampling processing on the video frame of the video to be encoded, and reducing the resolution of the video frame.

In the embodiment of the application, after the video to be encoded is obtained, the video to be encoded may be subjected to down-sampling processing to reduce the resolution of the video frame of the video to be encoded, so as to reduce the calculation amount in the subsequent calculation process, and slightly affect the calculation accuracy.

For example, the original resolution of a video frame of the video to be encoded may be down-sampled horizontally 1/2 and vertically 1/2, resulting in a down-sampled image having an original resolution size of 1/4.

It should be noted that the downsampling is an optional processing process, and in some scenes with sufficient computing resources, the downsampling processing may not be performed on the video to be encoded, but the subsequent encoding processing is directly performed on the video to be encoded.

Step 203, obtaining the intra-frame prediction cost parameter and the inter-frame prediction cost parameter of the coding region.

In the embodiment of the present application, the selected coding rule may use an intra-frame prediction mode or an inter-frame prediction mode to perform frame-to-frame reference and prediction, where the intra-frame prediction mode is a mode in which only reconstructed pixels of the current frame are used for prediction; the inter-frame prediction mode is a mode for performing current frame prediction by using pixels of a previously encoded frame, and after the two modes are selected, an intra-frame prediction cost parameter and an inter-frame prediction cost parameter of an encoding area can be obtained, and the intra-frame prediction cost parameter and the inter-frame prediction cost parameter of the encoding area are important parameters for subsequently calculating quantization parameters of the encoding area. The intra-frame prediction cost of each coding region represents the total information amount of the current frame, and the difference value of the intra-frame prediction cost and the inter-frame prediction cost of the coding region represents the information amount transferred by the current frame from the reference frame of the current frame.

And 204, pre-coding the video frame, and acquiring a reference value parameter of a coding area of the video frame according to the intra-frame prediction cost parameter and the inter-frame prediction cost parameter.

In this step, referring to fig. 10, a schematic diagram of a precoding process of the present application is shown, where when precoding a current frame, a subsequent n frames of the current frame may be selected to construct a precoding window, it can be seen that one coding region 11 of each frame refers to one coding region 11 of an adjacent previous frame, it should be noted that a reference link of the coding region 11 shown in fig. 10 is only an implementation link, one coding region 11 of one frame may also refer to one coding region 11 of a previous several frames, or one coding region 11 of one frame may also refer to one coding region 11 of a subsequent frame, and a reference link is not specifically limited in the embodiment of the present application.

Specifically, calculating the reference value parameter of the coding region 11 of the current frame requires to sequentially perform:

first, starting from frame n in the pre-coding window, calculating the information amount info1 obtained from frame n-1 by the coding region 11 in frame n:

wherein intracost is an intra-frame prediction cost parameter of a coding region 11 in a frame n; intercost is an inter-frame prediction cost parameter of the coding region 11 in the frame n;

secondly, if the reference area of the frame n-1 in the frame n-2 covers a plurality of coding areas in the frame n-2, the value of info1 needs to be allocated in proportion according to the coding areas covering the CUs, and after traversing the coding areas of all the frames n, the reference value parameter of each coding area in the frame n-1 to the frame n is obtained and recorded as propagetech.

Third, the amount of information that frame n-1 obtains from frame n-2 affects not only frame n-1 but also the following frame n, so the amount of information info2 that the encoded region of frame n-1 obtains from frame n-2 is:

Info2＝(0.5*intracost+propagateCost)*propagate_faction；

wherein intracost is an intra-frame prediction cost parameter of a coding region 11 in a frame n-1; intercost is an inter-frame prediction cost parameter of the coding region 11 in the frame n-1; the propagate Cost is a reference value parameter of each coding region in the frame n-1 to the frame n;

and fourthly, similarly to the second step, equally dividing by area through info2 and traversing each coding region of the frame n-1 to finally obtain the reference value parameter of each coding region of the frame n-2 to the frame n-1.

And fifthly, calculating until the current frame according to the logics from the first step to the fourth step, and obtaining the reference value parameter of each coding region of the current frame to the frame 1.

By performing the above logic calculation on all or part of the video frames of the video to be encoded, the reference value parameters of the encoding area of the video frame can be obtained.

To more clearly explain the specific implementation of step 204, a specific example will now be provided for illustration:

referring to fig. 11, a specific exemplary schematic diagram of a pre-coding process of the present application is shown, assuming that for a current frame, the current frame is pre-coded by p1 frames and p2 frames in a pre-coding window, the area of each video frame is 16 × 16, the size of each coding region is 8 × 8, and a reference region 12 of a coding region 1 of a p2 frame in a p1 frame covers coding regions 1-4 of a p1 frame, and the overlapping areas of the coding regions are s1, s2, s3, and s4, respectively.

The amount of information obtained from the p1 frame by the encoding area 1 of the p2 frame is info _ p2(1) ═ intracost _ p2(1) — intercost _ p2 (1); the four coding regions in the p1 frame provide the coding region 1 of the p2 frame with the following amounts of information:

Pro_p1(1)＝(s1/64)×info_p2(1)；

Pro_p1(2)＝(s2/64)×info_p2(1)；

Pro_p1(3)＝(s3/64)×info_p2(1)；

Pro_p1(4)＝(s4/64)×info_p2(1)；

further, assuming that the coding region 2 of the p2 frame refers to the coding region 2 of the p1 frame, the coding region 3 of the p2 frame refers to the coding region 3 of the p1 frame, and the coding region 4 of the p2 frame refers to the coding region 4 of the p1 frame, the amount of information provided by the coding region 2 of the p1 frame to the coding region 2 of the p2 frame is:

Pro_p1(2)’＝info_p2(2)＝intracost_p2(2)-intercost_p2(2)；

coding region 3 of p1 frame the amount of information provided to coding region 3 of p2 frame was:

Pro_p1(3)’＝info_p2(3)＝intracost_p2(3)-intercost_p2(3)；

coding region 4 of p1 frame the amount of information provided to coding region 4 of p2 frame is:

Pro_p1(4)’＝info_p2(4)＝intracost_p2(4)-intercost_p2(4)；

then the reference value parameters propagtectost of the four coding regions in the p1 frame are:

propagatecost_p1(1)＝Pro_p1 (1)；

propagatecost_p1(2)＝Pro_p1(2)+Pro_p1 (2)’；

propagatecost_p1(3)＝Pro_p1(3)+Pro_p1 (3)’；

propagatecost_p1(4)＝Pro_p1(4)+Pro_p1 (4)’；

finally, assuming that the coding regions 1-4 in the p1 frame refer to the coding regions 1-4 of the current frame, respectively, the reference value parameters of the coding region 1 of the current frame to the coding region 1 of the p1 frame are:

Propagatecost_curr(1)＝(0.5×intracost_p1(1)+propagate cost_p1(1))×((intracost_p1(1)-intercost_p1(1))/intracost_p1 (1))；

the reference value parameters of coding region 2 of the current frame to coding region 2 of the p1 frame are:

Propagate cost_curr(2)＝(0.5×intracost_p1(2)+propagate cost_p1(2))×((intracost_p1(2)-intercost_p1(2))/intracost_p1 (2))；

the reference value parameters of the coding region 3 of the current frame to the coding region 3 of the p1 frame are as follows:

Propagate cost_curr(3)＝(0.5×intracost_p1(3)+propagate cost_p1(3))×((intracost_p1(3)-intercost_p1(3))/intracost_p1 (3))；

the reference value parameters of the coding region 4 of the current frame to the coding region 4 of the p1 frame are as follows:

Propagate cost_curr(4)＝(0.5×intracost_p1(4)+propagate cost_p1(4))×((intracost_p1(4)-intercost_p1(4))/intracost_p1 (4))；

step 205, adjusting an adjustment intensity value of the video frame according to the motion scene where the video frame is located, where the adjustment intensity value is used to control the size of a quantization parameter of a coding region in the video frame.

This step may specifically refer to step 102, which is not described herein again.

Optionally, the adjusted intensity value is in inverse proportional relation to the quantization parameter.

Specifically, in the encoding process, the adjustment intensity value is in an inverse proportional relation with a quantization parameter, the quantization parameter is used for controlling the image quality loss of the encoding region in the encoding process, and the smaller the adjustment intensity value is, the larger the quantization parameter is, the larger the image quality loss of the encoding region is, the smaller the code rate of the encoding region is, and the smaller the volume of the encoding region is; the larger the adjustment intensity value is, the smaller the quantization parameter is, the smaller the image quality loss of the coding region is, the larger the code rate of the coding region is, and the larger the volume of the coding region is.

Optionally, in an implementation manner, the step 205 includes:

sub-step 2051, in the case where it is determined that said video frame is in a motion scene with a drastic change in picture, reduces said adjusted intensity value.

Sub-step 2052, in the case where it is determined that the video frame is in a motion scene with slowly changing pictures, increases the adjusted intensity value.

In the embodiment of the present application, a conclusion can be obtained by analyzing a motion scene in which a video frame picture of a video to be coded is located: a coding region in a motion scene in which a picture changes drastically and a coding region in a motion scene in which a picture changes slowly, even though both coding regions have the same reference value for a subsequent frame, since picture distortion of the coding region in the motion scene in which a picture changes slowly is more easily captured by human eyes, picture quality of the coding region in the motion scene in which a picture changes slowly is more important; however, since the human eye is not sensitive to a rapidly changing scene, even if the image quality of a coding region in a moving scene in which the picture is changed drastically is reduced, the actual appearance is not affected.

Based on the above conclusion, in an implementation scheme, the adjustment intensity value of the video frame is set by distinguishing the drastic change of the picture or the slow change of the picture, and the adjustment intensity value of the video frame is reduced under the condition that the video frame is determined to be in a motion scene with the drastic change of the picture, so that the quantization parameter of the coding region in the video frame is increased, the volume of the coding region is reduced after coding, and the requirement of reducing the volume of the coded video is met; under the condition that the video frame is determined to be in a motion scene with slowly changing pictures, the adjustment intensity value of the video frame is increased, and further the quantization parameter of the coding region in the video frame is reduced, so that the code rate of the coding region after coding is improved, and the requirement of high image quality under the scene that human eyes are easy to capture changes is met.

Optionally, in another implementation manner, the step 205 includes:

sub-step 2053, in the case where it is determined that the video frame is in a motion scene with a drastic picture change or in a motion scene with a high image noise, reduces the adjusted intensity value.

Sub-step 2054, in the case where it is determined that the video frame is in a motion scene with a slowly changing picture and in a motion scene with less image noise, increasing the adjusted intensity value.

In another implementation manner of the embodiment of the application, on the basis of analyzing the change speed of the picture of the video frame of the video to be coded, the code rate and the volume of the coding region are also influenced by the image noise of the video frame, and as human eyes are insensitive to the scene of the change of the picture under the condition of high image noise, the actual impression is not influenced even if the image quality of the coding region in a motion scene under the condition of high image noise is reduced; the change of the picture is more easily captured by human eyes when the image noise is low, and therefore the picture quality of the coding region is more important when the image noise is low.

Therefore, the method and the device can dynamically adjust the video frame adjusting intensity value according to the motion scene by analyzing the motion scene where the video frame of the video to be coded is located, so as to further correct the size of the quantization parameter of the coding region, reduce the adjusting intensity value of the video frame under the condition that the video frame is located in the motion scene with fierce picture change or large image noise, further increase the quantization parameter of the coding region in the video frame, reduce the volume of the coding region after coding, and meet the requirement of reducing the volume of the coded video; under the condition that the video frame is in a motion scene with slowly changing pictures and less image noise, the adjustment intensity value of the video frame is increased, and further the quantization parameter of a coding region in the video frame is reduced, so that the code rate of the coding region after coding is improved, and the high image quality requirement of human eyes in the scene with easily capturing changes is met.

Optionally, the step of increasing the adjustment intensity value may be specifically implemented by summing the current adjustment intensity value and a preset adjustment threshold value to obtain an increased adjustment intensity value.

Optionally, the step of reducing the adjustment intensity value may be specifically implemented by subtracting the current adjustment intensity value from the preset adjustment threshold value to obtain a reduced adjustment intensity value.

In this embodiment, a preset adjustment threshold may be set, so that when the intensity adjustment value is adjusted, the current adjustment intensity value is added to the preset adjustment threshold, or the current adjustment intensity value is subtracted from the preset adjustment threshold, so as to implement the specific adjustment.

For example, assuming that the adjustment intensity value before adjustment is 0.5 and the preset adjustment threshold value is 0.1, when the adjustment intensity value is increased, the adjustment intensity value after the increase is 0.5+0.1 to 0.6; when the adjustment intensity value is decreased, the adjustment intensity value after the decrease is 0.5 to 0.1 to 0.4.

Optionally, the method may further include:

step a1, a still encoded region in the video frame is determined.

In the embodiment of the present application, the coding region of the current video frame may obtain useful information from the reference video frame by referring to a coding region of another reference video frame, and the still coding region in the current video frame refers to a coding region where the motion vector approaches to 0, that is, the position of the still coding region in the current video frame is close to the position of the reference coding region in the reference video frame. In particular, still encoded regions in a video frame may be determined by motion search techniques.

Step A2, obtaining the number ratio of the still coding areas in the video frame in all the coding areas of the video frame.

After the number of the still coding regions in the video frame is determined, the number ratio of the still coding regions in all coding regions of the video frame can be further calculated, and the number ratio is helpful for the subsequent determination of the motion scene in which the current video frame is located.

Step a3, in case the number ratio is greater than a first threshold, determining that the video frame is in a motion scene with slowly changing pictures.

Step A4, under the condition that the number ratio is smaller than a second threshold value, determining that the video frame is in a motion scene with a fiercely changed picture; the first threshold is greater than or equal to the second threshold.

In the embodiment of the application, the comparison between the number ratio and the threshold value can be performed by presetting a first threshold value or a second threshold value, so that the motion scene where the video frame is located is determined according to the comparison result.

In a preferred embodiment, the first threshold may be set to 62% and the second threshold may be set to 12%. The first threshold value and the second threshold value may be equal to each other, for example, 50%.

Optionally, the method may further include:

step a5, determining the first threshold and the second threshold according to preset index parameters adopted by the currently used encoding rule, where the preset index parameters are in direct proportional relationship with the first threshold and the second threshold, respectively.

In the embodiment of the present application, the encoder may configure the encoding levels using preset index parameters, for example, 10 encoding levels may be set, the preset index parameters range from 0 to 9, the encoding speed becomes slower as the preset index parameters become larger, and the encoding efficiency increases as the preset index parameters become larger.

Therefore, based on the relationship between different preset index parameters and the coding efficiency and the coding speed, the first threshold value and the second threshold value can be determined based on the preset index parameters adopted by the currently used coding rule, and the preset index parameters are respectively in direct proportion to the first threshold value and the second threshold value. That is, the lower the encoding speed is, the higher the encoding efficiency is, the stricter the determination condition for the motion scene in which the video frame is in the slow change of picture is, and the number of video frames of the motion scene determined to be in the slow change of picture is reduced; and the slower the coding speed is, the higher the coding efficiency is, the looser the judgment condition for the motion scene of the video frame in which the picture is severely changed is, the number of the video frames of the motion scene judged to be in which the picture is severely changed is increased, so that the amount of the coded video can be effectively reduced, and vice versa.

Optionally, the method may further include:

step B1, determining still encoded regions in the video frame.

This step can be referred to as step a1 and will not be described herein.

Step B2, obtaining pixel cost parameters of the static coding area; the pixel cost parameter is used to reflect: a similarity between a coding region in a reference video frame of the video frame corresponding to the still coding region and the still coding region; the more negatively the pixel cost parameter is related to the similarity.

In the embodiment of the present application, in the process of performing motion search on the coding region, in addition to obtaining a motion vector of the coding region, a pixel level difference between the current coding region and the reference coding region may also be obtained, where the pixel level difference is a pixel cost parameter of the current coding region, and the more negative the similarity between the pixel cost parameter and the current coding region and the reference coding region is. This step may obtain a pixel cost parameter for each static encoding region.

And step B3, obtaining the average pixel cost parameter of all the static encoding areas in the video frame according to the pixel cost parameter of the static encoding area.

In this step, after the pixel cost parameter of each still encoding region is obtained, the average pixel cost parameter of the still encoding region in the video frame may be calculated according to the number of the still encoding regions and the sum of the pixel cost parameters of all the still encoding regions.

And step B4, determining that the video frame is in a motion scene with less image noise when the average pixel cost parameter is smaller than a third threshold value.

And step B5, determining that the video frame is in a motion scene with high image noise when the average pixel cost parameter is greater than a fourth threshold, where the third threshold is less than or equal to the fourth threshold.

In the embodiment of the application, the comparison between the average pixel cost parameter and the threshold value can be performed by presetting a third threshold value or a fourth threshold value, so that the motion scene where the video frame is located is determined according to the comparison result.

In a preferred embodiment, the third threshold may be set to 30 and the fourth threshold may be set to 50. The third threshold value may be equal to the fourth threshold value.

Optionally, the method may further include:

step B6, determining the third threshold and the fourth threshold according to preset index parameters adopted by the currently used encoding rule, where the preset index parameters are in inverse proportional relation with the third threshold and the fourth threshold, respectively.

Therefore, based on the relationship between different preset index parameters and the coding efficiency and the coding speed, the third threshold value and the fourth threshold value can be determined based on the preset index parameters adopted by the currently used coding rule, and the preset index parameters are respectively in inverse proportion to the third threshold value and the fourth threshold value. That is, the lower the encoding speed is, the higher the encoding efficiency is, the stricter the determination condition for the video frame in the motion scene with less image noise is, and the number of video frames determined to be in the motion scene with less image noise is reduced; and the slower the coding speed is, the higher the coding efficiency is, the looser the judgment condition for the motion scene of the video frame with larger image noise is, the larger the number of the video frames judged to be in the motion scene with larger image noise is, thereby effectively reducing the amount of the coded video, and vice versa.

In this embodiment, it should be noted that, if the first threshold is the same as the second threshold, and the third threshold is the same as the fourth threshold, all the conditions included in the determination of the motion scene where the video frame is located and the adjustment of the adjustment intensity value are:

and reducing the adjusting intensity value under the condition that the video frame is determined to be in a motion scene with a severely changed picture or in a motion scene with high image noise.

And in the case that the video frame is determined to be in a motion scene with slowly changing pictures and in a motion scene with less image noise, increasing the adjustment intensity value.

If the first threshold is different from the second threshold and/or the third threshold is different from the fourth threshold, the judgment of the motion scene where the video frame is located and the adjustment of the adjustment intensity value may include all cases, except the above cases, other cases, such as a case where the number ratio is smaller than the first threshold but larger than the second threshold, and a case where the average pixel cost parameter is larger than the third threshold but smaller than the fourth threshold.

Optionally, the process of determining a still coding region in the video frame includes:

and step C1, acquiring a reference video frame of the video frame from the video to be coded.

In the embodiment of the present application, useful information of a current video frame of a video to be encoded can be obtained from another reference video frame, so this step can obtain a reference video frame of a video frame from the video to be encoded.

Step C2, determining a second coding region matching the first coding region of the video frame by motion search in the reference video frame.

In the embodiment of the present application, the motion search is a process of finding a matching coding region with the smallest difference from the current coding region on the reference frame, and therefore, the second coding region determined by the motion search to match the first coding region of the video frame is the coding region closest in pixel to the first coding region.

Step C3, determining the difference between the coordinates of the second coding region and the coordinates of the first coding region as the motion vector of the first coding region relative to the second coding region.

In this embodiment of the present application, a motion vector mv of the first coding region relative to the second coding region may be obtained through motion search, where the motion vector mv is specifically (mv.x, mv.y), mv.x is a difference between an abscissa of the second coding region and an abscissa of the first coding region, and mv.y is a difference between an ordinate of the second coding region and an ordinate of the first coding region.

Step C4, determining the first coding region with the motion vector smaller than the fifth threshold as the static coding region.

In this embodiment, since the motion vector (mv.x, mv.y) is a coordinate value, the fifth threshold may be correspondingly set to be a threshold coordinate (a, b), so that the motion vector smaller than the fifth threshold may specifically be mv.x smaller than a, and mv.y smaller than b, which means that the motion vector approaches 0, and if the motion vector of the first coding region is smaller than the fifth threshold, the first coding region may be a static coding region.

In order to accurately describe the traversal of each coding region to determine the static coding regions therein, and accurately describe the statistics of the number of static coding regions and the statistics of the pixel cost parameters, an example is provided to illustrate the whole implementation process:

the method comprises the steps of firstly, creating three parameters cuIndex, stationCuNum and stationCuCost, and initializing the three parameters, wherein the cuIndex is the index number of a current coding region, the index number is 0, and the 0 is the first coding region representing the coding sequence of a current frame; stationCuNum represents the number of static coding regions, initially 0; staticcucost represents the sum of the pixel cost parameters of the still encoded region, initially 0.

And secondly, performing motion search on the current coding region to obtain a motion vector and a pixel cost parameter of the current coding region.

And thirdly, when the motion vector of the current coding region is smaller than a fifth threshold value, determining that the current coding region is a static coding region, and adding 1 to the staticunum and adding the pixel cost parameter of the current coding region to the staticcucost. And when the motion vector of the current coding area is greater than or equal to a fifth threshold value, determining that the current coding area is not a static coding area, adding 1 to cuIndex, and entering analysis of the next coding area.

And fourthly, obtaining the number of static coding regions according to the staticCuNum after all the coding regions are traversed, and determining the average pixel cost parameter of the video frame according to the numerical value of the staticCucost and the number of the static coding regions.

Optionally, in another implementation manner, the step 205 includes:

substep 2055, basic adjustment parameters are obtained.

Sub-step 2056, adjusting the adjusted intensity value by adjusting the basic adjustment parameter according to the motion scene of the video frame, wherein the basic adjustment parameter is in inverse proportion to the adjusted intensity value.

In the embodiment of the present application, the adjusted intensity value cutreStrength is 5.0 (1-qcompress), where the basic adjustment parameter is qcompress, and qcompress is a compression ratio parameter for controlling the compression amount, and it can be seen that the basic adjustment parameter qcompress and the adjusted intensity value cutreStrength are in inverse proportion.

Based on the characteristics, the adjustment intensity value can be adjusted by adjusting the basic adjustment parameters according to the motion scene of the video frame; under the condition that the video frame is determined to be in a motion scene with a severely changed picture or in a motion scene with larger image noise, the purpose of reducing the adjusted intensity value cutreStrength is achieved by increasing the basic adjustment parameter qcompress; under the condition that the video frame is determined to be in a motion scene with a severely changed picture or in a motion scene with larger image noise, the purpose of increasing the adjustment intensity value cutreeStrength is achieved by reducing the basic adjustment parameter qcompress. For example, in decreasing cutreeStrength by increasing qcompress, the qcompress original value may be increased to 0.7 if it is set to 0.6.

And step 206, acquiring basic quantization parameters matched with the current coding rule.

In the embodiment of the present application, different coding rules have different slice QPs, where a slice QP is a basic quantization parameter, and after one coding rule (encoder) is selected, the basic quantization parameter corresponding to the coding rule is also selected.

In practical application, the code stream structure of the whole encoding rule can be divided into two layers: a network abstraction layer (nal) and a video coding layer (vcl). At the NAL layer, the codestream of the coding rule may be represented as a series of NAL units, with different types of syntax elements being contained in different NAL units. In addition, a portion that actually holds image data of the original video may be held in the nal unit of another vcl) layer. This part of the data is called a Slice (Slice) in the code stream. One Slice contains part or all of the data of one frame of image, in other words, one frame of video image can be encoded into one or several slices. A Slice contains at least one macroblock and at most data of the entire frame of image. In different coding implementations, the number of slices formed in the same frame image is not necessarily the same.

The purpose of Slice design in the coding rules is mainly to prevent spreading of bit errors. Because the decoding operation is independent between different slices. The data referenced by the decoding process of a slice (e.g. predictive coding) cannot cross the slice boundary. In the embodiment of the present application, one encoding rule may correspond to one basic quantization parameter slice QP.

And step 207, obtaining the quantization parameter offset of the coding region according to the intra-frame prediction cost parameter, the inter-frame prediction cost parameter, the adjustment intensity value and the reference value parameter.

In the embodiment of the present application, the quantization parameter offset deltaQP is calculated as follows:

as can be seen from the above, the quantization parameter offset deltaQP may be calculated based on the intra-frame prediction cost parameter intracost, the inter-frame prediction cost parameter intercost, the adjusted intensity value cutrestrength, and the reference value parameter propagateCost. Where the quantization parameter offset deltaQP is a negative value.

And 208, correcting the basic quantization parameter through the quantization parameter offset to obtain the quantization parameter of the coding region.

Optionally, step 208 may be specifically implemented by using a difference between the absolute values of the base quantization parameter and the quantization parameter offset as the quantization parameter of the coding region.

In the embodiment of the present application, a quantization parameter QP of an encoding region is equal to a basic quantization parameter slice QP + | quantization parameter offset deltaQP |; that is, the difference between the absolute value of the base quantization parameter slice QP and the quantization parameter offset deltaQP is used as the quantization parameter QP of the encoded region.

And 209, coding the video to be coded according to the quantization parameter of the coding region to obtain a coded video.

This step may specifically refer to step 104, which is not described herein again.

Optionally, the method may further include:

step 210, obtaining a mapping relation between the index parameter and the adjusted intensity value, and a query parameter of the video frame.

And step 211, determining the adjustment intensity value matched with the query parameter as the adjustment intensity value of the video frame according to the mapping relation.

Wherein the index parameter and the query parameter include: the number of the static coding regions in the video frame is at least one of the number ratio of all the coding regions in the video frame, the average pixel cost parameter of the static coding regions in the video frame, the target code rate set for the coded video, and the preset index parameter adopted by the currently used coding rule.

In this embodiment of the present application, for the adjustment process of the adjustment intensity value, a more precise continuous adjustment scheme may also be provided, specifically, a mapping relationship between the index parameter and the adjustment intensity value is fitted in advance, before the adjustment is performed, the query parameter of the current video frame is obtained, the query parameter is provided to be matched with the index parameter in the mapping relationship, and the adjustment intensity value corresponding to the index parameter matched with the query parameter is determined as the adjustment intensity value of the current video frame.

The index parameter and the query parameter may be of the same type, that is, the index parameter and the query parameter include: the number of the static coding regions in the video frame in all the coding regions of the video frame is proportional to the number of the static coding regions in the video frame, the average pixel cost parameter of the static coding regions in the video frame, the target code rate set for the coded video and at least one of the preset index parameters adopted by the currently used coding rule.

Referring to fig. 12, which shows a flowchart of steps of a video decoding method provided in an embodiment of the present application, including:

step 301, obtaining a coded video, in which a video frame of the coded video is divided into a plurality of coding regions.

Step 302, decoding the encoded video to obtain the quantization parameter of the encoded region, where the quantization parameter is: and adjusting the strength value determined by the motion scene where the video frame of the coded video is located before coding, and calculating the parameters obtained by the reference value parameters obtained by pre-coding the coding region.

And 303, constructing a video to be played according to the quantization parameter of the coding region.

The specific implementation process of

steps

301 and 303 can refer to the related description of fig. 2, and will not be described herein again. In the embodiment of the application, in the encoding process, the code rate and the volume of the encoding region are adjusted according to the optimized quantization parameter of the encoding region, so that the video frames in the motion scene with fierce picture change or large image noise in the decoded video stream have smaller volume, and the image quality loss of the video frames can hardly be captured by human eyes, thereby saving the volume of the video to be played; the video frame has high code rate for the video frame in the motion scene with slowly changing pictures or less image noise, so that the capture of image quality details in the video frame in the motion scene with slowly changing pictures and low noise by human eyes is improved through high picture quality, and the impression quality is improved.

Referring to fig. 13, a block diagram of a video encoding apparatus according to an embodiment of the present application is shown, including:

a first obtaining module 401, configured to obtain a video to be encoded, where a video frame of the video to be encoded is divided into multiple encoding regions;

an adjusting module 402, configured to adjust an adjustment intensity value of the video frame according to a motion scene where the video frame is located, where the adjustment intensity value is used to control a size of a quantization parameter of a coding region in the video frame;

a calculating module 403, configured to obtain a quantization parameter of the coding region according to the adjusted intensity value of the video frame and a reference value parameter of the coding region obtained by precoding the video frame;

and the encoding module 404 is configured to encode the video to be encoded according to the quantization parameter of the encoding region to obtain an encoded video.

Optionally, the adjusting module 402 includes:

the first adjusting submodule is used for reducing the adjusting intensity value under the condition that the video frame is determined to be in a motion scene with a fiercely changed picture;

and the second adjusting sub-module is used for increasing the adjusting intensity value under the condition that the video frame is determined to be in a motion scene with slowly changing pictures.

Optionally, the adjusting module 402 includes:

the third adjusting submodule is used for reducing the adjusting intensity value under the condition that the video frame is determined to be in a motion scene with a fiercely changed picture or in a motion scene with larger image noise;

and the fourth adjusting sub-module is used for increasing the adjusting intensity value under the condition that the video frame is determined to be in a motion scene with slowly changing pictures and in a motion scene with less image noise.

Optionally, the increasing the adjusted intensity value includes:

adding the current adjustment intensity value with a preset adjustment threshold value to obtain an increased adjustment intensity value;

the reducing the adjusted intensity value includes:

and subtracting the preset adjustment threshold value from the current adjustment intensity value to obtain a reduced adjustment intensity value.

Optionally, the apparatus further comprises:

a first determining module for determining a still encoded region in the video frame;

the proportion module is used for acquiring the quantity proportion of the static coding regions in the video frame in all the coding regions of the video frame;

the first judgment module is used for determining that the video frame is in a motion scene with a slowly changing picture under the condition that the number ratio is greater than a first threshold value;

the second judgment module is used for determining that the video frame is in a motion scene with a fiercely changed picture under the condition that the number ratio is smaller than a second threshold value; the first threshold is greater than or equal to the second threshold.

Optionally, the apparatus further comprises:

a second determining module for determining a still encoded region in the video frame;

the pixel cost module is used for acquiring pixel cost parameters of the static coding region; the pixel cost parameter is used to reflect: a similarity between a coding region in a reference video frame of the video frame corresponding to the still coding region and the still coding region; the more negatively the pixel cost parameter is related to the similarity;

the average calculation module is used for acquiring average pixel cost parameters of all static coding areas in the video frame according to the pixel cost parameters of the static coding areas;

a third determining module, configured to determine that the video frame is in a motion scene with less image noise when the average pixel cost parameter is smaller than a third threshold;

a fourth determining module, configured to determine that the video frame is in a motion scene with relatively high image noise when the average pixel cost parameter is greater than a fourth threshold, where the third threshold is less than or equal to the fourth threshold.

Optionally, the determining a still coding region in the video frame includes:

acquiring a reference video frame of the video frame from the video to be coded;

determining a second coding region matching the first coding region of the video frame through motion search in the reference video frame;

determining a difference value of the coordinates of the second coding region and the coordinates of the first coding region as a motion vector of the first coding region relative to the second coding region;

and determining a first coding region of which the motion vector is smaller than a fifth threshold value as the static coding region.

Optionally, the apparatus further comprises:

and the first mapping module is used for determining the first threshold and the second threshold according to preset index parameters adopted by the currently used coding rule, and the preset index parameters are in direct proportion to the first threshold and the second threshold respectively.

Optionally, the apparatus further comprises:

and the second mapping module is used for determining the third threshold and the fourth threshold according to preset index parameters adopted by the currently used coding rule, wherein the preset index parameters are in inverse proportional relation with the third threshold and the fourth threshold respectively.

Optionally, the adjusting module 402 includes:

the first basic parameter submodule is used for acquiring basic adjustment parameters;

and the adjusting submodule is used for adjusting the adjusting intensity value by adjusting the basic adjusting parameter according to the motion scene of the video frame, and the basic adjusting parameter and the adjusting intensity value are in an inverse proportional relation.

Optionally, the apparatus further comprises:

a cost parameter obtaining module, configured to obtain intra-frame prediction cost parameters and inter-frame prediction cost parameters of the coding region;

the pre-coding module is used for pre-coding the video frame and acquiring a reference value parameter of a coding area of the video frame according to the intra-frame prediction cost parameter and the inter-frame prediction cost parameter;

the calculation module 403 includes:

the second basic parameter submodule is used for acquiring basic quantization parameters matched with the current coding rule;

the offset submodule is used for acquiring the quantization parameter offset of the coding region according to the intra-frame prediction cost parameter, the inter-frame prediction cost parameter, the adjustment intensity value and the reference value parameter;

and the correction submodule is used for correcting the basic quantization parameter through the quantization parameter offset to obtain the quantization parameter of the coding region.

Optionally, the modification submodule includes:

a difference unit configured to use a difference between the base quantization parameter and an absolute value of the quantization parameter offset as the quantization parameter of the coding region.

Optionally, the apparatus further comprises:

the mapping establishing module is used for acquiring the mapping relation between the index parameters and the adjusted intensity values and the query parameters of the video frames;

the query module is used for determining the adjustment intensity value matched with the query parameter as the adjustment intensity value of the video frame according to the mapping relation;

Optionally, the apparatus further comprises:

and the down-sampling module is used for performing down-sampling processing on the video frame of the video to be coded to reduce the resolution of the video frame.

In summary, in the embodiment of the present application, the intensity value of the video frame can be dynamically adjusted according to the motion scene by analyzing the motion scene where the video frame of the video to be encoded is located, and then the size of the quantization parameter of the encoding region is corrected, and during subsequent encoding, the code rate and the size of the encoding region can be adjusted according to the quantization parameter of the encoding region, so that the code rate and the size are optimized on the basis of conforming to the viewing requirement of the corresponding motion scene, the viewing quality of the picture of the encoded video is met, the size of the encoded video is reduced, and the encoding quality is improved.

Referring to fig. 14, a block diagram of a video decoding apparatus according to an embodiment of the present application is shown, including:

a second obtaining module 501, configured to obtain a coded video, where a video frame of the coded video is divided into multiple coding regions;

a decoding module 502, configured to decode the encoded video to obtain a quantization parameter of the encoded region, where the quantization parameter is: adjusting the intensity value determined by the motion scene where the video frame of the coded video is located before coding, and calculating the parameters obtained by the reference value parameters obtained by pre-coding the coding region;

and a reconstructing module 503, configured to construct a video to be played according to the quantization parameter of the coding region.

The present application further provides a non-transitory, readable storage medium, where one or more modules (programs) are stored, and when the one or more modules are applied to a device, the device may execute instructions (instructions) of method steps in this application.

Embodiments of the present application provide one or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an electronic device to perform the methods as described in one or more of the above embodiments. In the embodiment of the present application, the electronic device includes various types of devices such as a terminal device and a server (cluster).

Embodiments of the present disclosure may be implemented as an apparatus, which may include electronic devices such as a terminal device, a server (cluster), etc., using any suitable hardware, firmware, software, or any combination thereof, to perform a desired configuration. Fig. 15 schematically illustrates an example apparatus 1000 that may be used to implement various embodiments described in embodiments of the present application.

For one embodiment, fig. 15 illustrates an example apparatus 1000 having one or more processors 1002, a control module (chipset) 1004 coupled to at least one of the processor(s) 1002, memory 1006 coupled to the control module 1004, non-volatile memory (NVM)/storage 1008 coupled to the control module 1004, one or more input/output devices 1010 coupled to the control module 1004, and a network interface 1012 coupled to the control module 1004.

The processor 1002 may include one or more single-core or multi-core processors, and the processor 1002 may include any combination of general-purpose or special-purpose processors (e.g., graphics processors, application processors, baseband processors, etc.). In some embodiments, the apparatus 1000 can be used as a terminal device, a server (cluster), and other devices described in this embodiment.

In some embodiments, the apparatus 1000 may include one or more computer-readable media (e.g., the memory 1006 or the NVM/storage 1008) having instructions 1014 and one or more processors 1002 that, in conjunction with the one or more computer-readable media, are configured to execute the instructions 1014 to implement modules to perform the actions described in this disclosure.

For one embodiment, control module 1004 may include any suitable interface controllers to provide any suitable interface to at least one of the processor(s) 1002 and/or any suitable device or component in communication with control module 1004.

The control module 1004 may include a memory controller module to provide an interface to the memory 1006. The memory controller module may be a hardware module, a software module, and/or a firmware module.

Memory 1006 may be used, for example, to load and store data and/or instructions 1014 for device 1000. For one embodiment, memory 1006 may comprise any suitable volatile memory, such as suitable DRAM. In some embodiments, the memory 1006 may comprise a double data rate type four synchronous dynamic random access memory (DDR4 SDRAM).

For one embodiment, the control module 1004 may include one or more input/output controllers to provide an interface to the NVM/storage 1008 and input/output device(s) 1010.

For example, NVM/storage 1008 may be used to store data and/or instructions 1014. NVM/storage 1008 may include any suitable non-volatile memory (e.g., flash memory) and/or may include any suitable non-volatile storage device(s) (e.g., one or more hard disk drive(s) (HDD (s)), one or more Compact Disc (CD) drive(s), and/or one or more Digital Versatile Disc (DVD) drive (s)).

The NVM/storage 1008 may include storage resources that are physically part of the device on which the apparatus 1000 is installed, or it may be accessible by the device and need not be part of the device. For example, NVM/storage 1008 may be accessed over a network via input/output device(s) 1010.

Input/output device(s) 1010 may provide an interface for apparatus 1000 to communicate with any other suitable device, input/output devices 1010 may include communication components, audio components, sensor components, and so forth. Network interface 1012 may provide an interface for device 1000 to communicate over one or more networks, and device 1000 may communicate wirelessly with one or more components of a wireless network according to any of one or more wireless network standards and/or protocols, such as access to a communication standard-based wireless network, such as WiFi, 2G, 3G, 4G, 5G, etc., or a combination thereof.

For one embodiment, at least one of the processor(s) 1002 may be packaged together with logic for one or more controller(s) (e.g., memory controller module) of control module 1004. For one embodiment, at least one of the processor(s) 1002 may be packaged together with logic for one or more controller(s) of control module 1004 to form a System In Package (SiP). For one embodiment, at least one of the processor(s) 1002 may be integrated on the same die with the logic of one or more controllers of the control module 1004. For one embodiment, at least one of the processor(s) 1002 may be integrated on the same die with logic for one or more controller(s) of control module 1004 to form a system on chip (SoC).

In various embodiments, the apparatus 1000 may be, but is not limited to: a server, a desktop computing device, or a mobile computing device (e.g., a laptop computing device, a handheld computing device, a tablet, a netbook, etc.), among other terminal devices. In various embodiments, the apparatus 1000 may have more or fewer components and/or different architectures. For example, in some embodiments, device 1000 includes one or more cameras, a keyboard, a Liquid Crystal Display (LCD) screen (including a touch screen display), a non-volatile memory port, multiple antennas, a graphics chip, an Application Specific Integrated Circuit (ASIC), and speakers.

The detection device can adopt a main control chip as a processor or a control module, sensor data, position information and the like are stored in a memory or an NVM/storage device, a sensor group can be used as an input/output device, and a communication interface can comprise a network interface.

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.

The foregoing detailed description is directed to a video encoding method, an apparatus, an electronic device, and a storage medium, which are provided by the present application, and specific examples are applied herein to illustrate the principles and implementations of the present application, and the descriptions of the foregoing examples are only used to help understand the method and the core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A video encoding method, comprising:

2. The method according to claim 1, wherein said adjusting the intensity value of the video frame according to the motion scene in which the video frame is located comprises:

reducing the adjustment intensity value under the condition that the video frame is determined to be in a motion scene with a severely changed picture;

in the case where it is determined that the video frame is in a motion scene in which a picture changes slowly, the adjustment intensity value is increased.

3. The method according to claim 1, wherein said adjusting the intensity value of the video frame according to the motion scene in which the video frame is located comprises:

reducing the adjustment intensity value under the condition that the video frame is determined to be in a motion scene with a fierce picture change or in a motion scene with larger image noise;

4. The method according to any of claims 1-3, wherein the adjusted intensity value is inversely proportional to the quantization parameter.

5. The method of claim 2 or 3, wherein said increasing said adjusted intensity value comprises:

the reducing the adjusted intensity value includes:

6. A method according to claim 2 or 3, characterized in that the method further comprises:

determining a still coding region in the video frame;

acquiring the number ratio of the static coding regions in the video frame in all coding regions of the video frame;

determining that the video frame is in a motion scene with a slowly changing picture under the condition that the number ratio is larger than a first threshold value;

under the condition that the number ratio is smaller than a second threshold value, determining that the video frame is in a motion scene with a fiercely changed picture; the first threshold is greater than or equal to the second threshold.

7. The method of claim 3, further comprising:

determining a still coding region in the video frame;

acquiring a pixel cost parameter of a static coding region; the pixel cost parameter is used to reflect: a similarity between a coding region in a reference video frame of the video frame corresponding to the still coding region and the still coding region; the more negatively the pixel cost parameter is related to the similarity;

acquiring average pixel cost parameters of all static coding areas in the video frame according to the pixel cost parameters of the static coding areas;

determining that the video frame is in a motion scene with less image noise under the condition that the average pixel cost parameter is smaller than a third threshold value;

determining that the video frame is in a motion scene with high image noise when the average pixel cost parameter is larger than a fourth threshold, wherein the third threshold is smaller than or equal to the fourth threshold.

8. The method of claim 6 or 7, wherein the determining the still encoded region in the video frame comprises:

9. The method of claim 6, further comprising:

and determining the first threshold and the second threshold according to preset index parameters adopted by the currently used coding rule, wherein the preset index parameters are in direct proportional relation with the first threshold and the second threshold respectively.

10. The method of claim 7, further comprising:

and determining the third threshold and the fourth threshold according to preset index parameters adopted by the currently used coding rule, wherein the preset index parameters are in inverse proportional relation with the third threshold and the fourth threshold respectively.

11. A video decoding method, comprising:

12. A video encoding apparatus, comprising:

13. A video decoding apparatus, comprising:

14. An electronic device, comprising:

a processor; and

memory having stored thereon executable code which, when executed, causes the processor to perform a method of encoding video as claimed in any one of claims 1 to 11.

15. One or more machine-readable media having stored thereon executable code that, when executed, causes a processor to perform a method of encoding video as claimed in any of claims 1 to 11.