CN117176955A - Video encoding method, video decoding method, computer device, and storage medium - Google Patents

Video encoding method, video decoding method, computer device, and storage medium Download PDF

Info

Publication number
CN117176955A
CN117176955A CN202310937003.7A CN202310937003A CN117176955A CN 117176955 A CN117176955 A CN 117176955A CN 202310937003 A CN202310937003 A CN 202310937003A CN 117176955 A CN117176955 A CN 117176955A
Authority
CN
China
Prior art keywords
target
current frame
parameter
coding
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310937003.7A
Other languages
Chinese (zh)
Inventor
黄晓峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202310937003.7A priority Critical patent/CN117176955A/en
Publication of CN117176955A publication Critical patent/CN117176955A/en
Pending legal-status Critical Current

Links

Abstract

The application discloses a video encoding method, a video decoding method, computer equipment and a storage medium. The video encoding method includes: acquiring a current frame to be coded from a target video; analyzing the picture scene of the current frame to obtain a scene analysis result; determining target coding parameters of the current frame by using scene analysis results; and encoding the current frame based on the target encoding parameters to obtain encoded data of the current frame. By the scheme, the video coding performance can be improved.

Description

Video encoding method, video decoding method, computer device, and storage medium
Technical Field
The present application relates to the field of video encoding and decoding technologies, and in particular, to a video encoding method, a video decoding method, a computer device, and a computer readable storage medium.
Background
Because the video image data volume is relatively large, the video image data volume is usually required to be encoded and compressed, and then transmitted to a user side through a wired or wireless network for decoding and viewing.
Currently, when encoding and compressing video images, fixed encoding parameters are generally adopted for encoding, or the encoding parameters are adjusted based on a rate distortion optimization model. However, as the final receiver of the video, the human visual system has perceptual redundancy due to some characteristics of the user, so that the current coding mode has lower coding performance on the video.
Disclosure of Invention
The application mainly solves the technical problem of providing a video encoding method, a video decoding method, computer equipment and a storage medium, which can improve the encoding performance of video.
In order to solve the above problems, a first aspect of the present application provides a video encoding method, including: acquiring a current frame to be coded from a target video; analyzing the picture scene of the current frame to obtain a scene analysis result; determining target coding parameters of the current frame by using scene analysis results; and encoding the current frame based on the target encoding parameters to obtain encoded data of the current frame.
In order to solve the above problems, a second aspect of the present application provides a video decoding method, comprising: receiving coded data of a current frame sent by a coding end; wherein, the coded data is obtained by the step of the video coding method executed by the coding end; analyzing by using the coded data to obtain target decoding parameters; and encoding the encoded data based on the target decoding parameters to obtain a video decoding image.
In order to solve the above-mentioned problems, a third aspect of the present application provides a computer device including a memory and a processor coupled to each other, the memory having program data stored therein, the processor being configured to execute the program data to implement any of the steps of any of the methods described above.
In order to solve the above-described problems, a fourth aspect of the present application provides a computer-readable storage medium storing program data executable by a processor for implementing any one of the steps of any one of the methods described above.
According to the scheme, the current frame to be encoded is obtained from the target video, the picture scene of the current frame is analyzed, and the scene analysis result is obtained, so that the target encoding parameters of the current frame can be determined by utilizing the scene analysis result, the target encoding parameters of the current frame can be dynamically determined according to different scene analysis results, the target parameters are more suitable for the picture scene of the current frame, the current frame is encoded based on the target encoding parameters, the encoding data of the current frame is obtained, the image quality of the video is more suitable for a human visual system, the perception redundancy is reduced, and the encoding performance of the video can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the present application, the drawings required in the description of the embodiments will be briefly described below, it being obvious that the drawings described below are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. Wherein:
FIG. 1 is a flowchart of a video encoding method according to a first embodiment of the present application;
FIG. 2 is a flow chart of a second embodiment of the video encoding method of the present application;
FIG. 3 is a flowchart illustrating the step S23 of FIG. 2 according to an embodiment of the present application;
FIG. 4 is a flowchart of a third embodiment of the video encoding method of the present application;
FIG. 5 is a flowchart of a fourth embodiment of the video encoding method of the present application;
FIG. 6 is a flowchart of a fifth embodiment of the video encoding method of the present application;
FIG. 7 is a flow chart of an embodiment of a video decoding method according to the present application;
FIG. 8 is a schematic diagram illustrating an embodiment of an encoding end of the present application;
FIG. 9 is a schematic diagram of an embodiment of a decoding end of the present application;
FIG. 10 is a schematic diagram of an embodiment of a computer device of the present application;
FIG. 11 is a schematic diagram of a computer-readable storage medium according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "first" and "second" in the present application are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship. Further, "a plurality" herein means two or more than two. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.
The present application provides the following examples, and each example is specifically described below.
Referring to fig. 1, fig. 1 is a flowchart illustrating a video encoding method according to a first embodiment of the present application. The method may comprise the steps of:
s11: and acquiring the current frame to be encoded from the target video.
The embodiment can be applied to an encoding end, and the video encoding method of the embodiment can be implemented by the encoding end performing steps S11 to S14 of the embodiment.
The target video may be a video to be encoded, and each frame image of the target video may be sequentially used as a current frame to be encoded, or frame images of a preset interval of the target video may be sequentially used as a current frame. The selection may be based on the particular application scenario, as the application is not limited in this regard.
S12: and analyzing the picture scene of the current frame to obtain a scene analysis result.
The scene of the current frame is analyzed to obtain a scene analysis result of the current frame, wherein the scene analysis comprises at least one of target analysis, image information analysis, picture information analysis, motion analysis and the like.
For example, the object analysis may perform object detection on the current scene of the current frame to detect the presence or absence of an object, an object type, and the like. Scene analysis may be performed based on AI (Artificial Intelligence ), detection may be performed using a scene analysis model, which may be determined based on user-defined selections, detection types of the model, set detection parameters, and the like may be set. For example, one or more of the types of detection are selected to be human face, human body, car, etc. The detection parameters are, for example, detection area, target size, quality parameters, etc. Thus, the target information in the picture scene focused by the user is acquired, and the information can comprise the target type, the target coordinates, the width and the height.
If the technical scheme of the application relates to personal information, the product applying the technical scheme of the application clearly informs the personal information processing rule before processing the personal information and obtains the autonomous agreement of the individual. If the technical scheme of the application relates to sensitive personal information, the product applying the technical scheme of the application obtains individual consent before processing the sensitive personal information, and simultaneously meets the requirement of 'explicit consent'. For example, a clear and remarkable mark is set at a personal information acquisition device such as a camera to inform that the personal information acquisition range is entered, personal information is acquired, and if the personal voluntarily enters the acquisition range, the personal information is considered as consent to be acquired; or on the device for processing the personal information, under the condition that obvious identification/information is utilized to inform the personal information processing rule, personal authorization is obtained by popup information or a person is requested to upload personal information and the like; the personal information processing rule may include information such as a personal information processor, a personal information processing purpose, a processing mode, and a type of personal information to be processed.
In some embodiments, the image information includes information such as image texture, image brightness, and the like. Motion analysis may include detecting a relatively still scene, a moving picture, a partial motion region, etc., as the application is not limited in this regard.
In some embodiments, the current frame is converted into a YUV format, so that proper resolution and frame rate can be set according to the performance requirement of the scene analysis model, and the performance and detection accuracy of the algorithm are ensured.
In some embodiments, it is determined whether a preset target is detected in a picture scene of the current frame, where the preset target may be a person, a car, an animal, an object, or the like, which is not limited in the present application. And in response to the detection of the preset target by the picture scene of the current frame, determining that the preset target exists as a picture analysis result of the current frame. Otherwise, the preset target does not exist.
In some embodiments, in response to the picture scene of the current frame not detecting the preset target, it is further determined whether the picture scene belongs to a moving picture. The frame difference method may be used to sum up the pixel change sum of the current frame and the previous frame, and when the pixel change sum is greater than a preset change threshold, it is determined that the picture scene of the current frame belongs to the moving picture. Otherwise, it is determined that the picture scene of the current frame belongs to a relative still picture.
For example, if no object is detected in the current frame of picture scenes, whether the current frame of picture scenes are in a relatively static scene is detected by a frame difference method, for example, no wind, no moving objects, cells or courtyards with brightness change lower than a brightness threshold value, and the like. When the picture scene is at a relative rest, the information of each frame can be regarded as almost the same, and therefore, when it is detected by the frame difference method that the GOP in the current rest state is in a rest state for a continuous period (for example, 1 minute), the length of the GOP in the current rest state can be set to a fixed length value (for example, 750), the scene change condition is set, and the following frames are set to a parallel encoding mode (for example, a pskip mode) and are subjected to parallel encoding, providing encoding and decoding performance.
S13: and determining target coding parameters of the current frame by using the scene analysis result.
After the scene analysis result of the current frame is obtained, the target coding parameters of the current frame can be dynamically determined according to different scene analysis results.
In some embodiments, the scene analysis result is that the picture scene of the current frame belongs to a moving picture, when a preset target exists, the reference coding parameters can be obtained for the preset target and the moving picture, and the initial coding parameters can be adjusted to obtain the target coding parameters of the current frame.
In some embodiments, the scene analysis result is that the picture scene of the current frame belongs to a moving picture, but when the preset target does not exist, the reference coding parameter can be acquired for the moving picture, and the initial coding parameter can be adjusted to obtain the target coding parameter of the current frame.
In some embodiments, when the scene analysis result indicates that the picture scene of the current frame belongs to a preset target, the reference coding parameter can be obtained for the preset target, and the initial coding parameter can be adjusted to obtain the target coding parameter of the current frame, regardless of whether the current picture scene is a moving picture or a still picture.
In some embodiments, when the scene analysis result is that the picture scene of the current frame belongs to a relative still picture and the preset target does not exist, the target coding parameter of the current frame may be acquired for the still picture.
In some implementations, the target encoding parameter may be a QP (Quantizer Parameter, quantization parameter) indicating compression of image space details. The QP value determines the picture quality to some extent. The Quantization Parameter (QP) reflects the spatial detail compression, e.g., the smaller the QP value, the more detail will be preserved; the larger the QP value, the more detail is lost and the lower the sharpness of the video picture.
S14: and encoding the current frame based on the target encoding parameters to obtain encoded data of the current frame.
After the target coding parameters are acquired, the current frame may be coded to compress video data, so as to obtain coded data of the current frame, which may also be referred to as a video code stream. The coding mode can adopt H.264, MPEG or AVS, and the application does not limit the coding mode. The encoded data may be transmitted to a decoding side so that the decoding side may decode the video data to play or view the video data.
In this embodiment, the current frame to be encoded is obtained from the target video, the picture scene of the current frame is analyzed, and the scene analysis result is obtained, so that the target encoding parameter of the current frame can be determined by using the scene analysis result, the target encoding parameter of the current frame can be dynamically determined according to different scene analysis results, so that the target parameter is more suitable for the picture scene of the current frame, the current frame is encoded based on the target encoding parameter, the encoding data of the current frame is obtained, and the encoding performance of the video can be improved.
In some embodiments, in step S13, the scene analysis result is that the picture scene of the current frame belongs to a moving picture or a preset target exists, and the target encoding parameter may be determined using implementation of the following embodiments.
Referring to fig. 2, fig. 2 is a flowchart illustrating a video encoding method according to a second embodiment of the present application. The method may comprise the steps of:
s21: and responding to the scene analysis result that the picture scene of the current frame belongs to a moving picture or a preset target exists, and acquiring target information entropy of at least one target area in the current frame. And acquiring the reference information entropy of the reference area in the current frame.
In some embodiments, in a case where the picture scene of the current frame belongs to a moving picture, the target area is a preset area of the current frame. If the moving picture has a preset target, the preset area is at least one block area of an area containing the preset target, and if the moving picture does not have the preset target, the preset area is the whole picture area or at least one block area of the current frame, which is not limited by the present application.
In the case that the picture scene of the current frame belongs to the area containing the preset target, the target area is an area containing the preset target, or the target area is at least one block area of the area containing the preset target.
In some embodiments, the reference region has an area greater than or equal to the target region, and the reference region comprises the target region. The reference area is disposed opposite to the target area. For example, when the target area is an area containing a preset target, the reference area may be the entire area of the current frame. For example, when the target area is the entire area of the current frame, the reference area may be the entire area of the current frame. For example, when the target area is an area where the current frame includes a single preset target, the reference area may be an area where the current frame includes all preset targets or an entire area of the current frame. For example, when the target area is an area where the current frame includes all preset targets, the reference area may be the entire area of the current frame. For example, when the target area is at least one block area of the current frame including the single preset target, the reference area may be the area of the current frame including the single preset target. The target area and the reference area may be set according to a specific application scenario, which is not limited in the present application.
Thus, the target information entropy of at least one target area of the current frame can be acquired. And acquiring the reference information entropy of the reference area in the current frame.
In some embodiments, image information for each target region may be acquired, wherein the image information includes at least one of image brightness, image texture. And acquiring target information entropy of each target area by utilizing the image information of the target area, the type weight of the target area and the area of the target area.
In some embodiments, image information of the reference region may be acquired, wherein the image information includes at least one of image brightness, image texture. And acquiring the reference information entropy of the reference region by utilizing the image information of the reference region and the region area of the reference region.
As one example, for example, the image information contains image brightness and image texture. For example, the image information contains only image textures. This will be described by the following two modes as an example.
Information entropy method 1: the image information contains image brightness and image texture.
The target information entropy of a single preset target can be expressed by the product of the sum of the image brightness and the brightness weight, the sum of the image texture and the texture weight, and the type weight of a target area or a preset target, and the area of the target area or the preset target. For example, the following is expressed:
E=(a*Ls+b*Vs)*T*S (1)
Wherein E represents the target information entropy of a single preset target, ls represents the image brightness sum, a represents the brightness weight, vs represents the image texture sum, b represents the texture weight, T represents the target type weight, and S represents the area of the region.
Similarly, the frame target information entropy of the current frame is the sum of all target information entropy of the current frame, and can be expressed by Es.
In addition, the frame information entropy of the current frame can be represented by the sum of the image brightness sum and brightness weight, and the sum of the image texture sum and texture weight. For example, the following are indicated:
Ec=a*Ls+b*Vs (2)
where Ec represents the frame information entropy of the current frame, i.e. the information entropy of all regions of the current frame picture, ls represents the sum of the image brightness, a represents the brightness weight, vs represents the sum of the image texture, and b represents the texture weight.
Information entropy method 2: the image information contains only image textures.
The target information entropy of a single preset target can be represented by the sum of the image texture sum and the texture weight, and the type weight. For example, the following is expressed:
E=+b*Vs*T (3)
wherein E represents the target information entropy of a single preset target, vs represents the image texture sum, b represents the texture weight, and T represents the type weight.
In addition, the frame target information entropy of the current frame is the sum of all target information entropy of the current frame, and can be represented by Es.
In addition, the frame information entropy of the current frame may be represented by the sum of the image texture sum and the texture weight. For example, the following are indicated:
Ec=b*Vs (4)
where Ec represents the frame information entropy of the current frame, i.e. the information entropy of all regions of the current frame picture, vs represents the image texture sum, and b represents the texture weight.
In the information entropy modes 1 and 2, compared with the information entropy mode 2, the information entropy mode 1 can optimize the brightness intense change or the image effect under the low-illumination scene according to the characteristic that the human eyes are sensitive to the brightness. The weight of the target type can be divided into a face, a human body, a vehicle body, a license plate and the like according to the types of the targets, fine adjustment can be carried out according to the mass fraction of the targets, the edge strength and the motion strength of the detail of the targets are judged through wavelet transformation, and the targets with large sharpness of the edge of the detail and large motion strength and heavy weight are obtained. The target type weight, texture weight, brightness weight and the like are selected according to the specific application scene, and the application is not limited to this.
By the method, the target information entropy of at least one target area of the current frame can be obtained. And acquiring the reference information entropy of the reference area in the current frame. For example, the target information entropy E of a single preset target, the frame target information entropy Es of the current frame may be used as the target information entropy of the target area, and the frame information entropy Ec of the current frame may be used as the reference information entropy. Or, taking the target information entropy E of a single preset target as the target information entropy of the target area, and taking the frame target information entropy Es of the current frame as the reference information entropy. In the case that the moving picture has no preset target, the frame information entropy Ec of the current frame may be respectively used as the target information entropy and the reference information entropy. The application is not limited in this regard.
S22: and acquiring the information entropy weight factor by utilizing the target information entropy and the reference information entropy.
A first logarithmic processing result of the sum of target information entropies of all target areas and a second logarithmic processing result of reference information entropies can be obtained; dividing the product of the first preset processing result and the preset weight by the second preset processing result to obtain an information entropy weight factor. As an example, the process may be expressed by the following formula:
wherein F represents an information entropy weight factor, ln (Es) represents a first logarithmic processing result of a sum of target information entropies of each target area, w1 represents a preset weight, and ln (Ec) represents a second logarithmic processing result of reference information entropies.
The target area and the reference area can be adjusted according to specific scenes. The frame level QP, the macro block level QP and the like can be set according to the target type weight, the number, the area and the like of the target, so that the QP adjustment effect is more direct and obvious, and the method can be suitable for scenes with requirements on the target image quality. The setting of the frame level QP, the macroblock level QP, etc. may be implemented for the selection of the target area, the reference area, so as to implement quality adjustment for different areas of the current frame, etc.
For example, the target area is an area where a single target is located, and the reference area is an area where a plurality of targets (including a target area, all targets, or targets of the same type) are located or an entire area of the current frame. For example, the target area is an area where a plurality of (or all targets or targets of the same type) targets are located, and the reference area may be the entire area of the current frame. Different degrees of adjustment of the frame level QP may be implemented. For example, the target area is an area where a single target is located, and the reference area is an area where the same type of target including the target is located. For example, the target region is a sub-region (e.g., a macroblock) of the region in which the single target is located, and the reference region is the region in which the single target is located. Different degrees of adjustment of the macroblock level QP may be implemented. The frame-level QP and the macroblock-level QP referred to in the present application are relatively speaking, and may describe QP adjustment with different degrees, and may have different area size settings, and in general, the reference area is greater than or equal to the target area, and different combinations of different target areas and corresponding reference areas may be selected according to different application scenarios, which is not limited in this aspect of the present application.
S23: and adjusting the initial coding parameters based on the information entropy weight factors to obtain target coding parameters of the current frame.
After the information entropy weight factor is obtained, the initial coding parameter QP1 can be adjusted by using the information entropy weight factor to obtain the target coding parameter of the current frame.
In some embodiments, referring to fig. 3, step S23 of the above embodiments may be further extended. Based on the information entropy weight factor, the initial encoding parameters are adjusted to obtain target encoding parameters of the current frame, and the embodiment may include the following steps:
s231: and adjusting the initial coding parameters by using the information entropy weight factors to obtain the reference coding parameters.
The initial encoding parameter QP1 may be an encoding parameter set in advance for the current frame, an encoding parameter set in advance or input, or an encoding parameter of the current frame obtained by a statistical value of a previous frame or a historical frame, which is not limited in the present application.
The initial coding parameters are adjusted by the entropy weight factor F to obtain the parameter coding parameters QP2, which can be expressed as follows:
in some embodiments, the reference encoding parameter or the target encoding parameter may be subjected to limit processing after the initial encoding parameter is adjusted by the entropy weight factor to obtain the reference encoding parameter or the target encoding parameter is determined.
In some embodiments, the initial parameter limit range of the initial encoding parameter includes at least one first limit term. The at least one first limit term includes an initial encoding parameter upper limit QP1 max Initial coding parameter lower limit QP1 min Initial coding step size limit QP1 delta At least one of them.
In some embodiments, the initial parameter limit range includes at least one first limit term that may be preset or user entered, as the application is not limited in this regard.
In some embodiments, the initial parameter limit range including at least one first limit term may be determined based on the encoding parameters of the current frame, e.g., an initial encoding parameter upper limit QP1 max Is the maximum value of the initial encoding parameters. For example, initial coding parameter lower limit QP1 min Is the minimum value of the initial encoding parameters. For example, initial coding step limit QP1 delta For the difference between the maximum and minimum values of the initial coding parameters, or the initial coding step size limit QP1 delta The preset statistical values are average values, maximum values, minimum values, sum values and the like, such as the difference value of the average value of the initial coding parameters of the current frame and the average value of the initial coding parameters of the previous frame. The application is not limited in this regard.
In some embodiments, for each first limit item, a first log processing result of a sum of target information entropies of each target area and a second log processing result of reference information entropies are obtained, and a product of weights corresponding to the first preset processing result and each first limit item is divided by the second preset processing result to obtain an entropy weight factor corresponding to each first limit item.
For each first limit term, the entropy weight factor corresponding to the first limit term may be used to adjust the first limit term to obtain a second limit term corresponding to the first limit term, where each first limit termThe two limit terms constitute the range of target parameter limits. The at least one second limit term includes a target encoding parameter upper limit QP2 max Target encoding parameter lower limit QP2 min Target coding step size limit QP2 delta At least one of them. Maximum QP2 max Minimum QP2 min Is a limit value for quantizing QP2 for the current frame. QP2 delta Refers to the step size of the quantization QP2 adjustment.
In some embodiments, the target encoding parameter upper limit value is a difference between the entropy weight factors corresponding to the initial encoding parameter upper limit value and the initial encoding parameter upper limit value, the target encoding parameter lower limit value is a sum of the entropy weight factors corresponding to the initial encoding parameter lower limit value and the initial encoding parameter lower limit value, and the target encoding step size limit value is a sum of the entropy weight factors corresponding to the initial encoding step size limit value and the initial encoding step size limit value.
As an example, the first limit term is adjusted by using the entropy weight factor corresponding to the first limit term, so as to obtain the second limit term corresponding to the first limit term, which can be expressed by the following formula:
in the above-mentioned formula (7),representing initial coding parameter upper limit QP1 max Corresponding entropy weight factors, ln (Es) represents a first logarithmic processing result of a sum of target information entropies of each target region, ln (Ec) represents a second logarithmic processing result of reference information entropies, and w2 represents an initial encoding parameter upper limit value QP1 max Corresponding toIs a weight of (2).
In the above-mentioned formula (8),representing initial encoding parameter lower limit QP1 min Corresponding entropy weight factors, ln (Es) represents a first logarithmic processing result of a sum of target information entropies of each target region, ln (Ec) represents a second logarithmic processing result of reference information entropies, and w3 represents an initial encoding parameter lower limit value QP1 min And (5) corresponding weight.
In the above-mentioned formula (9),representing initial encoding step size limit QP1 delta Corresponding entropy weight factors, ln (DEs) representing the target information entropy difference, ln (DEc) representing the reference information entropy difference, w4 representing the initial encoding step size limit QP1 delta And (5) corresponding weight.
The target information entropy difference may be a difference between a sum of target information entropies of each target area of the current frame and a sum of target information entropies of each target area of the history frame (e.g., a previous frame). The reference information entropy difference may be a difference between the reference information entropy of the current frame and the reference information entropy of the history frame (e.g., the previous frame).
And processing the limit value of the reference coding parameter based on the limit value range of the target parameter to obtain the reference coding parameter after the limit value. Or, performing limit processing on the initial coding parameter based on the target parameter limit range to obtain the initial coding parameter after the limit. Or, performing limit processing on the target coding parameter based on the target parameter limit range to obtain the target coding parameter after the limit.
In this embodiment, the reference encoding parameter is subjected to limit processing based on the target parameter limit range, and the reference encoding parameter after the limit is obtained is described as an example. May be greater than the target encoding parameter upper limit QP2 max Is the target coding parameter upper limit QP2 max . May be smaller than the target encoding parameter lower limit value QP2 min Reference coding parameter QP2 limit of (c)Encoding parameter lower limit QP2 for target min . Can be larger than the target encoding step limit QP2 delta Is the target encoding step size limit QP2 delta . The procedure can also be understood as a step size limit between the initial encoding parameter and the reference encoding parameter being the target encoding step size limit QP2 delta . Alternatively, the step size limit between the reference coding parameters is set at the target coding step size limit QP2 delta So that the maximum and minimum values of the reference coding parameters do not differ too much.
By setting the maximum value QP2 of QP2 max And a minimum value QP2 min And carrying out limit value processing on the QP2 so that the maximum value and the minimum value of the reference coding parameters QP2 are reduced in the picture scenes with large type weight and large target number, and otherwise, the maximum value and the minimum value of the reference coding parameters QP2 are increased. In addition, under the original rate distortion optimization code control strategy, the value is larger than the maximum value QP2 max The time can be limited, and the image quality is prevented from being too bad; reducing QP2 min The target image quality of the target video can be improved when the code rate is insufficient.
S232: and determining a target coding parameter based on the pre-coding result of the initial coding parameter and the reference coding parameter.
In some embodiments, the target encoding parameter may be determined based on the initial encoding parameter and the precoding result of the reference encoding parameter after the threshold value. Or, after the target coding parameter is acquired, performing limit processing on the target coding parameter. The application is not limited in this regard.
In some embodiments, a first precoding result for precoding the current frame with the initial coding parameter QP1 may be obtained. And acquiring a second precoding result of precoding the current frame by using the reference coding parameter QP2, or acquiring a second precoding result of encoding the current frame by using the reference coding parameter QP 2. Wherein the first precoding result and the second precoding result include at least one of a bit number and distortion.
And acquiring the coding deviation of the first precoding result and the second precoding result. In this process, the bit number deviation between the bit numbers in the first precoding result and the second precoding result and the distortion deviation between the distortions may be obtained as the coding deviation, respectively.
And selecting a parameter value of a preset coding parameter or a reference coding parameter as a target parameter value based on the numerical range of the coding deviation, and adjusting the initial coding parameter according to the target parameter value to obtain the target coding parameter. Different coding parameters can be selected as target parameter values in different modes according to different coding deviations, so that different target coding parameters can be determined, and the adaptability is improved.
Specifically, it may be determined whether the encoding bias is greater than a first preset bias threshold, where if at least one of the bit number bias and the distortion bias is greater than the first preset bias threshold, it is determined that the encoding bias is greater than the first preset bias threshold.
In some embodiments, in response to the encoding deviation being greater than the first preset deviation threshold, indicating that the error is greater, it may be necessary to reset the encoding parameter or to reset the encoding parameter to a user-input encoding parameter, to select a parameter value of the preset encoding parameter as a target parameter value, and to adjust a parameter value of the initial encoding parameter to the target parameter value, so as to obtain the target encoding parameter. The preset encoding parameters may be initial encoding parameters or encoding parameters input by a reset user, which is not limited in the present application.
In some embodiments, in response to the code bias not being greater than a first preset bias threshold, it is further determined whether the code bias is greater than a second preset bias threshold. The second preset deviation threshold value is smaller than the first preset deviation threshold value.
In some embodiments, in response to the encoding deviation not being greater than the first preset deviation threshold and the encoding deviation being greater than the second preset deviation threshold, the parameter value of the reference encoding parameter is selected as the target parameter value and the parameter value of the initial encoding parameter is adjusted to the target parameter value to obtain the target encoding parameter. The procedure can also adjust the initial coding parameter QP1 to fit the reference coding parameter QP2 to the coding parameters that are comparatively adapted to the picture scene of the current frame.
In some embodiments, in response to the encoding deviation not being greater than the first preset deviation threshold and the encoding deviation not being greater than the second preset deviation threshold, the parameter value of the reference encoding parameter is selected as the target parameter value, and the parameter value of the initial encoding parameter is adjusted to a preset statistical value between the initial encoding parameter and the target parameter value to obtain the target encoding parameter. Wherein, the preset statistical value between the initial coding parameter and the target parameter value can be expressed as:
Ps=(QP1-QP2)*c+QP1 (10)
Wherein Ps represents a preset statistical value, QP1 is an initial encoding parameter, QP2 is a reference encoding parameter or a target parameter value, c represents a product coefficient, and may be an empirical value or a preset value, which is not limited in the present application.
In this embodiment, an information entropy difference between the information entropy of the current frame and the information entropy of the previous frame may be obtained, where a large information entropy difference indicates a large QP change and a large scene change. When the scene becomes complex, the information entropy difference is positive, and conversely is negative. The macro block QP with more targets and large target type weights is adjusted to be smaller, so that the coded image quality can be improved, and the subjective feeling of a user is more met.
In addition, QP parameters focus on the specific object itself, which may include object type weights and their area, texture, luminance. Reasonable allocation of picture content QP of the current frame can be ensured. The key target obtains more code rate and smaller QP, so that the compression loss of the target is reduced, and the picture quality of the target prospect is improved.
In some embodiments, scene change may be further determined according to a total acquired difference between a sum of QP-Offset of a current frame and QP-Offset of a previous frame, whether the difference is greater than a set difference may be determined, and in response to the difference being greater than the set difference, it is determined that scene change is required, the current frame may be set to be an I frame (i.e., a key frame), and the QP adjustment described above, that is, the initial encoding parameter is adjusted.
In some embodiments, if the difference between the target information entropy of the current frame and the target information entropy of the previous frame is greater than the preset entropy value, and it is determined that the scene change condition is satisfied, the current frame may be set as an I frame (i.e., a key frame), and the QP adjustment is performed, that is, the initial encoding parameter is adjusted.
In some implementations, a target length of the GOP (Group of Pictures ) may be obtained in addition to the target encoding parameters of the current frame. Where a GOP is a group of consecutive pictures. 1 GOP refers to a group of pictures consisting of 1 or more I frames (key frames) and a plurality of P, B frames (forward predictive coded frames). The I-frame in the GOP may also be referred to as an intra-coded frame, which is an independent frame with all the information itself, and can be decoded independently without reference to other frame pictures, and the first frame in the video sequence is always the I-frame. P, B frames need to be decoded with reference to other frame pictures (e.g., I frames). In the case where there are multiple I frames, the first I frame of the GOP is called an IDR (Instantaneous Decoding Refresh ) frame.
In some embodiments, for the above step S13, when the scene analysis result is that the picture scene of the current frame belongs to a relatively still picture and the preset target does not exist, the target encoding parameter of the current frame may be determined in the following manner.
In some embodiments, in response to the scene analysis result that the picture scene of the current frame belongs to a relative still picture and the preset target does not exist, the coding parameters of the key frames are obtained as target coding parameters based on the picture information of the key frames in the picture group where the current frame is located. Wherein, in case that the current frame is a forward predictive coding frame, the current frame is coded in parallel with at least one remaining forward predictive coding frame of the group of pictures.
As an example, in response to the scene analysis result being that the picture scene of the current frame belongs to a relative still picture and there is no preset target, a scene change condition and a target encoding parameter, such as an initial length of a GOP, an initial encoding parameter (QP parameter), an initial step size, and the like, may be set.
When the picture scene is at a relative rest, the information of each frame can be regarded as almost the same, and therefore, when it is detected by the frame difference method that the GOP in the current rest state is in a rest state for a continuous period (for example, 1 minute), the length of the GOP in the current rest state can be set to a fixed length value (for example, 750), the scene change condition is set, and the following frames are set to a parallel encoding mode (for example, a pskip mode) and are subjected to parallel encoding, providing encoding and decoding performance.
The scene change condition may also represent a determination condition of the first I frame (or IDR frame) of the GOP, and may determine a scene change condition of a current frame and a previous frame or a historical multiframe by using an information entropy or a frame difference method of the current frame, so as to determine whether a scene change occurs. I frames or IDR frames can be forced by changing from a relative still picture to a scene in which a preset object exists, or changing to a moving picture, or the like.
In some embodiments, after encoding a current frame in a still picture or encoding a forward predictive encoded frame of the GOP group in parallel, whether there is a scene change may be determined, and in response to detecting that a subsequent frame is changed from the still picture to another scene, such as a scene in which a preset object exists, or to a moving picture, the length of the GOP may be changed to the length of the other scene last time as a target length, and the subsequent frame may be used as an IDR frame, so as to ensure stability of image quality.
In some embodiments, key frames for a group of pictures may also be determined in the following manner.
Referring to fig. 4, fig. 4 is a flowchart illustrating a video encoding method according to a third embodiment of the present application. The method may comprise the steps of:
S31: and acquiring an information entropy change value between the target information entropy of the current frame and the target information entropy of the historical frame.
The historical frame is a frame before the current frame and can be a previous frame or a historical multi-frame, and the application is not limited to the previous frame or the historical multi-frame.
Alternatively, the target information entropy of the present embodiment may be a sum of target information entropies of all preset targets in the current frame and the history frame, so as to determine the information entropy change value. The preset target may be a plurality of types of targets or a preset type of targets, which the present application is not limited to.
In some embodiments, a sum of the target information entropies of the target areas of the current frame may be obtained as the target information entropy of the current frame, and a sum of the target information entropies of the target areas of the history frame may be obtained as the target information entropy of the history frame.
In some embodiments, in the case that the preset target does not exist in the picture scene of the current frame or the history frame, the information entropy of all the picture areas of the current frame may be used as the target information entropy of the current frame, and the information entropy of all the picture areas of the history frame may be used as the target information entropy of the history frame.
The difference between the target information entropy of the current frame and the target information entropy of the history frame can be obtained as an information entropy change value.
S32: judging whether the information entropy change value is larger than a preset change value.
In some embodiments, if the information entropy change value is greater than the preset change value, the following step S33 is performed in response to the information entropy change value being greater than the preset change value. Otherwise, the I frame or P frame, B frame may be determined as a forward prediction frame or according to the GOP length, and the above step S31 is performed on the next frame.
S33: the current frame is used as the key frame of the picture group.
The current frame may be used as an I frame or an IDR frame of the picture group, and if it is detected that the current frame has a scene change, the current frame may be used as an IDR frame.
In some embodiments, the length of the group of pictures may be dynamically determined in the following manner in the case where a scene change occurs, or a preset target exists for a picture scene, or a picture scene belongs to a motion scene.
Referring to fig. 5, fig. 5 is a flowchart illustrating a video encoding method according to a fourth embodiment of the present application. The method may comprise the steps of:
s41: and acquiring an information entropy change value between the target information entropy of the current frame and the target information entropy of the historical frame.
The specific implementation process of this step may refer to the specific implementation process of step S31, which is not described in detail herein.
Alternatively, the information entropy change value refers to the above-mentioned calculation process of the reference information entropy difference ln (DEc), which is not limited in the present application.
S42: and obtaining the entropy change weight factor by using the information entropy change value.
The entropy change weight factor can be obtained by using the third logarithm processing result of the information entropy change value and the corresponding entropy change weight.
S43: and adjusting the initial length of the picture group by utilizing the entropy change weight factor to obtain the target length of the picture group.
Wherein, in the case that the scene of the current frame belongs to the moving picture or the preset target exists in response to the scene analysis result, the target length of the picture group is the length of the picture group to which the current frame belongs.
The initial length gop_old of the group of pictures GOP may be adjusted by using the entropy change weight factor obtained above to obtain the target length gop_new of the group of pictures. The process can be expressed by the following formula:
GOP_new=GOP_old—ln(Ed)*w5 (11)
where gop_new represents a target length gop_new of a group of pictures, gop_old represents an initial length of a group of pictures GOP, ln (Ed) w5 represents an entropy change weight factor, ln (Ed) represents a third logarithmic processing result, w5 represents an entropy change weight, and Ed represents an information entropy change value.
In some embodiments, after the target length of the group of pictures is obtained, the number of frame images contained in the current GOP may be determined, and in this manner, the length of the GOP may be dynamically adjusted according to the size of the change of the target occupied picture weight between adjacent frames.
In addition, the GOP length is associated with the change of the information entropy, and the frame reference relation in the dynamic scene is optimized. In the case of a scene of intense motion, setting a GOP with a short length can alleviate smear and mosaic caused by a reference relation error. In the case of low motion scenes, a long GOP can be set to save the code rate.
In some embodiments, after determining the target length of the group of pictures, the target length of the group of pictures may be utilized to determine the key frames of the group of pictures.
The number of frames contained in each group of pictures can be determined by the target length of the group of pictures, thereby determining the first key frame of the group of pictures according to the target length. In addition, the mandatory I-frame may be set according to the target length of the GOP.
In some embodiments, in the case that the current frame is determined to have a preset target or belongs to a motion scene, the current frame may be set to be an I frame or an IDR frame, so that by means of this embodiment, the target length of the group of pictures to which the current frame (I frame or IDR frame) corresponds may be determined, that is, the initial length of the GOP corresponding to the current frame (I frame or IDR frame) may be adjusted.
Referring to fig. 6, fig. 6 is a flowchart of a video encoding method according to a fifth embodiment of the present application. The method may comprise the steps of:
s51: and acquiring an image evaluation value and scene related information of the current frame.
In some embodiments, the steps of this embodiment may be performed after encoding the current frame based on the target encoding parameters, resulting in encoded data for the current frame.
In some embodiments, an image evaluation value and scene-related information of the current frame may be acquired. Wherein the image evaluation value of the current frame can be acquired by the target area, the brightness of the target area, and the like. The scene-related information may include information of a target, a target identifier, a type of the target, etc. of the current frame.
Specifically, for each target area, a target detection evaluation value of each target area may be acquired, and the target detection evaluation value may be obtained by detecting the target area where each target is located using an AI detection model. The AI detection model may comprehensively determine a detection score according to a type of the target, an area size of a target area where the target is located, a definition of the target, a brightness of the target, and the like, and use the detection score as a target detection evaluation value.
And obtaining the target detection factor by using the product of the target detection evaluation value and the target evaluation weight. And obtaining an information weight factor by using the product of the image information of the target area and the information evaluation weight. Thus, the target detection factor and the information weight factor are used for superposition, and the region evaluation value of the target region can be obtained. In this way, an area evaluation value corresponding to the target area where at least one target is located can be obtained.
As an example, taking image information as image brightness, information evaluation weight as brightness evaluation weight as an example, the region evaluation value corresponding to each target region may be expressed as: region evaluation value=target detection evaluation value+target evaluation weight+image brightness.
Then, based on the area evaluation values of at least one target area, or based on the area evaluation values of all the target areas, an image evaluation value of the current frame can be obtained.
S52: the image evaluation value and scene-related information are written into the encoded data of the current frame.
The image evaluation value and scene related information of the current frame may be written into the encoded data of the current frame, wherein the stream custom information of the encoded data may be written, for example, in the SEI (Supplemental Enhancement Information ) may be written, and may be used to provide supplemental information unrelated to image codec.
The coded data written with the image evaluation value and the scene related information is used for being sent to a decoding end, so that the decoding end decodes the coded data by utilizing the image evaluation value and the scene related information to obtain a current frame. In this way, the decoding end can filter the key frames to be decoded according to the type of the target concerned, the image evaluation value of the current frame of the target video thereof, and the like. The method realizes the optimization of the decoding end, greatly reduces decoding consumption, reduces the storage, processing, retrieval and the like of the worthless video information, and improves decoding efficiency.
For the above embodiment, not only the quantization parameter QP value itself is focused, but the coding is integrally optimized based on the information entropy factor and the detection information obtained by the AI scene analysis technology, so as to improve the effect and performance of the encoded and decoded image.
In addition, the information entropy weight factor is obtained through the detection target based on the AI mode, and the rate distortion optimization parameters are optimized and adjusted in real time, so that the video quality is more approximate to subjective feeling and willingness of people.
In addition, since video coding employs block division, compression is achieved based on inter-frame prediction of I-frames and P-frames. When the code rate budget is limited, if the sharpening details are more, the random noise is large, and the variation of a moving object is large, the QP quantization parameters of a moving area, a high brightness area and a high texture noise large area can not be adjusted or can be adjusted in time, so that prediction errors are caused, errors (large residual errors) are rapidly diffused, and mosaics, smear and respiratory effects are caused. The method is characterized in that the pre-coding pretreatment is carried out by an AI detection method, the coordinates and the range of a target are obtained, and QP irrational and random properties caused by simple block division are optimized by a target area and block division method. Because the target can be updated in real time every frame, the rapid error diffusion of an I frame P frame reference mechanism is avoided. Through the adjustment algorithm of the embodiment, and through the actual test, the data of the weight a, the complex texture weight b and the brightness weight c, which are obtained through fine adjustment (the target area and the variation frame difference), can be generated into a corresponding fitting curve function through Matlab. The QP is adjusted by a function. And the timeliness and consistency of QP adjustment are ensured.
For the above embodiment, the following is exemplified in three scenarios.
For example, three scenarios, scenario 1: the picture scene is in a relatively stationary state. Scene 2: the scene of the picture has no preset target and belongs to the moving picture, i.e. no target is detected, but the brightness of the image, the content of the picture change (e.g. noise) exceeds a threshold. Scene 3: the picture scene has a preset target.
For scenario 1: and detecting that the current frame m is in a relatively static state, acquiring the current detection window size w, and all frames in the subsequent detection windows are in the relatively static state. Frame m is taken as an I-frame and the following n frames are encoded in parallel and written into a buffer (e.g. fifo buf). The main encoding thread circularly reads the encoded frames in the buffer zone and sets the frame sequence numbers.
When the algorithm detects that frame n is in motion. The parallel encoding thread is stopped and continues to read the target video input frame YUV. It is possible to switch to either scenario 2 or 3.
For scenario 2: scenes where noise is loud or the tree is shaking at night. The luminance and texture sum of the whole frame can be obtained according to the target type weight, target, etc. set by the user, and the corresponding encoding parameters, such as target length, maximum QP, minimum QP, QP of GOP delta Etc.
If the user does not care about the video quality of the scene, the coding performance can be improved by adjusting the weight of the target type, and the bandwidth is saved.
For scenario 3: in a general scene, a user is more concerned about key information of license plates and faces. Therefore, the weight of the face and the license plate target can be increased, and the weight of the human body and the vehicle body target can be reduced. And calculating the product of the target texture sum and the target type weight, and setting corresponding coding parameters.
And the macro block corresponding to each target also carries out hierarchical control on the macro block QP according to the target information entropy factor corresponding to the macro block. Under the condition that the original code control strategy is met, the scene has more frames, and targets with heavy weight of the target type, such as a human face and a license plate, are clearer.
Referring to fig. 7, fig. 7 is a flowchart illustrating an embodiment of a video decoding method according to the present application. The method may comprise the steps of:
s61: and receiving the coded data of the current frame sent by the coding end, wherein the coded data is obtained by the coding end executing the steps of the video coding method.
The embodiment can be applied to a decoding end, the decoding end is connected with an encoding end, encoded data can be transmitted between the decoding end and the encoding end, and the decoding end can receive encoded data of a current frame sent by the encoding end.
S62: and analyzing by using the coded data to obtain target decoding parameters.
The coded data can be analyzed to obtain an image evaluation value and scene related information written in the coded data of the current frame, and key attention areas, key frames and the like needing decoding can be selected by utilizing the image evaluation value and the scene related information so as to determine target decoding parameters. When video data is packed, the image evaluation value and the scene related information are written into code stream custom information (SEI), and when decoding, the video data can be filtered, stored and decoded according to the SEI information.
S63: and encoding the encoded data based on the target decoding parameters to obtain a video decoding image.
And (3) carrying the target identifier, the target type and the image evaluation value of the target into code stream custom information (SEI) of the video data, and transmitting the code stream custom information to a decoding end. The user can filter the key video streams to be decoded according to the target type of interest and the video average score threshold value thereof.
In this embodiment, the decoding end may filter the key frames to be decoded according to the type of the target concerned, the image evaluation value of the current frame of the target video, and so on. The method realizes the optimization of the decoding end, greatly reduces decoding consumption, reduces the storage, processing, retrieval and the like of the worthless video information, and improves decoding efficiency.
For the above embodiments, the present application provides an encoding end, where the encoding end is connected to a decoding end, and may be used to implement any step of the video encoding method.
Referring to fig. 8, fig. 8 is a schematic structural diagram of an embodiment of a coding end of the present application. The encoding end 70 includes an acquisition module 71, an analysis module 72, a parameter module 73, and an encoding module 74. Wherein the acquisition module 71, the analysis module 72, the parameter module 73 and the encoding module 74 are interconnected.
The obtaining module 71 is configured to obtain a current frame to be encoded from a target video.
The analysis module 72 is configured to analyze a scene of a current frame to obtain a scene analysis result.
The parameter module 73 is configured to determine a target encoding parameter of the current frame using the scene analysis result.
The encoding module 74 is configured to encode the current frame based on the target encoding parameter, to obtain encoded data of the current frame.
The specific implementation process of this embodiment may refer to the specific implementation process of the foregoing embodiment, and the disclosure is not repeated herein.
For the above embodiments, the present application provides a decoding end, where an encoding end is connected to the decoding end, and may be used to implement any step of the video decoding method.
Referring to fig. 9, fig. 9 is a schematic structural diagram of a decoding end according to an embodiment of the application. The decoding end 80 includes a receiving module 81, a parsing module 82, and a decoding module 83. Wherein the receiving module 81, the parsing module 82 and the decoding module 83 are interconnected.
The receiving module 81 is configured to receive encoded data of a current frame sent by an encoding end; wherein, the coded data is obtained by the coding end executing any step of the video coding method.
The parsing module 82 is configured to parse the encoded data to obtain target decoding parameters.
The decoding module 83 is configured to encode the encoded data based on the target decoding parameter, to obtain a video decoding image.
The specific implementation process of this embodiment may refer to the specific implementation process of the foregoing embodiment, and the disclosure is not repeated herein.
For the foregoing embodiments, the present application provides a computer device, please refer to fig. 10, fig. 10 is a schematic structural diagram of an embodiment of the computer device of the present application. The computer device 90 comprises a memory 91 and a processor 92, wherein the memory 91 and the processor 92 are coupled to each other, and the memory 91 stores program data, and the processor 92 is configured to execute the program data to implement the steps in any of the embodiments of the video encoding method and the video decoding method.
In the present embodiment, the processor 92 may also be referred to as a CPU (Central Processing Unit ). The processor 92 may be an integrated circuit chip with signal processing capabilities. Processor 92 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The general purpose processor may be a microprocessor or the processor 92 may be any conventional processor or the like.
For the method of the above embodiment, which may be implemented in the form of a computer program, the present application proposes a computer readable storage medium, please refer to fig. 11, fig. 11 is a schematic structural diagram of an embodiment of the computer readable storage medium of the present application. The computer-readable storage medium 100 stores therein program data 101 that can be executed by a processor, and the program data 101 can be executed by the processor to implement the steps of any one of the above-described video encoding method and video decoding method.
The computer readable storage medium 100 of this embodiment may be a medium such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, which may store the program data 101, or may be a server storing the program data 101, which may send the stored program data 101 to another device for operation, or may also run the stored program data 101 by itself.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical, or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer-readable storage medium, which is a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all or part of the technical solution contributing to the prior art or in the form of a software product stored in a storage medium, comprising several instructions for causing an electronic device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the method of the embodiments of the present application.
It will be apparent to those skilled in the art that the modules or steps of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, or they may alternatively be implemented in program code executable by computing devices, such that they may be stored in a computer readable storage medium for execution by computing devices, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module. Thus, the present application is not limited to any specific combination of hardware and software.
The foregoing description is only illustrative of the present application and is not intended to limit the scope of the application, and all equivalent structures or equivalent processes or direct or indirect application in other related technical fields are included in the scope of the present application.

Claims (18)

1. A method of video encoding, the method comprising:
acquiring a current frame to be coded from a target video;
analyzing the picture scene of the current frame to obtain a scene analysis result;
Determining target coding parameters of the current frame by using the scene analysis result;
and encoding the current frame based on the target encoding parameter to obtain encoded data of the current frame.
2. The method of claim 1, wherein said determining the target coding parameters of the current frame using the scene analysis results comprises:
obtaining target information entropy of at least one target area in the current frame in response to the scene analysis result that the picture scene of the current frame belongs to a moving picture or a preset target exists, wherein the target area is a preset area of the current frame in the case that the picture scene of the current frame belongs to the moving picture, the target area is an area containing the preset target in the case that the picture scene of the current frame belongs to the preset target exists, or the target area is at least one block area of the area containing the preset target;
obtaining a reference information entropy of a reference area in the current frame; wherein the area of the reference area is larger than or equal to the target area, and the reference area comprises the target area;
Acquiring an information entropy weight factor by utilizing the target information entropy and the reference information entropy;
and adjusting the initial coding parameters based on the information entropy weight factors to obtain target coding parameters of the current frame.
3. The method according to claim 2, wherein said adjusting initial encoding parameters based on said entropy weight factor to obtain target encoding parameters of said current frame comprises:
adjusting the initial coding parameters by utilizing the information entropy weight factors to obtain reference coding parameters;
and determining the target coding parameter based on the precoding results of the initial coding parameter and the reference coding parameter.
4. The method of claim 3, wherein the determining the target coding parameter based on the precoding results of the initial coding parameter and the reference coding parameter comprises:
acquiring a first precoding result of precoding the current frame by using the initial coding parameters, and acquiring a second precoding result of precoding the current frame by using the reference coding parameters;
acquiring the coding deviation of the first precoding result and the second precoding result;
And selecting a preset coding parameter or a parameter value of the reference coding parameter as a target parameter value based on the coding deviation, and adjusting the initial coding parameter according to the target parameter value to obtain the target coding parameter.
5. The method according to claim 4, wherein selecting the parameter value of the preset encoding parameter or the reference encoding parameter as a target parameter value based on the encoding bias, and adjusting the initial encoding parameter according to the target parameter value to obtain the target encoding parameter, comprises:
in response to the encoding deviation being greater than a first preset deviation threshold, selecting a parameter value of the preset encoding parameter as a target parameter value, and adjusting the parameter value of the initial encoding parameter to the target parameter value to obtain the target encoding parameter; or,
responding to the coding deviation not larger than a first preset deviation threshold value and the coding deviation larger than a second preset deviation threshold value, selecting a parameter value of the reference coding parameter as a target parameter value, and adjusting the parameter value of the initial coding parameter to the target parameter value to obtain the target coding parameter; or,
And in response to the coding deviation not being greater than a first preset deviation threshold and the coding deviation not being greater than a second preset deviation threshold, selecting the parameter value of the reference coding parameter as a target parameter value, and adjusting the parameter value of the initial coding parameter to a preset statistical value between the initial coding parameter and the target parameter value to obtain the target coding parameter.
6. A method according to claim 3, characterized in that the initial parameter limit range of the initial coding parameter comprises at least one first limit term; the method for adjusting the initial coding parameters by utilizing the information entropy weight factors comprises the following steps of:
for each first limit value item, adjusting the first limit value item by utilizing an entropy weight factor corresponding to the first limit value item to obtain a second limit value item corresponding to the first limit value item, wherein each second limit value item forms a target parameter limit value range;
performing limit processing on the reference coding parameters based on the target parameter limit range to obtain reference coding parameters after limit processing;
the determining the target coding parameter based on the precoding result of the initial coding parameter and the reference coding parameter includes:
Determining the target coding parameters based on the initial coding parameters and the pre-coding results of the reference coding parameters after the limiting values;
wherein the at least one first limit term includes at least one of an initial encoding parameter upper limit, an initial encoding parameter lower limit, and an initial encoding step size limit; the at least one second limit term includes at least one of a target encoding parameter upper limit, a target encoding parameter lower limit, and a target encoding step size limit.
7. The method of claim 6, further comprising, prior to said adjusting said first limit term with said entropy weight factor corresponding to said first limit term, obtaining a second limit term corresponding to said first limit term:
obtaining a first logarithmic processing result of the sum of target information entropies of the target areas and a second logarithmic processing result of the reference information entropies, dividing the product of the first preset processing result and the weight corresponding to the first limit value item by the second preset processing result to obtain entropy weight factors corresponding to the first limit value items;
and/or the target coding parameter upper limit value is the difference between the initial coding parameter upper limit value and the entropy weight factor corresponding to the initial coding parameter upper limit value, the target coding parameter lower limit value is the sum of the entropy weight factors corresponding to the initial coding parameter lower limit value and the initial coding parameter lower limit value, and the target coding step size limit value is the sum of the entropy weight factors corresponding to the initial coding step size limit value and the initial coding step size limit value.
8. The method of claim 2, wherein the obtaining an entropy weight factor using the target entropy and a reference entropy comprises:
acquiring a first logarithmic processing result of the sum of target information entropy of each target area and a second logarithmic processing result of the reference information entropy;
dividing the product of the first preset processing result and the preset weight by the second preset processing result to obtain the information entropy weight factor.
9. The method according to claim 2, wherein said obtaining the target information entropy of at least one target area of the current frame comprises:
acquiring image information of each target area, wherein the image information comprises at least one of image brightness and image texture;
acquiring target information entropy of each target area by utilizing the image information of the target area, the type weight of the target area and the area of the target area;
the obtaining the reference information entropy of the reference area of the current frame includes:
acquiring image information of the reference area, wherein the image information comprises at least one of image brightness and image texture;
And acquiring the reference information entropy of the reference region by utilizing the image information of the reference region and the region area of the reference region.
10. The method of claim 1, wherein said determining the target coding parameters of the current frame using the scene analysis results comprises:
responding to the scene analysis result that the picture scene of the current frame belongs to a relative still picture and a preset target does not exist, and acquiring the coding parameters of the key frames as the target coding parameters based on the picture information of the key frames in the picture group where the current frame is positioned;
wherein, in case the current frame is a forward predictive coded frame, the current frame is coded in parallel with at least one remaining forward predictive coded frame of the group of pictures.
11. The method according to claim 1, wherein the method further comprises:
acquiring an information entropy change value between the target information entropy of the current frame and the target information entropy of the historical frame;
judging whether the information entropy change value is larger than a preset change value or not;
and taking the current frame as a key frame of a picture group in response to the information entropy change value being greater than the preset change value.
12. The method according to claim 11, wherein after the obtaining the information entropy change value between the target information entropy of the current frame and the target information entropy of the history frame, the method comprises:
obtaining entropy change weight factors by utilizing the information entropy change values;
adjusting the initial length of the picture group by utilizing the entropy change weight factor to obtain the target length of the picture group;
wherein, in a case that the scene of the current frame belongs to a moving picture or a preset target exists in response to the scene analysis result, the target length of the picture group is the length of the picture group to which the current frame belongs.
13. The method according to claim 1, wherein the encoding the current frame based on the target encoding parameter, after obtaining the encoded data of the current frame, includes:
acquiring an image evaluation value and scene related information of the current frame;
writing the image evaluation value and scene related information into the coded data of the current frame;
the coded data is used for being sent to a decoding end, so that the decoding end decodes the coded data by utilizing the image evaluation value and the scene related information to obtain the current frame.
14. The method of claim 13, wherein the obtaining the image evaluation value of the current frame comprises:
for at least one target region, acquiring a target detection evaluation value of the target region in the current frame;
acquiring a target detection factor by using the target detection evaluation value and the target evaluation weight;
acquiring an information weight factor by utilizing the image information and the information evaluation weight of the target area;
obtaining a region evaluation value of the target region by using the target detection factor and the information weight factor;
and obtaining an image evaluation value of the current frame based on the region evaluation value of the at least one target region.
15. A method of video decoding, the method comprising:
receiving coded data of a current frame sent by a coding end; wherein the encoded data is obtained by the encoding end performing the steps of the method of any one of claims 1 to 14;
analyzing the encoded data to obtain target decoding parameters;
and encoding the encoded data based on the target decoding parameters to obtain a video decoding image.
16. The method of claim 15, wherein said parsing with said encoded data to obtain target decoding parameters comprises:
Analyzing the coded data to obtain an image evaluation value and scene related information written in the coded data of the current frame;
and determining the target decoding parameters by using the image evaluation values and scene related information.
17. A computer device comprising a memory and a processor coupled to each other, the memory having stored therein program data, the processor being adapted to execute the program data to implement the steps of the method of any of claims 1 to 16.
18. A computer readable storage medium, characterized in that program data executable by a processor are stored, said program data being for implementing the steps of the method according to any one of claims 1 to 16.
CN202310937003.7A 2023-07-27 2023-07-27 Video encoding method, video decoding method, computer device, and storage medium Pending CN117176955A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310937003.7A CN117176955A (en) 2023-07-27 2023-07-27 Video encoding method, video decoding method, computer device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310937003.7A CN117176955A (en) 2023-07-27 2023-07-27 Video encoding method, video decoding method, computer device, and storage medium

Publications (1)

Publication Number Publication Date
CN117176955A true CN117176955A (en) 2023-12-05

Family

ID=88936512

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310937003.7A Pending CN117176955A (en) 2023-07-27 2023-07-27 Video encoding method, video decoding method, computer device, and storage medium

Country Status (1)

Country Link
CN (1) CN117176955A (en)

Similar Documents

Publication Publication Date Title
WO2021244341A1 (en) Picture coding method and apparatus, electronic device and computer readable storage medium
JP6203965B2 (en) Thermal and power management
US9762917B2 (en) Quantization method and apparatus in encoding/decoding
EP3001885B1 (en) Region-of-interest aware video coding
US20180063549A1 (en) System and method for dynamically changing resolution based on content
US20060188014A1 (en) Video coding and adaptation by semantics-driven resolution control for transport and storage
CN102986211A (en) Rate control in video coding
TW202207708A (en) Video processing apparatus and processing method of video stream
CN107211145A (en) The almost video recompression of virtually lossless
WO2021129007A1 (en) Method and device for determining video bitrate, computer apparatus, and storage medium
US10812832B2 (en) Efficient still image coding with video compression techniques
US10021398B2 (en) Adaptive tile data size coding for video and image compression
US20230045884A1 (en) Rio-based video coding method and deivice
US20150249829A1 (en) Method, Apparatus and Computer Program Product for Video Compression
CN101426135B (en) Encoding apparatus, method of controlling thereof
CN117176955A (en) Video encoding method, video decoding method, computer device, and storage medium
CN110800298A (en) Code rate allocation method, code rate control method, encoder, and recording medium
JP4942208B2 (en) Encoder
CN116827921A (en) Audio and video processing method, device and equipment for streaming media
CN113973202A (en) Video encoding method, device, equipment and storage medium
CN112584143A (en) Video coding method, device and system and computer readable storage medium
JP5585271B2 (en) Video encoding device
WO2024082971A1 (en) Video processing method and related device
Tong et al. Human centered perceptual adaptation for video coding
CN114584834B (en) Video quality optimization method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination