WO2022021422A1 - Video coding method and system, coder, and computer storage medium - Google Patents

Video coding method and system, coder, and computer storage medium Download PDF

Info

Publication number
WO2022021422A1
WO2022021422A1 PCT/CN2020/106416 CN2020106416W WO2022021422A1 WO 2022021422 A1 WO2022021422 A1 WO 2022021422A1 CN 2020106416 W CN2020106416 W CN 2020106416W WO 2022021422 A1 WO2022021422 A1 WO 2022021422A1
Authority
WO
WIPO (PCT)
Prior art keywords
distortion
video
parameter
value
target
Prior art date
Application number
PCT/CN2020/106416
Other languages
French (fr)
Chinese (zh)
Inventor
元辉
周兰
李明
姜东冉
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to CN202080099999.3A priority Critical patent/CN115428451A/en
Priority to PCT/CN2020/106416 priority patent/WO2022021422A1/en
Publication of WO2022021422A1 publication Critical patent/WO2022021422A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria

Definitions

  • the embodiments of the present application relate to the technical field of video coding and decoding, and in particular, to a video coding method, an encoder, a system, and a computer storage medium.
  • H.266/VVC High Efficiency Video Coding
  • the rate-distortion optimization algorithm can either only guarantee the fidelity of the reconstructed video, or can guarantee the subjective quality of the reconstructed video, but the fidelity performance of the video will be greatly reduced.
  • the distortion criteria adopted by the existing rate-distortion optimization algorithms are single and incomplete, so that the existing rate-distortion optimization algorithms cannot be well adapted to machine vision and computer vision.
  • Application scenarios of human-machine vision are single and incomplete, so that the existing rate-distortion optimization algorithms cannot be well adapted to machine vision and computer vision.
  • Embodiments of the present application provide a video encoding method, encoder, system, and computer storage medium, which can be well adapted to application scenarios oriented to machine vision and human-machine vision, and can improve the reconstructed video under the condition of a certain bit rate
  • the accuracy of semantic segmentation can be improved, while maintaining good fidelity performance, thereby improving coding efficiency.
  • an embodiment of the present application provides a video encoding method, which is applied to an encoder, and the method includes:
  • first distortion metric criterion includes a semantic distortion metric criterion
  • the second distortion metric criterion includes a numerical error metric criterion
  • the encoding parameters of the video to be encoded are determined, and the video to be encoded is encoded.
  • an embodiment of the present application provides an encoder, the encoder includes a determination unit, a calculation unit, and an encoding unit; wherein,
  • the determining unit is configured to determine pre-parameters of the video to be encoded, and determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters;
  • the computing unit configured to determine a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier;
  • the determining unit is further configured to determine a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion; and determine a second distortion value according to a second distortion metric criterion, Wherein, the second distortion metric criterion includes a numerical error metric criterion;
  • the computing unit is further configured to determine a target distortion value according to the first distortion value and the second distortion value;
  • the encoding unit is configured to use the target Lagrangian multiplier and the target distortion value to determine the encoding parameter of the video to be encoded, and to encode the video to be encoded.
  • an embodiment of the present application provides an encoder, where the encoder includes a memory and a processor; wherein,
  • the memory for storing a computer program executable on the processor
  • the processor is configured to execute the method according to the first aspect when running the computer program.
  • an embodiment of the present application provides a computer storage medium, where the computer storage medium stores a computer program, and the computer program implements the method according to the first aspect when the computer program is executed by at least one processor.
  • an embodiment of the present application provides a video system, where the video system includes an encoder and a decoder; wherein,
  • the encoder configured to determine pre-parameters of the video to be encoded, determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters; and according to the first Lagrangian a multiplier and the second Lagrangian multiplier to determine a target Lagrange multiplier; and a first distortion value based on a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric and determining a second distortion value according to a second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion; and determining a target distortion according to the first distortion value and the second distortion value and using the target Lagrangian multiplier and the target distortion value to determine the encoding parameters of the to-be-encoded video, encode the to-be-encoded video to generate a code stream, and transmit the code stream to the decoder;
  • the decoder is configured to parse the code stream to obtain decoded video.
  • Embodiments of the present application provide a video encoding method, encoder, system, and computer storage medium, by determining pre-parameters of the video to be encoded, and determining a first Lagrangian multiplier and a second Lagrangian according to the pre-parameters the Lagrangian multiplier; according to the first Lagrangian multiplier and the second Lagrangian multiplier, determine the target Lagrangian multiplier; according to the first distortion metric criterion, determine the first distortion value, Wherein, the first distortion metric criterion includes a semantic distortion metric criterion; a second distortion value is determined according to a second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion; according to the first distortion value and the second distortion value, to determine a target distortion value; using the target Lagrangian multiplier and the target distortion value to determine the encoding parameters of the video to be encoded, and to encode the video to be encoded.
  • the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and under a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining a good fidelity performance, thereby also improving the coding efficiency.
  • Fig. 1 is the structural representation of a kind of RD curve that related technical scheme provides
  • FIG. 2 is a schematic structural diagram of a system composition of an encoder according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a video encoding method provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a curve of a functional relationship between a first distortion value and a code rate according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of a curve of a functional relationship between a code rate and a quantization parameter according to an embodiment of the present application
  • FIG. 6 is a schematic diagram of a curve of a functional relationship between a first distortion value and MSE provided by an embodiment of the present application;
  • FIG. 7 is a schematic diagram of the composition and structure of an encoder provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a specific hardware structure of an encoder provided by an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a video system according to an embodiment of the present application.
  • the higher the bit rate the better the reconstructed video quality and the smaller the distortion; however, the larger the storage space occupied by the encoded file, the larger the generated bit rate. Therefore, at this time, it is necessary to find a balance between the distortion of the reconstructed video and the bit rate through a rate-distortion optimization algorithm, so that the compression effect is optimal.
  • rate-distortion optimization can be expressed as minimizing the distortion of the decoded and reconstructed video when the encoded file does not exceed a certain bit rate, as shown in the following formula (1).
  • D and R represent the distortion and code rate under certain coding parameters, respectively.
  • the video is encoded with the given encoding parameters, and the encoded bit rate (R) and the distortion (D) of the reconstructed video are calculated.
  • R bit rate
  • D distortion
  • By changing the encoding parameters and repeatedly encoding the to-be-encoded video multiple R-D points consisting of bit rate and distortion can be obtained, as shown in Figure 1.
  • the point with the least distortion will appear on the convex curve (ie, the RD curve) in Fig. 1 .
  • the encoder needs to determine a set of encoding parameters so that the encoded R-D point can approximate this convex curve as much as possible.
  • the constrained problem of the above formula (1) can be transformed into an unconstrained problem by the Lagrange multiplier method, as shown in the following formula (2).
  • is the Lagrange multiplier and J is the rate-distortion cost function.
  • J is the rate-distortion cost function.
  • the encoder can find the optimal encoding parameters by minimizing the rate-distortion cost function.
  • the encoder can determine the optimal block division method, the optimal intra-frame prediction mode, and the optimal inter-frame prediction motion mode (including motion vector, reference image, prediction weight, etc.), To achieve optimal encoding performance.
  • the rate-distortion optimization here adopts the sum of square error (SSE) as the distortion criterion, and the corresponding reconstructed video quality can be determined by the peak signal-to-noise ratio (Peak Signal to Noise Ratio, PSNR). )to measure.
  • SSE distortion can objectively measure the fidelity of the video, and its calculation formula is shown in the following formula (3).
  • M and N represent the horizontal spatial resolution and vertical spatial resolution of the video, respectively, f(x, y) represents the original pixel value at the pixel position (x, y), and g(x, y) represents the pixel position (x, y) , y) at the reconstructed pixel value.
  • rate-distortion optimization is a key technology in video coding, it affects the performance of the encoder.
  • SSE distortion which can measure the fidelity of the video from an objective point of view; but the SSE distortion is not consistent with the perception of the human visual system, such as for some areas with large SSE distortion , the human eye does not perceive the degradation of the reconstructed video quality.
  • the distortion criterion needs to be changed to a distortion metric that can measure the subjective quality.
  • the calculation formula is shown in the following formula (4).
  • C1 and C2 are two constants, in order to avoid and Instability occurs when it is close to 0.
  • C 1 (K 1 L) 2
  • C 2 (K 2 L) 2
  • K 1 0.01
  • K 2 0.03.
  • the rate-distortion optimization algorithm based on SSE distortion can ensure the fidelity of the reconstructed video; however, although the SSIM distortion considering the subjective quality can guarantee the subjective quality of the reconstructed video, the fidelity performance of the video will be greatly reduced.
  • 5G fifth-generation mobile communication
  • machine-oriented applications such as the Internet of Vehicles, wireless Machine vision content such as human driving, industrial Internet, smart and safe cities, wearables, and video surveillance has a wider range of application scenarios.
  • machine-oriented applications such as the Internet of Vehicles, wireless Machine vision content such as human driving, industrial Internet, smart and safe cities, wearables, and video surveillance has a wider range of application scenarios.
  • most videos will be used by machines, such as intelligent analysis of reconstructed videos such as pedestrian detection, semantic segmentation, and target detection.
  • the distortion criterion adopted by the current rate-distortion optimization algorithm only considers the fidelity distortion, and does not consider the semantic distortion; Fidelity performance, but the semantic accuracy of the reconstructed video cannot be guaranteed, resulting in the current rate-distortion optimization algorithm can not be well adapted to many scenarios for machine vision and human-machine vision.
  • an embodiment of the present application provides a video encoding method.
  • the basic idea is: determine pre-parameters of the video to be encoded, and determine the first Lagrangian multiplier and the second Lagrangian according to the pre-parameters multiplier; according to the first Lagrangian multiplier and the second Lagrangian multiplier, determine the target Lagrangian multiplier; according to the first distortion measurement criterion, determine the first distortion value, wherein,
  • the first distortion metric criterion includes a semantic distortion metric criterion; a second distortion value is determined according to the second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion; according to the first distortion value and the obtained
  • the second distortion value is determined, and the target distortion value is determined; the encoding parameter of the to-be-encoded video is determined by using the target Lagrangian multiplier and the target distortion value, and the to-be-encoded video is encoded.
  • the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and under a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining a good fidelity performance, thereby also improving the coding efficiency.
  • the encoder 10 may include a transform and quantization unit 101, an intra-frame estimation unit 102, an intra-frame prediction unit 103, a motion compensation unit 104, a motion estimation unit 105, an inverse transform and inverse quantization unit 106, a filter Control the analysis unit 107, the filtering unit 108, the encoding unit 109, the decoded image buffering unit 110, etc., wherein the filtering unit 108 can realize deblocking filtering and sample adaptive offset (Sample Adaptive Offset, SAO) filtering, and the encoding unit 109 can realize Header information coding and context-based adaptive binary arithmetic coding (Context-based Adaptive Binary Arithmatic Coding, CABAC).
  • SAO Sample Adaptive Offset
  • a coding block (Coding Unit, CU) can be divided to obtain a video coding block, and then the residual pixel information obtained after intra-frame or inter-frame prediction is encoded by the transform and quantization unit 101.
  • the block is transformed, including transforming the residual information from the pixel domain to the transform domain, and quantizing the resulting transform coefficients to further reduce the bit rate; the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used for this video.
  • the coding block is intra-predicted; specifically, the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used to determine the intra-frame prediction mode to be used to encode the video coding block; the motion compensation unit 104 and the motion estimation unit 105 are used to Inter-predictive encoding of the received video encoding blocks relative to one or more blocks in one or more reference frames is performed to provide temporal prediction information; motion estimation performed by motion estimation unit 105 is the process of generating motion vectors, so The motion vector can estimate the motion of the video coding block, and then the motion compensation unit 104 performs motion compensation based on the motion vector determined by the motion estimation unit 105; after determining the intra prediction mode, the intra prediction unit 103 is also used to The selected intra prediction data is supplied to the encoding unit 109, and the motion estimation unit 105 also sends the calculated motion vector data to the encoding unit 109; in addition, the inverse transform and inverse quantization unit 106 is used for the video encoding block.
  • a residual block is reconstructed in the pixel domain, the reconstructed residual block is controlled by the filter analysis unit 107 and the filtering unit 108 to remove the blocking artifacts, and then the reconstructed residual block is added to the decoded image buffer unit
  • a predictive block in the frame of 110 is used to generate a reconstructed video coding block; the coding unit 109 is used for coding various coding parameters and quantized transform coefficients.
  • the context content can be Based on the adjacent coding blocks, it can be used to encode the information indicating the determined intra prediction mode, and output the code stream of the video signal; and the decoded image buffer unit 110 is used to store the reconstructed video coding blocks for prediction reference. As the video image coding proceeds, new reconstructed video coding blocks are continuously generated, and these reconstructed video coding blocks are all stored in the decoded image buffer unit 110 .
  • the video coding method in this embodiment of the present application is mainly applied to the coding control part in the encoder 10, for example, including the coding block (Coding Unit, CU) division shown in FIG. 2, the intra prediction unit 103, the motion compensation unit 104 and Motion estimation unit 105 and other parts. That is to say, the video encoding method of the embodiment of the present application is mainly used to determine encoding parameters, so as to perform encoding according to the determined encoding parameters.
  • the coding parameters may include a CU division mode, and an intra-frame prediction mode or an inter-frame prediction mode for determining the CU.
  • An embodiment of the present application provides a video encoding method, and the method is applied to a video encoding device, that is, an encoder.
  • the functions implemented by the method can be implemented by the processor in the encoder calling a computer program, and of course the computer program can be stored in a memory.
  • the encoder includes at least a processor and a memory.
  • FIG. 3 it shows a schematic flowchart of a video encoding method provided by an embodiment of the present application. As shown in Figure 3, the method may include:
  • S301 Determine pre-parameters of the video to be encoded, and determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters;
  • S302 Determine a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier;
  • S303 Determine a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion; and determine a second distortion value according to a second distortion metric criterion, wherein the second distortion metric criterion Distortion metrics include numerical error metrics;
  • S304 Determine a target distortion value according to the first distortion value and the second distortion value
  • S305 Using the target Lagrangian multiplier and the target distortion value, determine the encoding parameter of the video to be encoded, and encode the video to be encoded.
  • the video coding method in this embodiment of the present application may be applicable to an encoder of the H.266/VVC standard, an encoder of the H.265/HEVC standard, or even an encoder of other standards , such as an encoder suitable for the first-generation video coding standard (Alliance for Open Media Video 1, AV-1) developed by the Open Media Alliance, and the embodiment of this application does not make any limitation.
  • an encoder suitable for the first-generation video coding standard Alliance for Open Media Video 1, AV-1 developed by the Open Media Alliance
  • the rate-distortion optimization algorithm used in the video coding method of the embodiment of the present application comprehensively considers the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric, so that the rate The distortion optimization can be a multi-distortion criterion rate-distortion optimization algorithm for human-machine vision. That is to say, in video coding, in addition to the second Lagrangian multiplier and the second distortion value derived by using the related technical solution, for the human-machine vision application scenario of video semantic segmentation, the embodiment of the present application can also A semantic distortion metric is defined, and then the corresponding first Lagrangian multiplier and the first distortion value calculation formula are derived.
  • the target Lagrangian multiplier can be determined according to the first Lagrangian multiplier and the second Lagrangian multiplier, and the first distortion value determined according to the first distortion metric criterion and the second distortion value
  • the second distortion value determined by the metric criterion can also determine the target distortion value; in this way, after determining the encoding parameters of the video to be encoded according to the target Lagrangian multiplier and the target distortion value, use the encoding parameters to perform the encoding of the video to be encoded. Coding can improve the accuracy of semantic segmentation of reconstructed video, and improve the fidelity of reconstructed video, and can also reduce the coding bit rate of video, thereby shortening the time required for coding, improving coding speed, and improving coding efficiency.
  • the pre-parameter of the video to be encoded may include a quantization parameter (Quantization Parameter, QP).
  • QP Quantization Parameter
  • the determining the pre-parameters of the video to be encoded may include:
  • a quantization parameter of the coding unit in the video to be coded is determined; wherein, the coding unit may include at least one of the following: a picture, a slice (Slice), a sub-picture (Sub-picture), a tile (tile), and a coding block.
  • the quantization parameter may be the quantization step size of the quantizer in the encoder, or the index number value corresponding to the quantization step size of the quantizer in the encoder.
  • the determining the first Lagrangian multiplier according to the pre-parameter may include:
  • the first calculation model representing the correspondence between the first Lagrangian multiplier and the quantization parameter
  • the first Lagrangian multiplier is determined according to the quantization parameter and the first calculation model.
  • the determining the parameters of the first calculation model may include:
  • the first Lagrangian multiplier is set to a weighted value equal to the exponential power of the quantization parameter
  • the first calculation model parameter includes a first exponential parameter indicating the exponential power and a first weighting coefficient indicating the weighting.
  • the calculation formula of the first calculation model is as follows:
  • Equation (5) is the first calculation model, which is used to represent the correspondence between the first Lagrangian multiplier and the quantization parameter.
  • the first calculation model parameter may include a first index parameter (ie, 6.3612072 in the formula) and a first weighting coefficient (ie, 2.30422*10 -8 in the formula).
  • the determination of the parameters of the first calculation model may be a preset value, or may be obtained by fitting according to a large amount of test data of a test video, which is not limited herein.
  • the method may further include:
  • the first calculation model parameter is set to a preset value.
  • the first index parameter may be set to 6.3612072, and the first weighting coefficient may be set to 2.30422*10 -8 .
  • the first calculation model can be obtained, so as to determine the first Lagrangian multiplier according to the quantization parameter.
  • the method may further include:
  • a second relationship function between the bit rate and the quantization parameter is determined, and the first calculation model parameter is determined according to the derivative function and the second relationship function.
  • test video in this embodiment of the present application may be one or more test videos, for example, the test video may be multiple (eg, 59) test video sequences in a large-scale city scene dataset (Cityscapes).
  • the first relationship function between the first distortion value and the bit rate of the test video can be determined, and the first relationship function is shown in the following formula:
  • D miou represents the first distortion value
  • R represents the code rate
  • the value of ⁇ miou is the slope of the tangent to the curve ( ⁇ miou >0), that is, the derivative function of the negative curve.
  • the average bit rate of the reconstructed video under different quantization parameters can be calculated according to the encoded files.
  • a large amount of experimental test data can be used to determine the first distortion value (D miou ) and the bit rate (R ), the fitting curve is shown in Figure 4.
  • the derivative function of the first relation function can be obtained by performing the derivative operation on the formula (6), and the derivative function is used to represent the corresponding relationship between the first Lagrangian multiplier and the code rate.
  • the derivative function is as follows,
  • a second relationship function between the bit rate (R) and the quantization parameter (QP) can be determined by fitting using a large amount of experimental test data.
  • the fitting curve is shown in FIG. 5 .
  • the second relation function is as follows,
  • Equation (7) and Equation (8) substituting Equation (8) into Equation (7), the functional relationship between ⁇ miou and QP can be obtained, that is, the relationship between the first Lagrange multiplier and the quantization parameter
  • the first calculation model shown in formula (5) is obtained; thus, the parameters of the first calculation model are determined, so that the first Lagrangian multiplier can be determined according to the quantization parameters.
  • the calculation formula of the first Lagrange multiplier can also be modified into other functional forms.
  • the functional relationship of the above formula (8) can also be fitted in the e-exponential form, then the corresponding formula (5), that is, the calculation formula of the first Lagrangian multiplier can also be in the e-exponential form of the QP. limited.
  • the determining the second Lagrangian multiplier according to the pre-parameter may include:
  • the second Lagrangian multiplier is determined according to a preset third calculation model; wherein, the third calculation model represents the corresponding relationship between the second Lagrangian multiplier and the quantization parameter.
  • the determining the third calculation model parameter may include:
  • the second Lagrangian multiplier is set equal to a weighted value of an exponential power of 2;
  • the third calculation model parameter includes a third exponent parameter indicating the power of the exponent and a third weighting coefficient indicating the weighting, the third exponent parameter being related to a quantization parameter.
  • Equation (9) is the third calculation model, which is used to represent the correspondence between the second Lagrangian multiplier and the quantization parameter.
  • the third calculation model parameter may include a third index parameter (ie (QP-12)/3 in the formula) and a third weighting coefficient (ie, 0.57 in the formula), and the value of the third index parameter is related to the quantization parameter (QP) related.
  • the parameters of the third calculation model may be a preset value; it may also be obtained by fitting according to a large amount of experimental test data of the test video, which is not limited here.
  • the determining the quantization parameter of the coding unit in the to-be-coded video may include: using a rate control manner to determine the quantization parameter of the coding unit in the to-be-coded video.
  • the determining the quantization parameter of the coding unit in the to-be-coded video may include: setting the quantization parameter to a preset value.
  • the quantization parameter in the video to be encoded may be set to a preset value, such as 22, 27, 32, 37, and so on.
  • the quantization parameter can also be determined by using the code rate control method; specifically, the current code stream control algorithm mainly controls the code stream by adjusting the size of the quantization parameter; in this way, by controlling the size of the code rate, also The required quantization parameters can be obtained.
  • the pre-parameters of the video to be encoded may include a quantization parameter and a target bit rate.
  • the determining the pre-parameters of the video to be encoded may include:
  • the coding unit may include at least one of the following: a picture, a slice (Slice), a sub-picture (Sub-picture), a tile (tile), encoding block.
  • the quantization parameter may be the quantization step size of the quantizer in the encoder, or the index number value corresponding to the quantization step size of the quantizer in the encoder.
  • the determining the first Lagrangian multiplier according to the pre-parameter may include:
  • the second calculation model representing the correspondence between the first Lagrange multiplier and the code rate
  • the first Lagrangian multiplier is determined according to the target code rate and the second calculation model.
  • the determining the target bit rate of the coding unit in the to-be-encoded video may include: determining the target bit-rate of the encoding unit in the to-be-encoded video by using a bit allocation method.
  • the target bit rate for the coding unit in the video to be encoded can be obtained by way of bit allocation.
  • the target bit rate can be dynamically adjusted according to the number of bits consumed by the coding unit in the video to be encoded, so as to ensure real-time and accurate bit allocation.
  • the determining the parameters of the second calculation model may include:
  • the first Lagrangian multiplier is set to a weighted value equal to the exponential power of the target code rate
  • the second calculation model parameter includes a second exponent parameter indicating the power of the exponent and a second weighting coefficient indicating the weighting.
  • Equation (10) is the second calculation model, which is used to represent the correspondence between the first Lagrangian multiplier and the code rate.
  • the second calculation model parameter may include a second index parameter (ie -1.7553 in the formula) and a second weighting coefficient (ie, 0.17364347 in the formula).
  • the determination of the parameters of the second calculation model may be a preset value, or may be obtained by fitting according to a large amount of experimental test data of the test video, which is not limited herein.
  • the method may further include:
  • the second calculation model parameter is set to a preset value.
  • the second index parameter may be set to -1.7553, and the second weighting coefficient may be set to 0.17364347. After the second index parameter and the second weighting coefficient are determined, the second calculation model can be obtained, so as to determine the first Lagrangian multiplier according to the target code rate.
  • the method may further include:
  • a derivative operation is performed on the first relational function to determine the second calculation model parameter.
  • test video here may also be one or more test videos, for example, the test video may be multiple (eg, 59) test video sequences in a large-scale urban scene dataset (Cityscapes).
  • the first relationship function between the first distortion value and the bit rate of the test video can be determined, and the first relationship function is shown in the above formula (6).
  • the derivative operation is performed on the formula (6), the derivative function of the first relation function can be obtained, and the derivative function is used to represent the corresponding relationship between the first Lagrange multiplier and the code rate, and the formula ( 10) the second calculation model shown; thus, the parameters of the second calculation model are also determined, so that the first Lagrangian multiplier can be determined according to the quantization parameters.
  • the value of ⁇ miou is the slope of the tangent of the curve ( ⁇ miou >0), that is, the derivative function of the negative curve.
  • the average bit rate of the reconstructed video under different quantization parameters can be calculated according to the encoded files.
  • a large amount of experimental test data can be used to determine the first distortion value (D miou ) and the bit rate by fitting (R), the fitting curve is shown in Fig. 4 to obtain the first relation function.
  • the determining the second Lagrangian multiplier according to the pre-parameter may include:
  • the second Lagrangian multiplier is determined according to a preset third calculation model; wherein, the third calculation model represents the corresponding relationship between the second Lagrangian multiplier and the quantization parameter.
  • the determining the third calculation model parameter may include:
  • the second Lagrangian multiplier is set equal to a weighted value of an exponential power of 2;
  • the third calculation model parameter includes a third exponent parameter indicating the power of the exponent and a third weighting coefficient indicating the weighting, the third exponent parameter being related to a quantization parameter.
  • Equation (9) is the third calculation model, which is used to represent the correspondence between the second Lagrangian multiplier and the quantization parameter.
  • the third calculation model parameter may include a third index parameter (ie (QP-12)/3 in the formula) and a third weighting coefficient (ie, 0.57 in the formula), and the value of the third index parameter is related to the quantization parameter (QP) related.
  • the parameters of the third calculation model may be a preset value; it may also be obtained by fitting according to a large amount of experimental test data of the test video, which is not limited here.
  • the determining the quantization parameter of the coding unit in the to-be-coded video may include: using a rate control manner to determine the quantization parameter of the coding unit in the to-be-coded video.
  • the determining the quantization parameter of the coding unit in the to-be-coded video may include: setting the quantization parameter to a preset value.
  • the quantization parameter in the video to be encoded may be set to a preset value, such as 22, 27, 32, 37, and so on.
  • the quantization parameter can also be determined by using the code rate control method; specifically, the current code stream control algorithm mainly controls the code stream by adjusting the size of the quantization parameter; in this way, by controlling the size of the code rate, also The required quantization parameters can be obtained.
  • the distortion value in addition to determining the first Lagrangian multiplier and the second Lagrangian multiplier, the distortion value also needs to be determined.
  • the first distortion value is obtained based on the first distortion metric criterion
  • the second distortion value is obtained based on the second distortion metric criterion.
  • the first distortion metric criterion may be a semantic distortion metric criterion.
  • a semantic distortion metric In order to improve the semantic segmentation accuracy of reconstructed video, it is first necessary to define a semantic distortion metric. Specifically, multiple quantization parameters can be selected, and then VVC encoding is performed on multiple (for example, 59) test video sequences in the large-scale urban scene dataset (Cityscapes) under the condition of random access (RA). Video semantic segmentation is performed on the video before and after encoding, so that the accuracy of the semantic segmentation result can be calculated according to the corresponding annotation data.
  • measuring the accuracy of semantic segmentation can usually be expressed by mean Intersection over Union (mIoU), where mIoU refers to the average value of Intersection over Union (IoU) of all categories.
  • IoU is used as a detection evaluation function, which is simply the overlap rate of the generated prediction window and the real window, that is, the intersection of the detection result area (Detection Result) and the ground truth area (Ground Truth) and the union of the two. ratio, that is, semantic accuracy (represented by IoU).
  • the determining the first distortion value according to the first distortion metric criterion may include:
  • test video Based on the test video, semantically segment the test video to determine the semantic accuracy of one or more categories;
  • Distortion measurement is performed on the semantic accuracy of the target by using the fourth calculation model to obtain the first distortion value.
  • determining the target semantic accuracy according to the semantic accuracy of one or more categories may include:
  • a weighted sum of the semantic accuracy of the one or more categories is calculated, and the resulting weighted sum is determined as the target semantic accuracy.
  • a specific implementation is to set the weight to 1, in this case, the average of the semantic accuracy of the one or more categories is calculated. value, and the obtained average is determined as the target semantic accuracy.
  • the two sets can represent the predicted value and the real value respectively, that is, A pred is the predicted segmentation result area, and A true is the labeled segmentation result area;
  • the semantic accuracy of each category It can be represented by IoU, and the calculation of IoU is as follows.
  • the target semantic accuracy can be obtained by taking the average value, which can be expressed as mIoU.
  • mIoU refers to the average IoU of all categories, and its value ranges from 0 to 1; the larger the value, the higher the semantic accuracy.
  • IoU of n classes the calculation of mIoU is as follows.
  • the use of the fourth calculation model to perform distortion measurement on the target semantic accuracy to obtain the first distortion value includes:
  • the fourth calculation model parameter representing the correspondence between the first distortion value and the target semantic accuracy
  • the first distortion value is obtained according to the target semantic accuracy and the fourth calculation model.
  • the determining of the fourth calculation model parameter may include:
  • the first distortion value is set as a weighted value equal to the logarithm of the target semantic accuracy
  • the fourth calculation model parameter includes a base parameter indicating the logarithm and a fourth weighting coefficient parameter indicating the weighting.
  • the fourth calculation model parameter is set as a preset value.
  • a semantic distortion metric ie, the first distortion value, represented by D miou
  • its calculation formula is shown in the following formula (13).
  • the formula (13) is the fourth calculation model, which is used to represent the correspondence between the first distortion value (D miou ) and the target semantic accuracy (mIoU).
  • the fourth calculation model parameter may include a base parameter (that is, the base of ln in the formula is 10) and a fourth weighting coefficient parameter (that is, -10 in the formula). (mIoU) preset magnification.
  • the determination of the parameters of the fourth calculation model may be a preset value, or may be obtained by fitting according to a large amount of test data of a test video, which is not limited here.
  • the natural logarithmic function maps the finite mIoU value to an infinite range, and the multiplied coefficient amplifies the obtained value to match the distortion size in the rate-distortion optimization algorithm. In this way, when mIoU tends to 0, D miou tends to infinity; when mIoU tends to 1, D miou tends to 0.
  • the first distortion value may also be related to the target mean square error of the coding unit in the video to be encoded.
  • the determining the first distortion value according to the first distortion metric criterion may include:
  • the fifth calculation model representing a third relationship function between the first distortion value and the mean square error
  • the target mean square error of the coding unit in the video to be encoded is determined, and the first distortion value is determined according to the target mean square error and the fifth calculation model.
  • the reconstructed video is obtained by performing video decoding and reconstruction on the encoded video.
  • video reconstruction can be performed on the encoded video under the quantization parameter to obtain the reconstructed video under the quantization parameter; according to the reconstructed video and the original video, it is possible to The mean squared error (Mean Squared Error, MSE) of the reconstructed video under the quantization parameter is obtained.
  • MSE can evaluate the degree of change of the data. The smaller the value of MSE, the better the accuracy of the prediction model in describing the experimental data.
  • the determining the parameters of the fifth calculation model may include:
  • the first distortion value is set equal to the product of the target mean square error and the first parameter factor and the sum value of the second parameter factor is superimposed;
  • the fifth calculation model parameter includes the first parameter factor and the second parameter factor.
  • Equation (14) is the fifth calculation model, which is used to represent the corresponding relationship between the first distortion value and the mean square error.
  • the fifth calculation model parameter may include a first parameter factor (ie, 0.6276 in the formula) and a second parameter factor (ie, 3.48 in the formula).
  • the determination of the parameter of the fifth calculation model may be a preset value, or may be obtained by fitting according to a large amount of experimental test data of the test video, which is not limited herein.
  • the method may further include:
  • the fifth calculation model parameter is set to a preset value.
  • the first parameter factor may be set to 0.6276, and the second parameter factor may be set to 3.48. After the first parameter factor and the second parameter factor are determined, a fifth calculation model can be obtained, so as to determine the first distortion value according to the target mean square error.
  • the method may further include:
  • the fifth calculation model parameter is determined according to the third relational function.
  • test video here may also be one or more test videos, for example, the test video may be multiple (eg, 59) test video sequences in a large-scale urban scene dataset (Cityscapes).
  • the first distortion metric is used for the test video, and the average MSE of the reconstructed video under different quantization parameters is counted according to the encoded files.
  • a large amount of experimental test data can be used to determine the first distortion value (D miou ) and MSE, the fitting curve is shown in FIG. 6 , and the fitting curve is linear, and a fifth calculation model can be obtained.
  • the determining the pre-parameter of the video to be encoded may further include: determining the target mean square error of the coding units in the video to be encoded. In this way, after the fifth calculation model is obtained, the first distortion value can be determined according to the target mean square error and the fifth calculation model shown in formula (14).
  • the first mIoU value is determined according to the difference between the first mIoU value and the second mIoU value according to the semantic segmentation result (the first mIoU value) of the video before encoding and the semantic segmentation result (the second mIoU value) of the encoded video.
  • Distortion value the embodiment of the present application does not make any specific limitation.
  • the second distortion metric criterion may be a numerical error criterion.
  • the determining the second distortion value according to the second distortion metric criterion may include:
  • the coding unit includes at least one of the following: a picture, a slice (Slice), a sub-picture (Sub-picture), a tile (tile), and a coding block;
  • the second distortion value is determined according to the reconstructed value and the original value of the coding unit
  • the numerical error criterion is one of the following: Sum of Absolute Differences (SAD) criterion, Mean Absolute Deviation (MAD) criterion, and Sum of Square Error (SSE) criterion , the mean-square error (MSE) criterion. It should be noted that the numerical error criterion is not limited to these criteria, and may also be other criteria, which are not specifically limited in the embodiments of the present application.
  • the second distortion value is represented by SSE, and its calculation formula is as follows:
  • M and N represent the horizontal spatial resolution and vertical spatial resolution of the video, respectively, f(x, y) represents the original pixel value at the pixel position (x, y), and g(x, y) represents the pixel position (x, y) , y) at the reconstructed pixel value.
  • the target Lagrangian multiplier (represented by ⁇ ) can be calculated by ⁇ miou and ⁇ SSE
  • the target distortion value (represented by D) can be calculated by D miou and SSE.
  • the determining a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier may include:
  • the first preset parameter is used to control weight values corresponding to the first Lagrangian multiplier and the second Lagrangian multiplier;
  • the first Lagrangian multiplier and the second Lagrangian multiplier are weighted and calculated by using the first preset parameter to obtain the target Lagrangian multiplier.
  • the first preset parameter can control the weight values corresponding to the first Lagrangian multiplier and the second Lagrangian multiplier.
  • the determining the first preset parameter may include:
  • the first preset parameter is set according to the configuration information of the encoder.
  • the method can also include:
  • the target Lagrangian multiplier is set equal to the first Lagrangian multiplier and the second Lagrangian A weighted sum of Lagrangian multipliers, where k is any value greater than or equal to 0 and less than or equal to 1, the weighting coefficient of the first Lagrange multiplier is set to be equal to 1–k, and the second Lagrangian The weighting factor of the Grange multiplier is set equal to k.
  • the weighting coefficient of the second Lagrangian multiplier when the weighting coefficient of the second Lagrangian multiplier is set to k, the weighting coefficient of the first Lagrangian multiplier can be set to 1-k ; in this way, the calculation formula of the target Lagrange multiplier is as follows,
  • represents the target Lagrangian multiplier
  • ⁇ miou represents the first Lagrangian multiplier
  • ⁇ SSE represents the second Lagrangian multiplier
  • 1-k and k represent the first Lagrangian multiplier, respectively The weighting coefficients of the second Lagrangian and the second Lagrange multiplier.
  • k may be a constant within the range of 0 to 1.
  • the value of k may be equal to 0.5, may also be 0.75, or may be a variable value (for example, obtained by performing a certain calculation on the current coding unit), which is not specifically limited in this embodiment of the present application.
  • a typical value of k can be equal to 0.75.
  • the determining a target distortion value according to the first distortion value and the second distortion value includes:
  • the second preset parameter is used to control the weight value corresponding to the first distortion value and the second distortion value
  • the first distortion value and the second distortion value are weighted and calculated by using the second preset parameter to obtain the target distortion value.
  • the second preset parameter can control the weight values corresponding to the first distortion value and the second distortion value.
  • the determining the second preset parameter may include:
  • the second preset parameter is set according to the configuration information of the encoder.
  • the method can also include:
  • the target distortion value is set equal to the weighted sum of the first distortion value and the second distortion value, where m is Any value greater than or equal to 0 and less than or equal to 1, the weighting coefficient of the first distortion value is set equal to 1 ⁇ m, and the weighting coefficient of the second distortion value is set equal to m.
  • the weighting coefficient of the second distortion value when the weighting coefficient of the second distortion value is set to m, the weighting coefficient of the first distortion value can be set to 1-m; in this way, the calculation of the target distortion value
  • the formula is as follows,
  • D represents the target distortion value
  • D miou represents the first distortion value
  • SSE represents the second distortion value
  • 1-m and m represent the weighting coefficients of the first distortion value and the second distortion value, respectively.
  • m may be a constant within the range of 0 to 1.
  • the value of m may be equal to 0.5, may also be 0.75, or may be a variable value (for example, obtained by performing a certain calculation on the current coding unit), which is not specifically limited in this embodiment of the present application.
  • a typical value of m can be equal to 0.75.
  • the values of the first preset parameter and the second preset parameter may be set to be the same or different.
  • the values of the first preset parameter and the second preset parameter are the same, for example, both can be represented by ⁇ .
  • the calculation formula of the target Lagrange multiplier and the target distortion value can be as follows,
  • is a constant in the range of 0 to 1, which can not only be used to control the respective weights of the first Lagrangian multiplier and the second Lagrangian multiplier, but also can be used to control the semantic distortion (ie The first distortion value) and the fidelity distortion (ie, the second distortion value) respectively occupy the size of the weight.
  • is typically set to 0.75.
  • ⁇ miou represents the machine-oriented quality
  • ⁇ SSE represents the subjective quality viewed by the human eye
  • represents the subjective quality viewed by the human eye and the machine-oriented quality that can be adjusted between . For example, if ⁇ is equal to 1, then the target distortion value at this time is entirely the subjective quality viewed by the human eye; if ⁇ is equal to 0, then the target distortion value at this time is entirely the quality for the machine.
  • the value of ⁇ can be set through the configuration information of the encoder.
  • one implementation is to set directly according to the application requirements, such as the cases of 0 and 1 described above; another implementation is to set the encoder to work in different ways, for example, if it is set to work with the human eye , the encoder will set the value of ⁇ to 1; if it is set to the working mode of the machine, the encoder will set the value of ⁇ to 0; if it is set to human-machine hybrid, the encoder will adaptively determine the value of ⁇ For example, in the preprocessing stage, the pre-encoding method is used to pre-encode the video to be encoded, and then the value of ⁇ is estimated from the pre-encoding result.
  • the encoding parameters of the video to be encoded can be determined according to the target Lagrangian multiplier and the target distortion value, so as to encode the video to be encoded.
  • the determining the encoding parameter of the video to be encoded by using the target Lagrangian multiplier and the target distortion value may include:
  • a minimum rate-distortion cost value is selected from the determined rate-distortion cost values, and a candidate encoding parameter corresponding to the minimum rate-distortion cost value is determined as the encoding parameter of the video to be encoded.
  • the encoding parameters include at least a parameter indicating a division manner of the to-be-encoded video and a parameter for constructing a prediction value of an encoded block in the to-be-encoded video.
  • the encoding the video to be encoded may include: writing the encoding parameter into a code stream.
  • the rate-distortion cost function can be constructed; then one or more candidate encoding parameters are used to pre-encode the video to be encoded, so as to determine this one.
  • the coding parameters determined at this time are the optimal coding parameters (with the lowest rate-distortion cost), and then coding is performed; in this process, the coding parameters can also be written into the code stream for transmission from the encoder to the decoder, using to restore the original to-be-encoded video on the decoder side.
  • the embodiment of the present application uses the VVC
  • the distortion criterion in the rate-distortion optimization process is modified to the weight of the semantic distortion D miou and the fidelity distortion SSE, as shown in the above equation (17) or equation (19); the corresponding target Lagrange multiplier is modified to ⁇
  • the weighting of miou and ⁇ SSE is shown in the above formula (16) or formula (18), so that the rate-distortion process of the VVC standard encoder can be optimized according to the rate-distortion optimization algorithm of multi-distortion criteria for human-machine vision. , to improve the semantic segmentation accuracy of reconstructed video at a certain bit rate while maintaining good fidelity performance.
  • VVC TEST MODE VTM
  • VTM VVC TEST MODE
  • the rate-distortion process in the VVC is optimized according to the video encoding method of the embodiment of the present application, and then different QPs are selected, and the test video is encoded by the optimized encoder to obtain the encoding bit rate, and the encoding
  • the resulting reconstructed video is semantically segmented and the segmentation accuracy is calculated.
  • the BD-rate and BD-miou of the reconstructed video compared with the VVC standard encoder can be calculated. Performance in terms of video semantic accuracy at bit rate.
  • the BD-miou and BD-rate of the reconstructed video of the embodiment of the present application compared with the reconstructed video of the VVC standard encoder can be calculated.
  • Table 1 shows the performance of the video coding method of the application embodiment in terms of semantic accuracy.
  • BD-miou represents the improvement of the semantic accuracy of the reconstructed video under the same bit rate.
  • BD-miou is greater than 0, indicating that the semantic accuracy is improved; BD-miou is less than 0, indicating that the semantic accuracy has decreased; BD-rate represents For the increase of the coding rate under the same semantic accuracy, if BD-rate is greater than 0, it indicates that the coding rate increases; if BD-rate is less than 0, it indicates that the coding rate decreases, that is, the coding efficiency is improved.
  • the PSNR and coding rate of the reconstructed video are calculated from the encoded files.
  • the BD-rate and BD-PSNR of the reconstructed video compared with the VVC standard encoder are obtained, which can be measured in this application.
  • the video encoding method of the embodiment compares the performance of the VVC standard encoder in terms of video fidelity with the same bit rate.
  • BD-PSNR represents the increase of reconstructed video fidelity under the same bit rate.
  • BD-PSNR is greater than 0, indicating that the fidelity has increased; BD-PSNR is less than 0, indicating that the fidelity has decreased; BD-rate represents the same fidelity
  • the increase of the coding rate in the case of true degree if BD-rate is greater than 0, it indicates that the code rate increases; if BD-rate is less than 0, it indicates that the code rate decreases, that is, the coding efficiency is improved.
  • the BD-miou obtained according to the experimental results is 0.0112, indicating that the video coding method of the embodiment of the present application can improve the accuracy of semantic segmentation of reconstructed video under the same bit rate.
  • the overall semantic effect of the embodiments of the present application is better than that of the VVC standard encoder. That is to say, the embodiments of the present application can improve the semantic segmentation accuracy of the reconstructed video under the condition of the same bit rate.
  • the BD-rate obtained according to the experimental results is -24.8673, indicating that the video coding method of the embodiment of the present application can reduce the video coding bit rate under the same semantic accuracy. That is to say, the embodiments of the present application can reduce the code rate with the same semantic accuracy.
  • the BD-PSNR obtained according to the experimental results is 0.0316, indicating that the video coding method of the embodiment of the present application can improve the fidelity of the reconstructed video under the condition of the same bit rate. That is to say, the embodiments of the present application can improve the fidelity of the reconstructed video under the condition of the same bit rate.
  • the BD-rate obtained according to the experimental results is -1.0836, indicating that the video encoding method of the embodiment of the present application can reduce the video encoding bit rate under the same fidelity. That is to say, the embodiments of the present application can reduce the code rate with the same fidelity.
  • the PSNR performance of the reconstructed video is basically not degraded compared with the VVC standard encoder.
  • the subjective performance of the embodiment of the present application is better than that of VVC, which shows that the video coding method of the embodiment of the present application can ensure the fidelity of the reconstructed video while improving the semantic accuracy, and satisfy the subjective performance of the video. Watch demand. That is to say, the embodiments of the present application can ensure the fidelity of the video while improving the semantic effect.
  • the video encoding method of the embodiment of the present application optimizes the rate-distortion process in VVC, and does not change the video encoding and decoding process and code stream structure, and therefore does not increase the complexity of encoding and decoding.
  • the video coding method of the embodiment of the present application can also reduce the coding bit rate of the video, thereby shortening the time required for coding and improving the coding speed.
  • the embodiment of the present application defines a semantic distortion metric, and derives the corresponding first Lagrangian multiplier, through preset parameters (including the first The preset parameter and the second preset parameter) adjust the weights of semantic distortion and SSE distortion, as well as the weights of the first Lagrangian multiplier and the second Lagrangian multiplier, so as to optimize the rate-distortion process of video coding , so that the semantic segmentation accuracy of the reconstructed video can be improved under the condition of a certain bit rate, and a good fidelity performance can also be maintained.
  • This embodiment provides a video encoding method, which is applied to an encoder.
  • the first Lagrangian multiplier and the second Lagrangian multiplier are determined according to the pre-parameters; according to the first Lagrangian multiplier and the second Lagrangian multiplier Lagrangian multiplier, determining the target Lagrangian multiplier; determining a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion; according to the second distortion metric criterion , determine a second distortion value, wherein the second distortion measurement criterion includes a numerical error measurement criterion; determine a target distortion value according to the first distortion value and the second distortion value; use the target Lagrangian
  • the multiplier and the target distortion value are used to determine the encoding parameters of the to-be-encoded video, and the to-be-encoded video is encoded.
  • the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and at a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining good fidelity performance, thereby improving coding efficiency.
  • FIG. 7 shows a schematic structural diagram of the composition of an encoder 70 provided by an embodiment of the present application.
  • the encoder 70 may include: a determination unit 701, a calculation unit 702 and an encoding unit 703; wherein,
  • a determining unit 701 configured to determine pre-parameters of the video to be encoded, and determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters;
  • a computing unit 702 configured to determine a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier;
  • the determining unit 701 is further configured to determine a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion; and determine a second distortion value according to a second distortion metric criterion, wherein , the second distortion metric criterion includes a numerical error metric criterion;
  • the calculation unit 702 is further configured to determine a target distortion value according to the first distortion value and the second distortion value;
  • the encoding unit 703 is configured to use the target Lagrangian multiplier and the target distortion value to determine encoding parameters of the video to be encoded, and to encode the video to be encoded.
  • the pre-parameter includes a quantization parameter
  • the determining unit 701 is further configured to determine a quantization parameter of an encoding unit in the to-be-encoded video, wherein the encoding unit includes at least one of the following: image, slice , subimages, tiles, encoded blocks.
  • the determining unit 701 is further configured to determine a first calculation model parameter, where the first calculation model represents the corresponding relationship between the first Lagrangian multiplier and the quantization parameter; and according to the The quantization parameter and the first calculation model determine the first Lagrangian multiplier.
  • the determining unit 701 is further configured to, in the first calculation model, set the first Lagrangian multiplier to a weighted value equal to the exponential power of the quantization parameter; wherein, the first calculation model parameter includes a first exponential parameter indicating the exponential power and a first weighting coefficient indicating the weighting.
  • the determining unit 701 is further configured to set the parameter of the first calculation model to a preset value.
  • the determining unit 701 is further configured to, based on the test video, use the first distortion metric criterion to determine a first relationship function between the first distortion value and the bit rate of the test video, for The first relationship function performs a derivative operation to determine a derivative function of the first relationship function; and based on the test video, determine a second relationship function between the code rate and the quantization parameter, according to the derivative function and the second relational function to determine the first computational model parameter.
  • the pre-parameter includes a quantization parameter and a target code rate
  • the determining unit 701 is further configured to determine a quantization parameter and a target code rate of a coding unit in the to-be-coded video, wherein the coding unit includes the following At least one of: image, tile, subimage, tile, encoded block.
  • the determining unit 701 is further configured to determine a second calculation model parameter, where the second calculation model represents the corresponding relationship between the first Lagrangian multiplier and the code rate; and according to the The target code rate and the second calculation model determine the first Lagrangian multiplier.
  • the determining unit 701 is further configured to, in the second calculation model, set the first Lagrangian multiplier to a weighted value equal to the exponential power of the target code rate; wherein , the second calculation model parameter includes a second exponent parameter indicating the power of the exponent and a second weighting coefficient indicating the weighting.
  • the determining unit 701 is further configured to set the parameter of the second calculation model to a preset value.
  • the determining unit 701 is further configured to, based on the test video, use the first distortion metric criterion to determine a first relationship function between the first distortion value and the bit rate of the test video; and A derivative operation is performed on the first relational function to determine the second calculation model parameter.
  • the determining unit 701 is further configured to determine the target bit rate of the coding unit in the to-be-coded video by using a bit allocation method.
  • the determining unit 701 is further configured to determine the second Lagrangian multiplier according to a preset third calculation model; wherein the third calculation model represents the second Lagrangian Correspondence between day multipliers and quantization parameters.
  • the determining unit 701 is further configured to use a rate control manner to determine the quantization parameter of the coding unit in the to-be-coded video.
  • the determining unit 701 is further configured to set the quantization parameter to a preset value.
  • the determining unit 701 is further configured to perform semantic segmentation on the test video based on the test video, and determine the semantic accuracy of one or more categories; and according to the semantic accuracy of the one or more categories , to determine the target semantic accuracy;
  • the calculation unit 702 is further configured to use a fourth calculation model to perform a distortion measurement on the target semantic accuracy to obtain the first distortion value.
  • the calculating unit 702 is further configured to calculate a weighted sum of the semantic accuracy of the one or more categories, and determine the obtained weighted sum as the target semantic accuracy.
  • the determining unit 701 is further configured to determine the fourth calculation model parameter, where the fourth calculation model represents the correspondence between the first distortion value and the target semantic accuracy; and according to The target semantic accuracy and the fourth calculation model are used to obtain the first distortion value.
  • the determining unit 701 is further configured to, in the fourth calculation model, set the first distortion value to be a weighted value equal to the logarithm of the target semantic accuracy;
  • Four computational model parameters include a base parameter indicating the logarithm and a fourth weighting coefficient parameter indicating the weighting.
  • the determining unit 701 is further configured to set the fourth calculation model parameter to a preset value.
  • the determining unit 701 is further configured to determine a fifth calculation model parameter, where the fifth calculation model represents a third relationship function between the first distortion value and the mean square error;
  • the target mean square error of the coding unit in the encoded video, and the first distortion value is determined according to the target mean square error and the fifth calculation model.
  • the determining unit 701 is further configured to, in the fifth calculation model, set the first distortion value equal to the product of the target mean square error and the first parameter factor and superimpose the second parameter The sum of factors; wherein, the fifth calculation model parameter includes the first parameter factor and the second parameter factor.
  • the determining unit 701 is further configured to set the parameter of the fifth calculation model to a preset value.
  • the determining unit 701 is further configured to, based on the test video, use the first distortion metric criterion to determine a third relationship function between the first distortion value and the mean square error of the test video; and determining the fifth calculation model parameter according to the third relational function.
  • the determining unit 701 is further configured to determine a reconstruction value of a coding unit in the video, wherein the coding unit includes at least one of the following: an image, a slice, a sub-image, a tile, and a coding block;
  • the calculation unit 702 is further configured to, based on the numerical error criterion, determine the second distortion value according to the reconstructed value and the original value of the coding unit; wherein the numerical error criterion is one of the following: absolute error and Criterion, Mean Absolute Error Criterion, Error Sum of Squares Criterion, Mean Squared Error Criterion.
  • the determining unit 701 is further configured to determine a first preset parameter; wherein the first preset parameter is used to control the first Lagrangian multiplier and the second Lagrangian The weight value corresponding to the daily multiplier;
  • the calculation unit 702 is further configured to perform weighted calculation on the first Lagrangian multiplier and the second Lagrangian multiplier by using the first preset parameter to obtain the target Lagrangian multiplier son.
  • the encoder 70 may further include a configuration unit 704 configured to set the first preset parameter according to the configuration information of the encoder.
  • the configuration unit 704 is further configured to, when the configuration information of the encoder indicates that the first preset parameter is equal to k, set the target Lagrangian multiplier to be equal to the first The weighted sum of the Lagrangian multiplier and the second Lagrangian multiplier, where k is any value greater than or equal to 0 and less than or equal to 1, and the weighted sum of the first Lagrangian multiplier The coefficients are set equal to 1-k, and the weighting coefficients of the second Lagrangian multipliers are set equal to k.
  • the value of k is equal to 0.75.
  • the determining unit 701 is further configured to determine a second preset parameter; wherein the second preset parameter is used to control the weight value corresponding to the first distortion value and the second distortion value;
  • the calculation unit 702 is further configured to perform weighted calculation on the first distortion value and the second distortion value by using the second preset parameter to obtain the target distortion value.
  • the configuration unit 704 is further configured to set the second preset parameter according to the configuration information of the encoder.
  • the configuration unit 704 is further configured to, when the configuration information of the encoder indicates that the second preset parameter is equal to m, set the target distortion value to be equal to the first distortion value and the The weighted sum of the second distortion value, where m is any value greater than or equal to 0 and less than or equal to 1, the weighting coefficient of the first distortion value is set to be equal to 1 ⁇ m, the weighting of the second distortion value The coefficients are set equal to m.
  • the value of m is equal to 0.75.
  • the determining unit 701 is further configured to construct a rate-distortion cost function based on the target Lagrangian multiplier and the target distortion value; and use one or more candidate encoding parameters to encode the to-be-encoded
  • the video is subjected to precoding processing to determine a rate-distortion cost value corresponding to the one or more candidate encoding parameters; and a minimum rate-distortion cost value is selected from the determined rate-distortion cost values, and the minimum rate-distortion cost value corresponds to
  • the candidate encoding parameter of is determined as the encoding parameter of the video to be encoded.
  • the encoding parameters include at least a parameter indicating how the video to be encoded is divided and a parameter constructing a predictor of an encoded block in the video to be encoded.
  • the encoder 70 may further include a writing unit 705 configured to write the encoding parameters into the code stream.
  • a "unit” may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, it may also be a module, and it may also be non-modular.
  • each component in this embodiment may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of software function modules.
  • the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of this embodiment is essentially or Said part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product
  • the computer software product is stored in a storage medium and includes several instructions for making a computer device (which can be It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the method described in this embodiment.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.
  • an embodiment of the present application provides a computer storage medium, which is applied to the encoder 70, where the computer storage medium stores a computer program, and when the computer program is executed by at least one processor, implements any one of the foregoing embodiments. steps of the method.
  • FIG. 8 shows a specific hardware structure example of the encoder 70 provided by the embodiment of the present application, which may include: a communication interface 801, a memory 802, and a processor 803; each The components are coupled together through a bus system 804 .
  • the bus system 804 is used to implement connection communication between these components.
  • the bus system 804 also includes a power bus, a control bus, and a status signal bus.
  • the various buses are labeled as bus system 804 in FIG. 7 .
  • the communication interface 801 is used for receiving and sending signals in the process of sending and receiving information with other external network elements;
  • a memory 802 for storing computer programs that can be executed on the processor 803;
  • the processor 803 is configured to, when running the computer program, execute:
  • first distortion metric criterion includes a semantic distortion metric criterion
  • the second distortion metric criterion includes a numerical error metric criterion
  • the encoding parameters of the video to be encoded are determined, and the video to be encoded is encoded.
  • the memory 802 in this embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable read-only memory (Erasable PROM, EPROM). Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory.
  • Volatile memory may be Random Access Memory (RAM), which acts as an external cache.
  • RAM Static RAM
  • DRAM Dynamic RAM
  • SDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • Double Data Rate SDRAM DDRSDRAM
  • Enhanced Synchronous Dynamic Random Access Memory Enhanced SDRAM, ESDRAM
  • Synchronous link DRAM Synchronous link DRAM, SLDRAM
  • Direct Rambus RAM Direct Rambus RAM
  • the processor 803 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method can be completed by an integrated logic circuit of hardware in the processor 803 or an instruction in the form of software.
  • the above-mentioned processor 803 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art.
  • the storage medium is located in the memory 802, and the processor 803 reads the information in the memory 802, and completes the steps of the above method in combination with its hardware.
  • the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof.
  • the processing unit can be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Device (DSP Device, DSPD), programmable Logic Devices (Programmable Logic Device, PLD), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), General Purpose Processors, Controllers, Microcontrollers, Microprocessors, Others for performing the functions described herein electronic unit or a combination thereof.
  • ASIC Application Specific Integrated Circuits
  • DSP Digital Signal Processing
  • DSP Device Digital Signal Processing Device
  • DSPD Digital Signal Processing Device
  • PLD programmable Logic Devices
  • Field-Programmable Gate Array Field-Programmable Gate Array
  • FPGA Field-Programmable Gate Array
  • the techniques described herein may be implemented through modules (eg, procedures, functions, etc.) that perform the functions described herein.
  • Software codes may be stored in memory and executed by a processor.
  • the memory can be implemented in the processor or external to the processor.
  • the processor 803 is further configured to execute the steps of the method in any one of the foregoing embodiments when running the computer program.
  • This embodiment provides an encoder, which includes a determination unit, a calculation unit, and an encoding unit; wherein the determination unit is configured to determine pre-parameters of a video to be encoded, and determine a first Lagrangian according to the pre-parameters a multiplier and a second Lagrangian multiplier; the computing unit is configured to determine a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier; and determine The unit is further configured to determine a first distortion value according to a first distortion metric, wherein the first distortion metric includes a semantic distortion metric; and determine a second distortion value according to a second distortion metric, wherein the The second distortion metric criterion includes a numerical error metric criterion; the computing unit is further configured to, and based on the first distortion value and the second distortion value, determine a target distortion value; the encoding unit is configured to utilize the target Lagrangian multiplication and the target distortion value, determine the encoding parameters of the to-be-en
  • the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and under a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining a good fidelity performance, thereby also improving the coding efficiency.
  • FIG. 9 it shows a schematic structural diagram of a video system provided by an embodiment of the present application.
  • the video system 90 may include an encoder 901 and a decoder 902 .
  • the encoder 901 may be the encoder 70 described in any one of the foregoing embodiments.
  • the encoder 901 is configured to determine pre-parameters of the video to be encoded, determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters; and determine the first Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier to determine a target Lagrangian multiplier; and a first distortion value is determined according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion And according to the second distortion metric criterion, determine the second distortion value, wherein, the second distortion metric criterion includes numerical error metric criterion; And according to the first distortion value and the second distortion value, determine the target distortion value and utilize the target Lagrange multiplier and the target distortion value to determine the encoding parameters of the video to be encoded, encode the video to be encoded to generate a code stream, and transmit the code stream to the the decoder;
  • the decoder 902 is configured to parse the code stream to obtain a decoded video.
  • the decoder 902 is further configured to parse the code stream, obtain decoding parameters, and obtain the decoded video according to the decoding parameters; wherein, the decoding parameters at least include a code indicating the division mode of the video to be decoded. parameters and parameters constructing the predicted values of the decoded blocks in the video to be decoded.
  • the video system 90 comprehensively considers the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric to perform rate-distortion optimization in video coding, which can be well adapted to It is oriented to the application scenarios of machine vision and human-machine vision, and in the case of a certain bit rate, it can improve the accuracy of semantic segmentation of reconstructed videos, while maintaining good fidelity performance, thereby improving coding efficiency.
  • the pre-parameters of the video to be encoded are determined, and the first Lagrangian multiplier and the second Lagrangian multiplier are determined according to the pre-parameters; according to the first Lagrangian multiplier and the second Lagrangian multiplier to determine the target Lagrangian multiplier; according to the first distortion metric criterion, determine the first distortion value, wherein the first distortion metric criterion includes a semantic distortion metric criterion; determining a second distortion value according to a second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion; determining a target distortion value according to the first distortion value and the second distortion value; using the The target Lagrange multiplier and the target distortion value are used to determine the encoding parameters of the video to be encoded, and the video to be encoded is encoded.
  • the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and under a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining a good fidelity performance, thereby also improving the coding efficiency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Disclosed are a video coding method and system, a coder, and a computer storage medium. The method comprises: determining a pre-parameter of a video to be coded, and determining a first Lagrange multiplier and a second Lagrange multiplier according to the pre-parameter; determining a target Lagrange multiplier according to the first Lagrange multiplier and the second Lagrange multiplier; determining a first distortion value according to a first distortion measure criterion, the first distortion measure criterion comprising a semantic distortion measure criterion; determining a second distortion value according to a second distortion measure criterion, the second distortion measure criterion comprising a numerical error measurement criterion; determining a target distortion value according to the first distortion value and the second distortion value; and using the target Lagrange multiplier and the target distortion value to determine a coding parameter of said video, and coding said video.

Description

视频编码方法、编码器、系统以及计算机存储介质Video encoding method, encoder, system, and computer storage medium 技术领域technical field
本申请实施例涉及视频编解码技术领域,尤其涉及一种视频编码方法、编码器、系统以及计算机存储介质。The embodiments of the present application relate to the technical field of video coding and decoding, and in particular, to a video coding method, an encoder, a system, and a computer storage medium.
背景技术Background technique
目前,国际电信联盟(International Telecommunication Union,ITU)和国际标准化组织(International Organization for Standardization,ISO)成立了联合视频专家组(Joint Video Experts Team,JVET),用以研究最新视频编码标准H.266/多功能视频编码(Versatile Video Coding,VVC),而且使得H.266/VVC比H.265/高效视频编码(High Efficiency Video Coding,HEVC)性能提升了约40%,是业界最为领先的视频压缩技术方案。At present, the International Telecommunication Union (ITU) and the International Organization for Standardization (ISO) have established the Joint Video Experts Team (JVET) to study the latest video coding standard H.266/ Versatile Video Coding (VVC), and makes H.266/VVC improve the performance of H.265/High Efficiency Video Coding (HEVC) by about 40%, which is the industry's most leading video compression technology plan.
通常而言,对于同一视频编码算法,码率越高,重建出的视频质量越好,失真越小;但是编码后的文件占用存储空间将越大,产生的码率也越大。这时候需要通过率失真优化(Rate Distortion Optimization,RDO)技术在重建视频的失真和码率之间找到平衡点,使得压缩效果最优。Generally speaking, for the same video encoding algorithm, the higher the bit rate, the better the reconstructed video quality and the smaller the distortion; however, the encoded file will occupy more storage space and the generated bit rate will be larger. At this time, it is necessary to find a balance between the distortion of the reconstructed video and the bit rate through the Rate Distortion Optimization (RDO) technology, so that the compression effect is optimal.
然而,目前的相关技术中,率失真优化算法要么只保证重建视频的保真度,要么可以保证重建视频的主观质量,但视频的保真度性能会大幅度下降。尤其是在面向机器视觉和人机视觉的视频编码中,现有的率失真优化算法采用的失真准则考虑单一,不全面,使得现有的率失真优化算法不能很好地适应于面向机器视觉和人机视觉的应用场景。However, in the current related art, the rate-distortion optimization algorithm can either only guarantee the fidelity of the reconstructed video, or can guarantee the subjective quality of the reconstructed video, but the fidelity performance of the video will be greatly reduced. Especially in the video coding for machine vision and human-machine vision, the distortion criteria adopted by the existing rate-distortion optimization algorithms are single and incomplete, so that the existing rate-distortion optimization algorithms cannot be well adapted to machine vision and computer vision. Application scenarios of human-machine vision.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供一种视频编码方法、编码器、系统以及计算机存储介质,可以很好地适应于面向机器视觉和人机视觉的应用场景,而且在一定码率的情况下,能够提高重建视频的语义分割准确度,同时还能够保持较好的保真度性能,从而提高编码效率。Embodiments of the present application provide a video encoding method, encoder, system, and computer storage medium, which can be well adapted to application scenarios oriented to machine vision and human-machine vision, and can improve the reconstructed video under the condition of a certain bit rate The accuracy of semantic segmentation can be improved, while maintaining good fidelity performance, thereby improving coding efficiency.
本申请实施例的技术方案可以如下实现:The technical solutions of the embodiments of the present application can be implemented as follows:
第一方面,本申请实施例提供了一种视频编码方法,应用于编码器,该方法包括:In a first aspect, an embodiment of the present application provides a video encoding method, which is applied to an encoder, and the method includes:
确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;determining pre-parameters of the video to be encoded, and determining a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters;
根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;determining a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier;
根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;determining a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion;
根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;determining a second distortion value according to a second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion;
根据所述第一失真值和所述第二失真值,确定目标失真值;determining a target distortion value according to the first distortion value and the second distortion value;
利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码。Using the target Lagrangian multiplier and the target distortion value, the encoding parameters of the video to be encoded are determined, and the video to be encoded is encoded.
第二方面,本申请实施例提供了一种编码器,该编码器包括确定单元、计算单元和编码单元;其中,In a second aspect, an embodiment of the present application provides an encoder, the encoder includes a determination unit, a calculation unit, and an encoding unit; wherein,
所述确定单元,配置为确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;The determining unit is configured to determine pre-parameters of the video to be encoded, and determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters;
所述计算单元,配置为根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;the computing unit, configured to determine a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier;
所述确定单元,还配置为根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;以及根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;The determining unit is further configured to determine a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion; and determine a second distortion value according to a second distortion metric criterion, Wherein, the second distortion metric criterion includes a numerical error metric criterion;
所述计算单元,还配置为根据所述第一失真值和所述第二失真值,确定目标失真值;The computing unit is further configured to determine a target distortion value according to the first distortion value and the second distortion value;
所述编码单元,配置为利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码。The encoding unit is configured to use the target Lagrangian multiplier and the target distortion value to determine the encoding parameter of the video to be encoded, and to encode the video to be encoded.
第三方面,本申请实施例提供了一种编码器,该编码器包括存储器和处理器;其中,In a third aspect, an embodiment of the present application provides an encoder, where the encoder includes a memory and a processor; wherein,
所述存储器,用于存储能够在所述处理器上运行的计算机程序;the memory for storing a computer program executable on the processor;
所述处理器,用于在运行所述计算机程序时,执行如第一方面所述的方法。The processor is configured to execute the method according to the first aspect when running the computer program.
第四方面,本申请实施例提供了一种计算机存储介质,该计算机存储介质存储有计算机程序,所述计算机程序被至少一个处理器执行时实现如第一方面所述的方法。In a fourth aspect, an embodiment of the present application provides a computer storage medium, where the computer storage medium stores a computer program, and the computer program implements the method according to the first aspect when the computer program is executed by at least one processor.
第五方面,本申请实施例提供了一种视频系统,该视频系统包括编码器和解码器;其中,In a fifth aspect, an embodiment of the present application provides a video system, where the video system includes an encoder and a decoder; wherein,
所述编码器,配置为确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;以及根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;以及根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;以及根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;以及根据所述第一失真值和所述第二失真值,确定目标失真值;以及利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码以生成码流,将所述码流传输至所述解码器;the encoder, configured to determine pre-parameters of the video to be encoded, determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters; and according to the first Lagrangian a multiplier and the second Lagrangian multiplier to determine a target Lagrange multiplier; and a first distortion value based on a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric and determining a second distortion value according to a second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion; and determining a target distortion according to the first distortion value and the second distortion value and using the target Lagrangian multiplier and the target distortion value to determine the encoding parameters of the to-be-encoded video, encode the to-be-encoded video to generate a code stream, and transmit the code stream to the decoder;
所述解码器,配置为解析码流,获得解码视频。The decoder is configured to parse the code stream to obtain decoded video.
本申请实施例提供了一种视频编码方法、编码器、系统以及计算机存储介质,通过确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;根据所述第一失真值和所述第二失真值,确定目标失真值;利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码。这样,在视频编码中综合考虑了基于语义失真度量的第一失真度量准则和基于数值误差度量的第二失真度量准则进行率失真优化,可以很好地适应于面向机器视觉和人机视觉的应用场景,而且在一定码率的情况下,能够提高重建视频的语义分割准确度,同时还能够保持较好的保真度性能,从而还提高了编码效率。Embodiments of the present application provide a video encoding method, encoder, system, and computer storage medium, by determining pre-parameters of the video to be encoded, and determining a first Lagrangian multiplier and a second Lagrangian according to the pre-parameters the Lagrangian multiplier; according to the first Lagrangian multiplier and the second Lagrangian multiplier, determine the target Lagrangian multiplier; according to the first distortion metric criterion, determine the first distortion value, Wherein, the first distortion metric criterion includes a semantic distortion metric criterion; a second distortion value is determined according to a second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion; according to the first distortion value and the second distortion value, to determine a target distortion value; using the target Lagrangian multiplier and the target distortion value to determine the encoding parameters of the video to be encoded, and to encode the video to be encoded. In this way, the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and under a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining a good fidelity performance, thereby also improving the coding efficiency.
附图说明Description of drawings
图1为相关技术方案提供的一种RD曲线的结构示意图;Fig. 1 is the structural representation of a kind of RD curve that related technical scheme provides;
图2为本申请实施例提供的一种编码器的系统组成结构示意图;2 is a schematic structural diagram of a system composition of an encoder according to an embodiment of the present application;
图3为本申请实施例提供的一种视频编码方法的流程示意图;3 is a schematic flowchart of a video encoding method provided by an embodiment of the present application;
图4为本申请实施例提供的一种第一失真值与码率之间函数关系的曲线示意图;4 is a schematic diagram of a curve of a functional relationship between a first distortion value and a code rate according to an embodiment of the present application;
图5为本申请实施例提供的一种码率与量化参数之间函数关系的曲线示意图;5 is a schematic diagram of a curve of a functional relationship between a code rate and a quantization parameter according to an embodiment of the present application;
图6为本申请实施例提供的一种第一失真值和MSE之间函数关系的曲线示意图;6 is a schematic diagram of a curve of a functional relationship between a first distortion value and MSE provided by an embodiment of the present application;
图7为本申请实施例提供的一种编码器的组成结构示意图;FIG. 7 is a schematic diagram of the composition and structure of an encoder provided by an embodiment of the present application;
图8为本申请实施例提供的一种编码器的具体硬件结构示意图;8 is a schematic diagram of a specific hardware structure of an encoder provided by an embodiment of the present application;
图9为本申请实施例提供的一种视频系统的组成结构示意图。FIG. 9 is a schematic structural diagram of a video system according to an embodiment of the present application.
具体实施方式detailed description
为了能够更加详尽地了解本申请实施例的特点与技术内容,下面结合附图对本申请实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本申请实施例。In order to have a more detailed understanding of the features and technical contents of the embodiments of the present application, the implementation of the embodiments of the present application will be described in detail below with reference to the accompanying drawings.
随着数字媒体时代的发展,通过网络传输连续的媒体数据已经成为大势所趋,同时越来越多的用户希望用个人计算机(Personal Computer,PC)和非PC设备通过互联网和无线网络来进行视频通信和服务,这种随时随地的视频通信和服务对当前视频编码技术提出了更大的挑战。With the development of the digital media era, the transmission of continuous media data through the network has become the general trend, and more and more users hope to use personal computers (Personal Computer, PC) and non-PC devices to conduct video communication and video communication through the Internet and wireless networks. Services, this anytime, anywhere video communication and service pose a greater challenge to current video coding technologies.
应理解,国际电信联盟(International Telecommunication Union,ITU)和国际标准化组织(International Organization for Standardization,ISO)成立了联合视频专家组(Joint Video Experts Team,JVET),用以研究下一代视频编码标准H.266/多功能视频编码(Versatile Video Coding,VVC),而且目前业界积累的技术已经使得H.266/VVC比H.265/HEVC性能进一步提升了约40%,是业界目前最为领先的视频压缩技术方案。It should be understood that the International Telecommunication Union (ITU) and the International Organization for Standardization (ISO) established the Joint Video Experts Team (JVET) to study the next-generation video coding standard H. 266/Versatile Video Coding (VVC), and the technology accumulated in the industry has further improved the performance of H.266/VVC by about 40% compared to H.265/HEVC, which is the most advanced video compression technology in the industry. plan.
通常而言,对于同一视频编码算法来说,码率越高,重建出的视频质量越好,失真越小;但是编码后的文件占用存储空间越大,产生的码率越大。因此,这时候需要通过率失真优化算法在重建视频的失真和码率之间找到平衡点,使得压缩效果最优。Generally speaking, for the same video encoding algorithm, the higher the bit rate, the better the reconstructed video quality and the smaller the distortion; however, the larger the storage space occupied by the encoded file, the larger the generated bit rate. Therefore, at this time, it is necessary to find a balance between the distortion of the reconstructed video and the bit rate through a rate-distortion optimization algorithm, so that the compression effect is optimal.
需要说明的是,率失真优化可以表述为在编码后的文件不超过一定码率的情况下,使得解码重建出的视频的失真最小,如下述的式(1)所示。It should be noted that the rate-distortion optimization can be expressed as minimizing the distortion of the decoded and reconstructed video when the encoded file does not exceed a certain bit rate, as shown in the following formula (1).
min{D}s.t.R<=R max             (1) min{D}stR<= Rmax (1)
其中,D和R分别表示在某种编码参数下的失真和码率。Among them, D and R represent the distortion and code rate under certain coding parameters, respectively.
以给定的编码参数对视频进行编码,计算编码后的码率(R)和重建视频的失真(D)。通过改变编码参数、重复对待编码视频进行编码,可以得到多个由码率和失真组成的R-D点,如图1所示。通常情况下,在对于一个预先设定的码率,失真最小的点将出现在图1中的凸曲线(即RD曲线)上。对于输入的待编码视频,编码器需要确定一组编码参数,能够使得编码后的R-D点尽可能地逼近这条凸曲线。The video is encoded with the given encoding parameters, and the encoded bit rate (R) and the distortion (D) of the reconstructed video are calculated. By changing the encoding parameters and repeatedly encoding the to-be-encoded video, multiple R-D points consisting of bit rate and distortion can be obtained, as shown in Figure 1. Normally, for a preset code rate, the point with the least distortion will appear on the convex curve (ie, the RD curve) in Fig. 1 . For the input video to be encoded, the encoder needs to determine a set of encoding parameters so that the encoded R-D point can approximate this convex curve as much as possible.
这时候可以通过拉格朗日乘数法将上述式(1)的有约束问题转换为无约束问题,如下述的式(2)所示。At this time, the constrained problem of the above formula (1) can be transformed into an unconstrained problem by the Lagrange multiplier method, as shown in the following formula (2).
min{J=D+λ·R}                (2)min{J=D+λ·R}        (2)
其中,λ表示拉格朗日乘子,J表示率失真代价函数。对于每个可能的λ,其对应的数值就是RD曲线切线的斜率,编码器可以通过最小化率失真代价函数来找到最优的编码参数。where λ is the Lagrange multiplier and J is the rate-distortion cost function. For each possible λ, the corresponding value is the slope of the RD curve tangent, and the encoder can find the optimal encoding parameters by minimizing the rate-distortion cost function.
这样,利用率失真优化算法,编码器可以确定出最优的块划分方式、最优的帧内预测模式、最优的帧间预测运动模式(包括运动矢量、参考图像、预测权值等),用以实现最优的编码性能。In this way, using the distortion optimization algorithm, the encoder can determine the optimal block division method, the optimal intra-frame prediction mode, and the optimal inter-frame prediction motion mode (including motion vector, reference image, prediction weight, etc.), To achieve optimal encoding performance.
在相关技术方案(比如VVC)中,这里的率失真优化采用误差平方和(Sum of Square Error,SSE)作为失真准则,对应的重建视频质量可以用峰值信噪比(Peak Signal to Noise Ratio,PSNR)来衡量。SSE失真能够从客观程度上衡量视频的保真度,其计算公式如下述的式(3)所示。In related technical solutions (such as VVC), the rate-distortion optimization here adopts the sum of square error (SSE) as the distortion criterion, and the corresponding reconstructed video quality can be determined by the peak signal-to-noise ratio (Peak Signal to Noise Ratio, PSNR). )to measure. SSE distortion can objectively measure the fidelity of the video, and its calculation formula is shown in the following formula (3).
Figure PCTCN2020106416-appb-000001
Figure PCTCN2020106416-appb-000001
其中,M和N分别表示视频的水平空间分辨率和垂直空间分辨率,f(x,y)表示像素位置(x,y)处的原始像素值,g(x,y)表示像素位置(x,y)处的重建像素值。Among them, M and N represent the horizontal spatial resolution and vertical spatial resolution of the video, respectively, f(x, y) represents the original pixel value at the pixel position (x, y), and g(x, y) represents the pixel position (x, y) , y) at the reconstructed pixel value.
由于率失真优化是视频编码中的关键技术,影响着编码器的性能。虽然现有视频编码中的率失真优化算法采用的是SSE失真,可以从客观角度衡量视频的保真度;但是SSE失真和人类视觉系统的感知并不一致,比如针对某些SSE失真较大的区域,人眼并不会察觉到重建视频质量的下降。这时候,当编码器需要保证重建视频的主观质量时,需要将失真准则更改为能够衡量主观质量的失真度量,如可以采用与人眼感知一致的结构相似性(Structural SIMilarity,SSIM)失真,其计算公式如下述的式(4)所示。Since rate-distortion optimization is a key technology in video coding, it affects the performance of the encoder. Although the rate-distortion optimization algorithm in the existing video coding uses SSE distortion, which can measure the fidelity of the video from an objective point of view; but the SSE distortion is not consistent with the perception of the human visual system, such as for some areas with large SSE distortion , the human eye does not perceive the degradation of the reconstructed video quality. At this time, when the encoder needs to ensure the subjective quality of the reconstructed video, the distortion criterion needs to be changed to a distortion metric that can measure the subjective quality. The calculation formula is shown in the following formula (4).
Figure PCTCN2020106416-appb-000002
Figure PCTCN2020106416-appb-000002
其中,x和y分别代表原始图像和重建图像,μ x和μ y分别表示原始图像和重建图像的均值,
Figure PCTCN2020106416-appb-000003
Figure PCTCN2020106416-appb-000004
分别表示原始图像和重建图像的方差,σ xy表示原始图像和重建图像的协方差,C 1和C 2是两个常数,为了避免
Figure PCTCN2020106416-appb-000005
Figure PCTCN2020106416-appb-000006
接近0时产生的不稳定现象。这里,为了得到鲁棒的质量评价结果,可以取C 1=(K 1L) 2,C 2=(K 2L) 2;其中,L=2 bit_depth-1(bit_depth表示比特深度,对于8位比特深度的图像,L=255),K 1=0.01,K 2=0.03。
where x and y represent the original image and the reconstructed image, respectively, μ x and μ y represent the mean of the original image and the reconstructed image, respectively,
Figure PCTCN2020106416-appb-000003
and
Figure PCTCN2020106416-appb-000004
represent the variance of the original image and the reconstructed image, respectively, σ xy represent the covariance of the original image and the reconstructed image, C1 and C2 are two constants, in order to avoid
Figure PCTCN2020106416-appb-000005
and
Figure PCTCN2020106416-appb-000006
Instability occurs when it is close to 0. Here, in order to obtain a robust quality evaluation result, C 1 =(K 1 L) 2 , C 2 =(K 2 L) 2 ; where, L=2 bit_depth −1 (bit_depth represents the bit depth, for 8 bits Bit depth image, L=255), K 1 =0.01, K 2 =0.03.
在相关技术中,基于SSE失真的率失真优化算法可以保证重建视频的保真度;但是考虑主观质量的SSIM失真虽然可以保证重建视频的主观质量,但是视频的保真度性能会大幅度下降。In the related art, the rate-distortion optimization algorithm based on SSE distortion can ensure the fidelity of the reconstructed video; however, although the SSIM distortion considering the subjective quality can guarantee the subjective quality of the reconstructed video, the fidelity performance of the video will be greatly reduced.
由于目前的率失真优化算法都是针对重建视频供人们观看和研究的传统应用场景,但是第五代移动通信(Fifth Generation,5G)时代的到来催生出面向机器的海量应用,比如车联网、无人驾驶、工业互联网、智慧与平安城市、可穿戴、视频监控等机器视觉内容,应用场景更为广泛。在5G时代和后5G时代,大多数的视频将会被机器所使用,比如对重建视频进行行人检测、语义分割、目标检测等智能分析。然而,在面向机器视觉和人机视觉的视频编码中,目前的率失真优化算法采用的失真准则只考虑了保真度失真,并没有考虑语义失真;虽然经过编码得到的重建视频具有较好的保真度性能,但是重建视频的语义准确度得不到保证,导致目前的率失真优化算法不能很好地适应于面向机器视觉和人机视觉的众多场景。Since the current rate-distortion optimization algorithms are all traditional application scenarios for reconstructing videos for people to watch and study, the arrival of the fifth-generation mobile communication (Fifth Generation, 5G) era has spawned a large number of machine-oriented applications, such as the Internet of Vehicles, wireless Machine vision content such as human driving, industrial Internet, smart and safe cities, wearables, and video surveillance has a wider range of application scenarios. In the 5G era and the post-5G era, most videos will be used by machines, such as intelligent analysis of reconstructed videos such as pedestrian detection, semantic segmentation, and target detection. However, in the video coding for machine vision and human-machine vision, the distortion criterion adopted by the current rate-distortion optimization algorithm only considers the fidelity distortion, and does not consider the semantic distortion; Fidelity performance, but the semantic accuracy of the reconstructed video cannot be guaranteed, resulting in the current rate-distortion optimization algorithm can not be well adapted to many scenarios for machine vision and human-machine vision.
基于此,本申请实施例提供了一种视频编码方法,该基本思想是:确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;根据所述第一失真值和所述第二失真值,确定目标失真值;利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码。这样,在视频编码中综合考虑了基于语义失真度量的第一失真度量准则和基于数值误差度量的第二失真度量 准则进行率失真优化,可以很好地适应于面向机器视觉和人机视觉的应用场景,而且在一定码率的情况下,能够提高重建视频的语义分割准确度,同时还能够保持较好的保真度性能,从而还提高了编码效率。Based on this, an embodiment of the present application provides a video encoding method. The basic idea is: determine pre-parameters of the video to be encoded, and determine the first Lagrangian multiplier and the second Lagrangian according to the pre-parameters multiplier; according to the first Lagrangian multiplier and the second Lagrangian multiplier, determine the target Lagrangian multiplier; according to the first distortion measurement criterion, determine the first distortion value, wherein, The first distortion metric criterion includes a semantic distortion metric criterion; a second distortion value is determined according to the second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion; according to the first distortion value and the obtained The second distortion value is determined, and the target distortion value is determined; the encoding parameter of the to-be-encoded video is determined by using the target Lagrangian multiplier and the target distortion value, and the to-be-encoded video is encoded. In this way, the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and under a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining a good fidelity performance, thereby also improving the coding efficiency.
下面将结合附图对本申请各实施例进行详细说明。The embodiments of the present application will be described in detail below with reference to the accompanying drawings.
参见图2,其示出了本申请实施例提供的一种编码器的系统组成框图示例。如图2所示,该编码器10可以包括变换与量化单元101、帧内估计单元102、帧内预测单元103、运动补偿单元104、运动估计单元105、反变换与反量化单元106、滤波器控制分析单元107、滤波单元108、编码单元109和解码图像缓存单元110等,其中,滤波单元108可以实现去方块滤波及样本自适应缩进(Sample Adaptive 0ffset,SAO)滤波,编码单元109可以实现头信息编码及基于上下文的自适应二进制算术编码(Context-based Adaptive Binary Arithmatic Coding,CABAC)。针对输入的原始视频信号,通过编码块(Coding Unit,CU)划分可以得到一个视频编码块,然后对经过帧内或帧间预测后得到的残差像素信息通过变换与量化单元101对该视频编码块进行变换,包括将残差信息从像素域变换到变换域,并对所得的变换系数进行量化,用以进一步减少比特率;帧内估计单元102和帧内预测单元103是用于对该视频编码块进行帧内预测;明确地说,帧内估计单元102和帧内预测单元103用于确定待用以编码该视频编码块的帧内预测模式;运动补偿单元104和运动估计单元105用于执行所接收的视频编码块相对于一或多个参考帧中的一或多个块的帧间预测编码以提供时间预测信息;由运动估计单元105执行的运动估计为产生运动向量的过程,所述运动向量可以估计该视频编码块的运动,然后由运动补偿单元104基于由运动估计单元105所确定的运动向量执行运动补偿;在确定帧内预测模式之后,帧内预测单元103还用于将所选择的帧内预测数据提供到编码单元109,而且运动估计单元105将所计算确定的运动向量数据也发送到编码单元109;此外,反变换与反量化单元106是用于该视频编码块的重构建,在像素域中重构建残差块,该重构建残差块通过滤波器控制分析单元107和滤波单元108去除方块效应伪影,然后将该重构残差块添加到解码图像缓存单元110的帧中的一个预测性块,用以产生经重构建的视频编码块;编码单元109是用于编码各种编码参数及量化后的变换系数,在基于CABAC的编码算法中,上下文内容可基于相邻编码块,可用于编码指示所确定的帧内预测模式的信息,输出该视频信号的码流;而解码图像缓存单元110是用于存放重构建的视频编码块,用于预测参考。随着视频图像编码的进行,会不断生成新的重构建的视频编码块,这些重构建的视频编码块都会被存放在解码图像缓存单元110中。Referring to FIG. 2, it shows an example of a system composition block diagram of an encoder provided by an embodiment of the present application. As shown in FIG. 2, the encoder 10 may include a transform and quantization unit 101, an intra-frame estimation unit 102, an intra-frame prediction unit 103, a motion compensation unit 104, a motion estimation unit 105, an inverse transform and inverse quantization unit 106, a filter Control the analysis unit 107, the filtering unit 108, the encoding unit 109, the decoded image buffering unit 110, etc., wherein the filtering unit 108 can realize deblocking filtering and sample adaptive offset (Sample Adaptive Offset, SAO) filtering, and the encoding unit 109 can realize Header information coding and context-based adaptive binary arithmetic coding (Context-based Adaptive Binary Arithmatic Coding, CABAC). For the input original video signal, a coding block (Coding Unit, CU) can be divided to obtain a video coding block, and then the residual pixel information obtained after intra-frame or inter-frame prediction is encoded by the transform and quantization unit 101. The block is transformed, including transforming the residual information from the pixel domain to the transform domain, and quantizing the resulting transform coefficients to further reduce the bit rate; the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used for this video. The coding block is intra-predicted; specifically, the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used to determine the intra-frame prediction mode to be used to encode the video coding block; the motion compensation unit 104 and the motion estimation unit 105 are used to Inter-predictive encoding of the received video encoding blocks relative to one or more blocks in one or more reference frames is performed to provide temporal prediction information; motion estimation performed by motion estimation unit 105 is the process of generating motion vectors, so The motion vector can estimate the motion of the video coding block, and then the motion compensation unit 104 performs motion compensation based on the motion vector determined by the motion estimation unit 105; after determining the intra prediction mode, the intra prediction unit 103 is also used to The selected intra prediction data is supplied to the encoding unit 109, and the motion estimation unit 105 also sends the calculated motion vector data to the encoding unit 109; in addition, the inverse transform and inverse quantization unit 106 is used for the video encoding block. Reconstruction, a residual block is reconstructed in the pixel domain, the reconstructed residual block is controlled by the filter analysis unit 107 and the filtering unit 108 to remove the blocking artifacts, and then the reconstructed residual block is added to the decoded image buffer unit A predictive block in the frame of 110 is used to generate a reconstructed video coding block; the coding unit 109 is used for coding various coding parameters and quantized transform coefficients. In the CABAC-based coding algorithm, the context content can be Based on the adjacent coding blocks, it can be used to encode the information indicating the determined intra prediction mode, and output the code stream of the video signal; and the decoded image buffer unit 110 is used to store the reconstructed video coding blocks for prediction reference. As the video image coding proceeds, new reconstructed video coding blocks are continuously generated, and these reconstructed video coding blocks are all stored in the decoded image buffer unit 110 .
本申请实施例中的视频编码方法,主要应用在编码器10中的编码控制部分,比如包括图2所示的编码块(Coding Unit,CU)划分、帧内预测单元103、运动补偿单元104和运动估计单元105等部分。也就是说,本申请实施例的视频编码方法主要是用于确定编码参数,以便根据所确定的编码参数进行编码。其中,编码参数可以包括CU划分方式、确定CU的帧内预测模式或帧间预测模式。The video coding method in this embodiment of the present application is mainly applied to the coding control part in the encoder 10, for example, including the coding block (Coding Unit, CU) division shown in FIG. 2, the intra prediction unit 103, the motion compensation unit 104 and Motion estimation unit 105 and other parts. That is to say, the video encoding method of the embodiment of the present application is mainly used to determine encoding parameters, so as to perform encoding according to the determined encoding parameters. Wherein, the coding parameters may include a CU division mode, and an intra-frame prediction mode or an inter-frame prediction mode for determining the CU.
基于此,下面结合附图和实施例对本申请的技术方案进一步详细阐述。在进行详细阐述之前,需要说明的是,说明书通篇中提到的“第一”、“第二”、“第三”等,仅仅是为了区分不同的特征,不具有限定优先级、先后顺序、大小关系等功能。Based on this, the technical solutions of the present application are further elaborated below with reference to the accompanying drawings and embodiments. Before going into detail, it should be noted that the "first", "second", "third", etc. mentioned throughout the specification are only for distinguishing different features, and do not have a limited priority or sequence. , size relationship and other functions.
本申请实施例提供一种视频编码方法,该方法应用于视频编码设备,即编码器。该方法所实现的功能可以通过编码器中的处理器调用计算机程序来实现,当然计算机程序可以保存在存储器中,可见,该编码器至少包括处理器和存储器。An embodiment of the present application provides a video encoding method, and the method is applied to a video encoding device, that is, an encoder. The functions implemented by the method can be implemented by the processor in the encoder calling a computer program, and of course the computer program can be stored in a memory. It can be seen that the encoder includes at least a processor and a memory.
参见图3,其示出了本申请实施例提供的一种视频编码方法的流程示意图。如图3所示,该方法可以包括:Referring to FIG. 3 , it shows a schematic flowchart of a video encoding method provided by an embodiment of the present application. As shown in Figure 3, the method may include:
S301:确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;S301: Determine pre-parameters of the video to be encoded, and determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters;
S302:根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;S302: Determine a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier;
S303:根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;以及根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;S303: Determine a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion; and determine a second distortion value according to a second distortion metric criterion, wherein the second distortion metric criterion Distortion metrics include numerical error metrics;
S304:根据所述第一失真值和所述第二失真值,确定目标失真值;S304: Determine a target distortion value according to the first distortion value and the second distortion value;
S305:利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码。S305: Using the target Lagrangian multiplier and the target distortion value, determine the encoding parameter of the video to be encoded, and encode the video to be encoded.
需要说明的是,本申请实施例的视频编码方法可以适用于H.266/VVC标准的编码器,也可以适用于H.265/HEVC标准的编码器,甚至还可以适用于其他标准的编码器,比如适用于开放媒体联盟开发的第一代视频编码标准(Alliance for Open Media Video 1,AV-1)的编码器等,本申请实施例不作任何限定。It should be noted that the video coding method in this embodiment of the present application may be applicable to an encoder of the H.266/VVC standard, an encoder of the H.265/HEVC standard, or even an encoder of other standards , such as an encoder suitable for the first-generation video coding standard (Alliance for Open Media Video 1, AV-1) developed by the Open Media Alliance, and the embodiment of this application does not make any limitation.
还需要说明的是,本申请实施例的视频编码方法中所使用的率失真优化算法综合考虑了基于语义失真度量的第一失真度量准则和基于数值误差度量的第二失真度量准则,使得该率失真优化可以是面向人机视觉的多失真准则率失真优化算法。也就是说,在视频编码中,除了利用相关技术方案推导出的第二 拉格朗日乘子和第二失真值之外,针对视频语义分割这一人机视觉应用场景,本申请实施例还可以定义一种语义失真度量,然后以此推导出对应的第一拉格朗日乘子和第一失真值的计算公式。It should also be noted that the rate-distortion optimization algorithm used in the video coding method of the embodiment of the present application comprehensively considers the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric, so that the rate The distortion optimization can be a multi-distortion criterion rate-distortion optimization algorithm for human-machine vision. That is to say, in video coding, in addition to the second Lagrangian multiplier and the second distortion value derived by using the related technical solution, for the human-machine vision application scenario of video semantic segmentation, the embodiment of the present application can also A semantic distortion metric is defined, and then the corresponding first Lagrangian multiplier and the first distortion value calculation formula are derived.
这样,根据第一拉格朗日乘子和第二拉格朗日乘子能够确定出目标拉格朗日乘子,而根据第一失真度量准则所确定的第一失真值和根据第二失真度量准则所确定的第二失真值,还能够确定出目标失真值;如此,根据目标拉格朗日乘子和目标失真值确定出待编码视频的编码参数后,利用该编码参数对待编码视频进行编码,可以提高重建视频的语义分割准确度,并提高重建视频的保真度,而且还可以降低视频的编码码率,从而缩短编码所需的时间,提高编码速度,用以提高编码效率。In this way, the target Lagrangian multiplier can be determined according to the first Lagrangian multiplier and the second Lagrangian multiplier, and the first distortion value determined according to the first distortion metric criterion and the second distortion value The second distortion value determined by the metric criterion can also determine the target distortion value; in this way, after determining the encoding parameters of the video to be encoded according to the target Lagrangian multiplier and the target distortion value, use the encoding parameters to perform the encoding of the video to be encoded. Coding can improve the accuracy of semantic segmentation of reconstructed video, and improve the fidelity of reconstructed video, and can also reduce the coding bit rate of video, thereby shortening the time required for coding, improving coding speed, and improving coding efficiency.
可以理解地,在一种可能的实施方式中,待编码视频的预参数可以包括量化参数(Quantization Parameter,QP)。It can be understood that, in a possible implementation manner, the pre-parameter of the video to be encoded may include a quantization parameter (Quantization Parameter, QP).
这时候,对于S301来说,所述确定待编码视频的预参数,可以包括:At this time, for S301, the determining the pre-parameters of the video to be encoded may include:
确定所述待编码视频中编码单元的量化参数;其中,编码单元可以包括以下至少之一:图像、分片(Slice)、子图像(Sub-picture)、瓦片(tile)、编码块。A quantization parameter of the coding unit in the video to be coded is determined; wherein, the coding unit may include at least one of the following: a picture, a slice (Slice), a sub-picture (Sub-picture), a tile (tile), and a coding block.
这里,量化参数可以是编码器中量化器的量化步长,或者编码器中量化器的量化步长对应的索引序号值。Here, the quantization parameter may be the quantization step size of the quantizer in the encoder, or the index number value corresponding to the quantization step size of the quantizer in the encoder.
针对第一拉格朗日乘子的确定,在一些实施例中,对于S301来说,所述根据所述预参数,确定第一拉格朗日乘子,可以包括:For the determination of the first Lagrangian multiplier, in some embodiments, for S301, the determining the first Lagrangian multiplier according to the pre-parameter may include:
确定第一计算模型参数,所述第一计算模型表示所述第一拉格朗日乘子与量化参数之间的对应关系;determining a first calculation model parameter, the first calculation model representing the correspondence between the first Lagrangian multiplier and the quantization parameter;
根据所述量化参数和所述第一计算模型,确定所述第一拉格朗日乘子。The first Lagrangian multiplier is determined according to the quantization parameter and the first calculation model.
需要说明的是,对于第一计算模型而言,所述确定第一计算模型参数,可以包括:It should be noted that, for the first calculation model, the determining the parameters of the first calculation model may include:
在所述第一计算模型中,将所述第一拉格朗日乘子设置为等于所述量化参数的指数幂次的加权值;In the first calculation model, the first Lagrangian multiplier is set to a weighted value equal to the exponential power of the quantization parameter;
所述第一计算模型参数包括指示所述指数幂次的第一指数参数和指示所述加权的第一加权系数。The first calculation model parameter includes a first exponential parameter indicating the exponential power and a first weighting coefficient indicating the weighting.
需要说明的是,第一拉格朗日乘子用λ miou表示,量化参数用QP表示,那么第一计算模型的计算公式如下所示, It should be noted that the first Lagrange multiplier is represented by λ miou , and the quantization parameter is represented by QP, then the calculation formula of the first calculation model is as follows:
λ miou=2.30422*10 -8*QP 6.3612072           (5) λ miou = 2.30422*10 -8 *QP 6.3612072 (5)
这里,式(5)即是第一计算模型,用于表示第一拉格朗日乘子与量化参数之间的对应关系。其中,第一计算模型参数可包括第一指数参数(即式中的6.3612072)和第一加权系数(即式中的2.30422*10 -8)。 Here, Equation (5) is the first calculation model, which is used to represent the correspondence between the first Lagrangian multiplier and the quantization parameter. The first calculation model parameter may include a first index parameter (ie, 6.3612072 in the formula) and a first weighting coefficient (ie, 2.30422*10 -8 in the formula).
进一步地,对于第一计算模型参数的确定,可以是预设值,也可以是根据测试视频的大量试验测试数据拟合得到,这里不作任何限定。Further, the determination of the parameters of the first calculation model may be a preset value, or may be obtained by fitting according to a large amount of test data of a test video, which is not limited herein.
可选地,在一些实施例中,该方法还可以包括:Optionally, in some embodiments, the method may further include:
将所述第一计算模型参数设置为预设值。The first calculation model parameter is set to a preset value.
这里,对于第一计算模型参数而言,第一指数参数可设置为6.3612072,第一加权系数可设置为2.30422*10 -8。在确定出第一指数参数和第一加权系数后,就可以得到第一计算模型,以便根据量化参数确定出第一拉格朗日乘子。 Here, for the first calculation model parameter, the first index parameter may be set to 6.3612072, and the first weighting coefficient may be set to 2.30422*10 -8 . After the first index parameter and the first weighting coefficient are determined, the first calculation model can be obtained, so as to determine the first Lagrangian multiplier according to the quantization parameter.
可选地,在一些实施例中,该方法还可以包括:Optionally, in some embodiments, the method may further include:
基于测试视频,使用所述第一失真度量准则,确定所述第一失真值与所述测试视频的码率之间的第一关系函数,对所述第一关系函数进行导数运算,确定所述第一关系函数的导数函数;Based on the test video, using the first distortion metric, determine a first relationship function between the first distortion value and the bit rate of the test video, perform a derivative operation on the first relationship function, and determine the the derivative function of the first relation function;
基于所述测试视频,确定所述码率与所述量化参数之间的第二关系函数,根据所述导数函数以及所述第二关系函数,确定所述第一计算模型参数。Based on the test video, a second relationship function between the bit rate and the quantization parameter is determined, and the first calculation model parameter is determined according to the derivative function and the second relationship function.
需要说明的是,本申请实施例的测试视频可以是一个或多个测试视频,比如该测试视频可以是大规模城市场景数据集(Cityscapes)中的多个(比如59个)测试视频序列。It should be noted that the test video in this embodiment of the present application may be one or more test videos, for example, the test video may be multiple (eg, 59) test video sequences in a large-scale city scene dataset (Cityscapes).
这样,针对测试视频使用第一失真度量准则,可以确定出第一失真值与测试视频的码率之间的第一关系函数,该第一关系函数如下式所示,In this way, using the first distortion metric criterion for the test video, the first relationship function between the first distortion value and the bit rate of the test video can be determined, and the first relationship function is shown in the following formula:
D miou=0.2299*R -0.7553+3.848           (6) D miou = 0.2299*R -0.7553 +3.848 (6)
其中,D miou表示第一失真值,R表示码率。 Among them, D miou represents the first distortion value, and R represents the code rate.
需要说明的是,在图1所示的RD曲线中,λ miou的数值为该曲线切线的斜率(λ miou>0),即等于负的曲线的导函数。基于测试视频,根据编码得到的文件可以统计出不同量化参数下重建视频的平均码率,这时候利用大量试验测试数据可以通过拟合来确定出第一失真值(D miou)与码率(R)之间的函数关系,该拟合曲线如图4所示。 It should be noted that, in the RD curve shown in FIG. 1 , the value of λ miou is the slope of the tangent to the curve (λ miou >0), that is, the derivative function of the negative curve. Based on the test video, the average bit rate of the reconstructed video under different quantization parameters can be calculated according to the encoded files. At this time, a large amount of experimental test data can be used to determine the first distortion value (D miou ) and the bit rate (R ), the fitting curve is shown in Figure 4.
对式(6)进行导数运算,可以得到第一关系函数的导数函数,而该导数函数用于表示第一拉格朗日乘子与码率之间的对应关系。这里,导数函数如下式所示,The derivative function of the first relation function can be obtained by performing the derivative operation on the formula (6), and the derivative function is used to represent the corresponding relationship between the first Lagrangian multiplier and the code rate. Here, the derivative function is as follows,
λ miou=0.17364347*R -1.7553            (7) λ miou = 0.17364347*R -1.7553 (7)
另外,根据测试视频,这时候利用大量试验测试数据还可以通过拟合来确定出码率(R)与量化参数(QP)之间的第二关系函数,该拟合曲线如图5所示。这里,第二关系函数如下式所示,In addition, according to the test video, a second relationship function between the bit rate (R) and the quantization parameter (QP) can be determined by fitting using a large amount of experimental test data. The fitting curve is shown in FIG. 5 . Here, the second relation function is as follows,
R=8278*QP -3.624              (8) R=8278*QP -3.624 (8)
根据式(7)和式(8),将式(8)代入式(7)中,即可以得到λ miou和QP之间的函数关系,也即第一拉格朗日乘子与量化参数之间的对应关系,也就得到了式(5)所示的第一计算模型;从而也就确定了第一计算模型参数,以便根据量化参数就可以确定出第一拉格朗日乘子。 According to Equation (7) and Equation (8), substituting Equation (8) into Equation (7), the functional relationship between λ miou and QP can be obtained, that is, the relationship between the first Lagrange multiplier and the quantization parameter The first calculation model shown in formula (5) is obtained; thus, the parameters of the first calculation model are determined, so that the first Lagrangian multiplier can be determined according to the quantization parameters.
需要注意的是,第一拉格朗日乘子的计算公式还可以修改为其他函数形式。比如上述式(8)的函数关系还可以拟合为e指数形式,那么对应的式(5)即第一拉格朗日乘子的计算公式也可以为QP的e指数形式,这里也不作任何限定。It should be noted that the calculation formula of the first Lagrange multiplier can also be modified into other functional forms. For example, the functional relationship of the above formula (8) can also be fitted in the e-exponential form, then the corresponding formula (5), that is, the calculation formula of the first Lagrangian multiplier can also be in the e-exponential form of the QP. limited.
针对第二拉格朗日乘子的确定,在一些实施例中,对于S301来说,所述根据所述预参数,确定第二拉格朗日乘子,可以包括:Regarding the determination of the second Lagrangian multiplier, in some embodiments, for S301, the determining the second Lagrangian multiplier according to the pre-parameter may include:
根据预设的第三计算模型,确定所述第二拉格朗日乘子;其中,所述第三计算模型表示所述第二拉格朗日乘子与量化参数之间的对应关系。The second Lagrangian multiplier is determined according to a preset third calculation model; wherein, the third calculation model represents the corresponding relationship between the second Lagrangian multiplier and the quantization parameter.
需要说明的是,对于第三计算模型而言,可以是采用现有技术中SSE失真准则构建的。在一些实施例中,所述确定第三计算模型参数,可以包括:It should be noted that, for the third calculation model, it may be constructed using the SSE distortion criterion in the prior art. In some embodiments, the determining the third calculation model parameter may include:
在所述第三计算模型中,将所述第二拉格朗日乘子设置为等于2的指数幂次的加权值;In the third calculation model, the second Lagrangian multiplier is set equal to a weighted value of an exponential power of 2;
所述第三计算模型参数包括指示所述指数幂次的第三指数参数和指示所述加权的第三加权系数,所述第三指数参数与量化参数有关。The third calculation model parameter includes a third exponent parameter indicating the power of the exponent and a third weighting coefficient indicating the weighting, the third exponent parameter being related to a quantization parameter.
需要说明的是,第二拉格朗日乘子用λ SSE表示,量化参数用QP表示,那么第三计算模型的计算公式如下所示, It should be noted that the second Lagrange multiplier is represented by λ SSE , and the quantization parameter is represented by QP, then the calculation formula of the third calculation model is as follows:
λ SSE=0.57*2 (QP-12)/3          (9) λ SSE = 0.57*2 (QP-12)/3 (9)
这里,式(9)即是第三计算模型,用于表示第二拉格朗日乘子与量化参数之间的对应关系。其中,第三计算模型参数可包括第三指数参数(即式中的(QP-12)/3)和第三加权系数(即式中的0.57),而且第三指数参数的取值与量化参数(QP)有关。Here, Equation (9) is the third calculation model, which is used to represent the correspondence between the second Lagrangian multiplier and the quantization parameter. The third calculation model parameter may include a third index parameter (ie (QP-12)/3 in the formula) and a third weighting coefficient (ie, 0.57 in the formula), and the value of the third index parameter is related to the quantization parameter (QP) related.
还需要说明的是,对于第三计算模型参数的确定,可以是预设值;也可以是根据测试视频的大量试验测试数据拟合得到,这里不作任何限定。It should also be noted that, for the determination of the parameters of the third calculation model, it may be a preset value; it may also be obtained by fitting according to a large amount of experimental test data of the test video, which is not limited here.
进一步地,无论是第一拉格朗日乘子还是第二拉格朗日乘子,均与量化参数有关。针对量化参数的确定,在一些实施例中,所述确定所述待编码视频中编码单元的量化参数,可以包括:使用码率控制的方式,确定待编码视频中编码单元的量化参数。Further, whether it is the first Lagrangian multiplier or the second Lagrangian multiplier, both are related to the quantization parameter. Regarding the determination of the quantization parameter, in some embodiments, the determining the quantization parameter of the coding unit in the to-be-coded video may include: using a rate control manner to determine the quantization parameter of the coding unit in the to-be-coded video.
或者,在一些实施例中,所述确定所述待编码视频中编码单元的量化参数,可以包括:将所述量化参数设置为预设值。Alternatively, in some embodiments, the determining the quantization parameter of the coding unit in the to-be-coded video may include: setting the quantization parameter to a preset value.
也就是说,针对待编码视频中的量化参数,一方面,可以将量化参数设置为预设值,比如22、27、32、37等。另一方面,还可以是使用码率控制的方式来确定量化参数;具体地,目前的码流控制算法主要是通过调整量化参数的大小来控制码流;如此,通过控制码率的大小,也就可以得到所需求的量化参数。That is, for the quantization parameter in the video to be encoded, on the one hand, the quantization parameter may be set to a preset value, such as 22, 27, 32, 37, and so on. On the other hand, the quantization parameter can also be determined by using the code rate control method; specifically, the current code stream control algorithm mainly controls the code stream by adjusting the size of the quantization parameter; in this way, by controlling the size of the code rate, also The required quantization parameters can be obtained.
在另一种可能的实施方式中,待编码视频的预参数可以包括量化参数和目标码率。In another possible implementation, the pre-parameters of the video to be encoded may include a quantization parameter and a target bit rate.
这时候,对于S301来说,所述确定待编码视频的预参数,可以包括:At this time, for S301, the determining the pre-parameters of the video to be encoded may include:
确定所述待编码视频中编码单元的量化参数和目标码率;其中,编码单元可以包括以下至少之一:图像、分片(Slice)、子图像(Sub-picture)、瓦片(tile)、编码块。Determine the quantization parameter and target code rate of the coding unit in the video to be coded; wherein, the coding unit may include at least one of the following: a picture, a slice (Slice), a sub-picture (Sub-picture), a tile (tile), encoding block.
这里,量化参数可以是编码器中量化器的量化步长,或者编码器中量化器的量化步长对应的索引序号值。Here, the quantization parameter may be the quantization step size of the quantizer in the encoder, or the index number value corresponding to the quantization step size of the quantizer in the encoder.
针对第一拉格朗日乘子的确定,在一些实施例中,对于S301来说,所述根据所述预参数,确定第一拉格朗日乘子,可以包括:For the determination of the first Lagrangian multiplier, in some embodiments, for S301, the determining the first Lagrangian multiplier according to the pre-parameter may include:
确定第二计算模型参数,所述第二计算模型表示所述第一拉格朗日乘子与码率之间的对应关系;determining a second calculation model parameter, the second calculation model representing the correspondence between the first Lagrange multiplier and the code rate;
根据所述目标码率和所述第二计算模型,确定所述第一拉格朗日乘子。The first Lagrangian multiplier is determined according to the target code rate and the second calculation model.
需要说明的是,所述确定所述待编码视频中编码单元的目标码率,可以包括:使用比特分配的方式,确定所述待编码视频中编码单元的目标码率。It should be noted that the determining the target bit rate of the coding unit in the to-be-encoded video may include: determining the target bit-rate of the encoding unit in the to-be-encoded video by using a bit allocation method.
也就是说,针对待编码视频中编码单元的目标码率可以采用比特分配的方式得到。这里,可以根据待编码视频中编码单元所消耗的比特数多少来动态调整目标码率,用以保证比特分配的实时性和准确性。That is to say, the target bit rate for the coding unit in the video to be encoded can be obtained by way of bit allocation. Here, the target bit rate can be dynamically adjusted according to the number of bits consumed by the coding unit in the video to be encoded, so as to ensure real-time and accurate bit allocation.
还需要说明的是,对于第二计算模型而言,所述确定第二计算模型参数,可以包括:It should also be noted that, for the second calculation model, the determining the parameters of the second calculation model may include:
在所述第二计算模型中,将所述第一拉格朗日乘子设置为等于所述目标码率的指数幂次的加权值;In the second calculation model, the first Lagrangian multiplier is set to a weighted value equal to the exponential power of the target code rate;
所述第二计算模型参数包括指示所述指数幂次的第二指数参数和指示所述加权的第二加权系数。The second calculation model parameter includes a second exponent parameter indicating the power of the exponent and a second weighting coefficient indicating the weighting.
需要说明的是,第一拉格朗日乘子用λ miou表示,码率用R表示,那么第二计算模型的计算公式如下所示, It should be noted that the first Lagrange multiplier is represented by λ miou , and the code rate is represented by R, then the calculation formula of the second calculation model is as follows:
λ miou=0.17364347*R -1.7553           (10) λmiou = 0.17364347*R -1.7553 (10)
这里,式(10)即是第二计算模型,用于表示第一拉格朗日乘子与码率之间的对应关系。其中,第二计算模型参数可包括第二指数参数(即式中的-1.7553)和第二加权系数(即式中的0.17364347)。Here, Equation (10) is the second calculation model, which is used to represent the correspondence between the first Lagrangian multiplier and the code rate. Wherein, the second calculation model parameter may include a second index parameter (ie -1.7553 in the formula) and a second weighting coefficient (ie, 0.17364347 in the formula).
进一步地,对于第二计算模型参数的确定,可以是预设值,也可以是根据测试视频的大量试验测试数据拟合得到,这里仍不作任何限定。Further, the determination of the parameters of the second calculation model may be a preset value, or may be obtained by fitting according to a large amount of experimental test data of the test video, which is not limited herein.
可选地,在一些实施例中,该方法还可以包括:Optionally, in some embodiments, the method may further include:
将所述第二计算模型参数设置为预设值。The second calculation model parameter is set to a preset value.
这里,对于第二计算模型参数而言,第二指数参数可设置为-1.7553,第二加权系数可设置为0.17364347。在确定出第二指数参数和第二加权系数后,就可以得到第二计算模型,以便根据目标码率确定出第一拉格朗日乘子。Here, for the second calculation model parameter, the second index parameter may be set to -1.7553, and the second weighting coefficient may be set to 0.17364347. After the second index parameter and the second weighting coefficient are determined, the second calculation model can be obtained, so as to determine the first Lagrangian multiplier according to the target code rate.
可选地,在一些实施例中,该方法还可以包括:Optionally, in some embodiments, the method may further include:
基于测试视频,使用所述第一失真度量准则,确定所述第一失真值与所述测试视频的码率之间的第一关系函数;Based on the test video, using the first distortion metric, determine a first relationship function between the first distortion value and the bit rate of the test video;
对所述第一关系函数进行导数运算,确定所述第二计算模型参数。A derivative operation is performed on the first relational function to determine the second calculation model parameter.
需要说明的是,这里的测试视频也可以是一个或多个测试视频,比如该测试视频可以是大规模城市场景数据集(Cityscapes)中的多个(比如59个)测试视频序列。It should be noted that the test video here may also be one or more test videos, for example, the test video may be multiple (eg, 59) test video sequences in a large-scale urban scene dataset (Cityscapes).
这样,针对测试视频使用第一失真度量准则,可以确定出第一失真值与测试视频的码率之间的第一关系函数,该第一关系函数如上述的式(6)所示。然后对式(6)进行导数运算,可以得到第一关系函数的导数函数,而该导数函数用于表示第一拉格朗日乘子与码率之间的对应关系,也就得到了式(10)所示的第二计算模型;从而也就确定了第二计算模型参数,以便根据量化参数就可以确定出第一拉格朗日乘子。In this way, using the first distortion metric criterion for the test video, the first relationship function between the first distortion value and the bit rate of the test video can be determined, and the first relationship function is shown in the above formula (6). Then the derivative operation is performed on the formula (6), the derivative function of the first relation function can be obtained, and the derivative function is used to represent the corresponding relationship between the first Lagrange multiplier and the code rate, and the formula ( 10) the second calculation model shown; thus, the parameters of the second calculation model are also determined, so that the first Lagrangian multiplier can be determined according to the quantization parameters.
需要注意的是,在确定第一关系函数时,基于图1所示的RD曲线,λ miou的数值为该曲线切线的斜率(λ miou>0),即等于负的曲线的导函数。这时候基于测试视频,根据编码得到的文件可以统计出不同量化参数下重建视频的平均码率,这时候利用大量试验测试数据可以通过拟合来确定出第一失真值(D miou)与码率(R)之间的函数关系,该拟合曲线如图4所示,以得到第一关系函数。 It should be noted that, when determining the first relationship function, based on the RD curve shown in FIG. 1 , the value of λ miou is the slope of the tangent of the curve (λ miou >0), that is, the derivative function of the negative curve. At this time, based on the test video, the average bit rate of the reconstructed video under different quantization parameters can be calculated according to the encoded files. At this time, a large amount of experimental test data can be used to determine the first distortion value (D miou ) and the bit rate by fitting (R), the fitting curve is shown in Fig. 4 to obtain the first relation function.
针对第二拉格朗日乘子的确定,在一些实施例中,对于S301来说,所述根据所述预参数,确定第二拉格朗日乘子,可以包括:Regarding the determination of the second Lagrangian multiplier, in some embodiments, for S301, the determining the second Lagrangian multiplier according to the pre-parameter may include:
根据预设的第三计算模型,确定所述第二拉格朗日乘子;其中,所述第三计算模型表示所述第二拉格朗日乘子与量化参数之间的对应关系。The second Lagrangian multiplier is determined according to a preset third calculation model; wherein, the third calculation model represents the corresponding relationship between the second Lagrangian multiplier and the quantization parameter.
需要说明的是,对于第三计算模型而言,可以是采用现有技术中SSE失真准则构建的。在一些实施例中,所述确定第三计算模型参数,可以包括:It should be noted that, for the third calculation model, it may be constructed using the SSE distortion criterion in the prior art. In some embodiments, the determining the third calculation model parameter may include:
在所述第三计算模型中,将所述第二拉格朗日乘子设置为等于2的指数幂次的加权值;In the third calculation model, the second Lagrangian multiplier is set equal to a weighted value of an exponential power of 2;
所述第三计算模型参数包括指示所述指数幂次的第三指数参数和指示所述加权的第三加权系数,所述第三指数参数与量化参数有关。The third calculation model parameter includes a third exponent parameter indicating the power of the exponent and a third weighting coefficient indicating the weighting, the third exponent parameter being related to a quantization parameter.
需要说明的是,第二拉格朗日乘子用λ SSE表示,量化参数用QP表示,那么第三计算模型的计算公式如上述的式(9)所示。 It should be noted that, the second Lagrange multiplier is represented by λ SSE , and the quantization parameter is represented by QP, then the calculation formula of the third calculation model is shown in the above formula (9).
这里,式(9)即是第三计算模型,用于表示第二拉格朗日乘子与量化参数之间的对应关系。其中,第三计算模型参数可包括第三指数参数(即式中的(QP-12)/3)和第三加权系数(即式中的0.57),而且第三指数参数的取值与量化参数(QP)有关。Here, Equation (9) is the third calculation model, which is used to represent the correspondence between the second Lagrangian multiplier and the quantization parameter. The third calculation model parameter may include a third index parameter (ie (QP-12)/3 in the formula) and a third weighting coefficient (ie, 0.57 in the formula), and the value of the third index parameter is related to the quantization parameter (QP) related.
还需要说明的是,对于第三计算模型参数的确定,可以是预设值;也可以是根据测试视频的大量试验测试数据拟合得到,这里不作任何限定。It should also be noted that, for the determination of the parameters of the third calculation model, it may be a preset value; it may also be obtained by fitting according to a large amount of experimental test data of the test video, which is not limited here.
进一步地,无论是第一拉格朗日乘子还是第二拉格朗日乘子,均与量化参数有关。针对量化参数的确定,在一些实施例中,所述确定所述待编码视频中编码单元的量化参数,可以包括:使用码率控制的方式,确定待编码视频中编码单元的量化参数。Further, whether it is the first Lagrangian multiplier or the second Lagrangian multiplier, both are related to the quantization parameter. Regarding the determination of the quantization parameter, in some embodiments, the determining the quantization parameter of the coding unit in the to-be-coded video may include: using a rate control manner to determine the quantization parameter of the coding unit in the to-be-coded video.
或者,在一些实施例中,所述确定所述待编码视频中编码单元的量化参数,可以包括:将所述量化参数设置为预设值。Alternatively, in some embodiments, the determining the quantization parameter of the coding unit in the to-be-coded video may include: setting the quantization parameter to a preset value.
也就是说,针对待编码视频中的量化参数,一方面,可以将量化参数设置为预设值,比如22、27、32、37等。另一方面,还可以是使用码率控制的方式来确定量化参数;具体地,目前的码流控制算法主要是通过调整量化参数的大小来控制码流;如此,通过控制码率的大小,也就可以得到所需求的量化参数。That is, for the quantization parameter in the video to be encoded, on the one hand, the quantization parameter may be set to a preset value, such as 22, 27, 32, 37, and so on. On the other hand, the quantization parameter can also be determined by using the code rate control method; specifically, the current code stream control algorithm mainly controls the code stream by adjusting the size of the quantization parameter; in this way, by controlling the size of the code rate, also The required quantization parameters can be obtained.
如此,针对本申请实施例的率失真优化算法,除了需要确定第一拉格朗日乘子和第二拉格朗日乘子之外,还需要确定失真值。这里,假定基于第一失真度量准则得到的称为第一失真值,基于第二失真度量准则得到的称为第二失真值。In this way, for the rate-distortion optimization algorithm of the embodiment of the present application, in addition to determining the first Lagrangian multiplier and the second Lagrangian multiplier, the distortion value also needs to be determined. Here, it is assumed that the first distortion value is obtained based on the first distortion metric criterion, and the second distortion value is obtained based on the second distortion metric criterion.
需要说明的是,第一失真度量准则可以为语义失真度量准则。以H.266/VVC标准的编码器为例,为了提高重建视频的语义分割准确度,首先需要定义语义失真度量。具体来讲,可以选取多个量化参数,然后在随机接入(Random Access,RA)条件下对大规模城市场景数据集(Cityscapes)中的多个(比如59个)测试视频序列进行VVC编码,并对编码前后的视频进行视频语义分割,这样根据相应的标注数据能够计算语义分割结果的准确度。It should be noted that the first distortion metric criterion may be a semantic distortion metric criterion. Taking the H.266/VVC standard encoder as an example, in order to improve the semantic segmentation accuracy of reconstructed video, it is first necessary to define a semantic distortion metric. Specifically, multiple quantization parameters can be selected, and then VVC encoding is performed on multiple (for example, 59) test video sequences in the large-scale urban scene dataset (Cityscapes) under the condition of random access (RA). Video semantic segmentation is performed on the video before and after encoding, so that the accuracy of the semantic segmentation result can be calculated according to the corresponding annotation data.
其中,衡量语义分割的准确度通常可以使用平均交并比(mean Intersection over Union,mIoU)表示,mIoU是指所有类别的交并比(Intersection over Union,IoU)的平均值。这里,IoU作为检测评价函数,简单来讲就是所产生的预测窗口与真实窗口的交叠率,即检测结果区域(Detection Result)与真值区域(Ground Truth)的交集与两者的并集之比,也即语义准确度(用IoU表示)。Among them, measuring the accuracy of semantic segmentation can usually be expressed by mean Intersection over Union (mIoU), where mIoU refers to the average value of Intersection over Union (IoU) of all categories. Here, IoU is used as a detection evaluation function, which is simply the overlap rate of the generated prediction window and the real window, that is, the intersection of the detection result area (Detection Result) and the ground truth area (Ground Truth) and the union of the two. ratio, that is, semantic accuracy (represented by IoU).
在一种可能的实施方式中,对于S303来说,所述根据第一失真度量准则,确定第一失真值,可以包括:In a possible implementation manner, for S303, the determining the first distortion value according to the first distortion metric criterion may include:
基于测试视频,对所述测试视频进行语义分割,确定一个或多个类别的语义准确度;Based on the test video, semantically segment the test video to determine the semantic accuracy of one or more categories;
根据所述一个或多个类别的语义准确度,确定目标语义准确度;determining the target semantic accuracy according to the semantic accuracy of the one or more categories;
利用第四计算模型对所述目标语义准确度进行失真度量,得到所述第一失真值。Distortion measurement is performed on the semantic accuracy of the target by using the fourth calculation model to obtain the first distortion value.
进一步地,所述根据一个或多个类别的语义准确度,确定目标语义准确度,可以包括:Further, determining the target semantic accuracy according to the semantic accuracy of one or more categories may include:
计算所述一个或多个类别的语义准确度的加权和,将所得到的加权和确定为所述目标语义准确度。A weighted sum of the semantic accuracy of the one or more categories is calculated, and the resulting weighted sum is determined as the target semantic accuracy.
需要说明的是,计算一个或多个类别的语义准确度的加权和,一种具体实施方式是可将权值设置为1,这时候即是计算这一个或多个类别的语义准确度的平均值,将所得到的平均值确定为目标语义准确度。It should be noted that, to calculate the weighted sum of the semantic accuracy of one or more categories, a specific implementation is to set the weight to 1, in this case, the average of the semantic accuracy of the one or more categories is calculated. value, and the obtained average is determined as the target semantic accuracy.
示例性地,在语义分割的场景中,两个集合可以分别表示预测值和真实值,即A pred是预测出的分割结果区域,A true是标注的分割结果区域;每一类别的语义准确度可以用IoU表示,而IoU的计算如下所示。 Exemplarily, in the scenario of semantic segmentation, the two sets can represent the predicted value and the real value respectively, that is, A pred is the predicted segmentation result area, and A true is the labeled segmentation result area; the semantic accuracy of each category It can be represented by IoU, and the calculation of IoU is as follows.
Figure PCTCN2020106416-appb-000007
Figure PCTCN2020106416-appb-000007
在得到每一类别的IoU之后,可以通过求取平均值以得到目标语义准确度,可以用mIoU表示。这里,mIoU是指所有类别的平均IoU,其取值范围为0~1;数值越大表示语义准确度越高。针对n个类别的IoU,mIoU的计算如下所示。After obtaining the IoU of each category, the target semantic accuracy can be obtained by taking the average value, which can be expressed as mIoU. Here, mIoU refers to the average IoU of all categories, and its value ranges from 0 to 1; the larger the value, the higher the semantic accuracy. For IoU of n classes, the calculation of mIoU is as follows.
Figure PCTCN2020106416-appb-000008
Figure PCTCN2020106416-appb-000008
其中,IoU i表示第i类别的IoU,i=1,…,n,n表示所有类别的数量。 Among them, IoU i represents the IoU of the i-th category, i=1,...,n, and n represents the number of all categories.
还需要说明的是,对于第四计算模型而言,所述利用第四计算模型对所述目标语义准确度进行失真度量,得到所述第一失真值,包括:It should also be noted that, for the fourth calculation model, the use of the fourth calculation model to perform distortion measurement on the target semantic accuracy to obtain the first distortion value includes:
确定所述第四计算模型参数,所述第四计算模型表示所述第一失真值与所述目标语义准确度之间的对应关系;determining the fourth calculation model parameter, the fourth calculation model representing the correspondence between the first distortion value and the target semantic accuracy;
根据所述目标语义准确度和所述第四计算模型,得到所述第一失真值。The first distortion value is obtained according to the target semantic accuracy and the fourth calculation model.
进一步地,所述确定第四计算模型参数,可以包括:Further, the determining of the fourth calculation model parameter may include:
在所述第四计算模型中,将所述第一失真值设置为等于所述目标语义准确度的对数的加权值;In the fourth calculation model, the first distortion value is set as a weighted value equal to the logarithm of the target semantic accuracy;
所述第四计算模型参数包括指示所述对数的底数参数和指示所述加权的第四加权系数参数。The fourth calculation model parameter includes a base parameter indicating the logarithm and a fourth weighting coefficient parameter indicating the weighting.
其中,将第四计算模型参数设置为预设值。Wherein, the fourth calculation model parameter is set as a preset value.
这里,针对视频语义分割的场景,可以定义语义失真度量(即第一失真值,用D miou表示),其计算公式如下述的式(13)所示。 Here, for the scene of video semantic segmentation, a semantic distortion metric (ie, the first distortion value, represented by D miou ) can be defined, and its calculation formula is shown in the following formula (13).
D miou=-10*ln(mIoU)             (13) D miou = -10*ln(mIoU) (13)
其中,式(13)即是第四计算模型,用于表示第一失真值(D miou)与目标语义准确度(mIoU)之间的对应关系。其中,第四计算模型参数可包括底数参数(即式中ln的底数为10)和第四加权系数参数(即式中的-10),这里的第四加权系数参数还可以看作是对ln(mIoU)的预设放大倍数。 The formula (13) is the fourth calculation model, which is used to represent the correspondence between the first distortion value (D miou ) and the target semantic accuracy (mIoU). The fourth calculation model parameter may include a base parameter (that is, the base of ln in the formula is 10) and a fourth weighting coefficient parameter (that is, -10 in the formula). (mIoU) preset magnification.
还需要说明的是,对于第四计算模型参数的确定,可以是预设值;也可以是根据测试视频的大量试验测试数据拟合得到,这里不作任何限定。It should also be noted that the determination of the parameters of the fourth calculation model may be a preset value, or may be obtained by fitting according to a large amount of test data of a test video, which is not limited here.
另外,根据式(13),自然对数函数将有限的mIoU数值映射到无限范围,所乘系数将得到的数值进行放大,以与率失真优化算法中的失真大小相匹配。这样,当mIoU趋于0时,D miou趋于无穷大;当mIoU趋于1时,D miou趋于0。 In addition, according to equation (13), the natural logarithmic function maps the finite mIoU value to an infinite range, and the multiplied coefficient amplifies the obtained value to match the distortion size in the rate-distortion optimization algorithm. In this way, when mIoU tends to 0, D miou tends to infinity; when mIoU tends to 1, D miou tends to 0.
在另一种可能的实施方式中,第一失真值还可以和待编码视频中编码单元的目标均方误差有关。所述根据第一失真度量准则,确定第一失真值,可以包括:In another possible implementation, the first distortion value may also be related to the target mean square error of the coding unit in the video to be encoded. The determining the first distortion value according to the first distortion metric criterion may include:
确定第五计算模型参数,所述第五计算模型表示所述第一失真值与均方误差之间的第三关系函数;determining a fifth calculation model parameter, the fifth calculation model representing a third relationship function between the first distortion value and the mean square error;
确定所述待编码视频中编码单元的目标均方误差,根据所述目标均方误差和所述第五计算模型,确定所述第一失真值。The target mean square error of the coding unit in the video to be encoded is determined, and the first distortion value is determined according to the target mean square error and the fifth calculation model.
需要说明的是,重建视频为编码后的视频进行视频解码重建得到的。在利用量化参数对测试视频进行编码,得到该量化参数下的编码视频后,可以针对该量化参数下的编码视频进行视频重建,得到该量化参数下的重建视频;根据重建视频与原始视频,能够得到该量化参数下重建视频的均方误差(Mean Squared Error,MSE)。这里,均方误差是指参数预测值与参数真实值之差平方的期望值。MSE可以评价数据的变化程度,MSE的值越小,说明了预测模型描述实验数据具有更好的精确度。It should be noted that the reconstructed video is obtained by performing video decoding and reconstruction on the encoded video. After encoding the test video by using the quantization parameter to obtain the encoded video under the quantization parameter, video reconstruction can be performed on the encoded video under the quantization parameter to obtain the reconstructed video under the quantization parameter; according to the reconstructed video and the original video, it is possible to The mean squared error (Mean Squared Error, MSE) of the reconstructed video under the quantization parameter is obtained. Here, the mean squared error refers to the expected value of the square of the difference between the predicted value of the parameter and the actual value of the parameter. MSE can evaluate the degree of change of the data. The smaller the value of MSE, the better the accuracy of the prediction model in describing the experimental data.
还需要说明的是,对于第五计算模型而言,所述确定第五计算模型参数,可以包括:It should also be noted that, for the fifth calculation model, the determining the parameters of the fifth calculation model may include:
在所述第五计算模型中,将所述第一失真值设置为等于所述目标均方误差与第一参数因子的乘积并叠加第二参数因子的和值;In the fifth calculation model, the first distortion value is set equal to the product of the target mean square error and the first parameter factor and the sum value of the second parameter factor is superimposed;
所述第五计算模型参数包括所述第一参数因子和所述第二参数因子。The fifth calculation model parameter includes the first parameter factor and the second parameter factor.
需要说明的是,第一失真值用D miou表示,均方误差用MSE表示,那么第五计算模型的计算公式如下所示, It should be noted that the first distortion value is represented by D miou , and the mean square error is represented by MSE, then the calculation formula of the fifth calculation model is as follows:
D miou=0.6276*MSE+3.48             (14) Dmiou = 0.6276*MSE+3.48 (14)
这里,式(14)即是第五计算模型,用于表示第一失真值与均方误差之间的对应关系。其中,第五计算模型参数可包括第一参数因子(即式中的0.6276)和第二参数因子(即式中的3.48)。Here, Equation (14) is the fifth calculation model, which is used to represent the corresponding relationship between the first distortion value and the mean square error. The fifth calculation model parameter may include a first parameter factor (ie, 0.6276 in the formula) and a second parameter factor (ie, 3.48 in the formula).
进一步地,对于第五计算模型参数的确定,可以是预设值,也可以是根据测试视频的大量试验测试数据拟合得到,这里仍不作任何限定。Further, the determination of the parameter of the fifth calculation model may be a preset value, or may be obtained by fitting according to a large amount of experimental test data of the test video, which is not limited herein.
可选地,在一些实施例中,该方法还可以包括:Optionally, in some embodiments, the method may further include:
将所述第五计算模型参数设置为预设值。The fifth calculation model parameter is set to a preset value.
这里,对于第五计算模型参数而言,第一参数因子可设置为0.6276,第二参数因子可设置为3.48。在确定出第一参数因子和第二参数因子后,就可以得到第五计算模型,以便根据目标均方误差确定出第一失真值。Here, for the fifth calculation model parameter, the first parameter factor may be set to 0.6276, and the second parameter factor may be set to 3.48. After the first parameter factor and the second parameter factor are determined, a fifth calculation model can be obtained, so as to determine the first distortion value according to the target mean square error.
可选地,在一些实施例中,该方法还可以包括:Optionally, in some embodiments, the method may further include:
基于测试视频,使用所述第一失真度量准则,确定所述第一失真值与所述测试视频的均方误差之间的第三关系函数;Based on the test video, using the first distortion metric, determine a third relationship function between the first distortion value and the mean square error of the test video;
根据所述第三关系函数,确定所述第五计算模型参数。The fifth calculation model parameter is determined according to the third relational function.
需要说明的是,这里的测试视频也可以是一个或多个测试视频,比如该测试视频可以是大规模城市场景数据集(Cityscapes)中的多个(比如59个)测试视频序列。It should be noted that the test video here may also be one or more test videos, for example, the test video may be multiple (eg, 59) test video sequences in a large-scale urban scene dataset (Cityscapes).
这样,针对测试视频使用第一失真度量准则,根据编码得到的文件统计出不同量化参数下重建视频的平均MSE,这时候利用大量试验测试数据可以通过拟合来确定出第一失真值(D miou)和MSE之间的函数关系,该拟合曲线如图6所示,且该拟合曲线为线性,可以得到第五计算模型。 In this way, the first distortion metric is used for the test video, and the average MSE of the reconstructed video under different quantization parameters is counted according to the encoded files. At this time, a large amount of experimental test data can be used to determine the first distortion value (D miou ) and MSE, the fitting curve is shown in FIG. 6 , and the fitting curve is linear, and a fifth calculation model can be obtained.
需要注意的是,所述确定待编码视频的预参数,还可以包括:确定所述待编码视频中编码单元的目标均方误差。这样在得到第五计算模型之后,根据该目标均方误差和如式(14)所示的第五计算模型,就可以确定出第一失真值。It should be noted that the determining the pre-parameter of the video to be encoded may further include: determining the target mean square error of the coding units in the video to be encoded. In this way, after the fifth calculation model is obtained, the first distortion value can be determined according to the target mean square error and the fifth calculation model shown in formula (14).
除此之外,对于第一失真值的确定,还可以采用其他方式,比如编码前后视频语义分割结果的差值。这里,根据编码前视频的语义分割结果(第一mIoU值)和编码后视频的语义分割结果(第二mIoU值),依据第一mIoU值和第二mIoU值之间的差值来确定第一失真值;本申请实施例不作具体限定。Besides, for the determination of the first distortion value, other methods may also be used, such as the difference between the video semantic segmentation results before and after encoding. Here, the first mIoU value is determined according to the difference between the first mIoU value and the second mIoU value according to the semantic segmentation result (the first mIoU value) of the video before encoding and the semantic segmentation result (the second mIoU value) of the encoded video. Distortion value; the embodiment of the present application does not make any specific limitation.
还需要说明的是,第二失真度量准则可以为数值误差准则。在一些实施例中,对于S303来说,所述根据第二失真度量准则,确定第二失真值,可以包括:It should also be noted that the second distortion metric criterion may be a numerical error criterion. In some embodiments, for S303, the determining the second distortion value according to the second distortion metric criterion may include:
确定所述视频中编码单元的重建值,其中,所述编码单元包括以下至少之一:图像、分片(Slice)、子图像(Sub-picture)、瓦片(tile)、编码块;determining a reconstruction value of a coding unit in the video, wherein the coding unit includes at least one of the following: a picture, a slice (Slice), a sub-picture (Sub-picture), a tile (tile), and a coding block;
基于所述数值误差准则,根据所述编码单元的重建值和原始值,确定所述第二失真值;Based on the numerical error criterion, the second distortion value is determined according to the reconstructed value and the original value of the coding unit;
其中,所述数值误差准则是以下其中之一:绝对误差和(Sum of Absolute Differences,SAD)准则、平均绝对误差(Mean Absolute Deviation,MAD)准则,误差平方和(Sum of Square Error,SSE)准则,均方误差(mean-square error,MSE)准则。需注意的是,数值误差准则并不局限于这些准则,也可以是其他准则,本申请实施例不作具体限定。Wherein, the numerical error criterion is one of the following: Sum of Absolute Differences (SAD) criterion, Mean Absolute Deviation (MAD) criterion, and Sum of Square Error (SSE) criterion , the mean-square error (MSE) criterion. It should be noted that the numerical error criterion is not limited to these criteria, and may also be other criteria, which are not specifically limited in the embodiments of the present application.
需要说明的是,以数值误差准则为SSE准则为例,这时候第二失真值用SSE表示,其计算公式如下所示,It should be noted that, taking the numerical error criterion as the SSE criterion as an example, at this time, the second distortion value is represented by SSE, and its calculation formula is as follows:
Figure PCTCN2020106416-appb-000009
Figure PCTCN2020106416-appb-000009
其中,M和N分别表示视频的水平空间分辨率和垂直空间分辨率,f(x,y)表示像素位置(x,y)处的原始像素值,g(x,y)表示像素位置(x,y)处的重建像素值。Among them, M and N represent the horizontal spatial resolution and vertical spatial resolution of the video, respectively, f(x, y) represents the original pixel value at the pixel position (x, y), and g(x, y) represents the pixel position (x, y) , y) at the reconstructed pixel value.
可以理解地,在得到第一拉格朗日乘子(λ miou)和第二拉格朗日乘子(λ SSE)、以及第一失真值(D miou)和第二失真值(SSE)之后,可以通过λ miou和λ SSE计算出目标拉格朗日乘子(用λ表示),通过D miou和SSE计算出目标失真值(用D表示)。 Understandably, after obtaining the first Lagrangian multiplier (λ miou ) and the second Lagrangian multiplier (λ SSE ), as well as the first distortion value (D miou ) and the second distortion value (SSE) , the target Lagrangian multiplier (represented by λ) can be calculated by λ miou and λ SSE , and the target distortion value (represented by D) can be calculated by D miou and SSE.
在一些实施例中,对于S302来说,所述根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子,可以包括:In some embodiments, for S302, the determining a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier may include:
确定第一预设参数;其中,所述第一预设参数用于控制所述第一拉格朗日乘子和所述第二拉格朗日乘子对应的权重值;determining a first preset parameter; wherein, the first preset parameter is used to control weight values corresponding to the first Lagrangian multiplier and the second Lagrangian multiplier;
利用所述第一预设参数对所述第一拉格朗日乘子和所述第二拉格朗日乘子进行加权计算,得到所述目标拉格朗日乘子。The first Lagrangian multiplier and the second Lagrangian multiplier are weighted and calculated by using the first preset parameter to obtain the target Lagrangian multiplier.
需要说明的是,第一预设参数可以控制第一拉格朗日乘子和第二拉格朗日乘子对应的权重值。具体来讲,在一些实施例中,所述确定第一预设参数,可以包括:It should be noted that the first preset parameter can control the weight values corresponding to the first Lagrangian multiplier and the second Lagrangian multiplier. Specifically, in some embodiments, the determining the first preset parameter may include:
根据所述编码器的配置信息设置所述第一预设参数。The first preset parameter is set according to the configuration information of the encoder.
进一步地,该方法还可以包括:Further, the method can also include:
当所述编码器的配置信息指示所述第一预设参数等于k时,将所述目标拉格朗日乘子设置为等于所述第一拉格朗日乘子和所述第二拉格朗日乘子的加权和,其中,k为大于或等于0且小于或等于1的任意值,所述第一拉格朗日乘子的加权系数设置为等于1–k,所述第二拉格朗日乘子的加权系数设置为等于k。When the configuration information of the encoder indicates that the first preset parameter is equal to k, the target Lagrangian multiplier is set equal to the first Lagrangian multiplier and the second Lagrangian A weighted sum of Lagrangian multipliers, where k is any value greater than or equal to 0 and less than or equal to 1, the weighting coefficient of the first Lagrange multiplier is set to be equal to 1–k, and the second Lagrangian The weighting factor of the Grange multiplier is set equal to k.
需要说明的是,假定第一预设参数用k表示,那么当第二拉格朗日乘子的加权系数设置为k时,第一拉格朗日乘子的加权系数可设置为1-k;这样,目标拉格朗日乘子的计算公式如下所示,It should be noted that, assuming that the first preset parameter is represented by k, then when the weighting coefficient of the second Lagrangian multiplier is set to k, the weighting coefficient of the first Lagrangian multiplier can be set to 1-k ; in this way, the calculation formula of the target Lagrange multiplier is as follows,
λ=k*λ SSE+(1-k)*λ miou           (16) λ=k*λ SSE +(1-k)*λ miou (16)
其中,λ表示目标拉格朗日乘子,λ miou表示第一拉格朗日乘子,λ SSE表示第二拉格朗日乘子;1-k和k分别表示第一拉格朗日乘子和第二拉格朗日乘子的加权系数。 where λ represents the target Lagrangian multiplier, λ miou represents the first Lagrangian multiplier, λ SSE represents the second Lagrangian multiplier; 1-k and k represent the first Lagrangian multiplier, respectively The weighting coefficients of the second Lagrangian and the second Lagrange multiplier.
需要说明的是,k可以为0~1范围内的常数。这里,k的取值可以等于0.5,也可以为0.75,还可以为可变值(比如通过对当前编码单元进行某种计算得到),本申请实施例不作具体限定。通常情况下,k的典型取值可以等于0.75。It should be noted that k may be a constant within the range of 0 to 1. Here, the value of k may be equal to 0.5, may also be 0.75, or may be a variable value (for example, obtained by performing a certain calculation on the current coding unit), which is not specifically limited in this embodiment of the present application. Typically, a typical value of k can be equal to 0.75.
在一些实施例中,对于S304来说,所述根据所述第一失真值和所述第二失真值,确定目标失真值,包括:In some embodiments, for S304, the determining a target distortion value according to the first distortion value and the second distortion value includes:
确定第二预设参数;其中,所述第二预设参数用于控制所述第一失真值和所述第二失真值对应的权重值;determining a second preset parameter; wherein the second preset parameter is used to control the weight value corresponding to the first distortion value and the second distortion value;
利用所述第二预设参数对所述第一失真值和所述第二失真值进行加权计算,得到所述目标失真值。The first distortion value and the second distortion value are weighted and calculated by using the second preset parameter to obtain the target distortion value.
需要说明的是,第二预设参数可以控制第一失真值和第二失真值对应的权重值。具体来讲,在一些实施例中,所述确定第二预设参数,可以包括:It should be noted that the second preset parameter can control the weight values corresponding to the first distortion value and the second distortion value. Specifically, in some embodiments, the determining the second preset parameter may include:
根据所述编码器的配置信息设置所述第二预设参数。The second preset parameter is set according to the configuration information of the encoder.
进一步地,该方法还可以包括:Further, the method can also include:
当所述编码器的配置信息指示所述第二预设参数等于m时,将所述目标失真值设置为等于所述第一失真值和所述第二失真值的加权和,其中,m为大于或等于0且小于或等于1的任意值,所述第一失真值的加权系数设置为等于1–m,所述第二失真值的加权系数设置为等于m。When the configuration information of the encoder indicates that the second preset parameter is equal to m, the target distortion value is set equal to the weighted sum of the first distortion value and the second distortion value, where m is Any value greater than or equal to 0 and less than or equal to 1, the weighting coefficient of the first distortion value is set equal to 1−m, and the weighting coefficient of the second distortion value is set equal to m.
需要说明的是,假定第二预设参数用m表示,那么当第二失真值的加权系数设置为m时,第一失真值的加权系数可设置为1-m;这样,目标失真值的计算公式如下所示,It should be noted that, assuming that the second preset parameter is represented by m, then when the weighting coefficient of the second distortion value is set to m, the weighting coefficient of the first distortion value can be set to 1-m; in this way, the calculation of the target distortion value The formula is as follows,
D=m*SSE+(1-m)*D miou            (17) D=m*SSE+(1-m)*D miou (17)
其中,D表示目标失真值,D miou表示第一失真值,SSE表示第二失真值;1-m和m分别表示第一失真值和第二失真值的加权系数。 Wherein, D represents the target distortion value, D miou represents the first distortion value, and SSE represents the second distortion value; 1-m and m represent the weighting coefficients of the first distortion value and the second distortion value, respectively.
需要说明的是,m可以为0~1范围内的常数。这里,m的取值可以等于0.5,也可以为0.75,还可以为可变值(比如通过对当前编码单元进行某种计算得到),本申请实施例不作具体限定。通常情况下,m的典型取值可以等于0.75。It should be noted that m may be a constant within the range of 0 to 1. Here, the value of m may be equal to 0.5, may also be 0.75, or may be a variable value (for example, obtained by performing a certain calculation on the current coding unit), which is not specifically limited in this embodiment of the present application. Typically, a typical value of m can be equal to 0.75.
还需要说明的是,第一预设参数和第二预设参数的取值可以设置相同,也可以设置不相同。一般而言,第一预设参数和第二预设参数的取值是相同的,比如都可以用θ表示。这时候目标拉格朗日乘子和目标失真值的计算公式可以如下,It should also be noted that the values of the first preset parameter and the second preset parameter may be set to be the same or different. Generally speaking, the values of the first preset parameter and the second preset parameter are the same, for example, both can be represented by θ. At this time, the calculation formula of the target Lagrange multiplier and the target distortion value can be as follows,
λ=θ*λ SSE+(1-θ)*λ miou           (18) λ=θ*λ SSE +(1-θ)*λ miou (18)
D=θ*SSE+(1-θ)*D miou           (19) D=θ*SSE+(1-θ)*D miou (19)
其中,θ为0到1范围内的常数,不仅可以用来控制第一拉格朗日乘子和第二拉格朗日乘子各自所占权重的大小,还可以用来控制语义失真(即第一失真值)和保真度失真(即第二失真值)各自所占权重的大小。可选地,θ通常设置为0.75。Among them, θ is a constant in the range of 0 to 1, which can not only be used to control the respective weights of the first Lagrangian multiplier and the second Lagrangian multiplier, but also can be used to control the semantic distortion (ie The first distortion value) and the fidelity distortion (ie, the second distortion value) respectively occupy the size of the weight. Optionally, θ is typically set to 0.75.
简言之,由于目前已有的视频编码方法不能很好地适用于面向机器视频的应用。在本申请实施例中,λ miou表示的是面向机器的质量,λ SSE表示的是人眼观看的主观质量,θ表示的是可以在人眼观看的主观质量和面向机器的质量之间进行调整。例如,如果θ等于1,那么这时候的目标失真值完全为人眼观看的主观质量;如果θ等于0,那么这时候的目标失真值完全为面向机器的质量。 In short, because the existing video coding methods are not well suited for machine video-oriented applications. In the embodiment of the present application, λ miou represents the machine-oriented quality, λ SSE represents the subjective quality viewed by the human eye, and θ represents the subjective quality viewed by the human eye and the machine-oriented quality that can be adjusted between . For example, if θ is equal to 1, then the target distortion value at this time is entirely the subjective quality viewed by the human eye; if θ is equal to 0, then the target distortion value at this time is entirely the quality for the machine.
这里,θ的取值可以通过编码器的配置信息进行设定。具体地,一种实现方式是直接根据应用需求进行设置,比如前述说明的0和1的情况;另一种实现方式是编码器设置不同的工作方式,例如,如果设置为对人眼的工作方式,编码器就将θ的取值设置为1;如果设置为对机器的工作方式,编码器就将θ的取值设置为0;如果设置为人机混合,编码器就自适应确定θ的取值,比如在预处理阶段采用预编码(pre-encoding)的方式,对待编码视频进行预编码,然后从预编码的结果中估计θ的取值。Here, the value of θ can be set through the configuration information of the encoder. Specifically, one implementation is to set directly according to the application requirements, such as the cases of 0 and 1 described above; another implementation is to set the encoder to work in different ways, for example, if it is set to work with the human eye , the encoder will set the value of θ to 1; if it is set to the working mode of the machine, the encoder will set the value of θ to 0; if it is set to human-machine hybrid, the encoder will adaptively determine the value of θ For example, in the preprocessing stage, the pre-encoding method is used to pre-encode the video to be encoded, and then the value of θ is estimated from the pre-encoding result.
还需要说明的是,在得到目标拉格朗日乘子和目标失真值之后,根据目标拉格朗日乘子和目标失真值,可以确定出待编码视频的编码参数,以对待编码视频进行编码。在一些实施例中,对于S305来说,所述利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,可以包括:It should also be noted that, after obtaining the target Lagrangian multiplier and the target distortion value, the encoding parameters of the video to be encoded can be determined according to the target Lagrangian multiplier and the target distortion value, so as to encode the video to be encoded. . In some embodiments, for S305, the determining the encoding parameter of the video to be encoded by using the target Lagrangian multiplier and the target distortion value may include:
基于所述目标拉格朗日乘子和所述目标失真值,构建率失真代价函数;constructing a rate-distortion cost function based on the target Lagrangian multiplier and the target distortion value;
利用一种或多种候选编码参数对所述待编码视频进行预编码处理,确定所述一种或多种候选编码参数对应的率失真代价值;Using one or more candidate encoding parameters to pre-encode the video to be encoded, to determine the rate-distortion cost value corresponding to the one or more candidate encoding parameters;
从所确定的率失真代价值中选取最小率失真代价值,将所述最小率失真代价值对应的候选编码参数确定为所述待编码视频的编码参数。A minimum rate-distortion cost value is selected from the determined rate-distortion cost values, and a candidate encoding parameter corresponding to the minimum rate-distortion cost value is determined as the encoding parameter of the video to be encoded.
这里,所述编码参数至少包括指示所述待编码视频划分方式的参数和构造所述待编码视频中编码块的预测值的参数。Here, the encoding parameters include at least a parameter indicating a division manner of the to-be-encoded video and a parameter for constructing a prediction value of an encoded block in the to-be-encoded video.
进一步地,在一些实施例中,所述对所述待编码视频进行编码,可以包括:将所述编码参数写入码流。Further, in some embodiments, the encoding the video to be encoded may include: writing the encoding parameter into a code stream.
需要指出的是,根据目标拉格朗日乘子和目标失真值,可以构建出率失真代价函数;然后利用一种或多种候选编码参数对待编码视频进行预编码处理,从而确定出这一种或多种候选编码参数对应的率失真代价值;再从所确定的率失真代价值中选取最小率失真代价值,将最小率失真代价值对应的候选编码参数确定为待编码视频的编码参数,这时候所确定的编码参数为最优编码参数(率失真代价最小),然后以此进行编码;在这过程中,还可以将编码参数写入码流,以便由编码器传输到解码器,用以在解码器侧恢复出原始的待编码视频。It should be pointed out that, according to the target Lagrange multiplier and the target distortion value, the rate-distortion cost function can be constructed; then one or more candidate encoding parameters are used to pre-encode the video to be encoded, so as to determine this one. or the rate-distortion cost value corresponding to multiple candidate encoding parameters; then select the minimum rate-distortion cost value from the determined rate-distortion cost value, and determine the candidate encoding parameter corresponding to the minimum rate-distortion cost value as the encoding parameter of the video to be encoded, The coding parameters determined at this time are the optimal coding parameters (with the lowest rate-distortion cost), and then coding is performed; in this process, the coding parameters can also be written into the code stream for transmission from the encoder to the decoder, using to restore the original to-be-encoded video on the decoder side.
在本申请实施例中,以VVC标准编码器为例,为了在提高重建视频语义准确度的同时保证视频的保真度,而且满足人机视觉场景中的主观观看需求,本申请实施例将VVC率失真优化过程中的失真准则修改为语义失真D miou和保真度失真SSE的加权,如上述的式(17)或式(19)所示;相应的目标拉格朗日乘子修改为λ miou和λ SSE的加权,如上述的式(16)或式(18)所示,从而根据面向人机视觉的多失真准则的率失真优化算法,能够对VVC标准编码器的率失真过程进行优化,用以在一定码率的情况下,提高重建视频的语义分割准确度,同时保持较好的保真度性能。 In the embodiment of the present application, taking the VVC standard encoder as an example, in order to improve the semantic accuracy of the reconstructed video while ensuring the fidelity of the video, and to meet the subjective viewing requirements in the human-machine vision scene, the embodiment of the present application uses the VVC The distortion criterion in the rate-distortion optimization process is modified to the weight of the semantic distortion D miou and the fidelity distortion SSE, as shown in the above equation (17) or equation (19); the corresponding target Lagrange multiplier is modified to λ The weighting of miou and λ SSE is shown in the above formula (16) or formula (18), so that the rate-distortion process of the VVC standard encoder can be optimized according to the rate-distortion optimization algorithm of multi-distortion criteria for human-machine vision. , to improve the semantic segmentation accuracy of reconstructed video at a certain bit rate while maintaining good fidelity performance.
示例性地,基于VVC参考软件测试平台(VVC TEST MODE,VTM),假定在VTM7.1上实现后,这时候选取不同的QP,在RA条件下对大规模城市场景数据集中的测试视频序列进行编码,并对重建视频进行语义分割测试。Exemplarily, based on the VVC reference software test platform (VVC TEST MODE, VTM), it is assumed that after the implementation on VTM7.1, different QPs are selected at this time, and the test video sequences in the large-scale urban scene dataset are tested under RA conditions. encoding, and perform a semantic segmentation test on the reconstructed video.
首先选取不同的QP,通过VVC标准编码器对这测试视频进行编码,可以得到编码码率和重建视频的PSNR,并且对重建视频进行语义分割并计算分割的准确度。然后,按照本申请实施例的视频编码方法对该VVC中的率失真过程进行优化,再选取不同的QP,通过优化后的编码器对这测试视频进行编码,可以得到编码码率,并且对编码得到的重建视频进行语义分割并计算分割的准确度。针对这两种情况下的实验结果,可以计算得到相比于VVC标准编码器重建视频的BD-rate和BD-miou,可以衡量本申请实施例的视频编码方法和VVC标准编码器相比在相同码率的情况下视频语义准确度方面的性能。First, select different QPs, encode the test video through the VVC standard encoder, and obtain the encoding bit rate and PSNR of the reconstructed video, and perform semantic segmentation on the reconstructed video and calculate the accuracy of the segmentation. Then, the rate-distortion process in the VVC is optimized according to the video encoding method of the embodiment of the present application, and then different QPs are selected, and the test video is encoded by the optimized encoder to obtain the encoding bit rate, and the encoding The resulting reconstructed video is semantically segmented and the segmentation accuracy is calculated. According to the experimental results in these two cases, the BD-rate and BD-miou of the reconstructed video compared with the VVC standard encoder can be calculated. Performance in terms of video semantic accuracy at bit rate.
具体地,在QP为22、27、32、37的情况下,根据实验结果计算本申请实施例的重建视频相比于VVC标准编码器的重建视频的BD-miou和BD-rate,可以衡量本申请实施例的视频编码方法在语义准确度方面的性能,实验结果如表1所示。这里,BD-miou代表相同码率情况下重建视频语义准确度的提升情况,BD-miou大于0,表明了语义准确度提升;BD-miou小于0,表明了语义准确度下降;BD-rate代表相同语义准确度情况下编码码率的增加情况,BD-rate大于0,表明了码率增加;BD-rate小于0,表明了码率下降,即编码效率得到提高。Specifically, when the QPs are 22, 27, 32, and 37, according to the experimental results, the BD-miou and BD-rate of the reconstructed video of the embodiment of the present application compared with the reconstructed video of the VVC standard encoder can be calculated. Table 1 shows the performance of the video coding method of the application embodiment in terms of semantic accuracy. Here, BD-miou represents the improvement of the semantic accuracy of the reconstructed video under the same bit rate. BD-miou is greater than 0, indicating that the semantic accuracy is improved; BD-miou is less than 0, indicating that the semantic accuracy has decreased; BD-rate represents For the increase of the coding rate under the same semantic accuracy, if BD-rate is greater than 0, it indicates that the coding rate increases; if BD-rate is less than 0, it indicates that the coding rate decreases, that is, the coding efficiency is improved.
在视频的保真度方面,通过编码得到的文件统计出重建视频的PSNR和编码码率,根据实验结果得到相比于VVC标准编码器重建视频的BD-rate和BD-PSNR,可以衡量本申请实施例的视频编码方法和VVC标准编码器相比在相同码率的情况下视频保真度方面的性能。In terms of video fidelity, the PSNR and coding rate of the reconstructed video are calculated from the encoded files. According to the experimental results, the BD-rate and BD-PSNR of the reconstructed video compared with the VVC standard encoder are obtained, which can be measured in this application. The video encoding method of the embodiment compares the performance of the VVC standard encoder in terms of video fidelity with the same bit rate.
具体地,根据实验结果计算本申请实施例的重建视频相比于VVC标准编码器的重建视频的BD-PSNR和BD-rate,可以衡量本申请实施例的视频编码方法在保真度方面的性能,实验结果如表2所示。这里,BD-PSNR代表相同码率情况下重建视频保真度的增加情况,BD-PSNR大于0,表明保真度增加;BD-PSNR小于0,表明保真度下降;BD-rate代表相同保真度情况下编码码率的增加情况,BD-rate大于0,表明码率增加;BD-rate小于0,表明码率下降,即编码效率得到提高。Specifically, calculating the BD-PSNR and BD-rate of the reconstructed video of the embodiment of the present application compared to the reconstructed video of the VVC standard encoder according to the experimental results, can measure the performance of the video coding method of the embodiment of the present application in terms of fidelity , and the experimental results are shown in Table 2. Here, BD-PSNR represents the increase of reconstructed video fidelity under the same bit rate. BD-PSNR is greater than 0, indicating that the fidelity has increased; BD-PSNR is less than 0, indicating that the fidelity has decreased; BD-rate represents the same fidelity The increase of the coding rate in the case of true degree, if BD-rate is greater than 0, it indicates that the code rate increases; if BD-rate is less than 0, it indicates that the code rate decreases, that is, the coding efficiency is improved.
表1Table 1
   BD-miouBD-miou BD-rateBD-rate
θ=0.75θ=0.75 0.01120.0112 -24.8673-24.8673
表2Table 2
   BD-PSNRBD-PSNR BD-rateBD-rate
θ=0.75θ=0.75 0.03160.0316 -1.0836-1.0836
进一步地,根据上述的实验结果,使用本申请实施例的视频编码方法得到以下技术有益效果:Further, according to the above-mentioned experimental results, the following technical beneficial effects are obtained by using the video coding method of the embodiment of the present application:
根据表1可以得到,依据实验结果得到的BD-miou为0.0112,说明在相同码率的情况下,本申请实施例的视频编码方法可以提高重建视频的语义分割准确度。另外,在较低码率的情况下,本申请实施例的整体语义效果优于VVC标准编码器。也就是说,本申请实施例可以在相同码率的情况下提高重建视频的语义分割准确度。According to Table 1, it can be obtained that the BD-miou obtained according to the experimental results is 0.0112, indicating that the video coding method of the embodiment of the present application can improve the accuracy of semantic segmentation of reconstructed video under the same bit rate. In addition, in the case of a lower code rate, the overall semantic effect of the embodiments of the present application is better than that of the VVC standard encoder. That is to say, the embodiments of the present application can improve the semantic segmentation accuracy of the reconstructed video under the condition of the same bit rate.
根据表1还可以得到,依据实验结果得到的BD-rate为-24.8673,说明在相同语义准确度的情况下,本申请实施例的视频编码方法可以降低视频的编码码率。也就是说,本申请实施例可以在相同语义准确度的情况下降低码率。It can also be obtained from Table 1 that the BD-rate obtained according to the experimental results is -24.8673, indicating that the video coding method of the embodiment of the present application can reduce the video coding bit rate under the same semantic accuracy. That is to say, the embodiments of the present application can reduce the code rate with the same semantic accuracy.
根据表2可以得到,依据实验结果得到的BD-PSNR为0.0316,说明在相同码率的情况下,本申请实施例的视频编码方法可以提高重建视频的保真度。也就是说,本申请实施例可以在相同码率的情况下提高重建视频的保真度。According to Table 2, it can be obtained that the BD-PSNR obtained according to the experimental results is 0.0316, indicating that the video coding method of the embodiment of the present application can improve the fidelity of the reconstructed video under the condition of the same bit rate. That is to say, the embodiments of the present application can improve the fidelity of the reconstructed video under the condition of the same bit rate.
根据表2还可以得到,依据实验结果得到的BD-rate为-1.0836,说明在相同保真度的情况下,本申请实施例的视频编码方法可以降低视频的编码码率。也就是说,本申请实施例可以在相同保真度的情况下降低码率。According to Table 2, it can also be obtained that the BD-rate obtained according to the experimental results is -1.0836, indicating that the video encoding method of the embodiment of the present application can reduce the video encoding bit rate under the same fidelity. That is to say, the embodiments of the present application can reduce the code rate with the same fidelity.
另外,使用本申请实施例的视频编码方法后,重建视频的PSNR性能相比VVC标准编码器基本没有发生下降。在较低码率的情况下,本申请实施例的主观性能优于VVC,这说明本申请实施例的视频编码方法在提高语义准确度的同时可以保证重建视频的保真度,满足视频的主观观看需求。也就是说,本申请实施例可以在提高语义效果的同时保证视频的保真度。In addition, after using the video encoding method of the embodiment of the present application, the PSNR performance of the reconstructed video is basically not degraded compared with the VVC standard encoder. In the case of a lower bit rate, the subjective performance of the embodiment of the present application is better than that of VVC, which shows that the video coding method of the embodiment of the present application can ensure the fidelity of the reconstructed video while improving the semantic accuracy, and satisfy the subjective performance of the video. Watch demand. That is to say, the embodiments of the present application can ensure the fidelity of the video while improving the semantic effect.
还需要说明的是,本申请实施例的视频编码方法是对VVC中的率失真过程进行优化,没有改变视频编解码的过程和码流结构,因此不会增加编解码的复杂度。而且本申请实施例的视频编码方法还可以降低视频的编码码率,从而缩短编码所需的时间,提高编码速度。It should also be noted that the video encoding method of the embodiment of the present application optimizes the rate-distortion process in VVC, and does not change the video encoding and decoding process and code stream structure, and therefore does not increase the complexity of encoding and decoding. Moreover, the video coding method of the embodiment of the present application can also reduce the coding bit rate of the video, thereby shortening the time required for coding and improving the coding speed.
也就是说,针对视频语义分割这一人机视觉的应用场景,本申请实施例定义了一种语义失真度量,并推导出对应的第一拉格朗日乘子,通过预设参数(包括第一预设参数和第二预设参数)调整语义失真和SSE失真的权重,以及第一拉格朗日乘子和第二拉格朗日乘子的权重,从而对视频编码的率失真过程进行优化,以在一定码率的情况下,能够提高重建视频的语义分割准确度,并且还能够保持较好的保真度性能。That is to say, for the human-machine vision application scenario of video semantic segmentation, the embodiment of the present application defines a semantic distortion metric, and derives the corresponding first Lagrangian multiplier, through preset parameters (including the first The preset parameter and the second preset parameter) adjust the weights of semantic distortion and SSE distortion, as well as the weights of the first Lagrangian multiplier and the second Lagrangian multiplier, so as to optimize the rate-distortion process of video coding , so that the semantic segmentation accuracy of the reconstructed video can be improved under the condition of a certain bit rate, and a good fidelity performance can also be maintained.
本实施例提供了一种视频编码方法,应用于编码器。通过确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;根据所述第一失真值和所述第二失真值,确定目标失真值;利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码。这样,在视频编码中综合考虑了基于语义失真度量的第一失真度量准则和基于数值误差度量的第二失真度量准则进行率失真优化,可以很好地适应于面向机器视觉和人机视觉的应用场景,而且在一定码率的情况下,能够提高重建视频的语义分割准确度,同时还能够保持较好的保真度性能,从而还提高了编码效率。This embodiment provides a video encoding method, which is applied to an encoder. By determining the pre-parameters of the video to be encoded, the first Lagrangian multiplier and the second Lagrangian multiplier are determined according to the pre-parameters; according to the first Lagrangian multiplier and the second Lagrangian multiplier Lagrangian multiplier, determining the target Lagrangian multiplier; determining a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion; according to the second distortion metric criterion , determine a second distortion value, wherein the second distortion measurement criterion includes a numerical error measurement criterion; determine a target distortion value according to the first distortion value and the second distortion value; use the target Lagrangian The multiplier and the target distortion value are used to determine the encoding parameters of the to-be-encoded video, and the to-be-encoded video is encoded. In this way, the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and at a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining good fidelity performance, thereby improving coding efficiency.
基于前述实施例相同的发明构思,参见图7,其示出了本申请实施例提供的一种编码器70的组成结构示意图。如图70所示,该编码器70可以包括:确定单元701、计算单元702和编码单元703;其中,Based on the same inventive concept as the foregoing embodiments, see FIG. 7 , which shows a schematic structural diagram of the composition of an encoder 70 provided by an embodiment of the present application. As shown in FIG. 70, the encoder 70 may include: a determination unit 701, a calculation unit 702 and an encoding unit 703; wherein,
确定单元701,配置为确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;A determining unit 701, configured to determine pre-parameters of the video to be encoded, and determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters;
计算单元702,配置为根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;a computing unit 702, configured to determine a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier;
确定单元701,还配置为根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;以及根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;The determining unit 701 is further configured to determine a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion; and determine a second distortion value according to a second distortion metric criterion, wherein , the second distortion metric criterion includes a numerical error metric criterion;
计算单元702,还配置为根据所述第一失真值和所述第二失真值,确定目标失真值;The calculation unit 702 is further configured to determine a target distortion value according to the first distortion value and the second distortion value;
编码单元703,配置为利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码。The encoding unit 703 is configured to use the target Lagrangian multiplier and the target distortion value to determine encoding parameters of the video to be encoded, and to encode the video to be encoded.
在一些实施例中,所述预参数包括量化参数,确定单元701,还配置为确定所述待编码视频中编码单元的量化参数,其中,所述编码单元包括以下至少之一:图像,分片,子图像,瓦片,编码块。In some embodiments, the pre-parameter includes a quantization parameter, and the determining unit 701 is further configured to determine a quantization parameter of an encoding unit in the to-be-encoded video, wherein the encoding unit includes at least one of the following: image, slice , subimages, tiles, encoded blocks.
在一些实施例中,确定单元701,还配置为确定第一计算模型参数,所述第一计算模型表示所述第一拉格朗日乘子与量化参数之间的对应关系;以及根据所述量化参数和所述第一计算模型,确定所述第一拉格朗日乘子。In some embodiments, the determining unit 701 is further configured to determine a first calculation model parameter, where the first calculation model represents the corresponding relationship between the first Lagrangian multiplier and the quantization parameter; and according to the The quantization parameter and the first calculation model determine the first Lagrangian multiplier.
在一些实施例中,确定单元701,还配置为在所述第一计算模型中,将所述第一拉格朗日乘子设置为等于所述量化参数的指数幂次的加权值;其中,所述第一计算模型参数包括指示所述指数幂次的第一指数参数和指示所述加权的第一加权系数。In some embodiments, the determining unit 701 is further configured to, in the first calculation model, set the first Lagrangian multiplier to a weighted value equal to the exponential power of the quantization parameter; wherein, The first calculation model parameter includes a first exponential parameter indicating the exponential power and a first weighting coefficient indicating the weighting.
在一些实施例中,确定单元701,还配置为将所述第一计算模型参数设置为预设值。In some embodiments, the determining unit 701 is further configured to set the parameter of the first calculation model to a preset value.
在一些实施例中,确定单元701,还配置为基于测试视频,使用所述第一失真度量准则,确定所述第一失真值与所述测试视频的码率之间的第一关系函数,对所述第一关系函数进行导数运算,确定所述第一关系函数的导数函数;以及基于所述测试视频,确定所述码率与所述量化参数之间的第二关系函数,根据所述导数函数以及所述第二关系函数,确定所述第一计算模型参数。In some embodiments, the determining unit 701 is further configured to, based on the test video, use the first distortion metric criterion to determine a first relationship function between the first distortion value and the bit rate of the test video, for The first relationship function performs a derivative operation to determine a derivative function of the first relationship function; and based on the test video, determine a second relationship function between the code rate and the quantization parameter, according to the derivative function and the second relational function to determine the first computational model parameter.
在一些实施例中,所述预参数包括量化参数和目标码率,确定单元701,还配置为确定所述待编码视频中编码单元的量化参数和目标码率,其中,所述编码单元包括以下至少之一:图像,分片,子图像,瓦片,编码块。In some embodiments, the pre-parameter includes a quantization parameter and a target code rate, and the determining unit 701 is further configured to determine a quantization parameter and a target code rate of a coding unit in the to-be-coded video, wherein the coding unit includes the following At least one of: image, tile, subimage, tile, encoded block.
在一些实施例中,确定单元701,还配置为确定第二计算模型参数,所述第二计算模型表示所述第一拉格朗日乘子与码率之间的对应关系;以及根据所述目标码率和所述第二计算模型,确定所述第一拉格朗日乘子。In some embodiments, the determining unit 701 is further configured to determine a second calculation model parameter, where the second calculation model represents the corresponding relationship between the first Lagrangian multiplier and the code rate; and according to the The target code rate and the second calculation model determine the first Lagrangian multiplier.
在一些实施例中,确定单元701,还配置为在所述第二计算模型中,将所述第一拉格朗日乘子设置为等于所述目标码率的指数幂次的加权值;其中,所述第二计算模型参数包括指示所述指数幂次的第二指数参数和指示所述加权的第二加权系数。In some embodiments, the determining unit 701 is further configured to, in the second calculation model, set the first Lagrangian multiplier to a weighted value equal to the exponential power of the target code rate; wherein , the second calculation model parameter includes a second exponent parameter indicating the power of the exponent and a second weighting coefficient indicating the weighting.
在一些实施例中,确定单元701,还配置为将所述第二计算模型参数设置为预设值。In some embodiments, the determining unit 701 is further configured to set the parameter of the second calculation model to a preset value.
在一些实施例中,确定单元701,还配置为基于测试视频,使用所述第一失真度量准则,确定所述第一失真值与所述测试视频的码率之间的第一关系函数;以及对所述第一关系函数进行导数运算,确定所述第二计算模型参数。In some embodiments, the determining unit 701 is further configured to, based on the test video, use the first distortion metric criterion to determine a first relationship function between the first distortion value and the bit rate of the test video; and A derivative operation is performed on the first relational function to determine the second calculation model parameter.
在一些实施例中,确定单元701,还配置为使用比特分配的方式,确定所述待编码视频中编码单元的目标码率。In some embodiments, the determining unit 701 is further configured to determine the target bit rate of the coding unit in the to-be-coded video by using a bit allocation method.
在一些实施例中,确定单元701,还配置为根据预设的第三计算模型,确定所述第二拉格朗日乘子;其中,所述第三计算模型表示所述第二拉格朗日乘子与量化参数之间的对应关系。In some embodiments, the determining unit 701 is further configured to determine the second Lagrangian multiplier according to a preset third calculation model; wherein the third calculation model represents the second Lagrangian Correspondence between day multipliers and quantization parameters.
在一些实施例中,确定单元701,还配置为使用码率控制的方式,确定所述待编码视频中编码单元的量化参数。In some embodiments, the determining unit 701 is further configured to use a rate control manner to determine the quantization parameter of the coding unit in the to-be-coded video.
在一些实施例中,确定单元701,还配置为将所述量化参数设置为预设值。In some embodiments, the determining unit 701 is further configured to set the quantization parameter to a preset value.
在一些实施例中,确定单元701,还配置为基于测试视频,对所述测试视频进行语义分割,确定一个或多个类别的语义准确度;以及根据所述一个或多个类别的语义准确度,确定目标语义准确度;In some embodiments, the determining unit 701 is further configured to perform semantic segmentation on the test video based on the test video, and determine the semantic accuracy of one or more categories; and according to the semantic accuracy of the one or more categories , to determine the target semantic accuracy;
计算单元702,还配置为利用第四计算模型对所述目标语义准确度进行失真度量,得到所述第一失真值。The calculation unit 702 is further configured to use a fourth calculation model to perform a distortion measurement on the target semantic accuracy to obtain the first distortion value.
在一些实施例中,计算单元702,还配置为计算所述一个或多个类别的语义准确度的加权和,将所得到的加权和确定为所述目标语义准确度。In some embodiments, the calculating unit 702 is further configured to calculate a weighted sum of the semantic accuracy of the one or more categories, and determine the obtained weighted sum as the target semantic accuracy.
在一些实施例中,确定单元701,还配置为确定所述第四计算模型参数,所述第四计算模型表示所述第一失真值与所述目标语义准确度之间的对应关系;以及根据所述目标语义准确度和所述第四计算模型,得到所述第一失真值。In some embodiments, the determining unit 701 is further configured to determine the fourth calculation model parameter, where the fourth calculation model represents the correspondence between the first distortion value and the target semantic accuracy; and according to The target semantic accuracy and the fourth calculation model are used to obtain the first distortion value.
在一些实施例中,确定单元701,还配置为在所述第四计算模型中,将所述第一失真值设置为等于所述目标语义准确度的对数的加权值;其中,所述第四计算模型参数包括指示所述对数的底数参数和指示所述加权的第四加权系数参数。In some embodiments, the determining unit 701 is further configured to, in the fourth calculation model, set the first distortion value to be a weighted value equal to the logarithm of the target semantic accuracy; Four computational model parameters include a base parameter indicating the logarithm and a fourth weighting coefficient parameter indicating the weighting.
在一些实施例中,确定单元701,还配置为将所述第四计算模型参数设置为预设值。In some embodiments, the determining unit 701 is further configured to set the fourth calculation model parameter to a preset value.
在一些实施例中,确定单元701,还配置为确定第五计算模型参数,所述第五计算模型表示所述第一失真值与均方误差之间的第三关系函数;以及确定所述待编码视频中编码单元的目标均方误差,根据所述目标均方误差和所述第五计算模型,确定所述第一失真值。In some embodiments, the determining unit 701 is further configured to determine a fifth calculation model parameter, where the fifth calculation model represents a third relationship function between the first distortion value and the mean square error; The target mean square error of the coding unit in the encoded video, and the first distortion value is determined according to the target mean square error and the fifth calculation model.
在一些实施例中,确定单元701,还配置为在所述第五计算模型中,将所述第一失真值设置为等于所述目标均方误差与第一参数因子的乘积并叠加第二参数因子的和值;其中,所述第五计算模型参数包括所述第一参数因子和所述第二参数因子。In some embodiments, the determining unit 701 is further configured to, in the fifth calculation model, set the first distortion value equal to the product of the target mean square error and the first parameter factor and superimpose the second parameter The sum of factors; wherein, the fifth calculation model parameter includes the first parameter factor and the second parameter factor.
在一些实施例中,确定单元701,还配置为将所述第五计算模型参数设置为预设值。In some embodiments, the determining unit 701 is further configured to set the parameter of the fifth calculation model to a preset value.
在一些实施例中,确定单元701,还配置为基于测试视频,使用所述第一失真度量准则,确定所述第一失真值与所述测试视频的均方误差之间的第三关系函数;以及根据所述第三关系函数,确定所述第五计算模型参数。In some embodiments, the determining unit 701 is further configured to, based on the test video, use the first distortion metric criterion to determine a third relationship function between the first distortion value and the mean square error of the test video; and determining the fifth calculation model parameter according to the third relational function.
在一些实施例中,确定单元701,还配置为确定所述视频中编码单元的重建值,其中,所述编码单元包括以下至少之一:图像,分片,子图像,瓦片,编码块;In some embodiments, the determining unit 701 is further configured to determine a reconstruction value of a coding unit in the video, wherein the coding unit includes at least one of the following: an image, a slice, a sub-image, a tile, and a coding block;
计算单元702,还配置为基于所述数值误差准则,根据所述编码单元的重建值和原始值,确定所述第二失真值;其中,所述数值误差准则是以下其中之一:绝对误差和准则,平均绝对误差准则,误差平方和准则,均方误差准则。The calculation unit 702 is further configured to, based on the numerical error criterion, determine the second distortion value according to the reconstructed value and the original value of the coding unit; wherein the numerical error criterion is one of the following: absolute error and Criterion, Mean Absolute Error Criterion, Error Sum of Squares Criterion, Mean Squared Error Criterion.
在一些实施例中,确定单元701,还配置为确定第一预设参数;其中,所述第一预设参数用于控制所述第一拉格朗日乘子和所述第二拉格朗日乘子对应的权重值;In some embodiments, the determining unit 701 is further configured to determine a first preset parameter; wherein the first preset parameter is used to control the first Lagrangian multiplier and the second Lagrangian The weight value corresponding to the daily multiplier;
计算单元702,还配置为利用所述第一预设参数对所述第一拉格朗日乘子和所述第二拉格朗日乘子进行加权计算,得到所述目标拉格朗日乘子。The calculation unit 702 is further configured to perform weighted calculation on the first Lagrangian multiplier and the second Lagrangian multiplier by using the first preset parameter to obtain the target Lagrangian multiplier son.
在一些实施例中,参见图7,编码器70还可以包括配置单元704,配置为根据所述编码器的配置信息设置所述第一预设参数。In some embodiments, referring to FIG. 7 , the encoder 70 may further include a configuration unit 704 configured to set the first preset parameter according to the configuration information of the encoder.
在一些实施例中,配置单元704,还配置为当所述编码器的配置信息指示所述第一预设参数等于k时,将所述目标拉格朗日乘子设置为等于所述第一拉格朗日乘子和所述第二拉格朗日乘子的加权和,其中,k为大于或等于0且小于或等于1的任意值,所述第一拉格朗日乘子的加权系数设置为等于1–k,所述第二拉格朗日乘子的加权系数设置为等于k。In some embodiments, the configuration unit 704 is further configured to, when the configuration information of the encoder indicates that the first preset parameter is equal to k, set the target Lagrangian multiplier to be equal to the first The weighted sum of the Lagrangian multiplier and the second Lagrangian multiplier, where k is any value greater than or equal to 0 and less than or equal to 1, and the weighted sum of the first Lagrangian multiplier The coefficients are set equal to 1-k, and the weighting coefficients of the second Lagrangian multipliers are set equal to k.
在一些实施例中,k的取值等于0.75。In some embodiments, the value of k is equal to 0.75.
在一些实施例中,确定单元701,还配置为确定第二预设参数;其中,所述第二预设参数用于控制所述第一失真值和所述第二失真值对应的权重值;In some embodiments, the determining unit 701 is further configured to determine a second preset parameter; wherein the second preset parameter is used to control the weight value corresponding to the first distortion value and the second distortion value;
计算单元702,还配置为利用所述第二预设参数对所述第一失真值和所述第二失真值进行加权计算,得到所述目标失真值。The calculation unit 702 is further configured to perform weighted calculation on the first distortion value and the second distortion value by using the second preset parameter to obtain the target distortion value.
在一些实施例中,配置单元704,还配置为根据所述编码器的配置信息设置所述第二预设参数。In some embodiments, the configuration unit 704 is further configured to set the second preset parameter according to the configuration information of the encoder.
在一些实施例中,配置单元704,还配置为当所述编码器的配置信息指示所述第二预设参数等于m时,将所述目标失真值设置为等于所述第一失真值和所述第二失真值的加权和,其中,m为大于或等于0且小于或等于1的任意值,所述第一失真值的加权系数设置为等于1–m,所述第二失真值的加权系数设置为等于m。In some embodiments, the configuration unit 704 is further configured to, when the configuration information of the encoder indicates that the second preset parameter is equal to m, set the target distortion value to be equal to the first distortion value and the The weighted sum of the second distortion value, where m is any value greater than or equal to 0 and less than or equal to 1, the weighting coefficient of the first distortion value is set to be equal to 1−m, the weighting of the second distortion value The coefficients are set equal to m.
在一些实施例中,m的取值等于0.75。In some embodiments, the value of m is equal to 0.75.
在一些实施例中,确定单元701,还配置为基于所述目标拉格朗日乘子和所述目标失真值,构建率失真代价函数;利用一种或多种候选编码参数对所述待编码视频进行预编码处理,确定所述一种或多种候选编码参数对应的率失真代价值;以及从所确定的率失真代价值中选取最小率失真代价值,将所述最小率失真代价值对应的候选编码参数确定为所述待编码视频的编码参数。In some embodiments, the determining unit 701 is further configured to construct a rate-distortion cost function based on the target Lagrangian multiplier and the target distortion value; and use one or more candidate encoding parameters to encode the to-be-encoded The video is subjected to precoding processing to determine a rate-distortion cost value corresponding to the one or more candidate encoding parameters; and a minimum rate-distortion cost value is selected from the determined rate-distortion cost values, and the minimum rate-distortion cost value corresponds to The candidate encoding parameter of is determined as the encoding parameter of the video to be encoded.
在一些实施例中,所述编码参数至少包括指示所述待编码视频划分方式的参数和构造所述待编码视频中编码块的预测值的参数。In some embodiments, the encoding parameters include at least a parameter indicating how the video to be encoded is divided and a parameter constructing a predictor of an encoded block in the video to be encoded.
在一些实施例中,参见图7,编码器70还可以包括写入单元705,配置为将所述编码参数写入码流。In some embodiments, referring to FIG. 7 , the encoder 70 may further include a writing unit 705 configured to write the encoding parameters into the code stream.
可以理解地,在本申请实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。It can be understood that, in the embodiments of the present application, a "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, it may also be a module, and it may also be non-modular. Moreover, each component in this embodiment may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of software function modules.
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储 在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this embodiment is essentially or Said part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, the computer software product is stored in a storage medium and includes several instructions for making a computer device (which can be It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the method described in this embodiment. The aforementioned storage medium includes: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.
因此,本申请实施例提供了一种计算机存储介质,应用于编码器70,该计算机存储介质存储有计算机程序,所述计算机程序被至少一个处理器执行时实现前述实施例中任一项所述的方法的步骤。Therefore, an embodiment of the present application provides a computer storage medium, which is applied to the encoder 70, where the computer storage medium stores a computer program, and when the computer program is executed by at least one processor, implements any one of the foregoing embodiments. steps of the method.
基于上述编码器70的组成以及计算机存储介质,参见图8,其示出了本申请实施例提供的编码器70的具体硬件结构示例,可以包括:通信接口801、存储器802和处理器803;各个组件通过总线系统804耦合在一起。可理解,总线系统804用于实现这些组件之间的连接通信。总线系统804除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图7中将各种总线都标为总线系统804。其中,通信接口801,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;Based on the composition of the encoder 70 and the computer storage medium described above, see FIG. 8 , which shows a specific hardware structure example of the encoder 70 provided by the embodiment of the present application, which may include: a communication interface 801, a memory 802, and a processor 803; each The components are coupled together through a bus system 804 . It will be appreciated that the bus system 804 is used to implement connection communication between these components. In addition to the data bus, the bus system 804 also includes a power bus, a control bus, and a status signal bus. However, for clarity of illustration, the various buses are labeled as bus system 804 in FIG. 7 . Among them, the communication interface 801 is used for receiving and sending signals in the process of sending and receiving information with other external network elements;
存储器802,用于存储能够在处理器803上运行的计算机程序;a memory 802 for storing computer programs that can be executed on the processor 803;
处理器803,用于在运行所述计算机程序时,执行:The processor 803 is configured to, when running the computer program, execute:
确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;determining pre-parameters of the video to be encoded, and determining a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters;
根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;determining a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier;
根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;determining a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion;
根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;determining a second distortion value according to a second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion;
根据所述第一失真值和所述第二失真值,确定目标失真值;determining a target distortion value according to the first distortion value and the second distortion value;
利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码。Using the target Lagrangian multiplier and the target distortion value, the encoding parameters of the video to be encoded are determined, and the video to be encoded is encoded.
可以理解,本申请实施例中的存储器802可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步链动态随机存取存储器(Synchronous link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本文描述的系统和方法的存储器802旨在包括但不限于这些和任意其它适合类型的存储器。It can be understood that the memory 802 in this embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory. Wherein, the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable read-only memory (Erasable PROM, EPROM). Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory. Volatile memory may be Random Access Memory (RAM), which acts as an external cache. By way of illustration and not limitation, many forms of RAM are available, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM, SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (Double Data Rate SDRAM, DDRSDRAM), Enhanced Synchronous Dynamic Random Access Memory (Enhanced SDRAM, ESDRAM), Synchronous link DRAM (Synchronous link DRAM, SLDRAM) ) and direct memory bus random access memory (Direct Rambus RAM, DRRAM). The memory 802 of the systems and methods described herein is intended to include, but not be limited to, these and any other suitable types of memory.
而处理器803可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器803中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器803可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器802,处理器803读取存储器802中的信息,结合其硬件完成上述方法的步骤。The processor 803 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method can be completed by an integrated logic circuit of hardware in the processor 803 or an instruction in the form of software. The above-mentioned processor 803 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The methods, steps, and logic block diagrams disclosed in the embodiments of this application can be implemented or executed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art. The storage medium is located in the memory 802, and the processor 803 reads the information in the memory 802, and completes the steps of the above method in combination with its hardware.
可以理解的是,本文描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits,ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable Logic Device,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。It will be appreciated that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof. For hardware implementation, the processing unit can be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Device (DSP Device, DSPD), programmable Logic Devices (Programmable Logic Device, PLD), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), General Purpose Processors, Controllers, Microcontrollers, Microprocessors, Others for performing the functions described herein electronic unit or a combination thereof.
对于软件实现,可通过执行本文所述功能的模块(例如过程、函数等)来实现本文所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。For a software implementation, the techniques described herein may be implemented through modules (eg, procedures, functions, etc.) that perform the functions described herein. Software codes may be stored in memory and executed by a processor. The memory can be implemented in the processor or external to the processor.
可选地,作为另一个实施例,处理器803还配置为在运行所述计算机程序时,执行前述实施例中任一项所述的方法的步骤。Optionally, as another embodiment, the processor 803 is further configured to execute the steps of the method in any one of the foregoing embodiments when running the computer program.
本实施例提供了一种编码器,该编码器包括确定单元、计算单元和编码单元;其中,确定单元配置为确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;计算单元配置为根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;以及确定单元还配置为根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;以及根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;计算单元还配置为以及根据所述第一失真值和所述第二失真值,确定目标失真值;编码单元配置为利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码。这样,在视频编码中综合考虑了基于语义失真度量的第一失真度量准则和基于数值误差度量的第二失真度量准则进行率失真优化,可以很好地适应于面向机器视觉和人机视觉的应用场景,而且在一定码率的情况下,能够提高重建视频的语义分割准确度,同时还能够保持较好的保真度性能,从而还提高了编码效率。This embodiment provides an encoder, which includes a determination unit, a calculation unit, and an encoding unit; wherein the determination unit is configured to determine pre-parameters of a video to be encoded, and determine a first Lagrangian according to the pre-parameters a multiplier and a second Lagrangian multiplier; the computing unit is configured to determine a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier; and determine The unit is further configured to determine a first distortion value according to a first distortion metric, wherein the first distortion metric includes a semantic distortion metric; and determine a second distortion value according to a second distortion metric, wherein the The second distortion metric criterion includes a numerical error metric criterion; the computing unit is further configured to, and based on the first distortion value and the second distortion value, determine a target distortion value; the encoding unit is configured to utilize the target Lagrangian multiplication and the target distortion value, determine the encoding parameters of the to-be-encoded video, and encode the to-be-encoded video. In this way, the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and under a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining a good fidelity performance, thereby also improving the coding efficiency.
参见图9,其示出了本申请实施例提供的一种视频系统的组成结构示意图。如图9所示,该视频系统90可以包括编码器901和解码器902。其中,编码器901可以为前述实施例中任一项所述的编码器70。Referring to FIG. 9 , it shows a schematic structural diagram of a video system provided by an embodiment of the present application. As shown in FIG. 9 , the video system 90 may include an encoder 901 and a decoder 902 . The encoder 901 may be the encoder 70 described in any one of the foregoing embodiments.
编码器901,配置为确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;以及根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;以及根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;以及根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;以及根据所述第一失真值和所述第二失真值,确定目标失真值;以及利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码以生成码流,将所述码流传输至所述解码器;The encoder 901 is configured to determine pre-parameters of the video to be encoded, determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters; and determine the first Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier to determine a target Lagrangian multiplier; and a first distortion value is determined according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion And according to the second distortion metric criterion, determine the second distortion value, wherein, the second distortion metric criterion includes numerical error metric criterion; And according to the first distortion value and the second distortion value, determine the target distortion value and utilize the target Lagrange multiplier and the target distortion value to determine the encoding parameters of the video to be encoded, encode the video to be encoded to generate a code stream, and transmit the code stream to the the decoder;
解码器902,配置为解析码流,获得解码视频。The decoder 902 is configured to parse the code stream to obtain a decoded video.
进一步地,在一些实施例中,解码器902,还配置为解析码流,获取解码参数,根据所述解码参数获得所述解码视频;其中,所述解码参数至少包括指示待解码视频划分方式的参数和构造所述待解码视频中解码块的预测值的参数。Further, in some embodiments, the decoder 902 is further configured to parse the code stream, obtain decoding parameters, and obtain the decoded video according to the decoding parameters; wherein, the decoding parameters at least include a code indicating the division mode of the video to be decoded. parameters and parameters constructing the predicted values of the decoded blocks in the video to be decoded.
在本申请实施例中,该视频系统90在视频编码中综合考虑了基于语义失真度量的第一失真度量准则和基于数值误差度量的第二失真度量准则进行率失真优化,可以很好地适应于面向机器视觉和人机视觉的应用场景,而且在一定码率的情况下,能够提高重建视频的语义分割准确度,同时还能够保持较好的保真度性能,从而还提高了编码效率。In the embodiment of the present application, the video system 90 comprehensively considers the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric to perform rate-distortion optimization in video coding, which can be well adapted to It is oriented to the application scenarios of machine vision and human-machine vision, and in the case of a certain bit rate, it can improve the accuracy of semantic segmentation of reconstructed videos, while maintaining good fidelity performance, thereby improving coding efficiency.
还需要说明的是,在本申请中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should also be noted that, in this application, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article or device comprising a series of elements includes not only those elements, but also other elements not expressly listed or inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The above-mentioned serial numbers of the embodiments of the present application are only for description, and do not represent the advantages or disadvantages of the embodiments.
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。The methods disclosed in the several method embodiments provided in this application can be arbitrarily combined under the condition of no conflict to obtain new method embodiments.
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。The features disclosed in the several product embodiments provided in this application can be combined arbitrarily without conflict to obtain a new product embodiment.
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。The features disclosed in several method or device embodiments provided in this application can be combined arbitrarily without conflict to obtain new method embodiments or device embodiments.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific embodiments of the present application, but the protection scope of the present application is not limited to this. should be covered within the scope of protection of this application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.
工业实用性Industrial Applicability
本申请实施例中,通过确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;根据第二失 真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;根据所述第一失真值和所述第二失真值,确定目标失真值;利用所述目标拉格朗日乘子和所述目标失真值,确定待编码视频的编码参数,对所述待编码视频进行编码。这样,在视频编码中综合考虑了基于语义失真度量的第一失真度量准则和基于数值误差度量的第二失真度量准则进行率失真优化,可以很好地适应于面向机器视觉和人机视觉的应用场景,而且在一定码率的情况下,能够提高重建视频的语义分割准确度,同时还能够保持较好的保真度性能,从而还提高了编码效率。In the embodiment of the present application, the pre-parameters of the video to be encoded are determined, and the first Lagrangian multiplier and the second Lagrangian multiplier are determined according to the pre-parameters; according to the first Lagrangian multiplier and the second Lagrangian multiplier to determine the target Lagrangian multiplier; according to the first distortion metric criterion, determine the first distortion value, wherein the first distortion metric criterion includes a semantic distortion metric criterion; determining a second distortion value according to a second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion; determining a target distortion value according to the first distortion value and the second distortion value; using the The target Lagrange multiplier and the target distortion value are used to determine the encoding parameters of the video to be encoded, and the video to be encoded is encoded. In this way, the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and under a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining a good fidelity performance, thereby also improving the coding efficiency.

Claims (41)

  1. 一种视频编码方法,应用于编码器,所述方法包括:A video coding method, applied to an encoder, the method comprising:
    确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;determining pre-parameters of the video to be encoded, and determining a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters;
    根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;determining a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier;
    根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;determining a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion;
    根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;determining a second distortion value according to a second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion;
    根据所述第一失真值和所述第二失真值,确定目标失真值;determining a target distortion value according to the first distortion value and the second distortion value;
    利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码。Using the target Lagrangian multiplier and the target distortion value, the encoding parameters of the video to be encoded are determined, and the video to be encoded is encoded.
  2. 根据权利要求1所述的方法,其中,所述预参数包括量化参数,所述确定待编码视频的预参数,包括:The method according to claim 1, wherein the pre-parameter comprises a quantization parameter, and the determining the pre-parameter of the video to be encoded comprises:
    确定所述待编码视频中编码单元的量化参数,其中,所述编码单元包括以下至少之一:图像,分片,子图像,瓦片,编码块。A quantization parameter of a coding unit in the to-be-coded video is determined, wherein the coding unit includes at least one of the following: an image, a slice, a sub-image, a tile, and an encoding block.
  3. 根据权利要求2所述的方法,其中,所述根据所述预参数,确定第一拉格朗日乘子,包括:The method according to claim 2, wherein the determining the first Lagrangian multiplier according to the pre-parameter comprises:
    确定第一计算模型参数,所述第一计算模型表示所述第一拉格朗日乘子与量化参数之间的对应关系;determining a first calculation model parameter, the first calculation model representing the correspondence between the first Lagrangian multiplier and the quantization parameter;
    根据所述量化参数和所述第一计算模型,确定所述第一拉格朗日乘子。The first Lagrangian multiplier is determined according to the quantization parameter and the first calculation model.
  4. 根据权利要求3所述的方法,其中,所述确定第一计算模型参数,包括:The method according to claim 3, wherein said determining the first calculation model parameter comprises:
    在所述第一计算模型中,将所述第一拉格朗日乘子设置为等于所述量化参数的指数幂次的加权值;In the first calculation model, the first Lagrangian multiplier is set to a weighted value equal to the exponential power of the quantization parameter;
    所述第一计算模型参数包括指示所述指数幂次的第一指数参数和指示所述加权的第一加权系数。The first calculation model parameter includes a first exponential parameter indicating the exponential power and a first weighting coefficient indicating the weighting.
  5. 根据权利要求4所述的方法,其中,所述方法还包括:The method of claim 4, wherein the method further comprises:
    将所述第一计算模型参数设置为预设值。The first calculation model parameter is set to a preset value.
  6. 根据权利要求4所述的方法,其中,所述方法还包括:The method of claim 4, wherein the method further comprises:
    基于测试视频,使用所述第一失真度量准则,确定所述第一失真值与所述测试视频的码率之间的第一关系函数,对所述第一关系函数进行导数运算,确定所述第一关系函数的导数函数;Based on the test video, using the first distortion metric, determine a first relationship function between the first distortion value and the bit rate of the test video, perform a derivative operation on the first relationship function, and determine the the derivative function of the first relation function;
    基于所述测试视频,确定所述码率与所述量化参数之间的第二关系函数,根据所述导数函数以及所述第二关系函数,确定所述第一计算模型参数。Based on the test video, a second relationship function between the bit rate and the quantization parameter is determined, and the first calculation model parameter is determined according to the derivative function and the second relationship function.
  7. 根据权利要求1所述的方法,其中,所述预参数包括量化参数和目标码率,所述确定待编码视频的预参数,包括:The method according to claim 1, wherein the pre-parameter includes a quantization parameter and a target bit rate, and the determining the pre-parameter of the video to be encoded comprises:
    确定所述待编码视频中编码单元的量化参数和目标码率,其中,所述编码单元包括以下至少之一:图像,分片,子图像,瓦片,编码块。A quantization parameter and a target code rate of a coding unit in the to-be-coded video are determined, wherein the coding unit includes at least one of the following: an image, a slice, a sub-image, a tile, and a coding block.
  8. 根据权利要求7所述的方法,其中,所述根据所述预参数,确定第一拉格朗日乘子,包括:The method according to claim 7, wherein the determining the first Lagrangian multiplier according to the pre-parameter comprises:
    确定第二计算模型参数,所述第二计算模型表示所述第一拉格朗日乘子与码率之间的对应关系;determining a second calculation model parameter, the second calculation model representing the correspondence between the first Lagrange multiplier and the code rate;
    根据所述目标码率和所述第二计算模型,确定所述第一拉格朗日乘子。The first Lagrangian multiplier is determined according to the target code rate and the second calculation model.
  9. 根据权利要求8所述的方法,其中,所述确定第二计算模型参数,包括:The method according to claim 8, wherein said determining the second calculation model parameter comprises:
    在所述第二计算模型中,将所述第一拉格朗日乘子设置为等于所述目标码率的指数幂次的加权值;In the second calculation model, the first Lagrangian multiplier is set to a weighted value equal to the exponential power of the target code rate;
    所述第二计算模型参数包括指示所述指数幂次的第二指数参数和指示所述加权的第二加权系数。The second calculation model parameter includes a second exponent parameter indicating the power of the exponent and a second weighting coefficient indicating the weighting.
  10. 根据权利要求9所述的方法,其中,所述方法还包括:The method of claim 9, wherein the method further comprises:
    将所述第二计算模型参数设置为预设值。The second calculation model parameter is set to a preset value.
  11. 根据权利要求9所述的方法,其中,所述方法还包括:The method of claim 9, wherein the method further comprises:
    基于测试视频,使用所述第一失真度量准则,确定所述第一失真值与所述测试视频的码率之间的第一关系函数;Based on the test video, using the first distortion metric, determine a first relationship function between the first distortion value and the bit rate of the test video;
    对所述第一关系函数进行导数运算,确定所述第二计算模型参数。A derivative operation is performed on the first relational function to determine the second calculation model parameter.
  12. 根据权利要求7所述的方法,其中,所述确定所述待编码视频中编码单元的目标码率,包括:The method according to claim 7, wherein the determining the target bit rate of the coding unit in the to-be-coded video comprises:
    使用比特分配的方式,确定所述待编码视频中编码单元的目标码率。The target code rate of the coding unit in the to-be-coded video is determined by using a bit allocation method.
  13. 根据权利要求2或7所述的方法,其中,所述根据所述预参数,确定第二拉格朗日乘子,包括:The method according to claim 2 or 7, wherein the determining the second Lagrangian multiplier according to the pre-parameter comprises:
    根据预设的第三计算模型,确定所述第二拉格朗日乘子;其中,所述第三计算模型表示所述第二拉格朗日乘子与量化参数之间的对应关系。The second Lagrangian multiplier is determined according to a preset third calculation model; wherein, the third calculation model represents the corresponding relationship between the second Lagrangian multiplier and the quantization parameter.
  14. 根据权利要求2或7所述的方法,其中,所述确定所述待编码视频中编码单元的量化参数,包括:The method according to claim 2 or 7, wherein the determining the quantization parameter of the coding unit in the to-be-coded video comprises:
    使用码率控制的方式,确定所述待编码视频中编码单元的量化参数。The quantization parameter of the coding unit in the to-be-coded video is determined by using a rate control method.
  15. 根据权利要求2或7所述的方法,其中,所述确定所述待编码视频中编码单元的量化参数,包括:The method according to claim 2 or 7, wherein the determining the quantization parameter of the coding unit in the to-be-coded video comprises:
    将所述量化参数设置为预设值。The quantization parameter is set to a preset value.
  16. 根据权利要求1所述的方法,其中,所述根据第一失真度量准则,确定第一失真值,包括:The method according to claim 1, wherein the determining the first distortion value according to the first distortion metric criterion comprises:
    基于测试视频,对所述测试视频进行语义分割,确定一个或多个类别的语义准确度;Based on the test video, semantically segment the test video to determine the semantic accuracy of one or more categories;
    根据所述一个或多个类别的语义准确度,确定目标语义准确度;determining the target semantic accuracy according to the semantic accuracy of the one or more categories;
    利用第四计算模型对所述目标语义准确度进行失真度量,得到所述第一失真值。Distortion measurement is performed on the semantic accuracy of the target by using the fourth calculation model to obtain the first distortion value.
  17. 根据权利要求16所述的方法,其中,所述根据一个或多个类别的语义准确度,确定目标语义准确度,包括:The method of claim 16, wherein the determining the target semantic accuracy according to the semantic accuracy of one or more categories comprises:
    计算所述一个或多个类别的语义准确度的加权和,将所得到的加权和确定为所述目标语义准确度。A weighted sum of the semantic accuracy of the one or more categories is calculated, and the resulting weighted sum is determined as the target semantic accuracy.
  18. 根据权利要求16所述的方法,其中,所述利用第四计算模型对所述目标语义准确度进行失真度量,得到所述第一失真值,包括:The method according to claim 16, wherein the performing distortion measurement on the target semantic accuracy by using a fourth calculation model to obtain the first distortion value comprises:
    确定所述第四计算模型参数,所述第四计算模型表示所述第一失真值与所述目标语义准确度之间的对应关系;determining the fourth calculation model parameter, the fourth calculation model representing the correspondence between the first distortion value and the target semantic accuracy;
    根据所述目标语义准确度和所述第四计算模型,得到所述第一失真值。The first distortion value is obtained according to the target semantic accuracy and the fourth calculation model.
  19. 根据权利要求18所述的方法,其中,所述确定第四计算模型参数,包括:The method of claim 18, wherein said determining a fourth calculation model parameter comprises:
    在所述第四计算模型中,将所述第一失真值设置为等于所述目标语义准确度的对数的加权值;In the fourth calculation model, the first distortion value is set as a weighted value equal to the logarithm of the target semantic accuracy;
    所述第四计算模型参数包括指示所述对数的底数参数和指示所述加权的第四加权系数参数。The fourth calculation model parameter includes a base parameter indicating the logarithm and a fourth weighting coefficient parameter indicating the weighting.
  20. 根据权利要求19所述的方法,其中,所述方法还包括:The method of claim 19, wherein the method further comprises:
    将所述第四计算模型参数设置为预设值。The fourth calculation model parameter is set to a preset value.
  21. 根据权利要求1所述的方法,其中,所述根据第一失真度量准则,确定第一失真值,包括:The method according to claim 1, wherein the determining the first distortion value according to the first distortion metric criterion comprises:
    确定第五计算模型参数,所述第五计算模型表示所述第一失真值与均方误差之间的第三关系函数;determining a fifth calculation model parameter, the fifth calculation model representing a third relationship function between the first distortion value and the mean square error;
    确定所述待编码视频中编码单元的目标均方误差,根据所述目标均方误差和所述第五计算模型,确定所述第一失真值。The target mean square error of the coding unit in the video to be encoded is determined, and the first distortion value is determined according to the target mean square error and the fifth calculation model.
  22. 根据权利要求21所述的方法,其中,所述确定第五计算模型参数,包括:The method of claim 21, wherein said determining a fifth computational model parameter comprises:
    在所述第五计算模型中,将所述第一失真值设置为等于所述目标均方误差与第一参数因子的乘积并叠加第二参数因子的和值;In the fifth calculation model, the first distortion value is set equal to the product of the target mean square error and the first parameter factor and the sum value of the second parameter factor is superimposed;
    所述第五计算模型参数包括所述第一参数因子和所述第二参数因子。The fifth calculation model parameter includes the first parameter factor and the second parameter factor.
  23. 根据权利要求22所述的方法,其中,所述方法还包括:The method of claim 22, wherein the method further comprises:
    将所述第五计算模型参数设置为预设值。The fifth calculation model parameter is set to a preset value.
  24. 根据权利要求22所述的方法,其中,所述方法还包括:The method of claim 22, wherein the method further comprises:
    基于测试视频,使用所述第一失真度量准则,确定所述第一失真值与所述测试视频的均方误差之间的第三关系函数;Based on the test video, using the first distortion metric, determine a third relationship function between the first distortion value and the mean square error of the test video;
    根据所述第三关系函数,确定所述第五计算模型参数。The fifth calculation model parameter is determined according to the third relational function.
  25. 根据权利要求1所述的方法,其中,所述根据第二失真度量准则,确定第二失真值,包括:The method according to claim 1, wherein the determining the second distortion value according to the second distortion metric criterion comprises:
    确定所述视频中编码单元的重建值,其中,所述编码单元包括以下至少之一:图像,分片,子图像,瓦片,编码块;determining a reconstruction value of a coding unit in the video, wherein the coding unit includes at least one of the following: an image, a slice, a sub-image, a tile, and a coding block;
    基于所述数值误差准则,根据所述编码单元的重建值和原始值,确定所述第二失真值;Based on the numerical error criterion, the second distortion value is determined according to the reconstructed value and the original value of the coding unit;
    其中,所述数值误差准则是以下其中之一:绝对误差和准则,平均绝对误差准则,误差平方和准则,均方误差准则。Wherein, the numerical error criterion is one of the following: absolute error sum criterion, mean absolute error criterion, error sum of square criterion, and mean square error criterion.
  26. 根据权利要求1所述的方法,其中,所述根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子,包括:The method according to claim 1, wherein the determining a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier comprises:
    确定第一预设参数;其中,所述第一预设参数用于控制所述第一拉格朗日乘子和所述第二拉格朗日乘子对应的权重值;determining a first preset parameter; wherein, the first preset parameter is used to control weight values corresponding to the first Lagrangian multiplier and the second Lagrangian multiplier;
    利用所述第一预设参数对所述第一拉格朗日乘子和所述第二拉格朗日乘子进行加权计算,得到所述目标拉格朗日乘子。The first Lagrangian multiplier and the second Lagrangian multiplier are weighted and calculated by using the first preset parameter to obtain the target Lagrangian multiplier.
  27. 根据权利要求26所述的方法,其中,所述确定第一预设参数,包括:The method according to claim 26, wherein said determining the first preset parameter comprises:
    根据所述编码器的配置信息设置所述第一预设参数。The first preset parameter is set according to the configuration information of the encoder.
  28. 根据权利要求26所述的方法,其中,所述方法还包括:The method of claim 26, wherein the method further comprises:
    当所述编码器的配置信息指示所述第一预设参数等于k时,将所述目标拉格朗日乘子设置为等于所述第一拉格朗日乘子和所述第二拉格朗日乘子的加权和,其中,k为大于或等于0且小于或等于1的任意值,所述第一拉格朗日乘子的加权系数设置为等于1–k,所述第二拉格朗日乘子的加权系数设置为 等于k。When the configuration information of the encoder indicates that the first preset parameter is equal to k, the target Lagrangian multiplier is set equal to the first Lagrangian multiplier and the second Lagrangian A weighted sum of Lagrangian multipliers, where k is any value greater than or equal to 0 and less than or equal to 1, the weighting coefficient of the first Lagrangian multiplier is set to be equal to 1–k, and the second Lagrangian The weighting factor of the Grange multiplier is set equal to k.
  29. 根据权利要求28所述的方法,其中,k的取值等于0.75。30. The method of claim 28, wherein the value of k is equal to 0.75.
  30. 根据权利要求1所述的方法,其中,所述根据所述第一失真值和所述第二失真值,确定目标失真值,包括:The method according to claim 1, wherein the determining a target distortion value according to the first distortion value and the second distortion value comprises:
    确定第二预设参数;其中,所述第二预设参数用于控制所述第一失真值和所述第二失真值对应的权重值;determining a second preset parameter; wherein, the second preset parameter is used to control the weight value corresponding to the first distortion value and the second distortion value;
    利用所述第二预设参数对所述第一失真值和所述第二失真值进行加权计算,得到所述目标失真值。The first distortion value and the second distortion value are weighted and calculated by using the second preset parameter to obtain the target distortion value.
  31. 根据权利要求30所述的方法,其中,所述确定第二预设参数,包括:The method according to claim 30, wherein said determining the second preset parameter comprises:
    根据所述编码器的配置信息设置所述第二预设参数。The second preset parameter is set according to the configuration information of the encoder.
  32. 根据权利要求31所述的方法,其中,所述方法还包括:The method of claim 31, wherein the method further comprises:
    当所述编码器的配置信息指示所述第二预设参数等于m时,将所述目标失真值设置为等于所述第一失真值和所述第二失真值的加权和,其中,m为大于或等于0且小于或等于1的任意值,所述第一失真值的加权系数设置为等于1–m,所述第二失真值的加权系数设置为等于m。When the configuration information of the encoder indicates that the second preset parameter is equal to m, the target distortion value is set equal to the weighted sum of the first distortion value and the second distortion value, where m is Any value greater than or equal to 0 and less than or equal to 1, the weighting coefficient of the first distortion value is set equal to 1−m, and the weighting coefficient of the second distortion value is set equal to m.
  33. 根据权利要求32所述的方法,其中,m的取值等于0.75。The method of claim 32, wherein the value of m is equal to 0.75.
  34. 根据权利要求1所述的方法,其中,所述利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,包括:The method according to claim 1, wherein the determining the encoding parameter of the to-be-encoded video by using the target Lagrangian multiplier and the target distortion value comprises:
    基于所述目标拉格朗日乘子和所述目标失真值,构建率失真代价函数;constructing a rate-distortion cost function based on the target Lagrangian multiplier and the target distortion value;
    利用一种或多种候选编码参数对所述待编码视频进行预编码处理,确定所述一种或多种候选编码参数对应的率失真代价值;Using one or more candidate encoding parameters to pre-encode the video to be encoded, to determine the rate-distortion cost value corresponding to the one or more candidate encoding parameters;
    从所确定的率失真代价值中选取最小率失真代价值,将所述最小率失真代价值对应的候选编码参数确定为所述待编码视频的编码参数。A minimum rate-distortion cost value is selected from the determined rate-distortion cost values, and a candidate encoding parameter corresponding to the minimum rate-distortion cost value is determined as the encoding parameter of the video to be encoded.
  35. 根据权利要求1或34所述的方法,其特征在于,所述编码参数至少包括指示所述待编码视频划分方式的参数和构造所述待编码视频中编码块的预测值的参数。The method according to claim 1 or 34, wherein the encoding parameters include at least a parameter indicating a division manner of the to-be-encoded video and a parameter for constructing a prediction value of an encoded block in the to-be-encoded video.
  36. 根据权利要求35所述的方法,其中,所述对所述待编码视频进行编码,包括:The method of claim 35, wherein said encoding the video to be encoded comprises:
    将所述编码参数写入码流。Write the encoding parameters into the code stream.
  37. 一种编码器,所述编码器包括确定单元、计算单元和编码单元;其中,An encoder comprising a determination unit, a calculation unit and an encoding unit; wherein,
    所述确定单元,配置为确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;The determining unit is configured to determine pre-parameters of the video to be encoded, and determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters;
    所述计算单元,配置为根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;the computing unit, configured to determine a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier;
    所述确定单元,还配置为根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;以及根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;The determining unit is further configured to determine a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion; and determine a second distortion value according to a second distortion metric criterion, Wherein, the second distortion metric criterion includes a numerical error metric criterion;
    所述计算单元,还配置为根据所述第一失真值和所述第二失真值,确定目标失真值;The computing unit is further configured to determine a target distortion value according to the first distortion value and the second distortion value;
    所述编码单元,配置为利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码。The encoding unit is configured to use the target Lagrangian multiplier and the target distortion value to determine encoding parameters of the video to be encoded, and to encode the video to be encoded.
  38. 一种编码器,所述编码器包括存储器和处理器;其中,An encoder comprising a memory and a processor; wherein,
    所述存储器,用于存储能够在所述处理器上运行的计算机程序;the memory for storing a computer program executable on the processor;
    所述处理器,用于在运行所述计算机程序时,执行如权利要求1至36任一项所述的方法。The processor is configured to execute the method according to any one of claims 1 to 36 when running the computer program.
  39. 一种计算机存储介质,其中,所述计算机存储介质存储有计算机程序,所述计算机程序被至少一个处理器执行时实现如权利要求1至36任一项所述的方法。A computer storage medium, wherein the computer storage medium stores a computer program which, when executed by at least one processor, implements the method according to any one of claims 1 to 36.
  40. 一种视频系统,所述视频系统包括编码器和解码器;其中,A video system, the video system includes an encoder and a decoder; wherein,
    所述编码器,配置为确定待编码视频的预参数,根据所述预参数,确定第一拉格朗日乘子和第二拉格朗日乘子;以及根据所述第一拉格朗日乘子和所述第二拉格朗日乘子,确定目标拉格朗日乘子;以及根据第一失真度量准则,确定第一失真值,其中,所述第一失真度量准则包括语义失真度量准则;以及根据第二失真度量准则,确定第二失真值,其中,所述第二失真度量准则包括数值误差度量准则;以及根据所述第一失真值和所述第二失真值,确定目标失真值;以及利用所述目标拉格朗日乘子和所述目标失真值,确定所述待编码视频的编码参数,对所述待编码视频进行编码以生成码流,将所述码流传输至所述解码器;the encoder, configured to determine pre-parameters of the video to be encoded, determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters; and according to the first Lagrangian a multiplier and the second Lagrangian multiplier to determine a target Lagrangian multiplier; and a first distortion value is determined according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric and determining a second distortion value according to a second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion; and determining a target distortion according to the first distortion value and the second distortion value and using the target Lagrange multiplier and the target distortion value to determine the encoding parameters of the to-be-encoded video, encode the to-be-encoded video to generate a code stream, and transmit the code stream to the decoder;
    所述解码器,配置为解析码流,获得解码视频。The decoder is configured to parse the code stream to obtain decoded video.
  41. 根据权利要求40所述的系统,其中,The system of claim 40, wherein,
    所述解码器,还配置为解析码流,获取解码参数,根据所述解码参数获得所述解码视频;其中,所述解码参数至少包括指示待解码视频划分方式的参数和构造所述待解码视频中解码块的预测值的参数。The decoder is further configured to parse the code stream, obtain decoding parameters, and obtain the decoded video according to the decoding parameters; wherein, the decoding parameters at least include a parameter indicating the division mode of the video to be decoded and the construction of the video to be decoded. The parameter for the predicted value of the decoded block in .
PCT/CN2020/106416 2020-07-31 2020-07-31 Video coding method and system, coder, and computer storage medium WO2022021422A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080099999.3A CN115428451A (en) 2020-07-31 2020-07-31 Video encoding method, encoder, system, and computer storage medium
PCT/CN2020/106416 WO2022021422A1 (en) 2020-07-31 2020-07-31 Video coding method and system, coder, and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/106416 WO2022021422A1 (en) 2020-07-31 2020-07-31 Video coding method and system, coder, and computer storage medium

Publications (1)

Publication Number Publication Date
WO2022021422A1 true WO2022021422A1 (en) 2022-02-03

Family

ID=80037374

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/106416 WO2022021422A1 (en) 2020-07-31 2020-07-31 Video coding method and system, coder, and computer storage medium

Country Status (2)

Country Link
CN (1) CN115428451A (en)
WO (1) WO2022021422A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114786010A (en) * 2022-03-07 2022-07-22 杭州未名信科科技有限公司 Rate distortion optimization quantization method and device, storage medium and electronic equipment
CN116723330A (en) * 2023-03-28 2023-09-08 成都师范学院 Panoramic video coding method for self-adapting spherical domain distortion propagation chain length

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102780884A (en) * 2012-07-23 2012-11-14 深圳广晟信源技术有限公司 Rate distortion optimization method
US20170188027A1 (en) * 2012-09-24 2017-06-29 Intel Corporation Histogram Segmentation Based Local Adaptive Filter for Video Encoding and Decoding
CN107205151A (en) * 2017-06-26 2017-09-26 中国科学技术大学 Coding and decoding device and method based on mixing distortion metrics criterion
CN109190752A (en) * 2018-07-27 2019-01-11 国家新闻出版广电总局广播科学研究院 The image, semantic dividing method of global characteristics and local feature based on deep learning
CN109982082A (en) * 2019-05-05 2019-07-05 山东大学深圳研究院 A kind of more distortion criterion Rate-distortion optimization methods of HEVC based on local grain characteristic

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7042943B2 (en) * 2002-11-08 2006-05-09 Apple Computer, Inc. Method and apparatus for control of rate-distortion tradeoff by mode selection in video encoders
US8879623B2 (en) * 2009-09-02 2014-11-04 Sony Computer Entertainment Inc. Picture-level rate control for video encoding a scene-change I picture
KR20140042845A (en) * 2011-06-14 2014-04-07 조우 왕 Method and system for structural similarity based rate-distortion optimization for perceptual video coding
FR3029055B1 (en) * 2014-11-24 2017-01-13 Ateme IMAGE ENCODING METHOD AND EQUIPMENT FOR IMPLEMENTING THE METHOD
CN108900838B (en) * 2018-06-08 2021-10-15 宁波大学 Rate distortion optimization method based on HDR-VDP-2 distortion criterion
WO2020107288A1 (en) * 2018-11-28 2020-06-04 Oppo广东移动通信有限公司 Video encoding optimization method and apparatus, and computer storage medium
CN110324618A (en) * 2019-07-03 2019-10-11 上海电力学院 The Optimized Coding of raising video quality based on VMAF criterion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102780884A (en) * 2012-07-23 2012-11-14 深圳广晟信源技术有限公司 Rate distortion optimization method
US20170188027A1 (en) * 2012-09-24 2017-06-29 Intel Corporation Histogram Segmentation Based Local Adaptive Filter for Video Encoding and Decoding
CN107205151A (en) * 2017-06-26 2017-09-26 中国科学技术大学 Coding and decoding device and method based on mixing distortion metrics criterion
CN109190752A (en) * 2018-07-27 2019-01-11 国家新闻出版广电总局广播科学研究院 The image, semantic dividing method of global characteristics and local feature based on deep learning
CN109982082A (en) * 2019-05-05 2019-07-05 山东大学深圳研究院 A kind of more distortion criterion Rate-distortion optimization methods of HEVC based on local grain characteristic

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG QUN; YUAN HUI; HUO JUNYAN; LI PENG: "A Fidelity-Assured Rate Distortion Optimization Method for Perceptual-Based Video Coding", 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), IEEE, 22 September 2019 (2019-09-22), pages 4135 - 4139, XP033647416, DOI: 10.1109/ICIP.2019.8803496 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114786010A (en) * 2022-03-07 2022-07-22 杭州未名信科科技有限公司 Rate distortion optimization quantization method and device, storage medium and electronic equipment
CN116723330A (en) * 2023-03-28 2023-09-08 成都师范学院 Panoramic video coding method for self-adapting spherical domain distortion propagation chain length
CN116723330B (en) * 2023-03-28 2024-02-23 成都师范学院 Panoramic video coding method for self-adapting spherical domain distortion propagation chain length

Also Published As

Publication number Publication date
CN115428451A (en) 2022-12-02

Similar Documents

Publication Publication Date Title
US11159801B2 (en) Video characterization for smart encoding based on perceptual quality optimization
US10212456B2 (en) Deblocking filter for high dynamic range (HDR) video
JP6698077B2 (en) Perceptual optimization for model-based video coding
US10567768B2 (en) Techniques for calculation of quantization matrices in video coding
US10873763B2 (en) Video compression techniques for high dynamic range data
TWI452907B (en) Optimized deblocking filters
CN108574841B (en) Coding method and device based on self-adaptive quantization parameter
US10574997B2 (en) Noise level control in video coding
CN111193931B (en) Video data coding processing method and computer storage medium
US9560386B2 (en) Pyramid vector quantization for video coding
WO2022021422A1 (en) Video coding method and system, coder, and computer storage medium
WO2018095890A1 (en) Methods and apparatuses for encoding and decoding video based on perceptual metric classification
US20160353107A1 (en) Adaptive quantization parameter modulation for eye sensitive areas
US20160277767A1 (en) Methods, systems and apparatus for determining prediction adjustment factors
WO2019001283A1 (en) Method and apparatus for controlling encoding resolution ratio
WO2020186763A1 (en) Image component prediction method, encoder, decoder and storage medium
CN112243129B (en) Video data processing method and device, computer equipment and storage medium
US6141449A (en) Coding mode determination system
WO2022198465A1 (en) Coding method and apparatus
KR100601846B1 (en) Apparatus and Method for Distortion Optimization of Moving Picture Compaction Encoder
Li et al. Space-domain-based CTU layer rate control for HEVC
WO2021262419A1 (en) Adaptive quantizer design for video coding
JP2011244334A (en) Moving image encoder, moving image encoding method and program
WO2023141781A1 (en) Encoding and decoding method and apparatus, encoding device, decoding device and storage medium
KR100207419B1 (en) Method and apparatus for controlling generation of bit rate in video encoding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20947371

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20947371

Country of ref document: EP

Kind code of ref document: A1