WO2022021422A1 - Procédé et système de codage vidéo, codeur et support de stockage informatique - Google Patents

Procédé et système de codage vidéo, codeur et support de stockage informatique Download PDF

Info

Publication number
WO2022021422A1
WO2022021422A1 PCT/CN2020/106416 CN2020106416W WO2022021422A1 WO 2022021422 A1 WO2022021422 A1 WO 2022021422A1 CN 2020106416 W CN2020106416 W CN 2020106416W WO 2022021422 A1 WO2022021422 A1 WO 2022021422A1
Authority
WO
WIPO (PCT)
Prior art keywords
distortion
video
parameter
value
target
Prior art date
Application number
PCT/CN2020/106416
Other languages
English (en)
Chinese (zh)
Inventor
元辉
周兰
李明
姜东冉
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to PCT/CN2020/106416 priority Critical patent/WO2022021422A1/fr
Priority to CN202080099999.3A priority patent/CN115428451A/zh
Publication of WO2022021422A1 publication Critical patent/WO2022021422A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria

Definitions

  • the embodiments of the present application relate to the technical field of video coding and decoding, and in particular, to a video coding method, an encoder, a system, and a computer storage medium.
  • H.266/VVC High Efficiency Video Coding
  • the rate-distortion optimization algorithm can either only guarantee the fidelity of the reconstructed video, or can guarantee the subjective quality of the reconstructed video, but the fidelity performance of the video will be greatly reduced.
  • the distortion criteria adopted by the existing rate-distortion optimization algorithms are single and incomplete, so that the existing rate-distortion optimization algorithms cannot be well adapted to machine vision and computer vision.
  • Application scenarios of human-machine vision are single and incomplete, so that the existing rate-distortion optimization algorithms cannot be well adapted to machine vision and computer vision.
  • Embodiments of the present application provide a video encoding method, encoder, system, and computer storage medium, which can be well adapted to application scenarios oriented to machine vision and human-machine vision, and can improve the reconstructed video under the condition of a certain bit rate
  • the accuracy of semantic segmentation can be improved, while maintaining good fidelity performance, thereby improving coding efficiency.
  • an embodiment of the present application provides a video encoding method, which is applied to an encoder, and the method includes:
  • first distortion metric criterion includes a semantic distortion metric criterion
  • the second distortion metric criterion includes a numerical error metric criterion
  • the encoding parameters of the video to be encoded are determined, and the video to be encoded is encoded.
  • an embodiment of the present application provides an encoder, the encoder includes a determination unit, a calculation unit, and an encoding unit; wherein,
  • the determining unit is configured to determine pre-parameters of the video to be encoded, and determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters;
  • the computing unit configured to determine a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier;
  • the determining unit is further configured to determine a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion; and determine a second distortion value according to a second distortion metric criterion, Wherein, the second distortion metric criterion includes a numerical error metric criterion;
  • the computing unit is further configured to determine a target distortion value according to the first distortion value and the second distortion value;
  • the encoding unit is configured to use the target Lagrangian multiplier and the target distortion value to determine the encoding parameter of the video to be encoded, and to encode the video to be encoded.
  • an embodiment of the present application provides an encoder, where the encoder includes a memory and a processor; wherein,
  • the memory for storing a computer program executable on the processor
  • the processor is configured to execute the method according to the first aspect when running the computer program.
  • an embodiment of the present application provides a computer storage medium, where the computer storage medium stores a computer program, and the computer program implements the method according to the first aspect when the computer program is executed by at least one processor.
  • an embodiment of the present application provides a video system, where the video system includes an encoder and a decoder; wherein,
  • the encoder configured to determine pre-parameters of the video to be encoded, determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters; and according to the first Lagrangian a multiplier and the second Lagrangian multiplier to determine a target Lagrange multiplier; and a first distortion value based on a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric and determining a second distortion value according to a second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion; and determining a target distortion according to the first distortion value and the second distortion value and using the target Lagrangian multiplier and the target distortion value to determine the encoding parameters of the to-be-encoded video, encode the to-be-encoded video to generate a code stream, and transmit the code stream to the decoder;
  • the decoder is configured to parse the code stream to obtain decoded video.
  • Embodiments of the present application provide a video encoding method, encoder, system, and computer storage medium, by determining pre-parameters of the video to be encoded, and determining a first Lagrangian multiplier and a second Lagrangian according to the pre-parameters the Lagrangian multiplier; according to the first Lagrangian multiplier and the second Lagrangian multiplier, determine the target Lagrangian multiplier; according to the first distortion metric criterion, determine the first distortion value, Wherein, the first distortion metric criterion includes a semantic distortion metric criterion; a second distortion value is determined according to a second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion; according to the first distortion value and the second distortion value, to determine a target distortion value; using the target Lagrangian multiplier and the target distortion value to determine the encoding parameters of the video to be encoded, and to encode the video to be encoded.
  • the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and under a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining a good fidelity performance, thereby also improving the coding efficiency.
  • Fig. 1 is the structural representation of a kind of RD curve that related technical scheme provides
  • FIG. 2 is a schematic structural diagram of a system composition of an encoder according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a video encoding method provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a curve of a functional relationship between a first distortion value and a code rate according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of a curve of a functional relationship between a code rate and a quantization parameter according to an embodiment of the present application
  • FIG. 6 is a schematic diagram of a curve of a functional relationship between a first distortion value and MSE provided by an embodiment of the present application;
  • FIG. 7 is a schematic diagram of the composition and structure of an encoder provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a specific hardware structure of an encoder provided by an embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a video system according to an embodiment of the present application.
  • the higher the bit rate the better the reconstructed video quality and the smaller the distortion; however, the larger the storage space occupied by the encoded file, the larger the generated bit rate. Therefore, at this time, it is necessary to find a balance between the distortion of the reconstructed video and the bit rate through a rate-distortion optimization algorithm, so that the compression effect is optimal.
  • rate-distortion optimization can be expressed as minimizing the distortion of the decoded and reconstructed video when the encoded file does not exceed a certain bit rate, as shown in the following formula (1).
  • D and R represent the distortion and code rate under certain coding parameters, respectively.
  • the video is encoded with the given encoding parameters, and the encoded bit rate (R) and the distortion (D) of the reconstructed video are calculated.
  • R bit rate
  • D distortion
  • By changing the encoding parameters and repeatedly encoding the to-be-encoded video multiple R-D points consisting of bit rate and distortion can be obtained, as shown in Figure 1.
  • the point with the least distortion will appear on the convex curve (ie, the RD curve) in Fig. 1 .
  • the encoder needs to determine a set of encoding parameters so that the encoded R-D point can approximate this convex curve as much as possible.
  • the constrained problem of the above formula (1) can be transformed into an unconstrained problem by the Lagrange multiplier method, as shown in the following formula (2).
  • is the Lagrange multiplier and J is the rate-distortion cost function.
  • J is the rate-distortion cost function.
  • the encoder can find the optimal encoding parameters by minimizing the rate-distortion cost function.
  • the encoder can determine the optimal block division method, the optimal intra-frame prediction mode, and the optimal inter-frame prediction motion mode (including motion vector, reference image, prediction weight, etc.), To achieve optimal encoding performance.
  • the rate-distortion optimization here adopts the sum of square error (SSE) as the distortion criterion, and the corresponding reconstructed video quality can be determined by the peak signal-to-noise ratio (Peak Signal to Noise Ratio, PSNR). )to measure.
  • SSE distortion can objectively measure the fidelity of the video, and its calculation formula is shown in the following formula (3).
  • M and N represent the horizontal spatial resolution and vertical spatial resolution of the video, respectively, f(x, y) represents the original pixel value at the pixel position (x, y), and g(x, y) represents the pixel position (x, y) , y) at the reconstructed pixel value.
  • rate-distortion optimization is a key technology in video coding, it affects the performance of the encoder.
  • SSE distortion which can measure the fidelity of the video from an objective point of view; but the SSE distortion is not consistent with the perception of the human visual system, such as for some areas with large SSE distortion , the human eye does not perceive the degradation of the reconstructed video quality.
  • the distortion criterion needs to be changed to a distortion metric that can measure the subjective quality.
  • the calculation formula is shown in the following formula (4).
  • C1 and C2 are two constants, in order to avoid and Instability occurs when it is close to 0.
  • C 1 (K 1 L) 2
  • C 2 (K 2 L) 2
  • K 1 0.01
  • K 2 0.03.
  • the rate-distortion optimization algorithm based on SSE distortion can ensure the fidelity of the reconstructed video; however, although the SSIM distortion considering the subjective quality can guarantee the subjective quality of the reconstructed video, the fidelity performance of the video will be greatly reduced.
  • 5G fifth-generation mobile communication
  • machine-oriented applications such as the Internet of Vehicles, wireless Machine vision content such as human driving, industrial Internet, smart and safe cities, wearables, and video surveillance has a wider range of application scenarios.
  • machine-oriented applications such as the Internet of Vehicles, wireless Machine vision content such as human driving, industrial Internet, smart and safe cities, wearables, and video surveillance has a wider range of application scenarios.
  • most videos will be used by machines, such as intelligent analysis of reconstructed videos such as pedestrian detection, semantic segmentation, and target detection.
  • the distortion criterion adopted by the current rate-distortion optimization algorithm only considers the fidelity distortion, and does not consider the semantic distortion; Fidelity performance, but the semantic accuracy of the reconstructed video cannot be guaranteed, resulting in the current rate-distortion optimization algorithm can not be well adapted to many scenarios for machine vision and human-machine vision.
  • an embodiment of the present application provides a video encoding method.
  • the basic idea is: determine pre-parameters of the video to be encoded, and determine the first Lagrangian multiplier and the second Lagrangian according to the pre-parameters multiplier; according to the first Lagrangian multiplier and the second Lagrangian multiplier, determine the target Lagrangian multiplier; according to the first distortion measurement criterion, determine the first distortion value, wherein,
  • the first distortion metric criterion includes a semantic distortion metric criterion; a second distortion value is determined according to the second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion; according to the first distortion value and the obtained
  • the second distortion value is determined, and the target distortion value is determined; the encoding parameter of the to-be-encoded video is determined by using the target Lagrangian multiplier and the target distortion value, and the to-be-encoded video is encoded.
  • the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and under a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining a good fidelity performance, thereby also improving the coding efficiency.
  • the encoder 10 may include a transform and quantization unit 101, an intra-frame estimation unit 102, an intra-frame prediction unit 103, a motion compensation unit 104, a motion estimation unit 105, an inverse transform and inverse quantization unit 106, a filter Control the analysis unit 107, the filtering unit 108, the encoding unit 109, the decoded image buffering unit 110, etc., wherein the filtering unit 108 can realize deblocking filtering and sample adaptive offset (Sample Adaptive Offset, SAO) filtering, and the encoding unit 109 can realize Header information coding and context-based adaptive binary arithmetic coding (Context-based Adaptive Binary Arithmatic Coding, CABAC).
  • SAO Sample Adaptive Offset
  • a coding block (Coding Unit, CU) can be divided to obtain a video coding block, and then the residual pixel information obtained after intra-frame or inter-frame prediction is encoded by the transform and quantization unit 101.
  • the block is transformed, including transforming the residual information from the pixel domain to the transform domain, and quantizing the resulting transform coefficients to further reduce the bit rate; the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used for this video.
  • the coding block is intra-predicted; specifically, the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used to determine the intra-frame prediction mode to be used to encode the video coding block; the motion compensation unit 104 and the motion estimation unit 105 are used to Inter-predictive encoding of the received video encoding blocks relative to one or more blocks in one or more reference frames is performed to provide temporal prediction information; motion estimation performed by motion estimation unit 105 is the process of generating motion vectors, so The motion vector can estimate the motion of the video coding block, and then the motion compensation unit 104 performs motion compensation based on the motion vector determined by the motion estimation unit 105; after determining the intra prediction mode, the intra prediction unit 103 is also used to The selected intra prediction data is supplied to the encoding unit 109, and the motion estimation unit 105 also sends the calculated motion vector data to the encoding unit 109; in addition, the inverse transform and inverse quantization unit 106 is used for the video encoding block.
  • a residual block is reconstructed in the pixel domain, the reconstructed residual block is controlled by the filter analysis unit 107 and the filtering unit 108 to remove the blocking artifacts, and then the reconstructed residual block is added to the decoded image buffer unit
  • a predictive block in the frame of 110 is used to generate a reconstructed video coding block; the coding unit 109 is used for coding various coding parameters and quantized transform coefficients.
  • the context content can be Based on the adjacent coding blocks, it can be used to encode the information indicating the determined intra prediction mode, and output the code stream of the video signal; and the decoded image buffer unit 110 is used to store the reconstructed video coding blocks for prediction reference. As the video image coding proceeds, new reconstructed video coding blocks are continuously generated, and these reconstructed video coding blocks are all stored in the decoded image buffer unit 110 .
  • the video coding method in this embodiment of the present application is mainly applied to the coding control part in the encoder 10, for example, including the coding block (Coding Unit, CU) division shown in FIG. 2, the intra prediction unit 103, the motion compensation unit 104 and Motion estimation unit 105 and other parts. That is to say, the video encoding method of the embodiment of the present application is mainly used to determine encoding parameters, so as to perform encoding according to the determined encoding parameters.
  • the coding parameters may include a CU division mode, and an intra-frame prediction mode or an inter-frame prediction mode for determining the CU.
  • An embodiment of the present application provides a video encoding method, and the method is applied to a video encoding device, that is, an encoder.
  • the functions implemented by the method can be implemented by the processor in the encoder calling a computer program, and of course the computer program can be stored in a memory.
  • the encoder includes at least a processor and a memory.
  • FIG. 3 it shows a schematic flowchart of a video encoding method provided by an embodiment of the present application. As shown in Figure 3, the method may include:
  • S301 Determine pre-parameters of the video to be encoded, and determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters;
  • S302 Determine a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier;
  • S303 Determine a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion; and determine a second distortion value according to a second distortion metric criterion, wherein the second distortion metric criterion Distortion metrics include numerical error metrics;
  • S304 Determine a target distortion value according to the first distortion value and the second distortion value
  • S305 Using the target Lagrangian multiplier and the target distortion value, determine the encoding parameter of the video to be encoded, and encode the video to be encoded.
  • the video coding method in this embodiment of the present application may be applicable to an encoder of the H.266/VVC standard, an encoder of the H.265/HEVC standard, or even an encoder of other standards , such as an encoder suitable for the first-generation video coding standard (Alliance for Open Media Video 1, AV-1) developed by the Open Media Alliance, and the embodiment of this application does not make any limitation.
  • an encoder suitable for the first-generation video coding standard Alliance for Open Media Video 1, AV-1 developed by the Open Media Alliance
  • the rate-distortion optimization algorithm used in the video coding method of the embodiment of the present application comprehensively considers the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric, so that the rate The distortion optimization can be a multi-distortion criterion rate-distortion optimization algorithm for human-machine vision. That is to say, in video coding, in addition to the second Lagrangian multiplier and the second distortion value derived by using the related technical solution, for the human-machine vision application scenario of video semantic segmentation, the embodiment of the present application can also A semantic distortion metric is defined, and then the corresponding first Lagrangian multiplier and the first distortion value calculation formula are derived.
  • the target Lagrangian multiplier can be determined according to the first Lagrangian multiplier and the second Lagrangian multiplier, and the first distortion value determined according to the first distortion metric criterion and the second distortion value
  • the second distortion value determined by the metric criterion can also determine the target distortion value; in this way, after determining the encoding parameters of the video to be encoded according to the target Lagrangian multiplier and the target distortion value, use the encoding parameters to perform the encoding of the video to be encoded. Coding can improve the accuracy of semantic segmentation of reconstructed video, and improve the fidelity of reconstructed video, and can also reduce the coding bit rate of video, thereby shortening the time required for coding, improving coding speed, and improving coding efficiency.
  • the pre-parameter of the video to be encoded may include a quantization parameter (Quantization Parameter, QP).
  • QP Quantization Parameter
  • the determining the pre-parameters of the video to be encoded may include:
  • a quantization parameter of the coding unit in the video to be coded is determined; wherein, the coding unit may include at least one of the following: a picture, a slice (Slice), a sub-picture (Sub-picture), a tile (tile), and a coding block.
  • the quantization parameter may be the quantization step size of the quantizer in the encoder, or the index number value corresponding to the quantization step size of the quantizer in the encoder.
  • the determining the first Lagrangian multiplier according to the pre-parameter may include:
  • the first calculation model representing the correspondence between the first Lagrangian multiplier and the quantization parameter
  • the first Lagrangian multiplier is determined according to the quantization parameter and the first calculation model.
  • the determining the parameters of the first calculation model may include:
  • the first Lagrangian multiplier is set to a weighted value equal to the exponential power of the quantization parameter
  • the first calculation model parameter includes a first exponential parameter indicating the exponential power and a first weighting coefficient indicating the weighting.
  • the calculation formula of the first calculation model is as follows:
  • Equation (5) is the first calculation model, which is used to represent the correspondence between the first Lagrangian multiplier and the quantization parameter.
  • the first calculation model parameter may include a first index parameter (ie, 6.3612072 in the formula) and a first weighting coefficient (ie, 2.30422*10 -8 in the formula).
  • the determination of the parameters of the first calculation model may be a preset value, or may be obtained by fitting according to a large amount of test data of a test video, which is not limited herein.
  • the method may further include:
  • the first calculation model parameter is set to a preset value.
  • the first index parameter may be set to 6.3612072, and the first weighting coefficient may be set to 2.30422*10 -8 .
  • the first calculation model can be obtained, so as to determine the first Lagrangian multiplier according to the quantization parameter.
  • the method may further include:
  • a second relationship function between the bit rate and the quantization parameter is determined, and the first calculation model parameter is determined according to the derivative function and the second relationship function.
  • test video in this embodiment of the present application may be one or more test videos, for example, the test video may be multiple (eg, 59) test video sequences in a large-scale city scene dataset (Cityscapes).
  • the first relationship function between the first distortion value and the bit rate of the test video can be determined, and the first relationship function is shown in the following formula:
  • D miou represents the first distortion value
  • R represents the code rate
  • the value of ⁇ miou is the slope of the tangent to the curve ( ⁇ miou >0), that is, the derivative function of the negative curve.
  • the average bit rate of the reconstructed video under different quantization parameters can be calculated according to the encoded files.
  • a large amount of experimental test data can be used to determine the first distortion value (D miou ) and the bit rate (R ), the fitting curve is shown in Figure 4.
  • the derivative function of the first relation function can be obtained by performing the derivative operation on the formula (6), and the derivative function is used to represent the corresponding relationship between the first Lagrangian multiplier and the code rate.
  • the derivative function is as follows,
  • a second relationship function between the bit rate (R) and the quantization parameter (QP) can be determined by fitting using a large amount of experimental test data.
  • the fitting curve is shown in FIG. 5 .
  • the second relation function is as follows,
  • Equation (7) and Equation (8) substituting Equation (8) into Equation (7), the functional relationship between ⁇ miou and QP can be obtained, that is, the relationship between the first Lagrange multiplier and the quantization parameter
  • the first calculation model shown in formula (5) is obtained; thus, the parameters of the first calculation model are determined, so that the first Lagrangian multiplier can be determined according to the quantization parameters.
  • the calculation formula of the first Lagrange multiplier can also be modified into other functional forms.
  • the functional relationship of the above formula (8) can also be fitted in the e-exponential form, then the corresponding formula (5), that is, the calculation formula of the first Lagrangian multiplier can also be in the e-exponential form of the QP. limited.
  • the determining the second Lagrangian multiplier according to the pre-parameter may include:
  • the second Lagrangian multiplier is determined according to a preset third calculation model; wherein, the third calculation model represents the corresponding relationship between the second Lagrangian multiplier and the quantization parameter.
  • the determining the third calculation model parameter may include:
  • the second Lagrangian multiplier is set equal to a weighted value of an exponential power of 2;
  • the third calculation model parameter includes a third exponent parameter indicating the power of the exponent and a third weighting coefficient indicating the weighting, the third exponent parameter being related to a quantization parameter.
  • Equation (9) is the third calculation model, which is used to represent the correspondence between the second Lagrangian multiplier and the quantization parameter.
  • the third calculation model parameter may include a third index parameter (ie (QP-12)/3 in the formula) and a third weighting coefficient (ie, 0.57 in the formula), and the value of the third index parameter is related to the quantization parameter (QP) related.
  • the parameters of the third calculation model may be a preset value; it may also be obtained by fitting according to a large amount of experimental test data of the test video, which is not limited here.
  • the determining the quantization parameter of the coding unit in the to-be-coded video may include: using a rate control manner to determine the quantization parameter of the coding unit in the to-be-coded video.
  • the determining the quantization parameter of the coding unit in the to-be-coded video may include: setting the quantization parameter to a preset value.
  • the quantization parameter in the video to be encoded may be set to a preset value, such as 22, 27, 32, 37, and so on.
  • the quantization parameter can also be determined by using the code rate control method; specifically, the current code stream control algorithm mainly controls the code stream by adjusting the size of the quantization parameter; in this way, by controlling the size of the code rate, also The required quantization parameters can be obtained.
  • the pre-parameters of the video to be encoded may include a quantization parameter and a target bit rate.
  • the determining the pre-parameters of the video to be encoded may include:
  • the coding unit may include at least one of the following: a picture, a slice (Slice), a sub-picture (Sub-picture), a tile (tile), encoding block.
  • the quantization parameter may be the quantization step size of the quantizer in the encoder, or the index number value corresponding to the quantization step size of the quantizer in the encoder.
  • the determining the first Lagrangian multiplier according to the pre-parameter may include:
  • the second calculation model representing the correspondence between the first Lagrange multiplier and the code rate
  • the first Lagrangian multiplier is determined according to the target code rate and the second calculation model.
  • the determining the target bit rate of the coding unit in the to-be-encoded video may include: determining the target bit-rate of the encoding unit in the to-be-encoded video by using a bit allocation method.
  • the target bit rate for the coding unit in the video to be encoded can be obtained by way of bit allocation.
  • the target bit rate can be dynamically adjusted according to the number of bits consumed by the coding unit in the video to be encoded, so as to ensure real-time and accurate bit allocation.
  • the determining the parameters of the second calculation model may include:
  • the first Lagrangian multiplier is set to a weighted value equal to the exponential power of the target code rate
  • the second calculation model parameter includes a second exponent parameter indicating the power of the exponent and a second weighting coefficient indicating the weighting.
  • Equation (10) is the second calculation model, which is used to represent the correspondence between the first Lagrangian multiplier and the code rate.
  • the second calculation model parameter may include a second index parameter (ie -1.7553 in the formula) and a second weighting coefficient (ie, 0.17364347 in the formula).
  • the determination of the parameters of the second calculation model may be a preset value, or may be obtained by fitting according to a large amount of experimental test data of the test video, which is not limited herein.
  • the method may further include:
  • the second calculation model parameter is set to a preset value.
  • the second index parameter may be set to -1.7553, and the second weighting coefficient may be set to 0.17364347. After the second index parameter and the second weighting coefficient are determined, the second calculation model can be obtained, so as to determine the first Lagrangian multiplier according to the target code rate.
  • the method may further include:
  • a derivative operation is performed on the first relational function to determine the second calculation model parameter.
  • test video here may also be one or more test videos, for example, the test video may be multiple (eg, 59) test video sequences in a large-scale urban scene dataset (Cityscapes).
  • the first relationship function between the first distortion value and the bit rate of the test video can be determined, and the first relationship function is shown in the above formula (6).
  • the derivative operation is performed on the formula (6), the derivative function of the first relation function can be obtained, and the derivative function is used to represent the corresponding relationship between the first Lagrange multiplier and the code rate, and the formula ( 10) the second calculation model shown; thus, the parameters of the second calculation model are also determined, so that the first Lagrangian multiplier can be determined according to the quantization parameters.
  • the value of ⁇ miou is the slope of the tangent of the curve ( ⁇ miou >0), that is, the derivative function of the negative curve.
  • the average bit rate of the reconstructed video under different quantization parameters can be calculated according to the encoded files.
  • a large amount of experimental test data can be used to determine the first distortion value (D miou ) and the bit rate by fitting (R), the fitting curve is shown in Fig. 4 to obtain the first relation function.
  • the determining the second Lagrangian multiplier according to the pre-parameter may include:
  • the second Lagrangian multiplier is determined according to a preset third calculation model; wherein, the third calculation model represents the corresponding relationship between the second Lagrangian multiplier and the quantization parameter.
  • the determining the third calculation model parameter may include:
  • the second Lagrangian multiplier is set equal to a weighted value of an exponential power of 2;
  • the third calculation model parameter includes a third exponent parameter indicating the power of the exponent and a third weighting coefficient indicating the weighting, the third exponent parameter being related to a quantization parameter.
  • Equation (9) is the third calculation model, which is used to represent the correspondence between the second Lagrangian multiplier and the quantization parameter.
  • the third calculation model parameter may include a third index parameter (ie (QP-12)/3 in the formula) and a third weighting coefficient (ie, 0.57 in the formula), and the value of the third index parameter is related to the quantization parameter (QP) related.
  • the parameters of the third calculation model may be a preset value; it may also be obtained by fitting according to a large amount of experimental test data of the test video, which is not limited here.
  • the determining the quantization parameter of the coding unit in the to-be-coded video may include: using a rate control manner to determine the quantization parameter of the coding unit in the to-be-coded video.
  • the determining the quantization parameter of the coding unit in the to-be-coded video may include: setting the quantization parameter to a preset value.
  • the quantization parameter in the video to be encoded may be set to a preset value, such as 22, 27, 32, 37, and so on.
  • the quantization parameter can also be determined by using the code rate control method; specifically, the current code stream control algorithm mainly controls the code stream by adjusting the size of the quantization parameter; in this way, by controlling the size of the code rate, also The required quantization parameters can be obtained.
  • the distortion value in addition to determining the first Lagrangian multiplier and the second Lagrangian multiplier, the distortion value also needs to be determined.
  • the first distortion value is obtained based on the first distortion metric criterion
  • the second distortion value is obtained based on the second distortion metric criterion.
  • the first distortion metric criterion may be a semantic distortion metric criterion.
  • a semantic distortion metric In order to improve the semantic segmentation accuracy of reconstructed video, it is first necessary to define a semantic distortion metric. Specifically, multiple quantization parameters can be selected, and then VVC encoding is performed on multiple (for example, 59) test video sequences in the large-scale urban scene dataset (Cityscapes) under the condition of random access (RA). Video semantic segmentation is performed on the video before and after encoding, so that the accuracy of the semantic segmentation result can be calculated according to the corresponding annotation data.
  • measuring the accuracy of semantic segmentation can usually be expressed by mean Intersection over Union (mIoU), where mIoU refers to the average value of Intersection over Union (IoU) of all categories.
  • IoU is used as a detection evaluation function, which is simply the overlap rate of the generated prediction window and the real window, that is, the intersection of the detection result area (Detection Result) and the ground truth area (Ground Truth) and the union of the two. ratio, that is, semantic accuracy (represented by IoU).
  • the determining the first distortion value according to the first distortion metric criterion may include:
  • test video Based on the test video, semantically segment the test video to determine the semantic accuracy of one or more categories;
  • Distortion measurement is performed on the semantic accuracy of the target by using the fourth calculation model to obtain the first distortion value.
  • determining the target semantic accuracy according to the semantic accuracy of one or more categories may include:
  • a weighted sum of the semantic accuracy of the one or more categories is calculated, and the resulting weighted sum is determined as the target semantic accuracy.
  • a specific implementation is to set the weight to 1, in this case, the average of the semantic accuracy of the one or more categories is calculated. value, and the obtained average is determined as the target semantic accuracy.
  • the two sets can represent the predicted value and the real value respectively, that is, A pred is the predicted segmentation result area, and A true is the labeled segmentation result area;
  • the semantic accuracy of each category It can be represented by IoU, and the calculation of IoU is as follows.
  • the target semantic accuracy can be obtained by taking the average value, which can be expressed as mIoU.
  • mIoU refers to the average IoU of all categories, and its value ranges from 0 to 1; the larger the value, the higher the semantic accuracy.
  • IoU of n classes the calculation of mIoU is as follows.
  • the use of the fourth calculation model to perform distortion measurement on the target semantic accuracy to obtain the first distortion value includes:
  • the fourth calculation model parameter representing the correspondence between the first distortion value and the target semantic accuracy
  • the first distortion value is obtained according to the target semantic accuracy and the fourth calculation model.
  • the determining of the fourth calculation model parameter may include:
  • the first distortion value is set as a weighted value equal to the logarithm of the target semantic accuracy
  • the fourth calculation model parameter includes a base parameter indicating the logarithm and a fourth weighting coefficient parameter indicating the weighting.
  • the fourth calculation model parameter is set as a preset value.
  • a semantic distortion metric ie, the first distortion value, represented by D miou
  • its calculation formula is shown in the following formula (13).
  • the formula (13) is the fourth calculation model, which is used to represent the correspondence between the first distortion value (D miou ) and the target semantic accuracy (mIoU).
  • the fourth calculation model parameter may include a base parameter (that is, the base of ln in the formula is 10) and a fourth weighting coefficient parameter (that is, -10 in the formula). (mIoU) preset magnification.
  • the determination of the parameters of the fourth calculation model may be a preset value, or may be obtained by fitting according to a large amount of test data of a test video, which is not limited here.
  • the natural logarithmic function maps the finite mIoU value to an infinite range, and the multiplied coefficient amplifies the obtained value to match the distortion size in the rate-distortion optimization algorithm. In this way, when mIoU tends to 0, D miou tends to infinity; when mIoU tends to 1, D miou tends to 0.
  • the first distortion value may also be related to the target mean square error of the coding unit in the video to be encoded.
  • the determining the first distortion value according to the first distortion metric criterion may include:
  • the fifth calculation model representing a third relationship function between the first distortion value and the mean square error
  • the target mean square error of the coding unit in the video to be encoded is determined, and the first distortion value is determined according to the target mean square error and the fifth calculation model.
  • the reconstructed video is obtained by performing video decoding and reconstruction on the encoded video.
  • video reconstruction can be performed on the encoded video under the quantization parameter to obtain the reconstructed video under the quantization parameter; according to the reconstructed video and the original video, it is possible to The mean squared error (Mean Squared Error, MSE) of the reconstructed video under the quantization parameter is obtained.
  • MSE can evaluate the degree of change of the data. The smaller the value of MSE, the better the accuracy of the prediction model in describing the experimental data.
  • the determining the parameters of the fifth calculation model may include:
  • the first distortion value is set equal to the product of the target mean square error and the first parameter factor and the sum value of the second parameter factor is superimposed;
  • the fifth calculation model parameter includes the first parameter factor and the second parameter factor.
  • Equation (14) is the fifth calculation model, which is used to represent the corresponding relationship between the first distortion value and the mean square error.
  • the fifth calculation model parameter may include a first parameter factor (ie, 0.6276 in the formula) and a second parameter factor (ie, 3.48 in the formula).
  • the determination of the parameter of the fifth calculation model may be a preset value, or may be obtained by fitting according to a large amount of experimental test data of the test video, which is not limited herein.
  • the method may further include:
  • the fifth calculation model parameter is set to a preset value.
  • the first parameter factor may be set to 0.6276, and the second parameter factor may be set to 3.48. After the first parameter factor and the second parameter factor are determined, a fifth calculation model can be obtained, so as to determine the first distortion value according to the target mean square error.
  • the method may further include:
  • the fifth calculation model parameter is determined according to the third relational function.
  • test video here may also be one or more test videos, for example, the test video may be multiple (eg, 59) test video sequences in a large-scale urban scene dataset (Cityscapes).
  • the first distortion metric is used for the test video, and the average MSE of the reconstructed video under different quantization parameters is counted according to the encoded files.
  • a large amount of experimental test data can be used to determine the first distortion value (D miou ) and MSE, the fitting curve is shown in FIG. 6 , and the fitting curve is linear, and a fifth calculation model can be obtained.
  • the determining the pre-parameter of the video to be encoded may further include: determining the target mean square error of the coding units in the video to be encoded. In this way, after the fifth calculation model is obtained, the first distortion value can be determined according to the target mean square error and the fifth calculation model shown in formula (14).
  • the first mIoU value is determined according to the difference between the first mIoU value and the second mIoU value according to the semantic segmentation result (the first mIoU value) of the video before encoding and the semantic segmentation result (the second mIoU value) of the encoded video.
  • Distortion value the embodiment of the present application does not make any specific limitation.
  • the second distortion metric criterion may be a numerical error criterion.
  • the determining the second distortion value according to the second distortion metric criterion may include:
  • the coding unit includes at least one of the following: a picture, a slice (Slice), a sub-picture (Sub-picture), a tile (tile), and a coding block;
  • the second distortion value is determined according to the reconstructed value and the original value of the coding unit
  • the numerical error criterion is one of the following: Sum of Absolute Differences (SAD) criterion, Mean Absolute Deviation (MAD) criterion, and Sum of Square Error (SSE) criterion , the mean-square error (MSE) criterion. It should be noted that the numerical error criterion is not limited to these criteria, and may also be other criteria, which are not specifically limited in the embodiments of the present application.
  • the second distortion value is represented by SSE, and its calculation formula is as follows:
  • M and N represent the horizontal spatial resolution and vertical spatial resolution of the video, respectively, f(x, y) represents the original pixel value at the pixel position (x, y), and g(x, y) represents the pixel position (x, y) , y) at the reconstructed pixel value.
  • the target Lagrangian multiplier (represented by ⁇ ) can be calculated by ⁇ miou and ⁇ SSE
  • the target distortion value (represented by D) can be calculated by D miou and SSE.
  • the determining a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier may include:
  • the first preset parameter is used to control weight values corresponding to the first Lagrangian multiplier and the second Lagrangian multiplier;
  • the first Lagrangian multiplier and the second Lagrangian multiplier are weighted and calculated by using the first preset parameter to obtain the target Lagrangian multiplier.
  • the first preset parameter can control the weight values corresponding to the first Lagrangian multiplier and the second Lagrangian multiplier.
  • the determining the first preset parameter may include:
  • the first preset parameter is set according to the configuration information of the encoder.
  • the method can also include:
  • the target Lagrangian multiplier is set equal to the first Lagrangian multiplier and the second Lagrangian A weighted sum of Lagrangian multipliers, where k is any value greater than or equal to 0 and less than or equal to 1, the weighting coefficient of the first Lagrange multiplier is set to be equal to 1–k, and the second Lagrangian The weighting factor of the Grange multiplier is set equal to k.
  • the weighting coefficient of the second Lagrangian multiplier when the weighting coefficient of the second Lagrangian multiplier is set to k, the weighting coefficient of the first Lagrangian multiplier can be set to 1-k ; in this way, the calculation formula of the target Lagrange multiplier is as follows,
  • represents the target Lagrangian multiplier
  • ⁇ miou represents the first Lagrangian multiplier
  • ⁇ SSE represents the second Lagrangian multiplier
  • 1-k and k represent the first Lagrangian multiplier, respectively The weighting coefficients of the second Lagrangian and the second Lagrange multiplier.
  • k may be a constant within the range of 0 to 1.
  • the value of k may be equal to 0.5, may also be 0.75, or may be a variable value (for example, obtained by performing a certain calculation on the current coding unit), which is not specifically limited in this embodiment of the present application.
  • a typical value of k can be equal to 0.75.
  • the determining a target distortion value according to the first distortion value and the second distortion value includes:
  • the second preset parameter is used to control the weight value corresponding to the first distortion value and the second distortion value
  • the first distortion value and the second distortion value are weighted and calculated by using the second preset parameter to obtain the target distortion value.
  • the second preset parameter can control the weight values corresponding to the first distortion value and the second distortion value.
  • the determining the second preset parameter may include:
  • the second preset parameter is set according to the configuration information of the encoder.
  • the method can also include:
  • the target distortion value is set equal to the weighted sum of the first distortion value and the second distortion value, where m is Any value greater than or equal to 0 and less than or equal to 1, the weighting coefficient of the first distortion value is set equal to 1 ⁇ m, and the weighting coefficient of the second distortion value is set equal to m.
  • the weighting coefficient of the second distortion value when the weighting coefficient of the second distortion value is set to m, the weighting coefficient of the first distortion value can be set to 1-m; in this way, the calculation of the target distortion value
  • the formula is as follows,
  • D represents the target distortion value
  • D miou represents the first distortion value
  • SSE represents the second distortion value
  • 1-m and m represent the weighting coefficients of the first distortion value and the second distortion value, respectively.
  • m may be a constant within the range of 0 to 1.
  • the value of m may be equal to 0.5, may also be 0.75, or may be a variable value (for example, obtained by performing a certain calculation on the current coding unit), which is not specifically limited in this embodiment of the present application.
  • a typical value of m can be equal to 0.75.
  • the values of the first preset parameter and the second preset parameter may be set to be the same or different.
  • the values of the first preset parameter and the second preset parameter are the same, for example, both can be represented by ⁇ .
  • the calculation formula of the target Lagrange multiplier and the target distortion value can be as follows,
  • is a constant in the range of 0 to 1, which can not only be used to control the respective weights of the first Lagrangian multiplier and the second Lagrangian multiplier, but also can be used to control the semantic distortion (ie The first distortion value) and the fidelity distortion (ie, the second distortion value) respectively occupy the size of the weight.
  • is typically set to 0.75.
  • ⁇ miou represents the machine-oriented quality
  • ⁇ SSE represents the subjective quality viewed by the human eye
  • represents the subjective quality viewed by the human eye and the machine-oriented quality that can be adjusted between . For example, if ⁇ is equal to 1, then the target distortion value at this time is entirely the subjective quality viewed by the human eye; if ⁇ is equal to 0, then the target distortion value at this time is entirely the quality for the machine.
  • the value of ⁇ can be set through the configuration information of the encoder.
  • one implementation is to set directly according to the application requirements, such as the cases of 0 and 1 described above; another implementation is to set the encoder to work in different ways, for example, if it is set to work with the human eye , the encoder will set the value of ⁇ to 1; if it is set to the working mode of the machine, the encoder will set the value of ⁇ to 0; if it is set to human-machine hybrid, the encoder will adaptively determine the value of ⁇ For example, in the preprocessing stage, the pre-encoding method is used to pre-encode the video to be encoded, and then the value of ⁇ is estimated from the pre-encoding result.
  • the encoding parameters of the video to be encoded can be determined according to the target Lagrangian multiplier and the target distortion value, so as to encode the video to be encoded.
  • the determining the encoding parameter of the video to be encoded by using the target Lagrangian multiplier and the target distortion value may include:
  • a minimum rate-distortion cost value is selected from the determined rate-distortion cost values, and a candidate encoding parameter corresponding to the minimum rate-distortion cost value is determined as the encoding parameter of the video to be encoded.
  • the encoding parameters include at least a parameter indicating a division manner of the to-be-encoded video and a parameter for constructing a prediction value of an encoded block in the to-be-encoded video.
  • the encoding the video to be encoded may include: writing the encoding parameter into a code stream.
  • the rate-distortion cost function can be constructed; then one or more candidate encoding parameters are used to pre-encode the video to be encoded, so as to determine this one.
  • the coding parameters determined at this time are the optimal coding parameters (with the lowest rate-distortion cost), and then coding is performed; in this process, the coding parameters can also be written into the code stream for transmission from the encoder to the decoder, using to restore the original to-be-encoded video on the decoder side.
  • the embodiment of the present application uses the VVC
  • the distortion criterion in the rate-distortion optimization process is modified to the weight of the semantic distortion D miou and the fidelity distortion SSE, as shown in the above equation (17) or equation (19); the corresponding target Lagrange multiplier is modified to ⁇
  • the weighting of miou and ⁇ SSE is shown in the above formula (16) or formula (18), so that the rate-distortion process of the VVC standard encoder can be optimized according to the rate-distortion optimization algorithm of multi-distortion criteria for human-machine vision. , to improve the semantic segmentation accuracy of reconstructed video at a certain bit rate while maintaining good fidelity performance.
  • VVC TEST MODE VTM
  • VTM VVC TEST MODE
  • the rate-distortion process in the VVC is optimized according to the video encoding method of the embodiment of the present application, and then different QPs are selected, and the test video is encoded by the optimized encoder to obtain the encoding bit rate, and the encoding
  • the resulting reconstructed video is semantically segmented and the segmentation accuracy is calculated.
  • the BD-rate and BD-miou of the reconstructed video compared with the VVC standard encoder can be calculated. Performance in terms of video semantic accuracy at bit rate.
  • the BD-miou and BD-rate of the reconstructed video of the embodiment of the present application compared with the reconstructed video of the VVC standard encoder can be calculated.
  • Table 1 shows the performance of the video coding method of the application embodiment in terms of semantic accuracy.
  • BD-miou represents the improvement of the semantic accuracy of the reconstructed video under the same bit rate.
  • BD-miou is greater than 0, indicating that the semantic accuracy is improved; BD-miou is less than 0, indicating that the semantic accuracy has decreased; BD-rate represents For the increase of the coding rate under the same semantic accuracy, if BD-rate is greater than 0, it indicates that the coding rate increases; if BD-rate is less than 0, it indicates that the coding rate decreases, that is, the coding efficiency is improved.
  • the PSNR and coding rate of the reconstructed video are calculated from the encoded files.
  • the BD-rate and BD-PSNR of the reconstructed video compared with the VVC standard encoder are obtained, which can be measured in this application.
  • the video encoding method of the embodiment compares the performance of the VVC standard encoder in terms of video fidelity with the same bit rate.
  • BD-PSNR represents the increase of reconstructed video fidelity under the same bit rate.
  • BD-PSNR is greater than 0, indicating that the fidelity has increased; BD-PSNR is less than 0, indicating that the fidelity has decreased; BD-rate represents the same fidelity
  • the increase of the coding rate in the case of true degree if BD-rate is greater than 0, it indicates that the code rate increases; if BD-rate is less than 0, it indicates that the code rate decreases, that is, the coding efficiency is improved.
  • the BD-miou obtained according to the experimental results is 0.0112, indicating that the video coding method of the embodiment of the present application can improve the accuracy of semantic segmentation of reconstructed video under the same bit rate.
  • the overall semantic effect of the embodiments of the present application is better than that of the VVC standard encoder. That is to say, the embodiments of the present application can improve the semantic segmentation accuracy of the reconstructed video under the condition of the same bit rate.
  • the BD-rate obtained according to the experimental results is -24.8673, indicating that the video coding method of the embodiment of the present application can reduce the video coding bit rate under the same semantic accuracy. That is to say, the embodiments of the present application can reduce the code rate with the same semantic accuracy.
  • the BD-PSNR obtained according to the experimental results is 0.0316, indicating that the video coding method of the embodiment of the present application can improve the fidelity of the reconstructed video under the condition of the same bit rate. That is to say, the embodiments of the present application can improve the fidelity of the reconstructed video under the condition of the same bit rate.
  • the BD-rate obtained according to the experimental results is -1.0836, indicating that the video encoding method of the embodiment of the present application can reduce the video encoding bit rate under the same fidelity. That is to say, the embodiments of the present application can reduce the code rate with the same fidelity.
  • the PSNR performance of the reconstructed video is basically not degraded compared with the VVC standard encoder.
  • the subjective performance of the embodiment of the present application is better than that of VVC, which shows that the video coding method of the embodiment of the present application can ensure the fidelity of the reconstructed video while improving the semantic accuracy, and satisfy the subjective performance of the video. Watch demand. That is to say, the embodiments of the present application can ensure the fidelity of the video while improving the semantic effect.
  • the video encoding method of the embodiment of the present application optimizes the rate-distortion process in VVC, and does not change the video encoding and decoding process and code stream structure, and therefore does not increase the complexity of encoding and decoding.
  • the video coding method of the embodiment of the present application can also reduce the coding bit rate of the video, thereby shortening the time required for coding and improving the coding speed.
  • the embodiment of the present application defines a semantic distortion metric, and derives the corresponding first Lagrangian multiplier, through preset parameters (including the first The preset parameter and the second preset parameter) adjust the weights of semantic distortion and SSE distortion, as well as the weights of the first Lagrangian multiplier and the second Lagrangian multiplier, so as to optimize the rate-distortion process of video coding , so that the semantic segmentation accuracy of the reconstructed video can be improved under the condition of a certain bit rate, and a good fidelity performance can also be maintained.
  • This embodiment provides a video encoding method, which is applied to an encoder.
  • the first Lagrangian multiplier and the second Lagrangian multiplier are determined according to the pre-parameters; according to the first Lagrangian multiplier and the second Lagrangian multiplier Lagrangian multiplier, determining the target Lagrangian multiplier; determining a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion; according to the second distortion metric criterion , determine a second distortion value, wherein the second distortion measurement criterion includes a numerical error measurement criterion; determine a target distortion value according to the first distortion value and the second distortion value; use the target Lagrangian
  • the multiplier and the target distortion value are used to determine the encoding parameters of the to-be-encoded video, and the to-be-encoded video is encoded.
  • the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and at a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining good fidelity performance, thereby improving coding efficiency.
  • FIG. 7 shows a schematic structural diagram of the composition of an encoder 70 provided by an embodiment of the present application.
  • the encoder 70 may include: a determination unit 701, a calculation unit 702 and an encoding unit 703; wherein,
  • a determining unit 701 configured to determine pre-parameters of the video to be encoded, and determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters;
  • a computing unit 702 configured to determine a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier;
  • the determining unit 701 is further configured to determine a first distortion value according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion; and determine a second distortion value according to a second distortion metric criterion, wherein , the second distortion metric criterion includes a numerical error metric criterion;
  • the calculation unit 702 is further configured to determine a target distortion value according to the first distortion value and the second distortion value;
  • the encoding unit 703 is configured to use the target Lagrangian multiplier and the target distortion value to determine encoding parameters of the video to be encoded, and to encode the video to be encoded.
  • the pre-parameter includes a quantization parameter
  • the determining unit 701 is further configured to determine a quantization parameter of an encoding unit in the to-be-encoded video, wherein the encoding unit includes at least one of the following: image, slice , subimages, tiles, encoded blocks.
  • the determining unit 701 is further configured to determine a first calculation model parameter, where the first calculation model represents the corresponding relationship between the first Lagrangian multiplier and the quantization parameter; and according to the The quantization parameter and the first calculation model determine the first Lagrangian multiplier.
  • the determining unit 701 is further configured to, in the first calculation model, set the first Lagrangian multiplier to a weighted value equal to the exponential power of the quantization parameter; wherein, the first calculation model parameter includes a first exponential parameter indicating the exponential power and a first weighting coefficient indicating the weighting.
  • the determining unit 701 is further configured to set the parameter of the first calculation model to a preset value.
  • the determining unit 701 is further configured to, based on the test video, use the first distortion metric criterion to determine a first relationship function between the first distortion value and the bit rate of the test video, for The first relationship function performs a derivative operation to determine a derivative function of the first relationship function; and based on the test video, determine a second relationship function between the code rate and the quantization parameter, according to the derivative function and the second relational function to determine the first computational model parameter.
  • the pre-parameter includes a quantization parameter and a target code rate
  • the determining unit 701 is further configured to determine a quantization parameter and a target code rate of a coding unit in the to-be-coded video, wherein the coding unit includes the following At least one of: image, tile, subimage, tile, encoded block.
  • the determining unit 701 is further configured to determine a second calculation model parameter, where the second calculation model represents the corresponding relationship between the first Lagrangian multiplier and the code rate; and according to the The target code rate and the second calculation model determine the first Lagrangian multiplier.
  • the determining unit 701 is further configured to, in the second calculation model, set the first Lagrangian multiplier to a weighted value equal to the exponential power of the target code rate; wherein , the second calculation model parameter includes a second exponent parameter indicating the power of the exponent and a second weighting coefficient indicating the weighting.
  • the determining unit 701 is further configured to set the parameter of the second calculation model to a preset value.
  • the determining unit 701 is further configured to, based on the test video, use the first distortion metric criterion to determine a first relationship function between the first distortion value and the bit rate of the test video; and A derivative operation is performed on the first relational function to determine the second calculation model parameter.
  • the determining unit 701 is further configured to determine the target bit rate of the coding unit in the to-be-coded video by using a bit allocation method.
  • the determining unit 701 is further configured to determine the second Lagrangian multiplier according to a preset third calculation model; wherein the third calculation model represents the second Lagrangian Correspondence between day multipliers and quantization parameters.
  • the determining unit 701 is further configured to use a rate control manner to determine the quantization parameter of the coding unit in the to-be-coded video.
  • the determining unit 701 is further configured to set the quantization parameter to a preset value.
  • the determining unit 701 is further configured to perform semantic segmentation on the test video based on the test video, and determine the semantic accuracy of one or more categories; and according to the semantic accuracy of the one or more categories , to determine the target semantic accuracy;
  • the calculation unit 702 is further configured to use a fourth calculation model to perform a distortion measurement on the target semantic accuracy to obtain the first distortion value.
  • the calculating unit 702 is further configured to calculate a weighted sum of the semantic accuracy of the one or more categories, and determine the obtained weighted sum as the target semantic accuracy.
  • the determining unit 701 is further configured to determine the fourth calculation model parameter, where the fourth calculation model represents the correspondence between the first distortion value and the target semantic accuracy; and according to The target semantic accuracy and the fourth calculation model are used to obtain the first distortion value.
  • the determining unit 701 is further configured to, in the fourth calculation model, set the first distortion value to be a weighted value equal to the logarithm of the target semantic accuracy;
  • Four computational model parameters include a base parameter indicating the logarithm and a fourth weighting coefficient parameter indicating the weighting.
  • the determining unit 701 is further configured to set the fourth calculation model parameter to a preset value.
  • the determining unit 701 is further configured to determine a fifth calculation model parameter, where the fifth calculation model represents a third relationship function between the first distortion value and the mean square error;
  • the target mean square error of the coding unit in the encoded video, and the first distortion value is determined according to the target mean square error and the fifth calculation model.
  • the determining unit 701 is further configured to, in the fifth calculation model, set the first distortion value equal to the product of the target mean square error and the first parameter factor and superimpose the second parameter The sum of factors; wherein, the fifth calculation model parameter includes the first parameter factor and the second parameter factor.
  • the determining unit 701 is further configured to set the parameter of the fifth calculation model to a preset value.
  • the determining unit 701 is further configured to, based on the test video, use the first distortion metric criterion to determine a third relationship function between the first distortion value and the mean square error of the test video; and determining the fifth calculation model parameter according to the third relational function.
  • the determining unit 701 is further configured to determine a reconstruction value of a coding unit in the video, wherein the coding unit includes at least one of the following: an image, a slice, a sub-image, a tile, and a coding block;
  • the calculation unit 702 is further configured to, based on the numerical error criterion, determine the second distortion value according to the reconstructed value and the original value of the coding unit; wherein the numerical error criterion is one of the following: absolute error and Criterion, Mean Absolute Error Criterion, Error Sum of Squares Criterion, Mean Squared Error Criterion.
  • the determining unit 701 is further configured to determine a first preset parameter; wherein the first preset parameter is used to control the first Lagrangian multiplier and the second Lagrangian The weight value corresponding to the daily multiplier;
  • the calculation unit 702 is further configured to perform weighted calculation on the first Lagrangian multiplier and the second Lagrangian multiplier by using the first preset parameter to obtain the target Lagrangian multiplier son.
  • the encoder 70 may further include a configuration unit 704 configured to set the first preset parameter according to the configuration information of the encoder.
  • the configuration unit 704 is further configured to, when the configuration information of the encoder indicates that the first preset parameter is equal to k, set the target Lagrangian multiplier to be equal to the first The weighted sum of the Lagrangian multiplier and the second Lagrangian multiplier, where k is any value greater than or equal to 0 and less than or equal to 1, and the weighted sum of the first Lagrangian multiplier The coefficients are set equal to 1-k, and the weighting coefficients of the second Lagrangian multipliers are set equal to k.
  • the value of k is equal to 0.75.
  • the determining unit 701 is further configured to determine a second preset parameter; wherein the second preset parameter is used to control the weight value corresponding to the first distortion value and the second distortion value;
  • the calculation unit 702 is further configured to perform weighted calculation on the first distortion value and the second distortion value by using the second preset parameter to obtain the target distortion value.
  • the configuration unit 704 is further configured to set the second preset parameter according to the configuration information of the encoder.
  • the configuration unit 704 is further configured to, when the configuration information of the encoder indicates that the second preset parameter is equal to m, set the target distortion value to be equal to the first distortion value and the The weighted sum of the second distortion value, where m is any value greater than or equal to 0 and less than or equal to 1, the weighting coefficient of the first distortion value is set to be equal to 1 ⁇ m, the weighting of the second distortion value The coefficients are set equal to m.
  • the value of m is equal to 0.75.
  • the determining unit 701 is further configured to construct a rate-distortion cost function based on the target Lagrangian multiplier and the target distortion value; and use one or more candidate encoding parameters to encode the to-be-encoded
  • the video is subjected to precoding processing to determine a rate-distortion cost value corresponding to the one or more candidate encoding parameters; and a minimum rate-distortion cost value is selected from the determined rate-distortion cost values, and the minimum rate-distortion cost value corresponds to
  • the candidate encoding parameter of is determined as the encoding parameter of the video to be encoded.
  • the encoding parameters include at least a parameter indicating how the video to be encoded is divided and a parameter constructing a predictor of an encoded block in the video to be encoded.
  • the encoder 70 may further include a writing unit 705 configured to write the encoding parameters into the code stream.
  • a "unit” may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, it may also be a module, and it may also be non-modular.
  • each component in this embodiment may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of software function modules.
  • the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of this embodiment is essentially or Said part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product
  • the computer software product is stored in a storage medium and includes several instructions for making a computer device (which can be It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the method described in this embodiment.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.
  • an embodiment of the present application provides a computer storage medium, which is applied to the encoder 70, where the computer storage medium stores a computer program, and when the computer program is executed by at least one processor, implements any one of the foregoing embodiments. steps of the method.
  • FIG. 8 shows a specific hardware structure example of the encoder 70 provided by the embodiment of the present application, which may include: a communication interface 801, a memory 802, and a processor 803; each The components are coupled together through a bus system 804 .
  • the bus system 804 is used to implement connection communication between these components.
  • the bus system 804 also includes a power bus, a control bus, and a status signal bus.
  • the various buses are labeled as bus system 804 in FIG. 7 .
  • the communication interface 801 is used for receiving and sending signals in the process of sending and receiving information with other external network elements;
  • a memory 802 for storing computer programs that can be executed on the processor 803;
  • the processor 803 is configured to, when running the computer program, execute:
  • first distortion metric criterion includes a semantic distortion metric criterion
  • the second distortion metric criterion includes a numerical error metric criterion
  • the encoding parameters of the video to be encoded are determined, and the video to be encoded is encoded.
  • the memory 802 in this embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable read-only memory (Erasable PROM, EPROM). Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory.
  • Volatile memory may be Random Access Memory (RAM), which acts as an external cache.
  • RAM Static RAM
  • DRAM Dynamic RAM
  • SDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • Double Data Rate SDRAM DDRSDRAM
  • Enhanced Synchronous Dynamic Random Access Memory Enhanced SDRAM, ESDRAM
  • Synchronous link DRAM Synchronous link DRAM, SLDRAM
  • Direct Rambus RAM Direct Rambus RAM
  • the processor 803 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method can be completed by an integrated logic circuit of hardware in the processor 803 or an instruction in the form of software.
  • the above-mentioned processor 803 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art.
  • the storage medium is located in the memory 802, and the processor 803 reads the information in the memory 802, and completes the steps of the above method in combination with its hardware.
  • the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof.
  • the processing unit can be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Device (DSP Device, DSPD), programmable Logic Devices (Programmable Logic Device, PLD), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), General Purpose Processors, Controllers, Microcontrollers, Microprocessors, Others for performing the functions described herein electronic unit or a combination thereof.
  • ASIC Application Specific Integrated Circuits
  • DSP Digital Signal Processing
  • DSP Device Digital Signal Processing Device
  • DSPD Digital Signal Processing Device
  • PLD programmable Logic Devices
  • Field-Programmable Gate Array Field-Programmable Gate Array
  • FPGA Field-Programmable Gate Array
  • the techniques described herein may be implemented through modules (eg, procedures, functions, etc.) that perform the functions described herein.
  • Software codes may be stored in memory and executed by a processor.
  • the memory can be implemented in the processor or external to the processor.
  • the processor 803 is further configured to execute the steps of the method in any one of the foregoing embodiments when running the computer program.
  • This embodiment provides an encoder, which includes a determination unit, a calculation unit, and an encoding unit; wherein the determination unit is configured to determine pre-parameters of a video to be encoded, and determine a first Lagrangian according to the pre-parameters a multiplier and a second Lagrangian multiplier; the computing unit is configured to determine a target Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier; and determine The unit is further configured to determine a first distortion value according to a first distortion metric, wherein the first distortion metric includes a semantic distortion metric; and determine a second distortion value according to a second distortion metric, wherein the The second distortion metric criterion includes a numerical error metric criterion; the computing unit is further configured to, and based on the first distortion value and the second distortion value, determine a target distortion value; the encoding unit is configured to utilize the target Lagrangian multiplication and the target distortion value, determine the encoding parameters of the to-be-en
  • the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and under a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining a good fidelity performance, thereby also improving the coding efficiency.
  • FIG. 9 it shows a schematic structural diagram of a video system provided by an embodiment of the present application.
  • the video system 90 may include an encoder 901 and a decoder 902 .
  • the encoder 901 may be the encoder 70 described in any one of the foregoing embodiments.
  • the encoder 901 is configured to determine pre-parameters of the video to be encoded, determine a first Lagrangian multiplier and a second Lagrangian multiplier according to the pre-parameters; and determine the first Lagrangian multiplier according to the first Lagrangian multiplier and the second Lagrangian multiplier to determine a target Lagrangian multiplier; and a first distortion value is determined according to a first distortion metric criterion, wherein the first distortion metric criterion includes a semantic distortion metric criterion And according to the second distortion metric criterion, determine the second distortion value, wherein, the second distortion metric criterion includes numerical error metric criterion; And according to the first distortion value and the second distortion value, determine the target distortion value and utilize the target Lagrange multiplier and the target distortion value to determine the encoding parameters of the video to be encoded, encode the video to be encoded to generate a code stream, and transmit the code stream to the the decoder;
  • the decoder 902 is configured to parse the code stream to obtain a decoded video.
  • the decoder 902 is further configured to parse the code stream, obtain decoding parameters, and obtain the decoded video according to the decoding parameters; wherein, the decoding parameters at least include a code indicating the division mode of the video to be decoded. parameters and parameters constructing the predicted values of the decoded blocks in the video to be decoded.
  • the video system 90 comprehensively considers the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric to perform rate-distortion optimization in video coding, which can be well adapted to It is oriented to the application scenarios of machine vision and human-machine vision, and in the case of a certain bit rate, it can improve the accuracy of semantic segmentation of reconstructed videos, while maintaining good fidelity performance, thereby improving coding efficiency.
  • the pre-parameters of the video to be encoded are determined, and the first Lagrangian multiplier and the second Lagrangian multiplier are determined according to the pre-parameters; according to the first Lagrangian multiplier and the second Lagrangian multiplier to determine the target Lagrangian multiplier; according to the first distortion metric criterion, determine the first distortion value, wherein the first distortion metric criterion includes a semantic distortion metric criterion; determining a second distortion value according to a second distortion metric criterion, wherein the second distortion metric criterion includes a numerical error metric criterion; determining a target distortion value according to the first distortion value and the second distortion value; using the The target Lagrange multiplier and the target distortion value are used to determine the encoding parameters of the video to be encoded, and the video to be encoded is encoded.
  • the first distortion metric based on the semantic distortion metric and the second distortion metric based on the numerical error metric are comprehensively considered in video coding for rate-distortion optimization, which can be well adapted to machine vision and human-machine vision-oriented applications. scene, and under a certain bit rate, it can improve the semantic segmentation accuracy of the reconstructed video, while maintaining a good fidelity performance, thereby also improving the coding efficiency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne un procédé et un système de codage vidéo, un codeur et un support de stockage informatique. Le procédé consiste à : déterminer un pré-paramètre d'une vidéo à coder, et déterminer un premier multiplicateur de Lagrange et un second multiplicateur de Lagrange en fonction du pré-paramètre ; déterminer un multiplicateur de Lagrange cible selon le premier multiplicateur de Lagrange et le second multiplicateur de Lagrange ; déterminer une première valeur de distorsion selon un premier critère de mesure de distorsion, le premier critère de mesure de distorsion comprenant un critère de mesure de distorsion sémantique ; déterminer une seconde valeur de distorsion selon un second critère de mesure de distorsion, le second critère de mesure de distorsion comprenant un critère de mesure d'erreur numérique ; déterminer une valeur de distorsion cible en fonction de la première valeur de distorsion et de la seconde valeur de distorsion ; et utiliser le multiplicateur de Lagrange cible et la valeur de distorsion cible pour déterminer un paramètre de codage de ladite vidéo, et coder ladite vidéo.
PCT/CN2020/106416 2020-07-31 2020-07-31 Procédé et système de codage vidéo, codeur et support de stockage informatique WO2022021422A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/106416 WO2022021422A1 (fr) 2020-07-31 2020-07-31 Procédé et système de codage vidéo, codeur et support de stockage informatique
CN202080099999.3A CN115428451A (zh) 2020-07-31 2020-07-31 视频编码方法、编码器、系统以及计算机存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/106416 WO2022021422A1 (fr) 2020-07-31 2020-07-31 Procédé et système de codage vidéo, codeur et support de stockage informatique

Publications (1)

Publication Number Publication Date
WO2022021422A1 true WO2022021422A1 (fr) 2022-02-03

Family

ID=80037374

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/106416 WO2022021422A1 (fr) 2020-07-31 2020-07-31 Procédé et système de codage vidéo, codeur et support de stockage informatique

Country Status (2)

Country Link
CN (1) CN115428451A (fr)
WO (1) WO2022021422A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114786010A (zh) * 2022-03-07 2022-07-22 杭州未名信科科技有限公司 率失真优化量化方法、装置、存储介质及电子设备
CN116723330A (zh) * 2023-03-28 2023-09-08 成都师范学院 一种自适应球域失真传播链长度的全景视频编码方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102780884A (zh) * 2012-07-23 2012-11-14 深圳广晟信源技术有限公司 一种率失真优化方法
US20170188027A1 (en) * 2012-09-24 2017-06-29 Intel Corporation Histogram Segmentation Based Local Adaptive Filter for Video Encoding and Decoding
CN107205151A (zh) * 2017-06-26 2017-09-26 中国科学技术大学 基于混合失真度量准则的编解码装置及方法
CN109190752A (zh) * 2018-07-27 2019-01-11 国家新闻出版广电总局广播科学研究院 基于深度学习的全局特征和局部特征的图像语义分割方法
CN109982082A (zh) * 2019-05-05 2019-07-05 山东大学深圳研究院 一种基于局部纹理特性的hevc多失真准则率失真优化方法

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7042943B2 (en) * 2002-11-08 2006-05-09 Apple Computer, Inc. Method and apparatus for control of rate-distortion tradeoff by mode selection in video encoders
US8879623B2 (en) * 2009-09-02 2014-11-04 Sony Computer Entertainment Inc. Picture-level rate control for video encoding a scene-change I picture
CA2839345A1 (fr) * 2011-06-14 2012-12-20 Zhou Wang Procede et systeme d'optimisation debit-distorsion basee sur la similarite structurale pour le codage video perceptuel
FR3029055B1 (fr) * 2014-11-24 2017-01-13 Ateme Procede d'encodage d'image et equipement pour la mise en oeuvre du procede
CN108900838B (zh) * 2018-06-08 2021-10-15 宁波大学 一种基于hdr-vdp-2失真准则的率失真优化方法
WO2020107288A1 (fr) * 2018-11-28 2020-06-04 Oppo广东移动通信有限公司 Procédé et appareil d'optimisation de codage vidéo et support d'informations d'ordinateur
CN110324618A (zh) * 2019-07-03 2019-10-11 上海电力学院 基于vmaf准则的提高视频质量的优化编码方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102780884A (zh) * 2012-07-23 2012-11-14 深圳广晟信源技术有限公司 一种率失真优化方法
US20170188027A1 (en) * 2012-09-24 2017-06-29 Intel Corporation Histogram Segmentation Based Local Adaptive Filter for Video Encoding and Decoding
CN107205151A (zh) * 2017-06-26 2017-09-26 中国科学技术大学 基于混合失真度量准则的编解码装置及方法
CN109190752A (zh) * 2018-07-27 2019-01-11 国家新闻出版广电总局广播科学研究院 基于深度学习的全局特征和局部特征的图像语义分割方法
CN109982082A (zh) * 2019-05-05 2019-07-05 山东大学深圳研究院 一种基于局部纹理特性的hevc多失真准则率失真优化方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG QUN; YUAN HUI; HUO JUNYAN; LI PENG: "A Fidelity-Assured Rate Distortion Optimization Method for Perceptual-Based Video Coding", 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), IEEE, 22 September 2019 (2019-09-22), pages 4135 - 4139, XP033647416, DOI: 10.1109/ICIP.2019.8803496 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114786010A (zh) * 2022-03-07 2022-07-22 杭州未名信科科技有限公司 率失真优化量化方法、装置、存储介质及电子设备
CN116723330A (zh) * 2023-03-28 2023-09-08 成都师范学院 一种自适应球域失真传播链长度的全景视频编码方法
CN116723330B (zh) * 2023-03-28 2024-02-23 成都师范学院 一种自适应球域失真传播链长度的全景视频编码方法

Also Published As

Publication number Publication date
CN115428451A (zh) 2022-12-02

Similar Documents

Publication Publication Date Title
US11159801B2 (en) Video characterization for smart encoding based on perceptual quality optimization
US10212456B2 (en) Deblocking filter for high dynamic range (HDR) video
JP6698077B2 (ja) モデルベースの映像符号化用の知覚的最適化
US10567768B2 (en) Techniques for calculation of quantization matrices in video coding
US10873763B2 (en) Video compression techniques for high dynamic range data
TWI452907B (zh) 最佳化之解區塊濾波器
CN108574841B (zh) 一种基于自适应量化参数的编码方法及装置
US10574997B2 (en) Noise level control in video coding
CN111193931B (zh) 一种视频数据的编码处理方法和计算机存储介质
US9560386B2 (en) Pyramid vector quantization for video coding
EP3545677A1 (fr) Procédés et appareils de codage et de décodage de vidéo basés sur une classification de métriques perceptuelles
WO2022021422A1 (fr) Procédé et système de codage vidéo, codeur et support de stockage informatique
US20160353107A1 (en) Adaptive quantization parameter modulation for eye sensitive areas
US20160277767A1 (en) Methods, systems and apparatus for determining prediction adjustment factors
WO2012006305A1 (fr) Codage vidéo utilisant des filtres de déblocage quantifiés par un vecteur
WO2019001283A1 (fr) Procédé et appareil destinés à commander le rapport de résolution de codage
WO2020186763A1 (fr) Procédé de prédiction de composante d'image, encodeur, décodeur et support de stockage
CN112243129B (zh) 视频数据处理方法、装置、计算机设备及存储介质
US6141449A (en) Coding mode determination system
WO2022198465A1 (fr) Procédé et appareil de codage
KR100601846B1 (ko) 동영상 압축 부호화기의 왜곡 최적화 장치 및 방법
Li et al. Space-domain-based CTU layer rate control for HEVC
WO2021262419A1 (fr) Conception de quantificateur adaptatif pour codage vidéo
JP2011244334A (ja) 動画像符号化装置、動画像符号化方法及びプログラム
WO2023141781A1 (fr) Procédé et appareil de codage et de décodage, dispositif de codage, dispositif de décodage et support d'enregistrement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20947371

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20947371

Country of ref document: EP

Kind code of ref document: A1