WO2022027881A1 - 基于视频序列特征和QP-λ修正的时域率失真优化方法 - Google Patents

基于视频序列特征和QP-λ修正的时域率失真优化方法 Download PDF

Info

Publication number
WO2022027881A1
WO2022027881A1 PCT/CN2020/132813 CN2020132813W WO2022027881A1 WO 2022027881 A1 WO2022027881 A1 WO 2022027881A1 CN 2020132813 W CN2020132813 W CN 2020132813W WO 2022027881 A1 WO2022027881 A1 WO 2022027881A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
time domain
distortion
coding
video sequence
Prior art date
Application number
PCT/CN2020/132813
Other languages
English (en)
French (fr)
Inventor
朱策
秦晗
王永华
刘翼鹏
刘凯
Original Assignee
电子科技大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 电子科技大学 filed Critical 电子科技大学
Priority to US17/460,360 priority Critical patent/US11418795B2/en
Publication of WO2022027881A1 publication Critical patent/WO2022027881A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/19Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock

Definitions

  • the invention belongs to the technical field of video coding and decoding, and in particular relates to a time-domain rate-distortion optimization method based on video sequence features and QP- ⁇ correction.
  • Rate Distortion Theory is the theoretical basis of video coding. Rate-distortion theory provides the limit value of source compression under the condition of given distortion, or the minimum distortion that can be achieved under the condition of given code rate.
  • the rate-distortion optimization problem is to minimize the distortion of video coding under the limitation of bit consumption. It is essentially a constrained optimization problem, and its mathematical expression is shown in the formula:
  • D represents the distortion
  • R represents the encoding bit rate
  • R max represents the maximum encoding bit rate
  • the quantization operation is the main factor that produces distortion in video coding. Improving the performance of the quantizer is of great significance to improving the coding compression efficiency.
  • the quantization of rate-distortion optimization is to find the quantization parameter setting that can minimize the distortion under the constraints of a given code rate.
  • the Lagrangian multiplier ⁇ In order to determine the optimal quantization parameter of the coding block, the Lagrangian multiplier ⁇ must first be determined. A large number of experimental results show that there is a close relationship between ⁇ and the quantization parameter QP.
  • the default condition of the encoder is that the coding units (coding blocks) to be optimized are independent of each other, that is, the coding rate, distortion, parameters of each coding unit are not related to other units.
  • the coding parameters such as quantization step size, coding mode, motion vector, etc.
  • the corresponding coding bits are R( ok )
  • the coding distortion is D( ok )
  • J is called the rate-distortion cost function.
  • this method of processing each coding unit independently is actually a locally optimal method, because each coding unit of the actual video has a dependency relationship.
  • ⁇ g is an operable point with a larger negative slope at a certain point on the rate-distortion curve, and a smaller ⁇ g corresponds to a larger code rate and larger distortion.
  • the size of ⁇ g in AV1 (AOMedia Video 1) is mainly set in advance.
  • the quantization parameter QP is determined by the quantization parameter, independent of the input video sequence.
  • Figure 2 shows the default temporal hierarchy relationship of AV1. Since the default GOP size of AV1 is 16, except for the frame with the highest temporal hierarchy, the coding blocks of the rest of the frames will directly generate a direct response to the frame of a higher temporal hierarchy and subsequent frames. or indirect effects.
  • the default coding structure of AV1 is similar to the random access video coding of HEVC, and its coding order is inconsistent with the decoding order, and forward reference and backward reference need to be considered.
  • the forward reference is that the playback order of the forward reference frame is greater than that of the current frame.
  • POC Picture Order Count, picture order count
  • Figure 3 shows the main reference relationships in AV1.
  • rPOC relative POC, which is the frame that appears in the same position in GOP (Group of Pictures)
  • GOP Group of Pictures
  • the coding distortion of subsequent coding units can be expressed by the formula: is the motion compensation error of the coding unit B i+1 :
  • F i represents the original pixel of the coding unit B i
  • F i+1 represents the original pixel of coding unit B i+1 .
  • the characteristics of the new generation encoder AV1 and the characteristics of the video sequence were not adjusted accordingly, and the relationship between QP- ⁇ in AV1 was not re-corrected. Frames are adjusted, and the effect of I-frames on subsequent frames is very important.
  • the present invention proposes a temporal rate-distortion optimization method based on video sequence characteristics and QP- ⁇ correction for the new generation encoder AV1.
  • characteristics and video sequence characteristics by constructing a temporal distortion propagation chain, estimating the aggregate distortion of the current coding unit and the affected future coding units, calculating the propagation factor of the coding unit in the temporal distortion propagation model, and then adjusting the propagation factor more accurately
  • the Lagrangian multiplier realizes the time-domain dependent rate-distortion optimization, and at the same time, the relationship between QP- ⁇ is re-corrected, and the I frame is adjusted to obtain a better encoding effect.
  • the Lagrangian multiplier in the relational model as ⁇ new , and the quantization step size Qstep.
  • the Lagrangian of different sequences of different QPs is calculated.
  • the multiplier and its corresponding quantization step Qstep are counted, and the relationship model between the Lagrange multiplier ⁇ new and the quantization step Qstep is constructed, and the obtained relationship model is expressed by formulas (1.9)-(1.10);
  • ⁇ org is the Lagrange multiplier in the encoder
  • SAD i refers to the sum of the absolute values of the i-th difference
  • p 0 represents the pixel value of the initial frame
  • p 10*i represents the pixel value of the subsequent 10, 20, 30... frames
  • the total number of frames in the video sequence It is represented by F
  • the width is represented by W
  • the height is represented by H.
  • constrainQPrange represents the maximum adjustable range of QP.
  • Formula (1.13) is the adjustment range of ⁇
  • ⁇ org represents the preset ⁇ of AV1.
  • QP 0 represents the QP of the I frame (0 ⁇ 255)
  • represents the coefficient of formula (1.8)
  • the clip3() function is used to limit the calculation result of 1-0.0006*(0.8*E-20) between 0.90 and 0.98 .
  • step S4 According to the relational model obtained in step S2, define ⁇ new as the Lagrangian multiplier calculated by the relational model, and calculate the difference between the Lagrangian multipliers ⁇ org and ⁇ new in the encoder. value, the Lagrangian multiplier ⁇ org in the encoder is modified using the relational model formulas (1.9)-(1.10).
  • the differences between the solution of the present invention and the previous methods include that different coding strategies are adopted for different sequences, and the QP- ⁇ relationship in AV1 is re-corrected and adjusted according to the characteristics of AV1, including the adjustment of ALT frames and the adjustment of I-frames that satisfy the threshold sequence are adjusted.
  • Figure 1 is the rate-distortion curve
  • Figure 2 is the default coding structure of AV1
  • Figure 3 shows the main time domain dependencies in AV1
  • FIG. 4 is a schematic diagram of the structure of the time-domain distortion propagation chain
  • Figure 5 is the rate-distortion curve of the BasketballDrill sequence
  • the global Lagrangian multiplier ⁇ g can be directly modified in AV1 by the propagation factor ⁇ i . Since the subsequent coding unit is not actually encoded when the propagation factor ⁇ i is derived, the distortion of the subsequent coding unit needs to be estimated.
  • the coding distortion of subsequent coding units can be expressed by the formula: Since the coding unit B i+1 is not coded, R i+1 cannot be obtained, and D i+1 cannot be calculated using the formula, but the coding distortion of B i+1 under the quantization step Q step can be expressed as
  • a F( ⁇ ) curve can be fitted by a large number of quantization step sizes and coding unit experiments.
  • the F( ⁇ ) curve of the previous algorithm is for HEVC encoder, which is no longer applicable on AV1. Re-experiment on AV1 is obtained. A new curve is created, and points on the curve are sampled to build a look-up table of F( ⁇ ) and ⁇ to estimate the distortion of the encoded block.
  • the present invention changes the ⁇ previously set as a fixed value to the ⁇ that is adaptive according to the video sequence.
  • the main steps of the present invention include:
  • Step 1 According to the main time domain dependencies in the AV1 default coding structure, establish a time domain propagation chain (as shown in Figure 4), find out the matching blocks affected by each original coding block through forward motion search, and record the corresponding OMCP and motion vector;
  • Step 2 Define the Lagrangian multiplier in the relational model as ⁇ new , and the quantization step size Qstep.
  • the daily multiplier and its corresponding quantization step Qstep are counted, and the relationship model between the Lagrange multiplier ⁇ new and the quantization step Qstep is constructed, and the obtained relationship model is expressed by formulas (1.9)-(1.10);
  • Step 3 Briefly classify the original video sequence, use a method similar to the frame difference method, calculate the sum of the absolute values of the differences between the subsequent 10, 20, 30... frames equivalent to the initial first frame, and finally obtain the cumulative sum. and the pixel-level average of .
  • different QP and ⁇ adjustment ranges are set according to the threshold, as well as the corresponding ⁇ and I frame QP, which are represented by formulas (1.11)-(1.16).
  • Step 4 Before actually encoding the current frame, use the original motion compensation error and motion vector obtained by S1 to calculate the 16 ⁇ 16 propagation factor of each coding block of the current frame, and use the harmonic average method to obtain the average propagation factor of each Superblock . Since AV1 adopts secondary encoding by default, the built-in variable pcnt_neutral of AV1 is used to distinguish the screen content sequence, and the Lagrangian multiplier of different video sequences is adjusted in a targeted manner in combination with the adjustment range obtained by S21.
  • Step 5 According to the relational model obtained in step S2, define ⁇ new as the Lagrangian multiplier calculated by the relational model, and calculate the difference between the Lagrangian multipliers ⁇ org and ⁇ new in the encoder.
  • the difference value is used to modify the Lagrange multiplier ⁇ org in the encoder using the relational model formulas (1.9)-(1.10).
  • Step 6 Since the frame with rPOC of 16 is encoded with a special ALT frame in AV1, and the frame with rPOC of 16 is at the time domain level TL1, similar to the key frame in HEVC, its distortion will cause subsequent multi-frame distortion. Influence. Therefore, based on the adjustment of the ALT intra-frame Lagrangian multiplier by the AV1 encoder, scaling and QP- ⁇ correction are performed to improve the encoding effect.
  • the motion search is performed with blocks of size 16 ⁇ 16, and the propagation factor of each block is calculated. Since the video sequence with resolution greater than or equal to 720P in AV1 is independently divided by 128 ⁇ 128 SuperBlock Coding, the 64 ⁇ 64 SuperBlock is used to independently divide and encode the video sequence smaller than 720P, so the average of the propagation factors of all 16 ⁇ 16 blocks in the SuperBlock is taken as the propagation factor of the SuperBlock, and the Lagrangian of the SuperBlock level is adjusted. Multiplier and QP. The I frame adjusts the partial sequence according to the required threshold,
  • the present invention adopts the AV1 reference software libaom-1.0 as the experimental platform, the experimental environment refers to the common test condition CTC (Common Test Conditions) specified by JVET, and the experiment is only carried out under the default coding structure of AV1, and the experimental test sequence is Class B, C, D, E and F have a total of 20 video sequences, and each test sequence uses four QP points (32, 43, 53, 63) for encoding.
  • CTC Common Test Conditions
  • the coding experiment results are shown in Table 1, which shows that the Y component of the test sequence achieved 1.66% coding performance under the AV1 default coding structure.
  • the performance of the present invention is significantly improved, especially for ClassE, the performance is improved significantly, and a code rate saving of 5.03% is achieved under the Y component.
  • Class E is a video sequence with a relatively fixed scene, the similarity of each video frame is high, and the time domain dependence is strong, and the present invention can achieve better results for this sequence.
  • the BasketballDrill sequence achieves a bitrate savings of 6.21% in the Y component, also due to its relatively static background.
  • Figure 5 shows the rate-distortion curve of the BasketballDrill sequence.
  • the blue curve is the rate-distortion curve of the global rate-distortion optimization algorithm
  • the red curve is the rate-distortion curve of the original libaom-1.0. It can be seen that for sequences with strong time-domain dependence, the coding efficiency of the algorithm is improved. obvious.
  • the coding complexity of this time-domain rate-distortion optimization algorithm under the AV1 default coding structure is reduced by an average of 6%, mainly because the adaptive Lagrangian multipliers calculated in the algorithm can make The coding unit obtains a better prediction effect. Although it takes a certain amount of time to establish the time domain propagation chain, the high-quality prediction can make the coding residual smaller, and then accelerate a series of processes of transformation, quantization and entropy coding. reduce the overall time.
  • Table 2 The encoding time percentage of the present invention compared to libaom-1.0

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本发明属于视频编解码技术领域,具体涉及一种基于视频序列特征和QP-λ修正的时域率失真优化方法。本发明对新一代编码器AV1提出了一种基于视频序列特征以及QP-λ修正的时域率失真优化方法,根据此前在HEVC-RA编码结构下的时域依赖关系以及AV1的特性和视频序列特征,通过构造时域失真传播链,估计当前编码单元及受影响的未来编码单元的聚合失真,计算时域失真传播模型中编码单元的传播因子,进而通过更加准确的传播因子调整拉格朗日乘子实现时域依赖率失真优化,同时对QP-λ的关系进行重新修正,调整I帧以获得更好的编码效果。

Description

基于视频序列特征和QP-λ修正的时域率失真优化方法 技术领域
本发明属于视频编解码技术领域,具体涉及一种基于视频序列特征和QP-λ修正的时域率失真优化方法。
背景技术
率失真理论(Rate Distortion Theory)是视频编码的理论基础。率失真理论提供了在给定失真的条件下对信源压缩的极限值,或者是在给定码率的条件下能达到的最小失真。率失真优化问题是在比特消耗的限制下,最小化视频编码的失真,从本质上说是一有约束的优化问题,其数学表达如公式所示:
min{D} s.t. R≤R max          (1.1)
其中D代表失真,R代表编码比特率,R max代表最大编码比特率。
但在实际使用当中,有约束的优化问题的求解相对困难,(Sullivan,et al.,1998)中提出了利用拉格朗日乘数法,将有约束的优化问题转化成无约束的优化问题的方法。从根本上来说,(Sullivan,et al.,1998)中的方法是利用拉格朗日乘子将比特消耗折算成失真,进而将有约束的优化问题转换成无约束的优化问题。(Sullivan,et al.,1998)中提出的率失真优化方法,也是视频编码中最常使用的优化方法。
而量化操作是视频编码中产生失真的主要因素,提高量化器性能对提高编码压缩效率有着重要意义。率失真优化的量化就是在给定码率的约束条件下,寻找能使失真最小的量化参数设置。要想确定编码块的最佳量化参数,首先必须确定拉格朗日乘子λ。大量的实验结果表明,λ与量化参数QP之间存在着密切的关系。但随着视频编码的发展,更为复杂的编码结构的引入,比如分级-B(Hierarchical-B)编码结构,以及不同的编码器,都会对(Sullivan,et al.,1998)中发现的λ与QP的关系造成一定的影响。针对不同编码器采用与之相应的λ与QP关系的调整方案,给每一个特定的λ值分配最优的QP,从而提高编码效率。
在使用率失真优化方法的时候,编码器默认的情况是,待优化的编码单元(编码块)之间是相互独立的,即各个编码单元的码率、失真、参数和其他单元没有关系。例如,第k个单元的编码参数(如量化步长、编码模式、运动矢量等)为o k,对应的编码比特为R(o k)、 编码失真为D(o k),则上述的率失真优化问题,通过引入全局拉格朗日乘子λ g,转换为公式的无约束问题,其中J被称为率失真代价函数。
Figure PCTCN2020132813-appb-000001
事实上,上式的最优解可看成是代价函数J k(λ)=D k(o k)+λ gR k(o k)取得极小值时的参数o k,代价函数表示了图1中R-D平面上的一条斜率为λ的直线。实际上,这种将各个编码单元独立处理的方法,实际上是一种局部最优的方法,因为实际视频的各个编码单元是存在依赖关系的。对公式求导可得
Figure PCTCN2020132813-appb-000002
由此可见,λ g是率失真曲线上某个点的负斜率较大的λ g对应码率较小,失真较大的可操作点,而较小的λ g对应着码率较大,失真较小的可操作点,是影响率失真性能最主要的决定因素,因此拉格朗日乘子λ g的选取至关重要,目前AV1(AOMedia Video 1)中λ g的大小主要由预先设置好的量化参数QP决定,与输入视频序列无关。
然而由于帧内\帧间预测会在不同编码单元间引入依赖性,对每个编码单元采用独立率失真优化技术并不能使编码性能达到最优。因此需要一种复杂度可以接受的全局率失真优化方法,进一步提升编码效率。
文献Source Distortion Temporal Propagation Analysis for Random-Access Hierarchical Video Coding Optimization中研究了一种RA编码结构下时域率失真优化算法。根据RA配置下的时域依赖关系,建立了多参考帧下的时域失真传播链,估计失真传播程度,计算传播因子,进而根据聚合传播因子调整全局拉格朗日乘子,实现时域率失真优化,解决全局率失真优化问题。
图2给出了AV1默认的时域层级关系,由于AV1默认的GOP大小是16,除时域层级最高的帧以外,其余帧的编码块都会对更高时域层级的帧以及后续帧产生直接或者间接的影响。(AV1默认编码结构类似于HEVC的随机接入视频编码,其编码顺序于解码顺序不一致,需要考虑前向参考和后向参考,所述前向参考为前向参考帧的播放顺序大于当前帧的POC(Picture Order Count,图片顺序计数),反之为后向参考)。图3给出了AV1中主要的参考关系。
在AV1默认编码结构下,对于时域层级最高(TL5)的帧,即rPOC(rPOC即relative POC,是出现在GOP(Group of Pictures)中相同位置的帧)为1,3,5...15的帧,其优化问题为:
Figure PCTCN2020132813-appb-000003
对于时域层级TL4的帧,即rPOC为2,6,10,14的帧,其优化问题为:
Figure PCTCN2020132813-appb-000004
对于时域层级TL3的帧,即rPOC为4,12的帧,其优化问题为:
Figure PCTCN2020132813-appb-000005
对于时域层级TL2的帧,即rPOC为8的帧,其优化问题为:
Figure PCTCN2020132813-appb-000006
对于时域层级TL1的帧,即rPOC为0,16的帧,其优化问题为:
Figure PCTCN2020132813-appb-000007
在高码率情况下,后续编码单元的编码失真可用公式表示:
Figure PCTCN2020132813-appb-000008
是比编码单元B i+1的运动补偿误差:
Figure PCTCN2020132813-appb-000009
F i代表编码单元B i的原始像素,
Figure PCTCN2020132813-appb-000010
代表编码单元B i的重建像素、F i+1代表编码单元B i+1的原始像素。
由于该算法此前针对的视频编码标准是HEVC,没有对新一代编码器AV1的特性以及视频序列特征进行相应的调整,也没有对AV1中QP-λ的关系进行重新修正,同时该算法没有对I帧进行调整,而I帧对后续帧的影响非常重要。
发明内容
本发明针对上述问题,对新一代编码器AV1提出了一种基于视频序列特征以及QP-λ修正的时域率失真优化方法,根据此前在HEVC-RA编码结构下的时域依赖关系以及AV1的特性和视频序列特征,通过构造时域失真传播链,估计当前编码单元及受影响的未来编码单元的聚合失真,计算时域失真传播模型中编码单元的传播因子,进而通过更加准确的传播因子调整拉格朗日乘子实现时域依赖率失真优化,同时对QP-λ的关系进行重新修正,调整I帧以获得更好的编码效果。
本发明的技术方案是:
一种基于视频序列特征以及QP-λ修正的时域率失真优化方法的具体步骤如下:
S1、根据AV1默认编码结构中主要时域依赖关系,建立时域传播链(如图4),通过前向运动搜索找出每个原始编码块所影响的匹配块,并记录下相应的原始运动补偿误差(OMCP,Original Motion Compensation Predicted error)和运动矢量;
S2、定义关系模型中的拉格朗日乘子为λ new,量化步长Qstep,根据编码器自带的量化参数QP与量化步长Qstep对应列表,对不同QP的不同序列的拉格朗日乘子与其对应的量化步长Qstep进行统计,构造拉格朗日乘子λ new与量化步长Qstep之间的关系模型,得到关系模型用公式(1.9)-(1.10)表示;
λ new=3.667*Qstep 2-5.198e-07*Qstep-0.6664         (1.9)
Figure PCTCN2020132813-appb-000011
其中λ org是编码器中的拉格朗日乘子
对原始视频序列进行简要分类,采用类似帧差法的方式,计算后续10,20,30...帧相对于初始第一帧的差值的绝对值之和,最后求得累积之和的像素级的平均值,用E表示。对所求结果,按照阈值设置不同的QP和λ调整范围,以及相应的α和I帧QP:
SAD i=∑|p 0-p 10*i|            (1.11)
Figure PCTCN2020132813-appb-000012
Figure PCTCN2020132813-appb-000013
Figure PCTCN2020132813-appb-000014
α=clip3(0.90,0.98,1-0.0006*(0.8*E-20))            (1.15)
Figure PCTCN2020132813-appb-000015
其中SAD i指的是第i个差值的绝对值之和,p 0代表初始帧的像素值,p 10*i代表后续10,20,30...帧的像素值,视频序列总帧数用F表示,宽用W表示,高用H表示。constrainQPrange代表QP最大可调节范围。公式(1.13)是λ的调整范围,λ org代表AV1预先设置的λ。QP 0代表I帧的QP(0~255),α代表公式(1.8)的系数,clip3()函数用于将1-0.0006*(0.8*E-20)的计算结果限制在0.90~0.98之间。
S3、在实际编码当前帧前,利用S1获得的原始运动补偿误差和运动矢量计算得到当前帧每个编码块16×16的传播因子,用调和平均的方式求得每个Superblock的平均传播因子。由于AV1默认采用二次编码,利用AV1内置变量pcnt_neutral区分屏幕内容序列,结合S2得到的调整范围对不同视频序列的拉格朗日乘子进行针对性的调整。
S4、根据步骤S2得到的关系模型,定义λ new为通过关系模型计算得到的拉格朗日乘子, 计算编码器中拉格朗日乘子λ org与λ new的差值,对不同的差值,利用关系模型公式(1.9)-(1.10)对编码器中的拉格朗日乘子λ org进行修正。
S5、由于AV1中对rPOC为16的帧采用了特殊的ALT帧进行编码,同时rPOC为16的帧处于时域层级TL1,类似于HEVC中的关键帧,其失真会对后续多帧失真产生影响。因此,在AV1编码器对ALT帧内块级拉格朗日乘子调整的基础上,再对其进行缩放以及QP-λ修正,以提高编码效果。
本发明的方案与此前方法的区别点包括,针对不同序列采用了不同的编码策略,以及对AV1中的QP-λ关系进行重新修正,并根据AV1特性进行调整,包括对ALT帧的调整以及对满足阈值序列的I帧进行调整。
本发明的有益效果:
通过对不同序列采用不同的编码策略,以及当前帧中各编码块对后续编码的影响,并对编码器中的拉格朗日乘子λ进行修正,在AV1官方源码libaom-1.0的版本下,在Y分量下获得了平均1.66%的码率节省。
附图说明
图1为率失真曲线;
图2为AV1默认编码结构;
图3为AV1中主要时域依赖关系;
图4为时域失真传播链的构造示意图;
图5为BasketballDrill序列的率失真曲线;
具体实施方式
为了使本发明的目的,技术方案和优点更加清楚,下面结合附图和实施例,详细描述本发明的技术方案。实施例采用开发环境为Visual Studio 2015,实施例基于AV1参考软件libaom-1.0实现。
为了简化全局率失真算法的实现过程,可以通过传播因子κ i直接在AV1中修改全局拉格朗日乘子λ g。由于后续编码单元在推导传播因子κ i时并没有真正被编码,因此需对后续编码单元的失真进行估计。
在高码率情况下,后续编码单元的编码失真可用公式表示:
Figure PCTCN2020132813-appb-000016
由于编码单元B i+1没有被编码,R i+1无法获得,D i+1无法使用公式计算,但在量化步长Q step下B i+1的编码失真可表示为
Figure PCTCN2020132813-appb-000017
Figure PCTCN2020132813-appb-000018
其中
Figure PCTCN2020132813-appb-000019
可通过大量量化步长和编码单元的实验拟合出一条F(θ)曲线,此前算法的F(θ)曲线是针对HEVC编码器,在AV1上已经不再适用,在AV1上重新进行实验得到新的曲线,并对曲线上的点进行采样,建立一个F(θ)与θ的查询表,进而估计编码块的失真。同时,本发明将此前设置为固定值的α改为根据视频序列自适应的α。
本发明主要步骤包括:
步骤1:根据AV1默认编码结构中主要时域依赖关系,建立时域传播链(如图4),通过前向运动搜索找出每个原始编码块所影响的匹配块,并记录下相应的OMCP和运动矢量;
步骤2:定义关系模型中的拉格朗日乘子为λ new,量化步长Qstep,根据编码器自带的量化参数QP与量化步长Qstep对应列表,对不同QP的不同序列的拉格朗日乘子与其对应的量化步长Qstep进行统计,构造拉格朗日乘子λ new与量化步长Qstep之间的关系模型,得到关系模型用公式(1.9)-(1.10)表示;
步骤3:对原始视频序列进行简要分类,采用类似帧差法的方式,计算后续10,20,30...帧相当于初始第一帧的差值的绝对值之和,最后求得累积之和的像素级的平均值。对所求结果,按照阈值设置不同的QP和λ调整范围,以及相应的α和I帧QP,用公式(1.11)-(1.16)表示。
步骤4:在实际编码当前帧前,利用S1获得的原始运动补偿误差和运动矢量计算得到当前帧每个编码块16×16的传播因子,用调和平均的方式求得每个Superblock的平均传播因子。 由于AV1默认采用二次编码,利用AV1内置变量pcnt_neutral区分屏幕内容序列,结合S21得到的调整范围对不同视频序列的拉格朗日乘子进行针对性的调整。
步骤5:根据步骤S2得到的关系模型,定义λ new为通过关系模型计算得到的拉格朗日乘子,计算编码器中拉格朗日乘子λ org与λ new的差值,对不同的差值,利用关系模型公式(1.9)-(1.10)对编码器中的拉格朗日乘子λ org进行修正。
步骤6:由于AV1中对rPOC为16的帧采用了特殊的ALT帧进行编码,同时rPOC为16的帧处于时域层级TL1,类似于HEVC中的关键帧,其失真会对后续多帧失真产生影响。因此,在AV1编码器对ALT帧内块级拉格朗日乘子调整的基础上,再对其进行缩放以及QP-λ修正,以提高编码效果。
在建立时域传播链时,用大小为16×16的块进行运动搜索,并计算每个块的传播因子,由于AV1中对分辨率大于等于720P的视频序列采用128×128的SuperBlock进行独立划分编码,对小于720P的视频序列采用64×64的SuperBlock进行独立划分编码,所以对SuperBlock内所有16×16的块的传播因子取平均值,作为SuperBlock的传播因子,调整SuperBlock级的拉格朗日乘子和QP。I帧根据所求阈值对部分序列进行调整,
本发明采用AV1参考软件libaom-1.0作为实验平台,实验环境参照JVET规定的通用测试条件CTC(Common Test Conditions),仅在AV1默认编码结构下进行实验,实验测试序列为Class B、C、D、E、F共20个视频序列,每个测试序列使用四个QP点(32,43,53,63)进行编码。参考软件配置为,以BasketballDrill序列为例,--codec=av1 -w 832 -h 480 --fps=50/1 --cpu-used=1 --threads=0 --profile=0 --drop-frame=0 --static-thresh=0 --sharpness=0 --frame-parallel=0 --tile-columns=0 --end-usage=q -v --cq-level=32 --psnr --limit=500 -o BasketballDrill_832x480_50.yuv.ivf BasketballDrill_832x480_50.yuv
表1 本发明相比于libaom-1.0的测试结果
Figure PCTCN2020132813-appb-000020
Figure PCTCN2020132813-appb-000021
Figure PCTCN2020132813-appb-000022
编码实验结果如表1所示,表中显示了测试序列在AV1默认编码结构下Y分量取得了1.66%的编码性能。对于大部分测试序列,本发明性能提升明显,尤其是ClassE,性能提升显著,Y分量下取得了5.03%的码率节省。主要是因为Class E是场景较为固定的视频序列,各视频帧相似度较高,时域依赖性强,本发明对这种序列能取得较好的效果。除此之外,BasketballDrill序列在Y分量下取得了6.21%的码率节省,也是因其背景相对静止。接下来选取部分序列,基于率失真优化曲线对比图,观察它们的编码性能提升情况,如图5为BasketballDrill序列的率失真曲线图,横坐标为编码码率Rate,纵坐标为重建视频峰值信噪比PSNR,其中蓝色曲线是该全局率失真优化算法的率失真曲线,红色曲线是原始libaom-1.0的率失真曲线,可以看出,对于时域依赖性较强的序列,该算法编码效率提升明显。
同样,在编码复杂度方面,该时域率失真优化算法在AV1默认编码结构下的编码复杂度平均减少6%,主要是由于算法中所计算得到的自适应的拉格朗日乘子可使编码单元得到更好的预测效果,虽然在建立时域传播链上耗费了一定的时间,但高质量的预测可使编码残差变小,进而加速变换,量化和熵编码的一系列过程,从而使总体的时间减少。
表2 本发明相比于libaom-1.0的编码时间百分比
序列 Class B Class C Class D Class E Class F △EncT
  95% 94% 97% 91% 92% 94%

Claims (1)

  1. 基于视频序列特征和QP-λ修正的时域率失真优化方法,其特征在于,包括以下步骤:
    S1、根据AV1默认编码结构中时域依赖关系,建立时域传播链,通过前向运动搜索找出每个原始编码块所影响的匹配块,并记录下相应的原始运动补偿误差和运动矢量;
    S2、定义拉格朗日乘子为λ new,量化步长Qstep,根据编码器自带的量化参数QP与量化步长Qstep对应列表,对不同QP的不同序列的拉格朗日乘子λ与其对应的量化步长Qstep进行统计,构造拉格朗日乘子λ new与量化步长Qstep之间的关系模型,得到关系模型为:
    λ new=3.667*Qstep 2-5.198e-07*Qstep-0.6664
    Figure PCTCN2020132813-appb-100001
    其中λ org是编码器中的拉格朗日乘子;
    对原始视频序列进行分类,采用帧差法的方式,计算后续10,20,30...帧相对于初始第一帧的差值的绝对值之和,最后求得累积之和的像素级的平均值,用E表示,对所求结果,按照阈值设置不同的QP和λ调整范围,以及相应的α和I帧QP:
    SAD i=∑|p 0-p 10*i|
    Figure PCTCN2020132813-appb-100002
    Figure PCTCN2020132813-appb-100003
    Figure PCTCN2020132813-appb-100004
    α=clip3(0.90,0.98,1-0.0006*(0.8*E-20))
    Figure PCTCN2020132813-appb-100005
    其中SAD i指的是第i个差值的绝对值之和,p 0代表初始帧的像素值,p 10*i代表后续10,20,30...帧的像素值,视频序列总帧数用F表示,宽用W表示,高用H表示,constrainQPrange代表QP最大可调节范围,QP 0代表I帧的QP(0~255),α为系数,clip3()函数用于将1-0.0006*(0.8*E-20)的计算结果限制在0.90~0.98之间;
    S3、在实际编码当前帧前,利用S1获得的原始运动补偿误差和运动矢量计算得到当前帧每个编码块16×16的传播因子,用调和平均的方式求得每个Superblock的平均传播因子,由于AV1默认采用二次编码,利用AV1内置变量区分屏幕内容序列,结合S2得到的调整范围对不同视频序列的拉格朗日乘子进行针对性的调整;
    S4、根据步骤S2得到的关系模型,λ new为通过关系模型计算得到的拉格朗日乘子,计算编码器中拉格朗日乘子λ org与λ new的差值,对不同的差值,利用关系模型公式对编码器中的拉格朗日乘子λ org进行修正;
    S5、由于AV1中对rPOC为16的帧采用了特殊的ALT帧进行编码,同时rPOC为16的帧处于时域层级TL1,其失真会对后续多帧失真产生影响,因此,在AV1编码器对ALT帧内块级拉格朗日乘子调整的基础上,再对其进行缩放以及QP-λ修正,以提高编码效果。
PCT/CN2020/132813 2020-08-05 2020-11-30 基于视频序列特征和QP-λ修正的时域率失真优化方法 WO2022027881A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/460,360 US11418795B2 (en) 2020-08-05 2021-08-30 Temporal domain rate distortion optimization based on video content characteristic and QP-λcorrection

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010776206.9 2020-08-05
CN202010776206.9A CN111918068B (zh) 2020-08-05 2020-08-05 基于视频序列特征和QP-λ修正的时域率失真优化方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/460,360 Continuation US11418795B2 (en) 2020-08-05 2021-08-30 Temporal domain rate distortion optimization based on video content characteristic and QP-λcorrection

Publications (1)

Publication Number Publication Date
WO2022027881A1 true WO2022027881A1 (zh) 2022-02-10

Family

ID=73287127

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/132813 WO2022027881A1 (zh) 2020-08-05 2020-11-30 基于视频序列特征和QP-λ修正的时域率失真优化方法

Country Status (2)

Country Link
CN (1) CN111918068B (zh)
WO (1) WO2022027881A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114584536A (zh) * 2022-02-22 2022-06-03 重庆大学 一种基于分区率失真建模的360度流媒体传输方法
CN114866773A (zh) * 2022-05-09 2022-08-05 西安邮电大学 消除时域抖动效应的量化参数级联方法
CN116405690A (zh) * 2023-06-01 2023-07-07 中南大学 一种快速帧级自适应拉格朗日乘子优化方法、系统及设备
WO2023155445A1 (zh) * 2022-02-21 2023-08-24 翱捷科技股份有限公司 一种基于运动检测的率失真优化方法及装置
CN116723330A (zh) * 2023-03-28 2023-09-08 成都师范学院 一种自适应球域失真传播链长度的全景视频编码方法
CN117440158A (zh) * 2023-12-20 2024-01-23 华侨大学 基于三维几何失真的miv沉浸式视频编码率失真优化方法
CN117676136A (zh) * 2023-11-16 2024-03-08 广州群接龙网络科技有限公司 一种群接龙数据处理方法及系统
WO2024082580A1 (zh) * 2022-10-18 2024-04-25 电子科技大学 一种考虑时域失真传播的低复杂度全景视频编码方法

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11418795B2 (en) 2020-08-05 2022-08-16 University Of Electronic Science And Technology Of China Temporal domain rate distortion optimization based on video content characteristic and QP-λcorrection
CN111918068B (zh) * 2020-08-05 2022-03-08 电子科技大学 基于视频序列特征和QP-λ修正的时域率失真优化方法
CN114915789B (zh) * 2022-04-13 2023-03-14 中南大学 一种帧间的拉格朗日乘子优化方法、系统、设备及介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105120282A (zh) * 2015-08-07 2015-12-02 上海交通大学 一种时域依赖的码率控制比特分配方法
CN105872545A (zh) * 2016-04-19 2016-08-17 电子科技大学 一种随机接入视频编码中层次化时域率失真优化方法
CN105872544A (zh) * 2016-04-19 2016-08-17 电子科技大学 低延迟视频编码中时域率失真优化方法
WO2018065152A1 (en) * 2016-10-05 2018-04-12 Thomson Licensing Method and apparatus for encoding a picture using rate-distortion based block splitting
CN111314703A (zh) * 2020-03-31 2020-06-19 电子科技大学 一种基于失真类型传播分析的时域率失真优化方法
CN111918068A (zh) * 2020-08-05 2020-11-10 电子科技大学 基于视频序列特征和QP-λ修正的时域率失真优化方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002152759A (ja) * 2000-11-10 2002-05-24 Sony Corp 画像情報変換装置および画像情報変換方法
CN1206864C (zh) * 2002-07-22 2005-06-15 中国科学院计算技术研究所 结合率失真优化的码率控制的方法及其装置
JP5061122B2 (ja) * 2006-01-09 2012-10-31 マティアス・ナロシュケ ハイブリッドなビデオ符号化における予測誤差の適応符号化
US9386317B2 (en) * 2014-09-22 2016-07-05 Sony Interactive Entertainment Inc. Adaptive picture section encoding mode decision control
CN104994382B (zh) * 2015-04-30 2017-12-19 西安电子科技大学 一种感知率失真的优化方法
CN110830801B (zh) * 2018-08-13 2021-10-01 华为技术有限公司 视频编码速率控制方法以及相关装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105120282A (zh) * 2015-08-07 2015-12-02 上海交通大学 一种时域依赖的码率控制比特分配方法
CN105872545A (zh) * 2016-04-19 2016-08-17 电子科技大学 一种随机接入视频编码中层次化时域率失真优化方法
CN105872544A (zh) * 2016-04-19 2016-08-17 电子科技大学 低延迟视频编码中时域率失真优化方法
WO2018065152A1 (en) * 2016-10-05 2018-04-12 Thomson Licensing Method and apparatus for encoding a picture using rate-distortion based block splitting
CN111314703A (zh) * 2020-03-31 2020-06-19 电子科技大学 一种基于失真类型传播分析的时域率失真优化方法
CN111918068A (zh) * 2020-08-05 2020-11-10 电子科技大学 基于视频序列特征和QP-λ修正的时域率失真优化方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GUO, HONGWEI; ZHU, CE; LIU, YUYANG: "Overview of Rate-Distortion Optimization for Video Coding", ACTA ELECTRONICA SINICA, vol. 48, no. 5, 31 May 2020 (2020-05-31), China , pages 1018 - 1029, XP009533962, ISSN: 0372-2112, DOI: 10.3969/j.issn.0372-2112.2020.05.024 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023155445A1 (zh) * 2022-02-21 2023-08-24 翱捷科技股份有限公司 一种基于运动检测的率失真优化方法及装置
CN114584536A (zh) * 2022-02-22 2022-06-03 重庆大学 一种基于分区率失真建模的360度流媒体传输方法
CN114584536B (zh) * 2022-02-22 2024-03-12 重庆大学 一种基于分区率失真建模的360度流媒体传输方法
CN114866773A (zh) * 2022-05-09 2022-08-05 西安邮电大学 消除时域抖动效应的量化参数级联方法
CN114866773B (zh) * 2022-05-09 2023-02-28 西安邮电大学 消除时域抖动效应的量化参数级联方法
WO2024082580A1 (zh) * 2022-10-18 2024-04-25 电子科技大学 一种考虑时域失真传播的低复杂度全景视频编码方法
CN116723330B (zh) * 2023-03-28 2024-02-23 成都师范学院 一种自适应球域失真传播链长度的全景视频编码方法
CN116723330A (zh) * 2023-03-28 2023-09-08 成都师范学院 一种自适应球域失真传播链长度的全景视频编码方法
CN116405690A (zh) * 2023-06-01 2023-07-07 中南大学 一种快速帧级自适应拉格朗日乘子优化方法、系统及设备
CN116405690B (zh) * 2023-06-01 2023-09-01 中南大学 一种快速帧级自适应拉格朗日乘子优化方法、系统及设备
CN117676136A (zh) * 2023-11-16 2024-03-08 广州群接龙网络科技有限公司 一种群接龙数据处理方法及系统
CN117440158A (zh) * 2023-12-20 2024-01-23 华侨大学 基于三维几何失真的miv沉浸式视频编码率失真优化方法
CN117440158B (zh) * 2023-12-20 2024-04-12 华侨大学 基于三维几何失真的miv沉浸式视频编码率失真优化方法

Also Published As

Publication number Publication date
CN111918068B (zh) 2022-03-08
CN111918068A (zh) 2020-11-10

Similar Documents

Publication Publication Date Title
WO2022027881A1 (zh) 基于视频序列特征和QP-λ修正的时域率失真优化方法
WO2021196822A1 (zh) 一种基于自适应自导向滤波的环路滤波方法
US10715816B2 (en) Adaptive chroma downsampling and color space conversion techniques
WO2021196682A1 (zh) 一种基于失真类型传播分析的时域率失真优化方法
US20080159387A1 (en) Entropy deficiency based image
US10904524B2 (en) Adaptive loop filtering for high dynamic range video
US9118918B2 (en) Method for rate-distortion optimized transform and quantization through a closed-form operation
WO2016011796A1 (zh) 一种视频编码中自适应反量化方法及装置
JP2006180497A (ja) 画像又は画像シーケンスを符号化するために使用される量子化マトリクスを生成するための方法及び装置
US11418795B2 (en) Temporal domain rate distortion optimization based on video content characteristic and QP-λcorrection
CN101494792A (zh) 一种基于边缘特征的h.264/avc帧内预测方法
WO2024082580A1 (zh) 一种考虑时域失真传播的低复杂度全景视频编码方法
CN109889852B (zh) 一种基于邻近值的hevc帧内编码优化方法
WO2018095890A1 (en) Methods and apparatuses for encoding and decoding video based on perceptual metric classification
WO2024082579A1 (zh) 一种考虑时域失真传播的零时延全景视频码率控制方法
WO2020172813A1 (zh) 率失真优化方法及装置、计算机可读存储介质
CN110730346A (zh) 基于编码树单元失真优化的视频编码码率控制方法
WO2018076827A1 (zh) 视频编码中帧内编码的码率估计方法
CN107343202B (zh) 基于附加码率的无反馈分布式视频编解码方法
KR20090017724A (ko) 동영상 부호화에 있어서 비트 발생 가능성 예측을 이용한블록 모드 결정 방법 및 장치
TW200822757A (en) Video coding method using image data skipping
CN105812818A (zh) 基于改进列文伯格麦夸特优化的弹性运动估计方法
CN116016927A (zh) 一种考虑时域相关性和熵平衡的低延时全景视频编码方法
WO2020140219A1 (zh) 帧内预测方法、装置及计算机存储介质
Jung et al. HEVC encoder optimization for HDR video coding based on perceptual block merging

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20948025

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20948025

Country of ref document: EP

Kind code of ref document: A1