WO2021196682A1 - 一种基于失真类型传播分析的时域率失真优化方法 - Google Patents

一种基于失真类型传播分析的时域率失真优化方法 Download PDF

Info

Publication number
WO2021196682A1
WO2021196682A1 PCT/CN2020/132812 CN2020132812W WO2021196682A1 WO 2021196682 A1 WO2021196682 A1 WO 2021196682A1 CN 2020132812 W CN2020132812 W CN 2020132812W WO 2021196682 A1 WO2021196682 A1 WO 2021196682A1
Authority
WO
WIPO (PCT)
Prior art keywords
distortion
coding unit
coding
frame
propagation
Prior art date
Application number
PCT/CN2020/132812
Other languages
English (en)
French (fr)
Inventor
朱策
邓玲玲
蒋妮
王秋月
丁可可
Original Assignee
电子科技大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 电子科技大学 filed Critical 电子科技大学
Priority to US17/412,292 priority Critical patent/US11330270B2/en
Publication of WO2021196682A1 publication Critical patent/WO2021196682A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/567Motion estimation based on rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/109Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation

Definitions

  • the invention belongs to the technical field of video coding and decoding, and specifically relates to a time-domain rate-distortion optimization method based on distortion type propagation analysis.
  • Rate Distortion Theory is the basic theory of lossy coding.
  • the rate-distortion optimization technology RDO (Rate Distortion Optimization) developed based on this theory is one of the important tools to improve coding efficiency and has been widely used in the field of video coding. .
  • the performance of video encoding needs to be measured by both encoding bits and reconstruction distortion.
  • the encoding bits of the video will increase.
  • the distortion of the video will increase. Will greatly increase, so there is a contradictory and mutually restrictive relationship between coding bits and reconstruction distortion.
  • the rate-distortion optimization technology is to let the encoder select a set of optimal coding parameter sets so that the coding distortion is minimized under the premise that the coding bit is smaller than the target bit. Its mathematical expression is shown in formula (1.1):
  • D i and R i represent the distortion and the number of bits of the coding unit
  • N is the total number of coding units
  • R c represents the target number of bits
  • a global Lagrangian multiplier ⁇ g can be introduced to convert the constrained problem into an unconstrained problem of formula (1.2), where J is called the rate-distortion cost function.
  • Figure 1 shows the classic RD curve.
  • a set of coding parameters is used to encode a video
  • the distortion and bit rate under the coding parameters can be obtained.
  • R * we can observe that for a given rate R * , we can always find an operable point with the smallest D. These points are called the actual achievable optimal operable points. Connect these optimal operable points. Get the actual operable RD curve.
  • ⁇ g is the negative slope of a point on the rate-distortion curve.
  • a larger ⁇ g corresponds to an operable point with a smaller bit rate and larger distortion, while a smaller ⁇ g corresponds to a larger bit rate and distortion.
  • the small operable point is the most important determinant of the rate-distortion performance. Therefore , the selection of the Lagrangian multiplier ⁇ g is very important.
  • the size of ⁇ g in VVC is mainly determined by the preset quantization parameter QP. , Has nothing to do with the input video sequence.
  • Temporally dependent rate-distortion optimization for low-delay hierarchical video coding a time-domain rate-distortion optimization algorithm under the LD coding structure is studied. According to the time-domain dependency under the LD configuration, a time-domain distortion propagation chain under multiple reference frames is established, the degree of distortion propagation is estimated, the propagation factor is calculated, and then the global Lagrangian multiplier is adjusted according to the aggregate propagation factor to achieve the time-domain rate Distortion optimization, to solve the problem of global rate-distortion optimization.
  • Figure 2 shows the construction method of the time-domain distortion propagation chain under the LD coding structure. Due to the multi-reference frame coding method, one coding block in the key frame may directly affect multiple blocks in different subsequent frames, and continue to indirectly The ground expands forward. Therefore, it is necessary to use a large number of experiments to figure out the utilization rate of each reference frame in the GOP, and use a weighted average for multiple possible influencing blocks to calculate the expected distortion of subsequent coding blocks. Based on the time-domain dependence relationship under the LD coding structure of Fig. 3 and the use of forward motion search, the affected coding block in the subsequent frame can be determined.
  • the present invention re-summarizes the time-domain distortion propagation based on the time-domain dependence relationship under the LD coding structure and the analysis of distortion propagation in skip mode and inter mode.
  • the distortion of the current coding unit is determined by The previously coded reference unit is determined by distortion, so the influence on the subsequent coding unit is also determined by the previous coding unit, so when considering the influence of the current coding unit on the subsequent coding unit, the distortion in skip mode should be eliminated. Hypothesis with They are the coding distortion of the inter mode and skip mode selected by the current coding unit. p inter and p skip are the probabilities of the inter mode and skip mode selected by the current coding unit respectively, and the sum of the two is 1. If the error between the current coding unit and the prediction unit is larger, the probability that the encoder selects the inter mode will increase, and a larger quantization step size will increase the probability that the encoder selects the skip mode. Therefore, p inter is defined as:
  • ⁇ i is the Lagrangian multiplier of the coding unit B i under the global rate-distortion performance.
  • [kappa] i represents the influence of the coding unit B i a subsequent video sequence coding distortion, referred to propagation factor B i of the coding unit.
  • the distortion function in the high bit rate inter mode can be expressed as
  • the distortion function in skip mode can be expressed as R i+1 is the code rate, b is a constant related to the source distribution, It is the motion compensation prediction error of Bi+1.
  • F i represents the original pixel of the coding unit B i
  • F i+1 represent the original pixels of the coding unit B i+1.
  • is approximately equal to a constant.
  • the distortion of the coding unit B i+1 can be expressed as:
  • the expected distortion of the coding unit B i+1 affected by the coding unit B i in the coded frame f i+1 is:
  • the expected distortion of the coding unit B i+2 affected by B i in the coded frame f i+2 is:
  • coding unit B i+2 in Respectively represent the probability of using inter mode and skip mode when coding unit B i+2 refers to coding unit B i+1
  • R i+2 represents the code rate of coding unit B i+2
  • Pi+1, i+2 and Pi i+2 respectively represent the probability that the coded frames f i+1 and f i are referenced by the coded frame f i+2.
  • c i+2 is an irrelevant item that is irrelevant to the coding parameter o i of the coding unit B i.
  • the expected distortion of the coding unit B i+3 affected by B i in the coded frame f i+3 is:
  • coding unit B i+3 in Respectively represent the probability of using inter mode and skip mode when coding unit B i+3 refers to coding unit B i+2, Respectively represent the probability of using inter mode and skip mode when coding unit B i+3 refers to coding unit B i
  • R i+3 represents the code rate of coding unit B i+3
  • P i+2, i+3 and P i, i+3 respectively represent the probability that the coded frames f i+2 and f i are referenced by the coded frame f i+3.
  • c i+3 is an irrelevant item that is irrelevant to the coding parameter o i of the coding unit B i.
  • Pi,i+k+1-t indicates that the coded frame f i is referenced by the coded frame f i+k+1-t
  • the probability of P j,j+1 represents the probability that the coded frame f j is referenced by the coded frame f j+1 , where It has nothing to do with the coding parameter o i of the coding unit B i.
  • Pi+4m, i+4m+k+1-t represent coded frame f i +4m is the probability that the coded frame f i+4m+k+1-t is referenced
  • P j,j+1 represents the probability that the coded frame f j is referenced by the coded frame f j+1 , where It has nothing to do with the coding parameter o i of the coding unit B i.
  • the aggregate distortion of coding units affected by B i in all subsequent coded frames from the coded frame f i+1 to the last coded frame f N is:
  • M is the total number of GOPs from the coded frame f i+1 to the last coded frame f N , and L represents items that have nothing to do with o i.
  • propagation factor ⁇ i can adaptively adjust the CTU-level global Lagrangian multiplier ⁇ g , and further adjust the CTU-level QP, while using the frame-level average propagation factor to adjust the frame-level QP of all B frames.
  • the subsequent encoding frames all need to refer to the I frame.
  • the I frame QP is uniformly adjusted down by 1 in VTM, but the importance of the I frame in different sequences is different, so the I frame can be encoded twice, and the encoding distortion obtained by the first encoding can be used to establish the distortion propagation chain calculation.
  • the propagation factor of each 16 ⁇ 16 block in the I frame uses the frame-level average propagation factor to adjust the QP of the I frame, so that the QP of the I frame can be adjusted according to the influence of the I frame on the subsequent encoded frames.
  • the adjustment value is not limited to- 1.
  • the beneficial effect of the present invention is that the present invention solves the problem that the traditional method does not optimize the time-domain rate-distortion of the I frame, so that the global rate-distortion performance of the I frame is optimized.
  • Distortion propagation analysis in mode and inter mode re-summarizes the rate-distortion optimization problem based on time-domain distortion propagation, and improves the rate-distortion optimization performance under the LD coding structure.
  • Figure 1 shows the operable rate-distortion curve
  • Figure 2 is a schematic diagram of the structure of the time domain distortion propagation chain under the LD coding structure
  • Figure 3 is a schematic diagram of the LD encoding structure
  • Figure 4 shows the rate-distortion curve of the Fourpeople sequence.
  • the global Lagrangian multiplier ⁇ g can be directly modified in the VTM through the propagation factor ⁇ i . Since the subsequent coding unit is not actually coded when the propagation factor ⁇ i is derived, the distortion of the subsequent coding unit needs to be estimated.
  • the coding distortion of the subsequent coding unit is likely to be inter distortion.
  • R i + 1 can not be obtained, D i + 1 can not be used (1.21) calculated, but the quantization step size Q step the B i + 1 coding distortion can be expressed as
  • an F( ⁇ ) curve can be fitted, and a look-up table is built according to the curve to calculate the value of F( ⁇ ) by querying ⁇ , and then the inter distortion of the coding block is estimated.
  • the present invention Set ⁇ to 0.94 in.
  • the global Lagrangian multiplier ⁇ g can be obtained, where N is the number of all coding units.
  • the distortion of all coding units and the encoding process cannot be fully obtained.
  • the distortion and the use of encoded frames ⁇ g is updated by the weighted sum of the distortion of and the distortion of the coded frame just completed. due to Is not available, the use of alternate D i integrated encoder rate-distortion algorithm presented in this section of.
  • the motion search is performed based on 16 ⁇ 16 blocks, and the propagation factor of each block is calculated. Since the CTU with a size of 128 ⁇ 128 in the VTM is independently divided and coded, all 16 ⁇ The propagation factor of the 16 blocks is averaged, as the propagation factor of the CTU, the Lagrangian multiplier and QP of the CTU level are adjusted, and the frame-level QP is adjusted using the average propagation factor of the entire image.
  • the I frame adopts secondary encoding to optimize the QP of the I frame.
  • the first encoding process of the I frame is optimized, skipping the binary tree and the tri-tree division mode, and only uses the quad-tree division mode to divide the CTU.
  • the minimum division size of the coding unit is set to 16 ⁇ 16, and no smaller size division is required. Based on the distortion obtained from the first encoding of the I frame, the influence of the distortion of the coding unit in the I frame on the subsequent coding units can be estimated to achieve I Frame QP adaptive adjustment.
  • the present invention uses VVC reference software VTM5.0 as the experimental platform.
  • the experimental environment is configured according to the common test conditions CTC (Common Test Conditions) and the reference software specified by JVET.
  • CTC Common Test Conditions
  • JVET Joint Test Conditions
  • the experiment is performed only under the LDB coding structure.
  • the experimental test sequence is the Class recommended by CTC.
  • Table 1 The results of the coding experiment are shown in Table 1.
  • the table shows that the Y component of the test sequence achieved 2.57% of the coding performance under the LDB coding structure.
  • the performance of the present invention is significantly improved, especially for ClassE, the performance is significantly improved, and a code rate saving of 10.13% is achieved under the Y component.
  • Class E is a video sequence with relatively fixed scenes, and the similarity of each video frame is high, and the time domain dependence is strong, and the present invention can achieve better effects on this sequence.
  • Figure 4 shows the rate-distortion curve diagram of the Fourpeople sequence.
  • the abscissa is the coding rate Rate, and the ordinate is the peak signal-to-noise of the reconstructed video.
  • the circular marking curve is the rate-distortion curve of the global rate-distortion optimization algorithm, and the square marking curve is the rate-distortion curve of the original VTM5.0. It can be seen that the algorithm encodes The efficiency has improved significantly.
  • the coding complexity of this time-domain rate-distortion optimization algorithm under the LDB coding structure increases by an average of 15%, mainly due to the need to spend a certain amount of time in the algorithm to do a motion search for each 16 ⁇ 16 block
  • the affected coding block is found to establish the distortion propagation chain, and at the same time, the secondary coding optimization is used for the I frame.
  • the first coding process of the I frame is simplified, a small amount of coding complexity is also increased.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本发明属于视频编解码技术领域,具体涉及一种基于失真类型传播分析的时域率失真优化方法。本发明根据LD结构下的时域依赖关系以及skip模式和inter模式下失真传播分析,重新归纳了基于时域失真传播的依赖率失真优化问题,通过构造了时间传播链,估计当前编码单元及受影响的未来编码单元的聚合失真,计算时域失真传播模型中编码单元的传播因子,进而通过更加准确的传播因子调整拉格朗日乘子实现时域依赖率失真优化,同时对I帧使用二次编码技术实现I帧的时域依赖率失真优化。

Description

一种基于失真类型传播分析的时域率失真优化方法 技术领域
本发明属于视频编解码技术领域,具体涉及一种基于失真类型传播分析的时域率失真优化方法。
背景技术
率失真理论(Rate Distortion Theory)是有损编码的基础理论,基于该理论发展而来的率失真优化技术RDO(Rate Distortion Optimization)是提升编码效率的重要工具之一,已广泛应用于视频编码领域。
视频编码的性能需要编码比特和重建失真共同衡量,一方面,当我们想要更好的视频质量时,视频的编码比特会提高,另一方面在较低的编码比特水平下,视频的失真又会大大增加,故编码比特和重建失真存在相互矛盾、相互制约的关系。率失真优化技术就是让编码器选择一组最优的编码参数集使得在编码比特小于目标比特的前提下,编码失真最小,其数学表达如公式(1.1)所示:
Figure PCTCN2020132812-appb-000001
其中D i和R i表示编码单元的失真和比特数,N为编码单元总数,R c表示目标比特数。
为了解决上述约束性率失真优化问题,可以引入全局拉格朗日乘子λ g,将约束性问题转换为公式(1.2)的无约束问题,其中J被称为率失真代价函数。
Figure PCTCN2020132812-appb-000002
图1给出了经典R-D曲线,当使用一组编码参数对视频进行编码时,便可得到该编码参数条件下的失真和码率,我们以点的形式绘制不同编码参数下的(R,D)组合,这些点被称为实际率失真可操作点。我们可以观察到对于给定的速率R *,总能找到一个D最小的可操作点,这些点被称为实际可达到的最优可操作点,将这些最优可操作点连接起来,即可得到实际可操作R-D曲线。
在独立率失真优化的条件下,即不同编码单元之间的率失真性能相互独立,对公式(1.2)关于R i求导可得
Figure PCTCN2020132812-appb-000003
由此可见,λ g是率失真曲线上某点的负斜率,较大的λ g对应码率较小,失真较大的可操作点,而较小的λ g对应着码率较大,失真较小的可操作点,是影响率失真性能最主要的决定因素,因此拉格朗日乘子λ g的选取至关重要,目前VVC中λ g的大小主要由预先设置好的量化参数QP决定,与输入视频序列无关。
然而由于帧内\帧间预测会在不同编码单元间引入依赖性,对每个编码单元采用独立率失真优化技术并不能使编码性能达到最优。因此需要一种复杂度可以接受的全局率失真优化方法,进一步提升编码效率。
文献Temporally dependent rate-distortion optimization for low-delay hierarchical video coding中就研究了一种LD编码结构下时域率失真优化算法。根据LD配置下的时域依赖关系,建立了多参考帧下的时域失真传播链,估计失真传播程度,计算传播因子,进而根据聚合传播因子调整全局拉格朗日乘子,实现时域率失真优化,解决全局率失真优化问题。
图2中给出了LD编码结构下时域失真传播链的构造方法,由于采用了多参考帧编码方式,关键帧中的一个编码块可能直接影响后续不同帧中的多个块,并继续间接地向前扩展。因此,需要利用大量的实验统出GOP中每个参考帧的利用率,对多个可能的影响块使用加权平均计算对后续编码块的期望失真。基于图3的LD编码结构下的时域依赖关系以及采用前向运动搜索可以确定后续帧中受影响的编码块。
在LD编码结构下在考虑关键帧f i中的编码单元B i的时域率失真优化时,编码帧f i+1中受影响的编码单元B i+1的期望失真为:
Figure PCTCN2020132812-appb-000004
假设P i,j是编码帧f i被编码帧f j参考的概率,o i是B i的编码参数。由于后三项与B i的编码参数o i无关,因此公式(1.3)可简化为
Figure PCTCN2020132812-appb-000005
同理编码单元B i+2的期望失真可写为:
Figure PCTCN2020132812-appb-000006
其中
Figure PCTCN2020132812-appb-000007
与B i的编码参数o i无关,使用类似的方法可以得到对后续有影响的编码单元的期望失真。
基于期望失真的概念下,公式(1.2)的率失真问题可重新表示为:
Figure PCTCN2020132812-appb-000008
由于该算法对当前编码单元和后续编码单元估计的期望失真相对粗糙,使得传播因子难以准确衡量当前编码单元的失真对后续编码失真的影响,在新一代视频编码标准VVC中反而产生损失,同时该算法没有对I帧进行时域率失真优化,而I帧的编码性能在LD编码结构中非常重要。
发明内容
针对上述问题,为了进一步优化LD编码结构下的时域率失真优化方案,本发明根据LD编码结构下的时域依赖关系以及skip模式和inter模式下失真传播分析,重新归纳了基于时域失真传播的依赖率失真优化问题,通过构造时域失真传播链,估计当前编码单元及受影响的未来编码单元的聚合失真,计算时域失真传播模型中编码单元的传播因子,进而通过更加准确的传播因子调整拉格朗日乘子实现时域依赖率失真优化,同时对I帧使用二次编码技术实现I帧的时域依赖率失真优化。
本发明采用的技术方案是:
假设编码单元B i重建失真为D i。由于帧间预测存在skip模式,在该模式下不需传输残差,直接使用帧间预测值作为重建值,其余模式需要传输残差,称为inter模式,因此当前编码单元的失真可以由skip模式和inter模式带来的失真共同组成:
Figure PCTCN2020132812-appb-000009
只有当前编码单元inter模式下的部分失真d inter会对后续编码单元的带来影响,因为在skip模式下使用已编码的参考单元作为预测块时不需要传输预测残差,当前编码单元的失真 由前面已编码的参考单元失真决定,所以对后续编码单元的影响也是由前面已编码单元决定,所以在考虑当前编码单元的对后续编码单元的影响时应当将skip模式下的失真剔除。假设
Figure PCTCN2020132812-appb-000010
Figure PCTCN2020132812-appb-000011
分别是当前编码单元选择inter模式和skip模式的编码失真,p inter和p skip分别是当前编码单元选择inter模式和skip模式的概率,两者之和为1。如果当前编码单元与预测单元之间的误差越大,将会导致编码器选择inter模式的概率越大,同时较大的量化步长会导致编码器选择skip模式的概率增大。因此,将p inter定义为:
Figure PCTCN2020132812-appb-000012
其中
Figure PCTCN2020132812-appb-000013
为B i在原始帧中通过运动搜索得到的原始运动补偿误差,F i、F i-1分别代表编码单元B i和参考单元B i-1的原始像素,Δ为量化步长。
在编码B i时,对公式(1.6)关于R i求偏导,可求得全局拉格朗日乘子λ g
Figure PCTCN2020132812-appb-000014
在公式(1.9)两端同乘一个
Figure PCTCN2020132812-appb-000015
同时假设
Figure PCTCN2020132812-appb-000016
可以得到:
Figure PCTCN2020132812-appb-000017
λ i是在全局率失真性能下编码单元B i的拉格朗日乘子。此外,κ i代表编码单元B i对后续视频序列编码失真的影响,称为该编码单元B i的传播因子。
在高码率inter模式下的失真函数可以表示为
Figure PCTCN2020132812-appb-000018
skip模式下的失真函数可以表示为
Figure PCTCN2020132812-appb-000019
R i+1为码率,b是与信源分布有关的常数,
Figure PCTCN2020132812-appb-000020
为B i+1的运动补偿预测误差。
Figure PCTCN2020132812-appb-000021
F i代表编码单元B i的原始像素,
Figure PCTCN2020132812-appb-000022
代表编码单元B i的重建像素、F i+1代表编码单元B i+1的原始像素。
根据实验观察,α约等于一个常数,此时编码单元B i+1的失真可表示为:
Figure PCTCN2020132812-appb-000023
Figure PCTCN2020132812-appb-000024
分别表示编码单元B i+1参考编码单元B i时使用inter模式和skip模式的概率,
Figure PCTCN2020132812-appb-000025
表示编码单元B i+1的原始运动补偿误差。
此时由公式(1.4)和(1.7)可得编码帧f i+1中受编码单元B i影响的编码单元B i+1的期望失真为:
Figure PCTCN2020132812-appb-000026
其中
Figure PCTCN2020132812-appb-000027
仅与编码单元B i+1的码率R i+1有关,与编码单元B i的编码参数o i无关,
Figure PCTCN2020132812-appb-000028
也与B i的编码参数o i无关,仅有编码帧f i被编码帧f i+1参考的概率P i,i+1和编码单元B i使用inter模式下的编码失真
Figure PCTCN2020132812-appb-000029
以及参数γ i,i+1与编码参数o i有关。
同理可得,编码帧f i+2中受B i影响的编码单元B i+2的期望失真为:
Figure PCTCN2020132812-appb-000030
其中
Figure PCTCN2020132812-appb-000031
分别表示编码单元B i+2参考编码单元B i+1时使用inter模式和skip模式的概率,
Figure PCTCN2020132812-appb-000032
分别表 示编码单元B i+2参考编码单元B i时使用inter模式和skip模式的概率,R i+2表示编码单元B i+2的码率,P i+1,i+2和P i,i+2分别表示编码帧f i+1和f i被编码帧f i+2参考的概率。c i+2为与编码单元B i的编码参数o i不相干的无关项。
同理,编码帧f i+3中受B i影响的编码单元B i+3的期望失真为:
Figure PCTCN2020132812-appb-000033
其中
Figure PCTCN2020132812-appb-000034
分别表示编码单元B i+3参考编码单元B i+2时使用inter模式和skip模式的概率,
Figure PCTCN2020132812-appb-000035
分别表示编码单元B i+3参考编码单元B i时使用inter模式和skip模式的概率,R i+3表示编码单元B i+3的码率,P i+2,i+3和P i,i+3分别表示编码帧f i+2和f i被编码帧f i+3参考的概率。c i+3为与编码单元B i的编码参数o i不相干的无关项。
因此,当前GOP中的四个编码帧中受编码单元B i影响的所有编码单元的聚合失真为:
Figure PCTCN2020132812-appb-000036
Figure PCTCN2020132812-appb-000037
分别表示编码单元B i+k+1-t参考编码单元B i时使用inter模式和skip模式的概率,
Figure PCTCN2020132812-appb-000038
分别表示编码单元B j+1参考编码单元B j时使用inter模式和skip模式的概率,P i,i+k+1-t表示编码帧f i被编码帧f i+k+1-t参考的概率,P j,j+1表示编码帧f j被编码帧f j+1参考的概率,其中
Figure PCTCN2020132812-appb-000039
与编码单元B i的编码参数o i无关。
同理可得,第m个GOP中的四个编码帧中受编码单元B i影响的所有编码单元的聚合失真为:
Figure PCTCN2020132812-appb-000040
Figure PCTCN2020132812-appb-000041
分别表示编码单元B i+4m+k+1-t参考编码单元B i+4m时使用inter模式和skip模式的概率,P i+4m,i+4m+k+1-t表示编码帧f i+4m被编码帧f i+4m+k+1-t参考的概率,P j,j+1表示编码帧f j被编码帧f j+1参考的概率,其中
Figure PCTCN2020132812-appb-000042
与编码单元B i的编码参数o i无关。
从编码帧f i+1到编码最后一帧f N的所有后续编码帧中受B i影响的编码单元的聚合失真为:
Figure PCTCN2020132812-appb-000043
M为从编码帧f i+1到编码最后一帧f N的GOP总数,L表示与o i无关的项。
由公式(1.8)可知当前编码单元B i的使用inter模式的编码失真
Figure PCTCN2020132812-appb-000044
与实际编码失真D i关系如下:
Figure PCTCN2020132812-appb-000045
Figure PCTCN2020132812-appb-000046
是编码单元B i选择inter模式的概率,公式(1.19)可表示为:
Figure PCTCN2020132812-appb-000047
根据公式(1.10),传播因子κ i的计算公式为:
Figure PCTCN2020132812-appb-000048
使用传播因子κ i可对CTU级的全局拉格朗日乘子λ g自适应调整,并进一步调整CTU级QP,同时使用帧级平均传播因子调整所有B帧的帧级QP。
由于I帧在LD编码结构下尤其重要,后续编码帧均需参考I帧。目前VTM中统一将I帧QP下调1,但在不同序列中I帧的重要性是不同的,所以可采用I帧2次编码的方式,用第一次编码得到的编码失真建立失真传播链计算I帧中每个16×16的块的传播因子,使用帧级平均传播因子对I帧的QP调整,使得I帧QP可以根据I帧对后续编码帧的影响进行调整,调整值不局限于-1。
本发明的有益效果为,本发明解决了传统方法没有对I帧进行时域率失真优化的问题,使得I帧的全局率失真性能达到最优,根据LD编码结构下的时域依赖关系以及skip模式和inter模式下失真传播分析,重新归纳了基于时域失真传播的依赖率失真优化问题,提高LD编码结构下的率失真优化性能。
附图说明
图1为可操作率失真曲线
图2为LD编码结构下时域失真传播链的构造示意图
图3为LD编码结构示意图
图4为Fourpeople序列的率失真曲线图。
具体实施方式
下面结合实施例对本发明进行详细的描述:
为了简化全局率失真算法的实现过程,可以通过传播因子κ i直接在VTM中修改全局拉格 朗日乘子λ g。由于后续编码单元在推导传播因子κ i时并没有真正被编码,因此需对后续编码单元的失真进行估计。
在高码率情况下,后续编码单元的编码失真大概率是inter失真,此时存在
Figure PCTCN2020132812-appb-000049
由于B i+1没有被编码,R i+1无法获得,D i+1无法使用(1.21)计算,但在量化步长为Q step下B i+1的编码失真可表示为
Figure PCTCN2020132812-appb-000050
Figure PCTCN2020132812-appb-000051
其中
Figure PCTCN2020132812-appb-000052
基于大量不同量化步长和编码单元的实验可拟合出一条F(θ)曲线,根据曲线建立一个查询表计算θ查询得到F(θ)的值,进而估计编码块的inter失真,同时本发明中将α设置为0.94。
根据公式(1-9),可得全局拉格朗日乘子:
Figure PCTCN2020132812-appb-000053
同时VTM的拉格朗日乘子
Figure PCTCN2020132812-appb-000054
因此λ g和λ VTM存在以下关系:
Figure PCTCN2020132812-appb-000055
对于所有编码单元存在:
Figure PCTCN2020132812-appb-000056
利用公式(1-24)可求得全局拉格朗日乘子λ g,其中N是所有编码单元个数,所有编码单元的失真和编码过程中不能全部获得,此时失真和使用已编码帧的失真以及刚完成的编码帧的失真的加权和,对λ g进行更新。由于
Figure PCTCN2020132812-appb-000057
在集成了本节提出的率失真算法的编码器中是不 可得的,因此使用D i替代。
在建立失真传播链时,基于16×16大小的块做运动搜索,并计算每个块的传播因子,由于VTM中以大小为128×128的CTU进行独立划分编码,所以对CTU内所有16×16的块的传播因子取平均值,作为CTU的传播因子,调整CTU级的拉格朗日乘子和QP,同时使用一整幅图像的平均传播因子调整帧级QP。
I帧采用二次编码优化调整I帧QP,为了降低编码复杂度,对I帧第一次编码过程进行优化,跳过二叉树以及三叉树划分模式,仅使用四叉树划分模式对CTU进行划分,并且编码单元最小划分尺寸设置为16×16,不再进行更小尺寸的划分,基于I帧第一次编码得到的失真可以估计处于I帧的编码单元的失真对后续编码单元的影响,实现I帧QP自适应调整。
本发明采用VVC参考软件VTM5.0作为实验平台,实验环境根据JVET规定的通用测试条件CTC(Common Test Conditions)和参考软件配置,仅在LDB编码结构下进行实验,实验测试序列为CTC建议的Class B、C、D、E共16个视频序列,每个测试序列使用四个QP点(22,27,32,37)进行编码。
表1 本发明相比于VTM5.0的测试结果
Figure PCTCN2020132812-appb-000058
Figure PCTCN2020132812-appb-000059
编码实验结果如表1所示,表中显示了测试序列在LDB编码结构下Y分量取得了2.57%的编码性能。对于大部分测试序列,本发明性能提升明显,尤其是ClassE,性能提升显著,Y分量下取得了10.13%的码率节省。主要是因为Class E是场景较为固定的视频序列,各视频帧相似度较高,时域依赖性强,本发明对这种序列能取得较好的效果。接下来选取部分序列,基于率失真优化曲线对比图,观察它们的编码性能提升情况,如图4为Fourpeople序列的率失真曲线图,横坐标为编码码率Rate,纵坐标为重建视频峰值信噪比PSNR,其中圆形标记曲线是该全局率失真优化算法的率失真曲线,方形标记曲线是原始VTM5.0的率失真曲线,可以看出,对于时域依赖性较强的序列,该算法编码效率提升明显。
同样,在编码复杂度方面,该时域率失真优化算法在LDB编码结构下的编码复杂度平均增加15%,主要是由于算法中需要耗费一定的时间对每个16×16的块做运动搜索找到受影响的编码块从而建立失真传播链,同时对I帧使用了二次编码优化,尽管对I帧第一次编码过程进行了简化,但也增加了少量的编码复杂度。
表2 本发明相比于VTM5.0的编码时间百分比
Figure PCTCN2020132812-appb-000060
Figure PCTCN2020132812-appb-000061

Claims (1)

  1. 一种基于失真类型传播分析的时域率失真优化方法,其特征在于,包括以下步骤:
    S1、定义编码单元B i的重建失真D i为:
    Figure PCTCN2020132812-appb-100001
    其中,
    Figure PCTCN2020132812-appb-100002
    Figure PCTCN2020132812-appb-100003
    分别是当前编码单元选择inter模式和skip模式的编码失真,p inter和p skip分别是当前编码单元选择inter模式和skip模式的概率,d inter是inter模式下的部分失真,d skip是skip模式下的部分失真,p inter+p skip=1;
    p inter的定义为:
    Figure PCTCN2020132812-appb-100004
    其中
    Figure PCTCN2020132812-appb-100005
    为B i在原始帧中通过运动搜索得到的原始运动补偿误差,F i、F i-1分别代表编码单元B i和参考单元B i-1的原始像素,Δ为量化步长;
    S2、在编码B i时,对B i时域依赖率失真优化问题
    Figure PCTCN2020132812-appb-100006
    关于R i求偏导,得到全局拉格朗日乘子λ g
    Figure PCTCN2020132812-appb-100007
    o i为B i的编码参数,R i表示编码单元的比特数;
    在公式两端同乘一个
    Figure PCTCN2020132812-appb-100008
    同时令
    Figure PCTCN2020132812-appb-100009
    得到:
    Figure PCTCN2020132812-appb-100010
    λ i是在全局率失真性能下编码单元B i的拉格朗日乘子,κ i代表编码单元B i对后续视频序列编码失真的影响,定义为编码单元B i的传播因子;
    S3、建立当前GOP中的四个编码帧中受编码单元B i影响的所有编码单元的聚合失真为:
    Figure PCTCN2020132812-appb-100011
    其中,
    Figure PCTCN2020132812-appb-100012
    α是一个常数,
    Figure PCTCN2020132812-appb-100013
    分别表示编码单元B i+k+1-t参考编码单元B i时使用inter模式和skip模式的概率,
    Figure PCTCN2020132812-appb-100014
    分别表示编码单元B j+1参考编码单元B j时使用inter模式和skip模式的概率,P i,i+k+1-t表示编码帧f i被编码帧f i+k+1-t参考的概率,P j,j+1表示编码帧f j被编码帧f j+1参考的概率,其中
    Figure PCTCN2020132812-appb-100015
    与编码单元B i的编码参数o i无关,即c i+k+1为与编码单元B i的编码参数o i不相干的无关项;
    第m个GOP中的四个编码帧中受编码单元B i影响的所有编码单元的聚合失真为:
    Figure PCTCN2020132812-appb-100016
    Figure PCTCN2020132812-appb-100017
    分别表示编码单元B i+4m+k+1-t参考编码单元B i+4m时使用inter模式和skip模式的概率,P i+4m,i+4m+k+1-t表示编码帧f i+4m被编码帧f i+4m+k+1-t参考的概率,P j,j+1表示编码帧f j被编码帧f j+1参考的概率,其中
    Figure PCTCN2020132812-appb-100018
    与编码单元B i的编码参数o i无关,即c i+4m+k+1为与编码单元 B i的编码参数o i不相干的无关项;
    得到编码帧f i+1到编码最后一帧f N的所有后续编码帧中受B i影响的编码单元的聚合失真为:
    Figure PCTCN2020132812-appb-100019
    M为从编码帧f i+1到编码最后一帧f N的GOP总数,L表示与o i无关的项;
    S4、根据步骤S1中p inter的定义,得到当前编码单元的inter失真
    Figure PCTCN2020132812-appb-100020
    与实际失真D i关系为:
    Figure PCTCN2020132812-appb-100021
    b是与信源分布有关的常数,令
    Figure PCTCN2020132812-appb-100022
    将上式简化表示为
    Figure PCTCN2020132812-appb-100023
    根据步骤S2中λ i的表达式,得到传播因子κ i的计算公式为:
    Figure PCTCN2020132812-appb-100024
    使用传播因子κ i对CTU级的全局拉格朗日乘子λ g自适应调整,根据上述步骤通过对CTU 分块求传播因子取平均值,获得CTU的传播因子,进一步调整CTU级QP,同时使用帧级平均传播因子调整所有B帧的帧级QP;对于I帧,采用2次编码的方式,用第一次编码得到的编码失真建立失真传播链并根据上述步骤计算I帧中每个16×16的块的传播因子,使用帧级平均传播因子对I帧的QP调整,使得I帧QP可以根据I帧对后续编码帧的影响进行调整。
PCT/CN2020/132812 2020-03-31 2020-11-30 一种基于失真类型传播分析的时域率失真优化方法 WO2021196682A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/412,292 US11330270B2 (en) 2020-03-31 2021-08-26 Temporal domain rate distortion optimization considering coding-mode adaptive distortion propagation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010241861.4 2020-03-31
CN202010241861.4A CN111314703B (zh) 2020-03-31 2020-03-31 一种基于失真类型传播分析的时域率失真优化方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/412,292 Continuation US11330270B2 (en) 2020-03-31 2021-08-26 Temporal domain rate distortion optimization considering coding-mode adaptive distortion propagation

Publications (1)

Publication Number Publication Date
WO2021196682A1 true WO2021196682A1 (zh) 2021-10-07

Family

ID=71147494

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/132812 WO2021196682A1 (zh) 2020-03-31 2020-11-30 一种基于失真类型传播分析的时域率失真优化方法

Country Status (3)

Country Link
US (1) US11330270B2 (zh)
CN (1) CN111314703B (zh)
WO (1) WO2021196682A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116723330A (zh) * 2023-03-28 2023-09-08 成都师范学院 一种自适应球域失真传播链长度的全景视频编码方法

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111314703B (zh) * 2020-03-31 2022-03-08 电子科技大学 一种基于失真类型传播分析的时域率失真优化方法
CN111918068B (zh) * 2020-08-05 2022-03-08 电子科技大学 基于视频序列特征和QP-λ修正的时域率失真优化方法
US11418795B2 (en) 2020-08-05 2022-08-16 University Of Electronic Science And Technology Of China Temporal domain rate distortion optimization based on video content characteristic and QP-λcorrection
CN113596483B (zh) * 2021-08-20 2024-03-12 红河学院 一种编码树单元的参数确定方法及系统
CN115695801A (zh) * 2022-10-18 2023-02-03 电子科技大学 一种考虑时域失真传播的低复杂度全景视频编码方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104796705A (zh) * 2015-04-22 2015-07-22 福州大学 一种基于ssim的hevc视频编码率失真优化与码率控制算法
CN105120282A (zh) * 2015-08-07 2015-12-02 上海交通大学 一种时域依赖的码率控制比特分配方法
CN105872544A (zh) * 2016-04-19 2016-08-17 电子科技大学 低延迟视频编码中时域率失真优化方法
CN111314703A (zh) * 2020-03-31 2020-06-19 电子科技大学 一种基于失真类型传播分析的时域率失真优化方法

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20050061762A (ko) * 2003-12-18 2005-06-23 학교법인 대양학원 부호화 모드 결정방법, 움직임 추정방법 및 부호화 장치
JP4542107B2 (ja) * 2005-07-22 2010-09-08 三菱電機株式会社 画像復号装置及び画像復号方法
HUE039661T2 (hu) * 2009-09-10 2019-01-28 Guangdong Oppo Mobile Telecommunications Corp Ltd Gyorsítási technikák a torzítás-optimalizált kvantáláshoz
GB2495469B (en) * 2011-09-02 2017-12-13 Skype Video coding
CN102625102B (zh) * 2011-12-22 2014-02-12 北京航空航天大学 一种面向h.264/svc mgs编码的率失真模式选择方法
CN115052157A (zh) * 2012-07-02 2022-09-13 韩国电子通信研究院 图像编码/解码方法和非暂时性计算机可读记录介质
US20160373740A1 (en) * 2014-03-05 2016-12-22 Sony Corporation Image encoding device and method
CN110351557A (zh) * 2018-04-03 2019-10-18 朱政 视频编码中一种快速帧间预测编码方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104796705A (zh) * 2015-04-22 2015-07-22 福州大学 一种基于ssim的hevc视频编码率失真优化与码率控制算法
CN105120282A (zh) * 2015-08-07 2015-12-02 上海交通大学 一种时域依赖的码率控制比特分配方法
CN105872544A (zh) * 2016-04-19 2016-08-17 电子科技大学 低延迟视频编码中时域率失真优化方法
CN111314703A (zh) * 2020-03-31 2020-06-19 电子科技大学 一种基于失真类型传播分析的时域率失真优化方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GAO YANBO; ZHU CE; LI SHUAI: "Hierarchical temporal dependent rate-distortion optimization for low-delay coding", 2016 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), IEEE, 22 May 2016 (2016-05-22), pages 570 - 573, XP032941613, DOI: 10.1109/ISCAS.2016.7527304 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116723330A (zh) * 2023-03-28 2023-09-08 成都师范学院 一种自适应球域失真传播链长度的全景视频编码方法
CN116723330B (zh) * 2023-03-28 2024-02-23 成都师范学院 一种自适应球域失真传播链长度的全景视频编码方法

Also Published As

Publication number Publication date
CN111314703A (zh) 2020-06-19
CN111314703B (zh) 2022-03-08
US11330270B2 (en) 2022-05-10
US20220007031A1 (en) 2022-01-06

Similar Documents

Publication Publication Date Title
WO2021196682A1 (zh) 一种基于失真类型传播分析的时域率失真优化方法
CN111918068B (zh) 基于视频序列特征和QP-λ修正的时域率失真优化方法
US9179147B2 (en) Soft decision and iterative video coding for MPEG and H.264
US20100238997A1 (en) Method and system for optimized video coding
US20080056354A1 (en) Transcoding Hierarchical B-Frames with Rate-Distortion Optimization in the DCT Domain
US6891889B2 (en) Signal to noise ratio optimization for video compression bit-rate control
US20070274396A1 (en) Complexity adaptive skip mode estimation for video encoding
KR20100061756A (ko) 비디오 인코딩 생성 방법
US20060126734A1 (en) Video encoder and method for encoding a video signal
US20130235938A1 (en) Rate-distortion optimized transform and quantization system
WO2024082580A1 (zh) 一种考虑时域失真传播的低复杂度全景视频编码方法
WO2021120614A1 (zh) 二次编码优化方法
TW201301900A (zh) 視訊解碼器之編碼成位元串之解碼方法
WO2024082579A1 (zh) 一种考虑时域失真传播的零时延全景视频码率控制方法
Chen et al. Intra frame rate control for versatile video coding with quadratic rate-distortion modelling
US8705618B2 (en) Method and device for coding a video image with a coding error estimation algorithm
WO2018076827A1 (zh) 视频编码中帧内编码的码率估计方法
JP4994877B2 (ja) ビデオのフレームシーケンスにおけるマクロブロックの符号化モードを選択する方法及びシステム
WO2022194103A1 (zh) 解码方法、编码方法、装置、设备及存储介质
De Abreu et al. Optimal Lagrange multipliers for dependent rate allocation in video coding
CN108989818B (zh) 一种图像编码参数调整方法及装置
CN105872545A (zh) 一种随机接入视频编码中层次化时域率失真优化方法
Ropert et al. RD spatio-temporal adaptive quantization based on temporal distortion backpropagation in HEVC
Yang et al. Rate distortion optimization of H. 264 with main profile compatibility
Ascenso et al. Low complexity intra mode selection for efficient distributed video coding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20929451

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20929451

Country of ref document: EP

Kind code of ref document: A1