CN107295334B

CN107295334B - Adaptive reference picture chooses method

Info

Publication number: CN107295334B
Application number: CN201710696790.5A
Authority: CN
Inventors: 周益民; 曾鹏; 王宏宇; 冷龙韬; 黄航
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2017-08-15
Filing date: 2017-08-15
Publication date: 2019-12-03
Anticipated expiration: 2037-08-15
Also published as: CN107295334A

Abstract

The present invention relates to adaptive reference pictures to choose method, comprising: A. obtains the reference picture set R of t frame in coded image_Set(t)；B. the image pic for being k by timing_kAs reference picture, the i-th frame is calculated in reference pic_kWhen the bit number and pic that consume_kReference value；C. the reference value of each timing image is obtained, and then obtains over the reference value set of continuous w image；D. the mean value and variance of all w reference values in reference value set are calculated, and the farthest reference distance L of interframe is arranged by the ratio of mean variance；E. the selection of time domain or quality is carried out to image according to the size of the mean value and variance, and new reference picture set is set；F. according to new reference picture set

Description

Adaptive Reference Image Selection Method

技术领域technical field

本发明涉及图像处理的方法，具体的讲是自适应的参考图像抉择方法。The invention relates to an image processing method, specifically an adaptive reference image selection method.

背景技术Background technique

视频编码本质上是通过各种算法、策略、工具不断去除视频信号空域和时域冗余的过程。空域冗余的消除通常依靠帧内预测实现，时域冗余的去除则依靠帧间预测实现。在历史发展进程中，许多新技术新方法的提出并应用到视频编码器中，如熵编码，四叉树块分割，帧内预测的多方向模式，帧间预测的双向多参考模式，解码端导出矢量技术等。这些方法的共同目标都是在保障编码质量的同时尽可能少比特数的开销。编码质量通常用峰值信号比(PSNR)进行度量，比特数(Bits)通常根据视频序列的帧率(Frame Rate,FR)换算成为比特率(Bit Rate,BR)。从专业术语上来讲，信源视频序列是由一幅幅时序相邻的帧所组成。当帧进入编解码器时，整个加工过程中都被称为图像。Video coding is essentially a process of continuously removing spatial and temporal redundancy of video signals through various algorithms, strategies, and tools. The elimination of spatial redundancy is usually realized by intra-frame prediction, and the removal of temporal redundancy is realized by inter-frame prediction. In the course of historical development, many new technologies and methods have been proposed and applied to video encoders, such as entropy coding, quadtree block segmentation, multi-directional mode for intra-frame prediction, bi-directional multi-reference mode for inter-frame prediction, decoding end Export vector technology and more. The common goal of these methods is to reduce the bit overhead as much as possible while ensuring the encoding quality. Coding quality is usually measured by peak signal ratio (PSNR), and the number of bits (Bits) is usually converted into bit rate (Bit Rate, BR) according to the frame rate (Frame Rate, FR) of the video sequence. In terms of technical terms, the source video sequence is composed of time-series adjacent frames. When a frame enters a codec, the entire process is called an image.

视频编码国际标准从H.264/AVC到HEVC/H.265，国家标准从AVS1到AVS+再到AVS2，已有近二十年的发展历程。最新的国际标准H.266、国家标准AVS3制定工作新近启动。二十年以来编码技术的长足进步使得编码结构愈加规范。2002年分层级参考方式(Hierarchical Reference Structure,HRS)的引入，使得编码过程中图像间参考结构日趋稳定。在今天主流的编码配置中，低延迟(Low Delay,LD)和随机访问(Random Access,RA)都有HRS精神存在，并按照约定的参考结构被固化下来。有迹象表明在即将制定的H.266标准中，可能会取消用户自定义图像间参考调节。这些情况说明多数工程和研究人员都比较认可目前的参考图像管理方式。HRS已经非常成熟，性能表现已经很优异。The international standards for video coding range from H.264/AVC to HEVC/H.265, and the national standards range from AVS1 to AVS+ to AVS2. It has been nearly two decades of development. The formulation of the latest international standard H.266 and the national standard AVS3 has recently started. The great progress of coding technology in the past two decades has made the coding structure more standardized. The introduction of the Hierarchical Reference Structure (HRS) in 2002 made the inter-picture reference structure more stable during the coding process. In today's mainstream coding configurations, both Low Delay (LD) and Random Access (RA) have the spirit of HRS, and are solidified according to the agreed reference structure. There are indications that in the forthcoming H.266 standard, user-defined inter-image reference adjustment may be canceled. These circumstances show that most engineers and researchers are more accepting of the current reference image management methods. HRS is very mature and has excellent performance.

与此同时，视频信源内容和画面质量天然差别巨大，用一种先验的参考管理方式来适用所有的情形是不恰当的。运动剧烈的帧序列，其相邻帧之间的时域相关性强，稍远一些的时间距离，其时域相关性骤降。而场景静止，内容简单的帧画面，即使在很长的时间距离上，其时域相关性仍得以保留。At the same time, the content and picture quality of video sources vary greatly, and it is not appropriate to use a priori reference management method to apply to all situations. For a frame sequence with intense motion, the temporal correlation between adjacent frames is strong, and the temporal correlation drops sharply at a slightly longer time distance. However, the scene is still and the content of the frame is simple, even at a long time distance, its temporal correlation is still preserved.

图像间参考过程中时域冗余以运动估值获得“矢量+补偿”为表达。即使考虑到率失真优化技术的优化选择，编码器也始终倾向于选择“矢量+补偿”较小的那些模式用于最终的变换和熵编码。此时，参考图像本身质量的优劣就起着非常重要的作用。这是因为编码器总是倾向去选择“补偿”时候残差最小的块用作参考。为了将图像编码质量提升，比特数的增加是不开避免的开销。一般地，对于确定的一帧，编码时采用的量化参数QP值越小，其质量越高消耗比特数越多；采用的量化参数QP值越大，其质量越低消耗比特数越少。其单调性已经被信息论所证实。付出相对更多的比特换取一幅图像的较高质量，若被后续图像参考的频率高，其高质量获得时域上的传递，那么投入的比特数获得的参考价值就非常高。反之，若它被后续图像参考的频率很低，其高质量在时域上将快速截断，那么投入的比特数参考价值就很低。The temporal redundancy in the process of inter-picture reference is expressed by "vector + compensation" obtained by motion estimation. Even considering the optimal selection of rate-distortion optimization techniques, the encoder always tends to select those modes with smaller "vector + compensation" for the final transform and entropy encoding. At this time, the quality of the reference image itself plays a very important role. This is because the encoder always tends to choose the block with the smallest residual error when "compensating" as a reference. In order to improve the quality of image coding, the increase in the number of bits is unavoidable overhead. Generally, for a certain frame, the smaller the quantization parameter QP value used in encoding, the higher the quality and the more bits will be consumed; the larger the quantization parameter QP value used, the lower the quality and the less bits will be consumed. Its monotonicity has been confirmed by information theory. Relatively more bits are paid in exchange for a higher quality of an image. If the reference frequency of subsequent images is high and its high quality is transmitted in the time domain, the reference value obtained by the number of invested bits is very high. Conversely, if it is referenced by subsequent images infrequently, its high quality will be quickly truncated in the time domain, and the reference value of the number of invested bits is very low.

运动变化剧烈的视频序列，其时域相关性局限在非常相邻的帧之间。静止或变化很小的视频序列，帧间时域相关性可以持续很长的时序距离。对于前面一种情况，每一帧都尽量投入差不多的比特数以保持均匀的编码图像质量最有利于整体编码性能。对于后面一种情况，适宜投入大量比特编码较高质量的一幅图像，并将它用于后续长距离的参考，最有利于整体编码性能。For video sequences with sharp motion changes, the temporal correlation is limited to very adjacent frames. For video sequences that are stationary or change little, inter-frame temporal correlations can persist over long temporal distances. For the former case, it is most beneficial to the overall coding performance to invest as many bits as possible in each frame to maintain a uniform coded image quality. For the latter case, it is appropriate to invest a large number of bits to encode an image with higher quality and use it for subsequent long-distance reference, which is most beneficial to the overall encoding performance.

鉴于此，需要一种在编码过程中能够自主决策将哪些图像设置为长效参考(Long-Term Reference,LTR)驻留在参考解码缓冲区中，投入较多编码比特保证其参考质量；将哪些图像设置为短期参考(Short-Term Reference,STR)并合理移出参考解码缓冲区，以保障图像之间高时域相关性。In view of this, it is necessary to independently decide which images to set as long-term reference (Long-Term Reference, LTR) in the reference decoding buffer during the encoding process, and invest more encoding bits to ensure its reference quality; The image is set as a short-term reference (Short-Term Reference, STR) and reasonably moved out of the reference decoding buffer to ensure high temporal correlation between images.

发明内容Contents of the invention

本发明提供了一种自适应的参考图像抉择方法，能够自主决策选择图像的依据，根据不同的情形对图像质量或图像之间的时域相关性进行判断和选择。The invention provides an adaptive reference image selection method, which can independently determine the basis for selecting images, and judge and select the image quality or the temporal correlation between images according to different situations.

本发明的自适应的参考图像抉择方法，包括：The adaptive reference image selection method of the present invention includes:

A.将编码图像中的一帧选择多个已编码图像作为参考，得到第t帧的参考图像集合R_Set(t)，所述第t帧的参考图像集合包含了其时序相邻重构的参考图像pic_t-1和最近连续三个前向关键图像；A. Select a plurality of coded images in one frame of the coded image as a reference, and obtain the reference image set R _Set (t) of the tth frame, which includes its temporally adjacent reconstruction Reference image pic _t-1 and the most recent consecutive three forward key images;

B.根据参考图像集合R_Set(t)的参数设置，将时序为k的图像pic_k作为参考图像，计算第i帧在参考参考图像pic_k时消耗的比特数，所述消耗的比特数约等于累积计算参考图像pic_k的所有块的比特数之和，进而得到时序为k的参考图像pic_k参考价值；B. According to the parameter setting of the reference image set R _Set (t), the image pic _k whose time sequence is k is used as the reference image, and the number of bits consumed by the i-th frame when referring to the reference image pic _k is calculated, and the number of bits consumed is about It is equal to the sum of the number of bits of all the blocks of the cumulative calculation reference image pic _k , and then obtains the reference value of the reference image pic _k whose time sequence is k;

C.根据所述参考价值的计算方法，获得各个时序图像的参考价值，进而得到过去连续w个图像的参考价值集合IP_Set(t)，并且t＞w、t mod w＝0；C. According to the calculation method of the reference value, obtain the reference value of each time-series image, and then obtain the reference value set IP _Set (t) of w consecutive images in the past, and t>w, t mod w=0;

D.计算参考价值集合中所有w个参考价值的均值和方差，并通过均值方差的比值设置帧间最远参考距离L；D. Calculate the mean and variance of all w reference values in the reference value set, and set the farthest reference distance L between frames through the ratio of the mean variance;

E.比较所述均值和方差的大小，如果均值大、方差小，选择时域最近的图像做参考；如果均值小、方差大，选择质量好的图像而非时域最近的图像做参考；根据选择的参考图像重新设置第t帧的新参考图像集合 E. Compare the size of the mean and variance, if the mean is large and the variance is small, select the nearest image in the time domain as a reference; if the mean is small and the variance is large, select an image with good quality instead of the nearest image in the time domain as a reference; The selected reference image resets the new set of reference images for frame t

F.根据所述的新参考图像集合将每一帧编码的量化参数根据其被参考情况进行偏移设置。F. According to the new reference image set The quantization parameter encoded in each frame is offset-set according to its reference situation.

具体的，步骤A中所述的第t帧的参考图像集合的计算方法为：其中pic为参考图像。Specifically, the calculation method of the reference image set of the tth frame described in step A is: Where pic is the reference image.

具体的，步骤B所述的第i帧在参考参考图像pic_k时消耗的比特数的计算方法为：先计算依赖参考图像pic_k编码所产生的参考价值其中|k-t|表示帧间参考距离绝对值，|Δx_b|和|Δy_b|分别表示第i帧第b块的帧间预测运动矢量在水平和垂直两个方向的绝对值，B表示第i帧中块的总数，SAD_b表示第b块运动补偿后的残差和，qstep_k表示时序为k参考图像pic_k早前编码自身时候采用的量化步长；然后根据得到的参考价值v(·|·)，计算参考图像pic_k的影响力I(·)：n为后续依赖参考图像pic_k编码的图像个数。Specifically, the calculation method of the number of bits consumed by the i-th frame in step B when referring to the reference image pic _k is as follows: first calculate the reference value generated by the encoding of the reference image pic _k where |kt| represents the absolute value of the inter-frame reference distance, |Δx _b | and |Δy _b | represent the absolute values of the inter-frame predicted motion vectors in the horizontal and vertical directions of the block b of the i-th frame, respectively, and B represents the i-th The total number of blocks in the frame, SAD _b represents the residual sum after motion compensation of block b, and qstep _k represents the quantization step size used when the time sequence is k reference image pic _k earlier encoding itself; then according to the obtained reference value v(· | ), calculate the influence I( ) of the reference image pic _k : n is the number of pictures to be subsequently encoded depending on the reference picture pic _k .

在此基础上，步骤C中所述的过去连续w个图像的参考价值集合IP_Set(t)＝{I(pic_i)|i∈[t-w,t-1]}。On this basis, the reference value set IP _Set (t)={I(pic _i )|i∈[tw,t-1]} of the past continuous w images mentioned in step C.

进一步的，步骤D中所述w个参考价值的的均值为：所述的方差为： Further, the mean of the w reference values described in step D is: The stated variance is:

进一步的，所述的帧间最远参考距离L为： Further, the farthest reference distance L between frames is:

进一步的，步骤E所述的重新设置的第t帧的新参考图像集合的方法为：Further, the new reference image set of the reset t frame described in step E The method is:

所述的偏移设置为：QP(pic_t)＝QP_I+QP_offset-Q(t,4)-Q(t,L)，其中QP_I表示编码输入的量化参数QP的基准值，QP_offset表示量化参数QP的最大偏移量，Q(·,·)为量化参数QP偏移修正量，其公式为： The offset setting is: QP(pic _t )=QP_I+QP_offset-Q(t,4)-Q(t,L), wherein QP_I represents the reference value of the quantization parameter QP input by encoding, and QP_offset represents the quantization parameter QP The maximum offset of , Q( , ) is the quantization parameter QP offset correction, and its formula is:

本发明的自适应的参考图像抉择方法，能够根据不同的图像要求对图像的质量或时域相关性进行自适应选择，并且能够根据图像内容变化特征来制定参考图像管理规则显著提升视频序列编码性能。The self-adaptive reference image selection method of the present invention can adaptively select image quality or time-domain correlation according to different image requirements, and can formulate reference image management rules according to image content change characteristics to significantly improve video sequence coding performance .

以下结合实施例的具体实施方式，对本发明的上述内容再作进一步的详细说明。但不应将此理解为本发明上述主题的范围仅限于以下的实例。在不脱离本发明上述技术思想情况下，根据本领域普通技术知识和惯用手段做出的各种替换或变更，均应包括在本发明的范围内。The above-mentioned content of the present invention will be further described in detail below in conjunction with the specific implementation manners of the examples. However, this should not be construed as limiting the scope of the above-mentioned subject matter of the present invention to the following examples. Without departing from the above-mentioned technical idea of the present invention, various replacements or changes made according to common technical knowledge and customary means in this field shall be included in the scope of the present invention.

具体实施方式Detailed ways

编码图像需要耗费的二进制比特数(Bits)被称为信息熵(H)。具体而言，图像被合理规则划分为块(CU/PU/TU)进行编码。定义在图像序列中第t帧的第i行第j列的块番号记作u_t,i,j，其编码消耗的比特数，即熵为H(u_t,i,j)。帧间编码模式中，熵H(u_t,i,j)并非是独立产生的，它具有依赖性。如公式(1)中所示。The number of binary bits (Bits) required to encode an image is called information entropy (H). Specifically, an image is divided into blocks (CU/PU/TU) according to reasonable rules for encoding. The block number defined in the i-th row and j-th column of the t-th frame in the image sequence is denoted as u _t,i,j , and the number of bits consumed by its encoding, that is, the entropy is H(u _t,i,j ). In the inter-frame coding mode, the entropy H(u _t,i,j ) is not generated independently, it is dependent. as shown in formula (1).

公式(1)中，表示从第t帧所能使用的参考图像集合R_Set(t)中被率失真优化(RDO)抉择出最优模式后最终采用的参考块，τ是它所在的编码时序POC番号，ξ和表示该块所在的行和列番号。编码整块所产生的最小熵Encode(·)由三个部分组成，分别是E_MV{·}表示针对运动矢量记录的熵编码操作，E_QT{·}表示针对残差的变换和量化后的熵编码操作，E_MODE{·}表示针对模式信息的熵编码操作。范式||·||表示块u_t,i,j和块的残差，限制条件决定了参考块所能抉择的最大范围，因此参考图像集合R_Set(t)的制定和管理将直接影响编码性能，也就是编码所产生的信息熵H(u_t,i,j)。In formula (1), Indicates the reference block finally adopted after the optimal mode is selected by the rate-distortion optimization (RDO) from the reference image set R _Set (t) that can be used in the t-th frame, τ is the coding timing POC number where it is located, ξ and Indicates the row and column number of the block. The minimum entropy Encode(·) generated by encoding the entire block consists of three parts, namely E _MV { } represents the entropy coding operation for the motion vector record, and E _QT { } represents the transformation and quantization of the residual Entropy coding operation, E _MODE { } means entropy coding operation for mode information. The normal form ||·|| denotes block u _{t, i, j} and block Residuals, constraints determined reference block Therefore, the formulation and management of the reference image set R _Set (t) will directly affect the coding performance, that is, the information entropy H(u _t,i,j ) generated by coding.

在相对有限的一段时间内，视频信源内容变化速率趋于均匀。帧间预测也就是图像间参考过程中，一帧可以选择多个已编码图像作为参考，这被称为多参考。那些失真小、质量高的图像将有更大的概率被选中用于参考，特别是那些运动矢量趋零的静止图像内容。编码失真大、质量低的图像被选作参考的概率很小。在多参考结构下，没有必要将一帧所能参考的多个已编码图像的质量都编的很高，只要能保证有一幅高质量的图像属于参考集合即可。这被称为编码器参考图像管理。具体到本发明，这就是自适应参考图像抉择。In a relatively limited period of time, the content change rate of the video source tends to be uniform. In inter-frame prediction, that is, in the process of inter-picture reference, one frame can select multiple coded pictures as references, which is called multi-reference. Those images with low distortion and high quality will have a greater probability of being selected for reference, especially those still image contents whose motion vectors tend to zero. Images with high coding distortion and low quality are less likely to be selected as references. Under the multi-reference structure, it is not necessary to encode the quality of multiple coded images that can be referenced by one frame to be very high, as long as there is one high-quality image that belongs to the reference set. This is known as encoder reference picture management. Specific to the present invention, this is adaptive reference image selection.

不失一般性，目前主流的编码标准，如H.264/AVC、HEVC/H.265、AVS2，都支持多参考图像管理。一般而言，参考图像管理的数量都配置成4个。以LD编码结构为例，任意一个帧间参考帧时序t，它的播放序POC等于其编解码顺序DOC。它所能参考的图像全部真包含于参考图像集合R_Set(t)。Without loss of generality, the current mainstream coding standards, such as H.264/AVC, HEVC/H.265, and AVS2, all support multi-reference image management. Generally speaking, the number of reference image management is configured to be four. Taking the LD coding structure as an example, for any inter-frame reference frame timing t, its playback sequence POC is equal to its codec sequence DOC. All the images it can refer to are included in the reference image set R _Set (t).

在上述基础上，本发明的自适应的参考图像抉择方法，包括步骤：On the basis of the above, the adaptive reference image selection method of the present invention includes steps:

A.将编码图像中的一帧选择多个已编码图像作为参考，得到第t帧的参考图像集合R_Set(t)，所述第t帧的参考图像集合包含了其时序相邻重构的参考图像pic_t-1和最近连续三个前向关键图像。A. Select a plurality of coded images in one frame of the coded image as a reference, and obtain the reference image set R _Set (t) of the tth frame, which includes its temporally adjacent reconstruction The reference image pic _t-1 and the three most recent consecutive forward key images.

通常一个规范图像组(GOP)含有连续的4幅图像，它们的时序下标分别为连续的{τ+1,τ+2,τ+3,τ+4}其中τmod 4＝0。这4帧中，POC＝τ+4的GOP最末一帧图像，通常被称为关键帧，它的下标是4的整数倍。关键帧在被编码并解码成为参考图像pic_τ+4之时，会投入更多的比特换取更好的解码质量，以有利于其它非关键图像的后续多次帧间参考。Usually a canonical group of pictures (GOP) contains 4 consecutive images, and their timing subscripts are respectively consecutive {τ+1, τ+2, τ+3, τ+4} where τmod 4=0. Among these 4 frames, the last image frame of the GOP with POC=τ+4 is usually called a key frame, and its subscript is an integer multiple of 4. When the key frame is encoded and decoded into the reference image pic _τ+4 , more bits will be invested in exchange for better decoding quality, so as to benefit subsequent multiple inter-frame references of other non-key images.

因此第t帧的参考图像集合R_Set(t)的表达式为：Therefore, the expression of the reference image set R _Set (t) of the tth frame is:

公式(2)可以被表述为第t帧的参考图像集合包含其时序相邻重构参考图像pic_t-1，和最近连续三个前向关键图像，这样既考虑到了帧间时域强相关性，又考虑到了高质量参考图像的空域相关性，长久以来已被固定下来。与此一致，在编码中关键图像的量化参数QP的分配规则也同时契合公式(2)所规定的参考集合设置。关键图像编码所用的量化参数QP值要小于其余非关键图像以获得较高图像质量。Formula (2) can be expressed as the reference image set of the tth frame contains its temporally adjacent reconstructed reference image pic _t-1 , and the latest three consecutive forward key images, which takes into account the strong temporal correlation between frames , taking into account the spatial correlation of high-quality reference images, has long been fixed. Consistent with this, the allocation rule of the quantization parameter QP of the key image in the encoding also conforms to the reference set setting stipulated by formula (2). The value of the quantization parameter QP used in key image coding should be smaller than that of other non-key images to obtain higher image quality.

在公式(2)参考图像的管理中，参考图像与当前帧的时序距离最远将到达12以上，也就是关键图像将被后续图像参考12次。运动变化剧烈的视频图像，帧与帧之间的内容相关性在2、3个时序距离就会衰减至极小；运动变化缓慢的视频图像，帧与帧之间的内容相关性在几百个时序距离也会有很高的保持。前者以步长4投入高比特率换取参考价值，由于帧间相关性弱，参考命中的概率小，这样的投入是不必要的，浪费了比特数；后者也以步长4投入高比特率，但是参考时候并不需要3个关键图像，仅一个高质量的图像即可，而且高质量的参考图像在时序上可以尽量设置更长，因此频繁投入的高比特率也是不必要的，浪费了比特数。综合上述两种情况，自适应设置关键图像的数量、质量、距离将直接节省比特数，提高编码效率。In the management of the reference image in formula (2), the time sequence distance between the reference image and the current frame will be more than 12, that is, the key image will be referenced 12 times by the subsequent image. For video images with severe motion changes, the content correlation between frames will be attenuated to a minimum at 2 or 3 timing distances; for video images with slow motion changes, the content correlation between frames will be at hundreds of timing distances The distance will also be high. The former invests in high bit rate with a step size of 4 in exchange for reference value. Due to the weak inter-frame correlation, the probability of reference hits is small, such investment is unnecessary and wastes the number of bits; the latter also invests in high bit rate with a step size of 4 , but 3 key images are not needed for reference, only one high-quality image is enough, and the high-quality reference image can be set as long as possible in terms of timing, so the frequent input of high bit rate is also unnecessary, wasting number of bits. Combining the above two situations, adaptively setting the quantity, quality, and distance of key images will directly save bits and improve coding efficiency.

B.根据参考图像集合R_Set(t)的参数设置，将时序为k的图像pic_k作为参考图像，它将被其后续连续若干帧所参考。计算第i帧在参考参考图像pic_k时消耗的比特数，所述消耗的比特数约等于累积计算参考图像pic_k的所有块的比特数之和，具体为：B. According to the parameter setting of the reference image set R _Set (t), the image pic _k with time sequence k is used as the reference image, which will be referenced by several consecutive frames. Calculate the number of bits consumed by the i-th frame when referring to the reference image pic _k , the number of bits consumed is approximately equal to the sum of the bits of all blocks of the cumulative calculation of the reference image pic _k , specifically:

定义参考图像pic_k的影响力函数I(·)为后续n个图像依赖参考图像pic_k所产生的参考价值v(·|·)总和，如公式(3)所示。这是一个条件统计，当且仅当将参考图像pic_k作为参考的那些帧编码过程才被计算进来。The influence function I(·) of the reference image pic _k is defined as the sum of the reference value v(·|·) generated by the subsequent n images relying on the reference image pic _k , as shown in formula (3). This is a conditional statistic if and only if those frame encoding processes that use the reference image pic _k as reference are computed.

其中参考价值v(·|·)为：The reference value v(·|·) is:

参考价值v(·|·)既包含了块编码参考的命中频率，也包含了原始图像块和参考块之间的运动矢量和时域距离。上述三项因素，在每一帧编码下属每一个块的编码抉择中都直接提供，容易收集。The reference value v(·|·) contains not only the hit frequency of the block encoding reference, but also the motion vector and temporal distance between the original image block and the reference block. The above three factors are directly provided in the encoding decision of each block under each frame encoding, and are easy to collect.

公式(4)中，|k-t|表示帧间参考距离绝对值。|Δx_b|和|Δy_b|分别表示第i帧第b块的帧间预测运动矢量在水平和垂直两个方向的绝对值，B表示第i帧中块的总数。SAD_b表示第b块运动补偿后的残差和。qstep_k表示参考图像k早前编码自身时候采用的量化步长，可以从量化参数QP值计算得到。log₂(·)取以2为底的对数运算，指数的加1操作是为了防止运算越界。In formula (4), |kt| represents the absolute value of the reference distance between frames. |Δx _b | and |Δy _b | represent the absolute values of the inter-frame prediction motion vectors in the horizontal and vertical directions of the b-th block in the i-th frame, respectively, and B represents the total number of blocks in the i-th frame. SAD _b represents the residual sum after motion compensation of the bth block. qstep _k represents the quantization step used when the reference image k was coded earlier, and can be calculated from the quantization parameter QP value. log ₂ (·) is a logarithmic operation with base 2, and the operation of adding 1 to the exponent is to prevent the operation from going out of bounds.

这样便得到了第i帧在参考参考图像pic_k时消耗的比特数约等于累积计算参考图像pic_k的所有块的比特数之和。In this way, it is obtained that the number of bits consumed by the i-th frame when referring to the reference image pic _k is approximately equal to the sum of the bits of all blocks of the reference image pic _k accumulated and calculated.

进而得到时序为k的参考图像pic_k的参考价值：结合公式(3)和(4)，参考图像pic_k的参考价值为：Further, the reference value of the reference image pic _k whose timing is k is obtained: combining formulas (3) and (4), the reference value of the reference image pic _k is:

C.根据公式(5)对所有参考图像进行统计运算，就可以获得每一个时序图像的参考价值I(·)。定义帧间最远参考距离为L，即两副图像做帧间参考时候能够到达的最远时域距离。为了适用性，一般最远参考距离L的定义域在区间内，其中极小值4表示参考距离最小为参考帧配置数量，FR为帧率即1秒刷新的帧数量，是上取整运算。不失一般性，取时间窗口为最远参考距离L的定义域的上界那么，在当前编码帧号t时，过去连续w个图像的参考价值集合IP_Set(t)如公式(6)所示，一般要求t＞w、t mod w＝0。C. The reference value I(·) of each time-series image can be obtained by performing statistical operations on all reference images according to formula (5). Define the farthest reference distance between frames as L, that is, the farthest time-domain distance that can be reached when two images are used as inter-frame reference. For applicability, the definition domain of the farthest reference distance L is generally in the interval , where the minimum value of 4 indicates that the minimum reference distance is the number of reference frame configurations, and FR is the frame rate, that is, the number of frames refreshed in 1 second. is the rounding up operation. Without loss of generality, the time window is taken as the upper bound of the definition domain of the farthest reference distance L Then, when the current encoding frame number is t, the reference value set IP _Set (t) of w consecutive images in the past is shown in formula (6), generally requiring t>w, t mod w=0.

IP_Set(t)＝{I(pic_i)|i∈[t-w,t-1]} (6)IP _Set (t)＝{I(pic _i )|i∈[tw,t-1]} (6)

D.将参考价值集合IP_Set(t)中w个参考价值I(pic_i)作为样本进行均值μ_t和方差的统计。统计参考价值集合IP_Set(t)的均值为：D. Take the w reference value I(pic _i ) in the reference value set IP _Set (t) as a sample for mean μ _t and variance registration. The mean of the statistical reference value set IP _Set (t) is:

统计参考价值集合IP_Set(t)的均值为：The mean of the statistical reference value set IP _Set (t) is:

然后通过均值方差的比值设置帧间最远参考距离L：Then set the farthest reference distance L between frames by the ratio of the mean variance:

E.比较所述均值μ_t和方差的大小，如果均值μ_t大、方差小，选择时域最近的图像做参考；如果均值μ_t小、方差大，选择质量好的图像而非时域最近的图像做参考；根据选择的参考图像重新设置第t帧的新参考图像集合 E. Comparing the mean μ _t and variance , if the mean μ _{t is} large and the variance is small, select the nearest image in the time domain as a reference; if the mean μ _{t is} small and the variance Large, choose a good-quality image instead of the closest image in the time domain as a reference; reset the new reference image set of the t-th frame according to the selected reference image

F.根据所述的新参考图像集合将每一帧编码的量化参数根据其被参考情况进行偏移设置：与重新设置的新参考图像集合保持一致，每一帧编码采用的量化参数QP值根据其被参考情况进行偏移配置，如公式(11)所示：F. According to the new reference image set The quantization parameters encoded in each frame are offset and set according to their reference conditions: with the reset new reference image set To be consistent, the quantization parameter QP value used in each frame encoding is offset according to its reference situation, as shown in formula (11):

QP(pic_t)＝QP_I+QP_offset-Q(t,4)-Q(t,L) (11)QP(pic _t )=QP_I+QP_offset-Q(t,4)-Q(t,L) (11)

公式(11)中，QP_I表示编码输入的基准QP值，QP_offset表示帧编码QP值的最大偏移量通常固定取值为3，Q(·,·)为帧间参考QP值偏移修正量，其计算过程为：In formula (11), QP_I represents the reference QP value of the encoding input, QP_offset represents the maximum offset of the frame encoding QP value, which is usually fixed at 3, and Q(·,·) is the inter-frame reference QP value offset correction amount, Its calculation process is:

Claims

1. adaptive reference picture chooses method, feature includes:

A. it selects multiple encoded images as reference the frame in coded image, obtains the reference picture set R of t frame_Set (t), the reference picture set of the t frame contains the reference picture pic of the adjacent reconstruct of its timing_t-1With continuous three recently Forward direction key images；

B. according to reference picture set R_Set(t) parameter setting, the image pic for being k by timing_kAs reference picture, i-th is calculated Frame is in reference reference picture pic_kWhen the bit number that consumes, the bit number of the consumption is approximately equal to accumulation and calculates reference picture pic_k All pieces of the sum of bit number, and then obtain timing be k reference picture pic_kReference value；

I-th frame is in reference reference picture pic_kWhen the calculation method of bit number that consumes are as follows: first calculate and rely on reference picture pic_kIt compiles Reference value caused by codeWherein | K-t | indicate inter-reference apart from absolute value, | Δ x_b| and | Δ y_b| respectively indicate the inter-prediction motion vector of the i-th frame b block In the absolute value of horizontal and vertical directions, B indicates the sum of block in the i-th frame, SAD_bIt is residual after indicating b block motion compensation Difference and qstep_kExpression timing is k reference picture pic_kThe quantization step used when encoding itself in the early time, Bit (b) indicate ginseng Examine image pic_kB block bit number；Then according to obtained reference value v (|), reference picture pic is calculated_kShadow It rings power I ():N is subsequent dependence reference picture pic_kThe image number of coding；

C. according to the calculation method of the reference value, the reference value of each timing image is obtained, and then obtains over continuous w The reference value set IP of a image_SetAnd t > w, tmodw=0 (t),；

D. the mean value and variance of all w reference values in reference value set are calculated, and is arranged by the ratio of mean variance The farthest reference distance L of interframe；

E. the size of the mean value and variance, if mean value is big, variance is small, the image for selecting time domain nearest is made reference；Such as Fruit mean value is small, variance is big, selects high-quality image and the nearest image of non-temporal makes reference；According to the reference picture weight of selection The new reference picture set of new setting t frame

F. according to the new reference picture setBy quantization parameter that each frame encodes according to its be referenced situation into Line displacement setting.

2. adaptive reference picture as described in claim 1 chooses method, it is characterized in that: t frame described in step A Reference picture setWherein pic is reference Image.

3. adaptive reference picture as described in claim 1 chooses method, it is characterized in that: connect in the past described in step C The reference value set IP of continuous w image_Set(t)={ I (pic_i)|i∈[t-w,t-1]}。

4. adaptive reference picture as described in claim 1 chooses method, it is characterized in that: w reference price described in step D The mean value of value are as follows:The variance are as follows:

5. adaptive reference picture as claimed in claim 4 chooses method, it is characterized in that: the interframe farthest refer to away from From L are as follows:

6. adaptive reference picture as claimed in claim 5 chooses method, it is characterized in that: it is reset described in step E T frame new reference picture setMethod are as follows:

The offset setting are as follows: QP (pic_t)=QP_I+QP_offset-Q (t, 4)-Q (t, L), wherein QP_I presentation code is defeated The a reference value of the quantization parameter QP entered, QP_offset indicate that the maximum offset of quantization parameter QP, Q () are quantization parameter QP offset correction, formula are as follows: