CN114666585A - A rate control method and device based on visual perception - Google Patents

A rate control method and device based on visual perception Download PDF

Info

Publication number
CN114666585A
CN114666585A CN202210171159.4A CN202210171159A CN114666585A CN 114666585 A CN114666585 A CN 114666585A CN 202210171159 A CN202210171159 A CN 202210171159A CN 114666585 A CN114666585 A CN 114666585A
Authority
CN
China
Prior art keywords
lcu
motion
texture
rich
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210171159.4A
Other languages
Chinese (zh)
Inventor
刘鹏飞
温安君
刘国正
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ASR Microelectronics Co Ltd
Original Assignee
ASR Microelectronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ASR Microelectronics Co Ltd filed Critical ASR Microelectronics Co Ltd
Priority to CN202210171159.4A priority Critical patent/CN114666585A/en
Publication of CN114666585A publication Critical patent/CN114666585A/en
Priority to PCT/CN2022/123742 priority patent/WO2023159965A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • H04N19/198Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters including smoothing of a sequence of encoding parameters, e.g. by averaging, by choice of the maximum, minimum or median value
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/004Predictors, e.g. intraframe, interframe coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/177Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/557Motion estimation characterised by stopping computation or iteration based on certain criteria, e.g. error magnitude being too large or early exit

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a code rate control method based on visual perception, which comprises the following steps: step S10: and calculating the gradient amplitude of the texture rich area inside the LCU, and taking the calculation result as a texture perception factor of the LCU. Step S20: and calculating the product of the frame difference amplitude and the area ratio of the motion-rich area inside the LCU, and taking the calculation result as the motion perception factor of the LCU. The steps S10 and S20 are performed in sequence, either before or simultaneously. Step S30: and calculating the bit distribution weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and further calculating the target coding bit number of the LCU. The invention has the technical effects that the target bit distribution of the LCU in the image frame is more in line with the characteristics of subjective perception of human eyes, improves the subjective visual quality of the human eyes and is suitable for being realized by hardware.

Description

一种基于视觉感知的码率控制方法及装置A rate control method and device based on visual perception

技术领域technical field

本发明涉及一种视频编码技术,特别是涉及一种基于视觉感知的、适合于硬件实现的码率控制方法及装置。The present invention relates to a video coding technology, in particular to a code rate control method and device suitable for hardware implementation based on visual perception.

背景技术Background technique

视频编码是一种通过压缩视频图像中的冗余成分,并使用尽可能少的数据来表征视频信息的技术。HEVC(High Efficiency Video Coding,又称H.265)是新一代的视频编码标准。相比于上一代的视频编码标准AVC(Advanced Video Coding,又称H.264),在达到相同视频编码质量的前提下,HEVC可以减少约50%的编码比特率,视频压缩性能较AVC提升了约一倍。由于视频编码算法的运算量很大,为了提高视频编码速度,使用专用集成电路(ASIC)对视频编码过程进行硬件加速成为业界的通用做法。Video coding is a technique to characterize video information by compressing redundant components in video images and using as little data as possible. HEVC (High Efficiency Video Coding, also known as H.265) is a new generation of video coding standards. Compared with the previous generation of video coding standard AVC (Advanced Video Coding, also known as H.264), under the premise of achieving the same video coding quality, HEVC can reduce the coding bit rate by about 50%, and the video compression performance is improved compared with AVC. about double. Due to the large computational load of the video coding algorithm, in order to improve the video coding speed, it has become a common practice in the industry to use an application specific integrated circuit (ASIC) to perform hardware acceleration on the video coding process.

视频编码技术以图像块作为基本编码单元。在HEVC中,基本编码单元是CU(CodingUnit,编码单元)。CU可以为64像素×64像素、32像素×32像素、16像素×16像素、8像素×8像素的图像块。其中64像素×64像素的图像块又称为LCU(Largest Coding Unit,最大编码单元)。Video coding techniques use image blocks as basic coding units. In HEVC, a basic coding unit is a CU (Coding Unit, coding unit). The CU may be an image block of 64 pixels×64 pixels, 32 pixels×32 pixels, 16 pixels×16 pixels, or 8 pixels×8 pixels. The image block of 64 pixels×64 pixels is also called LCU (Largest Coding Unit, largest coding unit).

在实际应用中,用来传输压缩视频的信道带宽容量是有限的。如果压缩视频的编码比特率过高,超出了信道带宽的容量,就会造成视频传输拥塞甚至丢包。如果压缩视频的编码比特率过低,又会导致信道带宽没有得到充分利用,也无法获得更高的视频质量。因此,有必要使用码率控制(Rate Control)技术,对视频编码器的输出码率进行控制,使之与信道带宽容量相匹配。In practical applications, the bandwidth capacity of the channel used to transmit compressed video is limited. If the encoding bit rate of the compressed video is too high and exceeds the capacity of the channel bandwidth, it will cause video transmission congestion or even packet loss. If the encoding bit rate of the compressed video is too low, the channel bandwidth will not be fully utilized, and higher video quality cannot be obtained. Therefore, it is necessary to use rate control (Rate Control) technology to control the output code rate of the video encoder to match the channel bandwidth capacity.

码率控制技术的目的就是通过调整视频编码器的编码参数,使视频编码器的输出码率等于预先设置的目标码率,同时尽可能减少编码失真以提升视频编码质量。目前在HEVC参考编码器HM(HEVC Test Model,HEVC测试模型)中,采用的码率控制算法基于JCTVC-K0103提案。JCTVC-K0103提案建立了编码比特率R与拉格朗日乘子λ的数学关系模型(即R-λ模型),通过目标比特分配和目标比特控制两个环节来实现码率控制任务。The purpose of the bit rate control technology is to adjust the encoding parameters of the video encoder to make the output bit rate of the video encoder equal to the preset target bit rate, and at the same time reduce encoding distortion as much as possible to improve the video encoding quality. Currently, in the HEVC reference encoder HM (HEVC Test Model, HEVC test model), the rate control algorithm adopted is based on the JCTVC-K0103 proposal. The JCTVC-K0103 proposal establishes a mathematical relationship model between the encoding bit rate R and the Lagrange multiplier λ (that is, the R-λ model), and realizes the rate control task through two links: target bit allocation and target bit control.

JCTVC-K0103提案中,目标比特分配在三个层次进行,分别是GOP(Group OfPictures,图像组,即一组时间连续的图像帧的集合)级别、图像帧级别、基本编码单元级别。为了减少运算复杂度,在基本编码单元级别的目标比特分配中,一般选择LCU作为目标比特分配的基本单元。因此,基本编码单元级别的目标比特分配通常又称为LCU级别目标比特分配。In the JCTVC-K0103 proposal, target bit allocation is performed at three levels, namely GOP (Group Of Pictures, a group of pictures, that is, a set of temporally continuous picture frames) level, picture frame level, and basic coding unit level. In order to reduce the computational complexity, in the target bit allocation at the basic coding unit level, the LCU is generally selected as the basic unit of the target bit allocation. Therefore, the target bit allocation at the basic coding unit level is also commonly referred to as the LCU level target bit allocation.

在经过GOP级别和图像帧级别的目标比特分配之后,当前待编码视频帧的目标编码比特数就被确定下来,下一步要进行LCU级别目标比特分配,以决定当前视频帧内部各个LCU的比特分配权重,并依据每个LCU的比特分配权重为各个LCU分配目标编码比特数。在JCTVC-K0103提案中,LCU级别比特分配依据如下公式进行。

Figure BDA0003517687680000021
Figure BDA0003517687680000022
其中,TLCU_curr为当前待编码LCU分配的目标编码比特数,TPic为当前待编码视频帧(视频帧也称图像帧)分配的目标编码比特数,BitH为事先预估的视频帧头信息需要的比特数,CodedPic为当前待编码视频帧中已编码的LCU的实际编码比特数,ωLCU_curr为当前待编码LCU的比特分配权重,ωLCU表示不特指某一个LCU的比特分配权重,∑{AllNotCodedLCUs}ωLCU为当前待编码视频帧中所有未编码LCU的比特分配权重之和。After the target bit allocation at the GOP level and the image frame level, the target number of encoded bits of the current video frame to be encoded is determined, and the next step is to perform the LCU level target bit allocation to determine the current video frame. The bit allocation of each LCU weight, and allocate the target number of coded bits to each LCU according to the bit allocation weight of each LCU. In the JCTVC-K0103 proposal, LCU-level bit allocation is performed according to the following formula.
Figure BDA0003517687680000021
Figure BDA0003517687680000022
Among them, T LCU_curr is the target coding bit number allocated by the LCU to be encoded currently, T Pic is the target coding bit number allocated by the current video frame to be encoded (video frame is also called image frame), Bit H is the pre-estimated video frame header information The number of bits required, Coded Pic is the actual number of encoded bits of the LCU encoded in the current video frame to be encoded, ω LCU_curr is the bit allocation weight of the current LCU to be encoded, ω LCU indicates the bit allocation weight that does not specifically refer to a certain LCU, ∑ {AllNotCodedLCUs} ω LCU is the sum of the bit allocation weights of all uncoded LCUs in the video frame currently to be coded.

从上式中可以发现,LCU级别比特分配的核心是LCU的比特分配权重ωLCU。在计算出当前待编码视频帧内所有LCU的比特分配权重之后,通过累加各个未编码LCU的比特分配权重,就可以得到当前帧内所有未编码LCU的比特分配权重之和。然后通过各个LCU的比特分配权重在当前帧内所有未编码LCU的比特分配权重之和中的占比,就可以计算出各个LCU的目标编码比特数,完成LCU级别的目标比特分配。It can be found from the above formula that the core of LCU-level bit allocation is the bit allocation weight ω LCU of the LCU . After calculating the bit allocation weights of all LCUs in the current to-be-coded video frame, the sum of the bit allocation weights of all uncoded LCUs in the current frame can be obtained by accumulating the bit allocation weights of each uncoded LCU. Then, according to the proportion of the bit allocation weight of each LCU in the sum of the bit allocation weights of all uncoded LCUs in the current frame, the target number of encoded bits of each LCU can be calculated to complete the target bit allocation at the LCU level.

在JCTVC-K0103提案中,LCU的比特分配权重ωLCU依据前一编码帧同位(即相同位置)LCU的预测误差MAD值(Mean Absolute Differences,平均绝对差)来进行计算,其计算公式如下。

Figure BDA0003517687680000023
其中,ωLCU为LCU的比特分配权重,MADLCU为LCU在前一编码帧中的同位LCU的预测误差MAD值。
Figure BDA0003517687680000024
其中,Npixels为LCU中像素点的数目,∑{AllPixelsInLCU}是对LCU中的所有像素点进行累加,Porg为原始像素点的亮度值,Ppred为预测像素点的亮度值。In the JCTVC-K0103 proposal, the bit allocation weight ω LCU of the LCU is calculated according to the prediction error MAD value (Mean Absolute Differences, Mean Absolute Differences) of the LCU in the same position (ie the same position) of the previous coded frame, and the calculation formula is as follows.
Figure BDA0003517687680000023
Among them, ω LCU is the bit allocation weight of the LCU, and MAD LCU is the prediction error MAD value of the LCU of the same position in the previous coded frame of the LCU.
Figure BDA0003517687680000024
Among them, N pixels is the number of pixels in the LCU, ∑ {AllPixelsInLCU} is the accumulation of all pixels in the LCU, P org is the brightness value of the original pixel, and P pred is the brightness value of the predicted pixel.

人眼在观看视频图像时,受到主观感知的影响,更侧重于关注视频图像中纹理复杂、细节丰富的区域,对于图像中纹理特征不明显的平坦区域关注较少。同时,人眼在观看视频图像时,也更侧重于关注视频图像中的运动剧烈、变化丰富的区域,对于图像中静止不动的区域关注较少。换句话说,在视频图像中,运动丰富区域和纹理丰富区域更容易引起人眼视觉的关注。基于人眼的这种视觉感知特征,有必要在视频编码的过程中,在图像帧目标编码比特率固定的情况下,为图像帧中运动丰富区域和纹理丰富区域的LCU分配更多的目标编码比特数,以提升人眼主观感知的视觉质量。为达到这个目的,就需要视频编码的码率控制算法能够检测出运动丰富区域和纹理丰富区域,并据此来计算LCU级别的比特分配权重,以使更多的目标编码比特数分配到位于人眼关注区域内的LCU中,进而提升人眼主观感知的视觉质量。When viewing video images, the human eye is affected by subjective perception, and pays more attention to the areas with complex texture and rich details in the video images, and pays less attention to the flat areas with insignificant texture features in the images. At the same time, when viewing a video image, the human eye also pays more attention to the areas in the video image with violent motion and rich changes, and pays less attention to the static area in the image. In other words, in video images, motion-rich regions and texture-rich regions are more likely to attract human visual attention. Based on this visual perception feature of the human eye, it is necessary to allocate more target codes to the LCUs in the motion-rich areas and texture-rich areas in the image frame when the target coding bit rate of the image frame is fixed during the video coding process. The number of bits to improve the visual quality of the subjective perception of the human eye. In order to achieve this purpose, the rate control algorithm of video coding is required to be able to detect motion-rich areas and texture-rich areas, and calculate the LCU-level bit allocation weight accordingly, so that more target coding bits are allocated to people located at the human level. In the LCU in the eye attention area, the visual quality of the subjective perception of the human eye is improved.

对基于JCTVC-K0103提案的HEVC码率控制技术的研究可以发现,LCU级别的比特分配权重是依据前一编码帧同位LCU的预测误差MAD值来进行计算的,它仅仅从信号处理的角度计算像素点之间的亮度差异,并未考虑人眼的主观视觉感知特征,无法达到为人眼关注区域分配更多目标编码比特的目的。因此,有必要对HEVC码率控制技术中的LCU级别目标比特分配方式进行改良,选取能够表征人眼视觉感知的因子来计算LCU级别的比特分配权重,为图像帧中运动丰富区域和纹理丰富区域的LCU分配更多的目标编码比特数,以达到提升人眼主观感知的视觉质量的目的。这里提到的视觉感知因子需要能够表征人眼对图像内容的主观关注程度,它需要满足如下特征:(1)对于人眼关注区域,如运动丰富区域和纹理丰富区域,视觉感知因子的取值较大;且运动幅度和纹理复杂度越大,视觉感知因子的取值越大。(2)对于人眼不关注的区域,如运动稀疏区域和纹理简单区域,视觉感知因子的取值较小;且运动幅度越小、纹理越简单,视觉感知因子的取值越小;The research on the HEVC rate control technology based on the JCTVC-K0103 proposal can find that the bit allocation weight at the LCU level is calculated based on the prediction error MAD value of the same LCU in the previous coded frame. It only calculates pixels from the perspective of signal processing. The brightness difference between the points does not take into account the subjective visual perception characteristics of the human eye, and cannot achieve the purpose of allocating more target coding bits to the area of interest of the human eye. Therefore, it is necessary to improve the LCU-level target bit allocation method in the HEVC rate control technology, and select factors that can represent the visual perception of the human eye to calculate the LCU-level bit allocation weights. The LCU allocates more target coding bits to achieve the purpose of improving the visual quality of the subjective perception of the human eye. The visual perception factor mentioned here needs to be able to represent the degree of subjective attention of the human eye to the image content, and it needs to meet the following characteristics: (1) For areas of human eye attention, such as motion-rich areas and texture-rich areas, the value of the visual perception factor larger; and the greater the motion range and texture complexity, the greater the value of the visual perception factor. (2) For areas that the human eye does not pay attention to, such as areas with sparse motion and simple textures, the value of the visual perception factor is smaller; and the smaller the motion amplitude and the simpler the texture, the smaller the value of the visual perception factor;

目前已经有一些技术方案对HEVC码率控制技术进行改进,以达到提升人眼主观感知视觉质量的目的。这些方案大多也是从寻找能够表征人眼视觉感知的因子入手,利用视觉感知因子来计算LCU级别的比特分配权重,为处于人眼关注区域之内的LCU分配更多的目标编码比特数。但是这些现有的技术方案普遍存在如下问题。At present, there have been some technical solutions to improve the HEVC rate control technology, so as to achieve the purpose of improving the subjective perception visual quality of human eyes. Most of these schemes also start from finding factors that can characterize the visual perception of the human eye, and use the visual perception factor to calculate the bit allocation weight at the LCU level, and allocate more target coding bits to the LCU within the human eye's attention area. However, these existing technical solutions generally have the following problems.

第一类现有技术方案需要在当前帧编码之前,对整帧图像进行预处理。预处理的目的是依据选取的视觉感知因子计算当前帧内部每一个LCU的比特分配权重,然后累加得到当前帧内所有LCU的比特分配权重之和,进而通过每一个LCU的比特分配权重在当前帧内所占的比例计算出各个LCU的目标编码比特数。这类技术方案需要在图像编码前添加预处理级将整帧图像内所有LCU的比特分配权重都计算出来以后,才能计算每一个LCU的目标编码比特数。这类技术方案中,整帧图像的预处理需要耗费大量的时间,且图像帧的分辨率越大,预处理需要的时间就越长,会引入很大的帧级编码延迟。同时,由于整帧图像的预处理与图像编码分开进行,导致预处理和编码需要分别从内存中读取图像内容,会额外耗费很多的总线带宽。因此,这一类技术方案适合于软件编码器实现,不适合硬件编码器实现。The first type of prior art solution needs to preprocess the whole frame of image before encoding the current frame. The purpose of preprocessing is to calculate the bit allocation weight of each LCU in the current frame according to the selected visual perception factor, and then accumulate the sum of the bit allocation weights of all LCUs in the current frame, and then pass the bit allocation weight of each LCU in the current frame. Calculate the target number of coded bits of each LCU according to the proportion occupied in each LCU. This kind of technical solution needs to add a preprocessing stage before image encoding to calculate the bit allocation weights of all LCUs in the whole frame of image, and then the target number of encoded bits of each LCU can be calculated. In this kind of technical solution, the preprocessing of the whole frame image takes a lot of time, and the larger the resolution of the image frame is, the longer the preprocessing time is, which will introduce a large frame-level coding delay. At the same time, since the preprocessing of the whole frame image is performed separately from the image encoding, the image content needs to be read from the memory separately for the preprocessing and encoding, which consumes a lot of extra bus bandwidth. Therefore, this type of technical solution is suitable for software encoder implementation, but not for hardware encoder implementation.

第二类现有技术方案利用视频帧之间的相关性,使用前一帧中各个LCU的比特分配权重,来作为当前帧中同位LCU的比特分配权重。这类方案在进行前一帧的编码时,在编码每一个LCU的过程中,同时依据视觉感知因子计算出这个LCU对应的比特分配权重。在前一帧编码完成时,前一帧上的各个LCU的比特分配权重也同时计算完成。当前帧上各个LCU的比特分配权重就采用前一帧同位LCU的比特分配权重。对于这类技术方案,虽然不需要添加额外的预处理级,实时性较好,且不会消耗额外的总线带宽,但是由于当前帧上LCU的比特分配权重完全由前一帧的图像内容计算得出,当帧与帧之间的变化较大时,会造成当前帧上的LCU的比特分配权重与人眼主观视觉敏感区域不匹配,LCU比特分配误差较大。The second type of prior art solution utilizes the correlation between video frames, and uses the bit allocation weight of each LCU in the previous frame as the bit allocation weight of the co-located LCU in the current frame. In this type of scheme, when encoding the previous frame, in the process of encoding each LCU, the bit allocation weight corresponding to this LCU is simultaneously calculated according to the visual perception factor. When the encoding of the previous frame is completed, the bit allocation weights of each LCU on the previous frame are also calculated simultaneously. The bit allocation weight of each LCU in the current frame adopts the bit allocation weight of the co-located LCU in the previous frame. For such technical solutions, although there is no need to add additional preprocessing stages, the real-time performance is good, and additional bus bandwidth is not consumed, but since the bit allocation weight of the LCU on the current frame is completely calculated from the image content of the previous frame It can be seen that when the frame-to-frame variation is large, the bit allocation weight of the LCU on the current frame does not match the subjective visual sensitive area of the human eye, and the LCU bit allocation error is large.

除了上述问题,很多现有技术方案选取的视觉感知的因子的算法都很复杂,运算量很大,需要消耗很多的计算资源和带宽资源,不适合硬件实现。In addition to the above problems, the algorithms for visual perception factors selected by many prior art solutions are very complex, require a large amount of computation, consume a lot of computing resources and bandwidth resources, and are not suitable for hardware implementation.

发明内容SUMMARY OF THE INVENTION

本发明所要解决的技术问题是提出了一种基于人眼主观视觉感知的、适合于硬件实现的HEVC码率控制方法及装置。The technical problem to be solved by the present invention is to propose an HEVC code rate control method and device suitable for hardware implementation based on human subjective visual perception.

为解决上述技术问题,本发明提出了一种基于视觉感知的码率控制方法,包括如下步骤:步骤S10:计算LCU内部纹理丰富区域的梯度幅值,并将计算结果作为LCU的纹理感知因子。步骤S20:计算LCU内部运动丰富区域的帧差幅值与面积占比的乘积,并将计算结果作为LCU的运动感知因子。所述步骤S10和步骤S20的顺序或者任一在前,或者同时进行。步骤S30:根据LCU的纹理感知因子和运动感知因子计算LCU的比特分配权重,进而计算出LCU的目标编码比特数。In order to solve the above technical problems, the present invention proposes a rate control method based on visual perception, which includes the following steps: Step S10: Calculate the gradient magnitude of the texture-rich area inside the LCU, and use the calculation result as the texture perception factor of the LCU. Step S20: Calculate the product of the frame difference amplitude and the area ratio of the motion-rich area within the LCU, and use the calculation result as a motion perception factor of the LCU. The sequence of the step S10 and the step S20 is either one of the preceding steps, or performed simultaneously. Step S30: Calculate the bit allocation weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and then calculate the target number of encoded bits of the LCU.

进一步地,所述LCU的纹理感知因子表征了人眼对于LCU纹理特征的视觉敏感程度;所述LCU的纹理感知因子取值越大,表示LCU内部的纹理越丰富,人眼主观视觉越关注;所述LCU的纹理感知因子取值越小,表示LCU内部的纹理越平坦,人眼主观视觉越不关注。Further, the texture perception factor of the LCU represents the visual sensitivity of the human eye to the texture feature of the LCU; the larger the value of the texture perception factor of the LCU, the richer the texture inside the LCU is, and the more subjective vision of the human eye is concerned; The smaller the value of the texture perception factor of the LCU, the flatter the texture inside the LCU is, and the less the subjective vision of the human eye is concerned.

进一步地,所述LCU的运动感知因子表征了人眼对于LCU运动特征的视觉敏感程度;所述LCU的运动感知因子取值越大,表示LCU内部的运动越丰富,人眼主观视觉越关注;所述LCU的运动感知因子取值越小,表示LCU内部的运动越轻微,人眼主观视觉越不关注。Further, the motion perception factor of the LCU represents the visual sensitivity of the human eye to the motion feature of the LCU; the larger the value of the motion perception factor of the LCU, the richer the motion inside the LCU, and the more concerned about the subjective vision of the human eye; The smaller the value of the motion perception factor of the LCU is, the lighter the motion inside the LCU is, and the less the subjective vision of the human eye is concerned.

进一步地,所述步骤S10中计算LCU的纹理感知因子具体包括如下步骤。步骤S11:计算LCU内部每一个像素的梯度幅值Gradx,y,并由前一帧梯度信息得到的纹理丰富像素的梯度判定阈值来判断LCU内部的各个像素是否为纹理丰富像素;LCU内部由纹理丰富像素组成的区域即为LCU内部纹理丰富区域。步骤S12:计算LCU内部纹理丰富区域的梯度幅值GradLCU;在计算当前帧中各个LCU的GradLCU时,记录下其中的最大值Gmax。步骤S13:对LCU内部纹理丰富区域的梯度幅值GradLCU进行归一化处理得到

Figure BDA0003517687680000041
其中使用前一帧的LCU内部纹理丰富区域的梯度幅值的最大值作为当前帧中LCU内部纹理丰富区域的梯度幅值的归一化的基准;归一化值
Figure BDA0003517687680000042
即为LCU的纹理感知因子。步骤S14:使用当前帧的像素梯度均值计算下一帧纹理丰富像素的梯度判定阈值GradThr。Further, calculating the texture perception factor of the LCU in the step S10 specifically includes the following steps. Step S11: Calculate the gradient magnitude Grad x,y of each pixel inside the LCU, and judge whether each pixel inside the LCU is a texture-rich pixel based on the gradient determination threshold of the texture-rich pixel obtained from the gradient information of the previous frame; The area composed of texture-rich pixels is the texture-rich area inside the LCU. Step S12: Calculate the gradient magnitude Grad LCU of the texture-rich area inside the LCU ; when calculating the Grad LCU of each LCU in the current frame, record the maximum value G max among them. Step S13: Normalize the gradient amplitude Grad LCU of the texture-rich area inside the LCU to obtain
Figure BDA0003517687680000041
The maximum value of the gradient amplitude of the texture-rich area inside the LCU of the previous frame is used as the normalization benchmark for the gradient amplitude of the texture-rich area inside the LCU in the current frame; the normalized value
Figure BDA0003517687680000042
It is the texture perception factor of LCU. Step S14: Calculate the gradient determination threshold Grad Thr of the texture-rich pixels in the next frame by using the mean value of the pixel gradients of the current frame.

进一步地,所述步骤S11中,若满足Gradx,y>GradThr,判定该像素属于纹理丰富像素,否则判定该像素不属于纹理丰富像素;GradThr是由前一帧梯度信息得到的纹理丰富像素的梯度判定阈值。Further, in the step S11, if Grad x,y >Grad Thr is satisfied, it is determined that the pixel belongs to the texture-rich pixel, otherwise it is determined that the pixel does not belong to the texture-rich pixel; Grad Thr is the texture-rich pixel obtained from the gradient information of the previous frame. Gradient decision threshold for pixels.

进一步地,所述步骤S12中,

Figure BDA0003517687680000043
Gradx,y>GradThr;其中,M是LCU的水平宽度,N是LCU的垂直高度,GradLCU是LCU内部所有纹理丰富像素的梯度幅值之和。Further, in the step S12,
Figure BDA0003517687680000043
Grad x,y >Grad Thr ; where M is the horizontal width of the LCU, N is the vertical height of the LCU, and Grad LCU is the sum of the gradient magnitudes of all texture-rich pixels inside the LCU.

进一步地,所述步骤S13中,

Figure BDA0003517687680000051
其中,Gmax是前一帧中各个LCU的GradLCU的最大值;
Figure BDA0003517687680000052
的取值在[0,ZZ]之间并且为整数,使用0到ZZ之间的整数来代表0至1的归一化范围,ZZ代表归一化后的最大值。Further, in the step S13,
Figure BDA0003517687680000051
Among them, G max is the maximum value of the Grad LCU of each LCU in the previous frame;
Figure BDA0003517687680000052
The value of is between [0, ZZ] and is an integer. An integer between 0 and ZZ is used to represent the normalized range from 0 to 1, and ZZ represents the normalized maximum value.

进一步地,所述步骤S14中,GradThr=Gradavg+Gradoffset或GradThr=α×Gradavg;其中,Gradoffset为梯度阈值调节偏差,α为梯度阈值调节乘数,Gradavg是当前帧的像素梯度均值;

Figure BDA0003517687680000053
其中,W是当前图像帧的水平宽度,H是当前图像帧的垂直高度。Further, in the step S14, Grad Thr =Grad avg +Grad offset or Grad Thr =α×Grad avg ; wherein, Grad offset is the gradient threshold adjustment deviation, α is the gradient threshold adjustment multiplier, and Grad avg is the current frame pixel gradient mean;
Figure BDA0003517687680000053
Wherein, W is the horizontal width of the current image frame, and H is the vertical height of the current image frame.

进一步地,所述步骤S20中计算LCU的运动感知因子具体包括如下步骤。步骤S21:计算LCU内部每一个像素与前一帧同位LCU对应像素的帧差幅值Diffx,y,并由前一帧帧差信息得到的运动丰富像素的帧差判定阈值来判断LCU内部的各个像素是否为运动丰富像素;LCU内部由运动丰富像素组成的区域即为LCU内部运动丰富区域。步骤S22:计算LCU内部运动丰富区域的帧差幅值DiffLCU;在计算当前帧中各个LCU的DiffLCU时,记录下其中的最大值Dmax。步骤S23:计算LCU内部运动丰富区域的面积占比AreaLCU。所述步骤S22和步骤S23的顺序或者任一在前,或者同时进行。步骤S24:计算LCU内部运动丰富区域的帧差幅值与面积占比的乘积的归一化值

Figure BDA0003517687680000054
其中使用前一帧的LCU内部运动丰富区域的帧差幅值的最大值来作为当前帧中LCU内部运动丰富区域的帧差幅值与面积占比的乘积的归一化的基准;归一化值
Figure BDA0003517687680000055
即为LCU的运动感知因子。步骤S25:使用当前帧的帧差幅值均值计算下一帧的运动丰富像素的帧差判定阈值DiffThr。Further, calculating the motion perception factor of the LCU in the step S20 specifically includes the following steps. Step S21: Calculate the frame difference amplitude Diff x,y of each pixel inside the LCU and the corresponding pixel of the LCU in the previous frame, and use the frame difference judgment threshold of the motion-rich pixels obtained from the frame difference information of the previous frame to judge the frame difference within the LCU. Whether each pixel is a motion-rich pixel; the area composed of motion-rich pixels inside the LCU is the motion-rich area within the LCU. Step S22: Calculate the frame difference amplitude Diff LCU of the motion-rich area within the LCU ; when calculating the Diff LCU of each LCU in the current frame, record the maximum value D max among them. Step S23: Calculate the area ratio Area LCU of the motion-rich area inside the LCU . The sequence of the step S22 and the step S23 can either be preceded or be performed simultaneously. Step S24: Calculate the normalized value of the product of the frame difference amplitude and the area ratio of the motion-rich area inside the LCU
Figure BDA0003517687680000054
The maximum value of the frame difference amplitude of the motion-rich area within the LCU in the previous frame is used as the normalization benchmark for the product of the frame difference amplitude and the area ratio of the motion-rich area within the LCU in the current frame; value
Figure BDA0003517687680000055
It is the motion perception factor of LCU. Step S25: Calculate the frame difference determination threshold Diff Thr of the motion-rich pixels of the next frame by using the frame difference amplitude mean value of the current frame.

进一步地,所述步骤S21中,若满足Diffx,y>DiffThr,判定该像素属于运动丰富像素,否则判定该像素不属于运动丰富像素;DiffThr是由前一帧帧差信息得到的运动丰富像素的帧差判定阈值。Further, in the step S21, if Diff x,y >Diff Thr is satisfied, it is determined that the pixel belongs to a motion-rich pixel, otherwise it is determined that the pixel does not belong to a motion-rich pixel; Diff Thr is the motion obtained from the frame difference information of the previous frame. The frame difference judgment threshold for rich pixels.

进一步地,所述步骤S22中,

Figure BDA0003517687680000056
Diffx,y>DiffThr;其中,M是LCU的水平宽度,N是LCU的垂直高度,DiffLCU是LCU内部所有运动丰富像素的帧差幅值之和。Further, in the step S22,
Figure BDA0003517687680000056
Diff x,y > Diff Thr ; where M is the horizontal width of the LCU, N is the vertical height of the LCU, and Diff LCU is the sum of the frame difference amplitudes of all motion-rich pixels inside the LCU.

进一步地,所述步骤S23中,

Figure BDA0003517687680000057
其中,AreaLCU的取值在[0,ZZ]之间并且为整数,对应面积占比0%至100%;M是LCU的水平宽度,N是LCU的垂直高度;
Figure BDA0003517687680000058
其中,若像素为运动丰富像素,则其对应的Movx,y取值为1,否则取值为0。Further, in the step S23,
Figure BDA0003517687680000057
Among them, the value of Area LCU is between [0, ZZ] and is an integer, and the corresponding area accounts for 0% to 100%; M is the horizontal width of the LCU, and N is the vertical height of the LCU;
Figure BDA0003517687680000058
Among them, if the pixel is a motion-rich pixel, its corresponding Mov x, y takes the value of 1, otherwise it takes the value of 0.

进一步地,所述步骤S24中,

Figure BDA0003517687680000061
其中,
Figure BDA0003517687680000062
的取值在[0,ZZ]之间并且为整数,使用0到ZZ之间的整数来代表0至1的归一化范围,ZZ代表归一化后的最大值;Dmax是前一帧中各个LCU的DiffLCU的最大值。Further, in the step S24,
Figure BDA0003517687680000061
in,
Figure BDA0003517687680000062
The value of is between [0, ZZ] and is an integer, using an integer between 0 and ZZ to represent the normalized range from 0 to 1, ZZ represents the normalized maximum value; D max is the previous frame The maximum value of the Diff LCUs of each LCU in .

进一步地,所述步骤S25中,DiffThr=Diffavg+Diffoffset或DiffThr=β×Diffavg;其中,Diffoffset为运动阈值调节偏差,β为运动阈值调节乘数,Diffavg是当前帧的帧差幅值均值;

Figure BDA0003517687680000063
其中,W是图像帧的水平宽度,H是图像帧的垂直高度。Further, in the step S25, Diff Thr =Diff avg +Diff offset or Diff Thr =β×Diff avg ; wherein, Diff offset is the motion threshold adjustment deviation, β is the motion threshold adjustment multiplier, and Diff avg is the current frame Frame difference amplitude average value;
Figure BDA0003517687680000063
where W is the horizontal width of the image frame and H is the vertical height of the image frame.

进一步地,所述步骤S30中,依据LCU的纹理感知因子KT和LCU的运动感知因子KM计算LCU的比特分配权重的公式如下;ωLCU=μT×KTM×KM;其中,ωLCU为LCU的比特分配权重,其取值在[0,ZZ]之间并且为整数,使用0到ZZ之间的整数来代表0至1;μT为LCU纹理感知因子的权重系数,μM为LCU运动感知因子的权重系数,两者满足μTM=1以及0<μT<1以及0<μM<1。Further, in the step S30, the formula for calculating the bit allocation weight of the LCU according to the texture perception factor K T of the LCU and the motion perception factor K M of the LCU is as follows: ω LCU = μ T ×K T + μ M ×KM ; Among them, ω LCU is the bit allocation weight of LCU, and its value is between [0, ZZ] and is an integer, and an integer between 0 and ZZ is used to represent 0 to 1; μ T is the weight coefficient of LCU texture perception factor , μ M is the weight coefficient of the LCU motion perception factor, both of which satisfy μ T + μ M =1 and 0<μ T <1 and 0<μ M <1.

进一步地,所述步骤S30中,在计算出当前帧中的LCU的比特分配权重ωLCU后,使用前一帧统计的LCU比特分配权重之和代替当前帧中LCU的比特分配权重之和,实时计算出当前帧中的LCU的比特分配权重在整个图像帧中的占比

Figure BDA0003517687680000064
进而计算出LCU的目标编码比特数;
Figure BDA0003517687680000065
Further, in the step S30, after calculating the bit allocation weight ω LCU of the LCU in the current frame, use the sum of the LCU bit allocation weights of the previous frame statistics to replace the sum of the bit allocation weights of the LCU in the current frame, real-time Calculate the proportion of the bit allocation weight of the LCU in the current frame in the entire image frame
Figure BDA0003517687680000064
Then calculate the target number of encoded bits of the LCU;
Figure BDA0003517687680000065

优选地,ZZ的取值为127、255、511、或1023之一。Preferably, the value of ZZ is one of 127, 255, 511, or 1023.

本申请还提出了一种基于视觉感知的码率控制装置,包括纹理感知因子计算模块、运动感知因子计算模块和LCU比特分配模块。所述纹理感知因子计算模块用于计算LCU内部纹理丰富区域的梯度幅值,并将计算结果作为LCU的纹理感知因子。所述运动感知因子计算模块用于计算LCU内部运动丰富区域的帧差幅值与面积占比的乘积,并将计算结果作为LCU的运动感知因子。所述LCU比特分配模块根据LCU的纹理感知因子和运动感知因子计算LCU的比特分配权重,进而计算出LCU的目标编码比特数。The present application also proposes a code rate control device based on visual perception, including a texture perception factor calculation module, a motion perception factor calculation module and an LCU bit allocation module. The texture perception factor calculation module is used to calculate the gradient magnitude of the rich texture area inside the LCU, and use the calculation result as the texture perception factor of the LCU. The motion sensing factor calculation module is used to calculate the product of the frame difference amplitude and the area ratio of the motion-rich area within the LCU, and use the calculation result as the motion sensing factor of the LCU. The LCU bit allocation module calculates the bit allocation weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and then calculates the target coding bit number of the LCU.

本发明取得的技术效果是使图像帧内LCU的目标比特分配更加符合人眼主观感知的特征,提升人眼主观视觉质量,并且适合于采用硬件实现。The technical effect achieved by the present invention is to make the target bit allocation of the LCU in the image frame more in line with the characteristics of subjective perception of the human eye, improve the subjective visual quality of the human eye, and is suitable for hardware implementation.

附图说明Description of drawings

图1是本发明提出的码率控制方法的流程示意图。FIG. 1 is a schematic flowchart of a rate control method proposed by the present invention.

图2是本发明提出的码率控制装置的结构示意图。FIG. 2 is a schematic structural diagram of a code rate control device proposed by the present invention.

图3是步骤S10中计算LCU的纹理感知因子的具体流程示意图。FIG. 3 is a schematic diagram of a specific flow of calculating the texture perception factor of the LCU in step S10.

图4是步骤S20中计算LCU的运动感知因子的具体流程示意图。FIG. 4 is a schematic diagram of a specific flow of calculating the motion perception factor of the LCU in step S20.

图中附图标记说明:10为纹理感知因子计算模块、20为运动感知因子计算模块、30为LCU比特分配模块。Reference numerals in the figure indicate: 10 is a texture perception factor calculation module, 20 is a motion perception factor calculation module, and 30 is an LCU bit allocation module.

具体实施方式Detailed ways

请参阅图1,本发明提出的码率控制方法包括如下步骤。Referring to FIG. 1 , the rate control method proposed by the present invention includes the following steps.

步骤S10:计算LCU内部纹理丰富区域的梯度幅值,并将计算结果作为LCU的纹理感知因子。所述纹理感知因子表征了人眼对于LCU纹理特征的视觉敏感程度。Step S10: Calculate the gradient magnitude of the rich texture area inside the LCU, and use the calculation result as the texture perception factor of the LCU. The texture perception factor represents the visual sensitivity of human eyes to LCU texture features.

步骤S20:计算LCU内部运动丰富区域的帧差幅值与面积占比的乘积,并将计算结果作为LCU的运动感知因子。所述运动感知因子表征了人眼对于LCU运动特征的视觉敏感程度。这一步使用LCU运动丰富区域的帧差幅值与面积占比的乘积来计算LCU的运动感知因子,属于本发明的一个创新。发明人经过实验发现,如果仅采用LCU内部运动丰富区域的帧差幅值来表征LCU的运动丰富程度,这种方式对小面积运动比较敏感,容易受到传感器噪点的影响导致判定结果不准确。如果仅采用LCU内部运动丰富区域的面积占比来表征LCU的运动丰富程度,这种方式对由于前景运动而造成的图像局部区域光照和阴影的细微变化非常敏感,不符合人眼关注特性。而利用LCU内部运动丰富区域的帧差幅值与面积占比的乘积来表征LCU的运动丰富程度能够克服采用上述单一方法的缺陷,可以更准确地反映人眼对运动区域的视觉关注度。Step S20: Calculate the product of the frame difference amplitude and the area ratio of the motion-rich area within the LCU, and use the calculation result as a motion perception factor of the LCU. The motion perception factor represents the visual sensitivity of the human eye to the motion feature of the LCU. This step uses the product of the frame difference amplitude and the area ratio of the LCU motion-rich region to calculate the motion perception factor of the LCU, which is an innovation of the present invention. The inventor found through experiments that if only the frame difference amplitude of the motion-rich area inside the LCU is used to represent the motion richness of the LCU, this method is more sensitive to small-area motion, and is easily affected by sensor noise, resulting in inaccurate determination results. If only the area ratio of the motion-rich area inside the LCU is used to characterize the motion richness of the LCU, this method is very sensitive to the subtle changes in the illumination and shadow of the local area of the image caused by the foreground motion, which does not conform to the characteristics of human eyes. Using the product of the frame difference amplitude and the area ratio of the motion-rich area inside the LCU to characterize the motion-richness of the LCU can overcome the shortcomings of the single method above, and can more accurately reflect the visual attention of the human eye to the motion area.

所述步骤S10和步骤S20的顺序没有严格限制,可以任一在前,也可以同时进行。The sequence of the step S10 and the step S20 is not strictly limited, and either one can be performed before or simultaneously.

步骤S30:根据LCU的纹理感知因子和运动感知因子计算LCU的比特分配权重,进而计算出LCU的目标编码比特数。Step S30: Calculate the bit allocation weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and then calculate the target number of encoded bits of the LCU.

采用本发明提出的码率控制方法,可以使图像帧内LCU的比特分配更加符合人眼主观感知的特征,提升人眼主观视觉质量。By adopting the code rate control method proposed by the present invention, the bit allocation of the LCU in the image frame can be more in line with the characteristics of the subjective perception of the human eye, and the subjective visual quality of the human eye can be improved.

请参阅图2,本发明提出的码率控制装置包括纹理感知因子计算模块10、运动感知因子计算模块20和LCU比特分配模块30,整体与图1所示的码率控制方法相对应。其中,纹理感知因子计算模块10用于计算LCU内部纹理丰富区域的梯度幅值,并将计算结果作为LCU的纹理感知因子。运动感知因子计算模块20用于计算LCU内部运动丰富区域的帧差幅值与面积占比的乘积,并将计算结果作为LCU的运动感知因子。LCU比特分配模块30根据LCU的纹理感知因子和运动感知因子计算LCU的比特分配权重,进而计算出LCU的目标编码比特数。Referring to FIG. 2 , the rate control device proposed by the present invention includes a texture perception factor calculation module 10 , a motion perception factor calculation module 20 and an LCU bit allocation module 30 , which generally correspond to the rate control method shown in FIG. 1 . Among them, the texture perception factor calculation module 10 is used to calculate the gradient magnitude of the rich texture area inside the LCU, and use the calculation result as the texture perception factor of the LCU. The motion sensing factor calculation module 20 is configured to calculate the product of the frame difference amplitude and the area ratio of the motion-rich area within the LCU, and use the calculation result as the motion sensing factor of the LCU. The LCU bit allocation module 30 calculates the bit allocation weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and then calculates the target number of encoded bits of the LCU.

请参阅图3,所述步骤S10中计算LCU的纹理感知因子具体包括如下步骤。Referring to FIG. 3 , the calculation of the texture perception factor of the LCU in the step S10 specifically includes the following steps.

步骤S11:计算LCU内部每一个像素的梯度幅值Gradx,y,并由前一帧梯度信息得到的纹理丰富像素的梯度判定阈值来判断LCU内部的各个像素是否为纹理丰富像素。若满足Gradx,y>GradThr,判定该像素属于纹理丰富像素,否则判定该像素不属于纹理丰富像素。GradThr是由前一帧梯度信息得到的纹理丰富像素的梯度判定阈值,详见后续步骤S14。LCU内部由纹理丰富像素组成的区域即为LCU内部纹理丰富区域。Step S11: Calculate the gradient magnitude Grad x,y of each pixel in the LCU, and determine whether each pixel in the LCU is a texture-rich pixel based on the gradient determination threshold of the texture-rich pixel obtained from the gradient information of the previous frame. If Grad x,y > Grad Thr is satisfied, the pixel is determined to be a texture-rich pixel, otherwise it is determined that the pixel is not a texture-rich pixel. Grad Thr is the gradient determination threshold of the texture-rich pixels obtained from the gradient information of the previous frame, as detailed in the subsequent step S14. The area composed of texture-rich pixels inside the LCU is the texture-rich area inside the LCU.

步骤S12:计算LCU内部纹理丰富区域的梯度幅值GradLCU,计算公式如下。

Figure BDA0003517687680000081
Figure BDA0003517687680000082
Gradx,y>GradThr。其中,M是LCU的水平宽度,N是LCU的垂直高度,单位均为像素数量。GradLCU是LCU内部所有纹理丰富像素的梯度幅值之和,反映了LCU内部纹理丰富区域的梯度幅值大小。在计算当前帧中各个LCU的GradLCU时,记录下其中的最大值Gmax,供计算下一帧的LCU的GradLCU的归一化值
Figure BDA0003517687680000083
时使用。Step S12: Calculate the gradient magnitude Grad LCU of the texture-rich area inside the LCU, and the calculation formula is as follows.
Figure BDA0003517687680000081
Figure BDA0003517687680000082
Grad x,y > Grad Thr . Among them, M is the horizontal width of the LCU, N is the vertical height of the LCU, and the unit is the number of pixels. Grad LCU is the sum of the gradient magnitudes of all texture-rich pixels inside the LCU, which reflects the magnitude of the gradient magnitudes of the texture-rich regions inside the LCU. When calculating the Grad LCU of each LCU in the current frame, record the maximum value G max for calculating the normalized value of the Grad LCU of the LCU of the next frame
Figure BDA0003517687680000083
when used.

步骤S13:对LCU内部纹理丰富区域的梯度幅值GradLCU进行归一化处理得到

Figure BDA0003517687680000084
其中使用前一帧的LCU内部纹理丰富区域的梯度幅值的最大值作为当前帧中LCU内部纹理丰富区域的梯度幅值的归一化的基准。归一化值
Figure BDA0003517687680000085
即为LCU的纹理感知因子,计算公式如下。
Figure BDA0003517687680000086
其中,
Figure BDA0003517687680000087
是GradLCU的归一化值,
Figure BDA0003517687680000088
的取值在[0,ZZ]之间并且为整数,方括号表示包含“等于”。Gmax是前一帧中各个LCU的GradLCU的最大值。在每个具体应用场景中,ZZ是一个固定值。为了方便硬件实现归一化操作,尽可能不引入浮点运算,使用0到ZZ之间的整数来代表0至1的归一化范围,ZZ代表归一化后的最大值。考虑到在保证计算精度的基础上不过大增加计算量,ZZ的优选取值为127、255、511、或1023,以便于硬件实现。Step S13: Normalize the gradient amplitude Grad LCU of the texture-rich area inside the LCU to obtain
Figure BDA0003517687680000084
The maximum value of the gradient magnitude of the texture-rich region within the LCU in the previous frame is used as the normalization benchmark for the gradient magnitude of the texture-rich region within the LCU in the current frame. normalized value
Figure BDA0003517687680000085
It is the texture perception factor of LCU, and the calculation formula is as follows.
Figure BDA0003517687680000086
in,
Figure BDA0003517687680000087
is the normalized value of Grad LCU ,
Figure BDA0003517687680000088
The value of is between [0, ZZ] and is an integer, and square brackets indicate that "equals" is included. G max is the maximum value of the Grad LCU of each LCU in the previous frame. In each specific application scenario, ZZ is a fixed value. In order to facilitate the hardware normalization operation, floating-point operations are not introduced as much as possible, and an integer between 0 and ZZ is used to represent the normalization range from 0 to 1, and ZZ represents the normalized maximum value. Considering that the calculation amount should not be greatly increased on the basis of ensuring the calculation accuracy, the preferred value of ZZ is 127, 255, 511, or 1023, which is convenient for hardware implementation.

后文中,LCU的纹理感知因子简写为KT。KT的取值在[0,ZZ]之间。KT取值越大,表示LCU内部的纹理越丰富,人眼主观视觉越关注;KT取值越小,表示LCU内部的纹理越平坦,人眼主观视觉越不关注。Hereinafter, the texture perception factor of the LCU is abbreviated as K T . The value of K T is between [0, ZZ]. The larger the value of K T is, the richer the texture inside the LCU is, and the more the subjective vision of the human eye is concerned; the smaller the value of K T is, the flatter the texture inside the LCU is, and the less the subjective vision of the human eye pays attention.

步骤S14:使用当前帧的像素梯度均值计算下一帧纹理丰富像素的梯度判定阈值GradThr,计算公式如下面任意一种。GradThr=Gradavg+Gradoffset或GradThr=α×Gradavg。其中,Gradoffset为梯度阈值调节偏差,α为梯度阈值调节乘数,这两个值的取值可以依据用户对图像纹理区域的敏感程度进行调节。Gradavg是当前帧的像素梯度均值,计算公式如下。

Figure BDA0003517687680000089
其中,W是当前图像帧的水平宽度,H是当前图像帧的垂直高度,单位均为像素数量。Step S14: Calculate the gradient determination threshold Grad Thr of the texture-rich pixels in the next frame by using the mean value of the pixel gradient of the current frame, and the calculation formula is as follows. Grad Thr =Grad avg +Grad offset or Grad Thr =α×Grad avg . Among them, Grad offset is the gradient threshold adjustment deviation, and α is the gradient threshold adjustment multiplier. The values of these two values can be adjusted according to the user's sensitivity to the image texture area. Grad avg is the average pixel gradient of the current frame, and the calculation formula is as follows.
Figure BDA0003517687680000089
Wherein, W is the horizontal width of the current image frame, H is the vertical height of the current image frame, and the unit is the number of pixels.

在计算LCU的纹理感知因子KT的过程中,本发明具有如下创新。(1)利用连续视频帧之间图像内容上的相关性,使用前一帧统计的像素梯度均值与用户调节参数相结合,作为当前帧中纹理丰富像素的梯度判定阈值,实现对当前帧中LCU内部的像素点是否属于纹理丰富像素点的自适应判定。(2)利用连续视频帧之间图像内容上的相关性,使用前一帧统计的LCU内部纹理丰富区域的梯度幅值的最大值来作为当前帧中LCU内部纹理丰富区域的梯度幅值的归一化的基准,实现对当前帧中LCU内部纹理丰富区域的梯度幅值的自适应归一化,使当前帧内各LCU的梯度幅值大小分布更加合理,并令KT的取值分布更加合理。In the process of calculating the texture perception factor K T of the LCU, the present invention has the following innovations. (1) Using the correlation of image content between consecutive video frames, the average pixel gradient of the previous frame is combined with the user adjustment parameters as the gradient judgment threshold of the texture-rich pixels in the current frame, and the LCU in the current frame is realized. Adaptive determination of whether the internal pixels belong to texture-rich pixels. (2) Using the correlation of image content between consecutive video frames, the maximum value of the gradient magnitude of the texture-rich region within the LCU counted in the previous frame is used as the normalization of the gradient magnitude of the texture-rich region within the LCU in the current frame. A standardized benchmark to achieve adaptive normalization of the gradient amplitudes in the texture-rich area inside the LCU in the current frame, so that the distribution of the gradient amplitudes of each LCU in the current frame is more reasonable, and the value distribution of K T is more reasonable. Reasonable.

请参阅图4,所述步骤S20中计算LCU的运动感知因子具体包括如下步骤。Referring to FIG. 4 , the calculation of the motion perception factor of the LCU in step S20 specifically includes the following steps.

步骤S21:计算LCU内部每一个像素与前一帧同位(即相同位置)LCU对应像素的帧差幅值Diffx,y,并由前一帧帧差信息得到的运动丰富像素的帧差判定阈值来判断LCU内部的各个像素是否为运动丰富像素。所述帧差幅值是指两个对应像素点亮度值的差值的绝对值。在帧差(即帧间差分)法里,对应像素点的亮度值做差,然后差值需要取绝对值再作为结果参与后续的运算。若满足Diffx,y>DiffThr,判定该像素属于运动丰富像素,否则判定该像素不属于运动丰富像素。DiffThr是由前一帧帧差信息得到的运动丰富像素的帧差判定阈值,详见后续步骤S25。LCU内部由运动丰富像素组成的区域即为LCU内部运动丰富区域。Step S21: Calculate the frame difference amplitude Diff x,y of each pixel inside the LCU and the corresponding pixel of the LCU in the same position (ie the same position) of the previous frame, and obtain the frame difference judgment threshold of the motion-rich pixels obtained from the frame difference information of the previous frame. to determine whether each pixel inside the LCU is a motion-rich pixel. The frame difference amplitude value refers to the absolute value of the difference between the luminance values of two corresponding pixel points. In the frame difference (ie, inter-frame difference) method, the luminance values of the corresponding pixels are differentiated, and then the absolute value of the difference needs to be taken as the result to participate in subsequent operations. If Diff x,y > Diff Thr is satisfied, the pixel is determined to be a motion-rich pixel, otherwise it is determined that the pixel is not a motion-rich pixel. Diff Thr is the frame difference determination threshold of the motion-rich pixels obtained from the frame difference information of the previous frame, as detailed in the subsequent step S25. The area inside the LCU composed of motion-rich pixels is the motion-rich area inside the LCU.

步骤S22:计算LCU内部运动丰富区域的帧差幅值DiffLCU,计算公式如下。

Figure BDA0003517687680000091
Figure BDA0003517687680000092
Diffx,y>DiffThr。其中,M是LCU的水平宽度,N是LCU的垂直高度,单位均为像素数量。DiffLCU是LCU内部所有运动丰富像素的帧差幅值之和,反映了LCU运动丰富区域的帧差幅值大小。在计算当前帧中各个LCU的DiffLCU时,记录下其中的最大值Dmax,供计算下一帧中LCU的DiffLCU与AreaLCU的乘积的归一化值时使用。Step S22: Calculate the frame difference amplitude Diff LCU of the motion-rich area within the LCU, and the calculation formula is as follows.
Figure BDA0003517687680000091
Figure BDA0003517687680000092
Diff x,y > Diff Thr . Among them, M is the horizontal width of the LCU, N is the vertical height of the LCU, and the unit is the number of pixels. Diff LCU is the sum of the frame difference amplitudes of all motion-rich pixels in the LCU, which reflects the frame difference amplitudes of the motion-rich areas of the LCU. When calculating the Diff LCU of each LCU in the current frame, the maximum value D max is recorded for use in calculating the normalized value of the product of the Diff LCU and the Area LCU of the LCU in the next frame.

步骤S23:计算LCU内部运动丰富区域的面积占比AreaLCU,计算公式如下。

Figure BDA0003517687680000093
Figure BDA0003517687680000094
其中,AreaLCU的取值在[0,ZZ]之间,对应面积占比0%至100%。ZZ的含义和优选取值与之前相同,不再赘述。M是LCU的水平宽度,N是LCU的垂直高度,单位均为像素数量。
Figure BDA0003517687680000095
其中,若像素为运动丰富像素,则其对应的Movx,y取值为1,否则取值为0。计算AreaLCU的公式的逻辑意义是,计算LCU内部所有运动像素点的总个数占LCU内部所有像素点的总个数的比例,并将比例归一化。Movx,y=1表示对应像素点为运动像素点,这样对Movx,y进行加和,就得到了LCU内部所有运动像素点的总个数。Step S23: Calculate the area ratio Area LCU of the motion-rich area inside the LCU, and the calculation formula is as follows.
Figure BDA0003517687680000093
Figure BDA0003517687680000094
Among them, the value of Area LCU is between [0, ZZ], and the corresponding area accounts for 0% to 100%. The meaning and preferred value of ZZ are the same as before, and are not repeated here. M is the horizontal width of the LCU, and N is the vertical height of the LCU, in pixels.
Figure BDA0003517687680000095
Among them, if the pixel is a motion-rich pixel, its corresponding Mov x, y takes the value of 1, otherwise it takes the value of 0. The logical meaning of the formula for calculating the Area LCU is to calculate the ratio of the total number of motion pixels in the LCU to the total number of pixels in the LCU, and normalize the ratio. Mov x, y = 1 indicates that the corresponding pixel is a moving pixel. In this way, adding Mov x, y to obtain the total number of all moving pixels in the LCU.

所述步骤S22和步骤S23的顺序没有严格限制,可以任一在前,也可以同时进行。The sequence of the step S22 and the step S23 is not strictly limited, and either one can be performed before or can be performed simultaneously.

步骤S24:计算LCU内部运动丰富区域的帧差幅值与面积占比的乘积的归一化值

Figure BDA0003517687680000096
其中使用前一帧的LCU内部运动丰富区域的帧差幅值的最大值来作为当前帧中LCU内部运动丰富区域的帧差幅值与面积占比的乘积的归一化的基准。归一化值
Figure BDA0003517687680000097
即为LCU的运动感知因子,计算公式如下。
Figure BDA0003517687680000098
其中,
Figure BDA0003517687680000099
的取值在[0,ZZ]之间。ZZ的含义和优选取值与之前相同,不再赘述。Dmax是前一帧中各个LCU的DiffLCU的最大值。Step S24: Calculate the normalized value of the product of the frame difference amplitude and the area ratio of the motion-rich area inside the LCU
Figure BDA0003517687680000096
The maximum value of the frame difference amplitude of the motion-rich area within the LCU in the previous frame is used as the normalization reference for the product of the frame difference amplitude and the area ratio of the motion-rich area within the LCU in the current frame. normalized value
Figure BDA0003517687680000097
It is the motion perception factor of the LCU, and the calculation formula is as follows.
Figure BDA0003517687680000098
in,
Figure BDA0003517687680000099
The value of is between [0, ZZ]. The meaning and preferred value of ZZ are the same as before, and are not repeated here. Dmax is the maximum value of the Diff LCUs of the respective LCUs in the previous frame.

后文中,LCU的运动感知因子简写为KM。KM的取值在[0,ZZ]之间。KM取值越大,表示LCU内部的运动越丰富,人眼主观视觉越关注;KM取值越小,表示LCU内部的运动越轻微,人眼主观视觉越不关注。Hereinafter, the motion perception factor of the LCU is abbreviated as K M . The value of K M is between [0, ZZ]. The larger the value of KM , the richer the motion inside the LCU, and the more the subjective vision of the human eye is concerned; the smaller the value of KM is, the lighter the motion inside the LCU, the less the subjective vision of the human eye pays attention.

步骤S25:使用当前帧的帧差幅值均值计算下一帧的运动丰富像素的帧差判定阈值DiffThr,计算公式如下面任意一种。DiffThr=Diffavg+Diffoffset或DiffThr=β×Diffavg。其中,Diffoffset为运动阈值调节偏差,β为运动阈值调节乘数,这两个值的取值可以依据用户对图像运动区域的敏感程度进行调节。Diffavg是当前帧的帧差幅值均值,计算公式如下。

Figure BDA0003517687680000101
Figure BDA0003517687680000102
其中,W是图像帧的水平宽度,H是图像帧的垂直高度,单位均为像素数量。Step S25: Calculate the frame difference determination threshold Diff Thr of the motion-rich pixels of the next frame by using the frame difference amplitude average value of the current frame, and the calculation formula is as follows. Diff Thr = Diff avg + Diff offset or Diff Thr = β×Diff avg . Among them, Diff offset is the motion threshold adjustment deviation, and β is the motion threshold adjustment multiplier. The values of these two values can be adjusted according to the user's sensitivity to the image motion area. Diff avg is the average value of the frame difference amplitude of the current frame, and the calculation formula is as follows.
Figure BDA0003517687680000101
Figure BDA0003517687680000102
Among them, W is the horizontal width of the image frame, H is the vertical height of the image frame, and the unit is the number of pixels.

在计算LCU的运动感知因子KM的过程中,本发明具有如下创新。(1)利用连续视频帧之间图像内容上的相关性,使用前一帧统计的帧差幅值均值与用户调节参数相结合,作为当前帧中运动丰富像素的帧差判定阈值,实现对当前帧中LCU内部的像素点是否属于运动丰富像素点的自适应判定。(2)利用连续视频帧之间图像内容上的相关性,使用前一帧统计的LCU内部运动丰富区域的帧差幅值的最大值来作为当前帧中LCU内部运动丰富区域的帧差幅值与面积占比的乘积的归一化的基准,实现对当前帧中LCU内部运动丰富区域的帧差幅值与面积占比的乘积的自适应归一化,使当前帧内各LCU的帧差幅值大小分布更加合理,并令KM的取值分布更加合理。(3)使用LCU运动丰富区域的帧差幅值与面积占比的乘积来计算LCU的运动感知因子,可以更准确地反映人眼对运动区域的视觉关注度。In the process of calculating the motion perception factor KM of the LCU , the present invention has the following innovations. (1) Using the correlation of image content between consecutive video frames, the average frame difference amplitude value of the previous frame statistics is combined with the user adjustment parameters as the frame difference judgment threshold of the motion-rich pixels in the current frame, so as to realize the detection of the current frame difference. Adaptive determination of whether the pixels inside the LCU in the frame belong to motion-rich pixels. (2) Using the correlation of image content between consecutive video frames, the maximum value of the frame difference amplitude of the motion-rich area within the LCU counted in the previous frame is used as the frame difference amplitude of the motion-rich area within the LCU in the current frame. The benchmark for the normalization of the product of the area ratio, realizes the adaptive normalization of the product of the frame difference amplitude and the area ratio of the motion-rich area inside the LCU in the current frame, so that the frame difference of each LCU in the current frame is The magnitude distribution is more reasonable, and the value distribution of K M is more reasonable. (3) Using the product of the frame difference amplitude and the area ratio of the LCU motion-rich area to calculate the motion perception factor of the LCU can more accurately reflect the visual attention of the human eye to the motion area.

所述步骤S30中,将LCU的纹理感知因子KT和LCU的运动感知因子KM进行合成,并将合成结果作为LCU的比特分配权重。依据KT和KM计算LCU的比特分配权重的公式如下。ωLCU=μT×KTM×KM。其中,ωLCU为LCU的比特分配权重,其取值在[0,ZZ]之间。ZZ的含义和优选取值与之前相同,不再赘述。μT为LCU纹理感知因子的权重系数,μM为LCU运动感知因子的权重系数,两者满足μTM=1以及0<μT<1以及0<μM<1。In the step S30, the texture perception factor K T of the LCU and the motion perception factor KM of the LCU are combined, and the combined result is used as the bit allocation weight of the LCU. The formula for calculating the bit allocation weight of the LCU according to K T and K M is as follows. ω LCU = μ T ×K T + μ M ×KM . Among them, ω LCU is the bit allocation weight of LCU, and its value is between [0, ZZ]. The meaning and preferred value of ZZ are the same as before, and are not repeated here. μ T is the weight coefficient of the LCU texture perception factor, and μ M is the weight coefficient of the LCU motion perception factor, both of which satisfy μ TM =1 and 0<μ T <1 and 0<μ M <1.

由于连续的视频图像帧内容上存在相关性,因而对于相邻的视频图像帧,帧内所有LCU的比特分配权重之和也具有相关性,可以利用前一帧内所有LCU的比特分配权重之和来预测当前帧内所有LCU的比特分配权重之和。因此,在本发明的所述步骤S30中,在计算出当前帧中的LCU的比特分配权重后,会利用连续视频帧之间图像内容上的相关性,使用前一帧统计的LCU比特分配权重之和代替当前帧中LCU的比特分配权重之和,实时计算出当前帧中的LCU的比特分配权重在整个图像帧中的占比

Figure BDA0003517687680000103
进而计算出LCU的目标编码比特数,这属于本发明的一个创新。
Figure BDA0003517687680000111
的计算公式如下。
Figure BDA0003517687680000112
Since there is a correlation in the content of consecutive video image frames, for adjacent video image frames, the sum of the bit allocation weights of all LCUs in the frame is also correlated, and the sum of the bit allocation weights of all LCUs in the previous frame can be used. to predict the sum of the bit allocation weights of all LCUs in the current frame. Therefore, in the step S30 of the present invention, after calculating the bit allocation weight of the LCU in the current frame, the correlation on the image content between consecutive video frames will be used, and the LCU bit calculated by the previous frame will be used to allocate the weight. The sum replaces the sum of the bit allocation weights of the LCUs in the current frame, and calculates the proportion of the bit allocation weights of the LCUs in the current frame in the entire image frame in real time
Figure BDA0003517687680000103
Then, the target coding bit number of the LCU is calculated, which is an innovation of the present invention.
Figure BDA0003517687680000111
The calculation formula is as follows.
Figure BDA0003517687680000112

通过这种方式,可以实现对当前帧中每个LCU的目标编码比特分配的实时计算,不需要在编码前对整帧所有LCU进行预处理,因而不会引入帧级编码延迟。同时,由于LCU的比特分配权重由当前帧中的LCU计算得出,LCU的比特分配与人眼在当前帧上的主观视觉感受一致。如果出现了少数前后帧图像内容差异较大的情形,则由HEVC码率控制算法的图像帧级别和GOP级别的比特分配算法进行调节。In this way, real-time calculation of the target coding bit allocation for each LCU in the current frame can be achieved without preprocessing all LCUs in the whole frame before coding, so no frame-level coding delay is introduced. Meanwhile, since the bit allocation weight of the LCU is calculated by the LCU in the current frame, the bit allocation of the LCU is consistent with the subjective visual perception of the human eye on the current frame. If there are a few cases where the image content of the previous and subsequent frames is quite different, the adjustment is performed by the image frame level of the HEVC bit rate control algorithm and the bit allocation algorithm of the GOP level.

本发明提出了一种基于人眼主观视觉感知的、适合于硬件实现的HEVC码率控制方法,具有如下有益效果。(1)本发明提出的码率控制方法可以使图像帧内LCU的比特分配更加符合人眼主观感知的特征,提升人眼主观视觉质量。(2)本发明在视觉感知因子的计算过程中充分利用了连续视频帧之间图像内容上的相关性,使LCU纹理丰富区域和运动丰富区域的判定可以自适应完成,并令纹理感知因子和运动感知因子的取值分布更加合理。(3)本发明在计算运动感知因子的过程中,使用LCU运动丰富区域的帧差幅值与面积占比的乘积来计算LCU的运动感知因子,可以更准确地反映人眼对运动区域的视觉关注度。(4)本发明在LCU比特分配的过程中,充分利用了连续视频帧之间图像内容上的相关性,实现了LCU比特分配权重在整个图像帧中所占比例的实时计算,在不增加图像帧预处理级的情况下,使LCU的比特分配符合人眼在当前帧上的主观视觉感受。(5)本发明采用的视觉感知因子计算方法简便,运算量小,总线带宽占用少,适合硬件实现。(6)本发明中,LCU纹理感知因子和运动感知因子的计算与LCU的编码同时进行,不需要引入预处理级在整帧编码前对帧内所有LCU的视觉感知因子进行计算,不会引入帧级延迟,不需要消耗额外的总线带宽,适合于硬件实现。The present invention proposes a HEVC code rate control method based on human subjective visual perception and suitable for hardware implementation, and has the following beneficial effects. (1) The rate control method proposed by the present invention can make the bit allocation of the LCU in the image frame more in line with the characteristics of the subjective perception of the human eye, and improve the subjective visual quality of the human eye. (2) The present invention makes full use of the correlation of image content between consecutive video frames in the calculation process of the visual perception factor, so that the determination of the LCU texture-rich area and the motion-rich area can be adaptively completed, and the texture perception factor and The value distribution of the motion perception factor is more reasonable. (3) In the process of calculating the motion perception factor, the present invention uses the product of the frame difference amplitude and the area ratio of the LCU motion-rich area to calculate the motion perception factor of the LCU, which can more accurately reflect the human eye's vision of the motion area. Attention. (4) In the process of LCU bit allocation, the present invention makes full use of the correlation of image content between consecutive video frames, and realizes the real-time calculation of the proportion of the LCU bit allocation weight in the entire image frame, without increasing the image content. In the case of the frame preprocessing stage, the bit allocation of the LCU is made to conform to the subjective visual perception of the human eye on the current frame. (5) The visual perception factor calculation method adopted by the present invention is simple and convenient, the computation amount is small, and the bus bandwidth is occupied less, and is suitable for hardware implementation. (6) In the present invention, the calculation of the LCU texture perception factor and the motion perception factor is performed at the same time as the encoding of the LCU, and there is no need to introduce a preprocessing stage to calculate the visual perception factors of all LCUs in the frame before encoding the entire frame. Frame-level delay, does not need to consume additional bus bandwidth, suitable for hardware implementation.

为了验证本发明的有益效果,本发明选取了HEVC标准测试序列ClassE中的三个YUV视频流Johnny、FourPeople和KristenAndSara来进行测试。这三个视频流都属于典型的视频会议场景,人眼的主观关注区域都是视频流中的人脸部分。实验时,编码器采用恒定码率控制模式,Johnny和KristenAndSara编码比特率设置为600kbps,FourPeople编码比特率设置为800kbps,编码帧数为120帧,编码GOP结构为IPPP结构,P帧仅参考前一帧。实验中,以HM中采用的基于MAD值来计算LCU比特分配权重的码率控制算法为比较基准,与本发明提出的基于人眼主观视觉感知的改进的码率控制算法作对比,比较三个YUV视频流编码后图像中人脸区域的PSNR(Peak Signal-to-Noise Ratio,峰值信噪比)。PSNR的单位是分贝dB。PSNR越高,代表图像失真越小,图像质量越好。实验结果如下表1所示。In order to verify the beneficial effects of the present invention, the present invention selects three YUV video streams Johnny, FourPeople and KristenAndSara in the HEVC standard test sequence ClassE for testing. These three video streams belong to typical video conference scenarios, and the subjective attention area of the human eye is the face part in the video stream. During the experiment, the encoder adopts the constant bit rate control mode, the encoding bit rate of Johnny and KristenAndSara is set to 600kbps, the encoding bit rate of FourPeople is set to 800kbps, the number of encoded frames is 120 frames, the encoding GOP structure is IPPP structure, and the P frame only refers to the previous one. frame. In the experiment, the code rate control algorithm used in HM to calculate the LCU bit allocation weight based on the MAD value is used as a comparison benchmark, and compared with the improved code rate control algorithm based on the subjective visual perception of the human eye proposed by the present invention. The PSNR (Peak Signal-to-Noise Ratio, peak signal-to-noise ratio) of the face region in the YUV video stream encoded image. The unit of PSNR is decibel dB. The higher the PSNR, the smaller the image distortion and the better the image quality. The experimental results are shown in Table 1 below.

Figure BDA0003517687680000113
Figure BDA0003517687680000113

Figure BDA0003517687680000121
Figure BDA0003517687680000121

表1:本发明与现有的码率控制方法的测试比较表Table 1: the test comparison table of the present invention and existing rate control method

在这三个YUV视频流中,与背景区域相比,人脸区域同时属于运动丰富区域和纹理丰富区域;即相对于背景区域,人脸区域运动特征很显著,纹理特征较显著。由表1可见,应用了本发明提出的码率控制算法以后,在编码码流整体平均码率和平均PSNR变化不大的情况下,提高了人脸区域的PSNR——Johnny提高0.38dB,KristenAndSara提高0.38dB,FourPeople提高0.42dB,有效增强了人眼主观感知视觉质量。In these three YUV video streams, compared with the background region, the face region belongs to both the motion-rich region and the texture-rich region; that is, compared with the background region, the motion feature of the face region is significant, and the texture feature is more significant. As can be seen from Table 1, after applying the code rate control algorithm proposed by the present invention, the PSNR of the face region is improved under the condition that the overall average code rate and average PSNR of the encoded code stream do not change much—Johnny improved by 0.38dB, KristenAndSara The increase of 0.38dB and the increase of FourPeople by 0.42dB effectively enhance the subjective perception visual quality of the human eye.

以上仅为本发明的优选实施例,并不用于限定本发明。对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above are only preferred embodiments of the present invention, and are not intended to limit the present invention. Various modifications and variations of the present invention are possible for those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims (18)

1.一种基于视觉感知的码率控制方法,其特征是,包括如下步骤:1. a code rate control method based on visual perception, is characterized in that, comprises the steps: 步骤S10:计算LCU内部纹理丰富区域的梯度幅值,并将计算结果作为LCU的纹理感知因子;Step S10: Calculate the gradient magnitude of the rich texture area inside the LCU, and use the calculation result as the texture perception factor of the LCU; 步骤S20:计算LCU内部运动丰富区域的帧差幅值与面积占比的乘积,并将计算结果作为LCU的运动感知因子;Step S20: Calculate the product of the frame difference amplitude and the area ratio of the motion-rich area inside the LCU, and use the calculation result as the motion perception factor of the LCU; 所述步骤S10和步骤S20的顺序或者任一在前,或者同时进行;The sequence of the step S10 and the step S20 may be either preceded or performed simultaneously; 步骤S30:根据LCU的纹理感知因子和运动感知因子计算LCU的比特分配权重,进而计算出LCU的目标编码比特数。Step S30: Calculate the bit allocation weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and then calculate the target number of encoded bits of the LCU. 2.根据权利要求1所述的基于视觉感知的码率控制方法,其特征是,所述LCU的纹理感知因子表征了人眼对于LCU纹理特征的视觉敏感程度;所述LCU的纹理感知因子取值越大,表示LCU内部的纹理越丰富,人眼主观视觉越关注;所述LCU的纹理感知因子取值越小,表示LCU内部的纹理越平坦,人眼主观视觉越不关注。2. The code rate control method based on visual perception according to claim 1, wherein the texture perception factor of the LCU represents the visual sensitivity of the human eye to the LCU texture feature; the texture perception factor of the LCU takes The larger the value, the richer the texture inside the LCU, and the more the subjective vision of the human eye is concerned; the smaller the value of the texture perception factor of the LCU, the flatter the texture inside the LCU, the less the subjective vision of the human eye pays attention. 3.根据权利要求1所述的基于视觉感知的码率控制方法,其特征是,所述LCU的运动感知因子表征了人眼对于LCU运动特征的视觉敏感程度;所述LCU的运动感知因子取值越大,表示LCU内部的运动越丰富,人眼主观视觉越关注;所述LCU的运动感知因子取值越小,表示LCU内部的运动越轻微,人眼主观视觉越不关注。3. The rate control method based on visual perception according to claim 1, is characterized in that, the motion perception factor of described LCU characterizes the visual sensitivity degree of human eye to LCU motion characteristic; The motion perception factor of described LCU takes The larger the value is, the richer the motion inside the LCU is, and the more attention is paid to the subjective vision of the human eye; the smaller the value of the motion perception factor of the LCU, the lighter the motion inside the LCU, the less the subjective vision of the human eye pays attention. 4.根据权利要求1所述的基于视觉感知的码率控制方法,其特征是,所述步骤S10中计算LCU的纹理感知因子具体包括如下步骤;4. The code rate control method based on visual perception according to claim 1, is characterized in that, calculating the texture perception factor of LCU in described step S10 specifically comprises the following steps; 步骤S11:计算LCU内部每一个像素的梯度幅值Gradx,y,并由前一帧梯度信息得到的纹理丰富像素的梯度判定阈值来判断LCU内部的各个像素是否为纹理丰富像素;LCU内部由纹理丰富像素组成的区域即为LCU内部纹理丰富区域;Step S11: Calculate the gradient magnitude Grad x,y of each pixel inside the LCU, and judge whether each pixel inside the LCU is a texture-rich pixel based on the gradient determination threshold of the texture-rich pixel obtained from the gradient information of the previous frame; The area composed of texture-rich pixels is the texture-rich area inside the LCU; 步骤S12:计算LCU内部纹理丰富区域的梯度幅值GradLCU;在计算当前帧中各个LCU的GradLCU时,记录下其中的最大值GmaxStep S12: Calculate the gradient amplitude Grad LCU of the texture-rich area inside the LCU ; when calculating the Grad LCU of each LCU in the current frame, record the maximum value G max therein; 步骤S13:对LCU内部纹理丰富区域的梯度幅值GradLCU进行归一化处理得到
Figure FDA0003517687670000011
其中使用前一帧的LCU内部纹理丰富区域的梯度幅值的最大值作为当前帧中LCU内部纹理丰富区域的梯度幅值的归一化的基准;归一化值
Figure FDA0003517687670000012
即为LCU的纹理感知因子;
Step S13: Normalize the gradient amplitude Grad LCU of the texture-rich area inside the LCU to obtain
Figure FDA0003517687670000011
The maximum value of the gradient amplitude of the texture-rich area inside the LCU of the previous frame is used as the normalization benchmark for the gradient amplitude of the texture-rich area inside the LCU in the current frame; the normalized value
Figure FDA0003517687670000012
is the texture perception factor of LCU;
步骤S14:使用当前帧的像素梯度均值计算下一帧纹理丰富像素的梯度判定阈值GradThrStep S14: Calculate the gradient determination threshold Grad Thr of the texture-rich pixels in the next frame by using the mean value of the pixel gradients of the current frame.
5.根据权利要求4所述的基于视觉感知的码率控制方法,其特征是,所述步骤S11中,若满足Gradx,y>GradThr,判定该像素属于纹理丰富像素,否则判定该像素不属于纹理丰富像素;GradThr是由前一帧梯度信息得到的纹理丰富像素的梯度判定阈值。5. The code rate control method based on visual perception according to claim 4, wherein in the step S11, if Grad x,y >Grad Thr is satisfied, it is determined that the pixel belongs to a texture-rich pixel, otherwise it is determined that the pixel is Does not belong to texture-rich pixels; Grad Thr is the gradient determination threshold of texture-rich pixels obtained from the gradient information of the previous frame. 6.根据权利要求4所述的基于视觉感知的码率控制方法,其特征是,所述步骤S12中,
Figure FDA0003517687670000021
Gradx,y>GradThr;其中,M是LCU的水平宽度,N是LCU的垂直高度,GradLCU是LCU内部所有纹理丰富像素的梯度幅值之和。
6. The code rate control method based on visual perception according to claim 4, is characterized in that, in described step S12,
Figure FDA0003517687670000021
Grad x,y >Grad Thr ; where M is the horizontal width of the LCU, N is the vertical height of the LCU, and Grad LCU is the sum of the gradient magnitudes of all texture-rich pixels inside the LCU.
7.根据权利要求4所述的基于视觉感知的码率控制方法,其特征是,所述步骤S13中,
Figure FDA0003517687670000022
其中,Gmax是前一帧中各个LCU的GradLCU的最大值;
Figure FDA0003517687670000023
的取值在[0,ZZ]之间并且为整数,使用0到ZZ之间的整数来代表0至1的归一化范围,ZZ代表归一化后的最大值。
7. The code rate control method based on visual perception according to claim 4, wherein in the step S13,
Figure FDA0003517687670000022
Among them, G max is the maximum value of the Grad LCU of each LCU in the previous frame;
Figure FDA0003517687670000023
The value of is between [0, ZZ] and is an integer. An integer between 0 and ZZ is used to represent the normalized range from 0 to 1, and ZZ represents the normalized maximum value.
8.根据权利要求4所述的基于视觉感知的码率控制方法,其特征是,所述步骤S14中,GradThr=Gradavg+Gradoffset或GradThr=α×Gradavg;其中,Gradoffset为梯度阈值调节偏差,α为梯度阈值调节乘数,Gradavg是当前帧的像素梯度均值;
Figure FDA0003517687670000024
Figure FDA0003517687670000025
其中,W是当前图像帧的水平宽度,H是当前图像帧的垂直高度。
8. The code rate control method based on visual perception according to claim 4, wherein in the step S14, Grad Thr =Grad avg +Grad offset or Grad Thr =α×Grad avg ; wherein, Grad offset is Gradient threshold adjustment deviation, α is the gradient threshold adjustment multiplier, Grad avg is the average pixel gradient of the current frame;
Figure FDA0003517687670000024
Figure FDA0003517687670000025
Wherein, W is the horizontal width of the current image frame, and H is the vertical height of the current image frame.
9.根据权利要求1所述的基于视觉感知的码率控制方法,其特征是,所述步骤S20中计算LCU的运动感知因子具体包括如下步骤;9. The code rate control method based on visual perception according to claim 1, wherein calculating the motion perception factor of LCU in the step S20 specifically comprises the following steps; 步骤S21:计算LCU内部每一个像素与前一帧同位LCU对应像素的帧差幅值Diffx,y,并由前一帧帧差信息得到的运动丰富像素的帧差判定阈值来判断LCU内部的各个像素是否为运动丰富像素;LCU内部由运动丰富像素组成的区域即为LCU内部运动丰富区域;Step S21: Calculate the frame difference amplitude Diff x,y of each pixel inside the LCU and the corresponding pixel of the LCU in the previous frame, and use the frame difference judgment threshold of the motion-rich pixels obtained from the frame difference information of the previous frame to judge the frame difference within the LCU. Whether each pixel is a motion-rich pixel; the area inside the LCU composed of motion-rich pixels is the motion-rich area within the LCU; 步骤S22:计算LCU内部运动丰富区域的帧差幅值DiffLCU;在计算当前帧中各个LCU的DiffLCU时,记录下其中的最大值DmaxStep S22: Calculate the frame difference amplitude Diff LCU of the motion-rich area within the LCU ; when calculating the Diff LCU of each LCU in the current frame, record the maximum value Dmax therein; 步骤S23:计算LCU内部运动丰富区域的面积占比AreaLCUStep S23: Calculate the area ratio Area LCU of the motion-rich area inside the LCU ; 所述步骤S22和步骤S23的顺序或者任一在前,或者同时进行;The sequence of the step S22 and the step S23 is either one of the preceding steps, or performed simultaneously; 步骤S24:计算LCU内部运动丰富区域的帧差幅值与面积占比的乘积的归一化值
Figure FDA0003517687670000026
其中使用前一帧的LCU内部运动丰富区域的帧差幅值的最大值来作为当前帧中LCU内部运动丰富区域的帧差幅值与面积占比的乘积的归一化的基准;归一化值
Figure FDA0003517687670000027
即为LCU的运动感知因子;
Step S24: Calculate the normalized value of the product of the frame difference amplitude and the area ratio of the motion-rich area inside the LCU
Figure FDA0003517687670000026
The maximum value of the frame difference amplitude of the motion-rich area within the LCU in the previous frame is used as the normalization benchmark for the product of the frame difference amplitude and the area ratio of the motion-rich area within the LCU in the current frame; value
Figure FDA0003517687670000027
is the motion perception factor of LCU;
步骤S25:使用当前帧的帧差幅值均值计算下一帧的运动丰富像素的帧差判定阈值DiffThrStep S25: Calculate the frame difference determination threshold Diff Thr of the motion-rich pixels of the next frame by using the frame difference amplitude mean value of the current frame.
10.根据权利要求9所述的基于视觉感知的码率控制方法,其特征是,所述步骤S21中,若满足Diffx,y>DiffThr,判定该像素属于运动丰富像素,否则判定该像素不属于运动丰富像素;DiffThr是由前一帧帧差信息得到的运动丰富像素的帧差判定阈值。10. The code rate control method based on visual perception according to claim 9, wherein in the step S21, if Diff x,y >Diff Thr is satisfied, it is determined that the pixel belongs to a motion-rich pixel, otherwise it is determined that the pixel is Does not belong to motion-rich pixels; Diff Thr is the frame difference determination threshold of motion-rich pixels obtained from the frame difference information of the previous frame. 11.根据权利要求9所述的基于视觉感知的码率控制方法,其特征是,所述步骤S22中,
Figure FDA0003517687670000028
其中,M是LCU的水平宽度,N是LCU的垂直高度,DiffLCU是LCU内部所有运动丰富像素的帧差幅值之和。
11. The code rate control method based on visual perception according to claim 9, wherein in the step S22,
Figure FDA0003517687670000028
where M is the horizontal width of the LCU, N is the vertical height of the LCU, and Diff LCU is the sum of the frame difference amplitudes of all motion-rich pixels inside the LCU.
12.根据权利要求9所述的基于视觉感知的码率控制方法,其特征是,所述步骤S23中,
Figure FDA0003517687670000031
其中,AreaLCU的取值在[0,ZZ]之间并且为整数,对应面积占比0%至100%;M是LCU的水平宽度,N是LCU的垂直高度;
Figure FDA0003517687670000032
Figure FDA0003517687670000033
其中,若像素为运动丰富像素,则其对应的Movx,y取值为1,否则取值为0。
12. The code rate control method based on visual perception according to claim 9, wherein in the step S23,
Figure FDA0003517687670000031
Among them, the value of Area LCU is between [0, ZZ] and is an integer, and the corresponding area accounts for 0% to 100%; M is the horizontal width of the LCU, and N is the vertical height of the LCU;
Figure FDA0003517687670000032
Figure FDA0003517687670000033
Among them, if the pixel is a motion-rich pixel, its corresponding Mov x, y takes the value of 1, otherwise it takes the value of 0.
13.根据权利要求9所述的基于视觉感知的码率控制方法,其特征是,所述步骤S24中,
Figure FDA0003517687670000034
其中,
Figure FDA0003517687670000035
的取值在[0,ZZ]之间并且为整数,使用0到ZZ之间的整数来代表0至1的归一化范围,ZZ代表归一化后的最大值;Dmax是前一帧中各个LCU的DiffLCU的最大值。
13. The code rate control method based on visual perception according to claim 9, wherein in the step S24,
Figure FDA0003517687670000034
in,
Figure FDA0003517687670000035
The value of is between [0, ZZ] and is an integer, using an integer between 0 and ZZ to represent the normalized range from 0 to 1, ZZ represents the normalized maximum value; D max is the previous frame The maximum value of the Diff LCUs of each LCU in .
14.根据权利要求9所述的基于视觉感知的码率控制方法,其特征是,所述步骤S25中,DiffThr=Diffavg+Diffoffset或DiffThr=β×Diffavg;其中,Diffoffset为运动阈值调节偏差,β为运动阈值调节乘数,Diffavg是当前帧的帧差幅值均值;
Figure FDA0003517687670000036
其中,W是图像帧的水平宽度,H是图像帧的垂直高度。
14. The code rate control method based on visual perception according to claim 9, wherein in the step S25, Diff Thr =Diff avg +Diff offset or Diff Thr =β×Diff avg ; wherein, Diff offset is Motion threshold adjustment deviation, β is the motion threshold adjustment multiplier, Diff avg is the average frame difference amplitude of the current frame;
Figure FDA0003517687670000036
where W is the horizontal width of the image frame and H is the vertical height of the image frame.
15.根据权利要求1所述的基于视觉感知的码率控制方法,其特征是,所述步骤S30中,依据LCU的纹理感知因子KT和LCU的运动感知因子KM计算LCU的比特分配权重的公式如下;ωLCU=μT×KTM×KM;其中,ωLCU为LCU的比特分配权重,其取值在[0,ZZ]之间并且为整数,使用0到ZZ之间的整数来代表0至1;μT为LCU纹理感知因子的权重系数,μM为LCU运动感知因子的权重系数,两者满足μTM=1以及0<μT<1以及0<μM<1。15. The code rate control method based on visual perception according to claim 1, wherein in the step S30, the bit allocation weight of the LCU is calculated according to the texture perception factor K T of the LCU and the motion perception factor K M of the LCU The formula of ω LCU is as follows; ω LCU = μ T ×K T + μ M ×K M ; where, ω LCU is the bit allocation weight of LCU, and its value is between [0, ZZ] and is an integer, using 0 to ZZ The integers between 0 and 1 represent 0 to 1; μ T is the weight coefficient of the LCU texture perception factor, μ M is the weight coefficient of the LCU motion perception factor, both satisfy μ T + μ M =1 and 0<μ T <1 and 0 < μM <1. 16.根据权利要求1所述的基于视觉感知的码率控制方法,其特征是,所述步骤S30中,在计算出当前帧中的LCU的比特分配权重ωLCU后,使用前一帧统计的LCU比特分配权重之和代替当前帧中LCU的比特分配权重之和,实时计算出当前帧中的LCU的比特分配权重在整个图像帧中的占比
Figure FDA0003517687670000037
进而计算出LCU的目标编码比特数;
Figure FDA0003517687670000038
16. The code rate control method based on visual perception according to claim 1, wherein, in the step S30, after calculating the bit allocation weight ω LCU of the LCU in the current frame, use the statistical data of the previous frame. The sum of the LCU bit allocation weights replaces the sum of the bit allocation weights of the LCUs in the current frame, and calculates the proportion of the bit allocation weights of the LCUs in the current frame in the whole image frame in real time
Figure FDA0003517687670000037
Then calculate the target number of encoded bits of the LCU;
Figure FDA0003517687670000038
17.根据权利要求7、12、13、15中的任意一项所述的基于视觉感知的码率控制方法,其特征是,ZZ的取值为127、255、511、或1023之一。17 . The rate control method based on visual perception according to claim 7 , wherein the value of ZZ is one of 127, 255, 511, or 1023. 18 . 18.一种基于视觉感知的码率控制装置,其特征是,包括纹理感知因子计算模块、运动感知因子计算模块和LCU比特分配模块;18. A code rate control device based on visual perception, characterized in that it comprises a texture perception factor calculation module, a motion perception factor calculation module and an LCU bit allocation module; 所述纹理感知因子计算模块用于计算LCU内部纹理丰富区域的梯度幅值,并将计算结果作为LCU的纹理感知因子;The texture perception factor calculation module is used to calculate the gradient magnitude of the texture-rich area inside the LCU, and the calculation result is used as the texture perception factor of the LCU; 所述运动感知因子计算模块用于计算LCU内部运动丰富区域的帧差幅值与面积占比的乘积,并将计算结果作为LCU的运动感知因子;The motion perception factor calculation module is used to calculate the product of the frame difference amplitude and the area ratio of the motion-rich area inside the LCU, and the calculation result is used as the motion perception factor of the LCU; 所述LCU比特分配模块根据LCU的纹理感知因子和运动感知因子计算LCU的比特分配权重,进而计算出LCU的目标编码比特数。The LCU bit allocation module calculates the bit allocation weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and then calculates the target coding bit number of the LCU.
CN202210171159.4A 2022-02-23 2022-02-23 A rate control method and device based on visual perception Pending CN114666585A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210171159.4A CN114666585A (en) 2022-02-23 2022-02-23 A rate control method and device based on visual perception
PCT/CN2022/123742 WO2023159965A1 (en) 2022-02-23 2022-10-08 Visual perception-based rate control method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210171159.4A CN114666585A (en) 2022-02-23 2022-02-23 A rate control method and device based on visual perception

Publications (1)

Publication Number Publication Date
CN114666585A true CN114666585A (en) 2022-06-24

Family

ID=82027311

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210171159.4A Pending CN114666585A (en) 2022-02-23 2022-02-23 A rate control method and device based on visual perception

Country Status (2)

Country Link
CN (1) CN114666585A (en)
WO (1) WO2023159965A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115103187A (en) * 2022-07-08 2022-09-23 翱捷科技股份有限公司 A low-delay video coding rate control method and device
WO2023159965A1 (en) * 2022-02-23 2023-08-31 翱捷科技股份有限公司 Visual perception-based rate control method and device

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117274820B (en) * 2023-11-20 2024-03-08 深圳市天成测绘技术有限公司 Map data acquisition method and system for mapping geographic information
CN118381937B (en) * 2024-06-21 2024-09-20 北京联合大学 Self-adaptive QP (program guide) adjustment method, system and device for point cloud video
CN119450052A (en) * 2025-01-07 2025-02-14 杭州觅睿科技股份有限公司 A HEVC bit rate control method based on maximum coding unit level
CN120050426A (en) * 2025-04-25 2025-05-27 深圳华强电子网集团股份有限公司 High-precision CMOS image sensor data storage optimization method, device, medium and equipment based on adaptive compression algorithm

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04302590A (en) * 1991-03-29 1992-10-26 Sony Corp Blurring detector for video data
CN101534436A (en) * 2008-03-11 2009-09-16 深圳市融合视讯科技有限公司 Allocation method of video image macro-block-level self-adaptive code-rates
CN101827267A (en) * 2010-04-20 2010-09-08 上海大学 Code rate control method based on video image segmentation technology
CN103617632A (en) * 2013-11-19 2014-03-05 浙江工业大学 Moving target detection method with adjacent frame difference method and Gaussian mixture models combined
CN104079925A (en) * 2014-07-03 2014-10-01 中国传媒大学 Ultrahigh definition video image quality objective evaluation method based on visual perception characteristic
CN104602028A (en) * 2015-01-19 2015-05-06 宁波大学 Entire frame loss error concealment method for B frame of stereoscopic video
CN105681793A (en) * 2016-01-06 2016-06-15 四川大学 Very-low delay and high-performance video coding intra-frame code rate control method based on video content complexity adaption
CN106303530A (en) * 2016-10-20 2017-01-04 北京工业大学 A kind of bit rate control method merging vision perception characteristic
CN111193931A (en) * 2018-11-14 2020-05-22 深圳市中兴微电子技术有限公司 Video data coding processing method and computer storage medium
CN111447446A (en) * 2020-05-15 2020-07-24 西北民族大学 A HEVC rate control method based on the importance analysis of human visual area
CN112291564A (en) * 2020-11-20 2021-01-29 西安邮电大学 A HEVC Intra-Frame Rate Control Method for Optimizing Monitoring Video Perceptual Quality

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114666585A (en) * 2022-02-23 2022-06-24 翱捷科技股份有限公司 A rate control method and device based on visual perception

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04302590A (en) * 1991-03-29 1992-10-26 Sony Corp Blurring detector for video data
CN101534436A (en) * 2008-03-11 2009-09-16 深圳市融合视讯科技有限公司 Allocation method of video image macro-block-level self-adaptive code-rates
CN101827267A (en) * 2010-04-20 2010-09-08 上海大学 Code rate control method based on video image segmentation technology
CN103617632A (en) * 2013-11-19 2014-03-05 浙江工业大学 Moving target detection method with adjacent frame difference method and Gaussian mixture models combined
CN104079925A (en) * 2014-07-03 2014-10-01 中国传媒大学 Ultrahigh definition video image quality objective evaluation method based on visual perception characteristic
CN104602028A (en) * 2015-01-19 2015-05-06 宁波大学 Entire frame loss error concealment method for B frame of stereoscopic video
CN105681793A (en) * 2016-01-06 2016-06-15 四川大学 Very-low delay and high-performance video coding intra-frame code rate control method based on video content complexity adaption
CN106303530A (en) * 2016-10-20 2017-01-04 北京工业大学 A kind of bit rate control method merging vision perception characteristic
CN111193931A (en) * 2018-11-14 2020-05-22 深圳市中兴微电子技术有限公司 Video data coding processing method and computer storage medium
CN111447446A (en) * 2020-05-15 2020-07-24 西北民族大学 A HEVC rate control method based on the importance analysis of human visual area
CN112291564A (en) * 2020-11-20 2021-01-29 西安邮电大学 A HEVC Intra-Frame Rate Control Method for Optimizing Monitoring Video Perceptual Quality

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
费马燕等: "融合视觉感知特性的HEVC率失真优化", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》, 19 October 2015 (2015-10-19) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023159965A1 (en) * 2022-02-23 2023-08-31 翱捷科技股份有限公司 Visual perception-based rate control method and device
CN115103187A (en) * 2022-07-08 2022-09-23 翱捷科技股份有限公司 A low-delay video coding rate control method and device

Also Published As

Publication number Publication date
WO2023159965A1 (en) 2023-08-31

Similar Documents

Publication Publication Date Title
CN114666585A (en) A rate control method and device based on visual perception
TWI743919B (en) Video processing apparatus and processing method of video stream
JP5318561B2 (en) Content classification for multimedia processing
US10587874B2 (en) Real-time video denoising method and terminal during coding, and non-volatile computer readable storage medium
CN113766226A (en) Image encoding method, apparatus, device and storage medium
US9025673B2 (en) Temporal quality metric for video coding
CN106303530B (en) A rate control method integrating visual perception characteristics
JP3960451B2 (en) Scene characteristic detection type moving picture coding apparatus
CN108989802A (en) A kind of quality estimation method and system of the HEVC video flowing using inter-frame relation
CN118055235A (en) Video intelligent compression method based on image analysis
CN101426135B (en) Encoding apparatus, method of controlling thereof
KR100621003B1 (en) Decoding Method of Digital Image Data
CN107682701B (en) Distributed video compression sensing self-adaptive grouping method based on perceptual hash algorithm
CN118714328A (en) Adaptive encoding method, device and medium based on GOP scene switching detection
CN115589486B (en) Video encoding method and device, computer readable storage medium, and terminal device
CN114882390B (en) Video frame type decision method based on CTU histogram in VVC coding standard
JP3800435B2 (en) Video signal processing device
CN112218088A (en) An image and video compression method
CN101527846B (en) H.264 variable bit rate control method based on Matthew effect
Xiang et al. Perceptual ctu level bit allocation for avs2
KR100316764B1 (en) Image Coding Method and Apparatus Using Visibility
CN111757112A (en) A Perceptual Rate Control Method for HEVC Based on Just Perceptible Distortion
CN115103187B (en) A low-delay video encoding bit rate control method and device
JP3845913B2 (en) Quantization control method and quantization control apparatus
EP1921866A2 (en) Content classification for multimedia processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination