CN114666585A - Code rate control method and device based on visual perception - Google Patents

Code rate control method and device based on visual perception Download PDF

Info

Publication number
CN114666585A
CN114666585A CN202210171159.4A CN202210171159A CN114666585A CN 114666585 A CN114666585 A CN 114666585A CN 202210171159 A CN202210171159 A CN 202210171159A CN 114666585 A CN114666585 A CN 114666585A
Authority
CN
China
Prior art keywords
lcu
motion
texture
rich
perception
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210171159.4A
Other languages
Chinese (zh)
Inventor
刘鹏飞
温安君
刘国正
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ASR Microelectronics Co Ltd
Original Assignee
ASR Microelectronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ASR Microelectronics Co Ltd filed Critical ASR Microelectronics Co Ltd
Priority to CN202210171159.4A priority Critical patent/CN114666585A/en
Publication of CN114666585A publication Critical patent/CN114666585A/en
Priority to PCT/CN2022/123742 priority patent/WO2023159965A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • H04N19/198Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters including smoothing of a sequence of encoding parameters, e.g. by averaging, by choice of the maximum, minimum or median value
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/004Predictors, e.g. intraframe, interframe coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/177Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/557Motion estimation characterised by stopping computation or iteration based on certain criteria, e.g. error magnitude being too large or early exit

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a code rate control method based on visual perception, which comprises the following steps: step S10: and calculating the gradient amplitude of the texture rich area inside the LCU, and taking the calculation result as a texture perception factor of the LCU. Step S20: and calculating the product of the frame difference amplitude and the area ratio of the motion-rich area inside the LCU, and taking the calculation result as the motion perception factor of the LCU. The steps S10 and S20 are performed in sequence, either before or simultaneously. Step S30: and calculating the bit distribution weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and further calculating the target coding bit number of the LCU. The invention has the technical effects that the target bit distribution of the LCU in the image frame is more in line with the characteristics of subjective perception of human eyes, improves the subjective visual quality of the human eyes and is suitable for being realized by hardware.

Description

Code rate control method and device based on visual perception
Technical Field
The present invention relates to video coding technology, and more particularly, to a method and apparatus for rate control based on visual perception and suitable for hardware implementation.
Background
Video coding is a technique for characterizing video information by compressing redundant components in a video picture and using as little data as possible. HEVC (High Efficiency Video Coding, also known as h.265) is a new generation of Video Coding standard. Compared with the previous generation of Video Coding standard AVC (Advanced Video Coding, also known as h.264), on the premise of achieving the same Video Coding quality, HEVC can reduce the Coding bit rate by about 50%, and the Video compression performance is improved by about one time compared with AVC. Since the video coding algorithm has a large amount of operation, it is common practice in the industry to use an Application Specific Integrated Circuit (ASIC) to accelerate the hardware in the video coding process in order to increase the video coding speed.
Video coding techniques use image blocks as the basic coding unit. In HEVC, the basic Coding Unit is a CU (Coding Unit). A CU may be an image block of 64 pixels × 64 pixels, 32 pixels × 32 pixels, 16 pixels × 16 pixels, 8 pixels × 8 pixels. The 64 × 64 pixel image block is also called LCU (Largest Coding Unit).
In practical applications, the bandwidth capacity of the channel used to transmit compressed video is limited. If the coding bit rate of the compressed video is too high and exceeds the capacity of the channel bandwidth, the video transmission congestion and even packet loss can be caused. If the coding bit rate of the compressed video is too low, the channel bandwidth is not fully utilized, and higher video quality cannot be obtained. Therefore, it is necessary to Control the output code Rate of the video encoder to match the channel bandwidth capacity by using a Rate Control (Rate Control) technique.
The purpose of the code rate control technology is to make the output code rate of the video encoder equal to the preset target code rate by adjusting the encoding parameters of the video encoder, and simultaneously reduce the encoding distortion as much as possible to improve the video encoding quality. Currently, in an HEVC reference encoder HM (HEVC Test Model), an adopted code rate control algorithm is based on a JCTVC-K0103 proposal. The JCTVC-K0103 proposal establishes a mathematical relation model (namely an R-lambda model) of the coding bit rate R and the Lagrange multiplier lambda, and realizes a code rate control task through two links of target bit allocation and target bit control.
In the JCTVC-K0103 proposal, target bit allocation is performed at three levels, which are a Group Of Pictures (GOP) level, an image frame level, and a basic coding unit level. To reduce the computational complexity, in the target bit allocation at the basic coding unit level, the LCU is generally selected as the basic unit of the target bit allocation. Therefore, the target bit allocation at the base coding unit level is also commonly referred to as LCU level target bit allocation.
After the target bit distribution of GOP level and image frame level, the target coding bit number of the current video frame to be coded is determined, and LCU level target bit distribution is carried out next to determine the bit distribution weight of each LCU in the current video frame, and the target coding bit number is distributed to each LCU according to the bit distribution weight of each LCU. In the JCTVC-K0103 proposal, LCU level bit allocation is performed according to the following formula.
Figure BDA0003517687680000021
Figure BDA0003517687680000022
Wherein, TLCU_currTarget number of coding bits, T, allocated to the LCU currently to be codedPicThe target number of bits, Bit, allocated to a current video frame to be coded (video frame is also called image frame)HThe number of bits, Coded, required for the pre-estimated video frame header informationPicIs the actual coding bit number, omega, of the LCU coded in the current video frame to be codedLCU_currFor bits of the LCU currently to be encodedAssigning a weight, ωLCUBit allocation weights, ∑, indicating that a LCU is not specified{AllNotCodedLCUs}ωLCUAnd allocating the sum of the weights to the bits of all the unencoded LCUs in the current video frame to be encoded.
From the above equation, it can be found that the core of LCU level bit allocation is the bit allocation weight ω of LCULCU. After the bit allocation weights of all LCUs in the current video frame to be encoded are calculated, the sum of the bit allocation weights of all the LCUs in the current frame can be obtained by accumulating the bit allocation weights of all the LCUs not to be encoded. Then, the target coding bit number of each LCU can be calculated through the ratio of the bit distribution weight of each LCU to the sum of the bit distribution weights of all the uncoded LCUs in the current frame, and the target bit distribution of the LCU level is completed.
In the JCTVC-K0103 proposal, the bits of the LCU are assigned weights ωLCUThe calculation is performed according to the Mean Absolute difference (Mean Absolute difference) of the prediction error MAD values of the co-located LCUs (i.e., co-located LCUs) of the previous coding frame, and the calculation formula is as follows.
Figure BDA0003517687680000023
Wherein, ω isLCUAssigning weights, MADs, to bits of LCUsLCUThe prediction error MAD value for the co-located LCU in the previous coded frame for the LCU.
Figure BDA0003517687680000024
Wherein N ispixelsIs the number, sigma, of pixel points in the LCU{AllPixelsInLCU}Is to accumulate all pixel points in LCU, PorgIs the brightness value, P, of the original pixelpredThe brightness values of the pixel points are predicted.
When watching a video image, human eyes are influenced by subjective perception, focus more on regions with complex textures and rich details in the video image, and pay less attention to flat regions with unobvious texture features in the image. Meanwhile, when the human eyes watch the video images, the human eyes pay more attention to the areas with violent motion and rich change in the video images, and pay less attention to the areas which are still in the images. In other words, in the video image, the motion-rich region and the texture-rich region are more likely to attract attention of human vision. Based on the visual perception characteristics of human eyes, it is necessary to allocate more target coding bit numbers to LCUs of motion rich areas and texture rich areas in image frames under the condition that the target coding bit rate of the image frames is fixed in the video coding process, so as to improve the visual quality of human eyes subjective perception. To achieve this goal, it is necessary for the rate control algorithm of video coding to be able to detect the motion rich region and the texture rich region, and calculate the bit allocation weight at the LCU level based on this, so as to allocate more target coding bit numbers to the LCUs located in the attention region of human eyes, thereby improving the visual quality of human eyes' subjective perception.
Research on the HEVC rate control technology proposed by JCTVC-K0103 shows that the bit allocation weight at the LCU level is calculated according to the prediction error MAD value of the co-located LCU of the previous coding frame, and it is only calculated from the perspective of signal processing the luminance difference between pixels, without considering the subjective visual perception characteristics of human eyes, and cannot achieve the purpose of allocating more target coding bits to the attention area of human eyes. Therefore, it is necessary to improve an LCU level target bit allocation manner in the HEVC rate control technology, select a factor capable of representing human visual perception to calculate a bit allocation weight at an LCU level, and allocate more target coding bit numbers to LCUs in motion rich regions and texture rich regions in an image frame, so as to achieve the purpose of improving visual quality of human subjective perception. The visual perception factor mentioned here needs to be able to characterize the subjective attention degree of human eyes to image contents, and it needs to satisfy the following characteristics: (1) for the attention area of human eyes, such as a movement rich area and a texture rich area, the value of the visual perception factor is larger; and the larger the motion amplitude and the texture complexity are, the larger the value of the visual perception factor is. (2) For regions which are not concerned by human eyes, such as a motion sparse region and a texture simple region, the value of the visual perception factor is small; the smaller the motion amplitude is, the simpler the texture is, and the smaller the value of the visual perception factor is;
at present, some technical schemes improve the HEVC rate control technology to achieve the purpose of improving the subjective perceptual visual quality of human eyes. Most of the schemes start from finding factors capable of representing human eye visual perception, bit distribution weights at an LCU level are calculated by using the visual perception factors, and more target coding bit numbers are distributed to LCUs in a human eye attention area. However, these prior art solutions have the following problems in common.
The first type of prior art scheme requires preprocessing the entire frame image before the current frame is encoded. The purpose of the preprocessing is to calculate the bit distribution weight of each LCU in the current frame according to the selected visual perception factor, then accumulate to obtain the sum of the bit distribution weights of all LCUs in the current frame, and further calculate the target coding bit number of each LCU according to the proportion of the bit distribution weight of each LCU in the current frame. In the technical scheme, the target coding bit number of each LCU can be calculated only after a preprocessing stage is added before image coding and the bit distribution weights of all LCUs in the whole frame of image are calculated. In such a technical solution, a large amount of time is consumed for preprocessing the whole frame image, and the larger the resolution of the image frame is, the longer the time required for preprocessing is, which may introduce a large frame-level coding delay. Meanwhile, since the preprocessing of the whole frame image is performed separately from the image encoding, the preprocessing and encoding need to read the image content from the memory, which consumes a lot of bus bandwidth. Therefore, the technical scheme is suitable for being realized by a software encoder and is not suitable for being realized by a hardware encoder.
A second type of prior art scheme uses the correlation between video frames, using the bit allocation weights of each LCU in the previous frame as the bit allocation weights of the co-located LCUs in the current frame. In the scheme, when the coding of the previous frame is carried out, in the process of coding each LCU, the bit distribution weight corresponding to the LCU is calculated according to the visual perception factor. When the encoding of the previous frame is completed, the bit allocation weights of the LCUs on the previous frame are also calculated. The bit allocation weights for each LCU in the current frame are the bit allocation weights for the co-located LCUs in the previous frame. For the technical scheme, although an additional preprocessing stage is not required to be added, the real-time performance is good, and the additional bus bandwidth is not consumed, because the bit allocation weight of the LCU on the current frame is completely calculated by the image content of the previous frame, when the change between the frames is large, the bit allocation weight of the LCU on the current frame is not matched with the subjective visual sensitive area of human eyes, and the bit allocation error of the LCU is large.
In addition to the above problems, many of the algorithms for selecting the visual perception factors in the prior art are complex, have a large computation load, need to consume a large amount of computing resources and bandwidth resources, and are not suitable for hardware implementation.
Disclosure of Invention
The invention aims to solve the technical problem of providing an HEVC code rate control method and device based on human eye subjective visual perception and suitable for hardware implementation.
In order to solve the technical problem, the invention provides a code rate control method based on visual perception, which comprises the following steps: step S10: and calculating the gradient amplitude of the texture rich area inside the LCU, and taking the calculation result as a texture perception factor of the LCU. Step S20: and calculating the product of the frame difference amplitude and the area ratio of the motion-rich area inside the LCU, and taking the calculation result as the motion perception factor of the LCU. The steps S10 and S20 are performed in sequence or either prior to or simultaneously with each other. Step S30: and calculating the bit distribution weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and further calculating the target coding bit number of the LCU.
Further, the texture perception factor of the LCU characterizes the visual sensitivity degree of human eyes to the texture feature of the LCU; the larger the texture perception factor value of the LCU is, the richer the texture inside the LCU is, and the more attention is paid to the subjective vision of human eyes; the smaller the texture perception factor value of the LCU is, the flatter the texture inside the LCU is, and the less attention is paid to the subjective vision of human eyes.
Further, the motion perception factor of the LCU characterizes the visual sensitivity degree of human eyes to the motion characteristics of the LCU; the larger the value of the motion perception factor of the LCU is, the richer the motion inside the LCU is, and the more attention is paid to the subjective vision of human eyes; the smaller the value of the motion perception factor of the LCU is, the lighter the motion inside the LCU is, and the less attention is paid to the subjective vision of human eyes.
Further, the step S10 of calculating the texture perception factor of the LCU specifically includes the following steps. Step S11: calculating the gradient amplitude Grad of each pixel in the LCUx,yJudging whether each pixel in the LCU is a texture-rich pixel or not according to the gradient judgment threshold value of the texture-rich pixel obtained by the gradient information of the previous frame; the region inside the LCU, which is composed of texture-rich pixels, is the texture-rich region inside the LCU. Step S12: calculating gradient amplitude Grad of texture rich area inside LCULCU(ii) a Calculating Grad of each LCU in current frameLCUWhen the maximum value G is recordedmax. Step S13: gradient amplitude Grad for texture rich region inside LCULCUIs subjected to normalization processing to obtain
Figure BDA0003517687680000041
Wherein the maximum value of the gradient amplitude of the LCU internal texture rich region of the previous frame is used as the reference for the normalization of the gradient amplitude of the LCU internal texture rich region in the current frame; normalized value
Figure BDA0003517687680000042
I.e. the texture perception factor of the LCU. Step S14: calculating gradient decision threshold Grad of texture-rich pixel of next frame by using gradient mean value of pixel of current frameThr
Further, in the step S11, if the Grad is satisfiedx,y>GradThrJudging that the pixel belongs to a texture-rich pixel, otherwise, judging that the pixel does not belong to the texture-rich pixel; gradThrIs the gradient decision threshold of texture rich pixels derived from the gradient information of the previous frame.
Further, in the step S12,
Figure BDA0003517687680000043
Gradx,y>GradThr(ii) a Where M is the horizontal width of the LCU, N is the vertical height of the LCU, GradLCUIs inside the LCUThe sum of the gradient magnitudes of all texture-rich pixels.
Further, in the step S13,
Figure BDA0003517687680000051
wherein G ismaxIs the Grad of each LCU in the previous frameLCUMaximum value of (d);
Figure BDA0003517687680000052
is taken to be [0, ZZ ]]And is an integer, an integer between 0 and ZZ is used to represent the normalized range of 0 to 1, ZZ represents the maximum value after normalization.
Further, in the step S14, GradThr=Gradavg+GradoffsetOr GradThr=α×Gradavg(ii) a Wherein, GradoffsetAdjusting the deviation for the gradient threshold, alpha is the gradient threshold adjustment multiplier, GradavgIs the pixel gradient mean of the current frame;
Figure BDA0003517687680000053
where W is the horizontal width of the current image frame and H is the vertical height of the current image frame.
Further, the step of calculating the motion perception factor of the LCU in step S20 specifically includes the following steps. Step S21: calculating the frame difference amplitude Diff of each pixel in the LCU and the corresponding pixel of the same-position LCU in the previous framex,yJudging whether each pixel in the LCU is a motion-rich pixel or not according to a frame difference judgment threshold value of the motion-rich pixel obtained by the frame difference information of the previous frame; the region inside the LCU consisting of motion-rich pixels is the motion-rich region inside the LCU. Step S22: calculating frame difference amplitude Diff of LCU internal motion rich areaLCU(ii) a Diff calculation for each LCU in a current frameLCUWhen the maximum value D is recordedmax. Step S23: calculating Area ratio Area of LCU internal motion rich regionLCU. The steps S22 and S23 are performed in sequence, either before or simultaneously. Step S24: normalization to compute the product of frame difference amplitude and area fraction for motion rich regions inside an LCUValue of
Figure BDA0003517687680000054
Wherein the maximum value of the frame difference amplitude of the LCU internal motion-rich region of the previous frame is used as a normalized reference of the product of the frame difference amplitude and the area ratio of the LCU internal motion-rich region in the current frame; normalized value
Figure BDA0003517687680000055
I.e. the motion perception factor of the LCU. Step S25: calculating a frame difference decision threshold Diff for motion rich pixels of a next frame using a frame difference amplitude mean of a current frameThr
Further, in step S21, if Diff is satisfiedx,y>DiffThrJudging that the pixel belongs to a motion-rich pixel, otherwise, judging that the pixel does not belong to the motion-rich pixel; diffThrIs the frame difference decision threshold for the motion rich pixels derived from the previous frame difference information.
Further, in the step S22,
Figure BDA0003517687680000056
Diffx,y>DiffThr(ii) a Where M is the LCU horizontal width, N is the LCU vertical height, DiffLCUIs the sum of the frame difference magnitudes of all motion rich pixels inside the LCU.
Further, in the step S23,
Figure BDA0003517687680000057
wherein, AreaLCUIs taken as value of [0, ZZ ]]And is an integer, the corresponding area accounts for 0 to 100 percent; m is the horizontal width of the LCU, N is the vertical height of the LCU;
Figure BDA0003517687680000058
wherein, if the pixel is a motion-rich pixel, its corresponding Movx,yThe value is 1, otherwise the value is 0.
Further, in the step S24,
Figure BDA0003517687680000061
wherein,
Figure BDA0003517687680000062
is taken as value of [0, ZZ ]]And is an integer, using an integer between 0 and ZZ to represent the normalized range of 0 to 1, ZZ representing the maximum value after normalization; dmaxIs the Diff of each LCU in the previous frameLCUOf (c) is calculated.
Further, in step S25, Diff is performedThr=Diffavg+DiffoffsetOr DiffThr=β×Diffavg(ii) a Wherein DiffoffsetAdjusting the deviation for the motion threshold, beta is a motion threshold adjustment multiplier, DiffavgIs the average value of the frame difference amplitude of the current frame;
Figure BDA0003517687680000063
where W is the horizontal width of the image frame and H is the vertical height of the image frame.
Further, in the step S30, the texture perception factor K is determined according to the LCUTAnd motion perception factor K of LCUMThe formula for calculating the bit allocation weight of the LCU is as follows; omegaLCU=μT×KTM×KM(ii) a Wherein, ω isLCUWeights are assigned to the bits of the LCU, taking on the value [0, ZZ ]]And are integers between 0 and ZZ are used to represent 0 to 1; mu.sTIs the weight coefficient, mu, of the LCU texture perceptronMWeight coefficients of LCU motion perception factors, both of which satisfy muTM1 and 0<μT<1 and 0<μM<1。
Further, in step S30, a bit allocation weight ω of the LCU in the current frame is calculatedLCUThen, the bit distribution weight sum of the LCU in the current frame is replaced by the bit distribution weight sum of the LCU in the previous frame, and the proportion of the bit distribution weight of the LCU in the current frame in the whole image frame is calculated in real time
Figure BDA0003517687680000064
Further calculating the target coding bit number of the LCU;
Figure BDA0003517687680000065
preferably ZZ takes the value of one of 127, 255, 511, or 1023.
The application also provides a code rate control device based on visual perception, which comprises a texture perception factor calculation module, a motion perception factor calculation module and an LCU bit distribution module. The texture perception factor calculation module is used for calculating the gradient amplitude of the texture rich area inside the LCU and taking the calculation result as the texture perception factor of the LCU. And the motion perception factor calculation module is used for calculating the product of the frame difference amplitude and the area ratio of the motion rich area in the LCU, and taking the calculation result as the motion perception factor of the LCU. And the LCU bit distribution module calculates the bit distribution weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and further calculates the target coding bit number of the LCU.
The invention has the technical effects that the target bit distribution of the LCU in the image frame is more in line with the characteristics of subjective perception of human eyes, improves the subjective visual quality of the human eyes and is suitable for being realized by hardware.
Drawings
Fig. 1 is a flow chart illustrating a code rate control method according to the present invention.
Fig. 2 is a schematic structural diagram of a code rate control apparatus according to the present invention.
Fig. 3 is a schematic flowchart illustrating a specific process of calculating the texture perception factor of the LCU in step S10.
Fig. 4 is a specific flowchart illustrating the calculation of the motion perception factor of the LCU in step S20.
The reference numbers in the figures illustrate: 10 is a texture perception factor calculation module, 20 is a motion perception factor calculation module, and 30 is an LCU bit allocation module.
Detailed Description
Referring to fig. 1, the method for controlling a code rate according to the present invention includes the following steps.
Step S10: and calculating the gradient amplitude of the texture rich area inside the LCU, and taking the calculation result as a texture perception factor of the LCU. The texture perception factor characterizes the degree of visual sensitivity of the human eye to the texture features of the LCU.
Step S20: and calculating the product of the frame difference amplitude and the area ratio of the motion-rich area inside the LCU, and taking the calculation result as the motion perception factor of the LCU. The motion perception factor characterizes the degree of visual sensitivity of the human eye to the motion characteristics of the LCU. This step uses the product of the frame difference amplitude and the area ratio of the motion rich area of the LCU to calculate the motion perception factor of the LCU, which is an innovation of the present invention. The inventor finds through experiments that if the motion abundance degree of the LCU is represented by only using the frame difference amplitude of the motion abundance area inside the LCU, the mode is sensitive to small-area motion and is easily influenced by sensor noise points, so that the judgment result is inaccurate. If the motion abundance degree of the LCU is represented by only adopting the area ratio of the motion abundance area inside the LCU, the method is very sensitive to slight changes of illumination and shadow of the local area of the image caused by foreground motion and is not in line with the attention characteristics of human eyes. The method for representing the motion abundance of the LCU by utilizing the product of the frame difference amplitude and the area ratio of the motion abundance area in the LCU can overcome the defect of adopting the single method and can more accurately reflect the visual attention of human eyes to the motion area.
The sequence of step S10 and step S20 is not critical, and either one of them may be performed before or simultaneously.
Step S30: and calculating the bit distribution weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and further calculating the target coding bit number of the LCU.
By adopting the code rate control method provided by the invention, the bit distribution of the LCU in the image frame can better accord with the characteristics of subjective perception of human eyes, and the subjective visual quality of the human eyes is improved.
Referring to fig. 2, the apparatus for controlling a bit rate according to the present invention includes a texture perceptual factor calculating module 10, a motion perceptual factor calculating module 20, and an LCU bit allocation module 30, which correspond to the method for controlling a bit rate shown in fig. 1 as a whole. The texture perception factor calculating module 10 is configured to calculate a gradient amplitude of the texture rich area inside the LCU, and use a calculation result as a texture perception factor of the LCU. The motion perception factor calculation module 20 is configured to calculate a product of a frame difference amplitude and an area ratio of the motion-rich area inside the LCU, and use the calculation result as a motion perception factor of the LCU. The LCU bit allocation module 30 calculates the bit allocation weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and further calculates the target coding bit number of the LCU.
Referring to fig. 3, the step of calculating the texture perception factor of the LCU in step S10 specifically includes the following steps.
Step S11: calculating the gradient amplitude Grad of each pixel in the LCUx,yAnd judging whether each pixel in the LCU is a texture-rich pixel or not according to the gradient judgment threshold value of the texture-rich pixel obtained by the gradient information of the previous frame. If Grad is satisfiedx,y>GradThrAnd judging that the pixel belongs to the texture-rich pixel, otherwise, judging that the pixel does not belong to the texture-rich pixel. GradThrThe gradient decision threshold of the texture-rich pixel obtained from the gradient information of the previous frame is described in detail in the subsequent step S14. The area inside the LCU, which is composed of texture-rich pixels, is the texture-rich area inside the LCU.
Step S12: calculating gradient amplitude Grad of texture rich area inside LCULCUThe calculation formula is as follows.
Figure BDA0003517687680000081
Figure BDA0003517687680000082
Gradx,y>GradThr. Where M is the horizontal width of the LCU, N is the vertical height of the LCU, and the units are pixel numbers. GradLCUThe sum of the gradient amplitudes of all texture-rich pixels inside the LCU reflects the gradient amplitude of the texture-rich area inside the LCU. Calculating Grad of each LCU in current frameLCUThen, the maximum value G is recordedmaxFor calculating Grad of LCU of next frameLCUNormalized value of
Figure BDA0003517687680000083
It is used when in use.
Step S13: gradient amplitude Grad to texture rich region inside LCULCUIs subjected to normalization processing to obtain
Figure BDA0003517687680000084
Wherein the maximum value of the gradient amplitude of the LCU internal texture rich region of the previous frame is used as a reference for normalization of the gradient amplitude of the LCU internal texture rich region in the current frame. Normalized value
Figure BDA0003517687680000085
Namely the texture perception factor of the LCU, the calculation formula is as follows.
Figure BDA0003517687680000086
Wherein,
Figure BDA0003517687680000087
is GradLCUThe normalized value of (a) is calculated,
Figure BDA0003517687680000088
is taken as value of [0, ZZ ]]And are integers, brackets mean to include "equal to". GmaxIs the Grad of each LCU in the previous frameLCUIs measured. In each specific application scenario, ZZ is a fixed value. In order to facilitate the hardware normalization operation, floating-point operation is not introduced as much as possible, and an integer between 0 and ZZ is used to represent the normalization range of 0 to 1, and ZZ represents the maximum value after normalization. Considering that the calculation amount is not increased greatly on the basis of ensuring the calculation accuracy, the value of ZZ is preferably 127, 255, 511, or 1023, so as to facilitate the hardware implementation.
Hereinafter, the texture perception factor of the LCU is abbreviated as KT。KTIs taken as value of [0, ZZ ]]In between. KTThe larger the value is, the richer the texture inside the LCU is, and the more attention is paid to the subjective vision of human eyes; kTThe smaller the value, the flatter the texture inside the LCU and the less attention the human eye is focused on.
Step S14: make itCalculating gradient decision threshold Grad of texture-rich pixels of the next frame by using the pixel gradient mean value of the current frameThrThe calculation formula is any one of the following. GradThr=Gradavg+GradoffsetOr GradThr=α×Gradavg. Wherein, GradoffsetAnd adjusting the deviation for the gradient threshold, wherein alpha is a multiplier for adjusting the gradient threshold, and the values of the two values can be adjusted according to the sensitivity of a user to the image texture region. GradavgIs the mean value of the pixel gradient of the current frame, and the calculation formula is as follows.
Figure BDA0003517687680000089
Wherein, W is the horizontal width of the current image frame, H is the vertical height of the current image frame, and the unit is the number of pixels.
Texture perception factor K in calculating LCUTIn the process of (2), the present invention has the following innovations. (1) By utilizing the correlation on the image content between the continuous video frames, the pixel gradient mean value counted by the previous frame is combined with the user adjustment parameter to be used as the gradient judgment threshold of the texture-rich pixel in the current frame, and the self-adaptive judgment of whether the pixel point inside the LCU in the current frame belongs to the texture-rich pixel point is realized. (2) The method uses the correlation on the image content between continuous video frames and uses the maximum value of the gradient amplitude of the LCU internal texture rich area counted by the previous frame as the standard of the normalization of the gradient amplitude of the LCU internal texture rich area in the current frame to realize the self-adaptive normalization of the gradient amplitude of the LCU internal texture rich area in the current frame, so that the gradient amplitude distribution of each LCU in the current frame is more reasonable, and the K is ensuredTThe distribution of the values is more reasonable.
Referring to fig. 4, the step of calculating the motion perception factor of the LCU in step S20 specifically includes the following steps.
Step S21: calculating the frame difference amplitude Diff of each pixel in the LCU and the corresponding pixel of the LCU with the same position (i.e. the same position) in the previous framex,yAnd judging whether each pixel inside the LCU is a motion-rich pixel or not according to a frame difference judgment threshold value of the motion-rich pixel obtained by the frame difference information of the previous frame. The frame difference amplitude refers toAnd the absolute value of the difference value of the brightness values of the two corresponding pixels. In the frame difference (i.e. interframe difference) method, the brightness values of corresponding pixels are differenced, and then the absolute value of the difference needs to be taken and then taken as a result to participate in subsequent operation. If Diff is satisfiedx,y>DiffThrAnd judging that the pixel belongs to the motion-rich pixel, otherwise, judging that the pixel does not belong to the motion-rich pixel. DiffThrThe frame difference determination threshold of the motion-rich pixel obtained from the previous frame difference information is described in detail in the subsequent step S25. The region inside the LCU consisting of motion-rich pixels is the motion-rich region inside the LCU.
Step S22: calculating frame difference amplitude Diff of LCU internal motion rich areaLCUThe calculation formula is as follows.
Figure BDA0003517687680000091
Figure BDA0003517687680000092
Diffx,y>DiffThr. Where M is the horizontal width of the LCU, N is the vertical height of the LCU, and the units are pixel numbers. DiffLCUThe sum of the frame difference amplitudes of all motion-rich pixels inside the LCU reflects the frame difference amplitude of the motion-rich area of the LCU. Diff in calculating for each LCU in a current frameLCUThen, the maximum value D is recordedmaxFor calculating the Diff of the LCU in the next frameLCUAnd AreaLCUThe normalized value of the product of (a) is used.
Step S23: calculating Area ratio Area of LCU internal motion rich regionLCUThe calculation formula is as follows.
Figure BDA0003517687680000093
Figure BDA0003517687680000094
Wherein, AreaLCUIs taken to be [0, ZZ ]]The corresponding area accounts for 0% to 100%. The meaning and preferred value of ZZ are the same as before, and are not described in detail. M is the horizontal width of the LCU, N is the vertical height of the LCU, and the units are the number of pixels.
Figure BDA0003517687680000095
Wherein, if the pixel is a motion-rich pixel, its corresponding Movx,yThe value is 1, otherwise the value is 0. Computing AreaLCUThe logical meaning of the formula (2) is to calculate the proportion of the total number of all moving pixel points in the LCU to the total number of all pixel points in the LCU, and normalize the proportion. Movx,yWhen the corresponding pixel point is represented as a motion pixel point, the Mov is processedx,yThe sum is taken to obtain the total number of all moving pixel points inside the LCU.
The sequence of step S22 and step S23 is not critical, and either one of them may be performed before or simultaneously.
Step S24: calculating a normalized value of the product of the frame difference amplitude and the area fraction of the motion rich region inside the LCU
Figure BDA0003517687680000096
Wherein the maximum value of the frame difference amplitude of the LCU internal motion-rich region of the previous frame is used as a reference for normalization of the product of the frame difference amplitude and the area ratio of the LCU internal motion-rich region in the current frame. Normalized value
Figure BDA0003517687680000097
Namely the motion perception factor of the LCU, the calculation formula is as follows.
Figure BDA0003517687680000098
Wherein,
Figure BDA0003517687680000099
is taken to be [0, ZZ ]]In the meantime. The meaning and preferred value of ZZ are the same as before, and are not described in detail. DmaxIs the Diff of each LCU in the previous frameLCUIs measured.
Hereinafter, the motion perception factor of the LCU is abbreviated as KM。KMIs taken to be [0, ZZ ]]In the meantime. KMThe larger the value is, the richer the movement inside the LCU is represented, and the more attention is paid to the subjective vision of human eyes; k isMThe smaller the value is,indicating that the more slight the motion inside the LCU, the less interesting the subjective vision of the human eye is.
Step S25: calculating a frame difference decision threshold Diff for motion rich pixels of a next frame using a frame difference amplitude average of a current frameThrThe calculation formula is any one of the following. DiffThr=Diffavg+DiffoffsetOr DiffThr=β×Diffavg. Wherein DiffoffsetAnd adjusting the deviation for the motion threshold, wherein beta is a motion threshold adjusting multiplier, and the values of the two values can be adjusted according to the sensitivity of the user to the image motion area. DiffavgIs the average value of the frame difference amplitude of the current frame, and the calculation formula is as follows.
Figure BDA0003517687680000101
Figure BDA0003517687680000102
Where W is the horizontal width of the image frame and H is the vertical height of the image frame, the units being the number of pixels.
Calculating motion perception factor K of LCUMIn the process of (2), the invention has the following innovation. (1) By utilizing the correlation on the image content between the continuous video frames, the frame difference amplitude mean value counted by the previous frame is combined with the user adjustment parameter to be used as the frame difference judgment threshold of the motion-rich pixel in the current frame, and the self-adaptive judgment of whether the pixel point inside the LCU in the current frame belongs to the motion-rich pixel point is realized. (2) By utilizing the correlation on the image content between continuous video frames, the maximum value of the frame difference amplitude of the LCU internal motion rich region counted by the previous frame is used as the normalization reference of the product of the frame difference amplitude and the area ratio of the LCU internal motion rich region in the current frame, the self-adaptive normalization of the product of the frame difference amplitude and the area ratio of the LCU internal motion rich region in the current frame is realized, the size distribution of the frame difference amplitude of each LCU in the current frame is more reasonable, and K is ensuredMThe distribution of the values is more reasonable. (3) The motion perception factor of the LCU is calculated by using the product of the frame difference amplitude and the area ratio of the LCU motion rich area, so that the visual relation of human eyes to the motion area can be reflected more accuratelyAnd (5) degree of attention.
In step S30, the texture sensing factor K of the LCU is usedTAnd motion perception factor K of LCUMAnd synthesizing and taking the synthesis result as the bit allocation weight of the LCU. According to KTAnd KMThe formula for calculating the bit allocation weights of the LCU is as follows. OmegaLCU=μT×KTM×KM. Wherein, ω isLCUThe bits of the LCU are assigned weights that take on the value [0, ZZ ]]In the meantime. The meaning and preferred value of ZZ are the same as before, and are not described in detail. Mu.sTIs the weight coefficient, mu, of the LCU texture perceptronMWeight coefficients for LCU motion perception factors, both of which satisfy muTM1 and 0<μT<1 and 0<μM<1。
Since there is correlation in the content of successive video image frames, the sum of the bit allocation weights of all LCUs in a frame has correlation with respect to the adjacent video image frame, and the sum of the bit allocation weights of all LCUs in the previous frame can be used to predict the sum of the bit allocation weights of all LCUs in the current frame. Therefore, in step S30 of the present invention, after calculating the bit allocation weights of the LCUs in the current frame, the ratio of the bit allocation weights of the LCUs in the current frame to the bit allocation weights of the LCUs in the whole image frame is calculated in real time by using the sum of the bit allocation weights of the LCUs counted in the previous frame instead of the sum of the bit allocation weights of the LCUs in the current frame, based on the correlation between the image contents of the consecutive video frames
Figure BDA0003517687680000103
And then calculating the target coding bit number of the LCU, which belongs to the innovation of the invention.
Figure BDA0003517687680000111
The calculation formula of (c) is as follows.
Figure BDA0003517687680000112
In this way, real-time computation of target coding bit allocation for each LCU in the current frame can be achieved without pre-processing all LCUs of the entire frame prior to encoding, thus introducing no frame-level encoding delay. Meanwhile, the bit allocation weight of the LCU is calculated by the LCU in the current frame, so that the bit allocation of the LCU is consistent with the subjective visual perception of human eyes on the current frame. If the content difference of a few previous and next frame images is large, the bit allocation algorithm of the HEVC rate control algorithm at the image frame level and the GOP level is used for adjusting.
The invention provides an HEVC (high efficiency video coding) rate control method based on human eye subjective visual perception and suitable for hardware implementation, and has the following beneficial effects. (1) The code rate control method provided by the invention can enable the bit allocation of the LCU in the image frame to better accord with the subjective perception characteristics of human eyes, and improve the subjective visual quality of the human eyes. (2) The invention fully utilizes the correlation on the image content between the continuous video frames in the calculation process of the visual perception factor, so that the judgment of the texture rich area and the motion rich area of the LCU can be finished in a self-adaptive way, and the value distribution of the texture perception factor and the motion perception factor is more reasonable. (3) In the process of calculating the motion perception factor, the motion perception factor of the LCU is calculated by using the product of the frame difference amplitude and the area ratio of the LCU motion rich region, so that the visual attention of human eyes to the motion region can be more accurately reflected. (4) In the LCU bit allocation process, the invention fully utilizes the correlation on the image content between the continuous video frames, realizes the real-time calculation of the proportion of the LCU bit allocation weight in the whole image frame, and makes the LCU bit allocation accord with the subjective visual perception of human eyes on the current frame under the condition of not increasing the image frame preprocessing stage. (5) The visual perception factor calculation method adopted by the invention is simple and convenient, has small operand and less occupied bus bandwidth, and is suitable for hardware implementation. (6) In the invention, the calculation of the texture perception factor and the motion perception factor of the LCU and the coding of the LCU are carried out simultaneously, the preprocessing stage is not required to be introduced to calculate the visual perception factors of all LCUs in a frame before the whole frame is coded, the frame stage delay is not introduced, the extra bus bandwidth is not required to be consumed, and the method is suitable for hardware realization.
In order to verify the beneficial effects of the invention, three YUV video streams Johnny, FourPeople and Kristen AndSara in an HEVC standard test sequence Classe are selected for testing. The three video streams all belong to a typical video conference scene, and the subjective attention areas of human eyes are all the face parts in the video streams. In the experiment, the encoder adopts a constant code rate control mode, the encoding bit rate of Johnny and Kristen and Sarra is set to be 600kbps, the encoding bit rate of FourPel is set to be 800kbps, the number of encoding frames is 120 frames, the encoding GOP structure is an IPPP structure, and the P frame only refers to the previous frame. In the experiment, a code rate control algorithm which is adopted in the HM and used for calculating LCU bit allocation weight based on the MAD value is used as a comparison standard, compared with the improved code rate control algorithm which is provided by the invention and is based on human eye subjective visual perception, PSNR (Peak Signal-to-Noise Ratio) of a human face region in an image after three YUV video streams are coded is compared. The unit of PSNR is decibel dB. The higher the PSNR, the smaller the representative image distortion, and the better the image quality. The experimental results are shown in table 1 below.
Figure BDA0003517687680000113
Figure BDA0003517687680000121
Table 1: test comparison table of the present invention and the existing code rate control method
In the three YUV video streams, compared with a background region, a face region simultaneously belongs to a motion rich region and a texture rich region; namely, compared with the background area, the motion characteristic of the face area is significant, and the texture characteristic is significant. As can be seen from Table 1, after the code rate control algorithm provided by the invention is applied, under the condition that the overall average code rate and the average PSNR of the coding code stream are not changed greatly, the PSNR of a human face region, namely Johnny, is improved by 0.38dB, Kristen AndSara is improved by 0.38dB, FourPeople is improved by 0.42dB, and the subjective perception visual quality of human eyes is effectively enhanced.
The above are merely preferred embodiments of the present invention, and are not intended to limit the present invention. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (18)

1. A code rate control method based on visual perception is characterized by comprising the following steps:
step S10: calculating the gradient amplitude of the texture rich area inside the LCU, and taking the calculation result as a texture perception factor of the LCU;
step S20: calculating the product of the frame difference amplitude and the area ratio of the motion rich area in the LCU, and taking the calculation result as the motion perception factor of the LCU;
the steps S10 and S20 are performed in sequence or at any one of the preceding steps or at the same time;
step S30: and calculating the bit distribution weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and further calculating the target coding bit number of the LCU.
2. The method of claim 1, wherein the texture perception factor of the LCU characterizes a degree of visual sensitivity of a human eye to texture features of the LCU; the larger the texture perception factor value of the LCU is, the richer the texture inside the LCU is, and the more attention is paid to the subjective vision of human eyes; the smaller the texture perception factor value of the LCU is, the flatter the texture inside the LCU is, and the less attention is paid to the subjective vision of human eyes.
3. The code rate control method based on visual perception according to claim 1, wherein the motion perception factor of the LCU characterizes a degree of visual sensitivity of human eyes to a motion characteristic of the LCU; the larger the value of the motion perception factor of the LCU is, the richer the motion inside the LCU is, and the more attention is paid to the subjective vision of human eyes; the smaller the value of the motion perception factor of the LCU is, the lighter the motion inside the LCU is, and the less attention is paid to the subjective vision of human eyes.
4. The method for controlling bitrate based on visual perception according to claim 1, wherein the step of calculating texture perception factors of the LCU in the step S10 specifically comprises the steps of;
step S11: calculating the gradient amplitude Grad of each pixel in the LCUx,yJudging whether each pixel in the LCU is a texture-rich pixel or not according to the gradient judgment threshold value of the texture-rich pixel obtained by the gradient information of the previous frame; the region formed by the texture-rich pixels in the LCU is the texture-rich region in the LCU;
step S12: calculating gradient amplitude Grad of texture rich area inside LCULCU(ii) a Calculating Grad of each LCU in current frameLCUThen, the maximum value G is recordedmax
Step S13: gradient amplitude Grad for texture rich region inside LCULCUIs subjected to normalization processing to obtain
Figure FDA0003517687670000011
Wherein the maximum value of the gradient amplitude of the LCU internal texture rich region of the previous frame is used as the reference for the normalization of the gradient amplitude of the LCU internal texture rich region in the current frame; normalized value
Figure FDA0003517687670000012
Namely texture perception factors of the LCU;
step S14: calculating gradient decision threshold Grad of texture-rich pixel of next frame by using gradient mean value of pixel of current frameThr
5. The method for controlling code rate according to claim 4, wherein in step S11, if Grad is satisfiedx,y>GradThrJudging that the pixel belongs to a texture-rich pixel, otherwise, judging that the pixel does not belong to the texture-rich pixel; gradThrIs the gradient decision threshold of texture rich pixels derived from the gradient information of the previous frame.
6. The visual perception-based code of claim 4In the rate control method, in step S12,
Figure FDA0003517687670000021
Gradx,y>GradThr(ii) a Where M is the horizontal width of the LCU, N is the vertical height of the LCU, GradLCUIs the sum of the gradient magnitudes of all texture-rich pixels inside the LCU.
7. The method for rate control based on visual perception according to claim 4, wherein in step S13,
Figure FDA0003517687670000022
wherein G ismaxIs the Grad of each LCU in the previous frameLCUThe maximum value of (a);
Figure FDA0003517687670000023
is taken to be [0, ZZ ]]And is an integer, an integer between 0 and ZZ is used to represent the normalized range of 0 to 1, ZZ represents the maximum value after normalization.
8. The method for controlling code rate according to claim 4, wherein in step S14, GradThr=Gradavg+GradoffsetOr GradThr=α×Gradavg(ii) a Wherein, GradoffsetAdjusting the deviation for the gradient threshold, alpha is the gradient threshold adjustment multiplier, GradavgIs the pixel gradient mean of the current frame;
Figure FDA0003517687670000024
Figure FDA0003517687670000025
where W is the horizontal width of the current image frame and H is the vertical height of the current image frame.
9. The method for controlling bitrate based on visual perception according to claim 1, wherein the step of calculating motion perception factors of the LCU in the step S20 specifically comprises the steps of;
step S21: calculating the frame difference amplitude Diff of each pixel in the LCU and the pixel corresponding to the same-position LCU in the previous framex,yJudging whether each pixel in the LCU is a motion-rich pixel or not according to a frame difference judgment threshold value of the motion-rich pixel obtained by the frame difference information of the previous frame; the region formed by the motion-rich pixels in the LCU is the motion-rich region in the LCU;
step S22: calculating frame difference amplitude Diff of LCU internal motion rich areaLCU(ii) a Diff calculation for each LCU in a current frameLCUThen, the maximum value D is recordedmax
Step S23: calculating Area ratio Area of LCU internal motion rich regionLCU
The step S22 and the step S23 are carried out in sequence or at any one before or at the same time;
step S24: calculating a normalized value of the product of the frame difference amplitude and the area fraction of the motion rich region inside the LCU
Figure FDA0003517687670000026
Wherein the maximum value of the frame difference amplitude of the LCU internal motion-rich region of the previous frame is used as a normalized reference of the product of the frame difference amplitude and the area ratio of the LCU internal motion-rich region in the current frame; normalized value
Figure FDA0003517687670000027
The motion perception factor of the LCU is obtained;
step S25: calculating a frame difference decision threshold Diff for motion rich pixels of a next frame using a frame difference amplitude average of a current frameThr
10. The method for rate control based on visual perception of claim 9, wherein in step S21, if Diff is satisfiedx,y>DiffThrJudging that the pixel belongs to the motion-rich pixel, otherwise, judging that the pixel does not belong to the motion-rich pixelMotion-rich pixels; diffThrIs the frame difference decision threshold for the motion-rich pixels derived from the previous frame difference information.
11. The method for code rate control based on visual perception of claim 9, wherein in step S22,
Figure FDA0003517687670000028
where M is the horizontal width of the LCU, N is the vertical height of the LCU, DiffLCUIs the sum of the frame difference magnitudes of all motion rich pixels inside the LCU.
12. The method for code rate control based on visual perception of claim 9, wherein in step S23,
Figure FDA0003517687670000031
wherein, AreaLCUIs taken to be [0, ZZ ]]And is an integer, the corresponding area ratio is 0% to 100%; m is the horizontal width of the LCU, N is the vertical height of the LCU;
Figure FDA0003517687670000032
Figure FDA0003517687670000033
wherein, if the pixel is a motion-rich pixel, its corresponding Movx,yThe value is 1, otherwise the value is 0.
13. The method for code rate control based on visual perception of claim 9, wherein in step S24,
Figure FDA0003517687670000034
wherein,
Figure FDA0003517687670000035
is taken to be [0, ZZ ]]And is an integer, using integers between 0 and ZZNumbers represent the normalized range of 0 to 1, ZZ represents the maximum after normalization; dmaxIs the Diff of each LCU in the previous frameLCUIs measured.
14. The method for rate control based on visual perception of claim 9, wherein in step S25, DiffThr=Diffavg+DiffoffsetOr DiffThr=β×Diffavg(ii) a Wherein DiffoffsetFor adjusting the deviation for the motion threshold, beta is a motion threshold adjustment multiplier, DiffavgIs the average value of the frame difference amplitude of the current frame;
Figure FDA0003517687670000036
where W is the horizontal width of the image frame and H is the vertical height of the image frame.
15. The method for controlling bitrate based on visual perception of claim 1, wherein in said step S30, texture perception factor K according to LCUTAnd motion perception factor K of LCUMThe formula for calculating the bit allocation weight of the LCU is as follows; omegaLCU=μT×KTM×KM(ii) a Wherein, ω isLCUWeights are assigned to the bits of the LCU, taking on the value [0, ZZ ]]And are integers between 0 and ZZ are used to represent 0 to 1; mu.sTIs the weight coefficient, mu, of the LCU texture perceptronMWeight coefficients of LCU motion perception factors, both of which satisfy muTM1 and 0<μT<1 and 0<μM<1。
16. The method for rate control based on visual perception of claim 1, wherein in step S30, the LCU bit allocation weight ω in the current frame is calculatedLCUThen, the bit distribution weight sum of the LCU in the current frame is replaced by the bit distribution weight sum of the LCU in the previous frame, and the proportion of the bit distribution weight of the LCU in the current frame in the whole image frame is calculated in real time
Figure FDA0003517687670000037
Further calculating the target coding bit number of the LCU;
Figure FDA0003517687670000038
17. the method for rate control based on visual perception according to any of claims 7, 12, 13, 15, wherein ZZ takes one of 127, 255, 511, or 1023.
18. A code rate control device based on visual perception is characterized by comprising a texture perception factor calculation module, a motion perception factor calculation module and an LCU bit distribution module;
the texture perception factor calculation module is used for calculating the gradient amplitude of the texture rich region inside the LCU and taking the calculation result as the texture perception factor of the LCU;
the motion perception factor calculation module is used for calculating the product of the frame difference amplitude and the area ratio of the motion rich area in the LCU, and taking the calculation result as the motion perception factor of the LCU;
and the LCU bit distribution module calculates the bit distribution weight of the LCU according to the texture perception factor and the motion perception factor of the LCU, and further calculates the target coding bit number of the LCU.
CN202210171159.4A 2022-02-23 2022-02-23 Code rate control method and device based on visual perception Pending CN114666585A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210171159.4A CN114666585A (en) 2022-02-23 2022-02-23 Code rate control method and device based on visual perception
PCT/CN2022/123742 WO2023159965A1 (en) 2022-02-23 2022-10-08 Visual perception-based rate control method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210171159.4A CN114666585A (en) 2022-02-23 2022-02-23 Code rate control method and device based on visual perception

Publications (1)

Publication Number Publication Date
CN114666585A true CN114666585A (en) 2022-06-24

Family

ID=82027311

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210171159.4A Pending CN114666585A (en) 2022-02-23 2022-02-23 Code rate control method and device based on visual perception

Country Status (2)

Country Link
CN (1) CN114666585A (en)
WO (1) WO2023159965A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023159965A1 (en) * 2022-02-23 2023-08-31 翱捷科技股份有限公司 Visual perception-based rate control method and device

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117274820B (en) * 2023-11-20 2024-03-08 深圳市天成测绘技术有限公司 Map data acquisition method and system for mapping geographic information
CN118381937B (en) * 2024-06-21 2024-09-20 北京联合大学 Self-adaptive QP (program guide) adjustment method, system and device for point cloud video

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101827267B (en) * 2010-04-20 2012-07-04 上海大学 Code rate control method based on video image segmentation technology
CN105681793B (en) * 2016-01-06 2018-10-23 四川大学 Based on bit rate control method in the adaptive extremely low delay high-performance video coding frame of complexity of video content
CN112291564B (en) * 2020-11-20 2021-09-14 西安邮电大学 HEVC intra-frame code rate control method for optimizing and monitoring video perception quality
CN114666585A (en) * 2022-02-23 2022-06-24 翱捷科技股份有限公司 Code rate control method and device based on visual perception

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023159965A1 (en) * 2022-02-23 2023-08-31 翱捷科技股份有限公司 Visual perception-based rate control method and device

Also Published As

Publication number Publication date
WO2023159965A1 (en) 2023-08-31

Similar Documents

Publication Publication Date Title
US11240498B2 (en) Independently coding frame areas
WO2021244341A1 (en) Picture coding method and apparatus, electronic device and computer readable storage medium
CN114666585A (en) Code rate control method and device based on visual perception
JP5318561B2 (en) Content classification for multimedia processing
US11363298B2 (en) Video processing apparatus and processing method of video stream
JP3365730B2 (en) Apparatus and method for adjusting bit generation amount for video encoding
US7023914B2 (en) Video encoding apparatus and method
KR100977694B1 (en) Temporal quality metric for video coding
US8406297B2 (en) System and method for bit-allocation in video coding
EP0984634A2 (en) Moving picture coding apparatus
JP2002519914A (en) Block classification and applied bit allocation method and apparatus
KR20100021597A (en) A buffer-based rate control exploiting frame complexity, buffer level and position of intra frames in video coding
US20240357138A1 (en) Human visual system adaptive video coding
JP4366571B2 (en) Video encoding apparatus and method
CN111614909B (en) Automatic exposure control method, equipment and storage medium
US11330263B1 (en) Machine learning based coded size estimation in rate control of video encoding
GB2459671A (en) Scene Change Detection For Use With Bit-Rate Control Of A Video Compression System
JP2000197048A (en) Coding rate controller and information coder
CN114173131A (en) Video compression method and system based on inter-frame correlation
US20220191507A1 (en) Method and circuit system for compressing video signals based on adaptive compression rate
KR100316764B1 (en) Method and system for coding images using human visual sensitivity characteristic
KR20040062733A (en) Bit rate control system based on object
EP1921866A2 (en) Content classification for multimedia processing
JP4228739B2 (en) Encoding apparatus, encoding method, program, and recording medium
US20220138919A1 (en) Image processing device and image processing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination