US8199814B2 - Estimation of I frame average rate quantization parameter (QP) in a group of pictures (GOP) - Google Patents
Estimation of I frame average rate quantization parameter (QP) in a group of pictures (GOP) Download PDFInfo
- Publication number
- US8199814B2 US8199814B2 US12/103,470 US10347008A US8199814B2 US 8199814 B2 US8199814 B2 US 8199814B2 US 10347008 A US10347008 A US 10347008A US 8199814 B2 US8199814 B2 US 8199814B2
- Authority
- US
- United States
- Prior art keywords
- intra
- rate
- bit rate
- right arrow
- arrow over
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
- 238000013139 quantization Methods 0.000 title description 18
- 238000000034 method Methods 0.000 claims abstract description 64
- 241000023320 Luma <angiosperm> Species 0.000 claims abstract description 34
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 claims abstract description 34
- 238000012937 correction Methods 0.000 claims description 69
- 239000013598 vector Substances 0.000 claims description 35
- 239000002243 precursor Substances 0.000 claims description 17
- 230000006870 function Effects 0.000 claims description 14
- 238000013507 mapping Methods 0.000 claims description 5
- 238000000638 solvent extraction Methods 0.000 claims description 4
- 238000004590 computer program Methods 0.000 abstract description 2
- BXBVPYSHEOQGHP-UHFFFAOYSA-N Nordihydrocapsiate Chemical compound COC1=CC(COC(=O)CCCCCC(C)C)=CC=C1O BXBVPYSHEOQGHP-UHFFFAOYSA-N 0.000 description 16
- 238000012545 processing Methods 0.000 description 9
- 238000004422 calculation algorithm Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 5
- 230000015654 memory Effects 0.000 description 5
- 230000009466 transformation Effects 0.000 description 5
- 238000007906 compression Methods 0.000 description 4
- 230000006835 compression Effects 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 230000003321 amplification Effects 0.000 description 2
- 238000013144 data compression Methods 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000000750 progressive effect Effects 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 101000973623 Homo sapiens Neuronal growth regulator 1 Proteins 0.000 description 1
- 102100022223 Neuronal growth regulator 1 Human genes 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012015 optical character recognition Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000013442 quality metrics Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/177—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/137—Motion inside a coding unit, e.g. average field, frame or block difference
- H04N19/139—Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/149—Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/18—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/91—Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
Definitions
- This invention pertains generally to video encoding, and more particularly to intra mode decisions within advanced video encoding (such as H.264/AVC or MPEG 4 Part 10) standards.
- H.264/AVC alternatively known by MPEG 4 Part 10 and several other monikers, is representative of improved data compression algorithms. This improved data compression, however, comes at the price of greatly increased computational requirements during the encoding processing phase.
- One aspect of the invention is a method of Rate-QP estimation for an I picture, comprising: (a) providing an input group of pictures (GOP); (b) selecting an input I picture within the input group of pictures; and (C) outputting, to a computer readable medium, a bit rate corrected Rate-QP, R(QP), for the input I picture.
- GOP input group of pictures
- R(QP) bit rate corrected Rate-QP
- the outputting step may comprise: (a) calculating an intra luma (Y) Rate-QP estimate from an intra luma (Y) histogram; (b) calculating an intra chroma (C) Rate-QP estimate from an intra chroma (C) histogram; (c) offsetting the intra chroma (C) Rate-QP estimate to form an offset intra chroma (C) estimate; and (d) setting a Rate-QP for the input I picture to a sum of: (i) the intra luma (Y) Rate-QP estimate; and (ii) the offset intra chroma (C) Rate-QP estimate.
- the step of outputting the bit rate corrected Rate-QP may comprise: (a) correcting the Rate-QP of the input I picture to produce the bit rate corrected Rate-QP, R(QP).
- the method of correcting the bit rate corrected Rate-QP step may comprise: (a) partitioning a set of ordered pairs of (QP, Rate-QP) into a plurality of correction regions; (b) applying mapping functions for QP values in each of the correction regions to produce the bit rate corrected Rate-QP, R(QP).
- the plurality of correction regions may comprise: (a) a high bit rate correction region; (b) a medium bit rate correction region; and (c) a low bit rate correction region.
- these correction regions one may apply a linear interpolation for QP values in the high bit rate correction region, a medium bit rate correction for QP values in the medium bit rate correction region, and a low bit rate correction for QP values in the low bit rate correction region.
- the low bit rate correction may be based on entropic or other considerations presented in this invention.
- these bit rate correction functional mappings are continuous in output values and first derivatives in a region of overlap, so as to result in smooth corrections.
- the intra luma (Y) histogram, and the intra chroma (C) histogram described above are accumulated, for every macroblock in the group of pictures, in steps comprising: (a) forming an estimate of a set of intra prediction coefficients; (b) for each macroblock, separating the set of intra prediction coefficients into an output accumulated intra luma (Y) histogram and an accumulated intra chroma (C) histogram.
- the selection of the intra mode may comprise: (a) selecting the intra mode that has a lowest Sum of Absolute Transformed Differences (SATD) among intra modes using a set of inputs [x], H pos , V pos , ⁇ right arrow over (h) ⁇ , and ⁇ right arrow over (v) ⁇ ; (b) wherein [x] is a 4 ⁇ 4 block of pixels within the input I picture and
- H pos is a horizontal pixel position of the 4 ⁇ 4 block within the image
- V pos is a vertical pixel position of the 4 ⁇ 4 block within the image
- the process of selecting the lowest SATD intra mode step may comprise: (a) calculating a horizontal predictor ⁇ right arrow over (H) ⁇ (H 0 ,H 1 ,H 2 ,H 3 ) T , a vertical predictor ⁇ right arrow over (V) ⁇ (V 0 ,V 1 ,V 2 ,V 3 ), and a steady state (DC) predictor D; (b) calculating a horizontal cost precursor C hs and a vertical cost precursor C vs using the horizontal predictor ⁇ right arrow over (H) ⁇ , the vertical predictor ⁇ right arrow over (V) ⁇ , and the steady state (DC) predictor D; and (c) calculating a horizontal intra mode cost C H , a vertical intra mode cost C V , and a steady state (DC) intra mode cost C D using the horizontal cost precursor C hs and the vertical cost precursor C vs .
- the method of calculating the horizontal predictor ⁇ right arrow over (H) ⁇ , the vertical predictor ⁇ right arrow over (V) ⁇ , and the steady state (DC) predictor D may comprise:
- the method of calculating the horizontal cost precursor C hs and the vertical cost precursor C vs may comprise:
- the method of calculating the horizontal intra mode cost C H may comprise calculating
- the method of calculating the vertical intra mode cost C V may comprise calculating
- the lowest SATD intra mode may be selected with a lowest associated intra mode cost among the group consisting of: the horizontal intra mode cost C H , the vertical intra mode cost C V , and the steady state (DC) intra mode cost C D .
- a computer readable medium comprising a programming executable capable of performing on a computer the various steps described above.
- an advanced video encoder apparatus may comprise the methods described above.
- a Rate-QP estimator apparatus for an I picture may comprise: (a) an input for a data stream comprising a group of pictures (GOP); (b) means for processing an input I picture within the input group of pictures to calculate a bit rate corrected Rate-QP, R(QP), for the input I picture; and (c) a computer readable medium output comprising the bit rate corrected Rate-QP, R(QP), for the input I picture.
- GOP group of pictures
- the means for processing may comprise: an executable computer program resident within a program computer readable medium.
- the means for processing step may comprise: (a) means for estimating a set of accumulated histograms of transform coefficients of the input I picture; and (b) means for estimating the bit rate corrected Rate-QP, R(QP), from the set of accumulated histograms of transform coefficients.
- FIGS. 1A and 1B is a flow chart of showing how the R(QP) function is estimated from the histogram of the transform coefficients of an input picture.
- FIG. 2 is a flow chart of an execution model of an advanced video encoder comprising an encoder front end and an encoder back end.
- FIG. 3 is a flow chart of how four histograms of Discrete Cosine Transform (DCT) coefficients are generated and collected for each B picture in the R(QP) model.
- DCT Discrete Cosine Transform
- FIG. 4 is a flow chart of how four histograms of DCT coefficients are generated and collected for each P picture in the R(QP) model.
- FIG. 5 is a flow chart of how two histograms of DCT coefficients are generated and collected for each I picture in the R(QP) model.
- FIG. 6 is a flow chart of a 4 pixel normalization transform, and a 4 ⁇ 4 block normalized transform, both with scaling.
- FIG. 7A is a flow chart of an NDCT transform of a set of 4 pixels into an normalized NDCT transform of the 4 pixels.
- FIG. 7B is a flow chart of a normalized NDCT transform of a 4 ⁇ 4 block of pixels.
- FIG. 8 is a flow chart of an improved intra mode selection method.
- FIG. 9A is a matrix of the 4 ⁇ 1 vector ⁇ right arrow over (h) ⁇ to the left to the 4 ⁇ 4 block and 1 ⁇ 4 element vector ⁇ right arrow over (v) ⁇ above the 4 ⁇ 4 block.
- FIG. 9B is a matrix of the left normalized transform coefficients and the top normalized transform coefficients that correspond to the left 4 ⁇ 1 and top 1 ⁇ 4 elements of FIG. 9A , which depicts the relationship between the spatial and frequency domain intra predictors for the horizontal and vertical modes.
- FIG. 10 is a flowchart that details the computation of the frequency domain predictors for the intra vertical, horizontal, and steady state (or DC) intra modes.
- FIG. 11 is a flowchart that predicts the SATD costs of the various horizontal, vertical, or DC predictions. Using these costs, intra normalized DCT coefficients with the least SATD is output.
- FIG. 12 is a flowchart showing how the forward motion vector (MV) from the forward motion estimator (FME) is used to obtain the normalized forward predicted DCT coefficients.
- FIG. 13 is a flowchart showing how the backward motion vector from the FME is used to obtain the normalized backward predicted DCT coefficients.
- FIG. 15 is a flowchart that shows the bi-directionally predicted DCT coefficients are the average of the forward and backward predicted DCT coefficients.
- FIG. 16 is a flowchart that shows how to estimate the I picture R(QP) relationship from transform coefficient histograms.
- FIG. 17 is a flowchart that shows how to estimate the P or B picture R(QP) relationships from transform coefficient histograms.
- FIG. 18 is a graph that shows how to physically interpret the three models used in different regions of the bit rate estimation, with the ordinate being the quantization parameter (QP), and the abscissa being the rate based on the quantization parameter R(QP).
- QP quantization parameter
- R(QP) the quantization parameter
- FIG. 19 is a flow chart showing that the estimation of R(QP) relationship process has two parts. First, the number of non-zero coefficients at a given QP is estimated. Second, the number of non-zero coefficients is multiplied by 5.5 to provide an initial R(QP) estimate.
- FIG. 20 is a flow chart showing that the number of non-zero coefficients at a given QP is obtained by linear interpolation of the points on the graph that consists of the number of coefficients with value k, and the minimum value of QP that would quantize k to one.
- FIG. 1A through FIG. 23 the apparatus generally shown in FIG. 1A through FIG. 23 .
- the apparatus may vary as to configuration and as to details of the parts, and that the method may vary as to the specific steps and sequence, without departing from the basic concepts as disclosed herein.
- Computer means any device capable of performing the steps, methods, or producing signals as described herein, including but not limited to: a microprocessor, a microcontroller, a video processor, a digital state machine, a field programmable gate array (FGPA), a digital signal processor, a collocated integrated memory system with microprocessor and analog or digital output device, a distributed memory system with microprocessor and analog or digital output device connected by digital or analog signal protocols.
- a microprocessor a microcontroller, a video processor, a digital state machine, a field programmable gate array (FGPA), a digital signal processor, a collocated integrated memory system with microprocessor and analog or digital output device, a distributed memory system with microprocessor and analog or digital output device connected by digital or analog signal protocols.
- FGPA field programmable gate array
- Computer readable medium means any source of organized information that may be processed by a computer to perform the steps described herein to result in, store, perform logical operations upon, or transmit, a flow or a signal flow, including but not limited to: random access memory (RAM), read only memory (ROM), a magnetically readable storage system; optically readable storage media such as punch cards or printed matter readable by direct methods or methods of optical character recognition; other optical storage media such as a compact disc (CD), a digital versatile disc (DVD), a rewritable CD and/or DVD; electrically readable media such as programmable read only memories (PROMs), electrically erasable programmable read only memories (EEPROMs), field programmable gate arrays (FGPAs), flash random access memory (flash RAM); and information transmitted by electromagnetic or optical methods including, but not limited to, wireless transmission, copper wires, and optical fibers.
- RAM random access memory
- ROM read only memory
- magnetically readable storage system such as punch cards or printed matter readable by direct methods or
- SSD Absolute Transformed Differences
- H.264/AVC a series of 4 ⁇ 4 blocks are transformed rather than doing more processor-intensive 8 ⁇ 8 or 16 ⁇ 16 transforms.
- GOP Group of Pictures
- a GOP is usually about 15 frames long in an NTSC system.
- the length of a GOP can vary depending on editing needs.
- the length of a GOP represents the editing capability of an MPEG signal. If an edit occurs within a GOP, an MPEG decoder/recoder will be needed to reclose the GOP.
- a GOP is defined as a consecutive sequence of pictures with any combination of I, P, and B pictures.
- CABAC Context-adaptive binary arithmetic coding
- Context-adaptive variable-length coding means a method for the coding of quantized transform coefficient values that is a lower-complexity alternative to CABAC. Despite having a lower complexity than CABAC, CAVLC is more elaborate and more efficient than the methods typically used to code coefficients in other prior designs.
- I, P, B frames mean the three major picture types found in typical video compression designs. They are I(ntra) (or key) pictures, P(redicted) pictures, and B(i-predictive) pictures (or B(i-directional) pictures). They are also commonly referred to as I frames, P frames, and B frames. In older reference documents, the term “bi-directional” rather than “bi-predictive” is dominant.
- Y means the luminance (or luma) signal or information present in an image. It is the black and white portion that provides brightness information for the image.
- C means the chrominance (or chroma) signal or information present in an image. It is the color portion that provides hue and saturation information for the image.
- SD standard definition video
- HD high definition video
- Two dimensional “DCT” Discrete Cosine Transformation
- 2D spatial domain representation a two-dimensional (2D) frequency domain representation by use of Discrete Cosine Transform coefficients. This process is typically used in MPEG and JPEG image compression.
- Quantization means the conversion of a discrete signal (a sampled continuous signal) into a digital signal by quantizating. Both of these steps (sampling and quantizing) are performed in analog-to-digital converters with the quantization level specified in bits.
- a specific example would be compact disc (CD) audio which is sampled at 44,100 Hz and quantized with 16 bits (2 bytes) which can be one of 65,536 (i.e. 216) possible values per sample.
- “Quantizating”, in digital signal processing parlance, means the process of approximating a continuous range of values (or a very large set of possible discrete values) by a relatively-small set of discrete symbols or integer values. More specifically, a signal can be multi-dimensional and quantization need not be applied to all dimensions. Discrete signals (a common mathematical model) need not be quantized, which can be a point of confusion.
- a novel method for estimating the number of non-zero quantized transform coefficients M as a function of the quantization parameter QP is to estimate it from the histogram of the DCT coefficients. Let x be the absolute amplitude of a DCT coefficients and let the histogram P(x) be the frequency of occurrence of DCT coefficients with absolute amplitude x in a picture. Then the number of non-zero quantized coefficients as a function of the quantization parameter is
- M ⁇ ( QP ) ⁇ Q ⁇ ( x , QP ) ⁇ 1 ⁇ P ⁇ ( x ) ⁇ d x
- Q(x,QP) is the quantized value of x with quantization parameter QP.
- FIG. 1A shows that the rate estimation algorithm has two parts 100 .
- An input picture stream 102 is used in the first part to generate estimates of the histogram of the DCT coefficients 104 of the input picture 102 , which results in an output histogram of the transform coefficients 106 .
- the transform coefficient histogram 106 is used as an input to a second stage 108 , which estimates and outputs the rate R as a function of the quantization parameter QP, R(QP), 110 from the histogram.
- FIG. 1A shows that the rate estimation algorithm has two parts 100 .
- 1B shows the result is then used 112 as a bit rate corrected Rate-QP (R(QP)) for the input P picture in various example manner as seen by blocks 114 , 116 , 118 , 120 , 122 , 124 , 126 , 128 , 130 , 132 , 134 , 138 , 140 , 142 , and 250 .
- R(QP) bit rate corrected Rate-QP
- bit estimation algorithm is best described with the following simplified execution model of advanced video encoder.
- an advanced video encoder 200 consists of a front end 202 and a back end 204 .
- the front end 202 comprises a forward motion estimator (FME) 206 and a Picture Type Determiner (PTD) 208 .
- the backend 204 comprises a Forward and Backward Motion Encoder (FBME) that also performs Mode decisions and Macroblock (MM) 210 coding. These outputs of the FBME/MM 210 are coded in a Coding block 212 .
- FME forward motion estimator
- PTD Picture Type Determiner
- FBME Forward and Backward Motion Encoder
- MM Macroblock
- the bit estimation method presented here takes place within the encoder front end 202 . No information from the back end 204 is necessary in the bit estimation process.
- pictures are read by FME 206 where the forward motion estimation 206 is performed by using the original pictures 214 as reference pictures.
- the PTD 208 determines the picture type and group of picture structure.
- FBME/MM 210 re-computes the forward and backward motion vectors when needed based on the reconstructed pictures.
- the FBME/MM 210 additionally performs the mode decisions and macroblock coding. Based on the information from FBME/MM 210 , the Coding 212 block generates the final output bit stream 216 .
- the histogram and bit estimation for each picture 214 is performed in FME 206 .
- the method here computes three bit estimates: (1) one I picture estimate, (2) one P picture estimate, and (3) one B picture estimate. In this way, no assumption is made regarding the picture type and the GOP structure in the picture bit estimation.
- Such parallel calculations are also well suited for customized video processors or other computers that are capable of parallel pipeline calculations.
- the GOP bit estimation is performed after PTD 208 .
- the picture type and GOP structure is known. Therefore, that information is used to select the corresponding bit estimate out of the I, P, and B bit estimates of a picture 214 .
- the GOP bit estimation is obtained by summing up the bit estimates of each picture in a GOP with the corresponding picture type.
- the FME computes the forward motion estimation of the input picture in display order of a video sequence with N pictures.
- bit estimation is performed with one frame (two fields) delay except for the first and last frame (field pairs).
- the one frame (two fields) delay is inserted in the bit estimation within the FME so that the current input picture may be used as the backward reference picture.
- the forward motion field from FME of picture 5 is converted into backward motion field of picture 3 , and then bit estimation is performed on picture 3 .
- picture 1 is used for forward motion compensation and current input picture 5 is used for backward motion compensation.
- Table 1 shows the timing diagram of FME for encoding field pictures. Since the first field pair and the last field pair in display order cannot be encoded as B pictures, only I and P picture bit estimation is performed for the first and last field pair bit. Two fields delay after the first field pair, the I/P/B bit estimation starts. Then three bit estimates are computed for each picture, one estimate for each of the I/P/B picture types.
- FIGS. 3 , 4 , and 5 for the I/P/B picture bit estimation flowcharts where a total of ten histograms of the DCT coefficients are collected.
- an estimate of the intra prediction coefficients 302 is generated, as well as the estimate of the forward prediction coefficients 304 , and the estimate of the backward prediction coefficients 306 .
- This step is generally referred to as estimating the transform coefficients step 308 .
- From the estimate of the intra prediction coefficients 302 is output an intra prediction macroblock coefficient set 310 .
- From the estimate of the forward prediction coefficients 304 an output of the forward predicted macroblock coefficients 312 is determined.
- An adder, 314 adds the inputs of the output of the forward predicted macroblock coefficients 312 , the output of the backward predicted macroblock coefficients 316 , and 1 together.
- the output of the adder 314 is divided by two to form an estimate of the bi-directional predicted macroblock coefficients, and inputs all these macroblock coefficients 312 , 314 , and 316 , into a forward/backward/bi-directional decision using the lowest SATD 318 . From the outputs of the intra prediction macroblock coefficient set 310 and the forward/backward/bi-directional decision using the lowest SATD 318 , an intra/non-intra decision is made with the lowest SATD 320 .
- the chrominance and luminance is separated from the output of the intra/non-intra decision made (with separators 322 and 324 ) with the lowest SATD 320 to form four histograms: an accumulated intra Y histogram 328 , and accumulated intra C histogram 330 , an accumulated non-intra Y histogram 332 , and an accumulated non-intra C histogram 334 .
- FIG. 3 shows that four histograms are collected, as collect histograms 326 , for each B picture Rate-QP model.
- FIG. 4 Similar to the B picture of FIG. 3 , for a P picture, another four histograms are collected 400 .
- the estimate of the intra prediction coefficients 402 and estimate of the forward prediction coefficients 404 are used to generate the four histograms: an accumulated intra Y histogram 406 , an accumulated intra C histogram 408 , an accumulated non-intra Y histogram 410 , and an accumulated non-intra C histogram 412 .
- FIG. 5 is a flow chart 500 for generating the histograms for the I picture, where only two histograms are collected.
- an estimate for the intra prediction coefficients 502 is used to generate two histograms: an accumulated intra Y histogram 504 , and an accumulated intra C histogram 506 .
- the estimation of histograms for I, P, and B models are similar.
- the estimations of the I and P picture histogram may be interpreted as simplifications of the B picture histogram estimation process.
- the first commonality among the I/P/B bit estimations in FIGS. 3-5 is that the histograms of the luminance and chrominance blocks are collected separately. This is because the quantization parameters for luminance and chrominance may be different.
- the second commonality is that the intra macroblocks and non-intra macroblocks are collected separately into separate histograms. This is because the dead zones in the intra quantizer and the non-intra quantizer are typically different.
- the third commonality is that the forward/backward/bi-directional mode decisions and intra/non-intra mode decisions are all based on SATD.
- the mode with the minimum SATD is selected to be accumulated to the associated histogram.
- the fourth commonality is that I, P, and B picture models share the same estimate of the intra DCT coefficients. Additionally, the P and B picture models share the same forward predicted DCT coefficients.
- the fifth commonality is that normalized transforms are used to obtain the estimates of the transform coefficients.
- the normalized transform is a normalized form of the transform within the advanced video coder (AVC) that has scaling properties such that each transform coefficient results in the same amplification.
- AVC advanced video coder
- Normalized transforms are used in the histogram estimation steps described above in FIGS. 3-5 .
- FIG. 6 a flowchart of a normalized transform is shown as a transform with uniform scaling so that each transform coefficient has the same amplification.
- FIG. 6 is a flow chart of the transformations 600 of both a 4 pixel vector and a 4 ⁇ 4 block of pixels.
- the normalized transform NDCT 4 (s) is computed by the following steps:
- Step 2 normalize the coefficients at 606 :
- S i ⁇ 4 ⁇ S ′ i ⁇ ⁇ 0 , 2 ⁇ 4 ⁇ ( 41449 ⁇ S i ′ ) / 2 16 i ⁇ ⁇ 1 , 3 ⁇ , which may also be referred to as the N4 function 606 , as shown in FIG. 6 .
- the output of the 4 pixel 602 normalized transform is 608 .
- [ y ] [ y 0 , 0 y 0 , 1 y 0 , 2 y 0 , 3 y 1 , 0 y 2 , 0 ⁇ y 3 , 0 y 3 , 3 ] .
- the normalized transform NDCT 4 ⁇ 4 ([y]) is computed by the following steps:
- Step 1 compute DCT of [y] as
- Step 2 normalize the coefficients at step N4 ⁇ 4 614 to produce a normalized transform 616 of the input 4 ⁇ 4 block 610 :
- Y i , j ⁇ Y i , j ′ ( i , j ) ⁇ ⁇ ( 0 , 0 ) , ( 0 , 2 ) , ( 2 , 0 ) , ( 2 , 2 ) ⁇ ( 26214 ⁇ Y i , j ′ ) / 2 16 ( i , j ) ⁇ ⁇ ( 1 , 1 ) , ( 1 , 3 ) , ( 3 , 1 ) , ( 3 , 3 ) ⁇ ( 41449 ⁇ Y i , j ′ ) / 2 16 otherwise
- FIG. 6 there are two major steps for the input 4 pixel 602 and 4 ⁇ 4 block input 610 : first a transform step 618 , then a scaling, or normalizing step 620 .
- FIG. 7A is a flow chart of a normalized NDCT 4 transform of an input 4 pixel group into a normalized transform of the 4 pixel group.
- the X i,j will be described further later.
- FIG. 8 describes an overview of a method of determining a set of optimal intra normalized DCT coefficients 800 .
- an input 4 ⁇ 4 block of pixels 802 is used as an input to the 4 ⁇ 4 normalized DCT 804 to produce the 4 ⁇ 4 DCT output 806 .
- This output 806 will be used subsequently as described below.
- the top 4 ⁇ 1 pixels 808 (the 4 top elements immediately above the input 4 ⁇ 4 block of pixels 802 ) are used as input into a NDCT 4 normalized DCT 810 to produce a vertical prediction DCT output 812 .
- the left 1 ⁇ 4 pixels 814 (the 4 left elements immediately left of the input 4 ⁇ 4 block of pixels 802 ) are used as input into a NDCT 4 normalized DCT transform 816 to produce a horizontal prediction DCT output 818 .
- NDCT 4 normalized DCT vertical 812 and horizontal 818 predictions are used to estimate the steady state, or DC prediction 822 .
- the following inputs are compared 824 to determine the optimal intra mode prediction 826 : 1) the 4 ⁇ 4 normalized DCT block transform output 806 ; 2) the vertical prediction normalized DCT output 812 ; 3) the horizontal prediction normalized DCT output 818 ; and 4) the DC prediction 822 .
- the intra predictions are computed in frequency domain; the DC prediction is derived from the horizontal and vertical predictions. And, finally, the prediction residue with the minimal SATD is selected as the output of the intra mode selection process.
- FIGS. 9A and 9B which taken together describe the relationship 900 between the spatial and frequency domain intra predictor for horizontal and vertical modes.
- an initial spatial domain representation (in FIG. 9A ) of a 4 ⁇ 4 block of pixels 902 is shown as [x] with spatial elements x i,j , where i, j ⁇ (0, 1, 2, 3).
- the frequency domain representation (in FIG. 9B ) of the 4 ⁇ 4 transformation 904 is shown as the transformed matrix[X], with elements X i,j , where i, j ⁇ (0, 1, 2, 3).
- FIG. 10 details a flowchart 1000 for the computation of the frequency domain predictors for the intra vertical, horizontal, and steady state (or DC) modes.
- H pos 0 (e.g. not H pos ⁇ 0 at 1006 ) and V pos ⁇ 0 (at 1012 ), then:
- the pixels can only take on 8 bits of information.
- next the cost is calculated 1020 .
- FIG. 11 which predicts the computational costs of the various horizontal, vertical, or DC predictions 1100 , and using these, outputs a selected intra mode with the least SATD.
- To this evaluation is first provided the ⁇ right arrow over (H) ⁇ , ⁇ right arrow over (V) ⁇ , D values determined above, as well as the input 4 ⁇ 4 pixel block [x] 1102 .
- the appropriate intra mode is selected from the group of Horizontal Prediction, Vertical Prediction, and DC Prediction.
- the intra mode with the minimal cost is selected as the intra prediction mode and the corresponding DCT coefficients are replaced by the prediction error to obtain the prediction residue.
- the prediction residue associated with the minimal cost prediction selected among C H , C V , and C D is then output as the appropriate associated predicted residue. From this point, the selected intra prediction residue is used within the advanced video coder to compress the 4 ⁇ 4 block.
- the method for obtaining 1200 the forward predicted DCT coefficients is as follows.
- the method for obtaining 1300 the backward predicted DCT coefficients is as follows. This method is similar to the method used in the forward predicted DCT coefficient calculation.
- FIG. 14 depicts the relationship between forward and backward motion vectors of a specific macroblock 1400 .
- the backward motion vector 1402 is derived in the following manner.
- the motion vector ( ⁇ mvx, ⁇ mvy) is assigned as the backward motion vector 1402 of the macroblock at ( ⁇ tilde over (x) ⁇ , ⁇ tilde over (y) ⁇ ) 1410 and the status of the backward motion vector is marked as valid.
- FIG. 15 is a flow chart 1500 showing how the bi-directionally predicted DCT coefficients are the average of the forward and the backward predicted DCT coefficients.
- X f (i,j) 1502 , X b (i,j) 1504 , and X bi (i,j) 1506 , 0 ⁇ i, j ⁇ 3, are respectively the forward 1502 , backward 1504 , and bi-directionally motion compensated DCT 1506 coefficients.
- the forward predicted DCT coefficients 1502 are selected by default in the motion mode decision 318 , as shown in FIG. 3 .
- motion mode decisions are performed for the estimation of the B picture histograms.
- the motion mode decision 318 makes a selection among the forward 304 , the backward 306 , and the bi-directionally predicted DCT coefficient 314 for further processing.
- the motion type with the minimum sum of absolute value on the 16 blocks of 4 ⁇ 4 luminance transform coefficients in a macroblock is selected.
- intra/non-intra decisions with SATD 320 are performed for the estimation of the B and P picture histograms.
- the mode decision makes a selection among the intra predicted and motion predicted DCT coefficients for further processing.
- the macroblock with the minimum sum of absolute transformed values of the 16 blocks of 4 ⁇ 4 luminance transform coefficients is selected to estimate the histograms.
- each histogram for b bits per luma sample, is accumulated in an integer array P of size (2 b ⁇ 1) ⁇ 16 ⁇ 5+1 (i.e. 255 ⁇ 16 ⁇ 5+1 for 8 bits/sample).
- the array P is initialized to zero at the beginning of a picture.
- the intra luma (Y) histogram 504 and intra chroma (C) histogram 506 are collected and processed according to the flow chart 1600 in FIG. 16 .
- intra luma (Y) histogram 504 and an Intra signal 1602 are input into a luma estimator for Rate-QP 1604 to output ⁇ tilde over (R) ⁇ IY (QP) for all QP.
- the intra chroma (C) histogram 506 and an Intra signal 1606 are input into a chroma estimator for Rate-QP 1608 to output ⁇ tilde over (R) ⁇ IC (QP) for all QP.
- the output from the chroma estimator for Rate-QP 1608 is then processed by QP Offset 1610 to output ⁇ tilde over (R) ⁇ IC (QP+QP offset ) for all QP.
- the outputs from the QP Offset 1610 and the luma estimator for Rate-QP 1604 are added 1612 to output ⁇ tilde over (R) ⁇ IY (QP)+ ⁇ tilde over (R) ⁇ IC (QP+QP offset ) for all QP and used as inputs into the bit rate correction section, starting with the Medium Bit Rate Correction block 1614 .
- the Medium Bit Rate Correction block 1614 additional information is used as inputs relating to the Picture Type and Size, and whether Context Adaptive Variable-Length Coding (CAVLC) is being used.
- CAVLC Context Adaptive Variable-Length Coding
- the output is passed through the high bit rate correction block 1616 if the picture was found to be of a high bit rate at small QP, otherwise it is bypassed 1618 to the low bit rate correction block 1620 if it is not of a low bit rate at large QP, otherwise it also would be bypassed 1622 to yield the rate R I (QP) relationship of an I picture.
- CAVLC Context Adaptive Variable-Length Coding
- intra luma histograms, intra chroma histograms, non-intra luma histograms, and non-intra chroma histograms are collected from FIG. 3 for B pictures or FIG. 4 for P pictures.
- These four input histograms are then input with their respective intra or non-intra quantizations ( 1710 , 1712 , 1714 , and 1716 ) to FIG. 17 to estimate the R(QP) for all QP of a P/B picture proceeding through similar estimations of R(QP) blocks 1718 with or without QP Offsets 1720 , then through bit rate corrections 1722 to produce either a R P (QP) or a R B (QP) 1724 depending on whether a P or B picture is respectively being processed.
- the Rate-QP estimate of an I picture is obtained as shown in the flowchart 1600 of FIG. 16 .
- an initial luma R(QP) estimate 1604 is obtained from the intra luma histogram 504
- an initial chroma R(QP) estimate 1608 is obtained from the intra chroma histogram 506 . Since the AVC supports chroma offset on the quantization parameter, the initial chroma R(QP) estimate is offset 1610 and added 1612 to the initial luma R(QP) estimate 1604 to form the initial R(QP) estimate of the I picture prior to bit rate correction.
- a medium bit rate correction 1614 is applied to the estimate, followed by a high bit rate correction 1616 when conditions are met, and then finally a low bit rate correction 1620 to improve the accuracy of the bit estimation in needed.
- I picture R(QP) estimation and the P/B picture R(QP) estimation have the same building blocks.
- the building blocks are:
- FIG. 18 where a graphical interpretation of the bit rate correction process is shown in a graph of R(QP) versus QP 1800 .
- three different bit rate estimation models are used.
- a medium bit rate model is used for QP 1 ⁇ QP ⁇ QP 2 1802 .
- a linear high bit rate model is used for 0 ⁇ QP ⁇ QP 1 1804 .
- a low bit rate model is used for QP 2 ⁇ QP ⁇ 51.
- FIG. 19 is a flow chart 1900 showing how the initial Rate-QP ⁇ tilde over (R) ⁇ (QP) estimate 1902 is derived from an input histogram 1904 .
- M(QP) the number of non-zero coefficients quantized with parameter QP 1906 .
- ⁇ tilde over (R) ⁇ (QP) 1902 provides an initial rough estimate of the bit rate as a function of the quantization parameter QP.
- FIG. 20 is a flowchart 2000 that shows how the number of non-zero DCT coefficients M(QP) 2002 as a function of QP are estimated from the histogram of the DCT coefficients 2004 with the following steps:
- an approximated condition for a quantized coefficient to be non-zero can be determined as follows:
- Q be the quantization parameter of an advanced video encoder. Then define Q M ⁇ Q mod 6 and Q E ⁇ Q//6 where // denotes integer divide.
- the advanced video encoder quantizer is defined as
- [(
- N(r) can be interpreted as the scaling factor that normalizes the integer DCT in H.264.
- R ⁇ ( QP ) ( R 0 - R _ ⁇ ( QP 1 ) ) ⁇ QP 1 - QP QP 1 - 0 + R _ ⁇ ( QP 1 ) for 0 ⁇ QP ⁇ QP 1 to linearly interpolate the QP values.
- I picture estimate in FIG. 21 it is the sum of the chroma and the luma entropy estimates.
- the chroma/luma entropy estimate is derived from its corresponding histogram.
- Each chroma/luma entropy estimate is derived from its corresponding histogram.
- Let ⁇ tilde over (E) ⁇ 0 2302 be the rate at QP 0. It is estimated by the following steps:
- bit estimation at lower bit rates may be improved when certain conditions are met.
- the values of bpp_lower are listed in Tables 3-5.
- R ⁇ ( QP ) R _ ⁇ ( QP 2 ) 2 QP - QP 2 51 - QP 2 ⁇ Log 2 ( R _ ⁇ ( QP 2 ) R 51 ) for QP 2 ⁇ QP ⁇ 51.
- M is the number of macroblocks in a picture
- N N Y +N C
- N Y ,N C is the number of luma and chroma transform coefficients in a picture.
- R min is the minimum bits per macroblock.
- the parameters R min , e, and f for CAVLC and CABAC are shown in Table 6.
- the standard deviation ⁇ is derived from the histogram of the luma and chroma transform coefficients in an I picture, where the luma histogram is P Y [k], and the chroma histogram is P C [k].
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Algebra (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
- [1] Stèphane Mallat and Frederic Falzon, “Analysis of Low Bit Rate Image Transform Coding,” IEEE Trans on Signal Processing, vol. 46, no. 4, pp. 1027-1042, April 1998.
- [2] Zhihai He and Sanjit K. Mitra, “A unified rate-distortion analysis framework for transform coding,” IEEE Trans on Circuits and Systems for Video Technology, vol. 11, no. 12, pp. 1221-1236, December 2001.
(c) wherein Hpos is a horizontal pixel position of the 4×4 block within the image; (d) wherein Vpos is a vertical pixel position of the 4×4 block within the image; (e) wherein {right arrow over (h)} is a vector immediately left of the 4×4 block [x], defined as {right arrow over (h)}=≡(x0,−1,x1,−1,x2,−1,x3,−1)T relative to the indexing of the elements of [x]; (f) wherein {right arrow over (v)} is a vector immediately above the 4×4 block [x], defined as {right arrow over (v)}≡(x−1,0,x−1,1,x−1,2,x−1,3)T relative to the indexing of the elements of [x]; and (g) wherein the lowest SATD intra mode is determined among a group comprising: (i) a horizontal intra mode; (ii) a vertical intra mode; and (iii) a steady state (DC) intra mode.
-
- (i) setting {right arrow over (H)}≡(H0,H1,H2,H3)T=[NDCT4]{right arrow over (h)}
- where {right arrow over (h)}≡(h0,h1,h2,h3)T≡(x0,−1,x1,−1,x2,−1,x3,−1)T;
- (ii) setting {right arrow over (V)}≡(V0,V1,V2,V3)T=[NDCT4]{right arrow over (v)}
- where {right arrow over (v)}≡(v0,v1,v2,v3)≡(x−1,0,x−1,1,x−1,2,x−1,3)T;
- (iii) setting D=(H0+V0)/2;
- (i) setting {right arrow over (H)}≡(H0,H1,H2,H3)T=[NDCT4]{right arrow over (h)}
-
- (i) setting {right arrow over (H)}=(215−1,0,0,0)T;
- (ii) setting {right arrow over (V)}≡(V0,V1,V2,V3)T=[NDCT4]{right arrow over (v)}
- where {right arrow over (v)}≡(v0,v1,v2,v3)≡(x−1,0,x−1,1,x−1,2,x−1,2,x−1,3)T;
- (iii) setting D=V0;
-
- (i) setting
- where {right arrow over (h)}≡(h0,h1,h2,h3)T≡(x0,−1,x1,−1,x2,−1,x3,−1)T;
- (ii) setting {right arrow over (V)}=(215−1,0,0,0)T;
- (iii) setting D=H0; and
- (i) setting
-
- (i) setting {right arrow over (H)}=(215−1,0,0,0)T;
- (ii) setting {right arrow over (V)}=(215−1,0,0,0)T; and
- (iii) setting D=128×16.
(b) calculating the horizontal cost precursor
and
(c) calculating the vertical cost precursor
where Q(x,QP) is the quantized value of x with quantization parameter QP.
S=[S 0 ,S 1 ,S 2 ,S 3]T =NDCT 4(s)
{right arrow over (S)}′=[S 0 ′,S 1 ′,S 2 ′,S 3′]T =[H]{right arrow over (s)} at 604 where
where is referred to as the DCT4.
which may also be referred to as the
the cost of the vertical prediction is
and the cost of the DC prediction is CD=|D−X0,0|+Chs+Cvs.
{tilde over (x)}=((x+mvx+8)//16)×16
{tilde over (y)}=((y+mvy+8)//16)×16
where // is an integer divide.
X bi(i,j)=(X f(i,j)+X b(i,j)+1)>>1.
{right arrow over (P)}[|X i,j ]←P[|X i,j|]+1, for 0≦i,j≦3.
where P is the histogram and P[i] is the frequency of the coefficients with amplitude i;
and for a
|X q(i,j)|=[(|X(i,j)|A(Q M ,i,j)+f·215+Q
where f=1/3 for an intra slice and f=1/6 for a non-intra slice.
|X(i,j)|A(Q M ,i,j)+f·215+Q
which is equivalent to
where N(0)=1, N(1)=4/10, N(2)=2/√{square root over (10)}. In particular, the constant N(r) can be interpreted as the scaling factor that normalizes the integer DCT in H.264.
approximately becomes and
and
since Q=(6QE+QM).
where 215/A0≅2.5 and f=1/3 for intra slice and f=1/6 for non-intra slice.
and a quantizer with f=1/3 is used, then |Xq(i,j)|>0, when approximately 6 log2(0.6|
for QP=0, . . . , 51. The correction parameters a, b, d are listed in Table 3 for standard definition (SD) sequences, Table 4 for HD progressive sequences, and Table 5 for high definition (HD) interlace sequences. Their values depend on the picture size, picture structure, picture type, and the type of the entropy encoder.
for 0≦QP≦QP1 to linearly interpolate the QP values.
R 0=max[R(QP 1),E 0]
k=int(i/2.5+1/r)
P 0 [k]←P 0 [k]+P[i]
where r is the rounding parameter. For intra histograms, r=3. For non-intra histograms, r=6.
where N is the total number of coefficients of the histogram.
for QP2≦QP≦51.
where M is the number of macroblocks in a picture, N=NY+NC, and NY,NC, is the number of luma and chroma transform coefficients in a picture. Rmin is the minimum bits per macroblock. The parameters Rmin, e, and f for CAVLC and CABAC are shown in Table 6.
TABLE 1 |
Field Picture Timing For Bit Estimation In FME With N Pictures In A Sequence |
FME |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | . . . | . . . | N − 2 | N − 1 |
|
0 | 0 | 0 | 1 | 2 | 3 | 4 | 5 | N − 4 | N − 3 | ||
I/ |
0 | 1 | N − 2 | N − 1 | ||||||||
I/P Forward Ref Pic | 0 | N − 4 | N − 3 | |||||||||
I/P/ |
2 | 3 | 4 | 5 | . . . | . . . | N − 4 | N − 3 | ||||
I/P/B |
0 | 1 | 2 | 3 | . . . | . . . | N − 5 | N − 4 | ||||
I/P/B Backward |
4 | 5 | 6 | 7 | . . . | . . . | N − 2 | N − 1 | ||||
TABLE 2 |
Frame Picture Timing Of Bit Estimation In FME With N Pictures In A Sequence |
FME |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | . . . | . . . | N − 2 | N − 1 |
|
0 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | N − 3 | N − 2 | ||
I/P Bit Estimation | 0 | N − 1 | ||||||||||
I/P Forward Ref Pic | N − 3 | |||||||||||
I/P/ |
1 | 2 | 3 | 4 | 5 | 6 | . . . | . . . | N − 3 | N − 2 | ||
I/P/B |
0 | 1 | 2 | 3 | 4 | 5 | . . . | . . . | N − 4 | N − 3 | ||
I/P/B Backward |
2 | 3 | 4 | 5 | 6 | 7 | . . . | . . . | N − 2 | N − 1 | ||
TABLE 3 |
Correction Parameters For SD Sequences |
Pic Type | a | b | d | bpp_lower | bpp_upper | ||
CAVLC | I | 0.94 | 35000 | 5.0000E−05 | 0.4 | 2.8 |
P | 0.86 | 14000 | 6.6667E−05 | 0.4 | 2.8 | |
B | 0.9 | 5700 | 0 | 0.2 | 2 | |
CABAC | I | 0.88 | 3500 | 5.0000E−05 | 0.4 | 2.8 |
P | 0.86 | 14000 | 6.6667E−05 | 0.4 | 2.8 | |
B | 0.7 | 5700 | 0 | 0.2 | 2 | |
TABLE 4 |
Correction Parameters For HD Progressive Sequences |
Pic Type | a | b | d | bpp_lower | bpp_upper | ||
CAVLC | I | 0.68 | 9.00E+05 | 3.3333E−06 | 0.4 | 2.8 |
P | 0.71 | 357000 | 1.0000E−05 | 0.4 | 2.8 | |
B | 0.6 | 100000 | 2.0000E−06 | 0.2 | 2.0 | |
CABAC | I | 0.6 | 7.00E+05 | 2.2222E−06 | 0.3 | 2.8 |
P | 0.625 | 212500 | 6.6667E−06 | 0.3 | 2.8 | |
B | 0.6 | 100000 | 2.0000E−06 | 0.2 | 2.0 | |
TABLE 5 |
Correction Parameters For HD Interlace Sequences |
Pic Type | a | b | d | bpp_lower | bpp_upper | ||
CAVLC | I | 0.75 | 375000 | 1.0000E−05 | 0.4 | 2.8 |
P | 0.67 | 142487 | 1.0000E−05 | 0.4 | 2.8 | |
B | 0.6 | 0 | 0 | 0.2 | 2.0 | |
CABAC | I | 0.678 | 287000 | 1.0000E−05 | 0.3 | 3.0 |
P | 0.6 | 80000 | 2.0000E−05 | 0.3 | 2.8 | |
B | 0.6 | 100000 | 2.0000E−06 | 0.2 | 2.0 | |
TABLE 6 |
Parameters For Bit Estimation At QP = 51 |
RMIN | e | f | ||
CAVLC | 6.1 | 0.00180541 | 0.01534307 | ||
CABAC | 0.4 | 0.00127655 | 0.00527216 | ||
Claims (22)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/103,470 US8199814B2 (en) | 2008-04-15 | 2008-04-15 | Estimation of I frame average rate quantization parameter (QP) in a group of pictures (GOP) |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/103,470 US8199814B2 (en) | 2008-04-15 | 2008-04-15 | Estimation of I frame average rate quantization parameter (QP) in a group of pictures (GOP) |
Publications (2)
Publication Number | Publication Date |
---|---|
US20090257488A1 US20090257488A1 (en) | 2009-10-15 |
US8199814B2 true US8199814B2 (en) | 2012-06-12 |
Family
ID=41163951
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/103,470 Expired - Fee Related US8199814B2 (en) | 2008-04-15 | 2008-04-15 | Estimation of I frame average rate quantization parameter (QP) in a group of pictures (GOP) |
Country Status (1)
Country | Link |
---|---|
US (1) | US8199814B2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110249725A1 (en) * | 2010-04-09 | 2011-10-13 | Sony Corporation | Optimal separable adaptive loop filter |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102857768B (en) * | 2011-07-01 | 2014-12-10 | 华为技术有限公司 | Equipment and method for determining chromaticity prediction mode candidate set |
KR101542586B1 (en) * | 2011-10-19 | 2015-08-06 | 주식회사 케이티 | Method and apparatus for encoding/decoding image |
KR20130049526A (en) * | 2011-11-04 | 2013-05-14 | 오수미 | Method for generating reconstructed block |
CN103517067B (en) * | 2012-12-14 | 2017-04-19 | 深圳百科信息技术有限公司 | Initial quantitative parameter self-adaptive adjustment method and system |
CN104427320A (en) * | 2013-09-02 | 2015-03-18 | 苏州威迪斯特光电科技有限公司 | Video quality improving method based on sensitive information enhancement for video monitoring system |
US9294766B2 (en) | 2013-09-09 | 2016-03-22 | Apple Inc. | Chroma quantization in video coding |
EP3643063A1 (en) * | 2017-06-21 | 2020-04-29 | Vid Scale, Inc. | Adaptive quantization for 360-degree video coding |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5650860A (en) * | 1995-12-26 | 1997-07-22 | C-Cube Microsystems, Inc. | Adaptive quantization |
US6665442B2 (en) * | 1999-09-27 | 2003-12-16 | Mitsubishi Denki Kabushiki Kaisha | Image retrieval system and image retrieval method |
US7027510B2 (en) | 2002-03-29 | 2006-04-11 | Sony Corporation | Method of estimating backward motion vectors within a video sequence |
US20060209948A1 (en) | 2003-09-18 | 2006-09-21 | Bialkowski Jens-Guenter | Method for transcoding a data stream comprising one or more coded, digitised images |
US20060251330A1 (en) | 2003-05-20 | 2006-11-09 | Peter Toth | Hybrid video compression method |
EP1727370A1 (en) | 2005-05-25 | 2006-11-29 | Thomson Licensing | Rate-distortion based video coding mode selection foreseeing the esitmation of bit rate and distortion using a simplified transform on low activity prediction residuals |
US20070009027A1 (en) | 2005-05-27 | 2007-01-11 | Zhu Li H | Method for controlling the encoder output bit rate in a block-based video encoder, and corresponding video encoder apparatus |
US7953286B2 (en) * | 2006-08-08 | 2011-05-31 | Stmicroelectronics Asia Pacific Pte. Ltd. | Automatic contrast enhancement |
-
2008
- 2008-04-15 US US12/103,470 patent/US8199814B2/en not_active Expired - Fee Related
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5650860A (en) * | 1995-12-26 | 1997-07-22 | C-Cube Microsystems, Inc. | Adaptive quantization |
US6665442B2 (en) * | 1999-09-27 | 2003-12-16 | Mitsubishi Denki Kabushiki Kaisha | Image retrieval system and image retrieval method |
US7027510B2 (en) | 2002-03-29 | 2006-04-11 | Sony Corporation | Method of estimating backward motion vectors within a video sequence |
US20060251330A1 (en) | 2003-05-20 | 2006-11-09 | Peter Toth | Hybrid video compression method |
US20060209948A1 (en) | 2003-09-18 | 2006-09-21 | Bialkowski Jens-Guenter | Method for transcoding a data stream comprising one or more coded, digitised images |
EP1727370A1 (en) | 2005-05-25 | 2006-11-29 | Thomson Licensing | Rate-distortion based video coding mode selection foreseeing the esitmation of bit rate and distortion using a simplified transform on low activity prediction residuals |
US20070009027A1 (en) | 2005-05-27 | 2007-01-11 | Zhu Li H | Method for controlling the encoder output bit rate in a block-based video encoder, and corresponding video encoder apparatus |
US7953286B2 (en) * | 2006-08-08 | 2011-05-31 | Stmicroelectronics Asia Pacific Pte. Ltd. | Automatic contrast enhancement |
Non-Patent Citations (15)
Title |
---|
I. Richardson. Vcodex White Paper: An overview of H.264 Advanced Video Coding, Mar. 2007. |
I. Richardson. www.vcodex.com H.264 / MPEG-4 Part 10 White Paper (Intra Prediction), dated Apr. 30, 2003. |
I. Richardson. www.vcodex.com H.264 / MPEG-4 Part 10 White Paper (Overview), dated Jul. 10, 2002. |
Kim et al., Fast H.264 Intra-Prediction Mode Selection Using Joint Spatial and Transform Domain Features, J. Visual Comm. and Image Representation, vol. 17, No. 2, pp. 291-310 (2006), available online Jul. 1, 2005. |
Liang et al. MPEG-4 to H.264/AVC Transcoding, IWCMC '07, Aug. 12-16, 2007, pp. 689-693. |
Related U.S. Appl. No. 12/103,482-Office Action dated Sep. 14, 2011 (pp. 1-14), with claims (pp. 15-21). |
Related U.S. Appl. No. 12/103,482—Office Action dated Sep. 14, 2011 (pp. 1-14), with claims (pp. 15-21). |
S. Mallat et al. Analysis of low bit rate image transform coding. IEEE Trans. on Signal Processing, vol. 46, No. 4, pp. 1027-1042 (1998). |
S. Milani et al. A rate control algorithm for the H.264 encoder. IEEE Trans. on Circuits and Systems for Video Technology, vol. 18, No. 2, Feb. 2008, pp. 257-262. |
S-C. Chang et al. A Novel Rate Predictor Based on Quantized DCT Indices and its Rate Control Mechanism (Abstract). Signal Processing: Image Communication, vol. 18, No. 6, Jul. 2003, pp. 427-441. |
Tsukuba et al., H.264 Fast Intra-Prediction Mode Decision Based on Frequency Characteristic, Proc. of the 13 European Signal Processing Conference (EUSIPCO) '05, Antalya, Turkey, Sep. 2005. |
Yu et al. A Frequency Domain Approach to Intra Mode Selection in H.264/AVC, Proc. of 13th European Signal Processing Conference (EUSIPCO) '05, Antalya, Turkey, Sep. 2005. |
Z. He et al. A unified rate-distortion analysis framework for transform coding. IEEE Trans. on Circuits and Systems for Video Technology, vol. 11, No. 12, Dec. 2001, pp. 1221-1236. |
Z. He et al. Low-Delay Rate Control for DCT Video Coding via p-Domain Source Modeling. IEEE Trans. on Circuits and Systems for Video Technology, vol. 11, No. 8, Aug. 2001, pp. 928-940. |
Z. Lei et al. Accurate Bit Allocation and Rate Control for DCT Domain Video Transcoding. Proceedings of the 2002 IEEE Canadian Conference on Electrical & Computer Engineering, Mar. 2002. |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110249725A1 (en) * | 2010-04-09 | 2011-10-13 | Sony Corporation | Optimal separable adaptive loop filter |
US8787449B2 (en) * | 2010-04-09 | 2014-07-22 | Sony Corporation | Optimal separable adaptive loop filter |
Also Published As
Publication number | Publication date |
---|---|
US20090257488A1 (en) | 2009-10-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8189677B2 (en) | Estimation of P frame average rate quantization parameter (QP) in a group of pictures (GOP) | |
US8199823B2 (en) | Estimation of B frame average rate quantization parameter (QP) in a group of pictures (GOP) | |
US11451793B2 (en) | Parameterization for fading compensation | |
US8553768B2 (en) | Image encoding/decoding method and apparatus | |
US8165195B2 (en) | Method of and apparatus for video intraprediction encoding/decoding | |
US9374577B2 (en) | Method and apparatus for selecting a coding mode | |
US8249145B2 (en) | Estimating sample-domain distortion in the transform domain with rounding compensation | |
US8054883B2 (en) | Method for transcoding compressed video signals, related apparatus and computer program product therefor | |
US8199814B2 (en) | Estimation of I frame average rate quantization parameter (QP) in a group of pictures (GOP) | |
US20070098067A1 (en) | Method and apparatus for video encoding/decoding | |
US20050013497A1 (en) | Intraframe and interframe interlace coding and decoding | |
US7463684B2 (en) | Fading estimation/compensation | |
US20040105491A1 (en) | Method and apparatus for estimating and controlling the number of bits output from a video coder | |
US7609767B2 (en) | Signaling for fading compensation | |
US20110103486A1 (en) | Image processing apparatus and image processing method | |
WO2008020687A1 (en) | Image encoding/decoding method and apparatus | |
US20120147960A1 (en) | Image Processing Apparatus and Method | |
WO2016200714A2 (en) | Search strategies for intra-picture prediction modes | |
WO2016205154A1 (en) | Intra/inter decisions using stillness criteria and information from previous pictures |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SONY ELECTRONICS INC., NEW JERSEY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:AUYEUNG, CHEUNG;REEL/FRAME:020880/0198 Effective date: 20080402 Owner name: SONY CORPORATION, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:AUYEUNG, CHEUNG;REEL/FRAME:020880/0198 Effective date: 20080402 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
ZAAA | Notice of allowance and fees due |
Free format text: ORIGINAL CODE: NOA |
|
ZAAB | Notice of allowance mailed |
Free format text: ORIGINAL CODE: MN/=. |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20240612 |