VIDEO RATE-BUFFER MANAGEMENT SCHEME FOR MPEG
TRANSCODER
CROSS-REFERENCES TO RELATED APPLICATIONS This application claims priority from Provisional U.S. Patent Application No. 60/118,965, filed February 4, 1999, the disclosure of which is incorporated herein in its entirety by reference for all purposes.
BACKGROUND OF THE INVENTION The present invention relates generally to the encoding and decoding of multimedia data, and more particularly the invention relates to rate-buffer management in a transcoder of encoded, precompressed video data.
A transcoder is a device that receives a bitstream that is pre-compressed and pre-encoded according to one of many digital transmission techniques, and outputs a compressed bitstream of a different transmission bit-rate. A simplified block diagram of a transcoder system 10 including a transcoder 12 is shown in Fig. 1. The transcoder 12 accepts a precompressed, encoded signal of video frames on an input from a transmission channel. The transmission channel may be a satellite transmission network, or cable transmission medium, for example. The input signal is decoded by a decoder 12 and re- encoded by an encoder 14, whereupon the re-encoded signal is output at a different, usually constant, bit-rate. Using well-known techniques of adjusting a quantization level of the re-encoded signal, careful management of encoder parameters can provide a high quality signal at a desired bit-rate that is tailored for a specific output transmission channel or application.
An MPEG-2 video transcoder is a specific example of a transcoder that may employ techniques of the present invention. MPEG-2 is a conventionally accepted standard for digitally coding moving pictures, such as a video signal, for compressed transmission. The MPEG-2 video transcoder converts a pre-encoded and compressed video bitstream according to MPEG-2 video compression standards into another MPEG-2 encoded, compressed video signal for transmission at a different bit-rate. An MPEG bitstream has six layers of syntax, at which certain coding parameters are specified. There are a sequence layer (random access unit, context), Group of Pictures (GOP) layer (random access unit, video coding), picture layer (primary
coding layer), slice layer (resynchronization unit), macroblock layer (motion compensation unit), and block layer (DCT unit).
The term "signal" is applied herein to mean any picture, frame, or block. A block is an 8-row by 8-column matrix of pixels. A macroblock (MB) is four 8x8 blocks of luminance data and 2, 4 or 8 corresponding 8x8 blocks of chrominance data derived from a 16x16 section of the luminance component of the picture. A slice refers to a series of macroblocks. Blocks of source data may be encoded by frame, macroblock, or slice. The first bit-rate may be for high-capacity satellite transmission of a coded source video, and the second bit-rate may be downscaled for lower-capacity local cable transmission, ultimately to a set-top box decoder to an individual viewer. A Group of Pictures (GOP) is a set of frames which starts with an I-frame and includes a certain number of P and B frames. The number of frames in a GOP may be fixed. Data rate for a given bitstream is directly related to buffer size and the speed with which bits are placed into and emptied from the buffer. Every transcoder employs some type of video rate-buffer management technique for preventing buffer under- and or over-flows. In a decoder buffer under-flow situation, the decoder buffer is being emptied faster that it is being filled. Consequently, too may bits are being generated in the encoder, which will eventually overflow. To -prevent decoder underflow, video rate-buffer management may provide for an increased quantization level, adjust the bit allocation, discard high frequency DCT coefficients, or repeat pictures.
In a decoder buffer over-flow situation, the decoder buffer is being filled faster than it is being emptied. In other words, too many bits are being transmitted and too few bits are being removed by the decoder such that the buffer is full. Consequently, too few bits are being generated in the encoder, which will eventually underflow. Some video rate-buffer management techniques employed to avoid this situation include decreasing the quantization level, adjusting the bit allocation, and stuffing bits.
Quantization level and bit allocation adjustments are conventionally accomplished by rate control algorithm along with an adaptive quantizer. A transcoder system 20 is illustrated in Fig. 2 with rate control 25 and adaptive quantization 23 mechanisms. Generally, an encoded, compressed signal is first stored in a decoder buffer -22, and then decoded at a decoder 24 in blocks or group of blocks. Rate control 25 is applied to control a data rate of bits being removed from the decoder buffer 22, based on a number and rate of bits being added to an encoder buffer 28. Adaptive quantization
adjusts a quantization level of a bitstream as it is re-encoded by the encoder 26. Rate control and adaptive quantization are generally accomplished in three steps:
1. Bit Allocation
Most encoders have an optimized, and often complicated, bit-allocation algorithm to assign the number of bits for each type of pictures (I-, P-, and B-pictures). Conventional bit-allocation techniques take into account the prior knowledge of video characters (e.g. scene changes, fade, etc.) and coding types (e.g. picture types) for a group of pictures (GOP) by estimating a complexity and allocating target bits for a given GOP. Complexity Estimation: each picture type of I, P, and B pictures is assigned a relative weight X according to a global complexity measure of a Complexity Estimation technique. These weights (Xi, Xp, Xb) are reflected in a typical coded frame size of I, P, and B pictures. I pictures are assigned the largest weight since they have the greatest stability factor in an image sequence. B pictures are assigned the smallest weight since B data does not propagate into other frames through the prediction process. Picture Target Setting: allocates target bits for a frame based on the frame type (I, P, and B) and the remaining number of frames of that same type in the GOP.
2. Rate Control
Rate control attempts to adjust bit allocation if there is significant difference between the target bits (anticipated bits) and actual encoded bits for a block of data.
3. Adaptive Quantization
Adaptive quantization is applied in the encoder along with rate-control to ensure the required video quality and to satisfy the buffer regulation. Adaptive quantization usually recomputes the macroblock quantization factor according to a comparison of the activity of a block against the normalized activity of the frame. The effect of this is to roughly assign a constant number of bits per macroblock, which results in a more perceptually uniform picture quality.
As video distribution networks grow larger and more complex, transcoders using rate-control and adaptive quantization are required to be lower-cost, simple, and yet retain a good video quality. A video rate-buffer management scheme that includes a simplified rate-control and adaptive quantization algorithm is therefore highly desirable.
SUMMARY OF THE INVENTION The present invention provides a simplified rate control algorithm for a conventional video transcoder without requiring the GOP information. This may be accomplished by maintaining picture types, re-using motion vectors, and minimizing changes to the macroblock mode, and achieve the required video quality.
According to one embodiment, the present invention provides a method of managing a video transmission bit-rate in a transcoder. The method includes the steps of measuring a fullness of an input buffer of the transcoder, providing a bit budget for one of plurality of frames in an input bitstream, the bit budget being based on a quantization parameter of said video frames, measuring an actual bit-rate of said input video stream, and comparing said actual bit-rate with said buffer fullness to predict an input buffer underflow or overflow. In response to an input buffer underflow, the bit budget is incremented for a next one of said plurality of video frames. In response to an input buffer overflow, the bit budget is decremented for next one of said plurality of video frames.
According to another embodiment, the present invention provides a method of controlling a bit-rate of a plurality of pictures in a video transcoder, where the transcoder includes a decoder and an encoder. The method includes the steps of determining a bit budget for a current picture at an input to the decoder, measuring a buffer fullness of an encoder buffer when the encoder buffer receives a previous picture, and allocating a number of bits to the current picture based on the buffer fullness, such that the allocated bits of the current picture is within the bit budget.
Other features and advantages of the present invention will be understood upon reading and understanding the detailed description of the preferred embodiments below, in conjunction with reference to the drawings, in which like numerals represent like elements.
BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 schematically illustrates a conventional video transcoder system. Fig. 2 schematically illustrates a conventional video transcoder system with adaptive quantizer and rate-control mechanisms.
Fig. 3 schematically illustrates a video transcoder system including a video rate-management controller according to an embodiment of the present invention.
Fig. 4 illustrates a processor block of the video rate-management scheme according to an embodiment of the present invention.
Fig. 5 illustrates a processor block of the video rat-management scheme according to an alternative embodiment of the present invention.
DESCRIPTION OF THE SPECIFIC EMBODIMENTS The present invention provides a rate control process for efficient video rate-buffer management. According to a preferred exemplary embodiment of the invention, the rate control process is implemented in a video transcoder to control a transcoder output bitstream which complies with the requirements of the Video Buffering Verifier that are specified in the MPEG-2 video standard (ISO/IEC 13818-2).
Fig. 3 illustrates a video transcoder 30 with a video rate-management system 32 according to an embodiment of the present invention. The video rate- management system operates according to a rate management process. The rate management system 32 includes a controller 34 operatively coupled to the transcoder, and providing instructions to a Video Buffering Verifier (VBV) 36.
A Video Buffering Verifier (VBV) is a virtual decoder that is conceptually connected to the output of the encoder. Its purpose is to provide a constraint on the variability of the data rate that an encoder or editing process may produce (ISO13818-2 Annex C). The VBV contemplates a buffer in the receiver at the receiving end of the output transmission channel (not shown), and a prediction mechanism in the encoder. The prediction mechanism may a processor and control circuit that predicts a fullness of the buffer, i.e. buffer fullness, due to the constant fill from the constant bit-rate (CBR) stream and the variable empty from the variable bit-rate (VBR) due to the decoder bit demand.
In an embodiment of the present invention, the controller 34 prevents encoder VBV 36 buffer under- and/or over-flows. The encoder VBV buffer may be a shifted "mirror" of a decoder VBV buffer, however for simplification only the encoder VBV will be discussed in detail. For Constant Bit-rate (CBR) applications, by a use of rate-control, a bit-count-per-second must precisely converge to the target bit-rate with good video quality. For Variable Bit-rate (VBR) applications, the rate-control achieves the goal of maximizing the perceived quality of decoded video sequence with the maintained output-bit-rate within permitted bounds. By employing a rate-management system of the present invention with rate control, transcoder buffer under- and over- flows
are avoided without adding too much complexity to the overall operation of the transcoder system.
In accordance with the invention, for a number of pictures 1-j, the VBV buffer is characterized by the following parameters: vbv J uffer Jullness ): the encoder VBV buffer bit- level right before encoding of the j-th picture. coded _pict_size(j) the bit-count of the j-th coded picture. bits_increment(j+l): the number of bits transmitted between the j-th and (j+l)-th coded pictures. vbv_buffer_size: the (decoder) VBV buffer size coded in the sequence header and sequence extension if present.
These parameters satisfy the recursive equation: vbv_buffer_fullness(j+l) = vbv_buffer_fullness(j) + coded_pict_size(j ) - bits_increment(j + 1 ) . (la) Assume the encoding time of j-th picture is tej and decoding time of j-th picture is ta0. Then an upper bound on the VBV fullness is:
A d,J ■ vbv_buffer_fullness(j) + coded_pict_size(j) < \ R(t)dt t . e,J
(lb)
The VBV fullness upper bound is illustrated in Fig. 4, and
\R(t)dt < vbv_buffer_fullness(j) + vbv_buffer_size, t . e, f
(lc)
Where R(t) is the bit-rate function. The left-side of Eq. (lc) is set to a maximum value as
(Id)
Where ta0 - tej is the delay of the channel and Rmαχ-is the maximum channel bit-rate between td0 - tej.
Therefore, a VBV fullness lower bound is:
vbv_buffer_fullness(j) > -vbv_buffer_size + Tmax-
(le) The VBV fullness lower bound is illustrated in Fig. 5 A video rate-buffer management process according to the invention can be accomplished with rate-control and adaptive quantization for efficient buffer-control. The rate-buffer management system and method according to an embodiment of the present invention checks a bitstream to verify that the amount of rate- buffer memory required in the decoder is bounded by the vbv_buffer_size. The rate- control process will be guided by the rate-buffer management protocol to ensure the bitstream satisfying the buffer regulation with good video quality.
In one step of the rate-buffer management and rate-control process, a bit- budget is determined for each picture. In one embodiment of the invention, for the MPEG-2 transcoder for example, a bit-allocation process is followed for determining the bit-budget for each picture. According to the process, and for convenience of discussion, the following terms define the encoder VBV buffer-related variables: target Jjit_r ate: the VBR or CBR bit rate from a storage media to the decoder; target _pict_size: the targeted bit-count of the current picture, often call the bit-budget for the picture. input _bit_r ate: the bit rate of the input bitstream, input _pict_size: the bit-count of the current input (coded) picture (without picture header bits); coded _pict_size: the actual bit-count of the current coded picture (without picture header bits). frame _rate: the frame rate of the video sequence given in the sequence header. max ybvJyuffer ullnessQ): assigned for the j-th picture or the j-th GOP. According to the invention, the bit-budget for the j-th picture is allocated by a down-scaling transformation as follows: target _pict_size(j) = input _pict_size(j)* (target _bit_rate/inputjbit_r ate).
In an alternative embodiment, bit-allocation for the j-th picture accumulates the bit-budgets of all macroblocks (MBs):
target _pιct_sιze(j) munber of_MBs~\
= ^ round (input _ mb _ sιze(ι) * target _ bit _ rate I input _ bit _ rate + 0.5)
where the round(*) function performs a roundmg-toward zero and input _mb_sιze(ι) denotes the bit-count of the l-th input MB. It should be understood that input _mb_sιze(I) *target_bιt_rate/ιnput_bιt_rate is the target MB size. The bit-budget target _pιct_sιze(j) for the j-th picture is checked against the vbv_puffer ullness(j) to prevent the VBV buffer under- and over-flows. The condition on the VBV buffer underflow provides an upper limit on the bit-budget. The reason is that, at the time of decoding, the current picture should be small enough so that it is contained entirely mside the decoder buffer.
It is known for transcoder that target _pιct_sιze(j) needs to satisfy target _pιct_sιze(f) < input _pιct_sιze(j)
If the current picture size is too small, then Eq. (la) might exceed the max_vbv_bufferjullness(j+l) and then cause decoder buffer overflow Thus a lower limit is placed on the current picture size. This may be achieved, for example, by limiting the bit budget, and if the actual bits used is still smaller than the minimum picture size, then the end of the picture may be stuffed with zero's The lower limit is deπved from Eqs.(la) and (le) as follows. target _pιct_sιze(j) > vbv_buffer_sιze + Tmax- vbv_buffer_fullness(j-l) + Tmm where Tmm=(tej - tej-ι) Rmm and Rmm is the maximum channel bit-rate between tej - tej-ι. Note that Rm = Rmaχ for the CBR channel
The inequality condition of (le) is veπfied for each slice or frame to prevent the encoder buffer under- flow
Down-scaling bit-allocation takes advantage of information provided by the input bitstream for CBR applications. For VBR applications, the down-scaling process requires an instantaneous bit-rate for each picture or every few pictures. This bit- rate, associated with max_ybvj)ufferjullness, can be provided from StatMux.
It is shown in the next section that the target picture size or the target MB size will effect the virtual buffer fullness and, as a consequence, it will generate the quantization scale for the corresponding MB.
In general, the quantization scale (denoted by mquant) for the transcoded bitstream can also generated through a scaling process. Some commonly-used bit- allocation models are:
(1) Γ = - in MPEG-2 Test Model 5(TM5) [3].
k k
(2) T = A- + — in MPEG-4 verification model [41.
Q Q
Where Eis the bit-budget for a picture or a slice or a MB, and k0,kx are constants that are generated by a pre-estimation[4], and Q denotes the quantization scale corresponding to a picture or a slice or a MB, respectively. Since
Tt arg et I Tinput = target _bit _ rate I input _bit _ rate , the quantization scale for the transcoded MBs thus can be estimated by:
T * A 1 r K r. input bit rate
For ,he model T - - , β... - Q,,m — _tø-_m,e ■
0 1
For the model T = — + — , the quantization scale Q can be computed by
solving a quadratic equation : Q
t = where
target _ bit _ rate
B = - + ■ t«/?wt _ όtt _ rate fcimpu/ i npur J
Where Qinput can be the average quantization level for this picture or slice at the input, or the quantization level of the MB at the input. The same process can also be applied to other bit-allocation models. Adjustment to the quantization scale may be accomplished according to the embodiment illustrated below. Let Qv denote the quantization scale determined by the virtual buffers fullness and Qt„ et be the up-scaled quantization level given above.
Assume that Qtirget is the up-scaled quantization level for a given MB.
Then, the quantization scale Qτ for the MB is determined by current current
If( 2U coded_mb_size> target_mb_size)
0 0 ρr = max(ρ,arge,,ρv) ; else
O^ = min(-2,arge,,-2v) .
In an alternative embodiment, the quantization scale may be adjusted as follows. Assume that QtΛτ%e, is the up-scaled quantization level for a given picture or slice at the input and Qinput is the average quantization level for this picture or slice at the input, respectively. Then,
* x%etMB ~ > inputMB
~ xltκ%et ) '
The quantization level Qτ for the MB is determined by current current
If( U coded_mb_size> ∑ target_mb_size)
0 0
Qτ = max(ρ,argerΛ fi ,ρv) , else Qτ = min(QtargetMB ,Qv) .
The down-scaling process for bit-allocation is applied to the macroblock levels for their bit-budget estimation, and is described below with rate-buffer management and rate-control. According to an embodiment of the present invention, a rate management process includes five steps. In this embodiment, the down-scaling process for bit-allocation is only applied to the macroblock levels, which simplifies the bit- parser and counting process. For CBR applications, such a down-scaling process ensures that the VBV buffer never overflows for a "legal" input bitstream.
1. Initial conditions in Sequence Level
The vbv buffer is initially filled the vbv uffer Jullness amount of bits. For CBR applications, vbv uffer Jullness =vbv_delay*target_bit_rate/90000. For VBR applications, the initial vbv uffer Jullness is often derived from the decoding time-stamp of the first picture, i.e. vbv uffer Jullness = buffer bit level right before decoding of the first picture. For the elementary stream-only case, it is initially assumed that: vbv Juffer Jullness = max( min( (2*bit_rate)l framejrate, max _vbvj>uffer Jullness I 5 ) , Kl ) if the initial quantizer is non- linear and: vbv uffer Jullness = max( min( (4*bit_rate)l frame_rate, max _vbv_buffer Jullness •/ 2 ) , K2 )
if the initial quantizer is linear , where Kl and K2 are constants. In one embodiment, the values for the constants may be Kl= 100000 and K2=200000. In an alternative embodiment of the invention, in a similar manner to the MPEG-2 test model 5 (TM5) [3], three virtual buffers are used to measure the buffer fullness.
2. Initial Conditions in Picture Level
The additional parameters required in the picture level are the quantization scale type : qjscale ype and the average quantization level avg_Q_prev_pict of the previous picture. Two variables need to be set for the rate-control in the picture level: (1) the initial virtual buffer fullness for the picture; (2) the bit budget for this picture. Also, the bits from picture header (and sequence header and GOP header for the beginning of the sequence or GOP) , header Jits, are extracted.
The virtual buffer fullness d is set to be the virtual buffer fullness of the current picture type, i.e. case I_TYPE : d = dOi; case P_TYPE : d = dOp; case B_TYPE : d = dOb.
(2) The bit budget for this picture, denoted by target _pict_size, is allocated by a very simple transformation as follows: target _pict_size = input _pict_size *(target it _r ate/input Jit _r ate) For CBR applications, target Jit rate/input it jrate is pre-computed after parsing the sequence header.
3. Update Variables in Picture Level
Two variables are updated in the picture level: the virtual buffer fullness d. and the quantization type q_scale_type for this picture.
(1) d + = coded _pict_size - target _pict_size , and case I_TYPE : d0i=d; case P_TYPE : d0p=d; case B_TYPE : d0b=d
(2) The q_scale_type for this picture is determined by the following rules : If this picture is the first picture or an I-picture, keep the qjscalejype to be the same as the corresponding input picture;
Otherwise, qjscalejype is set as follows :
If ( avg_Q_prev_pict<Tl \\ avg_Q_prev_pict>T2 ) q_scale ype = 1 ; If ( avg Q_pτev_pict > T3 &&. avg_Q_prev_pict < T4) qjscalejype = 0; Where avg_Qj?rev pict is the average mquant of the previous frame and TKT3<T4<T2. The typical values for Tl, T2, T3, and T4 are: Tl=15, T2=25, T3=18 and T4 =22. liq calejype is not set, the input qjscalejype is used. At the end of a picture, the video buffer verifier fullness vbv Juffer Jullness is updated. The minimum picture size min_pict ize is compared with the actual coded picture size coded jpictjize for the frame just coded. If a deficit exists, ones are appended to the end of that frame.
4. Initial Variables in Macroblock Level
An initial quantization step-size (mquant) needs to be computed at the beginning of each picture. Such a quantization step-size is generated by an up-scaling conversion of the quantization step-size (input_mquant) of the corresponding input macroblock : mquant = input_mquant*(input_bit_rate/target_bit_rate);
5. Updated Variables in Macroblock Level
The macroblock (MB) quantization step-size, mquant, is updated by a use of a virtual buffer discrepancy. The virtual buffer discrepancy is calculated by the following formula :
Virtual buffer discrepancy = d + the cumulated bits up to the current MB of a picture - the cumulated MB-bit-budget up to the current MB of a picture.
The MB-bit-budget for each MB may also be computed by a down-scaling conversion: mb_bit_budget = input_mb_bitcount*(target_bit_rate/input_bit_rate). Although the invention has been described with reference to specific exemplary embodiments, it will be appreciated that it is intended to cover all modifications and equivalents within the scope of the appended claims.