WO2002096120A1 - Bit rate control for video compression - Google Patents

Bit rate control for video compression Download PDF

Info

Publication number
WO2002096120A1
WO2002096120A1 PCT/SG2001/000105 SG0100105W WO02096120A1 WO 2002096120 A1 WO2002096120 A1 WO 2002096120A1 SG 0100105 W SG0100105 W SG 0100105W WO 02096120 A1 WO02096120 A1 WO 02096120A1
Authority
WO
WIPO (PCT)
Prior art keywords
buffer
frame
bit rate
encoding
encoded
Prior art date
Application number
PCT/SG2001/000105
Other languages
French (fr)
Inventor
Zhengguo Li
Xiao Lin
Original Assignee
Centre For Signal Processing, Nanyang Technological University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Centre For Signal Processing, Nanyang Technological University filed Critical Centre For Signal Processing, Nanyang Technological University
Priority to PCT/SG2001/000105 priority Critical patent/WO2002096120A1/en
Publication of WO2002096120A1 publication Critical patent/WO2002096120A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/152Data rate or code amount at the encoder output by measuring the fullness of the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the present invention relates to a bit rate control for the compression of video data. It has particular, but not exclusive, application to the provision of video over a packet switched network such as the Internet.
  • Bit rate control plays an important role in the provision of multimedia over communications networks, and has been widely studied by many researchers for various standards and applications, such as storage media and real-time transmission with MPEG-1 and MPEG-2, videoconferencing with H.261 and H.263, and video object coding with MPEG-4.
  • the most influential coding parameter with regard to picture quality is the quantization parameter (QP) used for texture coding.
  • QP quantization parameter
  • This parameter can be selected for an entire frame of the video sequence or can change from macroblock to macroblock. In most implementations, it is selected on the basis of buffer fullness, so that the buffer occupancy is maintained at a given level.
  • the H.263 coding scheme allows for variable frameskip, and due to the low bit- rate conditions which may be imposed upon the encoder, it is up to the rate control algorithm to make appropriate decisions on both spatial and temporal coding parameters.
  • bit rate control algorithm must determine a suitable quantization parameter (QP) to obtain the desired bit rate.
  • QP quantization parameter
  • MPEG-4 bit rate control also considers spatial and temporal coding parameters.
  • the encoder must also consider the significant amount of bits which are used to code shape information such that arbitrarily shaped objects can be coded.
  • each video object may be encoded at a different frame rate, it is preferable that all of the objects are encoded at the same frame rate in order to yield better video quality.
  • additional coding parameters are introduced by MPEG-4 to control the amount of bits used to specify the shape of an object. It is the responsibility of the rate control scheme to incorporate these new parameter decisions along with other parameter decisions to ensure that the video objects are effectively coded.
  • the encoded bits are placed into an encoder buffer before it is transmitted through a network to a decoder. If the actual bit rate of the encoder is greater than the available channel bandwidth, the additional bits accumulate in the encoder buffer and increase buffer delay, which is the time needed to send the buffer bits remaining from the previously encoded frames. When the number of bits in the buffer is too high, the encoder usually skips some frames to reduce the buffer delay and avoid buffer overflow. This frame-skipping, however, produces undesirable motion discontinuity in the encoded video sequence. Conversely, if the buffer level is too low, there may be periods of time in which no bits are transmitted through the channel, and hence some channel bandwidth is wasted.
  • a joint buffer control is usually used to maintain a buffer occupancy of about 50% of the buffer size after coding each frame.
  • heuristic methods are usually employed, in which the target bit rate is increased if the current buffer level is less than half of the buffer size, and the target bit rate is decreased if the current buffer level is more than half of the current buffer size.
  • Such schemes are disclosed in "Scalable Rate Control for MPEG-4 Video", H.J. Lee, T.H. Chiang and Y.Q. Zhang, IEEE Trans. Circuit Syst. Video Technol., 10:878-894, 2000, and in “MPEG-4 rate control for multiple video objects", A. Vetro, H. Sun and Y. Wang, IEEE Trans. Circuit Syst.
  • the existing schemes have problems when used in for example Internet applications and the streaming of video over the Internet. Due to the connectionless nature of the current Internet protocols and the routing mechanisms involved, the instantaneous bandwidth available to a particular user can vary widely in time and cannot in practice be previously known. The existing bit rate control schemes cannot adapt themselves quickly enough to the variations of channel bandwidth, and are not effective enough to achieve the objectives of Video over the Internet. /
  • A-h aim of the present invention is to provide a bit rate control which provides for better video quality, especially but not exclusively in Internet applications.
  • the present invention provides a bit rate control system for the encoding of video data in which the encoded bits are placed in a buffer prior to transmission, and in which a target encoding bit rate is determined based on the fullness of the buffer, characterized in that the buffer is modelled by a fluid-flow traffic model preferably of the form:
  • B c (n + 1) max ⁇ 0 5 B c n) + T( ⁇ ) - u n) ⁇
  • B c (n) denotes the buffer level at time n
  • T(n) is the actual encoding bit rate; and u(n) is the channel output rate.
  • the system of the present invention is able to keep the buffer occupancy closer to its target, which is preferably set at a predefined percentage (preferably about 50%) of a safety margin used to determine whether a frame of the video sequence to be encoded should be skipped, and to adapt itself faster to the variations of the channel bandwidth, and so will skip fewer frames at a low bandwidth. This therefore provides a higher overall video quality, and is attractive for video over the Internet.
  • the target encoding bit rate is given by the equation:
  • A is the channel output rate; y is a buffer safety margin; B s is the buffer size; B c (n) is the current buffer level; and 0 ⁇ ⁇ ⁇ 1 is an adjustable parameter.
  • A may be equal to the number of bits available for encoding all of the inter- frames of a current group of frames being encoded divided by the number of inter-frames to be encoded in the current group of frames.
  • A may be the actual bandwidth estimated by using the packet loss information. This allows the variation of the channel bandwidth to be directly incorporated into the buffer control, and allows the system to adapt itself in time.
  • the target bit rate is preferably modified based on the remaining bits available for encoding and on the remaining frames to be encoded. It may thus be:
  • T r is the number of remaining bits available for encoding
  • N r is the number of frames remaining to be encoded
  • H hdr (n-1 ) is the amount of overhead bits used for the previous frame.
  • R is the total number of bits used to encode a frame
  • Q is the quantization parameter
  • Ci and C 2 are first and second order coefficients
  • is an index of video coding complexity
  • H hd r is the amount of overhead bits used.
  • the coefficients of the Rate-Distortion model are updated based upon data from a plurality of previous frames.
  • the number of previous frame used is preferably determined by a sliding window mechanism, wherein the value of the current window size W(n) is given by:
  • W ma ⁇ is a preset constant
  • ⁇ ( ⁇ ) is the maximum absolute difference of the frame at time n.
  • Such a sliding window mechanism smoothes the impact of scene changes, and changes the window size gradually.
  • the total number of actual bits used to encode the current frame is added to the current buffer level. If the buffer is in danger of overflow, a switched frame skipping mechanism is preferably used to compute the number of skipped frames. In one frame skip control, after the current frame is encoded, the next frame to be encoded will be skipped, if:
  • B c (n+1 ) is the current buffer level
  • T(n) is the actual number of bits used to encode the current frame;
  • A is the channel output rate;
  • B s is the buffer size;
  • is a pre-determined buffer safety margin;
  • T(S(j,n)) (1 ⁇ j ⁇ W(n)) denotes the total number of actual bits generated in the encoding of the previous W(n) frames.
  • a frame skipping parameter N pos t is set to skip the next N post frames so that the following buffer condition is satisfied:
  • B c (n + 1) max ⁇ , B c (# ⁇ ) + T(n) - A(N posl + 1) ⁇
  • B c (n) is the buffer level at time n;
  • T(n) is the actual number of bits used to encode the current frame
  • A is the channel output rate
  • the first-mentioned skipping control is preferably provided as a predictive switching control
  • the second-mentioned skipping control is preferably provided as a post-frame skipping control
  • the skipping controls are preferably switched between one another based on the following switching law: a) The predictive frame skipping control is switched to the post- skipping control if a frame is skipped; and b) The post-skipping control is switched to the predictive frame skipping control if the current frame is not skipped.
  • the present invention also extends to a method for the encoding of a video sequence in accordance with the above system features, and to computer software for implementing the above system and method features.
  • Rate-Distortion model defined above being in itself a new and advantageous model for use in bit rate control.
  • Figure 1 is a, diagram of the structure of a typical network over which video streaming may be provided.
  • FIG. 2 is a functional block diagram of a video encoder scheme according to an embodiment of the present invention. Detailed Description of the Invention
  • Fig. 1 shows a typical Internet structure over which a video sequence may need to be transmitted from a source 1 to one or more receivers 2. Due to the amount of data in a video sequence, the data must be compressed, otherwise the required transmission bit-rate would be unachievably high.
  • an encoder 3 is provided at the source 1 in order to compress the video data
  • decoders 4 are provided at the receivers 2 in order to decode the data and reconstruct the video sequence.
  • the compressed data is routed through various servers 5 and over what may be many different types of transmission channel 6.
  • MPEG video compression is often employed.
  • the current MPEG standards are MPEG-1 and MPEG-2, which are similar in basic concept, and MPEG-4 which is able to provide a low-bandwidth multimedia format that can contain a mix of media (including recorded video images and sounds and their computer-generated counterparts), and uses the concept of "Video Objects" to transmit independent images of arbitrary shape.
  • a video sequence is broken into a number of Groups of Pictures (GOP), each of which comprises a number of picture frames.
  • Each frame is broken into a series of slices, and each slice consists of a set of macroblocks comprising arrays of luminance pixels and associated chrominance pixels.
  • the macroblocks are divided into 8x8 blocks for encoding.
  • Each block undergoes a Discrete Cosine Transform (DCT) to provide an array of DCT coefficients that are then quantized to force various of the coefficients (generally higher frequency coefficients) to zero so as to reduce the amount of data to be transmitted.
  • DCT Discrete Cosine Transform
  • Quantization is carried out by multiplying the DCT coefficient array by a quantization matrix, each value in the matrix being scaled by a quantization parameter.
  • the matrix and quantization parameter can be altered on a frame-by-frame and/or block-by-block basis to alter the amount of compression.
  • the quantized coefficients then undergo further encoding to compress the
  • the frames in a GOP comprise an Intra-frame (I frame) that is spatially compressed (in accordance with the above method), and Inter-frames (P and/or B frames) that are also temporally compressed in a motion-compensated prediction manner.
  • I frame Intra-frame
  • P and/or B frames Inter-frames
  • each P frame in a sequence is predicted from the frame immediately preceding it
  • each B frame is predicted from preceding and succeeding frames.
  • MPEG-4 also includes a Video Object layer between the frame layer and macroblock layer for specifying different independent objects within a scene.
  • MPEG-4 In order to optimise video quality over a bit-rate range, e.g. in video-streaming to a number of receivers having different bandwidth capabilities, MPEG-4 also provides a Fine Granularity Scalability (FGS) scheme in which the coding of the video data is provided by a base layer and an enhancement layer, the base layer being designed to meet the lower bound of the bit rate range and the enhancement layer meeting the upper bound of the bit-rate range.
  • FGS Fine Granularity Scalability
  • the base layer is coded as discussed above, and the enhancement layer takes the original and reconstructed DCT coefficients of the base layer, and subtracts the reconstructed coefficients from the originals to provide a residue that is then encoded and transmitted with the base layer.
  • the receivers of the data decode the base layer to provide a video signal based on the lowest bit rate range, and can improve the quality by decoding various amounts of the enhancement layer.
  • the present invention relates to a bit rate control scheme for the compression of video data, and may for example be used in encoding the base layer of an FGS scheme. It may especially be used in the FGS disclosed in the co-pending International PCT patent application filed in Singapore on 25 May 2001 and entitled "A Fine Granularity Scalability Scheme".
  • the present bit-rate control scheme consists of three layers, namely the GOP layer, the frame layer and the video object layer. The whole scheme is shown in Fig. 2.
  • the GOP layer rate control 1 is used to allocate bits to each GOP of the video sequence, each GOP being composed of one I frame and a number of P and B frames.
  • the total number of bits available for the video sequence will be:
  • R is the bit rate for the sequence.
  • the number of bits allocated to the ith GOP is:
  • each GOP has the same structure, and so the GOP Layer Rate Control will allocate each GOP the following number of bits:
  • the encoder After the GOP layer rate control at block 1 , the encoder carries out a buffer initialization at block 2, conducts the Intra-coding of the l-frame at block 3, updates a Rate-Distortion model at block 4 and checks as to whether the next frame must be skipped at a skip-frame block 5 (e.g. because of possible buffer overrun).
  • Inter-coding is then performed in which the encoder 3 performs a joint buffer control at block 6, a Frame Layer Target Bit Rate calculation at block 7 and a Quantization Parameter calculation at block 8, before carrying out the Inter- coding of the P or B frame at block 9.
  • the R-D model update and Frame-skip control are again carried out at blocks 4 and 5 before conducting the encoding of the next inter-frame through block 6, etc.
  • the encoder also conducts a Target Bit Rate Allocation at block 10, and calculates a shape threshold in block 8 along with the quantization parameter calculation.
  • the part of the bit rate control in the frame layer consists of three stages: the initialization, pre-encoding and post-encoding stages.
  • the encoder carries out three main tasks with respect to the frame layer control, these being:
  • buffer fullness is set at 50% of a buffer safety margin (which will be 40% of the buffer size assuming a safety margin of 80%). Otherwise, the buffer fullness is set at the end level of the previous GPO.
  • the l-frame is quantized using an initial quantization value of Q 0 .
  • the remaining available bits R 0 (i) for encoding all of the subsequent inter-frames can be calculated as:
  • Ro( ⁇ ) W t -K,. + (0.5*5, * -B e (i))
  • TBj is the number of bits available to encode the ith group of frames
  • K is the number of bits used to encode the ith intra-frame
  • B s is the buffer size
  • y is the buffer safety margin for skipping frames, having a typical value of 0.8
  • the channel output rate (the average number of bits to be drained from the
  • the pre-encoding stage includes setting a target bit rate for the encoding of the next video frame in the GOP, and setting the quantization parameter for quantization of the DCT coefficients in accordance with the target bit rate.
  • the encoder When the number of bits in the buffer is too large (e.g. is predicted to exceed a safety margin), the encoder usually skips some frames to reduce the buffer delay and avoid buffer overflow. This however produces undesirable motion discontinuities in the encoded video sequence. Conversely, if the buffer level is too low, there may be periods of time in which no bits are transmitted through the channel, and channel bandwidth is wasted.
  • a frame level control which sets the target bit rate so as to attempt to maintain a buffer occupancy after the coding of each frame of about 50% of the buffer safety margin (i.e. about 40% of the buffer size for a 0.8 safety margin).
  • T(n) is the actual encoding bit rate
  • u(n) is the channel output rate
  • the target bit rate is scaled based on the buffer size B s , the current buffer level B c (n) and the channel output rate R 0 (i)/N P i , and is given by:
  • 0 ⁇ ⁇ ⁇ 1 is an adjustable parameter having a typical value of 0.75.
  • the number of remaining bits T r allocated to the current GOP and the remaining number of frames N r of the current GOP should also be taken into account to ensure that there are available bits for the remaining frames, and so the final frame bit rate is:
  • 0 ⁇ ⁇ ⁇ 1 is an adjustable parameter having a typical value of 0.585
  • H hdr (n-1) is the amount of bits used for overhead data, that is, the bits used for non-texture data, e.g. shape information, motion vector information and header information. It should be noted that the above method of using a fluid-flow model departs from the prior art use of heuristic methods for determining the target bit rate, and enables the buffer occupancy to be kept much closer to the target, so that fewer frames are skipped.
  • the present model-based method may be used in any suitable video transmission system, and is especially attractive when MPEG-4 video is transported over the Internet where variations in bandwidth occur.
  • adjustment of the joint buffer control has a delay of one step, and cannot adapt itself in time to the variations in channel bandwidth.
  • the term R Q (i)/N p i may be replaced by the estimated actual channel bandwidth, e.g. by using the packet loss information.
  • the variation of the channel bandwidth can be incorporated into the present joint buffer control, and the scheme can adapt itself in time.
  • the corresponding quantization parameter, Q can be computed by using a Rate-Distortion model, which takes the form of the following quadratic model:
  • R is the total number of bits used to encode a frame
  • Q is the quantization parameter
  • c ⁇ and c 2 are first and second order coefficients
  • is the mean absolute difference of texture computed using the motion-compensated residual for the luminance component (an index of video coding complexity)
  • H h dr is the amount of bits used for overhead data, that is, non- texture data, e.g. video/frame syntax, bits used for shape information, motion vector information and header information.
  • the post-encoding stage includes the processes of updating the parameters c-i and C 2 of the Rate-Distortion model and determining whether any frame- skipping is necessary to prevent possible buffer overflow.
  • the number of frames to use is based on a sliding window mechanism, which is designed to smooth the impact that a scene change might have in the updating of the R-D model.
  • the window size is increased gradually.
  • Max_Sliding_Window is a preset constant, and may be set to e.g. 20;
  • the selected sample data points within the window W(n) are denoted as S(j,n) (1 ⁇ j ⁇ W(n)).
  • the encoder collects the quantization parameter statistics QO) and the actual bit rate statistics T(j), and, using a linear regression technique, the parameters can be obtained by:
  • the switched frame skipping control is composed of two basic controllers (a predictive frame skipping controller and a post frame skipping controller) and a corresponding switching law to determine the active controller.
  • a function T B is defined by:
  • T(S(j,n)) (1 ⁇ j ⁇ W(n) denotes the total number of actual bits generated in the encoding of the previous W(n) frames.
  • the next frame to be encoded will be skipped, if the current buffer level plus the estimated number of bits for the next frame is larger than the sum of T B (n) and some pre-determined threshold, called the safety margin, that is if:
  • B c (n+1) is the current buffer level
  • T(n) is the actual number of bits used to encode the current frame
  • A is the channel output rate (which may be R 0 (i)/N P i or is replaced by the estimated actual channel bandwidth);
  • B s is the buffer size; and ⁇ is the pre-determined safety margin.
  • a frame skipping parameter N pos t is increased from zero until the following buffer condition is satisfied, the next N pos t frames are then skipped by the encoder:
  • the predictive frame skipping control is switched to the post- skipping control if a frame is skipped; and b) The post-skipping control is switched to the predictive frame skipping control if the current frame is not skipped.
  • the predictive or post frame skipping control may be used by itself.
  • the above method may also be used to control the video object layer rate control.
  • the total target bit rate (as found in the frame layer control) is allocated to each video object according to its coding complexity, size and perceptual importance.
  • the target bit rate for an object i is given by:
  • MOTi j ⁇ (n) and MOT ijy (n) are the absolute values of the jth motion vector component within the object i at the time n; and ⁇ is an adjustable parameter 0 ⁇ ⁇ ⁇ 1.
  • the shape threshold values can be set dynamically based on the previous coding information.
  • the threshold for the video object i, ⁇ * is initially set to zero, if fj(n) is less than Hh d r,i(n-1 ) - 1.25 H B (i) in the previous frame, then:
  • the switched frame skipping control When controlling the video object layer, the switched frame skipping control will preferably be used.
  • the present scheme can also control the macroblock layer control.
  • the method is thus scalable.
  • control scheme may be implement in software and/or hardware in a variety of manners.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A bit rate control scheme for video encoding is described, in which a target bitrate for a picture frame (or video object or macroblock) is determined based ona fluid-flow model of the buffer dynamics, and in which the buffer targetoccupancy is set to about 50% of a buffer safety margin used to determinewhether a frame should be skipped or not, the margin being about 80% of the buffer size. A new Rate-Distortion Model for determining a suitable quantizationparameter to give the target bit rate is also described, as is a sliding windowmethod of determining prior data points to update the Rate-Distortion modelparameters, and a switching frame-skipping control which switches between a predictive skipping control and a post frame skipping control.

Description

Bit Rate Control for Video Compression
Technical Field
The present invention relates to a bit rate control for the compression of video data. It has particular, but not exclusive, application to the provision of video over a packet switched network such as the Internet.
Background Art
Bit rate control plays an important role in the provision of multimedia over communications networks, and has been widely studied by many researchers for various standards and applications, such as storage media and real-time transmission with MPEG-1 and MPEG-2, videoconferencing with H.261 and H.263, and video object coding with MPEG-4.
For different coding standards and applications, different coding parameters are emphasised and different mechanisms are applied. For example, in MPEG-2, the most influential coding parameter with regard to picture quality is the quantization parameter (QP) used for texture coding. This parameter can be selected for an entire frame of the video sequence or can change from macroblock to macroblock. In most implementations, it is selected on the basis of buffer fullness, so that the buffer occupancy is maintained at a given level. The H.263 coding scheme allows for variable frameskip, and due to the low bit- rate conditions which may be imposed upon the encoder, it is up to the rate control algorithm to make appropriate decisions on both spatial and temporal coding parameters. If the buffer is in danger of overflow, complete frames may be disregarded at the encoder to allow bits used for the previous frame to be transmitted out of the buffer to thereby reduce the buffer level and delay. In conjunction with this frame-skipping mechanism, the bit rate control algorithm must determine a suitable quantization parameter (QP) to obtain the desired bit rate. Similarly to H.263, MPEG-4 bit rate control also considers spatial and temporal coding parameters. However, the encoder must also consider the significant amount of bits which are used to code shape information such that arbitrarily shaped objects can be coded. Also, although each video object may be encoded at a different frame rate, it is preferable that all of the objects are encoded at the same frame rate in order to yield better video quality. Further, additional coding parameters are introduced by MPEG-4 to control the amount of bits used to specify the shape of an object. It is the responsibility of the rate control scheme to incorporate these new parameter decisions along with other parameter decisions to ensure that the video objects are effectively coded.
i In real-time video communications, the encoded bits are placed into an encoder buffer before it is transmitted through a network to a decoder. If the actual bit rate of the encoder is greater than the available channel bandwidth, the additional bits accumulate in the encoder buffer and increase buffer delay, which is the time needed to send the buffer bits remaining from the previously encoded frames. When the number of bits in the buffer is too high, the encoder usually skips some frames to reduce the buffer delay and avoid buffer overflow. This frame-skipping, however, produces undesirable motion discontinuity in the encoded video sequence. Conversely, if the buffer level is too low, there may be periods of time in which no bits are transmitted through the channel, and hence some channel bandwidth is wasted.
To overcome these two problems, a joint buffer control is usually used to maintain a buffer occupancy of about 50% of the buffer size after coding each frame. In order to do this, heuristic methods are usually employed, in which the target bit rate is increased if the current buffer level is less than half of the buffer size, and the target bit rate is decreased if the current buffer level is more than half of the current buffer size. Such schemes are disclosed in "Scalable Rate Control for MPEG-4 Video", H.J. Lee, T.H. Chiang and Y.Q. Zhang, IEEE Trans. Circuit Syst. Video Technol., 10:878-894, 2000, and in "MPEG-4 rate control for multiple video objects", A. Vetro, H. Sun and Y. Wang, IEEE Trans. Circuit Syst. Video Technol., 9:186-199, 1999. These schemes either encode video at a predefined fixed rate or at a predefined small set of fixed rates. The existing schemes have problems when used in for example Internet applications and the streaming of video over the Internet. Due to the connectionless nature of the current Internet protocols and the routing mechanisms involved, the instantaneous bandwidth available to a particular user can vary widely in time and cannot in practice be previously known. The existing bit rate control schemes cannot adapt themselves quickly enough to the variations of channel bandwidth, and are not effective enough to achieve the objectives of Video over the Internet. /
A-h aim of the present invention is to provide a bit rate control which provides for better video quality, especially but not exclusively in Internet applications.
Summary of the Invention
Viewed from a first aspect, the present invention provides a bit rate control system for the encoding of video data in which the encoded bits are placed in a buffer prior to transmission, and in which a target encoding bit rate is determined based on the fullness of the buffer, characterized in that the buffer is modelled by a fluid-flow traffic model preferably of the form:
Bc (n + 1) = max{05 Bc n) + T(ή) - u n)}
where Bc(n) denotes the buffer level at time n;
T(n) is the actual encoding bit rate; and u(n) is the channel output rate.
The system of the present invention is able to keep the buffer occupancy closer to its target, which is preferably set at a predefined percentage (preferably about 50%) of a safety margin used to determine whether a frame of the video sequence to be encoded should be skipped, and to adapt itself faster to the variations of the channel bandwidth, and so will skip fewer frames at a low bandwidth. This therefore provides a higher overall video quality, and is attractive for video over the Internet.
Preferably, the target encoding bit rate is given by the equation:
Figure imgf000005_0001
where A is the channel output rate; y is a buffer safety margin; Bs is the buffer size; Bc(n) is the current buffer level; and 0 < γ < 1 is an adjustable parameter. ,
"A" may be equal to the number of bits available for encoding all of the inter- frames of a current group of frames being encoded divided by the number of inter-frames to be encoded in the current group of frames. Alternatively, when for example providing video over the Internet, "A" may be the actual bandwidth estimated by using the packet loss information. This allows the variation of the channel bandwidth to be directly incorporated into the buffer control, and allows the system to adapt itself in time.
Meanwhile, the target bit rate is preferably modified based on the remaining bits available for encoding and on the remaining frames to be encoded. It may thus be:
Figure imgf000005_0002
where 0 < β < 1 is an adjustable parameter;
Tr is the number of remaining bits available for encoding; Nr is the number of frames remaining to be encoded; and Hhdr(n-1 ) is the amount of overhead bits used for the previous frame. After the target bit rate is determined, a rate-distortion model preferably of the following form is further applied to determine the corresponding quantization parameter:
τ*» σ σ TT
where R is the total number of bits used to encode a frame; Q is the quantization parameter; Ci and C2 are first and second order coefficients; σ is an index of video coding complexity; and ,
Hhdr is the amount of overhead bits used.
Further preferably, the coefficients of the Rate-Distortion model are updated based upon data from a plurality of previous frames. The number of previous frame used is preferably determined by a sliding window mechanism, wherein the value of the current window size W(n) is given by:
W{ή) = mm{W(n - 1) + l,ς («) * ΣTmax }
where Wmaχ is a preset constant;
Figure imgf000006_0001
σ(ή) is the maximum absolute difference of the frame at time n.
Such a sliding window mechanism smoothes the impact of scene changes, and changes the window size gradually.
After the current frame is encoded, the total number of actual bits used to encode the current frame is added to the current buffer level. If the buffer is in danger of overflow, a switched frame skipping mechanism is preferably used to compute the number of skipped frames. In one frame skip control, after the current frame is encoded, the next frame to be encoded will be skipped, if:
Bc(n + l) + T(n) - A ≥ Bs *f + TB(n)
where Bc(n+1 ) is the current buffer level;
T(n) is the actual number of bits used to encode the current frame; A is the channel output rate; Bs is the buffer size; γ is a pre-determined buffer safety margin;
Figure imgf000007_0001
and T(S(j,n)) (1< j < W(n)) denotes the total number of actual bits generated in the encoding of the previous W(n) frames.
In an alternative frame skip control, a frame skipping parameter Npost is set to skip the next Npost frames so that the following buffer condition is satisfied:
Bc(n + ϊ) <yBs where
Bc (n + 1) = max{θ, Bc (#ι) + T(n) - A(Nposl + 1)}
Bc(n) is the buffer level at time n;
T(n) is the actual number of bits used to encode the current frame;
A is the channel output rate;
Bs is the buffer size; and y is a pre-determined buffer safety margin. The first-mentioned skipping control is preferably provided as a predictive switching control, the second-mentioned skipping control is preferably provided as a post-frame skipping control, and the skipping controls are preferably switched between one another based on the following switching law: a) The predictive frame skipping control is switched to the post- skipping control if a frame is skipped; and b) The post-skipping control is switched to the predictive frame skipping control if the current frame is not skipped.
The present invention also extends to a method for the encoding of a video sequence in accordance with the above system features, and to computer software for implementing the above system and method features.
It further extends to the use of the above features independently of one another, with for example the Rate-Distortion model defined above being in itself a new and advantageous model for use in bit rate control.
Brief Description of the Drawings
The present invention will hereinafter be described in greater detail by reference to the attached drawings which show an example form of the invention. It is to be understood that the particularity of the drawings does not supersede the generality of the preceding description of the invention.
Figure 1 is a, diagram of the structure of a typical network over which video streaming may be provided; and
Figure 2 is a functional block diagram of a video encoder scheme according to an embodiment of the present invention. Detailed Description of the Invention
Fig. 1 shows a typical Internet structure over which a video sequence may need to be transmitted from a source 1 to one or more receivers 2. Due to the amount of data in a video sequence, the data must be compressed, otherwise the required transmission bit-rate would be unachievably high.
Thus, an encoder 3 is provided at the source 1 in order to compress the video data, and decoders 4 are provided at the receivers 2 in order to decode the data and reconstruct the video sequence. In between the encoder 1 and decoders 4, the compressed data is routed through various servers 5 and over what may be many different types of transmission channel 6.
Various different encoding systems have been provided for the compression of video data, and, for example, MPEG video compression is often employed. The current MPEG standards are MPEG-1 and MPEG-2, which are similar in basic concept, and MPEG-4 which is able to provide a low-bandwidth multimedia format that can contain a mix of media (including recorded video images and sounds and their computer-generated counterparts), and uses the concept of "Video Objects" to transmit independent images of arbitrary shape.
In MPEG compression, a video sequence is broken into a number of Groups of Pictures (GOP), each of which comprises a number of picture frames. Each frame is broken into a series of slices, and each slice consists of a set of macroblocks comprising arrays of luminance pixels and associated chrominance pixels. The macroblocks are divided into 8x8 blocks for encoding. Each block undergoes a Discrete Cosine Transform (DCT) to provide an array of DCT coefficients that are then quantized to force various of the coefficients (generally higher frequency coefficients) to zero so as to reduce the amount of data to be transmitted. Quantization is carried out by multiplying the DCT coefficient array by a quantization matrix, each value in the matrix being scaled by a quantization parameter. The matrix and quantization parameter can be altered on a frame-by-frame and/or block-by-block basis to alter the amount of compression. The quantized coefficients then undergo further encoding to compress the transmission data still further.
The frames in a GOP comprise an Intra-frame (I frame) that is spatially compressed (in accordance with the above method), and Inter-frames (P and/or B frames) that are also temporally compressed in a motion-compensated prediction manner. Thus, each P frame in a sequence is predicted from the frame immediately preceding it, and each B frame is predicted from preceding and succeeding frames.
MPEG-4 also includes a Video Object layer between the frame layer and macroblock layer for specifying different independent objects within a scene.
In order to optimise video quality over a bit-rate range, e.g. in video-streaming to a number of receivers having different bandwidth capabilities, MPEG-4 also provides a Fine Granularity Scalability (FGS) scheme in which the coding of the video data is provided by a base layer and an enhancement layer, the base layer being designed to meet the lower bound of the bit rate range and the enhancement layer meeting the upper bound of the bit-rate range. The base layer is coded as discussed above, and the enhancement layer takes the original and reconstructed DCT coefficients of the base layer, and subtracts the reconstructed coefficients from the originals to provide a residue that is then encoded and transmitted with the base layer. The receivers of the data decode the base layer to provide a video signal based on the lowest bit rate range, and can improve the quality by decoding various amounts of the enhancement layer.
The present invention relates to a bit rate control scheme for the compression of video data, and may for example be used in encoding the base layer of an FGS scheme. It may especially be used in the FGS disclosed in the co-pending International PCT patent application filed in Singapore on 25 May 2001 and entitled "A Fine Granularity Scalability Scheme". The present bit-rate control scheme consists of three layers, namely the GOP layer, the frame layer and the video object layer. The whole scheme is shown in Fig. 2.
The GOP layer rate control 1 is used to allocate bits to each GOP of the video sequence, each GOP being composed of one I frame and a number of P and B frames.
The total number of bits available for the video sequence will be:
TB = x R
where is the duration of the video sequence; and R is the bit rate for the sequence.
Assuming that the total number of I frames is N and that the number of P and B frames in the ith GOP are NPJ and NBJ , and that the frames have weightings of W|, Wp and WB, then the number of bits allocated to the ith GOP is:
Figure imgf000011_0001
For the sake of the present embodiment and for simplicity, it is assumed that each GOP has the same structure, and so the GOP Layer Rate Control will allocate each GOP the following number of bits:
TB
TB, =
N
After the GOP layer rate control at block 1 , the encoder carries out a buffer initialization at block 2, conducts the Intra-coding of the l-frame at block 3, updates a Rate-Distortion model at block 4 and checks as to whether the next frame must be skipped at a skip-frame block 5 (e.g. because of possible buffer overrun).
Inter-coding is then performed in which the encoder 3 performs a joint buffer control at block 6, a Frame Layer Target Bit Rate calculation at block 7 and a Quantization Parameter calculation at block 8, before carrying out the Inter- coding of the P or B frame at block 9. After encoding of the frame, the R-D model update and Frame-skip control are again carried out at blocks 4 and 5 before conducting the encoding of the next inter-frame through block 6, etc.
Where the encoder scheme is used in the Video Object layer, the encoder also conducts a Target Bit Rate Allocation at block 10, and calculates a shape threshold in block 8 along with the quantization parameter calculation.
The part of the bit rate control in the frame layer consists of three stages: the initialization, pre-encoding and post-encoding stages.
(a) Initialization Stage
In the initialization stage of block 2, the encoder carries out three main tasks with respect to the frame layer control, these being:
(i) initialization of the buffer size based on latency requirements; (ii) subtraction of the bit count of the l-frame from the bit count of the ith GOP; and (iii) initialization of the buffer fullness - If the first GOP is encoded, then buffer fullness is set at 50% of a buffer safety margin (which will be 40% of the buffer size assuming a safety margin of 80%). Otherwise, the buffer fullness is set at the end level of the previous GPO.
The l-frame is quantized using an initial quantization value of Q0. The remaining available bits R0(i) for encoding all of the subsequent inter-frames can be calculated as:
Ro(ι) = Wt -K,. + (0.5*5, * -Be(i)) where TBj is the number of bits available to encode the ith group of frames; K,. is the number of bits used to encode the ith intra-frame; Bs is the buffer size; y is the buffer safety margin for skipping frames, having a typical value of 0.8; and Bc(i) is the buffer level at the start of encoding of the ith group of frames, with Bc(ϊ) = 0.5 *Bs * .
The channel output rate (the average number of bits to be drained from the
buffer per frame encoding) is then R0 (0 / NP i .
(b) Pre-encoding Stage
The pre-encoding stage includes setting a target bit rate for the encoding of the next video frame in the GOP, and setting the quantization parameter for quantization of the DCT coefficients in accordance with the target bit rate.
When the number of bits in the buffer is too large (e.g. is predicted to exceed a safety margin), the encoder usually skips some frames to reduce the buffer delay and avoid buffer overflow. This however produces undesirable motion discontinuities in the encoded video sequence. Conversely, if the buffer level is too low, there may be periods of time in which no bits are transmitted through the channel, and channel bandwidth is wasted.
In order to overcome these problems, a frame level control is adopted which sets the target bit rate so as to attempt to maintain a buffer occupancy after the coding of each frame of about 50% of the buffer safety margin (i.e. about 40% of the buffer size for a 0.8 safety margin).
It should be noted that this differs from the prior art, which sets the target buffer fullness at the middle level of the buffer. The present scheme enables a low encoder buffer delay to be maintained and the total delay to be reduced. In order to determine the target bit rate, the dynamics of the buffer are represented by a fluid-flow traffic model with Bc(n) denoting the buffer level at time n: Bc {n + 1) = max{0, Bc («) + T{ή) - u(ή)} (1 )
where T(n) is the actual encoding bit rate; and u(n) is the channel output rate.
Using equation (1 ) and linear system control theory (see for example Chi-Tsong Chen, "Linear system theory and design", Rinehard and Winston, New York, 1984), the target bit rate is scaled based on the buffer size Bs, the current buffer level Bc(n) and the channel output rate R0(i)/NP i , and is given by:
Figure imgf000014_0001
where 0 < γ < 1 is an adjustable parameter having a typical value of 0.75.
When calculating the bit rate for the frame, the number of remaining bits Tr allocated to the current GOP and the remaining number of frames Nr of the current GOP should also be taken into account to ensure that there are available bits for the remaining frames, and so the final frame bit rate is:
Figure imgf000014_0002
where 0 < β < 1 is an adjustable parameter having a typical value of 0.585; and
Hhdr(n-1) is the amount of bits used for overhead data, that is, the bits used for non-texture data, e.g. shape information, motion vector information and header information. It should be noted that the above method of using a fluid-flow model departs from the prior art use of heuristic methods for determining the target bit rate, and enables the buffer occupancy to be kept much closer to the target, so that fewer frames are skipped.
The present model-based method may be used in any suitable video transmission system, and is especially attractive when MPEG-4 video is transported over the Internet where variations in bandwidth occur. Using the heuristic approach, adjustment of the joint buffer control has a delay of one step, and cannot adapt itself in time to the variations in channel bandwidth. However, with the present model-based method, when the channel bandwidth is time-varying, the term RQ(i)/Np i may be replaced by the estimated actual channel bandwidth, e.g. by using the packet loss information. Thus, the variation of the channel bandwidth can be incorporated into the present joint buffer control, and the scheme can adapt itself in time.
A further point to note is that the receiver synchronization of a continuous media stream must deal with delay differences and variations. Since the present frame-layer control keeps the buffer occupancy much closer to the target (50% of the safety margin (40% of the buffer size)), the playout buffer delay can be reduced, and so the total delay is further reduced.
Once the target bit rate is determined, the corresponding quantization parameter, Q, can be computed by using a Rate-Distortion model, which takes the form of the following quadratic model:
R = c2 ^ϊ + Cι ^ + Hhdr
where R is the total number of bits used to encode a frame; Q is the quantization parameter; cι and c2 are first and second order coefficients; σ is the mean absolute difference of texture computed using the motion-compensated residual for the luminance component (an index of video coding complexity); and Hhdr is the amount of bits used for overhead data, that is, non- texture data, e.g. video/frame syntax, bits used for shape information, motion vector information and header information.
(c) Post-encoding Stage
The post-encoding stage includes the processes of updating the parameters c-i and C2 of the Rate-Distortion model and determining whether any frame- skipping is necessary to prevent possible buffer overflow.
The statistics of quantization parameter value and bit rate value, taken from a number of previously encoded frames including the immediately preceding frame, are used to provide improved parameters Ci and C2 for the R-D model by using a linear regression technique.
The number of frames to use is based on a sliding window mechanism, which is designed to smooth the impact that a scene change might have in the updating of the R-D model.
If the complexity changes significantly, i.e. in high motion scenes, a smaller window with more recent data points after the change is used. Otherwise, a window with more data points is used. To ensure that the window size is not varied too rapidly, the window size is increased gradually.
Thus, the value of the current window size W(n) is given by:
W{ ) = mv {W(n - 1) + l,ς O) *Max_ Sliding _ JVindov^
where Max_Sliding_Window is a preset constant, and may be set to e.g. 20; and
Figure imgf000017_0001
The selected sample data points within the window W(n) are denoted as S(j,n) (1 <j ≤ W(n)).
For the selected data points, the encoder collects the quantization parameter statistics QO) and the actual bit rate statistics T(j), and, using a linear regression technique, the parameters can be obtained by:
C3 - -0, c2 =
Figure imgf000017_0002
Figure imgf000017_0003
Figure imgf000017_0004
Figure imgf000017_0005
After updating the R-D model, the total number of actual bits T(n) used to encode the current frame is added to the current buffer level, and a switched frame skip control is performed to prevent buffer overflow and overcome continuous frame skipping. The switched frame skipping control is composed of two basic controllers (a predictive frame skipping controller and a post frame skipping controller) and a corresponding switching law to determine the active controller. In the predictive frame skip controller, a function TB is defined by:
Figure imgf000018_0001
where T(S(j,n)) (1 < j < W(n) denotes the total number of actual bits generated in the encoding of the previous W(n) frames.
The next frame to be encoded will be skipped, if the current buffer level plus the estimated number of bits for the next frame is larger than the sum of TB(n) and some pre-determined threshold, called the safety margin, that is if:
Bc(n + l) + T(n) - A ≥ Bs *y +TB(n)
where Bc(n+1) is the current buffer level; T(n) is the actual number of bits used to encode the current frame;
A is the channel output rate (which may be R0 (i)/NP i or is replaced by the estimated actual channel bandwidth);
Bs is the buffer size; and γ is the pre-determined safety margin.
If skipping takes place, the current buffer level is reduced by the channel output rate.
In the post frame skipping controller, a frame skipping parameter Npost is increased from zero until the following buffer condition is satisfied, the next Npost frames are then skipped by the encoder:
Bc(n + ϊ) <yBs where
Bc („ + !) = max{θ, Bc (n) + T{ ) - A(Npost + 1)} . The predictive frame skipping control is initially used, and the switching law is:
a) The predictive frame skipping control is switched to the post- skipping control if a frame is skipped; and b) The post-skipping control is switched to the predictive frame skipping control if the current frame is not skipped.
Instead of using the switched frame-skipping control, the predictive or post frame skipping control may be used by itself.
Besides using the present method on the frame layer rate control, the above method may also be used to control the video object layer rate control.
In the video object rate control, the total target bit rate (as found in the frame layer control) is allocated to each video object according to its coding complexity, size and perceptual importance. Thus, for a given target bit rate, the target bit rate for an object i is given by:
fi(n) = (f(n)
Figure imgf000019_0001
τ∑ .{MOTiJX(n) + MOTijy(n))+ -τ)Pi + H"Λn 1} * ∑ ∑j (MOTljx (n) + MOTlJy (»))+ (1 -τ )∑ . Pj
where p- is the size of the video object i; ff** -i) =∑ ^ -i) ;
MOTijχ(n) and MOTijy(n) are the absolute values of the jth motion vector component within the object i at the time n; and τ is an adjustable parameter 0 < τ < 1.
Also, to avoid using excessive bits for motion and shape information instead of for texture, and to balance the bit usage without imposing additional noticeable distortion, the shape threshold values can be set dynamically based on the previous coding information.
In the adaptive threshold shape control, let
W,(n-1) B (i) = ∑Hhd i(S(j,n))
W,(n -Ϊ)
Figure imgf000020_0001
The threshold for the video object i, θ*, is initially set to zero, if fj(n) is less than Hhdr,i(n-1 ) - 1.25 HB(i) in the previous frame, then:
θ,- = min^max( ,θ. +θ^(θ}-
where θstep(i) > 0 and θmaχ(i) > 0 are predefined.
If fj(n) is greater than Hhdr,i(n-1 ) + 1.25HB(i), then it is decreased by:
θ^max^ A -Θ^ } -
Otherwise, the threshold is not changed.
When controlling the video object layer, the switched frame skipping control will preferably be used.
Besides controlling the frame layer bit rate and the video object layer, the present scheme can also control the macroblock layer control. The method is thus scalable.
It is to be understood that various alterations additions and/or modifications may be made to the parts previously described without departing from the ambit of the invention, and that, in the light of the teachings of the present invention, the control scheme may be implement in software and/or hardware in a variety of manners.

Claims

Claims
1. A bit rate control system for the encoding of a video sequence in which encoded data is placed in a buffer prior to transmission, and in which a target encoding bit rate is determined based on the fullness of the buffer, characterised in that the buffer is modelled on a fluid-flow traffic model.
2. The system of claim 1 , wherein said the fluid-flow traffic model is of the form:
Bc (« + 1) ■= max{θ, Bc (ή) + T{ή) - u(n)}
where Bc(n) denotes the buffer level at time n; T(n) is the actual encoding bit rate; and u(n) is the channel output rate.
3. The system of claim 1 , in which a rate-distortion model, used to compute a quantization parameter for the control system, has the form:
Figure imgf000021_0001
where R is the total number of bits used to encode a frame; Q is the quantization parameter; cι and c2 are first and second order coefficients; σ is an index of video coding complexity; and
Hhdr is the amount of overhead bits used.
4. The bit rate control system of claim 1 , 2 or 3, wherein a buffer occupancy target is set at a predefined percentage of a safety margin, said safety margin being used to determine whether a frame of the video sequence to be encoded should be skipped.
5. The bit rate control system of claim 4, wherein said buffer target occupancy is set to about 50% of said safety margin.
6. The bit rate control system of any preceding claim, wherein said target encoding bit rate is given by the equation:
Figure imgf000022_0001
where A is the channel output rate; y is a buffer safety margin;
Bs is the buffer size; Bc(n) is the current buffer level; and 0 < γ < 1 is an adjustable parameter.
7. The bit rate control system of claim 6, wherein A is equal to the number of bits available for encoding all of the inter-frames of a current group of frames being encoded divided by the number of inter-frames to be encoded in the current group of frames.
8. The system of claim 7, wherein the available bits R0(i) for encoding the inter-frames of the ith group of frames is:
R0(f) = TBi -K, + (0.5*BS *y -Be ι )
where TB- is the number of bits available to encode the ith group of frames;
K,. is the number of bits used to encode the ith intra-frame; Bs is the buffer size; y is the buffer safety margin; and Bc(i) is the buffer level at the start of encoding the ith group of frames.
9. The bit rate control system of claim 6, wherein A is the estimated actual channel bandwidth.
10. The system of any preceding claim, wherein the target bit rate is modified based on the remaining bits available for encoding and on the remaining frames to be encoded.
11. The system of claim 10, wherein the target bit rate is:
/(«) = max{β *i + (l- β)V(«)5-5- + iJM/.(W-l)} iV0 3N r„
where 0 < β < 1 is an adjustable parameter;
Tr is the number of remaining bits available for encoding; Nr is the number of frames remaining to be encoded; and Hhdr(n-1 ) is the amount of overhead bits used for the previous frame.
12. The system of any preceding claim, wherein the bit rate control uses a rate-distortion model to determine the quantization parameter for a frame to be encoded, and wherein the coefficients of said model are updated based upon data from a plurality of previous frames, the number of previous frame used being determined by a sliding window mechanism, wherein the value of the current window size W(n) is given by:
W(n) = mm{W(n - 1) + l,ς («) * Wmax }
where Wmax is a preset constant; and
Figure imgf000023_0001
13. The system of any preceding claim, wherein after the current frame is encoded, the next frame to be encoded will be skipped, if: Be(n + l) + T(n) - A ≥ Bs *y +TB(n)
where Bc(n+1) is the current buffer level; T(n) is the actual number of bits used to encode the current frame;
A is the channel output rate; Bs is the buffer size; y is a pre-determined buffer safety margin; and
Figure imgf000024_0001
where T(S(j,n)) (1< j < W(n)) denotes the total number of actual bits generated in the encoding of the previous W(n) frames.
14. The system of any of claims 1 to 13, wherein after the current frame is encoded, the total number of actual bits used to encode the current frame is added to the current buffer level, and wherein a frame skipping parameter Npost is set to skip the next Npost frames so that the following buffer condition is satisfied: Bc(n + ϊ) <yBs
where
Bc (n + 1) = max {θ, Bc (n) + T(n) - A(N post + 1)}
where Bc(n) is the buffer level at time n;
T(n) is the actual number of bits used to encode the current frame;
A is the channel output rate;
Bs is the buffer size; and f is a pre-determined buffer safety margin.
15. The system of claims 13 and 14, wherein the skipping control of claim 13 is provided as a predictive switching control, the skipping control of claim 14 is provided as a post-frame skipping control, and the skipping controls are switched between one another based on a switching law, said switching law being: a) The predictive frame skipping control is switched to the post- skipping control if a frame is skipped; and b) The post-frame skipping control is switched to the predictive frame skipping control if the current frame is not skipped.
16. A method for encoding a video sequence, including the step of placing encoded data into a buffer prior to transmission, and the step of determining a target encoding bit rate based on the fullness of the buffer, characterised by the step of modelling the buffer based on a fluid-flow traffic model.
17. The method of claim 16, wherein the fluid-flow model is of the form:
Bc 0 + 1) = max{05 Bc («) + T(n) - u(n)}
where Bc(n) denotes the buffer level at time n;
T(n) is the actual encoding bit rate; and u(n) is the channel output rate.
18. The method of claim 16 or 17, including the step of determining a quantization parameter for encoding the data based on a rate-distortion equation having the form:
Figure imgf000025_0001
where R is the total number of bits used to encode a frame; Q is the quantization parameter; c-i and c2 are first and second order coefficients; σ is an index of video coding complexity; and Hhdr is the amount of overhead bits used.
19. Computer software for the encoding of a video sequence, wherein encoded data is placed in a buffer prior to its transmission, and wherein the computer software includes a component which determines a target encoding bit rate based on the fullness of the buffer, characterised in that the software includes a component for modelling the buffer based on a fluid-flow traffic model.
20. The software of claim 19, including a component for determining a quantization parameter for encoding the data based on a rate-distortion
1 equation haying the form:
σ 2 σ TT
where R is the total number of bits used to encode a frame;
Q is the quantization parameter; c-i and C2 are first and second order coefficients; σ is an index of video coding complexity; and Hhdr is the amount of overhead bits used.
21. A bit rate control system for the encoding of video data, wherein a rate- distortion model is used to determine a quantization parameter to use in the encoding, and characterised in that the rate-distortion model has the form:
Figure imgf000026_0001
where R is the total number of bits used to encode a frame; Q is the quantization parameter; cι and C2 are first and second order coefficients; σ is an index of video coding complexity; and
Hhdr is the amount of overhead bits used.
22. A bit rate control system for the encoding of a video sequence in which encoded data is placed in a buffer prior to its transmission, and in which a target encoding bit rate is determined based on the fullness of the buffer, characterised in that a buffer occupancy target is set at a set percentage of a safety margin, said safety margin being used to determine whether a frame of the video sequence to be encoded should be skipped.
23. The bit rate control system of claim 22, wherein said buffer target occupancy is set to about 50% of said safety margin.
24. A bit rate control system for the encoding of a video sequence in which encoded data is placed in a buffer prior to its transmission, and in which a target encoding bit rate is determined based on the fullness of the buffer, the bit rate control using a rate-distortion model to determine a quantization parameter for a frame to be encoded, and wherein the coefficients of said model are updated based upon data from a plurality of previous frames, the number of previous frame used being determined by a sliding window mechanism, characterised in that the value of the current window size W(n) is given by:
Win) = {mmW(n - 1) + l,ς(«) * Wm
where Wmax is a preset constant; and
Figure imgf000027_0001
σ(n) being an index of video coding complexity.
25. A bit rate control system for the encoding of a video sequence in which encoded data is placed in a buffer prior to its transmission, and in which video data to be encoded is skipped if it is determined that buffer overflow may occur, characterised in that said skip control comprises: a) a predictive skip control, in which, after the current frame is encoded, the next frame to be encoded will be skipped, if:
Bc(n + l) + T( ) - A ≥ Bs *γ +TB(n)
where Bc(n+1 ) is the current buffer level;
T(n) is the actual number of bits used to encode the current frame; A is the channel output rate; Bs is the buffer size; y is a pre-determined buffer safety margin; and
Figure imgf000028_0001
where T(S(j,n)) (1< j < Ws(n)) denotes the total number of actual bits generated in the encoding of the previous W(n) frames;
b) a post frame skip control in which after the current frame is encoded, the total number of actual bits used to encode the current frame is added to the current buffer level, and wherein a frame skipping parameter Npost is set to skip the next Npost frames so that the following buffer condition is satisfied:
Bc(n + l) <yBs
where
Bc {n + 1) = max {θ, Bc (n) + T{n) - A(Npost + 1)}
where Bc(n) is the buffer level at time n; T(n) is the actual number of bits used to encode the current frame;
A is the channel output rate; Bs is the buffer size; and y is a pre-determined buffer safety margin.
c) a switching law through which the skipping controls are switched between one another, said switching law being: a) The predictive frame skipping control is switched to the post- skipping control if a frame is skipped; and b) The post-frame skipping control is switched to the predictive frame skipping control if the current frame is not skipped.
PCT/SG2001/000105 2001-05-25 2001-05-25 Bit rate control for video compression WO2002096120A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/SG2001/000105 WO2002096120A1 (en) 2001-05-25 2001-05-25 Bit rate control for video compression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/SG2001/000105 WO2002096120A1 (en) 2001-05-25 2001-05-25 Bit rate control for video compression

Publications (1)

Publication Number Publication Date
WO2002096120A1 true WO2002096120A1 (en) 2002-11-28

Family

ID=20428942

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2001/000105 WO2002096120A1 (en) 2001-05-25 2001-05-25 Bit rate control for video compression

Country Status (1)

Country Link
WO (1) WO2002096120A1 (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1721467A2 (en) * 2004-02-06 2006-11-15 Apple Computer, Inc. Rate and quality controller for h.264/avc video coder and scene analyzer therefor
WO2006134455A1 (en) * 2005-06-13 2006-12-21 Nokia Corporation System and method for providing one-pass rate control in encoders
WO2007006181A1 (en) * 2005-07-14 2007-01-18 Intel Corporation A rate control method and apparatus
US7773672B2 (en) 2006-05-30 2010-08-10 Freescale Semiconductor, Inc. Scalable rate control system for a video encoder
US7804897B1 (en) * 2002-12-16 2010-09-28 Apple Inc. Method for implementing an improved quantizer in a multimedia compression and encoding system
US7940843B1 (en) 2002-12-16 2011-05-10 Apple Inc. Method of implementing improved rate control for a multimedia compression and encoding system
US7978764B2 (en) 2003-06-27 2011-07-12 Nxp B.V. Method of video encoding for handheld apparatuses selecting best prediction function according to optimal rate-distortion value
US8077775B2 (en) 2006-05-12 2011-12-13 Freescale Semiconductor, Inc. System and method of adaptive rate control for a video encoder
CN102860010A (en) * 2010-05-06 2013-01-02 日本电信电话株式会社 Video encoding control method and apparatus
US8428127B2 (en) 2002-07-15 2013-04-23 Apple Inc. Method of performing rate control for a compression system
US8953673B2 (en) 2008-02-29 2015-02-10 Microsoft Corporation Scalable video coding and decoding with sample bit depth and chroma high-pass residual layers
US8964854B2 (en) 2008-03-21 2015-02-24 Microsoft Corporation Motion-compensated prediction of inter-layer residuals
US9179165B2 (en) 2010-05-07 2015-11-03 Nippon Telegraph And Telephone Corporation Video encoding control method, video encoding apparatus and video encoding program
US9319729B2 (en) 2006-01-06 2016-04-19 Microsoft Technology Licensing, Llc Resampling and picture resizing operations for multi-resolution video coding and decoding
US9571856B2 (en) 2008-08-25 2017-02-14 Microsoft Technology Licensing, Llc Conversion operations in scalable video encoding and decoding
US9883202B2 (en) 2006-10-06 2018-01-30 Nxp Usa, Inc. Scaling video processing complexity based on power savings factor
CN107683600A (en) * 2015-06-12 2018-02-09 爱立信股份有限公司 System and method for managing the delivering of ABR bit rates in response to the video buffer characteristic of client
US9942570B2 (en) 2005-11-10 2018-04-10 Nxp Usa, Inc. Resource efficient video processing via prediction error computational adjustments
CN115834975A (en) * 2022-11-17 2023-03-21 中国联合网络通信集团有限公司 Video transmission method, device, equipment and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998037701A1 (en) * 1997-02-12 1998-08-27 Sarnoff Corporation Apparatus and method for optimizing the rate control in a coding system
EP0892555A2 (en) * 1997-07-18 1999-01-20 Mitsubishi Denki Kabushiki Kaisha Adaptive video coding method
US6229849B1 (en) * 1997-12-08 2001-05-08 Sony Corporation Coding device and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998037701A1 (en) * 1997-02-12 1998-08-27 Sarnoff Corporation Apparatus and method for optimizing the rate control in a coding system
EP0892555A2 (en) * 1997-07-18 1999-01-20 Mitsubishi Denki Kabushiki Kaisha Adaptive video coding method
US6229849B1 (en) * 1997-12-08 2001-05-08 Sony Corporation Coding device and method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
GUANG-LIANG LI: "An analysis of transient loss performance impact of long-range dependence in ATM traffic", IEEE ATM WORKSHOP 1997. PROCEEDINGS LISBOA, PORTUGAL 25-28 MAY 1997, NEW YORK, NY, USA,IEEE, US, 25 May 1997 (1997-05-25), pages 603 - 610, XP010247447, ISBN: 0-7803-4196-1 *
GUSTAFSSON E ET AL: "Fluid traffic modelling in simulation of a call admission control scheme for ATM networks", MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS, 1997. MASCOTS '97., PROCEEDINGS FIFTH INTERNATIONAL SYMPOSIUM ON HAIFA, ISRAEL 12-15 JAN. 1997, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 12 January 1997 (1997-01-12), pages 110 - 115, XP010211378, ISBN: 0-8186-7758-9 *
LEE H-J ET AL: "RATE-DISTORTION BASED OPTIMIZATION FOR ZEROTREE ENTROPY WAVELET CODING", IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, IEEE INC. NEW YORK, US, vol. 45, no. 3, August 1999 (1999-08-01), pages 650 - 660, XP000926980, ISSN: 0098-3063 *
LEE H-J ET AL: "SCALABLE RATE CONTROL FOR MPEG-4 VIDEO", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, IEEE INC. NEW YORK, US, vol. 10, no. 6, September 2000 (2000-09-01), pages 878 - 894, XP000959031, ISSN: 1051-8215 *
VETRO A ET AL: "MPEG-4 RATE CONTROL FOR MULTIPLE VIDEO OBJECTS", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, IEEE INC. NEW YORK, US, vol. 9, no. 1, February 1999 (1999-02-01), pages 186 - 199, XP000802297, ISSN: 1051-8215 *

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8428127B2 (en) 2002-07-15 2013-04-23 Apple Inc. Method of performing rate control for a compression system
US8477843B2 (en) 2002-12-16 2013-07-02 Apple Inc. Method of implementing improved rate control for a multimedia compression and encoding system
US7804897B1 (en) * 2002-12-16 2010-09-28 Apple Inc. Method for implementing an improved quantizer in a multimedia compression and encoding system
US7940843B1 (en) 2002-12-16 2011-05-10 Apple Inc. Method of implementing improved rate control for a multimedia compression and encoding system
US7978764B2 (en) 2003-06-27 2011-07-12 Nxp B.V. Method of video encoding for handheld apparatuses selecting best prediction function according to optimal rate-distortion value
EP1721467A2 (en) * 2004-02-06 2006-11-15 Apple Computer, Inc. Rate and quality controller for h.264/avc video coder and scene analyzer therefor
WO2006134455A1 (en) * 2005-06-13 2006-12-21 Nokia Corporation System and method for providing one-pass rate control in encoders
US8594179B2 (en) 2005-07-14 2013-11-26 Intel Corporation Rate control method and apparatus
WO2007006181A1 (en) * 2005-07-14 2007-01-18 Intel Corporation A rate control method and apparatus
CN101223790B (en) * 2005-07-14 2013-03-27 英特尔公司 Rate control method and apparatus
US9942570B2 (en) 2005-11-10 2018-04-10 Nxp Usa, Inc. Resource efficient video processing via prediction error computational adjustments
US9319729B2 (en) 2006-01-06 2016-04-19 Microsoft Technology Licensing, Llc Resampling and picture resizing operations for multi-resolution video coding and decoding
US8077775B2 (en) 2006-05-12 2011-12-13 Freescale Semiconductor, Inc. System and method of adaptive rate control for a video encoder
US7773672B2 (en) 2006-05-30 2010-08-10 Freescale Semiconductor, Inc. Scalable rate control system for a video encoder
US9883202B2 (en) 2006-10-06 2018-01-30 Nxp Usa, Inc. Scaling video processing complexity based on power savings factor
US8953673B2 (en) 2008-02-29 2015-02-10 Microsoft Corporation Scalable video coding and decoding with sample bit depth and chroma high-pass residual layers
US8964854B2 (en) 2008-03-21 2015-02-24 Microsoft Corporation Motion-compensated prediction of inter-layer residuals
US9571856B2 (en) 2008-08-25 2017-02-14 Microsoft Technology Licensing, Llc Conversion operations in scalable video encoding and decoding
US10250905B2 (en) 2008-08-25 2019-04-02 Microsoft Technology Licensing, Llc Conversion operations in scalable video encoding and decoding
US9179154B2 (en) 2010-05-06 2015-11-03 Nippon Telegraph And Telephone Corporation Video encoding control method and apparatus
TWI458355B (en) * 2010-05-06 2014-10-21 Nippon Telegraph & Telephone Video encoding control method and apparatus
EP2568704A4 (en) * 2010-05-06 2013-12-18 Nippon Telegraph & Telephone Video encoding control method and apparatus
EP2568704A1 (en) * 2010-05-06 2013-03-13 Nippon Telegraph And Telephone Corporation Video encoding control method and apparatus
CN102860010A (en) * 2010-05-06 2013-01-02 日本电信电话株式会社 Video encoding control method and apparatus
US9179165B2 (en) 2010-05-07 2015-11-03 Nippon Telegraph And Telephone Corporation Video encoding control method, video encoding apparatus and video encoding program
CN107683600A (en) * 2015-06-12 2018-02-09 爱立信股份有限公司 System and method for managing the delivering of ABR bit rates in response to the video buffer characteristic of client
CN115834975A (en) * 2022-11-17 2023-03-21 中国联合网络通信集团有限公司 Video transmission method, device, equipment and medium
CN115834975B (en) * 2022-11-17 2024-05-17 中国联合网络通信集团有限公司 Video transmission method, device, equipment and medium

Similar Documents

Publication Publication Date Title
Chen et al. Recent advances in rate control for video coding
KR100942395B1 (en) Rate control for multi-layer video design
Jiang et al. Improved frame-layer rate control for H. 264 using MAD ratio
JP4390112B2 (en) Method and apparatus for controlling rate of video sequence and video encoding apparatus
WO2002096120A1 (en) Bit rate control for video compression
KR100927083B1 (en) Quasi-constant-quality rate control with look-ahead
JP4602670B2 (en) MPEG-encoded video stream-based bitrate transcoder
KR101089325B1 (en) Encoding method, decoding method, and encoding apparatus for a digital picture sequence
EP1549074A1 (en) A bit-rate control method and device combined with rate-distortion optimization
US8406297B2 (en) System and method for bit-allocation in video coding
US7428339B2 (en) Pseudo-frames for MPEG-2 encoding
JP5136470B2 (en) Moving picture coding apparatus and moving picture coding method
Yi et al. Rate control using enhanced frame complexity measure for H. 264 video
JP4089753B2 (en) Image compression encoding apparatus and method, program, and recording medium
Feng et al. Reducing frame skipping in MPEG-4 rate control scheme
KR100949755B1 (en) A method and an apparatus for controlling the rate of a video sequence, a video encoding device
EP1739970A1 (en) Method for encoding and transmission of real-time video conference data
KR100778473B1 (en) Bit rate control method
Sun et al. An incremental basic unit level QP determination algorithm for H. 264/AVC rate control
Tun et al. Rate control algorithm based on quality factor optimization for Dirac video codec
Zhang et al. Research of pseudo frame skip technology applied in H. 264
Kwon et al. Improved initial QP prediction method in H. 264/AVC
Nguyen et al. SPEM online rate control for realtime streaming video
CN117880511A (en) Code rate control method based on video buffer verification
Tun et al. An efficient rate control algorithm for a wavelet video codec

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP