WO2000040032A1 - Adaptive buffer and quantizer regulation scheme for bandwidth scalability of video data - Google Patents

Adaptive buffer and quantizer regulation scheme for bandwidth scalability of video data Download PDF

Info

Publication number
WO2000040032A1
WO2000040032A1 PCT/EP1999/010223 EP9910223W WO0040032A1 WO 2000040032 A1 WO2000040032 A1 WO 2000040032A1 EP 9910223 W EP9910223 W EP 9910223W WO 0040032 A1 WO0040032 A1 WO 0040032A1
Authority
WO
WIPO (PCT)
Prior art keywords
frames
frame
video
quantizing
encoding
Prior art date
Application number
PCT/EP1999/010223
Other languages
French (fr)
Inventor
Shing-Chi Tzou
Zhiyong W. Wang
Junwun Lee
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Priority to EP99967977A priority Critical patent/EP1057344A1/en
Priority to JP2000591812A priority patent/JP2002534864A/en
Publication of WO2000040032A1 publication Critical patent/WO2000040032A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/115Selection of the code volume for a coding unit prior to coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • H04N19/126Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/152Data rate or code amount at the encoder output by measuring the fullness of the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/162User input
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/164Feedback from the receiver or from the transmission channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • H04N19/197Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters including determination of the initial value of an encoding parameter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/637Control signals issued by the client directed to the server or network components
    • H04N21/6377Control signals issued by the client directed to the server or network components directed to server

Definitions

  • This invention relates to the field of video image processing and data communications and in particular to the field of video image encoding.
  • Video image encoding techniques are well known in the art.
  • Encoding standards such as CCITT H.261, CCJ T H.263, and MPEG provide methods and techniques for efficiently encoding sequences of video images. These standards exploit the temporal correlation of frames in a video sequence by using a motion-compensated prediction, and exploit the spatial correlation of the frames by using a frequency transformation, such as a Discrete Cosine Transformation (DCT).
  • DCT Discrete Cosine Transformation
  • the resultant frequency component coefficients the measures of energy at each frequency
  • the non-uniformly distributed coefficients are quantized, typically producing some non-zero quantized coefficients among many zero valued quantized coefficients.
  • the occurrences of many zero valued coefficients, and similarly valued non-zero quantized coefficients allows for an efficient encoding, using an entropy based encoding, such as a Huffman/run-length encoding.
  • the aforementioned quantizing process introduces some loss of quality, or precision, in the encoding.
  • the quantization step size determines the degree of loss of quality in the encoding process.
  • a small quantization step size introduces less round-off error, or loss of precision, than a large quantization step size.
  • the quantization step size determines the resultant size of the entropy based encoding.
  • a small quantization step size for example, rounds fewer coefficients to a zero level than a large quantization step size, and therefore there will be fewer long runs of zero values that can be efficiently encoded.
  • a small quantization step size provides for a high quality reproduction of the original image, but at the cost of a larger sized encoding.
  • a large quantization step size provides for a smaller sized encoding, but with a resultant loss of quality in the reproduction of the original image.
  • variable sized encodings of an image are often communicated over a fixed bandwidth communications channel, such as, for example, a telephone line used for video teleconferencing, or a link to a web site containing video information.
  • the variable length encoded images are communicated to a buffer at the receiving site, decoded, and presented to the receiving display at a fixed image frame rate. That is, for example, in a video teleconferencing call, the sequence of images may be encoded at a rate of ten video frames per second. Because the encodings of each frame are of variable length, some frames may have an encoded length that require more than a tenth of a second to be communicated over the fixed bandwidth communications channel, while others require less than a tenth of a second.
  • the aggregate encoded frame transmission rate should equal the video frame rate.
  • the receiving buffer size determines the degree of variability about this aggregate rate that can be tolerated without underflowing or overflowing the buffer. That is, if the receiving buffer underflows, a frame will not be available for display when the next period of the video frame rate occurs; if the receiving buffer overflows, the received encoding is lost, and the frame will not be displayable when the next period of the video frame rate occurs. In both cases, a staggering of the frame display occurs and produces a visually disturbing artifact. Techniques are common in the art for controlling the sizes of the variable sized encodings so that the receiving buffer does not overflow or underflow.
  • the quantization step size is selected to provide a preferred level of buffer fullness to support a given video frame rate without overflowing or underflowing the receive buffer. Because the receive buffer is of limited size, the quality of the encoding can become unacceptably poor, particularly when communicating via a low bandwidth communications path.
  • Another problem with the known methods and techniques is the determination of the initial quantization step size for each frame.
  • Conventional techniques use the last determined quantization step size from the prior frame as the initial quantization step size for the subsequent frame to provide, somewhat, for a consistent level of buffer fullness. Because the quantization step size is determined based upon a measure of buffer fullness, the quantization step size of a prior frame is generally a poor estimator of the appropriate quantization step size for a subsequent frame to provide consistent quality.
  • the video encodings exploit the temporal correlation of frames in a video sequence by using a motion-compensated prediction, wherein each frame is encoded as changes from the prior frame.
  • the first frame of a sequence must be encoded as an independent frame, as well as frames that are encoded to recommence a sequence after a transmission error. Because an independent encoding of a frame typically has significantly more bits than an encoding of changes relative to a prior frame, the quantization step size for the encoding of changes relative to a prior frame is an inappropriate measure for determining the quantization step size for the encoding of an independent frame. As such, either the encoding process requires additional time to adjust the quantization step size to the appropriate level, commensurate with the quality of the prior frames, or an inappropriate step size is used, resulting in varying quality levels, particularly at each independent frame transmission.
  • Consistent image quality is provided by controlling the quantization process based on a set of quantizing parameters that include, for example, an initial value and bounds for a quantizing factor that is applied to each frame of the video data. Additionally, the quantizing parameters are modifiable by a user to achieve a user- determinable balance of performance objectives, based on the user's preference for image quality or image update rate. The user-determinable balance is achieved by a suitable modification of the video frame rate and the quantizing parameters commensurate with the selected frame rate.
  • the processing system in accordance with this invention allows for alternative encodings in dependence upon the desired performance objectives. If quality images are preferred, fewer, but more detailed, images are transmitted per second; if accurate motion depiction is preferred, more, but less detailed, images are transmitted. If neither image quality nor accurate motion have priority, images of moderate detail are transmitted at a moderate image update rate.
  • the sets of parameters for effecting the desired performance objective are predefined, and include, for example, an initial quantizing factor for encoding independent frames of images after the occurrence of a communications error.
  • FIG. 1 illustrates an example block diagram of a video processing system in accordance with this invention.
  • FIG. 2 illustrates an example block diagram of a video encoding system in accordance with this invention.
  • FIG. 1 illustrates an example block diagram of a video processing system in accordance with this invention, as would be used, for example, for videoconferencing.
  • a camera 180 provides video input 101 corresponding to an image scene 181 to a video encoding system 100.
  • the encoding system 100 converts the video input 101 into encoded frames 131 suitable for communication to a receiver 200 via a communications channel 141.
  • the communications channel 141 is represented as a communications network, such as a telephone network, although it could also be a wireless connection, a point to point connection, or combinations of varied connections between the encoding system 100 and the receiver 200.
  • the source of the video input 101 may be prerecorded data, computer generated data, and the like.
  • the encoded frames 131 may contain less information than the available information at the video input 101.
  • the performance of the video processing system is based on the degree of correspondence between the encoded frames 131 and the available video input 101.
  • image quality is used herein to be a measure of the accurate reproduction of an image
  • motion quality is used herein to be a measure of the accurate depiction of motion in a sequence of images.
  • the system is configured to provide a proper balance between image quality and motion quality, the proper balance being defined by the designers of the system.
  • the proper balance is typically established by defining an acceptable video frame rate of the encoded frames 131 given the available bandwidth of the communications channel 141 and the available buffering at the receiver 200, and then providing as much image quality as possible at that chosen frame rate.
  • the desired performance of a video processing system is often dependent upon the context within which the video processing system is used. For example, when using a videophone to call home, it may be desirable to accurately convey facial detail, whereas when using the same videophone for a business meeting, it may be more important to provide a continual update of fast moving events. It may also be important to accurately convey facial expressions at other times during the business meeting, or to accurately convey motion during a call home. Also, providing as much image quality as possible for frames is not necessarily desirable, because it is often more visually disturbing to view frames of varying quality than to view frames of consistent quality, even if that consistent quality is less than a sporadically achievable higher quality.
  • the video encoding system 100 is configured to provide consistent image quality at a chosen frame rate; and, in accordance with another aspect of this invention, the video encoding system 100 is configured to allow a user of the video processing system to control the choice of the proper balance between image quality and motion quality, based on user preferences 205.
  • FIG. 2 illustrates an example block diagram of a video encoding system 100 in accordance with this invention that provides consistent image quality and allows for a tradeoff of image quality and motion quality based upon a user's preference.
  • the video encoding system 100 encodes the video input 101 for communication to the receiver 200, and includes a transform device 110, a quantizer 120, an encoder 130, and a buffer regulator 140, as would be similar to a conventional video encoding system.
  • the video encoding system 100 also provides a source 150 of quantizing parameters 151 that affect the operation of the quantizer 120.
  • Video input 101 is transformed by the transform device 110 to produce a set of coefficients 111 that describe the image content of each frame.
  • the transform device 110 employs a variety of techniques for efficiently coding each frame as a set of coefficients 111.
  • an initial frame of the sequence of images is transformed using a Discrete Cosine Transform (DCT) to provide a set of DCT coefficients that correspond to the image of the initial frame.
  • DCT Discrete Cosine Transform
  • the transform device 110 compares the next frame of the sequence to the first frame, and transforms the differences between the frames as a set of movements of individual blocks in the first frame (motion vectors) 112, and a set of differences between the image details of the blocks in the first and next frames (error terms).
  • the transform device 110 then provides a set of DCT coefficients 111 corresponding to the error terms.
  • Subsequent frames of the sequence are similarly transformed to motion vectors 112 and error term DCT coefficients 111.
  • images are reconstructed as a sequence of modifications to the first frame, by applying the inverse of these functions via the decoder 220.
  • the transform device 110 transforms the next frame in a sequence as an independent frame whenever such an error is detected.
  • This independent frame is.independent of prior frames and contains a set of coefficients 111 that correspond directly to the image of this frame, thereby forming a first frame to a new sequence. Because this frame is independent of all prior frames, it is independent of the effects of the prior communications error, as are all subsequent frames.
  • another independent first frame transformation is effected.
  • inter-frame or predicted frame is used to identify a frame encoding that is based on one or more prior frames
  • intra-frame or independent frame is used to identify a frame encoding that contain a complete encoding of the image content of the frame, independent of any other frame.
  • the transform device 110 may effect other transformations of the video input 101, in addition to or in lieu of the example transformation presented above, using conventional or novel transformation techniques.
  • copending application "Low Bit Encoding Scheme for Video
  • the coefficients 111 are quantized, or rounded, by the quantizer 120.
  • the coefficients 111 may be very precise real numbers that result from a mathematical transformation of the image data, such as the aforementioned coefficients of a frequency transformation. Communicating each of the bits of each of the very precise real numbers would provide for a very accurate reconstruction of the image at the receiver 200, but would also require a large number of transmitted bits via the channel 141.
  • the quantizer 120 converts the coefficients 111 into quantized coefficients 121 having fewer bits.
  • the range of the coefficients 111 may be divided into four quartiles, wherein the quantized coefficient 121 of each coefficient 111 is merely an identification of the quartile corresponding to the coefficient 111.
  • the quantized coefficient 121 merely requires two bits to identify the quartile, regardless of the number of bits in the coefficient 111.
  • the quantizing factor is a measure of the quantization step size and is inversely proportional to the number of divisions, or quantization regions, of the range of the input parameter being quantized. The quantizing factor determines the resultant size of each quantized coefficient.
  • the quantizing factor of 1/4 of the range of the input requires two bits to identify the quantized region associated with each coefficient 111; a quantizing factor of 1/8 the range of the input requires three bits, and so on.
  • the range of the coefficients may be divided into uniform or non-uniform sized quantization regions, and the association between a coefficient 111 value and a quantized coefficient 121 value may be linear or non-linear.
  • the encoder 130 encodes the quantized coefficients 121, using, in a preferred embodiment, an entropy encoding that produces different sized encodings based on the information content of the quantized coefficients 121. For example, run-length encoding techniques common in the art are employed to encode multiple sequential occurrences of the same value as the number of times that the value occurs. Because each frame of the video input 101 may contain different amounts of image information, the encoded frames 131 from the encoder 130 vary in size. The independent frames, for example, will generally produce large encoded frames 131, as compared to the inter-frames that are encoded as changes to prior frames.
  • the encoded frames 131 are communicated to the channel 141 via the buffer regulator 140.
  • the channel 141 is a fixed bit rate system, and the buffer regulator 140 provides the variable length encoded frames 131 to the channel 141 at the fixed bit rate. Because the encoded frames 131 are of differing lengths, the frames are communicated via the channel 141 at a varying frame rate.
  • the receiver 200 includes a buffer 210 that stores the encoded frames 131 that are arriving at a varying frame rate and provides these frames for processing and subsequent display as video output 201 at the same fixed frame rate as the video input 101.
  • the buffer regulator 140 is provided a measure of the size of the receiver buffer 210 and controls the amount of data that is communicated to the receiver 200 so as not to overflow or underflow this buffer 210.
  • the buffer regulator 140 controls the amount of data that is communicated to the receiver 200 by controlling the amount of data that the quantizer 120 produces, via buffer control commands 142.
  • the buffer regulator 140 controls the amount of data that the quantizer 120 produces by providing a buffer control command 142 that effects a modification to the quantizing factor based on a level of fullness of the receiver buffer 210.
  • the buffer regulator is configured to allow the quantizing factor to be within an acceptable range of values.
  • the buffer regulator 140 may specify a minimum and maximum allocated size for subsequent blocks of the current frame, from which the quantizer 120 adjusts its quantizing factor only to the degree necessary to conform.
  • the buffer regulator 140 may merely provide an increment/decrement buffer control command 142 to the quantizer 120 as required.
  • the quantizer 120 Upon receipt of an increment/decrement control command 142, the quantizer 120 increments/decrements the quantizing factor, respectively; absent an increment/decrement command 142, the quantizer maintains the prior value of the quantizing factor.
  • Other techniques for modifying the quantizer factor in dependence upon a measure of the fullness of the receive buffer 210 would be evident to one of ordinary skill in the art.
  • the quantizing factor is also dependent upon the quantizing parameters 151 from the quantizing parameter source 150.
  • the quantizing parameters 151 include an initial quantizing factor Qi, that is used as an initial value for quantizing an independent frame, and minimum Qmin and maximum Qmax parameters that control the extent of the quantizing factors. Because of the inherent differences between an independent frame and an inter-frame, separate initial quantizing factors are provided for each of the frame types, as discussed below. For clarity, the initial quantizing factor for the inter-frames, or predicted frames, is termed Qp herein.
  • the quantizing factor determines the level of precision of the quantized encoding
  • initializing each frame to a given value provides for a more consistent image quality at the receiver 200.
  • the initial quantizing factor are chosen to provide reasonably sized encodings at a given frame rate, as discussed below, fewer adjustments of the quantizing factor by the buffer regulator 140 will, in general, be required. This improved efficiency is particularly apparent in the severing of an independent frame's quantizing factor from its immediate predecessor, because the immediate predecessor is generally an inter-frame, having fundamentally different characteristics.
  • the initial quantizing factor Qi, Qp is determined heuristically, experimentally, or algorithmically, based upon the bandwidth of the channel 141, the frame size and frame rate of the video input 101, and the size of the receive buffer 210.
  • the frame size and the size of the receive buffer 210 is specified by an accepted communication standard, thereby allowing the video encoding system 100 of one vendor to communicate with a receiver 200 of another vendor without fear of an underflow or overflow of receive buffer 210.
  • the bandwidth of the channel 141 is determined by the provider of the channel 141, and typically depends upon the class of service. For example, an ISDN communications link will typically have a substantially higher bandwidth than a common telephone communications link.
  • the initial quantizing factor Qi, Qp is inversely proportional to both the size of the buffer 210 and the bandwidth of the channel 141.
  • a larger bandwidth of the channel 141 allows a highly detailed (i.e. low quantizing factor) encoding to be communicated to the receive buffer 210 in a shorter period of time.
  • the size of the buffer 210 is typically directly proportional to the bandwidth of the channel 141 and the frame size of the video input 101.
  • a large buffer 210 allows for a high degree of variability among the sizes of the encoded frames 131, and thus highly detailed encodings can be communicated more often than when the buffer 210 size requires that all frames be constrained to near a consistent nominal size.
  • the initial quantizing factor Qi, Qp is directly proportional to the frame rate of the video input 101. A higher frame rate requires more frames to be communicated per second; thus, given a fixed bandwidth of channel 141, a higher frame rate requires less detail (i.e. higher quantizing factor) in each of the coded frames 131.
  • a nominal size of the encoded frame 131 can be defined that is equal to the bandwidth of channel 141 divided by the frame rate.
  • the receive buffer 210 allows some encoded frames 131 to be larger than the nominal size, and some encoded frames 131 to be correspondingly smaller, so that the average encoded frame size for full bandwidth utilization substantially equals the nominal encoded frame size. Note that if an encoded frame is substantially larger than the nominal frame size, at least some of the subsequent frames must be smaller than the nominal frame size.
  • an encoded independent frame has more information than an encoded inter-frame, and thus should be allocated a frame size that is larger than the nominal frame size.
  • the allocated frame size should not be so large that the subsequent inter- frames are constrained so as to substantially reduce their image quality.
  • the initial quantizing factor Qi is the factor that, on average, produces an encoded frame 131 that is approximately equal to twice the nominal frame size, based on the bandwidth of the channel 141 and the frame rate of the video input 101.
  • the initial quantizing factor Qp for inter-frames in general, is lower than the determined initial quantizing factor Qi for independent frames.
  • a lower quantizing factor is selected because the inter-frames typically have less information to transfer than the independent frames, and thus can support the use of a lower quantizing factor while still providing smaller encoded frames 131.
  • the initial quantizing factor Qp is a factor that, on average, produces an encoded frame 131 that is somewhat less than the nominal frame size, based on the bandwidth of the channel 141 and the frame rate of the video input 101. Note that for a given bandwidth of the channel 141, size of the receiver buffer
  • the bandwidth of the channel 141 is typically fixed, as is the size of the receiver buffer 210.
  • the frame rate of the video input 101 is adjusted so as to allow for a preferred level of image quality. That is, if the image quality is unacceptable, the frame rate of the video input 101 is reduced, to allow for the communication of larger encoded frames 131.
  • the frame rate of the video input 101 causes unacceptable motion quality
  • the frame rate of the video input is increased, thereby requiring a reduction in image quality to allow for the communication of smaller encoded frames 131 at the higher frame rate.
  • the frame rate of the video input 101 can be adjusted via a variety of means common in the art.
  • the frame rate is adjusted by communicating an appropriate video control command 102 to the source of the video input 101, such as the video camera 180 of FIG. 1, if it has an adjustable frame rate.
  • the frame rate change is effected via the use of a rate buffer in the transform device 110.
  • the rate buffer common to one of ordinary skill in the art, receives frames of image data 101 at the highest rate the video source provides, and the processes within the transform device 110 sample the image data 101 from the rate buffer at the desired frame rate for communication to the receiver 200.
  • a conventional rate buffer system also includes filters that smooth the visual anomalies that may be caused by this sampling process.
  • a preferred embodiment of this invention allows for the above modifications of frame rate and quantizing factor based on a user preference 205.
  • the modification of frame rate can be effected by a continuous user adjustment via a user control 230 in the receiver 200.
  • a continuous control allows for a continuous adjustment of image quality.
  • the user is provided a limited set of options via the user control 230, for ease of operation, and for ease of design of the aforementioned rate buffer.
  • the user options in a preferred embodiment are: higher image quality, higher motion quality, or a best tradeoff.
  • the best tradeoff corresponds to the convention choice of an acceptable frame rate that provides an acceptable image quality given the bandwidth of the channel 141.
  • the higher image quality is effected by reducing the best tradeoff frame rate by approximately twenty percent.
  • the higher motion quality is effected by increasing the best tradeoff frame rate by approximately twenty percent.
  • the initial quantizing factors Qi, Qp are dependent upon the frame rate. Therefore, in accordance with this invention, as the frame rate is modified based on the user preference, so also are the initial quantizing factors Qi, Qp.
  • the quantizing parameters 151 of FIG. 2 include the initial quantizing factors Qi, Qp, as well as a minimum Qmin and maximum Qmax set of parameters that bound the extent of the quantizing factor as it is adjusted by the buffer regulator 140.
  • the Qmin and Qmax parameters are also modified based on the user preference for higher image quality or higher motion quality. As discussed above, the minimum Qmin and maximum Qmax parameters are provided in order to provide a consistency in the image quality.
  • the quantizer 120 reduces the quantizing factor to Qmin in response to the buffer control commands 142, it is not reduced further. If additional bits are required to be provided to the channel 141 to prevent a receiver buffer underflow, the buffer regulator 140 inserts null bits, rather than providing more details to the encoded frame 131. Correspondingly, once the quantizer 120 increases the quantizing factor to Qmax, it is not increased further. If fewer bits must be provided to the channel 141 to avoid a receiver buffer overflow, the buffer regulator 140 reduces the frame rate, for example by not transmitting the frame, causing a subsequent momentary freeze of the image at the receiver 200, rather than an introduction of a poor quality image. Alternatively, the buffer regulator 140 may effect an explicit frame rate reduction at the transform 110, using the frame rate modification techniques presented above.
  • the Appendix to this specification provides tables of preferred values of frame rates and quantizing parameters (Qi, Qp, Qmin, Qmax) for common frame formats used for videoconferencing and commonly used channel bit rates.
  • Table 1 provides the frame rates and quantizing parameters for a QCIF format, which has a frame size of 176 by 144 pixels, using a 20 kbps bandwidth channel.
  • the frame rate is set to five frames per second, whereas if the user selects a higher motion quality, the frame rate is set to ten frames per second.
  • the quantizing parameters are set higher than those at the lower frame rates.
  • Tables 2, 3, and 4 provide the frame rates and quantizing parameters for a CIF format, which has a frame size of 352 by 288 pixels, using a 100 kbps, 200 kbps, and 300 kbps bandwidth channel, respectively.
  • Tables 2, 3, and 4 provide the frame rates and quantizing parameters for a CIF format, which has a frame size of 352 by 288 pixels, using a 100 kbps, 200 kbps, and 300 kbps bandwidth channel, respectively.
  • quantizing parameters 151 are presented as being predefined in a quantizing parameter source 150, they could be dynamically computed, using for example a machine learning or expert system approach that determines appropriate quantizing parameters based on prior user preferences and feedback.
  • the system may also automatically generate the user preferences 205, based for example on experiential data, thereby anticipating the user's desires.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A method and apparatus for processing and encoding video data is presented that controls a quantization process based on a selection of differing performance objectives. When the communication channel that is used to communicate the encoded video data has a limited bandwidth, a balance must be struck between performance objectives to provide an acceptable image quality at an acceptable image update rate. Consistent image quality is provided by controlling the quantization process based on a set of quantizing parameters that include, for example, an initial value and bounds for the quantizing factor that is applied to each frame of the video data. Additionally, the quantizing parameters are modifiable by a user to achieve a user-determinable balance, based on the user's preference for image quality or image update rate. The user-determinable balance is achieved by a suitable modification of the video frame rate and the quantizing parameters commensurate with the selected frame rate.

Description

ADAPTIVE BUFFER AND QUANΗZER REGULATION SCHEME FOR BANDWIDTH SCALABILITY OF VIDEO DATA
This invention relates to the field of video image processing and data communications and in particular to the field of video image encoding.
Video image encoding techniques are well known in the art. Encoding standards such as CCITT H.261, CCJ T H.263, and MPEG provide methods and techniques for efficiently encoding sequences of video images. These standards exploit the temporal correlation of frames in a video sequence by using a motion-compensated prediction, and exploit the spatial correlation of the frames by using a frequency transformation, such as a Discrete Cosine Transformation (DCT). When an image is transformed using a frequency transformation, the resultant frequency component coefficients, the measures of energy at each frequency, are typically non-uniformly distributed about the frequency spectrum. According to the existing standards, the non-uniformly distributed coefficients are quantized, typically producing some non-zero quantized coefficients among many zero valued quantized coefficients. The occurrences of many zero valued coefficients, and similarly valued non-zero quantized coefficients, allows for an efficient encoding, using an entropy based encoding, such as a Huffman/run-length encoding.
The aforementioned quantizing process introduces some loss of quality, or precision, in the encoding. Consider, for example, the transformation of a very minor image detail that results in a very small frequency component in the transformation of the image. If the magnitude of that frequency component, or coefficient, is below the size of the quantization step size, the quantized coefficient corresponding to that very small transformation coefficient will be zero. When the corresponding encoded image is subsequently decoded, it will not contain the original very minor image detail, because the frequency component corresponding to this detail has been eliminated by the quantization step. In like manner, each frequency coefficient is "rounded" to the value corresponding to the quantization step that includes the coefficient. As is evident to one of ordinary skill in the art, the quantization step size determines the degree of loss of quality in the encoding process. A small quantization step size introduces less round-off error, or loss of precision, than a large quantization step size.
As is also evident to one of ordinary skill in the art, the quantization step size determines the resultant size of the entropy based encoding. A small quantization step size, for example, rounds fewer coefficients to a zero level than a large quantization step size, and therefore there will be fewer long runs of zero values that can be efficiently encoded.
A small quantization step size provides for a high quality reproduction of the original image, but at the cost of a larger sized encoding. A large quantization step size provides for a smaller sized encoding, but with a resultant loss of quality in the reproduction of the original image.
The variable sized encodings of an image are often communicated over a fixed bandwidth communications channel, such as, for example, a telephone line used for video teleconferencing, or a link to a web site containing video information. In such systems, the variable length encoded images are communicated to a buffer at the receiving site, decoded, and presented to the receiving display at a fixed image frame rate. That is, for example, in a video teleconferencing call, the sequence of images may be encoded at a rate of ten video frames per second. Because the encodings of each frame are of variable length, some frames may have an encoded length that require more than a tenth of a second to be communicated over the fixed bandwidth communications channel, while others require less than a tenth of a second. For optimal bandwidth utilization, the aggregate encoded frame transmission rate should equal the video frame rate. The receiving buffer size determines the degree of variability about this aggregate rate that can be tolerated without underflowing or overflowing the buffer. That is, if the receiving buffer underflows, a frame will not be available for display when the next period of the video frame rate occurs; if the receiving buffer overflows, the received encoding is lost, and the frame will not be displayable when the next period of the video frame rate occurs. In both cases, a staggering of the frame display occurs and produces a visually disturbing artifact. Techniques are common in the art for controlling the sizes of the variable sized encodings so that the receiving buffer does not overflow or underflow. Copending application, "Method for Seamless Splicing in a Video Encoder", USPTO file number 08/829,124, filed 3/28/97, provides an example of such buffer regulation and is incorporated by reference herein. Additionally, U.S. patent 5,038,209, "Adaptive Buffer/Quantizer Control for Transform Video Coders", issued August 6, 1991, incorporated by reference herein, discloses a method for determining a quantization step size based on the fullness of the receiving buffer, so that the quantization step size provides an encoding that attempts to keep the receiving buffer filled to a predetermined preferred value. The quantization step size is continually adjusted to assure that neither an overflow nor an underflow of the receiving buffer occurs. Other methods of adjusting the quantization step size to regulate the encoded output to prevent buffer underflow and overflow are common in the art.
A problem with the known methods and techniques for buffer regulation via quantization is that the adjustment of the quantization step size introduces quality changes, as discussed above. In conventional encoding devices, the quantization step size is selected to provide a preferred level of buffer fullness to support a given video frame rate without overflowing or underflowing the receive buffer. Because the receive buffer is of limited size, the quality of the encoding can become unacceptably poor, particularly when communicating via a low bandwidth communications path.
Another problem with the known methods and techniques is the determination of the initial quantization step size for each frame. Conventional techniques use the last determined quantization step size from the prior frame as the initial quantization step size for the subsequent frame to provide, somewhat, for a consistent level of buffer fullness. Because the quantization step size is determined based upon a measure of buffer fullness, the quantization step size of a prior frame is generally a poor estimator of the appropriate quantization step size for a subsequent frame to provide consistent quality. Additionally, the video encodings exploit the temporal correlation of frames in a video sequence by using a motion-compensated prediction, wherein each frame is encoded as changes from the prior frame. The first frame of a sequence must be encoded as an independent frame, as well as frames that are encoded to recommence a sequence after a transmission error. Because an independent encoding of a frame typically has significantly more bits than an encoding of changes relative to a prior frame, the quantization step size for the encoding of changes relative to a prior frame is an inappropriate measure for determining the quantization step size for the encoding of an independent frame. As such, either the encoding process requires additional time to adjust the quantization step size to the appropriate level, commensurate with the quality of the prior frames, or an inappropriate step size is used, resulting in varying quality levels, particularly at each independent frame transmission. It is an object of this invention to provide a method and apparatus for data flow control that is based on a quality measure, as well as the conventional buffer-fullness measure. It is a further object of this invention to provide a method and apparatus for data flow control that provides efficient and effective error recovery. It is a further object of this invention to provide a method and apparatus for data flow control that requires minimal computational complexity. It is a further object of this invention to provide a method and apparatus for data flow control that is responsive to user preferences.
These objects and others are achieved by providing sets of parameters to facilitate control of the quantization task to achieve particular performance objectives, including maintaining a relatively constant image quality. Consistent image quality is provided by controlling the quantization process based on a set of quantizing parameters that include, for example, an initial value and bounds for a quantizing factor that is applied to each frame of the video data. Additionally, the quantizing parameters are modifiable by a user to achieve a user- determinable balance of performance objectives, based on the user's preference for image quality or image update rate. The user-determinable balance is achieved by a suitable modification of the video frame rate and the quantizing parameters commensurate with the selected frame rate. In certain contexts, for example, it may be more important that the individual images are accurately reproduced, while in other contexts, accurate motion depiction may be more important. The processing system in accordance with this invention allows for alternative encodings in dependence upon the desired performance objectives. If quality images are preferred, fewer, but more detailed, images are transmitted per second; if accurate motion depiction is preferred, more, but less detailed, images are transmitted. If neither image quality nor accurate motion have priority, images of moderate detail are transmitted at a moderate image update rate.
To reduce computational complexity in a preferred embodiment, the sets of parameters for effecting the desired performance objective are predefined, and include, for example, an initial quantizing factor for encoding independent frames of images after the occurrence of a communications error.
The invention is explained in further detail, and by way of example, with reference to the accompanying drawings wherein: FIG. 1 illustrates an example block diagram of a video processing system in accordance with this invention.
FIG. 2 illustrates an example block diagram of a video encoding system in accordance with this invention.
FIG. 1 illustrates an example block diagram of a video processing system in accordance with this invention, as would be used, for example, for videoconferencing. In the example of FIG. 1, a camera 180 provides video input 101 corresponding to an image scene 181 to a video encoding system 100. The encoding system 100 converts the video input 101 into encoded frames 131 suitable for communication to a receiver 200 via a communications channel 141. The communications channel 141 is represented as a communications network, such as a telephone network, although it could also be a wireless connection, a point to point connection, or combinations of varied connections between the encoding system 100 and the receiver 200. Similarly, the source of the video input 101 may be prerecorded data, computer generated data, and the like. For communications efficiency, the encoded frames 131 may contain less information than the available information at the video input 101. The performance of the video processing system is based on the degree of correspondence between the encoded frames 131 and the available video input 101. For ease of reference the term image quality is used herein to be a measure of the accurate reproduction of an image, and the term motion quality is used herein to be a measure of the accurate depiction of motion in a sequence of images. In a conventional video processing system, the system is configured to provide a proper balance between image quality and motion quality, the proper balance being defined by the designers of the system. The proper balance is typically established by defining an acceptable video frame rate of the encoded frames 131 given the available bandwidth of the communications channel 141 and the available buffering at the receiver 200, and then providing as much image quality as possible at that chosen frame rate.
As noted above, the desired performance of a video processing system is often dependent upon the context within which the video processing system is used. For example, when using a videophone to call home, it may be desirable to accurately convey facial detail, whereas when using the same videophone for a business meeting, it may be more important to provide a continual update of fast moving events. It may also be important to accurately convey facial expressions at other times during the business meeting, or to accurately convey motion during a call home. Also, providing as much image quality as possible for frames is not necessarily desirable, because it is often more visually disturbing to view frames of varying quality than to view frames of consistent quality, even if that consistent quality is less than a sporadically achievable higher quality. In accordance with one aspect of this invention, the video encoding system 100 is configured to provide consistent image quality at a chosen frame rate; and, in accordance with another aspect of this invention, the video encoding system 100 is configured to allow a user of the video processing system to control the choice of the proper balance between image quality and motion quality, based on user preferences 205. FIG. 2 illustrates an example block diagram of a video encoding system 100 in accordance with this invention that provides consistent image quality and allows for a tradeoff of image quality and motion quality based upon a user's preference. The video encoding system 100 encodes the video input 101 for communication to the receiver 200, and includes a transform device 110, a quantizer 120, an encoder 130, and a buffer regulator 140, as would be similar to a conventional video encoding system. In accordance with this invention, the video encoding system 100 also provides a source 150 of quantizing parameters 151 that affect the operation of the quantizer 120.
Video input 101, typically in the form of a sequence of image frames, is transformed by the transform device 110 to produce a set of coefficients 111 that describe the image content of each frame. As is common in the art, the transform device 110 employs a variety of techniques for efficiently coding each frame as a set of coefficients 111. In a conventional CCJ T H.261, CCITT H.263, or MPEG transform device 110, an initial frame of the sequence of images is transformed using a Discrete Cosine Transform (DCT) to provide a set of DCT coefficients that correspond to the image of the initial frame. The transform device 110 compares the next frame of the sequence to the first frame, and transforms the differences between the frames as a set of movements of individual blocks in the first frame (motion vectors) 112, and a set of differences between the image details of the blocks in the first and next frames (error terms). The transform device 110 then provides a set of DCT coefficients 111 corresponding to the error terms. Subsequent frames of the sequence are similarly transformed to motion vectors 112 and error term DCT coefficients 111. At the receiver 200, images are reconstructed as a sequence of modifications to the first frame, by applying the inverse of these functions via the decoder 220. If a communications error occurs between the video encoding system 100 and the receiver 200, all subsequent frames are affected, because each frame's reconstruction is based on its predecessor(s). To minimize the effects of a communications error, the transform device 110 transforms the next frame in a sequence as an independent frame whenever such an error is detected. This independent frame is.independent of prior frames and contains a set of coefficients 111 that correspond directly to the image of this frame, thereby forming a first frame to a new sequence. Because this frame is independent of all prior frames, it is independent of the effects of the prior communications error, as are all subsequent frames. When and if another communications error is detected, another independent first frame transformation is effected. In accordance with terminology common to the field, the term inter-frame or predicted frame is used to identify a frame encoding that is based on one or more prior frames, and the term intra-frame or independent frame is used to identify a frame encoding that contain a complete encoding of the image content of the frame, independent of any other frame.
As would be evident to one of ordinary skill in the art, the transform device 110 may effect other transformations of the video input 101, in addition to or in lieu of the example transformation presented above, using conventional or novel transformation techniques. For example, copending application "Low Bit Encoding Scheme for Video
Transmission", file number , filed , discloses the transformation of video images into a set of textured objects comprising the image, and is incorporated herein by reference. The transform device 100 using the techniques disclosed in this copending application could transform, for example, the input 101 to a set of coefficients that describe each textured object directly, without the use of a frequency domain transformation such as the DCT.
To optimize the transmission of the coefficients 111 corresponding to the video input 101, the coefficients 111 are quantized, or rounded, by the quantizer 120. For example, the coefficients 111 may be very precise real numbers that result from a mathematical transformation of the image data, such as the aforementioned coefficients of a frequency transformation. Communicating each of the bits of each of the very precise real numbers would provide for a very accurate reconstruction of the image at the receiver 200, but would also require a large number of transmitted bits via the channel 141. The quantizer 120 converts the coefficients 111 into quantized coefficients 121 having fewer bits. For example, the range of the coefficients 111 may be divided into four quartiles, wherein the quantized coefficient 121 of each coefficient 111 is merely an identification of the quartile corresponding to the coefficient 111. In such an embodiment, the quantized coefficient 121 merely requires two bits to identify the quartile, regardless of the number of bits in the coefficient 111. Consistent with commonly used terminology, the quantizing factor is a measure of the quantization step size and is inversely proportional to the number of divisions, or quantization regions, of the range of the input parameter being quantized. The quantizing factor determines the resultant size of each quantized coefficient. In the prior example, assuming a uniform quantization step size, the quantizing factor of 1/4 of the range of the input requires two bits to identify the quantized region associated with each coefficient 111; a quantizing factor of 1/8 the range of the input requires three bits, and so on. As is evident to one of ordinary skill in the art, the range of the coefficients may be divided into uniform or non-uniform sized quantization regions, and the association between a coefficient 111 value and a quantized coefficient 121 value may be linear or non-linear.
The encoder 130 encodes the quantized coefficients 121, using, in a preferred embodiment, an entropy encoding that produces different sized encodings based on the information content of the quantized coefficients 121. For example, run-length encoding techniques common in the art are employed to encode multiple sequential occurrences of the same value as the number of times that the value occurs. Because each frame of the video input 101 may contain different amounts of image information, the encoded frames 131 from the encoder 130 vary in size. The independent frames, for example, will generally produce large encoded frames 131, as compared to the inter-frames that are encoded as changes to prior frames.
The encoded frames 131 are communicated to the channel 141 via the buffer regulator 140. Typically, the channel 141 is a fixed bit rate system, and the buffer regulator 140 provides the variable length encoded frames 131 to the channel 141 at the fixed bit rate. Because the encoded frames 131 are of differing lengths, the frames are communicated via the channel 141 at a varying frame rate. The receiver 200 includes a buffer 210 that stores the encoded frames 131 that are arriving at a varying frame rate and provides these frames for processing and subsequent display as video output 201 at the same fixed frame rate as the video input 101.
The buffer regulator 140 is provided a measure of the size of the receiver buffer 210 and controls the amount of data that is communicated to the receiver 200 so as not to overflow or underflow this buffer 210. The buffer regulator 140 controls the amount of data that is communicated to the receiver 200 by controlling the amount of data that the quantizer 120 produces, via buffer control commands 142.
The buffer regulator 140 controls the amount of data that the quantizer 120 produces by providing a buffer control command 142 that effects a modification to the quantizing factor based on a level of fullness of the receiver buffer 210. To avoid unnecessary variations of the quantizing factor, the buffer regulator is configured to allow the quantizing factor to be within an acceptable range of values. For example, the buffer regulator 140 may specify a minimum and maximum allocated size for subsequent blocks of the current frame, from which the quantizer 120 adjusts its quantizing factor only to the degree necessary to conform. Alternatively, because the quantizer 120 can only approximate the effect that a particular quantizing factor will have on the size of the encoded frame 131 from the encoder 130, the buffer regulator 140 may merely provide an increment/decrement buffer control command 142 to the quantizer 120 as required. Upon receipt of an increment/decrement control command 142, the quantizer 120 increments/decrements the quantizing factor, respectively; absent an increment/decrement command 142, the quantizer maintains the prior value of the quantizing factor. Other techniques for modifying the quantizer factor in dependence upon a measure of the fullness of the receive buffer 210 would be evident to one of ordinary skill in the art.
In accordance with this invention, the quantizing factor is also dependent upon the quantizing parameters 151 from the quantizing parameter source 150. In a preferred embodiment, the quantizing parameters 151 include an initial quantizing factor Qi, that is used as an initial value for quantizing an independent frame, and minimum Qmin and maximum Qmax parameters that control the extent of the quantizing factors. Because of the inherent differences between an independent frame and an inter-frame, separate initial quantizing factors are provided for each of the frame types, as discussed below. For clarity, the initial quantizing factor for the inter-frames, or predicted frames, is termed Qp herein.
By providing the initial quantizing factors Qi, Qp, the dependence of the quantizing factor of a frame to the quantizing factor of its immediate predecessor is severed. In this manner, because the quantizing factor determines the level of precision of the quantized encoding, initializing each frame to a given value provides for a more consistent image quality at the receiver 200. And, because the initial quantizing factor are chosen to provide reasonably sized encodings at a given frame rate, as discussed below, fewer adjustments of the quantizing factor by the buffer regulator 140 will, in general, be required. This improved efficiency is particularly apparent in the severing of an independent frame's quantizing factor from its immediate predecessor, because the immediate predecessor is generally an inter-frame, having fundamentally different characteristics. Correspondingly, because an independent frame is transmitted upon detection of a transmission error, this improved efficiency for independent frame quantization substantially enhances the ability of the video encoding system to rapidly recover from transmission errors, and substantially enhances the consistency of the image quality. The initial quantizing factor Qi, Qp is determined heuristically, experimentally, or algorithmically, based upon the bandwidth of the channel 141, the frame size and frame rate of the video input 101, and the size of the receive buffer 210. Generally, the frame size and the size of the receive buffer 210 is specified by an accepted communication standard, thereby allowing the video encoding system 100 of one vendor to communicate with a receiver 200 of another vendor without fear of an underflow or overflow of receive buffer 210. The bandwidth of the channel 141 is determined by the provider of the channel 141, and typically depends upon the class of service. For example, an ISDN communications link will typically have a substantially higher bandwidth than a common telephone communications link. In general, the initial quantizing factor Qi, Qp is inversely proportional to both the size of the buffer 210 and the bandwidth of the channel 141. A larger bandwidth of the channel 141 allows a highly detailed (i.e. low quantizing factor) encoding to be communicated to the receive buffer 210 in a shorter period of time. The size of the buffer 210 is typically directly proportional to the bandwidth of the channel 141 and the frame size of the video input 101. At a given channel data rate, a large buffer 210 allows for a high degree of variability among the sizes of the encoded frames 131, and thus highly detailed encodings can be communicated more often than when the buffer 210 size requires that all frames be constrained to near a consistent nominal size. Conversely, the initial quantizing factor Qi, Qp is directly proportional to the frame rate of the video input 101. A higher frame rate requires more frames to be communicated per second; thus, given a fixed bandwidth of channel 141, a higher frame rate requires less detail (i.e. higher quantizing factor) in each of the coded frames 131.
Given a particular bandwidth of channel 141, a nominal size of the encoded frame 131 can be defined that is equal to the bandwidth of channel 141 divided by the frame rate. As noted above, the receive buffer 210 allows some encoded frames 131 to be larger than the nominal size, and some encoded frames 131 to be correspondingly smaller, so that the average encoded frame size for full bandwidth utilization substantially equals the nominal encoded frame size. Note that if an encoded frame is substantially larger than the nominal frame size, at least some of the subsequent frames must be smaller than the nominal frame size. As discussed above, an encoded independent frame has more information than an encoded inter-frame, and thus should be allocated a frame size that is larger than the nominal frame size. The allocated frame size, however, should not be so large that the subsequent inter- frames are constrained so as to substantially reduce their image quality. Based on experimental data, and commonly accepted sizes of the receive buffer 210, it has been found that allocating twice the nominal frame size to an independent frame is effective, in that it allows for the addition information transfer associated with the independent frame, yet also allows the subsequent inter-frames to contain an acceptable amount of image detail. Thus, in accordance with one aspect of this invention, the initial quantizing factor Qi is the factor that, on average, produces an encoded frame 131 that is approximately equal to twice the nominal frame size, based on the bandwidth of the channel 141 and the frame rate of the video input 101.
In accordance with a further aspect of this invention, the initial quantizing factor Qp for inter-frames, in general, is lower than the determined initial quantizing factor Qi for independent frames. A lower quantizing factor is selected because the inter-frames typically have less information to transfer than the independent frames, and thus can support the use of a lower quantizing factor while still providing smaller encoded frames 131. Because the encoded inter-frames 131 must compensate for the larger-than-nominal encoded independent frames, the initial quantizing factor Qp is a factor that, on average, produces an encoded frame 131 that is somewhat less than the nominal frame size, based on the bandwidth of the channel 141 and the frame rate of the video input 101. Note that for a given bandwidth of the channel 141, size of the receiver buffer
210, and frame rate of the video input 101, the adjusted quantizing factor that is produced as the buffer regulator 140 provides buffer control commands 142 to prevent buffer overflow or underflow may be insufficient to provide an acceptable degree of image quality at the receiver 200. As noted above, the bandwidth of the channel 141 is typically fixed, as is the size of the receiver buffer 210. In accordance with a further aspect of this invention, the frame rate of the video input 101 is adjusted so as to allow for a preferred level of image quality. That is, if the image quality is unacceptable, the frame rate of the video input 101 is reduced, to allow for the communication of larger encoded frames 131. Conversely, also in accordance with this invention, if the frame rate of the video input 101 causes unacceptable motion quality, the frame rate of the video input is increased, thereby requiring a reduction in image quality to allow for the communication of smaller encoded frames 131 at the higher frame rate.
The frame rate of the video input 101 can be adjusted via a variety of means common in the art. In a straightforward embodiment, the frame rate is adjusted by communicating an appropriate video control command 102 to the source of the video input 101, such as the video camera 180 of FIG. 1, if it has an adjustable frame rate. If the video source cannot be controlled directly, the frame rate change is effected via the use of a rate buffer in the transform device 110. The rate buffer, common to one of ordinary skill in the art, receives frames of image data 101 at the highest rate the video source provides, and the processes within the transform device 110 sample the image data 101 from the rate buffer at the desired frame rate for communication to the receiver 200. A conventional rate buffer system also includes filters that smooth the visual anomalies that may be caused by this sampling process.
Because image quality and motion quality are typically subjective measures, a preferred embodiment of this invention allows for the above modifications of frame rate and quantizing factor based on a user preference 205. The modification of frame rate can be effected by a continuous user adjustment via a user control 230 in the receiver 200. A continuous control allows for a continuous adjustment of image quality. In a preferred embodiment, however, the user is provided a limited set of options via the user control 230, for ease of operation, and for ease of design of the aforementioned rate buffer. The user options in a preferred embodiment are: higher image quality, higher motion quality, or a best tradeoff. The best tradeoff corresponds to the convention choice of an acceptable frame rate that provides an acceptable image quality given the bandwidth of the channel 141. The higher image quality is effected by reducing the best tradeoff frame rate by approximately twenty percent. The higher motion quality is effected by increasing the best tradeoff frame rate by approximately twenty percent.
As discussed above, the initial quantizing factors Qi, Qp are dependent upon the frame rate. Therefore, in accordance with this invention, as the frame rate is modified based on the user preference, so also are the initial quantizing factors Qi, Qp. As noted above, the quantizing parameters 151 of FIG. 2 include the initial quantizing factors Qi, Qp, as well as a minimum Qmin and maximum Qmax set of parameters that bound the extent of the quantizing factor as it is adjusted by the buffer regulator 140. In accordance with this invention, the Qmin and Qmax parameters are also modified based on the user preference for higher image quality or higher motion quality. As discussed above, the minimum Qmin and maximum Qmax parameters are provided in order to provide a consistency in the image quality. Changes in image quality are often more visually disturbing than an overall lack of image quality. Once the quantizer 120 reduces the quantizing factor to Qmin in response to the buffer control commands 142, it is not reduced further. If additional bits are required to be provided to the channel 141 to prevent a receiver buffer underflow, the buffer regulator 140 inserts null bits, rather than providing more details to the encoded frame 131. Correspondingly, once the quantizer 120 increases the quantizing factor to Qmax, it is not increased further. If fewer bits must be provided to the channel 141 to avoid a receiver buffer overflow, the buffer regulator 140 reduces the frame rate, for example by not transmitting the frame, causing a subsequent momentary freeze of the image at the receiver 200, rather than an introduction of a poor quality image. Alternatively, the buffer regulator 140 may effect an explicit frame rate reduction at the transform 110, using the frame rate modification techniques presented above.
For completeness, the Appendix to this specification provides tables of preferred values of frame rates and quantizing parameters (Qi, Qp, Qmin, Qmax) for common frame formats used for videoconferencing and commonly used channel bit rates. Table 1 provides the frame rates and quantizing parameters for a QCIF format, which has a frame size of 176 by 144 pixels, using a 20 kbps bandwidth channel. As can be seen, if the user prefers higher image quality, the frame rate is set to five frames per second, whereas if the user selects a higher motion quality, the frame rate is set to ten frames per second. Correspondingly, at the higher frame rate, the quantizing parameters are set higher than those at the lower frame rates. Tables 2, 3, and 4 provide the frame rates and quantizing parameters for a CIF format, which has a frame size of 352 by 288 pixels, using a 100 kbps, 200 kbps, and 300 kbps bandwidth channel, respectively. The foregoing merely illustrates the principles of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements which, although not explicitly described or shown herein, embody the principles of the invention and are thus within its spirit and scope. For example, although the information is presented herein in the context of a video processing system, it would be evident to one of ordinary skill in the art that the principles of this invention are applicable to the processing of other data forms that employ a quantization scheme to encode the data. In like manner, although the quantizing parameters 151 are presented as being predefined in a quantizing parameter source 150, they could be dynamically computed, using for example a machine learning or expert system approach that determines appropriate quantizing parameters based on prior user preferences and feedback. The system may also automatically generate the user preferences 205, based for example on experiential data, thereby anticipating the user's desires. These and other optimization techniques would be evident to one of ordinary skill in the art, in view of the principles and techniques presented in this disclosure. As would also be evident to one of ordinary skill in the art, the invention disclosed herein may be embodied in hardware, software, or a combination of both. APPENDLX
Table 1. QCIF (176 x 144) at 20k bps
Figure imgf000016_0001
Table 2. CIF (352 x 288) at 100k bps
Figure imgf000016_0002
Table 3. CIF (352 x 288) at 200k bps
Figure imgf000017_0001
Table 4. CIF (352 x 288) at 300k bps or higher
Figure imgf000017_0002

Claims

CLAIMS:
1. A processing system comprising: an encoding system (100) that is configured to produce an encoding of data (131) for communication to a receiving buffer (210) via a communications channel (141), the encoding system (100) comprising a quantizer (120) that is configured to quantize data (101) to produce the encoding of data (131) comprising quantized values of a set of quantization levels, the quantization levels being determined by a quantizing factor than is based on a fullness measure of the receiving buffer (210), wherein: the quantizing factor is also dependent upon quantizing parameters (151) that are selected based on a user's preference (205).
2. The processing system of claim 1, wherein the user's preference (205) is based on at least one of: data quality and data rate.
3. The processing system of claim 2, wherein the data (101) includes frames of video data, and the data rate is a video frame rate.
4. The processing system of claim 1, wherein the encoding of data (131) includes an intra-frame encoding and an inter-frame encoding, and the quantizing parameters (151) include an initial value upon which the intra- frame encoding is based.
5. The processing system of claim 4, wherein the encoding system (100) produces the intra-frame encoding when an error occurs on the communications channel (141).
6. The processing system of claim 1, wherein the communications channel (141) includes at least one of: an Internet connection, a cable connection, and a wireless connection.
7. The processing system of claim 1, further including a video camera (180) configured to provide the data (101) to the encoding system (100) to facilitate the use of the processing system for videoconferencing.
8. The processing system of claim 7, wherein the encoding system (100) is configured to provide the encoding of data (131) in conformance with at least one of: a CCITT H.261 standard, a CCJ T H.263 standard, and an MPEG standard.
9. A video encoding system (100) comprising: a transform device (110) that is configured to transform a sequence of video frames (101) into a sequence of frames of data coefficients (111), each frame of data coefficients (111) of the sequence of frames of data coefficients (111) corresponding to each video frame of the sequence of video frames (101), a quantizer (120), operably coupled to the transform device (110), that is configured to quantize the frames of data coefficients (111) into corresponding frames of quantized coefficients (121), each quantized coefficient of the frames of quantized coefficients (121) having a quantized value that is based on a quantizing factor, a variable length encoder (130), operably coupled to the quantizer (120), that encodes the frames of quantized coefficients (121) into a sequence of encoded frames (131) for communication to a receiving buffer (210) via a communications channel (141), and a buffer regulator (140), operably coupled to the variable length encoder (130) and the quantizer (120), that provides buffer control commands (142) to the quantizer (120) to effect a modification of the quantizing factor based on a fullness measure of the receiving buffer (210), wherein the quantizer (120) is further configured to control the modification of the quantizing factor based on a user preference (205).
10. The video encoding system (100) of claim 9, wherein the sequence of frames of data coefficients (111) have an associated frame rate, and the transform device (110) is further configured to modify the frame rate in dependence upon the user preference (205).
11. The video encoding system (100) of claim 10, wherein the user preference (205) is based on at least one of an image quality and a motion quality.
12. The video encoding system (100) of claim 9, wherein the sequence of encoded frames (131) conform to at least one of: a CCJTT H.261 standard, a CCITT H.263 standard, and an MPEG standard.
13. The video encoding system (100) of claim 10, wherein the user preference (205) includes at least one of: an initial quantizing value, a minimum quantizing value, a maximum quantizing value, and a target frame rate.
14. A video encoding system (100) comprising: a transform device (110) that is configured to transform a sequence of video frames (101) into a sequence of frames of data coefficients (111), each frame of data coefficients (111) corresponding to each video frame of the sequence of video frames (101), each frame of data coefficients (111) being one of a first frame type and a second frame type, a quantizer (120), operably coupled to the transform device (110), that is configured to quantize the frames of data coefficients (111) into corresponding frames of quantized coefficients (121), each quantized coefficient of the frames of quantized coefficients (121) having a quantized value that is based on a quantizing factor, a variable length encoder (130), operably coupled to the quantizer (120), that encodes the frames of quantized coefficients (121) into a sequence of encoded frames (131) for communication to a receiving buffer (210) via a communications channel (141), and a buffer regulator (140), operably coupled to the variable length encoder (130) and the quantizer (120), that provides buffer control commands (142) to the quantizer (120) to effect a modification of the quantizing factor based on a fullness measure of the receiving buffer (210), wherein the quantizer (120) is further configured to modify the quantizing factor to a first predetermined value at a start of each frame of data coefficients (111) of the first frame type.
15. The video encoding system (100) of claim 14, wherein the quantizer (120) is further configured to modify the quantizing factor to a second predetermined value at a start of each frame of data coefficients (111) of the second frame type.
16. The video encoding system (100) of claim 14, wherein the quantizer (120) is further configured to limit the modification of the quantizing factor based upon one or more quantizing parameters (151).
17. The video encoding system (100) of claim 16, wherein at least one of the first predetermined value and the one or more quantizing parameters (151) are based on a user's preference (205).
18. A video receiver (200) comprising: a decoder (220) that is configured to transform a sequence of encoded frames (131) from an encoding system (100) into a sequence of video frames (201), said encoded frames (131) comprising an encoding of quantized coefficients (121) that is based upon a quantizing factor, and a user control (230) that is configured to communicate a user preference (205) to the encoding system (100) to effect a modification to the quantizing factor.
19. The video receiver (200) of claim 18, further comprising: a buffer (210) that is configured to receive the sequence of encoded frames (131), wherein the buffer (210) has a buffer size, and the quantizing factor is further dependent upon the buffer size.
20. A method for controlling an encoding of video frames of data (101), comprising the steps of: transforming the video frames of data (101) into frames of data coefficients (HI), the frames of data coefficients (111) having an associated frame rate, quantizing the frames of data coefficients (111) into frames of quantized coefficients (121) based on a quantizing factor, encoding the frames of quantized coefficients (121) into frames of variable length encodings (131), and modifying the quantizing factor based on a size parameter associated with each of the frames of variable length encodings (131), and also based on a set of quantizing parameters (151).
21. The method of claim 20, further including the step of: identifying each of the frames of data coefficients (111) as being one of: a first frame type and a second frame type, and initializing each of the frames of the first frame type to a first predetermined initial quantizing value that is included in the set of quantizing parameters (151).
22. The method of claim 20, further including the step of: modifying the frame rate to facilitate modifying the quantizing factor based on the set of quantizing parameters (151).
23. The method of claim 20, further including the steps of : receiving a user preference (205), and selecting the set of quantizing parameters (151) based on the user preference (205).
24. A method for enabling control of an encoding of video frames by an encoding system (100), said encoding comprising quantized coefficients (121) that are based on a quantizing factor, said method comprising the steps of: enabling a decoding of the encoding to produce corresponding decoded video frames, and enabling a communication of a user preference (205) to the encoding system
(100) to effect a modification of the quantizing factor.
25. The method of claim 24, further comprising the step of enabling a display of the decoded frames upon which display the user preference (205) is based.
26. The method of claim 24, wherein the encoding is further based on a frame rate, and the method further comprises the step of enabling a modification of the frame rate based on the user preference (205).
PCT/EP1999/010223 1998-12-23 1999-12-15 Adaptive buffer and quantizer regulation scheme for bandwidth scalability of video data WO2000040032A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP99967977A EP1057344A1 (en) 1998-12-23 1999-12-15 Adaptive buffer and quantizer regulation scheme for bandwidth scalability of video data
JP2000591812A JP2002534864A (en) 1998-12-23 1999-12-15 Adaptive buffer and quantization adjustment scheme for bandwidth scalability of video data

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US21983298A 1998-12-23 1998-12-23
US09/219,832 1998-12-23

Publications (1)

Publication Number Publication Date
WO2000040032A1 true WO2000040032A1 (en) 2000-07-06

Family

ID=22820965

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP1999/010223 WO2000040032A1 (en) 1998-12-23 1999-12-15 Adaptive buffer and quantizer regulation scheme for bandwidth scalability of video data

Country Status (3)

Country Link
EP (1) EP1057344A1 (en)
JP (1) JP2002534864A (en)
WO (1) WO2000040032A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2285110A1 (en) 2009-07-24 2011-02-16 Alcatel Lucent Joint encoder and buffer regulation for statistical multiplexing of multimedia contents
US8335164B2 (en) * 2005-11-02 2012-12-18 Thomson Licensing Method for determining a route in a wireless mesh network using a metric based on radio and traffic load
CN111988556A (en) * 2020-08-28 2020-11-24 深圳市融讯视通科技有限公司 Dynamic audio and video coding transmission method, system, device and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021182512A1 (en) * 2020-03-11 2021-09-16 日本電気株式会社 Communication control system and communication control method

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0316489A (en) * 1989-06-14 1991-01-24 Hitachi Ltd Picture coding system
EP0424060A2 (en) * 1989-10-14 1991-04-24 Sony Corporation Method of coding video signals and transmission system thereof
US5038209A (en) * 1990-09-27 1991-08-06 At&T Bell Laboratories Adaptive buffer/quantizer control for transform video coders
EP0514663A2 (en) * 1991-05-24 1992-11-25 International Business Machines Corporation An apparatus and method for motion video encoding employing an adaptive quantizer
EP0541302A2 (en) * 1991-11-08 1993-05-12 AT&T Corp. Improved video signal quantization for an MPEG like coding environment
EP0540961A2 (en) * 1991-11-08 1993-05-12 International Business Machines Corporation A motion video compression system with adaptive bit allocation and quantization
EP0620686A2 (en) * 1993-04-15 1994-10-19 Samsung Electronics Co., Ltd. Fuzzy-controlled coding method and apparatus therefor
US5684714A (en) * 1995-05-08 1997-11-04 Kabushiki Kaisha Toshiba Method and system for a user to manually alter the quality of a previously encoded video sequence
EP0828393A1 (en) * 1996-09-06 1998-03-11 THOMSON multimedia Quantization process and device for video encoding

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0316489A (en) * 1989-06-14 1991-01-24 Hitachi Ltd Picture coding system
EP0424060A2 (en) * 1989-10-14 1991-04-24 Sony Corporation Method of coding video signals and transmission system thereof
US5038209A (en) * 1990-09-27 1991-08-06 At&T Bell Laboratories Adaptive buffer/quantizer control for transform video coders
EP0514663A2 (en) * 1991-05-24 1992-11-25 International Business Machines Corporation An apparatus and method for motion video encoding employing an adaptive quantizer
EP0541302A2 (en) * 1991-11-08 1993-05-12 AT&T Corp. Improved video signal quantization for an MPEG like coding environment
EP0540961A2 (en) * 1991-11-08 1993-05-12 International Business Machines Corporation A motion video compression system with adaptive bit allocation and quantization
EP0620686A2 (en) * 1993-04-15 1994-10-19 Samsung Electronics Co., Ltd. Fuzzy-controlled coding method and apparatus therefor
US5684714A (en) * 1995-05-08 1997-11-04 Kabushiki Kaisha Toshiba Method and system for a user to manually alter the quality of a previously encoded video sequence
EP0828393A1 (en) * 1996-09-06 1998-03-11 THOMSON multimedia Quantization process and device for video encoding

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PATENT ABSTRACTS OF JAPAN vol. 015, no. 135 (E - 1052) 4 April 1991 (1991-04-04) *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8335164B2 (en) * 2005-11-02 2012-12-18 Thomson Licensing Method for determining a route in a wireless mesh network using a metric based on radio and traffic load
US8537714B2 (en) 2005-11-02 2013-09-17 Thomson Licensing Method for determining a route in a wireless mesh network using a metric based on radio and traffic load
EP2285110A1 (en) 2009-07-24 2011-02-16 Alcatel Lucent Joint encoder and buffer regulation for statistical multiplexing of multimedia contents
CN111988556A (en) * 2020-08-28 2020-11-24 深圳市融讯视通科技有限公司 Dynamic audio and video coding transmission method, system, device and storage medium
CN111988556B (en) * 2020-08-28 2022-04-26 深圳市融讯视通科技有限公司 Dynamic audio and video coding transmission method, system, device and storage medium

Also Published As

Publication number Publication date
JP2002534864A (en) 2002-10-15
EP1057344A1 (en) 2000-12-06

Similar Documents

Publication Publication Date Title
US5038209A (en) Adaptive buffer/quantizer control for transform video coders
Ribas-Corbera et al. Rate control in DCT video coding for low-delay communications
US6389072B1 (en) Motion analysis based buffer regulation scheme
US6526097B1 (en) Frame-level rate control for plug-in video codecs
US7773672B2 (en) Scalable rate control system for a video encoder
US5835149A (en) Bit allocation in a coded video sequence
US5241383A (en) Pseudo-constant bit rate video coding with quantization parameter adjustment
JP5351040B2 (en) Improved video rate control for video coding standards
KR100555601B1 (en) Adaptive rate control for digital video compression
KR100304103B1 (en) Method for finding re-quantization step sizes resulting in abrupt bit-rate reduction and rate control method using it
US6094455A (en) Image compression/encoding apparatus and system with quantization width control based on bit generation error
US20050002453A1 (en) Network-aware adaptive video compression for variable bit rate transmission
US5638126A (en) Method and apparatus for deciding quantization parameter
WO2004056124A1 (en) Method of selecting among n 'spatial video codecs' the optimum codec for a same input signal
JP2000078577A (en) Method and system for processing multiplex stream of video frame
JP2001512651A (en) Calculation method of quantization matrix for each frame
US5710595A (en) Method and apparatus for controlling quantization and buffering for digital signal compression
CA2250284C (en) A perceptual compression and robust bit-rate control system
KR100601615B1 (en) Apparatus for compressing video according to network bandwidth
Wang Bit rate control for hybrid DPCM/DCT video codec
US20030007559A1 (en) Apparatus and method for image transmission
Seo et al. Rate control algorithm for fast bit-rate conversion transcoding
EP0971542A2 (en) Readjustment of bit rates when switching between compressed video streams
KR100498332B1 (en) Apparatus and method for adaptive rate in video transcoder
WO2000040032A1 (en) Adaptive buffer and quantizer regulation scheme for bandwidth scalability of video data

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

WWE Wipo information: entry into national phase

Ref document number: 1999967977

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWP Wipo information: published in national office

Ref document number: 1999967977

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1999967977

Country of ref document: EP