WO2000040031A1 - Method and device for encoding a video signal - Google Patents

Method and device for encoding a video signal Download PDF

Info

Publication number
WO2000040031A1
WO2000040031A1 PCT/EP1999/010199 EP9910199W WO0040031A1 WO 2000040031 A1 WO2000040031 A1 WO 2000040031A1 EP 9910199 W EP9910199 W EP 9910199W WO 0040031 A1 WO0040031 A1 WO 0040031A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
pictures
bits
coding
sub
Prior art date
Application number
PCT/EP1999/010199
Other languages
French (fr)
Inventor
Françoise Groliere
Eric Barrau
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Priority to EP99966980A priority Critical patent/EP1057343A1/en
Priority to JP2000591811A priority patent/JP2002534863A/en
Priority to KR1020007009588A priority patent/KR20010041441A/en
Publication of WO2000040031A1 publication Critical patent/WO2000040031A1/en

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/46Conversion to or from run-length codes, i.e. by representing the number of consecutive digits, or groups of digits, of the same kind by a code word and a digit indicative of that kind
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/115Selection of the code volume for a coding unit prior to coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/152Data rate or code amount at the encoder output by measuring the fullness of the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/174Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the present invention relates to a method of encoding the successive pictures of a video signal, comprising the steps of subdividing each successive picture into a plurality of sub-pictures, transforming each sub-picture into coefficients, quantizing said coefficients with an applied step size, coding said quantized coefficients, and controlling the step size in conformity with a target value for the number of bits for encoding each successive picture.
  • the invention also relates to a corresponding device. This invention is particularly adapted to real-time communication at low bit rate according to the so-called H.263 recommendation.
  • H.263 High bit rate video encoder
  • a video encoder according to said standard is for instance described in the book “Motion estimation algorithms for video compression", B. Furht and al., Kluwer Academic Publishers, 1997, chapter 2, pp.30-35.
  • An H.263 video encoder such as illustrated in Fig.l is based on a motion- compensated prediction from a previous image to the current one, followed by an orthogonal transformation (such as DCT), which reduces the spatial redundancy of the pictures by decorrelating the picture elements and concentrating the energy into a few low order coefficients, a quantization, and an encoding operation for encoding the prediction error thus transformed and quantized.
  • DCT orthogonal transformation
  • At least the first picture is a reference one, encoded without temporal prediction (i.e. according to an "intra mode", or I mode), and from time to time one further picture in every n pictures may be also coded according to said intra mode.
  • the other type of picture, the P one corresponds to P pictures, i.e. to pictures that are temporally predicted from earlier pictures.
  • a coding branch including a discrete cosine transform circuit 12, a quantization circuit 13 and a variable-length encoder 14 (DCT, Q, VLC respectively) processes these pictures (the circuit 12 receives in fact the difference between the input pictures and predicted ones available at the output of a subtracter 25) and sends the obtained, coded variable bit rate bitstream to a buffer 15, the output of which is the output constant bit rate bitstream of the H.263 video encoder. Said output bitstream is also sent to a bitrate control circuit 30 for buffer regulation.
  • DCT, Q, VLC variable-length encoder
  • a prediction branch is provided and comprises in series an inverse quantization circuit 21 (Q "1 ), an inverse DCT transform circuit 22 (DCT 1 ), an adder 23 (delivering the reconstructed previous picture RPP), a temporal prediction circuit 24 (delivering a predicted picture PP, sent to the subtracter 15, and motion vectors MV, sent to the variable-length encoder 14, the prediction being based on a block-matching search carried out between the current picture CP, available at the output of a picture skipping circuit 26, and the reconstructed one RPP, available at the output of the adder 23), and the subtracter 25.
  • the output of the temporal prediction circuit 24 is also sent back towards the other input of the adder 23 in view of the reconstruction of the previous picture RPP used for the temporal prediction.
  • the H.263 standard defines a hierarchical bitstream syntax with four layers in said hierarchy : picture level, group of blocks level (GOB), macroblock level (MB), and block level (8 8 picture elements, or pixels), the block being the elementary unit over which DCT operates.
  • a macroblock includes four luminance blocks (covering a 16 x 16 area in a picture) and two chrominance blocks.
  • the motion estimation and compensation implemented on the reconstructed previous picture RPP in the circuit 24 of the prediction branch operate on macroblocks.
  • a feedback connection 31 between the buffer 15 and the quantization circuit 13 allows to obtain a finer or a coarser quantization.
  • the coarseness of the quantization is defined by a quantization parameter for the first three layers (blocks, macroblocks, GOBs) and a fixed quantization matrix which sets the relative coarseness of quantization for each DCT coefficient.
  • the picture skipping circuit 26 provided at the input of the encoder may also be used as a possible way to reduce the bit rate (while keeping an acceptable picture quality).
  • the number of skipped pictures is variable and depends on the output buffer fullness, and the feedback connection 31 provided for buffer regulation is therefore related not only to quantization step size variations but also to picture skipping (and also to an intra/inter selection, which is controlled by a circuit 41 actuating or not a first switch 42 and a second switch 43).
  • This feedback connection allows to give guidelines to the encoder, for which the problem of bit rate control can indeed be formulated as follows : given a predetermined bit rate and an input picture, how to decide what encoder setting has to be chosen ? It would be possible either to have a constant picture rate approach, according to which the pictures are periodically grabbed and the quality is adapted to the complexity of each successive picture (in order to maintain the targeted bit rate) and therefore highly variable from a picture to another one, or to have a constant quality approach, according to which the pictures are processed with a fixed quantization step but only when the encoder has finished the processing of the previous picture, i.e. at a highly variable picture rate adapted to the complexities of the pictures.
  • the invention relates to a method such as described in the preamble of the description and in which said controlling step comprises a pre-analysis sub-step, based on an estimation of the number of bits respectively used for coding motion information between previous and current pictures and for coding said coefficients, and a decision sub- step, provided for adjusting the quantizing step size and the rate of the successive pictures.
  • Fig.1 shows a basic video compression scheme according to the H.263 standard
  • Fig.2 shows the general structure of an encoder according to the invention
  • Fig.3 shows how the scheme of Fig.2 works.
  • the starting point of the feedback control carried out thanks to the output buffer 15 is to fit the number of bits which will be used to code each successive current picture to the number of bits available on the transmission channel.
  • bits are used for coding information at the picture level, GOB level, macroblock level and block level, but the most expensive part in term of number of bits generated are macroblock and block levels, which concern motion estimation and DCT coefficient coding, totally dependent of the complexity of the current picture.
  • the quality of the communication depends on a variable, quantifier-dependent part of the generated bits.
  • a pre-analysis of the current image to roughly predict how many bits will be generated according to the encoder's setting is provided.
  • Said pre-analysis described hereunder in a more detailed manner, allows to predict the number of bits generated for each possible quantization step of the encoder (preanalysis step). It is followed by a decision step in which, after having compared said number of bits to the desired one, a setting for the encoder is found. If the corresponding quantization step is in accordance with a previously set quality range, the picture is coded with it (it means that a transmission is not authorized when the quality is too bad, which corresponds to a quantifier step size too great).
  • the worst authorized quantization step is first chosen and a decreased picture rate is computed (and chosen thanks to the picture skipping circuit 26), in order to meet the bandwidth requirements. Then the computed setting is used for the encoding process of the picture.
  • the bit rate control allowed by the feedback connection for buffer regulation checks if a discrepancy between predicted and desired numbers of bits has appeared during said encoding operation, and, if necessary, the setting of the encoder is modified by modifying the quantization step (the only authorized changes are plus or minus one) between two consecutive GOBs.
  • the generated bits can be split into two parts, a first one corresponding to the headers and a second one corresponding to the real content of the current picture.
  • the computation of the first part is easy, but that of the second one is more complex.
  • two kinds of information are indeed needed : (a) the information of motion between the current picture and the previous one, and (b) the chrominance and luminance variations between a current macroblock and the corresponding one in the previous picture.
  • a first pre-analysis sub-step, related to the motion information, is based on an approximate prediction of the number of bits needed to code the motion vectors. More precisely, it has been found that an empirical law linking the mean motion of the complete picture and the number of bits needed to code all the motion data could be established. This law can be expressed in the form of the following equation (1):
  • N number of macroblocks having a non null motion vector
  • the motion estimation of the whole picture is done.
  • the SAD of each macroblock being known, the predicted number of bits is then computed for each quantization step, in order to determine which quantization step gives the prediction closest to the targeted number of bits. If the computed quantization step is too high in term of minimal quality, the quantization step is set to the maximum allowed.
  • coder 203 designates the association of all the elements of Fig.l, except the buffer 15 and the bitrate control circuit 30.
  • Fig.2 illustrates at what level the above-described preanalysis acts in the coding chain and Fig.3 shows how the scheme of Fig.2
  • the first step 31 is provided for computing the authorized target number of bits T for the input picture, taking into account the bandwidth BW, the fullness of the output buffer FOB and the target frame rate TFR according to a relation (3) of the following type:
  • the second step comprises a computing operation 321, provided for computing the predicted number of bits P n useful for coding the actual picture with the smaller quantizer (or quantization) step which appears to be compatible with the bandwith, the quality and the frame rate.
  • These numbers T b and P n are then compared (test operation 322) : if P n is greater than T b (output Y), the coding step of Fig.2 will be done with said quantization step, while, if P n is smaller than T b (output N), 1 is added to the quantization step, the operation 321 is repeated, and the test operation 322 is repeated with the modified value of P n .
  • a quality test 33 is carried out : if the quantizer is under a predetermined quality threshold (Q n - ⁇ ⁇ Q max ), the frame rate FR is equal to the target frame rate (connection 331), while, if it is not the case, a new smaller frame rate is computed (connection 332) according to the predicted number of bits for the minimal authorized quality. In both situations, a coding step 341 is then carried out for coding each group of blocks (GOB).

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Algebra (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

In video communication, quality and delay of the transmission depend on the bit rate control strategy that allows to fit the number of bits generated for a given setting of the coder with the bandwidth. In order to give appropriate guidelines to the coder, a preanalysis step, based on approximate predictions (by means of an empirical law for the motion information and a use of statistics for the content of the picture), is consequently carried out to predict this generated number of bits, and followed by a decision step provided for adjusting the quantization step size and the rate of the successive pictures.

Description

Method and device for encoding a video signal.
The present invention relates to a method of encoding the successive pictures of a video signal, comprising the steps of subdividing each successive picture into a plurality of sub-pictures, transforming each sub-picture into coefficients, quantizing said coefficients with an applied step size, coding said quantized coefficients, and controlling the step size in conformity with a target value for the number of bits for encoding each successive picture. The invention also relates to a corresponding device. This invention is particularly adapted to real-time communication at low bit rate according to the so-called H.263 recommendation.
In low bit rate applications (videophony, videoconferencing), a system for image compression such as proposed in the H.263 standard is recommended. A video encoder according to said standard is for instance described in the book "Motion estimation algorithms for video compression", B. Furht and al., Kluwer Academic Publishers, 1997, chapter 2, pp.30-35. An H.263 video encoder such as illustrated in Fig.l is based on a motion- compensated prediction from a previous image to the current one, followed by an orthogonal transformation (such as DCT), which reduces the spatial redundancy of the pictures by decorrelating the picture elements and concentrating the energy into a few low order coefficients, a quantization, and an encoding operation for encoding the prediction error thus transformed and quantized. At least the first picture is a reference one, encoded without temporal prediction (i.e. according to an "intra mode", or I mode), and from time to time one further picture in every n pictures may be also coded according to said intra mode. The other type of picture, the P one, corresponds to P pictures, i.e. to pictures that are temporally predicted from earlier pictures.
As shown in Fig.l, the successive pictures P regularly arrive at the input of the encoding device (input video). A coding branch including a discrete cosine transform circuit 12, a quantization circuit 13 and a variable-length encoder 14 (DCT, Q, VLC respectively) processes these pictures (the circuit 12 receives in fact the difference between the input pictures and predicted ones available at the output of a subtracter 25) and sends the obtained, coded variable bit rate bitstream to a buffer 15, the output of which is the output constant bit rate bitstream of the H.263 video encoder. Said output bitstream is also sent to a bitrate control circuit 30 for buffer regulation. A prediction branch is provided and comprises in series an inverse quantization circuit 21 (Q"1), an inverse DCT transform circuit 22 (DCT1), an adder 23 (delivering the reconstructed previous picture RPP), a temporal prediction circuit 24 (delivering a predicted picture PP, sent to the subtracter 15, and motion vectors MV, sent to the variable-length encoder 14, the prediction being based on a block-matching search carried out between the current picture CP, available at the output of a picture skipping circuit 26, and the reconstructed one RPP, available at the output of the adder 23), and the subtracter 25. The output of the temporal prediction circuit 24 is also sent back towards the other input of the adder 23 in view of the reconstruction of the previous picture RPP used for the temporal prediction. The H.263 standard defines a hierarchical bitstream syntax with four layers in said hierarchy : picture level, group of blocks level (GOB), macroblock level (MB), and block level (8 8 picture elements, or pixels), the block being the elementary unit over which DCT operates. A macroblock includes four luminance blocks (covering a 16 x 16 area in a picture) and two chrominance blocks. The motion estimation and compensation implemented on the reconstructed previous picture RPP in the circuit 24 of the prediction branch operate on macroblocks. A feedback connection 31 between the buffer 15 and the quantization circuit 13 allows to obtain a finer or a coarser quantization. The coarseness of the quantization is defined by a quantization parameter for the first three layers (blocks, macroblocks, GOBs) and a fixed quantization matrix which sets the relative coarseness of quantization for each DCT coefficient. The picture skipping circuit 26 provided at the input of the encoder may also be used as a possible way to reduce the bit rate (while keeping an acceptable picture quality). The number of skipped pictures is variable and depends on the output buffer fullness, and the feedback connection 31 provided for buffer regulation is therefore related not only to quantization step size variations but also to picture skipping (and also to an intra/inter selection, which is controlled by a circuit 41 actuating or not a first switch 42 and a second switch 43).
This feedback connection allows to give guidelines to the encoder, for which the problem of bit rate control can indeed be formulated as follows : given a predetermined bit rate and an input picture, how to decide what encoder setting has to be chosen ? It would be possible either to have a constant picture rate approach, according to which the pictures are periodically grabbed and the quality is adapted to the complexity of each successive picture (in order to maintain the targeted bit rate) and therefore highly variable from a picture to another one, or to have a constant quality approach, according to which the pictures are processed with a fixed quantization step but only when the encoder has finished the processing of the previous picture, i.e. at a highly variable picture rate adapted to the complexities of the pictures.
It is therefore an object of the invention to propose an improved encoding method in which a trade-off between picture rate and quality is researched, while taking also into account the fact that the delay between the input pictures and the displayed ones has to be well controlled (as constant as possible) in order to ensure a restitution of any scene as regular as possible.
To this end the invention relates to a method such as described in the preamble of the description and in which said controlling step comprises a pre-analysis sub-step, based on an estimation of the number of bits respectively used for coding motion information between previous and current pictures and for coding said coefficients, and a decision sub- step, provided for adjusting the quantizing step size and the rate of the successive pictures.
The particular aspects of the invention will now be explained with reference to the embodiment described hereinafter and considered in connection with the accompanying drawings, in which:
Fig.1 shows a basic video compression scheme according to the H.263 standard;
Fig.2 shows the general structure of an encoder according to the invention; Fig.3 shows how the scheme of Fig.2 works.
In a coding chain such as shown in Fig.l, the starting point of the feedback control carried out thanks to the output buffer 15 is to fit the number of bits which will be used to code each successive current picture to the number of bits available on the transmission channel. When studying the bit generation during the coding operation of each picture, it appears that the biggest amount of bits to be transmitted is related to the content of the picture. In fact, bits are used for coding information at the picture level, GOB level, macroblock level and block level, but the most expensive part in term of number of bits generated are macroblock and block levels, which concern motion estimation and DCT coefficient coding, totally dependent of the complexity of the current picture. In such a coding chain, the quality of the communication depends on a variable, quantifier-dependent part of the generated bits. According to the invention, a pre-analysis of the current image to roughly predict how many bits will be generated according to the encoder's setting is provided. Said pre-analysis, described hereunder in a more detailed manner, allows to predict the number of bits generated for each possible quantization step of the encoder (preanalysis step). It is followed by a decision step in which, after having compared said number of bits to the desired one, a setting for the encoder is found. If the corresponding quantization step is in accordance with a previously set quality range, the picture is coded with it (it means that a transmission is not authorized when the quality is too bad, which corresponds to a quantifier step size too great). Otherwise the worst authorized quantization step is first chosen and a decreased picture rate is computed (and chosen thanks to the picture skipping circuit 26), in order to meet the bandwidth requirements. Then the computed setting is used for the encoding process of the picture. After the encoding operation of each GOB, the bit rate control allowed by the feedback connection for buffer regulation checks if a discrepancy between predicted and desired numbers of bits has appeared during said encoding operation, and, if necessary, the setting of the encoder is modified by modifying the quantization step (the only authorized changes are plus or minus one) between two consecutive GOBs.
The generated bits can be split into two parts, a first one corresponding to the headers and a second one corresponding to the real content of the current picture. The computation of the first part is easy, but that of the second one is more complex.
In order to transmit the content of a picture, two kinds of information are indeed needed : (a) the information of motion between the current picture and the previous one, and (b) the chrominance and luminance variations between a current macroblock and the corresponding one in the previous picture.
A first pre-analysis sub-step, related to the motion information, is based on an approximate prediction of the number of bits needed to code the motion vectors. More precisely, it has been found that an empirical law linking the mean motion of the complete picture and the number of bits needed to code all the motion data could be established. This law can be expressed in the form of the following equation (1):
PNB = 4. N . log(1000.Mean_mv/N) (1) where:
N = number of macroblocks having a non null motion vector;
Mean_mv = (sum of motion vectors )/(number of macroblocks per picture); PNB = estimated number of bits used to code the motion information of a picture (related to the mean motion of the whole picture). This law has the advantage to be simple and rapid.
A second pre-analysis sub-step, related to the number of bits to code DCT coefficients, is based on a prediction done at the macroblock level. For each quantization step, statistics of the number of bits used versus the sum of absolute differences between luminance data of current and reference macroblocks (= SAD, which is the correlation measure between the original macroblock in the current picture and the displaced macroblock in the previous reconstructed picture, according to the relation (2):
SADN , y) = - previous(i-x,j - y)\ (2)
Figure imgf000007_0001
with x,y = displacement coordinates and N = size of the block, generally 8 or 16) have been done, and it appears that the number of bits has not a linear variation, with respect to the SAD.
This could be due to the coding mode of the DCT coefficients in H.263 standard, which is a variable length coding mode for the more frequent coefficients and a fixed length coding one for the others : this could explain why, for each quantization step, three different areas, one for small SAD, one for medium, and another one for big SAD, are observed.
This leads to make different predictions, depending on the SAD area : a compromise has been made for three regions which work with the total range of the quantization step (1 to 31). Three order polynomial approximation laws have been computed for each quantization step for SAD < 500, 500 < SAD < 1000, 1000 < SAD
< 1500, a fixed value being chosen for SAD >1500. Said compromise between accuracy and computing complexity seems to be rather good.
In practice, just at the beginning of the coding, the motion estimation of the whole picture is done. The SAD of each macroblock being known, the predicted number of bits is then computed for each quantization step, in order to determine which quantization step gives the prediction closest to the targeted number of bits. If the computed quantization step is too high in term of minimal quality, the quantization step is set to the maximum allowed.
Knowing the number of bits sent on the line, the time to grab the next picture is then computed. In the encoder according to the invention, shown in Fig.2, these pre-analysis sub-steps, referenced with the single reference 201, are followed by a decision sub-step 202.
The reference "coder 203" designates the association of all the elements of Fig.l, except the buffer 15 and the bitrate control circuit 30.
With respect to the basic scheme of Fig.l, Fig.2 illustrates at what level the above-described preanalysis acts in the coding chain and Fig.3 shows how the scheme of Fig.2
(i.e. the bit rate control according to the invention) is working. It must be indicated that said regulation is here implemented by carrying out a set of software instructions controlling computation steps, test steps, or similar steps, such as now described. The first step 31 is provided for computing the authorized target number of bits T for the input picture, taking into account the bandwidth BW, the fullness of the output buffer FOB and the target frame rate TFR according to a relation (3) of the following type:
„ bandwidth - buffer fullness ...
Tb = 5^-S (3> The second step comprises a computing operation 321, provided for computing the predicted number of bits Pn useful for coding the actual picture with the smaller quantizer (or quantization) step which appears to be compatible with the bandwith, the quality and the frame rate. These numbers Tb and Pn are then compared (test operation 322) : if Pn is greater than Tb (output Y), the coding step of Fig.2 will be done with said quantization step, while, if Pn is smaller than Tb (output N), 1 is added to the quantization step, the operation 321 is repeated, and the test operation 322 is repeated with the modified value of Pn.
When Pn is greater than Tb, a quality test 33 is carried out : if the quantizer is under a predetermined quality threshold (Qn-ι < Qmax), the frame rate FR is equal to the target frame rate (connection 331), while, if it is not the case, a new smaller frame rate is computed (connection 332) according to the predicted number of bits for the minimal authorized quality. In both situations, a coding step 341 is then carried out for coding each group of blocks (GOB).
The last step is provided for checking for each new GOB (test 342 NEW GOB ?) if the coding prediction is in accordance (test PRED OK ?) with the actual coding (sub-step 343). If yes, the next coding step will continue with the same quantizer (connection 344) ; if not, add or subtract 1 (sub-step 345, NEW QUANT = +/-1) to the computed quantizer in view of reducing the drift.

Claims

CLAIMS:
1. A method of coding the successive pictures of a video signal, comprising the steps of:
- subdividing each successive picture into a plurality of sub-pictures;
- transforming each sub-picture into coefficients; - quantizing said coefficients with an applied step size;
- coding said quantized coefficients;
- controlling the step size in conformity with a target value for the number of bits for encoding each successive picture, said controlling step comprising a pre-analysis sub- step, based on an estimation of the number of bits respectively used for coding motion information between previous and current pictures and for coding said coefficients, and a decision sub-step, provided for adjusting the quantizing step size and the rate of the successive pictures.
2. A device for encoding the successive pictures of a video signal, comprising: - means for dividing each successive picture into a plurality of sub-pictures;
- an encoder for encoding successively said sub-pictures, or groups of sub- pictures, said encoder including a picture transformer for transforming each sub-picture into coefficients and a quantizer for quantizing the coefficients with an applied step size;
- control means for controlling the quantization step size in conformity with a target value for the number of bits for encoding the applied picture; said device also comprising a pre-analysis stage, provided for estimating the number of bits respectively used for coding motion information between previous and current pictures and for coding said coefficients, and a decision stage, provided for adjusting the quantization step size and the rate of successive pictures.
PCT/EP1999/010199 1998-12-29 1999-12-17 Method and device for encoding a video signal WO2000040031A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP99966980A EP1057343A1 (en) 1998-12-29 1999-12-17 Method and device for encoding a video signal
JP2000591811A JP2002534863A (en) 1998-12-29 1999-12-17 Video signal encoding method and apparatus
KR1020007009588A KR20010041441A (en) 1998-12-29 1999-12-17 Method and device for encoding a video signal

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP98403322 1998-12-29
EP98403322.5 1998-12-29

Publications (1)

Publication Number Publication Date
WO2000040031A1 true WO2000040031A1 (en) 2000-07-06

Family

ID=8235609

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP1999/010199 WO2000040031A1 (en) 1998-12-29 1999-12-17 Method and device for encoding a video signal

Country Status (4)

Country Link
EP (1) EP1057343A1 (en)
JP (1) JP2002534863A (en)
KR (1) KR20010041441A (en)
WO (1) WO2000040031A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100475623B1 (en) * 2001-02-26 2005-03-15 산요덴키가부시키가이샤 Image data transmitting device and image data receiving device
US7826529B2 (en) 2003-09-01 2010-11-02 Samsung Electronics Co., Ltd. H.263/MPEG video encoder for efficiently controlling bit rates and method of controlling the same
CN106791860A (en) * 2016-12-28 2017-05-31 重庆邮电大学 A kind of adaptive video coding control system and method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5333012A (en) * 1991-12-16 1994-07-26 Bell Communications Research, Inc. Motion compensating coder employing an image coding control method
EP0757490A2 (en) * 1995-08-02 1997-02-05 Matsushita Electric Industrial Co., Ltd. Video coding device and video transmission system using the same, quantization control method and average throughput calculation method used therein
WO1997016031A1 (en) * 1995-10-26 1997-05-01 Sarnoff Corporation Apparatus and method for selecting a coding mode in a block-based coding system
EP0771120A2 (en) * 1995-10-27 1997-05-02 Kabushiki Kaisha Toshiba Video encoding and decoding apparatus
EP0828393A1 (en) * 1996-09-06 1998-03-11 THOMSON multimedia Quantization process and device for video encoding
US5818536A (en) * 1995-09-29 1998-10-06 U.S. Philips Corporation Motion vector selection using a cost function relating accuracy to bit rate

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5333012A (en) * 1991-12-16 1994-07-26 Bell Communications Research, Inc. Motion compensating coder employing an image coding control method
EP0757490A2 (en) * 1995-08-02 1997-02-05 Matsushita Electric Industrial Co., Ltd. Video coding device and video transmission system using the same, quantization control method and average throughput calculation method used therein
US5818536A (en) * 1995-09-29 1998-10-06 U.S. Philips Corporation Motion vector selection using a cost function relating accuracy to bit rate
WO1997016031A1 (en) * 1995-10-26 1997-05-01 Sarnoff Corporation Apparatus and method for selecting a coding mode in a block-based coding system
EP0771120A2 (en) * 1995-10-27 1997-05-02 Kabushiki Kaisha Toshiba Video encoding and decoding apparatus
EP0828393A1 (en) * 1996-09-06 1998-03-11 THOMSON multimedia Quantization process and device for video encoding

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100475623B1 (en) * 2001-02-26 2005-03-15 산요덴키가부시키가이샤 Image data transmitting device and image data receiving device
US7826529B2 (en) 2003-09-01 2010-11-02 Samsung Electronics Co., Ltd. H.263/MPEG video encoder for efficiently controlling bit rates and method of controlling the same
CN106791860A (en) * 2016-12-28 2017-05-31 重庆邮电大学 A kind of adaptive video coding control system and method
CN106791860B (en) * 2016-12-28 2019-07-30 重庆邮电大学 A kind of adaptive video coding control system and method

Also Published As

Publication number Publication date
EP1057343A1 (en) 2000-12-06
JP2002534863A (en) 2002-10-15
KR20010041441A (en) 2001-05-25

Similar Documents

Publication Publication Date Title
US5870146A (en) Device and method for digital video transcoding
KR100471956B1 (en) Moving picture encoding apparatus and method
US5293229A (en) Apparatus and method for processing groups of fields in a video data compression system
JP4127914B2 (en) Adaptive video signal encoding apparatus
US5461422A (en) Quantizer with automatic pre-threshold
JP4109113B2 (en) Switching between bitstreams in video transmission
CN100463523C (en) Video encoding methods and systems with frame-layer rate control
US5532746A (en) Bit allocation method for controlling transmission rate of video encoder
WO2000046996A1 (en) Frame-level rate control for plug-in video codecs
AU2272395A (en) A method for determining whether to intra code a video block
KR100601615B1 (en) Apparatus for compressing video according to network bandwidth
EP1838108A1 (en) Processing video data at a target rate
US6501800B1 (en) Variable bit-rate encoding device
JP3173369B2 (en) Image compression coding device
US6480544B1 (en) Encoding apparatus and encoding method
EP0639924B1 (en) Coding mode control device for digital video signal coding system
EP1057343A1 (en) Method and device for encoding a video signal
Yi et al. Rate control using enhanced frame complexity measure for H. 264 video
JP3779066B2 (en) Video encoding device
KR100239867B1 (en) Method of compressing solid moving picture for controlling degradation of image quality in case of applying motion estimation and time difference estimation
JP2900927B2 (en) Encoding method and encoding device
KR20010104058A (en) Adaptive quantizer according to DCT mode in MPEG2 encoder
JP4035747B2 (en) Encoding apparatus and encoding method
KR100778473B1 (en) Bit rate control method
JPH09107293A (en) Method and device for controlling code volume

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP KR

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

WWE Wipo information: entry into national phase

Ref document number: 1999966980

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 1020007009588

Country of ref document: KR

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWP Wipo information: published in national office

Ref document number: 1999966980

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1020007009588

Country of ref document: KR

WWW Wipo information: withdrawn in national office

Ref document number: 1999966980

Country of ref document: EP

WWR Wipo information: refused in national office

Ref document number: 1020007009588

Country of ref document: KR