EP1057343A1 - Method and device for encoding a video signal - Google Patents
Method and device for encoding a video signalInfo
- Publication number
- EP1057343A1 EP1057343A1 EP99966980A EP99966980A EP1057343A1 EP 1057343 A1 EP1057343 A1 EP 1057343A1 EP 99966980 A EP99966980 A EP 99966980A EP 99966980 A EP99966980 A EP 99966980A EP 1057343 A1 EP1057343 A1 EP 1057343A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- picture
- pictures
- bits
- coding
- sub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/587—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/46—Conversion to or from run-length codes, i.e. by representing the number of consecutive digits, or groups of digits, of the same kind by a code word and a digit indicative of that kind
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/115—Selection of the code volume for a coding unit prior to coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/137—Motion inside a coding unit, e.g. average field, frame or block difference
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/149—Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/152—Data rate or code amount at the encoder output by measuring the fullness of the transmission buffer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/174—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
Definitions
- the present invention relates to a method of encoding the successive pictures of a video signal, comprising the steps of subdividing each successive picture into a plurality of sub-pictures, transforming each sub-picture into coefficients, quantizing said coefficients with an applied step size, coding said quantized coefficients, and controlling the step size in conformity with a target value for the number of bits for encoding each successive picture.
- the invention also relates to a corresponding device. This invention is particularly adapted to real-time communication at low bit rate according to the so-called H.263 recommendation.
- H.263 High bit rate video encoder
- a video encoder according to said standard is for instance described in the book “Motion estimation algorithms for video compression", B. Furht and al., Kluwer Academic Publishers, 1997, chapter 2, pp.30-35.
- An H.263 video encoder such as illustrated in Fig.l is based on a motion- compensated prediction from a previous image to the current one, followed by an orthogonal transformation (such as DCT), which reduces the spatial redundancy of the pictures by decorrelating the picture elements and concentrating the energy into a few low order coefficients, a quantization, and an encoding operation for encoding the prediction error thus transformed and quantized.
- DCT orthogonal transformation
- At least the first picture is a reference one, encoded without temporal prediction (i.e. according to an "intra mode", or I mode), and from time to time one further picture in every n pictures may be also coded according to said intra mode.
- the other type of picture, the P one corresponds to P pictures, i.e. to pictures that are temporally predicted from earlier pictures.
- a coding branch including a discrete cosine transform circuit 12, a quantization circuit 13 and a variable-length encoder 14 (DCT, Q, VLC respectively) processes these pictures (the circuit 12 receives in fact the difference between the input pictures and predicted ones available at the output of a subtracter 25) and sends the obtained, coded variable bit rate bitstream to a buffer 15, the output of which is the output constant bit rate bitstream of the H.263 video encoder. Said output bitstream is also sent to a bitrate control circuit 30 for buffer regulation.
- DCT, Q, VLC variable-length encoder
- a prediction branch is provided and comprises in series an inverse quantization circuit 21 (Q "1 ), an inverse DCT transform circuit 22 (DCT 1 ), an adder 23 (delivering the reconstructed previous picture RPP), a temporal prediction circuit 24 (delivering a predicted picture PP, sent to the subtracter 15, and motion vectors MV, sent to the variable-length encoder 14, the prediction being based on a block-matching search carried out between the current picture CP, available at the output of a picture skipping circuit 26, and the reconstructed one RPP, available at the output of the adder 23), and the subtracter 25.
- the output of the temporal prediction circuit 24 is also sent back towards the other input of the adder 23 in view of the reconstruction of the previous picture RPP used for the temporal prediction.
- the H.263 standard defines a hierarchical bitstream syntax with four layers in said hierarchy : picture level, group of blocks level (GOB), macroblock level (MB), and block level (8 8 picture elements, or pixels), the block being the elementary unit over which DCT operates.
- a macroblock includes four luminance blocks (covering a 16 x 16 area in a picture) and two chrominance blocks.
- the motion estimation and compensation implemented on the reconstructed previous picture RPP in the circuit 24 of the prediction branch operate on macroblocks.
- a feedback connection 31 between the buffer 15 and the quantization circuit 13 allows to obtain a finer or a coarser quantization.
- the coarseness of the quantization is defined by a quantization parameter for the first three layers (blocks, macroblocks, GOBs) and a fixed quantization matrix which sets the relative coarseness of quantization for each DCT coefficient.
- the picture skipping circuit 26 provided at the input of the encoder may also be used as a possible way to reduce the bit rate (while keeping an acceptable picture quality).
- the number of skipped pictures is variable and depends on the output buffer fullness, and the feedback connection 31 provided for buffer regulation is therefore related not only to quantization step size variations but also to picture skipping (and also to an intra/inter selection, which is controlled by a circuit 41 actuating or not a first switch 42 and a second switch 43).
- This feedback connection allows to give guidelines to the encoder, for which the problem of bit rate control can indeed be formulated as follows : given a predetermined bit rate and an input picture, how to decide what encoder setting has to be chosen ? It would be possible either to have a constant picture rate approach, according to which the pictures are periodically grabbed and the quality is adapted to the complexity of each successive picture (in order to maintain the targeted bit rate) and therefore highly variable from a picture to another one, or to have a constant quality approach, according to which the pictures are processed with a fixed quantization step but only when the encoder has finished the processing of the previous picture, i.e. at a highly variable picture rate adapted to the complexities of the pictures.
- the invention relates to a method such as described in the preamble of the description and in which said controlling step comprises a pre-analysis sub-step, based on an estimation of the number of bits respectively used for coding motion information between previous and current pictures and for coding said coefficients, and a decision sub- step, provided for adjusting the quantizing step size and the rate of the successive pictures.
- Fig.1 shows a basic video compression scheme according to the H.263 standard
- Fig.2 shows the general structure of an encoder according to the invention
- Fig.3 shows how the scheme of Fig.2 works.
- the starting point of the feedback control carried out thanks to the output buffer 15 is to fit the number of bits which will be used to code each successive current picture to the number of bits available on the transmission channel.
- bits are used for coding information at the picture level, GOB level, macroblock level and block level, but the most expensive part in term of number of bits generated are macroblock and block levels, which concern motion estimation and DCT coefficient coding, totally dependent of the complexity of the current picture.
- the quality of the communication depends on a variable, quantifier-dependent part of the generated bits.
- a pre-analysis of the current image to roughly predict how many bits will be generated according to the encoder's setting is provided.
- Said pre-analysis described hereunder in a more detailed manner, allows to predict the number of bits generated for each possible quantization step of the encoder (preanalysis step). It is followed by a decision step in which, after having compared said number of bits to the desired one, a setting for the encoder is found. If the corresponding quantization step is in accordance with a previously set quality range, the picture is coded with it (it means that a transmission is not authorized when the quality is too bad, which corresponds to a quantifier step size too great).
- the worst authorized quantization step is first chosen and a decreased picture rate is computed (and chosen thanks to the picture skipping circuit 26), in order to meet the bandwidth requirements. Then the computed setting is used for the encoding process of the picture.
- the bit rate control allowed by the feedback connection for buffer regulation checks if a discrepancy between predicted and desired numbers of bits has appeared during said encoding operation, and, if necessary, the setting of the encoder is modified by modifying the quantization step (the only authorized changes are plus or minus one) between two consecutive GOBs.
- the generated bits can be split into two parts, a first one corresponding to the headers and a second one corresponding to the real content of the current picture.
- the computation of the first part is easy, but that of the second one is more complex.
- two kinds of information are indeed needed : (a) the information of motion between the current picture and the previous one, and (b) the chrominance and luminance variations between a current macroblock and the corresponding one in the previous picture.
- a first pre-analysis sub-step, related to the motion information, is based on an approximate prediction of the number of bits needed to code the motion vectors. More precisely, it has been found that an empirical law linking the mean motion of the complete picture and the number of bits needed to code all the motion data could be established. This law can be expressed in the form of the following equation (1):
- N number of macroblocks having a non null motion vector
- the motion estimation of the whole picture is done.
- the SAD of each macroblock being known, the predicted number of bits is then computed for each quantization step, in order to determine which quantization step gives the prediction closest to the targeted number of bits. If the computed quantization step is too high in term of minimal quality, the quantization step is set to the maximum allowed.
- coder 203 designates the association of all the elements of Fig.l, except the buffer 15 and the bitrate control circuit 30.
- Fig.2 illustrates at what level the above-described preanalysis acts in the coding chain and Fig.3 shows how the scheme of Fig.2
- the first step 31 is provided for computing the authorized target number of bits T for the input picture, taking into account the bandwidth BW, the fullness of the output buffer FOB and the target frame rate TFR according to a relation (3) of the following type:
- the second step comprises a computing operation 321, provided for computing the predicted number of bits P n useful for coding the actual picture with the smaller quantizer (or quantization) step which appears to be compatible with the bandwith, the quality and the frame rate.
- These numbers T b and P n are then compared (test operation 322) : if P n is greater than T b (output Y), the coding step of Fig.2 will be done with said quantization step, while, if P n is smaller than T b (output N), 1 is added to the quantization step, the operation 321 is repeated, and the test operation 322 is repeated with the modified value of P n .
- a quality test 33 is carried out : if the quantizer is under a predetermined quality threshold (Q n - ⁇ ⁇ Q max ), the frame rate FR is equal to the target frame rate (connection 331), while, if it is not the case, a new smaller frame rate is computed (connection 332) according to the predicted number of bits for the minimal authorized quality. In both situations, a coding step 341 is then carried out for coding each group of blocks (GOB).
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Algebra (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
In video communication, quality and delay of the transmission depend on the bit rate control strategy that allows to fit the number of bits generated for a given setting of the coder with the bandwidth. In order to give appropriate guidelines to the coder, a preanalysis step, based on approximate predictions (by means of an empirical law for the motion information and a use of statistics for the content of the picture), is consequently carried out to predict this generated number of bits, and followed by a decision step provided for adjusting the quantization step size and the rate of the successive pictures.
Description
Method and device for encoding a video signal.
The present invention relates to a method of encoding the successive pictures of a video signal, comprising the steps of subdividing each successive picture into a plurality of sub-pictures, transforming each sub-picture into coefficients, quantizing said coefficients with an applied step size, coding said quantized coefficients, and controlling the step size in conformity with a target value for the number of bits for encoding each successive picture. The invention also relates to a corresponding device. This invention is particularly adapted to real-time communication at low bit rate according to the so-called H.263 recommendation.
In low bit rate applications (videophony, videoconferencing), a system for image compression such as proposed in the H.263 standard is recommended. A video encoder according to said standard is for instance described in the book "Motion estimation algorithms for video compression", B. Furht and al., Kluwer Academic Publishers, 1997, chapter 2, pp.30-35. An H.263 video encoder such as illustrated in Fig.l is based on a motion- compensated prediction from a previous image to the current one, followed by an orthogonal transformation (such as DCT), which reduces the spatial redundancy of the pictures by decorrelating the picture elements and concentrating the energy into a few low order coefficients, a quantization, and an encoding operation for encoding the prediction error thus transformed and quantized. At least the first picture is a reference one, encoded without temporal prediction (i.e. according to an "intra mode", or I mode), and from time to time one further picture in every n pictures may be also coded according to said intra mode. The other type of picture, the P one, corresponds to P pictures, i.e. to pictures that are temporally predicted from earlier pictures.
As shown in Fig.l, the successive pictures P regularly arrive at the input of the encoding device (input video). A coding branch including a discrete cosine transform circuit 12, a quantization circuit 13 and a variable-length encoder 14 (DCT, Q, VLC respectively) processes these pictures (the circuit 12 receives in fact the difference between the input pictures and predicted ones available at the output of a subtracter 25) and sends the obtained, coded variable bit rate bitstream to a buffer 15, the output of which is the output constant bit rate bitstream of the H.263 video encoder. Said output bitstream is also sent to a bitrate control circuit 30 for buffer regulation. A prediction branch is provided and comprises in series an
inverse quantization circuit 21 (Q"1), an inverse DCT transform circuit 22 (DCT1), an adder 23 (delivering the reconstructed previous picture RPP), a temporal prediction circuit 24 (delivering a predicted picture PP, sent to the subtracter 15, and motion vectors MV, sent to the variable-length encoder 14, the prediction being based on a block-matching search carried out between the current picture CP, available at the output of a picture skipping circuit 26, and the reconstructed one RPP, available at the output of the adder 23), and the subtracter 25. The output of the temporal prediction circuit 24 is also sent back towards the other input of the adder 23 in view of the reconstruction of the previous picture RPP used for the temporal prediction. The H.263 standard defines a hierarchical bitstream syntax with four layers in said hierarchy : picture level, group of blocks level (GOB), macroblock level (MB), and block level (8 8 picture elements, or pixels), the block being the elementary unit over which DCT operates. A macroblock includes four luminance blocks (covering a 16 x 16 area in a picture) and two chrominance blocks. The motion estimation and compensation implemented on the reconstructed previous picture RPP in the circuit 24 of the prediction branch operate on macroblocks. A feedback connection 31 between the buffer 15 and the quantization circuit 13 allows to obtain a finer or a coarser quantization. The coarseness of the quantization is defined by a quantization parameter for the first three layers (blocks, macroblocks, GOBs) and a fixed quantization matrix which sets the relative coarseness of quantization for each DCT coefficient. The picture skipping circuit 26 provided at the input of the encoder may also be used as a possible way to reduce the bit rate (while keeping an acceptable picture quality). The number of skipped pictures is variable and depends on the output buffer fullness, and the feedback connection 31 provided for buffer regulation is therefore related not only to quantization step size variations but also to picture skipping (and also to an intra/inter selection, which is controlled by a circuit 41 actuating or not a first switch 42 and a second switch 43).
This feedback connection allows to give guidelines to the encoder, for which the problem of bit rate control can indeed be formulated as follows : given a predetermined bit rate and an input picture, how to decide what encoder setting has to be chosen ? It would be possible either to have a constant picture rate approach, according to which the pictures are periodically grabbed and the quality is adapted to the complexity of each successive picture (in order to maintain the targeted bit rate) and therefore highly variable from a picture to another one, or to have a constant quality approach, according to which the pictures are processed with
a fixed quantization step but only when the encoder has finished the processing of the previous picture, i.e. at a highly variable picture rate adapted to the complexities of the pictures.
It is therefore an object of the invention to propose an improved encoding method in which a trade-off between picture rate and quality is researched, while taking also into account the fact that the delay between the input pictures and the displayed ones has to be well controlled (as constant as possible) in order to ensure a restitution of any scene as regular as possible.
To this end the invention relates to a method such as described in the preamble of the description and in which said controlling step comprises a pre-analysis sub-step, based on an estimation of the number of bits respectively used for coding motion information between previous and current pictures and for coding said coefficients, and a decision sub- step, provided for adjusting the quantizing step size and the rate of the successive pictures.
The particular aspects of the invention will now be explained with reference to the embodiment described hereinafter and considered in connection with the accompanying drawings, in which:
Fig.1 shows a basic video compression scheme according to the H.263 standard;
Fig.2 shows the general structure of an encoder according to the invention; Fig.3 shows how the scheme of Fig.2 works.
In a coding chain such as shown in Fig.l, the starting point of the feedback control carried out thanks to the output buffer 15 is to fit the number of bits which will be used to code each successive current picture to the number of bits available on the transmission channel. When studying the bit generation during the coding operation of each picture, it appears that the biggest amount of bits to be transmitted is related to the content of the picture. In fact, bits are used for coding information at the picture level, GOB level, macroblock level and block level, but the most expensive part in term of number of bits generated are macroblock and block levels, which concern motion estimation and DCT coefficient coding, totally dependent of the complexity of the current picture. In such a coding chain, the quality of the communication depends on a variable, quantifier-dependent part of the generated bits. According to the invention, a pre-analysis of the current image to roughly predict how many bits will be generated according to the encoder's setting is provided. Said pre-analysis, described hereunder in a more detailed manner, allows to predict the number of bits generated for each possible quantization step of
the encoder (preanalysis step). It is followed by a decision step in which, after having compared said number of bits to the desired one, a setting for the encoder is found. If the corresponding quantization step is in accordance with a previously set quality range, the picture is coded with it (it means that a transmission is not authorized when the quality is too bad, which corresponds to a quantifier step size too great). Otherwise the worst authorized quantization step is first chosen and a decreased picture rate is computed (and chosen thanks to the picture skipping circuit 26), in order to meet the bandwidth requirements. Then the computed setting is used for the encoding process of the picture. After the encoding operation of each GOB, the bit rate control allowed by the feedback connection for buffer regulation checks if a discrepancy between predicted and desired numbers of bits has appeared during said encoding operation, and, if necessary, the setting of the encoder is modified by modifying the quantization step (the only authorized changes are plus or minus one) between two consecutive GOBs.
The generated bits can be split into two parts, a first one corresponding to the headers and a second one corresponding to the real content of the current picture. The computation of the first part is easy, but that of the second one is more complex.
In order to transmit the content of a picture, two kinds of information are indeed needed : (a) the information of motion between the current picture and the previous one, and (b) the chrominance and luminance variations between a current macroblock and the corresponding one in the previous picture.
A first pre-analysis sub-step, related to the motion information, is based on an approximate prediction of the number of bits needed to code the motion vectors. More precisely, it has been found that an empirical law linking the mean motion of the complete picture and the number of bits needed to code all the motion data could be established. This law can be expressed in the form of the following equation (1):
PNB = 4. N . log(1000.Mean_mv/N) (1) where:
N = number of macroblocks having a non null motion vector;
Mean_mv = (sum of motion vectors )/(number of macroblocks per picture); PNB = estimated number of bits used to code the motion information of a picture (related to the mean motion of the whole picture). This law has the advantage to be simple and rapid.
A second pre-analysis sub-step, related to the number of bits to code DCT coefficients, is based on a prediction done at the macroblock level. For each quantization step,
statistics of the number of bits used versus the sum of absolute differences between luminance data of current and reference macroblocks (= SAD, which is the correlation measure between the original macroblock in the current picture and the displaced macroblock in the previous reconstructed picture, according to the relation (2):
SADN , y) = - previous(i-x,j - y)\ (2)
with x,y = displacement coordinates and N = size of the block, generally 8 or 16) have been done, and it appears that the number of bits has not a linear variation, with respect to the SAD.
This could be due to the coding mode of the DCT coefficients in H.263 standard, which is a variable length coding mode for the more frequent coefficients and a fixed length coding one for the others : this could explain why, for each quantization step, three different areas, one for small SAD, one for medium, and another one for big SAD, are observed.
This leads to make different predictions, depending on the SAD area : a compromise has been made for three regions which work with the total range of the quantization step (1 to 31). Three order polynomial approximation laws have been computed for each quantization step for SAD < 500, 500 < SAD < 1000, 1000 < SAD
< 1500, a fixed value being chosen for SAD >1500. Said compromise between accuracy and computing complexity seems to be rather good.
In practice, just at the beginning of the coding, the motion estimation of the whole picture is done. The SAD of each macroblock being known, the predicted number of bits is then computed for each quantization step, in order to determine which quantization step gives the prediction closest to the targeted number of bits. If the computed quantization step is too high in term of minimal quality, the quantization step is set to the maximum allowed.
Knowing the number of bits sent on the line, the time to grab the next picture is then computed. In the encoder according to the invention, shown in Fig.2, these pre-analysis sub-steps, referenced with the single reference 201, are followed by a decision sub-step 202.
The reference "coder 203" designates the association of all the elements of Fig.l, except the buffer 15 and the bitrate control circuit 30.
With respect to the basic scheme of Fig.l, Fig.2 illustrates at what level the above-described preanalysis acts in the coding chain and Fig.3 shows how the scheme of Fig.2
(i.e. the bit rate control according to the invention) is working. It must be indicated that said regulation is here implemented by carrying out a set of software instructions controlling computation steps, test steps, or similar steps, such as now described.
The first step 31 is provided for computing the authorized target number of bits T for the input picture, taking into account the bandwidth BW, the fullness of the output buffer FOB and the target frame rate TFR according to a relation (3) of the following type:
„ bandwidth - buffer fullness ...
Tb = 5^-S (3> The second step comprises a computing operation 321, provided for computing the predicted number of bits Pn useful for coding the actual picture with the smaller quantizer (or quantization) step which appears to be compatible with the bandwith, the quality and the frame rate. These numbers Tb and Pn are then compared (test operation 322) : if Pn is greater than Tb (output Y), the coding step of Fig.2 will be done with said quantization step, while, if Pn is smaller than Tb (output N), 1 is added to the quantization step, the operation 321 is repeated, and the test operation 322 is repeated with the modified value of Pn.
When Pn is greater than Tb, a quality test 33 is carried out : if the quantizer is under a predetermined quality threshold (Qn-ι < Qmax), the frame rate FR is equal to the target frame rate (connection 331), while, if it is not the case, a new smaller frame rate is computed (connection 332) according to the predicted number of bits for the minimal authorized quality. In both situations, a coding step 341 is then carried out for coding each group of blocks (GOB).
The last step is provided for checking for each new GOB (test 342 NEW GOB ?) if the coding prediction is in accordance (test PRED OK ?) with the actual coding (sub-step 343). If yes, the next coding step will continue with the same quantizer (connection 344) ; if not, add or subtract 1 (sub-step 345, NEW QUANT = +/-1) to the computed quantizer in view of reducing the drift.
Claims
1. A method of coding the successive pictures of a video signal, comprising the steps of:
- subdividing each successive picture into a plurality of sub-pictures;
- transforming each sub-picture into coefficients; - quantizing said coefficients with an applied step size;
- coding said quantized coefficients;
- controlling the step size in conformity with a target value for the number of bits for encoding each successive picture, said controlling step comprising a pre-analysis sub- step, based on an estimation of the number of bits respectively used for coding motion information between previous and current pictures and for coding said coefficients, and a decision sub-step, provided for adjusting the quantizing step size and the rate of the successive pictures.
2. A device for encoding the successive pictures of a video signal, comprising: - means for dividing each successive picture into a plurality of sub-pictures;
- an encoder for encoding successively said sub-pictures, or groups of sub- pictures, said encoder including a picture transformer for transforming each sub-picture into coefficients and a quantizer for quantizing the coefficients with an applied step size;
- control means for controlling the quantization step size in conformity with a target value for the number of bits for encoding the applied picture; said device also comprising a pre-analysis stage, provided for estimating the number of bits respectively used for coding motion information between previous and current pictures and for coding said coefficients, and a decision stage, provided for adjusting the quantization step size and the rate of successive pictures.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP99966980A EP1057343A1 (en) | 1998-12-29 | 1999-12-17 | Method and device for encoding a video signal |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP98403322 | 1998-12-29 | ||
EP98403322 | 1998-12-29 | ||
EP99966980A EP1057343A1 (en) | 1998-12-29 | 1999-12-17 | Method and device for encoding a video signal |
PCT/EP1999/010199 WO2000040031A1 (en) | 1998-12-29 | 1999-12-17 | Method and device for encoding a video signal |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1057343A1 true EP1057343A1 (en) | 2000-12-06 |
Family
ID=8235609
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP99966980A Withdrawn EP1057343A1 (en) | 1998-12-29 | 1999-12-17 | Method and device for encoding a video signal |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP1057343A1 (en) |
JP (1) | JP2002534863A (en) |
KR (1) | KR20010041441A (en) |
WO (1) | WO2000040031A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4190157B2 (en) * | 2001-02-26 | 2008-12-03 | 三洋電機株式会社 | Image data transmitting apparatus and image data receiving apparatus |
KR100594056B1 (en) | 2003-09-01 | 2006-07-03 | 삼성전자주식회사 | H.263/MPEG Video Encoder for Effective Bits Rate Control and Its Control Method |
CN106791860B (en) * | 2016-12-28 | 2019-07-30 | 重庆邮电大学 | A kind of adaptive video coding control system and method |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05167998A (en) * | 1991-12-16 | 1993-07-02 | Nippon Telegr & Teleph Corp <Ntt> | Image-encoding controlling method |
EP1357758A3 (en) * | 1995-08-02 | 2004-10-27 | Matsushita Electric Industrial Co., Ltd. | Video coding device and video transmission system using the same, quantization control method and average throughput calculation method used therein |
GB9519923D0 (en) * | 1995-09-29 | 1995-11-29 | Philips Electronics Nv | Motion estimation for predictive image coding |
JP4532607B2 (en) * | 1995-10-26 | 2010-08-25 | メディアテック インコーポレイション | Apparatus and method for selecting a coding mode in a block-based coding system |
US6002802A (en) * | 1995-10-27 | 1999-12-14 | Kabushiki Kaisha Toshiba | Video encoding and decoding apparatus |
FR2753330B1 (en) * | 1996-09-06 | 1998-11-27 | Thomson Multimedia Sa | QUANTIFICATION METHOD FOR VIDEO CODING |
-
1999
- 1999-12-17 EP EP99966980A patent/EP1057343A1/en not_active Withdrawn
- 1999-12-17 KR KR1020007009588A patent/KR20010041441A/en not_active Application Discontinuation
- 1999-12-17 WO PCT/EP1999/010199 patent/WO2000040031A1/en not_active Application Discontinuation
- 1999-12-17 JP JP2000591811A patent/JP2002534863A/en not_active Withdrawn
Non-Patent Citations (1)
Title |
---|
See references of WO0040031A1 * |
Also Published As
Publication number | Publication date |
---|---|
JP2002534863A (en) | 2002-10-15 |
WO2000040031A1 (en) | 2000-07-06 |
KR20010041441A (en) | 2001-05-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5870146A (en) | Device and method for digital video transcoding | |
KR100471956B1 (en) | Moving picture encoding apparatus and method | |
US5293229A (en) | Apparatus and method for processing groups of fields in a video data compression system | |
JP4127914B2 (en) | Adaptive video signal encoding apparatus | |
US5461422A (en) | Quantizer with automatic pre-threshold | |
JP4109113B2 (en) | Switching between bitstreams in video transmission | |
CN100463523C (en) | Video encoding methods and systems with frame-layer rate control | |
US5532746A (en) | Bit allocation method for controlling transmission rate of video encoder | |
WO2000046996A1 (en) | Frame-level rate control for plug-in video codecs | |
AU2272395A (en) | A method for determining whether to intra code a video block | |
KR100601615B1 (en) | Apparatus for compressing video according to network bandwidth | |
EP1838108A1 (en) | Processing video data at a target rate | |
US6501800B1 (en) | Variable bit-rate encoding device | |
JP3173369B2 (en) | Image compression coding device | |
US6480544B1 (en) | Encoding apparatus and encoding method | |
EP0639924B1 (en) | Coding mode control device for digital video signal coding system | |
EP1057343A1 (en) | Method and device for encoding a video signal | |
Yi et al. | Rate control using enhanced frame complexity measure for H. 264 video | |
JP3779066B2 (en) | Video encoding device | |
KR100239867B1 (en) | Method of compressing solid moving picture for controlling degradation of image quality in case of applying motion estimation and time difference estimation | |
JP2900927B2 (en) | Encoding method and encoding device | |
KR20010104058A (en) | Adaptive quantizer according to DCT mode in MPEG2 encoder | |
JP4035747B2 (en) | Encoding apparatus and encoding method | |
KR100778473B1 (en) | Bit rate control method | |
JPH09107293A (en) | Method and device for controlling code volume |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): DE FR GB |
|
17P | Request for examination filed |
Effective date: 20010108 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN |
|
18W | Application withdrawn |
Effective date: 20061221 |