WO2007014850A2 - Method and device for determining quantization parameters in an image - Google Patents

Method and device for determining quantization parameters in an image Download PDF

Info

Publication number
WO2007014850A2
WO2007014850A2 PCT/EP2006/064393 EP2006064393W WO2007014850A2 WO 2007014850 A2 WO2007014850 A2 WO 2007014850A2 EP 2006064393 W EP2006064393 W EP 2006064393W WO 2007014850 A2 WO2007014850 A2 WO 2007014850A2
Authority
WO
WIPO (PCT)
Prior art keywords
pixels
quantization parameter
groups
bits
image
Prior art date
Application number
PCT/EP2006/064393
Other languages
French (fr)
Other versions
WO2007014850A3 (en
Inventor
Olivier Le Meur
Dominique Thoreau
Philippe Guillotel
Original Assignee
Thomson Licensing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing filed Critical Thomson Licensing
Publication of WO2007014850A2 publication Critical patent/WO2007014850A2/en
Publication of WO2007014850A3 publication Critical patent/WO2007014850A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/192Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive
    • H04N19/194Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive involving only two passes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • H04N19/126Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/15Data rate or code amount at the encoder output by monitoring actual compressed data size at the memory before deciding storage at the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/152Data rate or code amount at the encoder output by measuring the fullness of the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/19Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/192Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive

Definitions

  • the invention relates to a device and a method for determining quantization parameters for each group of pixels in an image, by using information about the perceptual interest of each of the groups of pixels. These quantization parameters are subsequently used for coding the image with a number of bits C se tpoint-
  • the quantization parameter used for coding each image may vary in a given image from one block of pixels to another, for example from one macroblock (block of 16 pixels by 16 pixels) to another.
  • the regions of interest in the image can be coded with a reconstruction quality greater than that of the regions of non-interest by allocating them more bits, i.e. by associating a lower quantization parameter with them.
  • the background is considered to be a region of non-interest whereas the speaker's head and shoulders are considered to be regions of interest.
  • Non-homogeneous allocation of the number of bits thus makes it possible to improve overall the perceived reconstruction quality by allocating more bits to the blocks of pixels belonging to the speaker's face or shoulders than to those belonging to the background. It is therefore necessary to identify the regions of interest in the image and allocate them a greater number of bits for coding them. Identification of the regions of interest may be based for example on modelling the pre-attentive visual attention ("computational modelling of bottom-up visual selective attention"). Such modelling is described in an article by O. Le Meur et al. entitled “Performance assessment of a visual attention system entirely based on a human vision modeling" published in the ICIP conference proceedings of October 2004, and in European Patent Application EP 1 544 792 published in June 2005.
  • the invention provides a method for determining a quantization parameter for each group of pixels in an image, making it possible to guarantee a minimum reconstruction quality over all the regions of the image, and particularly over the regions of non-interest.
  • the invention relates to a method for determining a quantization parameter for each group of pixels in an image, the quantization parameters being used for coding the image with a first number of bits (C se tpoint) corresponding to the number of bits necessary for coding the image with a setpoint quantization parameter (q se tpoint)-
  • the method comprises the following steps:
  • the difference in bits (Csetpoint - Cmin) between the first and second numbers of bits is reallocated to the groups of pixels proportionally to their perceptual interest, i.e. the number of bits reallocated and the value of the perceptual interest to vary in the same sense.
  • the perceptual interest of a group of pixels is characterized by a salience value calculated for this group of pixels.
  • the step of calculating the preliminary quantization parameter ( ⁇ 1 " 3* ) is preceded by a step of associating a set of points with each of the groups of pixels, each point comprising a quantization parameter value, a number of bits necessary for coding the group of pixels with the quantization parameter and an associated distortion value.
  • the step of calculating the preliminary quantization parameters ( ⁇ ) comprises the following steps: a. For each of the groups of pixels (MBi), calculating a distortion value (d v (i,qinit)) corresponding to the coding of the group of pixels with an initial quantization parameter (qini t ), which is greater than the setpoint quantization parameter (q S etpomt); b. For the image, calculating a current variance value ( ⁇ v 2 ) of the distortion corresponding to the coding of the groups of pixels in the image with the initial quantization parameter (q in i t ); c.
  • step c For the image, recalculating a new variance value ( ⁇ v 2 ) of the distortion corresponding to the coding of the groups of pixels in the image with the quantization parameters resulting from step d, the current variance value becoming a preceding variance value and the new variance value becoming the current variance value; and f.
  • step c if the absolute value of the difference between the current variance value and the preceding variance value is greater than a threshold ( ⁇ ), otherwise, for each of the groups of pixels, assigning the value of the quantization parameter resulting from step d to the preliminary quantization parameter ( ⁇ ) of this group.
  • the integer N is the integer part of the product M times K, where K is the number of groups of pixels in the image and M is a number lying between 0 and 1.
  • the step of calculating the final quantization parameters (q * ) comprises the following steps: a. Calculating a parameter ⁇ (*,0) , referred to as the initial rate-distortion parameter, for each of the groups of pixels (MBi) according to the following formula:
  • - q ⁇ is the preliminary quantization parameter associated with the pixel group (MBi) of index i
  • - D(i,A) is a perceptual distortion value corresponding to the coding of the group of pixels MBi with the quantization parameter A
  • - R(i,A) is the number of bits necessary for coding the group of pixels of index i with the quantization parameter A.
  • step b if the difference in bits (C se tpoint - C m i n ) is positive, otherwise, for each of the groups of pixels, assigning the value of the quantization parameter resulting from step c to the final quantization parameter (q * ) of this group.
  • the perceptual distortion D(i,qi) associated with a group of pixels of index i, coded with the quantization parameter q it is derived from a conventional distortion value d v (i,qi) according to one of the following formulae:
  • the invention also relates to a device for determining a quantization parameter for each group of pixels in an image, the quantization parameters being used for coding the image with a first number of bits (C se t P oint) corresponding to the number of bits necessary for coding the image with a setpoint quantization parameter (q se tpoint)-
  • the device comprises the following means:
  • the invention relates to a computer program product which comprises program code instructions for carrying out the steps of the method when the program is run on a computer. 4. Lists of the Figures
  • FIG. 2 illustrates a rate-distortion curve associated with a group of pixels of index i
  • FIG. 3 represents two histograms of reconstruction quality at two different iterations of the method according to the invention.
  • FIG. 4 illustrates a device according to the invention.
  • the invention relates to a method for determining quantization parameters in an image, which uses information about the content of this image, more precisely information such as for example salience values characterizing the perceptual interest of the regions or groups of pixels (for example a block or macroblock) in the image.
  • An image comprises pixels, with each of which at least one luminance value is associated.
  • the method may be applied to a single image or a sequence of a plurality of images.
  • Each image is divided into K groups of pixels MB 1 , ie [1 ,K].
  • Each group of pixels may be a macroblock or, more generally, a block of I pixels by J pixels. Groups of pixels with any shape may equally be envisaged.
  • the invention requires knowledge of the reconstruction quality of an image or an MBj.
  • a plurality of reconstruction quality metrics may be used in order to estimate the distortion between a source image (or respectively a source MBi) and the corresponding reconstructed image (or respectively the corresponding reconstructed MBj), i.e. the source image coded with a given quantization parameter then decoded (or respectively the coded then decoded MBi).
  • the "sum of square errors” (SSE) is defined for an image (or respectively for an MBj) as the sum over this image (or respectively over this MBj) of the squared differences between the luminance value associated with the pixel in the original image and the luminance value associated with the pixel having the same coordinates in the reconstructed image.
  • SSE sum of square errors
  • MSE - “mean square error” is defined as being equal to the SSE divided by the number of samples used (i.e. the number of pixels).
  • the invention also uses perceptual distortions, i.e. ones which take into account information about the content of the image and more precisely the perceptual interest of the MBjS in the image.
  • a perceptual distortion for an MBj referenced D(i,qi) may be derived by the following formulae from a conventional distortion referenced d v (i,qi) :
  • this value s(i) is a salience value.
  • a salience map is calculated for an image.
  • a salience map is a two-dimensional topographical representation of the degree of salience of each pixel in the image. This map is normalized for example between 0 and 1 , although it may also be normalized between 0 and 255.
  • the salience map therefore provides a salience value S(x,y) per pixel (where (x,y) are the coordinates of a pixel in the image) which characterizes the perceptual interest of this image.
  • S(x,y) are the coordinates of a pixel in the image
  • the mean value of the salience values S(x,y) associated with each of the pixels of an MB 1 is calculated.
  • the median value may also be used instead of the mean value in order to represent the MB 1 .
  • a salience map associated with a given image may be obtained by the method comprising the following steps:
  • each sub- band may be considered as the neuronal image corresponding to a population of visual cells attuned to an interval of spatial frequencies and a particular orientation;
  • the method is divided into the 2 steps referenced 10 and 20.
  • the modules represented in Figure 1 are functional units, which may or may not correspond to physically distinguishable units. For example, these modules or some of them may be grouped into a single component or constitute functionalities of the same software. Conversely, some modules may optionally be composed of separate physical entities.
  • the object of the method is to achieve a satisfactory compromise between the perceived reconstruction quality of the regions of interest in relation to the regions of non-interest in the image so as to improve the overall perceived reconstruction quality, without introducing other defects such as, for example, spatio-temporal defects in the case of a sequence of images.
  • steps 10 and 20 of the method consist in distributing a setpoint number of bits C setPo in t between the groups of pixels MBj in an image, referred to as the current image, as a function of their perceptual interest, for example characterized by a salience value s(i), and optionally by using rate-distortion curves associated with each MBj. More precisely, they consist in associating a final quantization parameter q * with each group of pixels MBj in the current image.
  • steps 10 and 20 may be applied successively to all the images in the sequence.
  • the number C se t P oint is an input parameter of the method, and corresponds to the number of bits allocated to the current image for coding it.
  • This number may, for example, be provided by the user of the method as a function of the application. In the case of a sequence of images, this number of bits C se tpoint may also be determined by a conventional rate control method such as that defined in document ISO/IEC JTC1/SC29/WG11 , Test model 5, 1993. This number may vary in particular as the function of the type of current image (for example intra image, predicted image). Specifically, a larger number of bits is necessary for coding an intra type image (i.e. an image in a sequence of images which is coded without reference to the other images in the sequence) than for a coded image of the predicted type (i.e. an image in a sequence of images which is coded with reference to another image in the sequence).
  • the setpoint number of bits C se t P oint corresponds to the number of bits necessary for coding the current image with a unique quantization parameter q set p o in t or with a different parameter for each MBj.
  • a single parameter q set p o in t will be used here for describing the invention.
  • the value of qsetpoint referenced in Figure 2 may be provided directly by the rate control method indicated above, or it may be determined on the basis of the value of C set p o in t and a rate-distortion curve associated with the current image, such as of the one represented by Figure 2 for an MBj.
  • a rate-distortion curve consists in associating a distortion value and a coding cost (i.e. a number of bits) with each quantization parameter in a given interval (for example [0-31] for MPEG-2 and [0-51] for MPEG-4 AVC) specified, for example, by the coding standard.
  • a coding cost i.e. a number of bits
  • Such a curve may be associated with an image or with an MBj.
  • a rate-distortion curve may be provided by external means or alternatively generated as follows for an MBj. The same technique can be used for generating the rate-distortion curve associated with an image.
  • One technique for calculating the points 30 of the rate-distortion curve consists in coding each MBj with a plurality of quantization parameters (for example 1 , 2, ..., q it qi+1 , qi+2, ...) and in decoding it in order to generate a set of points 30.
  • a quantization parameter qi, a coding cost R(i,q,) and a conventional distortion value d v (i,qi) correspond to each point 30.
  • the coding cost R(i,q,) represents the number of bits necessary for coding an MBj by using the quantization parameter q,.
  • the value d v (i,qi) is obtained by coding the MB 1 with the quantization parameter q u decoding it and calculating the conventional distortion between this reconstructed MBj and the source MBj.
  • each MB 1 with a reduced number of quantization parameters, for example one out of every two (i.e. 2, 4, ..., qi, qi+2, qi+4, ).
  • the total curve as illustrated in Figure 2 is then interpolated between the calculated points 30, for example by using a cubic interpolation or by spline curves.
  • the images are generally modelled by a Gaussian model whose various parameters (i.e. mean, variance) are estimated directly on the basis of the current image or the images in the sequence.
  • various parameters i.e. mean, variance
  • they may be stored in correspondence tables ("look-up tables"), one per group of pixels MBi, which associate a conventional distortion value d v (i,qi) and a number of bits R(i,q,) with each quantization parameter q,.
  • the input data may be provided to the method of the invention in the form of data files.
  • step 10 consists in calculating a preliminary quantization parameter q ⁇ for each MBj so as to minimize the variation in reconstruction quality around an average reconstruction quality. It is carried out in four sub-steps: one initialization sub-step and three sub-steps applied iteratively until a first termination criterion.
  • a starting setpoint C in it is determined on the basis of C se t P oint and other parameters such as, for example, the resolution of the images in the sequence and/or meta-data and/or the spatio-temporal activity of the images.
  • a quantization parameter q in i t is derived from the value of C in i t and from the rate-distortion curve associated with the current image.
  • the value Cini t corresponds to the number of bits used for coding the current image with the quantization parameter q m .
  • Cini t which is less than the value of Csetpoint may also be set empirically to half the value of C se tpoint-
  • the initialization sub-step consists in calculating the conventional i.e. non-perceptual distortion d v (i,qinit) associated with this MBi coded with the quantization parameter q in i t -
  • the mean value d v of the conventional distortion, as well as its variance ⁇ v 2 are calculated over the current image in question according to the following formulae:
  • the values d v and ⁇ v 2 can thus be calculated directly on the basis of the current source image and the current reconstructed image.
  • the second sub-step consists in identifying a first set of groups of pixels corresponding to the N groups of pixels MBj having the smallest conventional distortion values, said first set being referenced ESi in Figure 3, and a second set of groups of pixels corresponding to the N groups of pixels MBj having the greatest conventional distortion values, said second set being referenced ES 2 in Figure 3.
  • the third sub-step consists in decreasing the quantization parameters associated with the groups of pixels MBi of the first set ESi by a value n in order to increase their reconstruction quality, and in increasing the quantization parameters associated with the groups of pixels MBi of the second set ES2 by a value n in order to decrease their reconstruction quality, n being a predetermined integer. A value of n equal to 1 seems well suited. The other MBjS keep the same quantization parameter.
  • the last sub-step consists in recalculating the mean value of the conventional distortion d v of the current image, as well as its variance ⁇ v 2 .
  • the distribution of bits is terminated. Otherwise, the method returns to the first sub-step in order to continue the distribution.
  • this step 10 makes it possible to have a reconstruction quality which is less than the setpoint reconstruction quality but is more homogeneous.
  • Figure 3 represents two histograms of reconstruction quality at two different iterations of step 10.
  • the setpoint reconstruction quality is the reconstruction quality calculated between the current source image and the current reconstructed image, i.e. the current source image coded with the quantization parameter q set p o in t then decoded.
  • the overall quality over an image is maximal when the local quality is identical and the overall quality drops greatly when the quality drops locally.
  • This step makes it possible for a preliminary quantization parameter q ⁇ , which corresponds to the last quantization parameter calculated, to be associated with each MBi in the current image.
  • a new rate Cmm is calculated, which takes into account the preliminary quantization parameters associated
  • Step 20 consists in calculating a final quantization parameter q * for each MBj by reallocating the remaining bits AC, i.e. the difference in bits between C se tpoint and C mi n, as a function in particular of the perceptual interest of the MBjS, a greater number of bits being reallocated to the MBjS whose perceptual interest is highest.
  • the reallocation of bits is carried out according to three sub-steps: one initialization sub-step and two sub-steps applied iteratively until a second termination criterion.
  • the first sub-step consists in calculating an initial rate-distortion parameter ⁇ (*,0) for each MBj in the following way on the basis of the rate-distortion curves calculated previously and the salience maps:
  • ⁇ (i, k) represents the slope of the rate-perceptual distortion curve at a given point on this curve as calculated at iteration k.
  • D(i) d v (i) *s(i)
  • QP(i,k) be the quantization parameter associated with MBj at iteration k.
  • m is equal to 1.
  • the third sub-step consists in recalculating the rate- distortion parameter associated with the MB 10 whose quantization parameter has just been modified, in the following way: ⁇ (i /n D(J 0 , QP(J 0 , *)) - D(J 0 , QP(J 0 ,Ar + I)) R(i 0 , QP(i 0 , k + 1)) - R(i 0 , QP(I 0 , k))
  • This step 20 makes it possible for a final quantization parameter q * , which corresponds to the last quantization parameter calculated, to be associated with each MBi in the current image.
  • the present invention also relates to a device, referenced 40 in Figure 4, which implements the method previously described. Only the essential elements of the device are represented in Figure 4.
  • the device 40 comprises in particular a random-access memory 42 (RAM or similar component), a read-only memory 43 (hard disk or similar component), a processing unit 44 such as a microprocessor or a similar component, an input/output interface 45 and a man-machine interface 46. These elements are connected together by an address and data bus 41.
  • the read-only memory 43 contains in particular the algorithms carrying out steps 10 and 20 of the method according to the invention.
  • the processing unit 44 may also contain the algorithms for obtaining the input parameters of the method such as, for example, a rate control algorithm, an algorithm for generating the salience maps as well as an algorithm for coding/decoding the images.
  • the processing unit 44 loads and executes the instructions of these algorithms.
  • the random-access memory 42 comprises in particular the operating programs of the processing unit 44, which are loaded upon start-up of the apparatus, as well as the images to be processed.
  • the purpose of the input/output interface 45 is to receive the input signal (i.e.
  • the man-machine interface 46 of the device allows the user to interrupt the processing.
  • the results of the determination of the quantization parameters in each image are stored in random-access memory then transferred to read-only memory in order to be archived with a view to subsequent processing operations, for example coding the images with these quantization parameters.
  • the man-machine interface 46 comprises in particular a control panel and a display screen.
  • the invention is not limited to the exemplary embodiments mentioned above.
  • the person skilled in the art may apply any variant in the embodiments explained and combine them in order to benefit from their various advantages.
  • perceptual distortion metrics other than those described previously may be used.
  • other methods may be used in order to determine the rate-distortion curves associated with each of the groups of pixels MB 1 .
  • determining values C set p o in t and Cini t directly for example by using a rate control method, it is moreover possible to directly use quantization parameters q se tpoint and q in it proposed for example by a user as a function of the application.
  • the values C setPo in t and C in i t then correspond to the number of bits used for coding the current image, respectively with q se tpoint and q in it- According to the invention, furthermore, it is not necessary to construct rate-distortion maps.
  • a group of pixels MBj may be coded with a given quantization parameter each time it is essential to know the number of bits necessary for coding this MBj with the given quantization parameter and the associated distortion.
  • the input data of the method according to the invention i.e. the setpoint rate C se tpoint > optionally qsetpoint, the salience maps and optionally the rate-distortion curves, may be provided by methods other than those described previously.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention relates to a method for determining a quantization parameter for each group of pixels in an image. It comprises the following steps: - Calculating (10) a preliminary quantization parameter ( qi max) for each of the groups of pixels so as to minimize the variation in reconstruction quality between the groups when the preliminary quantization parameters are used for coding the image with a second number of bits (Cmin), which is less than the first number of bits (Csetpoint); and - Calculating (20) a final quantization parameter (qi* ), which is less than or equal to the preliminary quantization parameter (qi max), for each of the groups of pixels by reallocating the difference in bits between the first and second numbers of bits to the groups of pixels as a function of their content and their perceptual interest.

Description

METHOD AND DEVICE FOR DETERMINING QUANTIZATION PARAMETERS IN AN IMAGE
1. Field of the Invention The invention relates to a device and a method for determining quantization parameters for each group of pixels in an image, by using information about the perceptual interest of each of the groups of pixels. These quantization parameters are subsequently used for coding the image with a number of bits Csetpoint-
2. Prior Art
Selective compression makes it possible to locally vary the rate in an image by distributing a number of bits CsetPoint non-homogeneously in the image in order to improve the quality of the reconstructed image or reconstruction quality, i.e. of the coded then decoded image. In the case of the MPEG2 and MPEG4 video coding standards, for example, the quantization parameter used for coding each image may vary in a given image from one block of pixels to another, for example from one macroblock (block of 16 pixels by 16 pixels) to another. In this way, the regions of interest in the image can be coded with a reconstruction quality greater than that of the regions of non-interest by allocating them more bits, i.e. by associating a lower quantization parameter with them. In the case of a video conference application, the background is considered to be a region of non-interest whereas the speaker's head and shoulders are considered to be regions of interest. Non-homogeneous allocation of the number of bits thus makes it possible to improve overall the perceived reconstruction quality by allocating more bits to the blocks of pixels belonging to the speaker's face or shoulders than to those belonging to the background. It is therefore necessary to identify the regions of interest in the image and allocate them a greater number of bits for coding them. Identification of the regions of interest may be based for example on modelling the pre-attentive visual attention ("computational modelling of bottom-up visual selective attention"). Such modelling is described in an article by O. Le Meur et al. entitled "Performance assessment of a visual attention system entirely based on a human vision modeling" published in the ICIP conference proceedings of October 2004, and in European Patent Application EP 1 544 792 published in June 2005.
Once the regions of interest are identified, as indicated previously, a greater number of bits is allocated to them for coding of them in order to improve their reconstruction quality. The conventional solutions make it possible to allocate bits by locally adapting the quantization parameter as a function of the perceptual interest of the regions. However, they introduce numerous spatio- temporal visual defects particularly in the regions of non-interest, for example in the background. These visual defects are problematic because they attract the observer's eye and therefore create fixation points in the image, which reduce the perceived reconstruction quality.
3. Summary of the Invention It is an object of the invention to overcome some or all of these drawbacks. To this end the invention provides a method for determining a quantization parameter for each group of pixels in an image, making it possible to guarantee a minimum reconstruction quality over all the regions of the image, and particularly over the regions of non-interest. The invention relates to a method for determining a quantization parameter for each group of pixels in an image, the quantization parameters being used for coding the image with a first number of bits (Csetpoint) corresponding to the number of bits necessary for coding the image with a setpoint quantization parameter (qsetpoint)- The method comprises the following steps:
- Calculating a preliminary quantization parameter ( q^ ) for each of the groups of pixels (MBi) so as to minimize the variation in reconstruction quality between the groups when the preliminary quantization parameters are used for coding the image with a second number of bits (Cmin) which is less than the first number of bits (Csetpoint); and
- Calculating a final quantization parameter (q* ), which is less than or equal to the preliminary quantization parameter (q^ ), for each of the groups of pixels by reallocating the difference in bits (Csetpoint - Cmin) between the first and second numbers of bits to the groups of pixels as a function of their content and their perceptual interest.
According to one particular characteristic, the difference in bits (Csetpoint - Cmin) between the first and second numbers of bits is reallocated to the groups of pixels proportionally to their perceptual interest, i.e. the number of bits reallocated and the value of the perceptual interest to vary in the same sense.
Preferably, the perceptual interest of a group of pixels is characterized by a salience value calculated for this group of pixels.
According to one variant, the step of calculating the preliminary quantization parameter (^1"3* ) is preceded by a step of associating a set of points with each of the groups of pixels, each point comprising a quantization parameter value, a number of bits necessary for coding the group of pixels with the quantization parameter and an associated distortion value.
Advantageously, the step of calculating the preliminary quantization parameters ( ^ ) comprises the following steps: a. For each of the groups of pixels (MBi), calculating a distortion value (dv(i,qinit)) corresponding to the coding of the group of pixels with an initial quantization parameter (qinit), which is greater than the setpoint quantization parameter (qSetpomt); b. For the image, calculating a current variance value (σv 2 ) of the distortion corresponding to the coding of the groups of pixels in the image with the initial quantization parameter (qinit); c. Identifying a first set of groups of pixels (ESi) corresponding to the N groups of pixels having the smallest distortion values and a second set of groups of pixels (ES2) corresponding to the N groups of pixels having the largest distortion values, N being a predetermined integer; d. Decreasing the quantization parameters associated with the groups of pixels of the first set (ESi) by a value n and increasing the quantization parameters associated with the groups of pixels of the second set (ES2) by a value n, the quantization parameters associated with each of the groups of pixels other than those belonging to the first and second sets remaining unchanged, n being a predetermined integer; e. For the image, recalculating a new variance value (σv 2 ) of the distortion corresponding to the coding of the groups of pixels in the image with the quantization parameters resulting from step d, the current variance value becoming a preceding variance value and the new variance value becoming the current variance value; and f. Returning to step c if the absolute value of the difference between the current variance value and the preceding variance value is greater than a threshold (ε), otherwise, for each of the groups of pixels, assigning the value of the quantization parameter resulting from step d to the preliminary quantization parameter (^ ) of this group.
Preferably, the integer N is the integer part of the product M times K, where K is the number of groups of pixels in the image and M is a number lying between 0 and 1.
Advantageously, the step of calculating the final quantization parameters (q* ) comprises the following steps: a. Calculating a parameter λ(*,0) , referred to as the initial rate-distortion parameter, for each of the groups of pixels (MBi) according to the following formula:
Figure imgf000005_0001
where: - q^ is the preliminary quantization parameter associated with the pixel group (MBi) of index i; - D(i,A) is a perceptual distortion value corresponding to the coding of the group of pixels MBi with the quantization parameter A; and
- R(i,A) is the number of bits necessary for coding the group of pixels of index i with the quantization parameter A. b. Determining the maximum value of the rate-distortion parameters associated with each of the groups of pixels; c. Decreasing the quantization parameter associated with the group of pixels of index io having the maximum rate-distortion parameter, referred to as the identified group, by a value m, the quantization parameters associated with each of the groups of pixels other than the identified group remaining unchanged, m being a predetermined integer; d. Calculating the difference between the number of bits necessary for coding the identified group with the quantization parameter of the identified group as calculated in step c and the number of bits necessary for coding the identified group with the quantization parameter of the group identified before step c, this difference being referred to as the number of supplementary bits; e. Subtracting the number of supplementary bits from the difference in bits (Csetpoint " Cmin)! f. For each identified group, recalculating the rate-distortion value according to the following formula: λ(i k n) _ DJh , QP(J0 , *)) ~ D(J0 , QPJip ,* + !))
R(i0 , QP(i0 , k + 1)) - R(i0 , QP(i0 , *)) where: - D(io,A) is the perceptual distortion value corresponding to the coding of the identified group with the quantization parameter A; - R(io,A) is the number of bits necessary for coding the identified group with the quantization parameter A; and
- QP(io,k) is the parameter associated with the identified group at the preceding iteration k and QP(io,k + \) is the quantization parameter as calculated at the iteration k+1. g. Returning to step b if the difference in bits (Csetpoint - Cmin) is positive, otherwise, for each of the groups of pixels, assigning the value of the quantization parameter resulting from step c to the final quantization parameter (q* ) of this group.
Advantageously, the perceptual distortion D(i,qi) associated with a group of pixels of index i, coded with the quantization parameter qit is derived from a conventional distortion value dv(i,qi) according to one of the following formulae:
- D(i,qι) = dv(i,qι) *s(i); or
- Dfl.qi) = dv(i,qi)*sp(i). where - s(i) represents a value characterizing the perceptual interest of the group of pixels of index i;
- p is a positive integer; and
- * is the multiplication operator.
The invention also relates to a device for determining a quantization parameter for each group of pixels in an image, the quantization parameters being used for coding the image with a first number of bits (CsetPoint) corresponding to the number of bits necessary for coding the image with a setpoint quantization parameter (qsetpoint)- The device comprises the following means:
- Means for calculating a preliminary quantization parameter (^1"3* ) for each of the groups of pixels (MBi) so as to minimize the variation in reconstruction quality between the groups when the preliminary quantization parameters are used for coding the image with a second number of bits (Cmin), which is less than the first number of bits
(Csetpoint); and
- Means for calculating a final quantization parameter (q* ), which is less than or equal to the preliminary quantization parameter (q^ ), for each of the groups of pixels by reallocating the difference in bits (Csetpoint - Cmin) between the first and second numbers of bits to the groups of pixels as a function of their content and their perceptual interest.
Lastly, the invention relates to a computer program product which comprises program code instructions for carrying out the steps of the method when the program is run on a computer. 4. Lists of the Figures
The invention will be more clearly understood and illustrated by means of entirely non-limiting examples of advantageous embodiments and implementations with reference to the appended figures, in which: - Figure 1 illustrates a method for quantization parameter determination according to the invention;
- Figure 2 illustrates a rate-distortion curve associated with a group of pixels of index i;
- Figure 3 represents two histograms of reconstruction quality at two different iterations of the method according to the invention; and
- Figure 4 illustrates a device according to the invention.
5. Detailed Description of the Invention The invention relates to a method for determining quantization parameters in an image, which uses information about the content of this image, more precisely information such as for example salience values characterizing the perceptual interest of the regions or groups of pixels (for example a block or macroblock) in the image. An image comprises pixels, with each of which at least one luminance value is associated. The method may be applied to a single image or a sequence of a plurality of images. Each image is divided into K groups of pixels MB1, ie [1 ,K]. Each group of pixels may be a macroblock or, more generally, a block of I pixels by J pixels. Groups of pixels with any shape may equally be envisaged. The invention requires knowledge of the reconstruction quality of an image or an MBj. To this end, a plurality of reconstruction quality metrics may be used in order to estimate the distortion between a source image (or respectively a source MBi) and the corresponding reconstructed image (or respectively the corresponding reconstructed MBj), i.e. the source image coded with a given quantization parameter then decoded (or respectively the coded then decoded MBi). Among the techniques used for calculating a conventional distortion, the "sum of square errors" (SSE) is defined for an image (or respectively for an MBj) as the sum over this image (or respectively over this MBj) of the squared differences between the luminance value associated with the pixel in the original image and the luminance value associated with the pixel having the same coordinates in the reconstructed image. Another calculation technique (MSE - "mean square error") is defined as being equal to the SSE divided by the number of samples used (i.e. the number of pixels). The invention also uses perceptual distortions, i.e. ones which take into account information about the content of the image and more precisely the perceptual interest of the MBjS in the image. These various perceptual distortions can be derived from the conventional distortions according to various formulae. For instance, a perceptual distortion for an MBj referenced D(i,qi) may be derived by the following formulae from a conventional distortion referenced dv(i,qi) :
D(i,q,) = dv(i,qi)*s(i) or D(i,q,) = dv(i,qi)*sp(i), where: - s(i) represents a value lying between 0 and 1 characterizing the perceptual interest of the MBJ;
- p is a positive integer (for example p=2);
- q, is the quantization parameter used for coding the image; and
- * is the multiplication operator.
These functions are examples, and other functions may be used. For an MBj with a value s(i) equal to 1 , the conventional distortion dv(i) over this group of pixels is conserved. For an MBj having a low value s(i), the conventional distortion is greatly reduced. Advantageously, this value s(i) is a salience value. In this case, a salience map is calculated for an image. A salience map is a two-dimensional topographical representation of the degree of salience of each pixel in the image. This map is normalized for example between 0 and 1 , although it may also be normalized between 0 and 255. The salience map therefore provides a salience value S(x,y) per pixel (where (x,y) are the coordinates of a pixel in the image) which characterizes the perceptual interest of this image. The higher the value of S(x,y) is, the more pertinent the pixel of coordinates (x,y) is from a perceptual point of view. In order to obtain a salience value s(i) per MB1, for example, the mean value of the salience values S(x,y) associated with each of the pixels of an MB1 is calculated. The median value may also be used instead of the mean value in order to represent the MB1. A salience map associated with a given image may be obtained by the method comprising the following steps:
- projection of the image in a psycho-visual colour space according to the luminance component in the case of a monochromatic image, and according to the luminance component and according to each of the chrominance components in the case of a colour image; it will be assumed below that the image being processed is a colour image;
- perceptual decomposition of the projected components (one luminance component and two chrominance components) into sub-bands in the frequency domain according to a human eye visibility threshold; the sub- bands are obtained by dividing up the frequency domain according to the radial spatial frequency and the orientation (angular selectivity); each sub- band may be considered as the neuronal image corresponding to a population of visual cells attuned to an interval of spatial frequencies and a particular orientation;
- extraction of salient elements of the sub-bands relating to the luminance component and relating to each of the chrominance components, i.e. the most important information in the sub-bands; - improvement of the contours of the salient elements in each sub-band relating to the luminance component and relating to each of the chrominance components;
- calculation of a salience map for the luminance on the basis of the improved contours of the salient elements of each sub-band relating to the luminance component;
- calculation of a salience map for each of the chrominance components on the basis of the improved contours of the salient elements of each sub-band relating to the chrominance components; and
- generation of a final salience map on the basis of the luminance and chrominance salience maps. This method is described in European Patent Application EP 1 544 792 published in June 2005. The article by O. Le Meur et al. entitled "Performance assessment of a visual attention system entirely based on a human vision modeling" and published in the ICIP conference proceedings of October 2004 also gives details of the salience model. Other methods may be used for characterizing the perceptual interest of an MBj.
According to a preferred embodiment, which is illustrated by Figure 1 , the method is divided into the 2 steps referenced 10 and 20. The modules represented in Figure 1 are functional units, which may or may not correspond to physically distinguishable units. For example, these modules or some of them may be grouped into a single component or constitute functionalities of the same software. Conversely, some modules may optionally be composed of separate physical entities. The object of the method is to achieve a satisfactory compromise between the perceived reconstruction quality of the regions of interest in relation to the regions of non-interest in the image so as to improve the overall perceived reconstruction quality, without introducing other defects such as, for example, spatio-temporal defects in the case of a sequence of images. To this end, steps 10 and 20 of the method consist in distributing a setpoint number of bits CsetPoint between the groups of pixels MBj in an image, referred to as the current image, as a function of their perceptual interest, for example characterized by a salience value s(i), and optionally by using rate-distortion curves associated with each MBj. More precisely, they consist in associating a final quantization parameter q* with each group of pixels MBj in the current image. In the case of an image sequence, steps 10 and 20 may be applied successively to all the images in the sequence. The number CsetPoint is an input parameter of the method, and corresponds to the number of bits allocated to the current image for coding it. This number may, for example, be provided by the user of the method as a function of the application. In the case of a sequence of images, this number of bits Csetpoint may also be determined by a conventional rate control method such as that defined in document ISO/IEC JTC1/SC29/WG11 , Test model 5, 1993. This number may vary in particular as the function of the type of current image (for example intra image, predicted image). Specifically, a larger number of bits is necessary for coding an intra type image (i.e. an image in a sequence of images which is coded without reference to the other images in the sequence) than for a coded image of the predicted type (i.e. an image in a sequence of images which is coded with reference to another image in the sequence). The setpoint number of bits CsetPoint corresponds to the number of bits necessary for coding the current image with a unique quantization parameter qsetpoint or with a different parameter for each MBj. For the sake of clarity, a single parameter qsetpoint will be used here for describing the invention. The value of qsetpoint referenced in Figure 2 may be provided directly by the rate control method indicated above, or it may be determined on the basis of the value of Csetpoint and a rate-distortion curve associated with the current image, such as of the one represented by Figure 2 for an MBj. The generation of a rate-distortion curve consists in associating a distortion value and a coding cost (i.e. a number of bits) with each quantization parameter in a given interval (for example [0-31] for MPEG-2 and [0-51] for MPEG-4 AVC) specified, for example, by the coding standard. Such a curve may be associated with an image or with an MBj. A rate-distortion curve may be provided by external means or alternatively generated as follows for an MBj. The same technique can be used for generating the rate-distortion curve associated with an image. One technique for calculating the points 30 of the rate-distortion curve consists in coding each MBj with a plurality of quantization parameters (for example 1 , 2, ..., qit qi+1 , qi+2, ...) and in decoding it in order to generate a set of points 30. A quantization parameter qi, a coding cost R(i,q,) and a conventional distortion value dv(i,qi) correspond to each point 30. The coding cost R(i,q,) represents the number of bits necessary for coding an MBj by using the quantization parameter q,. As indicated previously, the value dv(i,qi) is obtained by coding the MB1 with the quantization parameter qu decoding it and calculating the conventional distortion between this reconstructed MBj and the source MBj. In order to avoid too many coding operations, it is possible to code each MB1 with a reduced number of quantization parameters, for example one out of every two (i.e. 2, 4, ..., qi, qi+2, qi+4, ...). The total curve as illustrated in Figure 2 is then interpolated between the calculated points 30, for example by using a cubic interpolation or by spline curves. It is also feasible to construct only a portion of the curve around the quantization parameter qsetpoint- Such a method for constructing this curve consists in using the statistical properties of the images. Specifically, the images are generally modelled by a Gaussian model whose various parameters (i.e. mean, variance) are estimated directly on the basis of the current image or the images in the sequence. Whatever the way in which the data have been obtained, they may be stored in correspondence tables ("look-up tables"), one per group of pixels MBi, which associate a conventional distortion value dv(i,qi) and a number of bits R(i,q,) with each quantization parameter q,.
The input data may be provided to the method of the invention in the form of data files.
Referring again to Figure 1 , step 10 consists in calculating a preliminary quantization parameter q^ for each MBj so as to minimize the variation in reconstruction quality around an average reconstruction quality. It is carried out in four sub-steps: one initialization sub-step and three sub-steps applied iteratively until a first termination criterion. To this end, an initial quantization parameter qir,it is determined, which is uniform over all of the image and greater than the setpoint quantization parameter qsetpoint- For example, qinit is equal to qsetpoint +T (with, for example, T=3). As a variant, a starting setpoint Cinit is determined on the basis of CsetPoint and other parameters such as, for example, the resolution of the images in the sequence and/or meta-data and/or the spatio-temporal activity of the images. A quantization parameter qinit is derived from the value of Cinit and from the rate-distortion curve associated with the current image. The value Cinit corresponds to the number of bits used for coding the current image with the quantization parameter qm. The value of Cinit which is less than the value of Csetpoint may also be set empirically to half the value of Csetpoint- For each MBi in the current image, the initialization sub-step consists in calculating the conventional i.e. non-perceptual distortion dv(i,qinit) associated with this MBi coded with the quantization parameter qinit- The mean value dv of the conventional distortion, as well as its variance σv 2 , are calculated over the current image in question according to the following formulae:
< = ^∑dv{i,qιmt) and σv 2 = ^∑(dv(i,qmιt) -dv)2
The values dv and σv 2 can thus be calculated directly on the basis of the current source image and the current reconstructed image. The second sub-step consists in identifying a first set of groups of pixels corresponding to the N groups of pixels MBj having the smallest conventional distortion values, said first set being referenced ESi in Figure 3, and a second set of groups of pixels corresponding to the N groups of pixels MBj having the greatest conventional distortion values, said second set being referenced ES2 in Figure 3. N is defined for example by the formula N=E[M*K], where E[.] is the integer part function, * is the multiplication operator and M is a number lying between 0 and 1. A value of M=O.1 seems well suited. The third sub-step consists in decreasing the quantization parameters associated with the groups of pixels MBi of the first set ESi by a value n in order to increase their reconstruction quality, and in increasing the quantization parameters associated with the groups of pixels MBi of the second set ES2 by a value n in order to decrease their reconstruction quality, n being a predetermined integer. A value of n equal to 1 seems well suited. The other MBjS keep the same quantization parameter. The last sub-step consists in recalculating the mean value of the conventional distortion dv of the current image, as well as its variance σv 2. If the absolute value of the difference between the variance value calculated at the preceding iteration and the current value is less than a threshold ε (for example, ε = 10"6 ), the distribution of bits is terminated. Otherwise, the method returns to the first sub-step in order to continue the distribution.
For the current image, this step 10 makes it possible to have a reconstruction quality which is less than the setpoint reconstruction quality but is more homogeneous. Figure 3 represents two histograms of reconstruction quality at two different iterations of step 10. At the second iteration, the reconstruction quality of the macroblocks belonging to the first set ESi has increased and the reconstruction quality of the macroblocks belonging to the second set ES2 has decreased to approach the average reconstruction quality. The setpoint reconstruction quality is the reconstruction quality calculated between the current source image and the current reconstructed image, i.e. the current source image coded with the quantization parameter qsetpoint then decoded. Specifically, the overall quality over an image is maximal when the local quality is identical and the overall quality drops greatly when the quality drops locally. This step makes it possible for a preliminary quantization parameter q^ , which corresponds to the last quantization parameter calculated, to be associated with each MBi in the current image. A new rate Cmm is calculated, which takes into account the preliminary quantization parameters associated
with each of the MBi: C^ = ∑R(i,qT ).
(=1
Step 20 consists in calculating a final quantization parameter q* for each MBj by reallocating the remaining bits AC, i.e. the difference in bits between Csetpoint and Cmin, as a function in particular of the perceptual interest of the MBjS, a greater number of bits being reallocated to the MBjS whose perceptual interest is highest. The reallocation of bits is carried out according to three sub-steps: one initialization sub-step and two sub-steps applied iteratively until a second termination criterion.
The first sub-step, referred to as initialization, consists in calculating an initial rate-distortion parameter λ(*,0) for each MBj in the following way on the basis of the rate-distortion curves calculated previously and the salience maps:
R(i,qr +V-R(UqD where λ(i, k) represents the slope of the rate-perceptual distortion curve at a given point on this curve as calculated at iteration k. The rate-perceptual distortion curve is derived directly from the rate-distortion curve provided to the method as input and one of the formulae adopted for calculating a perceptual distortion (for example D(i) = dv(i) *s(i)). The higher the parameter λ(i,k) is, the more the distortion decreases strongly with a small extra cost of bits. Let QP(i,k) be the quantization parameter associated with MBj at iteration k. During an iteration k, the second sub-step consists in determining the maximum value ^103x(Jc) among all the parameters λ(i,k) calculated: λ maχ W = maχλ(i,&) . A quantization parameter reduced by an integer value m in relation to the preceding iteration will be associated with the group of pixels MB10 of index io corresponding to ^103x(Jc) , i.e. QP(io,k + \) = QP(io,k)-m.
Preferably, m is equal to 1. The other MB11≠lo keep their quantization parameter, i.e. QP(ifc+l) =QP(i,k) . Furthermore, the number of bits to be reallocated is updated in the following way: ΔC = ΔC- (Λ(ϊO ϊβP(ϊO Ϊ* + l)) -Λ(ϊo >βP(ϊo >*))).
During iteration k, the third sub-step consists in recalculating the rate- distortion parameter associated with the MB10 whose quantization parameter has just been modified, in the following way: λ(i /n D(J0 , QP(J0 , *)) - D(J0 , QP(J0 ,Ar + I)) R(i0 , QP(i0 , k + 1)) - R(i0 , QP(I0 , k)) The rate-distortion parameters associated with the other MB11≠lo remain unchanged, i.e. λ(i,k + \) = λ(i,k) . So long as AC is positive, the method returns to the second sub-step. This step 20 makes it possible for a final quantization parameter q* , which corresponds to the last quantization parameter calculated, to be associated with each MBi in the current image.
The present invention also relates to a device, referenced 40 in Figure 4, which implements the method previously described. Only the essential elements of the device are represented in Figure 4. The device 40 comprises in particular a random-access memory 42 (RAM or similar component), a read-only memory 43 (hard disk or similar component), a processing unit 44 such as a microprocessor or a similar component, an input/output interface 45 and a man-machine interface 46. These elements are connected together by an address and data bus 41. The read-only memory 43 contains in particular the algorithms carrying out steps 10 and 20 of the method according to the invention. It may also contain the algorithms for obtaining the input parameters of the method such as, for example, a rate control algorithm, an algorithm for generating the salience maps as well as an algorithm for coding/decoding the images. Upon start-up, the processing unit 44 loads and executes the instructions of these algorithms. The random-access memory 42 comprises in particular the operating programs of the processing unit 44, which are loaded upon start-up of the apparatus, as well as the images to be processed. The purpose of the input/output interface 45 is to receive the input signal (i.e. the sequence of source images and optionally the input parameters such as the setpoint number of bits Csetpoint, the associated quantization parameter qsetpoint> the salience maps, the rate-distortion curves) and to deliver the quantization parameters determined according to steps 10 and 20 of the method of the invention. The man-machine interface 46 of the device allows the user to interrupt the processing. The results of the determination of the quantization parameters in each image are stored in random-access memory then transferred to read-only memory in order to be archived with a view to subsequent processing operations, for example coding the images with these quantization parameters. The man-machine interface 46 comprises in particular a control panel and a display screen.
Of course, the invention is not limited to the exemplary embodiments mentioned above. In particular, the person skilled in the art may apply any variant in the embodiments explained and combine them in order to benefit from their various advantages. For example, perceptual distortion metrics other than those described previously may be used. Likewise, other methods may be used in order to determine the rate-distortion curves associated with each of the groups of pixels MB1. Instead of determining values Csetpoint and Cinit directly, for example by using a rate control method, it is moreover possible to directly use quantization parameters qsetpoint and qinit proposed for example by a user as a function of the application. The values CsetPointand Cinit then correspond to the number of bits used for coding the current image, respectively with qsetpoint and qinit- According to the invention, furthermore, it is not necessary to construct rate-distortion maps. In fact, a group of pixels MBj may be coded with a given quantization parameter each time it is essential to know the number of bits necessary for coding this MBj with the given quantization parameter and the associated distortion. The input data of the method according to the invention, i.e. the setpoint rate Csetpoint> optionally qsetpoint, the salience maps and optionally the rate-distortion curves, may be provided by methods other than those described previously.

Claims

Claims
1. Method for determining a quantization parameter for each group of pixels in an image, the said quantization parameters being used for coding the said image with a first number of bits (CsetPoint) corresponding to the number of bits necessary for coding the said image with a setpoint quantization parameter (qsetpoint), characterized in that it comprises the following steps: - Calculating (10) a preliminary quantization parameter (^1"3* ) for each of the said groups of pixels (MBi) so as to minimize the variation in reconstruction quality between the said groups when the said preliminary quantization parameters are used for coding the said image with a second number of bits (Cmin), which is less than the first number of bits (Csetpoint); and
- Calculating (20) a final quantization parameter (q* ), which is less than or equal to the preliminary quantization parameter ( q^ ), for each of the said groups of pixels by reallocating the difference in bits (Csetpoint - Cmin) between the first and second numbers of bits to the said groups of pixels as a function of their content and their perceptual interest.
2. Method according to Claim 1 , characterized in that the said difference in bits (Csetpoint - Cmin) between the first and second numbers of bits is reallocated to the said groups of pixels proportionally to their perceptual interest.
3. Method according to one of Claims 1 and 2, characterized in that the perceptual interest of a group of pixels is characterized by a salience value calculated for this group of pixels.
4. Method according to one of Claims 1 to 3, characterized in that the step of calculating the said preliminary quantization parameter (q^ ) is preceded by a step of associating a set of points (30) with each of the said groups of pixels, each point comprising a quantization parameter value, a number of bits necessary for coding the said group of pixels with the said quantization parameter and an associated distortion value.
5. Method according to one of Claims 1 to 4, characterized in that the step of calculating the said preliminary quantization parameters (^1"3* ) comprises the following steps: a. For each of the said groups of pixels (MBi), calculating a distortion value (dv(i,qinit)) corresponding to the coding of the said group of pixels with an initial quantization parameter (qinit), which is greater than the setpoint quantization parameter (qSetpomt); b. For the said image, calculating a current variance value (σv 2 ) of the distortion corresponding to the coding of the said groups of pixels in the said image with the said initial quantization parameter (qinit); c. Identifying a first set of groups of pixels (ESi) corresponding to the N groups of pixels having the smallest distortion values and a second set of groups of pixels (ES2) corresponding to the N groups of pixels having the largest distortion values, N being a predetermined integer; d. decreasing the quantization parameters associated with the said groups of pixels of the said first set (ESi) by a value n and increasing the quantization parameters associated with the said groups of pixels of the said second set (ES2) by a value n, the quantization parameters associated with each of the groups of pixels other than those belonging to the said first and second sets remaining unchanged, n being a predetermined integer; e. For the said image, recalculating a new variance value (σv 2 ) of the distortion corresponding to the coding of the said groups of pixels in the said image with the quantization parameters resulting from step d, the current variance value becoming a preceding variance value and the new variance value becoming the current variance value; and f. Returning to step c if the absolute value of the difference between the current variance value and the preceding variance value is greater than a threshold (ε), otherwise, for each of the said groups of pixels, assigning the value of the quantization parameter resulting from step d to the preliminary quantization parameter (^ ) of this group.
6. Method according to Claim 5, characterized in that the integer N is the integer part of the product M times K, where K is the number of groups of pixels in the said image and M is a number lying between 0 and 1.
7. Method according to Claim 6, characterized in that M=O.1 , n=1 and ε=10"6.
8. Method according to one of Claims 1 to 7, characterized in that the step of calculating the said final quantization parameters (q* ) comprises the following steps: a. Calculating a parameter λ(*,0) , referred to as the initial rate-distortion parameter, for each of the said groups of pixels (MBi) according to the following formula:
R(i,qr +V-R(UqD where: - q^ is the preliminary quantization parameter associated with the pixel group (MB1) of index i;
- D(i,A) is a perceptual distortion value corresponding to the coding of the said group of pixels MBi with the quantization parameter A; and
- R(i,A) is the number of bits necessary for coding the said group of pixels of index i with the quantization parameter A. b. Determining the maximum value of the said rate-distortion parameters associated with each of the said groups of pixels; c. Decreasing the quantization parameter associated with the group of pixels of index io having the said maximum rate-distortion parameter, referred to as the identified group, by a value m, the quantization parameters associated with each of the groups of pixels other than the identified group remaining unchanged, m being a predetermined integer; d. Calculating the difference between the number of bits necessary for coding the said identified group with the quantization parameter of the identified group as calculated in step c and the number of bits necessary for coding the identified group with the quantization parameter of the group identified before step c, this difference being referred to as the number of supplementary bits; e. Subtracting the said number of supplementary bits from the said difference in bits (CsetPoint - Cmin); f. For each identified group, recalculating the said rate-distortion value according to the following formula: λ(i k n) _ DJh , QP(J0 , *)) ~ D(J0 , QPJip ,* + !))
R(i0 , QP(i0 , k + 1)) - R(i0 , QP(i0 , *)) where: - D(io,A) is the perceptual distortion value corresponding to the coding of the said identified group with the quantization parameter A;
- R(io,A) is the number of bits necessary for coding the said identified group with the quantization parameter A; and
- QP(io,k) is the parameter associated with the said identified group at the preceding iteration k and QP(io,k + \) is the quantization parameter as calculated at the iteration k+1. g. Returning to step b if the said difference in bits (CsetPoint - Cmin) is positive, otherwise, for each of the said groups of pixels, assigning the value of the quantization parameter resulting from step c to the final quantization parameter (q* ) of this group.
9. Method according to Claim 8, characterized in that the perceptual distortion D(i,qi) associated with a group of pixels of index i, coded with the quantization parameter qt, is derived from a conventional distortion value dv(i,qi) according to one of the following formulae:
- D(i,qi) = dv(i,qi) *s(i); or
- D(i,qi) = dv(i,qirsp(i). where - s(i) represents a value characterizing the perceptual interest of the said group of pixels of index i; - p is a positive integer; and
- * is the multiplication operator.
10. Method according to Claim 9, characterized in that m=1 and p=2.
11. Device for determining a quantization parameter for each group of pixels in an image, the said quantization parameters being used for coding the said image with a first number of bits (Csetpoint) corresponding to the number of bits necessary for coding the said image with a setpoint quantization parameter (qSetpoint)> characterized in that it comprises the following means:
- Means for calculating (40) a preliminary quantization parameter (q^ ) for each of the said groups of pixels (MBi) so as to minimize the variation in reconstruction quality between the said groups when the said preliminary quantization parameters are used for coding the said image with a second number of bits (Cmin), which is less than the first number of bits (Csetpoint); and
- Means for calculating (40) a final quantization parameter (q* ), which is less than or equal to the preliminary quantization parameter (q^ ), for each of the said groups of pixels by reallocating the difference in bits (Csetpoint - Cmin) between the first and second numbers of bits to the said groups of pixels as a function of their content and their perceptual interest.
12. Computer program product, characterized in that it comprises program code instructions for carrying out the steps of the method according to one of
Claims 1 to 10, when the said program is run on a computer.
PCT/EP2006/064393 2005-07-28 2006-07-19 Method and device for determining quantization parameters in an image WO2007014850A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR0552345A FR2889381A1 (en) 2005-07-28 2005-07-28 Quantization parameter determining method for coding image in video conference application, involves calculating quantization parameter for each group of pixels in image to minimize variation in reconstruction quality between groups
FR0552345 2005-07-28

Publications (2)

Publication Number Publication Date
WO2007014850A2 true WO2007014850A2 (en) 2007-02-08
WO2007014850A3 WO2007014850A3 (en) 2007-04-12

Family

ID=36177783

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2006/064393 WO2007014850A2 (en) 2005-07-28 2006-07-19 Method and device for determining quantization parameters in an image

Country Status (2)

Country Link
FR (1) FR2889381A1 (en)
WO (1) WO2007014850A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014521272A (en) * 2011-07-19 2014-08-25 トムソン ライセンシング Method and apparatus for reframing and encoding video signals

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2963190B1 (en) 2010-07-23 2013-04-26 Canon Kk METHOD AND DEVICE FOR ENCODING AN IMAGE SEQUENCE

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997037322A1 (en) * 1996-03-29 1997-10-09 Sarnoff Corporation Apparatus and method for optimizing encoding and performing automated steerable image compression using a perceptual metric
US5754236A (en) * 1995-05-29 1998-05-19 Samsung Electronics Co., Ltd. Variable bit rate coding using the BFOS algorithm
US5819004A (en) * 1995-05-08 1998-10-06 Kabushiki Kaisha Toshiba Method and system for a user to manually alter the quality of previously encoded video frames
WO2003084240A1 (en) * 2002-03-28 2003-10-09 Koninklijke Philips Electronics N.V. Image coding using quantizer scale selection
US20040184535A1 (en) * 1997-03-14 2004-09-23 Microsoft Corporation Motion video signal encoder and encoding method
US6834080B1 (en) * 2000-09-05 2004-12-21 Kabushiki Kaisha Toshiba Video encoding method and video encoding apparatus

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5819004A (en) * 1995-05-08 1998-10-06 Kabushiki Kaisha Toshiba Method and system for a user to manually alter the quality of previously encoded video frames
US5754236A (en) * 1995-05-29 1998-05-19 Samsung Electronics Co., Ltd. Variable bit rate coding using the BFOS algorithm
WO1997037322A1 (en) * 1996-03-29 1997-10-09 Sarnoff Corporation Apparatus and method for optimizing encoding and performing automated steerable image compression using a perceptual metric
US20040184535A1 (en) * 1997-03-14 2004-09-23 Microsoft Corporation Motion video signal encoder and encoding method
US6834080B1 (en) * 2000-09-05 2004-12-21 Kabushiki Kaisha Toshiba Video encoding method and video encoding apparatus
WO2003084240A1 (en) * 2002-03-28 2003-10-09 Koninklijke Philips Electronics N.V. Image coding using quantizer scale selection

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014521272A (en) * 2011-07-19 2014-08-25 トムソン ライセンシング Method and apparatus for reframing and encoding video signals

Also Published As

Publication number Publication date
WO2007014850A3 (en) 2007-04-12
FR2889381A1 (en) 2007-02-02

Similar Documents

Publication Publication Date Title
Merkle et al. The effects of multiview depth video compression on multiview rendering
CN112913237A (en) Artificial intelligence encoding and decoding method and apparatus using deep neural network
US6477201B1 (en) Content-adaptive compression encoding
KR100524077B1 (en) Apparatus and method of temporal smoothing for intermediate image generation
KR102599314B1 (en) Quantization step parameters for point cloud compression
US11265528B2 (en) Methods and systems for color smoothing for point cloud compression
KR20210141942A (en) Point cloud processing
US11989919B2 (en) Method and apparatus for encoding and decoding volumetric video data
Park et al. Nonlinear depth quantization using piecewise linear scaling for immersive video coding
Sevom et al. Geometry-guided 3D data interpolation for projection-based dynamic point cloud coding
WO2007014850A2 (en) Method and device for determining quantization parameters in an image
Gonçalves et al. Encoding efficiency and computational cost assessment of state-of-the-art point cloud codecs
Wang et al. Visual quality optimization for view-dependent point cloud compression
JP4033292B2 (en) Quantization control method for video coding
WO2006072536A1 (en) Method and device for selecting quantization parameters in a picture using side information
JP2022523377A (en) Point cloud processing
Li et al. Near-lossless point cloud geometry compression based on adaptive residual compensation
JP2021530890A (en) Point cloud processing
US10356424B2 (en) Image processing device, recording medium, and image processing method
CN116686011B (en) Grid coding and decoding method, device and storage medium
Yuan et al. Report on analytical models
Tohidi et al. Efficient Dynamic Point Cloud Compression Through Adaptive Hierarchical Partitioning
JP2024149502A (en) Quantization step parameters for point cloud compression
KR20230082542A (en) Method and apparatus for attribute compression of plenoptic point cloud
WO2023086258A1 (en) Grid-based patch generation for video-based point cloud coding

Legal Events

Date Code Title Description
NENP Non-entry into the national phase in:

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 06792522

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 06792522

Country of ref document: EP

Kind code of ref document: A2