WO2008145560A1 - Method for selecting a coding data and coding device implementing said method - Google Patents

Method for selecting a coding data and coding device implementing said method Download PDF

Info

Publication number
WO2008145560A1
WO2008145560A1 PCT/EP2008/056149 EP2008056149W WO2008145560A1 WO 2008145560 A1 WO2008145560 A1 WO 2008145560A1 EP 2008056149 W EP2008056149 W EP 2008056149W WO 2008145560 A1 WO2008145560 A1 WO 2008145560A1
Authority
WO
WIPO (PCT)
Prior art keywords
coding
picture
subset
data
coding data
Prior art date
Application number
PCT/EP2008/056149
Other languages
French (fr)
Inventor
Julien Haddad
Olivier Le Meur
Philippe Guillotel
Original Assignee
Thomson Licensing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing filed Critical Thomson Licensing
Publication of WO2008145560A1 publication Critical patent/WO2008145560A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/109Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/19Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • Scope of the invention relates to the general domain of video coding.
  • the invention relates, more particularly, to a method for selecting a coding data from a predefined set of coding data, said coding data being associated with a picture portion with a view to its subsequent coding. It also relates to a coding device of a sequence of pictures suited to implement said selection method.
  • Video coders are known that are suitable to code pictures in INTRA mode, i.e. independently from the other pictures of the sequence and pictures in INTER mode, i.e. by temporal prediction from other pictures of the sequence, called reference pictures.
  • a picture divided into blocks of picture data e.g. luminance data
  • each block is coded in INTRA mode if the picture is of the INTRA type and in INTRA mode or INTER mode if the picture is of the INTER type.
  • the most recent video coding standards e.g. MPEG-4 AVC, define several coding modes of the INTRA type and several coding modes of the INTER type.
  • Figure 1 shows different INTER coding modes as defined in the document ISO/IEC 14496-10:2005 relating to the MPEG-4 AVC standard.
  • Such video coders are suited to select, for a current block of index i, a coding mode mode, from a set E of K coding modes m k . They are also suitable to generate, for this current block, a prediction block according to the selected coding mode mode,. The video coder is suitable to subtract the prediction block from the current block and to code, in the form of a stream of binary data, the residual data thus generated.
  • the coding mode mode is selected from the set E by means of a predefined criterion. This criterion is, for example, a bitrate/distortion type criterion.
  • the video coder calculates, for the index block i and for each of the modes m k of the set E, a value J,(m k ) equal to D,(m k ) + ⁇ 7R,(m k ), where R,(m k ) is the coding cost of the index block i coded according to the mode m k and D,(m k ) is the distortion associated with the index block i coded according to the mode m k then reconstructed.
  • the video coder selects from the set E, the coding mode mode, of the index block i such that mode ⁇ argmin ⁇ , (m k )) .
  • this coding data is the coding mode for example. It can also be a transform type, a reference picture number, etc.
  • the purpose of the invention is to compensate for at least one disadvantage of the prior art.
  • the invention relates to a method for selecting a coding data from a predefined set of coding data, said coding data being associated with a picture portion with a view to its subsequent coding.
  • the method comprises the following steps:
  • the coding data subset is determined for the picture portion according to a predetermined value representative of the perceptual interest of the picture portion, called perceptual interest value.
  • perceptual interest value a predetermined value representative of the perceptual interest of the picture portion.
  • the coding data is a coding mode.
  • the picture portion is a picture data block.
  • the predetermined value is a saliency value associated with the picture portion.
  • the subset is equal to the set if the perceptual interest value is greater than a predetermined threshold. If the perceptual interest value of the block is less than or equal to the predetermined threshold, the subset comprised the p coding modes of the set for which the selection probability is the highest, this probability having been determined beforehand for each coding mode of the set. According to a characteristic of the invention, the subset is equal to a first subset if the perceptual interest value is greater than a predefined threshold and is equal to a second subset different from the first subset if the perceptual interest value is less than the predefined threshold.
  • the first subset is equal to the set and the second subset comprises the p coding modes of the set for which the selection probability is the highest, this probability having been determined beforehand for each coding mode of the set.
  • the invention also relates to a coding device of a sequence of pictures, each picture being divided into picture data portions.
  • the device comprises:
  • - selection means suitable to select, for each picture data portion, at least one coding data
  • the selection means comprise:
  • - means to determine, for each picture data portion, a subset of the set of coding data according to a predetermined value representative of the perceptual interest of the picture data portion, and - means to select the at least one coding data from the determined subset.
  • FIG. 1 shows different INTER coding modes according to the MPEG-4 AVC standard
  • FIG. 2 shows a selection method of a coding mode according to the invention
  • FIG. 3 illustrates a video coding device according to the invention
  • FIG. 4 illustrates a video coding device according to a variant of the invention.
  • the invention described within the framework of the MPEG-4 AVC standard can be extended to any type of standard in which the selection of a coding data must be carried out.
  • the invention described within the framework of the selection of a coding mode can be extended to the general case of the selection of a coding data within a set of predefined coding data.
  • the invention can be applied to the case of the selection of the number of reference pictures used to code a current picture of the INTER type. Likewise, it can be extended to the selection of a particular transform type.
  • the invention relates to a selection method for each portion B 1 of a current picture divided into N picture portions of a coding data within a predefined set E comprising K coding data.
  • the coding data is coding modes.
  • each picture portion B 1 is a picture data block. In the rest of the description B 1 is called block.
  • the index i of the block B 1 is initialised to zero.
  • a subset SE 1 of the set E is determined for the block B 1 according to a predetermined value S 1 associated with the block B 1 , this value being representative of the perceptual interest of the block B 1 .
  • the subset SE 1 is equal to the set E if the value S 1 is greater than a predefined threshold T and the set SE 1 comprises the most probable p modes m k of the set E otherwise, with p an integer belonging to [1 ; K] otherwise.
  • the K modes of the set E are ordered according to their selection probability that was calculated beforehand by coding statistics on a representative number of sequences.
  • the p modes m k for which the selection probability is the highest then form the sub-set SE 1 if S 1 ⁇ T.
  • the most probable p modes of the set E can be determined by an analysis of the direction of the contours in block B 1 . If the contours in the block B 1 are mostly oriented in the vertical direction then the p modes closest to the vertical direction are the most probable and form the subset SE 1 , i.e. the vertical INTRA mode, vertical INTRA to the right and vertical INTRA mode to the left.
  • the invention is not limited by the manner in which the most probable p modes of the set E are determined.
  • the subset SE 1 is equal to the set E if the value S 1 is greater than the predefined threshold T and the set SE 1 comprises p modes m k of the set E, said p modes being selected according to the sub-block sizes that are associated with them. For example, if the current picture to which the block B 1 belongs is a picture of the INTER type and the set E comprises the coding modes shown in figure 1 , then if S 1 is less than or equal to T, the subset SE 1 comprises the coding modes associated with the greatest sub- block sizes, for example INTER16x16, INTER16x8 and INTER8x16. In this case, the other coding modes associated with the smaller sub-block sizes, i.e.
  • thresholds can be defined. For example, if the value S 1 is greater than a first threshold defined T1 , then the subset SE 1 is equal to the set E, if the value S 1 is less than T1 and greater than a predefined threshold T2 then the set SE 1 comprises the most probable p modes m k of the set E, and if the value S 1 is less than 12, then the set SE 1 comprises the most probable q modes m k of the set E with q an integer less than or equal to p.
  • the value S 1 is determined beforehand for the block B 1 according to a method known by the prior art. Such a value S 1 is, for example, obtained by applying the method described in the patent application EP03293216.2 (published und the number 1544792). This method is suitable to generate a saliency map for the current picture.
  • This saliency map is a topographical representation of the degree of saliency of each pixel of the current picture. This map is standardised for example between 0 and 1 but can also be between 0 and 255.
  • the saliency map thus provides a saliency value S(x,y) per pixel (where (x,y) are the co-ordinates of a pixel of the picture), which characterizes the perceptual interest of this pixel.
  • the saliency map is generated by applying the following steps:
  • each subband can be considered as the neuronal image corresponding to a population of visual cells aligned on a spatial frequency interval and a particular orientation, - extraction of the salient elements of the subbands relating to the luminance component and relating to each of the chrominance components, i.e. the most important information of the subbands.
  • the coding mode mode, associated with the block B is selected from the subset SE, according to a criterion for example of the bitrate-distortion type.
  • the block B is a block of which the value S, representative of the perceptual interest of the block is less than T
  • the selection method calculates, for each of the modes m k of the sub-set SE 1 , the value J 1 (ITIk) equal to D,(m k ) + ⁇ 7R,(m k ).
  • the method selects from the subset SE 1 , the coding mode mode, of the block such that modeF argmin ⁇ mJ) .
  • the selection of the coding mode mode requires less calculation.
  • the reconstruction quality can be slightly reduced for blocks with a low perceptual interest, i.e. such that S, ⁇ T, because all the coding modes are not tested for these blocks.
  • this degradation does not disturb the human eye as it is produced in the zones of the picture of the least interest for the human eye.
  • the computation resources thus saved on the blocks of which the perceptual interest is low can be advantageously used to code the zones of high perceptual interest and for increasing the reconstruction quality.
  • the human eye is less sensitive to the degradation in the zones of which the perceptual interest is low than to degradations in the zones of which the perceptual interest is greater.
  • the i index is incremented by 1.
  • i is compared with N. If i is greater than or equal to N then the selection of the coding modes for the current picture is terminated 20, otherwise the method continues to step 12 with the next block.
  • the invention relates to a coding device 30 and 40. Only the essential elements of the invention are shown in these figures. The elements that are well known by those skilled in the art of video coders are not shown, e.g. motion estimation module, motion compensation module, etc.
  • the modules shown are functional units that may or may not correspond to physically distinguishable units. For example, these modules or some of them can be grouped together in a single component, or constitute functions of the same software. On the contrary, some modules may be composed of separate physical entities.
  • the coding device 30 comprises a first input 300, a second input 302, an output 310, a selection module 304, a coding module 306 and a memory 308.
  • the first input 300 is suitable to receive saliency values S 1 and the second input 302 is suitable to receive the picture data of block B 1 .
  • the selection module 304 is suitable to select, for each block B 1 received from the second input 302, a coding mode mode, according to the saliency value S 1 received from the first input 300.
  • the selection module 304 is suited to implement the selection method of the invention.
  • the unit 3040 suitable to determine, for the block B 1 , a subset SE 1 of the set E according to the value S 1 of perceptual interest of said block B 1 in accordance with step 12 of the method and a unit 3042 connected to the unit 3040 suitable to select, in accordance with step 14 of the method, from the subset SE 1 , the coding mode mode, finally retained to code the block B, subsequently.
  • the unit 3042 is suitable to calculate for example the function of type of bitrate-distortion J,(m k ) and to carry out the selection of mode, from calculated values.
  • the coding module 306 is suitable to code in binary form the picture data B, transmitted by the second input 302 according to the coding mode mode, transmitted by the selection module 304 and possibly according to picture data previously coded and reconstructed by said coding module 306 and stored in a memory 308, e.g. picture data belonging to a previously coded picture (temporal prediction) or to a block of the same previously coded picture (spatial prediction).
  • the coding module 306 is linked to the output 310 of the coding device.
  • the output 310 is suitable to transmit, e.g. to a decoding device or to a broadcast network, a bitstream F representative of the picture data received on the second input 302 and coded by the coding module.
  • a variant of the coding device 30 is shown in figure 4.
  • the shared elements of the two coding devices are identified by the same numerical references.
  • the coding device 40 comprises a single input 302 suitable to receive the picture data from the blocks B 1 . It further comprises a module 400 suitable to calculate for each bock B 1 a perceptual interest value S 1 .
  • This value S 1 is for example calculated according to the method described above for the selection method. In this variant, perceptual interest values S 1 are calculated directly by the coding device 40 from picture data received on the input 302.
  • the invention is not limited to the embodiment examples mentioned above.
  • the person skilled in the art may apply any variant to the stated embodiments and combine them to benefit from their various advantages.
  • the invention described for coding data of this type of coding mode can be extended to the selection of any other type of coding data, notably a number of reference pictures, a type of transform, a size of search window for motion estimation, etc.
  • MPEG4 AVC it is indeed possible to select the reference picture used for the prediction of a picture data block in a set of 5 reference pictures.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention relates to a method for selecting a coding data from a predefined set (E) of coding data. Said coding data being associated with a picture portion (Bi) with a view to its subsequent coding. The method comprises the following steps: determine (12) a subset (SEi) of the set (E) of coding data, and select (14) at least one coding data from the determined subset (SEi). According to an essential characteristic of the invention, the coding data subset (SEi) is determined (12) for the picture portion (Bi) according to a predetermined value (Si) representative of the perceptual interest of the picture portion (Bi), called perceptual interest value.

Description

METHOD FOR SELECTING A CODING DATA AND CODING DEVICE IMPLEMENTING SAID METHOD
1. Scope of the invention The invention relates to the general domain of video coding.
The invention relates, more particularly, to a method for selecting a coding data from a predefined set of coding data, said coding data being associated with a picture portion with a view to its subsequent coding. It also relates to a coding device of a sequence of pictures suited to implement said selection method.
2. Prior art
Video coders are known that are suitable to code pictures in INTRA mode, i.e. independently from the other pictures of the sequence and pictures in INTER mode, i.e. by temporal prediction from other pictures of the sequence, called reference pictures. In a picture divided into blocks of picture data (e.g. luminance data), each block is coded in INTRA mode if the picture is of the INTRA type and in INTRA mode or INTER mode if the picture is of the INTER type. The most recent video coding standards, e.g. MPEG-4 AVC, define several coding modes of the INTRA type and several coding modes of the INTER type. Figure 1 shows different INTER coding modes as defined in the document ISO/IEC 14496-10:2005 relating to the MPEG-4 AVC standard. Such video coders are suited to select, for a current block of index i, a coding mode mode, from a set E of K coding modes mk. They are also suitable to generate, for this current block, a prediction block according to the selected coding mode mode,. The video coder is suitable to subtract the prediction block from the current block and to code, in the form of a stream of binary data, the residual data thus generated. Generally, the coding mode mode, is selected from the set E by means of a predefined criterion. This criterion is, for example, a bitrate/distortion type criterion. In this case, the video coder calculates, for the index block i and for each of the modes mk of the set E, a value J,(mk) equal to D,(mk) +λ7R,(mk), where R,(mk) is the coding cost of the index block i coded according to the mode mk and D,(mk) is the distortion associated with the index block i coded according to the mode mk then reconstructed. The video coder then selects from the set E, the coding mode mode, of the index block i such that mode^ argmin^, (mk )) . Now, to add new coding modes to the set E, as is the mkeE case with the standard MPEG-4 AVC with respect to the standard MPEG2, enables the picture data of the index block i to be predicted more finely and thus enables the reconstruction quality of said block to be increased for a given coding cost, i.e. a given number of bits. Moreover, this enables the coding cost of said block to be reduced for a given reconstruction quality. However, the greater the number K of coding modes mk in the set E, the greater the selection time of the coding mode mode, associated with the index block i, as the number of values J,(mk) to calculate is greater. More generally, it is often necessary to select a coding data from a predefined set according to a given criteria before carrying out the coding itself of the index block i. Now, the more elements that comprise this set, the greater is the selection time of the coding data. This is notably problematic for the production of a real-time coding device. As previously illustrated, this coding data is the coding mode for example. It can also be a transform type, a reference picture number, etc.
3. Summary of the invention
The purpose of the invention is to compensate for at least one disadvantage of the prior art.
The invention relates to a method for selecting a coding data from a predefined set of coding data, said coding data being associated with a picture portion with a view to its subsequent coding. The method comprises the following steps:
- determine a subset of the coding data set, and
- select at least one coding data from the determined subset.
According to an essential characteristic of the invention, the coding data subset is determined for the picture portion according to a predetermined value representative of the perceptual interest of the picture portion, called perceptual interest value. Advantageously, by pre-selecting coding data, the invention reduces selection time of the coding data finally selected. Further, this pre-selection being carried out as a function of perceptual interest data, the reconstruction quality of the sequence is not degraded.
According to a characteristic of the invention, the coding data is a coding mode. According to another characteristic of the invention, the picture portion is a picture data block.
Advantageously, the predetermined value is a saliency value associated with the picture portion.
According to a characteristic of the invention, the subset is equal to the set if the perceptual interest value is greater than a predetermined threshold. If the perceptual interest value of the block is less than or equal to the predetermined threshold, the subset comprised the p coding modes of the set for which the selection probability is the highest, this probability having been determined beforehand for each coding mode of the set. According to a characteristic of the invention, the subset is equal to a first subset if the perceptual interest value is greater than a predefined threshold and is equal to a second subset different from the first subset if the perceptual interest value is less than the predefined threshold.
According to a particular characteristic of the invention, the first subset is equal to the set and the second subset comprises the p coding modes of the set for which the selection probability is the highest, this probability having been determined beforehand for each coding mode of the set.
The invention also relates to a coding device of a sequence of pictures, each picture being divided into picture data portions. The device comprises:
- selection means suitable to select, for each picture data portion, at least one coding data, and
- coding means suitable to code each of the picture data portions according to the coding data selected. According to an essential characteristic of the invention, the selection means comprise:
- means to determine, for each picture data portion, a subset of the set of coding data according to a predetermined value representative of the perceptual interest of the picture data portion, and - means to select the at least one coding data from the determined subset.
4. List of figures
The invention will be better understood and illustrated by means of embodiments and implementations, by no means limiting, with reference to the figures attached in the appendix, wherein:
- figure 1 shows different INTER coding modes according to the MPEG-4 AVC standard,
- figure 2 shows a selection method of a coding mode according to the invention,
- figure 3 illustrates a video coding device according to the invention, and
- figure 4 illustrates a video coding device according to a variant of the invention.
5. Detailed description of the invention
The invention described within the framework of the MPEG-4 AVC standard can be extended to any type of standard in which the selection of a coding data must be carried out. The invention described within the framework of the selection of a coding mode can be extended to the general case of the selection of a coding data within a set of predefined coding data. For example, the invention can be applied to the case of the selection of the number of reference pictures used to code a current picture of the INTER type. Likewise, it can be extended to the selection of a particular transform type.
With reference to figure 2, the invention relates to a selection method for each portion B1 of a current picture divided into N picture portions of a coding data within a predefined set E comprising K coding data. According to a particular embodiment, the coding data is coding modes. According to particular characteristic of the invention, each picture portion B1 is a picture data block. In the rest of the description B1 is called block. In step 10, the index i of the block B1 is initialised to zero.
In step 12, a subset SE1 of the set E is determined for the block B1 according to a predetermined value S1 associated with the block B1, this value being representative of the perceptual interest of the block B1. In a particular embodiment, the subset SE1 is equal to the set E if the value S1 is greater than a predefined threshold T and the set SE1 comprises the most probable p modes mk of the set E otherwise, with p an integer belonging to [1 ; K] otherwise. In order to determine the most probable p modes of the set E, the K modes of the set E are ordered according to their selection probability that was calculated beforehand by coding statistics on a representative number of sequences. The p modes mk for which the selection probability is the highest then form the sub-set SE1 if S1 < T. In the particular case of the INTRA modes defined by the standard MPEG-4 AVC in the section 8.3 of the document ISO/IEC 14496-10 (version 3), the most probable p modes of the set E can be determined by an analysis of the direction of the contours in block B1. If the contours in the block B1 are mostly oriented in the vertical direction then the p modes closest to the vertical direction are the most probable and form the subset SE1, i.e. the vertical INTRA mode, vertical INTRA to the right and vertical INTRA mode to the left. Obviously, the invention is not limited by the manner in which the most probable p modes of the set E are determined.
According to a first variant, the subset SE1 is equal to the set E if the value S1 is greater than the predefined threshold T and the set SE1 comprises p modes mk of the set E, said p modes being selected according to the sub-block sizes that are associated with them. For example, if the current picture to which the block B1 belongs is a picture of the INTER type and the set E comprises the coding modes shown in figure 1 , then if S1 is less than or equal to T, the subset SE1 comprises the coding modes associated with the greatest sub- block sizes, for example INTER16x16, INTER16x8 and INTER8x16. In this case, the other coding modes associated with the smaller sub-block sizes, i.e. INTER8x8, INTER8x4, INTER4x8, INTER4x4, do not belong to the subset SE1. According to a second variant, the subset SE1 is equal to the set E if the value S1 is greater than the predefined threshold T and the set SE1 comprises p modes mk of the set E, said p modes being the ones that require the least calculation.
According to another variant, several thresholds can be defined. For example, if the value S1 is greater than a first threshold defined T1 , then the subset SE1 is equal to the set E, if the value S1 is less than T1 and greater than a predefined threshold T2 then the set SE1 comprises the most probable p modes mk of the set E, and if the value S1 is less than 12, then the set SE1 comprises the most probable q modes mk of the set E with q an integer less than or equal to p.
The value S1 is determined beforehand for the block B1 according to a method known by the prior art. Such a value S1 is, for example, obtained by applying the method described in the patent application EP03293216.2 (published und the number 1544792). This method is suitable to generate a saliency map for the current picture. This saliency map is a topographical representation of the degree of saliency of each pixel of the current picture. This map is standardised for example between 0 and 1 but can also be between 0 and 255. The saliency map thus provides a saliency value S(x,y) per pixel (where (x,y) are the co-ordinates of a pixel of the picture), which characterizes the perceptual interest of this pixel. The higher the value of S(x,y), the more the pixel of co-ordinates (x,y) is relevant from a perceptual viewpoint. In order to obtain a saliency value S1 per block B1, the mean value of the saliency values S(x,y) associated with each of the pixels of B1 is calculated for example. The median value can also be used instead of the mean value to represent the block B1. According to this document, the saliency map is generated by applying the following steps:
- projection of the picture into a psycho-visual colour space according to the luminance component in the case of a monochrome picture and according to the luminance component and according to each one of its chrominance components in the case of a coloured picture; in the rest, it will be considered that the picture processed is a coloured picture,
- perceptual decomposition of the projected components into subbands (one luminance component and two chrominance components) in the frequency domain according to a visibility threshold of the human eye; the subbands are obtained by sharing the frequency domain according to the radial spatial frequency and the orientation (angular selectivity); each subband can be considered as the neuronal image corresponding to a population of visual cells aligned on a spatial frequency interval and a particular orientation, - extraction of the salient elements of the subbands relating to the luminance component and relating to each of the chrominance components, i.e. the most important information of the subbands.
- improvement of the contours of the salient elements in each subband relating to the luminance component and relating to each of the chrominance components,
- calculation of a saliency map for the luminance from the improved contours of the salient elements of each subband relating to the luminance component, - calculation of a saliency map for each of the chrominance components from the improved contours of the salient elements of each subband relating to the chrominance components, and
- generation of a final saliency map from the luminance and chrominance saliency maps. In step 14, the coding mode mode, associated with the block B, is selected from the subset SE, according to a criterion for example of the bitrate-distortion type. Advantageously, if the block B, is a block of which the value S, representative of the perceptual interest of the block is less than T, only the modes of the subset SE, are tested. In this case, the selection method calculates, for each of the modes mk of the sub-set SE1, the value J1(ITIk) equal to D,(mk) +λ7R,(mk). The method selects from the subset SE1, the coding mode mode, of the block such that modeF argmin^mJ) . The
Figure imgf000008_0001
selection of the coding mode mode, requires less calculation. The reconstruction quality can be slightly reduced for blocks with a low perceptual interest, i.e. such that S, < T, because all the coding modes are not tested for these blocks. However, this degradation does not disturb the human eye as it is produced in the zones of the picture of the least interest for the human eye. Moreover, the computation resources thus saved on the blocks of which the perceptual interest is low can be advantageously used to code the zones of high perceptual interest and for increasing the reconstruction quality. Indeed, the human eye is less sensitive to the degradation in the zones of which the perceptual interest is low than to degradations in the zones of which the perceptual interest is greater.
At step 16, the i index is incremented by 1.
At step 18, i is compared with N. If i is greater than or equal to N then the selection of the coding modes for the current picture is terminated 20, otherwise the method continues to step 12 with the next block.
With reference to figures 3 and 4, the invention relates to a coding device 30 and 40. Only the essential elements of the invention are shown in these figures. The elements that are well known by those skilled in the art of video coders are not shown, e.g. motion estimation module, motion compensation module, etc. In figures 3 and 4, the modules shown are functional units that may or may not correspond to physically distinguishable units. For example, these modules or some of them can be grouped together in a single component, or constitute functions of the same software. On the contrary, some modules may be composed of separate physical entities.
With reference to figure 3, the coding device 30 comprises a first input 300, a second input 302, an output 310, a selection module 304, a coding module 306 and a memory 308. The first input 300 is suitable to receive saliency values S1 and the second input 302 is suitable to receive the picture data of block B1. The selection module 304 is suitable to select, for each block B1 received from the second input 302, a coding mode mode, according to the saliency value S1 received from the first input 300. The selection module 304 is suited to implement the selection method of the invention. For this purpose, it comprises a unit 3040 suitable to determine, for the block B1, a subset SE1 of the set E according to the value S1 of perceptual interest of said block B1 in accordance with step 12 of the method and a unit 3042 connected to the unit 3040 suitable to select, in accordance with step 14 of the method, from the subset SE1, the coding mode mode, finally retained to code the block B, subsequently. The unit 3042 is suitable to calculate for example the function of type of bitrate-distortion J,(mk) and to carry out the selection of mode, from calculated values. The coding module 306 is suitable to code in binary form the picture data B, transmitted by the second input 302 according to the coding mode mode, transmitted by the selection module 304 and possibly according to picture data previously coded and reconstructed by said coding module 306 and stored in a memory 308, e.g. picture data belonging to a previously coded picture (temporal prediction) or to a block of the same previously coded picture (spatial prediction). The coding module 306 is linked to the output 310 of the coding device. The output 310 is suitable to transmit, e.g. to a decoding device or to a broadcast network, a bitstream F representative of the picture data received on the second input 302 and coded by the coding module. A variant of the coding device 30 is shown in figure 4. The shared elements of the two coding devices are identified by the same numerical references. The coding device 40 comprises a single input 302 suitable to receive the picture data from the blocks B1. It further comprises a module 400 suitable to calculate for each bock B1 a perceptual interest value S1. This value S1 is for example calculated according to the method described above for the selection method. In this variant, perceptual interest values S1 are calculated directly by the coding device 40 from picture data received on the input 302.
Naturally, the invention is not limited to the embodiment examples mentioned above. In particular, the person skilled in the art may apply any variant to the stated embodiments and combine them to benefit from their various advantages. Notably, the invention described for coding data of this type of coding mode can be extended to the selection of any other type of coding data, notably a number of reference pictures, a type of transform, a size of search window for motion estimation, etc. In MPEG4 AVC, it is indeed possible to select the reference picture used for the prediction of a picture data block in a set of 5 reference pictures. According to the invention, it is possible to reduce the number of pictures to test for certain blocks of the picture, i.e. the blocks of which the perceptual interest value S1 is less than or equal to T. Likewise, in the FRExt extension (standing for "Fidelity Range Extension") of MPEG4 AVC, it is possible to transform each block of a picture using a 4x4 by transform or else a 8x8 type transform. According to the invention, it is possible to limit this choice for the blocks of which the perceptual interest value Si is less than or equal to T. Moreover, the invention is neither limited by the type of saliency map generated, nor by the selection function that can be a function other than the J function described above.

Claims

Claims
1. Method of selection of a coding data from a predefined set of coding data (E), said coding data being associated with a picture portion with a view to its subsequent coding, said method comprising the following steps:
- determine (12) a subset (SE1) of said set (E) of coding data,
- select (14) at least one coding data from said subset (SE1), said method being characterized in that said subset (SE1) is determined (12) for said picture portion (B1) according to a predetermined value (S1) representative of the perceptual interest of said picture portion (B1), called perceptual interest value.
2. Method according to claim 1 , wherein at least one coding data is a coding mode.
3. Method according to claim 1 or 2, wherein said picture portion is a picture data block.
4. Method according to one of claims 1 to 3, wherein said predetermined value is a saliency value associated with said picture portion.
5. Method according to one of claims 1 to 4, wherein said subset (SE1) is equal to a first subset if said perceptual interest value (S1) is greater than a predefined threshold (T) and is equal to a second subset different from the first subset if said perceptual interest value (S1) is less than or equal to said predefined threshold (T).
6. Method according to claim 5, wherein said first subset is equal to said set (E).
7. Method according to claim 5 or 6, wherein, said second subset comprises the p coding modes of said set (E) for which the selection probability is the highest, this probability having been determined beforehand for each coding mode of the set (E).
8. Coding device of a sequence of pictures, each picture being divided into portions of picture blocks (B1), said device comprising:
- selection means (304) suitable to select, for each picture data portion (B1), at least one coding data, and
- coding means (306) suitable to code each of the picture data portions (B1) according to the coding data selected, said selection means (304) being characterized in that they comprise:
- means (3040) to determine, for each picture data portion (B1), a subset (SE1) of said set (E) of coding data according to a predetermined value (S1) representative of the perceptual interest of said picture data portion (B1), and
- means (3042) to select said at least one coding data from said subset (SE1).
PCT/EP2008/056149 2007-05-29 2008-05-20 Method for selecting a coding data and coding device implementing said method WO2008145560A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR07/55301 2007-05-29
FR0755301A FR2916931A1 (en) 2007-05-29 2007-05-29 METHOD OF SELECTING ENCODING DATA AND ENCODING DEVICE IMPLEMENTING SAID METHOD

Publications (1)

Publication Number Publication Date
WO2008145560A1 true WO2008145560A1 (en) 2008-12-04

Family

ID=39133781

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2008/056149 WO2008145560A1 (en) 2007-05-29 2008-05-20 Method for selecting a coding data and coding device implementing said method

Country Status (2)

Country Link
FR (1) FR2916931A1 (en)
WO (1) WO2008145560A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10264280B2 (en) 2011-06-09 2019-04-16 Qualcomm Incorporated Enhanced intra-prediction mode signaling for video coding using neighboring mode
US11700384B2 (en) 2011-07-17 2023-07-11 Qualcomm Incorporated Signaling picture size in video coding

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1544792A1 (en) * 2003-12-18 2005-06-22 Thomson Licensing S.A. Device and method for creating a saliency map of an image
US20060193385A1 (en) * 2003-06-25 2006-08-31 Peng Yin Fast mode-decision encoding for interframes
WO2006107280A1 (en) * 2005-04-08 2006-10-12 Agency For Science, Technology And Research Method for encoding a picture, computer program product and encoder
US20070036215A1 (en) * 2003-03-03 2007-02-15 Feng Pan Fast mode decision algorithm for intra prediction for advanced video coding

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070036215A1 (en) * 2003-03-03 2007-02-15 Feng Pan Fast mode decision algorithm for intra prediction for advanced video coding
US20060193385A1 (en) * 2003-06-25 2006-08-31 Peng Yin Fast mode-decision encoding for interframes
EP1544792A1 (en) * 2003-12-18 2005-06-22 Thomson Licensing S.A. Device and method for creating a saliency map of an image
WO2006107280A1 (en) * 2005-04-08 2006-10-12 Agency For Science, Technology And Research Method for encoding a picture, computer program product and encoder

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JEYUN LEE ET AL: "Fast mode decision for H.264", MULTIMEDIA AND EXPO, 2004. ICME '04. 2004 IEEE INTERNATIONAL CONFERENCE ON TAIPEI, TAIWAN JUNE 27-30, 2004, PISCATAWAY, NJ, USA,IEEE, vol. 2, 27 June 2004 (2004-06-27), pages 1131 - 1134, XP010771023, ISBN: 0-7803-8603-5 *
KO C C ET AL: "Fast Intermode Decision in H.264/AVC Video Coding", July 2005, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, PAGE(S) 953-958, ISSN: 1051-8215, XP011135320 *
QUQING CHEN ET AL: "Attention-based adaptive intra refresh for error-prone video transmission", IEEE COMMUNICATIONS MAGAZINE, IEEE SERVICE CENTER, PISCATAWAY, US, vol. 44, no. 1, January 2007 (2007-01-01), pages 52 - 60, XP011156148, ISSN: 0163-6804 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10264280B2 (en) 2011-06-09 2019-04-16 Qualcomm Incorporated Enhanced intra-prediction mode signaling for video coding using neighboring mode
US11700384B2 (en) 2011-07-17 2023-07-11 Qualcomm Incorporated Signaling picture size in video coding

Also Published As

Publication number Publication date
FR2916931A1 (en) 2008-12-05

Similar Documents

Publication Publication Date Title
US10743033B2 (en) Method and device for optimizing encoding/decoding of compensation offsets for a set of reconstructed samples of an image
US11831881B2 (en) Image coding device, image decoding device, image coding method, and image decoding method
CA2868255C (en) Image encoding device, image decoding device, image encoding method, and image decoding method
US8787685B2 (en) Encoding and decoding an image or image sequence divided into pixel blocks
CA3052614C (en) Moving image encoding device, moving image decoding device, moving image coding method, and moving image decoding method
US20150049818A1 (en) Image encoding/decoding apparatus and method
CA2961818C (en) Image decoding and encoding with selectable exclusion of filtering for a block within a largest coding block
US20160094860A1 (en) Image encoding device, image encoding method, image decoding device, and image decoding method
EP3860126A1 (en) Encoding device, decoding device, encoding method, and decoding method
KR102602690B1 (en) Method and apparatus for adaptive encoding and decoding based on image quality
WO2008145560A1 (en) Method for selecting a coding data and coding device implementing said method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08759765

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08759765

Country of ref document: EP

Kind code of ref document: A1