EP2441267A1 - Textur- und videocodierung mit hohem dynamischem bereich - Google Patents

Textur- und videocodierung mit hohem dynamischem bereich

Info

Publication number
EP2441267A1
EP2441267A1 EP08875717A EP08875717A EP2441267A1 EP 2441267 A1 EP2441267 A1 EP 2441267A1 EP 08875717 A EP08875717 A EP 08875717A EP 08875717 A EP08875717 A EP 08875717A EP 2441267 A1 EP2441267 A1 EP 2441267A1
Authority
EP
European Patent Office
Prior art keywords
data
luminance
component
texture
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP08875717A
Other languages
English (en)
French (fr)
Inventor
Emanuele Salvucci
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Trellis Management Co Ltd
Original Assignee
Trellis Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Trellis Management Co Ltd filed Critical Trellis Management Co Ltd
Publication of EP2441267A1 publication Critical patent/EP2441267A1/de
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • the present invention relates to digital image processing and, more particularly, to encoding and decoding low dynamic range (LDR) ,high dynamic range (HDR) and High Bit-depth images, including video compression.
  • LDR low dynamic range
  • HDR high dynamic range
  • High Bit-depth images including video compression.
  • High Dynamic Range Imagery refers to processes, techniques, and digital image systems capable of reproducing real-world lighting and color data with high accuracy.
  • HDRI initially was introduced to overcome dynamic range limitations in digital images, such as described in the scientific publication Overcoming Gamut and Dynamic Range Limitations in Digital Images" (hereinafter “Ward98_2”), Proceedings of the Sixth Color Imaging Conference, November 1998, as well as in the scientific publication "High Dynamic Range Imagery” (hereinafter “WardOI”), Proceedings of the Ninth Color Imaging Conference, November 2001
  • WardOI High Dynamic Range Imagery
  • HDR pixel data commonly is represented using 96 bit data; 32 bits single precision IEEE floating point data for each RGB component. Standard digital images usually employ 24 bits per pixel; 8 bits for each RGB component.
  • an HDR image has four times as much data as an LDR image. Therefore, due to relatively large amount of data for an HDR image, there is a need to substantially compress HDR image data.
  • RGBE image format As representing the required dynamic range using 32 bits/pixel instead of 96 bits/pixel, providing a compression ratio of 3:1.
  • Typical image and video compression algorithms such as JPEG and MPEG, achieve much higher compression ratios, thus producing files hundreds of times smaller than the original source data.
  • the RGBE format explicates a relevant compression problem by introducing the Exponent on a per-pixel basis, since even small errors that may be introduced by common compression methods, such as but not limited to JPEG and MPEG, generate exponentially higher levels of artifacts in the recovered image.
  • European Patent EP 0991020 owned by the Eastman Kodak
  • Kodak Patent describes a method to encode the extended color gamut of sRGB images by using residual images obtained by computing [0006]
  • the scientific publication “Perception-motivated High Dynamic Range
  • WardO ⁇ JPEG-HDR: A Backwards- Compatible, High Dynamic Range Extension to JPEG
  • MPEG Video Compression Proc. of SIGGRAPH '06 (Special issue of ACM Transactions on Graphics), 25 (3), pp. 713-723, 2006, (hereinafter "MantiukO ⁇ ") discloses a method for compressing HDR video similar to that disclosed in Ward04 and WardO ⁇ .
  • Consumer digital displays today are only capable of displaying 8 bits for each RGB color component (i.e. 24-bit image data) and, considering luma/chroma separation, are only capable of displaying 8 bits for luma data that contain the high-frequency details of an image.
  • high definition (HD) resolutions and bigger display screens - up to 100 inches - the limited precision of standard 24-bit digital images reveals several artifacts on such displays, even when uncompressed.
  • common compression solutions like the JPEG or MPEG plethora of algorithms, these artifacts become even more visible to the human eye, due to the reduced precision, in number of bits, of the resulting decompressed output image.
  • a display In order to fully exploit the potential of HD resolutions, a display should be capable of showing 1920 shades, i.e. the maximum number of pixels in the longest pixel row at the highest HD resolution (108Op). With 8-bit data it is only possible to display 256 shades and moreover with MPEG compression this figure is usually greatly reduced, producing visible banding artifacts when using HD resolutions even on a 26 inches display.
  • MPEG - and the latest H.264 standard - also introduces block-artifacts due to the nature of the algorithms which encode and store image data in blocks of 4x4 pixels up to 16x16 pixels, which are independent from each other, i.e. the output pixel luma value of a block has no correlation with an adjacent pixel of a different block. This usually leads to a high luminance contrast between edges of the surrounding blocks.
  • deblocking filters In order to avoid block-artifacts a number of algorithms commonly known as “deblocking filters” have been introduced. Such algorithms though are by nature very expensive in terms of the computing power required, which is linearly related to the dimensions of the image to be filtered. With HD resolutions, the number of pixels to be processed with a deblocking filter is up to 4.6 times the number of pixels being processed in a standard PAL or NTSC frame, thus a deblocking algorithm requires much more computing power in order to be applied to HD resolutions.
  • the processes/techniques disclosed herein provide low dynamic range texture encoding and compression for 3D applications including, but not limited to, video-games, console-games, interactive software and rendering software.
  • all, or part of mapped textures of the same size within a scene are rendered into a new single texture maintaining the original texture tiling.
  • the new texture is separated into its chrominance and luminance components and the chrominance part of the texture (called cumulative chroma texture or CC-texture) is mapped back.
  • the original textures are taken in groups of three or four (or more), and converted into their respective luminance parts only, and each is stored as a single color channel of a single new texture (called cumulative luma texture or CL-texture) for each group.
  • a new texture (called index texture or ID-texture) is created, with each color channel representing the index to each of the original three or four (or more) textures.
  • a pixel shader program or equivalent can be utilized to read an index from the ID-texture and reconstruct the original texture by recombining the CC-texture with the specified luminance part stored in a channel of one of the CL-textures generated.
  • low dynamic range texture encoding and compression are provided for 3D applications including, but not limited to, video-games, console-games, interactive software, and rendering software. All or part of the UV-mapped textures of the same size and resolution within a scene are rendered into a new single texture of any desired size, following a dedicated second UV set and maintaining the relative texture tiling. The new texture is separated into its chrominance and luminance components, and the chrominance part of the texture (cumulative chroma texture or CC-texture) is mapped back using the second UV set. The CC-texture represents the chrominance part of the region originally mapped with many different textures.
  • the original textures are taken by groups of three or four (or more) and converted into their respective luminance parts only and each is stored as a single RGB or RGBA channel respectively of a single new texture (cumulative luma texture or CL-texture) for each group, so that any single CL-texture represents three or four (or more) original textures.
  • the areas originally covered by each of the three or four (or more) textures in the texture groups are color-coded using a very small RGB or RGBA texture (ID-texture), with each color channel representing the index to each of the original three or four (or more) textures.
  • ID-texture very small RGB or RGBA texture
  • a pixel shader program or equivalent is able to read an index from the ID-texture and reconstruct the original texture locally by recombining the CC- texture with the specified luminance part stored in a channel of one of the CL- textures generated.
  • variable dynamic range texture encoding and compression is provided by carrying out some of the above-summarized processes.
  • Each CL-texture channel representing luminance values are linearly scaled by a factor smaller than one in a manner that minimizes the error derived from the loss of data caused by scaling.
  • the values are clamped in the range between 0 and 1.0, with 1.0 representing the maximum decimal number according to the number of bits used for each channel.
  • the visible error introduced can be distributed on the CC-texture and thus reduced or eliminated by using an error- correction technique.
  • multi-material texture encoding is provided by carrying out some of the above-summarized processes. Instead of storing groups of three or four (or more) luminance textures, one of the channels into the new texture is reserved to the luma component for texture color reconstruction. The mask-texture is not needed and the remaining two or three (or more) channels may be used to store additional material information including, but not limited to, specularity, reflectivity and transparency.
  • variable dynamic range image sequence encoding is provided by carrying out some of the above-summarized processes.
  • An ID-texture, or an explicit numeric index is cycled for the entire image sequence duration so that at each frame it always represents either index number one (red), number two (green) or number three (blue), or a mix of such indexes.
  • Each frame of the CL-texture in the image sequence represents three frames in an image sequence that is being played back.
  • high dynamic range texture, and image sequence encoding is provided by carrying out some of the above- summarized processes.
  • the luminance part of each original HDR texture in the scene is encoded using a finite non-linear sum, and storing each term in each channel of a new texture or frame.
  • high dynamic range texture, and image sequence encoding is provided by carrying out some of the above- summarized processes.
  • a LDR luminance texture is created by clamping values of the HDR luminance texture within the range of 0.0 to 1.0.
  • a fractional luminance texture (FL-texture) is created by dividing the LDR luminance texture by the HDR luminance texture.
  • low and high dynamic range image sequence encoding is provided by carrying out some of the above- summarized processes.
  • the input HDR frame's RGB values are clamped within the range of 0.0 to 1.0, and generating an LDR version of the HDR frame.
  • An RGB fractional color frame (FC-texture) is generated by dividing the LDR by the HDR frame, channel by channel respectively.
  • High Bit-depth image, and video encoding is provided.
  • a 96 bit/pixel (high dynamic range - HDR, or High Bit-depth floating point) image or video is clamped at 24 bit/pixel and compressed using any common image or video compression algorithm and the result is stored as a common image or video sequence.
  • the compressed image or video is then decompressed and a quantization of the luminance values is performed thereby reducing the possible pixel values to a range between 2 and 256, i.e. to between 1 and 8 bits.
  • a linear interpolation of values between 0.0 and 1.0 (or 0 and 255) is then performed between each pixel value of the quantized image by comparing it with the original pixel of uncompressed HDR image.
  • the linear interpolation values restart from 0.0.
  • an inversion step may be applied in order to create a smoother interpolation throughout the entire image.
  • the result of the interpolation is stored in an 8 bit/pixel secondary image, or by employing a frame that is double in height or in width, that may be further compressed using any common image or video compression algorithm such as JPEG or MPEG.
  • the present encoding method can also be applied for each RGB component, instead of luma alone.
  • the present invention provides several advantages over existing systems/processes.
  • the present invention advantageously allows any JPEG-like, MPEG-like, or other image, texture and/or video compression system/technique to lossy compress LDR and HDR and High Bit- depth image and texture data with a selectable level of introduced error.
  • the present invention advantageously enables HDR images and video to be filtered using existing LDR filters including, but not limited to, de-blocking, de- ringing, and film effect, without generating perceivable artifacts.
  • the present invention as compared to existing encoding/compression systems/techniques, advantageously requires relatively little computational power and advantageously allows current hardware/compression systems to be exploited without the need for additional coding stages, as well as special, ad-hoc filtering.
  • the present invention further advantageously enables 3D applications to merge together a number of 3D objects that are able to use the same texture, thus leading to polygon batch optimization, and to apply further texture compression systems, including but not limited to DXTC and 3Dc, to the generated CL-textures.
  • the present invention further advantageously enables 16-bit luma or
  • Figure 1 is a functional block diagram in accordance with a first method of encoding and decoding of the present invention
  • Figure 2 is a functional block diagram in accordance with a second encoding method of the present invention.
  • Figure 3 is a functional block diagram in accordance with a second decoding method of the present invention.
  • Figure 4 is a functional block diagram in accordance with a third encoding method of the present invention.
  • Figure 5 is a functional block diagram in accordance with a third decoding method of the present invention.
  • Figure 6 is a functional block diagram in accordance with a fourth encoding method of the present invention.
  • Figure 7 is a functional block diagram in accordance with a fourth decoding method of the present invention.
  • Figure 8 is a functional block diagram in accordance with a fifth encoding method of the present invention.
  • Figure 9 is a functional block diagram in accordance with a fifth decoding method of the present invention.
  • Figure 10 and 11 are functional block diagrams respectively illustrating methods of encoding and decoding image data in accordance with a sixth embodiment of the present invention.
  • FIGURES 1 - 11 of the drawings in which like numbers designate like parts.
  • MPEG MPEG
  • NTSC long used television standards
  • high frequency details mainly represented by the luminance component of an image
  • NTSC long used television standards
  • the techniques described herein take this into consideration.
  • the present invention solves in several ways the linear representation of a series of exponential numeric values, i.e., the pixels of an HDR texture or image or image sequence, as described herein.
  • FRAME.rgb CHROMA.rgb * FC.rgb * f
  • Each of the figures represents functional block diagrams. Each block within the figures may represent one or more discrete steps/processes that are carried out by one or more devices/systems. Conversely, one device/system may carry out the functions/processes represented by multiple blocks. In various figures, dotted blocks represent optional steps/functionality or possible outputs (after an optional step/functionality is applied). Shaded blocks represent direct encoding outputs.
  • the herein described processes are particularly well suited for 3D applications.
  • the present invention enables the encoding of a variable number of textures so that fewer textures are effectively used, thus using less memory, by exploiting the higher details contained in the luminance component of any texture. It is; however, understand that the herein described processes may be used outside the 3D field(s).
  • a first encoding/decoding method in accordance with a first embodiment of the present invention is described below, in which a 3D scene or object is provided on which multiple 24-bit textures have been applied using "UV texture mapping.”
  • the first encoding/decoding method described herein is schematically illustrated in Figure 1 of the drawings.
  • 3D vertex as a unique UV value, that is, that maps every vertex, and wherein there is no overlap between any UV value.
  • texture baking a cumulative rendering of all the applied textures is performed.
  • the texture baking process writes new pixels into a newly generated texture following the mapping specified by the second UV-set 100, but reading all the originally mapped textures 101 by following the original UV-set 102, thus writing pixels according to the initial mapping so that any tiling or stretching of the original textures is explicit into the newly generated texture.
  • the generated texture (called cumulative rendering texture or "CR-texture” ) can be of any size and resolution.
  • An 8-bit grayscale luminance and an RGB chrominance version called
  • Cumulative Chroma texture or "CC-texture” 103) of the CR-texture are created using any known method.
  • the CieXYZ color-space is taken as a reference in order to generate the chrominance texture, as shown in “Formula 002" below, and the luminance texture is the average of the sum of the RGB channels, as shown in “Formula 001" below.
  • the CR-texture is separated into chroma and luma components.
  • Chroma (the Cumulative Chroma or CC-texture 103) is obtained by dividing the CR-texture with the sum of the CR-texture's RGB components, and the luma is the average of the CR-texture's RGB components, and is used to obtain the Cumulative Chroma texture 103.
  • the luma texture, and thus the chroma, may also be obtained by weighting the RGB components, as represented in formulas 001 and 002.
  • CieXYZ color space CHROMA.
  • X R/(R+G+B) CHROMA.
  • the CC-texture 103 effectively represents the chrominance.
  • the generated texture (Cumulative rendering texture or "CR-texture") can be of any size and component of all the textures originally applied to the scene.
  • the chroma values for the CC-texture 103 also can be computed directly during the texture baking process.
  • a color-coded index texture (Cumulative index texture or "ID-texture"
  • the ID-texture 104 is an RGB texture of any desired size and resolution, wherein each color channel fully or partially represents the index to any of the three corresponding original textures in a group. For example, in the first group of 3 textures, a pixel value of (255, 0, 0) in the ID-textures 104 indicates that texture n.1 is fully visible, thus the current pixel fully refers to texture n.1.
  • each 8-bit luminance component of each original texture in a group may be stored into each RGB channel of a new texture (Cumulative luminance texture or "CL-texture" 105) of the same size and resolution of the ones in the current group.
  • CL-texture Cumulative luminance texture
  • a single RGB 24-bit CL-texture 105 contains luminance, high frequency data from three 24-bit original textures.
  • the CC-texture 103 can be much smaller in size and resolution than the original applied mapping, while the memory required by the ID-texture 104 can be often neglected as the size and resolution needed for it is usually several times less than the original texture size.
  • All generated textures optionally may be further compressed using any suitable compression algorithm including, but not limited to, DXTC, which is widely used in 3D applications.
  • the CC-texture 103 further may be more useful when the 3D application employs "lightmaps" 106 throughout a scene. Since lightmaps 106 contain light information (e.g., color and intensity data), they can be pre-multiplied with the CC-texture 103, as shown in Formula 003 below. The latter operation does not affect the reconstruction process involved in the present method.
  • lightmaps 106 contain light information (e.g., color and intensity data)
  • they can be pre-multiplied with the CC-texture 103, as shown in Formula 003 below. The latter operation does not affect the reconstruction process involved in the present method.
  • PREMJJGHTMAP.rgb LIGHTMAP.rgb * CC-TEXTURE.rgb
  • the encoding method of the present invention optionally may employ
  • 3D textures in hardware Since a 3D texture can store up to 6 RGB or RGBA 2D textures, a 3D CL-texture 105 can store 18 luminance textures for the RGB type and 24 for the RGBA type.
  • a pixel shader program or other suitable software program may be used.
  • the decoding program gathers the information needed, i.e., the corresponding CC-texture 103 (Term A), ID- texture 104 (Term B) and CL-texture 105 (Term C) value, for the current pixel.
  • the current pixel color is recovered by selecting the appropriate channel, i.e., the appropriate luminance part of the appropriate original texture, in CL-texture 105 using the index provided in the ID- texture 104.
  • the "dot product" 107 between all channels of the ID-texture 104 and all channels of the CL-texture 105 produces the desired result, as represented in Formula 004 below.
  • the ID-texture 104 also can represent blended RGB values for the current pixel, thus allowing smooth or hard blending between the 3 textures represented in CL-texture 105. For example, if the value for the current pixel in the ID-texture 104 equals (127,127,127) in a 24-bit texture, the final reconstructed luminance is the average of all the luminance textures stored in the CL-texture 105.
  • Formula 006 summarizes the decoding process.
  • ORIGINAL_COLOR.rgb (R 1 * R 2 + G 1 * G 2 + B 1 * B 2 ) * CC-texture.rgb * 3 [0065]
  • a total of 24 grayscale textures may be stored together with a color-coded ID cubemap and a small mask texture representing the color-coded index to the cube face axis, i.e. ⁇ X 1 Y 1 Z. (6 cube faces * 4 RGBA channels).
  • a higher luminance range may be encoded into the CL-texture 105 by scaling each RGB channel 108 in the CL-texture 105 (called scaled luminance LDR or SLLDR-texture) by a factor ⁇ 1.0, as shown in Formula 007.
  • scaled luminance LDR or SLLDR-texture a factor ⁇ 1.0, as shown in Formula 007.
  • the present invention provides for additional error correction (to minimize visible artifacts) by calculating the chrominance texture (CC-texture 103) by using luminance information of the SLLDR-texture (discussed above), rather than the HDR CR- texture, as represented in Formula 008b below.
  • additional error correction to minimize visible artifacts
  • the extra luminance data used to correct the error are partially stored in the chrominance texture.
  • SCE is the Scaled Chrominance Error image
  • SLLDR is the Scaled Luminance LDR version of the HDR luminance image
  • f is an appropriate scaling factor
  • the chrominance texture is obtained by using information from the SLLDR-texture, instead of using the HDR luminance information, and scaled by the scaling factor.
  • different compatible textures may be encoded into the remaining texture channels of the CL-texture 105. Since one of the channels is reserved for luminance information from any one original texture, the remaining channels may be employed to store other 8-bit texture data including, but not limited to, specularity, reflectivity, and transparency data. In such case, the ID-texture 104 is no longer necessary since the CL-texture 105 does not contain three different luminance textures in this variation, but rather different features within the same material set. The different features are pre-known or pre-designated and thus be accessed directly by addressing the respective channel in the CL-texture 105. Second Encoding and Decoding Methods of the Present Invention
  • each frame of an LDR or HDR image sequence 200 initially is separated into its chrominance and luminance components (Chroma and Luma Separation 201), employing Formulas 001 and 002 provided above, to produce a chroma image sequence 202 and a luma image sequence 203.
  • Groups of three (3) frames from the luminance sequence 203 are stored in a single 24-bit RGB frame (Cumulative Luminance or "CL-frame" 204) preferably in the following order: luminance frame 1 in the red channel; luminance frame 2 in the green channel; and luminance frame 3 in the blue channel.
  • Both the chrominance frame sequence (CC- frame sequence 202) and the cumulative luminance frame sequence 205 (sequence of CL-frames 204) are compressed separately using any known, suitable compression algorithm/technique including, but not limited to, MPEG, to produce compressed streams of data (Compressed Chroma 206; and Compressed Luma 207).
  • the resultant luminance stream includes one-third the number of original frames.
  • the CC-frame sequence 202 may be sub-sampled prior to being compressed.
  • the luminance frame sequence may be scaled by a suitable factor 208 in the manner discussed above.
  • Figure 3 functionally shows the process of decoding the above- described compressed streams of data.
  • three frames of the chrominance stream 206 and one frame of the luminance stream 207 are decompressed (i.e., decoded) in accordance with the previously selected compression algorithm/technique (e.g., MPEG decoding), employing the known format (i.e., known order as described above) of the coded cumulative luminance stream 207.
  • Each luminance frame is extracted from the decoded cumulative luminance sequence 205 (CL-frame) utilizing Formula 004 107 described above.
  • the ID-texture 104 is an ID-vector 300 of three floating point or integer values.
  • Each color frame 301 is decoded by multiplying back the chroma and luma components in accordance with Formula 005 discussed above.
  • the CL- frame 205 values may be re-scaled 109 (i.e., scaled back by the inverse of the factor utilized) (see Formula 008b above).
  • each frame of an HDR image sequence 400 initially is separated into its chrominance and luminance components (Chroma and Luma Separation), employing Formulas 001 and 002 provided above, to produce a chroma image sequence 202 and an HDR luma image sequence 401.
  • each frame 402 of the HDR luma image sequence is separated into, preferably, three (3) different levels of luminance and stored into a new RGB 24-bit frame (called herein, High Dynamic Range Cumulative Luminance frame or "HDRCL-frame” 403) in accordance with Formula 009 provided below (also employing the "pseudo-code” clamp function provided above).
  • Formula 009 (F_009 in Figure 4) represents the preferred embodiment by providing three (3) discrete steps, with three different scaling factors 404. In a variation, a different number of steps/scaling factors may be employed.
  • the scaling factors 404 may be stored for each frame or for the entire sequence. If scaling factor f3 is small enough, the clamping operation is not need for HDRCL.b.
  • Each step optionally may be gamma corrected and/or scaled and error-corrected 405 in the manner described above.
  • ⁇ - ⁇ , ⁇ 2 , and ⁇ 3 in Formula 009 above are the gamma correction factors, if applied.
  • the principle behind Formula 009 is to define discrete and linearly compressed luminance ranges by subtracting and thus eliminating clamped and scaled values in the range 0 ⁇ v ⁇ 1.0 from the current HDR luminance level.
  • a gamma correction function may be applied before storing the current luminance level into an 8-bit channel of the HDRCL-frame. Since each clamped frame is subtracted from the current luminance level, a smaller factor can be employed in the next luminance level as the remaining values are mostly > 1.0.
  • the HDRCL-frame sequence 403, as well as the chrominance sequence 202 are compressed using any known compression technique, such as MPEG, to produce a compressed HDR Cumulative luminance stream 406 and a compressed chrominance stream 206, respectively, or into a larger movie.
  • the three factors 404 used to scale the luminance ranges, as well as the three gamma values, may be stored in a header of the preferred file format.
  • the chrominance frame sequence 202 may be sub-sampled.
  • Figure 5 functionally shows the process of decoding the above- described compressed streams of data.
  • the compressed luminance stream 406 and compressed chrominance stream 206 are decompressed (i.e., decoded) in accordance with the compression algorithm/technique used to compress the data (e.g., MPEG decoding).
  • Each resultant HDR luminance 402 frame is decoded from the HDRCL-frame 403 by applying the inverse gamma correction function 500 (power function) to each channel of the HDRCL-frame 403 and calculating the scalar dot product between the three (3) luminance scaling factors 404 and the three (3) channels in the HDRCL-frame 403, as represented in Formula 010 shown below.
  • HDRCLr HDRCLr ⁇ ⁇ 1
  • HDRCLg HDRCL.g ⁇ ⁇ 2
  • HDRCLb HDRCL.b ⁇ ⁇ 3
  • HDRL dot(F.xyz, HDRCL.rgb)
  • HDRL (F.x * HDRCLr + F.y * HDRCL.g + F.z * HDRCL.b)
  • Y and F are floating point scalar vectors.
  • the chrominance frame 202 is multiplied back with the recovered HDRCL-frame 402, as set forth in Formula 011 below.
  • HDRL dot(F.xyz, HDRCL.rgb) * CHROMA.rgb * 3 Fourth Encoding and Decoding Methods of the Present Invention
  • each frame of an HDR image sequence 400 initially is separated into its chrominance and luminance components (Chroma and Luma Separation), employing Formulas 001 and 002 provided above, to produce a chroma image sequence 202 and an HDR luma image sequence 401.
  • Each frame of the HDR luma image sequence 401 is clamped in the range 0 ⁇ x ⁇ 1.0, thus obtaining the LDR version of the HDR luminance frame (LLDR-frame 600).
  • a gamma correction function and a scaling factor 405, as described above, optionally may be applied to each LLDR-frame 600.
  • the LLDR- frame 600 is divided by the HDR luminance frame 401 to obtain an 8-bit reciprocal fractional luminance frame (FL-frame 601). That is, a reciprocal representation of all the values > 1.0 in the HDR luminance frame 401.
  • a gamma correction function and a scaling factor 405 optionally may be applied to each FL-frame.
  • the LLDR-frame sequence 600, FL-frame sequence 601 and chrominance frame sequence 202 are compressed using any known compression system, such as but not limited to MPEG, to produce three (3) separate streams or into a larger movie.
  • the chrominance frames 202 and FL-frames 601 may be sub- sampled.
  • the resulting FL-frame 601 when the LDR-frame 600 is divided by the HDR-frame 401, the resulting FL-frame 601 usually contains mostly large areas of white pixels (1.0, 1.0, 1.0) or (255, 255, 255), wherein the original pixel values are in the range 0 ⁇ x ⁇ 1.0.
  • the FL-frame 601 represents relatively small amount of overhead after it is compressed using a video compression algorithm, since large areas of uniform pixel values in a frame generally are well optimized by most video compression techniques.
  • Figure 7 functionally shows the process of decoding the above- described compressed frame sequences.
  • Each of the frame sequences are decompressed (e.g., MPEG decoding) to produce chrominance image 202, fractional luminance 601 and LDR luminance sequences 600. If applicable, the LLDR-frame 600 or the FL-frame 601 is re-scaled and/or inverse gamma corrected 500.
  • the HDR luminance component 401 is recovered by applying the reciprocal fractional function to the FL-frame 601 and multiplying it back with the LLDR-frame 600.
  • the chrominance 202 and HDR luminance frame 401 are re- multiplied back to obtain the original HDR frame 400, as set forth in Formula 013 below.
  • HDR.rgb (CHROMA.rgb * LLDR * (1.0 / (FL ⁇ y * (1.0/0 ) )) * 3
  • an FL-frame is directly generated for each HDR color channel, as described herein.
  • an HDR RGB color frame 400 is clamped in the visible range of 0 ⁇ x ⁇ 1.0 to produce a color LDR frame (also "CLDR-frame" 800). That is, the value of each color channel is clamped in the range of 0 ⁇ x ⁇ 1.0.
  • the resulting CLDR- frame 800 sequence is compressed using any appropriate compression algorithm/technique (e.g., MPEG).
  • each RGB component of the CLDR-frame 800 is divided by each respective RGB component of the original HDR color frame 400 to produce a 24-bit RGB reciprocal fractional color representation ("FC-frame" 801).
  • Gamma correction and a scaling factor 405 optionally may be applied to each FC-frame 801.
  • Formula 014 below represents the above-described processes.
  • CLDR.rgb CLAMP(HDR.rgb, 0, 1.0)
  • FC.rgb (CLDR / HDR.rgb * f) ⁇ 1/ ⁇
  • each RGB channel in the FC-frame 801 contains large areas of uniform white pixels (1.0, 1.0, 1.0) or (255, 255, 255), but in the fifth embodiment each color channel also represents the reciprocal fractional RGB color proportion to the original HDR color frame 400, thus including chrominance and residual chrominance values.
  • the FC-frame 801 sequence is compressed using any known compression technique (e.g., MPEG).
  • Figure 9 functionally shows the process of decoding the above- described compressed frame sequences.
  • the frame sequences are de-compressed (e.g., MPEG decoding) and, if applicable, the recovered FC-frames 801 are re- scaled and/or inverse gamma corrected 500.
  • the original HDR frame 400 is recovered by multiplying the LDR color frame 800 with the reciprocal (multiplicative inverse) of the FC-frame 801, as shown in Formula 015 below.
  • HDR.rgb CLDR.rgb * (1.0 / (FC.rgb ⁇ y * (1.0/f) ))
  • the FC-frame 801 restores chrominance features in the CLDR-frame 800 which were contained in the original HDR color frame 400.
  • the original HDR color frame 400 is clamped in the range 0 ⁇ x ⁇ 1.0, each pixel value that is greater than 1.0 is essentially lost in the CLDR-frame 800, and so is any RGB value difference (compared to the original HDR frame 400).
  • the clamped value in the "visible" (8-bit) range is h'(1.0,1.0,1.0), which corresponds to (255,255,255), which represents white.
  • the present invention stores, along with the h' white pixel, the reciprocal fractional representation of the original HDR pixel (i.e. by applying Formula 014), which is the value f(0.4, 0.66, 0.83).
  • Formula 014 is applied, or a simplified version of this formula is applied.
  • a further embodiment of the present invention enables 48 bit/pixel precision in RGB or 16 bit/pixel precision in the Luma component of an image or video sequence, even if said images or video are compressed using common image and video compression algorithms such as JPEG or MPEG, is shown in Figures 10 and 11.
  • a HDR 96 bit/pixel image or video (HDR-lnput 1000) and a JPEG or MPEG compressed version of that image or video (COMP-lmage 1001) are given as inputs.
  • the luma component of COMP-lmage 1001 is then quantized 1002 between 1 and 8 bits, where the number of bits used for the quantization is decided beforehand or stored in the file header. This basically divides the image in more or less large areas of equally spaced luminance values whatever the number of bits chosen for quantization.
  • Formula 016 x abs((input - inf)/(sup - inf)) where: "x" is the resulting interpolated value;
  • inf is the current quantization value so that inf ⁇ input
  • LOOP-lmage 1004 is to be compressed using JPEG or MPEG algorithms.
  • LOOP-lmage 1004 will contain high-contrast edges of pixels at the beginning of each new quantization step, where pixels in the LOOP-lmage 1004 corresponding to the end of a quantization step will be white, i.e. (1.0, 1.0, 1.0), while adjacent pixels corresponding to the beginning of the next quantization step will be black, i.e. (0, 0, 0).
  • the LOOP-lmage 1004 would not constitute an ideal source input, since it would be prone to artifacts once decompressed.
  • a "loop switch" 1005 is performed to make the interpolation across different quantization steps contiguous.
  • fract() is a function that returns the fractional part of a floating point number, i.e. (x - floor(x)), where floor(x) computes the smallest integer value of x.
  • cstep is the current quantization step value so that cstep ⁇ input;
  • loop is the current interpolated value obtained in Formula 016; and
  • the resulting image (LOOPW- lmage 1006) will contain only continuous shades in the range [0,255], where pixels at the end of an even quantization step are white, i.e. (1.0, 1.0, 1.0) and so are adjacent pixels at the beginning of the subsequent quantization step.
  • the LOOP-lmage 1004 and LOOPW-lmage 1006 now effectively represent luminance values at a much smaller quantization step once decoded, using only 8-bit precision.
  • each pixel in the LOOP-lmage 1004 or LOOPW-lmage 1006 effectively represents a step equal to 1/256*Q, where Q is equal to the number of quantization steps, i.e. number of bits/pixel, calculated for COMP-lmage 1001.
  • the LOOPW-lmage 1006 can now be compressed using JPEG or MPEG.
  • the decoding step shown in Figure 11 , requires both the COMP- lmage 1001 and the LOOP-lmage 1004 or LOOPW-lmage 1006 as inputs.
  • the COMP-lmage 1001 is quantized 1002 using the same number of steps as in the encoding stage. For each quantization step of COMP-lmage 1001 a corresponding pixel from the LOOPW-lmage 1006 is decompressed and its value is read back. The current value of the current pixel of the COMP-lmage 1001 calculated quantization is then incremented by adding the LOOPW-lmage 1006 value to it, multiplied by the reciprocal of the total number of quantization steps in COMP-lmage 1001, i.e. 1/Q, in accordance with Formula 018:
  • loop is the current interpolated value obtained in Formula 016.
  • LDR Frame (or CLDR-frame 800) is a clamped RGB version of the HDR input image 400, thus all the pixels whose value is equal to white (1.0, 1.0, 1.0) represent a mask (CLDR-mask) to the reciprocal fractional color representation (FC-frame 801). In fact only the corresponding pixels in the FC-frame 801 have a value that is ⁇ (1.0, 1.0, 1.0).
  • the CLDR-mask is obtained by applying a threshold function to the CLDR-frame 800 it is multiplied by the FC-frame 801 obtaining a Masked FC- frame (or MFC-frame). After this operation all the white (1.0, 1.0, 1.0) pixels in the MFC-frame are now black (O, O 1 O).
  • the CLDR-mask is then inverted and multiplied by the LOOP-lmage 1004 or LOOPW-lmage 1006, obtaining black pixels in those areas where the FC-frame 801 stores significant data (Masked LOOPW-lmage or MLOOPW-lmage).
  • EFC-frame Enhanced FC-frame
  • the EFC- frame can now be compressed using JPEG or MPEG algorithms.
  • the same threshold function is applied to the decoded CLDR-frame 800 and the appropriate decoding method is applied, according to the value of each pixel in the CLDR-mask. If the current pixel in CLDR-mask is white (1.0, 1.0, 1.0) then the decoding method of Figure 9 is applied, otherwise, if the pixel is black (0, 0, 0) the decoding method in Figure 11 is applied.
  • the above discussed sixth embodiment also particularly allows for advantageously encode a movie intended for both the DVD and HD-DVD or BlueRay Disk formats, just once instead of iterating any common encoding system twice at least in order to support higher density media.
  • the LOOP-lmage 1004 sequence or LOOPW-lmage 1006 sequence is separately enhancing the COMP- lmage sequence 1001 , the latter can be encoded using MPEG or other algorithms in order to fit a standard DVD size, while the LOOP-lmage 1004 sequence or LOOPW- lmage 1006 sequence can be set to fit the remaining data of an HD-DVD or BlueRay Disk once the COMP-lmage sequence 1001 has been included.
  • the luminance component image is not subjected to a tone-mapping operation, but rather to a clamping function (i.e. the most common and simplest Tone Mapping Operator possible), thus not requiring the extra precision offered by HDR data.
  • the error introduced by optionally scaling the luminance image is not perceptually significant for a number of values of the scaling factor. For example, a scaling factor of 0.5 when applied to the luminance component image, results in an average numeric error of 0.002:1.0, whereas a scaling factor of 0.25 results in an average error of 0.006:1.0.
  • sub-sampling pertains to reducing the original image size or its resolution by 0.5, 0.25, 0.125, etc., or other appropriate step.
  • the present invention does not encompass any size restriction with respect to sub-sampling.
  • sub-sampling may be applied in each of the embodiments described above.
  • the CC-texture 103 may be sub-sampled, with the ID-texture 104 generated at any size and resolution; in the second embodiment, the chrominance frame sequence 202 may be sub-sampled; in the third embodiment, the chrominance frame sequence 202 may be sub-sampled; in the fourth embodiment, the chrominance 202 and FL-frame 601 sequences may be sub-sampled; and in the fifth embodiment, the FC-frame 801 sequence may be sub-sampled.
  • the present invention may be applied in various manners.
  • the present invention may be employed in, but not limited to, the following: (1) real-time or pre-computed (software) 3D applications; (2) static or dynamic (animated) 2D image applications; (3) hardware/physical implementations (electronic, mechanical, optical, chemical etc.); and (4) hardware/software display devices applications and engineering.
  • the present invention entails at least the following advantageous features, as compared to many prior art systems/processes/techniques: fewer possible number of operations during encoding and decoding; smaller output file size (i.e., greater compression ratio); fewer errors introduced; and easier engineering (i.e., less complex system).
  • the present invention does not rely on a specific type of compression algorithm/technique, such as MPEG, but rather employs any of a variety of compression algorithms/techniques and thus is completely compatible with existing compression algorithms.
  • the present invention provides for at least the following advantageous aspects/features: backward compatibility with existing compression techniques; backward compatibility with existing video editing procedures; output data that is as compression-friendly as possible; output data that is editable in real-time; and other features mentioned above.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
EP08875717A 2008-10-17 2008-10-17 Textur- und videocodierung mit hohem dynamischem bereich Withdrawn EP2441267A1 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IB2008/002781 WO2010043922A1 (en) 2008-10-17 2008-10-17 High dynamic range texture and video coding

Publications (1)

Publication Number Publication Date
EP2441267A1 true EP2441267A1 (de) 2012-04-18

Family

ID=40720048

Family Applications (1)

Application Number Title Priority Date Filing Date
EP08875717A Withdrawn EP2441267A1 (de) 2008-10-17 2008-10-17 Textur- und videocodierung mit hohem dynamischem bereich

Country Status (2)

Country Link
EP (1) EP2441267A1 (de)
WO (1) WO2010043922A1 (de)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI479898B (zh) 2010-08-25 2015-04-01 Dolby Lab Licensing Corp 擴展影像動態範圍
ES2694806T3 (es) 2011-03-02 2018-12-27 Dolby Laboratories Licensing Corporation Operador de correlación de tonos multiescala local
WO2012142471A1 (en) 2011-04-14 2012-10-18 Dolby Laboratories Licensing Corporation Multiple color channel multiple regression predictor
CN103493490B (zh) 2011-04-25 2016-12-28 杜比实验室特许公司 非线性视觉动态范围残留量化器
WO2013138148A1 (en) 2012-03-13 2013-09-19 Dolby Laboratories Licensing Corporation Lighting system and method for image and object enhancement
JP6079549B2 (ja) * 2013-10-15 2017-02-15 ソニー株式会社 画像処理装置、画像処理方法
US11202049B2 (en) * 2019-03-15 2021-12-14 Comcast Cable Communications, Llc Methods and systems for managing content items

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1331604A1 (de) * 2002-01-22 2003-07-30 Deutsche Thomson-Brandt Gmbh Verfahren und Vorrichtung zum Speicherzugriff von Block-Kodierern/Dekodierern
US8237865B2 (en) * 2006-12-18 2012-08-07 Emanuele Salvucci Multi-compatible low and high dynamic range and high bit-depth texture and video encoding system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2010043922A1 *

Also Published As

Publication number Publication date
WO2010043922A1 (en) 2010-04-22

Similar Documents

Publication Publication Date Title
US8462194B2 (en) Multi-compatible low and high dynamic range and high bit-depth texture and video encoding system
JP7541055B2 (ja) 高ダイナミックレンジおよび広色域シーケンスを符号化するシステム
US10931961B2 (en) High dynamic range codecs
JP6609708B2 (ja) 高ダイナミックレンジ映像データの再形成および適応のためのシステムおよび方法
JP5965025B2 (ja) Hdr画像のための画像処理
EP2144444B1 (de) Vorrichtungen und Verfahren zur HDR-Videodatenkomprimierung
US7747098B2 (en) Representing and reconstructing high dynamic range images
EP2441267A1 (de) Textur- und videocodierung mit hohem dynamischem bereich
US8340442B1 (en) Lossy compression of high-dynamic range image files
CN106412595B (zh) 用于编码高动态范围帧以及施加的低动态范围帧的方法和设备
US10742986B2 (en) High dynamic range color conversion correction
US10715772B2 (en) High dynamic range color conversion correction
JP2024149797A (ja) 高ダイナミックレンジおよび広色域シーケンスを符号化するシステム
WO2019071045A1 (en) HIGH DYNAMIC RANGE COLORING CORRECTION

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20111021

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MT NL NO PL PT RO SE SI SK TR

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20120918