WO2010024782A1 - A method and system for displaying an hdr image on a ldr display device - Google Patents

A method and system for displaying an hdr image on a ldr display device Download PDF

Info

Publication number
WO2010024782A1
WO2010024782A1 PCT/SG2009/000299 SG2009000299W WO2010024782A1 WO 2010024782 A1 WO2010024782 A1 WO 2010024782A1 SG 2009000299 W SG2009000299 W SG 2009000299W WO 2010024782 A1 WO2010024782 A1 WO 2010024782A1
Authority
WO
WIPO (PCT)
Prior art keywords
tone
user
image
sub
hdr image
Prior art date
Application number
PCT/SG2009/000299
Other languages
French (fr)
Other versions
WO2010024782A8 (en
Inventor
Susanto Rahardha
Farzam Farbiz
Corey Mason Manders
Zhiyong Huang
Suat Ling Jamie Ng
Ee Ping Ong
Zhengguo Li
Jinghong Zheng
Wei Yao
Original Assignee
Agency For Science, Technology And Research
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agency For Science, Technology And Research filed Critical Agency For Science, Technology And Research
Publication of WO2010024782A1 publication Critical patent/WO2010024782A1/en
Publication of WO2010024782A8 publication Critical patent/WO2010024782A8/en

Links

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/10Intensity circuits
    • G06T5/92
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20172Image enhancement details
    • G06T2207/20208High dynamic range [HDR] image processing
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2320/00Control of display operating conditions
    • G09G2320/02Improving the quality of display appearance
    • G09G2320/0261Improving the quality of display appearance in the context of movement of objects on the screen or movement of the observer relative to the screen
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2320/00Control of display operating conditions
    • G09G2320/02Improving the quality of display appearance
    • G09G2320/0271Adjustment of the gradation levels within the range of the gradation scale, e.g. by redistribution or clipping
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2320/00Control of display operating conditions
    • G09G2320/06Adjustment of display parameters
    • G09G2320/0613The adjustment depending on the type of the information to be displayed
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2320/00Control of display operating conditions
    • G09G2320/06Adjustment of display parameters
    • G09G2320/0673Adjustment of display parameters for control of gamma adjustment, e.g. selecting another gamma curve
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2320/00Control of display operating conditions
    • G09G2320/06Adjustment of display parameters
    • G09G2320/0686Adjustment of display parameters with two or more screen areas displaying information with different brightness or colours
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2340/00Aspects of display data processing
    • G09G2340/14Solving problems related to the presentation of information to be displayed
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2354/00Aspects of interface with display user
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2360/00Aspects of the architecture of display systems
    • G09G2360/16Calculation or use of calculated indices related to luminance levels in display data

Definitions

  • the present invention relates to a method and system for displaying an HDR image on a LDR display device.
  • HDR high dynamic range
  • An option to address the problem of displaying HDR content is to compress the range of the HDR image using a local or global tone-mapping operator so that the HDR image may be shown appropriately on LDR display devices.
  • tone-mapping operators also referred to as dynamic range reduction operators or tone-mapping functions.
  • these methods can be grouped according to the type of operators (either global operators or local (spatially varying) operators) they use.
  • a few examples of proposed methods using the global operators and the local operators are listed as follows.
  • [Reinhard and Devlin 2005] also proposes a method in which a tone-mapping operator is constructed based on photoreceptor response by modeling the response of cones in the eye to produce a static LDR image.
  • [Reinhard and Devlin 2005] considers the effect of cones and rods in HDR to LDR tone-mapping [Reinhard and Devlin 2005] but does not fully consider the effect of photopigment depletion.
  • the present invention aims to provide a new and useful method and system for displaying HDR content on a LDR display device.
  • the present invention seeks to do so in such a way that the resulting image looks as realistic as possible while showing all the image content.
  • the present invention proposes finding the position on a LDR screen at which a user is looking, and dynamically adjusting a HDR image displayed on the screen by appropriately varying a tone-mapping function according to where the user is looking (that is, the gaze position of the user).
  • contrast resources may be allocated to the region where the user's gaze is centered, to make more visible the detail that is present in that region of the HDR image.
  • the present invention in some forms proposes using eye tracking technology to determine the user's distance and gaze direction to estimate the portion of the image projected on the most sensitive region of the user's retina (the macula).
  • the present invention has some similarity to known techniques of level-of-detail (LOD) and eye tracking [Murphy and Duchowski 2001].
  • LOD level-of-detail
  • Eye tracking [Murphy and Duchowski 2001].
  • the present invention in some forms further proposes designing the tone- mapping function to model the human visual system given the characteristics and requirements of human's gaze-aware system.
  • the human visual system uses several methods to interactively adapt to the immense range of light intensities in our day to day lives, continually changing to effectively perceive information we are looking at. Much of this ability of the human visual system has been shown to be based on range compression [Jarna et al. 1993], while maintaining adequate detail in the perceived image. For a display system to be able to offload part of this range reduction, it is preferable that it adequately models and mimics the human visual system.
  • range compression [Jarna et al. 1993]
  • a brief description of the physiology of the human eye is presented below. Often, the eye is modeled as a camera, and much of the design of any camera is inspired by the eye. The pupil is equivalent to the aperture of a camera.
  • the eye adjusts the pupil to globally adjust the light into the eye, and consequently the retina.
  • the diameter of the pupil may range from 2mm to 5 - 8mm depending on the age of the person (usually 5mm for an elderly and 8mm for young adults). This change in pupil diameter may be modeled as a global change in illumination, and is presented in greater detail in [Stark and Sherman 1957; Clarke et al. 2003].
  • photopigment molecules consist of two parts, a chromophore (retinal), and a protein background (opsin).
  • the complex of the chromophore and the opsin are termed rhodopsin.
  • photoisomerization When a photon of the correct wavelength impinges upon a particular chromophore, it instantly changes shape in a process called photoisomerization. This is similar to CMOS and CCD sensors and the photoelectric effect that is exploited [Boyle and Smith 1970].
  • Rhodopsin absorbs incoming light giving the rhodopsin cells a characteristic purple (termed visual purple) colour. Photoisomerization causes bleaching of the purple, turning it yellow and allowing light to pass deeper into the photopigments below. Bleached photopigments are regenerated continually by the eye, but as the intensity of the light impinging on the photoreceptor increases, the regeneration is unable to match the depletion. As an example, below lOOcd/m 2 , photopigment replenishing can stay ahead of bleaching. After 100 cd /m 2 , bleaching becomes a critical factor for the local adaptation of a scene. For example at 5000 cd/m 2 , the amount of unbleached photopigment is about 50%.
  • the amount of unbleached photopigment is at about 0.2%. This phenomenon shows that the eye is in fact doing "local" compression. Furthermore, the logarithmic depletion of the photopigments imposes a secondary compression element to the ever present logarithmic range compression which is present in all perceptual sensing in the human body.
  • Global adaptation occurs by means of the adjustment of the pupil. Just as a photographer adjusts the aperture diameter of a camera to regulate the amount of light that film or sensor arrays are exposed to, the human eye adjusts the pupil to similarly regulate the amount of incoming light.
  • Local adaptation occurs in the photopigment molecules which are produced in the human eye allowing the transduction of light into a physiological response. These photopigments are continually "bleached" and therefore must be regenerated or re-assembled to re-enable their function. This physiological mechanism contributes to a local adaptation as opposed to a global adaptation by means of pupil dilation.
  • the physiology of the human eye serves as a basis as to how a dynamic display which reacts to changes in a similar way as that of a human eye can be created.
  • the macular pigment is certainly an important component in vision, it is not appropriate to model this component for creating a dynamic display. Rather, the component that is more likely to be of primary consideration is the performance of photoreceptors.
  • the pupil contributes to very little of the overall adaption of the human visual system (less
  • the present invention proposes using a first step of performing adaptation (i.e. long term adaptation) in which the image is linearly scaled in intensity such that the mean quantimetric value [Mann 2001] maps to the midpoint of the output range before gamma correction is applied.
  • This step corresponds to the linear scaling during pupil expansion or contraction.
  • Modeling the response of the photoreceptors is significantly more complex as compared to modeling the pupillary reflex of the eye, partly because there is less information known about the process. Additionally, at this first stage of processing in the human visual system, the processing chain is remarkably intricate.
  • the present invention proposes using well- known physical concepts, for example pupil diameter and adjustment times, for the amount of light entering the eye. In certain embodiments, the present invention further proposes using the following points 1 to 6 to model opsin transductance and the methods in which it adapts to differing light intensities.
  • the eye has various global adaptation mechanisms such as pupil size, general photoreceptor response.
  • the present invention in some forms proposes incorporating a global adaptation mechanism.
  • the cone receptors that are located in the macula provide higher visual acuity than rods.
  • the maximal cone concentration is in the very center of the macula (the fovea).
  • the present invention in some forms proposes a window-based approach addressing locality.
  • Cone receptors act as a network, where a cluster of cones contribute to the output of the ganglion cells and receptors closer to the dendritic tree based in the ganglion cells contribute more to the perception of the incoming light than those at the edges.
  • the present invention in some forms proposes a cluster-based approach to model the light response.
  • Photoreceptor bleaching is largely a local phenomenon acting in tandem with global adaptation (reacting to environmental lighting). Transductance is changed corresponding to photons impinging on particular photopigments causing photoisomerization.
  • the present invention in some forms proposes a method which may be used to combine global adaptation with opsin bleaching/photoreceptor response.
  • the present invention in some forms proposes employing a time adaptation process in tone-mapping.
  • the fovea has a much higher concentration of cone receptors, and thus contributes more to the overall perceived image.
  • the maximal cone concentration is in the very center of the fovea. It should also be noted that 50% of the fibers in the optic nerve are used for transmitting information from the fovea, the remaining fiber carry information from the rest of the retina.
  • the present invention in some forms proposes a tone-mapping function according to where the user is looking.
  • Fig. 2 includes a histogram 202 of the numbers of pixels in the region corresponding to the ROM area for each possible value of luminescence (0 to 1.0).
  • the linear mapping function is shown as the dashed line 204 and the adjusted mapping function is shown as a solid line 206.
  • a relatively high proportion of the pixels in the ROM area have a luminance centered in a narrow sub-range of luminescence values (this sub-range is 0.35 to 0.65).
  • the present invention in some forms proposes adjusting the tone- mapping function which transfers the HDR content to the LDR content according to this observation to expand the dynamic range of the ROM area.
  • the present invention in some forms further proposes implementing the display system at real time speeds and varying the tone-mapping function in real time by performing much of the tone-mapping in an accelerated manner using a balance of pre-computation and real time GPU-based rendering.
  • a first aspect of the present invention provides a method for displaying an HDR image on a LDR display device to a user, the method comprising the steps of repeatedly: (1a) estimating a gaze position of the user by tracking at least one eye of the user, the gaze position of the user being a position on a screen of the LDR device at which the user is looking; (1 b) deriving an output image from the HDR image based on the estimated gaze position of the user; and (1c) displaying the output image on the LDR display device.
  • the invention may alternatively be expressed as a computer system for performing such a method.
  • This computer system may be integrated with a device for capturing HDR images.
  • the computer system performs the method by running a display program with a shader program.
  • the invention may also be expressed as a computer program product, such as one recorded on a tangible computer medium, containing program instructions operable by a computer system to perform the steps of the method.
  • Fig. 1 illustrates the physiology of a human eye
  • Fig. 2 illustrates a histogram of pixel luminescence values in a region corresponding to a ROM area, a linear tone-mapping function and an adjusted tone-mapping function
  • Fig. 3 illustrates a flow diagram of a method 300 which displays an HDR image on an LDR display device according to an embodiment of the present invention
  • Fig. 4 illustrates a cluster-based tone-mapping function derived in a first example of sub-step 314b of step 314 of method 300 when the number of clusters N is 4;
  • Fig. 5 illustrates slopes derived for each cluster center in a second example of sub-step 314b of step 314 of method 300 when the number of clusters N is 4;
  • Fig 6 illustrates a piecewise linear function and a cluster-based tone- mapping function derived from the slopes of Fig. 5;
  • Figs. 7(a) and (b) illustrate results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and
  • Figs. 7(c) and (d) illustrate results after applying a global and local tone- mapping algorithm on the input image respectively;
  • FIGS. 8(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 8(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively; Figs. 9(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs.
  • FIG. 9(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively;
  • FIGs. 10(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and
  • Figs. 10(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively;
  • Figs. 11 (a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 11(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively;
  • Figs. 12(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 12(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively;
  • Figs. 13(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 13(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively;
  • Figs. 14(a) and (b) illustrate results of method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 14(c) and (d) illustrate results after applying a local and global tone-mapping algorithm on the input image respectively.
  • Figs. 15(a) - (d) illustrate results of method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) as the user walks nearer the image;
  • Figs. 16(a) - (c) illustrate results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when the user looks at different portions of the image;
  • Figs. 17(a) - (c) illustrate results of method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when the user looks at different portions of the image;
  • Fig. 18 illustrates results of an user study performed using method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency);
  • Fig. 19 illustrates results of a first user study performed using method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency);
  • Fig. 20 illustrates further results of the first user study performed using method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency);
  • Fig. 21 illustrates a real world HDR reference image used in a second user study performed using method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency); and
  • Fig. 22(a) illustrates an original HDR image
  • Figs. 22(b) - (c) illustrate results of method 300 when the user is looking at different portions of the image after replacing the cluster-based tone-mapping function with a global tone- mapping operator.
  • a method 300 which is an embodiment of the present invention, and which displays an HDR image on a LDR display device.
  • the user distance to the display and information from eye tracking are used to produce an appropriately tone mapped dynamic LDR image, able to react to a user's change in gaze.
  • the input to the method 300 is an HDR image.
  • a long term adaptation component is computed, and in step 304 the global scale of the input HDR image is adjusted to form the image marked as "A".
  • the distance between the user and the screen of the display device is determined whereas in step 308, an X-Y position of the user's gaze on the screen of the display device is determined.
  • a region of observation by the macula (ROM) area is estimated using the distance between the user and the screen of the display device (determined in step 306) and the X-Y position of the user's gaze on the screen of the display device (determined in step 308).
  • step 312 illumination data for each of the ROM area pixels is determined, and this is used to calculate a cluster-based tone-mapping function in step 314. Further in step 314, cluster-based tone-mapping is performed on the input HDR image using the cluster-based tone-mapping function, to form the image shown as "B".
  • step 316 a bleaching factor is calculated. This is used together with the globally scaled image A from step 304 and the cluster-based tone mapped image B from step 314 to perform illumination scaling on the image to derive an illumination scaled image C. The values in the illumination scaled image are then converted from luminance values to RGB values. This gives the "output image" which is displayed using the LDR display device.
  • Steps 302 and 304 correspond to the long term adaptation of photoreceptors and pupil adjustment in a human eye.
  • a long term adaptation component S g is computed and is used to adjust the global scale of the image. This is performed by calculating the illumination value (also referred to as the HDR luminance for each pixel in the HDR image in step 302 and using the HDR luminance of each pixel to calculate the long term adaptation component S g .
  • the long term adaptation component S g is in turn used to adjust the global scale of the image accordingly in step 304.
  • the long term adaptation component S g is defined in Equation (1 ) where
  • Equation (1 ) calculates an average illumination value for the pixels in the HDR image (average HDR luminance) using the illumination values calculated for the pixels in the HDR image in step 302 and maps the average HDR luminance of the HDR image to approximately the middle of the display range of the display device.
  • the ROM area may be referred to as an area on a screen of the display device at which the user is looking. In one example, it is the center of the viewer's gaze on the screen and refers to the area of the screen that falls into the macular region inside the retina of the viewer's eyes.
  • the ROM area on the screen is estimated. This is performed by determining the user's distance from the screen (also referred to as the view distance) in step 306 to determine the position (X,Y,Z) as mentioned above, calculating the X-Y position of the user's gaze on the screen in step 308 to determine the point ( «,v) (also referred to as the view point) and subsequently, estimating the ROM area on the screen in step 310 using the position (X, Y, Z) from step 306 and the point (w,v) from step 308.
  • Steps 306 and 308 are performed using eye tracking technology.
  • the ROM area on the screen is estimated as a square window centered at the point (M,V) with sides having lengths equal to l ROM (in pixels) where l ROM is defined according to Equation (2).
  • the ROM area is estimated as a square to simplify the calculations required and ⁇ m ⁇ cu/a is set as 18° .
  • the ROM area may be estimated as any other shape, for example an oval and ⁇ m ⁇ J may be set to a different value.
  • the lowest level of information available is the individual image pixels corresponding to the pixels (or photosites) present in the camera's sensor.
  • pixels with comparable intensities in the ROM area are clustered in accordance with [Sterling 1999] in step 312 and a tone-mapping function is obtained from this clustering and is applied on the HDR image in step 314.
  • step 312 the illumination data of each of the ROM area pixels is determined whereas in step 314, tone-mapping is performed on the entire image using a tone-mapping function that is determined based on the illumination data of each of the ROM area pixels.
  • Step 314 comprises sub-steps 314a and 314b.
  • sub-step 314a clustering is performed on the ROM area pixels whereas in sub-step 314b, a cluster-based tone-mapping function is obtained based on the clustering of the ROM area pixels and is applied on the ROM area.
  • the ROM area pixels are first grouped into N classes based on the illumination data of each ROM area pixel as calculated in step 312. This is performed using a K-means clustering algorithm proposed in [Kanungo et al. 2002; Kanungo et al. 2004]. After performing the K-means clustering algorithm on the ROM area pixels, a set of TV cluster centers (denoted as [C 0 - C ⁇ 1 ] ) is obtained. Hence, the tone-mapping operator is reduced to an N -dimensional vector via the clustering.
  • a K-means clustering algorithm is used to cluster the ROM area pixels and N is set to 4.
  • the ROM area pixels may be clustered using other clustering algorithms and N may be set to any other value.
  • a value of N higher than 4 is not preferred as this results in a higher computational cost and experiments on different test images have shown that increasing the number of clusters N beyond 4 does not make a noticeable difference in the output. This could be because the neighbouring pixels for most of the image belong to less than 4 different intensity regions.
  • a tone-mapping function which is a piecewise linear function, is derived based on the clustering of pixels in the ROM area and is then applied on the entire image to obtain a tone mapped image.
  • the cluster centers are used to impose knots and are used as control points in the tone-mapping function.
  • a first and second example of sub-step 314b is presented as follows.
  • Equation (4) the slope S 1 of the piecewise linear tone- mapping function defined according to Equation (4) is derived.
  • N may be set as 4 in sub-step 314b.
  • each piecewise linear region of the tone-mapping function as shown in Fig. 4 is allocated a particular gain (slope).
  • the cluster centers are then mapped to equidistant intensities in the range of the tone-mapping function in the process of obtaining the tone-mapped image.
  • N is preferably set as a value greater than 3 in the first example of sub-step 314b.
  • pixel clusters will be best viewed if each cluster center is mapped to the middle of the LDR range [0,1] and with a slope inversely proportional to its luminance according to Weber's law of just noticeable difference (JND) [Wyszecki and Stiles 2000].
  • JND just noticeable difference
  • a desired slope Sf is calculated for each cluster center C 1 as 0.5 /C 1 as it is not possible to map all cluster centers to the middle of the display range simultaneously. This creates a linear segment with a gradient equal to the slope ZS ⁇ for each cluster as shown in Fig. 5.
  • Sf is calculated for each cluster center C 1 as 0.5 IC 1 .
  • the linear segments are then required to be joined to produce a monotonically non-decreasing tone-mapping function which is a piecewise linear tone- mapping function.
  • the piecewise linear tone-mapping function is formed by matching each segment of the function to the desired slope Sf , the function will likely extend past 1 as shown by the solid line in Fig. 6.
  • a weighted normalization procedure based on the cluster populations P, ( ⁇ .e. the number of pixels in cluster i ) is performed on the linear segments prior to joining the linear segments to derive a piecewise linear tone-mapping function S 1 defined according to Equation (5).
  • Equation (5) B 1 is the cluster boundary which falls between clusters C,_, and C 1 .
  • the dotted line in Fig. 6 gives an example of this piecewise linear tone- mapping function 5, which is also referred to as the cluster-based tone-mapping function.
  • Equation (5) is advantageous as clusters which are heavily populated will not be significantly compressed whereas clusters with low populations will receive greater compression. Furthermore, Equation (5) is not completely dependent on histogram population and accordingly does not suffer from the extreme contrast enhancement which is sometimes present in the result of histogram equalization.
  • a piecewise linear tone- mapping function is used due to its flexibility [Mantiuk et al].
  • the slope in dynamic regions where clusters of intensities exist can be increased whereas this adjustment can be compensated for by decreasing the slope in low population regions.
  • a cubic B-splines can be used to model the tone-mapping function, therefore providing C 2 continuity.
  • the piecewise linear method is preferable as it can be easily implemented as a GPU shader program, able to easily achieve realtime performance.
  • the key illumination adaptation (and corresponding contrast optimization) is done by the photoreceptors.
  • the adaptation is known to be accomplished by a mechanism of bleaching in the photopigment.
  • Equation (6) which is often termed the Naka-Rushton equation. This equation is the function of unbleached photopigment at a particular instance and follows from the experiments described in [Naka and Rushton 1966].
  • M is the value that causes half-maximal response
  • is a semi-saturation constant [Pattanaik et al. 2000].
  • the value n is a sensitivity control which is similar to gamma for video, film and typical displays.
  • R(M) R 1011x - ⁇ - — (6) m M" + ⁇ "
  • a bleaching factor ⁇ is calculated in step 316 using Equation (6) in order to blend the photopigment phenomena with the adaptation adjusted image.
  • is set as R(M) of Equation (6) and i?(M) is calculated according to a first and second example of step 316 as described below.
  • step 316 First example of step 316
  • R(M) is calculated using the variables obtained using Equations (7) - (9).
  • Equation (6) (denoted as M ROM in Equation (7)) is calculated according to Equation (7) in which T R0M denotes the total number of pixels in the ROM area. Note that calculating M ROM according to Equation (7) is consistent with using the average radiometric value of the user's ROM area.
  • the variable ⁇ is calculated according to Equation (8) which is based on the method described in [Pattanaik et al. 2000] in which findings of [Hunt 1995] were used to aid in modeling the response of the photoreceptors. Equation (8) is derived when Equation (6) is specifically applied to the cones and A cme is the fixed adaptation illumination amount, which corresponds directly to the global illumination average value.
  • the Hunt's bleaching parameter is calculated according to Equation (9) whereby A com is defined in the same way as in Equation (8).
  • M ROM - ⁇ - ⁇ I(i,j) (7)
  • step 316 Second example of step 316
  • R(M) is calculated using the variables obtained using Equations (7), (10) - (11 ).
  • Equation (10) is calculated according to Equation (10) which is also based on the method described in [Pattanaik et al. 2000] in which findings of [Hunt 1995] were used to aid in modeling the response of the photoreceptors whereas the Hunt's bleaching parameter i? max is calculated according to Equation (11 ) whereby k is the scaling factor necessary to convert normalized pixel values e [ ⁇ ,l] to luminance values in cdlm 2 and is equal to the maximum luminance of the scene in cdlm 2 .
  • A: is determined empirically. Equations (10) and (11 ) are derived from Equations (8) and (9).
  • An illumination scaled image is then derived by performing a weighted sum of the globally scaled image from step 304 and the tone-mapped image from step 314 using the bleaching factor ⁇ .
  • the bleaching factor ⁇ calculated in step 316 is used to linearly weight the (photopigment) enhanced processing of the image (the output image after performing cluster based tone- mapping in step 314) with the globally scaled adaptation image (the output image after performing global scaling in step 304).
  • a first example of performing the illumination scaling is done according to
  • I out (u',v') is a given intensity at the spatial location (w',v')on the illumination scaled image when the user's gaze is centered at the point (u,v) on the screen of the display device.
  • is the bleaching factor obtained in step 316
  • I ⁇ u',V) is the HDR luminance of a given pixel at spatial location [u',v'] ,L(u',v') ⁇ s the LDR image (tone-mapped result obtained from the first example of sub-step 314b) at spatial location
  • Equation (13) [u',v r ] and I(u',v')-S g is the globally scaled image (obtained from step 304) at spatial location [u',v'] .
  • L(u',v') is defined according to Equation (13).
  • S 1 is the tone-mapping function obtained in Equation (5) whereas the definitions of C, and O 1 remain the same as those in Equation (5).
  • Equation (14) A second example of deriving the illumination scaled image I out (u',v') is performed according to Equation (14).
  • D is the distance between the viewpoint and the center of the user's gaze as calculated in Equation (3)
  • the illumination values in the illumination scaled image are then converted to RGB color values to obtain an output image of method 300.
  • the conversion of the illumination values in the illumination scaled image to RGB color values is performed by relative scaling according to Equation (15).
  • the conversion of the illumination values in the illumination scaled image to RGB color values is performed using the method of [Schlick 1994; Mantiuk et al. 2008] according to Equation (15a).
  • gamma is set as 0.45 in Equation (15a).
  • Equation (16) A first example of simulating eye adaptation latency is performed according to Equation (16). Given each newly computed value v,.(t) (which may be for example, a cluster center), each parameter p,(t) at time t is updated using an exponential smoothing technique according to Equation (16) where w controls the exponential delay factor.
  • P(t) is calculated according to Equation (17) where t is in seconds andr is in milliseconds.
  • P(t) is the vector of parameters used for the current tone-mapping and P ss (ROM new ) are the steady state parameters for the new ROM area the user is focused on.
  • the vector P(t 0 ) is the set of parameters corresponding to the moment t o when the user's ROM area is changed.
  • the value of ⁇ changes when the user changes the gaze position from a light region to a dark region, or contrarily from a dark region to a light region. This reflects the adaptation in the human visual system in each scenario.
  • the window for the ROM area used for a typical image has a radius of 50 pixels.
  • the input image is divided into 50 x 50 window blocks where each of the 50 x 50 window blocks is referred to as a ROM window block whereas the center of each ROM window block is referred to as a grid position.
  • a set of N cluster centers on the image are pre-set as pixels separated at regular pixel intervals both horizontally and vertically.
  • N is set as four and the pixel interval ranged from 10 to 50 pixels. As the interval decreases, the consistency between the intervals increases, with the consequence of increasing the pre-computation time.
  • the bleaching factor, ⁇ is also pre-computed for each ROM window block.
  • the display program is written in OpenGL, employing a vertex shader. Given the HDR photoquantities in RGB space, each spatial location in the HDR image is mapped to a vertex2D in the OpenGL program. The color of each vertex is then set using color3f(r,g,b) to the raw photometric value of the HDR image. Mapping presented in Equations (12), (13) and (15) is implemented as a vertex shader which is able to set the four cluster centers in addition to a ⁇ parameter.
  • the eye tracking system sends updated gaze positions to the openGL program over UDP at a rate of 60Hz. Since the cluster centers and the bleaching factor are pre-computed, steps 312 - 316 of method 300 are replaced by the following steps.
  • the OpenGL(CPU) program locates four sets of N cluster centers belonging to ROM window blocks with grid positions (the centers of the 50x50 ROM window blocks) which the user's current gaze position is closest to. It then linearly interpolates these four sets of N preset cluster centers given the distance from the user's current gaze position to each of the four grid positions to obtain N interpolated preset cluster centers.
  • the bleaching factor, ⁇ is also interpolated in the same manner.
  • a delay factor according to Equation (16) or (17) is then applied to the interpolated preset cluster centers and the interpolated bleaching factor ⁇ to obtain updated interpolated cluster centers and an updated interpolated bleaching factor ⁇ .
  • the updated interpolated cluster centers are then set as the elements of a uniform float4 vector and are input into the OpenGL vertex shader.
  • the updated interpolated bleaching factor ⁇ is input into the OpenGL vertex shader as well.
  • Each vertex then computes its new display value given the N updated interpolated preset cluster centers and the updated interpolated ⁇ parameter.
  • Equation (12) blends the tone- mapping based on the clustering algorithm with a linear tone curve in a per-pixel manner using the ⁇ parameter.
  • the image is discretized into a grid of 50 x 50 pixel segments where each of the 50 x 50 grid segments is referred to as single ROM blocks or a pre-computed ROM area. This is the smallest ROM area which may be represented.
  • the parameters for the tone-mapping function i.e. the N cluster centers
  • the parameters of all possible contiguous 2 x 2, 3 x 3, ..., n x n ROM blocks across the entire image are also pre-computed whereby n is the larger dimension of the width or height divided by the discretization size.
  • the "missing rows/columns" of the block are discarded and the pre-computation of the parameters is done only on the available image information.
  • the bleaching factor ⁇ is also pre-computed.
  • the OpenGL employs a pixel shader. Given the HDR photoquantities in RGB space, each spatial location in the HDR image is mapped to a g
  • the eye tracking system sends updated gaze positions to the program over User Datagram Protocol (UDP) at a rate of 60Hz. Since the cluster centers and the bleaching factor are pre-computed, steps 312 - 316 of method 300 are replaced by the following steps.
  • the program in the second example finds the pre-computed ROM area which is most applicable to the user's gaze and distance. This most applicable pre-computed ROM area is the pre-computed ROM area which overlaps the most with the estimated ROM area of the user.
  • the delay factor according to Equations (16) and (17) is then applied to the corresponding pre-computed tone-mapping parameters of the best fit ROM area (i.e.
  • These updated pre-computed tone-mapping parameters and the updated bleaching factor ⁇ are then sent to the Graphics Processing Unit (GPU).
  • the GPU shader program in the embodiments of the present invention computes the tone-mapping function based on the updated pre-computed tone-mapping parameters and applies it to each pixel of the HDR image in parallel to obtain a tone-mapped image. An output image is then derived using the tone-mapped image and the updated bleaching factor ⁇ .
  • the system in the embodiments of the present invention adaptively tunes the dynamic range resources available in the display in correspondence to where the user is looking.
  • Some LDR snapshots of the HDR viewing system in the embodiments of the present invention are presented with the user looking at differing spatial locations. These images are compared to the static results of two state-of-the-art tone-mapping algorithms, one global method [Reinhard and Devlin 2005] and one local method [Fattal et al. 2002].
  • Figs. 7 - 13 show results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) as compared to the local and global mapping algorithms of [Fattal et al. 2002] and [Reinhard and Devline 2005] respectively.
  • Fig. 7(a) and (b) - 13(a) and (b) show the results of method 300 when a user is looking at different portions of the image.
  • Figs. 7(c) - 13(c) show the results when a global tone- mapping function (Reinhard) is applied whereas Figs. 7(d) - 13(d) show the results when a local tone-mapping function (Fattal) is applied.
  • Results from method 300 (using the second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) are shown in Fig. 14.
  • Fig. 14(a) shows the image obtained from method
  • Fig. 14(b) shows the image obtained from method 300 when the user is looking at the window.
  • Fig. 14(c) shows the image obtained from applying the local tone-mapping algorithm [Fattal et al. 2002] on the input image whereas Fig. 14(d) shows the image obtained from applying the global tone-mapping [Reinhard and Devlin
  • Figs. 15(a) - (d) illustrates the output image from method 300 (using the second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when the ROM area encompasses the entire image, 512 x 512 pixels, 128 x 128 pixels, and 32 x 32 pixels respectively. These are the results in the case where the user walks toward the screen. As shown in Fig. 15, the number of pixels in the ROM area decreases (from Fig. 15(a) - 15(d)) as the user gets closer. Consequently, as shown in Fig. 15, the contrast of a specific region the user is looking at increases as the user's distance decreases.
  • Figs. 16 and 17 also show the contrast and details shown in different regions of the output images obtained from method 300 (using respectively the first and second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when the user is looking at different regions of each of the images.
  • testing the system in the embodiments of the present invention is a somewhat problematic issue, as there is no similar existing system.
  • the point of the embodiments of the present invention is to adapt the widespread current commercial technology, in other words the LDR display devices, to display HDR content in a natural way.
  • the usability study was conducted with 23 subjects.
  • the subjects were pulled from diverse backgrounds such as media artists, photo hobbyists, media designers, visual communication designers, and market researchers.
  • the subjects also included research scientists and engineers in HCI, computer graphics, virtual reality, and image processing. About one quarter of the subjects consider themselves proficient in photography, or are photography enthusiasts.
  • the 3 images were labeled Image 1 , Image 2 and Image 3 respectively.
  • the image used was a photo taken at a popular tourist location in Singapore, known as The Esplanade. Multiple exposures were taken at known exposure times in RAW mode, and used to create HDR images. The best LDR shot was chosen to represent Image 1.
  • To generate Image 2 we used the program qpfstmo [qpfstmp 2007] (based on pfstmo [Mantiuk and Krawczyk 2007]), which implements several tone-mapping algorithms, including [Reinhard and Devlin 2005], to produce the tone-mapped version of the image. The parameters were tuned by an image processing engineer to achieve the best possible result using this software.
  • test results revealed that subjects generally based their preference on image features such as brightness, contrast, vividness of the colors, realism and the amount of details.
  • image features such as brightness, contrast, vividness of the colors, realism and the amount of details.
  • More information is shown in the pie chart in Fig. 18 which shows the percentage of subjects choosing image 1 , 2 or 3 as the image giving them the greatest sense of realism.
  • Subjects preferred the interactive and dynamic feature of Image 3 (when asked on a Likert scale of 1 - 5, 78% agreed, of which 48% strongly agreed on this point). Subjects also liked Image 3 as it can enhance the area of interest. It was further recommended that more depth information may be included (for example, by using a 3D image instead) to improve the realism of the image.
  • method 300 did give the perception of expanding the dynamic range of the display.
  • subjects' opinions were sought as to whether the bright areas had more detail.
  • the result was that 64% of subjects agreed (of which 50% strongly agreed) that they perceived more detail in the bright area whereas 43% felt that even the dark area was also perceived to have more detail.
  • method 300 did produce an output image which gives the perception of increasing the dynamic range of the display.
  • method 300 achieved a greater sense of realism as compared to existing HDR to LDR tone-mapping techniques. Also, the study showed that the dynamic range of the display was increased perceptually, in particular in the brighter regions of the HDR image by method 300.
  • User study results using method 300 (with second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency)
  • the first study compares the result obtained from method 300 against the result of static tone-mapping algorithms on 10 reference images.
  • the second study involves a real-world reference scene constructed by the inventors.
  • the first study involved 20 gender-balanced and unbiased subjects between the ages 16-35. Subjects were tested to ensure that they have normal color vision and were shown 10 different HDR images processed by the following three algorithms:
  • [C] method 300 the gaze-adaptive HDR display method according to the embodiments of the present invention.
  • the HDR images were selected from the HDR DVD provided by the text [Reinhard et al. 2005].
  • the independent variables in this study are the 3 algorithms (first level) and the 10 images (second level).
  • the dependent variable is the subject's preferential response.
  • the experiment was set up in a dark room (4 lux ambient luminance) with light provided only by the Barco Galaxy 9 HC+ projector used to display the images.
  • the projector has maximum luminance of 9,000 ANSI lumens and contrast ratio of 1700:1 , with a resolution of 1280 x 1024.
  • the images were projected on a large display (2m x 1.5m) and a chair was set up 1.5m from the screen for immersive viewing.
  • a Seeing Machines FaceLab Version 4 was used for eye tracking.
  • an HP xw9400 workstation with a dual core AMD 3.20 GHz Opteron, 16 GB RAM and NVidia Quadro FX 4600 graphics card was used. Subjects were given the impression that all three algorithms were developed by the inventors and were asked to rank the algorithms based on their preferences and perceived realism for each of the resulting 10 images.
  • Fig. 20 illustrates the subjects' perceived realism of the results for the same 10 reference images.
  • the difference for images 3, 7, 9 and 10 were found to be insignificant at p ⁇ 0.05 , although it was found to be significant for image 3 at p ⁇ 0.06.
  • the DR values of the 10 HDR test images used in this first user study are shown in Table 1.
  • This first user study revealed that if the dynamic range (DR) of the image was relatively low (DR less than 3, computed as Iog 10 (max/min)as shown in Table 1 ), users preferred the result of global tone-mapping. However, as the dynamic range of the image increased, users preferred the results of method 300. Artifacts produced by the global tone-mapping were rather obvious on the large display for these images.
  • a Pearson correlation analysis indicates a large correlation [Cohen 1988] between the respondents' preferences and perceived realism for algorithms versus DR values, as can be seen in Table 2 which shows the correlation coefficient ( r ) between the respondents' preferences and the perceived realism versus image DR values.
  • the second user study involved 10 gender-balanced and unbiased subjects between the ages 16 - 45 with normal color vision.
  • Colorimeter PR670 The parameters of the three algorithms were tuned using this data to set the global luminance appropriately (exposure time for Algorithm A, key value for Algorithm B, and S g for Algorithm C).
  • this second user study required the subjects to focus on three regions of the scene to evaluate which of the three algorithms best preserved the details and realism.
  • the three regions are: (i) the brightly illuminated poster in the top section, (ii) the moderately illuminated bunch of flowers in the middle section, and (iii) the dimly illuminated objects under the table in the bottom section of the scene.
  • the user studies confirmed that both the realism and the details of the scene were best maintained by method 300 and hence the users found the resulting display of the HDR images more natural. These user studies also showed that the dynamic range of the display is perceptually increased, in particular in the bright regions of the HDR images.
  • the first and second user studies were conducted in a dark room to alleviate most of the ambient light issues.
  • ambient light variations may be included into the system in the embodiments of the present invention for the system to be used in different environments.
  • ambient light when displaying images was considered in detail.
  • Implementing the work in [Mantiuk et al. 2008] may entail measuring the ambient light before displaying and modifying the final output to the user.
  • the work of [Mantiuk et al. 2008] also considers some aspects of the human visual system and may be incorporated into the embodiments of the present invention.
  • Fig. 22 shows the results when the global tone- mapping operator presented in [Reinhard et al. 2002] is used in place of the cluster-based tone-mapping operator.
  • Fig. 22(a) shows the original input HDR image
  • Figs. 22(b) and (c) shows the output image of method 300 when the cluster-based tone-mapping function is replaced with the global tone- mapping operator when the user is looking at the center of the image and at the plant in the right portion of the image respectively.
  • the parameters of the global tone-mapping operator are calculated based on the ROM area alone and not the entire image. The global tone-mapping operator is then applied to the entire image.
  • a tone-mapping function that is based on a model of the human visual system and changes in real time approximating the adaption of this system is derived. This is advantageous because of the following reasons.
  • the embodiments of the present invention can amplify the adaptive mechanisms of the human eye to compensate for the restrictions imposed by the display and also to enhance the viewing experience for the user.
  • the embodiments of the present invention can effectively offload some of the range compression and compensation that is done by the human visual system onto the display system thus perceptually increasing the dynamic range.
  • This coupled with transitional latency also taken from the adaptation process of the human visual system, allows the creation of a display capable of dynamically showing HDR content in a manner which is natural to the user.
  • the adaptive display system is able to display HDR images on an LDR device based on the user's gaze and distance (using eye tracking technology). This is advantageous because of the following reasons.
  • this allows the content of the display to be adapted according to where the user is looking at so that a particular area is presented in a natural way, but with appropriate contrast to represent the details in the particular area.
  • the system in the embodiments of the present invention can ease the amount of compression needed by reducing the spatial area that the tone-mapping function must operate on.
  • the display resources which are being optimized, such as contrast, intensity, etc. are largely only considered at the point where the user is looking. In regions where the user's gaze is not focused, recourses are not critical and thus may be partially or fully discarded. For example, if a region is saturated or too dark, but the user is not looking in that area, there is no reason to apply resources to that area.
  • the latency in the response of the human eye to changes in viewing a scene is considered and a latency factor is included in the change of the parameters when the user's viewpoint changes. This is advantageous because rapid discrete changes which are not natural to the user can be avoided. Instead, in the embodiments of the present invention, the device displaying the scene makes rapid but gradual adjustments, continually adapting to what the user is viewing, hence making the adaptive display look natural.
  • the embodiments of the present invention are implemented in real time by tracking the gaze of a user of a HDR image in real time and by computing an appropriate tone-mapping function for the HDR image at real-time rates using both pre-computations to allow real time performance for a particular image and real time GPU-based rendering.
  • the pre-computation for the tone-mapping function is done on the ROM-subdivided images. After determining the pre-computed ROM areas, the latency computation etc., may be used without modification to achieve satisfying results.
  • This real time implementation of method 300 is achieved by using a shader program on the GPU to perform all necessary real time operations and is advantageous because it can improve the computational time of the locally adaptive operator in the embodiments of the present invention. Since the real time approach taken in the embodiments of the present invention feeds a set of parameters to a shader program and a latency factor is included in the change of the parameters when the user's viewpoint changes, it is also possible to continually smooth the transition from one contrast enhancement to another by creating a gradual delay in each of the parameters. Further advantages of the embodiments of the present invention are as follows.
  • the embodiments of the present invention can create a dynamic display system which adapts interactively to the user's view by creating different LDR images. According to the studies done (one comparing the display of an HDR image to the actual HDR scene), the embodiments of the present invention were successful in creating the illusion of viewing a HDR scene when in reality, the image is displayed on a LDR display.
  • the embodiments of the present invention can enhance the effectiveness of other global tone-mapping operators. Any global operator may be applied and its effectiveness can be improved using the display system according to the embodiments of the present invention. Global tone-mapping operators and some local tone-mapping operators can hence benefit from this system. However, even though most tone-mapping operators can be used with this system, it may be difficult to apply a small percentage of local tone-mapping operators.
  • the interactive gaze- aware component in the embodiments of the present invention may be expanded to virtual and telepresence applications or other any application which benefits from adding a dynamic element to a static scene. Since telepresence is usually considered as not only "the feeling of being there” but also "the total response to being in a place and being able to interact with the environment" [Riva et al. 2003], the system in the embodiments of the present invention can be effectively used for such an application. Because embodiments of the present invention simulate the inherent interaction of the human eye in response to the light, they therefore will be able to provide a
  • the embodiments of the present invention may also be used for gaze-selected active refocusing of images.
  • an image may be retroactively refocused (or appear to have its focal plane changed). Intentional blur is commonly used by photographers to draw attention to a subject.
  • an active refocusing technique it would be possible to show a refocused scene with an interactive component.
  • content in the same focal plane would be in focus, objects closer or farther away would be appropriately out of focus.
  • the embodiments of the present invention can also help to represent the properties of the human eye more appropriately.
  • HDR display devices Appropriate for showing HDR images, but not common as a display medium
  • Embodiments of the present invention Interactively displays reasonable LDR images. Although the method according to the embodiments of the present invention is tuned to only a single user's view, it is a truly effective method of extracting dynamic content from a static image. REFERENCES
  • Presence 2010 The emergence of ambient intelligence. W.A Usselsteijn (Eds.), Being There: Concepts, effects and measurement of user presence in synthetic environments, IOS Press, 60-81. 52. SCHLICK, C. 1994. Quantization techniques for the visualization of high dynamic range pictures. Photorealistic Rendering Techniques, Proc. 5th Eurographic Rendering Workshop, 7-20.

Abstract

A method for displaying an HDR image on a LDR display device to a user. The method comprises the steps of repeatedly: (1a) estimating a gaze position of the user by tracking at least one eye of the user, the gaze position of the user being a position on a screen of the LDR device at which the user is looking; (1b) deriving an output image from the HDR image based on the estimated gaze position of the user; and (1c) displaying the output image on the LDR display device.

Description

A Method and System for Displaying an HDR Image on a LDR Display Device
Field of the invention
The present invention relates to a method and system for displaying an HDR image on a LDR display device.
Background of the Invention
Although the dynamic range of commercial displays is gradually increasing, there are currently only a small number of displays capable of showing relatively large dynamic ranges [Seetzen et al 2004, Bimber and Iwai 2008].
Unfortunately, many of these displays are fairly costly and have not become commercially available. Even if high dynamic range (HDR) display technology becomes prevalent, the majority of today's display devices only show content in a low dynamic range (LDR).
An option to address the problem of displaying HDR content is to compress the range of the HDR image using a local or global tone-mapping operator so that the HDR image may be shown appropriately on LDR display devices. [Reinhard and Devlin 2005] provides a survey of currently available methods of displaying HDR content using tone-mapping operators (also referred to as dynamic range reduction operators or tone-mapping functions). Generally, these methods can be grouped according to the type of operators (either global operators or local (spatially varying) operators) they use. A few examples of proposed methods using the global operators and the local operators are listed as follows.
• Global operators: [Larson et al. 1997; Tumblin and Rushmeier 1993; Larson et al. 1994; Drago et al. 2003; Reinhard and Devlin 2005; Reinhard et al. 2002; Schlick 1994; Ferwerda et al. 1996; Tumblin et al. 1999; Pattanaik et al. 2000] • Local (spatially varying) operators: [Mantiuk et al. 2005; Fattal et al. 2002; Tumblin and Turk 1999; Li et al. 2005; Ashikhmin 2002; Durand and Dorsey 2002; Pattanaik et al. 1998].
There are tradeoffs when considering these two types of operators. As a general observation, global operators tend to perform faster without introducing a large amount of artifacts but the use of global operators causes a loss of local image details and hence, a lack of realism. Local operators, on the other hand, are better at preserving local contrast and image details. However, local operators often introduce visual artifacts into the LDR results hence, also resulting in a lack of realism.
[Reinhard and Devlin 2005] also proposes a method in which a tone-mapping operator is constructed based on photoreceptor response by modeling the response of cones in the eye to produce a static LDR image. However, [Reinhard and Devlin 2005] considers the effect of cones and rods in HDR to LDR tone-mapping [Reinhard and Devlin 2005] but does not fully consider the effect of photopigment depletion.
As mentioned in [Mantiuk et al. 2008], the recent trend in displaying HDR images is towards interactive techniques. Examples of these are [Farbman et al. 2008] and [Lischinski et al. 2006]. However, these are strictly image editing techniques and do not interact with the user in real time. There are also interactive-techniques which show local previews of areas as the user moves a pointer into a specific area. For example, both [Photomatrix 2003] and the Bright-side HDR display webpage [Dolby 2006] use a method whereby the user moves a mouse pointer over sections of an image and in the section where the pointer is, the luminance of the section is adjusted to an appropriate level. However, in such methods, propagating the change to the entire image produces rapid changes that are very unnatural. Furthermore, such methods are not very effective in displaying various local regions of a given HDR image without resulting in overwhelming global changes. [Toyama 2009] describes another interactive method which utilizes a GUI to aid the user in deciding how to map the HDR content to LDR content. The interaction (for example, selection of region of interest, selection of parameters, application of the parameters and running of the tone-mapping function) between the user and the system in this method is done on a GUI system and a final "tone-mapped" image is obtained. However, the final "tone-mapped" image obtained from this method is not dynamically changing.
Work related to HDR videos has also been presented, for example in [Kang et al. 2003]. However, such work uses dynamic changes in video to increase the overall dynamic range of the video and not a single static HDR image to produce a dynamic version of the image.
Summary of the invention
The present invention aims to provide a new and useful method and system for displaying HDR content on a LDR display device. The present invention seeks to do so in such a way that the resulting image looks as realistic as possible while showing all the image content.
In general terms, the present invention proposes finding the position on a LDR screen at which a user is looking, and dynamically adjusting a HDR image displayed on the screen by appropriately varying a tone-mapping function according to where the user is looking (that is, the gaze position of the user).
For example, in regions which the user is not looking at, little of the "contrast resources" (that is, available dynamic range) of the display device may be applied. These regions correspond to areas of the retina far from the fovea, where the proportion of cones in the retina is lower. The fovea (as shown in Fig. 1 ) is located in the center of the macula region of the retina and is responsible for sharp central vision [Hunt 1995]. On the other hand, contrast resources may be allocated to the region where the user's gaze is centered, to make more visible the detail that is present in that region of the HDR image.
To vary the tone-mapping function according to where the user is looking, the present invention in some forms proposes using eye tracking technology to determine the user's distance and gaze direction to estimate the portion of the image projected on the most sensitive region of the user's retina (the macula).
The present invention has some similarity to known techniques of level-of-detail (LOD) and eye tracking [Murphy and Duchowski 2001]. In [Murphy and
Duchowski 2001], rendering of a scene is done dynamically by rendering greater detail at the point where the user's gaze is focused and this concept follows the phenomenon of foveation. The method in [Murphy and Duchowski
2001] changes the level of detail, whereas the present invention proposes changing the luminance which can in turn increase tonal detail perceived by a user. Similar work using eye tracking based texture mapping in an openGL environment is presented in [Nikolov et al. 2004].
The present invention in some forms further proposes designing the tone- mapping function to model the human visual system given the characteristics and requirements of human's gaze-aware system.
The human visual system uses several methods to interactively adapt to the incredible range of light intensities in our day to day lives, continually changing to effectively perceive information we are looking at. Much of this ability of the human visual system has been shown to be based on range compression [Jayant et al. 1993], while maintaining adequate detail in the perceived image. For a display system to be able to offload part of this range reduction, it is preferable that it adequately models and mimics the human visual system. A brief description of the physiology of the human eye is presented below. Often, the eye is modeled as a camera, and much of the design of any camera is inspired by the eye. The pupil is equivalent to the aperture of a camera. Just as a photographer adjusts the aperture of a camera to allow more or less light to reach a film or a photosensor, the eye adjusts the pupil to globally adjust the light into the eye, and consequently the retina. The diameter of the pupil may range from 2mm to 5 - 8mm depending on the age of the person (usually 5mm for an elderly and 8mm for young adults). This change in pupil diameter may be modeled as a global change in illumination, and is presented in greater detail in [Stark and Sherman 1957; Clarke et al. 2003].
Multiple studies in the human visual system [Dowling 1987; Haig 1941] show that the majority of light adaptation occurs in the retina. In the retina, there are two types of photoreceptors: approximately 100 million rods, appropriate for dim light and night vision; and 6 million cones for perceiving daylight scenes, color, and high contrast vision. Interestingly, the structure of each type of receptor is very similar. Of particular interest are the photopigment molecules of the cones which are embedded in each photoreceptor (there may be up to 10,000 of these cones). More details about the specific structure of the photoreceptors can be found in [Paupoo et al. 2000].
Previous medical research has shown that photopigment molecules consist of two parts, a chromophore (retinal), and a protein background (opsin). The complex of the chromophore and the opsin are termed rhodopsin. When a photon of the correct wavelength impinges upon a particular chromophore, it instantly changes shape in a process called photoisomerization. This is similar to CMOS and CCD sensors and the photoelectric effect that is exploited [Boyle and Smith 1970].
Rhodopsin absorbs incoming light giving the rhodopsin cells a characteristic purple (termed visual purple) colour. Photoisomerization causes bleaching of the purple, turning it yellow and allowing light to pass deeper into the photopigments below. Bleached photopigments are regenerated continually by the eye, but as the intensity of the light impinging on the photoreceptor increases, the regeneration is unable to match the depletion. As an example, below lOOcd/m2 , photopigment replenishing can stay ahead of bleaching. After 100 cd /m2 , bleaching becomes a critical factor for the local adaptation of a scene. For example at 5000 cd/m2 , the amount of unbleached photopigment is about 50%. At 1 , 000,000 cd/m2 , the amount of unbleached photopigment is at about 0.2%. This phenomenon shows that the eye is in fact doing "local" compression. Furthermore, the logarithmic depletion of the photopigments imposes a secondary compression element to the ever present logarithmic range compression which is present in all perceptual sensing in the human body.
Further results from studies on photoreceptors show that there is integration in neighboring cells due to the "wiring" of the photoreceptors to the retina's ganglion cells [Sterling 1999]. As mentioned in [Pattanaik et al. 2000; Reinhard and Devlin 2005], the cones in the retina are laterally connected to each other through electrical synapses. This signal is brought to the next layer of the vision system through glutamatergic synapses to bipolar cells. These bipolar cells are also laterally connected through electrical synapses and make forward connections to glutamatergic synapses onto the dendrites of ganglion cells [Dowling 1987]. This results in the ganglion cells receiving information from several bipolar cells which each acquire information from several cones. Furthermore, a single cone contributes to several ganglion cells. The work of Sterling [Sterling 1999] also states that "quantitative anatomical studies have revealed that a ganglion receives many synapses from a bipolar cell aligned with the center of its dendritic tree, but few synapses from the edge of its dendritic tree. Most of the synapses formed by these bipolar cells are onto the neighboring ganglion cells with which they are more closely aligned". When viewing a HDR scene, the eye is continually adapting. As a person looks at different content in a HDR scene, his or her human visual system continually adjusts the manner in which light is perceived. There is a fair amount of local and global adaptation that is done in parallel in response to the changes in the display. Global adaptation occurs by means of the adjustment of the pupil. Just as a photographer adjusts the aperture diameter of a camera to regulate the amount of light that film or sensor arrays are exposed to, the human eye adjusts the pupil to similarly regulate the amount of incoming light. Local adaptation occurs in the photopigment molecules which are produced in the human eye allowing the transduction of light into a physiological response. These photopigments are continually "bleached" and therefore must be regenerated or re-assembled to re-enable their function. This physiological mechanism contributes to a local adaptation as opposed to a global adaptation by means of pupil dilation.
Given the physiology of the human eye, a faithful model compressing an HDR scene is simply unrealistic to implement. Furthermore, as suggested in [Reinhard and Devlin 2005], this would be adding unnecessary complexity to the model. However, the physiology of the human eye serves as a basis as to how a dynamic display which reacts to changes in a similar way as that of a human eye can be created. For example, though the macular pigment is certainly an important component in vision, it is not appropriate to model this component for creating a dynamic display. Rather, the component that is more likely to be of primary consideration is the performance of photoreceptors. As stated in [Pattanaik et al. 2000] and multiple studies of human vision, the pupil contributes to very little of the overall adaption of the human visual system (less
Figure imgf000009_0001
In one aspect, the present invention proposes using a first step of performing adaptation (i.e. long term adaptation) in which the image is linearly scaled in intensity such that the mean quantimetric value [Mann 2001] maps to the midpoint of the output range before gamma correction is applied. This step corresponds to the linear scaling during pupil expansion or contraction.
Modeling the response of the photoreceptors is significantly more complex as compared to modeling the pupillary reflex of the eye, partly because there is less information known about the process. Additionally, at this first stage of processing in the human visual system, the processing chain is remarkably intricate. In terms of pupillary reflex, the present invention proposes using well- known physical concepts, for example pupil diameter and adjustment times, for the amount of light entering the eye. In certain embodiments, the present invention further proposes using the following points 1 to 6 to model opsin transductance and the methods in which it adapts to differing light intensities.
1. The eye has various global adaptation mechanisms such as pupil size, general photoreceptor response. Hence, the present invention in some forms proposes incorporating a global adaptation mechanism.
2. The cone receptors that are located in the macula provide higher visual acuity than rods. The maximal cone concentration is in the very center of the macula (the fovea). Hence, the present invention in some forms proposes a window-based approach addressing locality.
3. Cone receptors act as a network, where a cluster of cones contribute to the output of the ganglion cells and receptors closer to the dendritic tree based in the ganglion cells contribute more to the perception of the incoming light than those at the edges. Hence, the present invention in some forms proposes a cluster-based approach to model the light response.
4. Photoreceptor bleaching is largely a local phenomenon acting in tandem with global adaptation (reacting to environmental lighting). Transductance is changed corresponding to photons impinging on particular photopigments causing photoisomerization. Hence, the present invention in some forms proposes a method which may be used to combine global adaptation with opsin bleaching/photoreceptor response.
5. A time factor is involved. When looking at bright objects over time, opsin bleaching allows greater detail to be observed, essentially compressing the dynamic range more than the other physical compression systems allow.
Hence, the present invention in some forms proposes employing a time adaptation process in tone-mapping.
6. The fovea has a much higher concentration of cone receptors, and thus contributes more to the overall perceived image. The maximal cone concentration is in the very center of the fovea. It should also be noted that 50% of the fibers in the optic nerve are used for transmitting information from the fovea, the remaining fiber carry information from the rest of the retina. Hence, as described above, the present invention in some forms proposes a tone-mapping function according to where the user is looking.
Considering this final point, in [Pattanaik et al 2000], it is stated that "most retinal cells vary their response only within a range of intensities that is very narrow if compared against the entire range of vision. Adaptation processes dynamically adjust this narrow response function to conform better to the available light". Based on this observation, the present invention proposes in some forms increasing the contrast in the areas corresponding to the higher visual acuity in the fovea as shown graphically in Fig. 2. In Fig. 2 includes a histogram 202 of the numbers of pixels in the region corresponding to the ROM area for each possible value of luminescence (0 to 1.0). The linear mapping function is shown as the dashed line 204 and the adjusted mapping function is shown as a solid line 206. As one can see from the histogram 202 in Fig. 2, a relatively high proportion of the pixels in the ROM area have a luminance centered in a narrow sub-range of luminescence values (this sub-range is 0.35 to 0.65). The present invention in some forms proposes adjusting the tone- mapping function which transfers the HDR content to the LDR content according to this observation to expand the dynamic range of the ROM area.
The present invention in some forms further proposes implementing the display system at real time speeds and varying the tone-mapping function in real time by performing much of the tone-mapping in an accelerated manner using a balance of pre-computation and real time GPU-based rendering.
More specifically, a first aspect of the present invention provides a method for displaying an HDR image on a LDR display device to a user, the method comprising the steps of repeatedly: (1a) estimating a gaze position of the user by tracking at least one eye of the user, the gaze position of the user being a position on a screen of the LDR device at which the user is looking; (1 b) deriving an output image from the HDR image based on the estimated gaze position of the user; and (1c) displaying the output image on the LDR display device.
The invention may alternatively be expressed as a computer system for performing such a method. This computer system may be integrated with a device for capturing HDR images. Preferably, the computer system performs the method by running a display program with a shader program. The invention may also be expressed as a computer program product, such as one recorded on a tangible computer medium, containing program instructions operable by a computer system to perform the steps of the method.
Brief Description of the Figures
An embodiment of the invention will now be illustrated for the sake of example only with reference to the following drawings, in which: Fig. 1 illustrates the physiology of a human eye; Fig. 2 illustrates a histogram of pixel luminescence values in a region corresponding to a ROM area, a linear tone-mapping function and an adjusted tone-mapping function;
Fig. 3 illustrates a flow diagram of a method 300 which displays an HDR image on an LDR display device according to an embodiment of the present invention;
Fig. 4 illustrates a cluster-based tone-mapping function derived in a first example of sub-step 314b of step 314 of method 300 when the number of clusters N is 4; Fig. 5 illustrates slopes derived for each cluster center in a second example of sub-step 314b of step 314 of method 300 when the number of clusters N is 4;
Fig 6 illustrates a piecewise linear function and a cluster-based tone- mapping function derived from the slopes of Fig. 5; Figs. 7(a) and (b) illustrate results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 7(c) and (d) illustrate results after applying a global and local tone- mapping algorithm on the input image respectively; Figs. 8(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 8(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively; Figs. 9(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 9(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively; Figs. 10(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 10(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively;
Figs. 11 (a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 11(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively;
Figs. 12(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 12(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively;
Figs. 13(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 13(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively;
Figs. 14(a) and (b) illustrate results of method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 14(c) and (d) illustrate results after applying a local and global tone-mapping algorithm on the input image respectively.
Figs. 15(a) - (d) illustrate results of method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) as the user walks nearer the image;
Figs. 16(a) - (c) illustrate results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when the user looks at different portions of the image;
Figs. 17(a) - (c) illustrate results of method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when the user looks at different portions of the image; Fig. 18 illustrates results of an user study performed using method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency);
Fig. 19 illustrates results of a first user study performed using method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency);
Fig. 20 illustrates further results of the first user study performed using method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency); Fig. 21 illustrates a real world HDR reference image used in a second user study performed using method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency); and
Fig. 22(a) illustrates an original HDR image and Figs. 22(b) - (c) illustrate results of method 300 when the user is looking at different portions of the image after replacing the cluster-based tone-mapping function with a global tone- mapping operator.
Detailed Description of the Embodiments
Referring to Fig. 3, the steps are illustrated of a method 300 which is an embodiment of the present invention, and which displays an HDR image on a LDR display device. In method 300, the user distance to the display and information from eye tracking are used to produce an appropriately tone mapped dynamic LDR image, able to react to a user's change in gaze.
The input to the method 300 is an HDR image. In step 302, a long term adaptation component is computed, and in step 304 the global scale of the input HDR image is adjusted to form the image marked as "A". In step 306, the distance between the user and the screen of the display device is determined whereas in step 308, an X-Y position of the user's gaze on the screen of the display device is determined. In step 310, a region of observation by the macula (ROM) area is estimated using the distance between the user and the screen of the display device (determined in step 306) and the X-Y position of the user's gaze on the screen of the display device (determined in step 308). In step 312, illumination data for each of the ROM area pixels is determined, and this is used to calculate a cluster-based tone-mapping function in step 314. Further in step 314, cluster-based tone-mapping is performed on the input HDR image using the cluster-based tone-mapping function, to form the image shown as "B". In step 316, a bleaching factor is calculated. This is used together with the globally scaled image A from step 304 and the cluster-based tone mapped image B from step 314 to perform illumination scaling on the image to derive an illumination scaled image C. The values in the illumination scaled image are then converted from luminance values to RGB values. This gives the "output image" which is displayed using the LDR display device.
1. Adjusting global scale of image (Steps 302 and 304)
Steps 302 and 304 correspond to the long term adaptation of photoreceptors and pupil adjustment in a human eye.
In steps 302 and 304, a long term adaptation component Sg is computed and is used to adjust the global scale of the image. This is performed by calculating the illumination value (also referred to as the HDR luminance for each pixel in the HDR image in step 302 and using the HDR luminance of each pixel to calculate the long term adaptation component Sg . The long term adaptation component Sg is in turn used to adjust the global scale of the image accordingly in step 304.
The long term adaptation component Sg is defined in Equation (1 ) where
/(*,/) is the HDR luminance of a given pixel at spatial location (i,j) of the input image, and W and Hare the corresponding width and height of the input image in pixels. In essence, Equation (1 ) calculates an average illumination value for the pixels in the HDR image (average HDR luminance) using the illumination values calculated for the pixels in the HDR image in step 302 and maps the average HDR luminance of the HDR image to approximately the middle of the display range of the display device.
W - H
Figure imgf000017_0001
2. Estimating Region of Observation by the Macula (ROM) area (Steps 306, 308 and 310)
The ROM area may be referred to as an area on a screen of the display device at which the user is looking. In one example, it is the center of the viewer's gaze on the screen and refers to the area of the screen that falls into the macular region inside the retina of the viewer's eyes.
The greatest concentration of photoreceptors in the human eye is in a very small region of the retina termed the fovea. This region, the narrow center of our vision, is the most sensitive whereas the larger region encompassing the fovea, called the macula, contains the cone cells. Given the optics of the human visual system, the visual angle projected on the macula of the user's eye is approximately aπιacula =18° . If a user is standing at a position (X, Y, Z) with respect to the screen center and is looking at an area centered at a point (u,v) on the screen of the display device, the ROM area on the screen of the display device will generally be an oval shape.
In steps 306, 308 and 310, the ROM area on the screen is estimated. This is performed by determining the user's distance from the screen (also referred to as the view distance) in step 306 to determine the position (X,Y,Z) as mentioned above, calculating the X-Y position of the user's gaze on the screen in step 308 to determine the point («,v) (also referred to as the view point) and subsequently, estimating the ROM area on the screen in step 310 using the position (X, Y, Z) from step 306 and the point (w,v) from step 308.
Steps 306 and 308 are performed using eye tracking technology. In step 310, the ROM area on the screen is estimated as a square window centered at the point (M,V) with sides having lengths equal to lROM (in pixels) where lROM is defined according to Equation (2). In Equation (2), L1, is the height of the screen in metric meters and LROM = D -anιacula is the ROM area size in meters square. a macuia 's tr>e visual angle projected on the macula and is set to 18° whereas D is the distance between the center of the user's gaze on the screen (point [u,v] in pixels) and the user's position (X,Y,Z) \n real space and is defined according to Equation (3) whereby Lh and Lw are the width and height of the screen measured in meters, and W and H are the corresponding width and height of the image in pixels.
7 _ ^ROM I T
1ROM ~ ~~ j " (2)
Figure imgf000018_0001
In step 310, the ROM area is estimated as a square to simplify the calculations required andαmαcu/a is set as 18° . Alternatively, the ROM area may be estimated as any other shape, for example an oval and αmααώJ may be set to a different value.
3. Performing cluster-based tone-mapping
In method 300, the lowest level of information available is the individual image pixels corresponding to the pixels (or photosites) present in the camera's sensor. To mimic the clustering of a signal as it is provided to the brain from the ganglion cells, pixels with comparable intensities in the ROM area are clustered in accordance with [Sterling 1999] in step 312 and a tone-mapping function is obtained from this clustering and is applied on the HDR image in step 314.
More specifically, in step 312, the illumination data of each of the ROM area pixels is determined whereas in step 314, tone-mapping is performed on the entire image using a tone-mapping function that is determined based on the illumination data of each of the ROM area pixels.
Step 314 comprises sub-steps 314a and 314b. In sub-step 314a, clustering is performed on the ROM area pixels whereas in sub-step 314b, a cluster-based tone-mapping function is obtained based on the clustering of the ROM area pixels and is applied on the ROM area.
Sub-step 314a
In sub-step 314a, the ROM area pixels are first grouped into N classes based on the illumination data of each ROM area pixel as calculated in step 312. This is performed using a K-means clustering algorithm proposed in [Kanungo et al. 2002; Kanungo et al. 2004]. After performing the K-means clustering algorithm on the ROM area pixels, a set of TV cluster centers (denoted as [C0 - C^1] ) is obtained. Hence, the tone-mapping operator is reduced to an N -dimensional vector via the clustering.
In one realisation of sub-step 314a, a K-means clustering algorithm is used to cluster the ROM area pixels and N is set to 4. Alternatively, the ROM area pixels may be clustered using other clustering algorithms and N may be set to any other value. However, a value of N higher than 4 is not preferred as this results in a higher computational cost and experiments on different test images have shown that increasing the number of clusters N beyond 4 does not make a noticeable difference in the output. This could be because the neighbouring pixels for most of the image belong to less than 4 different intensity regions. Sub-step 314b
In sub-step 314b, a tone-mapping function, which is a piecewise linear function, is derived based on the clustering of pixels in the ROM area and is then applied on the entire image to obtain a tone mapped image. The cluster centers are used to impose knots and are used as control points in the tone-mapping function. A first and second example of sub-step 314b is presented as follows.
First example of sub-step 314b
In a first example of sub-step 314b, the slope S1 of the piecewise linear tone- mapping function defined according to Equation (4) is derived. In Equation (4),
C, are the cluster centers where i = {\,---,N}, CQ = 0, CN+1 = 1, and O1 = — + £,_,
N where i = {2,-,N} , O0 = O, O1 = -, and ON+] = 1.
s, = °r" ~_°r i = {0X-,N} (4)
Although this adjustment is chosen in such a way to allow the pixels in the ROM area to be expressed in the dynamic range of the display, the change is applied to the entire image. N may be set as 4 in sub-step 314b. The cluster-based tone-mapping function S1 when TV = 4 is shown in Fig. 4. Using Equation (4), each piecewise linear region of the tone-mapping function as shown in Fig. 4 is allocated a particular gain (slope). The cluster centers are then mapped to equidistant intensities in the range of the tone-mapping function in the process of obtaining the tone-mapped image.
Considering Fig. 4, when N > 3 , the cluster centers are closer to the areas where the distribution of intensities are greater. As a result, the contrast in these regions will be enhanced and thus N is preferably set as a value greater than 3 in the first example of sub-step 314b. Second example of sub-step 314b
Generally, pixel clusters will be best viewed if each cluster center is mapped to the middle of the LDR range [0,1] and with a slope inversely proportional to its luminance according to Weber's law of just noticeable difference (JND) [Wyszecki and Stiles 2000]. In a second example of sub-step 314b, a desired slope Sf is calculated for each cluster center C1 as 0.5 /C1 as it is not possible to map all cluster centers to the middle of the display range simultaneously. This creates a linear segment with a gradient equal to the slope ZS^ for each cluster as shown in Fig. 5. In sub-step 314b and as shown in Fig. 5, Sf is calculated for each cluster center C1 as 0.5 IC1.
The linear segments are then required to be joined to produce a monotonically non-decreasing tone-mapping function which is a piecewise linear tone- mapping function. However, if the piecewise linear tone-mapping function is formed by matching each segment of the function to the desired slope Sf , the function will likely extend past 1 as shown by the solid line in Fig. 6. Hence, in the second example of sub-step 314b, a weighted normalization procedure based on the cluster populations P, (\.e. the number of pixels in cluster i ) is performed on the linear segments prior to joining the linear segments to derive a piecewise linear tone-mapping function S1 defined according to Equation (5).
In Equation (5), B1 is the cluster boundary which falls between clusters C,_, and C1 . The dotted line in Fig. 6 gives an example of this piecewise linear tone- mapping function 5, which is also referred to as the cluster-based tone-mapping function.
Figure imgf000021_0001
The use of Equation (5) to obtain the piecewise linear tone-mapping function in the second example of sub-step 314b is advantageous as clusters which are heavily populated will not be significantly compressed whereas clusters with low populations will receive greater compression. Furthermore, Equation (5) is not completely dependent on histogram population and accordingly does not suffer from the extreme contrast enhancement which is sometimes present in the result of histogram equalization.
In both the first and second examples of sub-step 314b, a piecewise linear tone- mapping function is used due to its flexibility [Mantiuk et al]. Using a piecewise linear function, the slope in dynamic regions where clusters of intensities exist can be increased whereas this adjustment can be compensated for by decreasing the slope in low population regions. Alternatively, a cubic B-splines can be used to model the tone-mapping function, therefore providing C2 continuity. However, the piecewise linear method is preferable as it can be easily implemented as a GPU shader program, able to easily achieve realtime performance.
4. Calculating the bleaching factor (Step 316)
In the human visual system, the key illumination adaptation (and corresponding contrast optimization) is done by the photoreceptors. The adaptation is known to be accomplished by a mechanism of bleaching in the photopigment.
The cluster-based tone-mapping performed in step 314 corresponds to the bleaching mechanism in the photopigment whereas the amount in which the bleaching takes place in the human visual system is governed by Equation (6) which is often termed the Naka-Rushton equation. This equation is the function of unbleached photopigment at a particular instance and follows from the experiments described in [Naka and Rushton 1966]. In Equation (6), M is the value that causes half-maximal response and σ is a semi-saturation constant [Pattanaik et al. 2000]. The value n is a sensitivity control which is similar to gamma for video, film and typical displays.
R(M) = R1011x -^- — (6) m M" + σ"
A bleaching factor ζ is calculated in step 316 using Equation (6) in order to blend the photopigment phenomena with the adaptation adjusted image. In step 316, ζ is set as R(M) of Equation (6) and i?(M) is calculated according to a first and second example of step 316 as described below.
First example of step 316
In a first example of step 316, R(M) is calculated using the variables obtained using Equations (7) - (9).
The variable M in Equation (6) (denoted as MROM in Equation (7)) is calculated according to Equation (7) in which TR0M denotes the total number of pixels in the ROM area. Note that calculating MROM according to Equation (7) is consistent with using the average radiometric value of the user's ROM area. The variable σ is calculated according to Equation (8) which is based on the method described in [Pattanaik et al. 2000] in which findings of [Hunt 1995] were used to aid in modeling the response of the photoreceptors. Equation (8) is derived when Equation (6) is specifically applied to the cones and Acme is the fixed adaptation illumination amount, which corresponds directly to the global illumination average value. The Hunt's bleaching parameter is calculated according to Equation (9) whereby Acom is defined in the same way as in Equation (8). MROM = -±- ∑I(i,j) (7)
* ROM ijzROM
12.9223Λ
Figure imgf000024_0001
- V ;, 4 A yl 1 /I 3 1 where k =
5Λ „ + 1
2 χ lO6
IL. = (9)
2 x lO6 + Λ cone
Second example of step 316
In a second example of step 316, R(M) is calculated using the variables obtained using Equations (7), (10) - (11 ).
Similar to the first example of step 316, the variable M in Equation (6) (denoted as MROM in Equation (7)) is calculated according to Equation (7). However, the variable σ is calculated according to Equation (10) which is also based on the method described in [Pattanaik et al. 2000] in which findings of [Hunt 1995] were used to aid in modeling the response of the photoreceptors whereas the Hunt's bleaching parameter i?max is calculated according to Equation (11 ) whereby k is the scaling factor necessary to convert normalized pixel values e [θ,l] to luminance values in cdlm2 and is equal to the maximum luminance of the scene in cdlm2 . In cases where no knowledge of the original illumination of the HDR scene is present, A: is determined empirically. Equations (10) and (11 ) are derived from Equations (8) and (9).
Figure imgf000025_0001
4x lO6S , -k "" 4x lO65 - k + l K '
5. Deriving an illumination scaled image
An illumination scaled image is then derived by performing a weighted sum of the globally scaled image from step 304 and the tone-mapped image from step 314 using the bleaching factor ζ . As shown in Fig. 3, the bleaching factor ζ calculated in step 316 is used to linearly weight the (photopigment) enhanced processing of the image (the output image after performing cluster based tone- mapping in step 314) with the globally scaled adaptation image (the output image after performing global scaling in step 304). A first and second example of deriving the illumination scaled image is described below.
First example of deriving an illumination scaled image
A first example of performing the illumination scaling is done according to
Equation (12) where Iout(u',v') is a given intensity at the spatial location (w',v')on the illumination scaled image when the user's gaze is centered at the point (u,v) on the screen of the display device.
ζ is the bleaching factor obtained in step 316, I{u',V) is the HDR luminance of a given pixel at spatial location [u',v'] ,L(u',v') \s the LDR image (tone-mapped result obtained from the first example of sub-step 314b) at spatial location
[u',vr] and I(u',v')-Sg is the globally scaled image (obtained from step 304) at spatial location [u',v'] . L(u',v') is defined according to Equation (13). Furthermore, In Equations (12) and (13), S1 is the tone-mapping function obtained in Equation (5) whereas the definitions of C, and O1 remain the same as those in Equation (5).
IouM'y) = {\ - ζ) - (I(u'y) - Sg) + ζ - L{u',v') (12)
L(u',v') = (I(u',v') - Ci).Si + Oi for C, < /(«', v') < C/+1 i = ϋ,l,-,N (13)
Second example of deriving the illumination scaled image
A second example of deriving the illumination scaled image Iout(u',v') is performed according to Equation (14). In Equation (14), D is the distance between the viewpoint and the center of the user's gaze as calculated in Equation (3), TMOROM(u v D)(u',v') \s the output of the tone-mapping function (as obtained from the second example of sub-step 314b and as shown in Fig. 6) at spatial location (u',V) based on the ROM area centered at (u,v) .
huM'y) = $ - ζ) <I(M'y) - Sg) + ζ -TMOROM(ι^D){u'y) (14)
6. Converting illumination values in the illumination scaled image to RGB color values.
The illumination values in the illumination scaled image are then converted to RGB color values to obtain an output image of method 300.
In a first example, the conversion of the illumination values in the illumination scaled image to RGB color values is performed by relative scaling according to Equation (15). _ Λ,,»') -g b, _ i0,Λ»'y) -b g (15)
I{u',v') I(u',v') I(u',V)
In a second example, the conversion of the illumination values in the illumination scaled image to RGB color values is performed using the method of [Schlick 1994; Mantiuk et al. 2008] according to Equation (15a). In one example, gamma is set as 0.45 in Equation (15a).
Figure imgf000027_0001
7. Simulation of Eve Adaptation Latency
Given variations in the direction of the user's gaze, spontaneous changes to the tone-mapping function tend to result in unnatural changes. One likely reason for this is that the human visual system simply does not adapt instantaneously. Hence, it is preferable to simulate the latency of the change in the human visual system to add naturalness and believability to the display system in the embodiments of the present invention.
First example of simulatinp eye adaptation latency A first example of simulating eye adaptation latency is performed according to Equation (16). Given each newly computed value v,.(t) (which may be for example, a cluster center), each parameter p,(t) at time t is updated using an exponential smoothing technique according to Equation (16) where w controls the exponential delay factor.
ft(()ll»lffiH) (1β) w + 1 Using Equation (16) to simulate the eye adaptation latency, if the user remains focused in an area such that the cluster centers of the area remain unchanged, the parameters will converge to the actual cluster. The smaller w is, the faster the convergence will be. For example, setting w = 0 will result in no delay factor at all. Given that the data is being updated at 30 frames per second, experimental results shows that w = 20 provides natural and dynamic content and hence is the preferred choice to be used in this first example of simulating eye adaptation latency. Alternatively, w may be set to a different value. Closely approximating the results in [Pattanaik et al. 2000; Haig 1941] warrants increasing w to be approximately 200. Though this does in fact produce results which are undoubtedly closer to what is physiologically occurring in the eye, the slow change results in the display being less dramatic. In such cases, users generally did not perceive the change and felt that they were looking at a static image. Hence increasing w to be approximately 200 is not preferred.
Second example of simulating eve adaptation latency
An example of a method to simulate the time course of eye adaptation to the change in luminance is outlined in [Baker 1949; Haig 1941]. In a second example of simulating eye adaptation latency, a simplified eye adaptation model [Kopf et al. 2007; Ledda et al. 2004; Banterle et al. 2008 and Krawczyk et al. 2005] based on the work presented in [Hateren 2006] is adopted to simulate the transition of the system parameters P(t) (such as cluster centers ζ, the cluster- based tone-mapping function S1, etc) over the time the user's ROM area changes. The time at which the user's ROM area changes from one ROM area to another is denoted as time t = t0 . P(t) is calculated according to Equation (17) where t is in seconds andr is in milliseconds. In Equation (17), P(t) is the vector of parameters used for the current tone-mapping and Pss (ROM new) are the steady state parameters for the new ROM area the user is focused on. The vector P(t0) is the set of parameters corresponding to the moment towhen the user's ROM area is changed. According to Equation (17), the value of τ changes when the user changes the gaze position from a light region to a dark region, or contrarily from a dark region to a light region. This reflects the adaptation in the human visual system in each scenario.
P(t) = Pss (R0Mnew) + (P(I0) - Pss (ROMnew))e^
{200ms MROM^ > MROM and r = if _, ' , '='» (17)
400ms MROU < MROM
8. Real time Implementation of method 300 It is difficult to implement the K-means clustering algorithm directly in real time. However, this is not necessary as parameters such as the cluster centers, tone- mapping function and bleaching factor (in other words, the parameters obtained from steps 312 - 316) may be pre-computed in method 300. A first and second example of real time implementation of method 300 is described below.
First example of real time implementation of method 300
The window for the ROM area used for a typical image has a radius of 50 pixels. In a first example, the input image is divided into 50 x 50 window blocks where each of the 50 x 50 window blocks is referred to as a ROM window block whereas the center of each ROM window block is referred to as a grid position. In each ROM window block, a set of N cluster centers on the image are pre-set as pixels separated at regular pixel intervals both horizontally and vertically. In this first example, N is set as four and the pixel interval ranged from 10 to 50 pixels. As the interval decreases, the consistency between the intervals increases, with the consequence of increasing the pre-computation time. The bleaching factor, ζ is also pre-computed for each ROM window block. In the first example, the display program is written in OpenGL, employing a vertex shader. Given the HDR photoquantities in RGB space, each spatial location in the HDR image is mapped to a vertex2D in the OpenGL program. The color of each vertex is then set using color3f(r,g,b) to the raw photometric value of the HDR image. Mapping presented in Equations (12), (13) and (15) is implemented as a vertex shader which is able to set the four cluster centers in addition to a ζ parameter.
In the first example, the eye tracking system sends updated gaze positions to the openGL program over UDP at a rate of 60Hz. Since the cluster centers and the bleaching factor are pre-computed, steps 312 - 316 of method 300 are replaced by the following steps. The OpenGL(CPU) program then locates four sets of N cluster centers belonging to ROM window blocks with grid positions (the centers of the 50x50 ROM window blocks) which the user's current gaze position is closest to. It then linearly interpolates these four sets of N preset cluster centers given the distance from the user's current gaze position to each of the four grid positions to obtain N interpolated preset cluster centers. The bleaching factor, ζ is also interpolated in the same manner. A delay factor according to Equation (16) or (17) is then applied to the interpolated preset cluster centers and the interpolated bleaching factor ζ to obtain updated interpolated cluster centers and an updated interpolated bleaching factor ζ . The updated interpolated cluster centers are then set as the elements of a uniform float4 vector and are input into the OpenGL vertex shader. The updated interpolated bleaching factor ζ is input into the OpenGL vertex shader as well. Each vertex then computes its new display value given the N updated interpolated preset cluster centers and the updated interpolated ζ parameter. This is performed by deriving a tone-mapping function from the N updated interpolated preset cluster centers, applying the derived tone-mapped function to the HDR image to obtain a tone-mapped HDR image and deriving an output image (i.e. the new display value of each vertex) using the tone-mapped HDR image according to Equations (12) and (13). Equation (12) blends the tone- mapping based on the clustering algorithm with a linear tone curve in a per-pixel manner using the ζ parameter.
Second example of real time implementation of Method 300
In a second example, the image is discretized into a grid of 50 x 50 pixel segments where each of the 50 x 50 grid segments is referred to as single ROM blocks or a pre-computed ROM area. This is the smallest ROM area which may be represented. The parameters for the tone-mapping function (i.e. the N cluster centers) are pre-computed for each of these ROM blocks. This is performed by clustering each of the ROM blocks into a plurality of clusters and deriving the centers of the clusters. The parameters of all possible contiguous 2 x 2, 3 x 3, ..., n x n ROM blocks across the entire image are also pre-computed whereby n is the larger dimension of the width or height divided by the discretization size. When the size of a block begins to exceed the smaller dimension of the width or height divided by the discretization size, the "missing rows/columns" of the block are discarded and the pre-computation of the parameters is done only on the available image information. The bleaching factor ζ is also pre-computed.
In the second example, the OpenGL employs a pixel shader. Given the HDR photoquantities in RGB space, each spatial location in the HDR image is mapped to a g|Vertex2D in the OpenGL program. The color of each vertex is then set to the raw photoquantimetric value of the HDR image.
In the second example, the eye tracking system sends updated gaze positions to the program over User Datagram Protocol (UDP) at a rate of 60Hz. Since the cluster centers and the bleaching factor are pre-computed, steps 312 - 316 of method 300 are replaced by the following steps. The program in the second example finds the pre-computed ROM area which is most applicable to the user's gaze and distance. This most applicable pre-computed ROM area is the pre-computed ROM area which overlaps the most with the estimated ROM area of the user. The delay factor according to Equations (16) and (17) is then applied to the corresponding pre-computed tone-mapping parameters of the best fit ROM area (i.e. the pre-computed N cluster centers for the best fit ROM area) and the pre-computed bleaching factor ζ to obtain updated pre-computed tone-mapping parameters and an updated bleaching factor ζ . These updated pre-computed tone-mapping parameters and the updated bleaching factor ζ are then sent to the Graphics Processing Unit (GPU). Finally, the GPU shader program in the embodiments of the present invention computes the tone-mapping function based on the updated pre-computed tone-mapping parameters and applies it to each pixel of the HDR image in parallel to obtain a tone-mapped image. An output image is then derived using the tone-mapped image and the updated bleaching factor ζ .
9. Range reduced image results
The system in the embodiments of the present invention adaptively tunes the dynamic range resources available in the display in correspondence to where the user is looking. Some LDR snapshots of the HDR viewing system in the embodiments of the present invention are presented with the user looking at differing spatial locations. These images are compared to the static results of two state-of-the-art tone-mapping algorithms, one global method [Reinhard and Devlin 2005] and one local method [Fattal et al. 2002].
Figs. 7 - 13 show results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) as compared to the local and global mapping algorithms of [Fattal et al. 2002] and [Reinhard and Devline 2005] respectively. Fig. 7(a) and (b) - 13(a) and (b) show the results of method 300 when a user is looking at different portions of the image. Figs. 7(c) - 13(c) show the results when a global tone- mapping function (Reinhard) is applied whereas Figs. 7(d) - 13(d) show the results when a local tone-mapping function (Fattal) is applied.
Results from method 300 (using the second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) are shown in Fig. 14. Fig. 14(a) shows the image obtained from method
300 when the user is looking at the sign below the window and Fig. 14(b) shows the image obtained from method 300 when the user is looking at the window.
Fig. 14(c) shows the image obtained from applying the local tone-mapping algorithm [Fattal et al. 2002] on the input image whereas Fig. 14(d) shows the image obtained from applying the global tone-mapping [Reinhard and Devlin
2005] on the input image.
As shown in Figs. 7 - 14, in each of the scenes shown, there is adequate contrast in the regions where the user's gaze is focused. The problems which occur when attempting to display HDR content on an LDR display are largely confined to regions where the observer's gaze is not focused (for example extreme brightness or darkness). In contrast, in the case of the existing HDR to LDR tone-mapping algorithms (for example the local and global mapping algorithms of [Fattal et al. 2002] and [Reinhard and Devline 2005] respectively), unwanted effects are typically distributed throughout the entire image and no matter where the user's gaze is focused, the image is unsatisfying and unrealistic.
Figs. 15(a) - (d) illustrates the output image from method 300 (using the second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when the ROM area encompasses the entire image, 512 x 512 pixels, 128 x 128 pixels, and 32 x 32 pixels respectively. These are the results in the case where the user walks toward the screen. As shown in Fig. 15, the number of pixels in the ROM area decreases (from Fig. 15(a) - 15(d)) as the user gets closer. Consequently, as shown in Fig. 15, the contrast of a specific region the user is looking at increases as the user's distance decreases.
Figs. 16 and 17 also show the contrast and details shown in different regions of the output images obtained from method 300 (using respectively the first and second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when the user is looking at different regions of each of the images.
10. User Study Results
User study results using method 300 (with first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eve adaptation latency)
Testing the system in the embodiments of the present invention is a somewhat problematic issue, as there is no similar existing system. Although it may be possible to test the embodiments of the present invention against the HDR display made by Brightside/Dolby (in such a case, it is likely that users will find the HDR display made by Brightside/Dolby more natural and hence this HDR display will probably fare much better in such a study), the point of the embodiments of the present invention is to adapt the widespread current commercial technology, in other words the LDR display devices, to display HDR content in a natural way. For this reason, a usability study is performed using a single LDR image with only scaling and gamma correction applied (similar to what is obtained from a single LDR image at a given exposure time using a typical camera) and a single tone-mapped HDR image using a state-of-the-art tone-mapping algorithm. Such a study, although not ideal, can give some indication of the benefit of method 300.
The usability study was conducted with 23 subjects. The subjects were pulled from diverse backgrounds such as media artists, photo hobbyists, media designers, visual communication designers, and market researchers. The subjects also included research scientists and engineers in HCI, computer graphics, virtual reality, and image processing. About one quarter of the subjects consider themselves proficient in photography, or are photography enthusiasts.
Subjects were shown 3 images listed below and were asked questions regarding their preferences and user experiences.
1. a regular LDR image
2. a tone-mapped image (based on [Reinhard and Devlin 2005])
3. the image output obtained from method 300
The 3 images were labeled Image 1 , Image 2 and Image 3 respectively. The image used was a photo taken at a popular tourist location in Singapore, known as The Esplanade. Multiple exposures were taken at known exposure times in RAW mode, and used to create HDR images. The best LDR shot was chosen to represent Image 1. To generate Image 2, we used the program qpfstmo [qpfstmp 2007] (based on pfstmo [Mantiuk and Krawczyk 2007]), which implements several tone-mapping algorithms, including [Reinhard and Devlin 2005], to produce the tone-mapped version of the image. The parameters were tuned by an image processing engineer to achieve the best possible result using this software.
The test results revealed that subjects generally based their preference on image features such as brightness, contrast, vividness of the colors, realism and the amount of details. When the 3 images were compared, 63% of the subjects felt that looking at Image 3 gave them the greatest sense of realism (i.e. strongest virtual presence). More information is shown in the pie chart in Fig. 18 which shows the percentage of subjects choosing image 1 , 2 or 3 as the image giving them the greatest sense of realism. Some subjects who had prior experience with using available HDR tone-mapping systems further commented were that tone-mapped images are generally unnatural and do not look realistic.
Subjects preferred the interactive and dynamic feature of Image 3 (when asked on a Likert scale of 1 - 5, 78% agreed, of which 48% strongly agreed on this point). Subjects also liked Image 3 as it can enhance the area of interest. It was further recommended that more depth information may be included (for example, by using a 3D image instead) to improve the realism of the image.
One subject claimed "this on-the-fly generation of burn and dodge effect also guides a 3rd party observer as to where to view".
To test if method 300 actually does give the perception of expanding the dynamic range of the display, subjects' opinions were sought as to whether the bright areas had more detail. The result was that 64% of subjects agreed (of which 50% strongly agreed) that they perceived more detail in the bright area whereas 43% felt that even the dark area was also perceived to have more detail. This affirms the modeling of the human eye in the embodiments of the present invention, in that the cones sensitivity decreases in low intensity regions. Given the outcome of the user study with this set of subjects, method 300 did produce an output image which gives the perception of increasing the dynamic range of the display.
The usability study conducted affirmed that method 300 achieved a greater sense of realism as compared to existing HDR to LDR tone-mapping techniques. Also, the study showed that the dynamic range of the display was increased perceptually, in particular in the brighter regions of the HDR image by method 300. User study results using method 300 (with second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency)
Two user studies were conducted using method 300. The first study compares the result obtained from method 300 against the result of static tone-mapping algorithms on 10 reference images. The second study involves a real-world reference scene constructed by the inventors.
First user study using reference images:
The first study involved 20 gender-balanced and unbiased subjects between the ages 16-35. Subjects were tested to ensure that they have normal color vision and were shown 10 different HDR images processed by the following three algorithms:
[A] a linear mapping with gamma correction of 2.2 applied thereafter,
[B] a global tone-mapping algorithm, and
[C] method 300 (the gaze-adaptive HDR display method according to the embodiments of the present invention)
The HDR images were selected from the HDR DVD provided by the text [Reinhard et al. 2005]. For the tone-mapping algorithm, [Reinhard et al. 2002] which (as stated by [Cadik et al. 2006]) preserves the most details and yet maintains photo realism was chosen. The independent variables in this study are the 3 algorithms (first level) and the 10 images (second level). The dependent variable is the subject's preferential response.
The experiment was set up in a dark room (4 lux ambient luminance) with light provided only by the Barco Galaxy 9 HC+ projector used to display the images. The projector has maximum luminance of 9,000 ANSI lumens and contrast ratio of 1700:1 , with a resolution of 1280 x 1024. The images were projected on a large display (2m x 1.5m) and a chair was set up 1.5m from the screen for immersive viewing. A Seeing Machines FaceLab Version 4 was used for eye tracking. To generate the images in real time, an HP xw9400 workstation with a dual core AMD 3.20 GHz Opteron, 16 GB RAM and NVidia Quadro FX 4600 graphics card was used. Subjects were given the impression that all three algorithms were developed by the inventors and were asked to rank the algorithms based on their preferences and perceived realism for each of the resulting 10 images.
The order of images was randomized and subjects were allowed to take as much time as needed to view each image, and review any images if they desired. The results revealed that there was a significant difference between the three algorithms, as computed by a two-way repeated measures analysis of variance (ANOVA), a non-parametric procedure, F(2,38) = 31.021,/? < 0.001 .
Given that ANOVA is like a t-test that can be applied to multiple data sets simultaneously, our method of analysis resulted in 10 F statistics. Only the minimum F statistic is presented in what follows. As shown in Fig. 19, subjects significantly preferred Algorithm C for 7 of the 10 images, minF(2,38) = 7.389,/? < 0.01 , and Algorithm B for 2 of the 10 images, minF(2,38) = 6.589,/? < 0.01. There was no significant difference for image 10 at p < 0.05 .
Fig. 20 illustrates the subjects' perceived realism of the results for the same 10 reference images. Significant perceived realism was achieved by Algorithm C (method 300) for 4 out of 10 of the images, minF(2,38) = 3.633,/? < 0.05 and Algorithm B for 2 of the 10 images minF(2,38) = 6.617, p < 0.01 . The difference for images 3, 7, 9 and 10 were found to be insignificant at p < 0.05 , although it was found to be significant for image 3 at p < 0.06.
The DR values of the 10 HDR test images used in this first user study are shown in Table 1. This first user study revealed that if the dynamic range (DR) of the image was relatively low (DR less than 3, computed as Iog10(max/min)as shown in Table 1 ), users preferred the result of global tone-mapping. However, as the dynamic range of the image increased, users preferred the results of method 300. Artifacts produced by the global tone-mapping were rather obvious on the large display for these images. A Pearson correlation analysis indicates a large correlation [Cohen 1988] between the respondents' preferences and perceived realism for algorithms versus DR values, as can be seen in Table 2 which shows the correlation coefficient ( r ) between the respondents' preferences and the perceived realism versus image DR values.
On a Likert scale of 1 to 5, the majority of subjects (75%, μ = 3.9,σ = 0.91 ) significantly agreed that the bright areas for Algorithm C had more detail (t(19) = 4.413, p < 0.001) , while 40% ( μ = 3.25,σ = 1.16 not significant at p < 0.05 ) of the subjects felt that the dark areas in Algorithm C had more details. This indicates that the perceived dynamic range of the display was expanded, particularly in the bright areas, when method 300 was used.
Image DR Image DR
(1 ) Alhambra7 3.2 (6) GGMusicians 4.0
(2) BoyScoutFalls 7.2 (7) River_BoatB 3.8
(3) BridgeStudio2 5.5 (8) SanRafaelCreek 2.8
(4) ChurchWindowl 4.3 (9) SpanishSunseti 3.1
(5) ConservatoryE 2.8 (1O) TmJ 3.1
Table 1
Figure imgf000039_0001
Table 2 Second user study using real-world scene:
The second user study involved 10 gender-balanced and unbiased subjects between the ages 16 - 45 with normal color vision. A real-world scene was set up to enable subjects to compare the results of the algorithms with a reference scene, as shown in Fig. 21 which shows a real-world HDR reference (DR = 6.4) shown in two exposures.
An HDR image of the reference scene was taken using a multi-exposure method similar to [Debevec and Malik 1997] and processed by the same three algorithms, A, B and C, used above. As in [Akyϋz et alo. 2007], the scene luminance was measured by using an 18% grey card and a Photo Research
Colorimeter PR670. The parameters of the three algorithms were tuned using this data to set the global luminance appropriately (exposure time for Algorithm A, key value for Algorithm B, and Sg for Algorithm C).
Without loss of generality, this second user study required the subjects to focus on three regions of the scene to evaluate which of the three algorithms best preserved the details and realism. The three regions are: (i) the brightly illuminated poster in the top section, (ii) the moderately illuminated bunch of flowers in the middle section, and (iii) the dimly illuminated objects under the table in the bottom section of the scene. In each region, the local contrast is computed using Equation (18) where m and n are the width and height of the image region as calculated from Equation 2 (for this study, m = n = 300 ).
1 ^, MAX(iJ) -MIN(i,j)
A = - m n ^10n MAX(U j) + MIN (i, j)
MAX(U j) = max [/(/ + kj + I)] (18)
A./e[-l.l]
MIN(Uj) = tmin μ(i + k,j + I)] The local contrast results for the three regions are shown in Table 3. Note that for Algorithm C, the computation was done when the system was at a steady state, and the user's gaze was focused at the center of the region. Table 3 indicates that the gaze-adaptive system in the embodiments of the present invention was best able to preserve the contrast of the three regions considered.
poster flower under table
Algorithm A 0.0450 0.0830 0.0635
Algorithm B 0.0479 0.0571 0.0644
Algorithm C 0.1092 0.0955 0.0678
Original HDR image 0.1513 0.1066 0.0851
Table 3
Subjects were asked to provide a score between 0 to 10 for details and realism, for each of the algorithms in each region. The results are given in Table 4 which shows the average user scores and significance (p ) for each algorithm in terms of details and realism.
Details Realism
Region A B C P A B C P poster 3 1 5 2 7 7 0.001 3. 4 5 5 7 9 0.001 flower 6 2 6 1 6 2 ns 6. 1 4 8 7 5 0.01 under
4 1 6 0 7 9 0.001 5. 3 6 4 6 8 ns table
Table 4
When compared to the real scene, Algorithm C scored highest in terms of details and realism for all regions. Significant difference was found for the details of the poster using a two-way repeated measure ANOVA, F(2,16) = 34.174,O < 0.001) , the realism of the poster,
F(2,16) = 36.118,(/? < 0.001) , the realism of the flowers, F(2,16) = 8.902,0 < 0.001) and the details under the table, F(2,16) = 35.548, 0 < 0.001) . It was observed that all of the algorithms scored well in terms of details for the moderately lit middle section.
The user studies confirmed that both the realism and the details of the scene were best maintained by method 300 and hence the users found the resulting display of the HDR images more natural. These user studies also showed that the dynamic range of the display is perceptually increased, in particular in the bright regions of the HDR images.
The first and second user studies were conducted in a dark room to alleviate most of the ambient light issues. However, ambient light variations may be included into the system in the embodiments of the present invention for the system to be used in different environments. In [Mantiuk et al. 2008], ambient light when displaying images was considered in detail. Implementing the work in [Mantiuk et al. 2008] may entail measuring the ambient light before displaying and modifying the final output to the user. The work of [Mantiuk et al. 2008] also considers some aspects of the human visual system and may be incorporated into the embodiments of the present invention.
Even though the tone-mapping function used in method 300 is specifically designed to model a human eye, other tone-mapping operators may be used as well in method 300. In particular, in Fig. 22, the results when the global tone- mapping operator presented in [Reinhard et al. 2002] is used in place of the cluster-based tone-mapping operator are shown. Fig. 22(a) shows the original input HDR image, Figs. 22(b) and (c) shows the output image of method 300 when the cluster-based tone-mapping function is replaced with the global tone- mapping operator when the user is looking at the center of the image and at the plant in the right portion of the image respectively. In this case, the parameters of the global tone-mapping operator are calculated based on the ROM area alone and not the entire image. The global tone-mapping operator is then applied to the entire image.
In the embodiments of the present invention, a tone-mapping function that is based on a model of the human visual system and changes in real time approximating the adaption of this system is derived. This is advantageous because of the following reasons.
The embodiments of the present invention can amplify the adaptive mechanisms of the human eye to compensate for the restrictions imposed by the display and also to enhance the viewing experience for the user.
Furthermore, the embodiments of the present invention can effectively offload some of the range compression and compensation that is done by the human visual system onto the display system thus perceptually increasing the dynamic range. This, coupled with transitional latency also taken from the adaptation process of the human visual system, allows the creation of a display capable of dynamically showing HDR content in a manner which is natural to the user.
In the embodiments of the present invention, the adaptive display system is able to display HDR images on an LDR device based on the user's gaze and distance (using eye tracking technology). This is advantageous because of the following reasons.
Essentially, this allows the content of the display to be adapted according to where the user is looking at so that a particular area is presented in a natural way, but with appropriate contrast to represent the details in the particular area. Hence, the inherence resource problem is undoubtedly simplified. The system in the embodiments of the present invention can ease the amount of compression needed by reducing the spatial area that the tone-mapping function must operate on. The display resources which are being optimized, such as contrast, intensity, etc., are largely only considered at the point where the user is looking. In regions where the user's gaze is not focused, recourses are not critical and thus may be partially or fully discarded. For example, if a region is saturated or too dark, but the user is not looking in that area, there is no reason to apply resources to that area.
In the embodiments of the present invention, the latency in the response of the human eye to changes in viewing a scene is considered and a latency factor is included in the change of the parameters when the user's viewpoint changes. This is advantageous because rapid discrete changes which are not natural to the user can be avoided. Instead, in the embodiments of the present invention, the device displaying the scene makes rapid but gradual adjustments, continually adapting to what the user is viewing, hence making the adaptive display look natural.
The embodiments of the present invention are implemented in real time by tracking the gaze of a user of a HDR image in real time and by computing an appropriate tone-mapping function for the HDR image at real-time rates using both pre-computations to allow real time performance for a particular image and real time GPU-based rendering. In a second example of implement method 300 in real time, the pre-computation for the tone-mapping function is done on the ROM-subdivided images. After determining the pre-computed ROM areas, the latency computation etc., may be used without modification to achieve satisfying results. This real time implementation of method 300 is achieved by using a shader program on the GPU to perform all necessary real time operations and is advantageous because it can improve the computational time of the locally adaptive operator in the embodiments of the present invention. Since the real time approach taken in the embodiments of the present invention feeds a set of parameters to a shader program and a latency factor is included in the change of the parameters when the user's viewpoint changes, it is also possible to continually smooth the transition from one contrast enhancement to another by creating a gradual delay in each of the parameters. Further advantages of the embodiments of the present invention are as follows.
As opposed to creating a single static image, the embodiments of the present invention can create a dynamic display system which adapts interactively to the user's view by creating different LDR images. According to the studies done (one comparing the display of an HDR image to the actual HDR scene), the embodiments of the present invention were successful in creating the illusion of viewing a HDR scene when in reality, the image is displayed on a LDR display.
The embodiments of the present invention can enhance the effectiveness of other global tone-mapping operators. Any global operator may be applied and its effectiveness can be improved using the display system according to the embodiments of the present invention. Global tone-mapping operators and some local tone-mapping operators can hence benefit from this system. However, even though most tone-mapping operators can be used with this system, it may be difficult to apply a small percentage of local tone-mapping operators.
Aside from being effective as an HDR display device, the interactive gaze- aware component in the embodiments of the present invention may be expanded to virtual and telepresence applications or other any application which benefits from adding a dynamic element to a static scene. Since telepresence is usually considered as not only "the feeling of being there" but also "the total response to being in a place and being able to interact with the environment" [Riva et al. 2003], the system in the embodiments of the present invention can be effectively used for such an application. Because embodiments of the present invention simulate the inherent interaction of the human eye in response to the light, they therefore will be able to provide a
"better feeling of being there" for both tele-and virtual-presence.
The embodiments of the present invention may also be used for gaze-selected active refocusing of images. In works such as [Moreno-Noguer et al. 2007], an image may be retroactively refocused (or appear to have its focal plane changed). Intentional blur is commonly used by photographers to draw attention to a subject. Using an active refocusing technique, it would be possible to show a refocused scene with an interactive component. Thus, wherever the user's gaze was focused, content in the same focal plane would be in focus, objects closer or farther away would be appropriately out of focus. The embodiments of the present invention can also help to represent the properties of the human eye more appropriately.
The following shows the comparison between the embodiments of the present invention as mentioned above and the prior art.
• Local tone-mapping: Can produce high contrast images, but artifacts are commonly created and the resulting images may look unnatural
• Global tone-mapping: May operate very quickly, but suffer from obvious contrast problems
• HDR display devices: Appropriate for showing HDR images, but not common as a display medium
• Embodiments of the present invention: Interactively displays reasonable LDR images. Although the method according to the embodiments of the present invention is tuned to only a single user's view, it is a truly effective method of extracting dynamic content from a static image. REFERENCES
1. AKYUZ, A. O., FLEMING, R., RIECKE, B.E., REINHARD, E., AND BULTHOFF, H. H. 2007. Do HDR displays support LDR content? : a psychophysical evaluation. ACM Trans. Graph. 26, 3, 38.
2. ASHIKHMIN, M. 2002. A tone mapping algorithm for high contrast images. Rendering Techniques, 145-156.
3. BAKER, H. D. 1949. The course of foveal light adaptation measured by the threshold intensity increment. Journal of Optometry 39, 172-179. 4. BANTERLE, F., DEBATTISTA, K. , LEDDA, P., AND CHALMERS, A.
2008. A gpu-friendly method for high dynamic range texture compression using inverse tone mapping. Proc. Graph. Int. '08, 41-48.
5. BIMBER, O., AND IWAI, D. 2008. Superimposing dynamic range. ACM
Trans. Graph. 27, 5, 1-8. 6. BOYLE, W., AND SMITH, G. 1970. Charged coupled semiconductor devices. Bell System Technical Journal 49 (April), 587-593.
7. CADiK, M., WIMMER, M., NEUMANN, L., AND ARTUSI, A. 2006.
Image attributes and quality for evaluation of tone mapping operators. Proc.
Pac. Graph., 35- 44. 8. CALKINS, D., Y.TSUKAMOTO, AND STERLING, P. 1998. Micro- circuitry and mosaic of a blue/yellow ganglion cell in the primate retina.
Journal of Neuroscience 18, 3373-3385.
9. CLARKE, R., ZHANG, H., AND GAMLIN, P. 2003. Characteristics of the pupillary light reflex in the alert rhesus monkey. Journal of Neurophysiology 89, 3179-3189.
10. COHEN, J . 1988. Statistical Power Analysis for the Behavioral Sciences, second ed. Psychology Press.
11. DEBEVEC, P. E., AND MALIK, J. 1997. Recovering high dynamic range radiance maps from photographs. In ACM SIGGRAPH '97, 369-378. 12. DOLBY, 2006. http://www.dolby.com/promo/hdr/exp flash.html.
13. DOWLING, J. 1987. The Retina: An Approachable Part of The Brain. Harvard University Press. 14. DRAGO, F., MYSZKOWSKI, K., ANNEN, T., AND CHIBA, N. 2003. Adaptive logarithmic mapping for displaying high contrast scenes. Computer Graphics Forum 22, 419-426.
15. DURAND, F., AND DORSEY, J. 2002. Fast bilateral ltering for the display of high-dynamic-range images. ACM Transactions in Graphics 21, 3, 257-266.
16. FACELAB, 2008, http://www.seeinqmachines.com/facelab.htm
17. FARBMAN, Z., FATTAL, R., LISCHINSKI, D., AND SZELISKI, R. 2008. Edge-preserving decompositions for multi-scale tone and detail manipulation. ACM SIGGRAPH '08, 1-10.
18. FATTAL, R., LISCHINSKI, D., AND WERMAN, M. 2002. Gradient domain high dynamic range compression. ACM Trans. Graph. 21, 3, 249-256.
19. FERWERDA, J. A., PATTANAIK, S., SHIRLEY, P., AND GREENBERG, D. P. 1996. A model of visual adaptation for realistic image synthesis. ACM SIGGRAPH, 249-258.
20. HAIG, C. 1941. The course of rod dark adaptation as influenced by the intensity and duration of pre-adaptation to light. Journal of General Physiology 24, 735-751.
21. HATEREN, J. H. V. 2006. Encoding of high dynamic range video with a model of human cones. ACM Trans. Graph. 25, 4, 1380-1399.
22. HUNT, R. W. G. 1995. The Reproduction of Color. Fountain Press, England.
23. JAYANT, N., JOHNSTON, J., AND SAFRANEK, R. 1993. Signal compression based on models of human perception. Proceedings of IEEE, 1385-1422.
24. KANG, S. B., UYTTENDAELE, M., WINDER, S., AND SZELISKI, R. 2003. High dynamic range video. In SIGGRAPH '03: ACM SIGGRAPH 2003 Papers, ACM, New York, NY, USA, 319-325.
25. KANUNGO, T., MOUNT, D. M., NETANYAHU, N., PIATKO, C, SILVERMAN, R., AND WU, A. Y. 2002. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Analysis and Mach. Intel. 24, 881-892. 26. KANUNGO, T., MOUNT, D. M., NETANYAHU, N., PIATKO, C1 SILVERMAN, R., AND Wu, A. Y. 2004. A local search approximation algorithm for k-means clustering. Computational Geometry:' Theory and Applications 28, 89-112. 27. KOPF, J., UYTTENDAELE, M., DEUSSEN, O. f AND COHEN, M. F. 2007. Capturing and viewing gigapixel images. ACM Trans. Graph. 26, 3, 93.
28. KRAWCZYK, G., MYSZKOWSKI, K., AND SEIDEL, H.-P. 2005. Perceptual effects in real-time tone mapping. In ACM SCCG '05, 195-202.
29. LARSON, G. W., RUSHMEIER, H., AND PIATKO, C. 1997. A visibility matching tone reproduction operator for high dynamic range scenes. IEEE
Transactions on Visualization and Computer Graphics 3, 4, 291-306.
30. LARSON, G. W., RUSHMEIER, H., AND PIATKO, C. 1994. A contrast- based scale factor for luminance display. Graphics Gems IV, 415-421.
31. LEDDA, P., SANTOS, L. P., AND CHALMERS, A. 2004. A local model of eye adaptation for high dynamic range images. In ACM AFRIGRAPH '04,
151-160.
32. Ll, Y., SHARAN, L., AND ADELSON, E. H. 2005. Compressing and companding high dynamic range images with subband architecture. ACM SIGGRAPH 24, 3, 836-844. 33. LISCHINSKI, D., FARBMAN, Z., UYTTENDAELE, M., AND SZELISKI, R. 2006. Interactive local adjustment of tonal values. ACM SIGGRAPH, 646-653.
34. MANN, S. 2001. Intelligent Image Processing. John Wiley and Sons, November 2. ISBN: 0-471 -40637-6.
35. MANTIUK, R., DALY, S., AND KEROFSKY, L. 2008. Display adaptive tone mapping. ACM Trans. Graph. 27, 3, 1-10.
36. MANTIUK, R., AND KRAWCZYK1G., 2007. http://sourceforqe.net/projects/pfstools/.
37. MANTIUK, R., MYSZKOWSKI, K., AND SEIDEL, H. P. 2005. A perceptual framework for contrast processing of high dynamic range images. Proceedings of Second Symposium on Applied Perception in Graphics and Visualization, 87-94. 38. MORENO-NOGUER, F., BELHUMEUR, P. N., AND NAYAR, S. K. 2007. Active refocusing of images and videos. ACM Trans. Graph. 26, 3, 67.
39. MURPHY, H., AND DUCHOWSKI, A. T. 2001. Gaze-contingent level of detail rendering. Eurographics. 40. MURPHY, H., AND DUCHOWSKI, A. T. 2007. Hybrid image/model- based gaze-contingent rendering. In ACM APGV '07, 107-114.
41. NAKA, K., AND RUSHTON, W. 1966. S-potentials from luminosity units in the retinaof sh (cyprinidae). The Journal of Physiology 185, 587-599.
42. NIKOLOV, S. G., NEWMAN, T. D., BULL, D. R., CANAGARAJAH, N. C1 JONES, M. G., AND GILCHRIST, I. D. 2004. Gaze contingent display using texture mapping and opengl; system and applications. In ETRA '04: Proceedings of the 2004 symposium on Eye tracking research & applications, ACM, New York, NY, USA, 11 -18.
43. PATTANAIK, S. N., FERWERDA, J. A., AND FAIRCHILD, M. D. 1998. A multiscale model of adaptation and spatial vision for realistic image display.
ACM SIGGRAPH, 287-298.
44. PATTANAIK, S. N., TUMBLIN, J., YEE, H., AND GREENBERG, D. P. 2000. Time-dependent visual adaptation for fast realistic image display. ACM SIGGRAPH OO, Al-bΛ. 45. PAUPOO, A., FRIEDBERG, C, AND LAMB, T. 2000. Human cone photoreceptor responses measured by the electroretinagram awave during and after exposure to intense illuminantion. Journal of Physiology 529, 2, 469-482.
46. PHOTOMATIX, 2003. http://www.hdrsoft.com/.
47. QPFSTMP, 2007. http://atpfsqui.sourceforqe.net/. 48. REINHARD, E., AND DEVLIN, K. 2005. Dynamic range reduction inspired by photoreceptor physiology. IEEE Trans. VCG 11, 1 , 12-24. 49. REINHARD, E., STARK, M., SHIRLEY, P., AND FERWERDA, J. 2002. Photographic tone reproduction for digital images. ACM Trans. Graph. 21, 3, 45-52. 50. REINHARD, E., WARD, G., PATTANAIK, S., AND DEBEVEC, P. 2005. High Dynamic Range Imaging: Acquisition, Display, and Image-Based Lighting. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. 51. RIVA, G., LORETI, P., LUNGHI, M., VATALARO, F., AND DAVIDE, F. 2003. Presence 2010: The emergence of ambient intelligence. W.A Usselsteijn (Eds.), Being There: Concepts, effects and measurement of user presence in synthetic environments, IOS Press, 60-81. 52. SCHLICK, C. 1994. Quantization techniques for the visualization of high dynamic range pictures. Photorealistic Rendering Techniques, Proc. 5th Eurographic Rendering Workshop, 7-20.
53. SEETZEN, H., WARD, G., WHITEHEAD, L., AND HEIDRICH, W. 2004. High dynamic range display system. ACM SIGGRAPH '04 Emerging Technologies.
54. STARK, L., AND SHERMAN, P. 1957. A servoanalytic study of consensual pupil reflex to light. Journal of Neurophysiology 20, 17-25.
55. STERLING, P. 1999. Deciphering the retina's wiring diagram. Nature Neuroscience 2, 10 (October), 851-853. 56. TUMBLIN, J., AND RUSHMEIER, H. E. 1993. Tone reproduction for realistic images. IEEE Computer Graphics and Applications 13, 6, 42-48.
57. TUMBLIN, J., AND TURK, G. 1999. Lcis: A boundary hierarchy for detail-preserving contrast reduction. ACM SIGGRAPH, 83-90.
58. TUMBLIN, J., HODGINS, J. K., AND GUENTER, B. K. 1999. Two methods for display of high contrast images. ACM Transactions on Graphics 18, 1 , 56-94.
59. WYSZECKI, G., AND STILES, W. 2000. Color Science: Concepts and Methods, Quantitative Data and Formulae, 2nd Ed. Wiley, New York, New York.

Claims

Claims
1. A method for displaying an HDR image on a LDR display device to a user, the method comprising the steps of repeatedly: (1a) estimating a gaze position of the user by tracking at least one eye of the user, the gaze position of the user being a position on a screen of the LDR device at which the user is looking;
(1 b) deriving an output image from the HDR image based on the estimated gaze position of the user; and (1c) displaying the output image on the LDR display device.
2. A method according to claim 1 , wherein the estimated gaze position of the user is an estimated ROM area which is an area on the screen of the LDR device at which the user is looking and step (1 b) further comprises the sub- steps of:
(2i) deriving a tone-mapping function based on the estimated ROM area of the user;
(2ii) applying the derived tone-mapping function to the HDR image to obtain a tone-mapped HDR image; and (2iii) deriving the output image using the tone-mapped HDR image.
3. A method according to claim 2, wherein step (1a) further comprises the sub- steps of:
(3i) determining a distance between the user and the screen of the LDR display device;
(3ii) determining a point on the screen of the LDR display device at which the user is looking; and
(3iii) estimating the ROM area based on a view distance and a view point, the view distance being the distance between the user and the screen of the LDR display device and the view point being the point on the screen of the LDR display device at which the user is looking.
4. A method according to claim 3, wherein a user position is calculated from the view distance and the ROM area is estimated in step (3iii) as a square centered at the view point and each side of the square is of a length calculated by multiplying the distance between the view point and the user position with an angle representing a visual angle projected on a macular of the user's eye.
5. A method according to claim 4, wherein the angle representing the visual angle projected on the macular of the user's eye is 18° .
6. A method according to any of claims 2 - 5, wherein the estimated ROM area comprises a plurality of pixels and step (2i) further comprises the sub-steps of:
(6i) clustering the plurality of pixels to obtain a plurality of clusters wherein each cluster comprises a sub-set of the plurality of pixels and a center of each cluster is referred to as a cluster center; and (6ii) deriving the tone-mapping function based on the cluster centers.
7. A method according to claim 6, wherein the number of the plurality of clusters in sub-step (6i) is four.
8. A method according to claim 6 or 7, wherein the plurality of pixels is clustered in step (6i) using a K-means clustering algorithm.
9. A method according to any of claims 6 - 8, wherein sub-step (6ii) further comprises the sub-step of deriving the tone-mapping function based on the cluster centers such that the plurality of pixels in the estimated ROM area are expressed in a dynamic range of the LDR display device.
10. A method according to any of claims 6 - 8, wherein sub-step (6ii) further comprises the sub-steps of:
(1Oi) calculating a slope for each cluster center of the plurality of clusters according to the equation Sf = 0.5 /C, where C, is the cluster center for the cluster/ and Sf is the calculated slope for the cluster center C1. ;
(10ii) creating a linear segment for each cluster center, the linear segment having a gradient equal to the calculated slope for the cluster center; and
(1OiN) deriving the tone-mapping function as a function formed by joining the linear segments created for the cluster centers.
11. A method according to claim 10, wherein sub-step (10iii) further comprises the sub-step of performing a weighted normalization procedure on the linear segments created for the cluster centers prior to joining the linear segments, the weight normalization procedure being based on the number of pixels in each cluster of the plurality of clusters.
12. A method according to any of the claims 2 - 11 , wherein the tone-mapping function derived in sub-step (2i) is a piece-wise linear function or a cubic B- splines function.
13. A method according to any of claims 2 - 12, wherein sub-step (2ii) further comprises the sub-step of mapping cluster centers to equidistant intensities in the range of the tone-mapping function to obtain the tone-mapped HDR image.
14. A method according to any of claims 2 - 13 wherein the HDR image comprises a plurality of pixels and the sub-step (2iii) further comprises the sub- steps of:
(14i) calculating an illumination value for each pixel in the HDR image; (14ii) calculating an average illumination value for the plurality of pixels in the HDR image using the illumination values calculated for the pixels in the HDR image;
(14iii) mapping the average illumination value for the plurality of pixels in the HDR image to a middle of a display range of the LDR display device to obtain a globally scaled HDR image; and
(14iv) deriving the output image based on the globally scaled HDR image and the tone-mapped HDR image.
15. A method according to claim 14, wherein sub-step (14iv) further comprises the sub-steps of:
(15i) calculating a bleaching factor according to a Naka-Rushton equation;
(15ii) performing a weighted sum of the globally scaled HDR image and the tone-mapped HDR image using the bleaching factor to obtain an illumination scaled image; and
(15iii) deriving the output image based on the illumination scaled image.
16. A method according to claim 15, wherein the illumination scaled image comprises a plurality of pixels, each pixel having an illumination value and sub- step (15iii) further comprises the sub-step of converting the illumination values of the pixels in the illumination scaled image to RGB color values to derive the output image.
17. A method according to claim 1 , wherein the following steps (17a) - (17b) are performed prior to step (1a):
(17a) dividing the HDR image into a plurality of ROM window blocks; (17b) pre-defining a set of cluster centers in each ROM window block , the set of cluster centers comprising a plurality of pixels separated at regular pixel intervals both horizontally and vertically; and wherein step (1b) further comprises the sub-steps of:
(1Ti) locating a plurality of grid positions to which the user's gaze position is the closest, each grid position being a center of a corresponding ROM window block;
(17ii) locating the pre-defined set of cluster centers in the corresponding ROM window block for each grid position;
(17iii) interpolating the located pre-defined sets of cluster centers to obtain a plurality of interpolated preset cluster centers according to the distance between the plurality of grid positions and the user's gaze position;
(17iv) deriving a tone-mapping function based on the plurality of interpolated preset cluster centers;
(17v) applying the derived tone-mapping function to the HDR image to obtain a tone-mapped HDR image; and (17vi) deriving the output image using the tone-mapped HDR image.
18. A method according to claim 1 , wherein the estimated gaze position of the user is an estimated ROM area which is an area on the screen of the LDR device at which the user is looking and wherein the following steps (18a) - (18b) are performed prior to step (1a):
(18a) dividing the HDR image into a plurality of pre-computed ROM areas, each pre-computed ROM area comprising a plurality of pixels; and
(18b) clustering the plurality of pixels in each pre-computed ROM area into a plurality of clusters; and wherein step (1b) further comprises the sub-steps of:
(18i) locating a pre-computed ROM area which overlaps the most with the estimated ROM area of the user; (18ii) deriving a tone-mapping function for the located pre-computed ROM area based on pre-computed cluster centers in the located pre-computed ROM area, each pre-computed cluster center being a center of a cluster in the located pre-computed ROM area; (18iii) applying the derived tone-mapping function to the HDR image to obtain a tone-mapped HDR image; and
(18iv) deriving the output image using the tone-mapped HDR image.
19. A method according to claim 17, further comprising the step of applying a delay factor to the plurality of interpolated preset cluster centers prior to sub- step (17iv).
20. A method according to claim 17, further comprising a step of pre-computing a bleaching factor prior to step (1a) and the sub-step (17vi) is performed by applying the bleaching factor with the tone-mapped HDR image to derive the output image.
21. A method according to claim 20, further comprising the step of applying a delay factor to the bleaching factor prior to sub-step (17vi).
22. A method according to claim 18, further comprising the step of applying a delay factor to the pre-computed cluster centers in the located pre-computed ROM area prior to sub-step (18ii).
23. A method according to claim 18, further comprising the step of pre- computing a bleaching factor prior to step (1a) and the sub-step (18iv) is performed by applying the bleaching factor with the tone-mapped HDR image to derive the output image.
24. A method according to claim 23, further comprising the step of applying a delay factor to the bleaching factor prior to sub-step (18iv).
25. A computer system having a processor and a LDR display device, the computer system being arranged to perform a method according to any of the preceding claims.
26. A computer system according to claim 25, wherein the processor is arranged to run a display program employing a shader program.
27. A computer program product, readable by a computer and containing instructions operable by a processor of a computer system having an LDR display device to cause the computer system to perform a method according to any of claims 1 to 24.
PCT/SG2009/000299 2008-08-26 2009-08-26 A method and system for displaying an hdr image on a ldr display device WO2010024782A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US9180308P 2008-08-26 2008-08-26
US61/091,803 2008-08-26

Publications (2)

Publication Number Publication Date
WO2010024782A1 true WO2010024782A1 (en) 2010-03-04
WO2010024782A8 WO2010024782A8 (en) 2010-05-06

Family

ID=41721744

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2009/000299 WO2010024782A1 (en) 2008-08-26 2009-08-26 A method and system for displaying an hdr image on a ldr display device

Country Status (1)

Country Link
WO (1) WO2010024782A1 (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101908330A (en) * 2010-07-26 2010-12-08 武汉大学 Method for display equipment with narrow dynamic range to reproduce image with wide dynamic range
CN101916555A (en) * 2010-07-26 2010-12-15 武汉大学 Method for displaying and reproducing high dynamic range images
WO2012049617A1 (en) * 2010-10-12 2012-04-19 Tc View Ltd. Methods and apparatuses of eye adaptation support
EP2498499A2 (en) 2011-03-08 2012-09-12 Dolby Laboratories Licensing Corporation Interpolation of color gamut for display on target display
FR2977054A1 (en) * 2011-06-23 2012-12-28 St Microelectronics Grenoble 2 METHOD FOR IMPROVING THE VISUAL PERCEPTION OF A DIGITAL IMAGE
WO2013036236A1 (en) 2011-09-08 2013-03-14 Intel Corporation Interactive screen viewing
US8593480B1 (en) 2011-03-15 2013-11-26 Dolby Laboratories Licensing Corporation Method and apparatus for image data transformation
WO2013183206A1 (en) * 2012-06-07 2013-12-12 Sony Corporation Image processing apparatus, image processing method, and program
CN103824250A (en) * 2014-01-24 2014-05-28 浙江大学 GPU-based image tone mapping method
US9024961B2 (en) 2011-12-19 2015-05-05 Dolby Laboratories Licensing Corporation Color grading apparatus and methods
US9111330B2 (en) 2011-05-27 2015-08-18 Dolby Laboratories Licensing Corporation Scalable systems for controlling color management comprising varying levels of metadata
US20150346817A1 (en) * 2014-06-03 2015-12-03 Nvidia Corporation Physiologically based adaptive image generation
EP3220350A1 (en) * 2016-03-16 2017-09-20 Thomson Licensing Methods, apparatus, and systems for extended high dynamic range hdr to hdr tone mapping
GB2549696A (en) * 2016-04-13 2017-11-01 Sony Corp Image processing method and apparatus, integrated circuitry and recording medium
CN110533613A (en) * 2014-05-30 2019-12-03 深圳迈瑞生物医疗电子股份有限公司 The system and method for area-of-interest in Selective long-range DEPT image
CN111372097A (en) * 2014-02-07 2020-07-03 索尼公司 Receiving apparatus, display device, and receiving method
US11025830B1 (en) 2013-05-23 2021-06-01 Oliver Markus Haynold Deghosting camera
RU2767328C2 (en) * 2013-11-04 2022-03-17 Долби Лэборетериз Лайсенсинг Корпорейшн Single- and multi-modulator projection systems with global brightness control
US20220101502A1 (en) * 2017-09-27 2022-03-31 Interdigital Vc Holdings, Inc. Device and method for dynamic range expansion in a virtual reality scene

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105427366B (en) * 2015-11-11 2018-07-27 广州华多网络科技有限公司 A kind of image rendering method and image rendering system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050104900A1 (en) * 2003-11-14 2005-05-19 Microsoft Corporation High dynamic range image viewing on low dynamic range displays
WO2008011541A2 (en) * 2006-07-20 2008-01-24 Max-Viz, Inc. Image dynamic range control for visual display

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050104900A1 (en) * 2003-11-14 2005-05-19 Microsoft Corporation High dynamic range image viewing on low dynamic range displays
WO2008011541A2 (en) * 2006-07-20 2008-01-24 Max-Viz, Inc. Image dynamic range control for visual display

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MURPHY ET AL.: "Gaze-Contingent Level of Detail Rendering", EUROGRAPHICS, 2001, Retrieved from the Internet <URL:http://www.eg.org/EG/DL/conf/EG2001/short/shortl8.pdf> [retrieved on 20091007] *
REINHARD ET AL.: "Dynamic Range Reduction Inspired By Photoreceptor Physiology", IEEE TRANS. VCG 11, vol. 1, 2005, pages 12 - 24, Retrieved from the Internet <URL:http://www,cs.ucfedu/~reinhard/papers/tvcg2005.pdf> [retrieved on 20091007] *

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101908330A (en) * 2010-07-26 2010-12-08 武汉大学 Method for display equipment with narrow dynamic range to reproduce image with wide dynamic range
CN101916555A (en) * 2010-07-26 2010-12-15 武汉大学 Method for displaying and reproducing high dynamic range images
WO2012049617A1 (en) * 2010-10-12 2012-04-19 Tc View Ltd. Methods and apparatuses of eye adaptation support
US8767004B2 (en) 2011-03-08 2014-07-01 Dolby Laboratories Licensing Corporation Interpolation of color gamut for display on target display
EP2498499A2 (en) 2011-03-08 2012-09-12 Dolby Laboratories Licensing Corporation Interpolation of color gamut for display on target display
US10255879B2 (en) 2011-03-15 2019-04-09 Dolby Laboratories Licensing Corporation Method and apparatus for image data transformation
US8593480B1 (en) 2011-03-15 2013-11-26 Dolby Laboratories Licensing Corporation Method and apparatus for image data transformation
US9916809B2 (en) 2011-03-15 2018-03-13 Dolby Laboratories Licensing Corporation Method and apparatus for image data transformation
US9224363B2 (en) 2011-03-15 2015-12-29 Dolby Laboratories Licensing Corporation Method and apparatus for image data transformation
US11218709B2 (en) 2011-05-27 2022-01-04 Dolby Laboratories Licensing Corporation Scalable systems for controlling color management comprising varying levels of metadata
US11917171B2 (en) 2011-05-27 2024-02-27 Dolby Laboratories Licensing Corporation Scalable systems for controlling color management comprising varying levels of metadata
US9111330B2 (en) 2011-05-27 2015-08-18 Dolby Laboratories Licensing Corporation Scalable systems for controlling color management comprising varying levels of metadata
US11736703B2 (en) 2011-05-27 2023-08-22 Dolby Laboratories Licensing Corporation Scalable systems for controlling color management comprising varying levels of metadata
US8818086B2 (en) 2011-06-23 2014-08-26 Stmicroelectronics (Grenoble 2) Sas Method for improving the visual perception of a digital image
FR2977054A1 (en) * 2011-06-23 2012-12-28 St Microelectronics Grenoble 2 METHOD FOR IMPROVING THE VISUAL PERCEPTION OF A DIGITAL IMAGE
EP2754028A4 (en) * 2011-09-08 2015-08-12 Intel Corp Interactive screen viewing
WO2013036236A1 (en) 2011-09-08 2013-03-14 Intel Corporation Interactive screen viewing
US9024961B2 (en) 2011-12-19 2015-05-05 Dolby Laboratories Licensing Corporation Color grading apparatus and methods
US9532022B2 (en) 2011-12-19 2016-12-27 Dolby Laboratories Licensing Corporation Color grading apparatus and methods
WO2013183206A1 (en) * 2012-06-07 2013-12-12 Sony Corporation Image processing apparatus, image processing method, and program
US11025830B1 (en) 2013-05-23 2021-06-01 Oliver Markus Haynold Deghosting camera
RU2767328C2 (en) * 2013-11-04 2022-03-17 Долби Лэборетериз Лайсенсинг Корпорейшн Single- and multi-modulator projection systems with global brightness control
CN103824250A (en) * 2014-01-24 2014-05-28 浙江大学 GPU-based image tone mapping method
CN103824250B (en) * 2014-01-24 2016-09-07 浙江大学 image tone mapping method based on GPU
CN111372097A (en) * 2014-02-07 2020-07-03 索尼公司 Receiving apparatus, display device, and receiving method
US11882320B2 (en) 2014-02-07 2024-01-23 Sony Corporation Transmission device, transmission method, reception device, reception method, display device, and display method
CN111372097B (en) * 2014-02-07 2023-03-24 索尼公司 Receiving apparatus, display device, and receiving method
US11716493B2 (en) 2014-02-07 2023-08-01 Sony Group Corporation Transmission device, transmission method, reception device, reception method, display device, and display method
CN110533613A (en) * 2014-05-30 2019-12-03 深圳迈瑞生物医疗电子股份有限公司 The system and method for area-of-interest in Selective long-range DEPT image
US20150346817A1 (en) * 2014-06-03 2015-12-03 Nvidia Corporation Physiologically based adaptive image generation
US9773473B2 (en) * 2014-06-03 2017-09-26 Nvidia Corporation Physiologically based adaptive image generation
US10148906B2 (en) 2016-03-16 2018-12-04 Interdigital Vc Holdings, Inc. Methods, apparatus, and systems for extended high dynamic range (“HDR”) HDR to HDR tone mapping
EP3220350A1 (en) * 2016-03-16 2017-09-20 Thomson Licensing Methods, apparatus, and systems for extended high dynamic range hdr to hdr tone mapping
US10298857B2 (en) 2016-04-13 2019-05-21 Sony Corporation Image processing method and apparatus, integrated circuitry and recording medium for applying gain to a pixel
GB2549696A (en) * 2016-04-13 2017-11-01 Sony Corp Image processing method and apparatus, integrated circuitry and recording medium
US20220101502A1 (en) * 2017-09-27 2022-03-31 Interdigital Vc Holdings, Inc. Device and method for dynamic range expansion in a virtual reality scene

Also Published As

Publication number Publication date
WO2010024782A8 (en) 2010-05-06

Similar Documents

Publication Publication Date Title
WO2010024782A1 (en) A method and system for displaying an hdr image on a ldr display device
JP7152540B2 (en) Filtering method and system
Reinhard et al. Dynamic range reduction inspired by photoreceptor physiology
Duan et al. Tone-mapping high dynamic range images by novel histogram adjustment
Ferradans et al. An analysis of visual adaptation and contrast perception for tone mapping
Reinhard et al. Calibrated image appearance reproduction.
Boitard et al. Temporal coherency for video tone mapping
Bruce Expoblend: Information preserving exposure blending based on normalized log-domain entropy
CN106981054B (en) Image processing method and electronic equipment
EP3155586A2 (en) Improvements in and relating to the display of images
Petit et al. Assessment of video tone-mapping: Are cameras’ S-shaped tone-curves good enough?
Scheel et al. Tone reproduction for interactive walkthroughs
CN104700365B (en) A kind of method for enhancing picture contrast
KR20210118233A (en) Apparatus and method for shooting and blending multiple images for high-quality flash photography using a mobile electronic device
El Mezeni et al. Enhanced local tone mapping for detail preserving reproduction of high dynamic range images
Horiuchi et al. HDR image quality enhancement based on spatially variant retinal response
Reinhard et al. Image display algorithms for high‐and low‐dynamic‐range display devices
CN111583163B (en) AR-based face image processing method, device, equipment and storage medium
Johnson Cares and concerns of CIE TC8-08: spatial appearance modeling and HDR rendering
Jishnu et al. Multi exposure image fusion based on exposure correction and input refinement using limited low dynamic range images
Tariq et al. Perceptually Adaptive Real-Time Tone Mapping
Krawczyk et al. HDR tone mapping
Spitzer et al. Biological gain control for high dynamic range compression
Raffin et al. Tone mapping and enhancement of high dynamic range images based on a model of visual perception
Yang et al. Tone mapping based on multi-scale histogram synthesis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09810328

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09810328

Country of ref document: EP

Kind code of ref document: A1