WO2010024782A1

WO2010024782A1 - A method and system for displaying an hdr image on a ldr display device

Info

Publication number: WO2010024782A1
Application number: PCT/SG2009/000299
Authority: WO
Inventors: Susanto Rahardha; Farzam Farbiz; Corey Mason Manders; Zhiyong Huang; Suat Ling Jamie Ng; Ee Ping Ong; Zhengguo Li; Jinghong Zheng; Wei Yao
Original assignee: Agency For Science, Technology And Research
Priority date: 2008-08-26
Filing date: 2009-08-26
Publication date: 2010-03-04
Also published as: WO2010024782A8

Abstract

A method for displaying an HDR image on a LDR display device to a user. The method comprises the steps of repeatedly: (1a) estimating a gaze position of the user by tracking at least one eye of the user, the gaze position of the user being a position on a screen of the LDR device at which the user is looking; (1b) deriving an output image from the HDR image based on the estimated gaze position of the user; and (1c) displaying the output image on the LDR display device.

Description

A Method and System for Displaying an HDR Image on a LDR Display Device

Field of the invention

The present invention relates to a method and system for displaying an HDR image on a LDR display device.

Background of the Invention

Although the dynamic range of commercial displays is gradually increasing, there are currently only a small number of displays capable of showing relatively large dynamic ranges [Seetzen et al 2004, Bimber and Iwai 2008].

Unfortunately, many of these displays are fairly costly and have not become commercially available. Even if high dynamic range (HDR) display technology becomes prevalent, the majority of today's display devices only show content in a low dynamic range (LDR).

An option to address the problem of displaying HDR content is to compress the range of the HDR image using a local or global tone-mapping operator so that the HDR image may be shown appropriately on LDR display devices. [Reinhard and Devlin 2005] provides a survey of currently available methods of displaying HDR content using tone-mapping operators (also referred to as dynamic range reduction operators or tone-mapping functions). Generally, these methods can be grouped according to the type of operators (either global operators or local (spatially varying) operators) they use. A few examples of proposed methods using the global operators and the local operators are listed as follows.

• Global operators: [Larson et al. 1997; Tumblin and Rushmeier 1993; Larson et al. 1994; Drago et al. 2003; Reinhard and Devlin 2005; Reinhard et al. 2002; Schlick 1994; Ferwerda et al. 1996; Tumblin et al. 1999; Pattanaik et al. 2000] • Local (spatially varying) operators: [Mantiuk et al. 2005; Fattal et al. 2002; Tumblin and Turk 1999; Li et al. 2005; Ashikhmin 2002; Durand and Dorsey 2002; Pattanaik et al. 1998].

There are tradeoffs when considering these two types of operators. As a general observation, global operators tend to perform faster without introducing a large amount of artifacts but the use of global operators causes a loss of local image details and hence, a lack of realism. Local operators, on the other hand, are better at preserving local contrast and image details. However, local operators often introduce visual artifacts into the LDR results hence, also resulting in a lack of realism.

[Reinhard and Devlin 2005] also proposes a method in which a tone-mapping operator is constructed based on photoreceptor response by modeling the response of cones in the eye to produce a static LDR image. However, [Reinhard and Devlin 2005] considers the effect of cones and rods in HDR to LDR tone-mapping [Reinhard and Devlin 2005] but does not fully consider the effect of photopigment depletion.

As mentioned in [Mantiuk et al. 2008], the recent trend in displaying HDR images is towards interactive techniques. Examples of these are [Farbman et al. 2008] and [Lischinski et al. 2006]. However, these are strictly image editing techniques and do not interact with the user in real time. There are also interactive-techniques which show local previews of areas as the user moves a pointer into a specific area. For example, both [Photomatrix 2003] and the Bright-side HDR display webpage [Dolby 2006] use a method whereby the user moves a mouse pointer over sections of an image and in the section where the pointer is, the luminance of the section is adjusted to an appropriate level. However, in such methods, propagating the change to the entire image produces rapid changes that are very unnatural. Furthermore, such methods are not very effective in displaying various local regions of a given HDR image without resulting in overwhelming global changes. [Toyama 2009] describes another interactive method which utilizes a GUI to aid the user in deciding how to map the HDR content to LDR content. The interaction (for example, selection of region of interest, selection of parameters, application of the parameters and running of the tone-mapping function) between the user and the system in this method is done on a GUI system and a final "tone-mapped" image is obtained. However, the final "tone-mapped" image obtained from this method is not dynamically changing.

Work related to HDR videos has also been presented, for example in [Kang et al. 2003]. However, such work uses dynamic changes in video to increase the overall dynamic range of the video and not a single static HDR image to produce a dynamic version of the image.

Summary of the invention

The present invention aims to provide a new and useful method and system for displaying HDR content on a LDR display device. The present invention seeks to do so in such a way that the resulting image looks as realistic as possible while showing all the image content.

In general terms, the present invention proposes finding the position on a LDR screen at which a user is looking, and dynamically adjusting a HDR image displayed on the screen by appropriately varying a tone-mapping function according to where the user is looking (that is, the gaze position of the user).

For example, in regions which the user is not looking at, little of the "contrast resources" (that is, available dynamic range) of the display device may be applied. These regions correspond to areas of the retina far from the fovea, where the proportion of cones in the retina is lower. The fovea (as shown in Fig. 1 ) is located in the center of the macula region of the retina and is responsible for sharp central vision [Hunt 1995]. On the other hand, contrast resources may be allocated to the region where the user's gaze is centered, to make more visible the detail that is present in that region of the HDR image.

To vary the tone-mapping function according to where the user is looking, the present invention in some forms proposes using eye tracking technology to determine the user's distance and gaze direction to estimate the portion of the image projected on the most sensitive region of the user's retina (the macula).

The present invention has some similarity to known techniques of level-of-detail (LOD) and eye tracking [Murphy and Duchowski 2001]. In [Murphy and

Duchowski 2001], rendering of a scene is done dynamically by rendering greater detail at the point where the user's gaze is focused and this concept follows the phenomenon of foveation. The method in [Murphy and Duchowski

2001] changes the level of detail, whereas the present invention proposes changing the luminance which can in turn increase tonal detail perceived by a user. Similar work using eye tracking based texture mapping in an openGL environment is presented in [Nikolov et al. 2004].

The present invention in some forms further proposes designing the tone- mapping function to model the human visual system given the characteristics and requirements of human's gaze-aware system.

The human visual system uses several methods to interactively adapt to the incredible range of light intensities in our day to day lives, continually changing to effectively perceive information we are looking at. Much of this ability of the human visual system has been shown to be based on range compression [Jayant et al. 1993], while maintaining adequate detail in the perceived image. For a display system to be able to offload part of this range reduction, it is preferable that it adequately models and mimics the human visual system. A brief description of the physiology of the human eye is presented below. Often, the eye is modeled as a camera, and much of the design of any camera is inspired by the eye. The pupil is equivalent to the aperture of a camera. Just as a photographer adjusts the aperture of a camera to allow more or less light to reach a film or a photosensor, the eye adjusts the pupil to globally adjust the light into the eye, and consequently the retina. The diameter of the pupil may range from 2mm to 5 - 8mm depending on the age of the person (usually 5mm for an elderly and 8mm for young adults). This change in pupil diameter may be modeled as a global change in illumination, and is presented in greater detail in [Stark and Sherman 1957; Clarke et al. 2003].

Multiple studies in the human visual system [Dowling 1987; Haig 1941] show that the majority of light adaptation occurs in the retina. In the retina, there are two types of photoreceptors: approximately 100 million rods, appropriate for dim light and night vision; and 6 million cones for perceiving daylight scenes, color, and high contrast vision. Interestingly, the structure of each type of receptor is very similar. Of particular interest are the photopigment molecules of the cones which are embedded in each photoreceptor (there may be up to 10,000 of these cones). More details about the specific structure of the photoreceptors can be found in [Paupoo et al. 2000].

Previous medical research has shown that photopigment molecules consist of two parts, a chromophore (retinal), and a protein background (opsin). The complex of the chromophore and the opsin are termed rhodopsin. When a photon of the correct wavelength impinges upon a particular chromophore, it instantly changes shape in a process called photoisomerization. This is similar to CMOS and CCD sensors and the photoelectric effect that is exploited [Boyle and Smith 1970].

Rhodopsin absorbs incoming light giving the rhodopsin cells a characteristic purple (termed visual purple) colour. Photoisomerization causes bleaching of the purple, turning it yellow and allowing light to pass deeper into the photopigments below. Bleached photopigments are regenerated continually by the eye, but as the intensity of the light impinging on the photoreceptor increases, the regeneration is unable to match the depletion. As an example, below lOOcd/m² , photopigment replenishing can stay ahead of bleaching. After 100 cd /m² , bleaching becomes a critical factor for the local adaptation of a scene. For example at 5000 cd/m² , the amount of unbleached photopigment is about 50%. At 1 , 000,000 cd/m² , the amount of unbleached photopigment is at about 0.2%. This phenomenon shows that the eye is in fact doing "local" compression. Furthermore, the logarithmic depletion of the photopigments imposes a secondary compression element to the ever present logarithmic range compression which is present in all perceptual sensing in the human body.

Further results from studies on photoreceptors show that there is integration in neighboring cells due to the "wiring" of the photoreceptors to the retina's ganglion cells [Sterling 1999]. As mentioned in [Pattanaik et al. 2000; Reinhard and Devlin 2005], the cones in the retina are laterally connected to each other through electrical synapses. This signal is brought to the next layer of the vision system through glutamatergic synapses to bipolar cells. These bipolar cells are also laterally connected through electrical synapses and make forward connections to glutamatergic synapses onto the dendrites of ganglion cells [Dowling 1987]. This results in the ganglion cells receiving information from several bipolar cells which each acquire information from several cones. Furthermore, a single cone contributes to several ganglion cells. The work of Sterling [Sterling 1999] also states that "quantitative anatomical studies have revealed that a ganglion receives many synapses from a bipolar cell aligned with the center of its dendritic tree, but few synapses from the edge of its dendritic tree. Most of the synapses formed by these bipolar cells are onto the neighboring ganglion cells with which they are more closely aligned". When viewing a HDR scene, the eye is continually adapting. As a person looks at different content in a HDR scene, his or her human visual system continually adjusts the manner in which light is perceived. There is a fair amount of local and global adaptation that is done in parallel in response to the changes in the display. Global adaptation occurs by means of the adjustment of the pupil. Just as a photographer adjusts the aperture diameter of a camera to regulate the amount of light that film or sensor arrays are exposed to, the human eye adjusts the pupil to similarly regulate the amount of incoming light. Local adaptation occurs in the photopigment molecules which are produced in the human eye allowing the transduction of light into a physiological response. These photopigments are continually "bleached" and therefore must be regenerated or re-assembled to re-enable their function. This physiological mechanism contributes to a local adaptation as opposed to a global adaptation by means of pupil dilation.

Given the physiology of the human eye, a faithful model compressing an HDR scene is simply unrealistic to implement. Furthermore, as suggested in [Reinhard and Devlin 2005], this would be adding unnecessary complexity to the model. However, the physiology of the human eye serves as a basis as to how a dynamic display which reacts to changes in a similar way as that of a human eye can be created. For example, though the macular pigment is certainly an important component in vision, it is not appropriate to model this component for creating a dynamic display. Rather, the component that is more likely to be of primary consideration is the performance of photoreceptors. As stated in [Pattanaik et al. 2000] and multiple studies of human vision, the pupil contributes to very little of the overall adaption of the human visual system (less

In one aspect, the present invention proposes using a first step of performing adaptation (i.e. long term adaptation) in which the image is linearly scaled in intensity such that the mean quantimetric value [Mann 2001] maps to the midpoint of the output range before gamma correction is applied. This step corresponds to the linear scaling during pupil expansion or contraction.

Modeling the response of the photoreceptors is significantly more complex as compared to modeling the pupillary reflex of the eye, partly because there is less information known about the process. Additionally, at this first stage of processing in the human visual system, the processing chain is remarkably intricate. In terms of pupillary reflex, the present invention proposes using well- known physical concepts, for example pupil diameter and adjustment times, for the amount of light entering the eye. In certain embodiments, the present invention further proposes using the following points 1 to 6 to model opsin transductance and the methods in which it adapts to differing light intensities.

1. The eye has various global adaptation mechanisms such as pupil size, general photoreceptor response. Hence, the present invention in some forms proposes incorporating a global adaptation mechanism.

2. The cone receptors that are located in the macula provide higher visual acuity than rods. The maximal cone concentration is in the very center of the macula (the fovea). Hence, the present invention in some forms proposes a window-based approach addressing locality.

3. Cone receptors act as a network, where a cluster of cones contribute to the output of the ganglion cells and receptors closer to the dendritic tree based in the ganglion cells contribute more to the perception of the incoming light than those at the edges. Hence, the present invention in some forms proposes a cluster-based approach to model the light response.

4. Photoreceptor bleaching is largely a local phenomenon acting in tandem with global adaptation (reacting to environmental lighting). Transductance is changed corresponding to photons impinging on particular photopigments causing photoisomerization. Hence, the present invention in some forms proposes a method which may be used to combine global adaptation with opsin bleaching/photoreceptor response.

5. A time factor is involved. When looking at bright objects over time, opsin bleaching allows greater detail to be observed, essentially compressing the dynamic range more than the other physical compression systems allow.

Hence, the present invention in some forms proposes employing a time adaptation process in tone-mapping.

6. The fovea has a much higher concentration of cone receptors, and thus contributes more to the overall perceived image. The maximal cone concentration is in the very center of the fovea. It should also be noted that 50% of the fibers in the optic nerve are used for transmitting information from the fovea, the remaining fiber carry information from the rest of the retina. Hence, as described above, the present invention in some forms proposes a tone-mapping function according to where the user is looking.

Considering this final point, in [Pattanaik et al 2000], it is stated that "most retinal cells vary their response only within a range of intensities that is very narrow if compared against the entire range of vision. Adaptation processes dynamically adjust this narrow response function to conform better to the available light". Based on this observation, the present invention proposes in some forms increasing the contrast in the areas corresponding to the higher visual acuity in the fovea as shown graphically in Fig. 2. In Fig. 2 includes a histogram 202 of the numbers of pixels in the region corresponding to the ROM area for each possible value of luminescence (0 to 1.0). The linear mapping function is shown as the dashed line 204 and the adjusted mapping function is shown as a solid line 206. As one can see from the histogram 202 in Fig. 2, a relatively high proportion of the pixels in the ROM area have a luminance centered in a narrow sub-range of luminescence values (this sub-range is 0.35 to 0.65). The present invention in some forms proposes adjusting the tone- mapping function which transfers the HDR content to the LDR content according to this observation to expand the dynamic range of the ROM area.

The present invention in some forms further proposes implementing the display system at real time speeds and varying the tone-mapping function in real time by performing much of the tone-mapping in an accelerated manner using a balance of pre-computation and real time GPU-based rendering.

More specifically, a first aspect of the present invention provides a method for displaying an HDR image on a LDR display device to a user, the method comprising the steps of repeatedly: (1a) estimating a gaze position of the user by tracking at least one eye of the user, the gaze position of the user being a position on a screen of the LDR device at which the user is looking; (1 b) deriving an output image from the HDR image based on the estimated gaze position of the user; and (1c) displaying the output image on the LDR display device.

The invention may alternatively be expressed as a computer system for performing such a method. This computer system may be integrated with a device for capturing HDR images. Preferably, the computer system performs the method by running a display program with a shader program. The invention may also be expressed as a computer program product, such as one recorded on a tangible computer medium, containing program instructions operable by a computer system to perform the steps of the method.

Brief Description of the Figures

An embodiment of the invention will now be illustrated for the sake of example only with reference to the following drawings, in which: Fig. 1 illustrates the physiology of a human eye; Fig. 2 illustrates a histogram of pixel luminescence values in a region corresponding to a ROM area, a linear tone-mapping function and an adjusted tone-mapping function;

Fig. 3 illustrates a flow diagram of a method 300 which displays an HDR image on an LDR display device according to an embodiment of the present invention;

Fig. 4 illustrates a cluster-based tone-mapping function derived in a first example of sub-step 314b of step 314 of method 300 when the number of clusters N is 4; Fig. 5 illustrates slopes derived for each cluster center in a second example of sub-step 314b of step 314 of method 300 when the number of clusters N is 4;

Fig 6 illustrates a piecewise linear function and a cluster-based tone- mapping function derived from the slopes of Fig. 5; Figs. 7(a) and (b) illustrate results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 7(c) and (d) illustrate results after applying a global and local tone- mapping algorithm on the input image respectively; Figs. 8(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 8(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively; Figs. 9(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 9(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively; Figs. 10(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 10(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively;

Figs. 11 (a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 11(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively;

Figs. 12(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 12(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively;

Figs. 13(a) and (b) illustrate further results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 13(c) and (d) illustrate results after applying a global and local tone-mapping algorithm on the input image respectively;

Figs. 14(a) and (b) illustrate results of method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when a user is looking at different portions of the image and Figs. 14(c) and (d) illustrate results after applying a local and global tone-mapping algorithm on the input image respectively.

Figs. 15(a) - (d) illustrate results of method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) as the user walks nearer the image;

Figs. 16(a) - (c) illustrate results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when the user looks at different portions of the image;

Figs. 17(a) - (c) illustrate results of method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when the user looks at different portions of the image; Fig. 18 illustrates results of an user study performed using method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency);

Fig. 19 illustrates results of a first user study performed using method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency);

Fig. 20 illustrates further results of the first user study performed using method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency); Fig. 21 illustrates a real world HDR reference image used in a second user study performed using method 300 (using second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency); and

Fig. 22(a) illustrates an original HDR image and Figs. 22(b) - (c) illustrate results of method 300 when the user is looking at different portions of the image after replacing the cluster-based tone-mapping function with a global tone- mapping operator.

Detailed Description of the Embodiments

Referring to Fig. 3, the steps are illustrated of a method 300 which is an embodiment of the present invention, and which displays an HDR image on a LDR display device. In method 300, the user distance to the display and information from eye tracking are used to produce an appropriately tone mapped dynamic LDR image, able to react to a user's change in gaze.

The input to the method 300 is an HDR image. In step 302, a long term adaptation component is computed, and in step 304 the global scale of the input HDR image is adjusted to form the image marked as "A". In step 306, the distance between the user and the screen of the display device is determined whereas in step 308, an X-Y position of the user's gaze on the screen of the display device is determined. In step 310, a region of observation by the macula (ROM) area is estimated using the distance between the user and the screen of the display device (determined in step 306) and the X-Y position of the user's gaze on the screen of the display device (determined in step 308). In step 312, illumination data for each of the ROM area pixels is determined, and this is used to calculate a cluster-based tone-mapping function in step 314. Further in step 314, cluster-based tone-mapping is performed on the input HDR image using the cluster-based tone-mapping function, to form the image shown as "B". In step 316, a bleaching factor is calculated. This is used together with the globally scaled image A from step 304 and the cluster-based tone mapped image B from step 314 to perform illumination scaling on the image to derive an illumination scaled image C. The values in the illumination scaled image are then converted from luminance values to RGB values. This gives the "output image" which is displayed using the LDR display device.

1. Adjusting global scale of image (Steps 302 and 304)

Steps 302 and 304 correspond to the long term adaptation of photoreceptors and pupil adjustment in a human eye.

In steps 302 and 304, a long term adaptation component S_g is computed and is used to adjust the global scale of the image. This is performed by calculating the illumination value (also referred to as the HDR luminance for each pixel in the HDR image in step 302 and using the HDR luminance of each pixel to calculate the long term adaptation component S_g . The long term adaptation component S_g is in turn used to adjust the global scale of the image accordingly in step 304.

The long term adaptation component S_g is defined in Equation (1 ) where

/(*,/) is the HDR luminance of a given pixel at spatial location (i,j) of the input image, and W and Hare the corresponding width and height of the input image in pixels. In essence, Equation (1 ) calculates an average illumination value for the pixels in the HDR image (average HDR luminance) using the illumination values calculated for the pixels in the HDR image in step 302 and maps the average HDR luminance of the HDR image to approximately the middle of the display range of the display device.

W - H

2. Estimating Region of Observation by the Macula (ROM) area (Steps 306, 308 and 310)

The ROM area may be referred to as an area on a screen of the display device at which the user is looking. In one example, it is the center of the viewer's gaze on the screen and refers to the area of the screen that falls into the macular region inside the retina of the viewer's eyes.

The greatest concentration of photoreceptors in the human eye is in a very small region of the retina termed the fovea. This region, the narrow center of our vision, is the most sensitive whereas the larger region encompassing the fovea, called the macula, contains the cone cells. Given the optics of the human visual system, the visual angle projected on the macula of the user's eye is approximately a_πιacula =18° . If a user is standing at a position (X, Y, Z) with respect to the screen center and is looking at an area centered at a point (u,v) on the screen of the display device, the ROM area on the screen of the display device will generally be an oval shape.

In steps 306, 308 and 310, the ROM area on the screen is estimated. This is performed by determining the user's distance from the screen (also referred to as the view distance) in step 306 to determine the position (X,Y,Z) as mentioned above, calculating the X-Y position of the user's gaze on the screen in step 308 to determine the point («,v) (also referred to as the view point) and subsequently, estimating the ROM area on the screen in step 310 using the position (X, Y, Z) from step 306 and the point (w,v) from step 308.

Steps 306 and 308 are performed using eye tracking technology. In step 310, the ROM area on the screen is estimated as a square window centered at the point (M,V) with sides having lengths equal to l_ROM (in pixels) where l_ROM is defined according to Equation (2). In Equation (2), L₁, is the height of the screen in metric meters and L_ROM = D -a_nιacula is the ROM area size in meters square. ^a _macui_a '^{s tr}>e visual angle projected on the macula and is set to 18° whereas D is the distance between the center of the user's gaze on the screen (point [u,v] in pixels) and the user's position (X,Y,Z) \n real space and is defined according to Equation (3) whereby L_h and L_w are the width and height of the screen measured in meters, and W and H are the corresponding width and height of the image in pixels.

7 _ ^ROM I T

¹ROM ^{~ ~~} _j ^•" (2)

In step 310, the ROM area is estimated as a square to simplify the calculations required andα_mαcu/a is set as 18° . Alternatively, the ROM area may be estimated as any other shape, for example an oval and α_mααώJ may be set to a different value.

3. Performing cluster-based tone-mapping

In method 300, the lowest level of information available is the individual image pixels corresponding to the pixels (or photosites) present in the camera's sensor. To mimic the clustering of a signal as it is provided to the brain from the ganglion cells, pixels with comparable intensities in the ROM area are clustered in accordance with [Sterling 1999] in step 312 and a tone-mapping function is obtained from this clustering and is applied on the HDR image in step 314.

More specifically, in step 312, the illumination data of each of the ROM area pixels is determined whereas in step 314, tone-mapping is performed on the entire image using a tone-mapping function that is determined based on the illumination data of each of the ROM area pixels.

Step 314 comprises sub-steps 314a and 314b. In sub-step 314a, clustering is performed on the ROM area pixels whereas in sub-step 314b, a cluster-based tone-mapping function is obtained based on the clustering of the ROM area pixels and is applied on the ROM area.

Sub-step 314a

In sub-step 314a, the ROM area pixels are first grouped into N classes based on the illumination data of each ROM area pixel as calculated in step 312. This is performed using a K-means clustering algorithm proposed in [Kanungo et al. 2002; Kanungo et al. 2004]. After performing the K-means clustering algorithm on the ROM area pixels, a set of TV cluster centers (denoted as [C₀ - C^₁] ) is obtained. Hence, the tone-mapping operator is reduced to an N -dimensional vector via the clustering.

In one realisation of sub-step 314a, a K-means clustering algorithm is used to cluster the ROM area pixels and N is set to 4. Alternatively, the ROM area pixels may be clustered using other clustering algorithms and N may be set to any other value. However, a value of N higher than 4 is not preferred as this results in a higher computational cost and experiments on different test images have shown that increasing the number of clusters N beyond 4 does not make a noticeable difference in the output. This could be because the neighbouring pixels for most of the image belong to less than 4 different intensity regions. Sub-step 314b

In sub-step 314b, a tone-mapping function, which is a piecewise linear function, is derived based on the clustering of pixels in the ROM area and is then applied on the entire image to obtain a tone mapped image. The cluster centers are used to impose knots and are used as control points in the tone-mapping function. A first and second example of sub-step 314b is presented as follows.

First example of sub-step 314b

In a first example of sub-step 314b, the slope S₁ of the piecewise linear tone- mapping function defined according to Equation (4) is derived. In Equation (4),

C, are the cluster centers where i = {\,---,N}, C_Q = 0, C_N+1 = 1, and O₁ = — + £,_,

N where i = {2,-,N} , O₀ = O, O₁ = -, and O_N+] = 1.

s, = °r" ^~_°r ⁱ = {0X-,N} (4)

Although this adjustment is chosen in such a way to allow the pixels in the ROM area to be expressed in the dynamic range of the display, the change is applied to the entire image. N may be set as 4 in sub-step 314b. The cluster-based tone-mapping function S₁ when TV = 4 is shown in Fig. 4. Using Equation (4), each piecewise linear region of the tone-mapping function as shown in Fig. 4 is allocated a particular gain (slope). The cluster centers are then mapped to equidistant intensities in the range of the tone-mapping function in the process of obtaining the tone-mapped image.

Considering Fig. 4, when N > 3 , the cluster centers are closer to the areas where the distribution of intensities are greater. As a result, the contrast in these regions will be enhanced and thus N is preferably set as a value greater than 3 in the first example of sub-step 314b. Second example of sub-step 314b

Generally, pixel clusters will be best viewed if each cluster center is mapped to the middle of the LDR range [0,1] and with a slope inversely proportional to its luminance according to Weber's law of just noticeable difference (JND) [Wyszecki and Stiles 2000]. In a second example of sub-step 314b, a desired slope Sf is calculated for each cluster center C₁ as 0.5 /C₁ as it is not possible to map all cluster centers to the middle of the display range simultaneously. This creates a linear segment with a gradient equal to the slope ZS^ for each cluster as shown in Fig. 5. In sub-step 314b and as shown in Fig. 5, Sf is calculated for each cluster center C₁ as 0.5 IC₁.

The linear segments are then required to be joined to produce a monotonically non-decreasing tone-mapping function which is a piecewise linear tone- mapping function. However, if the piecewise linear tone-mapping function is formed by matching each segment of the function to the desired slope Sf , the function will likely extend past 1 as shown by the solid line in Fig. 6. Hence, in the second example of sub-step 314b, a weighted normalization procedure based on the cluster populations P, (\.e. the number of pixels in cluster i ) is performed on the linear segments prior to joining the linear segments to derive a piecewise linear tone-mapping function S₁ defined according to Equation (5).

In Equation (5), B₁ is the cluster boundary which falls between clusters C,_, and C₁ . The dotted line in Fig. 6 gives an example of this piecewise linear tone- mapping function 5, which is also referred to as the cluster-based tone-mapping function.

The use of Equation (5) to obtain the piecewise linear tone-mapping function in the second example of sub-step 314b is advantageous as clusters which are heavily populated will not be significantly compressed whereas clusters with low populations will receive greater compression. Furthermore, Equation (5) is not completely dependent on histogram population and accordingly does not suffer from the extreme contrast enhancement which is sometimes present in the result of histogram equalization.

In both the first and second examples of sub-step 314b, a piecewise linear tone- mapping function is used due to its flexibility [Mantiuk et al]. Using a piecewise linear function, the slope in dynamic regions where clusters of intensities exist can be increased whereas this adjustment can be compensated for by decreasing the slope in low population regions. Alternatively, a cubic B-splines can be used to model the tone-mapping function, therefore providing C² continuity. However, the piecewise linear method is preferable as it can be easily implemented as a GPU shader program, able to easily achieve realtime performance.

4. Calculating the bleaching factor (Step 316)

In the human visual system, the key illumination adaptation (and corresponding contrast optimization) is done by the photoreceptors. The adaptation is known to be accomplished by a mechanism of bleaching in the photopigment.

The cluster-based tone-mapping performed in step 314 corresponds to the bleaching mechanism in the photopigment whereas the amount in which the bleaching takes place in the human visual system is governed by Equation (6) which is often termed the Naka-Rushton equation. This equation is the function of unbleached photopigment at a particular instance and follows from the experiments described in [Naka and Rushton 1966]. In Equation (6), M is the value that causes half-maximal response and σ is a semi-saturation constant [Pattanaik et al. 2000]. The value n is a sensitivity control which is similar to gamma for video, film and typical displays.

R(M) = R_1011x -^- — (6) m M" + σ"

A bleaching factor ζ is calculated in step 316 using Equation (6) in order to blend the photopigment phenomena with the adaptation adjusted image. In step 316, ζ is set as R(M) of Equation (6) and i?(M) is calculated according to a first and second example of step 316 as described below.

First example of step 316

In a first example of step 316, R(M) is calculated using the variables obtained using Equations (7) - (9).

The variable M in Equation (6) (denoted as M_ROM in Equation (7)) is calculated according to Equation (7) in which T_R0M denotes the total number of pixels in the ROM area. Note that calculating M_ROM according to Equation (7) is consistent with using the average radiometric value of the user's ROM area. The variable σ is calculated according to Equation (8) which is based on the method described in [Pattanaik et al. 2000] in which findings of [Hunt 1995] were used to aid in modeling the response of the photoreceptors. Equation (8) is derived when Equation (6) is specifically applied to the cones and A_cme is the fixed adaptation illumination amount, which corresponds directly to the global illumination average value. The Hunt's bleaching parameter is calculated according to Equation (9) whereby A_com is defined in the same way as in Equation (8). M_ROM = -±- ∑I(i,j) (7)

* ROM ijzROM

12.9223Λ

- V ;, 4 A yl 1 /I 3 1 where k =

5Λ „ + 1

2 χ lO⁶

IL. = ^■ (9)

2 x lO⁶ + Λ cone

Second example of step 316

In a second example of step 316, R(M) is calculated using the variables obtained using Equations (7), (10) - (11 ).

Similar to the first example of step 316, the variable M in Equation (6) (denoted as M_ROM in Equation (7)) is calculated according to Equation (7). However, the variable σ is calculated according to Equation (10) which is also based on the method described in [Pattanaik et al. 2000] in which findings of [Hunt 1995] were used to aid in modeling the response of the photoreceptors whereas the Hunt's bleaching parameter i?_max is calculated according to Equation (11 ) whereby k is the scaling factor necessary to convert normalized pixel values e [θ,l] to luminance values in cdlm² and is equal to the maximum luminance of the scene in cdlm² . In cases where no knowledge of the original illumination of the HDR scene is present, A: is determined empirically. Equations (10) and (11 ) are derived from Equations (8) and (9).

4x lO⁶S , -k "^■" 4x lO⁶5 - k + l ^K '

5. Deriving an illumination scaled image

An illumination scaled image is then derived by performing a weighted sum of the globally scaled image from step 304 and the tone-mapped image from step 314 using the bleaching factor ζ . As shown in Fig. 3, the bleaching factor ζ calculated in step 316 is used to linearly weight the (photopigment) enhanced processing of the image (the output image after performing cluster based tone- mapping in step 314) with the globally scaled adaptation image (the output image after performing global scaling in step 304). A first and second example of deriving the illumination scaled image is described below.

First example of deriving an illumination scaled image

A first example of performing the illumination scaling is done according to

Equation (12) where I_out(u',v') is a given intensity at the spatial location (w',v')on the illumination scaled image when the user's gaze is centered at the point (u,v) on the screen of the display device.

ζ is the bleaching factor obtained in step 316, I{u',V) is the HDR luminance of a given pixel at spatial location [u',v'] ,L(u',v') \s the LDR image (tone-mapped result obtained from the first example of sub-step 314b) at spatial location

[u',v^r] and I(u',v')-S_g is the globally scaled image (obtained from step 304) at spatial location [u',v'] . L(u',v') is defined according to Equation (13). Furthermore, In Equations (12) and (13), S₁ is the tone-mapping function obtained in Equation (5) whereas the definitions of C, and O₁ remain the same as those in Equation (5).

I_ouM'y) = {\ - ζ) - (I(u'y) - S_g) + ζ - L{u',v') (12)

L(u',v') = (I(u',v') - C_i).S_i + O_i for C, < /(«', v') < C_/+1 i = ϋ,l,-,N (13)

Second example of deriving the illumination scaled image

A second example of deriving the illumination scaled image I_out(u',v') is performed according to Equation (14). In Equation (14), D is the distance between the viewpoint and the center of the user's gaze as calculated in Equation (3), TMO_{ROM(u v D)}(u',v') \s the output of the tone-mapping function (as obtained from the second example of sub-step 314b and as shown in Fig. 6) at spatial location (u',V) based on the ROM area centered at (u,v) .

h_uM'y) = $ - ζ) <I(M'y) - S_g) + ζ -TMO_ROM(ι^_D){u'y) (14)

6. Converting illumination values in the illumination scaled image to RGB color values.

The illumination values in the illumination scaled image are then converted to RGB color values to obtain an output image of method 300.

In a first example, the conversion of the illumination values in the illumination scaled image to RGB color values is performed by relative scaling according to Equation (15). _ Λ,,»') -g _b, _ i₀,Λ»'y) -b g (15)

I{u',v') I(u',v') I(u',V)

In a second example, the conversion of the illumination values in the illumination scaled image to RGB color values is performed using the method of [Schlick 1994; Mantiuk et al. 2008] according to Equation (15a). In one example, gamma is set as 0.45 in Equation (15a).

7. Simulation of Eve Adaptation Latency

Given variations in the direction of the user's gaze, spontaneous changes to the tone-mapping function tend to result in unnatural changes. One likely reason for this is that the human visual system simply does not adapt instantaneously. Hence, it is preferable to simulate the latency of the change in the human visual system to add naturalness and believability to the display system in the embodiments of the present invention.

First example of simulatinp eye adaptation latency A first example of simulating eye adaptation latency is performed according to Equation (16). Given each newly computed value v,.(t) (which may be for example, a cluster center), each parameter p,(t) at time t is updated using an exponential smoothing technique according to Equation (16) where w controls the exponential delay factor.

f_t(()ll»lffiH⁾ _(1β) w + 1 Using Equation (16) to simulate the eye adaptation latency, if the user remains focused in an area such that the cluster centers of the area remain unchanged, the parameters will converge to the actual cluster. The smaller w is, the faster the convergence will be. For example, setting w = 0 will result in no delay factor at all. Given that the data is being updated at 30 frames per second, experimental results shows that w = 20 provides natural and dynamic content and hence is the preferred choice to be used in this first example of simulating eye adaptation latency. Alternatively, w may be set to a different value. Closely approximating the results in [Pattanaik et al. 2000; Haig 1941] warrants increasing w to be approximately 200. Though this does in fact produce results which are undoubtedly closer to what is physiologically occurring in the eye, the slow change results in the display being less dramatic. In such cases, users generally did not perceive the change and felt that they were looking at a static image. Hence increasing w to be approximately 200 is not preferred.

Second example of simulating eve adaptation latency

An example of a method to simulate the time course of eye adaptation to the change in luminance is outlined in [Baker 1949; Haig 1941]. In a second example of simulating eye adaptation latency, a simplified eye adaptation model [Kopf et al. 2007; Ledda et al. 2004; Banterle et al. 2008 and Krawczyk et al. 2005] based on the work presented in [Hateren 2006] is adopted to simulate the transition of the system parameters P(t) (such as cluster centers ζ, the cluster- based tone-mapping function S₁, etc) over the time the user's ROM area changes. The time at which the user's ROM area changes from one ROM area to another is denoted as time t = t₀ . P(t) is calculated according to Equation (17) where t is in seconds andr is in milliseconds. In Equation (17), P(t) is the vector of parameters used for the current tone-mapping and P_ss (ROM _new) are the steady state parameters for the new ROM area the user is focused on. The vector P(t₀) is the set of parameters corresponding to the moment t_owhen the user's ROM area is changed. According to Equation (17), the value of τ changes when the user changes the gaze position from a light region to a dark region, or contrarily from a dark region to a light region. This reflects the adaptation in the human visual system in each scenario.

P(t) = P_ss (R0M_new) + (P(I₀) - P_ss (ROM_new))e^

{200ms ^MROM_^ > ^MROM and r = if _, ^' , '⁼'^» (17)

400ms M_ROU < M_ROM

8. Real time Implementation of method 300 It is difficult to implement the K-means clustering algorithm directly in real time. However, this is not necessary as parameters such as the cluster centers, tone- mapping function and bleaching factor (in other words, the parameters obtained from steps 312 - 316) may be pre-computed in method 300. A first and second example of real time implementation of method 300 is described below.

First example of real time implementation of method 300

The window for the ROM area used for a typical image has a radius of 50 pixels. In a first example, the input image is divided into 50 x 50 window blocks where each of the 50 x 50 window blocks is referred to as a ROM window block whereas the center of each ROM window block is referred to as a grid position. In each ROM window block, a set of N cluster centers on the image are pre-set as pixels separated at regular pixel intervals both horizontally and vertically. In this first example, N is set as four and the pixel interval ranged from 10 to 50 pixels. As the interval decreases, the consistency between the intervals increases, with the consequence of increasing the pre-computation time. The bleaching factor, ζ is also pre-computed for each ROM window block. In the first example, the display program is written in OpenGL, employing a vertex shader. Given the HDR photoquantities in RGB space, each spatial location in the HDR image is mapped to a vertex2D in the OpenGL program. The color of each vertex is then set using color3f(r,g,b) to the raw photometric value of the HDR image. Mapping presented in Equations (12), (13) and (15) is implemented as a vertex shader which is able to set the four cluster centers in addition to a ζ parameter.

In the first example, the eye tracking system sends updated gaze positions to the openGL program over UDP at a rate of 60Hz. Since the cluster centers and the bleaching factor are pre-computed, steps 312 - 316 of method 300 are replaced by the following steps. The OpenGL(CPU) program then locates four sets of N cluster centers belonging to ROM window blocks with grid positions (the centers of the 50x50 ROM window blocks) which the user's current gaze position is closest to. It then linearly interpolates these four sets of N preset cluster centers given the distance from the user's current gaze position to each of the four grid positions to obtain N interpolated preset cluster centers. The bleaching factor, ζ is also interpolated in the same manner. A delay factor according to Equation (16) or (17) is then applied to the interpolated preset cluster centers and the interpolated bleaching factor ζ to obtain updated interpolated cluster centers and an updated interpolated bleaching factor ζ . The updated interpolated cluster centers are then set as the elements of a uniform float4 vector and are input into the OpenGL vertex shader. The updated interpolated bleaching factor ζ is input into the OpenGL vertex shader as well. Each vertex then computes its new display value given the N updated interpolated preset cluster centers and the updated interpolated ζ parameter. This is performed by deriving a tone-mapping function from the N updated interpolated preset cluster centers, applying the derived tone-mapped function to the HDR image to obtain a tone-mapped HDR image and deriving an output image (i.e. the new display value of each vertex) using the tone-mapped HDR image according to Equations (12) and (13). Equation (12) blends the tone- mapping based on the clustering algorithm with a linear tone curve in a per-pixel manner using the ζ parameter.

Second example of real time implementation of Method 300

In a second example, the image is discretized into a grid of 50 x 50 pixel segments where each of the 50 x 50 grid segments is referred to as single ROM blocks or a pre-computed ROM area. This is the smallest ROM area which may be represented. The parameters for the tone-mapping function (i.e. the N cluster centers) are pre-computed for each of these ROM blocks. This is performed by clustering each of the ROM blocks into a plurality of clusters and deriving the centers of the clusters. The parameters of all possible contiguous 2 x 2, 3 x 3, ..., n x n ROM blocks across the entire image are also pre-computed whereby n is the larger dimension of the width or height divided by the discretization size. When the size of a block begins to exceed the smaller dimension of the width or height divided by the discretization size, the "missing rows/columns" of the block are discarded and the pre-computation of the parameters is done only on the available image information. The bleaching factor ζ is also pre-computed.

In the second example, the OpenGL employs a pixel shader. Given the HDR photoquantities in RGB space, each spatial location in the HDR image is mapped to a g|Vertex2D in the OpenGL program. The color of each vertex is then set to the raw photoquantimetric value of the HDR image.

In the second example, the eye tracking system sends updated gaze positions to the program over User Datagram Protocol (UDP) at a rate of 60Hz. Since the cluster centers and the bleaching factor are pre-computed, steps 312 - 316 of method 300 are replaced by the following steps. The program in the second example finds the pre-computed ROM area which is most applicable to the user's gaze and distance. This most applicable pre-computed ROM area is the pre-computed ROM area which overlaps the most with the estimated ROM area of the user. The delay factor according to Equations (16) and (17) is then applied to the corresponding pre-computed tone-mapping parameters of the best fit ROM area (i.e. the pre-computed N cluster centers for the best fit ROM area) and the pre-computed bleaching factor ζ to obtain updated pre-computed tone-mapping parameters and an updated bleaching factor ζ . These updated pre-computed tone-mapping parameters and the updated bleaching factor ζ are then sent to the Graphics Processing Unit (GPU). Finally, the GPU shader program in the embodiments of the present invention computes the tone-mapping function based on the updated pre-computed tone-mapping parameters and applies it to each pixel of the HDR image in parallel to obtain a tone-mapped image. An output image is then derived using the tone-mapped image and the updated bleaching factor ζ .

9. Range reduced image results

The system in the embodiments of the present invention adaptively tunes the dynamic range resources available in the display in correspondence to where the user is looking. Some LDR snapshots of the HDR viewing system in the embodiments of the present invention are presented with the user looking at differing spatial locations. These images are compared to the static results of two state-of-the-art tone-mapping algorithms, one global method [Reinhard and Devlin 2005] and one local method [Fattal et al. 2002].

Figs. 7 - 13 show results of method 300 (using first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) as compared to the local and global mapping algorithms of [Fattal et al. 2002] and [Reinhard and Devline 2005] respectively. Fig. 7(a) and (b) - 13(a) and (b) show the results of method 300 when a user is looking at different portions of the image. Figs. 7(c) - 13(c) show the results when a global tone- mapping function (Reinhard) is applied whereas Figs. 7(d) - 13(d) show the results when a local tone-mapping function (Fattal) is applied.

Results from method 300 (using the second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) are shown in Fig. 14. Fig. 14(a) shows the image obtained from method

300 when the user is looking at the sign below the window and Fig. 14(b) shows the image obtained from method 300 when the user is looking at the window.

Fig. 14(c) shows the image obtained from applying the local tone-mapping algorithm [Fattal et al. 2002] on the input image whereas Fig. 14(d) shows the image obtained from applying the global tone-mapping [Reinhard and Devlin

2005] on the input image.

As shown in Figs. 7 - 14, in each of the scenes shown, there is adequate contrast in the regions where the user's gaze is focused. The problems which occur when attempting to display HDR content on an LDR display are largely confined to regions where the observer's gaze is not focused (for example extreme brightness or darkness). In contrast, in the case of the existing HDR to LDR tone-mapping algorithms (for example the local and global mapping algorithms of [Fattal et al. 2002] and [Reinhard and Devline 2005] respectively), unwanted effects are typically distributed throughout the entire image and no matter where the user's gaze is focused, the image is unsatisfying and unrealistic.

Figs. 15(a) - (d) illustrates the output image from method 300 (using the second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when the ROM area encompasses the entire image, 512 x 512 pixels, 128 x 128 pixels, and 32 x 32 pixels respectively. These are the results in the case where the user walks toward the screen. As shown in Fig. 15, the number of pixels in the ROM area decreases (from Fig. 15(a) - 15(d)) as the user gets closer. Consequently, as shown in Fig. 15, the contrast of a specific region the user is looking at increases as the user's distance decreases.

Figs. 16 and 17 also show the contrast and details shown in different regions of the output images obtained from method 300 (using respectively the first and second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency) when the user is looking at different regions of each of the images.

10. User Study Results

User study results using method 300 (with first examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eve adaptation latency)

Testing the system in the embodiments of the present invention is a somewhat problematic issue, as there is no similar existing system. Although it may be possible to test the embodiments of the present invention against the HDR display made by Brightside/Dolby (in such a case, it is likely that users will find the HDR display made by Brightside/Dolby more natural and hence this HDR display will probably fare much better in such a study), the point of the embodiments of the present invention is to adapt the widespread current commercial technology, in other words the LDR display devices, to display HDR content in a natural way. For this reason, a usability study is performed using a single LDR image with only scaling and gamma correction applied (similar to what is obtained from a single LDR image at a given exposure time using a typical camera) and a single tone-mapped HDR image using a state-of-the-art tone-mapping algorithm. Such a study, although not ideal, can give some indication of the benefit of method 300.

The usability study was conducted with 23 subjects. The subjects were pulled from diverse backgrounds such as media artists, photo hobbyists, media designers, visual communication designers, and market researchers. The subjects also included research scientists and engineers in HCI, computer graphics, virtual reality, and image processing. About one quarter of the subjects consider themselves proficient in photography, or are photography enthusiasts.

Subjects were shown 3 images listed below and were asked questions regarding their preferences and user experiences.

1. a regular LDR image

2. a tone-mapped image (based on [Reinhard and Devlin 2005])

3. the image output obtained from method 300

The 3 images were labeled Image 1 , Image 2 and Image 3 respectively. The image used was a photo taken at a popular tourist location in Singapore, known as The Esplanade. Multiple exposures were taken at known exposure times in RAW mode, and used to create HDR images. The best LDR shot was chosen to represent Image 1. To generate Image 2, we used the program qpfstmo [qpfstmp 2007] (based on pfstmo [Mantiuk and Krawczyk 2007]), which implements several tone-mapping algorithms, including [Reinhard and Devlin 2005], to produce the tone-mapped version of the image. The parameters were tuned by an image processing engineer to achieve the best possible result using this software.

The test results revealed that subjects generally based their preference on image features such as brightness, contrast, vividness of the colors, realism and the amount of details. When the 3 images were compared, 63% of the subjects felt that looking at Image 3 gave them the greatest sense of realism (i.e. strongest virtual presence). More information is shown in the pie chart in Fig. 18 which shows the percentage of subjects choosing image 1 , 2 or 3 as the image giving them the greatest sense of realism. Some subjects who had prior experience with using available HDR tone-mapping systems further commented were that tone-mapped images are generally unnatural and do not look realistic.

Subjects preferred the interactive and dynamic feature of Image 3 (when asked on a Likert scale of 1 - 5, 78% agreed, of which 48% strongly agreed on this point). Subjects also liked Image 3 as it can enhance the area of interest. It was further recommended that more depth information may be included (for example, by using a 3D image instead) to improve the realism of the image.

One subject claimed "this on-the-fly generation of burn and dodge effect also guides a 3rd party observer as to where to view".

To test if method 300 actually does give the perception of expanding the dynamic range of the display, subjects' opinions were sought as to whether the bright areas had more detail. The result was that 64% of subjects agreed (of which 50% strongly agreed) that they perceived more detail in the bright area whereas 43% felt that even the dark area was also perceived to have more detail. This affirms the modeling of the human eye in the embodiments of the present invention, in that the cones sensitivity decreases in low intensity regions. Given the outcome of the user study with this set of subjects, method 300 did produce an output image which gives the perception of increasing the dynamic range of the display.

The usability study conducted affirmed that method 300 achieved a greater sense of realism as compared to existing HDR to LDR tone-mapping techniques. Also, the study showed that the dynamic range of the display was increased perceptually, in particular in the brighter regions of the HDR image by method 300. User study results using method 300 (with second examples of sub-step 314b, step 316, deriving an illumination scaled image and simulating eye adaptation latency)

Two user studies were conducted using method 300. The first study compares the result obtained from method 300 against the result of static tone-mapping algorithms on 10 reference images. The second study involves a real-world reference scene constructed by the inventors.

First user study using reference images:

The first study involved 20 gender-balanced and unbiased subjects between the ages 16-35. Subjects were tested to ensure that they have normal color vision and were shown 10 different HDR images processed by the following three algorithms:

[A] a linear mapping with gamma correction of 2.2 applied thereafter,

[B] a global tone-mapping algorithm, and

[C] method 300 (the gaze-adaptive HDR display method according to the embodiments of the present invention)

The HDR images were selected from the HDR DVD provided by the text [Reinhard et al. 2005]. For the tone-mapping algorithm, [Reinhard et al. 2002] which (as stated by [Cadik et al. 2006]) preserves the most details and yet maintains photo realism was chosen. The independent variables in this study are the 3 algorithms (first level) and the 10 images (second level). The dependent variable is the subject's preferential response.

The experiment was set up in a dark room (4 lux ambient luminance) with light provided only by the Barco Galaxy 9 HC+ projector used to display the images. The projector has maximum luminance of 9,000 ANSI lumens and contrast ratio of 1700:1 , with a resolution of 1280 x 1024. The images were projected on a large display (2m x 1.5m) and a chair was set up 1.5m from the screen for immersive viewing. A Seeing Machines FaceLab Version 4 was used for eye tracking. To generate the images in real time, an HP xw9400 workstation with a dual core AMD 3.20 GHz Opteron, 16 GB RAM and NVidia Quadro FX 4600 graphics card was used. Subjects were given the impression that all three algorithms were developed by the inventors and were asked to rank the algorithms based on their preferences and perceived realism for each of the resulting 10 images.

The order of images was randomized and subjects were allowed to take as much time as needed to view each image, and review any images if they desired. The results revealed that there was a significant difference between the three algorithms, as computed by a two-way repeated measures analysis of variance (ANOVA), a non-parametric procedure, F(2,38) = 31.021,/? < 0.001 .

Given that ANOVA is like a t-test that can be applied to multiple data sets simultaneously, our method of analysis resulted in 10 F statistics. Only the minimum F statistic is presented in what follows. As shown in Fig. 19, subjects significantly preferred Algorithm C for 7 of the 10 images, minF(2,38) = 7.389,/? < 0.01 , and Algorithm B for 2 of the 10 images, minF(2,38) = 6.589,/? < 0.01. There was no significant difference for image 10 at p < 0.05 .

Fig. 20 illustrates the subjects' perceived realism of the results for the same 10 reference images. Significant perceived realism was achieved by Algorithm C (method 300) for 4 out of 10 of the images, minF(2,38) = 3.633,/? < 0.05 and Algorithm B for 2 of the 10 images minF(2,38) = 6.617, p < 0.01 . The difference for images 3, 7, 9 and 10 were found to be insignificant at p < 0.05 , although it was found to be significant for image 3 at p < 0.06.

The DR values of the 10 HDR test images used in this first user study are shown in Table 1. This first user study revealed that if the dynamic range (DR) of the image was relatively low (DR less than 3, computed as Iog₁₀(max/min)as shown in Table 1 ), users preferred the result of global tone-mapping. However, as the dynamic range of the image increased, users preferred the results of method 300. Artifacts produced by the global tone-mapping were rather obvious on the large display for these images. A Pearson correlation analysis indicates a large correlation [Cohen 1988] between the respondents' preferences and perceived realism for algorithms versus DR values, as can be seen in Table 2 which shows the correlation coefficient ( r ) between the respondents' preferences and the perceived realism versus image DR values.

On a Likert scale of 1 to 5, the majority of subjects (75%, μ = 3.9,σ = 0.91 ) significantly agreed that the bright areas for Algorithm C had more detail (t(19) = 4.413, p < 0.001) , while 40% ( μ = 3.25,σ = 1.16 not significant at p < 0.05 ) of the subjects felt that the dark areas in Algorithm C had more details. This indicates that the perceived dynamic range of the display was expanded, particularly in the bright areas, when method 300 was used.

Image DR Image DR

(1 ) Alhambra7 3.2 (6) GGMusicians 4.0

(2) BoyScoutFalls 7.2 (7) River_BoatB 3.8

(3) BridgeStudio2 5.5 (8) SanRafaelCreek 2.8

(4) ChurchWindowl 4.3 (9) SpanishSunseti 3.1

(5) ConservatoryE 2.8 (1O) TmJ 3.1

Table 1

Table 2 Second user study using real-world scene:

The second user study involved 10 gender-balanced and unbiased subjects between the ages 16 - 45 with normal color vision. A real-world scene was set up to enable subjects to compare the results of the algorithms with a reference scene, as shown in Fig. 21 which shows a real-world HDR reference (DR = 6.4) shown in two exposures.

An HDR image of the reference scene was taken using a multi-exposure method similar to [Debevec and Malik 1997] and processed by the same three algorithms, A, B and C, used above. As in [Akyϋz et alo. 2007], the scene luminance was measured by using an 18% grey card and a Photo Research

Colorimeter PR670. The parameters of the three algorithms were tuned using this data to set the global luminance appropriately (exposure time for Algorithm A, key value for Algorithm B, and S_g for Algorithm C).

Without loss of generality, this second user study required the subjects to focus on three regions of the scene to evaluate which of the three algorithms best preserved the details and realism. The three regions are: (i) the brightly illuminated poster in the top section, (ii) the moderately illuminated bunch of flowers in the middle section, and (iii) the dimly illuminated objects under the table in the bottom section of the scene. In each region, the local contrast is computed using Equation (18) where m and n are the width and height of the image region as calculated from Equation 2 (for this study, m = n = 300 ).

1 ^, MAX(iJ) -MIN(i,j)

A = - m ^■ n ^_10n MAX(U j) + MIN (i, j)

MAX(U j) = max [/(/ + kj + I)] (18)

A./e[-l.l]

MIN(Uj) = _tmin μ(i + k,j + I)] The local contrast results for the three regions are shown in Table 3. Note that for Algorithm C, the computation was done when the system was at a steady state, and the user's gaze was focused at the center of the region. Table 3 indicates that the gaze-adaptive system in the embodiments of the present invention was best able to preserve the contrast of the three regions considered.

poster flower under table

Algorithm A 0.0450 0.0830 0.0635

Algorithm B 0.0479 0.0571 0.0644

Algorithm C 0.1092 0.0955 0.0678

Original HDR image 0.1513 0.1066 0.0851

Table 3

Subjects were asked to provide a score between 0 to 10 for details and realism, for each of the algorithms in each region. The results are given in Table 4 which shows the average user scores and significance (p ) for each algorithm in terms of details and realism.

Details Realism

Region A B C P A B C P poster 3 1 5 2 7 7 0.001 3. 4 5 5 7 9 0.001 flower 6 2 6 1 6 2 ns 6. 1 4 8 7 5 0.01 under

4 1 6 0 7 9 0.001 5. 3 6 4 6 8 ns table

Table 4

When compared to the real scene, Algorithm C scored highest in terms of details and realism for all regions. Significant difference was found for the details of the poster using a two-way repeated measure ANOVA, F(2,16) = 34.174,O < 0.001) , the realism of the poster,

F(2,16) = 36.118,(/? < 0.001) , the realism of the flowers, F(2,16) = 8.902,0 < 0.001) and the details under the table, F(2,16) = 35.548, 0 < 0.001) . It was observed that all of the algorithms scored well in terms of details for the moderately lit middle section.

The user studies confirmed that both the realism and the details of the scene were best maintained by method 300 and hence the users found the resulting display of the HDR images more natural. These user studies also showed that the dynamic range of the display is perceptually increased, in particular in the bright regions of the HDR images.

The first and second user studies were conducted in a dark room to alleviate most of the ambient light issues. However, ambient light variations may be included into the system in the embodiments of the present invention for the system to be used in different environments. In [Mantiuk et al. 2008], ambient light when displaying images was considered in detail. Implementing the work in [Mantiuk et al. 2008] may entail measuring the ambient light before displaying and modifying the final output to the user. The work of [Mantiuk et al. 2008] also considers some aspects of the human visual system and may be incorporated into the embodiments of the present invention.

Even though the tone-mapping function used in method 300 is specifically designed to model a human eye, other tone-mapping operators may be used as well in method 300. In particular, in Fig. 22, the results when the global tone- mapping operator presented in [Reinhard et al. 2002] is used in place of the cluster-based tone-mapping operator are shown. Fig. 22(a) shows the original input HDR image, Figs. 22(b) and (c) shows the output image of method 300 when the cluster-based tone-mapping function is replaced with the global tone- mapping operator when the user is looking at the center of the image and at the plant in the right portion of the image respectively. In this case, the parameters of the global tone-mapping operator are calculated based on the ROM area alone and not the entire image. The global tone-mapping operator is then applied to the entire image.

In the embodiments of the present invention, a tone-mapping function that is based on a model of the human visual system and changes in real time approximating the adaption of this system is derived. This is advantageous because of the following reasons.

The embodiments of the present invention can amplify the adaptive mechanisms of the human eye to compensate for the restrictions imposed by the display and also to enhance the viewing experience for the user.

Furthermore, the embodiments of the present invention can effectively offload some of the range compression and compensation that is done by the human visual system onto the display system thus perceptually increasing the dynamic range. This, coupled with transitional latency also taken from the adaptation process of the human visual system, allows the creation of a display capable of dynamically showing HDR content in a manner which is natural to the user.

In the embodiments of the present invention, the adaptive display system is able to display HDR images on an LDR device based on the user's gaze and distance (using eye tracking technology). This is advantageous because of the following reasons.

Essentially, this allows the content of the display to be adapted according to where the user is looking at so that a particular area is presented in a natural way, but with appropriate contrast to represent the details in the particular area. Hence, the inherence resource problem is undoubtedly simplified. The system in the embodiments of the present invention can ease the amount of compression needed by reducing the spatial area that the tone-mapping function must operate on. The display resources which are being optimized, such as contrast, intensity, etc., are largely only considered at the point where the user is looking. In regions where the user's gaze is not focused, recourses are not critical and thus may be partially or fully discarded. For example, if a region is saturated or too dark, but the user is not looking in that area, there is no reason to apply resources to that area.

In the embodiments of the present invention, the latency in the response of the human eye to changes in viewing a scene is considered and a latency factor is included in the change of the parameters when the user's viewpoint changes. This is advantageous because rapid discrete changes which are not natural to the user can be avoided. Instead, in the embodiments of the present invention, the device displaying the scene makes rapid but gradual adjustments, continually adapting to what the user is viewing, hence making the adaptive display look natural.

The embodiments of the present invention are implemented in real time by tracking the gaze of a user of a HDR image in real time and by computing an appropriate tone-mapping function for the HDR image at real-time rates using both pre-computations to allow real time performance for a particular image and real time GPU-based rendering. In a second example of implement method 300 in real time, the pre-computation for the tone-mapping function is done on the ROM-subdivided images. After determining the pre-computed ROM areas, the latency computation etc., may be used without modification to achieve satisfying results. This real time implementation of method 300 is achieved by using a shader program on the GPU to perform all necessary real time operations and is advantageous because it can improve the computational time of the locally adaptive operator in the embodiments of the present invention. Since the real time approach taken in the embodiments of the present invention feeds a set of parameters to a shader program and a latency factor is included in the change of the parameters when the user's viewpoint changes, it is also possible to continually smooth the transition from one contrast enhancement to another by creating a gradual delay in each of the parameters. Further advantages of the embodiments of the present invention are as follows.

As opposed to creating a single static image, the embodiments of the present invention can create a dynamic display system which adapts interactively to the user's view by creating different LDR images. According to the studies done (one comparing the display of an HDR image to the actual HDR scene), the embodiments of the present invention were successful in creating the illusion of viewing a HDR scene when in reality, the image is displayed on a LDR display.

The embodiments of the present invention can enhance the effectiveness of other global tone-mapping operators. Any global operator may be applied and its effectiveness can be improved using the display system according to the embodiments of the present invention. Global tone-mapping operators and some local tone-mapping operators can hence benefit from this system. However, even though most tone-mapping operators can be used with this system, it may be difficult to apply a small percentage of local tone-mapping operators.

Aside from being effective as an HDR display device, the interactive gaze- aware component in the embodiments of the present invention may be expanded to virtual and telepresence applications or other any application which benefits from adding a dynamic element to a static scene. Since telepresence is usually considered as not only "the feeling of being there" but also "the total response to being in a place and being able to interact with the environment" [Riva et al. 2003], the system in the embodiments of the present invention can be effectively used for such an application. Because embodiments of the present invention simulate the inherent interaction of the human eye in response to the light, they therefore will be able to provide a

"better feeling of being there" for both tele-and virtual-presence.

The embodiments of the present invention may also be used for gaze-selected active refocusing of images. In works such as [Moreno-Noguer et al. 2007], an image may be retroactively refocused (or appear to have its focal plane changed). Intentional blur is commonly used by photographers to draw attention to a subject. Using an active refocusing technique, it would be possible to show a refocused scene with an interactive component. Thus, wherever the user's gaze was focused, content in the same focal plane would be in focus, objects closer or farther away would be appropriately out of focus. The embodiments of the present invention can also help to represent the properties of the human eye more appropriately.

The following shows the comparison between the embodiments of the present invention as mentioned above and the prior art.

• Local tone-mapping: Can produce high contrast images, but artifacts are commonly created and the resulting images may look unnatural

• Global tone-mapping: May operate very quickly, but suffer from obvious contrast problems

• HDR display devices: Appropriate for showing HDR images, but not common as a display medium

• Embodiments of the present invention: Interactively displays reasonable LDR images. Although the method according to the embodiments of the present invention is tuned to only a single user's view, it is a truly effective method of extracting dynamic content from a static image. REFERENCES

1. AKYUZ, A. O., FLEMING, R., RIECKE, B.E., REINHARD, E., AND BULTHOFF, H. H. 2007. Do HDR displays support LDR content? : a psychophysical evaluation. ACM Trans. Graph. 26, 3, 38.

2. ASHIKHMIN, M. 2002. A tone mapping algorithm for high contrast images. Rendering Techniques, 145-156.

3. BAKER, H. D. 1949. The course of foveal light adaptation measured by the threshold intensity increment. Journal of Optometry 39, 172-179. 4. BANTERLE, F., DEBATTISTA, K. , LEDDA, P., AND CHALMERS, A.

2008. A gpu-friendly method for high dynamic range texture compression using inverse tone mapping. Proc. Graph. Int. '08, 41-48.

5. BIMBER, O., AND IWAI, D. 2008. Superimposing dynamic range. ACM

Trans. Graph. 27, 5, 1-8. 6. BOYLE, W., AND SMITH, G. 1970. Charged coupled semiconductor devices. Bell System Technical Journal 49 (April), 587-593.

7. CADiK, M., WIMMER, M., NEUMANN, L., AND ARTUSI, A. 2006.

Image attributes and quality for evaluation of tone mapping operators. Proc.

Pac. Graph., 35- 44. 8. CALKINS, D., Y.TSUKAMOTO, AND STERLING, P. 1998. Micro- circuitry and mosaic of a blue/yellow ganglion cell in the primate retina.

Journal of Neuroscience 18, 3373-3385.

9. CLARKE, R., ZHANG, H., AND GAMLIN, P. 2003. Characteristics of the pupillary light reflex in the alert rhesus monkey. Journal of Neurophysiology 89, 3179-3189.

10. COHEN, J . 1988. Statistical Power Analysis for the Behavioral Sciences, second ed. Psychology Press.

11. DEBEVEC, P. E., AND MALIK, J. 1997. Recovering high dynamic range radiance maps from photographs. In ACM SIGGRAPH '97, 369-378. 12. DOLBY, 2006. http://www.dolby.com/promo/hdr/exp flash.html.

13. DOWLING, J. 1987. The Retina: An Approachable Part of The Brain. Harvard University Press. 14. DRAGO, F., MYSZKOWSKI, K., ANNEN, T., AND CHIBA, N. 2003. Adaptive logarithmic mapping for displaying high contrast scenes. Computer Graphics Forum 22, 419-426.

15. DURAND, F., AND DORSEY, J. 2002. Fast bilateral ltering for the display of high-dynamic-range images. ACM Transactions in Graphics 21, 3, 257-266.

16. FACELAB, 2008, http://www.seeinqmachines.com/facelab.htm

17. FARBMAN, Z., FATTAL, R., LISCHINSKI, D., AND SZELISKI, R. 2008. Edge-preserving decompositions for multi-scale tone and detail manipulation. ACM SIGGRAPH '08, 1-10.

18. FATTAL, R., LISCHINSKI, D., AND WERMAN, M. 2002. Gradient domain high dynamic range compression. ACM Trans. Graph. 21, 3, 249-256.

19. FERWERDA, J. A., PATTANAIK, S., SHIRLEY, P., AND GREENBERG, D. P. 1996. A model of visual adaptation for realistic image synthesis. ACM SIGGRAPH, 249-258.

20. HAIG, C. 1941. The course of rod dark adaptation as influenced by the intensity and duration of pre-adaptation to light. Journal of General Physiology 24, 735-751.

21. HATEREN, J. H. V. 2006. Encoding of high dynamic range video with a model of human cones. ACM Trans. Graph. 25, 4, 1380-1399.

22. HUNT, R. W. G. 1995. The Reproduction of Color. Fountain Press, England.

23. JAYANT, N., JOHNSTON, J., AND SAFRANEK, R. 1993. Signal compression based on models of human perception. Proceedings of IEEE, 1385-1422.

24. KANG, S. B., UYTTENDAELE, M., WINDER, S., AND SZELISKI, R. 2003. High dynamic range video. In SIGGRAPH '03: ACM SIGGRAPH 2003 Papers, ACM, New York, NY, USA, 319-325.

25. KANUNGO, T., MOUNT, D. M., NETANYAHU, N., PIATKO, C, SILVERMAN, R., AND WU, A. Y. 2002. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Analysis and Mach. Intel. 24, 881-892. 26. KANUNGO, T., MOUNT, D. M., NETANYAHU, N., PIATKO, C₁ SILVERMAN, R., AND Wu, A. Y. 2004. A local search approximation algorithm for k-means clustering. Computational Geometry:' Theory and Applications 28, 89-112. 27. KOPF, J., UYTTENDAELE, M., DEUSSEN, O. _f AND COHEN, M. F. 2007. Capturing and viewing gigapixel images. ACM Trans. Graph. 26, 3, 93.

28. KRAWCZYK, G., MYSZKOWSKI, K., AND SEIDEL, H.-P. 2005. Perceptual effects in real-time tone mapping. In ACM SCCG '05, 195-202.

29. LARSON, G. W., RUSHMEIER, H., AND PIATKO, C. 1997. A visibility matching tone reproduction operator for high dynamic range scenes. IEEE

Transactions on Visualization and Computer Graphics 3, 4, 291-306.

30. LARSON, G. W., RUSHMEIER, H., AND PIATKO, C. 1994. A contrast- based scale factor for luminance display. Graphics Gems IV, 415-421.

31. LEDDA, P., SANTOS, L. P., AND CHALMERS, A. 2004. A local model of eye adaptation for high dynamic range images. In ACM AFRIGRAPH '04,

151-160.

32. Ll, Y., SHARAN, L., AND ADELSON, E. H. 2005. Compressing and companding high dynamic range images with subband architecture. ACM SIGGRAPH 24, 3, 836-844. 33. LISCHINSKI, D., FARBMAN, Z., UYTTENDAELE, M., AND SZELISKI, R. 2006. Interactive local adjustment of tonal values. ACM SIGGRAPH, 646-653.

34. MANN, S. 2001. Intelligent Image Processing. John Wiley and Sons, November 2. ISBN: 0-471 -40637-6.

35. MANTIUK, R., DALY, S., AND KEROFSKY, L. 2008. Display adaptive tone mapping. ACM Trans. Graph. 27, 3, 1-10.

36. MANTIUK, R., AND KRAWCZYK₁G., 2007. http://sourceforqe.net/projects/pfstools/.

37. MANTIUK, R., MYSZKOWSKI, K., AND SEIDEL, H. P. 2005. A perceptual framework for contrast processing of high dynamic range images. Proceedings of Second Symposium on Applied Perception in Graphics and Visualization, 87-94. 38. MORENO-NOGUER, F., BELHUMEUR, P. N., AND NAYAR, S. K. 2007. Active refocusing of images and videos. ACM Trans. Graph. 26, 3, 67.

39. MURPHY, H., AND DUCHOWSKI, A. T. 2001. Gaze-contingent level of detail rendering. Eurographics. 40. MURPHY, H., AND DUCHOWSKI, A. T. 2007. Hybrid image/model- based gaze-contingent rendering. In ACM APGV '07, 107-114.

41. NAKA, K., AND RUSHTON, W. 1966. S-potentials from luminosity units in the retinaof sh (cyprinidae). The Journal of Physiology 185, 587-599.

42. NIKOLOV, S. G., NEWMAN, T. D., BULL, D. R., CANAGARAJAH, N. C₁ JONES, M. G., AND GILCHRIST, I. D. 2004. Gaze contingent display using texture mapping and opengl; system and applications. In ETRA '04: Proceedings of the 2004 symposium on Eye tracking research & applications, ACM, New York, NY, USA, 11 -18.

43. PATTANAIK, S. N., FERWERDA, J. A., AND FAIRCHILD, M. D. 1998. A multiscale model of adaptation and spatial vision for realistic image display.

ACM SIGGRAPH, 287-298.

44. PATTANAIK, S. N., TUMBLIN, J., YEE, H., AND GREENBERG, D. P. 2000. Time-dependent visual adaptation for fast realistic image display. ACM SIGGRAPH OO, Al-bΛ. 45. PAUPOO, A., FRIEDBERG, C, AND LAMB, T. 2000. Human cone photoreceptor responses measured by the electroretinagram awave during and after exposure to intense illuminantion. Journal of Physiology 529, 2, 469-482.

46. PHOTOMATIX, 2003. http://www.hdrsoft.com/.

47. QPFSTMP, 2007. http://atpfsqui.sourceforqe.net/. 48. REINHARD, E., AND DEVLIN, K. 2005. Dynamic range reduction inspired by photoreceptor physiology. IEEE Trans. VCG 11, 1 , 12-24. 49. REINHARD, E., STARK, M., SHIRLEY, P., AND FERWERDA, J. 2002. Photographic tone reproduction for digital images. ACM Trans. Graph. 21, 3, 45-52. 50. REINHARD, E., WARD, G., PATTANAIK, S., AND DEBEVEC, P. 2005. High Dynamic Range Imaging: Acquisition, Display, and Image-Based Lighting. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. 51. RIVA, G., LORETI, P., LUNGHI, M., VATALARO, F., AND DAVIDE, F. 2003. Presence 2010: The emergence of ambient intelligence. W.A Usselsteijn (Eds.), Being There: Concepts, effects and measurement of user presence in synthetic environments, IOS Press, 60-81. 52. SCHLICK, C. 1994. Quantization techniques for the visualization of high dynamic range pictures. Photorealistic Rendering Techniques, Proc. 5th Eurographic Rendering Workshop, 7-20.

53. SEETZEN, H., WARD, G., WHITEHEAD, L., AND HEIDRICH, W. 2004. High dynamic range display system. ACM SIGGRAPH '04 Emerging Technologies.

54. STARK, L., AND SHERMAN, P. 1957. A servoanalytic study of consensual pupil reflex to light. Journal of Neurophysiology 20, 17-25.

55. STERLING, P. 1999. Deciphering the retina's wiring diagram. Nature Neuroscience 2, 10 (October), 851-853. 56. TUMBLIN, J., AND RUSHMEIER, H. E. 1993. Tone reproduction for realistic images. IEEE Computer Graphics and Applications 13, 6, 42-48.

57. TUMBLIN, J., AND TURK, G. 1999. Lcis: A boundary hierarchy for detail-preserving contrast reduction. ACM SIGGRAPH, 83-90.

58. TUMBLIN, J., HODGINS, J. K., AND GUENTER, B. K. 1999. Two methods for display of high contrast images. ACM Transactions on Graphics 18, 1 , 56-94.

59. WYSZECKI, G., AND STILES, W. 2000. Color Science: Concepts and Methods, Quantitative Data and Formulae, 2nd Ed. Wiley, New York, New York.

Claims

1. A method for displaying an HDR image on a LDR display device to a user, the method comprising the steps of repeatedly: (1a) estimating a gaze position of the user by tracking at least one eye of the user, the gaze position of the user being a position on a screen of the LDR device at which the user is looking;

(1 b) deriving an output image from the HDR image based on the estimated gaze position of the user; and (1c) displaying the output image on the LDR display device.

2. A method according to claim 1 , wherein the estimated gaze position of the user is an estimated ROM area which is an area on the screen of the LDR device at which the user is looking and step (1 b) further comprises the sub- steps of:

(2i) deriving a tone-mapping function based on the estimated ROM area of the user;

(2ii) applying the derived tone-mapping function to the HDR image to obtain a tone-mapped HDR image; and (2iii) deriving the output image using the tone-mapped HDR image.

3. A method according to claim 2, wherein step (1a) further comprises the sub- steps of:

(3i) determining a distance between the user and the screen of the LDR display device;

(3ii) determining a point on the screen of the LDR display device at which the user is looking; and

(3iii) estimating the ROM area based on a view distance and a view point, the view distance being the distance between the user and the screen of the LDR display device and the view point being the point on the screen of the LDR display device at which the user is looking.

4. A method according to claim 3, wherein a user position is calculated from the view distance and the ROM area is estimated in step (3iii) as a square centered at the view point and each side of the square is of a length calculated by multiplying the distance between the view point and the user position with an angle representing a visual angle projected on a macular of the user's eye.

5. A method according to claim 4, wherein the angle representing the visual angle projected on the macular of the user's eye is 18° .

6. A method according to any of claims 2 - 5, wherein the estimated ROM area comprises a plurality of pixels and step (2i) further comprises the sub-steps of:

(6i) clustering the plurality of pixels to obtain a plurality of clusters wherein each cluster comprises a sub-set of the plurality of pixels and a center of each cluster is referred to as a cluster center; and (6ii) deriving the tone-mapping function based on the cluster centers.

7. A method according to claim 6, wherein the number of the plurality of clusters in sub-step (6i) is four.

8. A method according to claim 6 or 7, wherein the plurality of pixels is clustered in step (6i) using a K-means clustering algorithm.

9. A method according to any of claims 6 - 8, wherein sub-step (6ii) further comprises the sub-step of deriving the tone-mapping function based on the cluster centers such that the plurality of pixels in the estimated ROM area are expressed in a dynamic range of the LDR display device.

10. A method according to any of claims 6 - 8, wherein sub-step (6ii) further comprises the sub-steps of:

(1Oi) calculating a slope for each cluster center of the plurality of clusters according to the equation Sf = 0.5 /C, where C, is the cluster center for the cluster/ and Sf is the calculated slope for the cluster center C₁. ;

(10ii) creating a linear segment for each cluster center, the linear segment having a gradient equal to the calculated slope for the cluster center; and

(1OiN) deriving the tone-mapping function as a function formed by joining the linear segments created for the cluster centers.

11. A method according to claim 10, wherein sub-step (10iii) further comprises the sub-step of performing a weighted normalization procedure on the linear segments created for the cluster centers prior to joining the linear segments, the weight normalization procedure being based on the number of pixels in each cluster of the plurality of clusters.

12. A method according to any of the claims 2 - 11 , wherein the tone-mapping function derived in sub-step (2i) is a piece-wise linear function or a cubic B- splines function.

13. A method according to any of claims 2 - 12, wherein sub-step (2ii) further comprises the sub-step of mapping cluster centers to equidistant intensities in the range of the tone-mapping function to obtain the tone-mapped HDR image.

14. A method according to any of claims 2 - 13 wherein the HDR image comprises a plurality of pixels and the sub-step (2iii) further comprises the sub- steps of:

(14i) calculating an illumination value for each pixel in the HDR image; (14ii) calculating an average illumination value for the plurality of pixels in the HDR image using the illumination values calculated for the pixels in the HDR image;

(14iii) mapping the average illumination value for the plurality of pixels in the HDR image to a middle of a display range of the LDR display device to obtain a globally scaled HDR image; and

(14iv) deriving the output image based on the globally scaled HDR image and the tone-mapped HDR image.

15. A method according to claim 14, wherein sub-step (14iv) further comprises the sub-steps of:

(15i) calculating a bleaching factor according to a Naka-Rushton equation;

(15ii) performing a weighted sum of the globally scaled HDR image and the tone-mapped HDR image using the bleaching factor to obtain an illumination scaled image; and

(15iii) deriving the output image based on the illumination scaled image.

16. A method according to claim 15, wherein the illumination scaled image comprises a plurality of pixels, each pixel having an illumination value and sub- step (15iii) further comprises the sub-step of converting the illumination values of the pixels in the illumination scaled image to RGB color values to derive the output image.

17. A method according to claim 1 , wherein the following steps (17a) - (17b) are performed prior to step (1a):

(17a) dividing the HDR image into a plurality of ROM window blocks; (17b) pre-defining a set of cluster centers in each ROM window block , the set of cluster centers comprising a plurality of pixels separated at regular pixel intervals both horizontally and vertically; and wherein step (1b) further comprises the sub-steps of:

(1Ti) locating a plurality of grid positions to which the user's gaze position is the closest, each grid position being a center of a corresponding ROM window block;

(17ii) locating the pre-defined set of cluster centers in the corresponding ROM window block for each grid position;

(17iii) interpolating the located pre-defined sets of cluster centers to obtain a plurality of interpolated preset cluster centers according to the distance between the plurality of grid positions and the user's gaze position;

(17iv) deriving a tone-mapping function based on the plurality of interpolated preset cluster centers;

(17v) applying the derived tone-mapping function to the HDR image to obtain a tone-mapped HDR image; and (17vi) deriving the output image using the tone-mapped HDR image.

18. A method according to claim 1 , wherein the estimated gaze position of the user is an estimated ROM area which is an area on the screen of the LDR device at which the user is looking and wherein the following steps (18a) - (18b) are performed prior to step (1a):

(18a) dividing the HDR image into a plurality of pre-computed ROM areas, each pre-computed ROM area comprising a plurality of pixels; and

(18b) clustering the plurality of pixels in each pre-computed ROM area into a plurality of clusters; and wherein step (1b) further comprises the sub-steps of:

(18i) locating a pre-computed ROM area which overlaps the most with the estimated ROM area of the user; (18ii) deriving a tone-mapping function for the located pre-computed ROM area based on pre-computed cluster centers in the located pre-computed ROM area, each pre-computed cluster center being a center of a cluster in the located pre-computed ROM area; (18iii) applying the derived tone-mapping function to the HDR image to obtain a tone-mapped HDR image; and

(18iv) deriving the output image using the tone-mapped HDR image.

19. A method according to claim 17, further comprising the step of applying a delay factor to the plurality of interpolated preset cluster centers prior to sub- step (17iv).

20. A method according to claim 17, further comprising a step of pre-computing a bleaching factor prior to step (1a) and the sub-step (17vi) is performed by applying the bleaching factor with the tone-mapped HDR image to derive the output image.

21. A method according to claim 20, further comprising the step of applying a delay factor to the bleaching factor prior to sub-step (17vi).

22. A method according to claim 18, further comprising the step of applying a delay factor to the pre-computed cluster centers in the located pre-computed ROM area prior to sub-step (18ii).

23. A method according to claim 18, further comprising the step of pre- computing a bleaching factor prior to step (1a) and the sub-step (18iv) is performed by applying the bleaching factor with the tone-mapped HDR image to derive the output image.

24. A method according to claim 23, further comprising the step of applying a delay factor to the bleaching factor prior to sub-step (18iv).

25. A computer system having a processor and a LDR display device, the computer system being arranged to perform a method according to any of the preceding claims.

26. A computer system according to claim 25, wherein the processor is arranged to run a display program employing a shader program.

27. A computer program product, readable by a computer and containing instructions operable by a processor of a computer system having an LDR display device to cause the computer system to perform a method according to any of claims 1 to 24.