CN117043812A

CN117043812A - Viewer adaptive status based brightness adjustment

Info

Publication number: CN117043812A
Application number: CN202280023547.6A
Authority: CN
Inventors: J·A·派拉兹; J·W·祖恩娜; P·J·A·克里特马克
Original assignee: Dolby Laboratories Licensing Corp
Current assignee: Dolby Laboratories Licensing Corp
Priority date: 2021-03-22
Filing date: 2022-03-02
Publication date: 2023-11-10

Abstract

A video transmission system for brightness adjustment based on viewer adaptation status includes a processor configured to: a source image is received, the source image comprising a current image frame including metadata corresponding to an average light intensity value of the current image frame, and the source image comprising an upcoming image frame including metadata corresponding to an average light intensity value of the upcoming image frame. The processor is configured to determine an ambient light level value based on ambient light level, determine an incident light level value based on the ambient light level value and the average light level value, determine a difference between a current pupil size and a target pupil size, and generate an output image by modifying the source image based on a light level adjustment factor that is a function of the difference between the current pupil size and the target pupil size.

Description

Viewer adaptive status based brightness adjustment

1. Cross-reference to related applications

The present application claims priority from European patent application No. 21163880.4 filed on day 22 of 3.2021 and U.S. provisional application No. 63/164,165 filed on day 22 of 3.2021, all of which are incorporated herein by reference in their entireties.

2. Technical field

The present application relates generally to systems and methods for adjusting light intensity based on the adaptation status of a viewer.

3. Background art

As used herein, the term 'Dynamic Range (DR)' may relate to the ability of the Human Visual System (HVS) to perceive a range of intensities (e.g., luminance, brightness) in an image, such as from darkest gray (black) to brightest white (highlight). In this sense, DR is related to the (scene-reference) intensity of the 'reference scene'. DR may also relate to the ability of a display device to adequately or approximately render an intensity range for a particular width (break). In this sense, DR is related to the (display-reference) intensity of the 'reference display'. Unless a specific meaning is explicitly specified to have a specific meaning at any point in the description herein, it should be inferred that the terms can be used interchangeably in either sense, for example.

As used herein, the term "High Dynamic Range (HDR)" relates to DR broadness of about 14 to 15 orders of magnitude across the Human Visual System (HVS). Indeed, DR of a broad breadth in the range of intensities that humans can simultaneously perceive may be slightly truncated relative to HDR. As used herein, the term "Enhanced Dynamic Range (EDR) or Visual Dynamic Range (VDR)" may be related to such DR either alone or interchangeably: the DR may be perceived within a scene or image by the Human Visual System (HVS) including eye movement, allowing for some light adaptation variation across the scene or image.

In practice, an image includes one or more color components (e.g., luminance Y and chrominance Cb and Cr), where each color component is represented by an accuracy of n bits per pixel (e.g., n=8). Linear luma coding is used, where an image of n <8 (e.g., a color 24-bit JPEG image) is considered as a standard dynamic range image, and where an image of n >8 can be considered as an enhanced dynamic range image. EDR and HDR images may also be stored and distributed using high precision (e.g., 16 bit) floating point formats, such as the OpenEXR file format developed by industrial optical magic company (Industrial Light and Magic).

As used herein, the term "metadata" relates to any auxiliary information that is transmitted as part of the encoded bitstream and that assists the decoder in rendering the decoded image. Such metadata may include, but is not limited to, color space or gamut information, reference display parameters, and auxiliary signal parameters as described herein.

Most consumer desktop displays currently support 200 to 300cd/m ² Or nit. Most consumer HDTV ranges from 300 to 500 nits, with new models reaching 1000 nits (cd/m ² ). Thus, such conventional displays represent a Lower Dynamic Range (LDR), also referred to as Standard Dynamic Range (SDR), relative to HDR or EDR. As the availability of HDR content increases due to the development of both capture devices (e.g., cameras) and HDR displays (e.g., the dolby laboratory's PRM-4200 professional reference monitor), the HDR content may be color graded and displayed on HDR displays supporting a higher dynamic range (e.g., from 1,000 nits to 5,000 nits or higher). With the increasing luminance capability of HDR displays, viewers experience more dramatic changes between dark and bright luminance, which may lead to Causing discomfort.

In addition, high Dynamic Range (HDR) content authoring is now becoming commonplace, as such techniques provide more realistic and lifelike images than earlier formats. However, many display systems, including hundreds of millions of consumer television displays, are not capable of rendering HDR images. Furthermore, because HDR displays range widely (e.g., from 1,000 nits to 5,000 nits or more), HDR content optimized on one HDR display may not be suitable for direct playback on another HDR display. One approach for serving the entire marketplace is to create multiple versions of new video content; for example, one version uses an HDR image and the other version uses an SDR (standard dynamic range) image. However, this requires content authors to create their video content in a variety of formats, and may require consumers to know what format to purchase for their particular displays.

Disclosure of Invention

HDR technology makes content brighter than previously provided. The jump in brightness from dark to light and vice versa in the content can be an uncomfortable experience for the viewer of the content. Such brightness jumps may occur at image junctions, such as channel changes or advertisement insertion, and image junctions for authoring effects. Accordingly, techniques for reducing such stress while maintaining the intended viewing experience of content authors have been developed. The techniques may further consider characteristics of the output device while maintaining the desired viewing experience.

Various aspects of the present disclosure relate to devices, systems, and methods for adjusting light intensity based on a status of a viewer.

In one exemplary aspect of the present disclosure, a video transmission system for brightness adjustment based on viewer-adapted status is provided. The video transmission system includes a processor for performing post-production editing of video data. The processor is configured to: a source image is received, the source image comprising a current image frame including metadata corresponding to an average light intensity value of the current image frame, and the source image comprising an upcoming image frame including metadata corresponding to an average light intensity value of the upcoming image frame. The processor is configured to: determining an ambient light level value based on ambient light level for the current image frame and the upcoming image frame; and determining an incident light intensity value based on the ambient light intensity value and the average light intensity value for the current image frame and the upcoming image frame. The processor is further configured to determine a current pupil size and a target pupil size using a model that estimates pupil size from incident light intensity, wherein the target pupil size is determined based on the incident light intensity value of the upcoming image frame, and wherein the current pupil size is determined based on the incident light intensity value of the current image frame and one or more previous image frames. The processor is further configured to determine a difference between the current pupil size and the target pupil size, and generate an output image by including metadata in the source image indicating an expected pupil size change between the current image frame and the upcoming image frame, wherein the metadata indicating an expected pupil size change is determined from the difference between the current pupil size and the target pupil size.

In another exemplary aspect of the present disclosure, there is provided a method for brightness adjustment based on viewer-adapted status, the method comprising: receiving a source image, the source image comprising a current image frame, the current image frame comprising metadata corresponding to an average luminance value of the current image frame, and the source image comprising an upcoming image frame, the upcoming image frame comprising metadata corresponding to an average luminance value of the upcoming image frame; determining an ambient light level value based on ambient light level for the current image frame and the upcoming image frame; determining an incident light intensity value based on the ambient light intensity value and the average light intensity value for the current image frame and the upcoming image frame; determining a current pupil size and a target pupil size using a model that estimates pupil size from incident light intensity, wherein the target pupil size is determined based on the incident light intensity value of the upcoming image frame, and wherein the current pupil size is determined based on the incident light intensity value of the current image frame and one or more previous image frames; determining a difference between the current pupil size and the target pupil size; and generating an output image by including metadata in the source image indicating an expected pupil size change between the current image frame and the upcoming image frame, wherein the metadata indicating an expected pupil size change is determined from the difference between the current pupil size and the target pupil size.

In another exemplary aspect of the present disclosure, a video transmission system for brightness adjustment based on viewer-adapted status is provided. The transmission system includes a processor for decoding a received encoded bit stream. The processor is configured to: an input image is received, the input image comprising a current image frame and an upcoming image frame, and metadata indicating an expected pupil size change between the current image frame and the upcoming image frame. The processor is further configured to: determining a target luminance value for the current image frame and the upcoming image frame; determining an ambient light level value based on the ambient light level; and determining an incident light intensity value based on the ambient light intensity value and the target light intensity value for the current image frame and the upcoming image frame. The processor is further configured to: selecting a tone mapping curve based on characteristics of a device configured to provide the image; and determining a current pupil size and a target pupil size using a model that estimates pupil size from incident light intensities, wherein the target pupil size is determined based on incident light intensities of the upcoming image frame, and wherein the current pupil size is determined based on incident light intensities of the current image frame and one or more previous image frames. The processor is further configured to: determining a difference between the current pupil size and the target pupil size; altering the tone mapping curve based on the expected pupil size change and the difference between the current pupil size and the target pupil size; and applying the modified tone mapping curve to the input image to generate an output image.

In another exemplary aspect of the present disclosure, a method for brightness adjustment based on viewer-adapted status is provided. The method comprises the following steps: receiving an input image, the input image comprising a current image frame, an upcoming image frame, and metadata indicating an expected pupil size change between the current image frame and the upcoming image frame; and determining a target luminance value for the current image frame and the upcoming image frame. The method further comprises: determining an ambient light level value based on the ambient light level; determining an incident light intensity value based on the ambient light intensity value and the target light intensity value for the current image frame and the upcoming image frame; and selecting a tone mapping curve based on characteristics of a device configured to provide the image. The method further comprises: determining a current pupil size and a target pupil size using a model that estimates pupil size from incident light intensities, wherein the target pupil size is determined based on incident light intensities of upcoming image frames, and wherein the current pupil size is determined based on incident light intensities of a current image frame and one or more previous image frames; altering the tone mapping curve based on the expected pupil size change and a difference between the current pupil size and the target pupil size; and applying the modified tone mapping curve to the input image to generate an output image.

In another exemplary aspect of the present disclosure, a non-transitory computer readable medium storing instructions that, when executed by a processor of a video transmission system, cause the video transmission system to perform the method of the present disclosure is provided.

In this way, various aspects of the present disclosure provide for the display of images with high dynamic range and high resolution, and at least significant improvements in the technical fields of image projection, holography, signal processing, and the like.

Drawings

These and other more detailed and specific features of the various embodiments are more fully disclosed in the following description, with reference to the accompanying drawings, in which:

FIG. 1 depicts an example process of a video transmission pipeline.

Fig. 2 depicts an example process for brightness adjustment based on viewer adaptation status.

FIG. 3 depicts an example two-dimensional display environment.

Fig. 4 depicts an example one-dimensional cross-section of the display environment of fig. 3.

Fig. 5 depicts an example model of a steady-state pupil.

Fig. 6 depicts an example model of the pupil's response to a change in the intensity of light experienced.

FIG. 7 depicts an example process for brightness adjustment based on received metadata.

FIG. 8 depicts an example chart for determining a slider value.

Fig. 9A to 9B depict example charts illustrating user preference settings.

Detailed Description

The present disclosure and aspects thereof may be embodied in various forms including: hardware, devices or circuits controlled by computer implemented methods, computer program products, computer systems and networks, user interfaces and application programming interfaces; and hardware implemented methods, signal processing circuits, memory arrays, application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs), and the like. The foregoing is intended merely to give a general idea of various aspects of the present disclosure and is not intended to limit the scope of the present disclosure in any way.

In the following description, numerous details are set forth, such as optical device configurations, timings, operations, etc., to provide an understanding of one or more aspects of the present disclosure. It will be apparent to one skilled in the art that these specific details are merely exemplary and are not intended to limit the scope of the application.

Further, while the present disclosure focuses primarily on examples of using various circuits in a digital projection system, it should be understood that these are merely examples. It should further be appreciated that the disclosed systems and methods may be used in any device in which projection of light is desired; such as cinema projection systems, consumer-grade projection systems and other commercial projection systems, heads-up displays, virtual reality displays, and the like. The disclosed systems and methods may be implemented in additional display devices, such as with OLED displays, LCD displays, quantum dot displays, and the like.

Video encoding and decoding of HDR signals

FIG. 1 depicts an example process of a video transmission pipeline (100) showing various stages from video capture to video content display. A sequence of video frames (102) is captured or generated using an image generation block (105). The video frames (102) may be captured digitally (e.g., by a digital camera) or generated by a computer (e.g., using a computer animation) to provide video data (107). Alternatively, the video frames (102) may be captured on film by a film camera. The film is converted to a digital format to provide video data (107). In a production phase (110), the video data (107) is edited to provide a video production stream (112).

The video data of the production stream (112) is then provided to a processor (or one or more processors, such as a Central Processing Unit (CPU)) at block (115) for post-production editing. The block (115) post-production editing may include adjusting or modifying colors or brightness in particular regions of the image to enhance image quality or to achieve a particular appearance of the image according to the authoring intent of the video creator. This is sometimes referred to as "color adjustment" or "color grading". The methods described herein may be performed by a processor at block (115). Other edits (e.g., scene selection and ordering, image cropping, adding computer-generated visual effects, etc.) may be performed at block (115) to produce a final version (117) of the work for release. During post-production editing (115), video images are viewed on a reference display (125).

After post-production (115), the video data of the final work (117) may be transmitted to an encoding block (120) for downstream transmission to decoding and playback devices such as televisions, set-top boxes, movie theatres, and the like. In some embodiments, the encoding block (120) may include audio encoders and video encoders as defined by ATSC, DVB, DVD, blu-ray, and other transport formats to generate the encoded bitstream (122). In the receiver, the encoded bitstream (122) is decoded by a decoding unit (130) to generate a decoded signal (132) representing the same or a near-similar version of the signal (117). The receiver may be attached to a target display (140) that may have entirely different characteristics than the reference display (125). In this case, the display management block (135) may be configured to map the dynamic range of the decoded signal (132) to the characteristics of the target display (140) by generating a display mapping signal (137). Additional methods described herein may be performed by the decoding unit (130) or the display management block (135). Both the decoding unit (130) and the display management block (135) may comprise their own processors or may be integrated into a single processing unit.

Brightness adaptation

As described above, the jump in brightness may create an uncomfortable viewing experience for the person viewing the video content. Accordingly, the systems and methods provided herein maintain the authoring intent of a content creator based on the adaptation state of the viewer. The adaptation state of the viewer may be, for example, the speed at which the viewer's pupil reacts to brightness changes, as described in more detail below. Maintaining authoring intent in this manner is accomplished by modeling the adaptation state of the content creator and the adaptation state of the observer at any time while viewing a series of constantly changing frames. Specifically, the model estimates a change in the viewer pupil diameter based on the output light intensity of the apparatus when video content is provided. Additional information such as ambient light, screen reflection and color adaptation may be further considered.

Method of authoring content

To maintain the authoring intent of the content creator, the authoring experience is measured and converted into metadata during content authoring. Fig. 2 provides a method (200) for adjusting the brightness of content based on a reference adaptation state. The method (200) includes receiving a source image, such as video data (107), at step (205). The source image may include a current image frame (e.g., a frame of video data (107) currently being viewed). The source image may also include metadata associated with the current image frame. For example, an image "smid." representing a light brightness level proportional to the perceived brightness of the source image may be stored in L1 metadata describing the average image brightness of the source image. In some embodiments, the Smid value is a measure of the average (e.g., arithmetic, median, geometric) luminance provided by the source image when the source image is displayed on the reference display (125). The Smid value may be estimated to be equal to the average of the maximum color component (RGB) values encoded by the Perceptual Quantizer (PQ) within the source image. In some other embodiments, the Smid value may represent an average or median value of the selected region (e.g., face). In the case where L1 metadata is not available, smid (such as a value of 0.36) may be calculated or may be assumed.

The source image may further include upcoming (e.g., future) image frames. The upcoming image frame may be the frame immediately following the current image frame or an image frame a few frames after the current image frame. Thus, the production phase (110) may receive the current image frame, the upcoming image frame, and/or both.

The method (200) includes determining an ambient light level value based on the ambient light level at step (210). For example, ambient light levels are determined for an area surrounding the reference display (125). This may be achieved using an ambient light sensor that detects the light brightness value, the color of the light, etc. The ambient light level value may be determined by communication with a smart device, such as a smart light bulb, that is capable of communicating the color of the light they provide to an external device. The time of day may be considered when determining the ambient light level value, as the time of day may be associated with typical lighting (e.g., brighter during the day, darker during the night). When the ambient light level value is unknown and cannot be determined, a value of 5cd/m may be used ² As a default value.

The method (200) includes calculating an incident light intensity at step (215). The incident light intensity provides an estimate of the light falling on the viewer's eye and may be based on the ambient light intensity value from step (210) and the average light intensity value from step (205). For example, fig. 3 provides a display (300) (e.g., reference display (125)) within the ambient environment (305). The display 300 has a first light intensity value, such as the average light intensity value from step 205. The ambient environment (305) has a second light intensity value, such as the ambient light intensity value from step (210). The size of the display (300) relative to the surrounding environment (305) depends on the screen size of the display (300) and the viewing distance (e.g., how far the display (300) is from the viewer's eyes). Cross section line (310) will be discussed below in conjunction with fig. 4.

The incident light intensity may be calculated using a cosine cube function. For example, fig. 4 provides a graph (400) illustrating a cosine cube decay (welloff) function (415). Graph (400) is a one-dimensional cross-section along line (310) of fig. 3. The graph (400) includes an average light intensity value (405) corresponding to the light intensity of the display (300) and an ambient light intensity value (410) corresponding to the ambient environment (305). In some embodiments, the cosine cube decay function (415) is multiplied by the average light brightness value (405) and the ambient light brightness value (410) to scale the cosine cube decay function (415) for a given scene. The Y-coordinate provides the light intensity value and the X-coordinate provides the visibility value. In some embodiments, the cosine cube decay function (415) is scaled such that the area under the curve is 1 and the maximum degree is 45. While the cosine cube decay function (415) provides an estimate of the experienced light intensity based on gaze location, other embodiments may be envisaged based on the visual impact of ambient light intensity. For example, the same method may be adapted for two-dimensional and three-dimensional scenes.

As a specific example of the operation of step (215), pseudo code for calculating the incident light intensity using the average light intensity value (405), the ambient light intensity value (410), and the cosine cube decay function (415) is provided below:

The incident light intensity may be determined for both the current image frame and the upcoming image frame. In some embodiments, while the ambient light level value (410) is unchanged, the average light level value (405) may vary based on the image provided on the display (300). The difference between the incident light intensity of the current image frame and the incident light intensity of the upcoming image frame may be determined, as discussed further below. In the above pseudocode, the incrustation Lum provides an estimate of the intensity of incident light falling on the viewer's eye, which may affect or alter the pupil diameter of the eye (e.g., spatially compensating cornea flux, in cd/m) ² In units), display_y represents an average light brightness value, and surround_y represents an ambient light brightness value.

Returning to fig. 2, the method (200) includes determining a difference between the current pupil size and the target pupil size at step (220). The current pupil size is a function of the incident light intensity and is related to the way the pupil reacts to the upcoming image frame at the instant the upcoming image frame is displayed (i.e., when a switch occurs between the current image frame and the upcoming image frame). In some embodiments, the current pupil size is a continuous filtering result to represent the pupil's adaptability over time (e.g., from frame to frame). In this way, the current pupil size may be a function of all previous frames, and the manner in which the pupil size varies from one frame to another may also be referred to as adaptation of the pupil size. In some embodiments, a sensor, such as an eye movement tracking sensor, may be used to determine the current pupil size. The sensor observes pupil size and provides a measurement of pupil size. The target pupil size is related to the instantaneous desired pupil diameter of the upcoming image frame. In this way, the target pupil size may not take into account previous frames and represent the pupil diameter that would be achieved by staring at the upcoming image frame for a long period of time. The difference between the current pupil size and the target pupil size may be defined as a delta pupillary response, as shown in equation 1:

ΔPR＝d _currentPupil -d _targetPupil [ equation 1 ]]

Wherein:

Δpr = delta pupillary response

d _currentPupil Diameter of current pupil size

d _targetPupil Diameter of target pupil size

Fig. 5 provides a steady state pupil model for estimating and/or predicting pupil diameter with respect to incident light intensity. The steady state pupil model presented in fig. 5 was directly adapted from the a.b. watson and j.i. yellow paper "A Unified Formula for Light-Adapted Pupil Size [ unified formula for light adaptation pupil size ]", which is incorporated herein by reference in its entirety. Alternative models and methods may be utilized to determine steady state pupil size. In the model of FIG. 5, the constant rate of pupil constriction is 3mm/s and the constant rate of pupil dilation is 0.5mm/s. However, other constant rates of contraction and expansion may be utilized. In one embodiment, the color of the image may also affect the rate of shrinkage. Pupil diameter may further vary based on the range of power, the number of eyes, and age. The pupil will achieve a steady state pupil size based on the incident light intensity and constrict or dilate depending on whether the target pupil size is larger or smaller than the current pupil size. To determine how the pupil changes in response to incident light intensity, the speed of both constriction and dilation needs to be estimated for the sampling rate of the image frames (mathematically represented by the inverse of the image frame rate). As one specific example, pseudo code for determining the difference (e.g., delta pupillary response) between the current pupil size and the target pupil size as in step (220) is provided below. In this example, the exact method of estimating the steady-state pupil size is not clear.

In the above pseudo code, the current pupil is the pupil diameter given the frame to be displayed and the duration of that frame.

Returning to FIG. 2, the method (200) includes generating an output image at step (225). The output image may be generated by modifying the source image (e.g., video data (107)) based on the determined brightness adjustment factor. The luminance adjustment factor may be based on, for example, a differential pupillary response. In order to minimize the discomfort experienced by a viewer of the video data (107), it is desirable to obtain a relationship between the differential pupillary response and the experienced discomfort. For example, as shown in equation 2:

wherein:

Δpr = delta pupillary response.

While equation 2 provides an exponential function, other functions may be used to determine perceived discomfort, such as a triple roll-off function. Perceived discomfort is considered when generating the output image at step (225). As a specific example, pseudo code for converting perceived discomfort to an "authoring experience" (CE) value, which is a value indicative of pupil size change, is provided below. In particular, CE values are used to describe how an observer reacts to changes in light intensity levels over the duration of the presented content.

In the example provided, the function used to determine the brightness of the output image is selected such that the CE value is negative when the pupil dilates, positive when the pupil constricts, and zero when the pupil diameter is constant (e.g., unchanged). Fig. 6 provides one example of CE values determined over a plurality of image frames. Fig. 6 contains several cases of variation between bright and dark luminance values. For example, the pupil dilates during the descent (600) and descent (602), indicating a change to a dim light brightness value at the descent (600) and descent (602). The pupil constricts at the rises (604) and (606), indicating a brightness value that becomes brighter at the rises (604) and at the rises (606). CE values are determined for each frame included in the video content and transformed into metadata included in the encoded bitstream (122).

Method for implementing authoring experience metadata

When decoding the encoded bitstream (122), the decoding unit (130) processes CE values included in the metadata and may adjust the decoded signal (132) accordingly. Fig. 7 provides a method (700) for decoding an encoded bitstream (122). The method (700) includes receiving an input image, such as data included in an encoded bitstream (122), at step (705). The method (700) includes determining an ambient light level value based on the ambient light level at step (710). For example, the ambient light level of the area surrounding the target display (140) is determined, as described above with respect to step (210).

The method (700) selects a tone mapping curve based on characteristics of the target device (140) at step (715). For example, a "Tmid" value is calculated that provides an estimate of the average luminance (e.g., a target luminance value) of an image described by the encoded bitstream (122) when the image is displayed on the target device (140). In U.S. patent No. 10,600,166"Tone Curve Mapping for High Dynamic Range Images [ tone curve mapping for high dynamic range images ]" to j. Pytlarz and r. Atkins, which are incorporated herein by reference in their entirety, the inventors propose a method for determining tone curves for display mapping of High Dynamic Range (HDR) images. The tone mapping curve may be further adjusted using a function to make the input image brighter (resulting in higher CE values) or darker (resulting in lower CE values). As a specific example, pseudo code for calculating Tmid values is provided below.

The predicted CE value for a given target device (140) may be determined by using a dynamic look-up table that compares CE values to input tone curve parameters. This may be used to ensure that the selected tone mapping curve is appropriate for the target device (140). Once the tone mapping curve is determined, the tone mapping curve may be used to determine the predicted CE value. For example, for each frame included in the encoded bitstream (122), the Tmid value, the incident light intensity, and the current and target pupil sizes of the viewer are determined using tone mapping curve values, as described above with respect to method (200). These values are used to calculate the predicted CE value for each frame. The predicted CE value may then be compared to the received CE value to determine how to adjust the brightness of each image to achieve the desired output. As a specific example, pseudo code for determining an intersection between a predicted CE value and an actually provided CE value as received in metadata of an encoded bitstream (122) is provided below.

Fig. 8 illustrates a graph providing an intersection between a predicted CE value and a provided CE value (e.g., a reference CE value). In the example of fig. 8, the CE value provided (indicated by the dashed line) is-1. The solid line provides the calculated CE value as determined by the Tmid value calculated by the given device at each slider value. The intersection provides the slider value that the target display (140) must use to output the desired CE value. In the example of FIG. 8, the slider value is about-0.15. Therefore, the brightness of the image will be reduced to output the desired CE value.

Returning to FIG. 7, at step (720), the method (700) displays the output image using the slider value determined at step (715). Thus, the output image has an authoring experience corresponding to the light intensity indicated by the CE value included in the metadata of the encoded bitstream (122). In other words, the output image is brightened or darkened to reach the brightness indicated by the CE value. The method (700) is performed for each frame provided in the video data such that the output of all frames has an authoring experience value intended by the content creator.

Example embodiment

By using one tone mapping curve for each device, the authoring experience desired by the content creator of the video data can be achieved regardless of the capabilities of the user device, thereby reducing the gap in viewing differences between devices such as home theatres and mobile phones. However, the methods described herein may be used to reduce discomfort caused by changes in light brightness in various situations. One such embodiment includes reducing discomfort to the viewer during the fast forward function. When fast forward is initiated, the frame will quickly skip and be displayed. Stroboscopic effects may occur when a frame is changed from dark to bright and vice versa. To address this issue, the sampling rate at which pupil calculations are made at steps (220) and (715) may be increased based on the rate of fast forward. For example, in a video, a person leaves a cave to go to bright sunlight, resulting in a brightness of light that goes from dark to bright. This may last for 10 seconds. However, if fast forward is initiated such that the scene lasts 5 seconds, the pupil accommodation rate will change significantly. In this case, the mapping of CE values to tone mapping curves can be adjusted by reducing the adaptation duration [0043] by 50% to account for this time variation. The speed of contraction and expansion will increase proportionally based on the fast forward speed.

Another embodiment in which the methods disclosed herein may be implemented is a zoom function. When zooming is activated, the focus will turn to a portion of the image, which may significantly change the average brightness of the image. The user may move between portions of the image itself, switching between dark and bright areas of interest, resulting in discomfort. Knowledge of the lightness properties of the magnified region allows for weakening or brightening to more closely achieve the desired authoring experience.

The methods described herein may be used for volumetric experiences such as virtual reality games. The observer will adapt to some degree to the computer generated landscape in the virtual reality experience. The abrupt appearance of bright objects or reflections may be noticeable to the viewer as the viewer moves and looks around. Similar adjustments may be made to the image or object being viewed to limit viewing discomfort caused by the change in light intensity. In some embodiments, multiple users may be in the same virtual reality experience at the same time. Each user may see a different object and thus create a different experience, which may be advantageous to a certain player. The light intensity may be adjusted to further balance each player's experience and even field of view.

Advertisements may be inserted in the video that the user is watching, which may greatly change the average light intensity value displayed, regardless of the type of device. In this case, a smooth and comfortable transition can be achieved using the CE value. For example, during a dark scene in a movie, advertisements with bright light intensity values are provided. Using the CE values of both the movie and the advertisement, the provided image frames compensate for abrupt changes by decreasing the brightness of the advertisement and slowly increasing the brightness over time until the advertisement is played, and then faded out to match the CE value of the movie.

Viewers of video content may wish to set brightness preferences, such as limiting certain levels of discomfort specific to the viewer and/or device. For example, very bright content may be limited to only reaching below the authoring experience value indicated by the CE metadata. In addition, brightness jump may be limited. For example, fig. 9A provides an example of light brightness adjustment to limit the discomfort experienced while preserving the expected light brightness variation. The overall desired viewing experience is maintained but the extreme value of the luminance is limited to minimize changes in pupillary response. In other embodiments, a hard cutoff may be used instead of a general scalar. For example, fig. 9B provides an example of brightness adjustment employing a strict threshold. User preferences may be further partitioned based on the shrink and expansion information. For example, only contraction may have a set threshold, or only expansion may have a set threshold. The contraction and expansion may have distinct thresholds that are different from each other.

In some implementations, the methods described herein may be implemented to provide a viewing experience similar to the manner in which they were initially captured. For example, a user may use a camera to capture an image. The image is processed such that it includes CE metadata indicating the luminance values of the original scene. Thus, when viewing pictures on other devices, images are provided in a manner similar to the manner in which the images were originally captured.

The video transmission system and method described above may provide brightness adjustment based on viewer-adapted status. Systems, methods, and devices according to the present disclosure may employ any one or more of the following configurations.

(1) A video transmission system for brightness adjustment based on viewer adaptation status, the video transmission system comprising a processor for performing post-production editing of video data, the processor configured to: receiving a source image, the source image comprising a current image frame, the current image frame comprising metadata corresponding to an average luminance value of the current image frame, and the source image comprising an upcoming image frame, the upcoming image frame comprising metadata corresponding to an average luminance value of the upcoming image frame; determining an ambient light level value based on the ambient light level; determining an incident light intensity value based on the ambient light intensity value and the average light intensity value for the current image frame and the upcoming image frame; determining a difference between the current pupil size and the target pupil size; wherein the target pupil size is determined based on an incident light intensity value of the upcoming image frame, and wherein the current pupil size is determined based on an incident light intensity value of the current image frame and one or more previous image frames; and generating an output image by modifying the source image based on a brightness adjustment factor that is a function of a difference between the current pupil size and the target pupil size.

(2) The video transmission system of (1), wherein determining the ambient light level value comprises at least one of: the ambient light level value is received from one or more ambient light sensors, the ambient light level value is received from one or more intelligent lighting devices, or the ambient light level value is determined based on a time of day.

(3) The video transmission system of any one of (1) to (2), wherein determining the incident light intensity value comprises applying a cosine cube function to the average light intensity value and the ambient light intensity value to obtain an average adaptation state.

(4) The video transmission system of (3), wherein the current pupil size and the target pupil size are adjusted based on the average adaptation state.

(5) The video transmission system of (3), wherein the cosine cube function is scaled such that the integral of the cosine cube function over a 45 ° range is 1.

(6) The video transmission system according to any one of (1) to (5), wherein the output image includes metadata corresponding to an average luminance value of the source image.

(7) The video transmission system according to any one of (1) to (6), wherein the luminance adjustment factor is negative when a difference between the current pupil size and the target pupil size is negative, and wherein the luminance adjustment factor is positive when a difference between the current pupil size and the target pupil size is positive.

(8) The video transmission system of any one of (1) to (7), wherein the output image includes metadata indicating a desired pupil size change between the current image frame and the upcoming image frame given an infinite adaptation time.

(9) The video transmission system of any one of (1) to (8), wherein the brightness adjustment factor is based on an estimated inadequacy value that is based on a difference between the current pupil size and the target pupil size.

(10) The video transmission system according to any one of (1) to (9), wherein the incident light intensity value is an estimate of light on a pupil of a viewer of the video data.

(11) A method of brightness adjustment based on viewer adaptation status, the method comprising: receiving a source image, the source image comprising a current image frame, the current image frame comprising metadata corresponding to an average luminance value of the current image frame, and the source image comprising an upcoming image frame, the upcoming image frame comprising metadata corresponding to an average luminance value of the upcoming image frame; determining an ambient light level value based on the ambient light level; determining an incident light intensity value based on the ambient light intensity value and the average light intensity value for the current image frame and the upcoming image frame; determining a difference between the current pupil size and the target pupil size; wherein the target pupil size is determined based on an incident light intensity value of the upcoming image frame, and wherein the current pupil size is determined based on an incident light intensity value of the current image frame and one or more previous image frames; and generating an output image by modifying the source image based on a brightness adjustment factor that is a function of a difference between the current pupil size and the target pupil size.

(12) The method of (11), wherein determining the ambient light level value comprises at least one of: the ambient light level value is received from one or more ambient light sensors, the ambient light level value is received from one or more intelligent lighting devices, or the ambient light level value is determined based on a time of day.

(13) The method of any one of (11) to (12), wherein determining the incident light intensity value comprises applying a cosine cube function to the average light intensity value and the ambient light intensity value to obtain an average adaptation state.

(14) The method of (13), wherein the current pupil size and the target pupil size are adjusted based on the average adaptation state.

(15) The method of (13), wherein the cosine cube function is scaled such that the integral of the cosine cube function over a 45 ° range is 1.

(16) The method of any one of (11) to (15), wherein the output image includes metadata corresponding to an average luminance value of the source image.

(17) The method of any one of (11) to (16), wherein the brightness adjustment factor is negative when the difference between the current pupil size and the target pupil size is negative, and wherein the brightness adjustment factor is positive when the difference between the current pupil size and the target pupil size is positive.

(18) The method of any of (11) to (17), wherein the output image includes metadata indicating a desired pupil size change between the current image frame and the upcoming image frame given an infinite adaptation time.

(19) The method of any one of (11) to (18), wherein the luminance adjustment factor is based on an estimated inadequacy value that is based on a difference between the current pupil size and the target pupil size.

(20) A non-transitory computer-readable medium storing instructions that, when executed by an electronic processor, cause the electronic processor to perform operations comprising the method of any one of (11) to (19).

(21) A video transmission system for brightness adjustment based on viewer adaptation status, the transmission system comprising a processor for decoding a received encoded bitstream, the processor configured to: receiving an input image, the input image comprising a current image frame, an upcoming image frame, and metadata corresponding to an expected pupil size change; determining a target luminance value for the current image frame and the upcoming image frame; determining an ambient light level value based on the ambient light level; determining an incident light intensity value based on the ambient light intensity value and the target light intensity value for the current image frame and the upcoming image frame; selecting a tone mapping curve based on characteristics of a device configured to provide the image; determining a difference between a current pupil size and a target pupil size, wherein the target pupil size is determined based on an incident light intensity of the upcoming image frame, and wherein the current pupil size is determined based on incident light intensity of the current image frame and one or more previous image frames; altering the tone mapping curve based on the expected pupil size change and a difference between the current pupil size and the target pupil size; and applying the modified tone mapping curve to the input image to generate an output image.

(22) The video transmission system of (21), wherein the modified tone mapping curve is negative when the difference between the current pupil size and the target pupil size is negative, and wherein the modified tone mapping curve is positive when the difference between the current pupil size and the target pupil size is positive.

(23) The video transmission system of any one of (21) to (22), wherein determining the tone mapping curve includes comparing the target luminance value of the current image frame to a look-up table to obtain an input tone curve parameter.

(24) The video transmission system of (23), wherein determining the tone mapping curve includes determining an intersection between the target luminance value of the current image frame and the input tone curve parameter.

(25) The video transmission system of any one of (21) to (24), wherein the processor is further configured to receive a minimum value and a maximum value of the modified tone mapping curve via user input.

(26) The video transmission system of any one of (21) to (25), wherein determining the ambient light level value comprises at least one of: the ambient light level value is received from one or more ambient light sensors, the ambient light level value is received from one or more intelligent lighting devices, or the ambient light level value is determined based on a time of day.

(27) The video transmission system of any one of (21) to (26), wherein determining a difference between the current pupil size and the target pupil size further comprises adjusting a sampling rate when a start fast forward event is detected.

(28) The video transmission system of any one of (21) to (27), wherein applying the modified tone mapping curve to the input image increases a target luminance value of the input image.

(29) The video transmission system of any one of (21) to (28), wherein applying the modified tone mapping curve to the input image reduces a target luminance value of the input image.

(30) The video transmission system of any one of (21) to (29), wherein the maximum and minimum values of the tone mapping curve are adjusted based on user preference settings.

(31) A method of brightness adjustment based on viewer adaptation status, the method comprising: receiving an input image, the input image comprising a current image frame, an upcoming image frame, and metadata corresponding to an expected pupil size change; determining a target luminance value for the current image frame and the upcoming image frame; determining an ambient light level value based on the ambient light level; determining an incident light intensity value based on the ambient light intensity value and the target light intensity value for the current image frame and the upcoming image frame; selecting a tone mapping curve based on characteristics of a device configured to provide the image; determining a difference between a current pupil size and a target pupil size, wherein the target pupil size is determined based on an incident light intensity of an upcoming image frame, and wherein the current pupil size is determined based on incident light intensity of a current image frame and one or more previous image frames; altering the tone mapping curve based on the expected pupil size change and a difference between the current pupil size and the target pupil size; and applying the modified tone mapping curve to the input image to generate an output image.

(32) The method of (31), wherein the modified tone mapping curve is negative when the difference between the current pupil size and the target pupil size is negative, and wherein the modified tone mapping curve is positive when the difference between the current pupil size and the target pupil size is positive.

(33) The method of any one of (31) to (32), wherein determining the tone mapping curve comprises comparing the target luminance value of the current image frame to a look-up table to obtain an input tone curve parameter.

(34) The method of (33), wherein determining the tone mapping curve includes determining an intersection between the target luminance value of the current image frame and the input tone curve parameter.

(35) The method of any one of (31) to (34), further comprising receiving a minimum value and a maximum value of the modified tone mapping curve through user input.

(36) The method of any one of (31) to (35), wherein determining the ambient light level value comprises at least one of: the ambient light level value is received from one or more ambient light sensors, the ambient light level value is received from one or more intelligent lighting devices, or the ambient light level value is determined based on a time of day.

(37) The method of any one of (31) to (36), wherein determining the difference between the current pupil size and the target pupil size further comprises adjusting the sampling rate upon detection of a start fast forward event.

(38) The method of any one of (31) to (37), wherein applying the modified tone mapping curve to the input image increases a target luminance value of the input image.

(39) The method of any one of (31) to (38), wherein a maximum value and a minimum value of the tone mapping curve are adjusted based on user preference settings.

(40) A non-transitory computer-readable medium storing instructions that, when executed by an electronic processor, cause the electronic processor to perform operations comprising the method of any one of (31) to (39).

With respect to the processes, systems, methods, heuristics, etc. described herein, it should be understood that, while the steps of such processes, etc. have been described as occurring in a particular ordered sequence, such processes may be practiced with the described steps performed in an order different than that described herein. It is further understood that certain steps may be performed concurrently, other steps may be added, or certain steps described herein may be omitted. In other words, the process descriptions herein are provided for the purpose of illustrating certain embodiments and should in no way be construed as limiting the claims.

Accordingly, it is to be understood that the above description is intended to be illustrative, and not restrictive. Many embodiments and applications other than the examples provided will be apparent from a reading of the above description. The scope should be determined not with reference to the above description, but should instead be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled. It is anticipated and intended that the technology discussed herein will evolve in the future, and that the disclosed systems and methods will be incorporated into such future embodiments. In summary, it should be understood that the application is capable of modification and variation.

All terms used in the claims are intended to be given their broadest reasonable constructions and their ordinary meanings as understood by those knowledgeable in the art described herein unless an explicit indication to the contrary is made herein. In particular, the use of singular articles such as "a," "the," "said," and the like should be understood to recite one or more of the indicated elements unless a claim recites an explicit limitation to the contrary.

The Abstract of the disclosure is provided to allow the reader to quickly ascertain the nature of the technical disclosure. This Abstract is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, in the foregoing detailed description, it can be seen that various features are grouped together in various embodiments for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments incorporate more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus the following claims are hereby incorporated into the detailed description, with each claim standing on its own as a separately claimed subject matter.

Claims

1. A video transmission system for brightness adjustment based on viewer adaptation status, the transmission system comprising:

a processor for performing post-production editing of video data, the processor configured to:

receiving a source image, the source image comprising a current image frame, the current image frame comprising metadata corresponding to an average luminance value of the current image frame, and the source image comprising an upcoming image frame, the upcoming image frame comprising metadata corresponding to an average luminance value of the upcoming image frame;

determining an ambient light level value based on the ambient light level;

determining an incident light intensity value based on the ambient light intensity value and the average light intensity value for the current image frame and the upcoming image frame;

determining a current pupil size and a target pupil size using a model that estimates pupil size from incident light intensity, wherein the target pupil size is determined based on the incident light intensity value of the upcoming image frame, and wherein the current pupil size is determined based on the incident light intensity value of the current image frame and one or more previous image frames;

Determining a difference between the current pupil size and the target pupil size; and

an output image is generated by including metadata in the source image indicating an expected pupil size change between the current image frame and the upcoming image frame, wherein the metadata indicating an expected pupil size change is determined from the difference between the current pupil size and the target pupil size.

2. The video transmission system of claim 1, wherein the determining the ambient light level value comprises at least one of: the ambient light level value is received from one or more ambient light sensors, the ambient light level value is received from one or more intelligent lighting devices, or the ambient light level value is determined based on a time of day.

3. The video transmission system of any one of claims 1 to 2, wherein determining the incident light intensity value comprises applying a cosine cube function to the average light intensity value and the ambient light intensity value to obtain an average adaptation state.

4. The video transmission system of claim 3, wherein the current pupil size and the target pupil size are adjusted based on the average adaptation state.

5. A video transmission system according to claim 3, wherein the cosine cube function is scaled such that the integral of the cosine cube function over 45 ° is 1.

6. The video transmission system of any one of claims 1 to 5, wherein a brightness adjustment factor is based on an estimated inadequacy value, the estimated inadequacy value being based on the difference between the current pupil size and the target pupil size.

7. A video transmission system for brightness adjustment based on viewer adaptation status, the transmission system comprising:

a processor for decoding a received encoded bitstream, the processor configured to:

receiving an input image, the input image comprising a current image frame and an upcoming image frame, and metadata indicating an expected pupil size change between the current image frame and the upcoming image frame;

determining a target luminance value for the current image frame and the upcoming image frame;

determining an ambient light level value based on the ambient light level;

determining an incident light intensity value based on the ambient light intensity value and the target light intensity value for the current image frame and the upcoming image frame;

Selecting a tone mapping curve based on characteristics of a device configured to provide the image;

determining a current pupil size and a target pupil size using a model that estimates pupil size from incident light intensities, wherein the target pupil size is determined based on incident light intensities of the upcoming image frame, and wherein the current pupil size is determined based on incident light intensities of the current image frame and one or more previous image frames;

determining a difference between the current pupil size and the target pupil size;

altering the tone mapping curve based on the expected pupil size change and the difference between the current pupil size and the target pupil size; and

the modified tone mapping curve is applied to the input image to generate an output image.

8. The video transmission system of claim 7, wherein the modified tone mapping curve is negative when the difference between the current pupil size and the target pupil size is negative, and wherein the modified tone mapping curve is positive when the difference between the current pupil size and the target pupil size is positive.

9. The video transmission system of any one of claims 7 to 8, wherein determining the tone mapping curve includes comparing the target luminance value of the current image frame to a look-up table to obtain an input tone curve parameter.

10. The video transmission system of claim 9, wherein determining the tone mapping curve includes determining an intersection between the target luminance value of the current image frame and the input tone curve parameter.

11. The video transmission system of any of claims 7 to 10, wherein the processor is further configured to receive a minimum value and a maximum value of the modified tone mapping curve via user input.

12. The video transmission system of any one of claims 7 to 11, wherein the determining the ambient light level value comprises at least one of: the ambient light level value is received from one or more ambient light sensors, the ambient light level value is received from one or more intelligent lighting devices, or the ambient light level value is determined based on a time of day.

13. The video transmission system of any one of claims 7 to 12, wherein determining the difference between the current pupil size and the target pupil size further comprises adjusting the sampling rate when a start fast forward event is detected.

14. The video transmission system of any of claims 7 to 13, wherein the maximum and minimum values of the tone mapping curve are adjusted based on user preference settings.

15. A method of brightness adjustment based on viewer adaptation status, the method comprising:

determining an ambient light level value based on the ambient light level;

16. A method of brightness adjustment based on viewer adaptation status, the method comprising:

receiving an input image, the input image comprising a current image frame, an upcoming image frame, and metadata indicating an expected pupil size change between the current image frame and the upcoming image frame;

determining an ambient light level value based on the ambient light level;

Determining a current pupil size and a target pupil size using a model that estimates pupil size from incident light intensities, wherein the target pupil size is determined based on incident light intensities of upcoming image frames, and wherein the current pupil size is determined based on incident light intensities of current image frames and one or more previous image frames;

altering the tone mapping curve based on the expected pupil size change and a difference between the current pupil size and the target pupil size; and

17. A non-transitory computer-readable medium storing instructions that, when executed by an electronic processor, cause the electronic processor to perform operations comprising the method of claim 15 or claim 16.