US11388408B2 - Interpolation of reshaping functions - Google Patents
Interpolation of reshaping functions Download PDFInfo
- Publication number
- US11388408B2 US11388408B2 US17/299,743 US201917299743A US11388408B2 US 11388408 B2 US11388408 B2 US 11388408B2 US 201917299743 A US201917299743 A US 201917299743A US 11388408 B2 US11388408 B2 US 11388408B2
- Authority
- US
- United States
- Prior art keywords
- reshaping
- function
- computed
- functions
- parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/1887—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a variable length codeword
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/98—Adaptive-dynamic-range coding [ADRC]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
Definitions
- the present invention relates generally to images. More particularly, an embodiment of the present invention relates to generating a new reshaping function for HDR imaging by interpolating existing reshaping functions.
- DR dynamic range
- HVS human visual system
- DR may relate to a capability of the human visual system (HVS) to perceive a range of intensity (e.g., luminance, luma) in an image, e.g., from darkest grays (blacks) to brightest whites (highlights).
- DR relates to a ‘scene-referred’ intensity.
- DR may also relate to the ability of a display device to adequately or approximately render an intensity range of a particular breadth.
- DR relates to a ‘display-referred’ intensity.
- a particular sense is explicitly specified to have particular significance at any point in the description herein, it should be inferred that the term may be used in either sense, e.g. interchangeably.
- the term high dynamic range relates to a DR breadth that spans the 14-15 orders of magnitude of the human visual system (HVS).
- HVS human visual system
- the terms visual dynamic range (VDR) or enhanced dynamic range (EDR) may individually or interchangeably relate to the DR that is perceivable within a scene or image by a human visual system (HVS) that includes eye movements, allowing for some light adaptation changes across the scene or image.
- VDR may relate to a DR that spans 5 to 6 orders of magnitude.
- VDR or EDR nonetheless represents a wide DR breadth and may also be referred to as HDR.
- n ⁇ 8 e.g., color 24-bit JPEG images
- HDR images may also be stored and distributed using high-precision (e.g., 16-bit) floating-point formats, such as the OpenEXR file format developed by Industrial Light and Magic.
- HDR lower dynamic range
- SDR standard dynamic range
- HDR content may be color graded and displayed on HDR displays that support higher dynamic ranges (e.g., from 1,000 nits to 5,000 nits or more).
- OETF non-linear opto-electronic function
- EOTF electro-optical transfer function
- non-linear functions include the traditional “gamma” curve, documented in ITU-R Rec. BT.709 and BT. 2020, the “PQ” (perceptual quantization) curve described in SMPTE ST 2084, and the “HybridLog-gamma” or “HLG” curve described in and Rec. ITU-R BT. 2100.
- the term “reshaping” or “remapping” denotes a process of sample-to-sample or codeword-to-codeword mapping of a digital image from its original bit depth and original codewords distribution or representation (e.g., gamma or PQ or HLG, and the like) to an image of the same or different bit depth and a different codewords distribution or representation. Reshaping allows for improved compressibility or improved image quality at a fixed bit rate. For example, without limitation, forward reshaping may be applied to 10-bit or 12-bit PQ-coded HDR video to improve coding efficiency in a 10-bit video coding architecture. In a receiver, after decompressing the received signal (which may or may not be reshaped), the receiver may apply an inverse (or backward) reshaping function to restore the signal to its original codeword distribution and/or to achieve a higher dynamic range.
- original codewords distribution or representation e.g., gamma or PQ or HLG, and the like
- Reshaping can be static or dynamic. In static reshaping, a single reshaping function is generated and is being used for a single stream or across multiple streams. In dynamic reshaping, the reshaping function may be customized based on the input video stream characteristics, which can change at the stream level, the scene level, or even at the frame level. Dynamic reshaping is preferable; however, certain devices may not have enough computational power to support it. As appreciated by the inventors here, improved techniques for efficient image reshaping when displaying video content, especially HDR content, are desired.
- FIG. 1A depicts an example single-layer encoder for HDR data using a reshaping function
- FIG. 1B depicts an example HDR decoder corresponding to the encoder of FIG. 1A ;
- FIG. 2 depicts an example process for building a set of basis reshaping functions and applying reshaping function interpolation according to an embodiment of this invention
- FIG. 3A depicts an example process for asymmetric reshaping in an encoder according to an embodiment of this invention.
- FIG. 3B depicts an example process for the interpolation of reshaping functions in a decoder according to an embodiment of this invention.
- Image reshaping techniques for the efficient coding of images are described herein.
- a new reshaping parameter r where r(l) ⁇ r ⁇ r(l+1), a new reshaping function may be generated by interpolating reshaping function parameters from the given set.
- Example embodiments described herein relate to image reshaping.
- a processor accesses a first set of basis reshaping functions, wherein a basis reshaping function maps pixel codewords from a first codeword representation to a second codeword representation and each reshaping function is characterized by a reshaping-index parameter identifying the reshaping function.
- the processor receives an input image in the first codeword representation and a desired reshaping parameter, identifies within the first set of basis reshaping functions a first basis reshaping function with a first reshaping-index parameter lower than the input reshaping parameter and a second basis reshaping function with a second reshaping-index parameter higher than the input reshaping parameter, generates are shaping function by interpolating the first basis reshaping function and the second basis reshaping function using the desired reshaping parameter, applies the reshaping function to the input image to generate a reshaped image in the second codeword representation, and codes the reshaped image to generate a coded reshaped image.
- the desired reshaping parameter is different from any reshaping-index parameters of the basis reshaping functions of the first set.
- the basis reshaping functions are basis functions for interpolation.
- the basis reshaping function may be pre-computed.
- the reshaping function is generated by interpolation of the first and second reshaping function by using a desired reshaping parameter which has a value different from any reshaping-index parameters in the first set.
- the desired reshaping parameter has a value between a value of the first reshaping-index parameter and a value of the second reshaping index parameter.
- the interpolated reshaping function are identified by, i.e., correspond to, the desired reshaping parameter value, which is not present in the set of reshaping functions.
- the reshaping-index parameter and the desired reshaping parameter of the respective reshaping functions comprise a device setting of a device for capturing or displaying the input image or the reshaped image.
- the device setting comprises one of: a luminance, a maximum luminance, an exposure time, a picture mode, or a flash mode of the device.
- a processor receives a coded reshaped image in a first codeword representation and a desired reshaping parameter, it decodes the coded reshaped image to generate a first decoded image in the first codeword representation, it accesses a set of basis reshaping functions, wherein a reshaping function maps pixel codewords from the first codeword representation to a second codeword representation and each reshaping function is characterized by a reshaping-index parameter identifying the reshaping function.
- the desired reshaping parameter is different from any reshaping-index parameters of the pre-computed reshaping functions of the first set.
- the processor identifies within the set of basis reshaping functions a first basis reshaping function with a first reshaping-index parameter lower than the reshaping parameter and a second basis reshaping function with a second reshaping-index parameter higher than the desired reshaping parameter, it generates an output reshaping function based on the first basis reshaping function and the second basis reshaping function, and applies the output reshaping function to the first decoded image to generate an output image in the second codeword representation.
- FIG. 1A and FIG. 1B illustrate an example single-layer backward-compatible codec framework using image reshaping. More specifically, FIG. 1A illustrates an example encoder-side codec architecture, which may be implemented with one or more computing processors in an upstream video encoder. FIG. 1B illustrates an example decoder-side codec architecture, which may also be implemented with one or more computing processors in one or more downstream video decoders.
- corresponding SDR content ( 134 ) (also to be referred as base-layer (BL) or reshaped content) is encoded and transmitted in a single layer of a coded video signal ( 144 ) by an upstream encoding device that implements the encoder-side codec architecture.
- the SDR content is received and decoded, in the single layer of the video signal, by a downstream decoding device that implements the decoder-side codec architecture.
- Backward reshaping metadata ( 152 ) is also encoded and transmitted in the video signal with the SDR content so that HDR display devices can reconstruct HDR content based on the SDR content and the backward reshaping metadata.
- the backward compatible SDR images are generated using a forward reshaping mapping ( 132 ).
- “backward-compatible SDR images” may refer to SDR images that are specifically optimized or color graded for SDR displays.
- a compression block 142 e.g., an encoder implemented according to any known video coding algorithms, like AVC, HEVC, AV1, and the like compresses/encodes the SDR images ( 134 ) in a single layer 144 of a video signal.
- the forward reshaping function in 132 is generated using a forward reshaping function generator 130 based on the reference HDR images ( 120 ). Given the forward reshaping function, forward reshaping mapping ( 132 ) is applied to the HDR images ( 120 ) to generate reshaped SDR base layer 134 . In addition, a backward reshaping function generator 150 may generate a backward reshaping function which may be transmitted to a decoder as metadata 152 .
- backward reshaping metadata representing/specifying the optimal backward reshaping functions may include, but are not necessarily limited to only, any of: inverse tone mapping function, inverse luma mapping functions, inverse chroma mapping functions, lookup tables (LUTs), polynomials, inverse display management coefficients/parameters, etc.
- luma backward reshaping functions and chroma backward reshaping functions may be derived/optimized jointly or separately, may be derived using a variety of techniques as described in the '375 application.
- the backward reshaping metadata ( 152 ), as generated by the backward reshaping function generator ( 150 ) based on the SDR images ( 134 ) and the target HDR images ( 120 ), may be multiplexed as part of the video signal 144 , for example, as supplemental enhancement information (SEI) messaging.
- SEI Supplemental Enhancement Information
- backward reshaping metadata ( 152 ) is carried in the video signal as a part of overall image metadata, which is separately carried in the video signal from the single layer in which the SDR images are encoded in the video signal.
- the backward reshaping metadata ( 152 ) may be encoded in a component stream in the coded bitstream, which component stream may or may not be separate from the single layer (of the coded bitstream) in which the SDR images ( 134 ) are encoded.
- the backward reshaping metadata ( 152 ) can be generated or pre-generated on the encoder side to take advantage of powerful computing resources and offline encoding flows (including but not limited to content adaptive multiple passes, look ahead operations, inverse luma mapping, inverse chroma mapping, CDF-based histogram approximation and/or transfer, etc.) available on the encoder side.
- offline encoding flows including but not limited to content adaptive multiple passes, look ahead operations, inverse luma mapping, inverse chroma mapping, CDF-based histogram approximation and/or transfer, etc.
- the encoder-side architecture of FIG. 1A can be used to avoid directly encoding the target HDR images ( 120 ) into coded/compressed HDR images in the video signal; instead, the backward reshaping metadata ( 152 ) in the video signal can be used to enable downstream decoding devices to backward reshape the SDR images ( 134 ) (which are encoded in the video signal) into reconstructed images that are identical to or closely/optimally approximate the reference HDR images ( 120 ).
- the video signal encoded with the SDR images in the single layer ( 144 ) and the backward reshaping metadata ( 152 ) as a part of the overall image metadata are received as input on the decoder side of the codec framework.
- a decompression block 154 decompresses/decodes compressed video data in the single layer ( 144 ) of the video signal into the decoded SDR images ( 156 ).
- Decompression 154 typically corresponds to the inverse of compression 142 .
- the decoded SDR images ( 156 ) may be the same as the SDR images ( 134 ), subject to quantization errors in the compression block ( 142 ) and in the decompression block ( 154 ), which may have been optimized for SDR display devices.
- the decoded SDR images ( 156 ) may be outputted in an output SDR video signal (e.g., over an HDMI interface, over a video link, etc.) to be rendered on an SDR display device.
- a backward reshaping block 158 extracts the backward reshaping metadata ( 152 ) from the input video signal, constructs the optimal backward reshaping functions based on the backward reshaping metadata ( 152 ), and performs backward reshaping operations on the decoded SDR images ( 156 ) based on the optimal backward reshaping functions to generate the backward reshaped images ( 160 ) (or reconstructed HDR images).
- the backward reshaped images represent production-quality or near-production-quality HDR images that are identical to or closely/optimally approximating the reference HDR images ( 120 ).
- the backward reshaped images ( 160 ) may be outputted in an output HDR video signal (e.g., over an HDMI interface, over a video link, etc.) to be rendered on an HDR display device.
- display management operations specific to the HDR display device may be performed on the backward reshaped images ( 160 ) as a part of HDR image rendering operations that render the backward reshaped images ( 160 ) on the HDR display device.
- v denote a parameter or variable (e.g., an image, a pixel value, or other domain characteristic) in the HDR domain
- s denote parameters or values in the reshaped (e.g., SDR) domain
- r denote parameters or values in the reconstructed HDR domain.
- Those symbols may also be used as superscript or subscript.
- the maximum brightness may be denoted in the original HDR domain as I max v
- the reshaped domain e.g., SDR
- I max v the maximum brightness
- Similar notation may be used for other attributes defining a domain, such as: color space, color gamut, the EOTF being used, and the like.
- the term normalized pixel value denotes a pixel value in [0, 1).
- s t,i ch denotes the un-normalized value of the corresponding pixel in the SDR reference image
- ⁇ t,i ch denotes the un-normalized value of the corresponding pixel in the SDR reshaped image
- r t,i ch denotes the un-normalized value of pixel i of channel ch in frame t of the HDR reconstructed image.
- the goal of the FR function is to minimize the difference between s t,i ch and ⁇ t,i ch pixel values.
- BR backward reshaping
- a chroma reshaping function may be described in terms of a multivariate multi-regression (MMR) polynomials and its coefficients (Ref. [2], Ref.
- the polynomial parameters and coefficients for FR and BR may be communicated to a downstream decoder using metadata (e.g., 152 ).
- fully-adaptive reshaping Given a pair of HDR and SDR images representing the same scene, one may design optimum forward and backward reshaping functions so that the reconstructed HDR image after inverse reshaping is as close as possible to the original HDR image. While, fully-adaptive reshaping is preferred, in many applications (e.g., mobile phones, hand-held cameras, and the like) it is impractical due to lack of computational resources.
- a set of pre-computed reshaping functions each one corresponding to specific device settings (e.g., flash mode, exposure time, picture mode, and the like), denoted as r (l) .
- the reshaping index parameter r (l) may be called also an “identification tag” of the corresponding pre-computed reshaping function.
- This identification tag may be a device setting, like a maximum luminance, e.g. 100 nits, 200 nits etc., an exposure time, e.g. 5 ms, 10 ms, 1000 ms, etc., an ISO number, e.g.
- reshaping functions e.g. being corresponding curves.
- an interpolation between a pre-computed reshaping function with ISO number 100 and a pre-computed reshaping function with ISO number 200 can be performed.
- the first pre-computed reshaping function is for maximum luminance of 100 nits
- the second one is for 200 nits
- the third one is for 500 nits
- the fourth one is for 1000 nits
- the fifth one is for maximum luminance of 4000 nits.
- 100 nits, 200 nits, 500 nits, 1000 nits and 4000 nits are the “identification tags” for those pre-computed reshaping functions.
- the “identification tag” is maximum luminance.
- the identification tag may be an exposure time as well.
- each pre-computed reshaping function has a different identification tag.
- These identification tags can be device settings e.g. maximum luminance, exposure time, ISO number, picture mode, etc. However, the identification tags are not limited to be any of the device settings listed above but can be any other device setting.
- a semi-adaptive reshaping method is proposed, where given a setting r not in the set (e.g., r (l) ⁇ r ⁇ r (l+1) ), a new reshaping function is generated by interpolating parameters among the set of pre-stored reshaping functions.
- a semi-adaptive reshaping process may include the following steps:
- Table 1 describes in pseudocode the creation of a reshaping function for a set of images.
- the key steps include: a) the collection of statistics (histograms) for the images in the database, b) the generation of a cumulative density function (CDF) for the set, c) using CDF matching (Ref. [3]) to generate a reshaping function and d) smoothing the reshaping function.
- Table 1 may refer to images in each pair as SDR and HDR images, the same methodology can be applied when generating reshaping functions using any type of different signal representation formats.
- the images may differ in the EOTF function (e.g., gamma versus PQ), bit-depth (e.g., 8-bit vs 10-bit), color gamut, color format (e.g., 4:4:4 vs 4:2:0), color space, and the like.
- each dynamic range is subdivided into bins.
- M S can be equal to N S
- W 8 ⁇ ⁇ for ⁇ ⁇ 8 ⁇ - ⁇ bit ⁇ ⁇ data , 32 for 10-bit data)
- W b ⁇ clip3(b ⁇ W b ,0,M S ⁇ 1)
- W b + clip 3(b + W b ,0,M S ⁇ 1)
- the above process can be performed for all databases to generate a set of backward reshaping functions ⁇ T b (l) ⁇ .
- the CDF-matching step (STEP 5) can be simply explained as follows.
- STEP 5 can also be easily modified to generate a forward reshaping function instead.
- a forward reshaping function instead of the HDR value (x v ).
- one computes the corresponding HDR CDF value (say, c), and then tries to identify via simple linear interpolation from existing SDR CDF values the SDR value (x s ) for which c s,(l) c, thus mapping x v values to x s values.
- L (L ⁇ L 0 ) representative basis reshaping functions it may be desired to select out of this set a smaller set of L (L ⁇ L 0 ) representative basis reshaping functions so that one can generate the rest using a function interpolation method.
- the reshaping functions are ordered in a monotonically increasing manner with index l, to avoid the need for extrapolation, the first and the last functions should be part of the basis functions.
- BR denotes a basis function selected among the original BR functions. Note: for notation purposes, in the remaining of this description, given a set of L 0 BR functions, a set of L BR functions is selected. Given those basis BR functions, interpolated BR functions are generated.
- L is given (based on known memory requirements) or the value of L can be adjusted to meet a certain quality criterion.
- Table 2 describes in pseudo-code the case where L and the L basis functions are derived to meet a minimum interpolation threshold ⁇ .
- the interpolation error for just luma, or for both luma and chroma.
- the error may be computed as:
- each of the reshaping functions may be represented using a piece-wise polynomial.
- Each such piece-wise polynomial is represented by pivot points representing the start- and end-point of each polynomial segment and the polynomial coefficients for each segment.
- such functions can be interpolated only if their segments are aligned and all have the same set of pivot points. There are multiple ways to generate a common set of pivot points and some of these techniques are discussed in this section.
- the optimization function can be formulated as a problem to minimize the overall fitting errors for all L LUTs and all codewords.
- s b Y denote the normalized input value between 0 and 1.
- [ ⁇ m , ⁇ m+1 ) denotes the m-th polynomial segment.
- the optimization problem may be formulated as minimizing the maximal predicted error between the predicted value and the reference value (T b (l) ):
- a joint-optimization problem may be defined as: solve for J 3 given a non-trivial solution for ⁇ mk (l) ⁇ values given by either J 1 or J 2 .
- pivot point points may be bounded by the standard specifications (e.g., SMPTE 274M) for lower bound, ⁇ m s , and upper bound, ⁇ m e , of “legal” or “broadcast safe” signal values; since values below ⁇ m s and above ⁇ m e will be clipped.
- the broadcast-safe area is [16, 235] instead of [0, 255].
- ⁇ r ( l + 1 ) - r r ( l + 1 ) - r ( l ) , ( 21 ) denote a linear interpolation factor.
- LUT look-up table
- interpolated reshaping function BR t (s ⁇ r),Y ( ) will be a function ‘located between’ basis reshaping function BR t (s ⁇ r (l) ),Y ( ) and basis reshaping function BR t (s ⁇ r l+1) ),Y ( ) e.g. with no common values shared between them.
- the basis reshaping functions may be uniquely identified by the respective reshaping-index parameter and as such being functions of the same type but providing different mapped values (mapped pixel codewords).
- the interpolated reshaping function may be uniquely associated to a desired basis reshaping parameter having a value between the reshaping-index parameters of the two basis reshaping functions used for interpolation. Therefore, the interpolated reshaping function may provide mapped pixel codewords (in the second codewords representation) in function of the input pixel codewords (in the first codewords representation) which are still different from mapped pixel codewords provided by the two basis reshaping functions used for interpolation.
- chroma reshaping may be performed using a set of MMR polynomials.
- S t,i y,d denote the i-th pixel value of the down-sampled Y component
- P C denote the number of sample points (or pixels).
- s t,i T [1 s t,i y,d s t,i c 0 s t,i c 1 . . . s t,i y,d s t,i c 0 . . . ( s t,i y,d s t,i c 0 ) 2 . . . ( s t,i c 0 ) 2 ( s t,i c 1 ) 2 . . . ] denote the support of dependency vector for an MMR model. Then for
- r ⁇ t , i ( l ) [ r ⁇ t , i ( l ) , c 0 r ⁇ t , i ( l ) , c 1 ] , or, in a matrix representation, given
- R t ⁇ ⁇ ⁇ R t ( l ) + ( 1 - ⁇ ) ⁇ R t ( l + 1 ) ( 34 )
- the decoder applies a backward reshaping function to reconstruct a close approximation of the original HDR image.
- This can be referred to as “symmetric reshaping,” because the input color and bit-depth domain is the same as the output color and bit-depth domain. This case may not be applicable to certain applications, such as mobile picture capture and communications.
- FIG. 3A depicts an example of asymmetric reshaping in an encoder according to an embodiment based on semi-adaptive reshaping discussed earlier.
- the forward reshaping stage may include: a set of basis forward reshaping functions ( 305 ), a function interpolation unit ( 310 ), which can generate a new forward reshaping function ( 312 ) by interpolating from two basis forward reshaping functions, and a forward reshaping unit ( 315 ) which will apply the generated forward function ( 312 ) to generate the reshaped signal ( 317 ), e.g., an SDR signal.
- an encoder could generate the parameters of the reverse or backward reshaping function (e.g., 150 ) (e.g., see Ref. [5]), which could be transmitted to the decoder as was shown in FIG. 1 .
- the reverse or backward reshaping function e.g., 150
- Ref. [5] the reverse or backward reshaping function
- the encoder may include a separate backward reshaping stage which may include: a set of basis backward reshaping functions ( 320 ), and a second function interpolation unit ( 325 ), which can generate a new backward reshaping function ( 327 ) by interpolating from two basis backward reshaping functions.
- the parameters of the backward reshaping function may be communicated as metadata.
- the forward and backward parameters for function interpolation may include such variables as: exposure time, ISO, maximum luminance, and the like.
- the input HDR signal ( 302 ) may be raw HDR data or color-graded HDR data.
- one of the applications of embodiments of this invention is video encoding for mobile systems to simplify the computation requirements.
- the forward path handles the camera raw to SDR mapping
- the backward path handles the SDR to HDR mapping.
- the parameters to both paths can incorporate a variety of camera parameters, such as exposure time, ISO, and the like.
- Decoding embodiments may also incorporate semi-adaptive reshaping as follows: For example, in an embodiment, as shown in FIG. 3A , an encoder may transmit directly to a decoder the metadata defining explicitly the parameters of the interpolated backward reshaping function ( 327 ). Then, decoding follows the decoding process depicted in FIG. 1B .
- function interpolation may also be performed at the decoder site.
- an encoder may communicate to a decoder: the number of basis backward reshaping functions and the reshaping parameters (for luma and chroma) for each such function, plus an identification “tag” (or reshaping-index parameter) for each basis function. This allows the decoder to build a database of basis backward-reshaping functions ( 305 - d ).
- the encoder may simply send the data ( 329 ) required for the decoder to generate the interpolated reshaping function, which may include: the identification tags of the bracketing reshaping functions and the interpolation factor ( ⁇ ), or just the target reshaping parameter required by the decoder to generate this information on its own.
- the decoder uses a function-interpolation block ( 310 - d ) to generate the appropriate backward reshaping function ( 312 - d ), which can be used in the backward-reshaping block ( 315 - d ) to generate the reshaped HDR signal ( 330 ).
- Embodiments of the present invention may be implemented with a computer system, systems configured in electronic circuitry and components, an integrated circuit (IC) device such as a microcontroller, a field programmable gate array (FPGA), or another configurable or programmable logic device (PLD), a discrete time or digital signal processor (DSP), an application specific IC (ASIC), and/or apparatus that includes one or more of such systems, devices or components.
- IC integrated circuit
- FPGA field programmable gate array
- PLD configurable or programmable logic device
- DSP discrete time or digital signal processor
- ASIC application specific IC
- the computer and/or IC may perform, control or execute instructions relating to the interpolation of reshaping functions, such as those described herein.
- the computer and/or IC may compute, any of a variety of parameters or values that relate to the interpolation of reshaping functions as described herein.
- the image and video dynamic range extension embodiments may be implemented in hardware, software, firmware and various combinations
- Certain implementations of the invention comprise computer processors which execute software instructions which cause the processors to perform a method of the invention.
- processors in a display, an encoder, a set top box, a transcoder or the like may implement methods for the interpolation of reshaping functions as described above by executing software instructions in a program memory accessible to the processors.
- the invention may also be provided in the form of a program product.
- the program product may comprise any non-transitory and tangible medium which carries a set of computer-readable signals comprising instructions which, when executed by a data processor, cause the data processor to execute a method of the invention.
- Program products according to the invention may be in any of a wide variety of non-transitory and tangible forms.
- the program product may comprise, for example, physical media such as magnetic data storage media including floppy diskettes, hard disk drives, optical data storage media including CD ROMs, DVDs, electronic data storage media including ROMs, flash RAM, or the like.
- the computer-readable signals on the program product may optionally be compressed or encrypted.
- a component e.g. a software module, processor, assembly, device, circuit, etc.
- reference to that component should be interpreted as including as equivalents of that component any component which performs the function of the described component (e.g., that is functionally equivalent), including components which are not structurally equivalent to the disclosed structure which performs the function in the illustrated example embodiments of the invention.
- EEEs enumerated example embodiments
- EEE1 In an apparatus comprising one or more processors, a method for generating a reshaping function, the method comprising:
- a basis reshaping function maps pixel codewords from a first codeword representation to a second codeword representation and each reshaping function is characterized by a reshaping-index parameter;
- EEE2 The method of EEE1, further comprising:
- EEE3 The method of EEE 1 or EEE 2, further comprising:
- a basis reshaping function maps pixel codewords from the second codeword representation to a third codeword representation
- EEE4 The method of any of EEEs 1-3, wherein generating the output forward reshaping function comprises:
- the method of EEE 4, when computing the interpolating factor comprises computing:
- ⁇ r ( l + 1 ) - r r ( l + 1 ) - r ( l ) , wherein ⁇ denotes the interpolating factor, r denotes the input reshaping parameter, r (l) denotes the first reshaping-index parameter, r (l+1) denotes the second reshaping-index parameter, and r (l) ⁇ r ⁇ r (l+1) .
- ⁇ denotes the interpolation factor
- a mk (r) , a mk (l) , and a mk (l+1) denote respectively the polynomial coefficients for the m-th segment in the output forward reshaping function, the first basis reshaping function, and the second basis reshaping function.
- generating the first set of basis reshaping functions comprises:
- EEE9 The method of EEE 8 wherein the first signal representation form comprises a high-dynamic range representation and the second signal representation form comprises a standard dynamic range representation.
- EEE10 The method of any of EEEs 1-9, wherein two or more functions in the first set of basis reshaping functions are represented as multi-segment polynomials and corresponding segments for these two or more functions have same starting and ending pivot points.
- EEE11 In an apparatus comprising one or more processors, a method to decode a coded image, the method comprising:
- a reshaping function maps pixel codewords from the first codeword representation to a second codeword representation and each reshaping function is characterized by a reshaping-index parameter;
- EEE12 The method of EEE 11, wherein generating the output reshaping function comprises:
- EEE13 The method of EEE 12, when computing the interpolating factor comprises computing:
- ⁇ r ( l + 1 ) - r r ( l + 1 ) - r ( l ) , wherein ⁇ denotes the interpolation factor, r denotes the input reshaping parameter, r (l) denotes the first reshaping-index parameter, r (l+1) denotes the second reshaping-index parameter, and r (l) ⁇ r ⁇ r (l+1) .
- EEE14 A non-transitory computer-readable storage medium having stored thereon computer-executable instructions for executing with one or more processors a method in accordance with any of the EEEs 1-13.
- EEE15 An apparatus comprising a processor and configured to perform any of the methods recited in EEEs 1-13.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/299,743 US11388408B2 (en) | 2018-12-03 | 2019-11-27 | Interpolation of reshaping functions |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201862774393P | 2018-12-03 | 2018-12-03 | |
EP18209740 | 2018-12-03 | ||
EP18209740.2 | 2018-12-03 | ||
EP18209740 | 2018-12-03 | ||
PCT/US2019/063796 WO2020117603A1 (en) | 2018-12-03 | 2019-11-27 | Interpolation of reshaping functions |
US17/299,743 US11388408B2 (en) | 2018-12-03 | 2019-11-27 | Interpolation of reshaping functions |
Publications (2)
Publication Number | Publication Date |
---|---|
US20220046245A1 US20220046245A1 (en) | 2022-02-10 |
US11388408B2 true US11388408B2 (en) | 2022-07-12 |
Family
ID=68887445
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/299,743 Active US11388408B2 (en) | 2018-12-03 | 2019-11-27 | Interpolation of reshaping functions |
Country Status (5)
Country | Link |
---|---|
US (1) | US11388408B2 (zh) |
EP (1) | EP3891995A1 (zh) |
JP (1) | JP7094451B2 (zh) |
CN (1) | CN113170205B (zh) |
WO (1) | WO2020117603A1 (zh) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220150481A1 (en) * | 2019-03-07 | 2022-05-12 | Lg Electronics Inc. | Video or image coding based on luma mapping with chroma scaling |
DE112019007869T5 (de) * | 2019-11-01 | 2022-09-15 | LG Electronics Inc. | Signalverarbeitungsvorrichtung und Bilderzeugungsvorrichtung damit |
US20240095893A1 (en) | 2021-01-27 | 2024-03-21 | Dolby Laboratories Licensing Corporation | Image enhancement via global and local reshaping |
US11475549B1 (en) * | 2021-06-04 | 2022-10-18 | Nvidia Corporation | High dynamic range image generation from tone mapped standard dynamic range images |
WO2023215108A1 (en) | 2022-05-05 | 2023-11-09 | Dolby Laboratories Licensing Corporation | Stereoscopic high dynamic range video |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8811490B2 (en) | 2011-04-14 | 2014-08-19 | Dolby Laboratories Licensing Corporation | Multiple color channel multiple regression predictor |
WO2017165494A2 (en) | 2016-03-23 | 2017-09-28 | Dolby Laboratories Licensing Corporation | Encoding and decoding reversible production-quality single-layer video signals |
US20180098094A1 (en) | 2016-10-05 | 2018-04-05 | Dolby Laboratories Licensing Corporation | Inverse luma/chroma mappings with histogram transfer and approximation |
US10032262B2 (en) | 2016-02-02 | 2018-07-24 | Dolby Laboratories Licensing Corporation | Block-based content-adaptive reshaping for high dynamic range images |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9070361B2 (en) * | 2011-06-10 | 2015-06-30 | Google Technology Holdings LLC | Method and apparatus for encoding a wideband speech signal utilizing downmixing of a highband component |
CN102684701B (zh) * | 2012-04-27 | 2014-07-09 | 苏州上声电子有限公司 | 基于编码转换的数字扬声器驱动方法和装置 |
WO2014105385A1 (en) * | 2012-12-27 | 2014-07-03 | The Regents Of The University Of California | Anamorphic stretch image compression |
CN104581589B (zh) * | 2014-12-31 | 2018-01-02 | 苏州上声电子有限公司 | 基于三态编码的通道状态选取方法和装置 |
WO2017053432A1 (en) * | 2015-09-21 | 2017-03-30 | Vid Scale, Inc. | Inverse reshaping for high dynamic range video coding |
-
2019
- 2019-11-27 EP EP19821231.8A patent/EP3891995A1/en active Pending
- 2019-11-27 WO PCT/US2019/063796 patent/WO2020117603A1/en active Search and Examination
- 2019-11-27 JP JP2021531632A patent/JP7094451B2/ja active Active
- 2019-11-27 CN CN201980079807.XA patent/CN113170205B/zh active Active
- 2019-11-27 US US17/299,743 patent/US11388408B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8811490B2 (en) | 2011-04-14 | 2014-08-19 | Dolby Laboratories Licensing Corporation | Multiple color channel multiple regression predictor |
US10032262B2 (en) | 2016-02-02 | 2018-07-24 | Dolby Laboratories Licensing Corporation | Block-based content-adaptive reshaping for high dynamic range images |
WO2017165494A2 (en) | 2016-03-23 | 2017-09-28 | Dolby Laboratories Licensing Corporation | Encoding and decoding reversible production-quality single-layer video signals |
US20180098094A1 (en) | 2016-10-05 | 2018-04-05 | Dolby Laboratories Licensing Corporation | Inverse luma/chroma mappings with histogram transfer and approximation |
Non-Patent Citations (6)
Title |
---|
ITU-R BT. 2100 "Image Parameter Values for High Dynamic Range Television for Use in Production and International Programme Exchange" ITU, Jul. 2016. |
ITU-R BT.2020-2 "Parameter Values for Ultra-High Definition Television Systems for Production and International Programme Exchange" Oct. 2015. |
ITU-R BT.709-6 "Parameter Values for the HDTV Standards for Production and International Programme Exchange" Jun. 2015, pp. 1-19. |
Minoo, A. K. et al "Description of the Reshaper Parameters Derivation Process in ETM Reference Software" JCT-VC Meeting, Feb. 2016, San Diego. |
Qing, S. et al "Hardware-Efficient Debanding and Visual Enhancement Filter for Inverse Tone Mapped High Dynamic Range Images and Videos" 2016 IEEE International Conference on Image Processing, Sep. 25, 2016, pp. 3299-3303. |
SMPTE 2084:2014 "High Dynamic Range Electro-Optical Transfer Function of Mastering Reference Displays" Aug. 16, 2014. |
Also Published As
Publication number | Publication date |
---|---|
CN113170205B (zh) | 2023-11-10 |
JP7094451B2 (ja) | 2022-07-01 |
JP2022511829A (ja) | 2022-02-01 |
WO2020117603A1 (en) | 2020-06-11 |
EP3891995A1 (en) | 2021-10-13 |
CN113170205A (zh) | 2021-07-23 |
US20220046245A1 (en) | 2022-02-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109416832B (zh) | 高效的基于直方图的亮度外观匹配 | |
CN112106357B (zh) | 用于对图像数据进行编码和解码的方法及装置 | |
US11388408B2 (en) | Interpolation of reshaping functions | |
US10701375B2 (en) | Encoding and decoding reversible production-quality single-layer video signals | |
CN105744277B (zh) | 分层vdr编译码中的层分解 | |
US10972759B2 (en) | Color appearance preservation in video codecs | |
US11341624B2 (en) | Reducing banding artifacts in HDR imaging via adaptive SDR-to-HDR reshaping functions | |
US20230171436A1 (en) | Adjustable trade-off between quality and computation complexity in video codecs | |
JP7543577B2 (ja) | チェーンドリシェーピング関数の最適化 | |
US20230300381A1 (en) | Reshaping functions for hdr imaging with continuity and reversibility constraints | |
US20230368344A1 (en) | Color transformation for hdr video with a coding-efficiency constraint | |
US20230254494A1 (en) | Image prediction for hdr imaging in open-loop codecs | |
US20240095893A1 (en) | Image enhancement via global and local reshaping |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: DOLBY LABORATORIES LICENSING CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KADU, HARSHAD;SONG, QING;SU, GUAN-MING;SIGNING DATES FROM 20181212 TO 20181213;REEL/FRAME:056542/0596 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |