CN110572662B - Combined scalability processing for multi-layer video coding - Google Patents
Combined scalability processing for multi-layer video coding Download PDFInfo
- Publication number
- CN110572662B CN110572662B CN201910865427.0A CN201910865427A CN110572662B CN 110572662 B CN110572662 B CN 110572662B CN 201910865427 A CN201910865427 A CN 201910865427A CN 110572662 B CN110572662 B CN 110572662B
- Authority
- CN
- China
- Prior art keywords
- bit depth
- component
- bit
- luma
- chroma
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012545 processing Methods 0.000 title abstract description 168
- 238000000034 method Methods 0.000 claims abstract description 149
- 239000010410 layer Substances 0.000 claims abstract description 129
- 239000011229 interlayer Substances 0.000 claims abstract description 89
- 238000013507 mapping Methods 0.000 claims abstract description 68
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 claims abstract description 59
- 241000023320 Luma <angiosperm> Species 0.000 claims abstract description 57
- 238000005070 sampling Methods 0.000 claims description 16
- 230000008859 change Effects 0.000 claims description 9
- 230000008569 process Effects 0.000 abstract description 110
- 238000006243 chemical reaction Methods 0.000 abstract description 64
- 238000004891 communication Methods 0.000 description 43
- 230000011664 signaling Effects 0.000 description 27
- 238000005516 engineering process Methods 0.000 description 19
- 238000010586 diagram Methods 0.000 description 14
- 230000006870 function Effects 0.000 description 12
- 238000007726 management method Methods 0.000 description 9
- 241000760358 Enodes Species 0.000 description 8
- 230000002123 temporal effect Effects 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 6
- 238000013139 quantization Methods 0.000 description 6
- 238000012952 Resampling Methods 0.000 description 4
- 101150014732 asnS gene Proteins 0.000 description 4
- 230000001419 dependent effect Effects 0.000 description 4
- 230000002093 peripheral effect Effects 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 230000001413 cellular effect Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 230000002441 reversible effect Effects 0.000 description 3
- PXHVJJICTQNCMI-UHFFFAOYSA-N Nickel Chemical compound [Ni] PXHVJJICTQNCMI-UHFFFAOYSA-N 0.000 description 2
- 238000013475 authorization Methods 0.000 description 2
- 230000003139 buffering effect Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 229910001416 lithium ion Inorganic materials 0.000 description 2
- QELJHCBNGDEXLD-UHFFFAOYSA-N nickel zinc Chemical compound [Ni].[Zn] QELJHCBNGDEXLD-UHFFFAOYSA-N 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012856 packing Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 239000013598 vector Substances 0.000 description 2
- UFHFLCQGNIYNRP-UHFFFAOYSA-N Hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 description 1
- HBBGRARXTFLTSG-UHFFFAOYSA-N Lithium ion Chemical compound [Li+] HBBGRARXTFLTSG-UHFFFAOYSA-N 0.000 description 1
- 229910005580 NiCd Inorganic materials 0.000 description 1
- 229910005813 NiMH Inorganic materials 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000004873 anchoring Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- OJIJEKBXJYRIBZ-UHFFFAOYSA-N cadmium nickel Chemical compound [Ni].[Cd] OJIJEKBXJYRIBZ-UHFFFAOYSA-N 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000001143 conditioned effect Effects 0.000 description 1
- 230000009849 deactivation Effects 0.000 description 1
- 239000000446 fuel Substances 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 229910052759 nickel Inorganic materials 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/187—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scalable video layer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/34—Scalability techniques involving progressive bit-plane based encoding of the enhancement layer, e.g. fine granular scalability [FGS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Controls And Circuits For Display Device (AREA)
- Color Image Communication Systems (AREA)
- Image Processing (AREA)
- Facsimile Image Signal Circuits (AREA)
Abstract
A video coding system may perform interlayer processing by simultaneously performing inverse tone mapping and gamut conversion scalability processes on a base layer of a video signal. The video coding system may then perform upsampling at the processed base layer. The processed base layer may be used to encode the enhancement layer. The bit depth may be considered for the gamut conversion module. The chroma and/or luma bit depth may be aligned with respective larger or smaller bit depth values of chroma and/or luma.
Description
The present application is a divisional application of chinese patent application 201480055145.X entitled "combined scalability processing for multi-layer video coding" submitted on 10 months 07 of 2014.
Cross reference
The present application claims the benefit of U.S. provisional patent application No.61/887,782 filed on 7 of 10 th 2013, and U.S. provisional patent application No.62/045,495 filed on 9 of 2014, the contents of which are incorporated herein by reference in their entirety.
Background
As digital display technology evolves, display resolution continues to increase. For example, high Definition (HD) digital video streams that recently embody the best commercial display resolution available are being overridden by Ultra High Definition (UHD) displays (e.g., 4K displays, 8K displays, etc.).
Video coding is often used to compress digital video signals, for example, to reduce consumed memory space and/or reduce transmission bandwidth consumption associated with such signals. Scalable Video Coding (SVC) has been shown to improve the quality of experience of video applications running on devices with different capabilities over heterogeneous networks. Scalable video coding may consume less resources (e.g., communication network bandwidth, storage, etc.) than non-scalable video coding techniques.
Known SVC video coding implementations (e.g. using spatial scalability) have proven to be effective for the coding of HD video signals, but have drawbacks when processing digital video signals (e.g. UHD video signals) that extend beyond HD resolution.
Disclosure of Invention
A video coding system may perform inter-layer processing. The video coding system may perform both inverse (reverse) tone mapping and gamut conversion scalability processes at the video signal layer of the video signal. The video coding system may perform upsampling at the video signal layer. For example, the upsampling process may be performed after a combined inverse tone mapping and gamut conversion scalability process. Coding as used herein includes encoding and/or decoding.
For example, the combined processing module may be used to perform the inverse tone mapping and gamut conversion scalability processes simultaneously at a lower layer, such as the base layer. The combination processing module may take as input the sample bit depth of the input luma component and the sample bit depth of the input chroma component and may calculate the sample bit depth of the output luma component and the sample bit depth of the output chroma component based on the inputs. The output of the combination processing module (e.g., video comprising an output luma component and an output chroma component), and/or an indication of the output (e.g., one or more parameters indicative of sampling bit depths of the output luma and chroma components), and/or an indication of the output may be sent to an upsampling processing module for upsampling. The processed base layer may be used to encode the enhancement layer. The processed base layer may be used to predict the enhancement layer.
The video encoding system may perform color conversion from a first color space to a second color space. For example, color component values such as a chrominance component and/or a luminance component for a pixel may be obtained (retrievable). The color component values may be represented with different bit depths. The bit depths may be aligned and color component values may be converted from a first color space to a second color space using a cross-color component model. The alignment may be based on the input chroma bit depth, the input luma bit depth, the minimum input bit depth, and/or the maximum input bit depth. The bit depth may be aligned with a larger value in the bit depth and/or may be aligned with a smaller value in the bit depth. When performing color mapping for a chrominance component of the video signal, the bit-depth of the luminance component of the video signal may be aligned with the bit-depth of the chrominance component. When performing color mapping for a luminance component of the video signal, the bit depth of the chrominance component of the video signal may be aligned with the bit depth of the luminance component
Drawings
Fig. 1 depicts an example multi-layer scalable video coding system.
Fig. 2 depicts an example of temporal and inter-layer prediction for stereoscopic video coding.
Fig. 3 is a table of example scalability types that may be performed in video coding.
Fig. 4 is a table of Ultra High Definition Television (UHDTV) and High Definition Television (HDTV) specifications.
Fig. 5 depicts a comparison of the color space of Ultra High Definition Television (UHDTV) and High Definition Television (HDTV).
Fig. 6 is a table for describing an example of a bitstream layer, which may support HD to UHD scalability (scalability).
Fig. 7 is a table for describing another example of a bitstream layer, which may support HD to UHD scalability.
Fig. 8 is a simplified block diagram depicting an example two-layer scalable video encoder that may be configured to perform HD to UHD scalability.
Fig. 9 is a simplified block diagram depicting an example two-layer scalable video decoder that may be configured to perform HD to UHD scalability.
Fig. 10 depicts an example of interlayer processing using multiple processing modules.
Fig. 11 is a syntax table describing an example of signaling the interlayer process and/or the selection and processing of the interlayer processing module.
FIG. 12 depicts a table of example values that may be used with the example syntax table of FIG. 11.
Fig. 13 depicts an example of interlayer processing using a combined inverse tone mapping and upsampling processing module.
Fig. 14 depicts an example of interlayer processing using a combined inverse tone mapping and gamut conversion processing module.
Fig. 15 is a syntax table describing the combined gamut conversion and inverse tone mapping process.
FIG. 16 is a table of example values that may be used with the example syntax table of FIG. 11.
FIG. 17A depicts a system diagram of an example communication system in which one or more disclosed embodiments may be implemented.
Fig. 17B depicts a system diagram of an example wireless transmit/receive unit (WTRU) that may be used in the communication system shown in fig. 17A.
Fig. 17C depicts a system diagram of an example radio access network and an example core network that may be used in the communication system shown in fig. 17A.
Fig. 17D depicts a system diagram of an example radio access network and an example core network that may be used in the communication system shown in fig. 17A.
Fig. 17E depicts a system diagram of an example radio access network and an example core network that may be used in the communication system shown in fig. 17A.
Detailed Description
Fig. 1 is a simplified block diagram depicting an example block-based hybrid Scalable Video Coding (SVC) system. The spatial and/or temporal signal resolution represented by layer 1 (e.g., the base layer) may be generated by downsampling the input video signal. In the subsequent encoding stage, the setting of a quantizer such as Q1 results in a quality level of the basic information. One or more subsequent higher layers may be encoded and/or decoded using the base layer reconstruction Y1, which may represent an approximation to the higher layer resolution level. The upsampling unit may perform upsampling of the base layer reconstructed signal to a layer 2 resolution. Downsampling and/or upsampling may be performed across multiple layers (e.g., for N layers, layer 1,2 … … N). The downsampling and/or upsampling rate may be different depending on, for example, the dimension of scalability between the two layers.
In the example scalable video coding system of FIG. 1, for a given higher layer N (e.g., 2N N, N being the total number of layers), a differential signal may be generated by subtracting an upsampled lower layer signal (e.g., layer N-1 signal) from the current layer N signal. The differential signal may be encoded. If the respective video signals represented by the two layers n1 and n2 have the same spatial resolution, the corresponding downsampling and/or upsampling operations are bypassed. A given layer N (e.g., 1N) or multiple layers may be decoded without using decoding information from higher layers.
Coding residual (residual) signals of layers other than the base layer, e.g. differential signals between two layers, depending on e.g. using the example SVC system of fig. 1, may cause visual artifacts (artifacts). Such visual artifacts may be due to, for example, quantization and/or normalization of the residual signal to limit its dynamic range and/or quantization performed during encoding of the residual. One or more higher layer encoders may employ motion estimation and/or motion compensated prediction as the respective encoding modes. Compensation and/or motion estimation in the residual signal may be different from conventional motion estimation and may be prone to visual artifacts. To reduce (e.g., minimize) the occurrence of visual artifacts, more complex residual quantization may be implemented in conjunction with, for example, a joint quantization process, which may include quantization and/or normalization of the residual signal to limit its dynamic range and quantization performed during encoding of the residual.
Scalable video coding may enable transmission and decoding of partial bitstreams. This enables SVC to provide video services with lower temporal and/or spatial resolution or reduced fidelity while maintaining a relatively high reconstruction quality (e.g., the respective rates of a given partial bitstream). SVC may be implemented with single loop decoding, whereby an SVC decoder may establish one motion compensation loop at a layer being decoded and may not establish motion compensation loops at one or more other lower layers. For example, the bitstream may include two layers, including a first layer (e.g., layer 1), which may be a base layer, and a second layer (e.g., layer 2), which may be an enhancement layer. When such an SVC decoder reconstructs layer 2 video, establishing decoded picture buffering and motion compensated prediction may be limited to layer 2. In such an implementation of SVC, individual reference pictures from lower layers may not be fully reconstructed, which reduces computational complexity and/or memory consumption at the decoder.
Single loop decoding may be obtained by constrained inter-layer texture prediction, where spatial texture prediction from a lower layer may be allowed for a current block in a given layer if a block of the respective lower layer is coded in an intra mode. When the lower layer block is encoded in the intra mode, it may be reconstructed without motion compensation operations and/or decoded image buffering.
SVC may employ one or more additional inter-layer prediction techniques, such as motion vector prediction from one or more lower layers, residual prediction, mode prediction, and the like. This may improve the rate distortion rate of the enhancement layer. SVC implementations utilizing single loop decoding may exhibit reduced computational complexity and/or reduced memory consumption at the decoder, and increased implementation complexity due to, for example, reliance on block-level inter-layer prediction. To compensate for the performance penalty incurred by imposing single loop decoding constraints, the design and computational complexity of the encoder can be increased to obtain the desired performance. Coding of interleaved (interleaved) content may not be supported by SVC.
Multiview Video Coding (MVC) may provide view scalability. In an example of view scalability, a base layer bitstream may be decoded to reconstruct a conventional two-dimensional (2D) video, and one or more additional enhancement layers may be decoded to reconstruct other view representations of the same video signal. When such views are combined together and displayed by a three-dimensional (3D) display, 3D video with suitable depth perception may be produced.
Fig. 2 depicts an example prediction structure for encoding stereoscopic video having a left view (e.g., layer 1) and a right view (e.g., layer 2) using MVC. The left view video may be encoded in an I-B-P prediction structure and the right view video may be encoded in a P-B prediction structure. As shown in fig. 2, in the right view, a first picture collocated with a first I picture in the left view is encoded as a P picture, and a subsequent picture in the right view is encoded as a B picture, the B picture having a first prediction from a temporal reference in the right view and a second prediction from an inter-layer reference in the left view. MVC may not support the single loop decoding feature. For example, as shown in fig. 2, decoding right view (layer 2) video may be conditioned on the availability of all images in the left view (layer 1), each layer (e.g., view) having a respective compensation loop. Implementations of MVC may include high-level syntax changes and may not include block-level changes. This eases the implementation of MVC. MVC may be implemented, for example, by configuring reference pictures at the slice (slice) and/or picture level. MVC may support encoding more than two views by, for example, expanding the example in fig. 2 to perform inter-layer prediction between multiple views.
MPEG Frame Compatible (MFC) video coding may provide a scalable extension to 3D video coding. For example, MFC may provide a scalable extension to frame compatible base layer video (e.g., two views packed into the same frame) and may provide one or more enhancement layers to restore full resolution views. Stereoscopic 3D video may have two views, including left and right views. Stereoscopic 3D content may be delivered by packing and/or multiplexing two views into one frame and by compressing and transmitting the packed video. On the receiver side, after decoding, the frames are unpacked and displayed as two views. This multiplexing of views may be performed in the time domain or in the spatial domain. When performed in the spatial domain, to maintain the same image size, the two views may be spatially downsampled (e.g., by a factor of 2) and packaged in one or more arrangements (arrangements). For example, a side-by-side arrangement may place the downsampled left view in the left half of the image and the downsampled right view in the right half of the image. Other arrangements may include up and down, row by row, checkerboard, etc. For example, an arrangement for implementing frame compatible 3D video may be delivered by one or more frame packing arrangement SEI messages. This arrangement can achieve 3D delivery with minimal bandwidth consumption increase.
Fig. 3 is a table of example scalability types that may be performed in video coding. One or more of the example scalability types may be implemented as inter-layer prediction processing modes. This improves the compression efficiency of video coding systems, such as scalable extended video coding systems according to high efficiency video coding (SHVC). Bit depth scalability, color gamut scalability, and/or luminance format scalability may be associated with Base Layer (BL) and Enhancement Layer (EL) video formats. For bit depth scalability, for example, BL video may be in 8 bits, while EL video may be higher than 8 bits. For gamut scalability, for example, BL video may be color-graded according to bt.709 gamut, while EL video may be color-graded according to bt.2020 gamut. For luma format scalability, for example, BL video may be in YUV4:2:0 format, while EL video may be in YUV4:2:2 or YUV4:4:4 format.
Fig. 4 is a table of exemplary Ultra High Definition Television (UHDTV) and High Definition Television (HDTV) specifications. As shown in fig. 4, the UHDTV video format (e.g., as defined in ITU-R bt.2020) may support higher spatial resolution (e.g., 4Kx2K (3840 x 2160) and 8Kx4K (7680 x 4320) resolutions), higher frame rate (e.g., 120 Hz), higher sampling bit depth (e.g., 10 bits or 12 bits), and wider color gamut than the HDTV video format (e.g., as defined in ITU-R bt.709).
Fig. 5 depicts a comparison of the definition of the respective HDTV color gamut and UHDTV color gamut in terms of CIE colors. As shown, the amount of color covered by the UHDTV gamut is much wider than the amount of color covered by the HDTV gamut.
A video coding system, such as a scalable extended video coding system according to high efficiency video coding (SHVC), may include one or more devices configured to perform video coding. Devices configured to perform video encoding (e.g., to encode and/or decode video signals) may be referred to as video encoding devices. Such video encoding devices may include devices with video capabilities (e.g., televisions), digital media players, DVD players, blu-ray ™ players, networked media playback devices, desktop computers, portable personal computers, tablet devices, mobile phones, video conferencing systems, hardware and/or software based video encoding systems, and so forth. Such video encoding devices may include wireless communication network elements such as wireless transmit/receive units (WTRUs), base stations, gateways, or other network elements.
The video coding system may be configured to support both UHDTV display formats and HDTV display formats. For example, one or more video bitstreams may be encoded in a layered manner using, for example, two layers (with a base layer representing an HDTV video signal used by an HDTV display and an enhancement layer representing an UHDTV video signal used by a UHDTV display). As shown in fig. 4, the differences between the technical specifications of HDTV format and UHDTV format can be extended beyond spatial and temporal resolution differences, including sample bit depth and color gamut differences, for example. Video coding systems configured to support UHDTV may include support for spatial scalability, temporal scalability, bit Depth Scalability (BDS), and Color Gamut Scalability (CGS). Video coding systems may be configured to support multiple scalability types (e.g., spatial, temporal, bit depth, and color gamut scalability) simultaneously.
The video coding system may be configured to support multiple scalability types using, for example, a scalable bitstream comprising more than two layers. Such video coding may be configured such that one video parameter is enhanced per enhancement layer. For example, fig. 6 depicts an example bitstream layer configuration that may be used to upgrade an HD video signal to a UHD video signal. As shown, an example bitstream may have four layers, including a base layer (layer 0) and three enhancement layers (layer 1, layer 2, and layer 3, respectively). The base layer (layer 0) may comprise, for example, 1080p60 HD video signal. In the first enhancement layer (e.g., layer 1), the spatial resolution may be upgraded to, for example, 4kx2k (3840 x 1960). At a second enhancement layer (e.g., layer 2), the sample bit depth may be upgraded, for example, from 8 bits to 10 bits. In a third enhancement layer (e.g., layer 3), the color gamut may be upgraded, for example, from bt.709 to bt.2020. It should be understood that the bit stream layer processing order depicted in fig. 6 is an example processing order, and that other bit stream layer processing orders may be implemented. The described example bitstream layer configuration does not include increasing the frame rate of the video signal. However temporal scalability may be implemented, for example, to upgrade the frame rate to, for example, 120fps in one or more layers. The enhancement layer may enhance more than one video parameter.
The video encoding system may be configured to perform multi-loop decoding. In multi-loop decoding, one or more dependent layers (e.g., all dependent layers) of the current enhancement layer may be fully decoded in order to decode the current enhancement layer. A Decoded Picture Buffer (DPB) may be created in one or more dependent layers, such as each of the dependent layers. As the number of layers increases, decoding complexity (e.g., computational complexity and/or memory consumption) may increase. The number of layers used to support the desired video format may be limited according to the increased decoding complexity. For example, for HD to UHD scalability, a scalable video bitstream with two layers may be implemented (e.g., the example bit-retention layer configuration depicted in fig. 7).
Fig. 8 is a simplified block diagram depicting an example encoder (e.g., SHVC encoder). The described example encoder may be used to generate a two-layer HD to UHD scalable bitstream (e.g., as described in fig. 7). As shown in fig. 8, the Base Layer (BL) video input 830 may be an HD video signal and the Enhancement Layer (EL) video input 802 may be a UHD video signal. The HD video signal 830 and the UHD video signal 802 may correspond to each other in terms of, for example, one or more of the following: one or more downsampling parameters (e.g., spatial scalability); one or more color grading parameters (e.g., color gamut scalability) or one or more tone mapping parameters (e.g., bit depth scalability) 828.
BL encoder 818 may include, for example, an High Efficiency Video Coding (HEVC) video encoder or an h.264/AVC video encoder. BL encoder 818 may be configured to generate BL bitstream 832 using one or more BL reconstructed pictures (e.g., stored in BL DPB 820) for prediction. The EL encoder 804 may comprise, for example, an HEVC encoder. The EL encoder 804 may include one or more high level syntax modifications, for example, to support inter-layer prediction by adding inter-layer reference pictures to the EL DPB. EL encoder 804 may be configured to generate EL bitstream 808 using one or more EL reconstructed images (e.g., stored in EL DPB 806) for prediction.
At an inter-layer processing (ILP) unit 822, one or more of the BL DPBs 820 may be processed using one or more image level inter-layer processing techniques, including one or more of upsampling (e.g., for spatial scalability), gamut conversion (e.g., for gamut scalability), or inverse tone mapping (e.g., for bit depth scalability), to reconstruct BL images. One or more of the processed reconstructed BL images may be used as a reference image for EL encoding. The interlayer processing may be performed based on the enhanced video information 814 received from the EL encoder 804 and/or the base layer information 816 received from the BL encoder 818.
At 826, the EL bitstream 808, the BL bitstream 832, and parameters used in the interlayer processing (such as ILP information 824) may be multiplexed together into the scalable bitstream 812. For example, scalable bit stream 812 may include an SHVC bit stream.
Fig. 9 is a simplified block diagram depicting an example decoder (e.g., SHVC decoder) corresponding to the example encoder depicted in fig. 8. For example, the described example decoder may be used to decode a two-layer HD to UHD bitstream (e.g., as described in fig. 7).
As shown in fig. 9, a demultiplexing module 912 may receive the scalable bit stream 902 and may demultiplex the scalable bit stream 902 to generate ILP information 914, EL bit stream 904, and BL bit stream 918. The scalable bit stream 902 may comprise an SHVC bit stream. The EL bitstream 904 may be decoded by an EL decoder 906. For example, EL decoder 906 may comprise an HEVC video decoder. EL decoder 906 may be configured to generate a UHD video signal 910 using one or more EL reconstructed images (e.g., stored in EL DPB 908) for prediction. The BL bitstream 918 may be decoded by a BL decoder 920. BL decoder 920 may include, for example, an HEVC video decoder or an h.264/AVC video decoder. BL decoder 920 may be configured to generate HD video signal 924 using one or more BL reconstructed images (e.g., stored in BL DPB 922) for prediction. A reconstructed video signal, such as UHD video signal 910 and/or HD video signal 924, may be used to drive a display device.
At the ILP unit 916, one or more of the reconstructed BL pictures in the BL DPB 922 may be processed using one or more picture level inter-layer processing techniques. The image-level interlayer processing techniques may include one or more of the following: upsampling (e.g., for spatial scalability), gamut conversion (e.g., for gamut scalability), and inverse tone mapping (e.g., for bit depth scalability). The one or more processed reconstructed BL images may be used as reference images for EL decoding. The interlayer processing may be performed based on parameters used in the interlayer processing, such as ILP information 914. The prediction information may include a prediction block size, one or more motion vectors (e.g., which may indicate a motion direction and a motion amount), and/or one or more reference indices (e.g., which may indicate from which reference picture the prediction signal was obtained). This can improve the EL decoding efficiency.
The video coding system may perform a combined inter-layer scalability process. The video coding system may use a plurality of inter-layer processing modules in performing inter-layer prediction. One or more interlayer processing modules may be combined. The video coding system may perform the inter-layer processing according to a cascade configuration of inter-layer processing modules. The combined inter-layer scalability process and/or the corresponding module parameters may be signaled.
An example video encoding process includes performing interlayer processing on a base layer of a video signal. The first portion of the interlayer processing may be performed using a combined processing module that performs both the first and second scalability processes. An example video encoding process may include applying a processed base layer to an enhancement layer of a video signal. The first portion of the first interlayer processing may include inverse tone mapping processing and gamut conversion processing. The second part of the interlayer processing may be performed using an upsampling processing module.
The video coding system may be configured to perform the inter-layer processing steps in a particular order by having one or more of the inter-layer processing modules perform in a particular order. The interlayer processing module may be responsible for performing specific interlayer processes. One or more interlayer processes may be combined into one or more corresponding interlayer processing modules, whereby the interlayer processing modules may perform more than one interlayer process at the same time. These module configurations may be associated with respective measures of implementation complexity, computational complexity, and/or scalable coding performance. The interlayer processing module may be responsible for performing a plurality of interlayer processes.
The video coding system may be configured to perform inter-layer processing according to the combined scalability. For example, the combined scalability may be implemented in an ILP unit of a video encoder (e.g., ILP unit 822 described in fig. 8) and/or an ILP unit of a video decoder (e.g., ILP unit 916 described in fig. 9). Multiple processing modules may be used to implement the combined scalability.
In an example configuration for combining scalability processing, each processing module may be configured to perform processing associated with a single scalability type. Fig. 10 depicts an example inter-layer video encoding process using multiple processing modules configured to perform video encoding in a cascaded manner. As shown, each processing module is configured to perform a particular scalability type of processing. An example inter-layer video encoding process may be used, for example, to perform HD to UHD scalable encoding. The processing module may be configured to perform a plurality of scalability types of processing.
As shown in fig. 10, the inverse tone mapping module 1020 may convert the 8-bit video 1010 into a 10-bit video 1030. The gamut conversion module 1040 may convert the bt.709 video 1030 to bt.2020 video 1050. The upsampling module 1060 may be used to convert 1920x1080 spatial resolution video 1050 into 3840x2160 spatial resolution video 1070. In combination, these processing modules may implement the processing of the ILP units described in fig. 8 and 9. It should be appreciated that the processing order described in fig. 10 (e.g., the order of the processing modules) (reverse tone mapping followed by gamut conversion followed by upsampling) is an example processing order, and other processing orders may be implemented. For example, the order of the processing modules in the ILP units may be interchanged.
One or more inter-layer processing modules (e.g., each inter-layer processing module) may be configured for each sample (per-sample) operation. For example, the inverse tone mapping module 1020 may be applied to each sample in the video image to convert 8-bit video to 10-bit video. Each sampling operation may be performed by the gamut conversion module 1040. After the upsampling module 1060 is applied, the number of samples in the video image may be increased (e.g., significantly) (e.g., in the case of a 2 x spatial rate, the number of samples after upsampling is four times).
In an example implementation of the combined scalability processing, the ILP unit may be configured such that the processing by the upsampling module 1060 may be performed at the end of the interlayer processing (e.g., as shown in fig. 10).
A scalable video coding system may be implemented using multiple layers. The availability, selection and/or application of the various processes of the cascaded inter-layer processing stream may be different for one or more layers (e.g., for each layer). For example, processing may be limited to color domain conversion processes and upsampling processes for one or more layers. For example, the inverse tone mapping process may be omitted. For each layer, the selection of the respective scalability conversion process and/or the processing order (e.g., as described in fig. 10) may be signaled in the processing order (e.g., according to the sample syntax table described in fig. 11). This information may be encapsulated in a Video Parameter Set (VPS) and/or a Sequence Parameter Set (SPS) of the layer, for example. The application of one or more processes through the decoder may be limited by an indication as to whether each individual process is available and/or selected for processing. This may be indicated, for example, by process availability and/or process selection information. The sequence of processes in the inter-layer processing may be signaled (e.g., in a bitstream). In an embodiment, the processing order may be predefined. In an embodiment, the sequence of processes in the inter-layer processing may be signaled in the bitstream.
A process index corresponding to one or more applicable processes may be specified. The process index may correspond to a process or a combination of processes, and may indicate the respective processes. For example, FIG. 12 depicts an example syntax table defining an index that may be used for the process_index field depicted in FIG. 11. The encoder may send one or more indices to signal the selection and/or order of the processing modules according to, for example, a cascade of processing as described in fig. 10. The selection may be any selection. The decoder may receive and decode this signaling and, in response to the signaling, apply the selected procedures in a prescribed order when performing inter-layer processing (e.g., using ILP units).
One or more additional parameters may be included in the signaling and/or bit stream to specify the respective module definitions. For example, how each of the processing modules is applied is signaled. The one or more additional parameters may specify separate and/or combined module definitions. For example, such parameters may be signaled as part of the ILP information.
In an example upsampling process, the signaling may define, for example, the form, shape, size, or coefficients of an upsampling filter applied by the upsampling module. For example, the signaling may specify a separable (separable) 2D filter or a non-separable 2D filter. The signaling may specify a plurality of filters. For example, such a filter may be defined for upsampled luminance image components and/or chrominance image components. The filters may be defined separately or together. When combined with the inverse tone mapping process, the signaling may reflect differences between the respective input and/or output bit depths.
In an example gamut conversion process, for example, signaling may define one or more of the following: color conversion devices (e.g., 3D look-up tables (3D-LUTs)), piecewise linear models, cross-component linear models, linear gain and/or offset models, and the like. For the selected model, one or more of the format, size, coefficients, or other defined parameters may be signaled. When combined with the inverse tone mapping process, the signaling may reflect differences between the respective input and/or output bit depths.
In an example reverse mapping process, the signaling may define, for example, an input bit depth and/or an output bit depth. Multiple input and/or output bit depths may be signaled. For example, respective definitions of input and output bit depths for a luminance image component and for one or more chrominance image components may be signaled. The signaling may specify and/or define parameters (such as piecewise linear models, polynomial models, etc.) for the inverse tone mapping device.
The example syntax table of fig. 12 provides an example of a palette (palette) of available interlayer processing modules signaled by an encoder (e.g., the scalable video encoder of fig. 8). One or more process index values may be signaled. The process index value may correspond to one or more inter-layer processing modules (e.g., scalability for other modes). A decoder (e.g., the scalable video decoder of fig. 9) may receive one or more process indexes via signaling from an encoder and may apply one or more inter-layer processing modules that may correspond to the received process indexes.
For example, the spatial resampling process may support aspect ratio scalability. The index corresponding to the spatial resampling process may be added to the table of fig. 12. In an example, the chroma resampling process may support chroma format scalability. The index corresponding to the chroma resampling process may be added to the table of fig. 12. The syntax defined by the tables in fig. 11 and 12 may support any number of interlayer processing modules.
In an example implementation of the combined scalability processing, the order of application for the multiple inter-layer processing modules may be predetermined (e.g., agreed and fixed between encoder and decoder). The signaling of the table of fig. 11 does not define a processing order, and the decoder may apply a fixed order to one or more selected and/or signaled processes.
In an example implementation of the combined scalability processing, the selection and/or order of applications for multiple inter-layer processing modules may change (e.g., over time). In such an implementation, signaling specifying one or more of the following may be communicated and/or updated (e.g., at the image level) with one or more scalability layers: the selection of the interlayer processing modules, the order in which the interlayer processing modules are applied, and the individual module definitions (e.g., parameters defining each of the modules). The interlayer processing can be changed from one image to the next using, for example, signaling defined in the tables of fig. 11 and 12. For example, the definition of the 3D-LUT associated with the gamut conversion module may change over time (e.g., to reflect differences in the color regulation applied by the content provider).
In an example implementation of the combined scalability processing according to fig. 10, the inter-layer processing functions may be implemented separately and may be cascaded together. For example, the interlayer processing functions may be cascaded in any order. For example, based on implementations (e.g., pipeline and parallel designs), repeatedly accessing sample values (e.g., each sample value) may incur high resource consumption (e.g., in terms of memory access). Video encoding and/or processing may use fixed point operations. For example, a three-dimensional look-up table (3D LUT) process may be used for gamut conversion.
The processing modules may be combined into a single processing module whereby the scalable processing is completed at once. In an example implementation of the combined scalability process, the process modules depicted in fig. 10 may be combined into a single process module. In all such implementations in one (all-in-one), the pixels in the input are accessed and processed once (or twice in the case of performing separate upsampling) to generate one or more corresponding pixels in the output.
Linear processing is sufficient for some processing modules, whereas nonlinear processing may be more efficient for other processing modules (e.g., in improving EL coding performance methods). For example, upsampling using a linear filter is effective, whereas for gamut conversion, a nonlinear model (e.g., a 3D LUT) is more effective than a linear model. The inverse tone mapping module may be linear or nonlinear, depending on the type of tone mapping used in generating the video content. The combined nonlinear and linear processing is not trivial and the combined modules are nonlinear in nature.
Some processing modules are used more broadly than others. For example, spatial scalability may be used in applications such as video conferencing, where the sample bit depth and color gamut of the input video may remain the same (e.g., 8 bits per sample and bt.709 color gamut). For applications limited to spatial scalability, the inter-layer processing may include an upsampling processing module. In such applications, the upsampling processing module may be kept separate from one or more other processing modules in the ILP unit. When processing may be performed solely by the upsampling processing module, one or more other processing modules (e.g., inverse tone mapping processing modules and/or gamut conversion processing modules) may be bypassed.
One or more functions in the inter-layer processing unit may be aligned with one or more other portions of the video codec. For example, depending on the implementation of SHVC, the upsampling filters for the block-and ¼ -pixel positions may remain the same as the interpolation filters at the corresponding phases used for motion compensated prediction in HEVC.
FIG. 13 depicts an example implementation of a combined scalability process. One or more processing modules may be combined. As shown, the upsampling processing module may be combined with the inverse tone mapping processing module and the gamut conversion processing module may remain separate (e.g., ordered before the combined upsampling and inverse tone mapping processing module).
As shown in fig. 13, the gamut conversion module 1320 may convert the bt.709 video 1310 to the bt.2020 video 1330. The combined inverse tone mapping and upsampling module 1340 may convert the 8-bit bt.2020 video 1330 having a spatial resolution of 1920x1080 to a 10-bit video 1350 having a spatial resolution of 3840x 2160.
The one or more up-sampling filters may reduce the number of right shifts (shifts) after filtering. For purposes of description in an example implementation of SHVC, the following equations may represent steps in upsampling (e.g., vertical filtering).
intLumaSample = ( f L [ yPhase, 0 ] * tempArray [ 0 ]+
f L [ yPhase, 1 ] * tempArray [ 1 ]+
f L [ yPhase, 2 ] * tempArray [ 2 ]+
f L [ yPhase, 3 ] * tempArray [ 3 ]+
f L [ yPhase, 4 ] * tempArray [ 4 ]+
f L [ yPhase, 5 ] * tempArray [ 5 ]+
f L [ yPhase, 6 ] * tempArray [ 6 ]+
f L [ yPhase, 7 ] * tempArray [ 7 ]+ (1<<11)>>(12)
The filtering step may reduce the number of right shifts depending on, for example, the value of delta_bit_depth, which may represent the difference in sampling bit depth between BL and EL.
intLumaSample = ( f L [ yPhase, 0 ] * tempArray [ 0 ]+
f L [ yPhase, 1 ] * tempArray [ 1 ]+
f L [ yPhase, 2 ] * tempArray [ 2 ]+
f L [ yPhase, 3 ] * tempArray [ 3 ]+
f L [ yPhase, 4 ] * tempArray [ 4 ]+
f L [ yPhase, 5 ] * tempArray [ 5 ]+
f L [ yPhase, 6 ] * tempArray [ 6 ]+
f L [ yPhase, 7 ] * tempArray [ 7 ]+ (1<<(11 – delta_bit_depth))>>(12 – delta_bit_depth)
In an embodiment, non-linear tone mapping may be used to generate BL and EL video content. The combined upsampling and inverse tone mapping process may be implemented using a non-linear model, such as a polynomial model, piecewise linear model, or the like. This can improve coding efficiency.
A video encoding device, such as the video encoding system described in fig. 1, the video encoder described in fig. 8, and/or the video decoder described in fig. 9, may encode video signals. The video encoding apparatus may perform the first interlayer processing at a lower layer of the video signal using a combined processing module that simultaneously performs the inverse tone mapping and the gamut conversion scalability processes. The video encoding device may perform the second interlayer processing on the video signal layer using the upsampling processing module.
FIG. 14 depicts an example implementation of a combined scalability process with at least one combined processing module. As shown, the inverse tone mapping processing module may be combined with the gamut conversion processing module, and the upsampling processing module may remain separate. The combined inverse tone mapping and gamut conversion process may be applied before the upsampling process.
As shown in fig. 14, the combined inverse tone mapping and gamut conversion module 1420 may convert the 8-bit bt.709 video 1410 to a 10-bit bt.2020 video 1430. The spatial resolution of the video may remain the same (such as a spatial resolution of 1920x 1080). The combined inverse tone mapping and gamut conversion module 1420 may calculate the sample bit depth of the output luma component and the sample bit depth of the output chroma component based on the sample bit depth of the input luma component and the sample bit depth of the input chroma component. The combined inverse tone mapping and gamut conversion module 1420 may calculate and/or determine an output sampling bit depth based on signaling (e.g., parameters received within the video bitstream). The combined inverse tone mapping and gamut conversion module 1420 may send the results to an upsampling processing module 1440. For example, an indication of the sample bit depth of the output luma component and an indication of the sample bit depth of the output chroma component may be sent to the upsampling processing module 1440. The combined inverse tone mapping and gamut conversion module 1420 may send the video containing the output (e.g., converted) luminance component and the output chrominance component to the upsampling processing module 1440. The upsampling processing module 1440 may receive and convert the 10 bit bt.2020 video 1430 with a spatial resolution of 1920x1080 to the 10 bit bt.2020 video 1450 with a spatial resolution of 3840x 2160. The upsampling process may be performed based on the sample bit depth of the output luma component and the sample bit depth of the output chroma component received from the combined inverse tone mapping and gamut conversion module 1420.
Inverse tone mapping and gamut conversion are more efficient using a nonlinear model. For example, a 3D LUT may be used for color gamut conversion. The use of a modified 3D LUT (e.g., having 8-bit inputs and 10-bit outputs) in a combined inverse tone mapping and gamut conversion module, such as the example implementation depicted in fig. 14, may be as efficient as using separate non-linear models in separate processing modules (e.g., according to the example implementation depicted in fig. 10).
The executable test sequence is implemented using the example of the combined scalability process described in fig. 13 and 14. For the example implementation according to fig. 13, a 3D LUT technique with 8-bit input and 8-bit output is used in the gamut conversion processing module, and one technique is used in the combined inverse tone mapping and upsampling processing module. For the example implementation according to fig. 14, an enhanced 3D LUT technique with 8-bit input and 10-bit output is used in a combined gamut conversion process and inverse tone mapping process module, and the upsampling process module is consistent with the implementation of SHVC.
For test sequences combining two example implementations of scalability processing, the model parameters in the 3D LUT are estimated using Least Squares (LS) techniques and using BL and EL (downsampling if the resolutions are different) videos as training sequences. Simulation results show that both example implementations enhance scalable coding efficiency, with the example implementation of fig. 14 being slightly superior to the example implementation of fig. 13. The higher coding efficiency may be due to the inherent nonlinearity that the enhanced 3D LUT may take into account the inverse tone mapping process. For example, the estimation and/or training of one or more model parameters for an enhanced 3D LUT that may be used in a combined processing module (e.g., the combined inverse tone mapping processing and gamut conversion processing module of the example implementation described in fig. 14) may be based on training content that may reflect the input bit depth and/or the output bit depth of the inverse tone mapping process performed by the combined processing module.
The inverse tone mapping process and the gamut conversion process may be combined using polynomials with different orders, component independent linearities, cross-component linearities, and/or piecewise linearities. For example, the encoder may derive model parameters based on the source content of one layer and the target content of another layer using a least squares training technique to achieve a reduced (e.g., minimal) match error.
The combined inter-layer scalability process and/or corresponding model parameters may be signaled. For example, a combined scalability process may be signaled in the bitstream, where a syntax element may indicate which combined scalability process (e.g., described in fig. 13 or described in fig. 14) is to be used. This syntax element may be signaled at the sequence level, e.g. as part of the VPS and/or as part of the SPS. Syntax elements may be signaled at the picture level (e.g., in a slice segment header) as part of a Picture Parameter Set (PPS) or as part of an Adaptive Parameter Set (APS). The encoder may select a combined scalability process based on, for example, video input. The encoder may indicate the combined scalability process to the decoder.
In an example implementation of the combined inter-layer scalability process, the combined scalability process may be predefined. For example, the combined scalability process described in fig. 14 may be selected. The encoder and decoder may reuse a particular combined scalability process without additional signaling.
The gamut conversion techniques that may be used in the combined inter-layer scalability process may include one or more of the following: gain and offset, cross-component linearity, piecewise linearity, and 3D LUT. The example syntax table depicted in fig. 15 describes an example of signaling parameters of a combined scalability process and a gamut conversion process and an inverse tone mapping process for combination. The example syntax described in fig. 15 may be used in an example according to the example implementation described in fig. 14.
As shown in fig. 15, the input and output bit depth values may be included as parameters of the color mapping process. The color mapping process may base the processing on parameters indicating the sampling bit depth of the input luma component of the color mapping process. For example, the sample bit depth of the input luminance component may be signaled with a variation (delta) of more than 8. As shown in fig. 15, for example, a parameter for indicating the sampling bit depth of an input luminance component may be referred to as bit_depth_input_luma_minus8 (bit-depth-input-luminance-minimum 8). Other parameter names may be used.
The color mapping process may base the processing on parameters indicating the sampling bit depth of the input chroma components of the color mapping process. For example, the sample bit depth of the input chroma components may be signaled with a variation (delta) of more than 8. For example, the input chroma bit depth may be signaled as a change (delta) over the input luma bit depth. As shown in fig. 15, for example, a parameter for indicating the sampling bit depth of an input chroma component may be referred to as bit_depth_input_chroma_delta (bit-depth-input-chroma-change). Other parameter names may be used. This may reduce signaling consumption by reducing the value (e.g., small change value) of the syntax element to be encoded. Other bit depth signaling techniques may be used.
The color mapping process may output parameters indicating the sampling bit depth of the output luma component of the color mapping process. For example, the sample bit depth of the output luminance component may be signaled as a change (delta) of more than 8. For example, the sample bit depth of the output luminance component may be signaled as a change (delta) over the input luminance bit depth. As shown in fig. 15, this output parameter may be referred to as bit_depth_output_luma_delta (bit-depth-output-luma-change). Other parameter names may be used.
The color mapping process may output parameters indicating the sampling bit depth of the output chroma components of the color mapping process. For example, the sampling bit depth of the output chroma components may be signaled with a variation (delta) of more than 8. For example, the sampling bit depth of the output chroma components may be signaled as a change (delta) over the input chroma bit depth. As shown in fig. 15, this output parameter may be referred to as bit_depth_output_chroma_delta (bit-depth-output-chroma-change). Other parameter names may be used.
A syntax element cgs_method (cgs_method) may be included in the example syntax table of fig. 15 to indicate the CGS technique used. Examples of cgs_method include gain and/or offset, cross-component linearity, piecewise linearity, 3D LUT, custom CGS methods, and the like. After the cgs_method is transmitted, one or more corresponding model parameters may be signaled (e.g., according to the respective CGS technique). The example syntax table depicted in fig. 15 may be included in sequence level signaling (such as in one or more of VPS, SPS, and PPS). The example syntax table depicted in fig. 15 may be included in image level signaling (such as one or more of slice header and APS).
The luma and/or chroma bit depths may be derived from syntax elements in the VPS or SPS. For example, in the example two-layer scalable video encoder and decoder of fig. 8 and 9, respectively, the BL bit depth (e.g., equivalent to the input bit depth of the combined processing module) and the EL bit depth (e.g., equivalent to the output bit depth of the combined processing module) may be obtained, for example, from syntax elements such as bit_depth_luma/chroma_minus8 (bit_depth_luma/chroma_min 8) in the SPS. The gamut conversion parameters may be sent in the PPS. Parameter sets (e.g., VPS, SPS, and PPS) may be parsed and decoded independently. Signaling bit depth values using the example syntax table depicted in fig. 15 may simplify parsing and decoding of the parametric model for the combined gamut conversion and inverse tone mapping process.
The example signaling depicted in fig. 15 may be defined as a procedure for a first combined interlayer processing module, e.g., corresponding to the combined inverse tone mapping processing and gamut conversion processing module depicted in the example implementation in fig. 14. The first combined interlayer processing module may be used in a fixed configuration in the form of the example implementation depicted in fig. 14. The first combined interlayer processing module may comprise one interlayer processing module that may be used in a cascading processing configuration (e.g., as described in fig. 10). The selection and application of the first combined interlayer processing module may be signaled, for example, using the example syntax tables of fig. 11 and 12. An appropriate process index may be added to the example syntax table of fig. 12 to indicate the application of the first combined interlayer processing module.
A combined interlayer processing module (e.g., the combined inverse tone mapping processing and upsampling processing module described in the example implementation of fig. 13) may be defined. Suitable process definitions for the combined interlayer processing module may be defined. For example, the process definition may define one or more of a form, a size, a shape, or a coefficient of an upsampling filter for spatial scalability upsampling, and may further define one or more input bit depths and/or one or more output bit depths. The appropriate process_index may be added to the example syntax table of fig. 12 to indicate the application of the second combined interlayer processing module. The second combined interlayer processing module may be another interlayer processing module that may be used in a cascading processing configuration (e.g., described in fig. 10).
Any number of combined inter-layer processing modules may be defined and/or incorporated into the cascading framework depicted in fig. 10, for example, using the signaling frameworks defined in the example syntax tables in fig. 11 and 12.
FIG. 16 depicts an example syntax table that illustrates the definition of a process index. As shown, the process index may correspond to a combined inverse tone mapping and gamut conversion. For example, process_index=3 corresponds to the first combined interlayer processing module. As shown, the process index may correspond to a combined inverse tone mapping and upsampling. For example, process_index=4 may correspond to the second combined interlayer processing module.
The bit depth may be considered for the gamut conversion module. The gamut conversion process may convert signals from one color space to another color space. The cross color component relationship may be applied to a color gamut conversion function. For example, in 3D LUT-based gamut conversion, such as the gamut conversion process employed in the final version of the scalable extension of HEVC, the 3D color space may be divided into multiple octants. In one or more octants, a cross-color component linear model may be applied in a manner such as:
outputSampleX = ((LutX[yIdx][uIdx][vIdx][0]* inputSampleY + LutX[yIdx][uIdx][vIdx][1]* inputSampleU + LutX[yIdx][uIdx][vIdx][2]* inputSampleV + nMappingOffset)>>nMappingShift) + LutX[yIdx][uIdx][vIdx][3](1)
the parameter outputSampleX may indicate the output sample value of color component X (e.g., X may be Y, U, or V) after gamut conversion. The parameter LutX [ yIdx ] [ v ] i ] may indicate an ith LUT parameter for an octant specified by (yIdx, iidx, v ] X of color gamut component X, where 0< = i < = 3. Parameters nMappingShift and nMappingOffset may control the accuracy of the pointing operation during gamut conversion, and parameters inputSampleY, inputSampleU and inputSampleV may include the input values of each of color components Y, U and V prior to gamut conversion.
In an embodiment, the respective bit depth values of the luma and chroma samples may be different. These bit depth values may be specified, for example, by bit_depth_input_luma_minus8 and bit_depth_input_chroma_delta in fig. 15. The bit depth values may be specified, for example, by parameter sets such as VPS, SPS, and/or PPS. The input luminance bit depth may be denoted as InputLumaBitDepth and the input chrominance bit depth may be denoted as InputChromaBitDepth. An input luma bit depth and an input chroma bit depth may be derived. For example, the input luma bit depth and the input chroma bit depth may be derived based on the signaling described in fig. 15. The input luma bit depth and the input chroma bit depth may be derived according to the following:
InputLumaBitDepth = bit_depth_input_luma_minus8+8(1a)
InputChromaBitDepth = InputLumaBitDepth + bit_depth_input_chroma_delta(1b)
Video standards such as h.264/AVC and HEVC allow for different bit depths for each of the luma and chroma components. When the cross color component model is used, the bits of the individual color components are aligned when the cross color component linear model is applied. For example, the bit depths of the individual color components may be aligned when equation (1) is applied. According to an example gamut conversion process, before applying the cross color component model, such as equation (1), the luminance and/or chrominance sample bit depth may be aligned with the respective larger bit depth value of luminance and/or chrominance, denoted maxbitdepth=max (InputLumaBitDepth, inputChromaBitDepth). For example, deltaMaxLumaBitDepth and DeltaMaxChromaBitDepth may be defined as follows:
DeltaMaxLumaBitDepth = MaxBitDepth-InputLumaBitDepth
DeltaMaxChromaBitDepth = MaxBitDepth-InputChromaBitDepth
the cross color component linear model may be applied as follows:
outputSampleX = ((LutX[yIdx][uIdx][vIdx][0]* (inputSampleY<<DeltaMaxLumaBitDepth)+ LutX[yIdx][uIdx][vIdx][1]* (inputSampleU<<DeltaMaxChromaBitDepth) + LutX[yIdx][uIdx][vIdx][2]* (inputSampleV<<DeltaMaxChromaBitDepth) + nMappingOffset)>>nMappingShift) + LutX[yIdx][uIdx][vIdx][3](2)
during the gamut conversion process, the luminance and/or chrominance bit depths may be aligned with the respective smaller bit depths of luminance and/or chrominance, denoted as minbitdepth=min (InputLumaBitDepth, inputChromaBitDepth). For example, deltaMinLumaBitDepth and DeltaMinChromaBitDepth may be defined as follows:
DeltaMinLumaBitDepth = InputLumaBitDepth -MinBitDepth
DeltaMinChromaBitDepth = InputChromaBitDepth –MinBitDepth.
the cross color component linear model may be applied as follows:
outputSampleX = ((LutX[yIdx][uIdx][vIdx][0]* (inputSampleY>>DeltaMinLumaBitDepth) + LutX[yIdx][uIdx][vIdx][1]* (inputSampleU>>DeltaMinChromaBitDepth) + LutX[yIdx][uIdx][vIdx][2]* (inputSampleV>>DeltaMinChromaBitDepth) + nMappingOffset)>>nMappingShift) + LutX[yIdx][uIdx][vIdx][3](3)
A cross color component linear model may be applied whereby the complexity of one or more multiplication operations in the color map may be reduced. The bit depth of the second multiplication operation in equation (3) may be smaller. This reduces the complexity of implementation using, for example, ASIC designs.
It should be appreciated that the example process described above that accounts for possible differences between luminance and chrominance bit depths is not limited to implementation in a 3D LUT-based color gamut conversion function, and that the example process described above may be implemented in any color gamut conversion and/or tone mapping function that uses a cross color component model.
The respective values of nMappingShift and/or nMappingOffset may control the precision of the fixed point operation during gamut conversion. For example, the values of nMappingShift and nMappingOffset may be calculated as follows:
nMappingShift = 10 + InputBitDepthX − OutputBitDepthX(4)
nMappingOffset = 1<<(nMappingShift − 1)(5)
wherein the inputbtdepthx and outputbtdepthx may comprise the input and output bit depths, respectively, of color component X (e.g., X may be Y, U or V) of the color conversion process.
The respective values of inputbtdepthx for luminance and chrominance may be derived, for example, using equations (1 a) and (1 b). The respective values of outputbtdepthx for luminance and chrominance may be derived, for example, using the following equations:
OutputLumaBitDepth = bit_depth_output_luma_minus8+8(6)
OutputChromaBitDepth = OutputLumaBitDepth + bit_depth_input_chroma_delta(7)
In one embodiment, the output bit depth of a color component during color conversion is greater than or equal to the input bit depth of the color component. A color conversion process from a lower quality in BL to a higher quality in EL can be performed, whereby the value (inputbtdepthx-outputbtdepthx) can be negative. As the difference between the input and output bit depths increases, the value nMappingShift may become small. This may correspondingly reduce the accuracy of the fixed point calculation.
When the bit depth variation value (inputbtdepthy-outputbtdepthy) between the input and output for the luminance component is different from the bit depth variation value (inputbtdepthc-outputbtdepthc) between the input and output for the chrominance component, the technique may be used to calculate nMappingShift and/or nMappingOffset for luminance and/or chrominance. For example, nMappingShift may be calculated using (InputBItDepthY-OutpUtDepthY) and may be applied to one or both of luminance and chrominance. Or nMappingShift may be calculated using (inputbtdepthhc-outputbtdepthc) and may be applied to one or both of luminance and chrominance. In another example, nMappingShift and/or nMappingOffset may be calculated using the following formula:
nMappingShift = 10 + min(InputBitDepthY – OutputBitDepthY, InputBitDepthC – OutputBitDepthC)(8)
nMappingOffset = 1<<(nMappingShift − 1)(9)
These values may be applied to one or both of the luminance and chrominance components in the color conversion process. For example, these values may be used for nMappingShift and nMappingOffset in equation (2) and/or equation (3) (such as for each color component X in { Y, U, V }).
The process described above may preserve a higher amount of precision. This may enable, for example, a high (e.g., maximum) pointing accuracy for the gamut conversion process.
The video encoding techniques described herein (such as with a combined scalability process) may be implemented in accordance with video transmission in a wireless communication system (such as the example wireless communication system 100 and its components described in fig. 17A-17E).
Fig. 17A is an illustration of an example communication system 100 in which one or more disclosed embodiments may be implemented. Communication system 100 may be a multiple access system that provides content, such as voice, data, video, messages, broadcasts, etc., to a plurality of wireless users. Communication system 100 may enable multiple wireless users to access such content through the sharing of system resources, including wireless bandwidth. For example, communication system 100 may employ one or more channel access methods, such as Code Division Multiple Access (CDMA), time Division Multiple Access (TDMA), frequency Division Multiple Access (FDMA), orthogonal FDMA (OFDMA), single carrier FDMA (SC-FDMA), and so forth.
As shown in fig. 17A, the communication system 100 may include at least one wireless transmit/receive unit (WTRU), such as a plurality of WTRUs, e.g., WTRUs 102a,102b,102c, and 102d, a Radio Access Network (RAN) 104, a core network 106, a Public Switched Telephone Network (PSTN) 108, the internet 110, and other networks 112, although it is understood that the disclosed embodiments contemplate any number of WTRUs, base stations, networks, and/or network elements. Each of the WTRUs 102a,102b,102c,102d may be any type of device configured to operate and/or communicate in wireless communications. As examples, the WTRUs 102a,102b,102c,102d may be configured to transmit and/or receive wireless signals and may include User Equipment (UE), mobile stations, fixed or mobile subscriber units, pagers, cellular telephones, personal Digital Assistants (PDAs), smartphones, portable computers, netbooks, personal computers, wireless sensors, consumer electronics, and the like.
Communication system 100 may also include a base station 114a and a base station 114b. Each of the base stations 114a,114b may be any type of device configured to wirelessly interact with at least one of the WTRUs 102a,102b,102c,102d to facilitate access to one or more communication networks (e.g., the core network 106, the internet 110, and/or the network 112). For example, the base stations 114a,114B may be Base Transceiver Stations (BTSs), nodes B, e node bs, home enode bs, site controllers, access Points (APs), wireless routers, and the like. Although the base stations 114a,114b are each depicted as a single element, it is to be understood that the base stations 114a,114b may include any number of interconnected base stations and/or network elements.
Base station 114a may be part of RAN 104 and RAN 104 may also include other base stations and/or network elements (not shown) such as a site controller (BSC), a Radio Network Controller (RNC), a relay node, and the like. Base station 114a and/or base station 114b may be configured to transmit and/or receive wireless signals within a particular geographic area, which may be referred to as a cell (not shown). The cell may also be divided into cell sectors. For example, a cell associated with base station 114a may be divided into three sectors. Thus, in one embodiment, the base station 114a may include three transceivers, one for each sector of the cell. In another embodiment, the base station 114a may use multiple-input multiple-output (MIMO) technology and thus may use multiple transceivers for each sector of a cell.
The base stations 114a, 114b may communicate with one or more of the WTRUs 102a, 102b, 102c, 102d over an air interface 116, which air interface 116 may be any suitable wireless communication link (e.g., radio Frequency (RF), microwave, infrared (IR), ultraviolet (UV), visible, etc.). The air interface 116 may be established using any suitable Radio Access Technology (RAT).
More specifically, as previously described, communication system 100 may be a multiple access system and may employ one or more channel access schemes (such as CDMA, TDMA, FDMA, OFDMA, SC-FDMA and the like). For example, a base station 114a in the RAN 104 and WTRUs 102a,102b,102c may implement a radio technology such as Universal Mobile Telecommunications System (UMTS) terrestrial radio access (UTRA), which may use Wideband CDMA (WCDMA) to establish the air interface 116.WCDMA may include, for example, high Speed Packet Access (HSPA) and/or evolved HSPA (hspa+). HSPA may include High Speed Downlink Packet Access (HSDPA) and/or High Speed Uplink Packet Access (HSUPA).
In another embodiment, the base station 114a and the WTRUs 102a,102b,102c may implement a radio technology such as evolved UMTS terrestrial radio access (E-UTRA), which may use Long Term Evolution (LTE) and/or LTE-advanced (LTE-a) to establish the air interface 116.
In other embodiments, the base station 114a and the WTRUs 102a,102b,102c may implement radio technologies such as IEEE 802.16 (i.e., worldwide Interoperability for Microwave Access (WiMAX)), CDMA2000 1x, CDMA2000 EV-DO, temporary standard 2000 (IS-2000), temporary standard 95 (IS-95), temporary standard 856 (IS-856), global system for mobile communications (GSM), enhanced data rates for GSM evolution (EDGE), GSM EDGE (GERAN), and the like.
For example, the base station 114B in fig. 17A may be a wireless router, home node B, home enodeb, or access point, and may use any suitable RAT for facilitating wireless connections in local areas such as a company, home, vehicle, campus, etc. In one embodiment, the base station 114b and the WTRUs 102c,102d may implement a radio technology such as IEEE 802.11 to establish a Wireless Local Area Network (WLAN). In another embodiment, the base station 114b and the WTRUs 102c,102d may implement a radio technology such as IEEE 802.15 to establish a Wireless Personal Area Network (WPAN). In yet another embodiment, the base station 114b and the WTRUs 102c,102d may use a cellular-based RAT (e.g., WCDMA, CDMA2000, GSM, LTE, LTE-a, etc.) to establish a pico cell (picocell) or femto cell (femtocell). As shown in fig. 17A, the base station 114b may have a direct connection to the internet 110. Thus, the base station 114b does not have to access the internet 110 via the core network 106.
The RAN 104 may communicate with a core network 106, which may be any type of network configured to provide voice, data, application, and/or voice over internet protocol (VoIP) services to one or more of the WTRUs 102a, 102b, 102c,102 d. For example, core network 106 may provide call control, billing services, mobile location-based services, prepaid calling, internetworking, video distribution, etc., and/or perform advanced security functions such as user authentication. Although not shown in fig. 17A, it should be appreciated that the RAN 104 and/or the core network 106 may communicate directly or indirectly with other RANs, which may use the same RAT as the RAN 104 or a different RAT. For example, in addition to being connected to a RAN 104 that may employ an E-UTRA radio technology, the core network 106 may also communicate with another RAN (not shown) that uses a GSM radio technology.
The core network 106 may also serve as a gateway for the WTRUs 102a, 102b, 102c, 102d to access the PSTN 108, the internet 110, and/or other networks 112. PSTN 108 may include circuit-switched telephone networks that provide Plain Old Telephone Services (POTS). The internet 110 may include a global system of interconnected computer networks and devices that use common communication protocols such as Transmission Control Protocol (TCP), user Datagram Protocol (UDP), and Internet Protocol (IP) in the TCP/IP internet protocol suite. The network 112 may include a wireless or wired communication network owned and/or operated by other service providers. For example, network 112 may include another core network connected to one or more RANs, which may use the same RAT as RAN 104 or a different RAT.
Some or all of the WTRUs 102a, 102b, 102c, 102d in the communication system 100 may include multi-mode capabilities, i.e., the WTRUs 102a, 102b, 102c, 102d may include multiple transceivers for communicating with different wireless networks over multiple communication links. For example, the WTRU 102c shown in fig. 17A may be configured to communicate with a base station 114a using a cellular-based radio technology and with a base station 114b using an IEEE 802 radio technology.
Fig. 17B is a system block diagram of an example WTRU 102. As shown in fig. 17B, the WTRU 102 may include a processor 118, a transceiver 120, a transmit/receive element 122, a speaker/microphone 124, a keypad 126, a display/touchpad 128, a non-removable memory 130, a removable memory 132, a power source 134, a global positioning system chipset 136, and other peripherals 138. It should be appreciated that the WTRU 102 may include any subset of the elements described above while remaining consistent with an embodiment.
The processor 118 may be a general purpose processor, a special purpose processor, a conventional processor, a Digital Signal Processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) circuits, any other type of Integrated Circuit (IC), a state machine, or the like. The processor 118 may perform signal coding, data processing, power control, input/output processing, and/or any other function that enables the WTRU 102 to operate in a wireless environment. The processor 118 may be coupled to a transceiver 120, and the transceiver 120 may be coupled to a transmit/receive element 122. Although processor 118 and transceiver 120 are depicted in fig. 17B as separate components, it should be understood that processor 118 and transceiver 120 may be integrated together into an electronic package or chip.
The transmit/receive element 122 may be configured to transmit signals to a base station (e.g., base station 114 a) or receive signals from a base station (e.g., base station 114 a) over the air interface 116. For example, in one embodiment, the transmit/receive element 122 may be an antenna configured to transmit and/or receive RF signals. In another embodiment, the transmit/receive element 122 may be an emitter/detector configured to transmit and/or receive, for example, IR, UV, or visible light signals. In yet another embodiment, the transmit/receive element 122 may be configured to transmit and receive both RF signals and optical signals. It should be appreciated that the transmit/receive element 122 may be configured to transmit and/or receive any combination of wireless signals.
Further, although the transmit/receive element 122 is depicted as a single element in fig. 17B, the WTRU 102 may include any number of transmit/receive elements 122. More specifically, the WTRU 102 may use MIMO technology. Thus, in one embodiment, the WTRU 102 may include two or more transmit/receive elements 122 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interface 116.
The transceiver 120 may be configured to modulate signals to be transmitted by the transmit/receive element 122 and to demodulate signals received by the transmit/receive element 122. As described above, the WTRU 102 may have multi-mode capabilities. Thus, the transceiver 120 may include multiple transceivers for enabling the WTRU 102 to communicate via multiple RATs (such as UTRA and IEEE 802.11).
The processor 118 of the WTRU 102 may be coupled to and may receive user input data from a speaker/microphone 124, a keypad 126, and/or a display/touchpad 128, such as a Liquid Crystal Display (LCD) unit or an Organic Light Emitting Diode (OLED) display unit. The processor 118 may also output data to a speaker/microphone 124, a keyboard 126, and/or a display/touchpad 128. Further, the processor 118 may access information from and store data to any type of suitable memory, such as the non-removable memory 130 and/or the removable memory 132. The non-removable memory 130 may include Random Access Memory (RAM), readable memory (ROM), a hard disk, or any other type of memory storage device. Removable memory 132 may include a Subscriber Identity Module (SIM) card, a memory stick, a Secure Digital (SD) memory card, and the like. In other embodiments, the processor 118 may access data from, and store data in, a memory that is not physically located on the WTRU 102, such as on a server or home computer (not shown).
The processor 118 may receive power from the power source 134 and may be configured to distribute and/or control power to other components in the WTRU 102. The power source 134 may be any device suitable for powering the WTRU 102. For example, the power source 134 may include one or more dry cells (nickel cadmium (NiCd), nickel zinc (NiZn), nickel hydrogen (NiMH), lithium ion (Li-ion), etc.), solar cells, fuel cells, and the like.
The processor 118 may also be coupled to a GPS chipset 136, which GPS chipset 136 may be configured to provide location information (e.g., longitude and latitude) regarding the current location of the WTRU 102. In addition to, or in lieu of, the information from the GPS chipset 136, the WTRU 102 may receive location information from base stations (e.g., base stations 114a, 114 b) over the air interface 116 and/or determine its location based on the timing of signals received from two or more neighboring base stations. It should be appreciated that the WTRU 102 may obtain location information by any suitable location determination method while remaining consistent with an embodiment.
The processor 118 may also be coupled to other peripheral devices 138, which peripheral devices 138 may include one or more software and/or hardware modules that provide additional features, functionality, and/or wireless or wired connections. For example, the peripheral devices 138 may include accelerators, electronic compasses (e-compasses), satellite transceivers, digital cameras (for photographs or video), universal Serial Bus (USB) ports, vibration devices, television transceivers, hands-free headphones, bluetooth modules, frequency Modulation (FM) radio units, digital music players, media players, video game player modules, internet browsers, and the like.
Fig. 17C is a system diagram of an embodiment of a communication system 100 including a RAN104a and a core network 106a, the RAN104a and the core network 106a including example embodiments of the RAN104 and the core network 106, respectively. As described above, the RAN104 (e.g., RAN104 a) may communicate with the WTRUs 102a, 102b, 102c over the air interface 116 using UTRA radio technology. RAN104a may also communicate with core network 106 a. As shown in fig. 17C, the RAN104a may include node bs 140a, 140B, 140C, wherein the node bs 140a, 140B, 140C may each include one or more transceivers that communicate with the WTRUs 102a, 102B, 102C over the air interface 116. Each of the node bs 140a, 140B, 140c may be associated with a particular cell (not shown) within range of the RAN104 a. RAN104a may also include RNCs 142a,142b. It should be appreciated that RAN104a may include any number of node bs and RNCs while still remaining consistent with an embodiment.
As shown in fig. 17C, the node bs 140a, 140B may communicate with the RNC 142 a. In addition, node B140 c may be in communication with RNC 142B. The node bs 140a, 140B, 140c may communicate with the corresponding RNCs 142a,142B over Iub interfaces. The RNCs 142a,142b may communicate with each other over an Iur interface. Each of the RNCs 142a,142B may be configured to control a corresponding node B140 a, 140B, 140c connected thereto. Furthermore, each of the RNCs 142a,142b may be configured to implement or support other functions (such as outer loop power control, load control, admission control, packet scheduling, handover control, macro diversity, security functions, data encryption, etc.), respectively.
The core network 106a shown in fig. 17C may include a Media Gateway (MGW) 144, a Mobile Switching Center (MSC) 146, a Serving GPRS Support Node (SGSN) 148, and/or a Gateway GPRS Support Node (GGSN) 150. Although each of the above elements is described as being part of the core network 106a, it should be understood that any of these elements may be owned and/or operated by an entity other than the core network operator.
The RNC 142a in the RAN 104a may be connected to the MSC 146 in the core network 106a through an IuCS interface. The MSC 146 may be connected to the MGW 144. The MSC 146 and the MGW 144 may provide the WTRUs 102a, 102b,102c with access to a circuit switched network (e.g., the PSTN 108) to facilitate communications between the WTRUs 102a, 102b,102c and conventional landline communication devices.
The RNC 142a in the RAN 104a may also be connected to the SGSN 148 in the core network 106a through an IuPS interface. SGSN 148 may be coupled to GGSN 150. The SGSN 148 and GGSN 150 may provide the WTRUs 102a, 102b,102c with access to a packet-switched network (e.g., the internet 110) to facilitate communications between the WTRUs 102a, 102b,102c and IP-enabled devices.
As described above, the core network 106a may also be connected to other networks 112, wherein the other networks 112 may include other wired or wireless networks owned and/or operated by other service providers.
Fig. 17D is a system diagram of an embodiment of communication system 100 including RAN 104b and core network 106b, the RAN 104b and core network 106b including example embodiments of RAN 104 and core network 106, respectively. As described above, RAN 104 (such as RAN 104 b) may communicate with WTRUs 102a, 102b, and 102c over air interface 116 using an E-UTRA radio technology. RAN 104b may also communicate with core network 106 b.
RAN 104B may include enodebs 170a, 170B, 170c, it being understood that RAN 104B may include any number of enodebs while still remaining consistent with an embodiment. The enode bs 170a, 170B, 170c may each contain one or more transceivers that communicate with the WTRUs 102a, 102B, 102c over the air interface 116. In one embodiment, the enode bs 170a, 170B, 170c may use MIMO technology. Thus, for example, the enode B170a may use multiple antennas to transmit wireless signals to the WTRU 102a and to receive wireless information from the WTRU 102 a.
each of the enode bs 170a, 170B, 170c may be associated with a particular cell (not shown) and may be configured to handle user scheduling, radio resource management decisions, handover decisions, etc. in the uplink and/or downlink. As shown in fig. 17D, the enode bs 170a, 170B, 170c may communicate with each other over an X2 interface.
The core network 106b shown in fig. 17D may include a mobility management gateway (MME) 172, a serving gateway 174, and a Packet Data Network (PDN) gateway 176. Although each of the above elements is described as being part of the core network 106b, it should be understood that any of these elements may be owned and/or operated by an entity other than the core network operator.
MME 172 may be connected to each of enode bs 170a, 170B, 170c in RAN 104B through an S1 interface and may act as control nodes. For example, the MME 172 may be responsible for authenticating the user of the WTRUs 102a, 102b, 102c, bearer activation/deactivation, selecting a particular serving gateway during initial attachment of the WTRUs 102a, 102b, 102c, and so forth. MME 172 may also provide control plane functionality for handovers between RAN 104b and a RAN (not shown) using other radio technologies, such as GSM or WCDMA.
The serving gateway 174 may be connected to each of the enode bs 170a, 170B, 170c in the RAN 104B through an S1 interface. The serving gateway 174 may generally route and forward user data packets to the WTRUs 102a, 102b, 102c or route and forward user data packets from the WTRUs 102a, 102b, 102 c. The serving gateway 174 may also perform other functions such as anchoring the user plane during inter-enode B handover, triggering paging when downlink data is available to the WTRUs 102a, 102B, 102c, managing and storing the contexts of the WTRUs 102a, 102B, 102c, and so forth.
The serving gateway 174 may also be connected to a PDN gateway 176, which gateway 176 may provide the WTRUs 102a, 102b, 102c with access to a packet-switched network (e.g., the internet 110) to facilitate communication between the WTRUs 102a, 102b, 102c and IP-enabled devices.
The core network 106b may facilitate communications with other networks. For example, the core network 106b may provide the WTRUs 102a, 102b, 102c with access to a circuit-switched network (e.g., the PSTN 108) to facilitate communications between the WTRUs 102a, 102b, 102c and legacy landline communication devices. For example, the core network 106b may include, or may communicate with: an IP gateway (e.g., an IP Multimedia Subsystem (IMS) server) that interfaces between the core network 106b and the PSTN 108. In addition, the core network 106b may provide access to the network 112 by the WTRUs 102a, 102b, 102c, which network 112 may include other wired or wireless networks owned and/or operated by other service providers.
Fig. 17E is a system diagram of an embodiment of communication system 100 including RAN 104c and core network 106c, with RAN 104c and core network 106c including example embodiments of RAN 104 and core network 106, respectively. RAN 104, e.g., RAN 104c, may be an Access Service Network (ASN) that communicates with WTRUs 102a, 102b, 102c over an air interface 116 using IEEE 802.16 radio technology. As described herein, the communication links between the WTRUs 102a, 102b, 102c, the RAN 104c, and the different functional entities in the core network 106c may be defined as reference points.
As shown in fig. 17E, RAN 104c may include base stations 180a, 180b, 180c and ASN gateway 182, it being understood that RAN 104c may include any number of base stations and ASN gateways while still remaining consistent with an embodiment. Each of the base stations 180a, 180b, 180c may be associated with a particular cell (not shown) in the RAN 104c, respectively, and may include one or more transceivers that communicate with the WTRUs 102a, 102b, 102c over the air interface 116, respectively. In one embodiment, the base stations 180a, 180b, 180c may use MIMO technology. Thus, for example, the base station 180a may use multiple antennas to transmit wireless signals to the WTRU 102a and receive wireless information from the WTRU 102 a. The base stations 180a, 180b, 180c may also provide mobility management functions (such as handoff triggers, tunnel establishment, radio resource management, traffic classification, quality of service (QoS) policy enforcement, etc.). The ASN gateway 182 may act as a traffic aggregation point and may be responsible for paging, caching of user profiles, and routing to the core network 106c, among others.
The air interface 116 between the WTRUs 102a, 102b, 102c and the RAN 104c may be defined as an R1 reference point that implements the IEEE 802.16 specification. In addition, each of the WTRUs 102a, 102b, 102c may establish a logical interface (not shown) with the core network 106 c. The logical interface between the WTRUs 102a, 102b, 102c and the core network 106c may be defined as an R2 reference point, which may be used for authentication, authorization, IP host configuration management, and/or mobility management.
The communication link between each of the base stations 180a, 180b, 180c may be defined as an R8 reference point, the R8 reference point including protocols for facilitating data transmission and handover of WTRUs between the base stations. The communication link between the base stations 180a, 180b, 180c and the ASN gateway 182 may be defined as an R6 reference point. The R6 reference point may include protocols for facilitating mobility management based on mobility events associated with each WTRU 102a, 102b, 102 c.
For example, as shown in fig. 17E, RAN 104c may be connected to core network 106c. The communication link between RAN 104c and core network 106c may be defined as an R3 reference point, with the R3 reference point including protocols for facilitating data transmission and mobility management capabilities. The core network 106c may include a mobile IP home agent (MIP-HA) 184, an authentication, authorization, accounting (AAA) server 186, and a gateway 188. Although each of the above elements are described as being part of the core network 106c, it should be understood that any of these elements may be owned and/or operated by an entity other than the core network operator.
The MIP-HA 184 may be responsible for IP address management and may enable the WTRUs 102a, 102b, 102c to roam between different ASNs and/or different core networks. The MIP-HA 184 may provide the WTRUs 102a, 102b, 102c with access to a packet-switched network (e.g., the internet 110) to facilitate communications between the WTRUs 102a, 102b, 102c and IP-enabled devices. AAA server 186 may be responsible for user authentication and support for user services. Gateway 188 may facilitate interworking with other networks. For example, the gateway 188 may provide the WTRUs 102a, 102b, 102c with access to a circuit-switched network (e.g., the PSTN 108) to facilitate communications between the WTRUs 102a, 102b, 102c and conventional landline communication devices. In addition, the gateway 188 may provide the WTRUs 102a, 102b, 102c with access to the network 112, which network 112 may include other wired or wireless networks owned and/or operated by other service providers.
Although not shown in fig. 17E, it should be understood that RAN 104c may be connected to other ASNs and core network 106c may be connected to other core networks. The communication link between the RAN 104c and the other ASNs may be defined as an R4 reference point, which may include a protocol for coordinating mobility of the WTRUs 102a, 102b, 102c between the RAN 104c and the other ASNs. The communication link between the core network 106c and the other core networks may be defined as an R5 reference point, which may include protocols for facilitating interworking between the local core network and the visited core network.
Although the features and elements of the present invention are described above in particular combinations, one of ordinary skill in the art will appreciate that each feature or element can be used alone or in combination with any other feature or element. Furthermore, the methods described above may be implemented in a computer program, software, and/or firmware executed by a computer or processor, where the computer program, software, or firmware is embodied in a computer-readable storage medium for execution by a computer or processor. Examples of computer-readable storage media include, but are not limited to, electronic signals (transmitted over wired and/or wireless connections) and computer-readable storage media. Examples of computer readable storage media include, but are not limited to, read Only Memory (ROM), random Access Memory (RAM), registers, buffer memory, semiconductor memory devices, magnetic media (e.g., an internal hard disk or a removable disk), magneto-optical media, and optical media such as CD-ROM disks and Digital Versatile Disks (DVDs). A processor associated with the software may be used to implement a radio frequency transceiver for use in a WTRU, terminal, base station, RNC, or any host computer. Features and/or elements described herein in accordance with one or more example embodiments may be used in combination with features and/or elements described herein in accordance with one or more other example embodiments.
Claims (24)
1. A method of decoding a video signal, the method comprising:
receiving a video signal;
performing bit-depth alignment on a luminance component of an image from the video signal and a chrominance component of the image such that the luminance component and the chrominance component have the same aligned bit-depth;
performing color mapping on the bit-depth aligned luma and chroma components of the image; and
the image is decoded based on the color mapped luma component and chroma components.
2. The method of claim 1, wherein performing bit depth alignment further comprises:
the first bit depth of the luma component is aligned to the second bit depth of the chroma component.
3. The method of claim 1, wherein performing bit depth alignment further comprises:
the first bit depth of the chrominance component is aligned to the second bit depth of the luminance component.
4. The method of claim 1, wherein performing bit depth alignment further comprises:
determining a maximum color component bit depth based on the first bit depth of the luma component and the second bit depth of the chroma component; and
The first bit depth of the luma component and the second bit depth of the chroma component are aligned to the maximum color component bit depth.
5. The method of claim 4, wherein the maximum color component bit depth is the greater of two: the first bit depth of the luma component and the second bit depth of the chroma component.
6. The method of claim 1, wherein performing bit depth alignment further comprises:
determining a minimum color component bit depth based on the first bit depth of the luma component and the second bit depth of the chroma component; and
the first bit depth of the luma component and the second bit depth of the chroma component are aligned to the minimum color component bit depth.
7. The method of claim 6, wherein the minimum color component bit depth is the smaller of two: the first bit depth of the luma component and the second bit depth of the chroma component.
8. The method of claim 1, wherein the video signal comprises a Base Layer (BL) and an Enhancement Layer (EL), the image is a BL image from the BL, and performing color mapping on the bit-depth aligned luma and chroma components of the BL image comprises:
A color map is applied to the aligned luma and chroma components to generate an inter-layer reference picture that is used to predict at least one EL picture from the EL of the video signal.
9. The method of claim 1, wherein performing bit depth alignment further comprises:
determining a luminance variation bit depth based on a first bit depth of the luminance component and a second bit depth of the chrominance component; and
and shifting the brightness sampling value of the brightness component by the determined brightness change bit depth.
10. The method of claim 1, wherein performing bit depth alignment further comprises:
determining a bit depth difference between a first bit depth of the luma component and a second bit depth of the chroma component; and
a sample value of one of the luma component and the chroma component of the image having a lower bit depth is upgraded to match the bit depth of the other of the luma component and the chroma component having a higher bit depth.
11. A video decoding apparatus, the apparatus comprising:
a processor configured to:
receiving a video signal;
Performing bit-depth alignment on a luminance component of an image from the video signal and a chrominance component of the image such that the luminance component and the chrominance component have the same aligned bit-depth;
performing color mapping on the bit-depth aligned luma and chroma components of the image; and
the image is decoded based on the color mapped luma component and chroma components.
12. The video decoding device of claim 11, wherein the processor is configured to perform bit depth alignment by:
the first bit depth of the luma component is aligned to the second bit depth of the chroma component.
13. The video decoding device of claim 11, wherein the processor is configured to perform bit depth alignment by:
the first bit depth of the chrominance component is aligned to the second bit depth of the luminance component.
14. The video decoding device of claim 11, wherein the processor is configured to perform bit depth alignment by:
determining a maximum color component bit depth based on the first bit depth of the luma component and the second bit depth of the chroma component; and
The first bit depth of the luma component and the second bit depth of the chroma component are aligned to the maximum color component bit depth.
15. The video decoding apparatus of claim 14, wherein the maximum color component bit depth is the greater of: the first bit depth of the luma component and the second bit depth of the chroma component.
16. The video decoding device of claim 11, wherein the processor is configured to perform bit depth alignment by:
determining a minimum color component bit depth based on the first bit depth of the luma component and the second bit depth of the chroma component; and
the first bit depth of the luma component and the second bit depth of the chroma component are aligned to the minimum color component bit depth.
17. The video decoding apparatus of claim 16, wherein the minimum color component bit depth is the smaller of two: the first bit depth of the luma component and the second bit depth of the chroma component.
18. The video decoding apparatus of claim 11, wherein the video signal comprises a Base Layer (BL) and an Enhancement Layer (EL), the image is a BL image from the BL, and the processor is configured to perform color mapping on the bit-depth aligned luma component and chroma component of the BL image by:
A color map is applied to the aligned luma and chroma components to generate an inter-layer reference picture that is used to predict at least one EL picture from the EL of the video signal.
19. The video decoding device of claim 11, wherein the processor is configured to perform bit depth alignment by:
determining a luminance variation bit depth based on a first bit depth of the luminance component and a second bit depth of the chrominance component; and
and shifting the brightness sampling value of the brightness component by the determined brightness change bit depth.
20. The video decoding device of claim 11, wherein the processor is configured to perform bit depth alignment by:
determining a bit depth difference between a first bit depth of the luma component and a second bit depth of the chroma component; and
a sample value of one of the luma component and the chroma component of the image having a lower bit depth is upgraded to match the bit depth of the other of the luma component and the chroma component having a higher bit depth.
21. A video encoding apparatus, comprising:
A processor configured to:
receiving an image from a video signal;
performing bit depth alignment on a luminance component of the image and a chrominance component of the image such that the luminance component and the chrominance component have the same aligned bit depth;
performing color mapping on the bit-depth aligned luma and chroma components of the image; and
the image is encoded based on the color mapped luma component and chroma components.
22. The video encoding device of claim 21, wherein the processor is further configured to:
determining a maximum color component bit depth based on the first bit depth of the luma component and the second bit depth of the chroma component; and
the first bit depth of the luma component and the second bit depth of the chroma component are aligned with the maximum color component bit depth.
23. The video encoding device of claim 22, wherein the processor is further configured to:
determining a minimum color component bit depth based on the first bit depth of the luma component and the second bit depth of the chroma component; and
The first bit depth of the luma component and the second bit depth of the chroma component are aligned to the minimum color component bit depth.
24. A video encoding method, comprising:
receiving an image from a video signal;
performing bit depth alignment on a luminance component of the image and a chrominance component of the image such that the luminance component and the chrominance component have the same aligned bit depth;
performing color mapping on the bit-depth aligned luma and chroma components of the image; and
the image is encoded based on the color mapped luma component and chroma components.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910865427.0A CN110572662B (en) | 2013-10-07 | 2014-10-07 | Combined scalability processing for multi-layer video coding |
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361887782P | 2013-10-07 | 2013-10-07 | |
US61/887,782 | 2013-10-07 | ||
US201462045495P | 2014-09-03 | 2014-09-03 | |
US62/045,495 | 2014-09-03 | ||
CN201480055145.XA CN105874793B (en) | 2013-10-07 | 2014-10-07 | The method and apparatus that combination gradability for multi-layer video coding is handled |
CN201910865427.0A CN110572662B (en) | 2013-10-07 | 2014-10-07 | Combined scalability processing for multi-layer video coding |
PCT/US2014/059560 WO2015054307A2 (en) | 2013-10-07 | 2014-10-07 | Combined scalability processing for multi-layer video coding |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201480055145.XA Division CN105874793B (en) | 2013-10-07 | 2014-10-07 | The method and apparatus that combination gradability for multi-layer video coding is handled |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110572662A CN110572662A (en) | 2019-12-13 |
CN110572662B true CN110572662B (en) | 2024-03-08 |
Family
ID=51794970
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201480055145.XA Active CN105874793B (en) | 2013-10-07 | 2014-10-07 | The method and apparatus that combination gradability for multi-layer video coding is handled |
CN201910865427.0A Active CN110572662B (en) | 2013-10-07 | 2014-10-07 | Combined scalability processing for multi-layer video coding |
CN202410184144.0A Pending CN118301359A (en) | 2013-10-07 | 2014-10-07 | Combined scalability processing for multi-layer video coding |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201480055145.XA Active CN105874793B (en) | 2013-10-07 | 2014-10-07 | The method and apparatus that combination gradability for multi-layer video coding is handled |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410184144.0A Pending CN118301359A (en) | 2013-10-07 | 2014-10-07 | Combined scalability processing for multi-layer video coding |
Country Status (9)
Country | Link |
---|---|
US (2) | US10063886B2 (en) |
EP (2) | EP3055997B1 (en) |
JP (5) | JP2017501599A (en) |
KR (2) | KR102027027B1 (en) |
CN (3) | CN105874793B (en) |
HK (1) | HK1222965A1 (en) |
RU (1) | RU2658812C2 (en) |
TW (1) | TWI652937B (en) |
WO (1) | WO2015054307A2 (en) |
Families Citing this family (52)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102027027B1 (en) * | 2013-10-07 | 2019-09-30 | 브이아이디 스케일, 인크. | Combined scalability processing for multi-layer video coding |
US9948916B2 (en) | 2013-10-14 | 2018-04-17 | Qualcomm Incorporated | Three-dimensional lookup table based color gamut scalability in multi-layer video coding |
US10531105B2 (en) | 2013-12-17 | 2020-01-07 | Qualcomm Incorporated | Signaling partition information for 3D lookup table for color gamut scalability in multi-layer video coding |
US9756337B2 (en) | 2013-12-17 | 2017-09-05 | Qualcomm Incorporated | Signaling color values for 3D lookup table for color gamut scalability in multi-layer video coding |
US10368097B2 (en) * | 2014-01-07 | 2019-07-30 | Nokia Technologies Oy | Apparatus, a method and a computer program product for coding and decoding chroma components of texture pictures for sample prediction of depth pictures |
CA2943216A1 (en) | 2014-03-19 | 2015-09-24 | Arris Enterprises Llc | Scalable coding of video sequences using tone mapping and different color gamuts |
US10448029B2 (en) * | 2014-04-17 | 2019-10-15 | Qualcomm Incorporated | Signaling bit depth values for 3D color prediction for color gamut scalability |
BR112017004886A2 (en) * | 2014-09-12 | 2017-12-05 | Vid Scale Inc | video coding device and video coding method |
US10135577B2 (en) * | 2015-03-02 | 2018-11-20 | Lg Electronics Inc. | Scalable service in a wireless communication system |
JP6565330B2 (en) * | 2015-05-26 | 2019-08-28 | 日本電気株式会社 | Video processing apparatus, video processing method, and video processing program |
EP3107300A1 (en) * | 2015-06-15 | 2016-12-21 | Thomson Licensing | Method and device for encoding both a high-dynamic range frame and an imposed low-dynamic range frame |
WO2017011636A1 (en) | 2015-07-16 | 2017-01-19 | Dolby Laboratories Licensing Corporation | Signal reshaping and coding for hdr and wide color gamut signals |
US10349067B2 (en) * | 2016-02-17 | 2019-07-09 | Qualcomm Incorporated | Handling of end of bitstream NAL units in L-HEVC file format and improvements to HEVC and L-HEVC tile tracks |
US10440401B2 (en) | 2016-04-07 | 2019-10-08 | Dolby Laboratories Licensing Corporation | Backward-compatible HDR codecs with temporal scalability |
US11102495B2 (en) * | 2016-05-17 | 2021-08-24 | Qualcomm Incorporated | Methods and systems for generating and processing content color volume messages for video |
GB2553556B (en) * | 2016-09-08 | 2022-06-29 | V Nova Int Ltd | Data processing apparatuses, methods, computer programs and computer-readable media |
JP6915483B2 (en) * | 2017-09-27 | 2021-08-04 | 富士フイルムビジネスイノベーション株式会社 | Image processing equipment, image processing systems and programs |
US11265579B2 (en) * | 2018-08-01 | 2022-03-01 | Comcast Cable Communications, Llc | Systems, methods, and apparatuses for video processing |
TWI814890B (en) * | 2018-08-17 | 2023-09-11 | 大陸商北京字節跳動網絡技術有限公司 | Simplified cross component prediction |
WO2020053806A1 (en) | 2018-09-12 | 2020-03-19 | Beijing Bytedance Network Technology Co., Ltd. | Size dependent down-sampling in cross component linear model |
EP3861728A4 (en) | 2018-11-06 | 2022-04-06 | Beijing Bytedance Network Technology Co., Ltd. | Complexity reduction in parameter derivation for intra prediction |
CN113170122B (en) | 2018-12-01 | 2023-06-27 | 北京字节跳动网络技术有限公司 | Parameter derivation for intra prediction |
CA3121671C (en) | 2018-12-07 | 2024-06-18 | Beijing Bytedance Network Technology Co., Ltd. | Context-based intra prediction |
CN111491168A (en) * | 2019-01-29 | 2020-08-04 | 华为软件技术有限公司 | Video coding and decoding method, decoder, encoder and related equipment |
CN113366833A (en) | 2019-02-01 | 2021-09-07 | 北京字节跳动网络技术有限公司 | Limitation of loop shaping |
CN113366841B (en) | 2019-02-01 | 2024-09-20 | 北京字节跳动网络技术有限公司 | Luminance-dependent chroma residual scaling configured for video coding |
AU2020226565C1 (en) * | 2019-02-22 | 2024-01-11 | Beijing Bytedance Network Technology Co., Ltd. | Neighbouring sample selection for intra prediction |
CA3128769C (en) | 2019-02-24 | 2023-01-24 | Beijing Bytedance Network Technology Co., Ltd. | Parameter derivation for intra prediction |
WO2020175893A1 (en) * | 2019-02-28 | 2020-09-03 | 엘지전자 주식회사 | Aps signaling-based video or image coding |
WO2020177702A1 (en) | 2019-03-04 | 2020-09-10 | Beijing Bytedance Network Technology Co., Ltd. | Two-level signaling of filtering information in video processing |
WO2020185879A1 (en) | 2019-03-11 | 2020-09-17 | Dolby Laboratories Licensing Corporation | Video coding using reference picture resampling supporting region of interest |
PL4064706T3 (en) | 2019-03-11 | 2023-08-21 | Dolby Laboratories Licensing Corporation | Signalling of information related to shutter angle |
WO2020184928A1 (en) * | 2019-03-11 | 2020-09-17 | 엘지전자 주식회사 | Luma mapping- and chroma scaling-based video or image coding |
CN113574889B (en) | 2019-03-14 | 2024-01-12 | 北京字节跳动网络技术有限公司 | Signaling and syntax of loop shaping information |
KR20210139272A (en) | 2019-03-23 | 2021-11-22 | 베이징 바이트댄스 네트워크 테크놀로지 컴퍼니, 리미티드 | Restrictions on Adaptive Loop Filtering Parameter Sets |
CN117880494A (en) | 2019-03-24 | 2024-04-12 | 北京字节跳动网络技术有限公司 | Conditions for parameter derivation for intra prediction |
KR20210141683A (en) | 2019-03-25 | 2021-11-23 | 광동 오포 모바일 텔레커뮤니케이션즈 코포레이션 리미티드 | Prediction methods of image elements, encoders, decoders and computer storage media |
US20200322656A1 (en) * | 2019-04-02 | 2020-10-08 | Nbcuniversal Media, Llc | Systems and methods for fast channel changing |
CA3208670A1 (en) * | 2019-06-25 | 2020-12-30 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Image encoding method, image decoding method, encoder, decoder and storage medium |
CN114827612B (en) * | 2019-06-25 | 2023-04-11 | 北京大学 | Video image encoding and decoding method, apparatus and medium |
EP4011079A1 (en) | 2019-08-06 | 2022-06-15 | Dolby Laboratories Licensing Corporation | Canvas size scalable video coding |
US11272187B2 (en) * | 2019-08-13 | 2022-03-08 | Tencent America LLC | Method and apparatus for video coding |
WO2021030788A1 (en) | 2019-08-15 | 2021-02-18 | Bytedance Inc. | Entropy coding for palette escape symbol |
JP7494289B2 (en) | 2019-08-15 | 2024-06-03 | バイトダンス インコーポレイテッド | Palette modes with different partition structures |
US11172237B2 (en) * | 2019-09-11 | 2021-11-09 | Dolby Laboratories Licensing Corporation | Inter-layer dynamic range scalability for HDR video |
CN114424545B (en) * | 2019-09-19 | 2024-07-16 | 字节跳动有限公司 | Quantization parameter derivation for palette modes |
US11877011B2 (en) | 2020-09-17 | 2024-01-16 | Lemon Inc. | Picture dimension indication in decoder configuration record |
US12058310B2 (en) | 2021-02-26 | 2024-08-06 | Lemon Inc. | Methods of coding images/videos with alpha channels |
US20220279185A1 (en) * | 2021-02-26 | 2022-09-01 | Lemon Inc. | Methods of coding images/videos with alpha channels |
CN115080547A (en) | 2021-03-15 | 2022-09-20 | 伊姆西Ip控股有限责任公司 | Method, electronic device and computer program product for data processing |
US12041248B2 (en) * | 2021-08-02 | 2024-07-16 | Mediatek Singapore Pte. Ltd. | Color component processing in down-sample video coding |
KR102674361B1 (en) | 2022-01-03 | 2024-06-13 | 엘지전자 주식회사 | Display device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004064629A (en) * | 2002-07-31 | 2004-02-26 | Canon Inc | Image processing apparatus and its method |
WO2006047448A2 (en) * | 2004-10-21 | 2006-05-04 | Sony Electonics, Inc. | Supporting fidelity range extensions in advanced video codec file format |
WO2008049445A1 (en) * | 2006-10-25 | 2008-05-02 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Quality scalable coding |
CN101577828A (en) * | 2008-04-16 | 2009-11-11 | 英特尔公司 | Tone mapping for bit-depth scalable video codec |
CN101589625A (en) * | 2006-10-25 | 2009-11-25 | 弗劳恩霍夫应用研究促进协会 | Fraunhofer ges forschung |
CN101796841A (en) * | 2007-06-27 | 2010-08-04 | 汤姆逊许可公司 | Method and apparatus for encoding and/or decoding video data using enhancement layer residual prediction for bit depth scalability |
Family Cites Families (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7035460B2 (en) * | 2002-05-31 | 2006-04-25 | Eastman Kodak Company | Method for constructing an extended color gamut digital image from a limited color gamut digital image |
US20050259729A1 (en) * | 2004-05-21 | 2005-11-24 | Shijun Sun | Video coding with quality scalability |
CN101888559B (en) | 2006-11-09 | 2013-02-13 | Lg电子株式会社 | Method and apparatus for decoding/encoding a video signal |
CN102084653B (en) * | 2007-06-29 | 2013-05-08 | 弗劳恩霍夫应用研究促进协会 | Scalable video coding supporting pixel value refinement scalability |
EP2051527A1 (en) * | 2007-10-15 | 2009-04-22 | Thomson Licensing | Enhancement layer residual prediction for bit depth scalability using hierarchical LUTs |
KR20100086478A (en) * | 2007-10-19 | 2010-07-30 | 톰슨 라이센싱 | Combined spatial and bit-depth scalability |
US8953673B2 (en) * | 2008-02-29 | 2015-02-10 | Microsoft Corporation | Scalable video coding and decoding with sample bit depth and chroma high-pass residual layers |
HUE024173T2 (en) * | 2008-04-16 | 2016-05-30 | Ge Video Compression Llc | Bit-depth scalability |
WO2012122423A1 (en) * | 2011-03-10 | 2012-09-13 | Dolby Laboratories Licensing Corporation | Pre-processing for bitdepth and color format scalable video coding |
US9571856B2 (en) * | 2008-08-25 | 2017-02-14 | Microsoft Technology Licensing, Llc | Conversion operations in scalable video encoding and decoding |
EP2406959B1 (en) | 2009-03-13 | 2015-01-14 | Dolby Laboratories Licensing Corporation | Layered compression of high dynamic range, visual dynamic range, and wide color gamut video |
CN103283227A (en) * | 2010-10-27 | 2013-09-04 | Vid拓展公司 | Systems and methods for adaptive video coding |
KR101756442B1 (en) * | 2010-11-29 | 2017-07-11 | 에스케이텔레콤 주식회사 | Video Encoding/Decoding Method and Apparatus for Minimizing Redundancy of Intra Prediction Mode |
CA2807545C (en) * | 2011-02-22 | 2018-04-10 | Panasonic Corporation | Image coding method, image decoding method, image coding apparatus, image decoding apparatus, and image coding and decoding apparatus |
UA109312C2 (en) * | 2011-03-04 | 2015-08-10 | PULSE-CODE MODULATION WITH QUANTITATION FOR CODING VIDEO INFORMATION | |
US20130321675A1 (en) * | 2012-05-31 | 2013-12-05 | Apple Inc. | Raw scaler with chromatic aberration correction |
WO2014084564A1 (en) * | 2012-11-27 | 2014-06-05 | 엘지전자 주식회사 | Signal transceiving apparatus and signal transceiving method |
US9800884B2 (en) * | 2013-03-15 | 2017-10-24 | Qualcomm Incorporated | Device and method for scalable coding of video information |
US10230950B2 (en) * | 2013-05-30 | 2019-03-12 | Intel Corporation | Bit-rate control for video coding using object-of-interest data |
US10075735B2 (en) * | 2013-07-14 | 2018-09-11 | Sharp Kabushiki Kaisha | Video parameter set signaling |
KR102027027B1 (en) * | 2013-10-07 | 2019-09-30 | 브이아이디 스케일, 인크. | Combined scalability processing for multi-layer video coding |
CA2943216A1 (en) * | 2014-03-19 | 2015-09-24 | Arris Enterprises Llc | Scalable coding of video sequences using tone mapping and different color gamuts |
BR112017004886A2 (en) * | 2014-09-12 | 2017-12-05 | Vid Scale Inc | video coding device and video coding method |
-
2014
- 2014-10-07 KR KR1020167011361A patent/KR102027027B1/en active IP Right Grant
- 2014-10-07 CN CN201480055145.XA patent/CN105874793B/en active Active
- 2014-10-07 EP EP14789458.8A patent/EP3055997B1/en active Active
- 2014-10-07 EP EP19208604.9A patent/EP3629582A1/en active Pending
- 2014-10-07 WO PCT/US2014/059560 patent/WO2015054307A2/en active Application Filing
- 2014-10-07 CN CN201910865427.0A patent/CN110572662B/en active Active
- 2014-10-07 JP JP2016521268A patent/JP2017501599A/en active Pending
- 2014-10-07 KR KR1020197027671A patent/KR102127603B1/en active IP Right Grant
- 2014-10-07 TW TW103134905A patent/TWI652937B/en active
- 2014-10-07 RU RU2016117907A patent/RU2658812C2/en active
- 2014-10-07 US US14/508,865 patent/US10063886B2/en active Active
- 2014-10-07 CN CN202410184144.0A patent/CN118301359A/en active Pending
-
2016
- 2016-09-22 HK HK16111112.2A patent/HK1222965A1/en unknown
-
2018
- 2018-02-26 JP JP2018032293A patent/JP6564086B2/en active Active
- 2018-07-26 US US16/045,999 patent/US10986370B2/en active Active
-
2019
- 2019-07-25 JP JP2019136698A patent/JP7012048B2/en active Active
-
2022
- 2022-01-17 JP JP2022005155A patent/JP7448572B2/en active Active
-
2023
- 2023-11-27 JP JP2023199799A patent/JP2024020538A/en not_active Withdrawn
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004064629A (en) * | 2002-07-31 | 2004-02-26 | Canon Inc | Image processing apparatus and its method |
WO2006047448A2 (en) * | 2004-10-21 | 2006-05-04 | Sony Electonics, Inc. | Supporting fidelity range extensions in advanced video codec file format |
WO2008049445A1 (en) * | 2006-10-25 | 2008-05-02 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Quality scalable coding |
CN101589625A (en) * | 2006-10-25 | 2009-11-25 | 弗劳恩霍夫应用研究促进协会 | Fraunhofer ges forschung |
CN101796841A (en) * | 2007-06-27 | 2010-08-04 | 汤姆逊许可公司 | Method and apparatus for encoding and/or decoding video data using enhancement layer residual prediction for bit depth scalability |
CN101577828A (en) * | 2008-04-16 | 2009-11-11 | 英特尔公司 | Tone mapping for bit-depth scalable video codec |
Non-Patent Citations (1)
Title |
---|
AhG14: On bit-depth scalability support;Elena Alshina et al;《Joint Collaborative Team on Video Coding (JCT-VC)》;20130725;第1-3页 * |
Also Published As
Publication number | Publication date |
---|---|
JP7448572B2 (en) | 2024-03-12 |
EP3055997A2 (en) | 2016-08-17 |
CN105874793A (en) | 2016-08-17 |
US10986370B2 (en) | 2021-04-20 |
US20190014348A1 (en) | 2019-01-10 |
JP6564086B2 (en) | 2019-08-21 |
US10063886B2 (en) | 2018-08-28 |
HK1222965A1 (en) | 2017-07-14 |
JP2022064913A (en) | 2022-04-26 |
JP2018129814A (en) | 2018-08-16 |
CN105874793B (en) | 2019-10-11 |
JP2024020538A (en) | 2024-02-14 |
TWI652937B (en) | 2019-03-01 |
JP2020010341A (en) | 2020-01-16 |
JP7012048B2 (en) | 2022-01-27 |
CN118301359A (en) | 2024-07-05 |
RU2016117907A (en) | 2017-11-13 |
KR20190110644A (en) | 2019-09-30 |
US20150098510A1 (en) | 2015-04-09 |
KR102127603B1 (en) | 2020-06-26 |
EP3055997B1 (en) | 2019-12-04 |
RU2658812C2 (en) | 2018-06-22 |
CN110572662A (en) | 2019-12-13 |
EP3629582A1 (en) | 2020-04-01 |
WO2015054307A2 (en) | 2015-04-16 |
KR102027027B1 (en) | 2019-09-30 |
KR20160064204A (en) | 2016-06-07 |
WO2015054307A3 (en) | 2015-08-20 |
JP2017501599A (en) | 2017-01-12 |
TW201536032A (en) | 2015-09-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7448572B2 (en) | Combined scalability processing for multi-layer video coding | |
CN108141507B (en) | Color correction using look-up tables | |
US20190356925A1 (en) | Systems and methods for providing 3d look-up table coding for color gamut scalability | |
KR101786414B1 (en) | Color gamut scalable video coding device and method for the phase alignment of luma and chroma using interpolation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240821 Address after: Delaware, USA Patentee after: Interactive Digital VC Holdings Country or region after: U.S.A. Address before: Delaware, USA Patentee before: VID SCALE, Inc. Country or region before: U.S.A. |
|
TR01 | Transfer of patent right |