WO2011005624A1 - Encoding and decoding architectures for format compatible 3d video delivery - Google Patents

Encoding and decoding architectures for format compatible 3d video delivery Download PDF

Info

Publication number
WO2011005624A1
WO2011005624A1 PCT/US2010/040545 US2010040545W WO2011005624A1 WO 2011005624 A1 WO2011005624 A1 WO 2011005624A1 US 2010040545 W US2010040545 W US 2010040545W WO 2011005624 A1 WO2011005624 A1 WO 2011005624A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
enhancement layer
base layer
encoder
layer video
Prior art date
Application number
PCT/US2010/040545
Other languages
French (fr)
Inventor
Alexandros Tourapis
Peshala V. Pahalawatta
Athanasios Leontaris
Kevin J. Stec
Walter J. Husak
Original Assignee
Dolby Laboratories Licensing Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dolby Laboratories Licensing Corporation filed Critical Dolby Laboratories Licensing Corporation
Priority to US13/376,707 priority Critical patent/US9774882B2/en
Publication of WO2011005624A1 publication Critical patent/WO2011005624A1/en
Priority to US15/675,058 priority patent/US10038916B2/en
Priority to US16/011,557 priority patent/US10798412B2/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/187Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scalable video layer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/33Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/573Motion compensation with multiple frame prediction using two or more reference frames in a given prediction direction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression

Definitions

  • This disclosure relates to image processing and video compression. More particularly, embodiments of the present disclosure relate to encoding and decoding systems and methods for 3D video delivery, such as 2D compatible and frame compatible 3D video delivery.
  • the ideal solutions should be those that can be implemented with minimal or no alteration to existing playback devices such as set- top boxes, DVD, and Blu-ray disc players, as well as existing 3D capable displays, such as digital light processing (DLP) displays by Samsung and Mitsubishi, some Plasma displays, and polarized based and frame sequential LCD displays.
  • DLP digital light processing
  • One possible method for the delivery of 3D content is the consideration of creating, coding, and delivering 3D video content by multiplexing the two views into a single frame configuration using a variety of filtering, sampling, and arrangement methods.
  • Sampling could, for example, be horizontal, vertical, or quincunx, while an offset, e.g. a sampling offset, could also be considered between the two views allowing better exploitation of redundancies that may exist between the two views.
  • MVC Multiview Video Coding
  • FIGURE 9 An example of the MVC based 3D video delivery architecture is shown in FIGURE 9. Redundancies are exploited using only translational motion compensation based methods, while the system is based on "intelligent" reference buffer management, i.e. in which order references from the base or enhancement layers are added in the enhancement layer buffer and considered for prediction, for performing prediction compared to the original design of MPEG-4 AVC.
  • FIGURE 1 shows a checkerboard interleaved arrangement for the delivery of stereoscopic material.
  • FIGURE 2 shows a horizontal sampling/column interleaved arrangement for the delivery of stereoscopic material.
  • FIGURE 3 shows a vertical sampling/row interleaved arrangement for the delivery of stereoscopic material.
  • FIGURE 4 shows a horizontal sampling/side by side arrangement for the delivery of stereoscopic material.
  • FIGURE 5 shows a vertical sampling/over-under arrangement for the delivery of stereoscopic material.
  • FIGURE 6 shows a quincunx sampling/side by side arrangement for the delivery of stereoscopic material.
  • FIGURE 7 shows a frame compatible 3D video delivery architecture.
  • FIGURE 8 shows a simulcast 3D video delivery architecture.
  • FIGURE 9 shows an MVC based 3D video delivery architecture.
  • FIGURE 10 shows an example of 3D capture.
  • FIGURE 11 shows pre-processing stages located between a base layer and an enhancement layer, and between a first enhancement layer and a second enhancement layer of a frame compatible 3D architecture.
  • FIGURE 12 shows pre-processing stages located between the base layer and the enhancement layer of the video encoder, and the base layer and the enhancement layer of the video decoder of a 2D compatible 3D architecture, in accordance with the present disclosure.
  • FIGURE 13 shows a more detailed diagram of the pre-processing stage of FIGURE
  • FIGURE 14 shows a more detailed diagram of the pre-processing stage of FIGURE
  • FIGURE 15 shows a more detailed diagram of the pre-processing stage of FIGURE
  • FIGURE 16 shows an example of pre-processing technique for a horizontal sampling and side by side packing arrangement.
  • FIGURE 17 and FIGURE 18 show examples of pre-processing stages according to the present disclosure.
  • Embodiments of the present disclosure relate to image processing and video compression.
  • a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video encoding system comprising: a base layer, comprising a base layer video encoder; at least one enhancement layer, associated with the base layer, the enhancement layer comprising an enhancement layer video encoder; and at least one pre-processing module, i) to pre-process the output of the base layer video encoder and input the pre-processed output into the enhancement layer video encoder and/or ii) to pre- process the output of one enhancement layer video encoder of one enhancement layer and input the pre-processed output into another enhancement layer video encoder of another enhancement layer.
  • a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video decoding system comprising: a base layer, comprising a base layer video decoder; at least one enhancement layer, associated with
  • the enhancement layer comprising an enhancement layer video decoder; and at least one pre-processing module, i) to pre-process the output of the base layer video decoder and input the pre-processed output into the enhancement layer video decoder and/or ii) to pre- process the output of one enhancement layer video decoder of one enhancement layer and input the pre-processed output into another enhancement layer video decoder of another enhancement layer.
  • a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video system comprising: a base layer, comprising a base layer video encoder and a base layer video decoder; at least one enhancement layer, associated with the base layer, the enhancement layer comprising an enhancement layer video encoder and an enhancement layer video decoder; at least one encoder pre-processing module, i) to pre-process the output of the base layer video encoder and input the pre-processed output into the enhancement layer video encoder and/or ii) to pre- process the output of one enhancement layer video encoder of one enhancement layer and input the pre-processed output into another enhancement layer video encoder of another enhancement layer; and at least one decoder pre-processing module, i) to pre-process the output of the base layer video decoder and input the pre-processed output into the enhancement layer video decoder and/or ii) to pre-process the output of one enhancement layer video decoder of one enhancement layer and input
  • a frame compatible three-dimensional (3D) video encoding system comprising: a base layer, comprising a base layer video encoder and a base layer multiplexer, the base layer multiplexer receiving an input indicative of a plurality of views and forming a multiplexed output connected with the base layer video encoder; and at least one enhancement layer, associated with the base layer, the at least one enhancement layer comprising an enhancement layer video encoder and an enhancement layer multiplexer, the enhancement layer multiplexer receiving an input indicative of the plurality of views and forming a multiplexed output connected with the enhancement layer video encoder, wherein the base layer video encoder is directly connected with the enhancement layer video encoder.
  • a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video encoding method comprising: base layer video encoding a plurality of images or frames; enhancement layer video encoding the plurality of images or frames; pre-processing base layer video encoded images or frames; and
  • D09078WO01 A adopting the pre-processed base layer video encoded images or frame for the enhancement layer video encoding.
  • a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video decoding method comprising: base layer video decoding a plurality of images or frames; pre-processing base layer video decoded images or frames; adopting the pre-processed base layer video decoded images or frames for enhancement layer video decoding; and enhancement layer video decoding the plurality of images or frames;
  • a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video method comprising: base layer video encoding a plurality of images or frames; enhancement layer video encoding the plurality of images or frames; pre-processing base layer video encoded images or frames; adopting the pre-processed base layer video encoded images or frame for the enhancement layer video encoding; base layer video decoding a plurality of images or frames; pre-processing base layer video decoded images or frames; adopting the pre-processed base layer video decoded images or frames for enhancement layer video decoding; and enhancement layer video decoding the plurality of images or frames;
  • an encoder for encoding a video signal according to the method of the fifth embodiment is provided.
  • an apparatus for encoding a video signal according to the method of the fifth embodiment is provided.
  • a system for encoding a video signal according to the method of the fifth embodiment is provided.
  • a decoder for decoding a video signal according to the method of the sixth embodiment is provided.
  • an apparatus for decoding a video signal according to the method of the sixth embodiment is provided.
  • a system for decoding a video signal according to the method of the sixth embodiment is provided.
  • a computer-readable medium containing a set of instructions that causes a computer to perform the method or methods recited above is provided.
  • Embodiments of the present disclosure will show techniques that enable frame compatible 3D video systems to achieve full resolution 3D delivery, without any of the
  • D09078WO01 S drawbacks of the 2D compatible 3D delivery methods (e.g., MVC). Furthermore, decoder complexity, in terms of hardware cost, memory, and operations required will also be considered. Furthermore, improvements over the existing 2D compatible 3D delivery methods are also shown.
  • the content may also have differences in focus or illumination because of the camera characteristics, which again make prediction less accurate.
  • the MVC specification only accounts for 2D compatible 3D video coding systems and has no provision for frame compatible arrangements such as those shown in FIGURE 7 of the present application.
  • a pre-processing stage is introduced between the base and enhancement layer encoders and decoders in accordance with an embodiment of the present disclosure to process or refine the first encoded view for prediction before encoding the second view.
  • data from the base layer are pre-processed and altered using some additional parameters that have been signaled in the bitstream.
  • the pictures thus generated can be available for prediction, if desired.
  • Such process can be used globally or regionally and is not limited to a block-based process.
  • FIGURE 12 where a 3D pre-processor (1210) is shown on the encoding side between base layer video encoder (1220) and enhancement layer video encoder (1230), and a 3D-pre-processor (1240) is shown on the decoding side between base layer video decoder (1250) and enhancement layer video decoder (1260).
  • D09078WO01 f D09078WO01 f
  • the role of this pre-processing stage is to process and adjust the characteristics of the base layer video to better match those of the enhancement layer video. This can be done, for example, by considering pre-processing mechanisms such as filtering (e.g., a sharpening or a low pass filter) or even other more sophisticated methods such as global/region motion compensation/texture mapping.
  • pre-processing mechanisms such as filtering (e.g., a sharpening or a low pass filter) or even other more sophisticated methods such as global/region motion compensation/texture mapping.
  • a set of parameters could be derived for the entire video, scene, or image. However, multiple parameters could also be used within an image. Parameters, in this scenario, could be assigned for different regions of an image. The number, shape, and size of the regions could be fixed or could also be adaptive. Adaptive regions could be derived given preanalysis of the content (e.g., a segmentation method), and/or could be user-specified, in which case signaling of the characteristics of the regions (e.g., shape and size) can be signaled within the bitstream.
  • a system may signal that each frame is split in NxM rectangular regions, or could signal explicitly the shape of each region using a map description. Determination and signaling of such information could follow the description presented in U.S. Provisional Application 61/170,995 filed on April 20, 2009, for "Directed Interpolation and Data Post-Processing", which is incorporated herein by reference in its entirety.
  • an encoder can evaluate all or a subset of possible preprocessing methods that could be used by the system, by comparing the output of each method compared to the predicted signal (enhancement layer).
  • the method resulting in best performance e.g. best in terms of complexity, quality, resulting coding efficiency, among others, or a combination of all of these parameters using methods such as Lagrangian optimization, can be selected at the encoder.
  • FIGURE 17 shows a pre-processing system with N filter consideration/signaling. Multiple filters can be selected for a single region by selecting the M
  • D09078WO01 7 best filters that provide the best desired performance which can be defined as quality, cost, enhancement layer coding performance, etc.
  • FIGURE 18 shows pre-processing with multiparameter consideration.
  • the new processed images are also added in the reference buffer of the enhancement layer, as shown in FIGURE 13, where the output of pre-processor (1310) is connected to the reference buffer (1320) of the enhancement layer.
  • the reference buffer (1320) may already include other references such as previously encoded and decoded pictures from the enhancement layer or even pictures generated from processing previously encoded and decoded base layer pictures.
  • one or more new processed reference pictures can be generated and added in the enhancement layer buffer (1330) as additional reference pictures. All of these references could be considered for prediction using motion compensation methods and mechanisms such as the reference index concept/signaling that is available within codecs such as MPEG-4 AVC (Advanced Video Coding). For example, assuming that a base layer picture has been processed to generate two different reference picture instances, ref b o and ref H , and also ref e , which corresponds to the previously encoded enhancement layer picture, is available as a reference, one can assign reference indices (ref_idx) 0, 1, and 2 to these pictures respectively.
  • ref_idx 0 is signaled in the bitstream.
  • ref_idx 1 or 2 are signaled for MBs selecting ref b i and ref e respectively.
  • Memory management operations take into consideration which references are removed or added in the reference buffer for prediction, while reference ordering takes into consideration the order of how references are considered for motion compensation, which itself affects the number of bits that will be used when signaling that reference.
  • Default memory management and reference ordering operations could be considered based on the systems expectation of which is likely to be the least useful (for memory management) or most correlated reference (for reference ordering). As an example, a first- in
  • D09078WO01 Q first-out (FIFO) approach could be considered for memory management, while also both base and enhancement layer pictures corresponding to the same time instance are removed at the same time.
  • base layer information from previous pictures need not be retained after it was used, therefore saving memory.
  • Alternative or additional memory management techniques can include adaptive memory management control.
  • the default order can also be affected, for example, by the order these references are specified in the bitstream.
  • subsampling could include horizontal, vertical, or quincunx among others, and multiplexing could include side by side, over-under, line or column interleaved, and checkerboard among others.
  • FIGURE 11 Reference can be made, for example, to the embodiment of FIGURE 11 , where a base layer 3D multiplexer (1110) connected with a base layer video encoder (1120) and an enhancement layer 3D multiplexer (1130) connected with an enhancement layer video encoder (1140) are shown.
  • subsampling can be performed using basic pixel decimation (1111), (1112), (1131), (1132), i.e. without necessarily the consideration of any filtering, where the base layer corresponds to one set of pixels in the image and the enhancement layer corresponds to another set without filtering.
  • the left view samples in the base layer correspond to the even samples in the original left view frame
  • the right view samples in the base layer correspond to the odd samples in the original right view frame
  • the left and right view samples in the enhancement layer correspond to the remaining, i.e. odd and even samples, in their original frames respectively.
  • a preprocessing stage (1150) is introduced that processes the base layer information, before utilizing this information as a potential prediction for the enhancement layer.
  • a further embodiment of the present disclosure provides for a frame compatible 3D architecture similar to the one shown in FIGURE 11, with frame compatible signals but without 3D pre -processors (or with 3D-processors operating in a "pass-through mode") and with the presence of data multiplexing at the input (1110), (1130) and data remultiplexing at the output (1170), (1180).
  • D09078WO01 ⁇ Q for the characteristics of the sampling and arrangement methods used by the content, can be utilized to process the base layer.
  • processing could include separable or non-separable interpolation filters, edge adaptive interpolation techniques, filters based on wavelet, bandlet, or ridgelet methods, and inpainting among others.
  • the enhancement layer encoder (1140) can consider the processed images from the base layer for prediction, but only if desired. For example, the user may select to predict the entire enhancement layer from a previously decoded enhancement layer picture, or if multiple pre-processed base layer pictures are available, the encoder can select only one of them (e.g. in view of a rate distortion criterion) or any combination of two reference pictures, assuming the presence of a bi-predictive coding. The same can also occur at the region level.
  • the entire or part of the top half of a base layer processed image was used to predict the current enhancement layer picture, but instead for the bottom part the encoder selected to use again a previous enhancement layer picture.
  • Additional, block e.g. for MPEG-2 or MPEG-4 AVC like codecs
  • other local motion compensation methods e.g. a motion compensated method utilized by a future codec
  • a different prediction e.g. temporal
  • prediction samples could also be combined together in a bi-predictive or even a multi -hypothesis motion compensated framework again at the block or region level, resulting in further improved prediction.
  • D09078WO01 ⁇ ⁇ interpolated e.g., using the MPEG-4 AVC/H.264 interpolation filters
  • D09078WO01 ⁇ ⁇ interpolated e.g., using the MPEG-4 AVC/H.264 interpolation filters
  • reference re-ordering and weighted prediction when used for prediction.
  • FIGURE 14 and FIGURE 15 show in detail the pre-processing module (1410) on the encoder side and the pre-processing module (1510) on the decoder side.
  • the design and selection of the pre-processing method can be part of an encoder and can be based on user input or other criteria such as cost, complexity, and coding/quality performance.
  • FIGURE 16 An example of such process is shown in FIGURE 16. After a prediction reference from the base layer is created, as stated above, this reference (1610) and all other references (1620) (e.g. previously coded pictures from the enhancement layer or past or differently processed prediction references from the base layer) are considered within a motion compensated architecture to refine the prediction (1630) of the enhancement layer at a lower level, e.g. block or region.
  • the process according to the present disclosure is similar to how MPEG-4 AVC/H.264 and its MVC and SVC extensions also perform prediction, better references are used herein in view of the presence of a pre-processing stage.
  • the residual for the enhancement layer can be computed, transformed, quantized and encoded, with any additional overhead such as motion information, using methods similar to those used in the MPEG-4 AVC codec.
  • This residual can be dequantized, inversed transformed and then added back to the prediction to reconstruct the enhancement layer signal.
  • optional in-loop filtering (as shown in FIGURE 14 and FIGURE 15), such as deblocking, that applies only on the enhancement layer could be used to reduce artifacts, such as blockiness.
  • the enhancement layer in this scenario is in a similar packing arrangement as that of the base layer.
  • the base and enhancement layer data would need to be re- multiplexed together as to generate two separate, full resolution, left and right images. Re- multiplexing could be done by using simple interleaving of the base and enhancement layers. As shown in FIGURE 11, re-multiplexing of the base and enhancement layer data occurs through multiplexers (1170) and (1180).
  • the base layer information is also filtered prior to combining it, e.g. replacing half of the samples or averaging half of the samples, with the
  • D09078WO01 ⁇ samples from the enhancement layer Reference can be made to filters G L B2 and G L E of FIGURE 11, where G L can be averaged with G L in such alternative embodiment.
  • generation of the base layer video could occur through the use of filtering (e.g., to reduce aliasing) prior to decimation.
  • filtering e.g., to reduce aliasing
  • a single layer approach may not be able to generate a true full resolution image.
  • Such single layer can, however, help reconstruction of some of the lost frequencies or accurate reconstruction of half of the resolution of the original signal.
  • an additional, 3rd layer can be introduced that tries to correct for any errors introduced by the prior filtering in the base layer.
  • the enhancement layer may be of higher quality and could provide additional information that could be also utilized for the prediction of this layer. Therefore, in the present disclosure, apart from prediction references coming from the base layer and previously reconstructed references from this third (second enhancement) layer, references generated using pre-processing of the second (first enhancement) layer, or references using pre-processing while considering both base and first enhancement layer could be used. Therefore, embodiments of the present disclosure can be provided where there is more than one pre-processing stage on the encoding side and more than one pre-processing stage on the decoding side.
  • the prediction reference could be generated using edge adaptive interpolation of the enhancement layer while the edge adaptive decisions could be based also on the edges and samples of the base layer. Weighted averaging of an interpolated enhancement layer and the original or filtered base layer could generate a different prediction. Other mechanisms to generate a prediction picture for this enhancement layer could also be used, as discussed above, also including methods employing wavelet interpolation, inpainting and others.
  • delivery of 3D content is extended using frame compatible methods, i.e. Checkerboard video delivery, side by side, over-under, etc, to support full resolution through the introduction of additional enhancement layers.
  • additional enhancement layers can provide apart from additional resolution
  • MVC Multiview Video Coding
  • This advantage can result in improvements in coding efficiency, while having similar, and in some cases reduced complexity compared to these technologies.
  • some embodiments can be based on the MPEG-4 AVC/H.264 video coding standard, the techniques presented in the present disclosure are codec agnostic and other video coding standards and codecs such as MPEG-2 and VC-I can be applied to them.
  • stereoscopic (3D) format video encoders and decoders that can be applied, by way of example and not of limitation, to Blu-ray video discs, broadcast and download/on demand systems, satellite systems, IPTV systems, and mobile devices that support 3D video.
  • the methods and systems described in the present disclosure may be implemented in hardware, software, firmware or combination thereof.
  • Features described as blocks, modules or components may be implemented together (e.g., in a logic device such as an integrated logic device) or separately (e.g., as separate connected logic devices).
  • the software portion of the methods of the present disclosure may comprise a computer-readable medium which comprises instructions that, when executed, perform, at least in part, the described methods.
  • the computer-readable medium may comprise, for example, a random access memory
  • RAM random access memory
  • ROM read-only memory
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable logic array
  • An embodiment of the present invention may relate to one or more of the example embodiments, enumerated below.
  • a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video encoding system comprising:
  • a base layer comprising a base layer video encoder
  • the enhancement layer comprising an enhancement layer video encoder
  • At least one pre-processing module i) to pre-process the output of the base layer video encoder and input the pre-processed output into the enhancement layer video encoder and/or
  • D09078WO01 14 ii) to pre-process the output of one enhancement layer video encoder of one enhancement layer and input the pre-processed output into another enhancement layer video encoder of another enhancement layer.
  • a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video decoding system comprising:
  • a base layer comprising a base layer video decoder
  • the enhancement layer comprising an enhancement layer video decoder
  • At least one pre-processing module i) to pre-process the output of the base layer video decoder and input the pre-processed output into the enhancement layer video decoder and/or ii) to pre-process the output of one enhancement layer video decoder of one enhancement layer and input the pre-processed output into another enhancement layer video decoder of another enhancement layer.
  • a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video system comprising:
  • a base layer comprising a base layer video encoder and a base layer video decoder; at least one enhancement layer, associated with the base layer, the enhancement layer comprising an enhancement layer video encoder and an enhancement layer video decoder; at least one encoder pre-processing module, i) to pre-process the output of the base layer video encoder and input the pre-processed output into the enhancement layer video encoder and/or ii) to pre-process the output of one enhancement layer video encoder of one enhancement layer and input the pre-processed output into another enhancement layer video encoder of another enhancement layer; and
  • At least one decoder pre-processing module i) to pre-process the output of the base layer video decoder and input the pre-processed output into the enhancement layer video decoder and/or ii) to pre-process the output of one enhancement layer video decoder of one enhancement layer and input the pre-processed output into another enhancement layer video decoder of another enhancement layer .
  • pre-processing module comprises at least one of filtering, global motion compensation, region motion compensation, and texture mapping techniques.
  • preprocessing module comprises parameters to be used with the at least one of the filtering, global motion compensation, region motion compensation, and texture mapping techniques.
  • the base layer video encoder comprises a base layer encoding reference buffer
  • the enhancement layer video encoder comprises an enhancement layer encoding reference buffer
  • a pre-processing module of the at least one pre-processing module is connected between the base layer encoding reference buffer and the enhancement layer encoding reference buffer.
  • the base layer video decoder comprises a base layer decoding reference buffer
  • the enhancement layer video decoder comprises an enhancement layer decoding reference buffer
  • the pre-processing module is connected between the base layer decoding reference buffer and the enhancement layer decoding reference buffer.
  • D09078WO01 Jg 30 The 3D video system according to Enumerated Example Embodiment 29, wherein memory management techniques or reference ordering operations are applied to the enhancement layer encoding reference buffer and the enhancement layer decoding reference buffer.
  • Example Embodiment 32 The 3D video system according to Enumerated Example Embodiment 30, wherein the memory management techniques comprise at least one of a first frame-in-first frame-out (FIFO) approach, discarding previous frames approach, and adaptive memory management control.
  • FIFO first frame-in-first frame-out
  • a multiplexer to multiplex the two subsampled views.
  • Example Embodiment 36 The 3D video system of Enumerated Example Embodiment 36, wherein subsampler subsamples according to a technique selected from horizontal, vertical, or quincunx subsampling.
  • interpolation technique is selected from separable interpolation filters, edge adaptive interpolation techniques, wavelet-based filters, bandlet-based filters, ridgelet-based filters, and inpainting.
  • the techniques comprise signaling of parameters to allow generation of multiple potential references for prediction.
  • Example Embodiment 48 The 3D video system according to Enumerated Example Embodiment 47, wherein the at least one additional multiplexer are two additional multiplexers, a first additional multiplexer being a left view multiplexer and a second additional multiplexer being a right view multiplexer.
  • the 3D video system according to Enumerated Example Embodiment 47 further comprising a base layer filtering stage and an enhancement layer filtering stage to respectively filter the base layer decoded data and the enhancement layer decoded data before being multiplexed in the at least one additional multiplexer.
  • D09078WO01 21 52 The 3D video system of Enumerated Example Embodiment 51, wherein the techniques comprise signaling of parameters to allow generation of multiple potential references for prediction.
  • the enhancement layer further comprises an enhancement layer video decoder
  • the additional enhancement layer further comprises an additional enhancement layer video decoder
  • the 3D video system further comprising a further pre-processing module between the enhancement layer video decoder and the additional enhancement layer video decoder.
  • Example Embodiment 56 The 3D video system according to Enumerated Example Embodiment 56, wherein the additional enhancement layer further comprises an additional enhancement layer demultiplexer connected with the additional enhancement layer video decoder.
  • the 3D video system according to any one of Enumerated Example Embodiments 53 to 57, further comprising an enhancement layer video decoder, an additional enhancement layer video decoder, and at least one additional multiplexer to multiplex enhancement layer decoded data with additional enhancement layer decoded data of a same view.
  • Example Embodiment 58 The 3D video system according to Enumerated Example Embodiment 58, wherein the at least one additional multiplexer are two additional multiplexers, a first additional multiplexer being a left view multiplexer and a second additional multiplexer being a right view multiplexer.
  • a frame compatible three-dimensional (3D) video system comprising:
  • a base layer comprising a base layer video encoder and a base layer multiplexer, the base layer multiplexer receiving an input indicative of a plurality of views and forming a multiplexed output connected with the base layer video encoder;
  • the at least one enhancement layer associated with the base layer, the at least one enhancement layer comprising an enhancement layer video encoder and an enhancement layer multiplexer, the enhancement layer multiplexer receiving an input indicative of the plurality of views and forming a multiplexed output connected with the enhancement layer video encoder,
  • base layer video encoder is directly connected with the enhancement layer video encoder.
  • a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video encoding method comprising:
  • Enhancement layer video encoding the plurality of images or frames
  • a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video decoding method comprising:
  • D09078WO01 23 adopting the pre-processed base layer video decoded images or frames for enhancement layer video decoding
  • enhancement layer video decoding the plurality of images or frames.
  • a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video method comprising:
  • Enhancement layer video encoding the plurality of images or frames
  • enhancement layer video decoding the plurality of images or frames.
  • D09078WO01 24 71 A computer-readable medium containing a set of instructions that causes a computer to perform the method recited in one or more of Enumerated Example Embodiments 62-64.

Abstract

Encoding and decoding architectures for 3D video delivery are described, such as 2D compatible 3D video delivery and frame compatible 3D video delivery. The architectures include pre-processing stages to pre-process the output of a base layer video encoder and/or decoder and input the pre-processed output into an enhancement layer video encoder and/or decoder of one or more enhancement layers. Multiplexing methods of how to combine the base and enhancement layer videos are also described.

Description

ENCODING AND DECODING ARCHITECTURES FOR FORMAT COMPATIBLE
3D VIDEO DELIVERY
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] The present disclosure claims priority to United States Patent Provisional Application No. 61/223,027, filed 4 July 2009, hereby incorporated by reference in its entirety.
FIELD
[0002] This disclosure relates to image processing and video compression. More particularly, embodiments of the present disclosure relate to encoding and decoding systems and methods for 3D video delivery, such as 2D compatible and frame compatible 3D video delivery.
BACKGROUND
[0003] The provision of a stereoscopic (3D) user experience is a long held goal of both content providers and display manufacturers. Recently, the urgency of providing a stereoscopic experience to home users has increased with the production and tentative release of multiple 3D movies or other 3D material (e.g., concerts or documentaries).
[0004] To ensure rapid adoption among consumers, the ideal solutions should be those that can be implemented with minimal or no alteration to existing playback devices such as set- top boxes, DVD, and Blu-ray disc players, as well as existing 3D capable displays, such as digital light processing (DLP) displays by Samsung and Mitsubishi, some Plasma displays, and polarized based and frame sequential LCD displays.
[0005] One possible method for the delivery of 3D content that has these properties is the consideration of creating, coding, and delivering 3D video content by multiplexing the two views into a single frame configuration using a variety of filtering, sampling, and arrangement methods. Sampling could, for example, be horizontal, vertical, or quincunx, while an offset, e.g. a sampling offset, could also be considered between the two views allowing better exploitation of redundancies that may exist between the two views.
[0006] Similarly, arrangements could include side by side, over-under, line-interleaved, and checkerboard packing among others, as shown in FIGURES 1-6. Unfortunately, these methods do not provision for the delivery of full resolution stereoscopic material, which can impact quality and experience, and essentially can be an issue for many applications.
[0007] The desire for full resolution has lead to some systems that utilize two separate and independent bitstreams, each one representing a different view, like the simulcast 3D video delivery architecture shown in FIGURE 8. Unfortunately, the complexity of this method, its bandwidth requirements, i.e. redundancies between the two views are not exploited, but also
D09078WO01 1 the fact that this method is not backwards compatible with legacy devices and can have considerable implications to the entire delivery system, has not lead to its adoption.
[0008] An extension of this method, that tries to exploit some of the redundancies that may exist between the two views was proposed and adopted as a profile of the Multiview Video Coding (MVC) extension of the MPEG-4 AVC/H.264 video coding standard, i.e. the Stereo High profile, that provisions for the encoding and delivery of stereoscopic material. An example of the MVC based 3D video delivery architecture is shown in FIGURE 9. Redundancies are exploited using only translational motion compensation based methods, while the system is based on "intelligent" reference buffer management, i.e. in which order references from the base or enhancement layers are added in the enhancement layer buffer and considered for prediction, for performing prediction compared to the original design of MPEG-4 AVC. Unfortunately, even though coding efficiency was somewhat improved (i.e., 20-30% over simulcast), complexity issues, incompatibility with legacy devices (only 2D support is provided), and the not so significant performance benefits presented using such method still make it as a somewhat unattractive solution.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] The accompanying drawings, which are incorporated into and constitute a part of this specification, illustrate one or more embodiments of the present disclosure and, together with the description of example embodiments, serve to explain the principles and implementations of the disclosure.
[0010] FIGURE 1 shows a checkerboard interleaved arrangement for the delivery of stereoscopic material.
[0011] FIGURE 2 shows a horizontal sampling/column interleaved arrangement for the delivery of stereoscopic material.
[0012] FIGURE 3 shows a vertical sampling/row interleaved arrangement for the delivery of stereoscopic material.
[0013] FIGURE 4 shows a horizontal sampling/side by side arrangement for the delivery of stereoscopic material.
[0014] FIGURE 5 shows a vertical sampling/over-under arrangement for the delivery of stereoscopic material.
[0015] FIGURE 6 shows a quincunx sampling/side by side arrangement for the delivery of stereoscopic material.
[0016] FIGURE 7 shows a frame compatible 3D video delivery architecture.
[0017] FIGURE 8 shows a simulcast 3D video delivery architecture.
D09078WO01 9 [0018] FIGURE 9 shows an MVC based 3D video delivery architecture.
[0019] FIGURE 10 shows an example of 3D capture.
[0020] FIGURE 11 shows pre-processing stages located between a base layer and an enhancement layer, and between a first enhancement layer and a second enhancement layer of a frame compatible 3D architecture.
[0021] FIGURE 12 shows pre-processing stages located between the base layer and the enhancement layer of the video encoder, and the base layer and the enhancement layer of the video decoder of a 2D compatible 3D architecture, in accordance with the present disclosure.
[0022] FIGURE 13 shows a more detailed diagram of the pre-processing stage of FIGURE
12 on the encoder side.
[0023] FIGURE 14 shows a more detailed diagram of the pre-processing stage of FIGURE
11 on the encoder side.
[0024] FIGURE 15 shows a more detailed diagram of the pre-processing stage of FIGURE
11 on the decoder side.
[0025] FIGURE 16 shows an example of pre-processing technique for a horizontal sampling and side by side packing arrangement.
[0026] FIGURE 17 and FIGURE 18 show examples of pre-processing stages according to the present disclosure.
DESCRIPTION OF EXAMPLE EMBODIMENTS
Overview
[0027] Embodiments of the present disclosure relate to image processing and video compression.
[0028] According to a first embodiment, a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video encoding system is provided, comprising: a base layer, comprising a base layer video encoder; at least one enhancement layer, associated with the base layer, the enhancement layer comprising an enhancement layer video encoder; and at least one pre-processing module, i) to pre-process the output of the base layer video encoder and input the pre-processed output into the enhancement layer video encoder and/or ii) to pre- process the output of one enhancement layer video encoder of one enhancement layer and input the pre-processed output into another enhancement layer video encoder of another enhancement layer.
[0029] According to a second embodiment, a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video decoding system is provided, comprising: a base layer, comprising a base layer video decoder; at least one enhancement layer, associated with
D09078WO01 rt the base layer, the enhancement layer comprising an enhancement layer video decoder; and at least one pre-processing module, i) to pre-process the output of the base layer video decoder and input the pre-processed output into the enhancement layer video decoder and/or ii) to pre- process the output of one enhancement layer video decoder of one enhancement layer and input the pre-processed output into another enhancement layer video decoder of another enhancement layer.
[0030] According to a third embodiment, a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video system is provided, comprising: a base layer, comprising a base layer video encoder and a base layer video decoder; at least one enhancement layer, associated with the base layer, the enhancement layer comprising an enhancement layer video encoder and an enhancement layer video decoder; at least one encoder pre-processing module, i) to pre-process the output of the base layer video encoder and input the pre-processed output into the enhancement layer video encoder and/or ii) to pre- process the output of one enhancement layer video encoder of one enhancement layer and input the pre-processed output into another enhancement layer video encoder of another enhancement layer; and at least one decoder pre-processing module, i) to pre-process the output of the base layer video decoder and input the pre-processed output into the enhancement layer video decoder and/or ii) to pre-process the output of one enhancement layer video decoder of one enhancement layer and input the pre-processed output into another enhancement layer video decoder of another enhancement layer .
[0031] According to a fourth embodiment, a frame compatible three-dimensional (3D) video encoding system is provided, comprising: a base layer, comprising a base layer video encoder and a base layer multiplexer, the base layer multiplexer receiving an input indicative of a plurality of views and forming a multiplexed output connected with the base layer video encoder; and at least one enhancement layer, associated with the base layer, the at least one enhancement layer comprising an enhancement layer video encoder and an enhancement layer multiplexer, the enhancement layer multiplexer receiving an input indicative of the plurality of views and forming a multiplexed output connected with the enhancement layer video encoder, wherein the base layer video encoder is directly connected with the enhancement layer video encoder.
[0032] According to a fifth embodiment, a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video encoding method is provided, comprising: base layer video encoding a plurality of images or frames; enhancement layer video encoding the plurality of images or frames; pre-processing base layer video encoded images or frames; and
D09078WO01 A adopting the pre-processed base layer video encoded images or frame for the enhancement layer video encoding.
[0033] According to a sixth embodiment, a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video decoding method is provided, comprising: base layer video decoding a plurality of images or frames; pre-processing base layer video decoded images or frames; adopting the pre-processed base layer video decoded images or frames for enhancement layer video decoding; and enhancement layer video decoding the plurality of images or frames;
[0034] According to a seventh embodiment, a two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video method is provided, comprising: base layer video encoding a plurality of images or frames; enhancement layer video encoding the plurality of images or frames; pre-processing base layer video encoded images or frames; adopting the pre-processed base layer video encoded images or frame for the enhancement layer video encoding; base layer video decoding a plurality of images or frames; pre-processing base layer video decoded images or frames; adopting the pre-processed base layer video decoded images or frames for enhancement layer video decoding; and enhancement layer video decoding the plurality of images or frames;
[0035] According to an eighth embodiment, an encoder for encoding a video signal according to the method of the fifth embodiment is provided.
[0036] According to a ninth embodiment, an apparatus for encoding a video signal according to the method of the fifth embodiment is provided.
[0037] According to a tenth embodiment, a system for encoding a video signal according to the method of the fifth embodiment is provided.
[0038] According to an eleventh embodiment, a decoder for decoding a video signal according to the method of the sixth embodiment is provided.
[0039] According to a twelfth embodiment, an apparatus for decoding a video signal according to the method of the sixth embodiment is provided.
[0040] According to a thirteenth embodiment, a system for decoding a video signal according to the method of the sixth embodiment is provided.
[0041] According to a fourteenth embodiment, a computer-readable medium containing a set of instructions that causes a computer to perform the method or methods recited above is provided.
[0042] Embodiments of the present disclosure will show techniques that enable frame compatible 3D video systems to achieve full resolution 3D delivery, without any of the
D09078WO01 S drawbacks of the 2D compatible 3D delivery methods (e.g., MVC). Furthermore, decoder complexity, in terms of hardware cost, memory, and operations required will also be considered. Furthermore, improvements over the existing 2D compatible 3D delivery methods are also shown.
1. 2D Compatible 3D Delivery
[0043] Applicants have observed that the MVC extension of the MPEG-4 AVC/H.264 standard constrains prediction between the base and enhancement layers (see FIGURE 9) to only utilize translational block based methods, which also include the optional consideration of illumination compensation methods, i.e. weighted prediction.
[0044] This can severely affect coding performance since correlation between the two layers is not fully exploited. In general, especially for the scenario of stereo, i.e. left and right view, coding, the two stereo views are characterized more by an affine/geometric "motion" relationship due to the placement of the two cameras used to capture or generate (e.g., in the scenario of a computer generated 3D video sequence) the 3D content, which can not be captured well using translational (vertical and horizontal only) motion compensation mechanisms. This is also true for the multiview case, where more than two views for a scene are available. Reference is made to the example shown in FIGURE 10.
[0045] The content may also have differences in focus or illumination because of the camera characteristics, which again make prediction less accurate. Furthermore, the MVC specification only accounts for 2D compatible 3D video coding systems and has no provision for frame compatible arrangements such as those shown in FIGURE 7 of the present application.
[0046] To provide a solution to the first problem, i.e. inaccurate prediction from the base to the enhancement layer, a pre-processing stage is introduced between the base and enhancement layer encoders and decoders in accordance with an embodiment of the present disclosure to process or refine the first encoded view for prediction before encoding the second view. In particular, in accordance with such embodiment, data from the base layer are pre-processed and altered using some additional parameters that have been signaled in the bitstream. The pictures thus generated can be available for prediction, if desired. Such process can be used globally or regionally and is not limited to a block-based process.
[0047] Reference can be made, for example to FIGURE 12, where a 3D pre-processor (1210) is shown on the encoding side between base layer video encoder (1220) and enhancement layer video encoder (1230), and a 3D-pre-processor (1240) is shown on the decoding side between base layer video decoder (1250) and enhancement layer video decoder (1260).
D09078WO01 f, [0048] The role of this pre-processing stage is to process and adjust the characteristics of the base layer video to better match those of the enhancement layer video. This can be done, for example, by considering pre-processing mechanisms such as filtering (e.g., a sharpening or a low pass filter) or even other more sophisticated methods such as global/region motion compensation/texture mapping.
[0049] These methods require the derivation of parameters appropriate for each of them, such as i) the filters, ii) the filter coefficients/length that should be used, and/or iii) the global motion compensation correction parameters that should be applied to the image to generate the new prediction.
[0050] A set of parameters could be derived for the entire video, scene, or image. However, multiple parameters could also be used within an image. Parameters, in this scenario, could be assigned for different regions of an image. The number, shape, and size of the regions could be fixed or could also be adaptive. Adaptive regions could be derived given preanalysis of the content (e.g., a segmentation method), and/or could be user-specified, in which case signaling of the characteristics of the regions (e.g., shape and size) can be signaled within the bitstream.
[0051] As an example, a system may signal that each frame is split in NxM rectangular regions, or could signal explicitly the shape of each region using a map description. Determination and signaling of such information could follow the description presented in U.S. Provisional Application 61/170,995 filed on April 20, 2009, for "Directed Interpolation and Data Post-Processing", which is incorporated herein by reference in its entirety.
[0052] According to one embodiment, an encoder can evaluate all or a subset of possible preprocessing methods that could be used by the system, by comparing the output of each method compared to the predicted signal (enhancement layer). The method resulting in best performance, e.g. best in terms of complexity, quality, resulting coding efficiency, among others, or a combination of all of these parameters using methods such as Lagrangian optimization, can be selected at the encoder. Reference can be made, for example to Figures 3 to 5 of the above mentioned US Provisional 61/170,995, incorporated herein by reference in its entirety.
[0053] According to another embodiment, multiple parameters that correspond to the same region in the image could also be signaled to generate multiple different potential predictions for the enhancement layer. Reference can be made, for example, to FIGURE 17 and FIGURE 18 of the present disclosure. FIGURE 17 shows a pre-processing system with N filter consideration/signaling. Multiple filters can be selected for a single region by selecting the M
D09078WO01 7 best filters that provide the best desired performance which can be defined as quality, cost, enhancement layer coding performance, etc. FIGURE 18 shows pre-processing with multiparameter consideration.
[0054] Similarly to MVC, where the base layer is added in the reference buffer of the enhancement layer for prediction purposes (see FIGURE 9), the new processed images, e.g., after filtering or global motion compensation correction, are also added in the reference buffer of the enhancement layer, as shown in FIGURE 13, where the output of pre-processor (1310) is connected to the reference buffer (1320) of the enhancement layer. According to some embodiments of the present disclosure, the reference buffer (1320) may already include other references such as previously encoded and decoded pictures from the enhancement layer or even pictures generated from processing previously encoded and decoded base layer pictures.
[0055] As noted above, for every previously decoded base layer picture, one or more new processed reference pictures can be generated and added in the enhancement layer buffer (1330) as additional reference pictures. All of these references could be considered for prediction using motion compensation methods and mechanisms such as the reference index concept/signaling that is available within codecs such as MPEG-4 AVC (Advanced Video Coding). For example, assuming that a base layer picture has been processed to generate two different reference picture instances, refbo and refH, and also refe, which corresponds to the previously encoded enhancement layer picture, is available as a reference, one can assign reference indices (ref_idx) 0, 1, and 2 to these pictures respectively. If a macroblock in the current enhancement layer picture selects refb0 then ref_idx = 0 is signaled in the bitstream. Similarly, ref_idx 1 or 2 are signaled for MBs selecting refbi and refe respectively.
[0056] The availability of such processed reference pictures in the enhancement layer buffer involves the consideration of i) appropriate memory management and ii) reference ordering operations in both the encoder and the decoder as is also done in MPEG-4 AVC and its SVC and MVC extensions.
[0057] Memory management operations take into consideration which references are removed or added in the reference buffer for prediction, while reference ordering takes into consideration the order of how references are considered for motion compensation, which itself affects the number of bits that will be used when signaling that reference.
[0058] Default memory management and reference ordering operations could be considered based on the systems expectation of which is likely to be the least useful (for memory management) or most correlated reference (for reference ordering). As an example, a first- in
D09078WO01 Q first-out (FIFO) approach could be considered for memory management, while also both base and enhancement layer pictures corresponding to the same time instance are removed at the same time. On the other hand, base layer information from previous pictures need not be retained after it was used, therefore saving memory. Alternative or additional memory management techniques can include adaptive memory management control.
0059] Similarly, for default ordering, the base layer reference that corresponds to the current time as the current enhancement layer to be encoded could be placed in the beginning of the reference list for prediction, while the rest of the references can be ordered according to temporal distance, coding order, and/or layer relationships. For example, and assuming a single processed reference from the base layer, a default reference order can be as follows: a) place processed base layer reference, if available, as first reference in list (ref_idx = 0) b) proceed with alternating order and add enhancement layer and previously processed base layer reference pictures in reference buffer according to their temporal distance. If two pictures have the same temporal direction, then determine order according to direction of reference (past or future compared to current picture). If picture/slice type allows one list, then past pictures take precedence over future, while if picture/slice type allows two lists, then for the first list past pictures take precedence over future, while for the second list future pictures take precedence over past.
When multiple references from the base layer are available the default order can also be affected, for example, by the order these references are specified in the bitstream.
[0060] The above rules could be specified by the system. The person skilled in the art will also understand that such operations can apply to multiple reference lists, such as in the case of the two prediction lists available in B slices of MPEG-4 AVC/H.264. Explicit memory management and reference ordering operations could also be utilized, which allow further flexibility to be added to the system, since the system can select a different mechanism for handling references for an instance, given reasons such as coding performance and error resiliency among others. In particular, alternatively or in addition to a default ordering, users may wish to specify their own ordering mechanism and use reordering instructions that are signaled in the bitstream, similarly to what is available already in MPEG-4 AVC, that specify exactly how each reference is placed in each reference list.
2. Frame Compatible 3D Delivery
[0061] The above approach can be extended to frame compatible 3D delivery, generally shown in FIGURE 7 of the present application. In this scenario, instead of having a base layer that corresponds to a single view, the base layer now corresponds to two views that have been
D09078WO01 Q previously subsampled using a variety of methods and multiplexed using a variety of arrangements. As mentioned earlier, subsampling could include horizontal, vertical, or quincunx among others, and multiplexing could include side by side, over-under, line or column interleaved, and checkerboard among others.
[0062] Reference can be made, for example, to the embodiment of FIGURE 11 , where a base layer 3D multiplexer (1110) connected with a base layer video encoder (1120) and an enhancement layer 3D multiplexer (1130) connected with an enhancement layer video encoder (1140) are shown.
[0063] In this scenario, instead of missing information for one of the two views, what are essentially missing are resolution and/or high frequency information for both views. Therefore, what is desired by such system is the ability, if desired, to add back the missing information to the signal.
[0064] In the simplest embodiment of such a system, subsampling can be performed using basic pixel decimation (1111), (1112), (1131), (1132), i.e. without necessarily the consideration of any filtering, where the base layer corresponds to one set of pixels in the image and the enhancement layer corresponds to another set without filtering.
[0065] For example, for the horizontal sampling + side by side arrangement, the left view samples in the base layer correspond to the even samples in the original left view frame, the right view samples in the base layer correspond to the odd samples in the original right view frame, while the left and right view samples in the enhancement layer correspond to the remaining, i.e. odd and even samples, in their original frames respectively.
[0066] In this scenario, very high correlation exists between the base and enhancement layers which cannot be exploited as efficiently using the prediction methods provided by MVC.
[0067] Similarly to what previously done for the 2D compatible system embodiments, a preprocessing stage (1150) is introduced that processes the base layer information, before utilizing this information as a potential prediction for the enhancement layer.
[0068] A further embodiment of the present disclosure provides for a frame compatible 3D architecture similar to the one shown in FIGURE 11, with frame compatible signals but without 3D pre -processors (or with 3D-processors operating in a "pass-through mode") and with the presence of data multiplexing at the input (1110), (1130) and data remultiplexing at the output (1170), (1180).
[0069] More specifically, apart from filtering and global motion compensation correction that were discussed in the previous section, fixed or adaptive interpolation techniques that account
D09078WO01 \Q for the characteristics of the sampling and arrangement methods used by the content, can be utilized to process the base layer.
[0070] For example, processing could include separable or non-separable interpolation filters, edge adaptive interpolation techniques, filters based on wavelet, bandlet, or ridgelet methods, and inpainting among others.
[0071] Other methods that try to enhance resolution or can help with predicting missing frequency information could also be used. Methods that consider information from both views, such as copying the data from the base layer right view to predict the enhancement layer left view, can also be used. Similarly to what discussed above, these methods could be again applied at the sequence, scene, image, or/and region level, while multiple such parameters could be signaled to allow the generation of multiple potential references for prediction. Regions, as in the case of the 2D compatible system, can be predefined or signaled within a bitstream.
[0072] It should be noted that it is not necessary for the enhancement layer to utilize the entire or even any part of a prediction/reference picture. In other words, the enhancement layer encoder (1140) can consider the processed images from the base layer for prediction, but only if desired. For example, the user may select to predict the entire enhancement layer from a previously decoded enhancement layer picture, or if multiple pre-processed base layer pictures are available, the encoder can select only one of them (e.g. in view of a rate distortion criterion) or any combination of two reference pictures, assuming the presence of a bi-predictive coding. The same can also occur at the region level.
[0073] For example, the entire or part of the top half of a base layer processed image was used to predict the current enhancement layer picture, but instead for the bottom part the encoder selected to use again a previous enhancement layer picture. Additional, block (e.g. for MPEG-2 or MPEG-4 AVC like codecs) or other local motion compensation methods (e.g. a motion compensated method utilized by a future codec) could be used as part of the enhancement layer codec, which may determine that a different prediction, e.g. temporal, may provide better performance.
[0074] However, such prediction samples could also be combined together in a bi-predictive or even a multi -hypothesis motion compensated framework again at the block or region level, resulting in further improved prediction.
[0075] It should be apparent, similarly to how references are processed in MVC, that each reference in the systems and methods according to the present disclosure could be further
D09078WO01 \ \ interpolated (e.g., using the MPEG-4 AVC/H.264 interpolation filters) and utilized with reference re-ordering and weighted prediction when used for prediction.
[0076] FIGURE 14 and FIGURE 15 show in detail the pre-processing module (1410) on the encoder side and the pre-processing module (1510) on the decoder side.
[0077] The design and selection of the pre-processing method can be part of an encoder and can be based on user input or other criteria such as cost, complexity, and coding/quality performance.
[0078] An example of such process is shown in FIGURE 16. After a prediction reference from the base layer is created, as stated above, this reference (1610) and all other references (1620) (e.g. previously coded pictures from the enhancement layer or past or differently processed prediction references from the base layer) are considered within a motion compensated architecture to refine the prediction (1630) of the enhancement layer at a lower level, e.g. block or region.
[0079] While the process according to the present disclosure is similar to how MPEG-4 AVC/H.264 and its MVC and SVC extensions also perform prediction, better references are used herein in view of the presence of a pre-processing stage. After such prediction is performed, the residual for the enhancement layer can be computed, transformed, quantized and encoded, with any additional overhead such as motion information, using methods similar to those used in the MPEG-4 AVC codec.
[0080] Other methods or future codecs can also be utilized to encode such information. This residual can be dequantized, inversed transformed and then added back to the prediction to reconstruct the enhancement layer signal.
[0081] According to a further embodiment of the present disclosure, optional in-loop filtering (as shown in FIGURE 14 and FIGURE 15), such as deblocking, that applies only on the enhancement layer could be used to reduce artifacts, such as blockiness. It should be noted that the enhancement layer in this scenario is in a similar packing arrangement as that of the base layer. For display purposes, the base and enhancement layer data would need to be re- multiplexed together as to generate two separate, full resolution, left and right images. Re- multiplexing could be done by using simple interleaving of the base and enhancement layers. As shown in FIGURE 11, re-multiplexing of the base and enhancement layer data occurs through multiplexers (1170) and (1180).
[0082] In an alternative embodiment, the base layer information is also filtered prior to combining it, e.g. replacing half of the samples or averaging half of the samples, with the
D09078WO01 γχ samples from the enhancement layer. Reference can be made to filters GL B2 and GL E of FIGURE 11, where GL can be averaged with GL in such alternative embodiment.
[0083] In a different embodiment, generation of the base layer video could occur through the use of filtering (e.g., to reduce aliasing) prior to decimation. In this scenario, and excluding compression impact, a single layer approach may not be able to generate a true full resolution image. Such single layer can, however, help reconstruction of some of the lost frequencies or accurate reconstruction of half of the resolution of the original signal.
[0084] To alleviate for this problem, an additional, 3rd layer can be introduced that tries to correct for any errors introduced by the prior filtering in the base layer. Reference can be made to layer (1160) of FIGURE 11. Similar methods could be used for predicting the signal in this new layer from data in both the base and enhancement layers. The person skilled in the art will understand that data from the base layer could be good enough predictors for this new layer without processing or with very little processing.
[0085] However, it is possible that the enhancement layer may be of higher quality and could provide additional information that could be also utilized for the prediction of this layer. Therefore, in the present disclosure, apart from prediction references coming from the base layer and previously reconstructed references from this third (second enhancement) layer, references generated using pre-processing of the second (first enhancement) layer, or references using pre-processing while considering both base and first enhancement layer could be used. Therefore, embodiments of the present disclosure can be provided where there is more than one pre-processing stage on the encoding side and more than one pre-processing stage on the decoding side.
[0086] In an example, the prediction reference could be generated using edge adaptive interpolation of the enhancement layer while the edge adaptive decisions could be based also on the edges and samples of the base layer. Weighted averaging of an interpolated enhancement layer and the original or filtered base layer could generate a different prediction. Other mechanisms to generate a prediction picture for this enhancement layer could also be used, as discussed above, also including methods employing wavelet interpolation, inpainting and others.
[0087] Therefore, according to the teachings of the present disclosure, delivery of 3D content is extended using frame compatible methods, i.e. Checkerboard video delivery, side by side, over-under, etc, to support full resolution through the introduction of additional enhancement layers. These additional enhancement layers can provide apart from additional resolution
D09078WO01 ]J and/or quality, additional functionalities such as improved streaming and complexity scalability.
[0088] The teachings provided in the present disclosure can also be seen as extensions of existing scalable video coding technologies such as the Scalable Video Coding (SVC) and
Multiview Video Coding (MVC) extensions of the MPEG-4 AVC standard, however, with the consideration of improved methods for predicting from one layer to the next.
[0089] This advantage can result in improvements in coding efficiency, while having similar, and in some cases reduced complexity compared to these technologies. Although some embodiments can be based on the MPEG-4 AVC/H.264 video coding standard, the techniques presented in the present disclosure are codec agnostic and other video coding standards and codecs such as MPEG-2 and VC-I can be applied to them.
[0090] Possible applications of the teachings of the present disclosure are stereoscopic (3D) format video encoders and decoders that can be applied, by way of example and not of limitation, to Blu-ray video discs, broadcast and download/on demand systems, satellite systems, IPTV systems, and mobile devices that support 3D video.
[0091] The methods and systems described in the present disclosure may be implemented in hardware, software, firmware or combination thereof. Features described as blocks, modules or components may be implemented together (e.g., in a logic device such as an integrated logic device) or separately (e.g., as separate connected logic devices). The software portion of the methods of the present disclosure may comprise a computer-readable medium which comprises instructions that, when executed, perform, at least in part, the described methods.
The computer-readable medium may comprise, for example, a random access memory
(RAM) and/or a read-only memory (ROM). The instructions may be executed by a processor
(e.g., a digital signal processor (DSP), an application specific integrated circuit (ASIC), or a field programmable logic array (FPGA)).
[0092] An embodiment of the present invention may relate to one or more of the example embodiments, enumerated below.
1. A two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video encoding system, comprising:
a base layer, comprising a base layer video encoder;
at least one enhancement layer, associated with the base layer, the enhancement layer comprising an enhancement layer video encoder; and
at least one pre-processing module, i) to pre-process the output of the base layer video encoder and input the pre-processed output into the enhancement layer video encoder and/or
D09078WO01 14 ii) to pre-process the output of one enhancement layer video encoder of one enhancement layer and input the pre-processed output into another enhancement layer video encoder of another enhancement layer.
2. A two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video decoding system, comprising:
a base layer, comprising a base layer video decoder;
at least one enhancement layer, associated with the base layer, the enhancement layer comprising an enhancement layer video decoder; and
at least one pre-processing module, i) to pre-process the output of the base layer video decoder and input the pre-processed output into the enhancement layer video decoder and/or ii) to pre-process the output of one enhancement layer video decoder of one enhancement layer and input the pre-processed output into another enhancement layer video decoder of another enhancement layer.
3. A two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video system, comprising:
a base layer, comprising a base layer video encoder and a base layer video decoder; at least one enhancement layer, associated with the base layer, the enhancement layer comprising an enhancement layer video encoder and an enhancement layer video decoder; at least one encoder pre-processing module, i) to pre-process the output of the base layer video encoder and input the pre-processed output into the enhancement layer video encoder and/or ii) to pre-process the output of one enhancement layer video encoder of one enhancement layer and input the pre-processed output into another enhancement layer video encoder of another enhancement layer; and
at least one decoder pre-processing module, i) to pre-process the output of the base layer video decoder and input the pre-processed output into the enhancement layer video decoder and/or ii) to pre-process the output of one enhancement layer video decoder of one enhancement layer and input the pre-processed output into another enhancement layer video decoder of another enhancement layer .
4. The 3D video system of Enumerated Example Embodiment 1 or 3, wherein the pre- processed output of the pre-processing module further depends on video information coming from a plurality of views.
D09078WO01 J 5 5. The 3D video system of Enumerated Example Embodiment 4, wherein the plurality of views are a first view and a second view.
6. The 3D video system of Enumerated Example Embodiment 4, wherein the plurality of views are more than two views.
7. The 3D video system according to any one of the previous Enumerated Example Embodiments, wherein the pre-processing module comprises at least one of filtering, global motion compensation, region motion compensation, and texture mapping techniques.
8. The 3D video system according to Enumerated Example Embodiment 7, wherein the preprocessing module comprises parameters to be used with the at least one of the filtering, global motion compensation, region motion compensation, and texture mapping techniques.
9. The 3D video system according to Enumerated Example Embodiment 8, wherein the parameters are a set of parameters for an entire video, scene, image or region.
10. The 3D video system according to Enumerated Example Embodiment 8, wherein the parameters are multiple parameters, differently assigned for different regions of a video, scene or image.
11. The 3D video system according to Enumerated Example Embodiment 10, wherein number, shape and size of the regions is fixed.
12. The 3D video system according to Enumerated Example Embodiment 10, wherein number, shape and size of the regions is adaptive.
13. The 3D video system according to Enumerated Example Embodiment 10, wherein the regions are adaptive regions.
14. The 3D video system according to Enumerated Example Embodiment 13, wherein the adaptive regions are obtained through pre-analysis of video content.
D09078WO01 \β 15. The 3D video system according to Enumerated Example Embodiment 14, wherein the pre-analysis of the video content is a segmentation method.
16. The 3D video system according to Enumerated Example Embodiment 13, wherein features of the adaptive regions are user- specified.
17. The 3D video system according to Enumerated Example Embodiment 16, wherein the user-specified features of the adaptive regions are signaled within an image bitstream.
18. The 3D video system according to Enumerated Example Embodiment 16 or 17, wherein the user-specified features of the adaptive regions comprise at least one of region shape and region size.
19. The 3D video system according to Enumerated Example Embodiment 18, wherein the user-specified features of the adaptive regions comprise signaling that each frame is split in a plurality of rectangular regions.
20. The 3D video system according to Enumerated Example Embodiment 18, wherein the user-specified features of the adaptive regions comprise signaling region shape of each region using a map description.
21. The 3D video system according to any one of the previous Enumerated Example Embodiments, further comprising a pre-processing encoder connected with the preprocessing module, the pre-processing encoder adapted to select one of a plurality of preprocessing techniques to be used by the pre-processing module.
22. The 3D video system of Enumerated Example Embodiment 21, further comprising a preprocessing decoder connected at the output of the pre-processing module.
23. The 3D video system according to Enumerated Example Embodiment 21 or 22, wherein selection among the plurality of pre-processing techniques is based on one or more parameters selected to satisfy one or more among complexity, quality, and resulting coding efficiency criteria.
D09078WO01 \η 24. The 3D video system according to Enumerated Example Embodiment 21 or 22, wherein selection among the plurality of pre-processing techniques is performed through a Lagrangian optimization method.
25. The 3D video system according to any one of Enumerated Example Embodiments 1 to 20, further comprising a pre-processing encoder connected with the pre-processing module, the pre-processing encoder adapted to select one of a plurality of predictions generated through multiple parameters.
26. The 3D video system according to Enumerated Example Embodiment 25, further comprising a pre-processing decoder connected with the output of the pre-processing module.
27. The 3D video system according to any one of the previous Enumerated Example Embodiments, wherein
the base layer video encoder comprises a base layer encoding reference buffer the enhancement layer video encoder comprises an enhancement layer encoding reference buffer, and
a pre-processing module of the at least one pre-processing module is connected between the base layer encoding reference buffer and the enhancement layer encoding reference buffer.
28. The 3D video system according to any one of the previous Enumerated Example Embodiments, wherein
the base layer video decoder comprises a base layer decoding reference buffer the enhancement layer video decoder comprises an enhancement layer decoding reference buffer, and
the pre-processing module is connected between the base layer decoding reference buffer and the enhancement layer decoding reference buffer.
29. The 3D video system according to Enumerated Example Embodiments 27 or 28, wherein the enhancement layer encoding reference buffer and the enhancement layer decoding reference buffer comprise one or more processed reference pictures.
D09078WO01 Jg 30. The 3D video system according to Enumerated Example Embodiment 29, wherein memory management techniques or reference ordering operations are applied to the enhancement layer encoding reference buffer and the enhancement layer decoding reference buffer.
31. The 3D video system according to any one Enumerated Example Embodiments 28 to 30, wherein in-loop filtering is applied to the enhancement layer encoding reference buffer and the enhancement layer decoding reference buffer.
32. The 3D video system according to Enumerated Example Embodiment 30, wherein the memory management techniques comprise at least one of a first frame-in-first frame-out (FIFO) approach, discarding previous frames approach, and adaptive memory management control.
33. The 3D video system according to Enumerated Example Embodiment 30, wherein the reference ordering operations comprise placing at a beginning of a reference list a base layer reference corresponding to a current time as a current enhancement layer to be encoded, and ordering the remaining reference according to an ordering parameter.
34. The 3D video system according to Enumerated Example Embodiment 33, wherein the ordering parameter is selected from temporal distance, coding order, layer relationships, and/or coding prediction efficiency criteria.
35. The 3D video system of any one of the previous Enumerated Example Embodiments, wherein the 3D video system is a 2D compatible video system, wherein the base layer corresponds to a single view.
36. The 3D video system of any one of the previous Enumerated Example Embodiments, wherein the 3D video system is a frame compatible video system, wherein the base layer and the enhancement layer correspond to two views, each of the base layer encoder and enhancement layer encoder comprising
a subsampler to subsample the two views, and
a multiplexer to multiplex the two subsampled views.
D09078WO01 J 9 37. The 3D video system of Enumerated Example Embodiment 36, wherein subsampler subsamples according to a technique selected from horizontal, vertical, or quincunx subsampling.
38. The 3D video system according to Enumerated Example Embodiment 36 or 37, wherein the multiplexer multiplexes according to a technique selected from a side by side, over-under, line interleaved, column interleaved, and checkerboard technique.
39. The 3D video system according to any one of Enumerated Example Embodiments 36 to 38, wherein the pre-processing module is based on a fixed or adaptive interpolation technique.
40. The 3D video system according to Enumerated Example Embodiment 39, wherein the interpolation technique is selected from separable interpolation filters, edge adaptive interpolation techniques, wavelet-based filters, bandlet-based filters, ridgelet-based filters, and inpainting.
41. The 3D video system according to any one of Enumerated Example Embodiments 36 to 38, wherein the pre-processing module is based on resolution enhancement techniques, missing frequency information predicting techniques or techniques considering information from both views.
42. The 3D video system according to Enumerated Example Embodiment 41, wherein the techniques considering information from both views comprise copying data from a base layer first view to predict an enhancement layer second view.
43. The 3D video system according to Enumerated Example Embodiment 41 or 42, wherein the techniques are applied at a level selected from a sequence level, scene level, image level, and region level.
44. The 3D video system according to Enumerated Example Embodiment 43, wherein the regions of the region level are predefined regions or regions signaled within a bitstream.
D09078WO01 20 45. The 3D video system according to any one of Enumerated Example Embodiments 41 to
44, wherein the techniques comprise signaling of parameters to allow generation of multiple potential references for prediction.
46. The 3D video system according to any one of Enumerated Example Embodiments 36 to
45, further comprising a base layer decoder, an enhancement layer decoder, a base layer demultiplexer connected with the base layer decoder, and an enhancement layer demultiplexer connected with the enhancement layer decoder.
47. The 3D video system according to any one of Enumerated Example Embodiments 36 to
46, further comprising a base layer decoder and an enhancement layer decoder, and at least one additional multiplexer to multiplex base layer decoded data with enhancement layer decoded data of a same view.
48. The 3D video system according to Enumerated Example Embodiment 47, wherein the at least one additional multiplexer are two additional multiplexers, a first additional multiplexer being a left view multiplexer and a second additional multiplexer being a right view multiplexer.
49. The 3D video system according to Enumerated Example Embodiment 47, further comprising a base layer filtering stage and an enhancement layer filtering stage to respectively filter the base layer decoded data and the enhancement layer decoded data before being multiplexed in the at least one additional multiplexer.
50. The 3D video system of any one of the previous Enumerated Example Embodiments, wherein the pre-processed output of the pre-processing module further depends on previously coded pictures from the enhancement layer or past or differently processed prediction references from the base layer.
51. The 3D video system of any one of the previous Enumerated Example Embodiments, wherein the pre-processing module is based on resolution enhancement techniques, missing frequency information predicting techniques or techniques considering information from a plurality of views.
D09078WO01 21 52. The 3D video system of Enumerated Example Embodiment 51, wherein the techniques comprise signaling of parameters to allow generation of multiple potential references for prediction.
53. The 3D video system according to any one of the previous Enumerated Example Embodiments, further comprising an additional enhancement layer comprising an additional enhancement layer video encoder and an additional enhancement layer pre-processor connected with the additional enhancement layer video encoder.
54. The 3D video system according to Enumerated Example Embodiment 53, wherein the additional enhancement layer pre-processor is connected with the enhancement layer video encoder.
55. The 3D video system according to Enumerated Example Embodiment 54, wherein the additional enhancement layer comprises an additional enhancement layer video encoder connected with the additional enhancement layer pre-processor.
56. The 3D video system according to Enumerated Example Embodiment 55, wherein
the enhancement layer further comprises an enhancement layer video decoder, the additional enhancement layer further comprises an additional enhancement layer video decoder,
the 3D video system further comprising a further pre-processing module between the enhancement layer video decoder and the additional enhancement layer video decoder.
57. The 3D video system according to Enumerated Example Embodiment 56, wherein the additional enhancement layer further comprises an additional enhancement layer demultiplexer connected with the additional enhancement layer video decoder.
58. The 3D video system according to any one of Enumerated Example Embodiments 53 to 57, further comprising an enhancement layer video decoder, an additional enhancement layer video decoder, and at least one additional multiplexer to multiplex enhancement layer decoded data with additional enhancement layer decoded data of a same view.
D09078WO01 22 59. The 3D video system according to Enumerated Example Embodiment 58, wherein the at least one additional multiplexer are two additional multiplexers, a first additional multiplexer being a left view multiplexer and a second additional multiplexer being a right view multiplexer.
60. A frame compatible three-dimensional (3D) video system, comprising:
a base layer, comprising a base layer video encoder and a base layer multiplexer, the base layer multiplexer receiving an input indicative of a plurality of views and forming a multiplexed output connected with the base layer video encoder; and
at least one enhancement layer, associated with the base layer, the at least one enhancement layer comprising an enhancement layer video encoder and an enhancement layer multiplexer, the enhancement layer multiplexer receiving an input indicative of the plurality of views and forming a multiplexed output connected with the enhancement layer video encoder,
wherein the base layer video encoder is directly connected with the enhancement layer video encoder.
61. The frame compatible 3D video system of Enumerated Example Embodiment 60, wherein direct connection between the base layer video encoder and the enhancement layer video encoder is obtained through a pre-processor operating in a pass-through mode.
62. A two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video encoding method, comprising:
base layer video encoding a plurality of images or frames;
enhancement layer video encoding the plurality of images or frames;
pre-processing base layer video encoded images or frames; and
adopting the pre-processed base layer video encoded images or frame for the enhancement layer video encoding.
63. A two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video decoding method, comprising:
base layer video decoding a plurality of images or frames;
pre-processing base layer video decoded images or frames;
D09078WO01 23 adopting the pre-processed base layer video decoded images or frames for enhancement layer video decoding; and
enhancement layer video decoding the plurality of images or frames.
64. A two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video method, comprising:
base layer video encoding a plurality of images or frames;
enhancement layer video encoding the plurality of images or frames;
pre-processing base layer video encoded images or frames;
adopting the pre-processed base layer video encoded images or frame for the enhancement layer video encoding;
base layer video decoding a plurality of images or frames;
pre-processing base layer video decoded images or frames;
adopting the pre-processed base layer video decoded images or frames for enhancement layer video decoding; and
enhancement layer video decoding the plurality of images or frames.
65. An encoder for encoding a video signal according to the method recited in Enumerated Example Embodiment 62.
66. An apparatus for encoding a video signal according to the method recited in Enumerated Example Embodiment 62.
67. A system for encoding a video signal according to the method recited in Enumerated Example Embodiment 62.
68. A decoder for decoding a video signal according to the method recited in Enumerated Example Embodiment 63.
69. An apparatus for decoding a video signal according to the method recited in Enumerated Example Embodiment 63.
70. A system for encoding a video signal according to the method recited in Enumerated Example Embodiment 63.
D09078WO01 24 71. A computer-readable medium containing a set of instructions that causes a computer to perform the method recited in one or more of Enumerated Example Embodiments 62-64.
[0093] The examples set forth above are provided to give those of ordinary skill in the art a complete disclosure and description of how to make and use the embodiments of the encoding and decoding architectures for format compatible 3D video delivery of the disclosure, and are not intended to limit the scope of what the inventors regard as their disclosure. Modifications of the above-described modes for carrying out the disclosure may be used by persons of skill in the video art, and are intended to be within the scope of the following claims. All patents and publications mentioned in the specification may be indicative of the levels of skill of those skilled in the art to which the disclosure pertains. All references cited in this disclosure are incorporated by reference to the same extent as if each reference had been incorporated by reference in its entirety individually.
[0094] It is to be understood that the disclosure is not limited to particular methods or systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting. As used in this specification and the appended claims, the singular forms "a," "an," and "the" include plural referents unless the content clearly dictates otherwise. The term "plurality" includes two or more referents unless the content clearly dictates otherwise. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the disclosure pertains.
[0095] A number of embodiments of the disclosure have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the present disclosure. Accordingly, other embodiments are within the scope of the following claims.
D09078WO01 25

Claims

WHAT IS CLAIMED IS:
1. A two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video encoding system, comprising:
a base layer, comprising a base layer video encoder;
at least one enhancement layer, associated with the base layer, the enhancement layer comprising an enhancement layer video encoder; and
at least one pre-processing module, i) to pre-process the output of the base layer video encoder and input the pre-processed output into the enhancement layer video encoder and/or ii) to pre-process the output of one enhancement layer video encoder of one enhancement layer and input the pre-processed output into another enhancement layer video encoder of another enhancement layer.
2. A two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video decoding system, comprising:
a base layer, comprising a base layer video decoder;
at least one enhancement layer, associated with the base layer, the enhancement layer comprising an enhancement layer video decoder; and
at least one pre-processing module, i) to pre-process the output of the base layer video decoder and input the pre-processed output into the enhancement layer video decoder and/or ii) to pre-process the output of one enhancement layer video decoder of one enhancement layer and input the pre-processed output into another enhancement layer video decoder of another enhancement layer.
3. A two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video system, comprising:
a base layer, comprising a base layer video encoder and a base layer video decoder; at least one enhancement layer, associated with the base layer, the enhancement layer comprising an enhancement layer video encoder and an enhancement layer video decoder; at least one encoder pre-processing module, i) to pre-process the output of the base layer video encoder and input the pre-processed output into the enhancement layer video encoder and/or ii) to pre-process the output of one enhancement layer video encoder of one enhancement layer and input the pre-processed output into another enhancement layer video encoder of another enhancement layer; and
D09078WO01 26 at least one decoder pre-processing module, i) to pre-process the output of the base layer video decoder and input the pre-processed output into the enhancement layer video decoder and/or ii) to pre-process the output of one enhancement layer video decoder of one enhancement layer and input the pre-processed output into another enhancement layer video decoder of another enhancement layer .
4. The 3D video system of claim 1 or 3, wherein the pre-processed output of the preprocessing module further depends on video information coming from a plurality of views.
5. The 3D video system according to any one of the previous claims, wherein the preprocessing module comprises at least one of filtering, global motion compensation, region motion compensation, and texture mapping techniques.
6. The 3D video system according to claim 5, wherein the pre-processing module comprises parameters to be used with the at least one of the filtering, global motion compensation, region motion compensation, and texture mapping techniques.
7. The 3D video system according to claim 6, wherein the parameters are multiple parameters, differently assigned for different regions of a video, scene or image.
8. The 3D video system according to claim 7, wherein the regions are adaptive regions.
9. The 3D video system according to claim 8, wherein the adaptive regions are obtained through pre-analysis of video content.
10. The 3D video system according to claim 8, wherein features of the adaptive regions are user-specified.
11. The 3D video system according to any one of the previous claims, further comprising a pre-processing encoder connected with the pre-processing module, the pre-processing encoder adapted to select one of a plurality of pre-processing techniques to be used by the pre-processing module.
D09078WO01 27
12. The 3D video system of claim 11, further comprising a pre-processing decoder connected at the output of the pre-processing module.
13. The 3D video system according to any one of claims 1 to 12, further comprising a preprocessing encoder connected with the pre-processing module, the pre-processing encoder adapted to select one of a plurality of predictions generated through multiple parameters.
14. The 3D video system according to claim 13, further comprising a pre-processing decoder connected with the output of the pre-processing module.
15. The 3D video system according to any one of the previous claims, wherein
the base layer video encoder comprises a base layer encoding reference buffer the enhancement layer video encoder comprises an enhancement layer encoding reference buffer, and
a pre-processing module of the at least one pre-processing module is connected between the base layer encoding reference buffer and the enhancement layer encoding reference buffer.
16. The 3D video system according to any one of the previous claims, wherein
the base layer video decoder comprises a base layer decoding reference buffer the enhancement layer video decoder comprises an enhancement layer decoding reference buffer, and
the pre-processing module is connected between the base layer decoding reference buffer and the enhancement layer decoding reference buffer.
17. The 3D video system of any one of the previous claims, wherein the 3D video system is a 2D compatible video system, wherein the base layer corresponds to a single view.
18. The 3D video system of any one of the previous claims, wherein the 3D video system is a frame compatible video system, wherein the base layer and the enhancement layer correspond to two views, each of the base layer encoder and enhancement layer encoder comprising
a subsampler to subsample the two views, and
a multiplexer to multiplex the two subsampled views.
D09078WO01 28
19. The 3D video system according to claim 18, further comprising a base layer decoder and an enhancement layer decoder, and at least one additional multiplexer to multiplex base layer decoded data with enhancement layer decoded data of a same view.
20. The 3D video system according to claim 19, wherein the at least one additional multiplexer are two additional multiplexers, a first additional multiplexer being a left view multiplexer and a second additional multiplexer being a right view multiplexer.
21. The 3D video system according to claim 19, further comprising a base layer filtering stage and an enhancement layer filtering stage to respectively filter the base layer decoded data and the enhancement layer decoded data before being multiplexed in the at least one additional multiplexer.
22. The 3D video system of any one of the previous claims, wherein the pre-processed output of the pre-processing module further depends on previously coded pictures from the enhancement layer or past or differently processed prediction references from the base layer.
23. The 3D video system of any one of the previous claims, wherein the pre-processing module is based on resolution enhancement techniques, missing frequency information predicting techniques or techniques considering information from a plurality of views.
24. The 3D video system according to any one of the previous claims, further comprising an additional enhancement layer comprising an additional enhancement layer video encoder and an additional enhancement layer pre-processor connected with the additional enhancement layer video encoder.
25. A frame compatible three-dimensional (3D) video system, comprising:
a base layer, comprising a base layer video encoder and a base layer multiplexer, the base layer multiplexer receiving an input indicative of a plurality of views and forming a multiplexed output connected with the base layer video encoder; and
at least one enhancement layer, associated with the base layer, the at least one enhancement layer comprising an enhancement layer video encoder and an enhancement layer multiplexer, the enhancement layer multiplexer receiving an input indicative of the
D09078WO01 29 plurality of views and forming a multiplexed output connected with the enhancement layer video encoder,
wherein the base layer video encoder is directly connected with the enhancement layer video encoder.
26. A two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video encoding method, comprising:
base layer video encoding a plurality of images or frames;
enhancement layer video encoding the plurality of images or frames;
pre-processing base layer video encoded images or frames; and
adopting the pre-processed base layer video encoded images or frame for the enhancement layer video encoding.
27. A two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video decoding method, comprising:
base layer video decoding a plurality of images or frames;
pre-processing base layer video decoded images or frames;
adopting the pre-processed base layer video decoded images or frames for enhancement layer video decoding; and
enhancement layer video decoding the plurality of images or frames.
28. A two-dimensional (2D) compatible or frame compatible three-dimensional (3D) video method, comprising:
base layer video encoding a plurality of images or frames;
enhancement layer video encoding the plurality of images or frames;
pre-processing base layer video encoded images or frames;
adopting the pre-processed base layer video encoded images or frame for the enhancement layer video encoding;
base layer video decoding a plurality of images or frames;
pre-processing base layer video decoded images or frames;
adopting the pre-processed base layer video decoded images or frames for enhancement layer video decoding; and
enhancement layer video decoding the plurality of images or frames.
D09078WO01 3 Q
29. An encoder for encoding a video signal according to the method recited in claim 26.
30. An apparatus for encoding a video signal according to the method recited in claim 26.
31. A system for encoding a video signal according to the method recited in claim 26.
32. A decoder for decoding a video signal according to the method recited in claim 27.
33. An apparatus for decoding a video signal according to the method recited in claim 27.
34. A system for encoding a video signal according to the method recited in claim 27.
35. A computer-readable medium containing a set of instructions that causes a computer to perform the method recited in one or more of claims 26-28.
D09078WO01 3 J
PCT/US2010/040545 2009-07-04 2010-06-30 Encoding and decoding architectures for format compatible 3d video delivery WO2011005624A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US13/376,707 US9774882B2 (en) 2009-07-04 2010-06-30 Encoding and decoding architectures for format compatible 3D video delivery
US15/675,058 US10038916B2 (en) 2009-07-04 2017-08-11 Encoding and decoding architectures for format compatible 3D video delivery
US16/011,557 US10798412B2 (en) 2009-07-04 2018-06-18 Encoding and decoding architectures for format compatible 3D video delivery

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US22302709P 2009-07-04 2009-07-04
US61/223,027 2009-07-04

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US13/376,707 A-371-Of-International US9774882B2 (en) 2009-07-04 2010-06-30 Encoding and decoding architectures for format compatible 3D video delivery
US15/675,058 Continuation US10038916B2 (en) 2009-07-04 2017-08-11 Encoding and decoding architectures for format compatible 3D video delivery

Publications (1)

Publication Number Publication Date
WO2011005624A1 true WO2011005624A1 (en) 2011-01-13

Family

ID=42634918

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2010/040545 WO2011005624A1 (en) 2009-07-04 2010-06-30 Encoding and decoding architectures for format compatible 3d video delivery

Country Status (2)

Country Link
US (3) US9774882B2 (en)
WO (1) WO2011005624A1 (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011094047A1 (en) * 2010-02-01 2011-08-04 Dolby Laboratories Licensing Corporation Filtering for image and video enhancement using asymmetric samples
WO2012006299A1 (en) 2010-07-08 2012-01-12 Dolby Laboratories Licensing Corporation Systems and methods for multi-layered image and video delivery using reference processing signals
WO2012012582A1 (en) 2010-07-21 2012-01-26 Dolby Laboratories Licensing Corporation Reference processing using advanced motion models for video coding
WO2012020358A1 (en) * 2010-08-09 2012-02-16 Koninklijke Philips Electronics N.V. Encoder, decoder, bit-stream, method of encoding, method of decoding an image pair corresponding with two views of a multi-view signal
WO2012044487A1 (en) * 2010-10-01 2012-04-05 Dolby Laboratories Licensing Corporation Optimized filter selection for reference picture processing
WO2012125228A1 (en) * 2011-03-14 2012-09-20 Qualcomm Incorporated Post-filtering in full resolution frame-compatible stereoscopic video coding
WO2012178008A1 (en) * 2011-06-22 2012-12-27 General Instrument Corporation Construction of combined list using temporal distance
WO2013040170A1 (en) * 2011-09-16 2013-03-21 Dolby Laboratories Licensing Corporation Frame-compatible full resolution stereoscopic 3d compression and decompression
WO2013049179A1 (en) * 2011-09-29 2013-04-04 Dolby Laboratories Licensing Corporation Dual-layer frame-compatible full-resolution stereoscopic 3d video delivery
WO2013049383A1 (en) * 2011-09-29 2013-04-04 Dolby Laboratories Licensing Corporation Frame-compatible full resolution stereoscopic 3d video delivery with symmetric picture resolution and quality
WO2013009716A3 (en) * 2011-07-08 2013-04-18 Dolby Laboratories Licensing Corporation Hybrid encoding and decoding methods for single and multiple layered video coding systems
EP2632162A1 (en) * 2012-02-27 2013-08-28 Thomson Licensing Method and device for encoding an HDR video image, method and device for decoding an HDR video image
WO2013138127A1 (en) 2012-03-12 2013-09-19 Dolby Laboratories Licensing Corporation 3d visual dynamic range coding
EP2667610A2 (en) 2012-05-24 2013-11-27 Dolby Laboratories Licensing Corporation Multi-layer backwards-compatible video delivery for enhanced dynamic range and enhanced resolution formats
WO2013188552A2 (en) 2012-06-14 2013-12-19 Dolby Laboratories Licensing Corporation Depth map delivery formats for stereoscopic and auto-stereoscopic displays
US20140071231A1 (en) * 2012-09-11 2014-03-13 The Directv Group, Inc. System and method for distributing high-quality 3d video in a 2d format
US8676041B2 (en) 2009-07-04 2014-03-18 Dolby Laboratories Licensing Corporation Support of full resolution graphics, menus, and subtitles in frame compatible 3D delivery
WO2014052292A1 (en) * 2012-09-27 2014-04-03 Dolby Laboratories Licensing Corporation Inter-layer reference picture processing for coding standard scalability
US9078008B2 (en) 2009-04-20 2015-07-07 Dolby Laboratories Licensing Corporation Adaptive inter-layer interpolation filters for multi-layered video delivery
US9369712B2 (en) 2010-01-14 2016-06-14 Dolby Laboratories Licensing Corporation Buffered adaptive filters
US9392280B1 (en) 2011-04-07 2016-07-12 Google Inc. Apparatus and method for using an alternate reference frame to decode a video frame
US9426459B2 (en) 2012-04-23 2016-08-23 Google Inc. Managing multi-reference picture buffers and identifiers to facilitate video data coding
US9609341B1 (en) 2012-04-23 2017-03-28 Google Inc. Video data encoding and decoding using reference picture lists
US9756331B1 (en) 2013-06-17 2017-09-05 Google Inc. Advance coded reference prediction
US11375240B2 (en) 2008-09-11 2022-06-28 Google Llc Video coding using constructed reference frames

Families Citing this family (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9538176B2 (en) 2008-08-08 2017-01-03 Dolby Laboratories Licensing Corporation Pre-processing for bitdepth and color format scalable video coding
WO2011005624A1 (en) 2009-07-04 2011-01-13 Dolby Laboratories Licensing Corporation Encoding and decoding architectures for format compatible 3d video delivery
US8665968B2 (en) * 2009-09-30 2014-03-04 Broadcom Corporation Method and system for 3D video coding using SVC spatial scalability
US9014276B2 (en) * 2009-12-04 2015-04-21 Broadcom Corporation Method and system for 3D video coding using SVC temporal and spatial scalabilities
US8908758B2 (en) 2010-01-06 2014-12-09 Dolby Laboratories Licensing Corporation High performance rate control for multi-layered video coding applications
EP2779655B1 (en) 2010-01-06 2019-05-22 Dolby Laboratories Licensing Corporation Complexity-adaptive scalable decoding and streaming for multi-layered video systems
US8938011B2 (en) 2010-01-27 2015-01-20 Dolby Laboratories Licensing Corporation Methods and systems for reference processing in image and video codecs
EP2559238B1 (en) * 2010-04-13 2015-06-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Adaptive image filtering method and apparatus
US8483271B2 (en) * 2010-05-06 2013-07-09 Broadcom Corporation Method and system for 3D video pre-processing and post-processing
CN105812828B (en) 2010-07-21 2018-09-18 杜比实验室特许公司 Coding/decoding method for the transmission of multilayer frame compatible video
US8810565B2 (en) * 2010-08-27 2014-08-19 Broadcom Corporation Method and system for utilizing depth information as an enhancement layer
US20120062698A1 (en) * 2010-09-08 2012-03-15 Electronics And Telecommunications Research Institute Apparatus and method for transmitting/receiving data in communication system
JP5740885B2 (en) * 2010-09-21 2015-07-01 セイコーエプソン株式会社 Display device and display method
US10841573B2 (en) * 2011-02-08 2020-11-17 Sun Patent Trust Methods and apparatuses for encoding and decoding video using multiple reference pictures
WO2013033596A1 (en) 2011-08-31 2013-03-07 Dolby Laboratories Licensing Corporation Multiview and bitdepth scalable video delivery
CN107241606B (en) 2011-12-17 2020-02-21 杜比实验室特许公司 Decoding system, method and apparatus, and computer readable medium
US9049445B2 (en) 2012-01-04 2015-06-02 Dolby Laboratories Licensing Corporation Dual-layer backwards-compatible progressive video delivery
US9756353B2 (en) 2012-01-09 2017-09-05 Dolby Laboratories Licensing Corporation Hybrid reference picture reconstruction method for single and multiple layered video coding systems
US9451252B2 (en) 2012-01-14 2016-09-20 Qualcomm Incorporated Coding parameter sets and NAL unit headers for video coding
EP2642755B1 (en) 2012-03-20 2018-01-03 Dolby Laboratories Licensing Corporation Complexity scalable multilayer video coding
US9998726B2 (en) * 2012-06-20 2018-06-12 Nokia Technologies Oy Apparatus, a method and a computer program for video coding and decoding
US20140002598A1 (en) * 2012-06-29 2014-01-02 Electronics And Telecommunications Research Institute Transport system and client system for hybrid 3d content service
WO2014053517A1 (en) * 2012-10-01 2014-04-10 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Scalable video coding using derivation of subblock subdivision for prediction from base layer
US9565437B2 (en) 2013-04-08 2017-02-07 Qualcomm Incorporated Parameter set designs for video coding extensions
KR102318257B1 (en) 2014-02-25 2021-10-28 한국전자통신연구원 Apparatus for multiplexing signals using layered division multiplexing and method using the same
CA3024609C (en) 2014-05-09 2020-04-07 Electronics And Telecommunications Research Institute Signal multiplexing apparatus using layered division multiplexing and signal multiplexing method
WO2015170819A1 (en) * 2014-05-09 2015-11-12 한국전자통신연구원 Signal multiplexing apparatus using layered division multiplexing and signal multiplexing method
KR102214028B1 (en) * 2014-09-22 2021-02-09 삼성전자주식회사 Application processor including reconfigurable scaler and device including the same
EP3002946A1 (en) * 2014-10-03 2016-04-06 Thomson Licensing Video encoding and decoding methods for a video comprising base layer images and enhancement layer images, corresponding computer programs and video encoder and decoders
KR102362788B1 (en) 2015-01-08 2022-02-15 한국전자통신연구원 Apparatus for generating broadcasting signal frame using layered division multiplexing and method using the same
WO2016111567A1 (en) * 2015-01-08 2016-07-14 한국전자통신연구원 Broadcasting signal frame generation apparatus and method using layered divisional multiplexing
CN107431562B (en) 2015-04-06 2021-06-18 Lg电子株式会社 Apparatus and method for transmitting and receiving broadcast signal
US10681382B1 (en) * 2016-12-20 2020-06-09 Amazon Technologies, Inc. Enhanced encoding and decoding of video reference frames
US10424082B2 (en) 2017-04-24 2019-09-24 Intel Corporation Mixed reality coding with overlays
FR3072850B1 (en) * 2017-10-19 2021-06-04 Tdf CODING AND DECODING METHODS OF A DATA FLOW REPRESENTATIVE OF AN OMNIDIRECTIONAL VIDEO
US20190246114A1 (en) 2018-02-02 2019-08-08 Apple Inc. Techniques of multi-hypothesis motion compensation
US11924440B2 (en) 2018-02-05 2024-03-05 Apple Inc. Techniques of multi-hypothesis motion compensation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1720358A2 (en) * 2005-04-11 2006-11-08 Sharp Kabushiki Kaisha Method and apparatus for adaptive up-sampling for spatially scalable coding
WO2007047736A2 (en) * 2005-10-19 2007-04-26 Thomson Licensing Multi-view video coding using scalable video coding
WO2008010932A2 (en) * 2006-07-20 2008-01-24 Thomson Licensing Method and apparatus for signaling view scalability in multi-view video coding

Family Cites Families (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NO174610C (en) 1988-02-23 1994-06-01 Philips Nv Device for space / timed sub-sampling of digital video signals, representing a sequence of line jumps or sequential images, system for transmission of high resolution television images including such device
US5260773A (en) * 1991-10-04 1993-11-09 Matsushita Electric Corporation Of America Color alternating 3-dimensional TV system
US5621660A (en) 1995-04-18 1997-04-15 Sun Microsystems, Inc. Software-based encoder for a software-implemented end-to-end scalable video delivery system
US6173013B1 (en) 1996-11-08 2001-01-09 Sony Corporation Method and apparatus for encoding enhancement and base layer image signals using a predicted image signal
IL127274A (en) 1997-04-01 2006-06-11 Sony Corp Picture coding device, picture coding method,picture decoding device, picture decoding method, and providing medium
EP1294196A3 (en) 2001-09-04 2004-10-27 Interuniversitair Microelektronica Centrum Vzw Method and apparatus for subband encoding and decoding
US6983079B2 (en) * 2001-09-20 2006-01-03 Seiko Epson Corporation Reducing blocking and ringing artifacts in low-bit-rate coding
KR100454194B1 (en) 2001-12-28 2004-10-26 한국전자통신연구원 Stereoscopic Video Encoder and Decoder Supporting Multi-Display Mode and Method Thereof
AU2003237279A1 (en) 2002-05-29 2003-12-19 Pixonics, Inc. Classifying image areas of a video signal
JP4154569B2 (en) * 2002-07-10 2008-09-24 日本電気株式会社 Image compression / decompression device
EP1455534A1 (en) 2003-03-03 2004-09-08 Thomson Licensing S.A. Scalable encoding and decoding of interlaced digital video data
FR2852773A1 (en) 2003-03-20 2004-09-24 France Telecom Video image sequence coding method, involves applying wavelet coding on different images obtained by comparison between moving image and estimated image corresponding to moving image
US7489342B2 (en) 2004-12-17 2009-02-10 Mitsubishi Electric Research Laboratories, Inc. Method and system for managing reference pictures in multiview videos
EP1695555A1 (en) 2003-12-08 2006-08-30 Koninklijke Philips Electronics N.V. Spatial scalable compression scheme with a dead zone
KR100987775B1 (en) * 2004-01-20 2010-10-13 삼성전자주식회사 3 Dimensional coding method of video
US7953152B1 (en) 2004-06-28 2011-05-31 Google Inc. Video compression and encoding method
US20060023782A1 (en) 2004-07-27 2006-02-02 Microsoft Corporation System and method for off-line multi-view video compression
DE102004059978B4 (en) 2004-10-15 2006-09-07 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for generating a coded video sequence and decoding a coded video sequence using interlayer residue prediction, and a computer program and computer readable medium
KR100664929B1 (en) 2004-10-21 2007-01-04 삼성전자주식회사 Method and apparatus for effectively compressing motion vectors in video coder based on multi-layer
KR100732961B1 (en) 2005-04-01 2007-06-27 경희대학교 산학협력단 Multiview scalable image encoding, decoding method and its apparatus
US20070110155A1 (en) 2005-11-15 2007-05-17 Sung Chih-Ta S Method and apparatus of high efficiency image and video compression and display
JP2007174634A (en) 2005-11-28 2007-07-05 Victor Co Of Japan Ltd Layered coding and decoding methods, apparatuses, and programs
KR100977101B1 (en) 2005-11-30 2010-08-23 가부시끼가이샤 도시바 Image encoding/image decoding method and image encoding/image decoding apparatus
US8023569B2 (en) 2005-12-15 2011-09-20 Sharp Laboratories Of America, Inc. Methods and systems for block-based residual upsampling
GB0600141D0 (en) 2006-01-05 2006-02-15 British Broadcasting Corp Scalable coding of video signals
US7956930B2 (en) * 2006-01-06 2011-06-07 Microsoft Corporation Resampling and picture resizing operations for multi-resolution video coding and decoding
US20070160134A1 (en) 2006-01-10 2007-07-12 Segall Christopher A Methods and Systems for Filter Characterization
US7881552B1 (en) 2006-05-16 2011-02-01 Adobe Systems Incorporated Anti-flicker filter
US8363724B2 (en) 2006-07-11 2013-01-29 Thomson Licensing Methods and apparatus using virtual reference pictures
US20080095235A1 (en) * 2006-10-20 2008-04-24 Motorola, Inc. Method and apparatus for intra-frame spatial scalable video coding
ES2858578T3 (en) 2007-04-12 2021-09-30 Dolby Int Ab Tiled organization in video encoding and decoding
EP2143278B1 (en) 2007-04-25 2017-03-22 Thomson Licensing Inter-view prediction with downsampled reference pictures
US8487982B2 (en) 2007-06-07 2013-07-16 Reald Inc. Stereoplexing for film and video applications
US8373744B2 (en) 2007-06-07 2013-02-12 Reald Inc. Stereoplexing for video and film applications
WO2009011492A1 (en) 2007-07-13 2009-01-22 Samsung Electronics Co., Ltd. Method and apparatus for encoding and decoding stereoscopic image format including both information of base view image and information of additional view image
US8848787B2 (en) 2007-10-15 2014-09-30 Qualcomm Incorporated Enhancement layer coding for scalable video coding
US8126054B2 (en) 2008-01-09 2012-02-28 Motorola Mobility, Inc. Method and apparatus for highly scalable intraframe video coding
WO2010011557A2 (en) 2008-07-20 2010-01-28 Dolby Laboratories Licensing Corporation Encoder optimization of stereoscopic video delivery systems
US8213503B2 (en) 2008-09-05 2012-07-03 Microsoft Corporation Skip modes for inter-layer residual video coding and decoding
US20100260268A1 (en) * 2009-04-13 2010-10-14 Reald Inc. Encoding, decoding, and distributing enhanced resolution stereoscopic video
EP2422522A1 (en) 2009-04-20 2012-02-29 Dolby Laboratories Licensing Corporation Directed interpolation and data post-processing
JP5416271B2 (en) 2009-04-20 2014-02-12 ドルビー ラボラトリーズ ライセンシング コーポレイション Adaptive interpolation filter for multi-layer video delivery
EP2422521B1 (en) 2009-04-20 2013-07-03 Dolby Laboratories Licensing Corporation Filter selection for video pre-processing in video applications
WO2011005624A1 (en) 2009-07-04 2011-01-13 Dolby Laboratories Licensing Corporation Encoding and decoding architectures for format compatible 3d video delivery
WO2011005625A1 (en) 2009-07-04 2011-01-13 Dolby Laboratories Licensing Corporation Support of full resolution graphics, menus, and subtitles in frame compatible 3d delivery
EP2524504A1 (en) 2010-01-14 2012-11-21 Dolby Laboratories Licensing Corporation Buffered adaptive filters
CN102742269B (en) 2010-02-01 2016-08-03 杜比实验室特许公司 The method that process image or the sample of image sequence, post processing have decoded image
US20120075436A1 (en) 2010-09-24 2012-03-29 Qualcomm Incorporated Coding stereo video data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1720358A2 (en) * 2005-04-11 2006-11-08 Sharp Kabushiki Kaisha Method and apparatus for adaptive up-sampling for spatially scalable coding
WO2007047736A2 (en) * 2005-10-19 2007-04-26 Thomson Licensing Multi-view video coding using scalable video coding
WO2008010932A2 (en) * 2006-07-20 2008-01-24 Thomson Licensing Method and apparatus for signaling view scalability in multi-view video coding

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SCHWARZ H ET AL: "SVC overview", ITU STUDY GROUP 16 - VIDEO CODING EXPERTS GROUP -ISO/IEC MPEG & ITU-T VCEG(ISO/IEC JTC1/SC29/WG11 AND ITU-T SG16 Q6), XX, XX, no. JVT-U145, 20 October 2006 (2006-10-20), XP030006791 *
TOURAPIS A M ET AL: "Format Extensions to the Spatially Interleaved Pictures SEI message", 30. JVT MEETING; 29-1-2009 - 2-2-2009; GENEVA, ; (JOINT VIDEO TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ),, 31 January 2009 (2009-01-31), XP030007457 *

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11375240B2 (en) 2008-09-11 2022-06-28 Google Llc Video coding using constructed reference frames
US9078008B2 (en) 2009-04-20 2015-07-07 Dolby Laboratories Licensing Corporation Adaptive inter-layer interpolation filters for multi-layered video delivery
US8676041B2 (en) 2009-07-04 2014-03-18 Dolby Laboratories Licensing Corporation Support of full resolution graphics, menus, and subtitles in frame compatible 3D delivery
US9369712B2 (en) 2010-01-14 2016-06-14 Dolby Laboratories Licensing Corporation Buffered adaptive filters
US9503757B2 (en) 2010-02-01 2016-11-22 Dolby Laboratories Licensing Corporation Filtering for image and video enhancement using asymmetric samples
WO2011094047A1 (en) * 2010-02-01 2011-08-04 Dolby Laboratories Licensing Corporation Filtering for image and video enhancement using asymmetric samples
US10531120B2 (en) 2010-07-08 2020-01-07 Dolby Laboratories Licensing Corporation Systems and methods for multi-layered image and video delivery using reference processing signals
WO2012006299A1 (en) 2010-07-08 2012-01-12 Dolby Laboratories Licensing Corporation Systems and methods for multi-layered image and video delivery using reference processing signals
US9467689B2 (en) 2010-07-08 2016-10-11 Dolby Laboratories Licensing Corporation Systems and methods for multi-layered image and video delivery using reference processing signals
WO2012012582A1 (en) 2010-07-21 2012-01-26 Dolby Laboratories Licensing Corporation Reference processing using advanced motion models for video coding
EP2675163A1 (en) 2010-07-21 2013-12-18 Dolby Laboratories Licensing Corporation Reference processing using advanced motion models for video coding
WO2012020358A1 (en) * 2010-08-09 2012-02-16 Koninklijke Philips Electronics N.V. Encoder, decoder, bit-stream, method of encoding, method of decoding an image pair corresponding with two views of a multi-view signal
US9344702B2 (en) 2010-08-09 2016-05-17 Koninklijke Philips N.V. Encoder, decoder, bit-stream, method of encoding, method of decoding an image pair corresponding with two views of a multi-view signal
WO2012044487A1 (en) * 2010-10-01 2012-04-05 Dolby Laboratories Licensing Corporation Optimized filter selection for reference picture processing
WO2012125228A1 (en) * 2011-03-14 2012-09-20 Qualcomm Incorporated Post-filtering in full resolution frame-compatible stereoscopic video coding
US9392280B1 (en) 2011-04-07 2016-07-12 Google Inc. Apparatus and method for using an alternate reference frame to decode a video frame
WO2012178008A1 (en) * 2011-06-22 2012-12-27 General Instrument Corporation Construction of combined list using temporal distance
WO2013009716A3 (en) * 2011-07-08 2013-04-18 Dolby Laboratories Licensing Corporation Hybrid encoding and decoding methods for single and multiple layered video coding systems
US8902976B2 (en) 2011-07-08 2014-12-02 Dolby Laboratories Licensing Corporation Hybrid encoding and decoding methods for single and multiple layered video coding systems
CN103814572A (en) * 2011-09-16 2014-05-21 杜比实验室特许公司 Frame-compatible full resolution stereoscopic 3D compression and decompression
US9473788B2 (en) 2011-09-16 2016-10-18 Dolby Laboratories Licensing Corporation Frame-compatible full resolution stereoscopic 3D compression and decompression
WO2013040170A1 (en) * 2011-09-16 2013-03-21 Dolby Laboratories Licensing Corporation Frame-compatible full resolution stereoscopic 3d compression and decompression
CN103814572B (en) * 2011-09-16 2017-02-22 杜比实验室特许公司 Frame-compatible full resolution stereoscopic 3D compression and decompression
WO2013049179A1 (en) * 2011-09-29 2013-04-04 Dolby Laboratories Licensing Corporation Dual-layer frame-compatible full-resolution stereoscopic 3d video delivery
US10097820B2 (en) 2011-09-29 2018-10-09 Dolby Laboratories Licensing Corporation Frame-compatible full-resolution stereoscopic 3D video delivery with symmetric picture resolution and quality
WO2013049383A1 (en) * 2011-09-29 2013-04-04 Dolby Laboratories Licensing Corporation Frame-compatible full resolution stereoscopic 3d video delivery with symmetric picture resolution and quality
WO2013127753A1 (en) * 2012-02-27 2013-09-06 Thomson Licensing Method and device for encoding an hdr video image, method and device for decoding an hdr video image
EP2632162A1 (en) * 2012-02-27 2013-08-28 Thomson Licensing Method and device for encoding an HDR video image, method and device for decoding an HDR video image
US9973779B2 (en) 2012-03-12 2018-05-15 Dolby Laboratories Licensing Corporation 3D visual dynamic range coding
WO2013138127A1 (en) 2012-03-12 2013-09-19 Dolby Laboratories Licensing Corporation 3d visual dynamic range coding
US9426459B2 (en) 2012-04-23 2016-08-23 Google Inc. Managing multi-reference picture buffers and identifiers to facilitate video data coding
US9609341B1 (en) 2012-04-23 2017-03-28 Google Inc. Video data encoding and decoding using reference picture lists
EP2667610A2 (en) 2012-05-24 2013-11-27 Dolby Laboratories Licensing Corporation Multi-layer backwards-compatible video delivery for enhanced dynamic range and enhanced resolution formats
WO2013188552A2 (en) 2012-06-14 2013-12-19 Dolby Laboratories Licensing Corporation Depth map delivery formats for stereoscopic and auto-stereoscopic displays
EP3399755A1 (en) 2012-06-14 2018-11-07 Dolby Laboratories Licensing Corp. Depth map delivery formats for stereoscopic and auto-stereoscopic displays
US9743064B2 (en) * 2012-09-11 2017-08-22 The Directv Group, Inc. System and method for distributing high-quality 3D video in a 2D format
US20140071231A1 (en) * 2012-09-11 2014-03-13 The Directv Group, Inc. System and method for distributing high-quality 3d video in a 2d format
EP3255890A2 (en) 2012-09-27 2017-12-13 Dolby Laboratories Licensing Corp. Inter-layer reference picture processing for coding-standard scalability
KR101806101B1 (en) 2012-09-27 2017-12-07 돌비 레버러토리즈 라이쎈싱 코오포레이션 Inter-layer reference picture processing for coding standard scalability
WO2014052292A1 (en) * 2012-09-27 2014-04-03 Dolby Laboratories Licensing Corporation Inter-layer reference picture processing for coding standard scalability
EP3748969A1 (en) 2012-09-27 2020-12-09 Dolby Laboratories Licensing Corporation Inter-layer reference picture processing for coding standard scalability
US9756331B1 (en) 2013-06-17 2017-09-05 Google Inc. Advance coded reference prediction

Also Published As

Publication number Publication date
US20170339429A1 (en) 2017-11-23
US20120092452A1 (en) 2012-04-19
US10798412B2 (en) 2020-10-06
US9774882B2 (en) 2017-09-26
US20180310023A1 (en) 2018-10-25
US10038916B2 (en) 2018-07-31

Similar Documents

Publication Publication Date Title
US10798412B2 (en) Encoding and decoding architectures for format compatible 3D video delivery
US10531120B2 (en) Systems and methods for multi-layered image and video delivery using reference processing signals
US10484678B2 (en) Method and apparatus of adaptive intra prediction for inter-layer and inter-view coding
EP2752000B1 (en) Multiview and bitdepth scalable video delivery
US9906804B2 (en) Reference layer sample position derivation for scalable video coding
CN107241606B (en) Decoding system, method and apparatus, and computer readable medium
DK2512136T3 (en) Tiling in video coding and decoding
US9800884B2 (en) Device and method for scalable coding of video information
EP2974313B1 (en) Inter-layer motion vector scaling for scalable video coding
KR20150043547A (en) Coding stereo video data
EP2813079A1 (en) Method and apparatus of bi-directional prediction for scalable video coding
WO2014053085A1 (en) Method and apparatus of motion information management in video coding
AU2013305370B2 (en) Method and apparatus of interlayer texture prediction
CN115668930A (en) Image encoding/decoding method and apparatus for determining sub-layer based on whether reference is made between layers, and method of transmitting bitstream

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10730323

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
WWE Wipo information: entry into national phase

Ref document number: 13376707

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10730323

Country of ref document: EP

Kind code of ref document: A1