WO2012046239A2 - Multiview 3d compression format and algorithms - Google Patents

Multiview 3d compression format and algorithms Download PDF

Info

Publication number
WO2012046239A2
WO2012046239A2 PCT/IL2011/000792 IL2011000792W WO2012046239A2 WO 2012046239 A2 WO2012046239 A2 WO 2012046239A2 IL 2011000792 W IL2011000792 W IL 2011000792W WO 2012046239 A2 WO2012046239 A2 WO 2012046239A2
Authority
WO
WIPO (PCT)
Prior art keywords
data
view
elements
generating
decoded
Prior art date
Application number
PCT/IL2011/000792
Other languages
French (fr)
Other versions
WO2012046239A3 (en
Inventor
Alain Fogel
Marc Pollefeys
Original Assignee
Nomad3D Sas
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nomad3D Sas filed Critical Nomad3D Sas
Priority to US13/824,395 priority Critical patent/US20130250056A1/en
Publication of WO2012046239A2 publication Critical patent/WO2012046239A2/en
Publication of WO2012046239A3 publication Critical patent/WO2012046239A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/161Encoding, multiplexing or demultiplexing different image signal components
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding

Definitions

  • the present embodiment generally relates to the field of computer vision and graphics, and in particular, it concerns a system and format for three-dimensional (3D) encoding and rendering of multiple views.
  • Stereo 3D is produced by displaying two views, one for each eye, the LEFT and RIGHT views.
  • the main markets for S3D are similar as for 2D:
  • H.264/MPG4-MVC The current algorithmic technology that is recommended by the MPEG forum is an extension of H.264/MPEG4 and is called H.264/MPG4-MVC.
  • This algorithm has the advantage of keeping the bandwidth requirement to a reasonable level.
  • the power consumption on a mobile device for a typical broadcast video is multiplied by a factor close to 2.
  • Another algorithm technology has been developed by Dolby Laboratories, Inc. called 3D Full-Resolution that includes a layer on top of H.264.
  • This layer is an enhancement for Side by Side 3D HALF HD, enabling 3D FULL HD on set top boxes (STBs).
  • STBs set top boxes
  • Multiview 3D is the next step of S3D.
  • the viewer With a multiview 3D TV, the viewer will be able to see multiple, different views of a scene. For example, different views of a football match can be seen, by the viewer moving himself and/or selecting the 3D view that the viewer wants to see, just like going around a hologram or turning a hologram in your hand. The same use case applies to tablets with the viewer selecting a desired view by tilting the tablet.
  • Multiview 3D technology with no glasses exists already for specific markets (e.g. advertisement).
  • the available screen sets are impressive but very expensive as compared to the cost of a non-multiview screen. As such, currently available screen sets do not fit the consumer TV market yet.
  • the LCD sets are typically based on lenticular displays and typically exhibit 8 to 10 views.
  • the resolution of each view is equal to the screen resolution divided by the number of views. Projections for this market are that this limitation on the resolution of each view will be resolved in the future, and that each view will have full HD resolution.
  • the computing power and the power consumption for visualizing for example 8 views coded with H.264-M VC is the power required by a single view multiplied by the number of views (in this case 8) compared to a 2D view.
  • an 8 view multiview 3D TV set consumes about 8 times as much power as a single view 3D TV set.
  • the power consumption requirements of multiview 3D TV are a challenge for decoding chips and for energy saving regulations.
  • a method for encoding data including the steps of: receiving a first set of data; receiving a second set of data; generating a first view, a second view, and associated generating-vectors; wherein the first and second views are generated by combining the first and second sets of data, such that the first view contains information associated with elements of the first set of data, the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the associated generating-vectors indicate operations to be performed on the elements of the first and second views to recover the first and second sets of data.
  • the first view is the first set of data.
  • the first view includes elements that are common to the first and second sets of data.
  • the second view only contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data.
  • the second view contains additional information, the additional information other than information only associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data.
  • the second view includes elements of the first and second sets of data that are only in the second set of data.
  • the first set of data is a first two-dimensional (2D) image of a scene from a first viewing angle
  • the second set of data is a second 2D image of the scene from a second viewing angle
  • the data is in H.264 format. In another optional embodiment, the data is in MPEG4 format.
  • the method of claim 1 further includes the step of:
  • a method for decoding data including the steps of: a first view and a second view, the first and second views containing information associated with elements of a first set of data and a second set of data such that the first view contains information associated with elements of the first set of data, and the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data; receiving generating-vectors associated with the first and second views, the generating-vectors indicating operations to be performed on elements of the first and second views to generate the first and second set of data; and generating, using the first view, the second view, and the generating-vectors, at least the first set of data.
  • the method includes the step of generating, using the first view, the second view, and the generating-vectors, the second set of data.
  • a system for encoding data including: a data-receiving module configured to receive at least a first set of data and a second set of data; and a processing system containing one or more processors, the processing system being configured to generate a first view, a second view, and associated generating-vectors, wherein the first and second views are generated by combining the first and second sets of data, such that the first view contains information associated with elements of the first set of data, the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the associated generating- vectors indicate operations to be performed on the elements of the first and second views to recover the first and second sets of data.
  • the system includes a storage module configured to store the first view, the second view, and the associated generating-vectors in association with each other.
  • a system for decoding data including: a data-receiving module configured to receive at least: a first view and a second view, the first and second views containing information associated with elements of a first set of data and a second set of data such that the first view contains information associated with elements of the first set of data, and the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data; and generating-vectors associated with the first and second views, the generating-vectors indicating operati ons to be performed on elements of the first and second views to generate the first and second set of data; and a processing system containing one or more processors, the processing system being configured to generate, using the first view, the second view, and the generating-vectors, at least the first set of data.
  • 5 encoding data including the steps of: generating a first fused data set including a first view, a second view, and a first set of associated generating-vectors wherein the first and second views are generated by combining a first set of data and a second set of data, such that the first view contains information associated with elements of the first set of data, the second view contains information associated with elements of the second set of data other than
  • 3 elements of the second set of data that are in common with corresponding elements of the first set of data, and the first set of associated generating-vectors indicate operations to be performed on the elements of the first and second views to recover the first and second sets of data; generating a decoded second view using the first fused data set, the decoded second view substantially the same as the second set of data; and generating a third view, and a
  • the steps of generating a decoded second view and generating a third view are repeated to generate a higher-level fused data set, the higher-level fused data set including a higher-level decoded view from a lower-level fused data set.
  • the method further includes the step of: storing the first
  • a method for decoding data including the steps of: receiving a first fused data set including a first view, a second view, and a first set of associated generating-vectors, the first and second views
  • the first view contains information associated with elements of the first set of data
  • the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data
  • the a first set of associated generating-vectors indicating operations to be performed on elements of the first and second views to render the first and second set of data
  • the step of generating a decoded third view is repeated to generate a higher-level decoded view using a higher-level fused data set, the higher-level fused data set including a decoded view from a lower-level fused data set.
  • the method includes the step of: generating a decoded first view using the first fused data set, the decoded first view substantially the same as the first set of data.
  • a system for encoding data including: a data-receiving module configured to receive at least a first set of data, a second set of data, and a third set of data; and a processing system containing one or more processors, the processing system being configured to: generate a first fused data set including a first view, a second view, and a first set of associated generating-vectors wherein the first and second views are generated by combining a first set of data and a second set of data, such that the first view contains information associated with elements of the first set of data, the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the first set of associated generating-vectors indicate operations to be performed on the elements of the first and second views to recover the first and second sets of data; generate a decoded second view using the first fused data set, the decoded second view substantially the same as the second
  • the system includes a storage module configured to store the first fused data set, the third view, and the second set of associated generating-vectors in association with each other.
  • a system for decoding data including: a data-receiving module configured to receive at least a first fused data set including a first view, a second view, and a first set of associated generating-vectors, the first and second views containing information associated with elements of a first set of data and a second set of data such that the first view contains information associated with elements of the first set of data, and the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the a first set of associated generating-vectors indicating operations to be performed on elements of the first and second views to render the first and second set of data; and a processing system containing one or more processors, the processing system being configured to: generate at least a decoded second view using the first fused data set, the decoded second view substantially the same as the second set of data; and generate a decoded third view using a second fused
  • a computer- readable storage medium having embedded thereon computer-readable code for encoding data
  • the computer-readable code including program code for: receiving a first set of data; receiving a second set of data; generating a first view, a second view, and associated generating-vectors; wherein the first and second views are generated by combining the first and second sets of data, such that the first view contains information associated with elements of the first set of data, the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the associated generating-vectors indicate operations to be performed on the elements of the first and second views to recover the first and second sets of data.
  • a computer- readable storage medium having embedded thereon computer-readable code for decoding data
  • the computer-readable code including program code for: receiving a first view and a second view, the first and second views containing information associated with elements of a first set of data and a second set of data such that the first view contains information associated with elements of the first set of data, and the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data; receiving generating-vectors associated with the first and second views, the generating-vectors indicating operations to be performed on elements of the first and second views to generate the first and second set of data; and generating, using the first view, the second view, and the generating-vectors, at least the first set of data.
  • a computer- readable storage medium having embedded thereon computer-readable code for encoding data
  • the computer-readable code including program code for: generating a first fused data set including a first view, a second view, and a first set of associated generating-vectors wherein the first and second views are generated by combining a first set of data and a second set of data, such that the first view contains information associated with elements of the first set of data, the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the first set of associated generating-vectors indicate operations to be performed on the elements of the first and second views to recover the first and second sets of data; generating a decoded second view using the first fused data set, the decoded second view substantially the same as the second set of data; and generating a third view, and a second set of associated generating-vectors wherein the third view is generated by
  • a computer- readable storage medium having embedded thereon computer-readable code for decoding data
  • the computer-readable code including program code for: receiving a first fused data set including a first view, a second view, and a first set of associated generating-vectors, the first and second views containing information associated with elements of a first set of data and a second set of data such that the first view contains information associated with elements of the first set of data, and the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the a first set of associated generating- vectors indicating operations to be performed on elements of the first and second views to render the first and second set of data; generating at least a decoded second view using the first fused data set, the decoded second view substantially the same as the second set of data; and generating a decoded third view using a second fused data set, the second fused data set including the
  • FIGURE 1 is a diagram of a fused 2D view.
  • FIGURE 2 is a diagram of exemplary GOC operations for the LRO format.
  • FIGURE 3A is a diagram of an LRO fused view format.
  • FIGURE 3B is a diagram of an RLO format.
  • FIGURE 4 is a flow diagram of processing for an MLE CODEC encoder based on the LRO format.
  • FIGURE 5 is a flow diagram of processing for an MLE CODEC decoder based on the LRO format.
  • FIGURE 6 is a flow diagram of processing for an MLE CODEC encoder based on the RLO format.
  • FIGURE 7 is a flow diagram of processing for an MLE CODEC decoder based on the RLO format.
  • FIGURE 8 is a flow diagram of processing for an MLE CODEC encoder based on a combination of the LRO and RLO formats.
  • FIGURE 9 is a flow diagram of processing for an MLE CODEC decoder based on a combination of the LRO and RLO formats.
  • FIGURE 10 is a diagram of a system for LRO and MULTIVIEW encoding and decoding.
  • a present invention is a system and method for encoding 3D content with reduced power consumption, in particular reduced decoder power consumption, as compared to conventional techniques.
  • An innovative implementation of 3D+F includes an innovative format that includes one of the original views, in contrast to the previously taught 3D+F format that is generated from the original views but does not contain either of the original two views.
  • This innovative format is referred to in the context of this document as the "LRO format".
  • the LRO format can provide compatibility with two-dimensional (2D) applications.
  • the LRO format has been shown to provide images that are at least the quality of images provided by conventional encoding techniques using equivalent bandwidth, with the ability in some cases to provide higher quality images than conventional encoding techniques.
  • a significant feature of the LRO format is that encoding and decoding images requires less power than conventional encoding and decoding formats.
  • the LRO format facilitates encoding of multiple images using an innovative multiview low energy (MLE) CODEC.
  • MLE multiview low energy
  • the innovative MLE CODEC reduces the power consumption requirements of encoding and decoding 3D content, as compared to conventional techniques.
  • a significant feature of the MLE CODEC is that a decoded view from a lower processing level is used for one of the components of the LRO format for at least one higher processing level.
  • some components of the LRO level for a higher view can be derived from processing of lower views, and not all the components of the higher view need to be transmitted as part of the data for the MLE CODEC.
  • WIPO application PCT/1B2010/051311 (attorney file 4221/4) teaches a method and system for minimizing power consumption for encoding data and three-dimensional rendering.
  • This method called 3D+F, makes use of a special format and consists of two main components, a fused view portion and a generating-vectors portion.
  • the 3D+F format taught in PCT/IB2010/05131 1 is referred to as the "original 3D+F format” versus the innovative format of the current invention which is generally referred to as the "LRO format".
  • a diagram of a fused 2D view a fused view 120 is obtained by correlating a left view 100 and a right view 110 of a scene to derive a fused view, also known as a single Cyclopean view, 120, similar to the way the human brain derives one image from two images. While each of a left and right view (image) contains information only about the respective view, the fused view includes all the information necessary to render efficiently left and right views.
  • view and “image” are generally used interchangeably.
  • the term "scene” generally refers to what is being viewed. A scene can include one or more objects or a place that is being viewed.
  • a scene is viewed from a location, referred to as a viewing angle.
  • a viewing angle In the case of stereovision, two views, each from different viewing angles are used. Humans perceive stereovision using one view captured by each eye.
  • two image capture devices for example video cameras, at different locations provide images from two different viewing angles for stereovision.
  • left view 100 of a scene in this case a single object, includes the front of the object from the left viewing angle 106 and the left side of the object 102.
  • Right view 110 includes the front of the object from the right viewing angle 116 and the right side of the object 114.
  • the fused view 120 includes information for the left side of the object 122, information for the right side of the object 124, and information for the front of the object 126. Note that while the information for the fused view left side of the object 122 5 may include only left view information 102, and the information for the fused view right side of the object 124 may include only right view information 114, the information for the front of the object 126 includes information from both left 106 and right 116 front views.
  • features of a fused view include:
  • a fused view can be generated without occluded elements.
  • the term element generally refers to a significant minimum feature of an image. Commonly, an element will be a pixel, but depending on the application and/or image content, an element can be a polygon or area. The term pixel is often used in this document for clarity and ease of explanation. Every pixel in a left or right view can be rendered by copying a corresponding pixel (sometimes 5 copying more than once) from a fused view to the correct location in a left or right view.
  • the processing algorithms necessary to generate the fused view work similarly to how the human brain processes images, therefore eliminating issues such as light and shadowing of pixels.
  • the fused view of the 3D+F format does not contain any occluded pixels.
  • every pixel in the fused view is in the left, right, or both the left and right original images. There are no (occluded) pixels in the fused view that are not in either the left or the right original images.
  • a significant feature of the 3D+F format is the ability of a fused view to be constructed without the fused view containing occluded pixels. This feature > should not be confused with occluded pixels in the original images, which are pixels that are visible in a first original image, but not a second original image. In this case, the pixels that are visible only in the first original image are occluded for the second original image. The pixels that are occluded for the second original image are included in the fused view, and when the fused view is decoded, these occluded pixels are used to re-generate the first ) original image.
  • references to pixels that are visible in one image and in another image refer to corresponding pixels as understood in the stereo literature. Due to the realities of 3D imaging technology such as stereo 3D (S3D), including, but not limited to sampling, and noise, corresponding pixels are normally not exactly the same, but depending on the application, sufficiently similar to be used as the same pixel for processing purposes.
  • S3D stereo 3D
  • One type of fused view includes more pixels than either of the original left or right views. This is the case described in reference to FIGURE 1. In this case, all the occluded pixels in the left or right views are integrated into the fused view. In this case, if the fused view were to be viewed by a user, the view is a distorted 2D view of the content.
  • fused view has approximately the same amount of information as either the original left or right views.
  • This fused view can be generated by mixing (interpolating or filtering) a portion of the occluded pixels in the left or right views with the visible pixels in both views. In this case, if the fused view were to be viewed by a user, the view will show a normal 2D view of the content. This normal (viewable) 2D fused view in the original 3D+F
  • this fused view is not similar to either the original left or original right views in the sense that besides pixels that are in both original views, the fused view includes pixels that are only in the right original view and pixels that are only in the left original view.
  • 3D+F can use either of the above-described types of fused views, or another type of fused
  • the encoding algorithm should preferably be designed to optimize the quality of the rendered views.
  • the choice of which portion of the occluded pixels to be mixed with the visible pixels in the two views and the choice of mixing operation can be done in a process of analysis by synthesis. For example, using a process in which the pixels and operations are optimally selected as a function of the rendered image quality that is j continuously monitored.
  • Algorithms for performing fusion are known in the art, and are typically done using algorithms of stereo matching. Based on this description one skilled in the art will be able to choose the appropriate fusion algorithm for a specific application and modify the fusion algorithm as necessary to generate the associated generating-vectors for 3D+F.
  • a second component of the 3D+F format is a generating-vectors portion, also referred to as generic-vectors.
  • the generating-vectors portion includes a multitude of generating- vectors, more simply referred to as the generating-vectors.
  • Two types of generating-vectors are left generating-vectors and right generating-vectors used to generate a left view and right view, respectively.
  • a first element of a generating-vector is a run-length number that is referred to as a generating number (GN).
  • the generating number is used to indicate how many times an operation (defined below) on a pixel in a fused view should be repeated when generating a left or right view.
  • An operation is specified by a generating operation code, as described below.
  • a second element of a generating-vector is a generating operation code (GOC), also simply called “generating operators” or “operations”.
  • a generating operation code indicates what type of operation (for example, a function, or an algorithm) should be performed on the associated pixel(s). Operations can vary depending on the application. In a preferred implementation, at least the following operations are available:
  • Copy copy a pixel from a fused view to the view being generated (left or right). If GN is equal to n, the pixel is copied n times.
  • Occlude occlude a pixel. For example, do not generate a pixel in the view being generated. If GN is equal to n, do not generate n pixels, meaning that n pixels from the fused view are occluded in the view being generated.
  • Filter the pixels are copied and then smoothed with the surrounding pixels. This operation could be used in order to improve the imaging quality, although the quality achieved without filtering is generally acceptable.
  • 3D+F includes an innovative format that includes one of the original views, in contrast to the previously taught 3D+F format that is generated from the original views but does not contain either of the original two views.
  • this innovative format is referred to as LRO (left view, right occlusions) or as RLO (right view, left occlusions).
  • LRO left view, right occlusions
  • RLO right view, left occlusions
  • LRO/RLO format is generally referred to as just the LRO format.
  • references to either the LRO or RLO apply to both formats, except where a specific construction is being described.
  • 3D+F format does contain the original views, in the sense that the original views can be re-generated from the 3D+F format, this should not be confused with the innovative LRO format described below
  • the left view can be the original left view
  • the right view includes the elements occluded from the left view, in other words, the elements of the original right view that are not visible in the original left view.
  • Elements common to both the original left and right views are included in the left view of the LRO fused view. Note that in the above description of the right view the elements included in the right view are not
  • the right view can also include
  • padding information can also be pixels that are in common with the left view.
  • FIGURE 2 a diagram of exemplary GOC operations for the LRO format, the GOC operations of the generating-vectors can be simply graphically represented as I indicated by the following labels:
  • this generating-vector may be used to insert padding pixels (into the right view of the LRO fused view) that are not used in the view being generated, but are included in the fused view to increase the quality of the rendered views.
  • Padding can also be used to enable more efficient processing (such as compression) of the fused view. Padding data is added to the fused view making the fused view larger, but enabling a compression algorithm to compress the larger fused view into a smaller amount of data to be transmitted, as compared to compressing a relatively smaller fused view into a relatively larger amount of data to be transmitted.
  • the fused view can be arbitrarily generated.
  • the pixel positions on the LRO fused view can be changed as long as these pixels can be retrieved for generating the left and right views.
  • a non-limiting example of associating the generating-vectors (GVs) with the corresponding pixels on which the GVs need to act can be seen in the embodiment of FIGURE 2.
  • the B, L, and R GVs form a frame. In this frame, the GVs are located at a position such that retrieving sequentially pixels from the fused view and reading the
  • the pixels retrieved are either skipped (O GV) , copied to both left and right images (B), copied only in the left image (L), or copied only in the right images (R).
  • O GV skipped
  • the value of the GV points to the operation on the corresponding pixel.
  • the map can be compressed using run length coding or other types of efficient entropy coding.
  • different fused views can be generated by varying the padding in the fused views. As the padding is not used to generate the decoded original left
  • fused views can generate similar decoded original left and right views. Generating alternative fused views may have system benefits, including greater compressibility, or improved quality after transmission (such as by h.264 compression).
  • a key feature of a method of generating an LRO format is arranging the pixel positions of the fused view to optimize subsequent processing, typically image quality and
  • an LRO fused view 300 includes a left view 302 and right occlusions view 304.
  • the left view 302 is built from the L (left) and B (both) pixels corresponding to the L generating-vectors and the B generating- vectors, respectively.
  • the L generating-vectors and B generating-vectors use the L pixels and B pixels, respectively, to generate (re-generate) the original left view.
  • the right occlusions view 304 is built from the R (right) pixels corresponding to the R generating- vectors and optionally of padding pixels built from the O pixels (refer back to FIGURE 2) corresponding the O generating-vectors.
  • the padding pixels can be pixels common to the right and left original images.
  • the R generating-vectors and B generating-vectors use the R pixels and B pixels, respectively, to generate (re-generate) the original right view.
  • the LRO format is a method of storing data, the first step of the method being similar to the above-described method for generating an LRO fused data format for a first and second data set.
  • the general method for storing data can be any two-dimensional (2D) image, and the second data set is a second 2D image.
  • a method for encoding LRO format data includes the steps of receiving a first two- dimensional (2D) image of a scene from a first viewing angle and a second 2D image of the scene from a second viewing angle.
  • a first view, a second view, and associated generating- vectors are generated using the first and second 2D images.
  • the first view, a second view, and associated generating- vectors are generated using the first and second 2D images. The first view, a second view,
  • the first and second views are generated by combining the first and second 2D images.
  • the first view contains information associated with elements of the first 2D image.
  • the second view contains information associated with elements of the second 2D image other than elements of the second 2D image that are in common with
  • the second view may also include other elements, such as padding.
  • the associated generating-vectors indicate operations to be performed on elements of the first and second views to recover the first and second 2D images.
  • the first view is substantially identical to the first 2D image.
  • the first view includes elements that are common to the first and second sets of data.
  • the second view only contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data.
  • the second view contains additional information.
  • the additional information is information other than information only associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data.
  • the additional information is padding, which can be implemented as elements
  • a method for decoding LRO format data includes the step of providing a first view and a second view. Similar to the description above in reference to encoding LRO format data, the first and second views contain information associated with elements of a first
  • the first view contains information associated with
  • the second view contains information associated with elements of the second 2D image other than elements of the second 2D image that are in common with corresponding elements of the first 2D image. As described above, the second view may also include other elements, such as padding.
  • the operations to be performed on elements of the first and second views to render the first and second 2D images Using the first view, the second view, and the associated generating- vectors, at least the first 2D image is rendered.
  • the first view is the first 2D image. So rendering the first 2D image can be done by simply extracting the first view,
  • the second 2D image can be rendered.
  • Generating-vectors are also generated, and can be included in the left view, right occlusions view, or preferably in a separate file.
  • LRO fused view refers to both (or in general, all) of the generated files.
  • the two views can be separately encoded using different settings in the H.264/MPEG4 compression scheme. Whether one
  • the resulting LRO fused view achieves a compression with H.264/MPEG4 that is at least as good as the compression of the previously taught 3D+F format of FIGURE 1.
  • the good compression ratio results in part from the right side view being compact. Also, since the right side view consists of right occlusions that have a reasonable degree of coherence from frame to frame, the right view tends to have good H.264/MPEG4 compression.
  • the original 3D+F format previously taught in reference to FIGURE 1 to achieve improved H.264/MPEG4
  • the original 3D+F format requires padding pixels in order to preserve inter-line pixel coherence, and therefore is not as compact as the innovative LRO format described in reference to FIGURE 3A.
  • the LRO format facilitates arranging the pixel positions of the fused view to optimize subsequent processing.
  • the pixels of the right occlusions view 304 are the pixels that are occluded from the left view (the elements of the original right image that are not visible in the left image and optional padding).
  • the quality of an image decoded from the right occlusions view may be able to be increased by padding the right
  • the decoded image can be monitored and the quality of the decoded image used for feedback to the fused view generator, modifying how padding is applied in the generation of the fused views.
  • padding can be applied to increase the compression ratio of the fused view, in particular for the right occlusions view.
  • the right occlusions view is
  • the compression algorithm being used processes the data of the right occlusions view similar to processing of an original image.
  • the compression ratio of the padded right view can be higher than the compression ratio of an un-padded right occlusions view.
  • padding can optionally be used in the original 3D+F format, but this padding affects the quality of the rendered view, and increases the size of the fused view by a relatively larger amount.
  • padding when padding is used in the original 3D+F format, a larger amount of data is added to the fused view than the relatively smaller amount of data added when padding is added to the fused view of the LRO format.
  • padding affects the compression ratio of the data of the fused view, with a relatively smaller amount of data added only to the right occlusions view ("RO", the part built from the right (R) pixels and optional padding).
  • ROI right occlusions view
  • FIGURE 3A and FIGURE 1 by the relative size of the left view 302, which is comparable to the front of the object 126, and right occlusions view 304, the right occlusions view 304 is relatively smaller than the left view 302 to begin with, so the added padding to the LRO format right occlusions view 304 is less than the occlusions added to the front of the object 126.
  • FIGURE 3B a diagram of an RLO format, a right view-left occlusions (RLO) format can be derived from the fused view.
  • the RLO format is similar to the LRO
  • An RLO fused view 310 includes a right view 312 and left occlusions view 314.
  • the right view 312 is built from the R (right) and B (both) pixels corresponding to the R generating-vectors and the B generating-vectors, respectively.
  • the R i generating-vectors and B generating-vectors use the R pixels and B pixels, respectively, to generate (re-generate) the original right view.
  • the left occlusions view 314 is built from the L (left) pixels corresponding to the L generating-vectors and optionally from the O pixels, similar to the description of FIGURE 2.
  • the L generating-vectors and B generating-vectors use the L pixels and B pixels, respectively, to generate (re-generate) the original left view.
  • This innovative multiview 3D CODEC reduces the power consumption requirements of encoding and decoding 3D content, as compared to conventional techniques.
  • This innovative multiview 3D CODEC is referred to in the context of this document as a multiview low energy CODEC, or simply
  • MLE CODEC MLE CODEC.
  • a feature of the MLE CODEC is a much lower power requirement as
  • generating also referred to in the industry as synthesizing
  • decoded views during encoding is a significant feature of the current invention, and has been shown to be less power consuming than implementations which synthesize views at the decoder/receiver stage.
  • multiview 3D is herein described using the LRO (RLO) format. It is foreseen that based on the above description of the LRO format, and the below-description of multiview 3D, modifications to the LRO format for specific applications are possible, and any format that supports the innovative characteristics of the LRO format to facilitate multiview 3D can be used for multiview 3D.
  • modifications to the LRO format include changing the structure of the frames (refer again to FIGURE 3), while keeping the meaningful information substantially intact.
  • FIGURE 4 a flow diagram of processing for an MLE CODEC encoder based on the LRO format, a non-limiting example of 5 views is encoded.
  • the MLE CODEC encoder is also referred to as simply the MLE encoder. Based on this description, one skilled in the art will be able to extend this encoding method to an arbitrary number of views, including more views and fewer views. Below, further embodiments are described based on the RLO format, and both LRO and RLO formats. 5 original views, original view 1 (401), original view 2 (402), original view 3 (403), original view 4 (404), and original view 5 (405), are encoded to produce a smaller amount of data, as compared to the amount of data in the original views.
  • Original view 1 (401) and original view 2 (402) are used to generate LRO format for views 1 and 2 LRO 12 412, similar to the above described method for S3D with original view 1 (401) used as the left view and original view 2 (402) used as the right view.
  • LR012 contains a left view L12 [the part built from the left (L) and both (B) pixels], a right occlusions view R012 [the part built from the right (R) pixels, and optionally from the occlusion (O) pixels, for example such as padding], and generating-vectors GV12 (generating-vectors to regenerate the original left and right views).
  • decoded view 2 (402D) After generating LR012, LR012 is decoded to generate decoded view 2 (402D). While theoretically decoded view 2 (402D) can be the same as original view 2 (402), depending on the application and encoding parameters chosen, decoded view 2 (402D) and original view 2 (402) may be more or less similar. In other words, the quality of decoded
  • view 2 (402D) as compared to original view 2 (402) may be substantially the same, or a
  • decoded view 2 (402D) and original view 2 (402) are substantially the same, meaning that for a given application the differences between the two views are below a given threshold.
  • Decoded view 2 (402D) and original view 3 (403) are used to generate LRO format
  • LR023 423 similar to the above described method for LR012.
  • LR023 contains a left view L23, a right occlusions view R023, and generating-vectors GV23.
  • a significant feature of the method of this encoding is that decoded view 2 (402D) is used for left view L23.
  • L23, R023, and GV23 will all be required by the MLE CODEC decoder.
  • decoded view 2 (402D) is used for left view L23, left view L23 does not have to be part of
  • Fused view LRO 12 is already transmitted, and can be used by the MLE CODEC decoder to produce decoded view 2 (402D) which can be used as the L23 part of LR023.
  • decoded view 2 (402D) which can be used as the L23 part of LR023.
  • LR023 does not fully contribute to the bit rate and bandwidth required for transmission, as L23 does not need to be transmitted. This contributes significantly to the bandwidth savings.
  • the view that is transmitted contains only the right occlusion and optional padding (RO) pixels, which is generally a smaller amount of data than the view not-transmitted (L23).
  • Decoded view 3 (403D) and original view 4 (404) are used to generate LRO format LR034 434, similar to the above described method for LR012.
  • LR034 contains a left view L34, a right occlusions view R034, and generating-vectors GV34. Similar to the description in reference to left view L23, decoded view 3 (403D) is used for left view L34.
  • L34, R034, and GV34 will all be required by the MLE CODEC decoder. However, since decoded view 3 (403D) is used for left view L34, left view L34 does not have to be part of the produced data for LR034.
  • Data for fused view LR023 is already available from transmitted data, and can be used by the MLE CODEC decoder to produce decoded view 3 (403D) which can be used as the L34 part of LR034.
  • MLE CODEC decoder to produce decoded view 3 (403D) which can be used as the L34 part of LR034.
  • LR034 does not fully contribute to the bit rate and bandwidth required for transmission, as L34 does not need to be transmitted.
  • the view that is transmitted contains only the right occlusion and optional padding (RO) pixels, which is generally a smaller amount of data than the view not- transmitted (L34).
  • Decoded view 4 (404D) is substantially the same as original view 4 (404).
  • Decoded view 4 (404D) and original view 5 (405) are used to generate LRO format LR045 445, similar to the above described method for LR012.
  • LR045 contains a left view L45, a right occlusions view R045, and generating-vectors GV45.
  • decoded view 4 (404D) is used for left view L45.
  • L45, R045, and GV45 will all be required by the MLE CODEC decoder.
  • decoded view 4 (404D) is used for left view L45, left view L45 does not have to be part of the produced data for LR045.
  • Fused view LR034 is already transmitted, and can be used by the MLE CODEC decoder to produce decoded view 4 (404D) which can be used as the L45 part of LR045.
  • LR045 does not fully contribute to the bit rate and bandwidth required for transmission, as L45 does not need to be transmitted.
  • the original data for the current example includes 5 original views.
  • the data produced by the encoder includes only one original view [original view 1 (401)], left view L12 , with four right views R012, R023, R034, and R045 and correspondingly only four sets of generating-vectors GV12, GV23, GV34, and GV45. It will be obvious to one skilled in the art that the views can be combined in an arbitrary order, with different combinations requiring potentially different amounts of processing power, producing different resulting compression ratios, and different quality decoded images.
  • the multiview (fused data) format is a method of storing data, the first step of the method being similar to the above-described method for generating an LRO fused data format for a first and second data set.
  • the general method for storing data can be used for encoding data, in this case 2D images.
  • Generating data in the multiview format can be done by a MLE CODEC encoder.
  • a first fused data set includes a first view, a second view, and a first set of associated generating-vectors.
  • the first and second views are generated by combining a first set of data and a second set of data.
  • the first view contains information associated with elements of the first set of data.
  • the first view contains only information associated with elements of the first set of data.
  • the first view is the first set of data. Note that this first view is not exclusive, in that the first view does not exclude information that is also associated with elements of the second set of data.
  • the second view contains information associated with elements of the second set of data, preferably other than elements of the second set of data that are in common with corresponding elements of the first set of data, except for optional padding.
  • the second view contains information associated with elements of the second set of data that are not in common with corresponding elements of the first set of data, except for optional padding.
  • the first set of associated generating-vectors indicates operations to be performed on the elements of the first and second views to recover the first and second sets of data.
  • each set of views has an associated set of generating-vectors.
  • associated generating-vectors generally refers to the generating-vectors associated with the two views of the LRO fused data format for which the vectors are used to generate the original (or decoded) two images.
  • the next step in storing data in the multiview format is generating a decoded second view using the first fused data set.
  • Decoding can be done using the technique described above for decoding the LRO fused data format.
  • the decoded second view is substantially the same as the second set of data.
  • the next step can be thought of as generating a second fused data set.
  • the second fused data set includes the decoded second view, a third view, and a second set of associated generating-vectors. Practically, generating a formal second fused data set is not necessary.
  • the decoded second view has already been generated, and the decoded second view does not need to be stored nor transmitted in the multiview format.
  • a third view and a second set of associated generating-vectors need to be generated and retained (stored or transmitted). The third view is generated using the decoded second view and a third set of data.
  • a significant feature of the MLE CODEC encoder and storing data in the multiview format is that the decoded second view is used as one of the views in the fused data set.
  • the decoded second view is similar to the previously described first view, in that the decoded second view is not exclusive, that is, the decoded second view does not exclude information that is also associated with elements of the third set of data.
  • the third view is generated by combining the decoded second view and a third set of data, such that the third view contains information associated with elements of the third set of data other than elements of the third set of data that are in common with corresponding elements of the decoded second view, except for optional padding. Similar to the second view in the first fused data set, the third view contains information associated with elements of the third set of data that are not in
  • the second set of associated generating-vectors indicates operations to be performed on the elements of the decoded second view and third views to recover the second2) and third sets of data.
  • the method is repeated similar to the step of generating the second fused data set.
  • higher-level refers to a subsequently encoded or decoded fused data set
  • lower level refers to a previously encoded (or decoded) fused data set.
  • level 1 the lowest level image encoded (decoded) is referred to as level 1
  • the next image encoded (or decoded) is 2, and so forth.
  • encoding a third image uses the lower-level second image (decoded second image) to generate a third-level fused data set.
  • Decoding a fourth image comes from using the previous, lower-level fused data set, a third-level fused data set to generate the decoded fourth level image.
  • the above-described method for generating multiview fused data can be repeated to generate a higher-level fused data set, the higher-level fused data set including a higher-level decoded view from a lower-level fused data set.
  • the above description one skilled in the art will be able to expand the currently described method for multiview MLE encoding for an arbitrary number of sets of data.
  • portions of the fused data sets are stored.
  • a significant feature of the multiview format, and corresponding MLE CODEC, is that only portions of the fused data format are that needed for decoding need to be retained.
  • Retaining includes, but is not limited to storing the retained data in a non-volatile memory, or in temporary storage.
  • Temporary storage includes data that is generated for transmission, even if the data will not be retained by the generating system after transmission of the data.
  • the entire first fused data set is retained, including the first view, second view, and first set of generating- vectors.
  • For the second and additional data sets as one of the views (for example a left view) can be generated by the previous level's decoding, only the other view (for example a right view) and another set of generating-vectors needs to be retained.
  • Temporary storage during encoding includes portions of the fused data format that are not retained.
  • decoded views are generated for use in generating the next level fused data set.
  • the decoded views are not necessary for decoding the multiple views from the stored and/or transmitted data.
  • storage of additional data may be desired.
  • One example is storing additional data during encoding or decoding to improve processing.
  • Another example is storing one or more decoded views during testing, or periodically during operation to verify the operation, processing, and/or quality of the CODEC.
  • FIGURE 5 a flow diagram of processing for an MLE CODEC decoder based on the LRO format, the non-limiting example of FIGURE 4 is continued.
  • the MLE CODEC decoder is also referred to as simply the MLE decoder. Based on this description, one skilled in the art will be able to extend this decoding method to an arbitrary number of views, including more views and fewer views.
  • LR012 fused data format (412) includes transmitted data L12, R012, and GV12.
  • LR012 is decoded to generated decoded view 1 (501D) and decoded view 2 (502D). This decoding is similar to the decoding described above in reference to the LRO format for S3D.
  • decoded view ⁇ N> is substantially the same as original view ⁇ N>.
  • decoded view 1 can be extracted from LRO 12 as the L12 part.
  • An alternative drawing could represent item 501D as being extracted from LRO 12 via arrow 500, and item 501D not being striped. For consistency, the striped notation for all decoded views is maintained in the figures.
  • LR023 (523) is decoded to generate decoded view 3 (503D).
  • the format for LR023 contains a left view L23, a right occlusions view R023, and generating-vectors GV23.
  • a significant feature of the method of this decoding is that decoded view 2 (502D) is used for left view L23. Since decoded view 2 (502D) is used for left view L23, the MLE CODEC decoder does not have to receive L23 as part of the data transmission. As described above, since L23 is not needed by the decoder, L23 is not produced or transmitted as part of LR023. R023 and GV23 are transmitted as part of the multiview transmission to the decoder. R023 and GV23 are used with the generated decoded view 2 (502D), which is L23, to form
  • LR023. LR023 is then decoded to generate decoded view 3 (503D).
  • decoded view 3 After generating decoded view 3 (503D), the method repeats, using decoded view 3 (503D) as left view L34 of LR034 (534). Data received by the MLE CODEC decoder for right occlusions view R034 and generating-vectors GV34 completes the LR034 fused data format. LR034 is then decoded to generate decoded view 4 (504D).
  • decoded view 4 (04D)
  • the method repeats, using decoded view 4 (504D) as left view L45 of LR045 (545).
  • Data received by the MLE CODEC decoder for right occlusions view R045 and generating-vectors GV45 completes the LR045 fused data format.
  • LR045 is then decoded to generate decoded view 5 (505D).
  • the MLE CODEC decoder is a subset of the MLE CODEC encoder. As such, savings can be achieved in hardware and/or software implementation by a careful reuse of modules, specifically by implementing the encoder to allow for dual-use as a decoder.
  • the first step of a method for decoding the multiview (fused data) format is similar to the above-described method for decoding an LRO fused data format.
  • the general method for decoding are two-dimensional (2D) images.
  • data can be used for decoding 2D images.
  • data sets are referred to as 2D images.
  • Generating data in the multiview format can be done by an MLE CODEC decoder.
  • a first fused data set includes a first view, a second view, and a first set of associated generating-vectors.
  • the first and second views contain information associated with elements
  • the first set of associated generating-vectors indicates operations to be i performed on elements of the first and second views to render the first and second 2D
  • At least a decoded second view is rendered from the first fused data set using the first fused data set.
  • the decoded second view is substantially the same as the second 2D image.
  • a decoded first view is also rendered using the first fused data set.
  • I first view is substantially the same as the first 2D image.
  • a third view and a second set of associated generating-vectors are provided, which in combination with the decoded second view are used to render a decoded third view.
  • the decoded second view, third view, and second set of associated generating-vectors are portions of a second fused data set. As the decoded second view, has been rendered from the
  • the third view contains information associated with elements of a third 2D image other than elements of the third 2D image that are in common with corresponding elements of the second 2D image, except for optional padding pixels.
  • the second set of associated generating-vectors indicates
  • the decoded third view is substantially the same as the third 2D image.
  • the above-described method for decoding multiview fused data can be repeated to decode a higher-level fused data set, the higher-level fused data set including a higher-level decoded view from a lower-level fused data set.
  • MLE decoding for an arbitrary number of sets of data.
  • FIGURE 6 a flow diagram of processing for an MLE CODEC encoder ) based on the RLO format, a non-limiting example of 5 views is encoded.
  • RLO encoder is similar to the above-described non-limiting example of an LRO encoder.
  • 5 original views, original view 1 (601), original view 2 (602), original view 3 (603), original view 4 (604), and original view 5 (605) are encoded to produce a smaller amount of data, as compared to the amount of data in the original views.
  • Original view 1 ⁇ (601) and original view 2 (602) are used to generate RLO format for views 1 and 2 RL012 (612), similar to the above described method for S3D with original view 1 (601) used as the left view and original view 2 (602) used as the right view.
  • RL012 contains a right view R12 [the part built from the right (R) and both (B) pixels], a left occlusions view L012 [the part built from the left (L) pixels], and generating-vectors RGV12 (generating-vectors to
  • references to the generating-vectors of the LRO format are of the form ⁇ GVnn>, while generating-vectors of the RLO format are of the form ⁇ RGVnn>.
  • generating-vectors of the RLO format are of the form ⁇ RGVnn>.
  • R12, L012, and RGV12 will all be required by the MLE CODEC decoder, so are part of the produced data, and fully contribute to a bit rate and bandwidth required for
  • decoded view 2 (602D) is decoded to generate decoded view 2 (602D). While theoretically decoded view 2 (602D) can be the same as original view 2 (602), depending on the application and encoding parameters chosen, decoded view 2 (602D) and original view 2 (602) may be more or less similar. In other words, the quality of decoded view 2 (602D) as compared to original view 2 (602) may be substantially the same, or a lower quality. In general, the two views, decoded view 2 (602D) and original view 2 (602) are substantially the same, meaning that for a given application the differences between the two views are below a given threshold.
  • Decoded view 2 (602D) and original view 3 (603) are used to generate RLO format RL023 623, similar to the above described method for RLO 12.
  • RL023 contains a right view R23, a left occlusions view L023, and generating-vectors RGV23.
  • a significant feature of the method of this encoding is that decoded view 2 (602D) is used for right view R23.
  • R23, L023, and RGV23 will all be required by the MLE CODEC decoder.
  • decoded view 2 (602D) is used for right view R23, right view R23 does not have to be part of the produced data for RL023.
  • Fused view RLO 12 is already transmitted, and can be used by the MLE CODEC decoder to produce decoded view 2 (602D) which can be used as the R23 part of RL023.
  • decoded view 2 (602D) which can be used as the R23 part of RL023.
  • RL023 does not fully contribute to the bit rate and bandwidth required for transmission, as R23 does not need to be transmitted. This contributes significantly to the bandwidth savings.
  • the view that is transmitted contains only the left occlusion (LO) pixels and optional padding pixels, which is generally a smaller amount of data than the view not-transmitted (R23).
  • Decoded view 3 (603D) is substantially the same as original view 3 (603).
  • Decoded view 3 (603D) and original view 4 (604) are used to generate RLO format RL034 634, similar to the above described method for RL012.
  • RL034 contains a right view R34, a left occlusions view L034, and generating-vectors RGV34. Similar to the description in reference to right view R23, decoded view 3 (603D) is used for right view R34.
  • R34, L034, and RGV34 will all be required by the MLE CODEC decoder.
  • decoded view 3 (603D) is used for right view R34
  • right view R34 does not have to be part of the produced data for RL034.
  • Data for fused view RL023 is already available, and can be used by the MLE CODEC decoder to produce decoded view 3 (603D) which can be used as the R34 part of RL034.
  • RL034 only the L034 and RGV34 parts need to be transmitted.
  • RL034 does not fully contribute to the bit rate and bandwidth required for transmission, as
  • the view that is transmitted contains only the left occlusion (LO) pixels and optional padding pixels, which is generally a smaller amount of data than the view
  • Decoded view 4 (604D) is substantially the same as original view 4 (604).
  • Decoded view 4 (604D) and original view 5 (605) are used to generate RLO format RL045 645, similar to the above described method for RL012. RL045
  • ' contains a right view R45, a left occlusions view L045, and generating-vectors RGV45.
  • decoded view 4 (604D) is used for right view R45.
  • R45, L045, and RGV45 will all be required by the MLE CODEC decoder.
  • decoded view 4 (604D) is used for right view R45, right view R45 does not have to be part of the produced data for RL045. Fused view RL034 is already transmitted,
  • decoded view 4 (604D) which can be used as the R45 part of RL045.
  • RL045 only the L045 and RGV45 parts need to.be transmitted.
  • RL045 does not fully contribute to the bit rate and bandwidth required for transmission, as R45 does not need to be transmitted.
  • the original data for the current example includes 5 original views.
  • the data produced by the encoder includes only one original view [original view 1 (601)], right view R12 and four additional left views, L012, L023, L034, and L045 and correspondingly only four sets of generating-vectors RGV12, RGV23, RGV34, and RGV45.
  • FIGURE 7 a flow diagram of processing for an MLE CODEC decoder based on the RLO format, the non-limiting example of FIGURE 6 is continued.
  • the current example of an RLO decoder is similar to the above-described non-limiting example of an LRO decoder.
  • RL012 fused data format (612) includes transmitted data R12, L012, and RGV12.
  • RLO 12 is decoded to generated decoded view 1 (701D) and decoded view 2 (702D). This decoding is similar to the decoding described above in reference to the LRO format for S3D. As described above, in general decoded view ⁇ N> is substantially the same as original view ⁇ N>.
  • decoded view 1 (701D) can be extracted from RLO 12 as the R12 part.
  • An alternative drawing could represent item 701D as being extracted from RLO 12 via arrow I 700, and item 701D not being striped. For consistency, the striped notation for all decoded views is maintained in the figures.
  • RL023 (723) is decoded to generate decoded view 3 (703D).
  • the format for RL023 contains a right view R23, a left occlusions view L023, and generating-vectors RGV23.
  • a significant feature of the method of this decoding is that decoded view 2 (702D) is used for right view R23. Since decoded view 2 (702D) is used for right view R23, the MLE CODEC decoder does not have to receive R23 as part of the data transmission. As described above, since R23 is not needed by the decoder, R23 is not produced or transmitted as part of RL023. L023 and RGV23 are transmitted as part of the multiview transmission to the decoder. L023 and RGV23 are used with the generated decoded view 2 (702D), which is I R23, to form. RL023. RL023 is then decoded to generate decoded view 3 (703D).
  • decoded view 3 (703D) After generating decoded view 3 (703D), the method repeats, using decoded view 3 (703D) as right view R34 of RL034 (734). Data received by the MLE CODEC decoder for left occlusions view L034 and generating-vectors RGV34 completes the RL034 fused data format. RL034 is then decoded to generate decoded view 4 (704D).
  • decoded view 4 (704D) the method repeats, using decoded view 4 (704D) as right view R45 of RL045 (745).
  • Data received by the MLE CODEC decoder for left occlusions view L045 and generating-vectors RGV45 completes the RL045 fused data format.
  • RL045 is then decoded to generate decoded view 5 (705D).
  • MLE CODEC decoder is a subset of the MLE CODEC encoder. As > such, savings can be achieved in hardware and/or software implementation by a careful reuse of modules, specifically by implementing the encoder to allow for dual-use as a decoder.
  • MLE CODEC Encoder Based on a Combination of the LRO and RLO Formats
  • the decoded views are normally substantially the same as the original images, but typically not exactly the same as the original images.
  • the errors in the decoded views affect the quality of subsequent decoded views.
  • MLE encoders based on only either the LRO format or RLO format, there is typically a decrease in the quality of the decoded images as the CODEC progresses from lower to higher levels. A greater number of processing levels may have a greater decrease in the quality of the decoded image for higher levels, as compared to the original image.
  • Variations of the MLE CODEC can be specified using a combination or mixture of LRO and RLO formats. Using a combination of formats can reduce the number of processing levels required to encode a given number of original images, thereby increasing the quality of the decoded images.
  • FIGURE 8 a flow diagram of processing for an MLE CODEC encoder based on a combination of the LRO and RLO formats, a non- limiting ' example of 5 views is encoded.
  • the LRO format CODEC described in reference to FIGURE 4 has four processing levels (to generate LRO 12, LR023, LR034, LR045)
  • the combination CODEC described below in reference to FIGURE 8 has only two processing levels on the LRO pipeline branch (to generate LR034 and LR045) and three processing levels (to generate LR034, RL032, and RL021) on the branch for the RLO pipeline.
  • the root of both the LRO and RLO branches is the same (LR034), and only has to be generated once. Due to fewer processing levels in a branch when using a combination of formats, the resulting quality of decoded images can be better than the quality of decoded images when using a single format.
  • LRO and RLO branches of processing levels can be implemented in parallel, serial, or a combination of processing order. Depending on the application, more than one root and more than two branches can also be used. In the below description, the LRO branch will first be described.
  • Original view 3 (403) and original view 4 (404) are used to generate LRO format for views 3 and 4 LR034 834, similar to the above described method for S3D with original view 3 (403) used as the left view and original view 4 (404) used as the right view.
  • LR034 contains a left view L34 [the part built from the left (L) and both (B) pixels], a right occlusions view R034 [the part built from the right (R) pixels and optional padding pixels], and generating- vectors GV34 (generating-vectors to regenerate the original left and right views). Note that in the preferred embodiment described above in reference to the LRO format, as all of the pixels for the original view 3 (403) are in the left view (L34) and only pixels for original view 4 (404) are in the right occlusions view (R034).
  • L34, R034, and GV34 will all be required by the MLE CODEC decoder, so are part of the produced data, and fully contribute to a bit rate and bandwidth required for
  • original view 3 (403) and original view 4 (404) do not need to be transmitted.
  • items that do not need to be transmitted are striped, while items to be transmitted are not filled-in.
  • LR034 is decoded to generate decoded view 4 (804D) and decoded view 3 (803D). While theoretically decoded view 4 (804D) and decoded view 3 (803D) can be the same as original view 4 (404) and original view 3 (403), respectively, depending on the application and encoding parameters chosen, the decoded and respective original views may be more or less similar.
  • Decoded view 4 (804D) and original view 5 (405) are used to generate LRO format LR045 (845), similar to the above described method for LR034.
  • LR045 contains a left view L45, a right occlusions view R045, and generating-vectors GV45.
  • a significant feature of the method of this encoding is that decoded view 4 (804D) is used for left view L45.
  • L45, R045, and GV45 will all be required by the MLE CODEC decoder.
  • decoded view 4 (804D) is used for left view L45, left view L45 does not have to be part of the produced data for LR045.
  • Fused view LR034 is already transmitted, and can be used by the MLE CODEC decoder to produce decoded view 4 (804D) which can be used as the L45 part of LR045.
  • decoded view 4 (804D) which can be used as the L45 part of LR045.
  • LR045 does not fully contribute to the bit rate and bandwidth required for transmission, as L45 does not need to be transmitted. This contributes significantly to the bandwidth savings.
  • ⁇ transmitted (R045) contains only the right occluded (RO) pixels and optional padding pixels, which is generally a smaller amount of data than the view not-transmitted (L45).
  • Decoded view 3 (803D) and original view 2 (402) are used to generate RLO format RL032 823, similar to the above described method for RLO 12.
  • RL032 contains a right view R32, a left occlusions view L032, and generating- vectors RGV32.
  • decoded view 3 (803D) is used for right view R32.
  • R32, L032, and RGV32 will all be required by the MLE CODEC decoder.
  • decoded view 3 (803D) is used for right view R32, right view R32 does not have to be part of the produced data for RL032.
  • Fused view LR034 is already transmitted, and can be used by the MLE CODEC decoder to produce decoded view 3 (803D) which can be used as the R32 part of RL032.
  • RL032 and RGV32 parts need to be
  • RL032 does not fully contribute to the bit rate and bandwidth required for transmission, as R32 does not need to be transmitted. This contributes significantly to the bandwidth savings.
  • Decoded view 2 (802D) is substantially the same as original view 2 (402).
  • Decoded view 2 (802D) and original view 1 (401) are used to generate RLO format RL021 (821), similar to the above described method for RL032.
  • RL021 contains a right view R21, a left occlusions view L021, and generating-vectors RGV21. Similar to the description in reference to right view R32, decoded view 2 (802D) is used for right view
  • R21. R21, L021, and RGV21 will all be required by the MLE CODEC decoder. However, since decoded view 2 (802D) is used for right view R21, right view R21 does not have to be part of the produced data for RL021. Data for fused view RL032 is already available, and can be used by the MLE CODEC decoder to produce decoded view 2 (802D) which can be used as the R21 part of RL021. Hence, from RL021, only the L021 and RGV21 parts need to be transmitted. RL021 does not fully contribute to the bit rate and bandwidth required for transmission, as R21 does not need to be transmitted. MLE CODEC Decoder Based on a Combination of the LRO and RLO Formats
  • FIGURE 9 a flow diagram of processing for an MLE CODEC decoder based on a combination of the LRO and RLO formats, the non-limiting example of FIGURE
  • the MLE CODEC decoder does not 5 have to decode all branches. Only branches necessary for providing the desired images need to be decoded. In a non-limiting example, if a user wants to see what is happening in the direction of original image 3 (403), 4 (404), and 5 (405), only the left branch (LRO encoded data) needs to be decoded to provide the desired images.
  • LR034 fused data format (834) includes transmitted data L34, R034, and GV34.
  • ⁇ LR034 is decoded to generated decoded view 4 (904D) and decoded view 3 (903D). This decoding is similar to the decoding described above in reference to the LRO format for S3D.
  • LR045 (945) is decoded to generate decoded view 5 (905D).
  • the format for LR045 is decoded to generate decoded view 5 (905D).
  • decoded view 4 (904D) is used for left view L45. Since decoded view 4 (904D) is used for left view L45, the MLE CODEC decoder does not have to receive L45 as part of the data transmission. As described above, since L45 is not needed by the decoder, L45 is not produced or transmitted as part of LR045.
  • R045 and GV45 are transmitted as part of the multiview transmission to the decoder.
  • R045 and GV45 are used with the generated decoded view 4 (904D), which is L45, to form
  • LR045 is then decoded to generate decoded view 5 (905D).
  • decoded view 3 (903D) is used for right view R32. Since decoded view 3 (903D) is used for right view R32, the MLE CODEC decoder does not have to receive R32 as part of the data
  • L032 and RGV32 are transmitted as part of the multiview transmission to the decoder.
  • L032 and RGV32 are used with the generated decoded view 3 (903D), which is R32, to form RL032.
  • RL032 is then decoded to generate decoded view 2 (902D).
  • decoded view 2 (902D) After generating decoded view 2 (902D), the method repeats, using decoded view 2 (902D) as right view R21 of RL021 (921). Data received by the MLE CODEC decoder for left occlusions view L021 and generating-vectors RGV21 completes the RL021 fused data format. RL021 is then decoded to generate decoded view 1 (901D).
  • System 1000 includes a variety of processing modules, depending on the specific encoding and/or decoding required by the application.
  • the high-level block diagram of a system 1000 of the present embodiment includes a processor 1002, a transceiver module 1010, and optional memory devices: a RAM 1004, a boot ROM 1006, and a nonvolatile memory 1008, all communicating via a common bus 1012.
  • the components of system 1000 are deployed in a host 1020.
  • Transceiver module 1010 can be configured to receive and/or send data for encoding and/or decoding. When the transceiver module is used to receive data, the transceiver module functions as a data-receiving module.
  • received data for LRO encoding can include a first set of data (original view 1 , 401) and a second set of data (original view 2, 402).
  • received data for LRO decoding can include a first view (L12) and a second view (R012), and generating- vectors (GV12) associated with the first and second views.
  • received data for encoding can include first set of data (original view 1, 401), a second set of data (original view 2, 402), and a third set of data (original view 3, 403).
  • received data for multiview decoding can include first fused data set (LR012) a third view (R023) and a second set of associated generating-vectors (GV23).
  • LRO and multiview encoding and decoding can be transmitted via transceiver module 1010, stored in volatile memory, such as RAM 1004, and/or stored in nonvolatile memory 1008.
  • RAM 1004 and nonvolatile memory 1008 can be configured as a storage module for data.
  • Stored or transmitted data from LRO encoding includes the LRO fused view of FIGURE 3 A, the RLO fused view of FIGURE 3B, and the (LR012) fused view 412 of FIGURE 4 (which includes first view L12, second view R012, and associated generating-vectors GV12).
  • Data from LRO decoding includes first set of data (decoded view 1, 501D) and second set of data (decoded view 2, 502D).
  • data from multiview encoding includes first fused data set (LR012 that is 412), third view
  • data from multiview decoding includes a decoded first view (50 ID), a decoded second view (502D), and a decoded third view (503D).
  • data can be received or transmitted as two sets of data in a single file, as two or more files, or other configurations as appropriate to the application.
  • Nonvolatile memory 1008 is an example of a computer-readable storage medium bearing computer-readable code for implementing the data encoding and decoding methodologies described in the current document.
  • Other examples of such computer- readable storage media include read-only memories such as CDs bearing such code.
  • the computer-readable code can include program code for one or more of the following: encoding data in the LRO format, decoding data from the LRO format, encoding data in the multiview format, and decoding data from the multiview format.
  • arrows between data generally represent processing modules, or processing which can be implemented on a processing system that includes one or more processors such as processor 1002. For clarity in the diagrams, not all of the arrows are labeled.
  • the following is an exemplary description and mapping of some of the LRO and multiview CODEC processing modules:
  • arrow 490 represents a processing module, typically implemented as processing on a processor, such as processor 1002.
  • Arrow 490 also referred to in this description as processing module 490, is an encoding process, which generates a fused view (LR012/412) from two sets of data (original view 1 , 401 and original view 2, 402).
  • the fused view encoding process, arrow 490 is similar for LRO encoding or as a step in multiview encoding.
  • Arrow 492 also referred to in this description as processing module 492, is a decoding process, which generates one or more decoded views, such as decoded view 2 (402D) from a fused view (LR012/412)
  • the fused view decoding process, arrow 492 is similar for LRO decoding or as a step in multiview decoding.
  • arrow 592 also referred to in this description as processing module 592, is a decoding process, which generates one or more decoded views, such as decoded
  • view 1 (501D) and decoded view 2 (502D) from a fused view (LR012/412).
  • the fused view decoding process arrow 592, is similar for LRO decoding or as a step in multiview decoding.
  • decoding processes arrow 492 and arrow 592 are similar, and the same processing module can be used for each. As such, the
  • MLE CODEC decoder processing (arrow 492) is a subset of the processing needed for the MLE CODEC encoder processing (including arrow 490 and arrow 492).
  • savings can be achieved in hardware, firmware, and/or software implementation by a careful re-use of modules, specifically by implementing the encoder to allow for dual-use of the decoding processing that is part of the encoder portion of the CODEC to be used for the decoding
  • Arrow 594 also referred to in this description as processing module 594, is a decoding process, which generates one or more decoded views, such as decoded view 3 (503D). While processing module 594 can also generate decoded view 2 (502D), this is not necessary, as
  • I decoded view 2 (502D) already has been generated as part of the previous level's processing.
  • the decoding process of arrow 594 is similar to arrow 592 and arrow 492, and is preferably implemented as the same processing module.
  • Encoded i input data includes encoding in H.254, MPEG4, or any other format, as applicable for the application.
  • Processing includes LRO encoding, LRO decoding, multiview encoding, and multiview decoding.
  • the output data, including sets of data, and decoded views can be encoded in H.254, MPEG4, or any other format, as applicable for the application.
  • Modules can be implemented in software, but can also be implemented in hardware and firmware, on a single processor or distributed processors, at one or more locations.
  • the above-described module functions can be combined and implemented as fewer modules or separated into sub-functions and implemented as a larger number of modules. Based on the above description, one skilled in the art will be able to design an implementation for a specific application.

Abstract

An LRO format encoder provides images for three-dimensional (3D) viewing, that are at least the quality of images provided by conventional encoding techniques using equivalent bandwidth, with encoding and decoding requiring less power than conventional formats. The LRO format facilitates encoding of multiple images using an innovative multiview low energy (MLE) CODEC that reduces the power consumption requirements of encoding and decoding 3D content, as compared to conventional techniques. A significant feature of the MLE CODEC is that a decoded view from a lower processing level is used for one of the components of the LRO format for at least one higher processing level. Thus, some components of the LRO level for a higher view can be derived from processing of lower views, and not all the components of the higher view need to be transmitted as part of the data for the MLE CODEC.

Description

MULTIVIEW 3D COMPRESSION FORMAT AND ALGORITHMS
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of provisional patent application (PPA) Serial Number 61/390,291 filed October 6, 2010 (attorney file 4221/6), and PPA Serial Number 61/509,581 filed July 20, 201 1 (attorney file 4221/8) by Alain Fogel, which are incorporated by reference.
FIELD OF THE INVENTION
The present embodiment generally relates to the field of computer vision and graphics, and in particular, it concerns a system and format for three-dimensional (3D) encoding and rendering of multiple views.
BACKGROUND OF THE INVENTION
3D Imaging Technology can be described as the next revolution in modern camera and video technology. Stereo 3D (S3D) is produced by displaying two views, one for each eye, the LEFT and RIGHT views. The main markets for S3D are similar as for 2D:
television (TV), cinema, mobile, and cameras. S3D is already being marketed in TV sets, as many LCD TV sets are provided with S3D capability using 3D glasses. 3D Smartphones and tablets are potentially huge markets, due to the phenomenal growth and development of the mobile market. Several 3D handsets and tablets are currently available (e.g., Hitachi, Samsung, Sharp, HTC, and LG currently have products on the market). In all these markets, 2D and 3D digital video content must be stored and delivered via limited bandwidth communication channels or devices. Hence encoding/decoding by CODECs is required to compress the content and satisfy bandwidth limitations. 2D CODECs such as H.264 have significant computing requirements resulting in significant power consumption. While much more critical for battery-powered devices such as mobile devices, power consumption is also a consideration in TV sets and set top boxes (STBs) because of regulatory energy constraints. For S3D CODECs, the problem is much worse: in a brute force approach, the power consumption and the bandwidth are doubled. Hence, any implementation of S3D requires sophisticated algorithms that minimize bandwidth and power consumption. These two requirements usually have opposite effects, since the more sophisticated the algorithms are to limit bandwidth requirement, the more complex is the implementing software/hardware, and the higher the power consumption to implement the algorithm.
The current algorithmic technology that is recommended by the MPEG forum is an extension of H.264/MPEG4 and is called H.264/MPG4-MVC. This algorithm has the advantage of keeping the bandwidth requirement to a reasonable level. However, the power consumption on a mobile device for a typical broadcast video is multiplied by a factor close to 2. Another algorithm technology has been developed by Dolby Laboratories, Inc. called 3D Full-Resolution that includes a layer on top of H.264. This layer is an enhancement for Side by Side 3D HALF HD, enabling 3D FULL HD on set top boxes (STBs). Similar to H.264/MPEG4-MVC, the power consumption when using this additional layer is high, as compared to two-dimensional (2D) viewing.
Multiview 3D is the next step of S3D. With a multiview 3D TV, the viewer will be able to see multiple, different views of a scene. For example, different views of a football match can be seen, by the viewer moving himself and/or selecting the 3D view that the viewer wants to see, just like going around a hologram or turning a hologram in your hand. The same use case applies to tablets with the viewer selecting a desired view by tilting the tablet. Multiview 3D technology with no glasses exists already for specific markets (e.g. advertisement). Currently, the available screen sets are impressive but very expensive as compared to the cost of a non-multiview screen. As such, currently available screen sets do not fit the consumer TV market yet. The LCD sets are typically based on lenticular displays and typically exhibit 8 to 10 views. Today, the resolution of each view is equal to the screen resolution divided by the number of views. Projections for this market are that this limitation on the resolution of each view will be resolved in the future, and that each view will have full HD resolution. In a full resolution multiview 3D TV set or tablet, the computing power and the power consumption for visualizing for example 8 views coded with H.264-M VC, is the power required by a single view multiplied by the number of views (in this case 8) compared to a 2D view. In other words, an 8 view multiview 3D TV set consumes about 8 times as much power as a single view 3D TV set. The power consumption requirements of multiview 3D TV are a challenge for decoding chips and for energy saving regulations.
There is therefore a need for methods and systems for encoding 3D content with the purpose of reducing the power consumption, in particular at the decoder, as compared to conventional techniques. SUMMARY
According to the teachings of the present embodiment there is provided a method for encoding data including the steps of: receiving a first set of data; receiving a second set of data; generating a first view, a second view, and associated generating-vectors; wherein the first and second views are generated by combining the first and second sets of data, such that the first view contains information associated with elements of the first set of data, the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the associated generating-vectors indicate operations to be performed on the elements of the first and second views to recover the first and second sets of data.
In an optional embodiment, the first view is the first set of data. In another optional embodiment, the first view includes elements that are common to the first and second sets of data. In another optional embodiment, the second view only contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data. In another optional embodiment, the second view contains additional information, the additional information other than information only associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data. In another optional embodiment, the second view includes elements of the first and second sets of data that are only in the second set of data.
In an optional embodiment, the first set of data is a first two-dimensional (2D) image of a scene from a first viewing angle, and the second set of data is a second 2D image of the scene from a second viewing angle.
In an optional embodiment, the data is in H.264 format. In another optional embodiment, the data is in MPEG4 format.
In an optional embodiment, the method of claim 1 further includes the step of:
storing the first view, the second view, and the associated generating-vectors in association with each other.
According to the teachings of the present embodiment there is provided a method for decoding data including the steps of: a first view and a second view, the first and second views containing information associated with elements of a first set of data and a second set of data such that the first view contains information associated with elements of the first set of data, and the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data; receiving generating-vectors associated with the first and second views, the generating-vectors indicating operations to be performed on elements of the first and second views to generate the first and second set of data; and generating, using the first view, the second view, and the generating-vectors, at least the first set of data.
In an optional embodiment, the method includes the step of generating, using the first view, the second view, and the generating-vectors, the second set of data.
According to the teachings of the present embodiment there is provided a system for encoding data including: a data-receiving module configured to receive at least a first set of data and a second set of data; and a processing system containing one or more processors, the processing system being configured to generate a first view, a second view, and associated generating-vectors, wherein the first and second views are generated by combining the first and second sets of data, such that the first view contains information associated with elements of the first set of data, the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the associated generating- vectors indicate operations to be performed on the elements of the first and second views to recover the first and second sets of data.
In an optional embodiment, the system includes a storage module configured to store the first view, the second view, and the associated generating-vectors in association with each other.
According to the teachings of the present embodiment there is provided a system for decoding data including: a data-receiving module configured to receive at least: a first view and a second view, the first and second views containing information associated with elements of a first set of data and a second set of data such that the first view contains information associated with elements of the first set of data, and the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data; and generating-vectors associated with the first and second views, the generating-vectors indicating operati ons to be performed on elements of the first and second views to generate the first and second set of data; and a processing system containing one or more processors, the processing system being configured to generate, using the first view, the second view, and the generating-vectors, at least the first set of data.
According to the teachings of the present embodiment there is provided a method for
5 encoding data including the steps of: generating a first fused data set including a first view, a second view, and a first set of associated generating-vectors wherein the first and second views are generated by combining a first set of data and a second set of data, such that the first view contains information associated with elements of the first set of data, the second view contains information associated with elements of the second set of data other than
3 elements of the second set of data that are in common with corresponding elements of the first set of data, and the first set of associated generating-vectors indicate operations to be performed on the elements of the first and second views to recover the first and second sets of data; generating a decoded second view using the first fused data set, the decoded second view substantially the same as the second set of data; and generating a third view, and a
5 second set of associated generating-vectors wherein the third view is generated by combining the decoded second view and a third set of data, such that the third view contains information associated with elements of the third set of data other than elements of the third set of data that are in common with corresponding elements of the decoded second view, and the second set of associated generating-vectors indicate operations to be performed on the elements of
) the decoded second view and third views to recover the second and third sets of data.
In an optional embodiment, the steps of generating a decoded second view and generating a third view, are repeated to generate a higher-level fused data set, the higher-level fused data set including a higher-level decoded view from a lower-level fused data set.
In an optional embodiment, the method further includes the step of: storing the first
5 fused data set, the third view, and the second set of associated generating-vectors in
association with each other.
According to the teachings of the present embodiment there is provided a method for decoding data including the steps of: receiving a first fused data set including a first view, a second view, and a first set of associated generating-vectors, the first and second views
) containing information associated with elements of a first set of data and a second set of data such that the first view contains information associated with elements of the first set of data, and the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the a first set of associated generating-vectors indicating operations to be performed on elements of the first and second views to render the first and second set of data; generating at least a decoded second view using the first fused data set, the decoded second view substantially the same as the second set of data; and generating a decoded third view using a second fused data set, the second fused data set including the decoded second view, a third view and a second set of associated generating-vectors, wherein the third view contains information associated with elements of a third set of data other than elements of the third set of data that are in common with corresponding elements of the second set of data, the second set of associated generating-vectors indicating operations to be performed on elements of the decoded second view and the third view to render the decoded third view, the decoded third view substantially the same as the third set of data.
In an optional embodiment, the step of generating a decoded third view is repeated to generate a higher-level decoded view using a higher-level fused data set, the higher-level fused data set including a decoded view from a lower-level fused data set.
. In an optional embodiment, the method includes the step of: generating a decoded first view using the first fused data set, the decoded first view substantially the same as the first set of data.
According to the teachings of the present embodiment there is provided a system for encoding data including: a data-receiving module configured to receive at least a first set of data, a second set of data, and a third set of data; and a processing system containing one or more processors, the processing system being configured to: generate a first fused data set including a first view, a second view, and a first set of associated generating-vectors wherein the first and second views are generated by combining a first set of data and a second set of data, such that the first view contains information associated with elements of the first set of data, the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the first set of associated generating-vectors indicate operations to be performed on the elements of the first and second views to recover the first and second sets of data; generate a decoded second view using the first fused data set, the decoded second view substantially the same as the second set of data; and generate a third view, and a second set of associated generating-vectors wherein the third view is generated by combining the decoded second view and a third set of data, such that the third view contains information associated with elements of the third set of data other than elements of the third set of data that are in common with corresponding elements of the decoded second view, and the second set of associated generating- vectors indicate operations to be performed on the elements of the decoded second view and third views to recover the second and third sets of data.
In an optional embodiment, the system includes a storage module configured to store the first fused data set, the third view, and the second set of associated generating-vectors in association with each other.
According to the teachings of the present embodiment there is provided a system for decoding data including: a data-receiving module configured to receive at least a first fused data set including a first view, a second view, and a first set of associated generating-vectors, the first and second views containing information associated with elements of a first set of data and a second set of data such that the first view contains information associated with elements of the first set of data, and the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the a first set of associated generating-vectors indicating operations to be performed on elements of the first and second views to render the first and second set of data; and a processing system containing one or more processors, the processing system being configured to: generate at least a decoded second view using the first fused data set, the decoded second view substantially the same as the second set of data; and generate a decoded third view using a second fused data set, the second fused data set including the decoded second view, a third view and a second set of associated generating-vectors, wherein the third view contains information associated with elements of a third set of data other than elements of the third set of data that are in common with corresponding elements of the second set of data, the second set of associated generating-vectors indicating operations to be performed on elements of the decoded second view and the third view to render the decoded third view, the decoded third view substantially the same as the third set of data.
According to the teachings of the present embodiment there is provided a computer- readable storage medium having embedded thereon computer-readable code for encoding data the computer-readable code including program code for: receiving a first set of data; receiving a second set of data; generating a first view, a second view, and associated generating-vectors; wherein the first and second views are generated by combining the first and second sets of data, such that the first view contains information associated with elements of the first set of data, the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the associated generating-vectors indicate operations to be performed on the elements of the first and second views to recover the first and second sets of data.
According to the teachings of the present embodiment there is provided a computer- readable storage medium having embedded thereon computer-readable code for decoding data the computer-readable code including program code for: receiving a first view and a second view, the first and second views containing information associated with elements of a first set of data and a second set of data such that the first view contains information associated with elements of the first set of data, and the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data; receiving generating-vectors associated with the first and second views, the generating-vectors indicating operations to be performed on elements of the first and second views to generate the first and second set of data; and generating, using the first view, the second view, and the generating-vectors, at least the first set of data.
According to the teachings of the present embodiment there is provided a computer- readable storage medium having embedded thereon computer-readable code for encoding data the computer-readable code including program code for: generating a first fused data set including a first view, a second view, and a first set of associated generating-vectors wherein the first and second views are generated by combining a first set of data and a second set of data, such that the first view contains information associated with elements of the first set of data, the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the first set of associated generating-vectors indicate operations to be performed on the elements of the first and second views to recover the first and second sets of data; generating a decoded second view using the first fused data set, the decoded second view substantially the same as the second set of data; and generating a third view, and a second set of associated generating-vectors wherein the third view is generated by combining the decoded second view and a third set of data, such that the third view contains information associated with elements of the third set of data other than elements of the third set of data that are in common with corresponding elements of the decoded second view, and the second set of associated generating-vectors indicate operations to be performed on the elements of the decoded second view and third views to recover the second and third sets of data.
According to the teachings of the present embodiment there is provided a computer- readable storage medium having embedded thereon computer-readable code for decoding data the computer-readable code including program code for: receiving a first fused data set including a first view, a second view, and a first set of associated generating-vectors, the first and second views containing information associated with elements of a first set of data and a second set of data such that the first view contains information associated with elements of the first set of data, and the second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and the a first set of associated generating- vectors indicating operations to be performed on elements of the first and second views to render the first and second set of data; generating at least a decoded second view using the first fused data set, the decoded second view substantially the same as the second set of data; and generating a decoded third view using a second fused data set, the second fused data set including the decoded second view, a third view and a second set of associated generating- vectors, wherein the third view contains information associated with elements of a third set of data other than elements of the third set of data that are in common with corresponding elements of the second set of data, the second set of associated generating-vectors indicating operations to be performed on elements of the decoded second view and the third view (R023) to render the decoded third view, the decoded third view substantially the same as the third set of data.
BRIEF DESCRIPTION OF FIGURES
The embodiment is herein described, by way of example only, with reference to the accompanying drawings, wherein:
FIGURE 1 is a diagram of a fused 2D view.
FIGURE 2 is a diagram of exemplary GOC operations for the LRO format. FIGURE 3A is a diagram of an LRO fused view format.
FIGURE 3B is a diagram of an RLO format.
FIGURE 4 is a flow diagram of processing for an MLE CODEC encoder based on the LRO format.
FIGURE 5 is a flow diagram of processing for an MLE CODEC decoder based on the LRO format.
FIGURE 6 is a flow diagram of processing for an MLE CODEC encoder based on the RLO format.
FIGURE 7 is a flow diagram of processing for an MLE CODEC decoder based on the RLO format.
FIGURE 8 is a flow diagram of processing for an MLE CODEC encoder based on a combination of the LRO and RLO formats.
FIGURE 9 is a flow diagram of processing for an MLE CODEC decoder based on a combination of the LRO and RLO formats.
FIGURE 10 is a diagram of a system for LRO and MULTIVIEW encoding and decoding.
DETAILED DESCRIPTION
The principles and operation of the method and system according to a present embodiment may be better understood with reference to the drawings and the accompanying description. A present invention is a system and method for encoding 3D content with reduced power consumption, in particular reduced decoder power consumption, as compared to conventional techniques.
An innovative implementation of 3D+F includes an innovative format that includes one of the original views, in contrast to the previously taught 3D+F format that is generated from the original views but does not contain either of the original two views. This innovative format is referred to in the context of this document as the "LRO format". In addition to providing images for three-dimensional (3D) viewing, the LRO format can provide compatibility with two-dimensional (2D) applications. The LRO format has been shown to provide images that are at least the quality of images provided by conventional encoding techniques using equivalent bandwidth, with the ability in some cases to provide higher quality images than conventional encoding techniques. A significant feature of the LRO format is that encoding and decoding images requires less power than conventional encoding and decoding formats. The LRO format facilitates encoding of multiple images using an innovative multiview low energy (MLE) CODEC.
The innovative MLE CODEC reduces the power consumption requirements of encoding and decoding 3D content, as compared to conventional techniques. A significant feature of the MLE CODEC is that a decoded view from a lower processing level is used for one of the components of the LRO format for at least one higher processing level. Thus, some components of the LRO level for a higher view can be derived from processing of lower views, and not all the components of the higher view need to be transmitted as part of the data for the MLE CODEC.
WIPO application PCT/1B2010/051311 (attorney file 4221/4) teaches a method and system for minimizing power consumption for encoding data and three-dimensional rendering. This method, called 3D+F, makes use of a special format and consists of two main components, a fused view portion and a generating-vectors portion. For clarity in this document, the 3D+F format taught in PCT/IB2010/05131 1 is referred to as the "original 3D+F format" versus the innovative format of the current invention which is generally referred to as the "LRO format".
Referring to FIGURE 1, a diagram of a fused 2D view, a fused view 120 is obtained by correlating a left view 100 and a right view 110 of a scene to derive a fused view, also known as a single Cyclopean view, 120, similar to the way the human brain derives one image from two images. While each of a left and right view (image) contains information only about the respective view, the fused view includes all the information necessary to render efficiently left and right views. In the context of this document, the terms "view" and "image" are generally used interchangeably. In the context of this document, the term "scene" generally refers to what is being viewed. A scene can include one or more objects or a place that is being viewed. A scene is viewed from a location, referred to as a viewing angle. In the case of stereovision, two views, each from different viewing angles are used. Humans perceive stereovision using one view captured by each eye. Technologically, two image capture devices, for example video cameras, at different locations provide images from two different viewing angles for stereovision.
In a non-limiting example, left view 100 of a scene, in this case a single object, includes the front of the object from the left viewing angle 106 and the left side of the object 102. Right view 110 includes the front of the object from the right viewing angle 116 and the right side of the object 114. The fused view 120 includes information for the left side of the object 122, information for the right side of the object 124, and information for the front of the object 126. Note that while the information for the fused view left side of the object 122 5 may include only left view information 102, and the information for the fused view right side of the object 124 may include only right view information 114, the information for the front of the object 126 includes information from both left 106 and right 116 front views.
In particular, features of a fused view include:
• A fused view can be generated without occluded elements. In the 3 context of this document, the term element generally refers to a significant minimum feature of an image. Commonly, an element will be a pixel, but depending on the application and/or image content, an element can be a polygon or area. The term pixel is often used in this document for clarity and ease of explanation. Every pixel in a left or right view can be rendered by copying a corresponding pixel (sometimes 5 copying more than once) from a fused view to the correct location in a left or right view.
The processing algorithms necessary to generate the fused view work similarly to how the human brain processes images, therefore eliminating issues such as light and shadowing of pixels.
) Preferably, the fused view of the 3D+F format does not contain any occluded pixels.
In other words, every pixel in the fused view is in the left, right, or both the left and right original images. There are no (occluded) pixels in the fused view that are not in either the left or the right original images. A significant feature of the 3D+F format is the ability of a fused view to be constructed without the fused view containing occluded pixels. This feature > should not be confused with occluded pixels in the original images, which are pixels that are visible in a first original image, but not a second original image. In this case, the pixels that are visible only in the first original image are occluded for the second original image. The pixels that are occluded for the second original image are included in the fused view, and when the fused view is decoded, these occluded pixels are used to re-generate the first ) original image.
One ordinarily skilled in the art will understand that references to pixels that are visible in one image and in another image refer to corresponding pixels as understood in the stereo literature. Due to the realities of 3D imaging technology such as stereo 3D (S3D), including, but not limited to sampling, and noise, corresponding pixels are normally not exactly the same, but depending on the application, sufficiently similar to be used as the same pixel for processing purposes.
5 The type of fused view generated in the original 3D+F format depends on the
application. One type of fused view includes more pixels than either of the original left or right views. This is the case described in reference to FIGURE 1. In this case, all the occluded pixels in the left or right views are integrated into the fused view. In this case, if the fused view were to be viewed by a user, the view is a distorted 2D view of the content.
3 Another type of fused view has approximately the same amount of information as either the original left or right views. This fused view can be generated by mixing (interpolating or filtering) a portion of the occluded pixels in the left or right views with the visible pixels in both views. In this case, if the fused view were to be viewed by a user, the view will show a normal 2D view of the content. This normal (viewable) 2D fused view in the original 3D+F
5 format has been nonlinearly warped so that the fused view appears as a normal 2D view.
However, this fused view is not similar to either the original left or original right views in the sense that besides pixels that are in both original views, the fused view includes pixels that are only in the right original view and pixels that are only in the left original view. Note that 3D+F can use either of the above-described types of fused views, or another type of fused
) view, depending on the application. The encoding algorithm should preferably be designed to optimize the quality of the rendered views. The choice of which portion of the occluded pixels to be mixed with the visible pixels in the two views and the choice of mixing operation can be done in a process of analysis by synthesis. For example, using a process in which the pixels and operations are optimally selected as a function of the rendered image quality that is j continuously monitored.
Algorithms for performing fusion are known in the art, and are typically done using algorithms of stereo matching. Based on this description one skilled in the art will be able to choose the appropriate fusion algorithm for a specific application and modify the fusion algorithm as necessary to generate the associated generating-vectors for 3D+F.
) A second component of the 3D+F format is a generating-vectors portion, also referred to as generic-vectors. The generating-vectors portion includes a multitude of generating- vectors, more simply referred to as the generating-vectors. Two types of generating-vectors are left generating-vectors and right generating-vectors used to generate a left view and right view, respectively.
A first element of a generating-vector is a run-length number that is referred to as a generating number (GN). The generating number is used to indicate how many times an operation (defined below) on a pixel in a fused view should be repeated when generating a left or right view. An operation is specified by a generating operation code, as described below.
A second element of a generating-vector is a generating operation code (GOC), also simply called "generating operators" or "operations". A generating operation code indicates what type of operation (for example, a function, or an algorithm) should be performed on the associated pixel(s). Operations can vary depending on the application. In a preferred implementation, at least the following operations are available:
Copy: copy a pixel from a fused view to the view being generated (left or right). If GN is equal to n, the pixel is copied n times.
• Occlude: occlude a pixel. For example, do not generate a pixel in the view being generated. If GN is equal to n, do not generate n pixels, meaning that n pixels from the fused view are occluded in the view being generated.
• Go to next line: current line is completed; start to generate a new line.
• Go to next frame: current frame is completed; start to generate a new frame. A non-limiting example of additional and optional operations includes Copy-and-
Filter: the pixels are copied and then smoothed with the surrounding pixels. This operation could be used in order to improve the imaging quality, although the quality achieved without filtering is generally acceptable.
DETAILED DESCRIPTION - FIRST EMBODIMENT - FIGURES 2 TO 3
LRO Fused Data Format
An innovative implementation of 3D+F includes an innovative format that includes one of the original views, in contrast to the previously taught 3D+F format that is generated from the original views but does not contain either of the original two views. In the context of this document, this innovative format is referred to as LRO (left view, right occlusions) or as RLO (right view, left occlusions). For simplicity and clarity in the description, the
LRO/RLO format is generally referred to as just the LRO format. One skilled in the art will understand that references to either the LRO or RLO apply to both formats, except where a specific construction is being described. Although the previously taught 3D+F format does contain the original views, in the sense that the original views can be re-generated from the 3D+F format, this should not be confused with the innovative LRO format described below
5 that contains an original view as the original view per se. In other words, the original view does not need to be generated, but can be extracted from the LRO/RLO format. As described above, a viewable 2D fused view in the original 3D+F format has been nonlinearly warped so that the fused view appears as a normal 2D view. However, this viewable fused view is not similar to either the original left or original right views. In contrast, in the LRO format (as
) opposed to the RLO format), the left view can be the original left view, and the right view includes the elements occluded from the left view, in other words, the elements of the original right view that are not visible in the original left view. Elements common to both the original left and right views are included in the left view of the LRO fused view. Note that in the above description of the right view the elements included in the right view are not
> exclusive. In other words, in addition to the occlusions, the right view can also include
padding information. This padding information can also be pixels that are in common with the left view.
Referring to FIGURE 2, a diagram of exemplary GOC operations for the LRO format, the GOC operations of the generating-vectors can be simply graphically represented as I indicated by the following labels:
• B: copy pixel from fused view to both the left and right views.
• L: copy pixel from fused view to left view.
• R: copy pixel from fused view to right view.
• O: occlude pixel (skip): this generating-vector may be used to insert padding pixels (into the right view of the LRO fused view) that are not used in the view being generated, but are included in the fused view to increase the quality of the rendered views. Padding can also be used to enable more efficient processing (such as compression) of the fused view. Padding data is added to the fused view making the fused view larger, but enabling a compression algorithm to compress the larger fused view into a smaller amount of data to be transmitted, as compared to compressing a relatively smaller fused view into a relatively larger amount of data to be transmitted. As long as the generating-vectors are able to point to the correct pixels on which the generating-vectors need to act in the fused view to generate the correct pixels on the right and left views, the fused view can be arbitrarily generated. In other words, the pixel positions on the LRO fused view can be changed as long as these pixels can be retrieved for generating the left and right views.
A non-limiting example of associating the generating-vectors (GVs) with the corresponding pixels on which the GVs need to act can be seen in the embodiment of FIGURE 2. The B, L, and R GVs form a frame. In this frame, the GVs are located at a position such that retrieving sequentially pixels from the fused view and reading the
0 corresponding GVs, the pixels retrieved are either skipped (O GV) , copied to both left and right images (B), copied only in the left image (L), or copied only in the right images (R). The value of the GV points to the operation on the corresponding pixel.
Continuing the current example using FIGURE 2, two (2) bits are necessary to represent the four values B, R, L, and O. In a case where padding is not used, only the two
5 values B and R are required, and 2 values can be represented by only one bit. Once a map of generating-vectors is created per frame, the map can be compressed using run length coding or other types of efficient entropy coding.
In one non-limiting example, different fused views can be generated by varying the padding in the fused views. As the padding is not used to generate the decoded original left
) and right views, different fused views can generate similar decoded original left and right views. Generating alternative fused views may have system benefits, including greater compressibility, or improved quality after transmission (such as by h.264 compression).
A key feature of a method of generating an LRO format is arranging the pixel positions of the fused view to optimize subsequent processing, typically image quality and
) compression encoding, while maintaining association between pixels of the LRO fused view and corresponding generating-vectors. In other words, the pixel positions of the LRO fused view can be changed for maximum benefit of the specific application for which the LRO fused view is being used.
In a non-limiting example application of wireless transmission, the LRO fused view
) needs to be compressed by a compression algorithm such as H.264/MPEG4. Maximum
benefit for this application includes using an LRO fused view that yields a very good compression rate by the compression algorithm being used, for example by H.264/MPEG4. Referring to FIGURE 3 A, a diagram of an LRO fused view format, an LRO fused view 300 includes a left view 302 and right occlusions view 304. The left view 302 is built from the L (left) and B (both) pixels corresponding to the L generating-vectors and the B generating- vectors, respectively. The L generating-vectors and B generating-vectors use the L pixels and B pixels, respectively, to generate (re-generate) the original left view. The right occlusions view 304 is built from the R (right) pixels corresponding to the R generating- vectors and optionally of padding pixels built from the O pixels (refer back to FIGURE 2) corresponding the O generating-vectors. Note that the padding pixels can be pixels common to the right and left original images. The R generating-vectors and B generating-vectors use the R pixels and B pixels, respectively, to generate (re-generate) the original right view.
In general, the LRO format is a method of storing data, the first step of the method being similar to the above-described method for generating an LRO fused data format for a first and second data set. In a case where the first data set is a first two-dimensional (2D) image, and the second data set is a second 2D image, the general method for storing data can
5 be used for encoding data, in this case 2D images.
A method for encoding LRO format data includes the steps of receiving a first two- dimensional (2D) image of a scene from a first viewing angle and a second 2D image of the scene from a second viewing angle. A first view, a second view, and associated generating- vectors are generated using the first and second 2D images. The first view, a second view,
) and associated generating-vectors can be stored in association with each other, temporarily, permanently, and/or transmitted. The first and second views are generated by combining the first and second 2D images. The first view contains information associated with elements of the first 2D image. The second view contains information associated with elements of the second 2D image other than elements of the second 2D image that are in common with
> corresponding elements of the first 2D image. As described above, the second view may also include other elements, such as padding. The associated generating-vectors indicate operations to be performed on elements of the first and second views to recover the first and second 2D images. Preferably, the first view is substantially identical to the first 2D image.
The first view includes elements that are common to the first and second sets of data.
1 In one implementation, the second view only contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data. In another implementation, the second view contains additional information. The additional information is information other than information only associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data. In other words, the additional information is padding, which can be implemented as elements
5 from the first set of data, or as other elements that are identified by the generating-vectors to not be used to generate views.
In general, a method for decoding LRO format data includes the step of providing a first view and a second view. Similar to the description above in reference to encoding LRO format data, the first and second views contain information associated with elements of a first
) 2D image and a second 2D image. The first view contains information associated with
elements of the first 2D image. The second view contains information associated with elements of the second 2D image other than elements of the second 2D image that are in common with corresponding elements of the first 2D image. As described above, the second view may also include other elements, such as padding. Generating-vectors associated with
5 the first and second views are provided. The associated generating-vectors indicate
operations to be performed on elements of the first and second views to render the first and second 2D images. Using the first view, the second view, and the associated generating- vectors, at least the first 2D image is rendered. Preferably, the first view is the first 2D image. So rendering the first 2D image can be done by simply extracting the first view,
) which is the first 2D image, from the encoded data. Using the first view, the second view, and the generating-vectors, the second 2D image can be rendered.
Note that instead of generating an LRO fused view as a single file, two or more files (for example, two separate views) can be generated, one file for the left view ("L", the part built from the left (L) and both (B) pixels), another file for the right occlusions view ("RO",
5 the part built from the right (R) pixels and optionally from the "O" pixels, for example for padding). Generating-vectors (GVs) are also generated, and can be included in the left view, right occlusions view, or preferably in a separate file. In this case, the term LRO fused view refers to both (or in general, all) of the generated files. The two views can be separately encoded using different settings in the H.264/MPEG4 compression scheme. Whether one
) view or multiple views are used for the data of the fused view, the resulting LRO fused view achieves a compression with H.264/MPEG4 that is at least as good as the compression of the previously taught 3D+F format of FIGURE 1. The good compression ratio results in part from the right side view being compact. Also, since the right side view consists of right occlusions that have a reasonable degree of coherence from frame to frame, the right view tends to have good H.264/MPEG4 compression. In contrast, in order for the original 3D+F format previously taught in reference to FIGURE 1 to achieve improved H.264/MPEG4
5 compression, the original 3D+F format requires padding pixels in order to preserve inter-line pixel coherence, and therefore is not as compact as the innovative LRO format described in reference to FIGURE 3A.
As described above, the LRO format facilitates arranging the pixel positions of the fused view to optimize subsequent processing. In a typical case, the pixels of the left view
) 302 are chosen such that the left view is the original left image, and the pixels of the right occlusions view 304 are the pixels that are occluded from the left view (the elements of the original right image that are not visible in the left image and optional padding). As the right occlusions view is missing elements from the original right image, the quality of an image decoded from the right occlusions view may be able to be increased by padding the right
> occlusions view. In this case, the decoded image can be monitored and the quality of the decoded image used for feedback to the fused view generator, modifying how padding is applied in the generation of the fused views. Separately or in combination with applying padding for increased image quality, padding can be applied to increase the compression ratio of the fused view, in particular for the right occlusions view. The right occlusions view is
' padded sufficiently so that the compression algorithm being used (for example H.264 for wireless transmission of the data) processes the data of the right occlusions view similar to processing of an original image. Thus, the compression ratio of the padded right view can be higher than the compression ratio of an un-padded right occlusions view.
Regarding the use of padding, note that while both the original 3D+F format and the LRO format can use padding in the generated fused view, padding is not required in either format. In addition, how padding is used and the effect on the size of the fused view are different between the two formats. As described above, padding can optionally be used in the original 3D+F format, but this padding affects the quality of the rendered view, and increases the size of the fused view by a relatively larger amount. In other words, when padding is used in the original 3D+F format, a larger amount of data is added to the fused view than the relatively smaller amount of data added when padding is added to the fused view of the LRO format. When padding is optionally used in the LRO format, padding affects the compression ratio of the data of the fused view, with a relatively smaller amount of data added only to the right occlusions view ("RO", the part built from the right (R) pixels and optional padding). As shown in FIGURE 3A and FIGURE 1 by the relative size of the left view 302, which is comparable to the front of the object 126, and right occlusions view 304, the right occlusions view 304 is relatively smaller than the left view 302 to begin with, so the added padding to the LRO format right occlusions view 304 is less than the occlusions added to the front of the object 126.
Referring to FIGURE 3B, a diagram of an RLO format, a right view-left occlusions (RLO) format can be derived from the fused view. The RLO format is similar to the LRO
I format, with the right view corresponding to the B and R pixels of the fused view while the left occlusions part is built from the L pixels and optionally from the O pixels, similar to the description of FIGURE 2. An RLO fused view 310 includes a right view 312 and left occlusions view 314. The right view 312 is built from the R (right) and B (both) pixels corresponding to the R generating-vectors and the B generating-vectors, respectively. The R i generating-vectors and B generating-vectors use the R pixels and B pixels, respectively, to generate (re-generate) the original right view. The left occlusions view 314 is built from the L (left) pixels corresponding to the L generating-vectors and optionally from the O pixels, similar to the description of FIGURE 2. The L generating-vectors and B generating-vectors use the L pixels and B pixels, respectively, to generate (re-generate) the original left view.
)
DETAILED DESCRIPTION - SECOND EMBODIMENT - FIGURES 4 TO 10
While the above-described embodiment for an LRO format is useful, an additional method can be used in conjunction or independently, for handling multiple three-dimensional (3D) views, known as multiview 3D. As described above, the power consumption
5 requirements of multiview 3D are a challenge for decoding chips and for energy saving
regulations. The innovative method and system of a multiview 3D CODEC
(encoder/decoder) reduces the power consumption requirements of encoding and decoding 3D content, as compared to conventional techniques. This innovative multiview 3D CODEC is referred to in the context of this document as a multiview low energy CODEC, or simply
) MLE CODEC. A feature of the MLE CODEC is a much lower power requirement as
compared to H.264-MVC. In particular, generating (also referred to in the industry as synthesizing) decoded views during encoding is a significant feature of the current invention, and has been shown to be less power consuming than implementations which synthesize views at the decoder/receiver stage.
For clarity, multiview 3D is herein described using the LRO (RLO) format. It is foreseen that based on the above description of the LRO format, and the below-description of multiview 3D, modifications to the LRO format for specific applications are possible, and any format that supports the innovative characteristics of the LRO format to facilitate multiview 3D can be used for multiview 3D. A non-limiting example of modifications to the LRO format include changing the structure of the frames (refer again to FIGURE 3), while keeping the meaningful information substantially intact.
MLE Encoder Based on the LRO Format
Referring to FIGURE 4, a flow diagram of processing for an MLE CODEC encoder based on the LRO format, a non-limiting example of 5 views is encoded. In the context of this document, the MLE CODEC encoder is also referred to as simply the MLE encoder. Based on this description, one skilled in the art will be able to extend this encoding method to an arbitrary number of views, including more views and fewer views. Below, further embodiments are described based on the RLO format, and both LRO and RLO formats. 5 original views, original view 1 (401), original view 2 (402), original view 3 (403), original view 4 (404), and original view 5 (405), are encoded to produce a smaller amount of data, as compared to the amount of data in the original views. Original view 1 (401) and original view 2 (402) are used to generate LRO format for views 1 and 2 LRO 12 412, similar to the above described method for S3D with original view 1 (401) used as the left view and original view 2 (402) used as the right view. LR012 contains a left view L12 [the part built from the left (L) and both (B) pixels], a right occlusions view R012 [the part built from the right (R) pixels, and optionally from the occlusion (O) pixels, for example such as padding], and generating-vectors GV12 (generating-vectors to regenerate the original left and right views). Note that in the preferred embodiment described above in reference to the LRO format, as all of the pixels for the original view 1 (401) are in the left view (L12) and only pixels for original view 2 (402) are in the right occlusions view (R012), so an alternative notation could be used in which LI 2 is LI and R012 is R2. However, for consistency and compatibility, the L12 and R012 notation is maintained in this document. L12, R012, and GV12 will all be required by the MLE CODEC decoder, so are part of the produced data, and fully contribute to a bit rate and bandwidth required for
transmission. The original views (in this case original view 1 (401) and original view 2 (402) do not need to be transmitted. For clarity in the figures, items that do not need to be
5 transmitted are striped, while items to be transmitted are not filled-in.
After generating LR012, LR012 is decoded to generate decoded view 2 (402D). While theoretically decoded view 2 (402D) can be the same as original view 2 (402), depending on the application and encoding parameters chosen, decoded view 2 (402D) and original view 2 (402) may be more or less similar. In other words, the quality of decoded
) view 2 (402D) as compared to original view 2 (402) may be substantially the same, or a
lower quality. In general, the two views, decoded view 2 (402D) and original view 2 (402) are substantially the same, meaning that for a given application the differences between the two views are below a given threshold.
Decoded view 2 (402D) and original view 3 (403) are used to generate LRO format
> LR023 423, similar to the above described method for LR012. LR023 contains a left view L23, a right occlusions view R023, and generating-vectors GV23. A significant feature of the method of this encoding is that decoded view 2 (402D) is used for left view L23. L23, R023, and GV23 will all be required by the MLE CODEC decoder. However, since decoded view 2 (402D) is used for left view L23, left view L23 does not have to be part of
) the produced data for LR023. Fused view LRO 12 is already transmitted, and can be used by the MLE CODEC decoder to produce decoded view 2 (402D) which can be used as the L23 part of LR023. Hence, from LR023, only the R023 and GV23 parts need to be transmitted. LR023 does not fully contribute to the bit rate and bandwidth required for transmission, as L23 does not need to be transmitted. This contributes significantly to the bandwidth savings.
5 Note that in addition to only needing to transmit one of the two views (only R023), since the non-transmitted view (L23) contains the left (L) and both (B) pixels, the view that is transmitted (R023) contains only the right occlusion and optional padding (RO) pixels, which is generally a smaller amount of data than the view not-transmitted (L23).
After generating LR023, the method repeats, decoding LR023 to generate decoded
3 view 3 (403D). Decoded view 3 (403D) is substantially the same as original view 3 (403).
Decoded view 3 (403D) and original view 4 (404) are used to generate LRO format LR034 434, similar to the above described method for LR012. LR034 contains a left view L34, a right occlusions view R034, and generating-vectors GV34. Similar to the description in reference to left view L23, decoded view 3 (403D) is used for left view L34. L34, R034, and GV34 will all be required by the MLE CODEC decoder. However, since decoded view 3 (403D) is used for left view L34, left view L34 does not have to be part of the produced data for LR034. Data for fused view LR023 is already available from transmitted data, and can be used by the MLE CODEC decoder to produce decoded view 3 (403D) which can be used as the L34 part of LR034. Hence, from LR034, only the R034 and GV34 parts need to be transmitted. LR034 does not fully contribute to the bit rate and bandwidth required for transmission, as L34 does not need to be transmitted.
As noted above in reference to LR023 in addition to only needing to transmit one of the two views (only R034), since the non-transmitted view (L34) contains the left (L) and both (B) pixels, the view that is transmitted (R034) contains only the right occlusion and optional padding (RO) pixels, which is generally a smaller amount of data than the view not- transmitted (L34).
After generating LR034, the method repeats as already described, decoding LR034 to generate decoded view 4 (404D). Decoded view 4 (404D) is substantially the same as original view 4 (404). Decoded view 4 (404D) and original view 5 (405) are used to generate LRO format LR045 445, similar to the above described method for LR012. LR045 contains a left view L45, a right occlusions view R045, and generating-vectors GV45.
Similar to the description in reference to left view L23, decoded view 4 (404D) is used for left view L45. L45, R045, and GV45 will all be required by the MLE CODEC decoder. However, since decoded view 4 (404D) is used for left view L45, left view L45 does not have to be part of the produced data for LR045. Fused view LR034 is already transmitted, and can be used by the MLE CODEC decoder to produce decoded view 4 (404D) which can be used as the L45 part of LR045. Hence, from LR045, only the R045 and GV45 parts need to be transmitted. LR045 does not fully contribute to the bit rate and bandwidth required for transmission, as L45 does not need to be transmitted.
The original data for the current example includes 5 original views. The data produced by the encoder includes only one original view [original view 1 (401)], left view L12 , with four right views R012, R023, R034, and R045 and correspondingly only four sets of generating-vectors GV12, GV23, GV34, and GV45. It will be obvious to one skilled in the art that the views can be combined in an arbitrary order, with different combinations requiring potentially different amounts of processing power, producing different resulting compression ratios, and different quality decoded images.
In general, the multiview (fused data) format is a method of storing data, the first step of the method being similar to the above-described method for generating an LRO fused data format for a first and second data set. In a case where the first data set is a first two- dimensional (2D) image, and the second data set is a second 2D image, the general method for storing data can be used for encoding data, in this case 2D images. Generating data in the multiview format can be done by a MLE CODEC encoder.
A first fused data set includes a first view, a second view, and a first set of associated generating-vectors. The first and second views are generated by combining a first set of data and a second set of data. The first view contains information associated with elements of the first set of data. In the preferred implementation of the LRO fused data format, the first view contains only information associated with elements of the first set of data. Most preferably, the first view is the first set of data. Note that this first view is not exclusive, in that the first view does not exclude information that is also associated with elements of the second set of data. The second view contains information associated with elements of the second set of data, preferably other than elements of the second set of data that are in common with corresponding elements of the first set of data, except for optional padding. In other words, the second view contains information associated with elements of the second set of data that are not in common with corresponding elements of the first set of data, except for optional padding. The first set of associated generating-vectors indicates operations to be performed on the elements of the first and second views to recover the first and second sets of data. In the above description of the LRO format, there is one set of generating-vectors for one set of views. When generating multiple fused views, each set of views has an associated set of generating-vectors. In the context of this document, the term "associated generating-vectors" generally refers to the generating-vectors associated with the two views of the LRO fused data format for which the vectors are used to generate the original (or decoded) two images.
The next step in storing data in the multiview format is generating a decoded second view using the first fused data set. Decoding can be done using the technique described above for decoding the LRO fused data format. The decoded second view is substantially the same as the second set of data.
For clarity, the next step can be thought of as generating a second fused data set. The second fused data set includes the decoded second view, a third view, and a second set of associated generating-vectors. Practically, generating a formal second fused data set is not necessary. The decoded second view has already been generated, and the decoded second view does not need to be stored nor transmitted in the multiview format. A third view and a second set of associated generating-vectors need to be generated and retained (stored or transmitted). The third view is generated using the decoded second view and a third set of data. A significant feature of the MLE CODEC encoder and storing data in the multiview format is that the decoded second view is used as one of the views in the fused data set. The decoded second view is similar to the previously described first view, in that the decoded second view is not exclusive, that is, the decoded second view does not exclude information that is also associated with elements of the third set of data. The third view is generated by combining the decoded second view and a third set of data, such that the third view contains information associated with elements of the third set of data other than elements of the third set of data that are in common with corresponding elements of the decoded second view, except for optional padding. Similar to the second view in the first fused data set, the third view contains information associated with elements of the third set of data that are not in
I common with corresponding elements of the decoded second view, except for optional
padding. The second set of associated generating-vectors indicates operations to be performed on the elements of the decoded second view and third views to recover the second2) and third sets of data.
The above description is for three sets of data. If more than three sets of data are to be
> encoded, the method is repeated similar to the step of generating the second fused data set. In the context of this document, when referring to more than one fused data set, the terminology of "higher-level" and "lower-level" is used. In general, the term higher-level refers to a subsequently encoded or decoded fused data set, while the term lower level refers to a previously encoded (or decoded) fused data set. In a case where the original images are
) numbered 1, 2 ... N for reference during encoding, (or as generated when decoding), the lowest level image encoded (decoded) is referred to as level 1 , the next image encoded (or decoded) is 2, and so forth. For example, encoding a third image uses the lower-level second image (decoded second image) to generate a third-level fused data set. Decoding a fourth image comes from using the previous, lower-level fused data set, a third-level fused data set to generate the decoded fourth level image.
In general, the above-described method for generating multiview fused data can be repeated to generate a higher-level fused data set, the higher-level fused data set including a higher-level decoded view from a lower-level fused data set. Based on the above description, one skilled in the art will be able to expand the currently described method for multiview MLE encoding for an arbitrary number of sets of data. As a related example, refer back to the above description of the MLE CODEC for a description using five original images.
During or after generation of all or portions of the fused data sets, portions of the fused data sets are stored. A significant feature of the multiview format, and corresponding MLE CODEC, is that only portions of the fused data format are that needed for decoding need to be retained. Retaining includes, but is not limited to storing the retained data in a non-volatile memory, or in temporary storage. Temporary storage includes data that is generated for transmission, even if the data will not be retained by the generating system after transmission of the data. In general, the entire first fused data set is retained, including the first view, second view, and first set of generating- vectors. For the second and additional data sets, as one of the views (for example a left view) can be generated by the previous level's decoding, only the other view (for example a right view) and another set of generating-vectors needs to be retained.
Temporary storage during encoding includes portions of the fused data format that are not retained. In particular, after generation of a first fused data set, decoded views are generated for use in generating the next level fused data set. The decoded views are not necessary for decoding the multiple views from the stored and/or transmitted data.
Depending on the specific application, storage of additional data may be desired. One example is storing additional data during encoding or decoding to improve processing.
Another example is storing one or more decoded views during testing, or periodically during operation to verify the operation, processing, and/or quality of the CODEC.
MLE Decoder Based on the LRO Format
Referring to FIGURE 5, a flow diagram of processing for an MLE CODEC decoder based on the LRO format, the non-limiting example of FIGURE 4 is continued. In the context of this document, the MLE CODEC decoder is also referred to as simply the MLE decoder. Based on this description, one skilled in the art will be able to extend this decoding method to an arbitrary number of views, including more views and fewer views.
LR012 fused data format (412) includes transmitted data L12, R012, and GV12. LR012 is decoded to generated decoded view 1 (501D) and decoded view 2 (502D). This decoding is similar to the decoding described above in reference to the LRO format for S3D. As described above, in general decoded view <N> is substantially the same as original view <N>.
Note that as decoded view 1 (501D) can be extracted from LRO 12 as the L12 part. An alternative drawing could represent item 501D as being extracted from LRO 12 via arrow 500, and item 501D not being striped. For consistency, the striped notation for all decoded views is maintained in the figures.
LR023 (523) is decoded to generate decoded view 3 (503D). The format for LR023 contains a left view L23, a right occlusions view R023, and generating-vectors GV23. A significant feature of the method of this decoding is that decoded view 2 (502D) is used for left view L23. Since decoded view 2 (502D) is used for left view L23, the MLE CODEC decoder does not have to receive L23 as part of the data transmission. As described above, since L23 is not needed by the decoder, L23 is not produced or transmitted as part of LR023. R023 and GV23 are transmitted as part of the multiview transmission to the decoder. R023 and GV23 are used with the generated decoded view 2 (502D), which is L23, to form
LR023. LR023 is then decoded to generate decoded view 3 (503D).
After generating decoded view 3 (503D), the method repeats, using decoded view 3 (503D) as left view L34 of LR034 (534). Data received by the MLE CODEC decoder for right occlusions view R034 and generating-vectors GV34 completes the LR034 fused data format. LR034 is then decoded to generate decoded view 4 (504D).
Similarly, after generating decoded view 4 (504D), the method repeats, using decoded view 4 (504D) as left view L45 of LR045 (545). Data received by the MLE CODEC decoder for right occlusions view R045 and generating-vectors GV45 completes the LR045 fused data format. LR045 is then decoded to generate decoded view 5 (505D).
Note that the MLE CODEC decoder is a subset of the MLE CODEC encoder. As such, savings can be achieved in hardware and/or software implementation by a careful reuse of modules, specifically by implementing the encoder to allow for dual-use as a decoder. In general, the first step of a method for decoding the multiview (fused data) format is similar to the above-described method for decoding an LRO fused data format. In a case where the encoded data are two-dimensional (2D) images, the general method for decoding
5 data can be used for decoding 2D images. To avoid confusion between the similar terms "fused data sets" and "data sets" in the below description, data sets are referred to as 2D images. Generating data in the multiview format can be done by an MLE CODEC decoder.
A first fused data set includes a first view, a second view, and a first set of associated generating-vectors. The first and second views contain information associated with elements
) of a first 2D image and a second 2D image such that the first view contains information
associated with elements of the first 2D image, and the second view contains information associated with elements of the second 2D image other than elements of the second 2D image that are in common with corresponding elements of the first 2D image, except for optional padding pixels. The first set of associated generating-vectors indicates operations to be i performed on elements of the first and second views to render the first and second 2D
images.
At least a decoded second view is rendered from the first fused data set using the first fused data set. The decoded second view is substantially the same as the second 2D image. Typically, a decoded first view is also rendered using the first fused data set. The decoded
I first view is substantially the same as the first 2D image.
A third view and a second set of associated generating-vectors are provided, which in combination with the decoded second view are used to render a decoded third view. The decoded second view, third view, and second set of associated generating-vectors are portions of a second fused data set. As the decoded second view, has been rendered from the
i previously provided first fused data set, only the third view and a second set of associated generating-vectors need to be provided to generate the decoded third view. The third view contains information associated with elements of a third 2D image other than elements of the third 2D image that are in common with corresponding elements of the second 2D image, except for optional padding pixels. The second set of associated generating-vectors indicates
I operations to be performed on elements of the decoded second view and the third view to render the decoded third view. The decoded third view is substantially the same as the third 2D image. In general, the above-described method for decoding multiview fused data can be repeated to decode a higher-level fused data set, the higher-level fused data set including a higher-level decoded view from a lower-level fused data set. Based on the above description, one skilled in the art will be able to expand the currently described method for multiview > MLE decoding for an arbitrary number of sets of data. As a related example, refer back to the above description of the MLE CODEC for a description using five original images.
MLE Encoder Based on the RLO Format
Referring to FIGURE 6, a flow diagram of processing for an MLE CODEC encoder ) based on the RLO format, a non-limiting example of 5 views is encoded. The current
example of an RLO encoder is similar to the above-described non-limiting example of an LRO encoder. 5 original views, original view 1 (601), original view 2 (602), original view 3 (603), original view 4 (604), and original view 5 (605), are encoded to produce a smaller amount of data, as compared to the amount of data in the original views. Original view 1 ί (601) and original view 2 (602) are used to generate RLO format for views 1 and 2 RL012 (612), similar to the above described method for S3D with original view 1 (601) used as the left view and original view 2 (602) used as the right view. RL012 contains a right view R12 [the part built from the right (R) and both (B) pixels], a left occlusions view L012 [the part built from the left (L) pixels], and generating-vectors RGV12 (generating-vectors to
I regenerate the original left and right views). For clarity, references to the generating-vectors of the LRO format are of the form <GVnn>, while generating-vectors of the RLO format are of the form <RGVnn>. Note that similar to the preferred embodiment described above in reference to the LRO format, as all of the pixels for the original view 2 (602) are in the right view (R12) and only pixels for original view 1 (601) are in the left occlusions view (L012), except for optional padding, so an alternative notation could be used in which R12 is Rl and L012 is L2. However, for consistency and compatibility, the R12 and L012 notation is maintained in this document.
R12, L012, and RGV12 will all be required by the MLE CODEC decoder, so are part of the produced data, and fully contribute to a bit rate and bandwidth required for
transmission. The original views (in this case original view 1 (601) and original view 2 (602) do not need to be transmitted. For clarity in the figures, items that do not need to be transmitted are striped, while items to be transmitted are not filled-in. After generating R12, Rl 2 is decoded to generate decoded view 2 (602D). While theoretically decoded view 2 (602D) can be the same as original view 2 (602), depending on the application and encoding parameters chosen, decoded view 2 (602D) and original view 2 (602) may be more or less similar. In other words, the quality of decoded view 2 (602D) as compared to original view 2 (602) may be substantially the same, or a lower quality. In general, the two views, decoded view 2 (602D) and original view 2 (602) are substantially the same, meaning that for a given application the differences between the two views are below a given threshold.
Decoded view 2 (602D) and original view 3 (603) are used to generate RLO format RL023 623, similar to the above described method for RLO 12. RL023 contains a right view R23, a left occlusions view L023, and generating-vectors RGV23. A significant feature of the method of this encoding is that decoded view 2 (602D) is used for right view R23. R23, L023, and RGV23 will all be required by the MLE CODEC decoder. However, since decoded view 2 (602D) is used for right view R23, right view R23 does not have to be part of the produced data for RL023. Fused view RLO 12 is already transmitted, and can be used by the MLE CODEC decoder to produce decoded view 2 (602D) which can be used as the R23 part of RL023. Hence, from RL023, only the L023 and RGV23 parts need to be
transmitted. RL023 does not fully contribute to the bit rate and bandwidth required for transmission, as R23 does not need to be transmitted. This contributes significantly to the bandwidth savings.
Note that in addition to only needing to transmit one of the two views (only L023), since the non-transmitted view (R23) contains the right (R) and both (B) pixels, the view that is transmitted (L023) contains only the left occlusion (LO) pixels and optional padding pixels, which is generally a smaller amount of data than the view not-transmitted (R23).
After generating RL023, the method repeats, decoding RL023 to generate decoded view 3 (603D). Decoded view 3 (603D) is substantially the same as original view 3 (603). Decoded view 3 (603D) and original view 4 (604) are used to generate RLO format RL034 634, similar to the above described method for RL012. RL034 contains a right view R34, a left occlusions view L034, and generating-vectors RGV34. Similar to the description in reference to right view R23, decoded view 3 (603D) is used for right view R34. R34, L034, and RGV34 will all be required by the MLE CODEC decoder. However, since decoded view 3 (603D) is used for right view R34, right view R34 does not have to be part of the produced data for RL034. Data for fused view RL023 is already available, and can be used by the MLE CODEC decoder to produce decoded view 3 (603D) which can be used as the R34 part of RL034. Hence, from RL034, only the L034 and RGV34 parts need to be transmitted. RL034 does not fully contribute to the bit rate and bandwidth required for transmission, as
> R34 does not need to be transmitted.
As noted above in reference to RL023 in addition to only needing to transmit one of the two views (only L034), since the non-transmitted view (R34) contains the right (R) and both (B) pixels, the view that is transmitted (L034) contains only the left occlusion (LO) pixels and optional padding pixels, which is generally a smaller amount of data than the view
) not- transmitted (R34).
After generating RL034, the method repeats as already described, decoding RL034 to generate decoded view 4 (604D). Decoded view 4 (604D) is substantially the same as original view 4 (604). Decoded view 4 (604D) and original view 5 (605) are used to generate RLO format RL045 645, similar to the above described method for RL012. RL045
' contains a right view R45, a left occlusions view L045, and generating-vectors RGV45.
Similar to the description in reference to right view R23, decoded view 4 (604D) is used for right view R45. R45, L045, and RGV45 will all be required by the MLE CODEC decoder. However, since decoded view 4 (604D) is used for right view R45, right view R45 does not have to be part of the produced data for RL045. Fused view RL034 is already transmitted,
I and can be used by the MLE CODEC decoder to produce decoded view 4 (604D) which can be used as the R45 part of RL045. Hence, from RL045, only the L045 and RGV45 parts need to.be transmitted. RL045 does not fully contribute to the bit rate and bandwidth required for transmission, as R45 does not need to be transmitted.
The original data for the current example includes 5 original views. The data produced by the encoder includes only one original view [original view 1 (601)], right view R12 and four additional left views, L012, L023, L034, and L045 and correspondingly only four sets of generating-vectors RGV12, RGV23, RGV34, and RGV45.
MLE Decoder Based on the RLO Format
Referring to FIGURE 7, a flow diagram of processing for an MLE CODEC decoder based on the RLO format, the non-limiting example of FIGURE 6 is continued. The current example of an RLO decoder is similar to the above-described non-limiting example of an LRO decoder.
RL012 fused data format (612) includes transmitted data R12, L012, and RGV12. RLO 12 is decoded to generated decoded view 1 (701D) and decoded view 2 (702D). This decoding is similar to the decoding described above in reference to the LRO format for S3D. As described above, in general decoded view <N> is substantially the same as original view <N>.
Note that as decoded view 1 (701D) can be extracted from RLO 12 as the R12 part. An alternative drawing could represent item 701D as being extracted from RLO 12 via arrow I 700, and item 701D not being striped. For consistency, the striped notation for all decoded views is maintained in the figures.
RL023 (723) is decoded to generate decoded view 3 (703D). The format for RL023 contains a right view R23, a left occlusions view L023, and generating-vectors RGV23. A significant feature of the method of this decoding is that decoded view 2 (702D) is used for right view R23. Since decoded view 2 (702D) is used for right view R23, the MLE CODEC decoder does not have to receive R23 as part of the data transmission. As described above, since R23 is not needed by the decoder, R23 is not produced or transmitted as part of RL023. L023 and RGV23 are transmitted as part of the multiview transmission to the decoder. L023 and RGV23 are used with the generated decoded view 2 (702D), which is I R23, to form. RL023. RL023 is then decoded to generate decoded view 3 (703D).
After generating decoded view 3 (703D), the method repeats, using decoded view 3 (703D) as right view R34 of RL034 (734). Data received by the MLE CODEC decoder for left occlusions view L034 and generating-vectors RGV34 completes the RL034 fused data format. RL034 is then decoded to generate decoded view 4 (704D).
Similarly, after generating decoded view 4 (704D), the method repeats, using decoded view 4 (704D) as right view R45 of RL045 (745). Data received by the MLE CODEC decoder for left occlusions view L045 and generating-vectors RGV45 completes the RL045 fused data format. RL045 is then decoded to generate decoded view 5 (705D).
Note that the MLE CODEC decoder is a subset of the MLE CODEC encoder. As > such, savings can be achieved in hardware and/or software implementation by a careful reuse of modules, specifically by implementing the encoder to allow for dual-use as a decoder. MLE CODEC Encoder Based on a Combination of the LRO and RLO Formats
Due to processing realities, including errors and/or artifacts appearing in a decoded view, the decoded views are normally substantially the same as the original images, but typically not exactly the same as the original images. The errors in the decoded views affect the quality of subsequent decoded views. In the above-described MLE encoders based on only either the LRO format or RLO format, there is typically a decrease in the quality of the decoded images as the CODEC progresses from lower to higher levels. A greater number of processing levels may have a greater decrease in the quality of the decoded image for higher levels, as compared to the original image.
Variations of the MLE CODEC can be specified using a combination or mixture of LRO and RLO formats. Using a combination of formats can reduce the number of processing levels required to encode a given number of original images, thereby increasing the quality of the decoded images. Referring to FIGURE 8, a flow diagram of processing for an MLE CODEC encoder based on a combination of the LRO and RLO formats, a non- limiting 'example of 5 views is encoded. While the LRO format CODEC described in reference to FIGURE 4 has four processing levels (to generate LRO 12, LR023, LR034, LR045), the combination CODEC described below in reference to FIGURE 8 has only two processing levels on the LRO pipeline branch (to generate LR034 and LR045) and three processing levels (to generate LR034, RL032, and RL021) on the branch for the RLO pipeline. Note that in this case, the root of both the LRO and RLO branches is the same (LR034), and only has to be generated once. Due to fewer processing levels in a branch when using a combination of formats, the resulting quality of decoded images can be better than the quality of decoded images when using a single format.
One skilled in the art will realize that processing of the LRO and RLO branches of processing levels can be implemented in parallel, serial, or a combination of processing order. Depending on the application, more than one root and more than two branches can also be used. In the below description, the LRO branch will first be described.
The current example of a combination MLE CODEC using LRO and RLO encoder is similar to the above-described non-limiting examples of LRO and RLO encoders. Based on this description, one skilled in the art will be able to extend this encoding method to an arbitrary number of views, including more views and fewer views, starting with a view appropriate for a specific application. 5 original views, original view 1 (401), original view 2 (402), original view 3 (403), original view 4 (404), and original view 5 (405), are encoded to produce a smaller amount of data, as compared to the amount of data in the original views. Original view 3 (403) and original view 4 (404) are used to generate LRO format for views 3 and 4 LR034 834, similar to the above described method for S3D with original view 3 (403) used as the left view and original view 4 (404) used as the right view. LR034 contains a left view L34 [the part built from the left (L) and both (B) pixels], a right occlusions view R034 [the part built from the right (R) pixels and optional padding pixels], and generating- vectors GV34 (generating-vectors to regenerate the original left and right views). Note that in the preferred embodiment described above in reference to the LRO format, as all of the pixels for the original view 3 (403) are in the left view (L34) and only pixels for original view 4 (404) are in the right occlusions view (R034).
L34, R034, and GV34 will all be required by the MLE CODEC decoder, so are part of the produced data, and fully contribute to a bit rate and bandwidth required for
transmission. The original views (in this case original view 3 (403) and original view 4 (404) do not need to be transmitted. For clarity in the figures, items that do not need to be transmitted are striped, while items to be transmitted are not filled-in.
After generating LR034, LR034 is decoded to generate decoded view 4 (804D) and decoded view 3 (803D). While theoretically decoded view 4 (804D) and decoded view 3 (803D) can be the same as original view 4 (404) and original view 3 (403), respectively, depending on the application and encoding parameters chosen, the decoded and respective original views may be more or less similar.
Decoded view 4 (804D) and original view 5 (405) are used to generate LRO format LR045 (845), similar to the above described method for LR034. LR045 contains a left view L45, a right occlusions view R045, and generating-vectors GV45. A significant feature of the method of this encoding is that decoded view 4 (804D) is used for left view L45. L45, R045, and GV45 will all be required by the MLE CODEC decoder. However, since decoded view 4 (804D) is used for left view L45, left view L45 does not have to be part of the produced data for LR045. Fused view LR034 is already transmitted, and can be used by the MLE CODEC decoder to produce decoded view 4 (804D) which can be used as the L45 part of LR045. Hence, from LR045, only the R045 and GV45 parts need to be transmitted. LR045 does not fully contribute to the bit rate and bandwidth required for transmission, as L45 does not need to be transmitted. This contributes significantly to the bandwidth savings.
Note that in addition to only needing to transmit one of the two views (only R045), since the non-transmitted view (L45) contains the left (L) and both (B) pixels, the view that is
Ί transmitted (R045) contains only the right occluded (RO) pixels and optional padding pixels, which is generally a smaller amount of data than the view not-transmitted (L45).
Decoded view 3 (803D) and original view 2 (402) are used to generate RLO format RL032 823, similar to the above described method for RLO 12. RL032 contains a right view R32, a left occlusions view L032, and generating- vectors RGV32. A significant feature of
I the method of this encoding is that decoded view 3 (803D) is used for right view R32. R32, L032, and RGV32 will all be required by the MLE CODEC decoder. However, since decoded view 3 (803D) is used for right view R32, right view R32 does not have to be part of the produced data for RL032. Fused view LR034 is already transmitted, and can be used by the MLE CODEC decoder to produce decoded view 3 (803D) which can be used as the R32 part of RL032. Hence, from RL032, only the L032 and RGV32 parts need to be
transmitted. RL032 does not fully contribute to the bit rate and bandwidth required for transmission, as R32 does not need to be transmitted. This contributes significantly to the bandwidth savings.
After generating RL032 (823), the method repeats, decoding RL032 to generate decoded view 2 (802D). Decoded view 2 (802D) is substantially the same as original view 2 (402). Decoded view 2 (802D) and original view 1 (401) are used to generate RLO format RL021 (821), similar to the above described method for RL032. RL021 contains a right view R21, a left occlusions view L021, and generating-vectors RGV21. Similar to the description in reference to right view R32, decoded view 2 (802D) is used for right view
R21. R21, L021, and RGV21 will all be required by the MLE CODEC decoder. However, since decoded view 2 (802D) is used for right view R21, right view R21 does not have to be part of the produced data for RL021. Data for fused view RL032 is already available, and can be used by the MLE CODEC decoder to produce decoded view 2 (802D) which can be used as the R21 part of RL021. Hence, from RL021, only the L021 and RGV21 parts need to be transmitted. RL021 does not fully contribute to the bit rate and bandwidth required for transmission, as R21 does not need to be transmitted. MLE CODEC Decoder Based on a Combination of the LRO and RLO Formats
Referring to FIGURE 9, a flow diagram of processing for an MLE CODEC decoder based on a combination of the LRO and RLO formats, the non-limiting example of FIGURE
8 is continued. When decoding the combination format, the MLE CODEC decoder does not 5 have to decode all branches. Only branches necessary for providing the desired images need to be decoded. In a non-limiting example, if a user wants to see what is happening in the direction of original image 3 (403), 4 (404), and 5 (405), only the left branch (LRO encoded data) needs to be decoded to provide the desired images.
LR034 fused data format (834) includes transmitted data L34, R034, and GV34. ΰ LR034 is decoded to generated decoded view 4 (904D) and decoded view 3 (903D). This decoding is similar to the decoding described above in reference to the LRO format for S3D.
As described above, in general decoded view <N> is substantially the same as original view
<N>.
LR045 (945) is decoded to generate decoded view 5 (905D). The format for LR045
5 contains a left view L45, a right occlusions view R045, and generating-vectors GV45. A significant feature of the method of this decoding is that decoded view 4 (904D) is used for left view L45. Since decoded view 4 (904D) is used for left view L45, the MLE CODEC decoder does not have to receive L45 as part of the data transmission. As described above, since L45 is not needed by the decoder, L45 is not produced or transmitted as part of LR045.
) R045 and GV45 are transmitted as part of the multiview transmission to the decoder. R045 and GV45 are used with the generated decoded view 4 (904D), which is L45, to form
LR045. LR045 is then decoded to generate decoded view 5 (905D).
For the right branch (RLO) pipeline of the combination MLE CODEC decoder, LR034 fused data format (834) has already been decoded to generate decoded view 3
> (903D). Additional data for the left occlusions view L032 and generating-vectors RGV32 is received with the combination format data, along with decoded view 3 (903D as the R32 portion of RL032 (932). A significant feature of the method of this decoding is that decoded view 3 (903D) is used for right view R32. Since decoded view 3 (903D) is used for right view R32, the MLE CODEC decoder does not have to receive R32 as part of the data
) transmission. As described above, since R32 is not needed by the decoder, R32 is not
produced or transmitted as part of RL032. L032 and RGV32 are transmitted as part of the multiview transmission to the decoder. L032 and RGV32 are used with the generated decoded view 3 (903D), which is R32, to form RL032. RL032 is then decoded to generate decoded view 2 (902D).
After generating decoded view 2 (902D), the method repeats, using decoded view 2 (902D) as right view R21 of RL021 (921). Data received by the MLE CODEC decoder for left occlusions view L021 and generating-vectors RGV21 completes the RL021 fused data format. RL021 is then decoded to generate decoded view 1 (901D).
Referring to FIGURE 10, a diagram of a system for LRO and MULTIVIEW encoding and decoding, this system can also be used for LRO and MULTIVIEW decoding. System 1000 includes a variety of processing modules, depending on the specific encoding and/or decoding required by the application. The high-level block diagram of a system 1000 of the present embodiment includes a processor 1002, a transceiver module 1010, and optional memory devices: a RAM 1004, a boot ROM 1006, and a nonvolatile memory 1008, all communicating via a common bus 1012. Typically, the components of system 1000 are deployed in a host 1020.
Transceiver module 1010 can be configured to receive and/or send data for encoding and/or decoding. When the transceiver module is used to receive data, the transceiver module functions as a data-receiving module.
For clarity, a limited number of elements from the accompanying figures are referenced in the current description. Based on this description, one skilled in the art will realize to which other elements are being referred, and will be able to extend this description for implementation with a specific application. Referring back to FIGURE 3 and FIGURE 4, received data for LRO encoding can include a first set of data (original view 1 , 401) and a second set of data (original view 2, 402). Referring back to FIGURE 5, received data for LRO decoding can include a first view (L12) and a second view (R012), and generating- vectors (GV12) associated with the first and second views. Referring back to FIGURE 4, in the case where the system 1000 is configured for multiview CODEC, received data for encoding can include first set of data (original view 1, 401), a second set of data (original view 2, 402), and a third set of data (original view 3, 403). Referring back to FIGURE 5, received data for multiview decoding can include first fused data set (LR012) a third view (R023) and a second set of associated generating-vectors (GV23).
The results of LRO and multiview encoding and decoding can be transmitted via transceiver module 1010, stored in volatile memory, such as RAM 1004, and/or stored in nonvolatile memory 1008. RAM 1004 and nonvolatile memory 1008 can be configured as a storage module for data. Stored or transmitted data from LRO encoding includes the LRO fused view of FIGURE 3 A, the RLO fused view of FIGURE 3B, and the (LR012) fused view 412 of FIGURE 4 (which includes first view L12, second view R012, and associated generating-vectors GV12). Data from LRO decoding includes first set of data (decoded view 1, 501D) and second set of data (decoded view 2, 502D). Referring again to FIGURE 4, data from multiview encoding includes first fused data set (LR012 that is 412), third view
(R023), and second set of associated generating-vectors (GV23). Referring again to FIGURE 5, data from multiview decoding includes a decoded first view (50 ID), a decoded second view (502D), and a decoded third view (503D). Obviously, data can be received or transmitted as two sets of data in a single file, as two or more files, or other configurations as appropriate to the application.
Nonvolatile memory 1008 is an example of a computer-readable storage medium bearing computer-readable code for implementing the data encoding and decoding methodologies described in the current document. Other examples of such computer- readable storage media include read-only memories such as CDs bearing such code.
The computer-readable code can include program code for one or more of the following: encoding data in the LRO format, decoding data from the LRO format, encoding data in the multiview format, and decoding data from the multiview format.
In FIGURES 4 to 9, arrows between data generally represent processing modules, or processing which can be implemented on a processing system that includes one or more processors such as processor 1002. For clarity in the diagrams, not all of the arrows are labeled. The following is an exemplary description and mapping of some of the LRO and multiview CODEC processing modules: In FIGURE 4, arrow 490 represents a processing module, typically implemented as processing on a processor, such as processor 1002. Arrow 490, also referred to in this description as processing module 490, is an encoding process, which generates a fused view (LR012/412) from two sets of data (original view 1 , 401 and original view 2, 402). As described above, the fused view encoding process, arrow 490, is similar for LRO encoding or as a step in multiview encoding. Arrow 492, also referred to in this description as processing module 492, is a decoding process, which generates one or more decoded views, such as decoded view 2 (402D) from a fused view (LR012/412) As described above, the fused view decoding process, arrow 492, is similar for LRO decoding or as a step in multiview decoding.
In FIGURE 5, arrow 592, also referred to in this description as processing module 592, is a decoding process, which generates one or more decoded views, such as decoded
) view 1 (501D) and decoded view 2 (502D) from a fused view (LR012/412). As described above, the fused view decoding process, arrow 592, is similar for LRO decoding or as a step in multiview decoding.
As will be obvious to one skilled in the art, the decoding processes arrow 492 and arrow 592 are similar, and the same processing module can be used for each. As such, the
) MLE CODEC decoder processing, (arrow 492) is a subset of the processing needed for the MLE CODEC encoder processing (including arrow 490 and arrow 492). As such, savings can be achieved in hardware, firmware, and/or software implementation by a careful re-use of modules, specifically by implementing the encoder to allow for dual-use of the decoding processing that is part of the encoder portion of the CODEC to be used for the decoding
! processing needed for the decoder portion of the CODEC.
To continue the current example for further clarity, refer again to FIGURE 5. Arrow 594, also referred to in this description as processing module 594, is a decoding process, which generates one or more decoded views, such as decoded view 3 (503D). While processing module 594 can also generate decoded view 2 (502D), this is not necessary, as
I decoded view 2 (502D) already has been generated as part of the previous level's processing.
The decoding process of arrow 594 is similar to arrow 592 and arrow 492, and is preferably implemented as the same processing module.
In general, in cases where the input data, including sets of data, original images, and fused data sets, are encoded, the encoding can be removed prior to processing. Encoded i input data includes encoding in H.254, MPEG4, or any other format, as applicable for the application. Processing includes LRO encoding, LRO decoding, multiview encoding, and multiview decoding. The output data, including sets of data, and decoded views can be encoded in H.254, MPEG4, or any other format, as applicable for the application.
' Note that a variety of implementations for modules and processing are possible,
depending on the application. Modules can be implemented in software, but can also be implemented in hardware and firmware, on a single processor or distributed processors, at one or more locations. The above-described module functions can be combined and implemented as fewer modules or separated into sub-functions and implemented as a larger number of modules. Based on the above description, one skilled in the art will be able to design an implementation for a specific application.
The use of simplified calculations to assist in the description of this embodiment should not detract from the utility and basic advantages of the invention.
It should be noted that the above-described examples, numbers used, and exemplary calculations are to assist in the description of this embodiment. Inadvertent typographical and mathematical errors should not detract from the utility and basic advantages of the invention.
It will be appreciated that the above descriptions are intended only to serve as examples, and that many other embodiments are possible within the scope of the present invention as defined in the appended claims.

Claims

WHAT IS CLAIMED IS:
1. A method for encoding data comprising the steps of:
(a) receiving a first set of data;
(b) receiving a second set of data; and
(c) generating a first view, a second view, and associated generating- vectors;
wherein said first and second views are generated by combining the first and second sets of data, such that said first view contains information associated with elements of the first set of data, said second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and said associated generating-vectors indicate operations to be performed on the elements of said first and second views to recover the first and second sets of data.
2. The method of claim 1 wherein said first view is the first set of data.
3. The method of claim 1 wherein said first view includes elements that are common to the first and second sets of data.
4. The method of claim 1 wherein said second view only contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data.
5. The method of claim 1 wherein said second view contains additional information, said additional information other than information only associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data.
6. The method of claim 1 wherein said second view includes elements of the first and second sets of data that are only in the second set of data.
7. The method of claim 1 wherein said first set of data is a first two- dimensional (2D) image of a scene from a first viewing angle, and said second set of data is a second 2D image of the scene from a second viewing angle.
8. The method of claim 1 wherein the data is in H.264 format.
9. The method of claim 1 wherein the data is in MPEG4 format.
10. The method of claim 1 further comprising the step of:
(d) storing said first view, said second view, and said associated generating-vectors in association with each other.
11. A method for decoding data comprising the steps of:
(a) receiving a first view and a second view, said first and second views containing information associated with elements of a first set of data and a second set of data such that said first view contains information associated with elements of said first set of data, and said second view contains information associated with elements of said second set of data other than elements of said second set of data that are in common with corresponding elements of said first set of data;
(b) receiving generating-vectors associated with said first and second views, said generating-vectors indicating operations to be performed on elements of said first and second views to generate said first and second set of data; and
(c) generating, using said first view, said second view, and said
generating-vectors, at least said first set of data.
12. The method of claim 1 1 wherein said first view is said first set of data.
13. The method of claim 11 wherein said first view includes elements that are common to the first and second sets of data.
14. The method of claim 11 wherein said second view only contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data.
15. The method of claim 1 1 wherein said second view contains additional information, said additional information other than information only associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data.
16. The method of claim 11 comprising the step of:
(d) generating, using said first view, said second view, and said
generating-vectors, said second set of data.
17. The method of claim 11 wherein said first set of data is a first two- dimensional (2D) image of a scene from a first viewing angle, and said second set of data is a second 2D image of the scene from a second viewing angle.
18. A system for encoding data comprising:
(a) a data-receiving module configured to receive at least a first set of data and a second set of data; and
(b) a processing system containing one or more processors, said
processing system being configured to generate a first view, a second view, and associated generating-vectors,
wherein said first and second views are generated by combining the first and second sets of data, such that said first view contains information associated with elements of the first set of data, said second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and said associated generating-vectors indicate operations to be performed on the elements of said first and second views to recover the first and second sets of data.
19. The system of claim 18 including:
(c) a storage module configured to store said first view, said second view, and said associated generating-vectors in association with each other.
20. A system for decoding data comprising:
(a) a data-receiving module configured to receive at least:
(i) a first view and a second view, said first and second views containing information associated with elements of a first set of data and a second set of data such that said first view contains information associated with elements of said first set of data, and said second view contains information associated with elements of said second set of data other than elements of said second set of data that are in common with corresponding elements of said first set of data; and (ii) generating-vectors associated with said first and second views, said generating-vectors indicating operations to be performed on elements of said first and second views to generate said first and second set of data; and
(b) a processing system containing one or more processors, said
processing system being configured to generate, using said first view, said second view, and said generating-vectors, at least said first set of data.
21. A method for encoding data comprising the steps of:
(a) generating a first fused data set including a first view, a second view, and a first set of associated generating-vectors wherein said first and second views are generated by combining a first set of data and a second set of data, such that said first view contains information associated with elements of the first set of data, said second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and said first set of associated generating-vectors indicate operations to be performed on the elements of said first and second views to recover the first and second sets of data;
(b) generating a decoded second view using said first fused data set, said decoded second view substantially the same as the second set of data; and
(c) generating a third view, and a second set of associated generating- vectors wherein said third view is generated by combining said decoded second view and a third set of data, such that said third view contains information associated with elements of the third set of data other than elements of the third set of data that are in common with corresponding elements of said decoded second view, and said second set of associated generating-vectors indicate operations to be performed on the elements of said decoded second view and third views to recover the second and third sets of data.
22. The method of claim 21 wherein said first view is the first set of data.
23. The method of claim 21 wherein said first view includes elements that are common to the first and second sets of data.
24. The method of claim 21 wherein said second view only contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data.
25. The method of claim 21 wherein said second view contains additional information, said additional information other than information only associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data.
26. The method of claim 21 wherein steps (b) to (c) are repeated to generate a higher-level fused data set, said higher-level fused data set including a higher-level decoded view from a lower-level fused data set.
27. The method of claim 21 wherein said first set of data is a first two- dimensional (2D) image of a scene from a first viewing angle, and said second set of data is a second 2D image of the scene from a second viewing angle.
28. The method of claim 21 further comprising the step of:
(d) storing said first fused data set, said third view, and said second set of associated generating-vectors in association with each other.
29. A method for decoding data comprising the steps of:
(a) receiving a first fused data set including a first view, a second view, and a first set of associated generating-vectors, said first and second views containing information associated with elements of a first set of data and a second set of data such that said first view contains information associated with elements of the first set of data, and said second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and said a first set of associated generating-vectors indicating operations to be performed on elements of said first and second views to render the first and second set of data; (b) generating at least a decoded second view using said first fused data set, said decoded second view substantially the same as the second set of data; and
(c) generating a decoded third view using a second fused data set, said second fused data set including said decoded second view, a third view and a second set of associated generating-vectors, wherein said third view contains information associated with elements of a third set of data other than elements of the third set of data that are in common with corresponding elements of the second set of data, said second set of associated generating-vectors indicating operations to be performed on elements of said decoded second view and said third view to render said decoded third view, said decoded third view substantially the same as said third set of data.
30. The method of claim 29 wherein said first view is said first set of data.
31. The method of claim 29 wherein said first view includes elements that are common to the first and second sets of data.
32. The method of claim 29 wherein said second view only contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data.
33. The method of claim 29 wherein said second view contains additional information, said additional information other than information only associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data.
34. The method of claim 29 wherein step (c) of generating is repeated to generate a higher-level decoded view using a higher-level fused data set, said higher- level fused data set including a decoded view from a lower-level fused data set.
35. The method of claim 29 further comprising the step of:
(d) generating a decoded first view using said first fused data set, said decoded first view substantially the same as the first set of data.
36. A system for encoding data comprising: (a) a data-receiving module configured to receive at least a first set of data, a second set of data, and a third set of data; and
(b) a processing system containing one or more processors, said
processing system being configured to:
(i) generate a first fused data set including a first view, a second view, and a first set of associated generating-vectors wherein said first and second views are generated by combining a first set of data and a second set of data, such that said first view contains information associated with elements of the first set of data, said second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and said first set of associated generating-vectors indicate operations to be performed on the elements of said first and second views to recover the first and second sets of data;
(ii) generate a decoded second view using said first fused data set, said decoded second view substantially the same as the second set of data; and
(iii) generate a third view, and a second set of associated generating-vectors wherein said third view is generated by combining said decoded second view and a third set of data, such that said third view contains information associated with elements of the third set of data other than elements of the third set of data that are in common with corresponding elements of said decoded second view, and said second set of associated generating-vectors indicate operations to be performed on the elements of said decoded second view and third views to recover the second and third sets of data.
37. The system of claim 36 including: (d) a storage module configured to store said first fused data set, said third view, and said second set of associated generating-vectors in association with each other.
38. A system for decoding data comprising:
(a) a data-receiving module configured to receive at least a first fused data set including a first view, a second view, and a first set of associated generating-vectors, said first and second views containing information associated with elements of a first set of data and a second set of data such that said first view contains information associated with elements of the first set of data, and said second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and said a first set of associated generating-vectors indicating operations to be performed on elements of said first and second views to render the first and second set of data; and
(b) a processing system containing one or more processors, said
processing system being configured to:
(i) generate at least a decoded second view using said first fused data set, said decoded second view substantially the same as the second set of data; and
(ii) generate a decoded third view using a second fused data set, said second fused data set including said decoded second view, a third view and a second set of associated generating-vectors, wherein said third view contains information associated with elements of a third set of data other than elements of the third set of data that are in common with corresponding elements of the second set of data, said second set of associated generating- vectors indicating operations to be performed on elements of said decoded second view and said third view to render said decoded third view, said decoded third view substantially the same as said third set of data.
39. A computer-readable storage medium having embedded thereon computer-readable code for encoding data the computer-readable code comprising program code for:
(a) receiving a first set of data;
(b) receiving a second set of data; and
(c) generating a first view, a second view, and associated generating- vectors;
wherein said first and second views are generated by combining the first and second sets of data, such that said first view contains information associated with elements of the first set of data, said second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and said associated generating-vectors indicate operations to be performed on the elements of said first and second views to recover the first and second sets of data.
40. A computer-readable storage medium having embedded thereon computer-readable code for decoding data the computer-readable code comprising program code for:
(a) receiving a first view and a second view, said first and second views containing information associated with elements of a first set of data and a second set of data such that said first view contains information associated with elements of said first set of data, and said second view contains information associated with elements of said second set of data other than elements of said second set of data that are in common with corresponding elements of said first set of data;
(b) receiving generating-vectors associated with said first and second views, said generating-vectors indicating operations to be performed on elements of said first and second views to generate said first and second set of data; and
(c) generating, using said first view, said second view, and said
generating-vectors, at least said first set of data.
41. A computer-readable storage medium having embedded thereon computer-readable code for encoding data the computer-readable code comprising program code for:
(a) generating a first fused data set including a first view, a second view, and a first set of associated generating-vectors wherein said first and second views are generated by combining a first set of data and a second set of data, such that said first view contains information associated with elements of the first set of data, said second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and said first set of associated generating-vectors indicate operations to be performed on the elements of said first and second views to recover the first and second sets of data;
(b) generating a decoded second view using said first fused data set, said decoded second view substantially the same as the second set of data; and
(c) generating a third view, and a second set of associated generating- vectors wherein said third view is generated by combining said decoded second view and a third set of data, such that said third view contains information associated with elements of the third set of data other than elements of the third set of data that are in common with corresponding elements of said decoded second view, and said second set of associated generating-vectors indicate operations to be performed on the elements of said decoded second view and third views to recover the second and third sets of data.
42. A computer-readable storage medium having embedded thereon computer-readable code for decoding data the computer-readable code comprising program code for:
(a) receiving a first fused data set including a first view, a second view, and a first set of associated generating-vectors, said first and second views containing information associated with elements of a first set of data and a second set of data such that said first view contains information associated with elements of the first set of data, and said second view contains information associated with elements of the second set of data other than elements of the second set of data that are in common with corresponding elements of the first set of data, and said a first set of associated generating-vectors indicating operations to be performed on elements of said first and second views to render the first and second set of data;
(b) generating at least a decoded second view using said first fused data set, said decoded second view substantially the same as the second set of data; and
(c) generating a decoded third view using a second fused data set, said second fused data set including said decoded second view, a third view and a second set of associated generating-vectors, wherein said third view contains information associated with elements of a third set of data other than elements of the third set of data that are in common with corresponding elements of the second set of data, said second set of associated generating-vectors indicating operations to be performed on elements of said decoded second view and said third view (R023) to render said decoded third view, said decoded third view
substantially the same as said third set of data.
PCT/IL2011/000792 2010-10-06 2011-10-06 Multiview 3d compression format and algorithms WO2012046239A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/824,395 US20130250056A1 (en) 2010-10-06 2011-10-06 Multiview 3d compression format and algorithms

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US39029110P 2010-10-06 2010-10-06
US61/390,291 2010-10-06
US201161509581P 2011-07-20 2011-07-20
US61/509,581 2011-07-20

Publications (2)

Publication Number Publication Date
WO2012046239A2 true WO2012046239A2 (en) 2012-04-12
WO2012046239A3 WO2012046239A3 (en) 2013-04-11

Family

ID=45928168

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL2011/000792 WO2012046239A2 (en) 2010-10-06 2011-10-06 Multiview 3d compression format and algorithms

Country Status (2)

Country Link
US (1) US20130250056A1 (en)
WO (1) WO2012046239A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9797225B2 (en) 2013-11-27 2017-10-24 Saudi Arabian Oil Company Data compression of hydrocarbon reservoir simulation grids

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117061605B (en) * 2023-10-11 2024-02-06 杭州宇谷科技股份有限公司 Intelligent lithium battery active information pushing method and device based on end cloud cooperation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070177673A1 (en) * 2006-01-12 2007-08-02 Lg Electronics Inc. Processing multiview video
US20080130984A1 (en) * 2006-12-01 2008-06-05 Samsung Electronics Co. Ltd. Apparatus and Method for Compressing Three-Dimensional Stereoscopic Images
US7444664B2 (en) * 2004-07-27 2008-10-28 Microsoft Corp. Multi-view video format
US20090262206A1 (en) * 2008-04-16 2009-10-22 Johnson Controls Technology Company Systems and methods for providing immersive displays of video camera information from a plurality of cameras
US20100091881A1 (en) * 2006-12-21 2010-04-15 Purvin Bibhas Pandit Methods and apparatus for improved signaling using high level syntax for multi-view video coding and decoding

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5850352A (en) * 1995-03-31 1998-12-15 The Regents Of The University Of California Immersive video, including video hypermosaicing to generate from multiple video views of a scene a three-dimensional video mosaic from which diverse virtual video scene images are synthesized, including panoramic, scene interactive and stereoscopic images
US20110001792A1 (en) * 2008-03-04 2011-01-06 Purvin Bibhas Pandit Virtual reference view
KR20110007928A (en) * 2009-07-17 2011-01-25 삼성전자주식회사 Method and apparatus for encoding/decoding multi-view picture

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7444664B2 (en) * 2004-07-27 2008-10-28 Microsoft Corp. Multi-view video format
US20070177673A1 (en) * 2006-01-12 2007-08-02 Lg Electronics Inc. Processing multiview video
US20080130984A1 (en) * 2006-12-01 2008-06-05 Samsung Electronics Co. Ltd. Apparatus and Method for Compressing Three-Dimensional Stereoscopic Images
US20100091881A1 (en) * 2006-12-21 2010-04-15 Purvin Bibhas Pandit Methods and apparatus for improved signaling using high level syntax for multi-view video coding and decoding
US20090262206A1 (en) * 2008-04-16 2009-10-22 Johnson Controls Technology Company Systems and methods for providing immersive displays of video camera information from a plurality of cameras

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9797225B2 (en) 2013-11-27 2017-10-24 Saudi Arabian Oil Company Data compression of hydrocarbon reservoir simulation grids

Also Published As

Publication number Publication date
US20130250056A1 (en) 2013-09-26
WO2012046239A3 (en) 2013-04-11

Similar Documents

Publication Publication Date Title
US10528004B2 (en) Methods and apparatus for full parallax light field display systems
US8451320B1 (en) Methods and apparatus for stereoscopic video compression, encoding, transmission, decoding and/or decompression
EP2201784B1 (en) Method and device for processing a depth-map
US8743178B2 (en) Multi-view video format control
JP5735181B2 (en) Dual layer frame compatible full resolution stereoscopic 3D video delivery
US20080310762A1 (en) System and method for generating and regenerating 3d image files based on 2d image media standards
US20100309287A1 (en) 3D Data Representation, Conveyance, and Use
US20150341614A1 (en) Stereoscopic video encoding device, stereoscopic video decoding device, stereoscopic video encoding method, stereoscopic video decoding method, stereoscopic video encoding program, and stereoscopic video decoding program
TWI539790B (en) Apparatus, method and software product for generating and rebuilding a video stream
JP2010273333A (en) Three-dimensional image combining apparatus
JP2015005978A (en) Method and device for generating, storing, transmitting, receiving and reproducing depth map by using color components of image belonging to three-dimensional video stream
JP4954473B2 (en) Method and apparatus for encoding a digital video signal
KR20210027482A (en) Methods and apparatus for volumetric video transmission
US20120007951A1 (en) System and format for encoding data and three-dimensional rendering
WO2014041355A1 (en) Multi-view high dynamic range imaging
CN102272793A (en) Method and system for scaling compressed image frames
EP2309766A2 (en) Method and system for rendering 3D graphics based on 3D display capabilities
CN106657961B (en) Hybrid digital-analog encoding of stereoscopic video
US20150326873A1 (en) Image frames multiplexing method and system
EP2373046A2 (en) Super resolution based n-view + n-depth multiview video coding
US9628769B2 (en) Apparatus and method for generating a disparity map in a receiving device
EP2312859A2 (en) Method and system for communicating 3D video via a wireless communication link
US20130250056A1 (en) Multiview 3d compression format and algorithms
Coltuc On stereo embedding by reversible watermarking
EP3035688B1 (en) Encoding and decoding of 3d hdr images using a tapestry representation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11830288

Country of ref document: EP

Kind code of ref document: A2

WWE Wipo information: entry into national phase

Ref document number: 13824395

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11830288

Country of ref document: EP

Kind code of ref document: A2

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 22/07/13)

122 Ep: pct application non-entry in european phase

Ref document number: 11830288

Country of ref document: EP

Kind code of ref document: A2