WO2016003340A1 - Codage et décodage de champs lumineux - Google Patents

Codage et décodage de champs lumineux Download PDF

Info

Publication number
WO2016003340A1
WO2016003340A1 PCT/SE2014/050851 SE2014050851W WO2016003340A1 WO 2016003340 A1 WO2016003340 A1 WO 2016003340A1 SE 2014050851 W SE2014050851 W SE 2014050851W WO 2016003340 A1 WO2016003340 A1 WO 2016003340A1
Authority
WO
WIPO (PCT)
Prior art keywords
plf
scene
parameters
bitstream
model
Prior art date
Application number
PCT/SE2014/050851
Other languages
English (en)
Inventor
Julien Michot
Original Assignee
Telefonaktiebolaget L M Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget L M Ericsson (Publ) filed Critical Telefonaktiebolaget L M Ericsson (Publ)
Priority to PCT/SE2014/050851 priority Critical patent/WO2016003340A1/fr
Publication of WO2016003340A1 publication Critical patent/WO2016003340A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/005Statistical coding, e.g. Huffman, run length coding

Definitions

  • Embodiments presented herein relate to encoding and decoding of light fields, and particularly to methods, an electronic device, computer programs, and a computer program product for encoding and decoding of light fields.
  • 3D (three-dimensional) video and 3D TV is gaining momentum and is considered one possible logical step in consumer electronics, mobile devices, computers and cinemas.
  • the additional dimension on top of common two-dimensional (2D) video offers multiple different directions for displaying the content and improves the potential for interaction between viewers and the content.
  • 3D is usually related to stereoscopic experiences, where each one of the user's eyes is provided with a unique image of a scene. Such unique images may be provided as a stereoscopic image pair.
  • the unique images are then fused by the human brain to create a depth impression (i.e. an imagined 3D view). For example, by presenting a light field using technology that maps each sample to the appropriate ray in physical space, one obtains an auto- stereoscopic visual effect akin to viewing the original scene.
  • Digital technologies for enabling this include placing an array of lenslets over a high- resolution display screen, or projecting the imagery onto an array of lenslets using an array of video projectors. If the latter is combined with an array of video cameras, one can capture and display a time-varying light field.
  • stereoscopic video This essentially constitutes a 3D television system. More generally, one way to add the depth dimension to video is by means of so-called stereoscopic video.
  • stereoscopic video the left and the right eyes of the viewer are shown slightly different views (i.e., images). This is achieved by using anaglyph, shutter or polarized glasses that allow showing different images to the left and the right eyes of the viewer, in this way creating a perception of depth. The perceived depth of the point in the image is thereby determined by its displacement between the left and the right views.
  • Fig. l schematically illustrates a rendering unit 12 where slightly different images 14a, 14b, 14c from locations 12a, 12b, 12c on the display of the rendering unit 12 are projected towards a viewer, as represented by the eyes 11a, 11b, in front of the rendering unit 12. Therefore, if the viewer is located in a proper position in front of the display, the viewer's left and right eyes see slightly different views of the same scene, which make it possible to create the perception of depth.
  • a number of views typically 7-28
  • the number of views may increase to 20-50.
  • One issue that may arise when using such auto-stereoscopic displays is the transmission, or storage, of the views, as the views may constitute to a high bit rate.
  • the left and the right views may be coded independently or jointly.
  • Another way to obtain one view from the other view is by using the view synthesis.
  • the issue may be overcome by transmitting a low number (e.g. 1 to 3) of key views and generating the other views by a so-called view synthesis process from the transmitted key views and eventually using additional information such as depth maps.
  • These synthesized views can be located between the key views (interpolated) or outside the range covered by key views (extrapolated).
  • a depth map may be regarded as a simple grayscale image, wherein each pixel indicates the distance between the corresponding pixel from a video object and the image plane of the capturing camera.
  • Disparity maybe regarded as the apparent shift of a pixel which is a consequence of the viewer moving from one viewpoint to another.
  • Depth and disparity are mathematically related and can be interchangeably used.
  • One property of depth/disparity maps is that they contain large smooth surfaces of constant gray levels. This makes them comparatively easy to compress using currently available video coding technology.
  • a so-called 3D point cloud may be reconstructed from the depth map if the 3D camera parameters (such as the intrinsic calibration matrix K for a pinhole camera model, containing the focal lengths, principal point, etc.) are known.
  • the depth map maybe measured by specialized cameras, e.g., structured-light or time-of-flight (ToF) cameras, where the depth is correlated respectively with the
  • DIBR depth image based rendering
  • the depth map may be represented by a grayscale image having the same resolution as the view (video frame).
  • each pixel of the depth map represents the distance from the camera to the object for the corresponding pixel in the image/ video frame.
  • DIBR generally consists of creating a dense 3D point cloud by back-projection of the depth map and projecting the 3D point cloud to another viewpoint.
  • some parameters need to be signalled for the device or program module that performs the view synthesis.
  • z near and z far that represent the closest and the farthest depth values, respectively, in the depth maps for the frame under consideration. These values are needed in order to map the quantized depth map samples to the real depth values that they represent.
  • Another set of parameters that is needed for the view synthesis are camera parameters.
  • Camera parameters for the 3D video are usually split into two parts.
  • the first part relates to internal camera parameters (or intrinsic parameters) and generally represents the optical characteristics of the camera for the image captured, such as the focal length, the coordinates of the images principal point and the lens distortions.
  • the second part relates to external camera parameters (or extrinsic parameters) and generally represents the camera position and the direction of the optical axis of the camera either in the chosen real world coordinates or the position of the camera relative to each other and the objects in the scene.
  • both the internal and the external camera parameters may be required in the view synthesis process based on usage of the depth information (such as DIBR).
  • LDV layered depth video
  • Having one color per 3D pixel or volume pixel (voxel) in stereoscopic video is often not enough since some content has varying colors depending on from where the viewer is viewing them. This is typically due to specular lights, reflective or transparent content. Motion parallax brings the ability to the viewer to perceive the 3D structure of the static content but also allows the viewer to see how reflective/ transparent the content is. Being able to replicate this on a screen may thus improve the user experience.
  • Compressing or encoding 3D video with properties as outlined above may be challenging since the input data structure (camera array) contains a lot of redundancy due to the fact that the cameras record the same content but at a slightly different viewpoint, while most of the contents does not have a varying color depending on the angle from which the contents is observed.
  • MV-HEVC multi view high efficiency video coding
  • 3D-HEVC three-dimensional high efficiency video coding
  • ISO/IEC JTC i/SC 29/WG 11 both yield very low compression efficiency for wide angular resolution (high number of cameras). These standards also have a very large overhead for motion/disparity vectors coding. These standards further offer a slow encoding process.
  • the so-called "Layer-Based Representation for Image Based Rendering and Compression” was developed by Dragotti, P. L. et al at Imperial College London. The image content is split into several zones having the same depth value using image segmentation. Several light field layers for each zone are encoded separately. This approach only works when the content is easy to segment and contains a few elements, such as a toy example. But in reality, real videos are much more complex and the number of zones will increase, leading to a sub-optimal compression ratio. Coding the contour is also challenging (bits demanding).
  • JP3D (also known as JPEG 2000 3D) is only applicable for images; a straight-forward extension to video would yield a quite low compression ratio. JP3D was not directly developed for light field compression.
  • An object of embodiments herein is to provide efficient encoding of a light field into a bitstream and efficient decoding of a bitstream into a panoramic light field.
  • a method for encoding a light field (LF) into a bitstream comprising receiving an LF of a scene and parameters relating to the LF.
  • the parameters describe a three-dimensional (3D) model of the scene, parameters for rendering at least one view of the scene from at least one panoramic light field (PLF), and a projection method for generating a PLF space from the images and the 3D model.
  • the method comprises encoding the at least one PLF and the parameters into the bitstream by sampling the sequence of PLF spaces into a sequence of PLFs and applying compression to remove redundancy in the sequence of PLFs.
  • an electronic device for encoding an LF into a bitstream.
  • the electronic device comprises a processing unit.
  • the processing unit is configured to receive an LF of a scene and parameters relating to the LF.
  • the parameters describe a 3D model of the scene, parameters for rendering at least one view of the scene from at least one panoramic light field (PLF), and a projection method for generating a PLF space from the images and the 3D model.
  • the processing unit is configured to encode the at least one PLF and the parameters into the bitstream by sampling the sequence of PLF spaces into a sequence of PLFs and applying compression to remove redundancy in the sequence of PLFs.
  • a computer program for encoding an LF into a bitstream the computer program comprising computer program code which, when run on a processing unit, causes the processing unit to perform a method according to the first aspect.
  • a computer program product comprising a computer program according to the third aspect and a computer readable means on which the computer program is stored.
  • a method for decoding an encoded bitstream into a PLF is performed by an electronic device.
  • the method comprises receiving an encoded bitstream representing at least one PLF of a scene and parameters relating to the at least one PLF.
  • the parameters describe a panoramic 3D model of the scene, parameters for rendering at least one view of the scene from the at least one PLF, a back- projection method for generating the images from the at least one PLF, and samplings of a PLF space.
  • the method comprises decoding the encoded bitstream into the at least one PLF by reconstructing the at least one PLF by applying decompression to the bitstream based on the parameters.
  • this enables efficient decoding of a bitstream into a panoramic light field.
  • an electronic device for decoding an encoded bitstream into a PLF.
  • the electronic device comprises a processing unit.
  • the processing unit is configured to receive an encoded bitstream representing at least one PLF of a scene and parameters relating to the at least one PLF.
  • the parameters describe a panoramic 3D model of the scene, parameters for rendering at least one view of the scene from the at least one PLF, a back-projection method for generating the images from the at least one PLF, and samplings of a PLF space.
  • the processing unit is configured to decode the encoded bitstream into the at least one PLF by reconstructing the at least one PLF by applying decompression to the bitstream based on the parameters.
  • a seventh aspect there is presented a computer program for decoding an encoded bitstream into a PLF, the computer program
  • a computer program product comprising a computer program according to the seventh aspect and a computer readable means on which the computer program is stored.
  • the disclosed encoding and decoding provides efficient encoding and decoding of light fields and panoramic light fields, respectively.
  • the disclosed encoding and decoding is scalable in number of cameras. Increasing the number of input camera will increase the number of bits but at a lower pace.
  • the disclosed encoding and decoding provides angular scalability (light field), thus creating high fidelity images where motion parallax is available.
  • a 2D or 3D only screen may just drop the angular layers and still be able to display the content.
  • a network node may determine to drop transmission of the angular layers.
  • the disclosed encoding and decoding may handle any input camera setup (lines, planar grid, circular grid, etc.), even non-ordered cameras.
  • the disclosed encoding and decoding require only a few modifications of existing standards such as MV-HEVC and 3D-HEVC and could even be compatible for some setups (such as for a line and/ or planar grid).
  • the disclosed encoding and decoding have a competitive compression efficiency compared to existing light field coding schemes.
  • the disclosed encoding and decoding may utilize angular coding that supports transparency/ reflections (i.e., not only specular light), especially when a dense representation is kept and transmitted.
  • the disclosed encoding and decoding allows the movie maker to select where to compress more (by giving the movie maker free control of the projection model).
  • any feature of the first, second, third, fourth, fifth, sixth, seventh and eight aspects may be applied to any other aspect, wherever appropriate.
  • any advantage of the first aspect may equally apply to the second, third, fourth, fifth, sixth, seventh, and/or eight aspect, respectively, and vice versa.
  • Fig. l is a schematic diagram illustrating a rendering unit according to prior art
  • Fig. 2 is a schematic diagram illustrating an image communications system according to an embodiment
  • Fig. 3a is a schematic diagram showing functional units of an electronic device according to an embodiment
  • Figs. 3b and 3c are schematic diagrams showing functional modules of an electronic device according to an embodiment
  • Fig. 4 shows one example of a computer program product comprising computer readable means according to an embodiment
  • Fig. 5 schematically illustrates parts of an image communications system according to an embodiment
  • Fig. 6 schematically illustrates parts of an image communications system according to an embodiment
  • Fig. 7 schematically illustrates angular coordinates for cameras according to an embodiment
  • FIG. 8 schematically illustrates representation in an angular space according to an embodiment
  • Fig. 9 schematically illustrates slicing according to an embodiment
  • Fig. 10 schematically illustrates angular space sampling according to an embodiment
  • Fig. 11 schematically illustrates a lD encoding order according to an embodiment
  • Fig. 12 schematically illustrates a 2D encoding order according to an embodiment
  • Figs. 13, 14, 15 and 16 are flowcharts of methods according to embodiments.
  • Embodiments disclosed herein relate to encoding a light field (LF) into a bitstream.
  • an electronic device a method performed by the electronic device, a computer program comprising code, for example in the form of a computer program product, that when run on a processing unit of the electronic device, causes the processing unit to perform the method.
  • Embodiments disclosed herein further relate to decoding an encoded bitstream into a panoramic light field (PLF).
  • PLF panoramic light field
  • an electronic device a method performed by the electronic device, a computer program comprising code, for example in the form of a computer program product, that when run on a processing unit of the electronic device, causes the processing unit to perform the method.
  • Fig. 2 schematically illustrates an image communications system 20 according to an embodiment.
  • the image communications system 20 comprises an M-by-N camera array 21.
  • the camera array 21 comprises M-by- N cameras, one of which is identified at reference numerals 21a configured to capture (or record) images of a scene 22.
  • the scene is schematically, and for illustrative purposes, represented by s single object (a circle).
  • the scene may comprise a variety of objects of possibly different shapes and with possible different distances to the cameras 21a.
  • Image data captured by the cameras 21a represents a light field of the scene 22 and is transmitted to an electronic device 30, 30a acting as an encoder.
  • the electronic device 30, 30a encodes the light field into a bitstream.
  • the encoded bitstream is communicated over a symbolic communications channel 23.
  • the symbolic communications channel 23 may be implemented as a storage medium or as a transmission medium between two electronic devices. Hence the symbolic communications channel 23 may be regarded as a delayed or real-time communications channel.
  • the encoded bitstream is received by an electronic device 30, 30b acting as a decoder. Hence, when the symbolic communications channel 23 is implemented as a storage medium the electronic device 30, 30a and the electronic device 30, 30b maybe one and the same electronic device 30, 301, 30b.
  • the electronic device 30, 30b decodes the received bitstream into a panoramic light field.
  • the panoramic light field may be provided to a rendering unit 12 for displaying the panoramic light field (as in Fig. 1).
  • FIG. 3a schematically illustrates, in terms of a number of functional units, the components of an electronic device 30 according to an embodiment.
  • a processing unit 31 is provided using any combination of one or more of a suitable central processing unit (CPU), multiprocessor, microcontroller, digital signal processor (DSP), application specific integrated circuit (ASIC), field programmable gate arrays (FPGA) etc., capable of executing software instructions stored in a computer program product 41a, 41b (as in Fig. 4), e.g. in the form of a storage medium 33.
  • CPU central processing unit
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate arrays
  • the storage medium 33 may also comprise persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid state memory or even remotely mounted memory.
  • the electronic device 30 may further comprise a communications interface 32 for communications with, for example, another electronic device 30, an external storage medium, a camera array 21, and a rendering unit 12.
  • the communications interface 32 may comprise one or more transmitters and receivers, comprising analogue and digital components and a suitable number of ports and interfaces for communications.
  • the processing unit 31 controls the general operation of the electronic device 30 e.g. by sending data and control signals to the communications interface 32 and the storage medium 33, by receiving data and reports from the communications interface 32, and by retrieving data and instructions from the storage medium 33.
  • Fig. 3b schematically illustrates, in terms of a number of functional modules, the components of an electronic device 30 acting as an encoder according to an embodiment.
  • the electronic device 30b of Fig. 3b comprises a number of functional modules; a send and/or receive module, 31a, and an encode module 31b.
  • the electronic device 30b of Fig. 3b may further comprises a number of optional functional modules, such as any of a determine module 31c, a reduce module 3id, a generate module 31 ⁇ , a project module 3if, a detect module 3ig, and a sample module 31I1.
  • each functional module 3ia-h may be implemented in hardware or in software.
  • one or more or all functional modules 3ia-h maybe implemented by the processing unit 31, possibly in cooperation with functional units 32 and/ or 33.
  • the processing unit 31 may thus be arranged to from the storage medium 33 fetch instructions as provided by a functional module 3ia-h and to execute these instructions, thereby performing any steps as will be disclosed hereinafter.
  • Fig. 3c schematically illustrates, in terms of a number of functional modules, the components of an electronic device 30b acting as a decoder according to an embodiment.
  • each functional module 31J-I may be implemented in hardware or in software.
  • one or more or all functional modules 31J-I maybe implemented by the processing unit 31, possibly in cooperation with functional units 32 and/ or 33.
  • the processing unit 31 may thus be arranged to from the storage medium 33 fetch instructions as provided by a functional module 31J-I and to execute these instructions, thereby performing any steps as will be disclosed hereinafter.
  • the electronic device 30, 30a and the electronic device 30, 30b maybe one and the same electronic device.
  • the functional modules 3ia-h of the electronic device 30, 30a and the functional modules 31J-I maybe combined.
  • only one send and/or receive module may be used instead of the separate send and/ or receive modules 31a, 31J
  • only one generate module may be used instead of the separate generate modules 31 ⁇ , 31I
  • only one detect module maybe used instead of the separate detect modules 3ig, 31m
  • only one determine module maybe used instead of the separate determine modules 31c, 31 ⁇ .
  • the electronic device 30, 30a, 30b maybe provided as a standalone device or as a part of a further device.
  • the electronic device 30, 30a, 30b maybe provided as an integral part of the further device. That is, the components of the electronic device 30, 30a, 30b maybe integrated with other components of the further device; some components of the further device and the electronic device 30, 30a, 30b maybe shared.
  • the further device as such comprises a processing unit
  • this processing unit may be arranged to perform the actions of the processing unit 31 of the electronic device 30, 30a, 30b.
  • the electronic device 30, 30a, 30b maybe provided as a separate unit in the further device.
  • the further device may be a digital versatile disc (DVD) player, Blu-ray Disc player, a desktop computer, a laptop computer, a tablet computer, a portable wireless device, a mobile phone, a mobile station, a handset, wireless local loop phone, or a user equipment (UE).
  • DVD digital versatile disc
  • UE user equipment
  • Fig. 4 shows one example of a computer program product 41a, 41b
  • a computer program 42a can be stored, which computer program 42a can cause the processing unit 31 and thereto operatively coupled entities and devices, such as the communications interface 32 and the storage medium 33, to execute methods for encoding a light field (LF) into a bitstream according to embodiments described herein.
  • a computer program 42b can be stored, which computer program 42b can cause the processing unit 31 and thereto operatively coupled entities and devices, such as the communications interface 32 and the storage medium 33, to execute methods for decoding an encoded bitstream into a panoramic light field (PLF) according to embodiments described herein.
  • PLF panoramic light field
  • the computer program 42b and/or computer program product 41b may thus provide means for performing any steps as herein disclosed.
  • the computer program product 41a, 41b is illustrated as an optical disc, such as a CD (compact disc) or a DVD (digital versatile disc) or a Blu-Ray disc.
  • the computer program product 41a, 41b could also be embodied as a memory, such as a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), or an electrically erasable programmable read-only memory (EEPROM) and more particularly as a non-volatile storage medium of a device in an external memory such as a USB (Universal Serial Bus) memory or a Flash memory, such as a compact Flash memory.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • the computer programs 42a, 42b are here schematically shown as a track on the depicted optical disk, the computer programs 42a, 42b can be stored in any way which is suitable for the computer program product 41a, 41b.
  • Figs. 13 and 14 are flow charts illustrating embodiments of methods for encoding a light field (LF) into a bitstream as performed by an electronic device 30, 30a. The methods are advantageously provided as computer programs 42a.
  • Figs. 15 and 16 are flow charts illustrating embodiments of methods for decoding an encoded bitstream into a panoramic light field (PLF) as performed by an electronic device 30, 30b. The methods are advantageously provided as computer programs 42b.
  • Fig 13 illustrating a method for encoding a light field (LF) into a bitstream as performed by an electronic device 30, 30a according to an embodiment.
  • the electronic device 30, 30a is configured to, in a step S102, receive an LF of a scene and parameters relating to the LF.
  • the parameters describe a three- dimensional (3D) model of the scene, parameters for rendering at least one view of the scene from at least one panoramic light field (PLF), and a projection method for generating a PLF space from the images and the 3D model.
  • the processing unit 31 maybe configured to perform step S102 by executing functionality of the functional module 31a.
  • the electronic device 30, 30a is configured to, in a step S118, encode the at least one PLF and the parameters into the bitstream.
  • the encoding is performed by sampling the sequence of PLF spaces into a sequence of PLFs and by applying compression to remove redundancy in the sequence of PLFs.
  • the processing unit 31 maybe configured to perform step S118 by executing functionality of the functional module 31b.
  • the computer program 42a and/or computer program product 41a may thus provide means for this step Reference is now made to Fig. 14 illustrating methods for encoding a light field (LF) into a bitstream as performed by an electronic device 30, 30a according to further embodiments.
  • the encoding may be used to convert initial LF input data representing multiview videos to a panoramic light field 3D/4D video.
  • the LF may represents images defining multiple views comprising pixels of the scene, where the images have been captured by an N-by-M array of cameras, and where at least one of M and N is larger than 1, and using 3D model
  • a LF may be thus be acquired from a set of cameras arranged on a grid (such as on a plane or circular grid) or on a line.
  • the first case is related to scenarios where the array of cameras is one-dimensional (iD), i.e., where one of M and N is equal to 1
  • the second case is related to scenarios where the array of cameras is two- dimensional (2D), i.e., where both M and N are larger than 1.
  • Preprocessing which maybe performed after the receiving in step S102 but prior to the encoding in step S118, may comprise at least some of the following steps (which may be performed for each of the videos frames, i.e., images, captured by the cameras): 1. Selection of a panoramic projection model.
  • panoramic view of the scene has only one 3D point, for example by projecting the depth representation to a panoramic view, and back- projecting each 3D point and the depth representation of the panoramic view.
  • At least some of the embodiments disclosed herein are based on embedding the 3D structure into one, global 2D depth map (or 3D mesh) that can later be used to recreate as many views of the captured scene as possible.
  • a projection model that will merge all the views into one global view that will have no overlapping at all i.e., defining a bijective function may therefore be used.
  • a 3D surface and project the 3D content onto it This may be visualized as a virtual surface being a large sparse sensor of a virtual camera, covering all the input cameras in the N-by-M array of cameras and where each pixel would have a quite different 3D location and projection angle. Physically, it would be similar to having one large sparse light capturing sensor that covers all the cameras instead of having one small light capturing sensor per camera. A parametric surface shape and location with a projection equation may therefore be defined. Since 2D videos may later be encoded, the surface is defined as being a distorted 2D rectangle.
  • Fig. 5 schematically illustrates a top view of the image communications system 20, where a surface 51 has been schematically illustrated.
  • cameras 21a are placed along an x-axis and configured to capture a scene 22.
  • the surface 51 is placed between the cameras 21a and the scene 22. Projections perpendicular from the surface 51 are schematically illustrated at reference number 52 on the side of the surface 51 facing the scene 22.
  • one objective may thus be to find a good surface distortion and localization such that all the pixels of the input images are projected onto the surface (considering a sufficient enough surface sampling resolution), and all the pixels occulted in the input images are projected onto the surface.
  • This may be regarded as a hard problem but it may be solved using spline/radial basis functions within an optimization framework, or the problem itself may be simplified by relaxing some constraints.
  • the problem may be simplified by assuming a per-pixel (or per-column for Case A) pinhole camera.
  • the generation of the panoramic image and depth map (both of size Wp-by-Hp pixels) is performed such that for each column c (from 1 to Wp) the projection matrix Pc maybe defined such that:
  • FIG. 6 schematically illustrates one examples of a camera array (for case A) comprising cameras 21a and with interpolated cameras 21b
  • a per-pixel (or per column) projection of all the points of the point cloud onto the panoramic image and associated depth map maybe performed and only the necessary points, appearing in the associated column, may be kept.
  • other types of surfaces and surface projections shape, location
  • a semi- rectangular cylinder covering the foreground content may be used, even if the cameras are aligned on a plane.
  • the sampling rate may be adapted to the surface curvature so that high-curved areas are more densely sampled than low-curved areas.
  • the electronic device 30, 30a maybe configured to, in an optional step S104, determine a point cloud from the images and the parameters describing the 3D model and the camera parameters.
  • the processing unit 31 maybe configured to perform step S104 by executing functionality of the functional module 31c.
  • the computer program 42a and/or computer program product 41a may thus provide means for this step.
  • the electronic device 30, 30a may further be configured to back-project all (or part of) the images using the associated depth maps. This will generate L point clouds, denoted PCx (where x goes from 1 to L).
  • a 3D mesh can be converted to, or expressed as, a 3D point cloud by sampling the faces of the mesh. Merging and simplification of the point cloud
  • the electronic device 30, 30a maybe configured to, in an optional step S106, reduce the number of points in the point cloud such that each pixel in the PLF has only one point in the point cloud.
  • the processing unit 31 may be configured to perform step S 106 by executing functionality of the functional module 3id.
  • the computer program 42a and/ or computer program product 41a may thus provide means for this step.
  • the electronic device 30, 30a maybe configured to, in an optional step S108, generate a panoramic 3D model from the projection model by projecting the point cloud.
  • the 3D model may be a depth map.
  • the processing unit 31 may be configured to perform step S108 by executing functionality of the functional module 3ie.
  • the computer program 42a and/or computer program product 41a may thus provide means for this step.
  • the electronic device 30, 30a may be configured to, in an optional step S110, project the point cloud to each input image of the cameras, and to generate the at least one PLF using the projection model by storing the different colors in a PLF space.
  • the processing unit 31 may be configured to perform step S110 by executing functionality of the functional module 3if.
  • the computer program 42a and/ or computer program product 41a may thus provide means for this step.
  • Enabling each pixel in the panoramic depth (PPD) to only have one 3D point, may be implemented in different ways.
  • One way is to project all the points to the panoramic depth map using the projection model. Then, the panoramic depth map may be back-projected, using the inverse of the projection model, in order for a single point cloud to be obtained.
  • the point cloud maybe projected to the panoramic depth as described above with reference to how the panoramic projection model may be selected.
  • the panoramic light field space may be defined as being the 3D or 4D space with the following parameters:
  • a first axis, denoted X corresponds to the x axis of the panoramic image PPI (i.e. the sampled projection surface, 51), thus ranging from 1 to Wp, where Wp is the resolution of the LF in the x-direction.
  • a second axis, denoted Y corresponds to the y axis of the panoramic image PPI, thus ranging from 1 to Hp, where Hp is the resolution of the LF in the y-direction.
  • Ch denotes the horizontal cameras index on the input camera array, ranging from 1 to N.
  • Figs. 7 and 8 provide a representation of theta and omega.
  • Ci-i, Ci, and Ci+i are three different camera indexes of cameras 21a placed along the x-axis.
  • a 3D point P from a point cloud can be defined by its 3D coordinates ( ⁇ , ⁇ , ⁇ ) but also by an angular space theta, omega and length (which can be the depth Z).
  • This PLF space contains only color information since the 3D structure is encoded in the panoramic 3D model and projection model.
  • holes in the 3D/4D light field space
  • occlusion i.e., where foreground content cover background content
  • background content may in such cases not be projected onto all the views.
  • the pixels around the edges of the panoramic image may only appear in a few viewpoints since the viewpoint makes the pixels shift to the left/right. Thus some pixels maybe shifted outside the input image size, as shown in the non-hatched area of Fig. 10a (see below).
  • Such occlusions may be filled by determining values for the missing pixels.
  • the electronic device 30, 30a may be configured to, in an optional step S112, detect an entry in the PLF space representing a missing pixel; and, in an optional step S114, determine a value for the entry. Determine the value for the entry causes the hole to be filled. Filling the holes in the PLF space may be accomplished by employing a simple linear interpolation/ extrapolation. The process of hole filling is also denoted inpainting.
  • the processing unit 31 may be configured to perform step S112 by executing functionality of the functional module 3ig.
  • the computer program 42a and/or computer program product 41a may thus provide means for this step.
  • the processing unit 31 maybe configured to perform step S114 by executing functionality of the functional module 31c.
  • the computer program 42a and/or computer program product 41a may thus provide means for this step.
  • the electronic device 30, 30a maybe configured to, in an optional step S116, sample the PLF space along two-dimensional planes so as to slice the PLF space into slices.
  • the processing unit 31 maybe configured to perform step S116 by executing functionality of the functional module 31I1.
  • the computer program 42a and/or computer program product 41a may thus provide means for this step.
  • Each slice represents a two-dimensional image. All slices have a color variation but a common panoramic 3D structure: the panoramic 3D model.
  • only a part of the PLF space (and hence not the complete 3D/4D space) is encoded. Slicing the space may then be regarded as sampling the space with a (eventually planar) 2D/ 3D path. One slice corresponds to one PLF. The rendering (see bellows) may also be achieved by slicing this space.
  • step S118 there maybe different ways to perform the actual encoding (i.e., how to sample the sequence of PLF spaces into a sequence of PLFs and how to apply compression to remove redundancy in the sequence of PLFs). Further details of the encoding in step S118 will now be disclosed.
  • At least one PLF may be encoded as a sequence of 3D video frames (or 4D video frame, or a set of 2D video frames with a panoramic 3D mesh or depth map) having dependent layers.
  • the layers may be encoded one by one in a predetermined order .
  • Encoding of layer k+i may be dependent on encoding of layer k for K layers, and where o ⁇ k ⁇ K-i.
  • Encoding the at least one PLF may comprise encoding a quantized pixel difference between layer k+i and layer k.
  • the 3D model may be represented by a depth map or a 3D mesh of the scene.
  • the electronic device 30, 30a maybe configured to, in an optional step 120, encode positions of the cameras relative to the scene.
  • the processing unit 31 may be configured to perform step S120 by executing functionality of the functional module 31b.
  • the computer program 42a and/ or computer program product 41a may thus provide means for this step. Since a PLF space and its associated panoramic 3D mesh or depth map (PPD) have been generated, the PLF space may be sliced, as in step S116 above, thus generating several 2D images that have the same 3D structure but with a color variation.
  • PPD panoramic 3D mesh or depth map
  • the first PLF slice (2D image/video) to be encoded is denoted PPL
  • PPL The first PLF slice (2D image/video) to be encoded
  • the PPI may be encoded using known 2D video coding techniques such as the ones used in HEVC.
  • HEVC or other 2D video codecs
  • known 2D video coding techniques may be adapted in order to handle the video resolution increase.
  • the panoramic image resolution may have to be increased, thus leading to large image sizes, typically more than 4K (i.e., larger than 3840-by-2i6o pixels).
  • One possibility is to increase the block sizes.
  • the depth map PPD can be stored using only one component (typically the luminance component) and then any existing coding techniques for standard 2D videos may be applied.
  • the depth map is generally easier to encode than the PPI and more specific coding techniques, such as the ones used in 3D- HEVC, may be used.
  • the motion vectors estimated when coding PPI can be re-used for encoding the depth map.
  • This depth map may thus correspond to a new dependent layer, say layer 1, when using MV-HEVC or an equivalent encoding technique.
  • a 3D mesh maybe encoded instead of the depth map.
  • the herein disclosed encoding is able to generate both 2D and 3D videos since it contains both the colors and their associated depth (or position for the 3D mesh) of most of the necessary pixels.
  • a decoder will need to know what projection model was used when creating the PLF and hence the projection mode information may be encoded in the bitstream as well. This can be achieved in various ways. Examples include, but are not limited to, equations, surface mesh, matrix representation, a UV-mapping etc. Also information defining the camera locations may be encoded in the bitstream.
  • the angular space has typically one dimension (case A) or two dimensions (case B).
  • case A the angular space will have only one dimension
  • case B the angular space will have two dimensions.
  • the angular space may be encoded using dependent layers.
  • the angular space is sampled with a regular slicing of the PLF space.
  • slicing is meant sampling (with interpolation) the PLF space in order to generate rectangular images that then may be encoded.
  • Figs. 10a and 10b provide illustrations of the angular space sampling of PPI and LFi for Case A, i.e., where the angular space is iD.
  • PPI corresponds to the pixels colors for which theta is o. This is referred to as layer o (denoted LFo).
  • LFi represents one slice of the angular space theta. Theta may be replaced by the camera index and the space may be spliced in the camera dimension.
  • LFi corresponds to the pixel colors of all pixels where the angle is equal to, say 15 degrees; LF2
  • LFx corresponds to the pixel colors of all pixels where the angle is equal to, say -15 degrees; LF3 corresponds to the pixel colors of all pixels where the angle is equal to, say 30 degrees; LF4 corresponds to the pixels colors of all pixels where the angle is equal to, say -30 degrees; etc.
  • LFx where x denotes an index of the slice, are images of the same size as the PPI and, for most of the content, will have the same or slightly the same pixels colors.
  • the hatched area in Fig. 10a corresponds to the valid space where the pixels will have colors. This space may be discrete since the input data is a discrete number of cameras and hence an interpolation technique (bilinear, bi-cubic, etc.) maybe used to obtain the color value if the slicing lands in between two theta values. When there is no color (outside the hatched area), the color of the last valid theta at the same pixel location (x,y) may be used. Or an extrapolation technique may be used to obtain the color. In Fig. 10a the y-axis has been omitted to ease the illustration. The x-axis corresponds to the x-axis of the PPI (but may also be regarded as the real X-axis of the 3D space coordinate).
  • Case B i.e., where the angular space is 3D
  • the 2D angular space maybe regularly sliced using the same approach as for Case A.
  • the herein disclosed encoding is also applicable for other types of slicing.
  • Information relating to how the slicing was performed may thus also be encoded in the bitstream.
  • the slicing may be planar or parametric.
  • the different light field layers may be expressed as a small variation of the first light field layer.
  • MV-HECV, 3D-HEVC the same mechanisms as used in the multiview or 3D extensions of HEVC (MV-HECV, 3D-HEVC) maybe used to encode the extra layers as dependent layers.
  • One way to accomplish this is to encode the most extreme layers (highest positive/negative theta/omega values, denoted LFmax) first (after encoding the PPI and the depth map) in order to get a scalability on the angular space reconstruction accuracy.
  • LFmax positive/negative theta/omega values
  • only one extra LF layer may be needed to get a first approximation of the angular-based color changes in Case A since a simple linear interpolation may create a color variation in the motion parallax.
  • More complex interpolation such as cubic interpolation (using LF2, LFo and LFi), may be used. The same applies for the other LF layers as well.
  • a 2D interpolation such as bi-linear, bi-cubic, etc.
  • the layers may therefore be very cheap (i.e., require very few bits) to encode.
  • motion vectors estimated in layer o PPI can be re-used for the other layers, thus being transmitted only once.
  • blocks may not be used.
  • the quantified difference between the predicted layer and the true layer is encoded. Whilst there is not any motion, some image content (such as dense details) may be more difficult to encode, thus requiring denser sampling in the angular space and/ or requiring a further predictor.
  • DWT discrete Wavelet transform
  • one surface projection model may be used for the background image content and one surface projection model may be used for the foreground image content.
  • Fig 15 illustrating a method for decoding an encoded bitstream into a panoramic light field (PLF) as performed by an electronic device 30, 30b according to an embodiment.
  • the decoding process may comprise the opposite steps of the above disclosed encoding process.
  • the decoding may involve performing the inverse operations, in reverse order, of the operations performed during the encoding.
  • the decoder may decode the PLF, then reconstruct the PLF space, then fill any holes in the PLF space, then generate one or more 2D image(s) to be shown (or rendered) on a display (since a LF is by itself usually not shown).
  • the electronic device 30, 30b is configured to, in a step S202, receive an encoded bitstream representing at least one PLF of a scene and parameters relating to the at least one PLF.
  • the parameters describe a panoramic three- dimensional (3D) model of the scene, parameters for rendering at least one view of the scene from the at least one PLF, a back-projection method for generating the images from the at least one PLF, and samplings of a PLF space.
  • the processing unit 31 maybe configured to perform step S202 by executing functionality of the functional module 31J.
  • the computer program 42b and/or computer program product 41b may thus provide means for this step.
  • the electronic device 30, 30b is configured to, in a step S204, decode the encoded bitstream into the at least one PLF by reconstructing the at least one PLF by applying decompression to the bitstream based on the parameters.
  • the processing unit 31 maybe configured to perform step S204 by executing functionality of the functional module 31k.
  • the computer program 42b and/ or computer program product 41b may thus provide means for this step.
  • Fig 16 illustrating methods for decoding an encoded bitstream into a panoramic light field (PLF) as performed by an electronic device 30, 30b according to further embodiments.
  • the bitstream may comprise a header.
  • the header may represent the parameters received in step S202.
  • the electronic device 30, 30b may then be configured to, in an optional step S206, decode the header, thereby extracting the panoramic 3D model of the scene, the input camera parameters relating to cameras having captured images of the scene, the back-projection method being used for generating the images from the PLF, and the at least one PLF.
  • back-projection is the inverse application of the projection performed during the pre-processing.
  • the processing unit 31 maybe configured to perform step S206 by executing functionality of the functional module 31k.
  • the computer program 42b and/ or computer program product 41b may thus provide means for this step.
  • the decoding may involve extracting the video headers that notably includes the projection model in order to re-create the point cloud, the eventual angular space slicing parameters that defines how the layers (for such embodiments) were defined and how to reconstruct the angular space defined above.
  • the electronic device 30, 30b may be configured to, in an optional step S208, generate a sequence of PLF spaces from the bitstream and the decoded at least one PLF representing the PLF spaces.
  • the processing unit 31 maybe configured to perform step S208 by executing functionality of the functional module 31I.
  • the computer program 42b and/ or computer program product 41b may thus provide means for this step.
  • any holes in the PLF spaces maybe filled using, for instance, bi-linear interpolation. Such holes may be generated during the slicing resulting from the sampling in step S116.
  • the electronic device 30, 30b maybe configured to, in an optional step S210, detect an entry in at least one PLF space of the sequence of PLF spaces representing a missing pixel; and, in an optional step S212, determine a value for said entry.
  • the processing unit 31 maybe configured to perform step 210 by executing functionality of the functional module 31m.
  • the computer program 42b and/or computer program product 41b may thus provide means for this step.
  • the processing unit 31 may be configured to perform step S212 by executing functionality of the functional module 31 ⁇ .
  • the computer program 42b and/or computer program product 41b may thus provide means for this step.
  • the electronic device 30, 30b maybe configured to, in an optional step S214, generate a point cloud by back-projecting the panoramic 3D model from the at least one PLF from said 3D video frames using the back-projection model.
  • the processing unit 31 maybe configured to perform step S214 by executing functionality of the functional module 31I.
  • the computer program 42b and/or computer program product 41b may thus provide means for this step.
  • the electronic device 30, 30b maybe configured to, in an optional step S216, generate images from said point cloud and PLF space based on the parameters describing the panoramic 3D model, camera parameters, the PLF space comprising pixel colors of the scene, by projecting the point cloud with colors coming from the PLF space.
  • the generation in step S216 may be implemented by a reading in a 3D/4D matrix.
  • the processing unit 31 maybe configured to perform step S216 by executing functionality of the functional module 31I.
  • the computer program 42b and/ or computer program product 41b may thus provide means for this step.
  • the bitstream may comprise information relating to positions of the cameras relative the scene.
  • the electronic device 30, 30b may then be configured to, in an optional step S218, decode the positions of the cameras; and, in an optional step S220, generate the images from the PLF space based on the positions. Colors in the images may then depend on these positions.
  • the processing unit 31 maybe configured to perform step S218 by executing functionality of the functional module 31k.
  • the computer program 42b and/or computer program product 41b may thus provide means for this step.
  • the processing unit 31 maybe configured to perform step S220 by executing functionality of the functional module 31I.
  • the computer program 42b and/ or computer program product 41b may thus provide means for this step.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention concerne le codage d'un champ lumineux (LF) en un train de bits. Un LF d'une scène et des paramètres associés au LF sont reçus. Les paramètres décrivent un modèle tridimensionnel (3D) de la scène, des paramètres pour un rendu d'au moins une vue de la scène à partir d'au moins un champ lumineux panoramique (PLF), et un procédé de projection permettant de générer un espace de PLF à partir des images et du modèle 3D. Au moins un PLF et les paramètres sont codés en un train de bits par la séquence d'espaces de PLF échantillonnés en une séquence de PLF et une compression appliquée pour éliminer la redondance dans la séquence de PLF. L'invention concerne également le décodage de ce train de bits codé en une séquence de PLF.
PCT/SE2014/050851 2014-07-03 2014-07-03 Codage et décodage de champs lumineux WO2016003340A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/SE2014/050851 WO2016003340A1 (fr) 2014-07-03 2014-07-03 Codage et décodage de champs lumineux

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/SE2014/050851 WO2016003340A1 (fr) 2014-07-03 2014-07-03 Codage et décodage de champs lumineux

Publications (1)

Publication Number Publication Date
WO2016003340A1 true WO2016003340A1 (fr) 2016-01-07

Family

ID=51263465

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2014/050851 WO2016003340A1 (fr) 2014-07-03 2014-07-03 Codage et décodage de champs lumineux

Country Status (1)

Country Link
WO (1) WO2016003340A1 (fr)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107301648A (zh) * 2017-06-09 2017-10-27 大连理工大学 基于重叠区域边界角度的冗余点云去除方法
EP3310052A1 (fr) * 2016-10-12 2018-04-18 Thomson Licensing Procédé, appareil et flux de format vidéo immersif
CN108965859A (zh) * 2018-07-09 2018-12-07 歌尔科技有限公司 投影方式识别方法、视频播放方法、装置及电子设备
EP3515066A1 (fr) * 2018-01-19 2019-07-24 Thomson Licensing Procédé et appareil pour coder et décoder des scènes tridimensionnelles dans un flux de données et à partir de ce dernier
CN110546686A (zh) * 2017-05-05 2019-12-06 高通股份有限公司 用于产生具有不均匀码字图案的结构光深度图的系统和方法
WO2021062530A1 (fr) * 2019-10-01 2021-04-08 Blackberry Limited Syntaxe de mode angulaire pour codage en nuage de points à base d'arborescence
US11109066B2 (en) 2017-08-15 2021-08-31 Nokia Technologies Oy Encoding and decoding of volumetric video
RU2767771C1 (ru) * 2018-04-11 2022-03-21 ИНТЕРДИДЖИТАЛ ВиСи ХОЛДИНГЗ, ИНК. Способ и оборудование для кодирования/декодирования облака точек, представляющего трехмерный объект
US11405643B2 (en) 2017-08-15 2022-08-02 Nokia Technologies Oy Sequential encoding and decoding of volumetric video
US11790562B2 (en) 2018-01-19 2023-10-17 Interdigital Vc Holdings, Inc. Method and apparatus for encoding and decoding three-dimensional scenes in and from a data stream

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CLEMENS BIRKLBAUER ET AL: "Panorama light-field imaging", ACM SIGGRAPH 2012 POSTERS - ASSOCIATION FOR COMPUTING MACHINERY-DIGITAL LIBRARY, ACM, NEW YORK, NY, 5 August 2012 (2012-08-05), pages 1, XP058008195, ISBN: 978-1-4503-1682-8, DOI: 10.1145/2342896.2342971 *
DRAGOTTI, P. L., LAYER-BASED REPRESENTATION FOR IMAGE BASED RENDERING AND COMPRESSION

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3310052A1 (fr) * 2016-10-12 2018-04-18 Thomson Licensing Procédé, appareil et flux de format vidéo immersif
CN110546686A (zh) * 2017-05-05 2019-12-06 高通股份有限公司 用于产生具有不均匀码字图案的结构光深度图的系统和方法
CN107301648B (zh) * 2017-06-09 2020-04-07 大连理工大学 基于重叠区域边界角度的冗余点云去除方法
CN107301648A (zh) * 2017-06-09 2017-10-27 大连理工大学 基于重叠区域边界角度的冗余点云去除方法
US11405643B2 (en) 2017-08-15 2022-08-02 Nokia Technologies Oy Sequential encoding and decoding of volumetric video
US11109066B2 (en) 2017-08-15 2021-08-31 Nokia Technologies Oy Encoding and decoding of volumetric video
WO2019143486A1 (fr) * 2018-01-19 2019-07-25 Interdigital Vc Holdings, Inc. Procédé et appareil pour coder et décoder des scènes tridimensionnelles dans et à partir d'un flux de données
CN111742548A (zh) * 2018-01-19 2020-10-02 交互数字Vc控股公司 在数据流中编码及解码来自数据流的三维场景的方法和装置
EP3515066A1 (fr) * 2018-01-19 2019-07-24 Thomson Licensing Procédé et appareil pour coder et décoder des scènes tridimensionnelles dans un flux de données et à partir de ce dernier
US11375235B2 (en) 2018-01-19 2022-06-28 Interdigital Vc Holdings, Inc. Method and apparatus for encoding and decoding three-dimensional scenes in and from a data stream
CN111742548B (zh) * 2018-01-19 2023-06-27 交互数字Vc控股公司 在数据流中编码及解码来自数据流的三维场景的方法和装置
US11790562B2 (en) 2018-01-19 2023-10-17 Interdigital Vc Holdings, Inc. Method and apparatus for encoding and decoding three-dimensional scenes in and from a data stream
RU2767771C1 (ru) * 2018-04-11 2022-03-21 ИНТЕРДИДЖИТАЛ ВиСи ХОЛДИНГЗ, ИНК. Способ и оборудование для кодирования/декодирования облака точек, представляющего трехмерный объект
US11856222B2 (en) 2018-04-11 2023-12-26 Interdigital Vc Holdings, Inc. Method and apparatus for encoding/decoding a point cloud representing a 3D object
CN108965859A (zh) * 2018-07-09 2018-12-07 歌尔科技有限公司 投影方式识别方法、视频播放方法、装置及电子设备
WO2021062530A1 (fr) * 2019-10-01 2021-04-08 Blackberry Limited Syntaxe de mode angulaire pour codage en nuage de points à base d'arborescence

Similar Documents

Publication Publication Date Title
US11599968B2 (en) Apparatus, a method and a computer program for volumetric video
WO2016003340A1 (fr) Codage et décodage de champs lumineux
US11405643B2 (en) Sequential encoding and decoding of volumetric video
US11202086B2 (en) Apparatus, a method and a computer program for volumetric video
EP3751857A1 (fr) Procédé, appareil et produit programme informatique de codage et décodage de vidéos volumétriques
EP2005757B1 (fr) Codage efficace de vues multiples
WO2019135024A1 (fr) Appareil, procédé et programme informatique pour vidéo volumétrique
JP2019534606A (ja) ライトフィールドデータを使用して場面を表す点群を再構築するための方法および装置
WO2018098054A1 (fr) Codec uv centré sur un décodeur pour la diffusion en continu de vidéo à point de vue libre
US20150215600A1 (en) Methods and arrangements for supporting view synthesis
EP3562159A1 (fr) Procédé, appareil et flux pour format vidéo volumétrique
Graziosi et al. Depth assisted compression of full parallax light fields
US20150304640A1 (en) Managing 3D Edge Effects On Autostereoscopic Displays
Morvan et al. System architecture for free-viewpoint video and 3D-TV
WO2020006035A1 (fr) Accès aléatoire dans des images de champ lumineux à parallaxe totale codées
JP7344988B2 (ja) ボリュメトリック映像の符号化および復号化のための方法、装置、およびコンピュータプログラム製品
WO2019115867A1 (fr) Appareil, procédé, et programme d'ordinateur pour vidéo volumétrique
WO2019122504A1 (fr) Procédé de codage et de décodage de données vidéo volumétriques
KR102505130B1 (ko) 명시야 컨텐츠를 표현하는 신호를 인코딩하기 위한 방법 및 디바이스
EP3698332A1 (fr) Appareil, procédé, et programme d'ordinateur pour vidéo volumétrique
WO2011094164A1 (fr) Systèmes d'optimisation d'image utilisant des informations de zone
EP3709659A1 (fr) Procédé et appareil de codage et de décodage de vidéo volumétrique
JP4937161B2 (ja) 距離情報符号化方法,復号方法,符号化装置,復号装置,符号化プログラム,復号プログラムおよびコンピュータ読み取り可能な記録媒体
WO2019185983A1 (fr) Procédé, appareil et produit-programme d'ordinateur destinés au codage et au décodage de vidéo volumétrique numérique
US20220345681A1 (en) Method and apparatus for encoding, transmitting and decoding volumetric video

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14747162

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14747162

Country of ref document: EP

Kind code of ref document: A1