WO2018053591A1 - Base anchored models and inference for the compression and upsampling of video and multiview imagery - Google Patents

Base anchored models and inference for the compression and upsampling of video and multiview imagery Download PDF

Info

Publication number
WO2018053591A1
WO2018053591A1 PCT/AU2017/051030 AU2017051030W WO2018053591A1 WO 2018053591 A1 WO2018053591 A1 WO 2018053591A1 AU 2017051030 W AU2017051030 W AU 2017051030W WO 2018053591 A1 WO2018053591 A1 WO 2018053591A1
Authority
WO
WIPO (PCT)
Prior art keywords
base
displacement
frame
mesh
gop
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/AU2017/051030
Other languages
English (en)
French (fr)
Inventor
David Scott Taubman
Dominic Patric RUEFENACHT
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NewSouth Innovations Pty Ltd
Original Assignee
NewSouth Innovations Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2016903815A external-priority patent/AU2016903815A0/en
Application filed by NewSouth Innovations Pty Ltd filed Critical NewSouth Innovations Pty Ltd
Priority to US16/335,297 priority Critical patent/US11122281B2/en
Priority to EP17851993.0A priority patent/EP3516872A4/en
Priority to JP2019536625A priority patent/JP7279939B2/ja
Priority to AU2017331736A priority patent/AU2017331736B2/en
Publication of WO2018053591A1 publication Critical patent/WO2018053591A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/31Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the temporal domain
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/001Model-based coding, e.g. wire frame
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/004Predictors, e.g. intraframe, interframe coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/177Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/537Motion estimation other than block-based
    • H04N19/54Motion estimation other than block-based using feature points or meshes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding

Definitions

  • the present invention relates to an apparatus and a method for coding a video signal, and particularly, but not exclusively, to a method and an apparatus for implementing a representation of displacement information (i.e., a model) between video frames.
  • a representation of displacement information i.e., a model
  • frame refers to frames of a video sequence, as well as views in a multi-view setting.
  • Embodiments of the invention are not concerned with the generation of such models, but how they can be used to "infer" frames in between already decoded frames.
  • motion compensated temporal lifting transforms [4] also known as motion compensated temporal filtering, or just MCTF
  • MCTF motion compensated temporal filtering
  • the present invention provides a method of representing displacement information between the frames of a video and/or multiview sequence, comprising the steps of assigning a plurality of the frames to a Group of Pictures (GOP) , providing a base displacement model for each GOP, the base displacement model describing a displacement field that carries each location in a designated base frame of the GOP to a corresponding location in each other the frame of the GOP, and inferring other displacement relationships between the frames of the GOP from the base displacement model.
  • GOP Group of Pictures
  • a video signal may be a multi-view video signal.
  • a GOP may consist of frames from multiple views at the same time instance and/or frames from a view taken at different time instances.
  • a video signal may be a single dimensional video sequence.
  • a GOP need not be one-dimensional, as is traditionally the case for single view video compression, or in the case of a multi-view arrangement where all views are arranged in a ID array.
  • 2D groups of pictures are the most appropriate construct for multi-view imagery associated with a 2D array of cameras, while 3D GOPs are the most appropriate construct when the cameras in such an array each capture a video sequence.
  • the term "displacement” covers a number of parameters associated with the images, including motion, depth and disparity (particularly for multi-view imagery and video), location information, and other parameters .
  • this represents a new way to describe, compress and infer displacements, in which all displacement information for a group of pictures (GOP) is derived from a base model, whose displacement representations are anchored at the GOP' s base frame.
  • GOP group of pictures
  • one piecewise smooth 2D displacement field is encoded for each frame, but all of the displacement fields associated with a GOP are anchored at its base frame. Collectively, we identify these displacement fields as the base model.
  • One advantage of anchoring all descriptions for the GOP at the same frame is that it facilitates various compact descriptions of the multitude of displacement fields. By anchoring all displacements at the base frame, a single description of boundary discontinuities can be applied to all displacement fields, these boundary discontinuities being critical to the description of piecewise continuous models in general.
  • energy compacting transforms are readily applied directly to the collection of displacement fields.
  • parametric models may be employed to express the base model using a reduced set of displacement parameters.
  • parametric representations can be based on physical attributes such as velocity and acceleration.
  • the apparent displacement between frames of the GOP may be related to geometric properties, notably scene depth, so that depth or reciprocal depth provides the natural basis for parametric descriptions.
  • the base-anchored framework can support high quality temporal motion inference, which is computationally efficient and requires as few as half the coded motion fields used in conventional codecs, where the common tool of bi-directional prediction assigns two motion fields to each target frame .
  • the base-anchored approach advantageously provides more geometrically consistent and meaningful displacement information.
  • the availability of geometrically consistent displacement information improves visual perception, and facilitates the efficient deployment of highly scalable video and multi-view compression systems based on displacement-compensated lifting, where the feedback state machine used in traditional codecs is replaced by purely feed-forward transforms .
  • the present invention provides a method for coding displacement fields within a video sequence, wherein the video frames are assigned to Groups of Pictures known as GOPs, a base displacement model is coded for each GOP, describing the displacement that carries each location in a designated base frame of the GOP to a corresponding location in each other frame of the GOP, and other displacement relationships between the frames of the GOP are inferred from the base displacement model, in accordance with the method of the first aspect of the invention.
  • GOPs Groups of Pictures
  • the present invention provides a method for displacement compensated prediction of certain image frames from other frames, wherein the frames are assigned to groups of pictures (GOPs), a base displacement model is provided for each GOP, describing the displacement that carries each location in a designated base frame of the GOP to a corresponding location in each other frame of the GOP, this base displacement model being used to infer displacement relationships between the frames of the GOP, and the inferred displacement field at a prediction target frame being used to predict that frame from one or more other frames in the GOP.
  • GOPs groups of pictures
  • the present invention provides a coding apparatus arranged to implement a method for representing displacement information in accordance with the first aspect of the invention.
  • the present invention provides a coding apparatus arranged to implement a method for coding displacement fields in accordance with the second aspect of the invention.
  • the present invention provides a coding apparatus arranged to implement a method for displacement compensated prediction in accordance with the third aspect of the invention.
  • the present invention provides a decoding apparatus arranged to decode a signal coded by an apparatus in accordance with the fourth aspect of the invention or the fifth aspect of the invention or the sixth aspect of the invention.
  • the present invention provides a computer program, comprising instructions for controlling a computer to implement a method in accordance with the first aspect, second aspect or third aspect of the invention.
  • the present invention provides a non-volatile computer readable medium, providing a computer program, in accordance with the eight aspect of the invention.
  • the present invention provides a data signal, comprising a computer program in accordance with the eight aspect of the invention.
  • Figure 1 is an illustration of base-anchored displacement in the case of ID Group of Pictures (GOP) , in accordance with an embodiment.
  • Figure 2 is an illustration of a number of representative frames of an image/ video sequence, illustrating principles of the base anchoring, in accordance with an embodiment
  • Figure 3 is an illustration of a displacement backfilling strategy, in accordance with an embodiment
  • Figure 4 is an illustration of double mapping resolving procedure in accordance with an embodiment of the invention, in accordance with an embodiment
  • Figure 5 is an illustration of the extension of the base anchoring to higher-dimensional GOPs, in accordance with an embodiment
  • FIG. 6 is an overview of an encoder that employs the base model and inference scheme in accordance with an embodiment of the invention.
  • Figure 7 is an overview of a decoder that employs the base model and inference scheme in accordance with an embodiment of the invention. Detailed description of embodiments
  • FIG. 4.5 illustrates some of the key ideas behind the base-anchored framework.
  • displacement information for a GOP is described within its base frame, denoted here as f Q .
  • the displacement field is piecewise smooth, being expected to exhibit discontinuities around object boundaries.
  • One way to describe such displacement fields is through a triangular mesh that is allowed to tear at certain locations - its breaks.
  • Methods for representing and encoding such displacement models exist. For example, in the case of single-view video compression, [19] generalizes the mesh to a wavelet-based motion model based on affine interpolation, which is coupled with an efficient and highly scalable method for coding "arc breakpoints" [20] , which adapt the wavelet basis functions in the vicinity of displacement discontinuities.
  • Figure 2 shows only a coarse regular triangular mesh, with 3 illustrative breakpoints.
  • nodes of the mesh carry one displacement vector for each of the N-l non-base frames in the iV-frame GOP, denoted as u 0 ⁇ j,jE ⁇ l,N ⁇ .
  • the node displacement vectors serve to continuously warp the mesh from the base frame to each other frame in the GOP.
  • mesh nodes outside the original frame can be assigned the displacement of their adjacent node(s) in the frame.
  • these nodes have to be mapped according to their displacement vector, effectively extrapolating the displacement at frame boundaries rather than creating a linear ramp as achieved by assigning a 0 displacement vector.
  • these elementary extension methods are sufficient, more physically meaningful extension mechanisms will be apparent to those skilled in the art.
  • information from base meshes from adjacent GOPs can be used in such regions; one way to achieve this is explained in Section 4.1.2, which describes a general method for "augmenting" the base mesh in disoccluded regions .
  • the base model incorporates scene depth (or reciprocal depth) information
  • the ambiguity created by double mappings can be resolved immediately by identifying the visible location r k l as the one with smallest depth.
  • explicit depth information is either not available or not sufficient to describe the displacement relationships between the base frame and all other frames of the GOP
  • more advanced techniques can be used to resolve double mappings - see Section 4.2 for a description of how discontinuities in the displacement field can be used to identify local foreground objects.
  • each breakpoint effectively introduces two new mesh nodes, illustrated by green and orange dots in the figure, whose locations coincide with the break in the base frame and whose displacement vectors are obtained by replicating or extrapolating the displacement vector from each end of the arc that is broken. Mapping these new break-induced nodes into each non-base frame, using their respective displacement vectors, can open up "holes" in the mesh, corresponding to regions in the non-base frame that are not visible from the base frame. These so-called disoccluded regions are illustrated by pink shading in the figure.
  • Break-induced discontinuities in the displacement field also provide a rich source of double mappings (not shown in the figure) , corresponding to areas of occlusion; as one side of a foreground object disoccludes background content, the other side typically produces double mappings .
  • an embodiment employs a novel backfilling methodology.
  • Triangles formed by breakpoint-induced nodes in the base frame are necessarily stretching (their area increases significantly when they are mapped) . Roughly half of these stretching triangles, which are expected to form around discontinuities (i.e., object boundaries) in the displacement field, are "disoccluding"; the remaining stretching triangles are "folding", indicating regions of double mappings. Disoccluding triangles are identified as triangles with a positive determinant, whereas folding triangles are characterized by a negative determinant.
  • Folding triangles map to regions where at least two other triangles (one of the local foreground and one of the local background object) will map to, and hence are discarded.
  • Disoccluding triangles can map to regions where no other triangle maps to; these triangles need to be handled separately, which is described in the following. Holes created by disocclusion are first filled by adding so-called "break-induced mesh elements" that link the breakpoint-induced nodes in the base frame. These mesh elements have zero size in the base frame, but expand to fill disoccluded regions in the non-base frames, as illustrated by the dashed red lines in Figure 2.
  • disoccluded regions in a non-base rame always involve substantial expansion, where the displacement found within a small region in the base rame expands that region to a much larger one in the non-base frame.
  • break-induced nodes avoids any ambiguity in the identification of disocclusions , because the break-induced nodes from each side of a discontinuity in the displacement field are co- located in the base frame, leading to mesh elements with zero area that exhibit infinite expansion ratios wherever disocclusions arise in the non-base frame.
  • ⁇ elements in the mesh is sufficient to cover all regions of disocclusion in every non-base frame. This means that each frame in a GOP is certain to be covered by mesh elements from the base frame that are mapped in accordance with the associated displacements.
  • ⁇ elements do ensure that a reverse displacement field exists everywhere, pointing back to the base frame from any non-base frame, they do not lead to physically meaningful reverse displacement values. This is because half the break-induced nodes (e.g., full black circles (dots)) associated with an ⁇ element move with the background, while the other half (e.g., black dots in circles) move with the foreground. Displacement within a disoccluded region, however, should be associated entirely with the (local) background.
  • the backfilling scheme starts by assigning new displacements to disoccluded regions in the last frame of the GOP, which can be identified as the "back-filled" frame for the purpose of this description.
  • each disoccluded region in the back-filled frame starts out being covered by ⁇ elements that have zero size in the base frame.
  • the first more generic method, extrapolates the local background information.
  • the second method called basemesh augmentation, leverages displacement information that is provided at the back-filled frame by other means, to augment the current base mesh; this method is of particular interest when the back-filled frame coincides with the baseframe of another GOP.
  • each mapped ⁇ element that is visible in the back-filled frame is first replicated to produce what will become a back-fill element. Consequently, the mesh nodes that delineate each back-fill element include at least two break-induced nodes, being co-located in the base frame. For each pair of break-induced nodes, one belongs to the foreground side of the break and one belongs to the background side of the break. Distinguishing between these is very important to the back-filling procedure.
  • Each break-induced node that is identified as belonging to the foreground is also replicated, and the replica is associated with the relevant back-fill element (s) in place of the original break-induced node that is associated only with ⁇ element (s) .
  • the replicated nodes are identified as back-fill nodes, and are illustrated in Figure 3b as purple dots; these can also be viewed as "free nodes", since the displacement vector initially assigned to these nodes from the base model can be freely changed to improve consistency with the uncovered background, whose displacement vectors should be modeled by the back-fill elements .
  • the mapped locations of back-fill nodes in the back-filled frame must agree with the mapped locations of the break- induced nodes from which they were spawned, but the backfilling strategy assigns new displacement vectors to these nodes, effectively changing their locations within all other frames, including the base frame.
  • the ⁇ elements that span disocclusions in the back-filled frame are remapped into back-fill elements (triangles g-h in Figure 3b) , whose appearance is identical to the corresponding ⁇ elements in the back-filled frame, but not in the other frames .
  • Displacement vectors are assigned to back-fill nodes based on an extrapolation procedure that averages the displacement vectors found on the other node(s) of the back-fill mesh elements - these are the non-free nodes from each pair of break-induced nodes that defined the original ⁇ elements.
  • the displacement vectors for the free nodes are found by assigning a weighted average of all the "fixed" nodes (i.e., the ones carrying local background displacement information) in the grid of the base mesh, obtained via a splatting procedure. This creates a lookup table of displacement values that can efficiently be implemented in a computer graphics card.
  • the original 2D base mesh is converted to a layered mesh through inter-frame reasoning alone, without the need for any additional coded displacement information or other side information.
  • This section describes a way of augmenting the current base mesh with information from another mesh - this can be either a base mesh of another GOP, or another mesh that has been coded; we refer to this mesh as the augmenting mesh.
  • all triangles of the other base frame that cover regions mapped by infinity triangles from the current base mesh i.e., the set of all disoccluded regions
  • the main appeal of base mesh augmentation is that it is able to handle discoccluded regions where new objects are appearing.
  • the main issue that arises with base mesh augmentation is that since the augmenting mesh elements are only applied in regions where the current base mesh has no valid values, inconsistencies can be created at the (hard) transition boundary from the current base mesh to the backfilled augmenting mesh elements .
  • the augmenting mesh elements can potentially be large enough to span unrelated disocclusion regions, leading to ambiguities in the back-filling procedure.
  • the back-fill mesh elements are all collected within the base frame as part of an augmented base model, where they can be interpreted as an inferred local background layer, which guarantees consistent displacement assignments in disoccluded regions across multiple interpolated frames .
  • back-filling because the assignment of displacement vectors to backfill nodes, which leads to new underlying mesh elements in the base frame, is driven initially from the last frame of the GOP - the back-filled frame. This is the frame that is furthest from the base frame, where regions of disocclusion are likely to be largest. While not strictly necessary, it is desirable to arrange for the inter-frame transform to provide intra-coded texture information at the base frames .
  • the remapped mesh elements produced by backfilling correspond to content that can be predicted from the last frame of the GOP (the next base frame) but not from the GOP' s own base frame.
  • the new back-fill elements are defined by mesh nodes that arise from pairs of original break-induced nodes, whose foreground/background assignment is either already known (from preceding back-filling steps) or needs to be newly determined, as explained above. Once determined, the background node within each break-induced pair is replicated to form a new back-fill node that is associated with the relevant back-fill element. These new back-fill nodes are free to be assigned new displacement vectors, using the same extrapolation procedure described in Section 4.1.1, which results in the back-fill mesh elements constituting a new local background layer within the base frame.
  • the base model is progressively augmented with backfill nodes and back-fill mesh elements, so that the base model eventually describes the relationship between all frames of the GOP in a complete and geometrically consistent manner.
  • back-fill elements means that double mappings become increasingly likely, as mesh elements from the augmented base model are mapped to new intermediate frames. All such double mappings can be resolved using the methods briefly introduced already. It is helpful, however, to assign a layer-ID to each back-fill element, which identifies the back-filled frame in which the backfill element was discovered.
  • Original elements of the base mesh are assigned layer-ID 0.
  • Elements introduced when back-filling the last frame of the GOP e.g., f N
  • Elements introduced when backfilling the first intermediate frame e.g., /jv/2) are assigned layer-ID 2, and so forth.
  • the ⁇ elements are conceptually assigned a layer-id of ⁇ . This way, when a frame location is mapped by different mesh elements, the double mapping can be resolved in favour of the element with smaller ID.
  • each node in the (augmented) base model is assigned a unique ID and each mesh element is also assigned a unique ID.
  • mapping base model elements to another frame that frame is assigned an ID-map, with one ID for every pixel location.
  • the ID-map is populated progressively as each mesh element is mapped, transferring the element's ID to all empty locations in the ID-map that are covered by the mapped element. Double mappings are discovered immediately when a mesh element covers a location in the mapped frame's ID-map that is not empty.
  • the existing ID is used to immediately discover the mesh element that maps to the same location as the one currently being considered, and the double mapping resolution techniques are applied, as described already.
  • any locations in the ID-map that identify ⁇ elements are the ones for which back-filling is required.
  • Simple pixel and reference counting techniques can be used to identify all ⁇ elements that remain visible, each of which is replicated to produce a back-fill element and back-fill nodes that are remapped.
  • an embodiment employs a recursive back-filling strategy.
  • the first back-filled frame associated with the GOP based at frame f 0 is f N .
  • the next back-filled frame is ⁇ N/2 - After this, frames f N / and fsN/2 are back-filled. The process continues in this way, following a breadth-first dyadic tree scan of the GOP.
  • the foreground displacement model should be the one that maps discontinuities in the base mesh for frame f 0 to discontinuities in the next base mesh, associated with frame f N
  • the last frame of each GOP is also the first frame of the next GOP, so that the next GOP' s base displacement model M Nr that is anchored in frame f Nr can be compared with the displacement found in the base displacement model M 0 of the current GOP.
  • a discontinuity (or break) in the base displacement model M Q can be mapped to frame f N using the displacement vector found on either side of the break; the displacement vector which maps the discontinuity to a region of similar divergence or convergence in model M Nr is the one that is more likely to correspond to the foreground displacement vector .
  • each pair of co-located break-induced nodes in the base mesh maps to a line segment in the back-filled frame that spans the disoccluded region. This phenomenon corresponds to divergence in the base model M 0 . From each such pair of break-induced nodes, the free node is identified as the one whose location in frame f N exhibits a divergence value in the next base model M N that is most similar to the divergence in M 0 .
  • the mesh nodes or regions that belong to the foreground are determined as follows (see Figure 4).
  • the "origin" of any detected double mapping s k in non-base frame f k is found by searching the line segment connecting the corresponding source locations r and r ⁇ in the base-frame, looking for the location at which the displacement field folds.
  • This line segment was refer to this line segment as the "fold search path.” Folding is associated with convergence in the base displacement field, so the fold location is identified as the point along the search path at which the displacement convergence value (negative divergence) is largest. This location usually corresponds to a break in the base displacement field.
  • the fold location is mapped to frame f Nr using the displacement vectors on each side of the fold (e.g., at a distance of 1 pixel in each direction along the fold search path) , and the divergence in the next base displacement model M N is compared with that in the base displacement model M 0r to discover which displacement vector belongs to the foreground.
  • the foreground side of the fold is the one whose displacement vector carries it to a location of similar convergence (negative divergence) in the next base-frame.
  • Figure 4 is an illustration of a double mapping resolving procedure that uses divergence of the displacement field to identify the foreground object.
  • T 0 ⁇ t defines the affine mapping from frame f 0
  • i"o and i"g map to the same location m in f t .
  • each location m in the target frame can be computed
  • each location in the target frame is predicted as a weighted combination of all reference frames where the location is visible, weighted by the distance of the target frame to the respective reference frame.
  • d(v) is a distance measure. That is, each location in the target frame is predicted as a weighted combination of all reference frames where the location is visible, weighted by the distance of the target frame to the respective reference frame.
  • a number of methods can applied. In the formulation above, we resort to simple weighted prediction for locations that are deemed not visible. Another, preferred way is to employ inpainting strategies to fill in all locations that are not visible in any reference frame, which generally lead to more plausible interpolations in regions that are not visible in any of the reference frames.
  • Figure 5 is an example tiling of a higher-dimensional group of pictures (GOP) .
  • Each GOP has one base frame in its upper left corner. Adjacent GOPs overlap by one frame, in each direction and the frames that would be base frames for additional overlapping GOPs that do not exist are shown in light grey text - these are so-called "disenfranchised base frames.
  • Figure 5 shows a GOP tiling scheme that represents the most natural extension of the ID GOP structure presented before.
  • the base frame for a GOP is in its upper left corner, and adjacent GOPs overlap by one frame, horizontally and vertically.
  • the displacement-compensated inter-frame transform involves prediction only between frames that are found in the same GOP. This is why we require the GOPs to overlap. This way, common frames found at the intersection between adjacent GOPs can be used to predict frames found within each of those GOPs.
  • frame 2 M,O has no GOP of its own, unless the tiling continues to a third row of GOPs .
  • These disenfranchised base frames need carry no coded displacement information. However, it could be beneficial to selectively encode displacement information within a disenfranchised base so that it can be used to improve the quality of back-filled geometry for the GOPs that include it .
  • the back-filling algorithm is identical for ID and 2D GOPs, but in the 2D case there is no obvious order in which frames should be visited to generate back-fill mesh elements.
  • the back-filling order determines the order in which the base model for a GOP is augmented, which ultimately affects the inferred displacement values that are generated for any given frame of the GOP. It is important that the back-filling order is well-defined, since the displacement-compensated inter-frame transform generally depends upon it.
  • the motion-compensated prediction of a target frame f t can exhibit visible artefacts around object boundaries, where the displacement field is likely to be discontinuous.
  • the target frame f t is predicted from N reference frames fry E ⁇ 1, ...,N ⁇ ; we note here that depending on the transform structure, the base frame itself might be a target frame.
  • the upsampled frame interpolated using the occlusion-aware frame interpolation method of can have problems; the sudden transition from uni- to multi-directional prediction can lead to artificial boundaries in places where the illumination changes between the two reference frames.
  • the method we propose therefore consists of limiting the frequency content at each location of f t to the one of the motion- compensated reference frames.
  • this is achieved in the wavelet domain.
  • f t to denote the 2D-wavelet decomposition (interleaved) of frame / " ;
  • i[k] to access a specific wavelet coefficient k, where k collects information about level, subband, and spatial position in the transform.
  • r[k] max(/ ri [k] / ri ⁇ t [fc], ...,/ r Jfc] rw ⁇ t M) ⁇ That is,T[fc] represents the largest (visible) wavelet coefficient of the wavelet decomposition of the
  • synthesizing optical blur uses the divergence of the mapped (and inverted) displacement field M t ⁇ b as an indication of displacing object boundaries.
  • a low-pass filter is applied to all pixels where the absolute value of the divergence of the displacement field is larger than a certain threshold ⁇ ; the displacement compensated non- base frame with optical blur synthesis, denoted as f ⁇ ° is then obtained as :
  • h[m] is the kernel of a two-dimensional low-pass filter.
  • Figure 6 shows an overview of an encoder that employs the proposed base model.
  • Input to the encoder scheme is either a video sequence, multi view imagery, or a multi view video sequence.
  • the base model is
  • FIG. 7 shows an overview of a decoder that employs the proposed base model. Firstly, the sub bands and base model are decoded 110. Then, the sub bands are subjected to an inverse spatial transform 111. Lastly, the decoded sequence is obtained by reversing the interframe transform 112.
  • Encoders and de-coders implementing the methods, processes described above may be implemented using hardware, software, a combination of hardware, software and
  • firmware may be provided on computer readable mediums, or transmitted as a data signal, or in other way.
  • An element of the base-anchored approach in accordance with embodiments described above is the displacement backfilling procedure, whereby local background
  • displacement layers are added to the base model whenever disocclusion holes are observed during displacement inference. These "background layers” guarantee the assignment of geometrically consistent displacement information in regions of disocclusion, which is highly important for visual perception.
  • Another element is a robust method of identifying local foreground/background relationships around displacing objects in cases where such information cannot be deduced from the displacement information, as is for example the case when the
  • the base anchoring approach facilitates the deployment of compression systems that are highly scalable across space (i.e., multi-view) and/or time (i.e., video), enabling a seamless upsampling of arbitrary frame-rates across both dimensions .
  • a compelling feature of the base-anchored approach is that not all the frames that are used to estimate the base displacement model must be coded. That is, one might use all frames that were recorded to estimate a high-quality displacement model; however, only a fraction of these frames is coded, and all "in-between" frames are purely interpolated using the described geometrically consistent frame interpolation procedure. This is in contrast to existing video

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Analysis (AREA)
PCT/AU2017/051030 2016-09-21 2017-09-21 Base anchored models and inference for the compression and upsampling of video and multiview imagery Ceased WO2018053591A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US16/335,297 US11122281B2 (en) 2016-09-21 2017-09-21 Base anchored models and inference for the compression and upsampling of video and multiview imagery
EP17851993.0A EP3516872A4 (en) 2016-09-21 2017-09-21 BASE ANCHORED MODELS AND INFERENCE FOR VIDEO AND MULTI-VIEW IMAGING COMPRESSION AND Oversampling
JP2019536625A JP7279939B2 (ja) 2016-09-21 2017-09-21 ビデオ及びマルチビュー・イマジェリーの圧縮及びアップサンプリングのためのベース固定モデル及び推論
AU2017331736A AU2017331736B2 (en) 2016-09-21 2017-09-21 Base anchored models and inference for the compression and upsampling of video and multiview imagery

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
AU2016903815A AU2016903815A0 (en) 2016-09-21 Base Anchored Motion for Video Compression and Temporal Interpolation
AU2016903815 2016-09-21
AU2017902670A AU2017902670A0 (en) 2017-07-07 Base Anchored Models for Video Compression and Frame Upsampling
AU2017902670 2017-07-07

Publications (1)

Publication Number Publication Date
WO2018053591A1 true WO2018053591A1 (en) 2018-03-29

Family

ID=61689250

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2017/051030 Ceased WO2018053591A1 (en) 2016-09-21 2017-09-21 Base anchored models and inference for the compression and upsampling of video and multiview imagery

Country Status (5)

Country Link
US (1) US11122281B2 (enExample)
EP (1) EP3516872A4 (enExample)
JP (1) JP7279939B2 (enExample)
AU (1) AU2017331736B2 (enExample)
WO (1) WO2018053591A1 (enExample)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114357757A (zh) * 2021-12-29 2022-04-15 中国气象科学研究院 气象资料同化方法、装置、设备、可读存储介质及产品
WO2024014195A1 (ja) * 2022-07-09 2024-01-18 Kddi株式会社 メッシュ復号装置、メッシュ符号化装置、メッシュ復号方法及びプログラム
US12401823B2 (en) 2022-06-17 2025-08-26 Tencent America LLC Vertex prediction based on decoded neighbors

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200081367A (ko) * 2017-11-09 2020-07-07 소니 주식회사 화상 처리 장치와 화상 처리 방법
KR102272569B1 (ko) * 2020-05-26 2021-07-05 한국과학기술원 웨이블릿 기반 변형된 거대 메쉬 데이터의 점진적 고속 재압축 방법 및 그 시스템
CN114938461B (zh) * 2022-04-01 2024-11-01 网宿科技股份有限公司 视频处理方法、装置、设备及可读存储介质
CN117974814A (zh) * 2022-10-26 2024-05-03 荣耀终端有限公司 用于图像处理的方法、装置及存储介质
WO2024148064A1 (en) * 2023-01-03 2024-07-11 Bytedance Inc. Data arrangement for dynamic mesh coding
US20240355002A1 (en) * 2023-04-24 2024-10-24 Tencent America LLC Signaling zero coefficients for displacement coding
US20240357164A1 (en) * 2023-04-24 2024-10-24 Tencent America LLC Displacement coding in mesh compression
CN119254914A (zh) * 2023-07-03 2025-01-03 荣耀终端有限公司 一种图像处理方法、装置及电子设备
WO2025147042A1 (ko) * 2024-01-02 2025-07-10 한양대학교 산학협력단 개량된 비디오 기반 동적 메시 데이터의 부호화 및 복호화를 위한 방법
CN117974817B (zh) * 2024-04-02 2024-06-21 江苏狄诺尼信息技术有限责任公司 基于图像编码的三维模型纹理数据高效压缩方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0782343A2 (en) * 1995-12-27 1997-07-02 Matsushita Electric Industrial Co., Ltd. Video coding method
WO2004028166A1 (en) * 2002-09-20 2004-04-01 Unisearch Limited Method of signalling motion information for efficient scalable video compression
US20060114995A1 (en) * 2004-12-01 2006-06-01 Joshua Robey Method and system for high speed video encoding using parallel encoders
US20100316126A1 (en) * 2009-06-12 2010-12-16 Microsoft Corporation Motion based dynamic resolution multiple bit rate video encoding

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2856548A1 (fr) * 2003-06-18 2004-12-24 France Telecom Procede de representation d'une sequence d'images par modeles 3d, signal et dispositifs correspondants
EP1790169A1 (fr) * 2004-09-15 2007-05-30 France Télécom Procede d'estimation de mouvement a l'aide de maillages deformables
KR100678958B1 (ko) * 2005-07-29 2007-02-06 삼성전자주식회사 인트라 bl 모드를 고려한 디블록 필터링 방법, 및 상기방법을 이용하는 다 계층 비디오 인코더/디코더
WO2008012822A2 (en) * 2006-07-26 2008-01-31 Human Monitoring Ltd Image stabilizer
US9819946B2 (en) * 2012-04-05 2017-11-14 Newsouth Innovations Pty Limited Method and apparatus for coding of spatial data
US9749642B2 (en) * 2014-01-08 2017-08-29 Microsoft Technology Licensing, Llc Selection of motion vector precision

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0782343A2 (en) * 1995-12-27 1997-07-02 Matsushita Electric Industrial Co., Ltd. Video coding method
WO2004028166A1 (en) * 2002-09-20 2004-04-01 Unisearch Limited Method of signalling motion information for efficient scalable video compression
US20060114995A1 (en) * 2004-12-01 2006-06-01 Joshua Robey Method and system for high speed video encoding using parallel encoders
US20100316126A1 (en) * 2009-06-12 2010-12-16 Microsoft Corporation Motion based dynamic resolution multiple bit rate video encoding

Non-Patent Citations (31)

* Cited by examiner, † Cited by third party
Title
A. GOLBELKARJ. WOODS: "Motion-compensated temporal filtering and motion vector coding using biorthogonal filters", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 17, no. 4, April 2007 (2007-04-01), pages 417 - 428, XP011179773
A. SEEKERD. TAUBMAN: "Lifting-based invertible motion adaptive transform (LIMAT) framework for highly scalable video compression", IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 12, no. 12, December 2003 (2003-12-01), pages 1530 - 1542
A. T. NAMAND. TAUBMAN: "Flexible synthesis of video frames based on motion hints", IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 23, no. 9, September 2014 (2014-09-01), pages 3802 - 3815, XP011554420, DOI: 10.1109/TIP.2014.2332763
A. T. NAMAND. TAUBMAN: "Flexible synthesis of video frames based on motion hints", IEEE TRANSACTIONS ON IMAGE PROCESSING,, vol. 23, no. 9, September 2014 (2014-09-01), pages 3802 - 3815, XP011554420, DOI: 10.1109/TIP.2014.2332763
A. ZHENGY. YUANH. ZHANGH. YANGP. WANO. AU: "Motion vector fields based video coding", IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, September 2015 (2015-09-01), pages 2095 - 2099, XP032826839, DOI: 10.1109/ICIP.2015.7351170
B.-D. CHOIJ.-W. HANC.-S. KIMS.-J. KO: "Motion-compensated frame interpolation using bilateral motion estimation and adaptive overlapped block motion compensation", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 17, no. 4, April 2007 (2007-04-01), pages 407 - 416, XP011179771
C.-L. CHANGX. ZHUP. RAMANATHANB. GIROD: "Light field compression using disparity-compensated lifting and shape adaptation", IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 15, no. 4, April 2006 (2006-04-01), pages 793 - 806, XP055231515, DOI: 10.1109/TIP.2005.863954
D. KIMH. LIMH. PARK: "Iterative true motion estimation for motion-compensated frame interpolation", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 23, no. 3, March 2013 (2013-03-01), pages 445 - 454, XP011496049, DOI: 10.1109/TCSVT.2012.2207271
D. RUFENACHTR. MATHEWD. TAUBMAN: "A novel motion field anchoring paradigm for highly scalable wavelet-based video coding", IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 25, no. 1, January 2016 (2016-01-01), pages 39 - 52, XP011590586, DOI: 10.1109/TIP.2015.2496332
D. RUFENACHTR. MATHEWD. TAUBMAN: "Bidirectional, occlusion-aware temporal frame interpolation in a highly scalable video setting", PICTURE CODING SYMPOSIUM (PCS), May 2015 (2015-05-01), pages 5 - 9
D. SUNJ. WULFFE. SUDDERTHH. PFISTERM. BLACK: "A fully connected layered model of foreground and background flow", IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, pages 2451 - 2458, XP032492834, DOI: 10.1109/CVPR.2013.317
D. TAUBMAN: "High performance scalable image compression with EBCOT", IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 9, no. 7, pages 1151 - 1170
G. OTTAVIANOP. KOHLI: "Compressible motion fields", PROC. IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), June 2013 (2013-06-01), pages 2251 - 2258, XP032492945, DOI: 10.1109/CVPR.2013.292
H. G. LALGUDIM. W. MARCELLINA. BILGINH. OHM. S. NADAR: "View compensated compression of volume rendered images for remote visualization", IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 18, no. 7, July 2009 (2009-07-01), pages 1501 - 1511
I. DARIBOD. FLORENCIOG. CHEUNG: "Arbitrarily shaped sub-block motion prediction in texture map compression using depth information", PICTURE CODING SYMPOSIUM (PCS), May 2012 (2012-05-01), pages 121 - 124, XP032449843, DOI: 10.1109/PCS.2012.6213301
J. REVAUDP. WEINZAEPFELZ. HARCHAOUIC. SCHMID: "Epicflow: edge-preserving interpolation of correspondences for optical flow", PROC. IEEE CONFERENCE ON COMPUTING AND VISUAL PATTERN RECOGNITION (CVPR), June 2015 (2015-06-01)
J.-U. GARBASB. PESQUET-POPESCUA. KAUP: "Mehods and tools for wavelet-based scalable multiview video coding", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 21, no. 2, February 2011 (2011-02-01), pages 113 - 126, XP011348871, DOI: 10.1109/TCSVT.2011.2105552
M. FLIERLB. GIROD: "Video coding with motion-compensated lifted wavelet transforms", SIGNAL PROCESSING: IMAGE COMMUNICATIONS, vol. 19, July 2004 (2004-07-01)
MATTHEW, REIJ ET AL.: "Optimization of Optical Flow for Scalable Coding", PICTURE CODING SYMPOSIUM (PCS) IEEE 2015, 31 May 2015 (2015-05-31) - 3 June 2015 (2015-06-03), pages 70 - 74, XP033184710, Retrieved from the Internet <URL:DOI:10.1109/PCS.2015.7170049> *
N. MEHRSERESHTD. TAUBMAN: "An efficient content-adaptive motion-compensated 3-D DWT with enhanced spatial and temporal scalability", IEEE TRANSACTIONS ON IMAGE PROCESSING,, vol. 15, no. 3, March 2006 (2006-03-01), pages 1397 - 1412
R. MATHEWD. TAUBMAN: "Scalable modeling of motion and boundary geometry with quad-tree node merging", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 21, no. 2, February 2011 (2011-02-01), pages 178 - 192
R. MATHEWD. TAUBMANP. ZANUTTIGH: "Scalable coding of depth maps with R-D optimized embedding", IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 22, no. 5, May 2013 (2013-05-01), pages 1982 - 1995, XP011497799, DOI: 10.1109/TIP.2013.2240007
R. MATHEWS. YOUNGD. TAUBMAN: "Optimization of optical flow for scalable coding", PICTURE CODING SYMPOSIUM (PCS), May 2015 (2015-05-01), pages 70 - 74
R. SZELISKIH.-Y. SHUM: "Motion estimation with quadtree splines", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 18, no. 12, December 1996 (1996-12-01), pages 1199 - 1210
RUFENACHT, D. ET AL.: "A Novel Motion Field Anchoring Paradigm for Highly Scalable Wavelet-Based Video Coding", IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 25, no. 1, 30 October 2015 (2015-10-30), pages 39 - 52, XP011590586, Retrieved from the Internet <URL:DOI:10.1109/TIP.2015.2496332> *
RUFENACHT, D. ET AL.: "Bidirectional Hierarchical Anchoring of Motion Fields for Scalable Video Coding", 2014 IEEE 16TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 22 September 2014 (2014-09-22), XP032684405, Retrieved from the Internet <URL:DOI:10.1109/MMSP.2014.6958816> *
S. MILANIG. CALVAGNO: "Segmentation-based motion compensation for enhanced video coding", IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, September 2011 (2011-09-01), pages 1685 - 1688
S. YOUNGD. TAUBMAN: "Rate-distortion optimized optical flow estimation", IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, September 2015 (2015-09-01), pages 1677 - 1681, XP032826708, DOI: 10.1109/ICIP.2015.7351086
S.-G. JEONGC. LEEC.-S. KIM: "Motion-compensated frame interpolation based on multihypothesis motion estimation and texture optimization", IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 22, no. 11, November 2013 (2013-11-01), pages 4495 - 4509, XP011527287, DOI: 10.1109/TIP.2013.2274731
See also references of EP3516872A4
Y. ANDREOPOULOSA. MUNTEANUJ. BARBARIENM. VAN DER SCHAARJ. CORNELISP. SCHELKENS: "In-band motion compensated temporal filtering", SIGNAL PROCESSING: IMAGE COMMUNICATION, vol. 19, no. 7, July 2004 (2004-07-01), pages 653 - 673

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114357757A (zh) * 2021-12-29 2022-04-15 中国气象科学研究院 气象资料同化方法、装置、设备、可读存储介质及产品
US12401823B2 (en) 2022-06-17 2025-08-26 Tencent America LLC Vertex prediction based on decoded neighbors
WO2024014195A1 (ja) * 2022-07-09 2024-01-18 Kddi株式会社 メッシュ復号装置、メッシュ符号化装置、メッシュ復号方法及びプログラム
JP2024008743A (ja) * 2022-07-09 2024-01-19 Kddi株式会社 メッシュ復号装置、メッシュ符号化装置、メッシュ復号方法及びプログラム
JP7701898B2 (ja) 2022-07-09 2025-07-02 Kddi株式会社 メッシュ復号装置、メッシュ符号化装置、メッシュ復号方法及びプログラム

Also Published As

Publication number Publication date
AU2017331736B2 (en) 2022-10-27
US20200021824A1 (en) 2020-01-16
JP2019530386A (ja) 2019-10-17
EP3516872A1 (en) 2019-07-31
JP7279939B2 (ja) 2023-05-23
AU2017331736A1 (en) 2019-05-16
EP3516872A4 (en) 2020-04-15
US11122281B2 (en) 2021-09-14

Similar Documents

Publication Publication Date Title
US11122281B2 (en) Base anchored models and inference for the compression and upsampling of video and multiview imagery
US12457311B2 (en) Efficient multi-view coding using depth-map estimate for a dependent view
Merkle et al. The effects of multiview depth video compression on multiview rendering
JP5575908B2 (ja) 2dビデオデータの3dビデオデータへの変換のための深度マップ生成技法
CN100576934C (zh) 基于深度和遮挡信息的虚拟视点合成方法
US8351685B2 (en) Device and method for estimating depth map, and method for generating intermediate image and method for encoding multi-view video using the same
US9998761B2 (en) Apparatus for coding a bit stream representing a three-dimensional video
EP2428045B1 (en) Method for reconstructing depth image and decoder for reconstructing depth image
WO2009091563A1 (en) Depth-image-based rendering
JP6154643B2 (ja) 動画像符号化装置、動画像符号化装置のデプスイントラ予測方法およびプログラム、ならびに動画像復号装置、動画像復号装置のデプスイントラ予測方法およびプログラム
EP2061005A2 (en) Device and method for estimating depth map, and method for generating intermediate image and method for encoding multi-view video using the same
EP3373584B1 (en) Content adaptive and art directable scalable video coding
Rüefenacht et al. Base-anchored model for highly scalable and accessible compression of multiview imagery
Rüfenacht et al. A novel motion field anchoring paradigm for highly scalable wavelet-based video coding
Muller et al. Compressing time-varying visual content
Morvan et al. Multiview depth-image compression using an extended H. 264 encoder
Maugey et al. Multiview image coding using graph-based approach
Kamolrat et al. Adaptive motion-estimation-mode selection for depth video coding
Ruefenacht et al. wg1m78051-REQ-JPEG Pleno-UNSW HDCA Proposal Description (Updated)
Woo et al. Object-oriented hybrid segmentation using stereo images
Garcia et al. Depth-map super-resolution for asymmetric stereo images
Ozkalayci et al. Multi-view video coding via dense depth estimation
Li Discontinuity-Aware Base-Mesh Modeling of Depth for Scalable Multiview Image Synthesis and Compression
HK40012753A (en) Apparatus and method for encoding a multi-view signal into a multi-view data stream
HK40030833A (en) Efficient multi-view coding using depth-map estimate for a dependent view

Legal Events

Date Code Title Description
DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17851993

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019536625

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017851993

Country of ref document: EP

Effective date: 20190423

ENP Entry into the national phase

Ref document number: 2017331736

Country of ref document: AU

Date of ref document: 20170921

Kind code of ref document: A