EP1790169A1 - Verfahren zur bewegungsschätzung mithilfe von verformbaren netzen - Google Patents

Verfahren zur bewegungsschätzung mithilfe von verformbaren netzen

Info

Publication number
EP1790169A1
EP1790169A1 EP05805589A EP05805589A EP1790169A1 EP 1790169 A1 EP1790169 A1 EP 1790169A1 EP 05805589 A EP05805589 A EP 05805589A EP 05805589 A EP05805589 A EP 05805589A EP 1790169 A1 EP1790169 A1 EP 1790169A1
Authority
EP
European Patent Office
Prior art keywords
mesh
nodes
meshes
sub
discontinuity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
EP05805589A
Other languages
English (en)
French (fr)
Inventor
Nathalie Cammas
Stéphane PATEUX
Nathalie Laurent-Chatenet
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Orange SA
Original Assignee
France Telecom SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by France Telecom SA filed Critical France Telecom SA
Publication of EP1790169A1 publication Critical patent/EP1790169A1/de
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/215Motion-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/53Multi-resolution motion estimation; Hierarchical motion estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/537Motion estimation other than block-based
    • H04N19/54Motion estimation other than block-based using feature points or meshes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/553Motion estimation dealing with occlusions

Definitions

  • the present invention relates to digital processing of moving images, and more particularly to motion estimation techniques between successive images of a sequence.
  • the motion is represented by means of a set of values defined on the nodes of a mesh positioned on an image.
  • An interpolation technique is used to deduce from the values stored at the nodes of this mesh, a motion vector at any point of the image. Typically, it can be a type interpolation
  • Lagrange that is to say that the movement vector assigned to a point of the image is an affine function of the vectors calculated for the neighboring nodes.
  • the meshes can also be used to decorrelate the motion and texture information of a video sequence in order to perform an analysis-synthesis coding scheme.
  • Deformable meshes define a continuous representation of a motion field, while the actual motion of a video sequence is usually of a discontinuous nature. Thus, when different planes and objects overlap in a scene, areas of occultation and discoveries appear, generating lines of discontinuities.
  • the challenge is to eliminate this visual degradation and the limit in terms of analysis, by determining the zones in discontinuity.
  • a post-processing technique can be implemented to solve this problem.
  • One of these techniques proceeds by a posteriori correction, and consists of applying the motion vectors such as calculating the products, detecting those in default and then correcting their value.
  • Another of these techniques proceeds iteratively, adding a portion of the expected displacement to the nodes at each iteration so that there is no reversal, and continuing the iterations until the process converges.
  • the post-processing techniques act once the motion estimation has been performed. As a result, the result is suboptimal because the motion vectors are corrected independently of their contribution to the minimization of the prediction error.
  • An improvement consists in optimizing the motion field by taking into account non-inversion constraints during the optimization process. To do this, we adapt the motion estimation by adding to the quadratic error of prediction an augmented Lagrangian allowing to correct the deformation of the meshes as their area approaches from scratch. This last technique makes it possible to determine the optimal solution, but provided that it represents a continuous field. However, the nature of a video sequence is most often discontinuous.
  • Another technique consists in identifying the discontinuity zones in order to restore them, by monitoring the appearance or disappearance of objects.
  • a first estimation of movement is made between two successive instants t, and t 2 without preventing the reversals of meshes.
  • the discontinuity zones are detected.
  • the process then consists in realizing a new movement estimate between I 1 and t 2 , excluding from the optimization criterion the contributions of the faulty zones, containing at least one upturn, in order to minimize the error of prediction between the two images. considered.
  • This reoptimization makes it possible to determine the optimal motion vectors for the continuous zone (admitting a bijection between t 1 and t 2 ) and thus to avoid the disturbance of the values of the motion vectors obtained in the previous optimization, generated by the zones of discontinuities.
  • the faulty areas are subject to a frequency or spatial approximation in image compression, and they are excluded from the video object tracking optimization method.
  • the various known techniques strive to make continuous a field of discontinuous movement, by imposing a movement calculated from the continuous areas in the discontinuous areas. This results in a false movement and a poor temporal prediction of the texture in the discontinuous zones, and therefore an additional cost of coding.
  • An object of the invention is to estimate the motion of a video sequence - AT -
  • the invention thus proposes a motion estimation method in a sequence of animated digital images, comprising the following steps:
  • estimating a first displacement field in a group of images including the reference image by assigning to each point of an image a displacement value calculated from values assigned to the nodes delimiting a mesh of the first mesh at which belongs to that point;
  • each discontinuity zone including at least one mesh fulfilling a mesh deformation criterion in the group of images
  • a second mesh to be applied to the reference image comprising a regular part composed of meshes of the first mesh that do not belong to any discontinuity zone and, for at least one detected discontinuity zone, at least two sub-meshes overlapping in a region including the break line determined in said discontinuity zone, each of the two sub-meshes including respective meshs delimited by nodes including nodes shared with the regular part, located at the edge of the discontinuity zone, and additional nodes not belonging to the regular part, the breaking line being located between the respective nodes of the two sub-meshes shared with the regular part; and estimating a second displacement field in the group of images, assigning to each point in a detected discontinuity zone a displacement value calculated from values assigned to nodes delimiting a selected mesh of the second mesh to which said point belongs, the selected mesh depending on the position of said point with respect to the predetermined break line in said discontinuity zone.
  • the method performs global optimization to determine motion. It is not imposed a priori constraints on the criteria to be optimized, and one also avoids to exclude from the calculation the zones of discontinuity frequently present in the animated images.
  • the motion estimation performed can therefore be optimal, even in the discontinuity zones, provided that the rupture lines are reliably identified.
  • the estimated movement can then be used by a video coder. In this context, it will allow a good prediction of the images of the sequence even in the discontinuity areas of the mesh and will allow the improvement of the coding cost of the video sequence. Parameters representing the estimated motion will then be transmitted to a decoder, or stored in memory for later decoding.
  • the motion estimation method is compatible with the use of hierarchical meshes, with which the displacement field estimates are made ranging from the coarsest hierarchical level (1) to the finest hierarchical level (nivFin) of the meshes .
  • the discontinuity zone is preferably detected as a connected set of meshes of the finest hierarchical level fulfilling the mesh deformation criterion. It is then defined at the higher hierarchical levels as being composed of at least one mesh including at least one mesh respectively of the finest hierarchical level fulfilling the criterion of mesh deformation.
  • the two sub-meshes of the second mesh are generated starting at the level of nivFin, and the meshes of the higher levels are then generated during a progressive recovery in the hierarchy.
  • Raising from a hierarchical level n to an immediately higher hierarchical level n-1 includes the following steps for each of the sub-meshes and for 1 ⁇ n ⁇ nivFin: / a / integrating each mesh of said sub-mesh previously defined at level n into a new mesh of said sub-mesh generated at level n-1;
  • n 1 n;
  • said new mesh of the level ne-1 can not be completed with meshes of said sub-mesh already generated at level n 1 , generating at level n 1 at least one new mesh of said sub-mesh to complete said new mesh of level no-1; and he if n 1 ⁇ nivFin, increase n 1 by one unit and repeat from the step here.
  • respective depth values are assigned to the nodes of the regular part and to the additional nodes of each sub-mesh of the second mesh.
  • the value assigned to the additional nodes of a sub-mesh generated for a detected discontinuity zone is a function of the position of said sub-mesh with respect to the breaking line determined in said zone.
  • the step of estimating the second displacement field comprises for each point of an image belonging to a mesh of the regular part of the second mesh and to at least one mesh of a sub-mesh, the calculation for each mesh including this point of a weighted sum of the depth values respectively assigned to the nodes delimiting said mesh, and the selection, for the assignment of a displacement value to said point, of the mesh for which the calculated weighted sum is maximum.
  • a motion estimation device in an animated digital image sequence comprising means adapted to the implementation of a method as defined above, as well as to a computer program to be installed in a moving image processing apparatus, including instructions for implementing the steps of a motion estimation method as defined above during execution of the program by a computing unit of said apparatus.
  • the invention also proposes a video encoder, comprising motion estimation means in a sequence of animated digital images and means for constructing an output stream including motion parameters produced by the motion estimation means, in wherein the motion estimation means is arranged to operate according to a method as defined above.
  • Still another aspect of the invention relates to a signal representative of a sequence of animated digital images, comprising a representation of motion parameters obtained by executing a method as defined above, as well as to a support of recording, on which is recorded such a signal.
  • Motion parameters include, for a group of images including a reference image:
  • First motion parameters indicating, in a first mesh to be applied to the reference image, meshes of which at least one discontinuity zone is composed in the group of images;
  • Second motion parameters for positioning at least one break line in each discontinuity zone.
  • the motion parameters represented in the signal can be supplemented by parameters indicating depth values respectively assigned to the nodes of the regular part and to the additional nodes of each sub-mesh of the second mesh.
  • the invention is also apparent on the motion decoding side realized in a video decoder or other moving image processing apparatus.
  • the invention thus proposes a motion decoding method in a sequence of animated digital images, using image meshes comprising meshs delimited by nodes. This process comprises the following steps:
  • a motion decoding device in a sequence of animated digital images, comprising means adapted to the implementation of a motion decoding method as defined above, as well as a computer program to be installed in a moving image processing apparatus, comprising instructions for implementing the steps of a motion decoding method as defined above during execution of the program by a computing unit of said apparatus.
  • the invention also proposes a video decoder, comprising motion decoding means and synthesis means for constructing a sequence of animated digital images by taking into account a displacement field generated by the motion decoding means, which are arranged to operate according to a motion decoding method as defined above.
  • FIG. 1 is a diagram illustrating the hierarchical mesh of an image
  • FIG. 2 is a diagram illustrating the phenomenon of cell reversal
  • FIG. 3 is a flowchart of a motion estimation method according to the invention.
  • FIGS. 4 to 7 are diagrams illustrating a remeshing process used in one embodiment of the invention
  • FIG. 8 is a diagram illustrating the definition of a discontinuity zone in higher levels of a hierarchical mesh once it has been determined at the finest level
  • FIGS. 9a-d, 10a-d, 11a-d and 12a-c are diagrams illustrating the generation of the mesh in higher levels of a hierarchical mesh in one embodiment of the invention.
  • FIGS. 13 and 14 are simplified block diagrams of a video encoder and a video decoder according to the invention.
  • the values l (x, y, t) associated with the pixels are typically values of luminance.
  • the calculation is performed on an estimation support ⁇ . It consists in determining the displacement field D (x, y, t) which minimizes a functional ⁇ (t) of the form:
  • ⁇ (t) ⁇ p (l (xd x , yd yi t-1), l (x, y, t)) (1)
  • weights W 1 (X, y, t) represent coordinates of the point (x, y) expressed with respect to the position of the nodes i in the image at time t.
  • a convenient mesh is the triangular mesh, in which each point (x, y) is considered to belong to a triangle whose vertices are nodes i, j, k of the respective coordinate mesh (X j [t], y, [ t]),
  • ⁇ ⁇ Jik [t] xj [t] .y k [t] - x k [t] .yj [t] + x k [t] .y, [t] - X
  • [t] is a vector product associated with the triangle at time t.
  • the calculation is conducted on a group of consecutive images of the sequence, typically of the order of a dozen images.
  • the displacement vectors D (X i [I], y f [1], 1) are estimated by minimizing the functional ⁇ (1), for example by applying a Gauss-Seidel gradient descent method or the like.
  • Y j [I]) (X j [O], Y j [O]) + D (X 1 [I ] 1 y, [1], 1).
  • the estimation of the movement is advantageously carried out using a hierarchical mesh, which, in a manner known per se, ensures a better convergence of the system.
  • a certain fineness of mesh is necessary to represent faithfully the movement within the image. But in case of strong movement, the previous minimization technique may not converge if it is applied directly to a fine mesh. In addition, the use of very fine meshes can cause instability of the system due to too many parameters.
  • Figure 1 shows an example of hierarchical mesh.
  • the hierarchical representation consists of several levels of representation.
  • the lowest level 30 (level 0 in the figure) has a coarse field, with only three nodes to define the mesh. Going towards the finer levels 32, 33, 35, the field becomes denser and the number of nodes of the mesh grows.
  • the quality of movement varies with the levels, the low level 30 representing the dominant movement of the scene, and the fine levels refining the dominant movement to represent the local movements.
  • the number of levels of the hierarchical mesh is an adjustable parameter of the estimation phase, it can vary according to the sequence to be estimated.
  • the hierarchical mesh motion estimation technique we generate several levels of hierarchical mesh on the images, we start by estimating the movement on the coarser level 30, then we go to the next level by starting the gradient descent to from displacement values to the nodes deduced from those estimated at the previous level: the nodes common to the two levels receive initial displacement vectors equal to those which have just been estimated, and the nodes added at the finer level receive calculated initial displacement vectors by spatial interpolation. At the end of the iterations, it is the displacement vectors estimated at the finest level that are quantized to be transmitted to the decoder.
  • the hierarchical mesh motion estimation technique is combinable with a multi-resolution estimation technique, in which we work on a pyramid of filtered and decimated images constructed from the original images. Motion estimation in a hierarchical mesh level is then performed based on sampled images at a suitable resolution level.
  • a general problem of meshing motion estimation techniques is that of mesh reversals. This problem is illustrated in FIG. 2, where we see the mesh of an image at two successive instants with, on the left part of the figure, an example of displacement vectors estimated between these two instants at the nodes i, j, k forming the vertices of a triangle elementary of the mesh.
  • the reversal of this triangle results from the fact that the node k crosses the line passing through the nodes i and j.
  • the inversion of a triangle i, j, k corresponds to a change of sign of the vector product% - x j k [t].
  • Such artifacts greatly disrupt motion estimation. They are usually due to relative movements of objects in different shots of the filmed scene.
  • the illustration in Figure 2 is very simplified because only one triangle turns over (passing through a triangle of zero area). In practice, the overlays most often occur on discontinuity areas having a certain extent in the image.
  • the mesh With a hierarchical mesh, the mesh reversals naturally have a greater probability of occurring in the fine levels than in the coarse levels.
  • the invention uses a tracking of the discontinuity zones and the rupture lines that they contain.
  • a remeshing of the image is done in the discontinuity zones, using multiple sub-meshes, anchored to the initial mesh on both sides of the break lines.
  • the multiple sub-meshes generated in a discontinuity zone extend beyond the break line, out of which they overlap each other. They may even overflow outside the discontinuity area.
  • an interpolation formula such as (2) reference is made to the nodes of one of the sub-meshes, selected depending on the position of the point with respect to the break line or lines.
  • the sub-meshes make it possible to account for different planes present in the sequence of images, their use depending on the objects which appear or disappear in the scene.
  • the invention makes it possible to manage the zones of discontinuity of movement without putting them in default or rejecting them at the time of the coding.
  • the principle is to cut the mesh locally where the discontinuity is created, and transform the mesh into a so-called "non-manifold" mesh.
  • a non-manifold mesh is a mesh whose edges can be shared by more than two meshes. It allows to estimate the motion in the video sequence and to model a discontinuous motion field.
  • One advantage is that it is thus possible to take into account the discontinuity zones at the time of coding in the same way as the continuous zones.
  • Figure 3 shows a flowchart of a motion estimation method according to the invention.
  • the first step 9 consists in defining the initial mesh on an image of a video sequence to be encoded. Then in step 10, we make a first estimation of the motion field in a group of T consecutive images. This estimation is performed conventionally using a preferably hierarchical mesh, for example according to the method explained above. During this calculation, some triangular meshes can turn over or deform too strongly.
  • the method comprises a step 11 of detecting the discontinuity zones in the initial mesh.
  • the discontinuity zones each consist of a connected set of degenerate meshes defined at the finest hierarchical level.
  • the detection can be generalized to include in a discontinuity zone a triangular mesh i, j, k whose area (equal to half of the absolute value of the vector product 71 M kM) becomes close to zero, that is to say say less than a predefined threshold, for at least one moment t.
  • the detection of degenerate triangles, to be included in a discontinuity zone may more generally comprise a study of the deformation of the triangles between the image at time 0 and the image at time T. If the deformation of a mesh exceeds a certain threshold, this mesh is considered degenerate.
  • a related set of degenerate meshes forms an area of discontinuity. This is the area where a discontinuity appeared in the movement. It is defined at the finest hierarchical level, and the triangular meshes that compose it (or the nodes that border it) are part of the parameters that will be transmitted to the decoder.
  • the contour of the discontinuity zone can also be represented by splines.
  • step 11 If no discontinuity zone is detected in step 11 (test 12), the motion estimation method ends in step 20 where the motion parameters are output which will be quantized to be transmitted to the video decoder.
  • these parameters are those obtained in step 10, to which is added an indicator signaling that no discontinuity zone has been detected (continuous movement).
  • a break line determination is first made in each detected discontinuity area (step 13).
  • a break line is positioned on the edge of an object that caused a discontinuity in the area. The following is the case of a single break line in a discontinuity zone. It will be observed that the process is generalizable to several fault lines within the same zone.
  • the outline of the object is oriented to define an inner region (foreground region) and an outer region (background region).
  • Several methods known in themselves, are applicable to find this contour in step 13. If we already have segmentation masks of the images of the sequence, the contour is extracted from these masks. However, for most sequences, masks are not available. segmentation.
  • the image can be pre-segmented by a "mean shift” technique such as that described by Dorin Comaniciu and Peter Meer in “Mean Shift: A Robust Approach Toward Feature Space Analysis", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 5, May 2002, pp. 603-619. Then, a succession of dilations and morphological erosions make it possible to eliminate the small segmented regions. The outline of the object is finally extracted from the segmented images.
  • the salient point detection technique can also be applied in step 13.
  • the salient points are essentially positioned on the contours of the objects. Since the list of highlights does not define a complete contour, it is necessary to add a contour refinement step from a chaining of these points.
  • the highlights of an image I correspond to the pixels of I belonging to high frequency regions. To detect them, we can use wavelet theory.
  • the wavelet transform is a multi-resolution representation of the image that allows it to be expressed at different resolutions 1/2, 1/4, etc.
  • This tree is based on the so-called "Zerotree” approach, known in the field of image coding. It allows to set up a saliency map of size 2 k + r x 2 l + r reflecting the importance of each coefficient of wavelets at resolution 2 r .
  • a coefficient having a high saliency corresponds to a region of Im having high frequencies.
  • an important module wavelet coefficient at resolution 2 r corresponds to a contour of the image A r + 1 .lm in a particular direction (horizontal, vertical or
  • each of the wavelet coefficients at resolution 2 r corresponds to a spatial area of size 2 ⁇ "r * 2 ⁇ r in the image Im, from the constructed saliency map choose from the 2 ⁇ r * 2 ⁇ r pixels Im, the most representative pixel of this area.
  • step 13 the break lines are determined in each of the images of the group of images. The positions of these lines will be part of the movement parameters delivered in step 13 to be communicated to the decoder.
  • step 14 of the method the discontinuity areas that were detected in step 11 are subject to non-manifold remeshing. This remeshing is done first at the finest hierarchical level.
  • Figure 4 shows an example of discontinuity zone Z, here composed of eight adjacent meshes of the initial triangular mesh. This mesh consists of equilateral triangles when it is defined on the first image of the group. Fig. 4 shows an L-shaped break line which was determined in zone Z at step 13.
  • the new mesh adopted in step 14 includes a regular part composed of the triangles of the initial mesh which do not belong to any zone of discontinuity.
  • each zone of discontinuity Z containing a break line L two sub-meshes are generated attached to the regular part along the edges of the zone of discontinuity Z.
  • Each of these two sub-meshes is assigned to one side of the line. L, and it includes the nodes of the initial mesh located on this side along the edges of the zone of discontinuity Z.
  • the triangles in broken lines in FIGS. 5 and 6 thus represent, respectively, two sub-meshes that can be generated in the discontinuity zone Z of FIG. 4.
  • the nodes of the initial mesh denoted a, b, c , d, e, f in Fig. 4 belong to the "left" sub-mesh (i.e., attached to the initial mesh on the left side of the break line L, the left and right sides being defined relative to the orientation determined for the break line L) shown in FIG. 5, and the nodes of the initial mesh denoted a, f, g, h, i, j in FIG. 4 belong to the "right" sub-mesh shown in FIG. 6.
  • nodes a and f Some of the nodes of the initial mesh that border the zone of discontinuity are common to the two sub-meshes (here the nodes a and f).
  • the left sub-mesh has eight new nodes a'-h 1 and sixteen new triangles (a.a'.h 1 ), (a, b, a '), (b, b ', a'), Ob 1 Cb 1 ), (c, d, b '), (d.c'.b 1 ), (dec 1 ), (e.d'.c 1 ), (efd 1 ) , (f'e'.d 1 ), (d ⁇ e ⁇ f), (c'.d'f), (Cf, g '), (b'.c'.g 1 ), (a') .b'.g 1 ) and (a'.g'.h 1 ).
  • the right sub-mesh has eight new nodes "-h” and sixteen new triangles (a, h ", a"), (j a > a ") > (i, j , a "), (i.a '-. b"), (i, b “, c"), (hic “), (h, c", d “), (ghd"), (fgd "), (f, d ", e"), (c “, e", d “), (c", f, e "), (b'T, c"), (b-.g'T), (a ", g", b ") and (a", h ", g").
  • the additional nodes generated in the new sub-meshes may have positions in the first image merged with those of nodes of the initial mesh. They have been shown offset in Figures 5 and 6 to facilitate the reading of the drawing.
  • border nodes which can move only with the initial mesh.
  • These border nodes can be of three types:
  • - left border nodes serving as a basis for the only left sub-mesh; these are the nodes b, c, d and e in figures 4-6;
  • the break line L passes through a triangle having at least one vertex edge node, this node is identified as being left or right border depending on its position relative to the oriented line.
  • the nodes on the edge traversed by the line L can be identified as left and right edge nodes and the third node as shared edge node (as in the figures 4-6).
  • Another possibility is to extend the break line by extrapolation until it encounters an edge of the triangle, and identify the nodes on that edge as shared border nodes and the third node as the left border node or right according to its position relative to the oriented line.
  • the new meshes extend beyond the zone of discontinuity Z as shown in FIGS. 5 and 6. Meshes of the regular part and meshes of the sub -maillages overlap then.
  • the adaptation to non-manifold meshes used here is done by assigning new nodes of each sub-mesh a value of depth z, positive or negative, assigned to this sub-mesh.
  • a value z> 0 generally corresponds to an object in the foreground, and a value z ⁇ 0 to an object in the background.
  • the sign of z is given by the orientation of the rupture line L.
  • the object in the foreground, whose contour corresponds to the rupture line, is positioned relative to the orientation of the line L (for example to right of the line if we move in the direction of its orientation).
  • the hatched portion in FIG. 7 belongs to the object whose break line L constitutes the contour.
  • This value z at the nodes makes it possible to calculate at each point of a mesh a value of z by an interpolation technique (affine for example).
  • an interpolation technique affine for example.
  • a value z is calculated at this point for these different meshes, and these values are compared to retain the mesh giving the largest value z. This helps to promote objects in the foreground relative to the background.
  • break lines When several break lines appear in a discontinuity area, there are more than two planes in the corresponding portion of the image sequence.
  • the detection of break lines makes it possible to position the different planes, and they are assigned differentiated z values.
  • the preceding method then makes it possible to select the relevant mesh for the reconstruction of each point of the image.
  • the z-values at the nodes are positioned so as to reconstruct at best the image for which the meshes were introduced.
  • the positioning can be done using an iterative Condition Mode (ICM) algorithm, seeking to minimize the mean squared error between the initial image and the reconstructed image.
  • ICM iterative Condition Mode
  • the z values that have been determined for the corresponding sub-meshes are part of the parameters of the movement to be transmitted to the decoder.
  • the discontinuity that represents a line of rupture L is raised in the higher levels until its disappearance at a certain level.
  • the zone of discontinuity defined on this level is remailled taking into account the remaillage of the lower level to preserve the hierarchy of the mesh.
  • the rise of the discontinuity in the hierarchy comprises two steps: determining the discontinuity zone for each level, and determining the constraints imposed on the nodes at the edge of the zone.
  • nivFin be the finest level of the mesh to which remeshing was originally done. If, for a level n less than or equal to nivFin, the zone of discontinuity has been detected, the discontinuity zone at level n-1 is defined by the set of parent meshes of the meshes of the zone of occlusion of level n, as shown in Figure 8.
  • the constraint of the border nodes shared by the two sub-meshes is raised in the hierarchy, for example according to the following algorithm. For a node m constituting a shared border node at level n,
  • the creation of a new mesh for a level of hierarchy n-1 lower than a level n which has already been re-worked can be of three types:
  • a 1 is created at level n-1.
  • the new mesh A 1 BC of the level n-1 takes for meshes girls the meshes A'E'D ', E 1 CF, D'FB and E'D'F of the level n, where
  • This case is similar to case 1 /, except that node C becomes a shared border node.
  • C and B for example
  • node A ' is created.
  • the new mesh A 1 BC of the level n-1 has for meshes girls the meshes A'E'D ', E 1 CF, D'FB and E 1 D 1 F of the level n, whose mesh A ⁇ 'D 1 added during remeshing at level n.
  • C and A are border nodes, and node B 1 is created.
  • the new mesh AB'C of level n-1 has for meshes girls AED, ECF 1 , EF 1 D and DF 1 B 'at level n. / Figures 1 1a-d: the break line L does not completely cross the mesh ABC level n-1.
  • n it is artificially extended either to the node C opposite to the edge EF that it passes through entering the mesh, which brings to the case of Figures 10a-d, or up to an edge (EF on Figure 1 1a) opposite to the incoming edge.
  • E and F are then edge nodes shared at level n.
  • the line of rupture L is prolonged towards a node or edge (case similar to what one has just seen for the finer level n).
  • the contour has been extended on the node C.
  • C and B are edge nodes, and the node A 1 is created.
  • the mesh A 1 BC of the level n-1 has for meshes girls
  • a 1 ED ', ED'F, EFC and D'FB at level n The mesh A 1 ED 'falls under level n even if it was not generated during the remeshing at level n, but it must be at the top level n-1.
  • the border nodes are C and A, and the B 'node is created.
  • the ACB 'mesh of level n-1 has for meshes girls AED 1 ECF, EDF and DFB 1 at level n. Note that in this case, the ECF mesh of the level n is shared by the meshs A 1 BC and ACB 1 .
  • Figures 12a-c show the case of the disappearance of the break line.
  • the break line was extended to nodes B and C, which became shared edge nodes.
  • the remeshing introduced for the right side the nodes E 'and D', and for the high side the node F 1 .
  • the contour is entirely contained in the mesh ABC.
  • the node A ' is introduced to form the mesh A'BC.
  • no knot is introduced, the remeshing producing the initial mesh ABC.
  • the mesh A 1 BC is forced to move with the initial mesh ABC, so that the points A and A 'at the level n-1 are virtually the same.
  • a ' exists and is defined by its barycentric coordinates in ABC of level n-1.
  • the remeshing process is compatible with a geometric multigrid approach, sometimes used for motion estimation in order to obtain a weighting between successive hierarchical level nodes that takes into account the deformation of the lower mesh.
  • the weighting of the nodes can be done as follows: (i) if the end node is a direct child of a coarse node, the weighting is 1; (ii) if the end node is derived from several coarse nodes, the weighting is the average of the barycentric weights of the end node relative to the coarse nodes.
  • the motion is reestimated on the group of images in step 15.
  • This reestimation can be performed as in step 10, for example using the formulas (1) to (6) above, with a precaution for the pixels likely to be reconstructed by several triangles of the new mesh. This ambiguity exists because some triangular meshes of the new mesh overlap.
  • a visibility mask is defined at each time t.
  • this mask corresponds to the hatched portion in FIG. 7. It consists of the points which, at time t, are inside the discontinuity zone (that is to say do not belong to any mesh of the initial mesh resumed in the new mesh) and are, for example, to the right of the L-oriented break line determined for this time t.
  • the points inside the discontinuity zone are likely to be reconstructed either by a triangle of the right sub-mesh or by a triangle of the left sub-mesh.
  • the triangle i, j, k retained for the application of formulas (3) - (5) at such a point is that of the right sub-mesh if the point belongs to the mask and that of the left sub-mesh if not.
  • the gradient descent can be started by initializing the displacement vectors at the retained nodes of the initial mesh to the values that were obtained during the first estimation 10.
  • the minimization of the functional (1) does not provide a displacement vector for such a node.
  • the displacement vector is then regenerated by interpolation of those obtained for neighboring nodes of the same sub-mesh.
  • the motion parameters delivered in step 20 when the image group includes at least one discontinuity zone include:
  • FIG. 13 A simplified block diagram of an encoder embodying the invention is presented in FIG. 13. Such an encoder carries out a motion estimation on a digital image sequence of a video stream (module 36), and of on the other hand, to a texture coding (module 37) which can be implemented according to various techniques known in the field of video coding.
  • the module 36 operates according to the method described with reference to FIG. 3.
  • the parameters of movement (a) - (d) that it delivers are coded by the module 38. before being inserted into the digital output stream of the encoder by the module 39, with the texture encoding information.
  • a signal carrying this output stream can be transmitted or broadcast over a communication channel. It can also be recorded on a recording medium such as an optical disc, a magnetic tape, etc.
  • a video decoder compatible with such encoder receives an input stream similar to the output stream of the encoder, and separates in this stream the motion parameters and the texture information (module 40).
  • Modules 41 and 42 respectively process this information to decode motion and texture in the successive image groups of the encoded video sequence.
  • the decoded movement and texture are processed by a synthesis module 43 to reconstruct the video images.
  • the operation of the motion decoding module 41 is as follows. Groups of images are first identified in the sequence, as in a conventional decoder. From the initial mesh fixed by convention, the module 41 locates the discontinuity zones according to the information (a) above. He then places the break lines in these discontinuity zones according to their location in the first image of the group (b). The module 41 then regenerates the non-manifold mesh by remeshing the discontinuity zones in accordance with the step 14 previously described with reference to FIGS. 3 to 12. The quantized displacement vectors assigned to the nodes of the non-manifold mesh are indicated in FIG. the coded stream.
  • the module 41 identifies the triangular mesh used to synthesize the displacement vector of each point according to the same process as that used by the encoder in step 15 described above, from the position of the point with respect to the break line (b) (if this point is in a discontinuity zone) and from the z-values indicative of depth (c).
  • the encoder according to FIG. 13, or the decoder according to FIG. 14, can be made in the form of a specific electronic circuit. However, it will often be done as software.
  • the steps of the methods described above are then controlled by instructions of a program executed by a processor of the video encoding or decoding apparatus.
  • this apparatus may for example be a computer, a video camera, a workstation of a television relay, a recording apparatus, etc.
  • a recording apparatus it may for example be a computer, a recording medium reader, a television signal receiver, an image display, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Analysis (AREA)
EP05805589A 2004-09-15 2005-09-06 Verfahren zur bewegungsschätzung mithilfe von verformbaren netzen Ceased EP1790169A1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR0409778 2004-09-15
PCT/FR2005/002216 WO2006030103A1 (fr) 2004-09-15 2005-09-06 Procede d'estimation de mouvement a l'aide de maillages deformables

Publications (1)

Publication Number Publication Date
EP1790169A1 true EP1790169A1 (de) 2007-05-30

Family

ID=34950345

Family Applications (1)

Application Number Title Priority Date Filing Date
EP05805589A Ceased EP1790169A1 (de) 2004-09-15 2005-09-06 Verfahren zur bewegungsschätzung mithilfe von verformbaren netzen

Country Status (5)

Country Link
US (1) US8761250B2 (de)
EP (1) EP1790169A1 (de)
JP (1) JP4870081B2 (de)
CN (1) CN101036390B (de)
WO (1) WO2006030103A1 (de)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8400472B2 (en) * 2009-02-25 2013-03-19 Technion Research & Development Foundation Limited Method and system of geometric deformation
US8295364B2 (en) * 2009-04-02 2012-10-23 Sony Corporation System and method of video data encoding with minimum baseband data transmission
JP2010258739A (ja) * 2009-04-24 2010-11-11 Sony Corp 画像処理装置および方法、並びにプログラム
US8922558B2 (en) * 2009-09-25 2014-12-30 Landmark Graphics Corporation Drawing graphical objects in a 3D subsurface environment
EP2582134A1 (de) * 2011-10-12 2013-04-17 Thomson Licensing Bestimmung von Auffälligkeitswerten ("saliency") prädiktionskodierter Videoströme
JP7279939B2 (ja) * 2016-09-21 2023-05-23 カカドゥ アール アンド ディー ピーティーワイ リミテッド ビデオ及びマルチビュー・イマジェリーの圧縮及びアップサンプリングのためのベース固定モデル及び推論
CN107621403B (zh) * 2017-10-24 2019-10-08 岭东核电有限公司 一种获取地下混凝土受硫酸盐侵蚀后真实力学性能和本构的方法
KR102167071B1 (ko) * 2019-11-29 2020-10-16 한길씨앤씨 주식회사 서술자 매칭쌍 분석 기술을 적용한 이미지 인식 시스템 및 방법

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06339136A (ja) * 1993-03-31 1994-12-06 Toshiba Corp 動画像処理方法および装置
JP2798120B2 (ja) * 1995-08-04 1998-09-17 日本電気株式会社 動き補償フレーム間予測方法及び動き補償フレーム間予測装置
JP3700230B2 (ja) * 1996-01-12 2005-09-28 株式会社日立製作所 動画像符号化における動き補償方法
US5999651A (en) * 1997-06-06 1999-12-07 Matsushita Electric Industrial Co., Ltd. Apparatus and method for tracking deformable objects
FR2783123B1 (fr) 1998-09-04 2000-11-24 France Telecom Procede d'estimation du mouvement entre deux images
JP3414683B2 (ja) * 1999-11-16 2003-06-09 株式会社国際電気通信基礎技術研究所 対象物の表面動き測定方法および装置、ならびに当該方法を実現するようコンピュータを動作させるためのプログラムを記憶したコンピュータ読取可能な記録媒体
FR2802377B1 (fr) * 1999-12-09 2002-03-08 France Telecom Procede d'estimation de mouvement entre deux images avec gestion des retournements de mailles et procede de codage correspondant
FR2820255A1 (fr) * 2001-01-26 2002-08-02 France Telecom Procedes de codage et de decodage d'images, dispositifs, systemes, signaux et applications correspondants
US7221366B2 (en) * 2004-08-03 2007-05-22 Microsoft Corporation Real-time rendering system and process for interactive viewpoint video

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2006030103A1 *

Also Published As

Publication number Publication date
CN101036390B (zh) 2010-06-16
US8761250B2 (en) 2014-06-24
CN101036390A (zh) 2007-09-12
US20070291845A1 (en) 2007-12-20
WO2006030103A1 (fr) 2006-03-23
JP4870081B2 (ja) 2012-02-08
JP2008514073A (ja) 2008-05-01
WO2006030103A8 (fr) 2007-05-03

Similar Documents

Publication Publication Date Title
Zhang et al. Multi-scale single image dehazing using perceptual pyramid deep network
EP1604529B1 (de) Verfahren und einrichtungen zur codierung und decodierung einer bildsequenz mittels bewegungs-/texturzerlegung und wavelet-codierung
Alsaiari et al. Image denoising using a generative adversarial network
EP1790169A1 (de) Verfahren zur bewegungsschätzung mithilfe von verformbaren netzen
US20110069152A1 (en) 2D to 3D video conversion
US20030081836A1 (en) Automatic object extraction
EP1299859A1 (de) Bewegungsschätzer für die kodierung und die dekodierung von bildersequenzen
TW202037169A (zh) 基於視訊的點雲壓縮的區塊分段的方法及裝置
Jantet Layered depth images for multi-view coding
FR2813485A1 (fr) Procede de construction d'au moins une image interpolee entre deux images d'une sequence animee, procedes de codage et de decodage, signal et support de donnees correspondant
EP0961227B1 (de) Verfahren zum Detektieren der relativen Tiefe zweier Objekte in einer Szene ausgehend von zwei Aufnahmen aus verschiedenen Blickrichtungen
FR2742901A1 (fr) Procede de correction d'estimation de mouvement dans des images a structures periodiques
CN111681192B (zh) 一种基于残差图像条件生成对抗网络的比特深度增强方法
WO2001043446A1 (fr) Procede d'estimation de mouvement entre deux images avec gestion des retournements de mailles et procede de codage correspondant
Panda et al. An efficient image interpolation using edge-error based sharpening
EP1654882A2 (de) Verfahren zum repräsentieren einer bildsequenz durch verwendung von 3d-modellen und entsprechende einrichtungen und signal
EP2737452B1 (de) Verfahren zur kodierung eines bildes nach einer grössenänderung mittels pixelentfernung und bildübertragungsverfahren zwischen einem sender und einem empfänger
CN117853340B (zh) 基于单向卷积网络和降质建模的遥感视频超分辨率重建方法
Corrigan et al. Pathological motion detection for robust missing data treatment
Rane et al. Image Denoising using Adaptive Patch Clustering with Suboptimal Wiener Filter in PCA Domain
Kokaram et al. Twenty years of Frame Interpolation for Retiming in the Movies
Balure Guidance-based improved depth upsampling with better initial estimate
Laccetti et al. P-LSR: a parallel algorithm for line scratch restoration
WO2023180844A1 (en) Mesh zippering
EP3991401A1 (de) Verfahren und vorrichtung zur verarbeitung von daten von mehrfachansichtsvideo

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20070306

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

17Q First examination report despatched

Effective date: 20070814

DAX Request for extension of the european patent (deleted)
RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: ORANGE

REG Reference to a national code

Ref country code: DE

Ref legal event code: R003

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED

18R Application refused

Effective date: 20160926