CN102034248B

CN102034248B - Motion segmentation and three-dimensional (3D) expression method for single view image sequence

Info

Publication number: CN102034248B
Application number: CN2010106168493A
Authority: CN
Inventors: 于慧敏; 潘丰俏; 杨刚
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2010-12-31
Filing date: 2010-12-31
Publication date: 2012-08-22
Anticipated expiration: 2030-12-31
Also published as: CN102034248A

Abstract

The invention discloses a motion segmentation and three-dimensional (3D) expression method for a single view image sequence. The method comprises the following steps of: acquiring the single view image sequence, estimating the motion information and depth information of a main motion target by using a gradient descent method and evolving a break curve by using a level set method; verifying the estimated depth information and correcting unreliable depth information; minimizing an energy function by using the corrected depth information and the obtained break curve and estimating the motion information and depth information of the main motion target; and fixing the break curve, minimizing the energy function once again, verifying and correcting the reliability of the depth information and minimizing the energy function once again so as to obtain the depth information and motion information of every motion target. Motion segmentation and 3D expression can be performed at the same time and the number of targets in the image sequence does not need to be known in advance. The method has wide applicability.

Description

The motion segmentation and 3D expressions of monocular image sequence

Technical field

The present invention relates to a kind of motion segmentation of monocular image sequence, 3D motion information and depth estimation method, more particularly to it is a kind of using main motion segmentation thought, using the Level Set Method of region-competitive, the method that the motion segmentation of the monocular image sequence based on the time-space domain model treatment for including 3D motion information, 3D transport work(information and estimation of Depth.

Background technology

Society is going into the information age, and computer vision is more and more generally entered among every field.Motion detection, movable information and the estimation of Depth of monocular image sequence are the important branch of computer vision, are theoretical in recent years and application study hotspot again.The application field of the motion detection of monocular image sequence, movable information and estimation of Depth widely, such as monitoring of machine vision, communal facility, medical diagnosis, station and traffic scene and hotel, building, the monitoring in market.

Motion segmentation, movable information and the estimation of Depth of monocular image sequence refer on one section of monocular image sequence, region or the target of different motion characteristic are distinguished and recognized using the real motion characteristic in three dimensions, and recover the depth information and kinematic parameter of moving target from image sequence.

Traditional motion segmentation method has background subtraction threshold method, difference image threshold method, optical flow method and the method based on level set.Background subtraction calculates simple, but adaptability is bad, segmentation errors easily occurs.Difference image threshold method can not typically extract all related like vegetarian refreshments, and cavity easily occurs in moving object.Optical flow method computing is more complicated, it is impossible to reach real-time application effect.

Osher and Sethian propose the level set of the movement curved surface dependent on the time first（Level Set）Description, The method avoids the processing to topologies change, calculates stable.Caselles et al. proposes the Level Set Method based on edge, but selection of this kind of method to curve initial position is very sensitive, it is not easy to the segmentation effect reached.

The 3D expression of moving target is to recover the 3D structures and kinematic parameter of moving target from image sequence.The 3D expression of current moving target can be divided into sparse 3D expression and dense 3D is expressed.Sparse 3D expression is what is calculated by a nowhere-dense set in image sequence, and 3D expresses the selection for being limited to nowhere-dense set, it is impossible to reach stable effect.Dense 3D expression is calculated by whole image sequence.But, the substantial amounts of dense 3D expression based on monocular image sequence is the static situation of environment for camera system motion.

2006, Hicham for the first time combined motion segmentation and 3D expression, but Hicham method can only split the moving target of fixed number.

The content of the invention

It is an object of the invention to solve in view of the shortcomings of the prior art, there is provided a kind of motion segmentation of monocular image sequence and 3D expressions.

The purpose of the present invention is achieved through the following technical solutions：The motion segmentation and 3D expressions of a kind of monocular image sequence, comprise the following steps：

（1）Gather monocular image sequence：The picture of the scene comprising moving target is shot using video camera, video camera, which can be moved, to move, and it can also be multiple that the number of moving target, which can be one,；

（2）Initialization, initializes a curved surface in the image sequence collected, and so each frame can all have a curve, picture is divided into two parts；By depth in whole image sequence area

Figure 2010106168493100002DEST_PATH_IMAGE002

It is initialized as a constant；Movement velocity

Figure 2010106168493100002DEST_PATH_IMAGE004

Also it is initialized as constant；

（3）Movement velocity is asked by gradient descent method；

（4）Depth is asked by gradient descent method；

（5）Surface evolution is carried out by the method for level set；

（6）Judge the condition of convergence, if being unsatisfactory for the condition of convergence, the step of rebound the 3rd is continued executing with, if meeting the condition of convergence, verify whether obtained depth information is reliable, corrects insecure depth information；

（7）If convergence, judgment formula（11）Whether obtained depth information is reliable, and corrects insecure depth information；

（8）Re-start the estimation of depth and movable information：Movable information and segmentation curved surface after depth information and step the 3-5 convergence obtained with step 7 perform the initialization of step 2, step 3,4 are repeated, until convergence, obtains final movable information and depth information；

（9 ）The segmentation curve of the target of each motion is performed into step respectively successively（2）Arrive（8）, obtain the movement velocity and depth information of each moving target.

The beneficial effects of the invention are as follows：

1st, the present invention can be carried out motion segmentation and 3D expression simultaneously.

2nd, the present invention need not know the target numbers in image sequence in advance, with wide applicability.

3rd, the present invention employs Narrow bands in the horizontal set operation of curve, and arrowband refers to that apart from level set be -2, -

,-

, -1,1,

,

, 2 point can so ensure the correctness of motion segmentation, amount of calculation can be greatly reduced again, improve the efficiency of motion segmentation and 3D expression.

Brief description of the drawings

Fig. 1 is projected to the coordinate system of the projection relation institute foundation of two dimensional image plane for hollow three-dimensional information of the invention；

Fig. 2 is the minimum method flow diagram of energy function in the present invention；

Fig. 3 is checking depth information reliability in the present invention and the method flow diagram for correcting unreliable depth.

Embodiment

Therefore, present invention employs a kind of processing model of time-space domain, 3D motion segmentation and dense 3D expression problems are studied as an entirety.Model employs the thought of main motion segmentation, and the method that make use of the level set of region-competitive, by video camera pin-hole model by the two dimensional motion of 3D motion information MAP to image, directly carries out motion segmentation and estimation with 3D information.As a result of the thought of main motion, in cutting procedure, background is main moving target, it so can once be partitioned into background and all targets, and it is estimated that the movable information and depth information of background, then estimation and estimation of Depth are carried out to target one by one, it is not necessary to know the number of moving target in advance.

Assuming that I is one in spatio-temporal regionDThe image sequence of upper definition,

,

It is

Open subset,TIt is time series,

It is image sequence respectively in transverse direction, the grey scale difference of longitudinal direction and time orientation is located at image（x,y）The light stream campaign at place is（u,v）, according to optical flow constraint equation：

（1）

Assuming that

For a motion rigid body, same rigid body has identical translational and rotational movement, used respectively

With

Represent；Assuming that in space a bitP（X,Y,Z）It is projected in camera plane（x,y）Point, according to projection relation, as shown in figure,,

,

Represent the focal length of video camera.

So in camera plane（x,y）The motion of point is expressed as the form of vector

, the function of depth and six kinematic variableses is expressed as, is had：

（2）

（3）

Wherein optical flow velocity（u,v）And depthzIt is the function of image coordinate and time.

Order：

（4）

Order：Background area in image is represented,

Be relative to

Target area, moving target segmentation and 3D expression can be converted into the minimization problem of following energy function.

（5）

Wherein,,

,

With

It is the real constant for adjusting every weight in energy function,

。

To meet

The function of upper monotone decreasing, such as

。

1st, 2 the 3D expression and the similarity degrees of actual conditions that are used for measuring motion segmentation and background.3rd is used for constraining depth, it is ensured that the flatness of depth.Section 4 is used for constraining segmentation curved surface, it is ensured that the flatness of curved surface.

The 3D expression for being minimized to solve segmentation curved surface and background to energy function using Eulerian equation, motion segmentation, the estimation of Depth of background area and estimation are to carry out simultaneously.Using energy function, it is divisible go out background and all moving targets, obtain the depth information and movable information of background；Then evolution curved surface is fixed, formula is utilized（1）、（2）With（3）, the depth information and movable information of each target can be estimated.

As shown in Fig. 2 the motion segmentation and 3D expressions of this method monocular image sequence, comprise the following steps：

1st, monocular image sequence is gathered

The picture of the scene comprising moving target is shot using video camera, video camera, which can be moved, to move, and it can also be multiple that the number of moving target, which can be one,.

2nd, initialize, a curved surface is initialized in the image sequence collected, so each frame can all have a curve, picture is divided into two parts（Curve can include target complete, can also include a part of target）.By depth in whole image sequence area

It is initialized as a constant.Movement velocity

Also it is initialized as constant.

3rd, movement velocity is asked by gradient descent method

Assuming that curved surfaceAnd depth

It is constant, energy function

It is right

With

Derivation, and gradient descent method is used, obtain：

（6）

Wherein,

With

Translation and rotary speed are represented respectively,；

Represent energy function；

S represents segmentation curved surface；

。

Obtain iterative formula：

（7）

（8）.

4th, depth is asked by gradient descent method

Assuming that kinematic parameter and curved surface are constant, energy function utilizes gradient descent method to depth derivation：

（9）

Wherein,

It is image sequence respectively in transverse direction, the grey scale difference of longitudinal direction and time orientation；

To meet

The function of upper monotone decreasing, such as；

；

Represent focal length；

Z represents depth；

,

,

It is the real constant for adjusting every weight in energy function

Obtain iterative formula：

（10）.

5th, surface evolution is carried out by the method for level set

Assuming that kinematic parameter and depth are constant, energy function is to curved surface derivation：

（11）

Wherein

,

,It is the real constant for adjusting every weight in energy function；

Represent curvature；

；

To meetThe function of upper monotone decreasing, such as

；

Level set form is converted into, the partial differential equation of surface evolution are obtained：

（12）

Wherein

For symbolic measurement,K _rFor curvature.

Obtain the iterative equation of curved surface：

（13）

6th, judge the condition of convergence, if being unsatisfactory for the condition of convergence, the step of rebound the 3rd is continued executing with, if meeting the condition of convergence, verify whether obtained depth information is reliable, corrects insecure depth information.

If the movement velocity that continuous three processing of step 3 are obtained, difference is respectively less than the 10% of movement velocity average, and the depth that continuous three processing of step 4 are obtained, difference is respectively less than the 10% of depth average, and the curved surface that continuous three processing of step 5 are obtained, surface location difference is respectively less than the 10% of surface location average, then it is assumed that the condition of convergence is met, and otherwise the condition of convergence is unsatisfactory for.

If the 7, restrained, judgment formula（11）Whether obtained depth information is reliable, and corrects insecure depth information, as shown in Figure 3.

For formula（11）Obtained depth information, judges whether corresponding energy function is more than threshold value, depth point of the mark more than threshold value.

（14）

Wherein,The marked index of the depth point of threshold value is greater than,

Represent a little（i,j）Corresponding energy function value,

To judge the threshold value of energy function value correctness.

For formula（11）Obtained depth information, judges the depth difference of depth point and its eight neighborhood, and threshold value is more than if 4 or more than 4, then this point will be identified as insecure.That is,

（15）

Wherein,

It is the marked index of unreliable depth point,

It is depth point（i,j）It is more than the number of threshold value with its eight neighborhood depth difference.

For insecure depth point, it is modified using neighborhood information and corresponding half-tone information.That is Corrected Depth so that following energy function is minimum,

（16）

Wherein,

Represent point（i,j）Depth value during the t times iteration,

m,nThe neighborhood window size used during for Corrected Depth value.

、、

Respectively three weights, the successively influence of representation space information, half-tone information and energy information to the depth.It is defined as follows：

（17）

（18）

（19）

Wherein,

For point（i,j）Gray value；

、

、

For coefficient

It is the weight for distinguishing the reliable depth point of neighborhood and unreliable depth point, is defined as follows：

（20）

Wherein,

It is formula（14）The energy function judged is more than depth point or the formula of threshold value（15）The insecure depth point judged.

8th, the estimation of depth and movable information is re-started

The initialization of the 2nd step is performed with the movable information and segmentation curved surface after 7 obtained depth informations and the convergence of 3-5 steps, 3,4 steps are repeated, until convergence, obtains final movable information and depth information.

9th, the segmentation curve of the target of each motion is performed into step respectively successively（2）Arrive（8）, obtain the movement velocity and depth information of each moving target.

Claims

1. the motion segmentation and 3D expressions of a kind of monocular image sequence, it is characterised in that comprise the following steps：

(1) monocular image sequence is gathered：The picture of the scene comprising moving target is shot using video camera, camera motion or is not moved, the number of moving target is one or multiple；

(2) initialize, a curved surface is initialized in the image sequence collected, so each frame can all have a curve, picture is divided into two parts；Depth Z is initialized as a constant in whole image sequence area；Movement velocity T, ω are also initialized as constant；

(3) movement velocity is asked by gradient descent method；

(4) depth is asked by gradient descent method；

(5) surface evolution is carried out by the method for level set；

(6) judge the condition of convergence, if being unsatisfactory for the condition of convergence, rebound step (3) is continued executing with, if meeting the condition of convergence, verify whether obtained depth information is reliable, corrects insecure depth information；

(7) if convergence, judgment formula

Whether obtained depth information is reliable, and corrects insecure depth information；

In formula, a₁, a₂, a₄It is the real constant for adjusting every weight in energy function；k_rRepresent curvature；E₀=(I_x·u+I_y·v+I_t)²；g(E₀) for meet [0 ,+∞) on monotone decreasing function；

(8) estimation of depth and movable information is re-started：Movable information and segmentation curved surface after the depth information and step (3) obtained with step (7)-step (5) convergence perform the initialization of step (2), step (3) and step (4) is repeated, until convergence, obtains final movable information and depth information；

(9) the segmentation curve of the target of each motion is performed into step (2) to (8) respectively successively, obtains the movement velocity and depth information of each moving target.

2. the motion segmentation of monocular image sequence and 3D expressions according to claim 1, it is characterised in that the step (3) is specially：Assuming that curved surface s and depth Z are constant, energy function E (S, θ) uses gradient descent method to T and ω derivations, obtains：

Wherein, T (t₁, t₂, t₃) and ω (ω₁, ω₂, ω₃) translation and rotary speed are represented respectively,

E (S, θ) represents energy function,

S represents segmentation curved surface,

θ (x, y)=(T (x, y), ω (x, y), Z).

3. the motion segmentation of monocular image sequence and 3D expressions according to claim 1, it is characterised in that the step (4) is specially：Assuming that kinematic parameter and curved surface are constant, energy function utilizes gradient descent method to depth derivation：

Wherein,

E₁=I_x·u+I_y·v+I_t,

I_x, I_y, I_tIt is image sequence respectively in transverse direction, the grey scale difference of longitudinal direction and time orientation；

g(E₀) for meet [0 ,+∞) on monotone decreasing function；

E₀=(I_x·u+I_y·v+I_t)²；

F represents focal length；

Z represents depth；

a₁, a₂, a₃It is the real constant for adjusting every weight in energy function.

4. the motion segmentation of monocular image sequence and 3D expressions according to claim 1, it is characterised in that the step (5) is specially：Assuming that kinematic parameter and depth are constant, energy function is to curved surface derivation：

Wherein a₁, a₂, a₄It is the real constant for adjusting every weight in energy function；

k_rRepresent curvature；

E₀=(I_x·u+I_y·v+I_t)²；

g(E₀) for meet [0 ,+∞) on monotone decreasing function；

Level set form is converted into, the partial differential equation of surface evolution are obtained.

5. the motion segmentation of monocular image sequence and 3D expressions according to claim 1, it is characterised in that in the step (6)：If the movement velocity that continuous three processing of step (3) are obtained, difference is respectively less than the 10% of movement velocity average, and the depth that continuous three processing of step (4) are obtained, difference is respectively less than the 10% of depth average, and the curved surface that continuous three processing of step (5) are obtained, surface location difference is respectively less than the 10% of surface location average, then it is assumed that the condition of convergence is met, and otherwise the condition of convergence is unsatisfactory for.