CN102034248B - Motion segmentation and three-dimensional (3D) expression method for single view image sequence - Google Patents
Motion segmentation and three-dimensional (3D) expression method for single view image sequence Download PDFInfo
- Publication number
- CN102034248B CN102034248B CN2010106168493A CN201010616849A CN102034248B CN 102034248 B CN102034248 B CN 102034248B CN 2010106168493 A CN2010106168493 A CN 2010106168493A CN 201010616849 A CN201010616849 A CN 201010616849A CN 102034248 B CN102034248 B CN 102034248B
- Authority
- CN
- China
- Prior art keywords
- depth
- image sequence
- motion
- segmentation
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 39
- 238000000034 method Methods 0.000 title claims abstract description 27
- 230000014509 gene expression Effects 0.000 title claims abstract description 25
- 238000011478 gradient descent method Methods 0.000 claims abstract description 11
- 238000012545 processing Methods 0.000 claims description 8
- 230000003247 decreasing effect Effects 0.000 claims description 6
- 238000009795 derivation Methods 0.000 claims description 6
- 238000013519 translation Methods 0.000 claims description 2
- 230000003287 optical effect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Landscapes
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a motion segmentation and three-dimensional (3D) expression method for a single view image sequence. The method comprises the following steps of: acquiring the single view image sequence, estimating the motion information and depth information of a main motion target by using a gradient descent method and evolving a break curve by using a level set method; verifying the estimated depth information and correcting unreliable depth information; minimizing an energy function by using the corrected depth information and the obtained break curve and estimating the motion information and depth information of the main motion target; and fixing the break curve, minimizing the energy function once again, verifying and correcting the reliability of the depth information and minimizing the energy function once again so as to obtain the depth information and motion information of every motion target. Motion segmentation and 3D expression can be performed at the same time and the number of targets in the image sequence does not need to be known in advance. The method has wide applicability.
Description
Technical field
The present invention relates to a kind of motion segmentation of monocular image sequence, 3D motion information and depth estimation method, more particularly to it is a kind of using main motion segmentation thought, using the Level Set Method of region-competitive, the method that the motion segmentation of the monocular image sequence based on the time-space domain model treatment for including 3D motion information, 3D transport work(information and estimation of Depth.
Background technology
Society is going into the information age, and computer vision is more and more generally entered among every field.Motion detection, movable information and the estimation of Depth of monocular image sequence are the important branch of computer vision, are theoretical in recent years and application study hotspot again.The application field of the motion detection of monocular image sequence, movable information and estimation of Depth widely, such as monitoring of machine vision, communal facility, medical diagnosis, station and traffic scene and hotel, building, the monitoring in market.
Motion segmentation, movable information and the estimation of Depth of monocular image sequence refer on one section of monocular image sequence, region or the target of different motion characteristic are distinguished and recognized using the real motion characteristic in three dimensions, and recover the depth information and kinematic parameter of moving target from image sequence.
Traditional motion segmentation method has background subtraction threshold method, difference image threshold method, optical flow method and the method based on level set.Background subtraction calculates simple, but adaptability is bad, segmentation errors easily occurs.Difference image threshold method can not typically extract all related like vegetarian refreshments, and cavity easily occurs in moving object.Optical flow method computing is more complicated, it is impossible to reach real-time application effect.
Osher and Sethian propose the level set of the movement curved surface dependent on the time first(Level Set)Description, The method avoids the processing to topologies change, calculates stable.Caselles et al. proposes the Level Set Method based on edge, but selection of this kind of method to curve initial position is very sensitive, it is not easy to the segmentation effect reached.
The 3D expression of moving target is to recover the 3D structures and kinematic parameter of moving target from image sequence.The 3D expression of current moving target can be divided into sparse 3D expression and dense 3D is expressed.Sparse 3D expression is what is calculated by a nowhere-dense set in image sequence, and 3D expresses the selection for being limited to nowhere-dense set, it is impossible to reach stable effect.Dense 3D expression is calculated by whole image sequence.But, the substantial amounts of dense 3D expression based on monocular image sequence is the static situation of environment for camera system motion.
2006, Hicham for the first time combined motion segmentation and 3D expression, but Hicham method can only split the moving target of fixed number.
The content of the invention
It is an object of the invention to solve in view of the shortcomings of the prior art, there is provided a kind of motion segmentation of monocular image sequence and 3D expressions.
The purpose of the present invention is achieved through the following technical solutions:The motion segmentation and 3D expressions of a kind of monocular image sequence, comprise the following steps:
(1)Gather monocular image sequence:The picture of the scene comprising moving target is shot using video camera, video camera, which can be moved, to move, and it can also be multiple that the number of moving target, which can be one,;
(2)Initialization, initializes a curved surface in the image sequence collected, and so each frame can all have a curve, picture is divided into two parts;By depth in whole image sequence areaIt is initialized as a constant;Movement velocityAlso it is initialized as constant;
(3)Movement velocity is asked by gradient descent method;
(4)Depth is asked by gradient descent method;
(5)Surface evolution is carried out by the method for level set;
(6)Judge the condition of convergence, if being unsatisfactory for the condition of convergence, the step of rebound the 3rd is continued executing with, if meeting the condition of convergence, verify whether obtained depth information is reliable, corrects insecure depth information;
(7)If convergence, judgment formula(11)Whether obtained depth information is reliable, and corrects insecure depth information;
(8)Re-start the estimation of depth and movable information:Movable information and segmentation curved surface after depth information and step the 3-5 convergence obtained with step 7 perform the initialization of step 2, step 3,4 are repeated, until convergence, obtains final movable information and depth information;
(9 )The segmentation curve of the target of each motion is performed into step respectively successively(2)Arrive(8), obtain the movement velocity and depth information of each moving target.
The beneficial effects of the invention are as follows:
1st, the present invention can be carried out motion segmentation and 3D expression simultaneously.
2nd, the present invention need not know the target numbers in image sequence in advance, with wide applicability.
3rd, the present invention employs Narrow bands in the horizontal set operation of curve, and arrowband refers to that apart from level set be -2, -,-, -1,1,,, 2 point can so ensure the correctness of motion segmentation, amount of calculation can be greatly reduced again, improve the efficiency of motion segmentation and 3D expression.
Brief description of the drawings
Fig. 1 is projected to the coordinate system of the projection relation institute foundation of two dimensional image plane for hollow three-dimensional information of the invention;
Fig. 2 is the minimum method flow diagram of energy function in the present invention;
Fig. 3 is checking depth information reliability in the present invention and the method flow diagram for correcting unreliable depth.
Embodiment
Therefore, present invention employs a kind of processing model of time-space domain, 3D motion segmentation and dense 3D expression problems are studied as an entirety.Model employs the thought of main motion segmentation, and the method that make use of the level set of region-competitive, by video camera pin-hole model by the two dimensional motion of 3D motion information MAP to image, directly carries out motion segmentation and estimation with 3D information.As a result of the thought of main motion, in cutting procedure, background is main moving target, it so can once be partitioned into background and all targets, and it is estimated that the movable information and depth information of background, then estimation and estimation of Depth are carried out to target one by one, it is not necessary to know the number of moving target in advance.
Assuming that I is one in spatio-temporal regionDThe image sequence of upper definition,,It isOpen subset,TIt is time series,It is image sequence respectively in transverse direction, the grey scale difference of longitudinal direction and time orientation is located at image(x,y)The light stream campaign at place is(u,v), according to optical flow constraint equation:
(1)
Assuming thatFor a motion rigid body, same rigid body has identical translational and rotational movement, used respectivelyWithRepresent;Assuming that in space a bitP(X,Y,Z)It is projected in camera plane(x,y)Point, according to projection relation, as shown in figure,,,Represent the focal length of video camera.
So in camera plane(x,y)The motion of point is expressed as the form of vector, the function of depth and six kinematic variableses is expressed as, is had:
(2)
Wherein optical flow velocity(u,v)And depthzIt is the function of image coordinate and time.
Order:
Order:Background area in image is represented,Be relative toTarget area, moving target segmentation and 3D expression can be converted into the minimization problem of following energy function.
1st, 2 the 3D expression and the similarity degrees of actual conditions that are used for measuring motion segmentation and background.3rd is used for constraining depth, it is ensured that the flatness of depth.Section 4 is used for constraining segmentation curved surface, it is ensured that the flatness of curved surface.
The 3D expression for being minimized to solve segmentation curved surface and background to energy function using Eulerian equation, motion segmentation, the estimation of Depth of background area and estimation are to carry out simultaneously.Using energy function, it is divisible go out background and all moving targets, obtain the depth information and movable information of background;Then evolution curved surface is fixed, formula is utilized(1)、(2)With(3), the depth information and movable information of each target can be estimated.
As shown in Fig. 2 the motion segmentation and 3D expressions of this method monocular image sequence, comprise the following steps:
1st, monocular image sequence is gathered
The picture of the scene comprising moving target is shot using video camera, video camera, which can be moved, to move, and it can also be multiple that the number of moving target, which can be one,.
2nd, initialize, a curved surface is initialized in the image sequence collected, so each frame can all have a curve, picture is divided into two parts(Curve can include target complete, can also include a part of target).By depth in whole image sequence areaIt is initialized as a constant.Movement velocityAlso it is initialized as constant.
3rd, movement velocity is asked by gradient descent method
Assuming that curved surfaceAnd depthIt is constant, energy functionIt is rightWithDerivation, and gradient descent method is used, obtain:
S represents segmentation curved surface;
Obtain iterative formula:
4th, depth is asked by gradient descent method
Assuming that kinematic parameter and curved surface are constant, energy function utilizes gradient descent method to depth derivation:
It is image sequence respectively in transverse direction, the grey scale difference of longitudinal direction and time orientation;
Z represents depth;
Obtain iterative formula:
5th, surface evolution is carried out by the method for level set
Assuming that kinematic parameter and depth are constant, energy function is to curved surface derivation:
Level set form is converted into, the partial differential equation of surface evolution are obtained:
Obtain the iterative equation of curved surface:
6th, judge the condition of convergence, if being unsatisfactory for the condition of convergence, the step of rebound the 3rd is continued executing with, if meeting the condition of convergence, verify whether obtained depth information is reliable, corrects insecure depth information.
If the movement velocity that continuous three processing of step 3 are obtained, difference is respectively less than the 10% of movement velocity average, and the depth that continuous three processing of step 4 are obtained, difference is respectively less than the 10% of depth average, and the curved surface that continuous three processing of step 5 are obtained, surface location difference is respectively less than the 10% of surface location average, then it is assumed that the condition of convergence is met, and otherwise the condition of convergence is unsatisfactory for.
If the 7, restrained, judgment formula(11)Whether obtained depth information is reliable, and corrects insecure depth information, as shown in Figure 3.
For formula(11)Obtained depth information, judges whether corresponding energy function is more than threshold value, depth point of the mark more than threshold value.
(14)
Wherein,The marked index of the depth point of threshold value is greater than,
Represent a little(i,j)Corresponding energy function value,
For formula(11)Obtained depth information, judges the depth difference of depth point and its eight neighborhood, and threshold value is more than if 4 or more than 4, then this point will be identified as insecure.That is,
(15)
Wherein,It is the marked index of unreliable depth point,It is depth point(i,j)It is more than the number of threshold value with its eight neighborhood depth difference.
For insecure depth point, it is modified using neighborhood information and corresponding half-tone information.That is Corrected Depth so that following energy function is minimum,
m,nThe neighborhood window size used during for Corrected Depth value.
、、Respectively three weights, the successively influence of representation space information, half-tone information and energy information to the depth.It is defined as follows:
It is the weight for distinguishing the reliable depth point of neighborhood and unreliable depth point, is defined as follows:
(20)
Wherein,It is formula(14)The energy function judged is more than depth point or the formula of threshold value(15)The insecure depth point judged.
8th, the estimation of depth and movable information is re-started
The initialization of the 2nd step is performed with the movable information and segmentation curved surface after 7 obtained depth informations and the convergence of 3-5 steps, 3,4 steps are repeated, until convergence, obtains final movable information and depth information.
9th, the segmentation curve of the target of each motion is performed into step respectively successively(2)Arrive(8), obtain the movement velocity and depth information of each moving target.
Claims (5)
1. the motion segmentation and 3D expressions of a kind of monocular image sequence, it is characterised in that comprise the following steps:
(1) monocular image sequence is gathered:The picture of the scene comprising moving target is shot using video camera, camera motion or is not moved, the number of moving target is one or multiple;
(2) initialize, a curved surface is initialized in the image sequence collected, so each frame can all have a curve, picture is divided into two parts;Depth Z is initialized as a constant in whole image sequence area;Movement velocity T, ω are also initialized as constant;
(3) movement velocity is asked by gradient descent method;
(4) depth is asked by gradient descent method;
(5) surface evolution is carried out by the method for level set;
(6) judge the condition of convergence, if being unsatisfactory for the condition of convergence, rebound step (3) is continued executing with, if meeting the condition of convergence, verify whether obtained depth information is reliable, corrects insecure depth information;
(7) if convergence, judgment formulaWhether obtained depth information is reliable, and corrects insecure depth information;
In formula, a1, a2, a4It is the real constant for adjusting every weight in energy function;krRepresent curvature;E0=(Ix·u+Iy·v+It)2;g(E0) for meet [0 ,+∞) on monotone decreasing function;
(8) estimation of depth and movable information is re-started:Movable information and segmentation curved surface after the depth information and step (3) obtained with step (7)-step (5) convergence perform the initialization of step (2), step (3) and step (4) is repeated, until convergence, obtains final movable information and depth information;
(9) the segmentation curve of the target of each motion is performed into step (2) to (8) respectively successively, obtains the movement velocity and depth information of each moving target.
2. the motion segmentation of monocular image sequence and 3D expressions according to claim 1, it is characterised in that the step (3) is specially:Assuming that curved surface s and depth Z are constant, energy function E (S, θ) uses gradient descent method to T and ω derivations, obtains:
Wherein, T (t1, t2, t3) and ω (ω1, ω2, ω3) translation and rotary speed are represented respectively,
E (S, θ) represents energy function,
S represents segmentation curved surface,
θ (x, y)=(T (x, y), ω (x, y), Z).
3. the motion segmentation of monocular image sequence and 3D expressions according to claim 1, it is characterised in that the step (4) is specially:Assuming that kinematic parameter and curved surface are constant, energy function utilizes gradient descent method to depth derivation:
E1=Ix·u+Iy·v+It,
Ix, Iy, ItIt is image sequence respectively in transverse direction, the grey scale difference of longitudinal direction and time orientation;
g(E0) for meet [0 ,+∞) on monotone decreasing function;
E0=(Ix·u+Iy·v+It)2;
F represents focal length;
Z represents depth;
a1, a2, a3It is the real constant for adjusting every weight in energy function.
4. the motion segmentation of monocular image sequence and 3D expressions according to claim 1, it is characterised in that the step (5) is specially:Assuming that kinematic parameter and depth are constant, energy function is to curved surface derivation:
Wherein a1, a2, a4It is the real constant for adjusting every weight in energy function;
krRepresent curvature;
E0=(Ix·u+Iy·v+It)2;
g(E0) for meet [0 ,+∞) on monotone decreasing function;
Level set form is converted into, the partial differential equation of surface evolution are obtained.
5. the motion segmentation of monocular image sequence and 3D expressions according to claim 1, it is characterised in that in the step (6):If the movement velocity that continuous three processing of step (3) are obtained, difference is respectively less than the 10% of movement velocity average, and the depth that continuous three processing of step (4) are obtained, difference is respectively less than the 10% of depth average, and the curved surface that continuous three processing of step (5) are obtained, surface location difference is respectively less than the 10% of surface location average, then it is assumed that the condition of convergence is met, and otherwise the condition of convergence is unsatisfactory for.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010106168493A CN102034248B (en) | 2010-12-31 | 2010-12-31 | Motion segmentation and three-dimensional (3D) expression method for single view image sequence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010106168493A CN102034248B (en) | 2010-12-31 | 2010-12-31 | Motion segmentation and three-dimensional (3D) expression method for single view image sequence |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102034248A CN102034248A (en) | 2011-04-27 |
CN102034248B true CN102034248B (en) | 2012-08-22 |
Family
ID=43887101
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2010106168493A Expired - Fee Related CN102034248B (en) | 2010-12-31 | 2010-12-31 | Motion segmentation and three-dimensional (3D) expression method for single view image sequence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102034248B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102521846B (en) * | 2011-12-21 | 2013-12-04 | 浙江大学 | Time-space domain motion segmentation and motion estimation method based on three-dimensional video |
CN102542578A (en) * | 2011-12-23 | 2012-07-04 | 浙江大学 | Time-space domain motion segmentation and motion evaluation method based on three-dimensional (3D) videos |
CN103237228B (en) * | 2013-04-28 | 2015-08-12 | 清华大学 | The segmentation method for space-time consistency of binocular tri-dimensional video |
CN106157307B (en) | 2016-06-27 | 2018-09-11 | 浙江工商大学 | A kind of monocular image depth estimation method based on multiple dimensioned CNN and continuous CRF |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1150848A (en) * | 1994-08-30 | 1997-05-28 | 汤姆森宽带系统公司 | Synthesis image generating process |
CN101692284A (en) * | 2009-07-24 | 2010-04-07 | 西安电子科技大学 | Three-dimensional human body motion tracking method based on quantum immune clone algorithm |
CN101826228A (en) * | 2010-05-14 | 2010-09-08 | 上海理工大学 | Detection method of bus passenger moving objects based on background estimation |
-
2010
- 2010-12-31 CN CN2010106168493A patent/CN102034248B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1150848A (en) * | 1994-08-30 | 1997-05-28 | 汤姆森宽带系统公司 | Synthesis image generating process |
CN101692284A (en) * | 2009-07-24 | 2010-04-07 | 西安电子科技大学 | Three-dimensional human body motion tracking method based on quantum immune clone algorithm |
CN101826228A (en) * | 2010-05-14 | 2010-09-08 | 上海理工大学 | Detection method of bus passenger moving objects based on background estimation |
Also Published As
Publication number | Publication date |
---|---|
CN102034248A (en) | 2011-04-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107747941B (en) | Binocular vision positioning method, device and system | |
CN105225482B (en) | Vehicle detecting system and method based on binocular stereo vision | |
CN108717712B (en) | Visual inertial navigation SLAM method based on ground plane hypothesis | |
CN103761737B (en) | Robot motion's method of estimation based on dense optical flow | |
CN110929566B (en) | Human face living body detection method based on visible light and near infrared binocular camera | |
CN104167016B (en) | A kind of three-dimensional motion method for reconstructing based on RGB color and depth image | |
KR101616926B1 (en) | Image processing apparatus and method | |
TWI536318B (en) | Depth measurement quality enhancement | |
CN110766024B (en) | Deep learning-based visual odometer feature point extraction method and visual odometer | |
CN104794737B (en) | A kind of depth information Auxiliary Particle Filter tracking | |
CN107657644B (en) | Sparse scene flows detection method and device under a kind of mobile environment | |
CN103745474A (en) | Image registration method based on inertial sensor and camera | |
CN103345751A (en) | Visual positioning method based on robust feature tracking | |
CN110807809A (en) | Light-weight monocular vision positioning method based on point-line characteristics and depth filter | |
JP2008503757A (en) | Method and apparatus for visual odometry | |
WO2021051526A1 (en) | Multi-view 3d human pose estimation method and related apparatus | |
CN102034248B (en) | Motion segmentation and three-dimensional (3D) expression method for single view image sequence | |
CN103903263A (en) | Algorithm for 360-degree omnibearing distance measurement based on Ladybug panorama camera images | |
CN111583386B (en) | Multi-view human body posture reconstruction method based on label propagation algorithm | |
CN112418288A (en) | GMS and motion detection-based dynamic vision SLAM method | |
CN114627491A (en) | Single three-dimensional attitude estimation method based on polar line convergence | |
CN108021857B (en) | Building detection method based on unmanned aerial vehicle aerial image sequence depth recovery | |
CN110428461B (en) | Monocular SLAM method and device combined with deep learning | |
JP2022027464A (en) | Method and device related to depth estimation of video | |
CN104156933A (en) | Image registering method based on optical flow field |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C17 | Cessation of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20120822 Termination date: 20121231 |