CN108171725A

CN108171725A - Object detection method under a kind of dynamic environment based on normal direction stream information

Info

Publication number: CN108171725A
Application number: CN201711421030.XA
Authority: CN
Inventors: 袁丁; 于亚龙; 胡晓辉; 张弘; 李伟鹏
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2017-12-25
Filing date: 2017-12-25
Publication date: 2018-06-15

Abstract

The present invention provides object detection method under a kind of dynamic environment based on normal direction stream information, mainly includes three big steps, step 1：Ballot method estimates FOE；Step 2：Generate initial markers field；Step 3：Optimal Label Field is generated, so as to fulfill the detection of target.The sharpest edges of the present invention are that algorithm is relatively low to noise sensitivity, and the image texture and characteristics of image to photographed scene do not require, therefore are applicable in scene than wide, have broad prospects.

Description

Object detection method under a kind of dynamic environment based on normal direction stream information

Technical field

The present invention relates to a kind of method of target detection under dynamic environment based on normal direction stream information, to noise sensitivity compared with Low, image texture and characteristics of image to photographed scene do not require, therefore are applicable in scene than wide, belong to computer vision Field.

Background technology

Target detection refers to by analyzing the consecutive image sequence shot by motion cameras in different moments under dynamic environment The movable information included in row detected the self-movement target in photographed scene from each frame image.Due to camera shooting Machine is ceaselessly moving, therefore, comprising there are two types of movable informations in the consecutive image sequence of acquisition：The movable information of video camera and The movable information of self-movement target in photographed scene.Both movable informations are mixed so that transporting in each frame image The detection of moving-target becomes more complicated and difficulty.But in actual life, by the consecutive image sequence of mobile equipment acquisition It is seen everywhere, and this technology is still in developing stage.Therefore, in the near future, this technology will have more extensive Application field and good development prospect.At present, target detection technique is widely used in intelligent video monitoring, base under dynamic environment In multiple fields such as the video frequency searchings, Visual Navigation of Mobile Robots, automatic Pilot of content.So have herein very important Academic research meaning and actual application value.

So far, existing moving object detection algorithm is only applicable to the static situation of video camera, i.e. static field mostly The detection of moving target in scape, only a small part detection algorithm can be suitable for the situation of camera motion, i.e. dynamic ring The detection of target under border.These algorithms are broadly divided into：Frame differential method, the background subtraction of extension and the light of extension of extension Stream method.The frame differential method of extension employs characteristic matching when estimating camera motion, thus its application range by Great limitation, normally only can be suitably used for the scene that texture information enriches, has obvious characteristic.The background subtraction of extension is in structure Image registration techniques are employed when building panorama background model image, since the technology is vulnerable to the influence of parallax, the target Detection method is not used widely.The optical flow method of extension is complicated and easily affected by noise without obtaining extensively due to calculating Application.

Invention content

The technology of the present invention solves the problems, such as：Overcome the deficiencies of the prior art and provide a kind of dynamic based on normal direction stream information The method of target detection under environment, acquisition of this method based on normal direction stream information and be unfolded to implement, it is relatively low to noise sensitivity, it is right The image texture and characteristics of image of photographed scene do not require, and are applicable in scene than wide.

The present invention is directed to the moving target under dynamic environment is detected.Its input quantity is calculated for adjacent two field pictures The normal direction flow field arrived is exported to detect obtained moving target.The tool of target detection under dynamic environment is described below in the present invention Body technique scheme.

Step 1：The 1st, 2 frame images in video are selected, the normal direction flow vector field of the 2nd frame image are calculated, while right 2nd frame image is split, and obtains the Color Segmentation result figure of the 2nd frame image；

Step 2：Using the normal direction flow vector field and Color Segmentation figure being calculated in step 1, using ballot method estimation Obtain the pure translational movement information FOE of video camera；

Step 3：It, will with reference to the normal direction flow vector field of the 2nd frame image after the pure translational movement information FOE of video camera Color Segmentation result figure carries out region division, is divided into background area and foreground area, foreground area, that is, motion target area, then right Two different zones carry out the judgement of translation half-plane constraint, so as to generate the corresponding initial markers field of the 2nd frame image；

Step 4：The corresponding initial markers field of the 2nd frame image is optimized, the 2nd frame image of generation is corresponding most Then excellent Label Field models the problem of optimal Label Field using markov random file theory, obtain target energy letter Number；Algorithm being cut with figure again and solving the objective energy function, final realize detected moving target from the 2nd frame image；

Step 5：3rd frame image is split, obtains Color Segmentation result figure, it is corresponding optimal using the 2nd frame image Label Field, with reference to the Color Segmentation result figure of the 3rd frame image, the corresponding initial markers field of the 3rd frame image of generation；

Step 6：The corresponding initial markers field of the 3rd frame image is optimized again, the 3rd frame image of generation corresponds to Optimal Label Field, so that moving target detected from the 3rd frame image；

Step 7：It repeats Step 5: six, detects the target in other frame images, until will be in whole frame image Moving object detection comes out.

The present invention having the beneficial effect that compared with present technology：Method is utilized during present invention estimation camera motion information To the directional information of flow vector, FOE is obtained by ballot method.This method do not have to photographed scene any limitation and to noise not Sensitivity can accurately be estimated to obtain the movable information of video camera.Therefore, method of the invention can be applied in wider field Jing Zhong.

Description of the drawings

Fig. 1 is half-plane constraint principles figure；

Fig. 2 is the method for the present invention overhaul flow chart；

Fig. 3 is true picture sequential experimentation result figure, wherein：(a) it is original image；(b) it is based on Mean-Shift's Color Segmentation result figure；(c) it is initial markers field；(d) it is optimal Label Field；(e) it is true value.

Specific embodiment

Technical solution for a better understanding of the present invention below further retouches the specific embodiment of the present invention It states：

Institute's extracting method of the present invention is realized under based on the MATLAB 2013a language environments in 7 systems of Windows 's.This method utilizes the camera motion information and present frame that the normal direction flow field that is calculated by adjacent two field pictures estimates Color Segmentation result figure, obtain an initial markers field, recycle the suitable target energy of markov random file the Theory Construction Flow function cuts optimization algorithm with reference to figure and initial markers field is optimized, the optimal Label Field of present frame finally obtained, so as to examine Measure self-movement target.

The present invention be a kind of innovation the dynamic environment based on normal direction stream information under target detection method, the stream of this method Journey is as shown in Figure 2.This method includes the following steps：

First, ballot method estimation FOE

For the present invention under research trends environment during the method for target detection, consideration video camera makees the feelings of pure translational movement Condition.The acquisition in normal direction flow field is described below and the method for camera translational motion parameter is estimated by normal direction flow field.

Normal direction flow vector field in image can be directly calculated by the gray matrix of adjacent two field pictures.Assuming that work as Unit gradient direction vector on prior image frame at any one pixel p=(i, j) is n：

Wherein [E_x E_y]^TRepresent the shade of gray direction at pixel p.By the definition of normal direction flow vector it is found that pixel p The normal direction flow vector V at placeⁿIt can be expressed as form：

Wherein E_tRepresent the variation of gray value at corresponding pixel points p in adjacent two field pictures.E_x, E_y, E_tCalculation expression It is as follows：

Wherein I₁, I₂It is the gray matrix of adjacent two field pictures.Therefore, for true consecutive image sequence, each frame figure As upper normal direction flow vector field can be calculated by formula (2) and formula (3).

After normal direction flow vector field on the 2nd frame image is calculated, normal direction stream when the present invention utilizes pure translational movement The half-plane of vector constrains the translational motion information to estimate video camera, i.e. FOE.Utilize the base of this constraints estimation FOE Present principles are as shown in Figure 1.

In Fig. 1, three pixel p₁, p₂And p₃The normal direction flow vector at place is respectively Vⁿ(p₁), Vⁿ(p₂) and Vⁿ(p₃)；l (p₁), l (p₂) and l (p₃) it is respectively by pixel p₁, p₂And p₃And perpendicular to the straight line of each pixel normal direction flow vector.For Pixel p₁For, by the half-plane constraint of normal direction flow vector during pure translational movement it is found that characterization camera pan information FOE should be located at do not include normal direction flow vector Vⁿ(p₁) half-plane in；Similarly, for pixel p₂For, FOE should position In not comprising normal direction flow vector Vⁿ(p₂) half-plane in；For pixel p₃For, FOE should be located at and not include normal direction stray arrow Measure Vⁿ(p₃) half-plane in.Assuming that video camera makees pure translational movement, and scenery is static, then characterizes video camera pure translational movement letter The positions of the FOE of breath in the picture should be able to make the normal direction flow vector in image at all pixels point meet half-plane about simultaneously Beam.Therefore, in Fig. 1, in order to simultaneously meet three pixel p₁, p₂And p₃The half-plane constraint at place, FOE should be located at by straight Line l (p₁), l (p₂) and l (p₃) in the delta-shaped region that surrounds.It can be seen that estimated by the normal direction flow vector of three pixels The possibility distributed areas of obtained FOE are significantly less than the possibility of FOE estimated by the normal direction flow vector of a pixel Distributed areas.Therefore, with for estimating that the pixel number of FOE gradually increases, utilizing normal direction flow vector during pure translational movement The possibility distributed areas of FOE estimated of half-plane constraints be gradually reduced.Pixel until being used to estimate FOE Number increases to after certain numerical value, and the precision of the FOE estimated can reach Pixel-level.

Based on above-mentioned principle, the present invention proposes the method using temporal voting strategy estimation camera pan information, that is, throws Ticket method.This method estimation FOE is as follows：

(1) it establishes a two dimension to add up array, and it is zero to enable the initial values of all elements in array.The size of the array It is identical with the spatial resolution of the consecutive image sequence of pretreatment.The range of its first and second dimension is FOE respectively in y-axis, x-axis Possibility value range.

(2) for some pixel p in the 2nd frame image, the normal direction flow vector V at the pixel is utilized_t ⁿ's Half-plane constraints can determine the possibility distributed areas of FOE on the image plane.

(3) it after the possibility distributed areas for obtaining FOE, finds the corresponding position in the region in the cumulative array of two dimension and makes The element value of its corresponding position adds 1.

(4) step (2) and (3) is performed to other pixels in current frame image in addition to pixel p, until in image All elements be disposed.

(5) find the maximum value of element in the cumulative array of two dimension, corresponding column and row call number be FOE in the picture Coordinate value.

After above-mentioned five steps, you can accurately estimate the pure translational movement information of video camera, i.e. FOE.By During Yu Yong ballot method estimations FOE, merely with the directional information of normal direction flow vector, therefore, during this method estimation FOE It is relatively low to noise sensitivity.But in the pure translational movement of the half-plane constraints estimation video camera using normal direction flow vector During information, need to exclude interference caused by self-movement target in photographed scene, the present invention excludes this dry using following methods It disturbs.

First, the 2nd frame image is handled using the image partition method based on Mean-Shift algorithms, obtains its coloured silk Color segmentation result figure.In the figure, the pixel with similar features spatial information flocks together, and forms several different companies Logical region.Meanwhile these connected regions are endowed different RGB color values.Secondly, according to the RGB in Color Segmentation result figure Color value assigns different label values to each region, so as to form an area label figure.Then, each label value is calculated Corresponding region area size is to count the pixel number of each region.Finally, due to the larger region of area belongs to background Therefore the possibility bigger of image, selects the normal direction flow vector at the corresponding pixel in these regions to estimate the pure flat of video camera Shifting movement information, i.e. FOE.Meanwhile the corresponding label value in these regions is updated to " 0 ", wherein " 0 " represents background area.

Pass through above step, you can estimation obtains the pure translational movement information of video camera.So it is taken the photograph with the estimation of ballot method During the pure translational movement information of camera, the normal direction flow vector at the corresponding pixel in the region is selected, and by the label in the region Value is updated to " 0 ", that is, thinks that the region belongs to background.

2nd, the generation of initial markers field

After estimating to obtain the pure translational movement information i.e. FOE of video camera with ballot method, with reference to the normal direction stream of the 2nd frame image Vector field, you can by the region division in its Color Segmentation result figure be two groups：Background area and foreground area (i.e. moving target Region).In the 2nd frame image, the normal direction flow vector at all pixels for belonging to background area all meets translation half-plane about Beam, and normal direction flow vector at the pixel of motion target area is belonged to for those, due to being influenced by displacement, make It obtains the normal direction flow vector in the region at most of pixel and is unsatisfactory for translation half-plane constraint.The fact that utilization, you can right Classify each region in its Color Segmentation result figure.

For any region R in its Color Segmentation result figure, first, the normal direction at each pixel in the region is examined Flow vector whether meet translation half-plane constraint, and count wherein meet translation half-plane constraint pixel number Sup (R) and Ungratified pixel number Vio (R).Secondly, the pixel number institute in this region for meeting translation half-plane constraint is calculated The ratio Ratio (R) accounted for：

Finally, the attribute in the region is differentiated according to the Ratio (R) the being calculated and threshold epsilon ＞ 0 being previously set.At this In invention, empirical value ε=0.8 is can use according to abundant experimental results.If Ratio (R) >=ε, the region belong to background, simultaneously will The corresponding label value in the region is updated to " 0 "；Conversely, Ratio if (R) ＜ ε, which belongs to moving target, updates simultaneously The corresponding label value in the region is " 1 ".

Above three step is repeated to each region in Color Segmentation result figure to draw whole image region It is divided into two classes --- background and moving target, so as to generate initial markers field (the Initial Labeling of the 2nd frame image Field)。

3rd, the generation of optimal Label Field

It the initial markers field that previous step obtains can not be accurately by the self-movement target in photographed scene from the 2nd frame figure Divide as in and extract, therefore, the present invention also needs to first to this by some prioris included in consecutive image sequence Beginning Label Field optimizes, to obtain the corresponding optimal Label Field (Optimal Labeling Filed) of the 2nd frame image, from And the accurate Ground Split of self-movement target in photographed scene is extracted.In the optimal Label Field, " 0 " represents background area Domain；" 1 " represents motion target area.

The it is proposed of markov random file (Markov Random Field, MRF) theory so that this optimization problem obtains Obtain perfect solution.The priori included in image can well be expressed by Markov random field model.It utilizes Markov random file theory is to be regarded as a Bayes to label (Bayesian Labeling) when being modeled to the problem Problem acquires a maximum a-posteriori estimation (Maximum a of authentic signature field (True Labeling) Posteriori, MAP)." 1 " in the authentic signature field represents the self-movement target in photographed scene, and " 0 " represents shooting field Background area in scape.Since markov random file Joint Distribution and Gibbs Distribution Equivalence Theorem (i.e. Hammersley- Clifford theorems) be suggested after, just so that solutions of the problem is converted to solution objective energy function minimum value and asks Topic acquires one group of optimal label configuration and objective energy function is caused to obtain minimum value.And it is current relatively stream that figure, which cuts algorithm, The method of capable solution tag combination optimization problem.Therefore, the present invention first with markov random file theory to the optimization Problem is modeled, and obtains objective energy function；Then algorithm (Graph Cuts) is cut with figure solve the objective energy function again. The process for being optimized under MAP-MRF frame structures to initial markers field and obtaining optimal Label Field is described more fully below.

For given tag set L={ 0,1 }, the observation field D={ d of the 2nd frame image₁,d₂,...,d_NAnd the 2nd frame The corresponding normal direction flow vector field of imageWherein d_iIt represents in the 2nd frame image at ith pixel point RGB color value is a trivector；It represents the normal direction flow vector at ith pixel point, is a two-dimensional vector；N Represent the pixel sum in the 2nd frame image.The target of the present invention seeks to obtain the corresponding optimal Label Field of the 2nd frame imageWherein f_i ^*Expression is assigned to the optimal label value of ith pixel point, and value may be 0 or 1.

Under MAP-MRF frame structures, optimal Label Field f^*It can be obtained by solving following formula：

Wherein U (f, V) represents objective energy function, and F represents the collection for being possible to Label Field f corresponding to the 2nd frame image It closes.It follows that the process of the formula of solution (5) is exactly to find that objective energy function U (f, V) is made to obtain most in Label Field set F One Label Field of small value, i.e. f^*.Objective energy function U (f, V) is defined as follows：

U (f, V)=U₁(f)+U₂(f|V) (6)

Wherein U₁(f) the priori group energy function of image is represented, the summation of all group energy in representative image；U₂(f| V it) represents likelihood energy function, represents the summation of required energy when assigning a label value to each pixel in image. U₁(f), U₂Shown in (f | V) is defined as follows：

U in formula (7)₁(f_i,f_j) represent the discontinuous cost of adjacent two pixels (i.e. i-th, j pixel) feature, i.e. base Group's energy；C_iRepresent the eight neighborhood pixel point set at ith pixel point.U in formula (8)₂(f_i|V_i ⁿ) it is represented to ith pixel point Assign label value f_iThe energy of Shi Suoxu.In formula, α and β is zero mean Gaussian white noise.U₁(f_i,f_j),Determine Justice is as follows：

Dist (d in formula (9)_i,d_j) in represent d_iAnd d_jEuclidean distance, τ is the distance threshold being previously set, 0 ＜ α₁＜ α₂；Cor in formula (10)_iRepresent the coordinate of ith pixel point,Represent the normal direction flow vector at ith pixel point, μ is normal direction The flow noise factor, δ (i) are defined as follows：

Wherein, num represents the pixel for meeting translation half-plane constraint in the eight neighborhood pixel point set of ith pixel point Number.

In order to obtain the globally optimal solution of formula (5), the present invention solves the tag combination optimization problem using figure segmentation method.By This defines a non-directed graph G=(V, E), and V represents the set on vertex in non-directed graph (comprising N number of node for representing image slices vegetarian refreshments The terminal s for representing background with one and terminal a t), E for representing moving target represent that the set on side in non-directed graph (includes two The side of type：A kind of is the side between connecting node, and weights are set as U₁(f_i,f_j)；Another kind be connecting node and terminal it Between side, weights are set asIt, can be by net by finding cutting for Least-cost in the non-directed graph Node in trrellis diagram is divided into two set (S and T), wherein the node in set S is all connect with terminal s, the node in set T is all It is connect with terminal t.

Complete above-mentioned steps, you can obtain the corresponding optimal Label Field f of the 2nd frame image^*。

4th, in video self-movement target detection

Front has described several key steps in object detection method under the dynamic environment based on normal direction flow field in detail Suddenly.The specific steps that self-movement target in algorithm extraction video is put forward using the present invention, flow chart such as Fig. 2 is explained below It is shown.

Self-movement target in video is extracted using the method for the present invention to be as follows：

(1) the 1st, 2 frame images in video are selected, the normal direction stream of the 2nd frame image is calculated using formula (2) and formula (3) Vector field；The 2nd frame image is split using the image partition method based on Mean-Shift algorithms simultaneously, obtains its colour Segmentation result figure.

(2) ballot method is estimated to obtain the pure translational movement information of video camera, i.e. FOE.

(3) the corresponding initial markers field of the 2nd frame image of generation.

(4) the initial markers field is optimized, the corresponding optimal Label Field of the 2nd frame image of generation, so as to which mesh will be moved Mark detected.

(5) using the corresponding optimal Label Field of the 2nd frame image, with reference to the Color Segmentation result figure of the 3rd frame image, generation the The corresponding initial markers field of 3 frame images.The method for generating the initial markers field is as follows：First, according to the colour of the 3rd frame image point The RGB color value in result figure is cut, different label values (its value is greater than 0 positive number) is assigned to each region, so as to raw Into an area label figure.Secondly, by pixel corresponding with " 0 " in the optimal Label Field of the 2nd frame image in the area label figure Label value at point becomes " 0 ", generates a pseudo- initial markers field.Again, each non-zeros label in pseudo- initial markers field is calculated It is worth corresponding pixel number ratio Ratio (label shared in the same label area of former area label figure_j), j=1, 2 ..., m, wherein label_jRepresent j-th of non-zeros label value, m represents the number of non-zeros label value.Finally, if Ratio (label_jJ-th of non-zeros label value is then updated to " 1 " by) >=τ；Conversely, it is updated to " 0 ".By the above process, you can generation The corresponding initial markers field of 3rd frame image.

(6) the initial markers field is optimized again, the corresponding optimal Label Field of the 3rd frame image of generation, so as to transport Moving-target detected.

(7) (5) are repeated, (6) step, you can complete the detection of moving target in subsequent frame image.

Effectiveness of the invention and accuracy are carried out by the consecutive image sequence of emulation experiment and actual photographed Verification, achieves good estimated result.Experimental result is shown in Fig. 3, Fig. 3 is the movement of the 57th frame image in true picture sequence Object detection results, wherein (a) is original image, (b) is the Color Segmentation result figure based on Mean-Shift, and (c) is initial Label Field, (d) are optimal Label Field, and (e) is true templates of moving objects, and (f) is DECOLOR algorithm testing results.This hair Bright sharpest edges are that algorithm is relatively low to noise sensitivity, and the image texture and characteristics of image to photographed scene do not require, Therefore scene is applicable in than wide.From the experimental results, the present invention can be effectively realized carries out target under dynamic environment Detection.

Above example is provided just for the sake of the description purpose of the present invention, and is not intended to limit the scope of the present invention.This The range of invention is defined by the following claims.It the various equivalent replacements that do not depart from spirit and principles of the present invention and make and repaiies Change, should all cover within the scope of the present invention.

Claims

1. object detection method under a kind of dynamic environment based on normal direction stream information, it is characterised in that include the following steps：

Step 1：The 1st, 2 frame images in video are selected, the normal direction flow vector field of the 2nd frame image are calculated, while to the 2nd Frame image is split, and obtains the Color Segmentation result figure of the 2nd frame image；

Step 2：Using the normal direction flow vector field and Color Segmentation figure being calculated in step 1, estimate to obtain using ballot method The pure translational movement information FOE of video camera；

Step 3：It, will be colored with reference to the normal direction flow vector field of the 2nd frame image after the pure translational movement information FOE of video camera Segmentation result figure carries out region division, is divided into background area and foreground area, foreground area, that is, motion target area, then to two Different zones carry out the judgement of translation half-plane constraint, so as to generate the corresponding initial markers field of the 2nd frame image；

Step 4：The corresponding initial markers field of the 2nd frame image is optimized, the corresponding optimal mark of the 2nd frame image of generation Remember field, then the problem of optimal Label Field is modeled using markov random file theory, obtain objective energy function；Again Algorithm is cut with figure and solves the objective energy function, and final realize detected moving target from the 2nd frame image；

Step 5：3rd frame image is split, obtains Color Segmentation result figure, utilizes the corresponding optimal label of the 2nd frame image , with reference to the Color Segmentation result figure of the 3rd frame image, the corresponding initial markers field of the 3rd frame image of generation；

Step 6：The corresponding initial markers field of the 3rd frame image is optimized again, the 3rd frame image of generation is corresponding most Excellent Label Field, so as to which moving target detected from the 3rd frame image；

Step 7：It repeats Step 5: six, detects the target in other frame images, until by the movement in whole frame image Target detection comes out.

2. object detection method under the dynamic environment according to claim 1 based on normal direction stream information, it is characterised in that：Institute It states in step 1, the Color Segmentation result figure obtained using the image partition method based on Mean-Shift algorithms.

3. object detection method under the dynamic environment according to claim 1 based on normal direction stream information, it is characterised in that：Institute It states in step 3, the 2nd corresponding initial markers field of frame image of generation is implemented as follows：

(1) it for any region R in the Color Segmentation result figure of the 2nd frame image, examines and counts the region R satisfactions translation The pixel number Sup (R) of half-plane constraint and ungratified pixel number Vio (R), and calculate satisfaction translation half-plane about The pixel number of beam shared ratio Ratio (R) in this region：

(2) differentiate the attribute of the region R according to the Ratio (R) the being calculated and threshold epsilon ＞ 0 being previously set, it is of the invention according to Empirical value ε=0.8 is taken according to abundant experimental results, if Ratio (R) >=ε, the region R belongs to background, while by the region pair Label value is answered to be updated to " 0 "；Otherwise, the region R belongs to moving target, while updates the region corresponding label value for " 1 "；

It to be two classes by whole image region division that above-mentioned steps are repeated to each region in Color Segmentation result figure Background area and foreground area, so as to generate the initial markers field of the 2nd frame image (Initial Labeling Field).

4. object detection method under the dynamic environment according to claim 3 based on normal direction stream information, it is characterised in that：Institute State ε=0.8.

5. object detection method under the dynamic environment according to claim 1 based on normal direction stream information, it is characterised in that：Institute It states in step 4, the corresponding optimal Label Field of the 2nd frame image of generation is specific as follows：

(1) for given tag set L={ 0,1 }, the observation field D={ d of the 2nd frame image₁,d₂,...,d_NAnd the 2nd frame The corresponding normal direction flow vector field of imageWherein d_iIt represents in the 2nd frame image at ith pixel point RGB color value is a trivector；V_i ⁿIt represents the normal direction flow vector at ith pixel point, is a two-dimensional vector；N Represent the pixel sum in the 2nd frame image, target is to obtain the corresponding optimal Label Field of the 2nd frame image Wherein f_i ^*Expression is assigned to the optimal label value of ith pixel point, and value may be 0 or 1；

(2) under MAP-MRF frame structures, optimal Label Field f^*It is obtained by solving following formula：

Wherein U (f, V) represents objective energy function, and F represents the set for being possible to Label Field f corresponding to the 2nd frame image；It asks The process of solution formula (2) is that the Label Field that objective energy function U (f, V) is made to obtain minimum value is found in Label Field set F, That is f^*, objective energy function U (f, V) is defined as follows：

U (f, V)=U₁(f)+U₂(f|V) (3)

Wherein U₁(f) the priori group energy function of image is represented, the summation of all group energy in representative image；U₂(f | V) table Show likelihood energy function, represent the summation of required energy when assigning a label value to each pixel in image；U₁ (f), U₂Shown in (f | V) is defined as follows：

U₁(f_i,f_j) represent the discontinuous cost of adjacent two pixels (i.e. i-th, j pixel) feature, i.e. group energy；C_iTable Show the eight neighborhood pixel point set at ith pixel point, U₂(f_i|V_i ⁿ) it is represented to ith pixel point imparting label value f_iShi Suoxu Energy；α and β is zero mean Gaussian white noise；U₁(f_i,f_j), U₂(f_i|V_i ⁿ) be defined as follows shown in：

dist(d_i,d_j) in represent d_iAnd d_jEuclidean distance, τ is the distance threshold being previously set, 0 ＜ α₁＜ α₂；Cor_iRepresent the The coordinate of i pixel, V_i ⁿRepresent the normal direction flow vector at ith pixel point, δ (i) is defined as follows：

Wherein, num represents the pixel for meeting translation half-plane constraint in the eight neighborhood pixel point set of ith pixel point Number；

(3) in order to obtain globally optimal solution, f is solved using figure segmentation method^*, a non-directed graph G=(V, E) is defined, V represents undirected The set on vertex in figure represents background terminal s comprising N number of node for representing image slices vegetarian refreshments and one and one represents movement mesh Mark terminal t；E represents the set on side in non-directed graph, includes two kinds of side：A kind of is the side between connecting node, weights It is set as U₁(f_i,f_j)；Another kind is the side between connecting node and terminal, and the weights on side are set as U₂(f_i|V_i ⁿ)；By undirected Cutting for Least-cost is found in figure, makes this minimum cutting that the node in grid chart can be divided into two set S and T, Node in wherein set S is all connect with background terminal s, and the node in set T is all connect with target terminal t；Complete above-mentioned step Suddenly, that is, the corresponding optimal Label Field of the 2nd frame image is obtained, so as to which moving target detected from the 2nd frame image.

6. object detection method under the dynamic environment according to claim 1 based on normal direction stream information, it is characterised in that：Institute It states in step 5, the corresponding initial markers field of the 3rd frame image of generation is specific as follows：

(1) it according to the RGB color value in the Color Segmentation result figure of the 3rd frame image, is assigned to the region of different colours different Label value, so as to generate an area label figure；

(2) by the area label figure in the optimal Label Field of the 2nd frame image " 0 ", i.e., most having the background picture in Label Field Plain label is denoted as " 0 ", and object pixel label is denoted as " 1 ", and the label value at corresponding pixel becomes " 0 ", at the beginning of generating a puppet Beginning Label Field；

(3) it calculates each non-zeros label in pseudo- initial markers field and is worth same mark of the corresponding pixel number in former area label figure Sign ratio Ratio (label shared in region_j), j=1, wherein 2 ..., m, label_jRepresent j-th of non-zeros label value, m tables Show the number of non-zeros label value；If Ratio (label_j) >=τ, τ are threshold value, then are updated to j-th of non-zeros label value " 1 "；Instead It, is updated to " 0 ", that is, generates the corresponding initial markers field of the 3rd frame image.