CN101916449A

CN101916449A - Method for establishing background model based on motion information during image processing

Info

Publication number: CN101916449A
Application number: CN 201010259541
Authority: CN
Inventors: 张鸣; 孙兵; 李科; 刘允才
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2010-08-21
Filing date: 2010-08-21
Publication date: 2010-12-15

Abstract

The invention relates to a method for establishing a background model based on motion information during image processing, belonging to the technical field of image processing. The method comprises the following steps of: (1) estimating a moving field between adjacent frames; estimating the moving field of each pixel point of an integral image by utilizing the invariance of pixel gray before and after motion for each pair of the adjacent frames input to videos, wherein the pixel point of which the speed is higher than a certain threshold value is used as a foreground point so as to obtain an initial foreground segmentation; and (2) estimating the background model by utilizing missing data: each frame obtained from the step (1) can be viewed as a sample point missing partial data in an original background space, the original background space is reestablishes through a series of sample points missing the partial data in the videos, and then the background model of each frame can be obtained through reprojection.

Description

Set up in the Flame Image Process method based on the background model of movable information

Technical field

What the present invention relates to is a kind of modeling method of technical field of image processing, and being specifically related to is a kind of method of setting up in the Flame Image Process based on the background model of movable information.

Background technology

Along with being extensive use of of video monitoring, the treatment technology of video data has obtained very big concern, for example pedestrian detection, tracking, the research of aspects such as object modeling and attitude estimation.And background subtraction is used for extracting the prospect part of video by general being applied in the various technology, and this part generally includes interested research object, for example people or vehicle and other article etc.For traditional background subtraction method, its background model usually needs given in advance, for example takes the video of one section background, sets up the model of background according to the statistical nature---average and variance---of the color of each pixel then.There are two problems in this quadrat method, and the one, need extra step to set up background model, the 2nd, the background model of being set up is static, promptly loses efficacy after background environment changes.Therefore, can directly set up background model from the successive video frames that comprises sport foreground and background simultaneously, will be the significant improvement to classic method.

In the prior art document, background modeling method commonly used is the method based on gauss hybrid models that Stauffer and Grimson proposed in the paper Adaptive background mixturemodels for real-time tracking (the adaptive background mixture model in real-time follow-up) of Proc.IEEEConf.on Computer Vision and Pattern Recognition in 1999.Their method can adapt to illumination variation slowly and by for example branch rocks or shade causes multi-modal color distribution.But owing to do not utilize the movable information of prospect, smear problem can appear in this method, and promptly slowly moving of object can stay historical motion vestige in the foreground segmentation that obtains.Therefore, this area need a kind ofly can utilize the movable information of prospect to set up the method for background model.

Summary of the invention

The objective of the invention is at the deficiencies in the prior art, propose a kind of method of setting up in the Flame Image Process based on the background model of movable information.The present invention need not extra background acquisition step, utilize movable information to come the prospect part of image according to a preliminary estimate, utilize in the multiple image the imperfect background after the removal prospect that each frame is set up background model again, utilize background subtraction can obtain accurate foreground segmentation at last.Has practical application widely at aspects such as object and pedestrian's tracking, motion analysiss.

The present invention is achieved by the following technical solutions:

The present invention to the consecutive frame image, estimates the sports ground (light stream) of each pixel in the input video each by condition random field.Then with speed greater than the pixel of certain threshold value as the foreground point, will obtain each two field picture is obtained the initial segmentation of a prospect like this.For each frame in the list entries, can obtain such initial segmentation.If regard background as a space that changes in low dimension, so wherein each width of cloth can be counted as a sample point with missing data through the image of initial segmentation.Wherein Que Shi part is exactly the difference of being blocked by prospect.For these a series of sample points with missing data, this method utilizes the principal component analytical method under the missing data to reconstruct the spatial context of this low-dimensional.The background re-projection of excalation in each frame in this lower dimensional space, can be recovered the background of present frame.

The present invention includes following steps:

1. the estimation of sports ground between consecutive frame: to consecutive frame, the unchangeability of pixel grey scale is estimated the sports ground of each pixel of entire image before and after this method utilization campaign to each of input video.Its medium velocity is used as the foreground point greater than the pixel of certain threshold value, obtains an initial foreground segmentation.

2. utilize missing data to set up background model: in step 1 to each frame all can be regarded as a sample point that lacks partial data in the former spatial context.Sample point by a series of disappearance partial data in the video reconstructs former spatial context.By re-projection, obtain the background model of each frame subsequently.

Compared with prior art, method proposed by the invention need not extra background acquisition step, can adapt to simultaneously the linear change of background, for utilizing background subtraction to come the extraction prospect to have actual use value, and is better than original method.

Description of drawings

Fig. 1 is adjacent two frames in the input video.

Fig. 2 is the size of the sports ground that obtains according to the consecutive frame among Fig. 1.

Fig. 3 is the size according to sports ground, and the point that speed is non-vanishing is made as the segmentation result of foreground point.

The background model of Fig. 4 for obtaining according to a series of imperfect background image.

Embodiment

Below in conjunction with accompanying drawing embodiments of the invention are elaborated: following examples have provided detailed embodiment and process being to implement under the prerequisite with the technical solution of the present invention, but protection scope of the present invention is not limited to following embodiment.

Embodiment

1. the estimation of sports ground between consecutive frame.Fig. 1 is the image of two width of cloth frames in the input video among the embodiment.The unchangeability of pixel grey scale is meant for a pairing object point of pixel before and after the motion, when it moves to second frame by first frame, because the interval on time and space is all very short, can suppose that the gray scale of this point remains unchanged.In this condition, utilize condition random field can solve the motion vector of each pixel correspondence.

The sports ground of k frame is designated as v _k, suppose motion vector v at pixel x place _k(x) only with the motion vector v of its neighbor pixel y _k(y) and the image g of adjacent this two frame _K-1, g _kRelevant, that is:

p[v _k(x)|v _k(y)，g _k-1，g _k，y≠x]＝p[v _k(x)|v _k(y)，g _k-1，g _k，y∈N(x)] (1)

Wherein N (x) is meant the set with the x neighbor pixel.

Under such prerequisite, can separate and become the estimation minimum energy problem finding the solution maximum a posteriori probability.The energy of system is defined as:

E (v_{k} | g_{k}) = \underset{x}{Σ} E_{data} [v_{k} (x) | g_{k - 1}, g_{k}] + \underset{x, y &Element; N}{Σ} E_{smooth} [v_{k} (x), v_{k} (y) | g_{k - 1}, g_{k}] - - - (2)

Wherein two energy terms are defined as follows respectively: E _DataExpression current pixel assignment is v _k(x) energy is used for weighing the motion gray scale unchangeability of object pixel down.When a point from x _K-1Put the v that moved _k(x) arrive x _kBehind the point, if grey scale change is more little, this energy term is more little so, and vice versa.This energy term is defined as follows:

E _data[v _k(x)|g _k-1，g _k]＝λmin{|g _k-1[x-v _k(x)]-g _k(x)|，α} (3)

Here added one by α, made that its energy no longer increases when grey scale change during greater than α.This is used to handle the situation of blocking when between consecutive frame.λ is a zoom factor of regulating the energy size.

Another energy term E _SmoothThe motion vector of expression neighbor pixel x, y is respectively v _k(x) and v _k(y) energy the time is used for weighing the motion continuity between neighbor pixel, is defined as follows:

E _smooth[v _k(x)，v _k(y)|g _k-1，g _k]＝d(g _k，x，y)min[|v _k(x)-v _k(y)|，β] (4)

Here β be and formula (3) in α similarly by.D (g _k, x y) is one and describes image g _kThe function of changing features on middle x, the y pixel, when when x, y place image change are big, the value of this function reduces, and when variation was mild, functional value increased.Like this reason of Chu Liing be because for the motion continuity between neighbor pixel at the object edge place and be false, and the image change at object edge place is generally bigger, therefore the energy term that reduces this moment.Here choose with minor function:

d (g_{x}, x, y) = \{\begin{matrix} 1 & ifΔ (x) < 0.1, andΔ (y) < 0.1 \\ 0.05 & else \end{matrix} - - - (5)

Here Δ (x), Δ (y) is the Laplace transform of x, y place image.

At last, calculate the feasible value that is defined in the energy term minimum of sports ground between consecutive frame by belief propagation algorithm.This value is exactly to estimate for the maximum a posteriori of sports ground.For two consecutive frames that Fig. 1 among the embodiment provides, the size of the sports ground that its estimation obtains as shown in Figure 2.Wherein bright more point represents that movement velocity is big more, and the arm of personage's swing is the fastest part of movement velocity among the visible figure.

On this sports ground, setting speed is the foreground point greater than zero point (i.e. all points that move), and the segmentation result that obtains so as shown in Figure 3.Can see,, but effectively remove the part of motion though this preliminary segmentation result is not accurately cutting apart of a foreground object.Like this, Sheng Xia part just can be regarded as an incomplete background.

2. utilize missing data to set up background model

The principal component analysis method is as the common method of graphical analysis, and for a two field picture, its grey scale pixel value can sequence turn to a big vector.The main use of principal component analysis (PCA) is to reduce dimension, for a n-dimensional vector x, can it be projected as a k dimensional vector y by a matrix W:

y＝W ^Tx (6)

The W here is the matrix of a k * n dimension.The best value W that can make y approach x in the k space promptly is called main composition.

For background modeling, though the enormous amount of pixel exists very strong relevance between pixel grey scale, the variation that is to say background is in a low-down dimension.For the image M of width of cloth m * n size, each row is regarded a n-dimensional vector as.Therefore, in the time of on projecting to main composition, each is worth Y _iBe a m dimensional vector:

Y＝MW (7)

For a series of incomplete background image that provides in the step 1, can find the solution this principal component analysis problem by expecting maximum algorithm, the result on embodiment is as shown in Figure 4.So image average M that is obtained ₀With main composition C _i(i=1,2 ..., d) can be used for background modeling.Other is P _iThe projection of expression original image on i main composition, background can be redeveloped into so:

M_{0} + Σ_{i = 1}^{d} P_{i} C_{i}

(8), be the background model that obtains.

Claims

1. a method of setting up in the Flame Image Process based on the background model of movable information is characterized in that, may further comprise the steps:

Step 1. the estimation of sports ground between consecutive frame: each of input video to consecutive frame, is estimated the sports ground of each pixel of entire image with the unchangeability of pixel grey scale before and after the motion;

Its medium velocity is used as the foreground point greater than the pixel of certain threshold value, obtains an initial foreground segmentation;

Step 2. utilize missing data to set up background model: in step 1 to each frame all can be regarded as a sample point that lacks partial data in the former spatial context;

Sample point by a series of disappearance partial data in the video reconstructs former spatial context;

By re-projection, obtain the background model of each frame subsequently.

2. the method for setting up in the Flame Image Process based on the background model of movable information according to claim 1, it is characterized in that, the unchangeability of pixel grey scale is meant for a pairing object point of pixel before and after the described motion, when it moves to second frame by first frame, because the interval on time and space is all very short, the gray scale of supposing this point remains unchanged, and in this condition, utilizes condition random to solve the motion vector of each pixel correspondence.

3. the method for setting up in the Flame Image Process based on the background model of movable information according to claim 1 is characterized in that described motion vector obtains by the following method:

p[v _k(x)|v _k(y)，g _k-1，g _k，y≠x]＝p[v _k(x)|v _k(y)，g _k-1，g _k，y∈N(x)]

Wherein N (x) is meant the set with the x neighbor pixel;

Under such prerequisite, can separate and become the estimation minimum energy problem finding the solution maximum a posteriori probability:

The energy of system is defined as:

E (v_{k} | g_{k}) = \underset{x}{Σ} E_{data} [v_{k} (x) | g_{k - 1}, g_{k}] + \underset{x, y &Element; N}{Σ} E_{smooth} [v_{k} (x), v_{k} (y) | g_{k - 1}, g_{k}]

Wherein two energy terms are defined as follows respectively: E _DataExpression current pixel assignment is v _k(x) energy is used for weighing the motion gray scale unchangeability of object pixel down;

When a point from x _K-1Put the v that moved _k(x) arrive x _kBehind the point, if grey scale change is more little, this energy term is more little so, and vice versa.

4. the method for setting up in the Flame Image Process based on the background model of movable information according to claim 1 is characterized in that described energy term is defined as follows:

E _data[v _k(x)|g _k-1，g _k]＝λmin{|g _k-1[x-v _k(x)]-g _k(x)|，α} (3)

Here added one by α, made that its energy no longer increases when grey scale change during greater than α, this is used to handle the situation of blocking when between consecutive frame, and λ is the zoom factor of an adjusting energy size;

E _smooth[v _k(x)，v _k(y)|g _k-1，g _k]＝d(g _k，x，y)min[|v _k(x)-v _k(y)|，β]

Here β be and formula (3) in α similarly by.D (g _k, x y) is one and describes image g _kThe function of changing features on middle x, the y pixel, when when x, y place image change are big, the value of this function reduces, and when variation was mild, functional value increased; Therefore reduce energy term at this moment, choose with minor function:

d (g_{x}, x, y) = \{\begin{matrix} 1 & ifΔ (x) < 0.1, andΔ (y) < 0.1 \\ 0.05 & else \end{matrix}

Here Δ (x), Δ (y) is the Laplace transform of x, y place image;

At last, calculate the feasible value that is defined in the energy term minimum of sports ground between consecutive frame by the belief propagation method.

5. the method for setting up in the Flame Image Process based on the background model of movable information according to claim 1, it is characterized in that, described step 2 specifically is meant: for a two field picture, its grey scale pixel value can sequence turn to a big vector, for a n-dimensional vector x, it is projected as a k dimensional vector y by a matrix W:

v＝W ^Tx

The W here is the matrix of a k * n dimension, and the best value W that can make y approach x in the k space promptly is called main composition;

Exist relevance between pixel grey scale, for the image M of width of cloth m * n size, each row is regarded a n-dimensional vector as, and therefore, in the time of on projecting to main composition, each is worth Y _iBe a m dimensional vector:

Y＝MW

For a series of incomplete background image that provides in the step 1, can find the solution this main composition, the image average M that is obtained ₀With main composition C _i(i=1,2 ..., d) can be used for, in addition P background modeling _iThe projection of expression original image on i main composition, background can be redeveloped into so:

Be the background model that obtains.