CN101645171A

CN101645171A - Background modeling method (method of segmenting video moving object) based on space-time video block and online sub-space learning

Info

Publication number: CN101645171A
Application number: CN200910177528A
Authority: CN
Inventors: 朱松纯; 赵友东
Original assignee: HUBEI LOTUS HILL INSTITUTE FOR COMPUTER VISION AND INFORMATION SCIENCE
Current assignee: HUBEI LOTUS HILL INSTITUTE FOR COMPUTER VISION AND INFORMATION SCIENCE
Priority date: 2009-09-15
Filing date: 2009-09-15
Publication date: 2010-02-10

Abstract

The invention relates to the video field, in particular to the content analysis of the video and the object detection field. The purpose of the invention is to solve problem that the moving object segmentation is easily affected by illumination changes in the application of video monitoring, such as abrupt change of solar illumination in daytime, automobile light at night, a great amount of falsealarm can be generated by using the traditional method. Two key technologies are utilized in the invention for realizing the above purpose. One key technology is to take a space-time video block as abasic process unit, thus apparent spatial information and time motion information are simultaneously utilized to carry out background modeling and prospective detection segmentation. The other key technology is to effectively capture background modeling by utilizing online sub-space learning method. The method can be used in the systems for processing and analyzing the video content, which need tocarry out background modeling and prospective detection, such as video monitoring system.

Description

Background modeling method (video frequency motion target dividing method) based on space-time video blocks and the study of online subspace

Technical field

The present invention relates to video field, be specifically related to video content analysis and object detection field.The objective of the invention is to solve the target of transporting in the video surveillance applications and cut apart the influence that is subject to illumination variation, such as the unexpected variation of solar irradiation on daytime, night car bulb etc., classic method can produce a large amount of false-alarms.And the present invention can well address this problem.

Background technology

The background modeling of the video that the video camera visual angle is fixing is meant by mathematical model and algorithm, static background in the consecutive image sequence is set up a kind of technology of mathematical model.Utilize this background model the image-region of the moving target in the video sequence can be split from background automatically.This technology can be used for the video intelligent analysis, video coding, multiple application such as man-machine interaction.

By the complete transfixion of hypothesis background, the simplest background modeling algorithm is by two interframe, even moving target and background are differentiated in the calculating of many frame-to-frame differencess.What image difference was little between two frames or multiframe is background, and big is prospect.This method has the simple and direct advantage, and experiment effect is also better under the ideal conditions.Owing to often have the complicated background situation in the monitoring scene, a lot of scenes can't obtain desirable pure background image at all but in fact.The noise that produces in adding the camera imaging process, and the variation of automatic gain control and white balance are planted often poor effect of moving target detecting method, are difficult to practicality.

Present main flow background modeling method is gauss hybrid models method and various mutation thereof.Many Gaussian Background modeling method is utilized pixel color feature in the video sequence, it mainly considers the statistical distribution of background image and foreground image pixel value, by setting up a plurality of Gauss models this distribution is described, promptly set up background model, and then use certain decision rule that current pixel is belonged to background pixel or the moving target pixel is judged, also be foreground segmentation.After background and prospect are cut apart, upgrade according to the statistical distribution parameter of certain rule the background image pixel value, usually comparatively at a slow speed renewal is carried out in the distribution of prospect, the distribution of background is carried out the renewal of conventional speeds.Article in computer vision in 1998 and the pattern-recognition meeting " Adaptivebackground mixture models for real-time tracking " promptly be typical case's representative of this method.The highly effective customer service of this method the sensitivities of frame-to-frame differences class methods to noise, and can some are subjective and the motion that thinks little of, go into background as statements such as rocking of flowers and plants.In this simultaneously, these class methods do not need to use in advance a desirable background model, so and practical.But because the distribution of using Gaussian distribution to describe non-moving target pixel grey scale under a lot of situation is inaccurate, therefore all erroneous segmentation can appear to this method to situations such as illumination variation, shades.And since this method modeling background be one group of independently pixel change procedure, so in their the intractable video scene overall situation variation.

Recently, the method of utilizing realm information around the pixel to improve the background modeling effect has become a kind of trend, comprises spatial neighborhood information (as the method based on piece (block) such as portion's binaryzation pattern (Local Binary Pattern LBP) histogram feature) or time neighborhood information (as the method based on light stream).Utilize the background modeling method thought and the above-mentioned modeling method of pixel color feature of utilizing of local neighborhood information similar, just changed color characteristic into the local neighborhood statistical nature.The local neighborhood statistical nature has been considered the interior information of a small neighbourhood of current pixel, be subjected to the brightness random variation of current pixel to influence little, therefore than colouring information robust more, but to the smooth region in the image (as metope), zone that noise is bigger and violent illumination variation etc. all erroneous judgement occurs easily and produce a lot of omissions and false-alarm.

On directly perceived, spatial neighborhood information and time neighborhood information complement one another each other.Such as, in the environment of dim illumination or low contrast, the motion of prospect provides main information for our visually-perceptible; And when bigger illumination variation was arranged in scene, prospect apparent provided main visually-perceptible information.Some researchers also use space time information to carry out background modeling.Each pixel is the response of the space-time differential filter at center in the monitor video sequence but the background model of their their foundation is based on.And the main problem of paying close attention to is that the scene that consistent background of moving is arranged in the scene is carried out background modeling, such as the leaf that waves with the wind etc. is arranged in the scene.And can not well work to illumination variation violent in the scene etc.

In light of this situation, be accurately the monitoring scene of complexity is carried out background modeling, and successfully distinguish the moving target pixel, must consider the space-time neighborhood information around each pixel fully, and on algorithm design, guarantee real-time and versatility.

Summary of the invention

Fundamental purpose of the present invention is in order to solve the background modeling problem of outdoor scene at night.In the night outdoor scene, dim illumination condition, low signal-to-noise ratio, low contrast and violent illumination variation or the like factor all can cause the difficulty of background modeling.The invention provides the background modeling method based on video space-time piece of a novelty.With based on space piece or different based on the background modeling method of light stream, this method utilize simultaneously the space apparent with the time two aspect information of moving improve the modeling performance.In new method, the space-time video blocks is basic processing unit.Based on the space-time video blocks, learn background model by an online subspace learning method.Based on the background model that study obtains follow-up space-time video blocks is carried out background piece and foreground blocks judgement.In this simultaneously, utilize the background piece to come update background module according to the result who judges.It is suitable equally to other scenes that the inventive method is not limited to the night outdoor scene, such as the scene on daytime, is a blanket background modeling method.

The problem formalized description.For input video stream, as a new frame I _n(n=1.2 ...) enter that (it is of a size of W * H), and we are divided into the N=(W * H)/(individual image block { P of h * h) to it _{I, n}} _I=1 ^N, i is the index of image block, h is the wide of image block and high (suppose W and H can be divided exactly by h).For each image block P _{I, n}, we are combined to form a video blocks (as shown in Figure 2) by the image block that it is corresponding with front t-1 (as t=5) frame, and it is of a size of h * h * t.Along with constantly the reading in of video flowing, we will obtain one group of video blocks sequence { B so _i} _I=1 ^N, B _i={ B _{I, 1}, B _{I, 2}..., B _{I, n}... }.

For i video blocks sequence, wherein mainly contain two kinds of variations, that is, illumination variation and prospect are blocked.Only be that illumination variation (video blocks comprise the background video piece that goes significant change normally) is the background video piece, they are positioned at a low n-dimensional subspace n S ₁, this sub spaces can be by obtaining in a line subspace learning method (as, CCIPCA algorithm) study.Based on this low-dimensional background subspace, we can carry out the judgement of background piece and foreground blocks.And utilize the background piece that real-time update is carried out in the background subspace.In order to express and convenience of calculation, we express the video blocks of 3 dimensions and are converted to easy-to-handle vector expression, promptly B _{I, n}Be converted to a D (the dimensional vector x of D=h * h * t) _{I, n}(having gone average).Because we are to the same foundation of all video blocks sequences and keep a background subspace, in the arthmetic statement of back, express for simplicity, we have removed the subscript index i of video blocks, and a video blocks sequence can be expressed as B={x so ₁, x ₂..., x _n, ... }, x _nBe n video blocks vector.

The speed that algorithm of the present invention is gone up operation at the individual PC (Inter Pentium 2.8G with 1GB RAM) of standard is about 15 frame per seconds.Some improved little strategies if sample, such as with new frame update model the time, every frame is new portion background model more in turn, and frame per second can be brought up to more than per second 40 frames, and very little to the experiment effect influence.

The invention provides background modeling method in a kind of video sequence, comprise following step:

1. model initialization (optional)

2. background model is mated and moving object detection

3. background model is upgraded

Description of drawings

The elementary cell that the background modeling method of three kinds of levels of Fig. 1 utilizes

Fig. 2 space-time video blocks spatial distribution analysis: the characteristic distribution curve of background subspace (a) and prospect subspace (b), be not difficult therefrom to find that background video piece (including the video blocks of illumination variation) is positioned at a low n-dimensional subspace n (3 ~ 4 dimension) really, and the video blocks that prospect is blocked is distributed in the space of a higher-dimension.(c), (d), (e) illustrate the reconstruction error curve of video blocks on different subspace and corresponding ROC curve thereof with (f).(c) curve is to the background subspace of the background video piece of application of pure training, and (d) curve is to using the background subspace that the video blocks that only comprises illumination variation is come out.The normal background video piece of red representative, green representative has the video blocks of illumination variation, the video blocks that blue representative has foreground target to block.As seen normal background video piece with have the video blocks of illumination variation to be positioned at same subspace, and the video blocks that has prospect to block is positioned at different subspaces and can well distinguishes.

Fig. 3 algorithm flow chart

Fig. 4 compares without result's (4 * 4 * 5. the 3rd row) of time dimension information (4 * 4 * 1. secondary series) with usefulness: be not difficult to find out from the result, have serious omission to exist in the result of space piece level, this is the time movable information that lacks owing to the video scene contrast is low in the space-time piece.

The result of Fig. 5 and commonly used two classical ways is relatively: the performance of GMM is the poorest when illumination variation bigger in the scene as can be seen.Though LBP can overcome the influence of illumination variation to a certain extent, when illumination variation was very violent, it still can produce large-area car light light false-alarm on the road surface.And the method that we propose is that to show also be good in that violent car light illumination variation is arranged.

Another extreme path of Fig. 6 compares according to the result of scene: for LBP method (secondary series), be not difficult from the result to find that more omission and false-alarm exist simultaneously, and our method (the 3rd row) not only can be tolerated violent illumination variation, and is also responsive to the variation that is produced by foreground moving when scene contrast is low.

Embodiment

Concrete implementing method of the present invention is as follows:

1. model initialization:

To a video blocks sequence B={ x ₁, x ₂..., x _n... }, we at first carry out traditional principal component analysis (PCA) (batch PCA) algorithm to some video blocks (as preceding 200) of front, obtain a low n-dimensional subspace n (as the d=8 dimension).With its initial space, carry out normal background and keep and foreground detection then as the study renewal of online subspace.This initialization is an option, if computational resource do not allow, such as the less situation of embedded memory, can not want this initialization procedure and the renewal of directly carrying out model detects.There just do not have initialized words to begin the background modeling effect of a bit of time to be bad, needs a period of time study.

2. background model is mated and moving object detection:

For a new video blocks x _n, the distance metric L between it and the background subspace model S that estimated can be by the following formula iterative computation:

x _k-1，n＝x _k，n-(x _k，n，q _k，n-1)q _k，n-1，(k＝1.2，...，d) (1)

L(x _n，S)＝||x _d+1，n||， (2)

Here,

q_{k, n - 1} = \frac{ν_{k, n - 1}}{| | ν_{k, n - 1} | |}

Be k the unit principal vector of background subspace S.If distance L is less than threshold value T, x _nBe exactly the background video piece, can be used to update background module S, otherwise just the prospect of being judged as blocked video blocks, can not be used for more new model.The distance L that it should be noted that in the formula 7 definition is x just _nWith the residual error after the expression of background subspace, but it can not be calculated by traditional reconstruction residual computations method, that is,

L (x_{n}, S) = | | x_{n} - Σ_{k = 1}^{d} q_{km - 1} < x_{n}, q_{k, n - 1} > | | - - - (3)

Because the proper vector { q that estimates in line iteration here _{K, n-1}} _K=1 ^dUsually strict orthogonal not each other.

The threshold value T here can have following formula according to the adaptive adjusting of scene brightness:

T = \frac{| | μ_{n} | |}{10} + \frac{1}{80} Σ_{i = 1}^{D} (x_{n} (i) - μ_{n} (i)) - - - (4)

The denominator 10 here and 80 can be done some fine settings to improve effect according to scene.This threshold setting method has embodied such basic consideration, that is, if new video blocks than average video piece " bright ", then threshold value is higher accordingly, when new than average video blocks " secretly " then the corresponding reduction of threshold value some.

3. background model is upgraded:

A given video blocks sequence B={ x ₁, x ₂..., x _n... }, d main proper vector before we estimate by following two equation iteration:

ν_{k, n} = (1 - α) ν_{k, n - 1} + α x_{k, n} {< x}_{k, n}, \frac{ν_{k, n - 1}}{| | ν_{k, n - 1} | |} > - - - (5)

x_{k + 1, n} = x_{k, n} - < x_{k, n}, \frac{ν_{k, n}}{| | ν_{k, n} | |} > \frac{ν_{k, n}}{| | ν_{k, n} | |}, - - - (6)

Here, (. .) the expression inner product, v _{K, n}Be k principal component (1≤k≤d), the x after upgrading through n video blocks _{1, n}=x _n, x _{K-1, n}Be with video blocks x _nK has estimated the residual video blocks after the good principal component projection forward, and α is a learning rate, as α=0.005.The author of CCIPCA algorithm proof is for above-mentioned iteration update algorithm, when n → ∞, and v _{K, n}→ ± λ _kq _k, λ _kBe k principal component characteristic of correspondence value of the covariance matrix of B, q _kUnit character vector for correspondence. in addition, the average μ of subspace _nUpgrade by following formula:

μ _n＝(1-α)μ _n-1+αx _n. (7)

Claims

1. the background modeling of a monitor video (or moving Object Segmentation) method, flow process the steps include: as shown in Figure 3

A) model initialization

B) background model coupling and moving object detection

C) background model is upgraded

2. a kind of video background modeling method as claimed in claim 1 is characterized in that, utilizes the space-time video blocks as the elementary cell of handling.

3. a kind of video background modeling method as claimed in claim 1 is characterized in that, the method for utilizing the subspace to learn is found the low-dimensional spatial context in the higher-dimension video blocks space, and carries out the foreground target detection and cut apart with it.

4. a kind of video background modeling method as claimed in claim 1 is characterized in that, utilizes the apparent relevant adaptive threshold of a kind of and scene zones of different to set strategy, carries out target detection and cuts apart.