CN103390278B

CN103390278B - A kind of video unusual checking system

Info

Publication number: CN103390278B
Application number: CN201310311800.0A
Authority: CN
Inventors: 郭立; 刘鹏; 王成彰; 于昊
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2013-07-23
Filing date: 2013-07-23
Publication date: 2016-03-09
Anticipated expiration: 2033-07-23
Also published as: CN103390278A

Abstract

A kind of video unusual checking system, comprises trajectory extraction module, and function uses gauss hybrid models to extract character contour to obtain track sets; Region dividing module, function is for different demands, manually or by algorithm detection video background is divided into different regions, these background block sequences is manually demarcated; Condition random field MBM, function is the structural attitude vector that the region of division and corresponding track combined, and uses it for CRF model training, estimated parameter; Video detection module to be measured, obtains the proper vector of cycle tests, estimates that the parameter obtained is inferred before utilization finally by above-mentioned similar approach, calculate the probability belonging to different abnormal behaviour, gets abnormal behaviour belonging to maximum probability as classification.The invention provides and there is good practicality and higher classification accuracy rate.

Description

A kind of video unusual checking system

Technical field

The present invention relates to video track anomaly analysis detection field, specifically, the present invention relates to a kind of video track abnormality detection system.

Background technology

The behavioural analysis of people has a wide range of applications in the video frequency searching of security monitoring senior man-machine interaction video conference Behavior-based control and medical diagnosis etc. and potential economic worth is the study hotspot in computer vision field, video brainpower watch and control system (IntelligentVideoSurveillanceSystem, IVSS) is one of its most important application.

Along with the development of infotech, in recent years due to the needs of public safety, the demand of intelligent monitoring aspect increased sharply.Traditional passive monitoring system is by artificial observation and the monitor data analyzing magnanimity, and result in high cost of labor, lower discrimination and higher loss, it is also extremely consuming time that the later stage selects the work that may be used for evidence in a large amount of video datas.This cannot meet public security, bank, the security protection requirement that the department of the security sensitives such as traffic proposes video monitoring.

Abnormal behaviour identification is the main task of intelligent monitor system.The major requirement of intelligent monitor system is real-time and robustness.Current research also mainly concentrates on the unusual checking in the Activity recognition of limited assortment simple rule or special scenes.

The detection of abnormal behaviour has two kinds of conventional methods, and a kind of is method based on distinctiveness ratio between exception and normal behaviour.This method is further subdivided into two kinds of different submethods according to whether building behavior model again:

(1) method building behavior model is not needed.First the behavior pattern observed of cluster, is then labeled as exception by its medium and small cluster, during detection, likelihood score is done in the behavior in scene and the normal behaviour in database and calculate, exceed thresholding, be then judged as exception as likelihood score departs from.

Or first (2) build the database of normal behaviour collection, abnormal behaviour will be labeled as by the behavior of data representation in database.

These class methods are mainly used to analysis list people behavior, and need a large amount of prioris to build model, and the model built all existing defects in scene adaptability and real-time.

Another method is the method based on abnormal behaviour modeling.

First, from video sequence, extract characteristics of image, feature usually by detecting and moving object tracking, and calculates its track, and speed and shape description symbols obtain.Then, the feature of feature based is by artificial or supervised learning technique construction " normally " behavior model.Behavior modeling selects other graph models such as Hidden Markov (HMM) model or maximum entropy Hidden Markov Model (HMM) (MEMM) usually.

Image feature amount is turned to the state of series of discrete and modeling state mode over time by these models.In order to detect abnormal behaviour, video and a series of normal model are matched, the paragraph being wherein not suitable for model thinks exception.Based on quite effective in the scene that the method for model can clearly define and retrain at " normal behaviour ".But in typical actual life, define and determine that "abnormal" behavior is more difficult with modeling " normally " behavior ratio.

Although these two kinds of methods can set up behavior model accurately under fixing scene, the behavior sequence needing manual markings a large amount of is to obtain enough training samples, and this can cause the waste of a large amount of human resources.

Summary of the invention

Technology of the present invention is dealt with problems: overcome the deficiencies in the prior art, provides a kind of video unusual checking system, and realize effective to the overall situation, the detection of long time behaviour, has good applicability, and has higher classification accuracy rate.

The technology of the present invention solution: a kind of video unusual checking system, its feature is to comprise:

Trajectory extraction module, for building the track sets of target in training video and test video.To video extraction prospect and background, denoising after obtaining prospect, removes shade, and structure bounding box, using bounding box barycenter as track.

Region dividing module: background is carried out Region dividing, can divide by artificial division or by algorithm automatically according to different requirements.Combined by Region dividing and trajectory coordinates and obtain the characteristic sequence of these abnormal behaviours.

Condition random field MBM: employing condition random field (CRF) detects the exception in video: track exception main manifestations on specific background particular space, for hover in some position, stays, and drives in the wrong direction, crosses border etc.The abnormal behavior sequence of the front construction of utilization carries out conditional random field models training and parameter estimation.Condition random field parameter estimation can adopt iteration pantography to calculate.

Video detection module to be measured: after completing modeling, does same process by cycle tests, obtains characteristic sequence, the model set up before utilization, calculates the conditional probability belonging to different abnormal behaviour, gets maximum probability and mark as it, determine whether abnormal behaviour.

Described trajectory extraction module implementation procedure is as follows:

The thought of GMM (mixed Gauss model) is that each pixel of a sub-picture can represent by the weighted sum of M Gaussian distribution, being distributed as of pixel:

p (X_{t}) = Σ_{k = 1}^{M} w_{k, t} * η (X_{t}, μ_{k, t}, Σ_{k, t})

Wherein M is the number of Gaussian distribution model, X _tred, green, blue three colouring component of this pixel in t, w _k,tthe weight (1≤k≤M) shared by a t kth Gauss model, μ _k,tand ∑ _k,tbe kth Gauss model average and covariance matrix in t mixed Gauss model respectively, η is Gaussian density function.

η (X_{t}, μ, Σ) = \frac{1}{{(2 π)}^{n / 2} | Σ_{k, t} |^{1 / 2}} e^{- \frac{1}{2} {(x_{t} - μ_{k t})}^{T} Σ_{k, t}^{- 1} (x_{t} - μ_{k t})}

(1) suppose each Color Channel independent distribution, simplifying covariance is initialization mixed Gauss model;

(2) at moment t, to each pixel X of video _tmate with all Gauss models, if pixel value X _twith a kth Gaussian distribution g _kthe distance of average be less than threshold value, then pixel X _tthe match is successful with this Gaussian distribution, and this Gaussian distribution then according to following formula undated parameter, increases the weight of the Gauss model of this coupling.

\{\begin{matrix} \begin{matrix} w_{k, t} = (1 - α) w_{k, t - 1} + α \\ μ_{k, t} = (1 - ρ) μ_{k, t - 1} + {ρX}_{t} \\ σ_{k, t}^{2} = (1 - ρ) σ_{k, t - 1}^{2} + ρ {(X_{t} - μ_{k, t})}^{2} \end{matrix} & ρ = \frac{α}{w_{k, t}} \end{matrix}

(3) if do not mate, the Gaussian distribution of so minimum weights is replaced by new distribution, and all the other Gaussian distribution upgrade according to following formula:

w _k,t＝(1-α)w _k,t-1

(4) last according to priority w _k,t/ σ _k,tgauss model is sorted.Wherein the greater represents that it has less variance, and probability of occurrence is larger.By sequence after before C distribution elect background model as, all the other are as foreground model.C meets:

C = \arg m i n (Σ_{k = 1}^{c} w_{k} > T)

Wherein T is a weight threshold, can be understood as the ratio that background accounts for whole picture.If threshold value arranges excessive, complex environment (water surface of such as slow movement, the twig etc. of rocking with the wind) can bring the increase of calculated amount, if threshold value arranges too small, then mixed Gauss model may deteriorate to single Gauss model.Described Region dividing module, for dividing background, can adopt GMM model to carry out Region dividing.

The thought of GMM model utilizes M Gaussian distribution that image is divided into multiple part, and it is μ that the pixel of each part obeys average _k, variance is gaussian distribution.So how to obtain the Gauss model number M in GMM model and parameter (weight of mixed Gauss model, average, variance) just becomes main content.If certain sample meets a kind of probability distribution, but parameter wherein needs to estimate, needs to obtain result by test many times, utilizes result to release the maximum possible value of parameter.The basic ideas of Here it is Maximum-likelihood estimation: a known parameter can make certain sample occurrence probability maximum, so just using this parameter value as estimated value.EM algorithm can be adopted to carry out parameter estimation to GMM model:

In order to background to be combined the proper vector constructed needed for training condition random field models with track, background is divided into multiple region, marks each region.Algorithm (GMM) can be adopted to divide or artificial division, and implementation procedure is as follows:

(1) initialization Initialize:

If background image observed reading is vector x _i, i=1 ..., n

θ^{(0)} = (w_{1}^{(0)} ... w_{M}^{(0)}, μ_{1}^{(0)} ... μ_{M}^{(0)}, σ_{1}^{2 (0)} ... σ_{M}^{2 (0)})

(2) E-STEP (estimation)

For each pixel, the probability that it belongs to a kth Gaussian distribution is:

p (k | x_{i}, θ^{o l d}) = α_{k} = \frac{w_{k} η_{k} (x_{i}; μ_{k}, σ_{k})}{Σ_{j}^{M} w_{j} η_{j} (x_{i}; μ_{j}, σ_{j})}, 1 \leq i \leq n, 1 \leq k \leq M

(3) M-step (maximization)

Likelihood function is maximized and obtains new parameter value.

First weight is upgraded:

w_{k} = \frac{Σ_{i = 1}^{n} α_{i k}}{n}

Upgrade average:

μ_{k} = \frac{Σ_{i = 1}^{n} α_{i k} x_{i}}{Σ_{i = 1}^{n} α_{i k}}

Upgrade variance:

σ_{k}^{2} = \frac{Σ_{i = 1}^{n} α_{i k} {(x_{i} - μ_{k})}^{2}}{Σ_{i = 1}^{n} α_{i k}}

(4) (2) are repeated, (3) two steps, until convergence.

Described condition random field MBM, for building the conditional random field models based on track and Region dividing.The proper vector of training video is utilized to set up CRF model and estimated parameter.Iteration pantography the proper vector of same class abnormal behaviour video is utilized to estimate CRF model parameter.Whole implementation procedure is as follows:

When training pattern, define following several abnormal behaviour: cross the border; Hover; Stay; Drive in the wrong direction.All the other behaviors regard as normally.Utilize method noted earlier to obtain target trajectory, the region that background segment goes out, they are combined: T _i=(p _t, q _t, p _t-1, q _t-1, t, subarea _k, state), wherein p _t, q _tthe coordinate of t respectively, p _t-1, q _t-1the coordinate in t-1 moment, subarea _kthe mark in region residing for target, state target-marking at current region through state, whether once passed through this region before showing target.To the training video belonging to a class abnormal behaviour together, obtain their proper vector (as obtaining the proper vector of all " driving in the wrong direction " video) respectively.

After obtaining all proper vectors, need to carry out estimated parameter from training data: λ=(λ ₁, λ ₂... λ _s... λ _m)

(1) in actual applications, structural attitude function and potential function is first needed.In this article, according to the proper vector T obtained before _i=(p _t, q _t, p _t-1, q _t-1, t, subarea _k, state), the background area subarea residing for t target _k, obtain fundamental function:

Wherein x _trepresent the feature obtained, that select here is time t and background area subarea _k, y _t-1and y _tthe handmarking in t-1 moment and the handmarking of t.

By t and the coordinate p in t-1 moment _t, q _t, p _t-1, q _t-1, we can obtain the direction of motion of target, structural attitude function:

Construct other fundamental function by that analogy, finally construct potential function, it is the linear combination exponential form formation of multiple features function: their coefficient is respectively λ _a, a=1,2,3,4, initialization coefficient lambda _a=1, (a=1,2,3,4)

Calculate

{\tilde{E}}_{a} = \underset{x, y}{Σ} f_{a} (x_{t}, y_{t - 1}, y_{t}) .

(2) normalized factor is calculated:

Z (x) = Σ_{y = 1}^{5} \exp (\underset{a}{Σ} Σ_{t = 1}^{n} λ_{a} f_{a} (x_{t}, y_{t - 1}, y_{t})) .

(3) according to the article that HannaM delivered in 2004) ask condition to distribute

p^{(k)} (y_{t} | x_{t}) = \frac{1}{Z (x)} \exp {\underset{a}{Σ} Σ_{t = 1}^{n} λ_{a} f_{a} (x_{t}, y_{t - 1}, y_{t})}

Utilize current λ _acomputation model is expected

E_{a}^{k} = \underset{x_{t}, y_{t}}{Σ} p^{(k)} (y_{t} | x_{t}) f_{a} (x_{t}, y_{t - 1}, y_{t})

(4) undated parameter value c gets 4. in this article

(5) (2) to (4) are repeated until λ _aconvergence.

Which kind of abnormal behaviour described video detection module to be measured, belong to for detecting video to be measured.By the proper vector of video to be measured, the conditional random field models of the front construction of utilization, calculates the probability belonging to different abnormal behaviour, gets abnormal behaviour classification belonging to maximum probability, carries out unusual checking to video track.Implementation procedure is as follows:

First δ is defined _ti (), the observation sequence of the known front t of its expression is x ₁x ₂... x _t, be the maximum probability of i at t vertex ticks, i is behavior marking serial numbers (normally, crosses the border, hover, stay, drive in the wrong direction and be numbered 1 to 5 respectively), defines a rollback array W again _ti () deposits the optimum mark before t mark i recurrence.According to the δ of t _t(i) and W _ti () recurrence can obtain the δ in t+1 moment _t+1(i):

(1) initialization δ ₁(i)=p (y ₁=i|x ₁), 1≤i≤5, wherein can see in previous step Construction of A Model.λ _athe parameter obtained is estimated, f when being tectonic model _a(x ₁, y ₀, y ₁) be the fundamental function obtained by test video proper vector, i is behavior marking serial numbers (normally, crosses the border, hover, stay, drive in the wrong direction and be numbered 1 to 5 respectively)

(2) recurrence asks locally optimal solution:

δ_{t} (j) = \underset{1 \leq i \leq 5}{m a x} [δ_{t - 1} (i) p (y_{t} = j, y_{t - 1} = i | x)] p (y_{t} | x_{t}), 1 \leq i \leq 5, 2 \leq t \leq n .

Wherein what represent is in t, and known observed reading data, behavior is labeled as the probability of j, (j is behaviour classification sequence number, numbers from 1 to 5).

P (y _t=j, y _t-1=i|x) represent that known t-1 moment behavior is labeled as i, t transfers to the probability that behavior is labeled as j.

(3) rollback array element is wherein upgraded:

W_{t} (j) = \arg \underset{1 \leq i \leq 5}{m a x} [δ_{t - 1} (i) p (y_{t} = j, y_{t - 1} = i | x)],

1≤i≤5,2≤t≤n, return make product term in bracket maximum respective markers j to W _t(j).

(4) calculate

p ^*δ _nthe maximal value of (i)

value be make δ _ni mark i that () is maximum.

(5) according to the value inside rollback array, the mark of maximum probability during moment t is return back to:

t＝n-1,n-2,...,1。

By rollback array, from start to ask until obtain the state in all moment

The invention has the advantages that:

(1) according to the video accident detection method based on condition random field computation model in the present invention, the detection of the unexpected abnormality event to specific a few class can be realized, its anomaly analysis algorithm has good applicability, and has higher classification accuracy rate

(2) such scheme that proposes of the present invention, very little to the change of existing system, can not the compatibility of influential system, and realize simple, efficient.

Accompanying drawing explanation

The present invention above-mentioned and/or additional aspect and advantage will become obvious and easy understand below in conjunction with in the accompanying drawing description of this invention, wherein:

Fig. 1 is according to video accident detection FB(flow block) of the present invention;

Fig. 2 is according to unusual checking support module implementation procedure of the present invention;

Fig. 3 is according to GMM model extraction prospect process flow diagram of the present invention;

Fig. 4 is according to extraction track sets schematic diagram of the present invention, wherein (a) track sets 1, and (b) is track sets 2;

Fig. 5 divides schematic diagram according to background area of the present invention; A () is former background, (b) is for utilizing GMM model partition region;

Fig. 6 is according to first-order condition random field models figure of the present invention;

Fig. 7 is according to unusual checking result example one of the present invention; Wherein (a) is normal, and (b) is normal, and (c) stays;

Fig. 8 is according to unusual checking result example two of the present invention; Wherein (a) hovers, and (b) crosses the border, and (c) crosses the border;

Fig. 9 is according to device testing result example of the present invention, wherein (a) selects video, b () extracts target, c () obtains track, d () selects background segment number of regions, e () obtains background segment result, (f) detects.

Embodiment

Be described below in detail the present invention, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or has element that is identical or similar functions from start to finish.Being exemplary below by the embodiment be described with reference to the drawings, only for explaining the present invention, and can not limitation of the present invention being interpreted as.

In order to realize the object of the present invention, the invention discloses a kind of abnormal track-detecting method of video divided based on condition random field and background area, shown in composition graphs 1, whole method comprises the steps:

(1) unusual checking is made up of training and testing two large divisions, all first training and testing uses trajectory extraction module to obtain and extracts target trajectory, current extraction order calibration method has a lot, that the present invention mainly adopts is gauss hybrid models (GMM), it can upgrade in time background, fully takes into account the slight change of target.

Because have noise, shade etc., so will follow-up process be carried out after extracting target.First carry out Morphological scale-space herein, remove the isolated noise point of target image, calculating connected domain simultaneously, filter out the connected domain compared with small size, is Shadows Processing afterwards.Finally obtain target bounding box, calculate the center of bounding box as target trajectory.

(2) acquisition track after and utilize Region dividing module, segmentation background area.

(3) utilize condition random field MBM, the structural attitude vectors such as the track obtained before utilization and the background after splitting, set up the condition random field models, carries out parameter estimation.

(4) video detection module part of detecting to be measured is by the method construct proper vector same with training video, carries out mode inference.Calculate proper vector during detection and belong to the probability of different abnormal behaviour, get abnormal behaviour belonging to maximum probability as classification.

First track sets is obtained by trajectory extraction module shown in general module flow process composition graphs 2 of the present invention, afterwards background is passed through Region dividing module, obtain the region after dividing, again by condition random field MBM tectonic model, detect abnormal behaviour classification finally by video detection module to be measured.The specific implementation process of above-mentioned each module is as follows:

1, trajectory extraction module

GMM model extraction prospect and background is utilized, as shown in Figure 3 in training video.

Utilize GMM to detect prospect and background, mainly because:

In the scene of long-term observation, background occupies the majority the time, even if the consistent moving object of relative color also can produce more evolutions than background, and generally object all with different colours.When adding object in scene, by the adaptive process of a period of time, old background model can be replaced by new background model.And when object is removed time, because original background model still exists, can quick-recovery background model soon.

The thought of GMM (mixed Gaussian) model is that the color that each pixel presents is represented by M state, usual M gets 3-5, each state Gaussian distribution is similar to, the color that pixel presents is represented with stochastic variable X, the pixel value obtaining video image at each moment T is the sampled value of stochastic variable X, being distributed as of pixel:

p (X_{t}) = Σ_{k = 1}^{M} w_{k, t} * η (X_{t}, μ_{k, t}, Σ_{k, t})

X _tred, green, blue three colouring component of this pixel in t, w _k,tthe weight (1≤k≤M) shared by a t kth Gauss model, μ _k,tand ∑ _k,tbe kth Gauss model average and covariance matrix in t mixed Gauss model respectively, η is Gaussian density function.

η (X_{t}, μ, Σ) = \frac{1}{{(2 π)}^{n / 2} | Σ_{k, t} |^{1 / 2}} e^{- \frac{1}{2} {(x_{t} - μ_{k t})}^{T} Σ_{k, t}^{- 1} (x_{t} - μ_{k t})}

(1) each Color Channel independent distribution is supposed, after initialization mixed Gauss model, according to priority w _k,t/ σ _k,tgauss model is sorted.

(2) at moment t, to each pixel X of video _tmate with all Gauss models, if pixel value X _twith a kth Gaussian distribution g _kthe distance of average be less than threshold value (2.5 times of standard deviations), then define this Gaussian distribution and pixel X _tcoupling.So according to following formula undated parameter

\{\begin{matrix} \begin{matrix} w_{k, t} = (1 - α) w_{k, t - 1} + α \\ μ_{k, t} = (1 - ρ) μ_{k, t - 1} + {ρX}_{t} \\ σ_{k, t}^{2} = (1 - ρ) σ_{k, t - 1}^{2} + ρ {(X_{t} - μ_{k, t})}^{2} \end{matrix} & ρ = \frac{α}{w_{k, t}} \end{matrix}

In formula, α is learning rate, reaction be the speed of Gauss model undated parameter, be a decimal being comparatively close to zero, getting initial value is 0.001, w _k,tthe weight (1≤k≤M) shared by a t kth Gauss model, μ _k,tand ∑ _k,tbe kth Gauss model average and covariance matrix in t GMM model respectively, simplifying covariance is σ _k,tit is the standard deviation of a kth Gauss model;

(3) if do not mated, then by Gauss model g minimum for priority _lagain assignment.No matter whether mate, all will according to certain Policy Updates weight, average, variance.

By sequence after before C distribution elect background model as, all the other are as foreground model.

(4), after utilizing GMM model extraction to target, coordinate (x, y) the ∈ { (x of target can be obtained ₁, y ₁), (x ₂, y ₂) .... (x _n, y _n), find out maximum value and the minimum point of transverse and longitudinal coordinate wherein afterwards, x _min=min (x ₁, x ₂... x _n), x _max=max (x ₁, x ₂... x _n), y _min=min (y ₁, y ₂... y _n), y _max=max (y ₁, y ₂... y _n), then with these extreme points structure parallel lines, x=x _min, x=x _max, y=y _min, y=y _max.Thus obtain Rectangular Bounding Volume (x, the y) ∈ [x of surrounding target _min, x _max] × [y _min, y _max].After known bounding box, obtain the barycenter ((x of bounding box _min+ x _max)/2, (y _min+ y _max)/2), it can be used as the tracing point of target.Tracing points all in video are combined can obtain track sets according to time sequencing, as shown in Figure 4.

2, Region dividing module

In order to better disclose track at the meaning of current scene and reduction computation complexity, the background utilizing GMM to extract is divided into L region by the present invention, and each zone marker is subarea _k, (k=1,2 ... L), as shown in Figure 5.

Background divides and can adopt artificial division or adopt algorithm partition.What Fig. 5 adopted is GMM algorithm partition region.

For the thought that divides mainly, suppose total M Gauss model, that image can be divided into M region to GMM algorithm, and wherein the pixel obedience average in each region is μ _k, variance is δ _ka kth Gaussian distribution.Model parameter is set to (being Gauss model weight respectively, average and variance), θ can obtain with Maximum-likelihood estimation, by EM Algorithm for Solving.

EM algorithm steps:

1. initialization θ, Initialize:

2. estimate (E-steps):

Calculate weight w _kposterior probability

α_{k} = \frac{w_{k} η_{k} (x_{i}; μ_{k}, δ_{k})}{Σ_{j}^{M} w_{j} η_{j} (x_{i}; μ_{j}, δ_{j})},

1≤i≤n,1≤k≤M

X _ii-th pixel in frame of video, w _kthe weight of a kth Gaussian distribution, μ _kand σ _k

Average and the standard deviation of a kth Gaussian distribution respectively;

3. maximize (M-steps):

Upgrade weight, average, variance.

Repetition E step and M step, until convergence, finally calculate pixel according to parameter θ and belong to which Gaussian distribution, partitioned image.

3, condition random field MBM

Condition random field chain type non-directed graph model as shown in Figure 6.Y shows as abnormal behaviour classification in the present invention, and X represents the eigenwert observed and obtaining in this patent.When training pattern, as previously described, following several abnormal behaviour is defined: cross the border; Hover; Stay; Drive in the wrong direction.All the other behaviors regard as normally.Utilize method noted earlier to obtain target trajectory, the region that background segment goes out, they are combined: T _i=(p _t, q _t, p _t-1, q _t-1, t, subarea _k, state), wherein p _t, q _tthe coordinate of t respectively, p _t-1, q _t-1the coordinate in t-1 moment, subarea _kthe mark in region residing for target, state target-marking at current region through state, whether once passed through this region before showing target.To the training video belonging to a class abnormal behaviour together, obtain their proper vector (as obtaining the proper vector of all " driving in the wrong direction " video) respectively.

(1) in actual applications, structural attitude function and potential function is first needed.In invention, according to the proper vector T obtained before _i=(p _t, q _t, p _t-1, q _t-1, t, subarea _k, state), the background area subarea residing for t target _k, obtain fundamental function:

By t and the coordinate p in t-1 moment _t, q _t, p _t-1, q _t-1, the direction of motion of target can be obtained, structural attitude function:

By the current residing region subarea of target _kwith mark state, whether differentiate repeatedly through a region, structural attitude function:

By the current residing region subarea of target _k, and time t, statistics enters a subarea _ktime, structural attitude function:

Finally construct potential function, it is the linear combination exponential form formation of multiple features function: their coefficient is respectively λ _a, a=1,2,3,4, initialization coefficient lambda _a=1, (a=1,2,3,4)

Calculate

{\tilde{E}}_{a} = \underset{x, y}{Σ} f_{a} (x_{t}, y_{t - 1}, y_{t}) .

(2) normalized factor is calculated:

Z (x) = Σ_{y = 1}^{5} \exp (\underset{a}{Σ} Σ_{t = 1}^{n} λ_{a} f_{a} (x_{t}, y_{t - 1}, y_{t})) .

(3) (article according to HannaM delivered in 2004) asks condition to distribute

p^{(k)} (y_{t} | x_{t}) = \frac{1}{Z (x)} \exp {\underset{a}{Σ} Σ_{t = 1}^{n} λ_{a} f_{a} (x_{t}, y_{t - 1}, y_{t})}

Utilize current λ _acomputation model is expected

E_{a}^{k} = \underset{x_{t}, y_{t}}{Σ} p^{(k)} (y_{t} | x_{t}) f_{a} (x_{t}, y_{t - 1}, y_{t})

(4) undated parameter value c gets 4. repetitions (2) in this article to (4) until λ _aconvergence.

4, video to be measured detects MBM

Adopted by the CRF model set up and infer that algorithm calculates the mark of test data.First before doing test data, same pre-service obtains proper vector, then utilizes the CRF model (normally, cross the border, hover, stay, drive in the wrong direction) just now built to estimate the parameter lambda=(λ obtained ₁, λ ₂... λ _a) calculate test video and belong to the probability of different abnormal behaviour respectively, get its maximum as classification foundation.Fig. 7 is the normal behaviour detected, Fig. 8 is the abnormal behaviour detected.If simple chain type non-directed graph structure, can with reference to Viterbi algorithm.Viterbi is by finding locally optimal solution, and finally these are separated the complete solution of composition one, step is as follows:

(1) initialization δ ₁(i)=p (y ₁=i|x ₁), 1≤i≤5, wherein can see in previous step Construction of A Model.λ _athe parameter obtained is estimated, f when being tectonic model _a(x ₁, y ₀, y ₁) be the fundamental function obtained by test video proper vector, i is behavior marking serial numbers (normally, crosses the border, hover, stay, drive in the wrong direction and be numbered 1 to 5 respectively).

(2) recurrence asks locally optimal solution:

δ_{t} (j) = \underset{1 \leq i \leq 5}{m a x} [δ_{t - 1} (i) p (y_{t} = j, y_{t - 1} = i | x)] p (y_{t} | x_{t}), 1 \leq i \leq 5, 2 \leq t \leq n .

In order to calculate p (y _t=j, y _t-1=i|x), first set node as X _t(t=1 ..., n) (node can be understood as the characteristic information of frame of video here), all correspondence has 5 kinds to mark y _t=(1 ..., 5), represent respectively normally, cross the border, hover, drive in the wrong direction, stay 5 kinds of abnormal behaviours.In order to calculate foremost and finally add two nodes, be defined as ' start node ' and ' end node '.

For the labeled bracketing of whole sequence node, p (y|x) is just equivalent to select the probability from first node to final node one paths.

To each node definition 5 × 5 matrix M _tx (), wherein each matrix element is defined as

m (y_{t - 1} = i, y_{t} = j | x_{t}) = \exp (\underset{a}{Σ} λ_{a} f_{a} (x_{t}, y_{t - 1}, y_{t})),

(i, j are behavior labeled bracketings, get 1 to 5, f _a(x _t, y _t-1, y _t) be characteristic of correspondence functional value).Utilize the element in this matrix can calculate marginal distribution p (y _t=j, y _t-1=i|x _t):

p (y_{t} = j, y_{t - 1} = i | x_{t}) = \frac{α_{t - 1} (y_{t - 1} | x_{t}) m (y_{t - 1} = i, y_{t} = j | x_{t}) β_{t} (y_{t} | x_{t})}{Z (x)}

Wherein α _t-1(y _t-1=i|x _t) be a forward direction vector, for start node α ₀(y ₀| x ₁)=1, forward direction vector is produced by iteration: α _t(y _t| x _t+1)=α _t-1(y _t-1| x _t) M _t(x), M _tthe matrix of each node of x front construction that () is.

Similar β _t(y _t| x _t)=M _t(x) β _t+1(y _t| x _t).β _n+1(y _n+1|x _n+1)＝1

Need afterwards to calculate time such as t=8, j=2, calculate the maximal value of product in following bracket.

max[δ ₇(1)p(y ₈＝2,y ₇＝1|x ₈)，δ ₇(2)p(y ₈＝2,y ₇＝2|x ₈)，…，δ ₇(5)p(y ₈＝2,y ₇＝5|x ₈)]。

Finally choose the result that in bracket, product is maximum, then with p (y _t| x _t) being multiplied obtains δ _t(j).

(3) rollback array element is wherein upgraded: 1≤i≤5,2≤t≤n, return make product term in bracket maximum respective markers j to W _t(j).

(4) the maximum probability P of final complete label sequence is established ^*with last state in this sequence be obtain δ _n(1) δ _n(2) ... δ _n(5) calculate afterwards.

p ^*δ _nthe maximal value of (i).

value be make δ _ni mark i that () is maximum.

According to the value inside rollback array, return back to the mark of maximum probability during moment t:

t＝n-1,n-2,...,1。

By rollback array, from start to ask until obtain the state in all moment

What in the present invention, experimental data adopted is the database and 3DPES database independently taken.Experiment contains following several behavior: normal Normal; Hover Wander; Cross the border Cross, stays Stay, reverse Reverse.The training video number of each abnormal behaviour at 25-30, the lasting duration 8-30 second of each training video.The data of employing 30% are trained altogether, and the data of 70% are used for test.Additionally use HMM model herein to detect simultaneously.For same video data, both verification and measurement ratio contrasts as shown in Table 1 and Table 2.

Table 1HMM testing result

Table 2CRF model inspection result

Fig. 9 is the experimental prototype of writing, and includes above-mentioned several module and ruuning situation.Through experiment, adopt the database that the present invention uses, several unusual checking rates of defined can reach more than 90%.

Non-elaborated part of the present invention belongs to techniques well known.

Claims

1. a video unusual checking system, is characterized in that comprising: trajectory extraction module, Region dividing module, condition random field MBM, video detection module to be measured; Wherein:

Trajectory extraction module: the track first being detected target to be detected by GMM model and mixed Gauss model in video, delivers to condition random field MBM;

Region dividing module: background is carried out territorial classification, divides by artificial division or by algorithm automatically according to different requirements, then marks the region after division, delivers to condition random field MBM;

Condition random field MBM: the region after division and trajectory coordinates are combined and obtains the proper vector of these abnormal behaviours, utilize the proper vector belonging to a kind of abnormal behaviour, structural attitude function, carry out conditional random field models training and parameter estimation, obtain the weight coefficient of fundamental function in condition random field, deliver to video detection module to be measured;

Video detection module to be measured: cycle tests is extracted trajectory coordinates, region after dividing is combined, obtain the proper vector of cycle tests, the parameter utilizing condition random field MBM to estimate judges, calculate and belong to the probability of different abnormal behaviour, get abnormal behaviour belonging to maximum probability as classification;

Described condition random field MBM implementation procedure is as follows:

After obtaining all proper vectors, need to carry out estimated parameter from training data:

λ＝(λ ₁,λ ₂,...λ _s,...λ _m)

(1) structural attitude function and potential function is first needed, according to the proper vector T that background segment and Region dividing obtain _i=(p _t, q _t, p _t-1, q _t-1, t, subarea _k, state), the background area subarea residing for t target _k, obtain fundamental function:

Wherein x _trepresent the feature obtained, that select here is time t and background area subarea _k, y _t-1and y _tthe handmarking in t-1 moment and the handmarking of t; By t and the coordinate p in t-1 moment _t, q _t, p _t-1, q _t-1, obtain the direction of motion of target, structural attitude function:

Construct other fundamental function by that analogy, finally construct potential function, it is the linear combination exponential form formation of multiple features function: their coefficient is respectively λ _a, a=1,2,3,4, initialization coefficient lambda _a=1, a=1,2,3,4, calculate sample and expect f _a(x _t, y _t-1, y _t) be the fundamental function constructed in (1) step;

This step defined feature function, and calculate sample expectation;

(2) normalized factor is calculated: n represents that video has n node, and n gets the frame number of video; According to the fundamental function that step (1) constructs, calculate and obtain normalized factor;

(3) condition is asked to distribute

Utilize current λ _acomputation model is expected

According to the normalized factor calculated in previous step and the fundamental function of front construction, ask model to expect;

(4) undated parameter value c gets 4

Expect and model expectation according to the sample that step (2) and step (3) obtain, and the parameter value of last iteration, undated parameter value λ _a;

(5) step (2)-step (4) is repeated until λ _aconvergence.

2. video unusual checking system according to claim 1, is characterized in that: described trajectory extraction module specific implementation process is as follows:

(1) suppose each Color Channel independent distribution, simplifying covariance is initialization mixed Gauss model: average and the variance of each Gauss model are initialized as zero, the number of the weights initialisation of each Gauss model to be 1/M, M be each Gauss model;

(2) at each pixel X of moment t to video _tmate with all Gauss models, if pixel X _tvalue and a kth Gaussian distribution g _kthe distance of average be less than threshold value, then pixel X _tthe match is successful with this Gaussian distribution, this Gaussian distribution then undated parameter according to the following formula, increase the weight of the Gauss model of this coupling, according to weight, average, the variance of each Gauss model during step (1) initialization, the RGB triple channel value of each pixel in one frame, this pixel and existing Model Matching simultaneously, what export is weight, average, the variance of mating rear each Gauss model the 2nd moment, by that analogy, the parameter of known t-1 moment Gauss model, upgrades weight, average, the variance of obtaining each Gauss model of t;

In formula, α is learning rate, reaction be the speed of Gauss model undated parameter, getting initial value is 0.001, w _k,tthe weight shared by a t kth Gauss model, 1≤k≤M, μ _k,tand ∑ _k,tbe kth Gauss model average and covariance matrix in t GMM model respectively, simplifying covariance is σ _k,tit is the standard deviation of a kth Gauss model;

(3) if do not mate, then the Gaussian distribution of minimum weights is replaced by new distribution, all the other Gaussian distribution upgrade according to the following formula: according to the weight of each Gauss model during step (1) initialization, average, variance, the RGB triple channel value of each pixel in one frame, this pixel is got along well existing Model Matching simultaneously, output be weight, average, the variance of the 2nd moment each Gauss model; The parameter of known t-1 moment Gauss model by that analogy, upgrades weight, average, the variance of obtaining each Gauss model of t,

w _k,t＝(1-α)w _k,t-1(2)

(4) last according to priority w _k,t/ σ _k,tsort to Gauss model, wherein the greater represents that it has less variance, and probability of occurrence is larger; Elect C distribution before after sequence as background model, all the other are as foreground model, and described C is satisfied:

Wherein T is a weight threshold, and weight threshold scope is 0.65 to 0.75, finally judges whether this pixel belongs to background model, if do not belong to background model, belongs to foreground model; After determining each pixel belonging to foreground model, which pixel what can judge to belong to foreground target in each two field picture is, obtains all foreground pixels, namely obtains foreground target; Obtain weight, average, the variance of each Gaussian distribution in k moment according to previous step, what this step exported is meet the pixel set belonging to prospect condition, i.e. foreground target;

After obtaining foreground target, using the barycenter of prospect bounding box as tracing point, implementation procedure is as follows: maximum value and the minimum point of finding out foreground target transverse and longitudinal coordinate, again with these extreme points structure parallel lines, thus obtain the Rectangular Bounding Volume of surrounding target, obtain the barycenter of bounding box, namely detect the track of target to be detected.

3. video unusual checking system according to claim 1, is characterized in that: the specific implementation process of described Region dividing module is as follows:

(1) initialization Initialize

The initial parameter of each Gauss model is set:

represent the weight of the 1 to the M Gauss model respectively, average, variance, the value after the null representation in subscript bracket upgrades for the 0th time; During initialization, the weight of all Gauss models is set to 1/M, M is Gauss model number, and average and variance are initialized as zero;

(2) E-STEP estimation is carried out

X _ii-th pixel in frame of video, w _kthe weight of a kth Gaussian distribution, μ _kand σ _kaverage and the standard deviation of a kth Gaussian distribution respectively; The weight of the kth Gaussian distribution obtained according to initialization or step (3), average, variance and the RGB triple channel value of i-th pixel obtained from image, output be the probability that i-th pixel belongs to a kth Gaussian distribution;

(3)M-step

Obtaining new parameter value by being maximized by likelihood function, first upgrading weight:

Upgrade average:

Upgrade variance:

The pixel obtained according to step (2) belongs to the probability of different Gaussian distribution, the weight that iteration makes new advances, average, variance;

(4) step (2), (3) are repeated, until the likelihood function upgraded convergence, calculates each pixel and belongs to which Gauss model, classify, obtain the image after dividing according to Gauss model numbering to all pixels of frame of video after convergence.

4. video unusual checking system according to claim 1, is characterized in that: described video detection module implementation procedure to be measured is as follows:

(1) initialization δ ₁(i)=p (y ₁=i|x ₁), 1≤i≤5, wherein λ _athe parameter obtained is estimated, f when being tectonic model _a(x ₁, y ₀, y ₁) be the fundamental function obtained by test video proper vector, i be behavior marking serial numbers namely normal, cross the border, hover, stay, drive in the wrong direction, be numbered 1 to 5 respectively, by δ ₁i (), 1≤i≤5 Initialize installation is δ ₁(i)=1;

(2) recurrence asks locally optimal solution: according to step (1) initialized δ ₁i (), calculates δ ₂i (), namely obtains the probability of vertex ticks during the 2nd moment, known t-1 moment known observed reading, behavior is labeled as the conditional probability of i, be labeled as i with known t-1 moment behavior, t behavior is labeled as the conditional probability of j, obtains the probability δ that t vertex ticks is j _tj (), until the probability δ obtaining the last moment _n(i), 1≤i≤5;

Wherein what represent is in t, known observed reading data, and behavior is labeled as the probability of j, and j is behaviour classification sequence number, numbers from 1 to 5;

P (y _t=j, y _t-1=i|x) represent that known t-1 moment behavior is labeled as i, t transfers to the probability that behavior is labeled as j;

(3) rollback array element is wherein upgraded: 1≤i≤5,2≤t≤n, calculates according to step (2) the probability δ that t-1 moment behavior is labeled as i _t-1i (), upgrades the element in t rollback array; By respective markers j maximum for product term in bracket to W _t(j);

(4) calculate p ^*δ _nthe maximal value of (i); Input is the probability that each node belongs to different abnormal behaviour, output be that probabilistic packet marking of maximum probability,

return value be make δ _ni mark i that () is maximum;

According to the δ that step (2) obtains _ni (), obtains the mark i making it maximum;

Obtain according to step (4) the rollback array obtained with step (3), from start to ask , until obtain the status indication in all moment