CN105046195B

CN105046195B - Human bodys' response method based on asymmetric generalized gaussian model

Info

Publication number: CN105046195B
Application number: CN201510313321.1A
Authority: CN
Inventors: 李俊峰; 方建良; 张飞燕
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2015-06-09
Filing date: 2015-06-09
Publication date: 2018-11-02
Anticipated expiration: 2035-06-09
Also published as: CN105046195A

Abstract

The invention discloses a kind of Human bodys' response methods based on asymmetric generalized gaussian model, training video is pre-processed to and is detected the space-time interest points of the video first, its video block is extracted centered on point of interest again and calculates its Optic flow information and gradient information, corresponding histogram is drawn according to obtained Optic flow information and gradient information, then it is fitted corresponding histogram with asymmetric generalized gaussian model (AGGD), forms the eigenmatrix of training video using the parameter of Optic flow information and the AGGD of gradient information as feature.Above-mentioned all processing are equally carried out for test video obtains the eigenmatrix of test video.Mahalanobis distance between training video and the eigenmatrix of test video is finally calculated, the behavior of test video is identified further according to nearest neighbouring rule.The method of the present invention largely improves the accuracy rate of video behavior to be identified.

Description

Human bodys' response method based on asymmetric generalized gaussian model

Technical field

The present invention relates to a kind of methods of Human bodys' response, belong to computer vision and machine learning field, specifically It is a kind of Human bodys' response algorithm.

Background technology

In recent years, the research of intelligent Video Surveillance Technology receives the concern of people with application.As its basic handling part, Activity recognition is a very active research direction, belongs to the important research content of computer vision field.

It can be divided into two classes according to current research method：Activity recognition method based on global characteristics and local feature. The information such as global characteristics generally use edge, light stream, outline profile retouch the entire interested human region detected It states, it is more sensitive to noise, visual angle change, partial occlusion.It is right such as using mixed Gauss model come adaptive updates background model Video sequence carries out zone marker after extracting sport foreground to foreground, is then obtained using Lucas-Kanade optical flow computation methods Optic flow information in moving region describes behavior using the weighting direction histogram based on amplitude；Also utilize double-background model Carry out adaptive updates background model, after the sport foreground for extracting video sequence, the adjacent rectangular area of minimum in foreground is used Lucas-Kanade optical flow computation methods calculate Optic flow information, and behavior knowledge is carried out using the unit weighting light stream energy of moving target Not；Some scholars propose the Optic flow information extracted first in video, then utilize an empirical covariance matrix to its dimensionality reduction A covariance descriptor is obtained, by taking logarithmic mapping to vector space it, is retouched using the logarithm covariance finally obtained It states symbol and carries out Activity recognition；In addition have and be extracted a kind of feature towards Optical-flow Feature histogram to describe motor behavior, the spy Any human body segmentation and background subtraction are not needed when sign extraction；It proposes one and is based on depth image 3D artis Sampling characters packets Sampling system Activity recognition method, the 3D artis by extracting characterization human body attitude from range image sequence describes people Body behavior；Some is then to propose a kind of detection carrying out abnormal behaviour using motor pattern analysis method, is regarded by calculating The light stream of frequency defines track to generate a kind of motion model, then carries out hierarchical cluster to track using space time information to learn This statistics motor pattern, is finally carried out abnormality detection using this statistical model；.

Local feature is described after being to interested piece in human body or point extraction, need not be carried out to human body accurate Positioning and tracking, and local feature is insensitive to hiding part gear, visual angle change etc..Therefore local feature makes in Activity recognition It is relatively high with frequency.Such as quantization parameter and motion vector are extracted from compression of video sequence as feature；Utilize 3D-HOG spies Optical-flow Feature seek peace to describe video behavior；Some are then extracted 3D-SIFT features in video sequence；And some is by HOG features Combine the space-time cube for being described together and being extracted from video sequence with HOF features；Space-time list is extracted from video Then word feature packet carries out Activity recognition using the latent Di Li Crays distribution model of label as grader；Propose one kind Then fast density track Activity recognition method utilizes the time by the region of interesting extraction density trace feature in video frame Pyramid carries out Activity recognition to realize the velocity adaptive mechanism of different actions；It is sharp after removing image background by pretreatment With more accurate Optical-flow Feature is calculated after Harris Corner Detection Algorithm detection image points of interest；First use light stream detection method The position and direction of movement are detected, are come by the method that random sampling is known together most prominent in the frame for further positioning and identifying The movement gone out, then according to point of interest in optical flow field both horizontally and vertically position mean difference and standard deviation position human motion A small rectangular area, this small rectangular area is divided into several pieces, light stream is calculated by frame according to point of interest, then synthesizes one A matrix is averaged after the matrix of identical behavior is added to indicate the behavior, finally carries out behavior using simple grader again Identification.

How to be obtained from image sequence can effective expression body motion information be characterized in the pass of Human bodys' response Key.Optical-flow Feature is relatively good space-time characteristic, and a kind of motion feature being commonly used in movement identification.The above method In, all it is foreground moving region to be marked after extracting video motion foreground, then optical flow computation is carried out to it；Some are then Light stream is calculated to regular-shape motion area dividing again after detecting entire human motion region.Various for human body do not go together For the Optic flow information for moving unconspicuous body part is negligible, and the above method needs to calculate entire human region Light stream, which not only adds calculation amounts, but also can reduce accuracy of identification.For space-time characteristic, space-time is special Sign description reconstructs bag of words code book after carrying out PCA dimensionality reductions, i.e., carries out cluster calculation generation after being sampled to training data again " dictionary ", this method prevent training sample from being fully utilized；And in order to ensure certain average recognition rate, even if drop Dimension, sample data volume is still excessively high, and cluster speed is slow.In addition, there may be certain similar for the characteristic of all directions Property, all directions cluster together will reduce different directions feature to the descriptive of behavior.

Invention content

The technical problem to be solved in the present invention is to provide a kind of Human bodys' responses based on asymmetric generalized gaussian model Method.

In order to solve the above technical problem, the present invention provides a kind of, and the human body behavior based on asymmetric generalized gaussian model is known Other method is realized by training video library and test video；Include the following steps：Step 1, for given training video library And test video carries out point of interest detection respectively；Step 2 extracts video block centered on point of interest；Step 3 calculates separately The video block message of training video and test video, and obtain tri- direction gradient feature data of respective X, Y, Z and Optical-flow Feature U, two component datas of v；Step 4 draws two direction histogram of three direction histogram of gradient and light stream respectively to above-mentioned data；Step Rapid five, corresponding histogram is fitted with asymmetric generalized gaussian model；Step 6 extracts the ginseng of asymmetric generalized gaussian model Number forms the eigenmatrix of the eigenmatrix and test video of each behavior of training video as feature；Step 7 calculates test and regards Mahalanobis distance between frequency eigenmatrix and each behavioural characteristic matrix of training video；Step 8, according to nearest neighbouring rule into every trade For identification.

As the improvement to the Human bodys' response method of the present invention based on asymmetric generalized gaussian model：It is described In step, point of interest detection is as follows：Video is regarded as the image sequence f (x, y, t) of multiple image composition；Defined function f：F obtains L after linear space scale is added：Become by separate space by image sequence f Amount isIt is with time variableGaussian function carry out convolutional filtering obtain, expression formula is as follows：

Gauss window in time-space domainIt is defined as：

σ in formula_lFor space scale variable, τ_lFor time scale variable, t is time dimension；Defining receptance function R is:

R (x, y, t)=(I*g*h_ev)²+(I*g*h_od)² (11)

In formula, * is convolution operator, and I is video image, and g is dimensional Gaussian smoothing kernel, h_evAnd h_odIt is in spatial domain Orthogonal one-dimensional Gabor filter；h_evAnd h_odDefinition be：

In formula (12), formula (13),σ and τ is respectively the detection scale in spatial domain and time-domain, takes σ=2 and τ =3；Gaussian smoothing filter scale is 2；The adjacent domain of each maximum point of receptance function R contains in I (x, y, t) Body local movable information.

As further changing to the Human bodys' response method of the present invention based on asymmetric generalized gaussian model Into：In the step, two component data of Optical-flow Feature u, v includes as follows：The extraction of space-time characteristic point：Image sequence is carried out emerging After interest point detection, space-time interest points are obtained, a space-time cube is defined centered on space-time interest points, it is vertical to extract this space-time The pixel of cube constructs space-time characteristic；If space-time cube is I (x, y, t), then the gradient G of its X-axis, Y-axis, Z-direction_x、 G_y、G_zIt can be respectively defined as：

G_x(x, y, t)=L (x+1, y, t)-L (x-1, y, t), (14)

G_y(x, y, t)=L (x, y+1, t)-L (x, y-1, t), (15)

G_z(x, y, t)=L (x, y, t+1)-L (x, y, t-1), (16)

In formula, L (x+1, y, t) is the gradient of Gaussian function of this after convolutional filtering；

The extraction of Optical-flow Feature：Light stream is calculated using Lucas-Kanade methods：It is located at moment t, pixel (x, y) At position 1, gray value herein is I (x, y, t), and at (the t+ △ t) moment, preimage vegetarian refreshments moves at position 2, at this time it Change in location is (x+ △ x, y+ △ y), and new gray value is I (x+ △ x, y+ △ y, t+ △ t)；According to image consistency it is assumed that MeetThen：

I (x, y, t)=I (x+ △ x, y+ △ y, t+ △ t) (17)

If u and v are respectively component of the light stream vector along x and y both directions of pixel (x, y), It is by formula (17) Taylor expansion：

After high-order term ε more than second order is ignored, then meet：

∵△t→0

∴

I.e.：I_xu+I_yv+I_t=0 (20)

In formula (20), I_x、I_y、I_tIt is pixel (x, y) along the partial derivative in tri- directions x, y, t；The vector of following formula can be used Formula is expressed：

In formula (21),For gradient direction, U=(u, v)^TIndicate light stream；Assuming that the window specified in a size Light stream in mouthful is to maintain the characteristic window constant, the optical flow constraint equation in the window thus can be asked to obtain size as x × x Light stream (u, v), i.e.,：

In formula (22), i is characterized the number of pixels i=(x × x) in window, I_xAnd I_yFor the spatial gradient of image, I_tIt is Time gradient；Solution formula (22) can obtain：

As further changing to the Human bodys' response method of the present invention based on asymmetric generalized gaussian model Into：In the step, the parameter attribute extraction based on asymmetric generalized gaussian model：The expression formula of asymmetric generalized gaussian model It is as follows：

In formula (24),WithWherein Γ () is gamma function, and expression formula is：

After carrying out asymmetric generalized gaussian model fitting to characteristic, its five parameters (α, β are extracted_l,β_r, v, u), By five parameters (α, β_l,β_r, v, u) and it is used as feature.

Training video pre-processes and detects the space-time interest points of the video by the present invention first, then is with point of interest The heart extracts its video block and calculates its Optic flow information and gradient information, and phase is drawn according to obtained Optic flow information and gradient information Then the histogram answered is fitted corresponding histogram, with Optic flow information and gradient with asymmetric generalized gaussian model (AGGD) The parameter of the AGGD of information forms the eigenmatrix of training video as feature.Test video is equally carried out above-mentioned all Processing obtains the eigenmatrix of test video.Mahalanobis distance between training video and the eigenmatrix of test video is finally calculated, The behavior of test video is identified further according to nearest neighbouring rule.The method of the present invention largely improves video to be identified The accuracy rate of behavior.

Description of the drawings

The specific implementation mode of the present invention is described in further detail below in conjunction with the accompanying drawings.

Fig. 1 is the training video process flow of the present invention；

Fig. 2 is Activity recognition flow；

Fig. 3 angle points；

Fig. 4 elliptic function schematic diagrames；

Fig. 5 characteristic values and angle point relational graph；

The extraction of the AGGD parameters of Fig. 6 Gradient Features；

The AGGD parameter attribute Activity recognitions of Fig. 7 Gradient Features；

The Activity recognition rate of the libraries Fig. 8 Weizmann Gradient Features AGGD parameter attributes；

The Activity recognition rate of the libraries Fig. 9 KTH Gradient Features AGGD parameter attributes；

Figure 10 Optical-flow Feature AGGD parameter attributes extraction figure；

Figure 11 light stream AGGD parameter attribute Activity recognitions；

The libraries Figure 12 Weizmann Optical-flow Feature AGGD parameter attribute Activity recognition rates；

Figure 13 KTH library light stream AGGD parameter attribute Activity recognition rates；

Figure 14 merges the extraction of gradient and light stream parameter attribute；

Figure 15 merges gradient and light stream AGGD parameter attribute Activity recognitions；

Merge gradient and light stream AGGD parameter attribute Activity recognition rates in the libraries Figure 16 Weizmann；

Merge gradient and light stream AGGD parameter attribute Activity recognition rates in the libraries Figure 17 KTH.

Specific implementation mode

Embodiment 1, Fig. 1~Figure 17 give a kind of Human bodys' response method based on asymmetric generalized gaussian model, Include the following steps：

Step 1: by carrying out gradient and the extraction of Optical-flow Feature data to training video library, each characteristic direction is formed (here Characteristic direction be Gradient Features 3 directions and Optical-flow Feature 2 components) characteristic set.

Step 2: the characteristic for above-mentioned each characteristic direction is described into column hisgram.

Step 3: above-mentioned histogram is fitted with asymmetric generalized gaussian model (AGGD), using the parameter of AGGD as spy Sign, forms the parameter attribute matrix of each behavior.

Step 4: for test video, we also carry out the extraction of gradient and Optical-flow Feature, form the spy of each characteristic direction Levy data acquisition system.

Step 5: the characteristic for above-mentioned steps four is described into column hisgram.

Step 6: equally we are also fitted histogram with AGGD, using the parameter of AGGD as feature, test video is formed Parameter attribute matrix.

Step 7: the geneva between calculating separately the eigenmatrix of each behavior of test video eigenmatrix and training video library Distance.

Step 8: judging the behavior of test video according to nearest neighbouring rule.

In above step 1~8, mainly by point of interest detection, the extraction of characteristic point and description, based on asymmetric broad sense height The parameter attribute extraction of this model (AGGD), the Activity recognition based on AGGD parameters.

1, point of interest detection is as follows：

In the present invention, in order to effectively detect the space-time interest points of image sequence I (x, y, t), using following methods：

We define the intersection point that image angle point is two edges first, or can be understood as in neighborhood while have The characteristic point of two principal direction, the turning of similar road and house.Neighborhood where general angle point be typically stablize in image and Region with bulk information, these regions have the characteristics such as affine-invariant features, scale invariability, rotational invariance.Human body regards Feel that the identification of angle steel joint is usually completed by the region of part or a wicket, as shown in Figure 3.If this is specific Wicket when being moved towards all directions, at the same time grey scale change is larger in moving window region, then it is determined that There is angle point in this window, as shown in Figure 3.If this specific wicket moved towards all directions, moving window Gray scale does not change in region, then without angle point in this window, as shown in Figure 3.If by this specific wicket When being moved towards some direction, grey scale change is bigger in moving window region, and the window when being moved towards another direction Gray scale is constant in the domain of mouth region, then in this window may be straight line, as shown in Figure 3.

According to auto-correlation function, image I (x, y) can be provided and translate the self-similarity after (△ x, △ y) at point (x, y) Expression formula：

In formula (1), ω (u, v) is weighting function, and it can also be gaussian weighing function that can take constant；W (x, y) is with point Window centered on (x, y).

According to Taylor expansion, (△ x, △ y) is translated at point (x, y) to image I (x, y) carry out first approximation afterwards and obtain It arrives：

In formula (2), I_xAnd I_yIt is the partial derivative of I (x, y).

Then formula (2) can be approximately：

In formula (3),That is image I (x, y) point (x, Y) auto-correlation function after place's translation (△ x, △ y) can be approximated to be quadratic term function.

Quadratic term function is substantially considered as an elliptic function, as shown in figure 4, the ellipticity and size of elliptic function By the eigenvalue λ of M (x, y)₁、λ₂It determines, direction determines that equation is by the characteristic vector of M (x, y)：

It can judge the angle point in the image in window, edge (straight line) according to the size of the characteristic value of quadratic term function And plane, as shown in Figure 5.Work as λ₁< < λ₂Or λ₁>>λ₂, i.e. the value of auto-correlation function is only bigger on some direction, When smaller on other directions, may determine that in window as straight line；Work as λ₁≈λ₂, and λ₁And λ₂It is all smaller, i.e. auto-correlation function Value it is all smaller in all directions, may determine that in window as plane；Work as λ₁≈λ₂, and λ₁And λ₂It is all bigger, i.e., from phase The value for closing function is all bigger in all directions, is may determine that in window as angle point.

In fact, differentiating that angle point need not calculate specific characteristic value, calculated after only need to defining an angle point receptance function Its value judges angle point.Defining receptance function R is：

R=detM- α (traceM)² (5)

M (x, y) in formula (3) is reduced toThen the detM in formula (5) and traceM are respectively The determinant of M (x, y) and straight mark, wherein α are empirical value, generally take 0.04-0.06.

By angle point above definition method we extend out Harris points of interest detection, Harris interest point detecting methods Thinking is to find image f^spThe position all having significant change in all directions.Then the detection method of Harris points of interest can be with It is described as：It is f to define a sub-picture^sp：f^spL is obtained after linear filtering^sp：Its expression formula It is as follows：

In formula (6), g^spIt is image f^spThe gaussian kernel function of convolutional filtering is carried out, For its dimensional variation factor.

One observing result is given to formula (6)It is using scaleSecond-order matrix with Gauss window finds interest Point, expression formula are as follows：

In formula (7), * is convolution symbol,WithIt is that scale isGradient on Gaussian function x and y,

One second moment descriptor can be regarded as the framing covariance square of a near zone Two dimensional Distribution Battle array.So matrix μ^spEigenvalue λ₁And λ₂(λ₁≤λ₂) constitute f^spVariation descriptor in image both direction, and λ₁ And λ₂All it is just there is point of interest when being worth greatly.Maximum values of the Harris and Stephens based on this one Corner Detection function of proposition Computational methods, expression formula are as follows：

H^sp=det (μ^sp)-k×trace²(μ^sp)=λ₁λ₂-k(λ₁+λ₂)² (8)

At the position existing for point of interest, the ratio between characteristic value α=λ₂/λ₁Value can be bigger.From H known to formula (8)^spIt takes Positive maximum, the ratio between characteristic value α will meet k≤α/(1+ α)²If defining k=0.25, at this time α=1, λ₁=λ₂, H takes anode Big value, point of interest have ideal isotropism.

What is detected due to this patent is point of interest in video (image sequence), video can be regarded as multiple image The image sequence f (x, y, t) of composition.Defined function f：F obtains L after linear space scale is added：It is by separate space variable by image sequence fIt is with time variableGaussian function rolled up Product filtering obtains, and expression formula is as follows：

Gauss window in time-space domainIt is defined as：

σ in formula_lFor space scale variable, τ_lFor time scale variable, t is time dimension.

Interest point detecting method used in this patent, in interest point methods of the space dimension upper edge in above-mentioned image, when Between dimension on then use Dollar propose Gabor filter, then defining receptance function R is:

R (x, y, t)=(I*g*h_ev)²+(I*g*h_od)² (11)

In formula, * is convolution operator, and I is video image, and g is dimensional Gaussian smoothing kernel, h_evAnd h_odIt is in spatial domain Orthogonal one-dimensional Gabor filter.

h_evAnd h_odDefinition be：

In formula (12), formula (13),σ and τ is respectively the detection scale in spatial domain and time-domain, is taken in the present invention σ=2 and τ=3；Gaussian smoothing filter scale is 2.

The adjacent domain of each maximum point of receptance function R contains the movement letter of the body local in I (x, y, t) Breath.

2, it the extraction of feature and is described as follows：

The extraction of 2.1 space-time characteristics：

After point of interest detection being carried out to image sequence, so that it may to obtain a series of space-time interest points, but only lean on these interest Point can not effectively describe human body behavior.The present invention defines a space-time cube centered on space-time interest points, when extracting this Empty cubical pixel constructs space-time characteristic, six times of scale where the cubical length of side takes it.This space-time cube packet Contain and has been conducive to most of points that receptance function takes maximum.

Description space-time cube method have using cube expansion value as a vector come be described, pixel normalization Description and histogram description etc..During exercise due to human body, the brightness of image variation of point of interest near zone is very violent, and When the motor behavior difference of human body, the brightness of image variation of point of interest near zone is also different.Therefore, point of interest can be utilized attached The brightness of image of near field changes to describe the point of interest of different human body behavior.The point of interest near zone of different human body behavior Brightness of image, which changes, to be reflected along the gradient of X-axis, Y-axis and Z axis (i.e. time shaft) direction by the brightness of space-time cube, This patent extracts these gradients and carries out Human bodys' response as feature.

If space-time cube is I (x, y, t), then the gradient G of its X-axis, Y-axis, Z-direction_x、G_y、G_zIt can be respectively defined as：

G_x(x, y, t)=L (x+1, y, t)-L (x-1, y, t), (14)

G_y(x, y, t)=L (x, y+1, t)-L (x, y-1, t), (15)

G_z(x, y, t)=L (x, y, t+1)-L (x, y, t-1), (16)

In formula, L (x+1, y, t) is the gradient of Gaussian function of this after convolutional filtering.

2.2, the extraction of Optical-flow Feature：

Optical flow field is the vector field that can be described image sequence and how to change over time, and contains the transient motion speed of pixel Vector Message is spent, is relatively good space-time characteristic.But optical flow computation amount is larger, in order to pass through the video block for only calculating extraction Light stream and reduce calculation amount, text selects Lucas-Kanade methods to calculate light stream.

Optical flow computation principle：

It is located at moment t, pixel (x, y) is at position 1, and gray value herein is I (x, y, t), at (t+ △ t) Carve, preimage vegetarian refreshments moves at position 2, its change in location is (x+ △ x, y+ △ y) at this time, new gray value be I (x+ △ x, y+△y,t+△t).According to image consistency it is assumed that meetingThen：

I (x, y, t)=I (x+ △ x, y+ △ y, t+ △ t) (17)

After high-order term ε more than second order is ignored, then meet：

∵△t→0

∴

I.e.：I_xu+I_yv+I_t=0 (20)

In formula (20), I_x、I_y、I_tIt is pixel (x, y) along the partial derivative in tri- directions x, y, t.The vector of following formula can be used Formula is expressed：

In formula (21),For gradient direction, U=(u, v)^TIndicate light stream.

Lucas-Kanade optical flow methods：Text selects Lucas-Kanade^[25]Optical flow method calculates light stream.Assuming that big one Light stream in small specified window be to maintain it is constant, thus can ask the optical flow constraint equation in the window obtain size be x × x Characteristic window light stream (u, v), i.e.,：

In formula (22), i is characterized the number of pixels i=(x × x) in window, I_xAnd I_yFor the spatial gradient of image, I_tIt is Time gradient.Solution formula (22) can obtain：

3, the parameter attribute for being based on asymmetric generalized gaussian model (AGGD) extracts：

Although characteristic compares after curve matching is close to Gaussian Profile, they are not stringent symmetrical , asymmetric generalized Gaussian density distribution (AGGD) is chosen according to this feature this patent, two category feature data are fitted.

The expression formula of AGGD is as follows：

After carrying out AGGD models fittings to characteristic, its five parameters (α, β are extracted_l,β_r,v,u)。

4, the Activity recognition based on AGGD parameters：

4.1, the Activity recognition based on Gradient Features AGGD parameters：

4.1.1, Human bodys' response algorithm：

Gradient Features AGGD parameters based on most of behaviors spatially can distinguish this characteristic with other behaviors, This patent chooses five parameters (α, β of the asymmetric Generalized Gaussian Distribution Model of Gradient Features_l,β_r, v, u) and it is used as feature, such as table 4.1 are characterized paraphrase table.

4.1 feature paraphrase table of table

This patent is carried according to the movement characteristic of different behaviors after carrying out point of interest to each pretreated behavior video respectively The space-time cube for taking respective numbers, to (flow is such as feature shown in extraction table 4.1 after space-time cube progress gradient description Shown in Fig. 6), then calculate geneva between training set behavior video feature matrix and test set behavior video feature matrix away from From finally using nearest neighbor classifier progress Activity recognition, as shown in Figure 7.

Nearest neighbor classifier is each sample for concentrating training data as distinguishing rule, finds distance in training set and waits for The nearest sample of classification samples, then classifies on this basis.Assuming that training set has N number of sample { x₁,x₂,...,x_n, point For y classification, then sample x to be sorted to training sample x_iDistance d (x, x_i) be：

d(x,x_i)=‖ x-x_i‖ (26)

If d (x, x_k) meet following formula：Then x ∈ ω_j。

4.1.1, Weizmann database experiments：

It is observed that the recognition accuracy phase of the recognition accuracy of jump and side behaviors and other behaviors from Fig. 8 Than, comparing for it is relatively low, and respectively 0.82 and 0.844.Wherein, jump behaviors are mainly mistaken for the False Rate of side behaviors Highest is 0.072, and main cause is that the leg action of jump behaviors and side behaviors is more similar, and side behaviors are in action There is the action of similar jump, so that the Gradient Features numerical value of jump behaviors and the Gradient Features numerical value of side behaviors are relatively, Lead to not accurately identify；And side behaviors are mainly mistaken for walk behaviors, False Rate 0.048, the reason is that side behaviors It is lateral walking in action, and walk behaviors are positive walking in action, the footwork of two behaviors is similar, only moves Make amplitude and speed is slightly different, therefore side behaviors and the Gradient Features numerical value of walk behaviors have certain similitude, from And misjudged probability is relatively high.

4.1.3, KTH database experiments：

By, it is found that compared with other behaviors, the False Rate of handclap and handwave behaviors is relatively high in Fig. 9, Handclap behaviors are mainly mistaken for box and handwave behaviors, and False Rate is respectively 0.056 and 0.048；Handwave rows To be mainly mistaken for box behaviors, False Rate 0.072.Main cause is these three behaviors part hand behavior in action More similar, they have the action that hand forward extends out, the speed only acted and amplitude difference in action.Therefore, it Gradient Features numerical value have certain similitude, so as to cause misjudged.

4.2, the Human bodys' response of the AGGD parameter attributes based on Optical-flow Feature：

4.2.1, Human bodys' response algorithm：

Similar as above, this patent is fitted Optical-flow Feature data using AGGD models, then, extract it five Parameter (α, β_l,β_r, v, u) and it is used as feature, feature paraphrase is as shown in table 4.2.

The AGGD parameter attribute paraphrase tables of 4.2 Optical-flow Feature of table

It is consistent with gradient parameter characteristic behavior above identification experimental method, according to the movement characteristic of different behaviors, to each pre- The space-time cube for extracting respective numbers after processed behavior video progress point of interest respectively, light stream is carried out to space-time cube After description extract table 4.2 shown in feature (as shown in Figure 10), then by calculate training set behavior video feature matrix with Mahalanobis distance between test set behavior video feature matrix carries out Activity recognition, finally utilizes nearest neighbor classifier into every trade To identify, as shown in figure 11.

4.2.2, Weizmann database experiments：

It can be observed how, the discrimination of side and wave1 behaviors and other behaviors compare relatively low, discrimination by Figure 12 Respectively 0.856 and 0.76.Wherein, side behaviors are mistaken for the probability of jump behaviors and are up to 0.06, this may be with side rows To be more similar related to the part footwork of jump behaviors, connect so as to cause the part Optical-flow Feature numerical value of both behaviors Closely accurately identify；And wave1 behaviors are mainly mistaken for wave2, jack behavior, False Rate is respectively 0.096,0.084, former Because of the wobbling action of to be these three behaviors in action have hand, the amplitude and speed when only swinging are different, thus this three The Optical-flow Feature numerical value of a behavior has certain similitude, so as to cause erroneous judgement.

4.2.3, KTH database experiments：

Compare the recognition accuracy of each behavior in Figure 13 it is found that the False Rate of jog and handwave behaviors is relatively high.Its In, jog behaviors are mainly mistaken for run, False Rate 0.096, reason may be jog behaviors and run behaviors in action all There is running action, only speed is different (jog behavior speed is slow, run behaviors speed), so as to cause the light of the two behaviors Stream character numerical value is more similar and can not accurately identify；Wherein, handwave behaviors be mistaken for walk probability up to be 0.048, the reason is that the partial act of the two behavior hands is similar, only act different with amplitude.

4.3, the Human bodys' response of the AGGD parameter attributes based on fusion gradient and Optical-flow Feature：

4.3.1, Human bodys' response algorithm：

The Activity recognition rate of the AGGD parameter attributes of two databases, two kinds of features is it is found that Weizmann data from the above Library differs smaller in different characteristic discrimination, and the Activity recognition rate of the AGGD parameter attributes of KTH database light streams Higher than the discrimination of the AGGD parameter attributes of gradient is more, proposes to press the AGGD parameter attributes of two kinds of features based on this this patent Sequence is combined into a fusion AGGD parameter attribute and carries out Activity recognition.Merge gradient and light stream AGGD parameter attributes paraphrase such as table 4.3.1 shown.

Table 4.3 merges gradient and light stream AGGD parameter attribute paraphrase tables

It is consistent with above-mentioned Activity recognition method, institute in table 4.3.1 is extracted respectively to the behavior video of training set and test set The feature (as shown in figure 14) shown, then by calculating training set behavior video feature matrix and test set behavior video features square Mahalanobis distance between battle array carries out Activity recognition, finally utilizes nearest neighbor classifier progress Activity recognition, as shown in figure 15.

4.3.2, Weizmann database experiments：

What discrimination was minimum as can be known from Fig. 16 is jump and wave1 behaviors, and discrimination is respectively 0.856 and 0.844.Its In, the probability that jump behaviors are mistaken for walk behaviors is up to 0.084, the reason is that the part step of jump behaviors and walk behaviors It acts similar, causes the AGGD parameter differences of fusion feature data small and can not accurately identify；And wave1 behaviors are mainly misjudged For wave2, jack behavior, False Rate is respectively 0.084,0.072, the reason is that these three behaviors have the pendulum of hand in action Action, amplitude when only swinging and speed difference, therefore the fusion feature data of wave1 behaviors and wave2, jack behavior AGGD parameter values have certain similitude, cause to judge by accident.

4.3.3, KTH database experiments：

It is tested using experiment packet method of the same race above, the average behavior discrimination for obtaining the libraries KTH is 95.2%.Figure 17 be the confusion matrix of the Activity recognition rate in the libraries KTH, wherein the discrimination of jog behaviors is relatively low compared with other behaviors, quilt The probability for being mistaken for run behaviors is up to 0.088, and main cause may be that jog and run are running behaviors, and only jog is slow It runs, run is to hurry up, and the two has difference in speed, causes the fusion feature parameter value of the two behaviors more similar and nothing Method accurately identifies.

4.3.4, the Activity recognition rate of two database difference AGGD parameter attributes：

Following table is the discrimination of three kinds of AGGD parameter attributes in Weizmann databases and KTH databases.It can be with from table It observes in Weizmann databases, the discrimination of the AGGD parameter attributes of light stream is minimum, is 90.16%, compound AGGD ginsengs The discrimination highest of number feature is 93.16%；In KTH databases, the discrimination of the AGGD parameter attributes of gradient is minimum, is 88.40%, the discrimination highest of compound AGGD parameter attributes is 95.20%.

It can be seen that the row of fusion light stream and gradient parameter from the Activity recognition rate of the different parameters of above-mentioned two database It will be high than being based respectively on the discrimination of gradient and light stream parameter for discrimination.

The Activity recognition rate of 4.4 two database difference AGGD parameter attributes of table

Finally, it should also be noted that it is listed above be only the present invention a specific embodiment.Obviously, of the invention It is not limited to above example, acceptable there are many deformations.Those skilled in the art can be straight from present disclosure All deformations for connecing export or associating, are considered as protection scope of the present invention.

Claims

1. a kind of Human bodys' response method based on asymmetric generalized gaussian model, real by training video library and test video It is existing；It is characterized in that：Include the following steps：

Step 1 carries out point of interest detection respectively for given training video library and test video；

Step 2 extracts video block centered on point of interest；

Point of interest detection is as follows：

Video is regarded as the image sequence f (x, y, t) of multiple image composition；

Defined function f：F obtains L after linear space scale is added：It is passed through by image sequence f Crossing separate space variable isIt is with time variableGaussian function carry out convolutional filtering obtain, expression formula is as follows：

Gauss window in time-space domainIt is defined as：

σ in formula_lFor space scale variable, τ_lFor time scale variable, t is time dimension；

Defining receptance function R is:

R (x, y, t)=(I*g*h_ev)²+(I*g*h_od)² (11)

In formula, * is convolution operator, and I is video image, and g is dimensional Gaussian smoothing kernel, h_evAnd h_odIt is orthogonal in spatial domain One-dimensional Gabor filter；

h_evAnd h_odDefinition be：

In formula (12), formula (13),σ and τ is respectively the detection scale in spatial domain and time-domain, takes σ=2 and τ=3； Gaussian smoothing filter scale is 2；

The adjacent domain of each maximum point of receptance function R contains the body local movable information in I (x, y, t)；

Step 3, calculates separately the video block message of training video and test video, and obtains tri- direction gradient of respective X, Y, Z Two component data of characteristic and Optical-flow Feature u, v；

Step 4 draws two direction histogram of three direction histogram of gradient and light stream respectively to above-mentioned data；

Step 5 is fitted corresponding histogram with asymmetric generalized gaussian model；

Step 6, extract the parameter of asymmetric generalized gaussian model as feature formed each behavior of training video eigenmatrix and The eigenmatrix of test video；

Step 7 calculates the mahalanobis distance between test video eigenmatrix and each behavioural characteristic matrix of training video；

Step 8 carries out Activity recognition according to nearest neighbouring rule.

2. the Human bodys' response method according to claim 1 based on asymmetric generalized gaussian model, it is characterised in that： In the step 3, it includes as follows to calculate gradient feature data and two component data of Optical-flow Feature u, v：

The extraction of space-time characteristic point：

After carrying out point of interest detection to image sequence, space-time interest points are obtained, a space-time is defined centered on space-time interest points Cube extracts the pixel of this space-time cube to construct space-time characteristic；

G_x(x, y, t)=L (x+1, y, t)-L (x-1, y, t), (14)

G_y(x, y, t)=L (x, y+1, t)-L (x, y-1, t), (15)

G_z(x, y, t)=L (x, y, t+1)-L (x, y, t-1), (16)

The extraction of Optical-flow Feature：

Light stream is calculated using Lucas-Kanade methods：

It being located at moment t, for pixel (x, y) at position 1, gray value herein is I (x, y, t), (the t+ Δ t) moment, it is former Pixel moves at position 2, its change in location is that (x+ Δs x, y+ Δ y), new gray value are I (x+ Δs x, y+ Δs at this time y,t+Δt)；According to image consistency it is assumed that meetingThen：

I (x, y, t)=I (x+ Δs x, y+ Δ y, t+ Δ t) (17)

If u and v are respectively component of the light stream vector along x and y both directions of pixel (x, y),By formula (17) Taylor expansion is：

After high-order term ε more than second order is ignored, then meet：

∵Δt→0

I.e.：I_xu+I_yv+I_t=0 (20)

In formula (20), I_x、I_y、I_tIt is pixel (x, y) along the partial derivative in tri- directions x, y, t；The vector expression table of following formula can be used It reaches：

In formula (21),For gradient direction, U=(u, v)^TIndicate light stream；

Assuming that the light stream in the window that a size is specified is to maintain constant, the optical flow constraint equation in the window thus can be sought The light stream (u, v) for the characteristic window that size is x × x is obtained, i.e.,：

In formula (22), i is characterized the number of pixels i=(x × x) in window, I_xAnd I_yFor the spatial gradient of image, I_tIt is the time Gradient；

Solution formula (22) can obtain：

3. the Human bodys' response method according to claim 2 based on asymmetric generalized gaussian model, it is characterised in that： In the step 6, the parameter attribute extraction based on asymmetric generalized gaussian model：

The expression formula of asymmetric generalized gaussian model is as follows：

After carrying out asymmetric generalized gaussian model fitting to characteristic, its five parameters (α, β are extracted_l,β_r, v, u), by five A parameter (α, β_l,β_r, v, u) and it is used as feature.