CN103679215A - Video monitoring method based on group behavior analysis driven by big visual big data - Google Patents

Video monitoring method based on group behavior analysis driven by big visual big data Download PDF

Info

Publication number
CN103679215A
CN103679215A CN201310746795.6A CN201310746795A CN103679215A CN 103679215 A CN103679215 A CN 103679215A CN 201310746795 A CN201310746795 A CN 201310746795A CN 103679215 A CN103679215 A CN 103679215A
Authority
CN
China
Prior art keywords
mrow
msub
behavior
vector
math
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310746795.6A
Other languages
Chinese (zh)
Other versions
CN103679215B (en
Inventor
黄凯奇
康运锋
曹黎俊
张旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201310746795.6A priority Critical patent/CN103679215B/en
Publication of CN103679215A publication Critical patent/CN103679215A/en
Application granted granted Critical
Publication of CN103679215B publication Critical patent/CN103679215B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

A video monitoring method achieved by a computer comprises the steps of receiving video data captured by a vidicon; establishing a group behavior model according to the received video data; estimating parameters of the group behavior model to obtain multiple crowd behaviors existing in a scene; using the obtained group behavior model to obtain behavior feature sets of different crowds; converting the obtained behavior feature sets and using the converted behavior feature sets to obtain statistical people number values according to all of crowd behaviors. The vidicon has general applicability in angle. The video monitoring method can be used for people counting at open entrances and exits, is small in calculation quantity and can meet the real-time video processing requirement.

Description

Video monitoring method for group behavior analysis based on vision big data driving
Technical Field
The invention relates to a video monitoring method, in particular to a video monitoring method based on a visual big data driven group behavior analysis technology.
Background
Most conventional monitoring systems require a dedicated monitoring person to make a manual judgment on the monitored video. This requires a lot of manpower and the person is long time dedicated to a matter and may neglect some abnormalities, with negative consequences. The intelligent video monitoring system can identify different objects, and can send out alarm and provide useful information in a fastest and optimal mode when the abnormal condition in the monitoring picture is found, so that monitoring personnel can be effectively assisted to obtain accurate information and process emergencies, and the phenomena of misinformation and failure in reporting are reduced to the maximum extent.
In the related art, video monitoring methods can be divided into two categories according to different crowd behavior detection methods. One type of method is a multi-person behavior recognition method based on motion tracking, which is challenged by the number of people in the crowd. When the number of people is large, the shielding is serious, and single tracking cannot be performed, so that the method can only be applied to the condition that the scene is simple and the number of people is small. The second method is a crowd behavior identification method based on feature learning or behavior model construction, and is mainly used for detecting abnormal behaviors in crowds, such as crowd gathering, crowd scattering, crowd running, crowd blow and other abnormal behaviors. The method is more suitable for multiple human scenes, the model is established by extracting the characteristics, and the model parameters are obtained by using a machine learning method, so that the detection rate is improved. But one model cannot describe all behaviors and therefore a different model is required for a particular behavior. In addition, the lack of training samples still poses challenges to obtaining optimal model parameters.
Disclosure of Invention
The invention aims to provide a video monitoring method which can detect and identify the behaviors of people and count the number of people with different behaviors.
In order to achieve the above object, a video monitoring method may include the steps of:
1) receiving video data captured by a camera;
2) establishing a group behavior model according to the received video data;
3) estimating parameters of the group behavior model to obtain various group behaviors in a scene;
4) obtaining behavior feature sets of different crowds by using the obtained group behavior model;
5) the resulting behavioral feature sets are transformed and used per-day
The population behavior is counted.
According to the technical scheme of the invention, the method has the advantages that: 1) the mathematical model is simple, the parameters are few, and the training is convenient; 2) the method can be used for crowd crowding environment and calculating the cumulative amount of people at specific behaviors; 3) the camera angle setting has universal applicability and can be used for counting the number of people in the open entrance; 4) the calculation amount is small, and the requirement of real-time video processing can be met.
Drawings
FIG. 1 shows a flow diagram of a video surveillance method according to an embodiment of the invention;
FIG. 2 illustrates a word-document model structure according to an embodiment of the present invention;
FIG. 3 illustrates an example live scenario according to an embodiment of the present invention;
FIG. 4 illustrates a set of different population behavior features in a live scene in accordance with an embodiment of the invention;
FIG. 5 shows a schematic view of a geometric correction according to an embodiment of the invention;
figure 6 illustrates an example of the number of people on site parks according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to specific embodiments and the accompanying drawings.
It should be noted that in the drawings or description, the same drawing reference numerals are used for similar or identical parts. Implementations not depicted or described in the drawings are of a form known to those of ordinary skill in the art. Additionally, while exemplifications of parameters including particular values may be provided herein, it is to be understood that the parameters need not be exactly equal to the respective values, but may be approximated to the respective values within acceptable error margins or design constraints. In addition, directional terms such as "upper", "lower", "front", "rear", "left", "right", and the like, referred to in the following embodiments, are directions only referring to the drawings. Accordingly, the directional terminology is used for purposes of illustration and is in no way limiting.
According to the technical scheme, firstly, aiming at the complexity of scene population, a population behavior model is used for mining various behaviors in a scene; then, acquiring a behavior feature set for each type of crowd according to the detected K types of crowd behaviors; then, the obtained behavior feature set is converted into a 5-dimensional feature vector, for example, so as to reduce feature dimensions, and a 5 x G-dimensional feature vector is obtained by associating time parameters; and then, training an artificial neural network by using the obtained 5-G-dimensional feature vector, thereby counting the accumulated amount of the behavior people of each type of population. The whole technical scheme flow chart of the embodiment of the invention is shown in the attached figure 1. The following provides a detailed description of embodiments of the invention.
Step 1: video data acquired by the camera is received and may be processed, such as de-noised.
Step 2: a group behavior model is established based on the received video data.
Due to the complexity of crowd behavior, there are often different crowd behaviors in one scene, and it is difficult to describe all behaviors with a single model. Therefore, the feature set of each behavior can be obtained through a group behavior model, and the behavior feature set is used for performing human group analysis. The group behavior model may be a word-document model, namely: the bottom-level features are used as words, the video segments are used as documents, so that crowd behaviors in the video, namely hidden topics, are mined, and feature sets of all the crowd behaviors, namely the bottom-level feature sets, are obtained.
The model bottom layer characteristics adopted by the embodiment of the invention are local motion information. For example, the motion pixels may be obtained by a frame difference method, and then an optical flow method (Horn B K P, Schunck B g]Artificial intellgence, 1981, 17 (1): 185-203) to calculate the velocity vector of the motion pixel, and then obtain the characteristics of the motion pixel, i.e. the position and the motion velocity. Here, each moving pixel is taken as a word wiA segment of video may comprise M frames of images, i.e. M documents, each of which may be represented by a set of words, i.e. documents W ═ { W ═ Wi,i=1,.., N }, wherein wi={xi,yi,ui,viN is the number of pixels in the video frame, x represents the horizontal position of the pixel, y represents the vertical position of the pixel, u represents the velocity of the pixel in the horizontal direction, and v represents the velocity of the pixel in the vertical direction. Of course, other techniques known in the art of motion estimation may be employed by those skilled in the art to represent the document W.
FIG. 2 illustrates a word-document model structure used by embodiments of the present invention. Wherein, alpha represents the relative strength among the hidden topics in the document set, beta represents the probability distribution of all the hidden topics, and the random variable pijCharacterizing the document layer j, random variable πjThe size of (d) represents the specific gravity of each implied topic in the target document. In the word layer, zjiRepresenting the implicit topic quota, x, assigned to each word i by the target document jjiIs a word vector representation of the target document. Assuming that there are K behavioral topics, each topic is a multinomial distribution of words, and α may be a Dirichlet distribution of the corpus. For each document j, Dirichlet distributes Dir (π)j| α) is at πjAre parameters. For each word i in document j, topic zjiHas a probability distribution of pijkWord xjiIs about a parameterA plurality of distributions of (a). Wherein pijAnd zjiFor the dependent variables, α and β are parameters that need to be optimized. When given α and β, the random variable πjSubject zj={zjiThe term xj={xjiThe joint probability distribution of is shown in equation (1):
<math> <mrow> <mfenced open='' close=''> <mtable> <mtr> <mtd> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>,</mo> <msub> <mi>z</mi> <mi>j</mi> </msub> <mo>,</mo> <msub> <mi>&pi;</mi> <mi>j</mi> </msub> <mo>|</mo> <mi>&alpha;</mi> <mo>,</mo> <mi>&beta;</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>&pi;</mi> <mi>j</mi> </msub> <mo>|</mo> <mi>&alpha;</mi> <mo>)</mo> </mrow> <munderover> <mi>&Pi;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>z</mi> <mi>ji</mi> </msub> <mo>|</mo> <msub> <mi>&pi;</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>ji</mi> </msub> <mo>|</mo> <msub> <mi>z</mi> <mrow> <mi>ji</mi> <mo>,</mo> </mrow> </msub> <mi>&beta;</mi> <mo>)</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mo>=</mo> <mfrac> <mrow> <mi>&Gamma;</mi> <mrow> <mo>(</mo> <msubsup> <mi>&Sigma;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>K</mi> </msubsup> <msub> <mi>&alpha;</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msubsup> <mi>&Pi;</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>K</mi> </msubsup> <mi>&Gamma;</mi> <mrow> <mo>(</mo> <msub> <mi>&alpha;</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <msubsup> <mi>&pi;</mi> <mrow> <mi>j</mi> <mn>1</mn> </mrow> <mrow> <msub> <mi>&alpha;</mi> <mn>1</mn> </msub> <mo>-</mo> <mn>1</mn> </mrow> </msubsup> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <mo>&CenterDot;</mo> <msubsup> <mi>&pi;</mi> <mi>jk</mi> <mrow> <msub> <mi>&alpha;</mi> <mi>k</mi> </msub> <mo>-</mo> <mn>1</mn> </mrow> </msubsup> <munderover> <mi>&Pi;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </munderover> <msub> <mi>&pi;</mi> <msub> <mi>jz</mi> <mi>ji</mi> </msub> </msub> <msub> <mi>&beta;</mi> <mrow> <msub> <mi>z</mi> <mi>ji</mi> </msub> <msub> <mi>x</mi> <mi>ji</mi> </msub> </mrow> </msub> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </math>
therefore, the core problem of constructing the word-document model is the inference of the distribution of the hidden variables, namely, the acquisition of the constituent information (pi, z) of the hidden topic in the target document. However, due to the posterior distribution p (z)j,πjI α, β), the distribution can be approximated using the variation distribution of equation (2) as shown below:
<math> <mrow> <mi>q</mi> <mrow> <mo>(</mo> <msub> <mi>z</mi> <mi>j</mi> </msub> <mo>,</mo> <msub> <mi>&pi;</mi> <mi>j</mi> </msub> <mo>|</mo> <msub> <mi>&gamma;</mi> <mi>j</mi> </msub> <mo>,</mo> <msub> <mi>&phi;</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>q</mi> <mrow> <mo>(</mo> <msub> <mi>&pi;</mi> <mi>j</mi> </msub> <mo>|</mo> <msub> <mi>&gamma;</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <munderover> <mi>&Pi;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </munderover> <mi>q</mi> <mrow> <mo>(</mo> <msub> <mi>z</mi> <mi>ji</mi> </msub> <mo>|</mo> <msub> <mi>&phi;</mi> <mi>ji</mi> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow> </math>
wherein, γjFor Dirichlet distribution q (π)jj) Parameter of { phi })jiIs a polynomial distribution q (z)jj) The parameter (c) of (c). (gamma. rays)j,φj) Can be calculated by calculating logp (x)j| α, β).
And step 3: and estimating parameters of the group behavior model to obtain various group behaviors in the scene.
The optimum parameters (. alpha.,. beta.) can be calculated by calculating logp (x)j| α, β) is obtained as shown in equation (3).
<math> <mrow> <mrow> <mo>(</mo> <msup> <mi>&alpha;</mi> <mo>*</mo> </msup> <mo>,</mo> <msup> <mi>&beta;</mi> <mo>*</mo> </msup> <mo>)</mo> </mrow> <mo>=</mo> <munder> <mrow> <mi>arg</mi> <mi>max</mi> </mrow> <mrow> <mo>(</mo> <mi>&alpha;</mi> <mo>,</mo> <mi>&beta;</mi> <mo>)</mo> </mrow> </munder> <munderover> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>M</mi> </munderover> <mi>log</mi> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>j</mi> </msub> <mo>|</mo> <mi>&alpha;</mi> <mo>,</mo> <mi>&beta;</mi> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow> </math>
Also due to p (x)j| α, β) is not straightforward to compute, the parameters (α, β) can be estimated by a variational maximum likelihood estimation EM method: in E-step, for each document j, find the optimal variation parameterThe above equation (2) is approximated using the variation distribution of the optimum variation parameter obtained by E-step, and the optimum parameter (α) is obtained by two-step loop calculation*,β*)。
As an example, fig. 3 shows a certain frame of image of received video data, wherein mining to the scene using the group behavior model of the present invention includes, for example, four implicit topics (crowd behaviors), namely: up, down, left, right.
Step 4: and obtaining different population behavior feature sets by using the obtained population behavior model.
Each frame of image in the video contains different crowd behaviors, and the parameter of the group behavior model obtained in step 3 can be used to obtain the feature set of each crowd behavior through the word-document model, as shown in the following formula (4).
<math> <mrow> <mfenced open='{' close=''> <mtable> <mtr> <mtd> <msub> <mi>f</mi> <msup> <mi>k</mi> <mo>*</mo> </msup> </msub> <mo>=</mo> <mo>{</mo> <msub> <mi>x</mi> <mrow> <msup> <mi>k</mi> <mo>*</mo> </msup> <mi>i</mi> </mrow> </msub> <mo>|</mo> <mi>i</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>F</mi> <mo>}</mo> </mtd> </mtr> <mtr> <mtd> <msup> <mi>k</mi> <mo>*</mo> </msup> <mo>=</mo> <munder> <mrow> <mi>arg</mi> <mi>max</mi> </mrow> <mrow> <mi>k</mi> <mo>&Element;</mo> <mo>{</mo> <mn>1</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>K</mi> <mo>}</mo> </mrow> </munder> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>z</mi> <mrow> <mi>k</mi> <mo>,</mo> <mi>i</mi> </mrow> </msub> <mo>|</mo> <mi>&alpha;</mi> <mo>,</mo> <mi>&beta;</mi> <mo>)</mo> </mrow> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>4</mn> <mo>)</mo> </mrow> </mrow> </math>
Wherein,
Figure BDA0000450148950000054
is the feature set of the k-th behavior, F is the number of features in the feature set of the k-th behavior, xkiIs the feature that the word is the ith pixel point of the kth behavior.
Fig. 4 shows the crowd behavior in a scene, where different behaviors are represented by optical flow feature points (only some of the feature points are shown in the image), and there are three kinds of crowd behaviors in the figure: the feature point in the rectangular area 1 indicates upward movement, the feature point in the rectangular area 2 indicates leftward movement, and the feature point in the rectangular area 3 indicates downward movement.
And 5: the obtained behavior feature set is converted, and the statistical population value is obtained for each behavior by using the converted behavior feature set.
Different crowd behaviors and feature sets of each behavior are obtained through a group behavior model. Although the behavior feature set can also describe the number of behavior people, the feature dimension is high, the parameter training time is long, and the accumulated number cannot be directly obtained. Therefore, according to the method of the present invention, the behavior feature set of each frame of image can be converted into a 5-dimensional feature vector, thereby reducing the feature dimension. Meanwhile, the time parameter may be added to the behavior feature set, and for each behavior feature set obtained by using the above equation (4), a feature vector NF { AS } with dimension of 5 × G may be obtainedG,SVG,DVG,DDG,NPGAnd G is a time parameter and represents G frames for counting the accumulated amount of people at a specific behavior. Specifically, the above-mentioned 5 × G dimensional feature vector may be obtained by using the following method:
(1) average velocity vector ASG
ASG={ASgG1.., G }, wherein ASgAs the average speed of the g frame image, the AS can be obtained AS shown in the formula (5)g
<math> <mrow> <msub> <mi>AS</mi> <mi>g</mi> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mi>F</mi> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>F</mi> </munderover> <msqrt> <msubsup> <mi>v</mi> <mi>gi</mi> <mn>2</mn> </msubsup> <mo>+</mo> <msubsup> <mi>u</mi> <mi>gi</mi> <mn>2</mn> </msubsup> </msqrt> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>5</mn> <mo>)</mo> </mrow> </mrow> </math>
Wherein u isjiAnd vgiRespectively representing the x-direction and y-direction velocity components of the ith feature in the g-th frame image.
(2) Velocity variance vector AVG
SVG={SVgG1, G, wherein SV isgFor the speed variance of the g-th frame image, which is used to measure the complexity of the light stream speed in each frame image, SV can be obtained as shown in formula (6)g
<math> <mrow> <msub> <mi>SV</mi> <mi>g</mi> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mi>F</mi> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>F</mi> </munderover> <msup> <mrow> <mo>(</mo> <msqrt> <msubsup> <mi>v</mi> <mi>gi</mi> <mn>2</mn> </msubsup> <mo>+</mo> <msubsup> <mi>u</mi> <mi>gi</mi> <mn>2</mn> </msubsup> </msqrt> <mo>-</mo> <msub> <mi>AS</mi> <mi>g</mi> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> </mrow> </math>
(3) Direction variance vector DVG
DVG={DVgG1.., G }, where DVgIs the direction variance of the g frame image, and is used for measuring the complexity of the optical flow direction, and DV can be obtained as shown in formula (7)g
<math> <mrow> <msub> <mi>DV</mi> <mi>g</mi> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mn>8</mn> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mn>8</mn> </munderover> <msup> <mrow> <mo>(</mo> <msub> <mi>ND</mi> <mi>gi</mi> </msub> <mo>-</mo> <msub> <mover> <mi>ND</mi> <mo>&OverBar;</mo> </mover> <mi>g</mi> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>7</mn> <mo>)</mo> </mrow> </mrow> </math>
Dividing 0-360 degrees into 8 intervals, voting the directional features of the concentrated optical flow of each behavior feature according to the angle intervals, and obtaining a directional histogram of each behavior. NDgiIs the statistical value of the ith interval of the direction histogram,
Figure BDA0000450148950000073
is { NDgiI is the average of 1.., 8 }.
(4) Directional divergence vector DDG
DDG={DDgG1, G, where DD is presentgFor the directional divergence of the g frame image, DD can be obtained as shown in equation (5)g
<math> <mrow> <mfenced open='{' close=''> <mtable> <mtr> <mtd> <msub> <mi>DD</mi> <mi>g</mi> </msub> <mo>=</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mn>8</mn> </munderover> <msub> <mi>ND</mi> <mi>gi</mi> </msub> <mo>&times;</mo> <mo>|</mo> <msub> <mi>RD</mi> <mi>g</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>|</mo> </mtd> </mtr> <mtr> <mtd> <msub> <mi>RD</mi> <mi>g</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>mod</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>-</mo> <msub> <mi>MD</mi> <mi>g</mi> </msub> <mo>,</mo> <mn>8</mn> <mo>)</mo> </mrow> <mo>-</mo> <mn>8</mn> <mo>&times;</mo> <mrow> <mo>(</mo> <mi>mod</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>-</mo> <msub> <mi>MD</mi> <mi>g</mi> </msub> <mo>,</mo> <mn>8</mn> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mn>4</mn> <mo>)</mo> </mrow> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>8</mn> <mo>)</mo> </mrow> </mrow> </math>
Wherein MDg=max(NDgi),i=1,...,8。
(5) Line pixel total number vector
Since the depth of field of a monitored scene is generally large, and the projection of the scene on an image plane has a relatively serious perspective phenomenon (the same object looks large close to a camera and small far from the camera), the contribution of different pixels on the image plane needs to be weighted. The ground is assumed to be planar and the person is perpendicular to the ground. As shown in FIG. 5, let the vanishing point PvHas the coordinates of (x)v,yv) Reference line is yrIf H/2, the contribution factor of any pixel I (x, y) on the image plane can be obtained as shown in equation (9).
S C ( x , y ) = ( y r - y v y - y v ) 2 - - - ( 9 )
The total number of pixels for that behavior is then:the total number of pixels vector for this behavior is NPG={NPg,g=1,...,G}
After the 5G-dimensional feature vectors are obtained, the number of people who enter and exit two different behaviors is manually calibrated to be used for training an artificial neural network model, and the trained neural network model is used for counting the number of people who enter and exit. The demographics may be obtained using well-known neural network methods. The experiment obtains the total number of people entering the park by counting the difference between the number of people entering the gate and the number of people going out of the gate at the exit under the scene. Fig. 6(a) shows the number of people who have access to a behavior group in a certain frame of live image. The number of persons entering and exiting from the beginning of counting up to now is shown in red font in the upper right corner of the image: in (In): 157, Out (Out): 39, only partial optical flow feature points are displayed in the image, the feature points in the elliptical area 1 represent out, the feature points in the elliptical area 2 represent in, the arrows represent the motion direction of the feature points, and the black frame is the people counting area. Figure 6(b) shows the change in population on the campus (in units of every 2 minutes) with the average accuracy of the population statistics on the campus being 92.35%.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only examples of the present invention and are not intended to limit the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A computer-implemented video surveillance method, comprising the steps of:
a) receiving video data captured by a camera;
b) establishing a group behavior model according to the received video data;
c) estimating parameters of the group behavior model to obtain various group behaviors in a scene;
d) obtaining behavior feature sets of different crowds by using the obtained group behavior model;
e) and converting the obtained behavior feature set, and obtaining a statistical population value for each group behavior by using the converted behavior feature set.
2. The method of claim 1, wherein the step b) comprises: building a word-document model in which each moving pixel is treated as a word wiThe M frames of images of a video correspond to M documents, with the word set W ═ WiI 1.. N } represents a document, where wi={xi,yi,ui,viN is the number of pixels in the video frame, x represents the horizontal position of the pixel, y represents the vertical position of the pixel, u represents the velocity of the pixel in the horizontal direction, and v represents the velocity of the pixel in the vertical direction.
3. The method of claim 1, wherein the step c) comprises: estimating parameters of the population behavior model using a maximum likelihood Estimation (EM) method.
4. The method of claim 2, wherein the step c) comprises: detecting behaviors present in the scene using a population behavior model and obtaining a feature set for each behavior according to the following formula:
<math> <mfenced open='{' close=''> <mtable> <mtr> <mtd> <msub> <mi>f</mi> <msup> <mi>k</mi> <mo>*</mo> </msup> </msub> <mo>=</mo> <mo>{</mo> <msub> <mi>x</mi> <mrow> <msup> <mi>k</mi> <mo>*</mo> </msup> <mi>i</mi> </mrow> </msub> <mo>|</mo> <mi>i</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>F</mi> <mo>}</mo> </mtd> </mtr> <mtr> <mtd> <msup> <mi>k</mi> <mo>*</mo> </msup> <mo>=</mo> <munder> <mrow> <mi>arg</mi> <mi>max</mi> </mrow> <mrow> <mi>k</mi> <mo>&Element;</mo> <mo>{</mo> <mn>1</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mi>K</mi> <mo>}</mo> </mrow> </munder> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>z</mi> <mrow> <mi>k</mi> <mo>,</mo> <mi>i</mi> </mrow> </msub> <mo>|</mo> <mi>&alpha;</mi> <mo>,</mo> <mi>&beta;</mi> <mo>)</mo> </mrow> </mtd> </mtr> </mtable> </mfenced> </math>
wherein alpha represents the relative strength among the hidden topics in the document set, beta represents the probability distribution of all the hidden topics, and the total K behaviors are assumed,
Figure FDA0000450148940000012
f is the number of features in the feature set of the kth behavior.
5. The method of claim 4, wherein the step d) comprises: converting the obtained feature set of the behavior into a feature vector NF ═ AS of 5 x G dimensionG,SVG,DVG,DDG,NPGAnd training an artificial neural network to count people, wherein, ASGRepresenting the mean velocity vector, SVGRepresenting the velocity variance vector, DVGRepresenting a directional variance vector, DDGRepresenting directional divergence vectors, and NPGRepresenting a row pixel total vector.
6. Method according to claim 5, wherein the average velocity vector ASGIs calculated as:
ASG={ASg,g=1,...,G}
wherein ASgIs the average speed of the g-th frame image,ugiand vgiRespectively representing the x-direction and y-direction velocity components of the ith feature in the g-th frame image.
7. The method of claim 5, wherein velocity variance vector SVGIs calculated as:
SVG={SVg,g=1,...,G}
wherein SVgIs the velocity variance of the g-th frame image, <math> <mrow> <msub> <mi>SV</mi> <mi>g</mi> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mi>F</mi> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>F</mi> </munderover> <msup> <mrow> <mo>(</mo> <msqrt> <msubsup> <mi>v</mi> <mi>gi</mi> <mn>2</mn> </msubsup> <mo>+</mo> <msubsup> <mi>u</mi> <mi>gi</mi> <mn>2</mn> </msubsup> </msqrt> <mo>-</mo> <msub> <mi>AS</mi> <mi>g</mi> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>,</mo> </mrow> </math> ugiand vgiRespectively representing the x-direction and y-direction velocity components of the ith feature in the g-th frame image.
8. The method of claim 5, wherein the directional variance vector DVGIs calculated as:
DVG={DVg,g=1,...,G}
wherein DVgIs the direction variance of the g-th frame image,
Figure FDA0000450148940000023
NDgiis the statistical value of the ith interval of the direction histogram,
Figure FDA0000450148940000024
average value of (a), ugiAnd vgiRespectively representing the x-direction and y-direction velocity components of the ith feature in the g-th frame image.
9. The method of claim 5, wherein the directional divergence vector DDGIs calculated as:
DDG={DDg,g=1,...,G}
wherein DDgIs the directional divergence of the g frame image.
Wherein <math> <mfenced open='{' close=''> <mtable> <mtr> <mtd> <msub> <mi>DD</mi> <mi>g</mi> </msub> <mo>=</mo> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mn>8</mn> </munderover> <msub> <mi>ND</mi> <mi>gi</mi> </msub> <mo>&times;</mo> <mo>|</mo> <msub> <mi>RD</mi> <mi>g</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>|</mo> </mtd> </mtr> <mtr> <mtd> <msub> <mi>RD</mi> <mi>g</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>mod</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>-</mo> <msub> <mi>MD</mi> <mi>g</mi> </msub> <mo>,</mo> <mn>8</mn> <mo>)</mo> </mrow> <mo>-</mo> <mn>8</mn> <mo>&times;</mo> <mrow> <mo>(</mo> <mi>mod</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>-</mo> <msub> <mi>MD</mi> <mi>g</mi> </msub> <mo>,</mo> <mn>8</mn> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mn>4</mn> <mo>)</mo> </mrow> </mtd> </mtr> </mtable> </mfenced> </math>
Wherein MDg=max(NDgi),i=1,...,8,NDgiIs the statistical value of the ith interval of the direction histogram.
10. The method of claim 5, wherein the row total number of pixels vector NPGIs calculated as:
NPG={NPg,g=1,...,G}
wherein NP isgThe total number of line pixels of the g-th frame image, <math> <mrow> <msub> <mi>NP</mi> <mi>g</mi> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mi>F</mi> </mfrac> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>F</mi> </munderover> <msub> <mi>S</mi> <mi>C</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>gi</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>gi</mi> </msub> <mo>)</mo> </mrow> <mo>.</mo> </mrow> </math>
CN201310746795.6A 2013-12-30 2013-12-30 The video frequency monitoring method of the groupment behavior analysiss that view-based access control model big data drives Expired - Fee Related CN103679215B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310746795.6A CN103679215B (en) 2013-12-30 2013-12-30 The video frequency monitoring method of the groupment behavior analysiss that view-based access control model big data drives

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310746795.6A CN103679215B (en) 2013-12-30 2013-12-30 The video frequency monitoring method of the groupment behavior analysiss that view-based access control model big data drives

Publications (2)

Publication Number Publication Date
CN103679215A true CN103679215A (en) 2014-03-26
CN103679215B CN103679215B (en) 2017-03-01

Family

ID=50316703

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310746795.6A Expired - Fee Related CN103679215B (en) 2013-12-30 2013-12-30 The video frequency monitoring method of the groupment behavior analysiss that view-based access control model big data drives

Country Status (1)

Country Link
CN (1) CN103679215B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104320617A (en) * 2014-10-20 2015-01-28 中国科学院自动化研究所 All-weather video monitoring method based on deep learning
CN105096344A (en) * 2015-08-18 2015-11-25 上海交通大学 A group behavior identification method and system based on CD motion features
CN105100683A (en) * 2014-05-04 2015-11-25 深圳市贝尔信智能系统有限公司 Video-based passenger flow statistics method, device and system
CN108573497A (en) * 2017-03-10 2018-09-25 北京日立北工大信息系统有限公司 Passenger flow statistic device and method
US10127597B2 (en) 2015-11-13 2018-11-13 International Business Machines Corporation System and method for identifying true customer on website and providing enhanced website experience
CN109063549A (en) * 2018-06-19 2018-12-21 中国科学院自动化研究所 High-resolution based on deep neural network is taken photo by plane video moving object detection method
CN110874878A (en) * 2018-08-09 2020-03-10 深圳云天励飞技术有限公司 Pedestrian analysis method, device, terminal and storage medium
CN112084925A (en) * 2020-09-03 2020-12-15 厦门利德集团有限公司 Intelligent electric power safety monitoring method and system
CN113012386A (en) * 2020-12-25 2021-06-22 贵州北斗空间信息技术有限公司 Security alarm multi-level linkage rapid pushing method
CN115856980A (en) * 2022-11-21 2023-03-28 中铁科学技术开发有限公司 Marshalling station operator monitoring method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751553A (en) * 2008-12-03 2010-06-23 中国科学院自动化研究所 Method for analyzing and predicting large-scale crowd density
US20110243450A1 (en) * 2010-04-01 2011-10-06 Microsoft Corporation Material recognition from an image
CN102385705A (en) * 2010-09-02 2012-03-21 大猩猩科技股份有限公司 Abnormal behavior detection system and method by utilizing automatic multi-feature clustering method
CN102708573A (en) * 2012-02-28 2012-10-03 西安电子科技大学 Group movement mode detection method under complex scenes
US8406498B2 (en) * 1999-01-25 2013-03-26 Amnis Corporation Blood and cell analysis using an imaging flow cytometer
CN103258193A (en) * 2013-05-21 2013-08-21 西南科技大学 Group abnormal behavior identification method based on KOD energy feature

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8406498B2 (en) * 1999-01-25 2013-03-26 Amnis Corporation Blood and cell analysis using an imaging flow cytometer
CN101751553A (en) * 2008-12-03 2010-06-23 中国科学院自动化研究所 Method for analyzing and predicting large-scale crowd density
US20110243450A1 (en) * 2010-04-01 2011-10-06 Microsoft Corporation Material recognition from an image
CN102385705A (en) * 2010-09-02 2012-03-21 大猩猩科技股份有限公司 Abnormal behavior detection system and method by utilizing automatic multi-feature clustering method
CN102708573A (en) * 2012-02-28 2012-10-03 西安电子科技大学 Group movement mode detection method under complex scenes
CN103258193A (en) * 2013-05-21 2013-08-21 西南科技大学 Group abnormal behavior identification method based on KOD energy feature

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
茅耀斌: ""视频监控中的群体运动分析研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
邹友辉: ""基于统计图模型的视频异常事件检测"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105100683A (en) * 2014-05-04 2015-11-25 深圳市贝尔信智能系统有限公司 Video-based passenger flow statistics method, device and system
CN104320617A (en) * 2014-10-20 2015-01-28 中国科学院自动化研究所 All-weather video monitoring method based on deep learning
CN104320617B (en) * 2014-10-20 2017-09-01 中国科学院自动化研究所 A kind of round-the-clock video frequency monitoring method based on deep learning
CN105096344A (en) * 2015-08-18 2015-11-25 上海交通大学 A group behavior identification method and system based on CD motion features
CN105096344B (en) * 2015-08-18 2018-05-04 上海交通大学 Group behavior recognition methods and system based on CD motion features
US10127597B2 (en) 2015-11-13 2018-11-13 International Business Machines Corporation System and method for identifying true customer on website and providing enhanced website experience
CN108573497A (en) * 2017-03-10 2018-09-25 北京日立北工大信息系统有限公司 Passenger flow statistic device and method
CN108573497B (en) * 2017-03-10 2020-08-21 北京日立北工大信息系统有限公司 Passenger flow statistical device and method
CN109063549A (en) * 2018-06-19 2018-12-21 中国科学院自动化研究所 High-resolution based on deep neural network is taken photo by plane video moving object detection method
CN109063549B (en) * 2018-06-19 2020-10-16 中国科学院自动化研究所 High-resolution aerial video moving target detection method based on deep neural network
CN110874878A (en) * 2018-08-09 2020-03-10 深圳云天励飞技术有限公司 Pedestrian analysis method, device, terminal and storage medium
CN112084925A (en) * 2020-09-03 2020-12-15 厦门利德集团有限公司 Intelligent electric power safety monitoring method and system
CN113012386A (en) * 2020-12-25 2021-06-22 贵州北斗空间信息技术有限公司 Security alarm multi-level linkage rapid pushing method
CN115856980A (en) * 2022-11-21 2023-03-28 中铁科学技术开发有限公司 Marshalling station operator monitoring method and system

Also Published As

Publication number Publication date
CN103679215B (en) 2017-03-01

Similar Documents

Publication Publication Date Title
CN103679215B (en) The video frequency monitoring method of the groupment behavior analysiss that view-based access control model big data drives
CN109819208B (en) Intensive population security monitoring management method based on artificial intelligence dynamic monitoring
CN106407946B (en) Cross-line counting method, deep neural network training method, device and electronic equipment
US8582816B2 (en) Method and apparatus for video analytics based object counting
CN104123544B (en) Anomaly detection method and system based on video analysis
US10963674B2 (en) Unsupervised learning of object recognition methods and systems
CN101464944B (en) Crowd density analysis method based on statistical characteristics
CN101577812B (en) Method and system for post monitoring
Mukherjee et al. Anovel framework for automatic passenger counting
CN104320617B (en) A kind of round-the-clock video frequency monitoring method based on deep learning
CN105303191A (en) Method and apparatus for counting pedestrians in foresight monitoring scene
CN103810473B (en) A kind of target identification method of human object based on HMM
Cao et al. Abnormal crowd motion analysis
CN102156880A (en) Method for detecting abnormal crowd behavior based on improved social force model
CN109583373B (en) Pedestrian re-identification implementation method
CN110633643A (en) Abnormal behavior detection method and system for smart community
CN109117774B (en) Multi-view video anomaly detection method based on sparse coding
CN104820995A (en) Large public place-oriented people stream density monitoring and early warning method
CN107483894A (en) Judge to realize the high ferro station video monitoring system of passenger transportation management based on scene
CN113362374A (en) High-altitude parabolic detection method and system based on target tracking network
CN110020618A (en) A kind of crowd&#39;s abnormal behaviour monitoring method can be used for more shooting angle
CN110084201A (en) A kind of human motion recognition method of convolutional neural networks based on specific objective tracking under monitoring scene
Lijun et al. Video-based crowd density estimation and prediction system for wide-area surveillance
CN102169538B (en) Background modeling method based on pixel confidence
CN115294519A (en) Abnormal event detection and early warning method based on lightweight network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170301