CN111027440A - Crowd abnormal behavior detection device and method based on neural network - Google Patents
Crowd abnormal behavior detection device and method based on neural network Download PDFInfo
- Publication number
- CN111027440A CN111027440A CN201911221923.9A CN201911221923A CN111027440A CN 111027440 A CN111027440 A CN 111027440A CN 201911221923 A CN201911221923 A CN 201911221923A CN 111027440 A CN111027440 A CN 111027440A
- Authority
- CN
- China
- Prior art keywords
- neural network
- convolution
- network
- layer
- layers
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 56
- 238000001514 detection method Methods 0.000 title claims abstract description 45
- 206010000117 Abnormal behaviour Diseases 0.000 title claims abstract description 27
- 238000000034 method Methods 0.000 title claims description 34
- 238000012549 training Methods 0.000 claims abstract description 44
- 230000003287 optical effect Effects 0.000 claims abstract description 31
- 230000008569 process Effects 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 10
- 239000002131 composite material Substances 0.000 claims description 9
- 238000012360 testing method Methods 0.000 claims description 9
- 230000006399 behavior Effects 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 238000005111 flow chemistry technique Methods 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 4
- 230000001105 regulatory effect Effects 0.000 claims description 4
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 239000013598 vector Substances 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 2
- 230000009467 reduction Effects 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims 1
- 230000008901 benefit Effects 0.000 abstract description 9
- 238000000605 extraction Methods 0.000 abstract description 7
- 238000010276 construction Methods 0.000 abstract description 4
- 238000003062 neural network model Methods 0.000 abstract description 4
- 238000012216 screening Methods 0.000 abstract description 4
- 230000002159 abnormal effect Effects 0.000 description 10
- 238000013519 translation Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000010355 oscillation Effects 0.000 description 2
- 229910000831 Steel Inorganic materials 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000005381 potential energy Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/53—Recognition of crowd images, e.g. recognition of crowd congestion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Probability & Statistics with Applications (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a crowd abnormal behavior detection device and a detection method based on a neural network, which are characterized by comprising two steps, wherein the first step is the construction and training of a double-current residual neural network, the second step is the detection of a video image to be detected by using the neural network after the video image to be detected is processed, the advantage of simple optical flow acquisition is utilized as a classifier of a local characteristic character, the classifier is used for carrying out primary screening, the classifier which is constructed by carrying out characteristic extraction by using an automatic encoder based on the neural network is cascaded, the fine judgment is carried out, a 70-layer deep residual neural network model is used as a detection model, the characteristic learning is more accurate, the idea of the residual network is used, the 70-layer deep residual neural network model is constructed as the detection model, and the detection speed and the detection accuracy are more efficient.
Description
[ technical field ] A method for producing a semiconductor device
The invention relates to the technical field of computer vision, in particular to a crowd abnormal behavior detection method based on a neural network.
[ background of the invention ]
In modern cities, surveillance cameras have spread throughout all public areas. These closed circuit television systems usually require the frequent presence of a person to monitor the scene being photographed in order to alert the responsible person when an abnormal event occurs. However, manual detection of the recorded pictures is usually time-consuming and labor-consuming. Narrow-sense crowd abnormal event detection usually refers to judging some behaviors of some scenes; in the broad sense, the detection of abnormal events in the human population means that the probability of occurrence is generally lower than 5% to 10% compared with the probability of occurrence of normal events.
At present, the population inspection at home and abroad mainly comprises the following two methods: object recognition based methods (Cen wing steel, Wang Wen, Liang, Wang Yong. salient Optical Flow histogram dictionary representation of group anomaly event detection [ J ] Signal processing, 2017,33(03): 330-. Object-based methods can preserve local features well, but take relatively long time. Whereas the global-based approach is the opposite.
In terms of a mainstream global method, the method is mainly based on the following two feature extraction methods:
1) using easily-obtained low-dimensional features, the most commonly used are Optical Flow features (Colque RVHM, Caetano C, Andrad MTLD, et al. Histograms of Optical Flow organization and Magnitude and control to Detect analysis Events in Video [ J ]. IEEE Transactions on circuits & Systems for Video Technology,2017,27(3):673-682), performing local or global feature extraction to construct classifiers, such as documents (Wang T, Snsis H.detection of absolute Video view Optical Flow Orientation [ J ]. IEEE Transactions in SefForenses cur, 9(6): 988) on the basis of Optical Flow histogram, creating more global scales for Optical Flow histogram; the literature (Y.Cong, J.Yuan, and Y.Tang, Video analog search section Video-spatial-temporal motion context, IEEE trans.Inf. forces Security,2013,8(10): 1590-. This method has advantages of simple feature acquisition, disadvantages of insufficient optical flow accuracy, and consumption of a large amount of time for global optical flow calculation.
2) Constructing complex high dimensional features such AS energy flow features (Nam Y. Crown flux analysis and structured scenes [ J ]. MultimediaTools and Applications 2014,72(3): 3001-. The invention uses the advantage of simple optical flow acquisition as a classifier of local characteristic characters for carrying out preliminary screening, and cascades the classifier which is constructed by carrying out characteristic extraction on an automatic encoder based on a neural network for carrying out fine judgment.
In the patent aspect, a crowd abnormal event detection method, an electronic device and a storage medium [ P ] Chinese patent of Wuyuanchun, Yangyuan, CN108288021A, 2018-07-17) calculate an optical flow point adjacency matrix by utilizing the abscissa relationship and the ordinate relationship among optical flow points to detect abnormal events, and the method has the advantages of high accuracy, but has the defect that the time cost of optical flow calculation is still high. The patent (Xuanzuxing; Guo Yanfei; Wanghai; Sun Xin.) a crowd abnormal event detection method based on mixed tracking and a generalized linear model [ P ] Chinese patent: CN108280408A.2018-07-13.) provides a crowd abnormal event detection method based on the generalized linear model, and the method has the advantages that the linear model maintains the advantage of high dimensional accuracy, meanwhile, has certain advantages in speed, and has the disadvantages that the manually extracted feature model is easy to over-fit and under-fit, and the robustness is low.
[ summary of the invention ]
The invention provides a crowd abnormal behavior detection method based on a neural network. The method has the advantages of simple optical flow acquisition, is used as a classifier of local characteristic characters for preliminary screening, and is used for carrying out fine judgment by cascading classifiers constructed by feature extraction of automatic encoders based on neural networks.
The technical scheme adopted by the invention is as follows:
a crowd abnormal behavior detection method based on a neural network is characterized by comprising the following steps:
1) constructing and training a double-current residual error neural network;
2) and processing the video image to be detected and then detecting by using a neural network.
Further, the step 1) includes the following sub-steps:
1-1) dividing a standard data set of the crowd movement into two parts, wherein one part is a training data set, the other part is a testing data set, and each sample comprises normal behaviors and abnormal behaviors;
1-2), respectively carrying out optical flow processing on the training sample and the test sample, obtaining an optical flow image sequence, then regulating the optical flow image sequence and the RGB image sequence into a size of 320 x 240, and expanding the data set;
1-3), after training samples are randomly disturbed, inputting a double-current residual error neural network, and training the neural network;
1-4) detecting the trained neural network by using the test data set. If the detection result reaches the expected accuracy, the method can be used for detecting the crowd abnormal behaviors of the video to be detected, and the training is carried out again if the detection result does not reach the expected accuracy.
Further, the step 2) includes the following sub-steps:
2-1) denoising the video image sequence in the sequence to be detected. Denoising the video image by adopting a wiener filter, and denoising by selecting a wiener filter in a denoising algorithm;
2-2) performing optical flow processing on the video to be detected, and regulating the optical flow image sequence and the RGB image sequence into a size of 320 x 240 after acquiring the optical flow image sequence;
2-3) inputting the video image into the trained double-current residual error neural network. The neural network directly gives the detection result.
And further, if the detection result is that abnormal behaviors appear, a warning mark is sent out, and if the detection result is that abnormal behaviors do not appear, the video sequence to be detected is returned for re-detection.
A crowd abnormal behavior detection device based on a neural network is characterized by comprising:
the device comprises a neural network module and a to-be-detected video processing module.
Further, the main structure of the construction of the neural network is as follows:
1) the input layer converts an input picture into a database file, unifies the image into 320-240 pixels, and randomly scrambles the data image;
2) the first part of the hidden layer is a composite layer consisting of convolutional layers "Conv 1" and pooling layers "pool", Conv1 "consisting of 64 convolution kernels of size 7 × 7, the step size of the convolution being 2. And after the 64 convolution kernels are respectively convoluted, performing output matrix superposition and taking an average value as a final convolution result. The 'Pool' layer performs overlapped maximum downsampling on the convolution result, and the size of a pooling window is 2 x 2;
3) the second part is a complex convolutional network "Conv 2" made up of three sub-convolutional networks, each having 3 layers, with the number and size of convolutional cores 128 × 1, 128 × 3, 512 × 1, respectively. Each sub-part is connected by using a shortcut, and the three sub-convolution networks form a total 12-layer residual convolution network;
4) the third part is a convolution network "Conv 3" made up of three sub-convolution networks, each sub-network having 3 layers, with the number and size of convolution kernels being 64 × 1, 64 × 3, 128 × 1, respectively. Each subsection is connected by using a shortcut, and a third subsection has 12 convolutional layers to form a coincident convolutional layer;
5) the fourth part is a composite convolution network "Conv 4" comprising twenty-three sub-convolution networks, each having 3 layers, with the number and size of convolution kernels being 256 × 1, 512 × 3, 1024 × 1, respectively. Each subsection is connected by using a shortcut of a residual error network, and the fourth subsection has 69 layers of convolution layers in total;
6) the fifth part is the last part of the residual network, which is a composite convolution network "Conv 5" consisting of three sub-convolution networks, each having 3 layers, with the number and size of convolution kernels being 512 × 1, 1024 × 3, 2048 × 1, respectively. Each subsection is connected by using a shortcut of a residual error network, and the fourth subsection totally has 9 convolutional layers;
7) the last part of the neural network is an output part, and the 0 output layer is composed of an expansion layer, a 4-layer full-connection layer and a softmax classification layer. The expansion layer expands the fused features into one-dimensional vectors, the full-connection layer is full-connection with output sizes of 1024, 512, 256 and 64 respectively, and softmax normalizes output results of the full-connection layer.
Further, the video processing module to be detected is responsible for preprocessing the video to be detected, and the main process is as follows:
1) image noise reduction (wiener filtering);
2) extracting optical flow to generate an optical flow image;
3) and extracting the RGB image.
The invention has the beneficial effects that:
the present invention builds a system by using deep learning based ideas. The neural network is used for automatically carrying out high-dimensional feature learning on various behaviors of the sports crowd; traditional artificial feature extraction is limited by human visual fields, too complicated features are extremely poor in robustness, and too simple features cannot represent features effectively enough. Therefore, the accuracy and speed of detection are greatly limited, and the extraction method can fully and effectively represent the crowd characteristics.
The invention uses the 70 layers of deep residual error neural network models as the detection models, so that the learning of the characteristics is more accurate, and the method uses the thought of the residual error network to construct the 70 layers of deep residual error neural network models as the detection models, so that the method has higher efficiency in detection speed and detection accuracy.
The invention adopts a double-flow neural network structure and simultaneously extracts two kinds of information respectively. The information related to the time and the space sequence is fully utilized, the features obtained by learning from the two kinds of information are fused to be used as the final detection index, and the information loss is not easy to generate.
[ description of the drawings ]
FIG. 1 is an exemplary data augmentation of the present invention;
FIG. 2 is a flow chart of the training of the neural network of the present invention;
FIG. 3 is a flow chart of the present invention for processing a video image to be detected and detecting using a neural network;
fig. 4 shows a specific structure of the neural network of the present invention.
[ detailed description ] embodiments
The present invention is further illustrated in detail by the following examples and the accompanying drawings.
The invention provides a crowd abnormal behavior detection method based on a neural network. The method utilizes the advantage of simple optical flow acquisition as a classifier of local characteristic characters for preliminary screening, and cascades the classifier constructed by extracting the characteristics of the automatic encoder based on the neural network for fine judgment. The construction and the concrete method steps of the neural network of the invention are as follows by combining the attached drawings:
construction of a neural network:
1) the input layer converts an input picture into a database file, unifies the image into 320-240 pixels, and randomly scrambles the data image;
2) the first part of the hidden layer is a composite layer consisting of convolutional layers "Conv 1" and pooling layers "pool", Conv1 "consisting of 64 convolution kernels of size 7 × 7, the step size of the convolution being 2. And after the 64 convolution kernels are respectively convoluted, performing output matrix superposition and taking an average value as a final convolution result. The 'Pool' layer performs overlapped maximum downsampling on the convolution result, and the size of a pooling window is 2 x 2;
3) the second part is a complex convolutional network "Conv 2" made up of three sub-convolutional networks, each having 3 layers, with the number and size of convolutional cores 128 × 1, 128 × 3, 512 × 1, respectively. Each sub-part is connected by using a shortcut, and the three sub-convolution networks form a total 12-layer residual convolution network;
4) the third part is a convolution network "Conv 3" made up of three sub-convolution networks, each sub-network having 3 layers, with the number and size of convolution kernels being 64 × 1, 64 × 3, 128 × 1, respectively. Each subsection is connected by using a shortcut, and a third subsection has 12 convolutional layers to form a coincident convolutional layer;
5) the fourth part is a composite convolution network "Conv 4" comprising twenty-three sub-convolution networks, each having 3 layers, with the number and size of convolution kernels being 256 × 1, 512 × 3, 1024 × 1, respectively. Each subsection is connected by using a shortcut of a residual error network, and the fourth subsection has 69 layers of convolution layers in total;
6) the fifth part is the last part of the residual network, which is a composite convolution network "Conv 5" consisting of three sub-convolution networks, each having 3 layers, with the number and size of convolution kernels being 512 × 1, 1024 × 3, 2048 × 1, respectively. Each subsection is connected by using a shortcut of a residual error network, and the fourth subsection totally has 9 convolutional layers;
7) the last part of the neural network is an output part, and the 0 output layer is composed of an expansion layer, a 4-layer full-connection layer and a softmax classification layer. The expansion layer expands the fused features into one-dimensional vectors, the full-connection layer is full-connection with output sizes of 1024, 512, 256 and 64 respectively, and softmax normalizes the output results of the full-connection layer.
The method comprises the following specific steps:
1. data set production
Take a UMN database as an example. The database is an open database established by the university of minnesota for standard measurements of crowd abnormal event detection algorithms, and videos collected in the database are recorded by the college students in cooperation with the teacher. The database contains 11 video segments in total, and each video segment contains a crowd motion state normal part and a motion state abnormal part. The video content starts with normal behavior and ends with abnormal behavior.
The data set is divided according to a normal sample and an abnormal sample, the divided sample is divided into two parts, one part is a training sample, the other part is a testing sample, and each sample comprises a normal behavior and an abnormal behavior. Both parts are left to be put into the next network training.
2. Data set expansion
For the neural network applied to the task of detecting and identifying abnormal behaviors, the training effect is closely related to the quantity and the scale of the data sets, and the more large-scale data sets are used in the training process, the less easily the trained network is over-fit or under-fit. Because the training set of abnormal behaviors is limited, the existing data is expanded by adopting a data gain method.
The images of each frame are rotated, scaled and translated before the training data set is input into the network. For rotation, its angle is limited to [ -30, +30 ]; for the translation, the translation amplitude of each image is controlled in a range of [ -10%, + 10% ], and the translation amplitude and direction are randomly determined by a program to translate horizontally and vertically; for zooming, the range of the zoom is also controlled to [ -10%, + 10% ], when the picture is zoomed out resulting in a resulting blank, the nearest pixel in the original image will be filled in the blank. The change range of each change mode is controlled within a certain range, so that the situation that the newly generated image and the original image are excessively changed to cause the process of network training to become unpredictable is avoided. The range set by the embodiment contains zero values, and when the image is subjected to zero value conversion, the image means that no conversion is applied, and the size of the expanded data set is 4 times that of the original data set.
Shown in fig. 1 is an exemplary operation of the original image through +30 degrees rotation and-20 degrees rotation, 10% scaling and 10% enlargement, 10% translation to the right and 10% translation to the down, respectively.
3. Neural network setup
Most parameters of the neural network are self-adjusted to adapt to the learning content, but before network training, the hyper-parameters of the neural network need to be set so as to enable the network learning to be smoothly carried out. The setting of the respective hyper-parameters is explained in detail below:
(1) the batch _ size of the training set is set to 2. Batch _ size is how many units are used for a single Batch of training. The data size of the training set is relatively small, and in order to fully utilize information in the training set with a limited size, the training efficiency is obviously improved although the expense is increased in training time by setting batch _ size to be a relatively low array. According to common experience, the size of batch _ size is generally selected to be 2 times to the power of cpu and gpu processing requirements.
(2) The epoch for the training set is set to 50. epochs are the total batch of network training, each epoch means a new round of training based on the previous training using the training set. The identification of the completion of the neural network training is that the quasi-cluster rate and the loss rate of the network converge to specific values in a plurality of epochs.
(3) Momentum of the network is set to 0.9 a common method of optimizing parameters during network training is known as gradient descent. In the gradient descending process, the initial state of the network can influence the convergence of the network in the optimal solution, the network can quickly converge to the global optimal solution under the condition of correct convergence, and the network can collapse to the local optimal solution under the condition of error. The momentum refers to the relationship between potential energy and kinetic energy in physics and is used for guiding the direction in the descending process. The larger the descending gradient is, the smaller the adjustment angle is; the smaller the gradient of the descent is, the larger the angle of adjustment is, so that the situation that the local optimal solution is trapped can be avoided with higher probability. When the Momentum value is set, the process that the local optimal solution is to be got rid of as much as possible and the influence of the oscillation amplitude is not reduced is considered at the same time, and through a plurality of tests, the effect of network training is optimal when the momentun value is set to be 0.9.
(4) The initial learning rate is set to 0.001. In the training process, the learning rate represents the amplitude of parameter adjustment of the network, and in the gradient descent, the amplitude of adjustment means the step length of each step in the descent process. The larger the learning rate is, the larger the convergence rate is, but the phenomenon of gradient explosion and oscillation near the optimal solution are more easily caused; however, the smaller the learning rate is, the more accurate the network obtains the result, but the learning result is easy to be over-fitted. Therefore, the learning rate set in the training process of this embodiment will dynamically change with the epoch, and each time the epoch gradually increases the learning rate on the basis of the initial learning rate being 0.001.
3. Training of neural networks
And inputting the preprocessed video set into a neural network for training. And after the training reaches the expected accuracy, finishing the training of the neural network, and re-training without reaching the expected accuracy.
4. And detection video pre-processing
After the training is finished, a test video (a video to be detected) is also input into the trained neural network for detection. When the picture phase has abnormal behavior, the system gives out warning mark, if the picture phase has no abnormal behavior, the system detects again
The above-mentioned embodiments are merely preferred embodiments of the present invention, which should not be construed as limiting the scope of the invention, but rather as equivalent variations of the shape, structure and principle of the invention are also within the scope of the invention.
Claims (7)
1. A crowd abnormal behavior detection method based on a neural network is characterized by comprising the following steps:
1) constructing and training a double-current residual error neural network;
2) and processing the video image to be detected and then detecting by using a neural network.
2. The method for detecting the abnormal behavior of the crowd based on the neural network as claimed in claim 1, wherein the step 1) comprises the following substeps:
1-1) dividing a standard data set of the crowd movement into two parts, wherein one part is a training data set, the other part is a testing data set, and each sample comprises normal behaviors and abnormal behaviors;
1-2), respectively carrying out optical flow processing on the training sample and the test sample, obtaining an optical flow image sequence, then regulating the optical flow image sequence and the RGB image sequence to be 320 × 240, and expanding the data set;
1-3), after training samples are randomly disturbed, inputting a double-current residual error neural network, and training the neural network;
1-4) detecting the trained neural network by using the test data set. If the detection result reaches the expected accuracy, the method can be used for detecting the crowd abnormal behaviors of the video to be detected, and the training is carried out again if the detection result does not reach the expected accuracy.
3. The method for detecting the abnormal behavior of the crowd based on the neural network as claimed in claim 1, wherein the step 2) comprises the following substeps:
2-1) denoising the video image sequence in the sequence to be detected. Denoising the video image by adopting a wiener filter, and denoising by selecting a wiener filter in a denoising algorithm;
2-2) performing optical flow processing on the video to be detected, and regulating the optical flow image sequence and the RGB image sequence into a size of 320 x 240 after acquiring the optical flow image sequence;
2-3) inputting the video image into the trained double-current residual error neural network, and directly giving out the detection result by the neural network.
4. The method as claimed in claim 3, wherein the method for detecting the abnormal behavior of the crowd based on the neural network is characterized in that a warning mark is sent out if the detection result is that the abnormal behavior occurs, and the video sequence to be detected is returned for re-detection if the detection result is that the abnormal behavior does not occur.
5. A crowd abnormal behavior detection device based on a neural network is characterized by comprising:
the device comprises a neural network module and a to-be-detected video processing module.
6. The device according to claim 5, wherein the neural network is constructed by the following main structure:
1) the input layer converts an input picture into a database file, unifies the image into 320-240 pixels, and randomly scrambles the data image;
2) the first part of the hidden layer is a composite layer consisting of convolutional layers "Conv 1" and pooling layers "pool", Conv1 "consisting of 64 convolution kernels of size 7 × 7, the step size of the convolution being 2. And after the 64 convolution kernels are respectively convoluted, performing output matrix superposition and taking an average value as a final convolution result. The 'Pool' layer performs overlapped maximum downsampling on the convolution result, and the size of a pooling window is 2 x 2;
3) the second part is a complex convolution network "Conv 2" made up of three sub-convolution networks, each sub-network having 3 layers, with the number and size of convolution kernels being 128 x 1, 128 x 3, 512 x 1, respectively. Each subsection is connected by a shortcut, and the three sub convolution networks form a total 12-layer residual convolution network;
4) the third part is a convolution network "Conv 3" made up of three sub-convolution networks, each sub-network having 3 layers, with the number and size of convolution kernels being 64 × 1, 64 × 3, 128 × 1, respectively. Each subsection is connected by using a shortcut, and a third subsection has 12 convolutional layers to form a coincident convolutional layer;
5) the fourth part is a composite convolution network "Conv 4" comprising twenty-three sub-convolution networks, each having 3 layers, with the number and size of convolution kernels being 256 × 1, 512 × 3, 1024 × 1, respectively. Each subsection is connected by using a shortcut of a residual error network, and the fourth subsection has 69 layers of convolution layers in total;
6) the fifth part is the last part of the residual network, which is a composite convolution network "Conv 5" consisting of three sub-convolution networks, each having 3 layers, with the number and size of convolution kernels being 512 × 1, 1024 × 3, 2048 × 1, respectively. Each subsection is connected by using a shortcut of a residual error network, and the fourth subsection has 9 convolutional layers in total;
7) the last part of the neural network is an output part, and the 0 output layer is composed of an expansion layer, a 4-layer full-connection layer and a softmax classification layer. The expansion layer expands all the fused features into one-dimensional vectors, the full-connection layer is full-connected with output sizes of 1024, 512, 256 and 64 respectively, and softmax performs normalization processing on output results of the full-connection layer.
7. The device according to claim 5, wherein the video processing module to be detected is responsible for preprocessing the video to be detected, and the main process is as follows:
1) image noise reduction (wiener filtering);
2) extracting optical flow to generate an optical flow image;
3) and extracting the RGB image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911221923.9A CN111027440B (en) | 2019-12-03 | 2019-12-03 | Crowd abnormal behavior detection device and detection method based on neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911221923.9A CN111027440B (en) | 2019-12-03 | 2019-12-03 | Crowd abnormal behavior detection device and detection method based on neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111027440A true CN111027440A (en) | 2020-04-17 |
CN111027440B CN111027440B (en) | 2023-05-30 |
Family
ID=70207908
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911221923.9A Active CN111027440B (en) | 2019-12-03 | 2019-12-03 | Crowd abnormal behavior detection device and detection method based on neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111027440B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598179A (en) * | 2020-05-21 | 2020-08-28 | 国网电力科学研究院有限公司 | Power monitoring system user abnormal behavior analysis method, storage medium and equipment |
CN112613359A (en) * | 2020-12-09 | 2021-04-06 | 苏州玖合智能科技有限公司 | Method for constructing neural network for detecting abnormal behaviors of people |
CN113221817A (en) * | 2021-05-27 | 2021-08-06 | 江苏奥易克斯汽车电子科技股份有限公司 | Abnormal behavior detection method, device and equipment |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109359519A (en) * | 2018-09-04 | 2019-02-19 | 杭州电子科技大学 | A kind of video anomaly detection method based on deep learning |
CN109635790A (en) * | 2019-01-28 | 2019-04-16 | 杭州电子科技大学 | A kind of pedestrian's abnormal behaviour recognition methods based on 3D convolution |
CN109670446A (en) * | 2018-12-20 | 2019-04-23 | 泉州装备制造研究所 | Anomaly detection method based on linear dynamic system and depth network |
CN109934042A (en) * | 2017-12-15 | 2019-06-25 | 吉林大学 | Adaptive video object behavior trajectory analysis method based on convolutional neural networks |
CN110188637A (en) * | 2019-05-17 | 2019-08-30 | 西安电子科技大学 | A kind of Activity recognition technical method based on deep learning |
CN110210555A (en) * | 2019-05-29 | 2019-09-06 | 西南交通大学 | Rail fish scale hurt detection method based on deep learning |
CN110503063A (en) * | 2019-08-28 | 2019-11-26 | 东北大学秦皇岛分校 | Fall detection method based on hourglass convolution autocoding neural network |
-
2019
- 2019-12-03 CN CN201911221923.9A patent/CN111027440B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109934042A (en) * | 2017-12-15 | 2019-06-25 | 吉林大学 | Adaptive video object behavior trajectory analysis method based on convolutional neural networks |
CN109359519A (en) * | 2018-09-04 | 2019-02-19 | 杭州电子科技大学 | A kind of video anomaly detection method based on deep learning |
CN109670446A (en) * | 2018-12-20 | 2019-04-23 | 泉州装备制造研究所 | Anomaly detection method based on linear dynamic system and depth network |
CN109635790A (en) * | 2019-01-28 | 2019-04-16 | 杭州电子科技大学 | A kind of pedestrian's abnormal behaviour recognition methods based on 3D convolution |
CN110188637A (en) * | 2019-05-17 | 2019-08-30 | 西安电子科技大学 | A kind of Activity recognition technical method based on deep learning |
CN110210555A (en) * | 2019-05-29 | 2019-09-06 | 西南交通大学 | Rail fish scale hurt detection method based on deep learning |
CN110503063A (en) * | 2019-08-28 | 2019-11-26 | 东北大学秦皇岛分校 | Fall detection method based on hourglass convolution autocoding neural network |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598179A (en) * | 2020-05-21 | 2020-08-28 | 国网电力科学研究院有限公司 | Power monitoring system user abnormal behavior analysis method, storage medium and equipment |
CN111598179B (en) * | 2020-05-21 | 2022-10-04 | 国网电力科学研究院有限公司 | Power monitoring system user abnormal behavior analysis method, storage medium and equipment |
CN112613359A (en) * | 2020-12-09 | 2021-04-06 | 苏州玖合智能科技有限公司 | Method for constructing neural network for detecting abnormal behaviors of people |
CN112613359B (en) * | 2020-12-09 | 2024-02-02 | 苏州玖合智能科技有限公司 | Construction method of neural network for detecting abnormal behaviors of personnel |
CN113221817A (en) * | 2021-05-27 | 2021-08-06 | 江苏奥易克斯汽车电子科技股份有限公司 | Abnormal behavior detection method, device and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN111027440B (en) | 2023-05-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113052210A (en) | Fast low-illumination target detection method based on convolutional neural network | |
US20210034840A1 (en) | Method for Recognzing Face from Monitoring Video Data | |
CN113591795A (en) | Lightweight face detection method and system based on mixed attention feature pyramid structure | |
CN103530638B (en) | Method for pedestrian matching under multi-cam | |
CN113591968A (en) | Infrared weak and small target detection method based on asymmetric attention feature fusion | |
CN112183468A (en) | Pedestrian re-identification method based on multi-attention combined multi-level features | |
CN111027440A (en) | Crowd abnormal behavior detection device and method based on neural network | |
CN109886159B (en) | Face detection method under non-limited condition | |
CN110532959B (en) | Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network | |
CN109919223B (en) | Target detection method and device based on deep neural network | |
CN113780132A (en) | Lane line detection method based on convolutional neural network | |
CN112308087B (en) | Integrated imaging identification method based on dynamic vision sensor | |
CN109377499A (en) | A kind of Pixel-level method for segmenting objects and device | |
CN114155474A (en) | Damage identification technology based on video semantic segmentation algorithm | |
Zhu et al. | Towards automatic wild animal detection in low quality camera-trap images using two-channeled perceiving residual pyramid networks | |
CN111339950B (en) | Remote sensing image target detection method | |
CN115661459A (en) | 2D mean teacher model using difference information | |
Sun et al. | UAV image detection algorithm based on improved YOLOv5 | |
CN111881803B (en) | Face recognition method based on improved YOLOv3 | |
CN112232236B (en) | Pedestrian flow monitoring method, system, computer equipment and storage medium | |
CN113177956A (en) | Semantic segmentation method for unmanned aerial vehicle remote sensing image | |
CN113052139A (en) | Deep learning double-flow network-based climbing behavior detection method and system | |
Ren et al. | Research on Safety Helmet Detection for Construction Site | |
CN108711147A (en) | A kind of conspicuousness fusion detection algorithm based on convolutional neural networks | |
CN112418229A (en) | Unmanned ship marine scene image real-time segmentation method based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |