CN111160117A - Abnormal behavior detection method based on multi-example learning modeling - Google Patents
Abnormal behavior detection method based on multi-example learning modeling Download PDFInfo
- Publication number
- CN111160117A CN111160117A CN201911262679.0A CN201911262679A CN111160117A CN 111160117 A CN111160117 A CN 111160117A CN 201911262679 A CN201911262679 A CN 201911262679A CN 111160117 A CN111160117 A CN 111160117A
- Authority
- CN
- China
- Prior art keywords
- video
- abnormal
- score
- time
- constructing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
- G06V20/42—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method for detecting abnormal behaviors based on multi-example learning modeling, which comprises the following steps of: step 1, marking an original monitoring video based on a multi-example learning method, wherein a marking target is a video sequence, and a video segment is an example; step 2, extracting space-time sequence characteristics; step 3, calculating abnormal scores of the video segments; step 4, constructing a multi-example highest abnormal score solving function; step 5, constructing a sequencing loss function; step 6, constructing an objective function: and 7, accessing the trained video depth abnormal score ordering model to a real-time video stream, calculating the abnormal score of the real-time video through the model, and judging whether the video is abnormal or not. The method has the advantages that the video abnormity is detected by adopting a multi-example deep learning method, the abnormity type does not need to be subdivided, the abnormal video frame does not need to be accurately detected, and the early-stage sample marking workload of model training is greatly reduced.
Description
Technical Field
The invention belongs to the technical field of video abnormal behavior detection, and particularly relates to an abnormal behavior detection method based on multi-example learning modeling.
Background
The traditional monitoring system mainly realizes the safety management of public places in a manual monitoring mode and lacks real-time performance and initiative. In many cases, video monitoring is not the role of supervision because unattended management only plays a role of video backup. In addition, with the popularization and wide arrangement of monitoring cameras and the continuous development of video monitoring technology and information science, the fields of video monitoring, human-computer interaction, video searching and the like enable the automatic monitoring of abnormal behaviors to gradually become a technical field with wide application prospects. In recent years, researchers have proposed different methods for detecting abnormal behaviors, such as a pyramid optical flow method, a 3D-SIFT description operator, and a calculation method of multi-attribute fusion, in which a training model is obtained by processing each frame of video picture, a large number of training samples are required and labeling work is performed, the types of abnormal behaviors are large, the sample size is small, and the detection accuracy obtained by outputting the detection model is difficult to achieve an ideal effect.
Disclosure of Invention
In order to overcome the problems, the invention provides the abnormal behavior detection method based on the multi-example learning modeling, the abnormal type does not need to be subdivided, the abnormal video frame does not need to be accurate, and the early sample labeling workload of model training is greatly reduced. The technical scheme is that the method comprises the following steps of,
an abnormal behavior detection method based on multi-example learning modeling comprises the following steps:
step 1, marking an original monitoring video based on a multi-example learning method, wherein a marking target is a video sequence, and a video segment is an example;
step 2, extracting space-time sequence characteristics;
step 3, calculating the abnormal score s of the video segmenti;
Step 4, constructing a multi-example highest abnormal score solving function f;
step 5, constructing a sequencing loss function l (V)m,Vn) The method is used for correcting the multi-example highest abnormal score solving function f to help an abnormal video to obtain a score higher than that of a normal video in the model training process; wherein VmAnd VnRespectively represent differentA normal video m and a normal video n;
step 6, constructing an objective function:
L(W)=l(Vm,Vn)+||W||
wherein W is the model weight;
and 7, accessing the trained video depth abnormal score ordering model to a real-time video stream, calculating the abnormal score of the real-time video through the model, and judging whether the video is abnormal or not.
Preferably, in the step 2, the spatio-temporal sequence feature extraction method is that in the marked video data set, a certain video V in the data set is extractedjIs divided into n video segments, wherein each video segment viIncluding continuous 16 frames without overlapping, sending each video segment into C3D convolution neural network, entering full connection layer after 8 times of 3D convolution and 5 times of 3D pooling to obtain video segment viSpace-time feature vector x ofiWill video VjAfter the space-time feature vectors of all the video segments are spliced according to the time sequence, the video V is obtainedjSpace-time feature matrix X ofj。
Preferably, in step 3, the video V obtained in step 2jSpace-time feature matrix X ofj=(x1,x2,Λ,xn) Obtaining a spatio-temporal feature vector x from any video segmentiInputting three full connection layers to obtain the abnormal score s of the video segmentiThen a certain video VjAbnormal score S of all video segmentsj=(s1,s2,Λsn) (ii) a Abnormality score siThe calculation formula is as follows:
where t is the set of weights for the three fully-connected layers (t)1,t2,t3) (ii) a b is the deviation set of three fully connected layers (b)1,b2,b3) (ii) a Phi is a three-layer fully-connected neural network.
Preferably, in step 4, the multi-example highest abnormality score solving function f is:
wherein z represents the number of videos in a given set of video data,representing a video VjThe corresponding label.
Preferably, in step 5, in the multi-instance learning, at least one video segment with abnormality in the video marked as positive definitely exists, and the ranking loss function l (V) is constructed by calculating the instance with the highest abnormality score in each videom,Vn);
Wherein, VmAnd VnRespectively representing an abnormal video m and a normal video n,in order to smooth the terms in the time domain,for sparse terms, λ10.00008 is the coefficient of the time domain smoothing term, λ20.00008 is the coefficient of the sparse term.
Preferably, in step 7, a plurality of abnormal videos in a plurality of scenes are used as positive samples based on the weak labeling mode of the video.
Advantageous effects
The video abnormity is detected by adopting a multi-example deep learning method, the abnormity type does not need to be subdivided, the abnormal video frame does not need to be accurately detected, and the early-stage sample marking workload of model training is greatly reduced.
Detailed Description
An abnormal behavior detection method based on multi-example learning modeling comprises the following steps:
step 1, marking an original monitoring video based on a multi-example learning method, wherein a marking target is a video sequence, and a video segment is an example; when one video sequence is marked as negative, the marks of all sample data in the video sequence are negative, namely, the video sequence is a normal video, and when one video sequence is marked as positive, the marks indicate that at least one sample in the video sequence is positive, namely, the video is marked to have abnormity.
Step 2, extracting space-time sequence characteristics; in a given video data set, a certain video V in the data setjCut into n video segments, i.e. Vj=(v1,v2,K,vn) Wherein each video band viThe method comprises the steps of respectively sending each video segment into a C3D convolutional neural network containing continuous 16 non-overlapping frames, entering a full connection layer after 8 times of 3D volume and 5 times of 3D pooling, and obtaining a video segment viSpace-time feature vector x ofiWill video VjAfter the space-time feature vectors of all the video segments are spliced according to the time sequence, the video V is obtainedjSpace-time feature matrix X ofj。
Step 3, calculating the abnormal score s of the video segmenti(ii) a From the video V obtained in step 2jSpace-time feature matrix X ofj=(x1,x2,Λ,xn) Obtaining a spatio-temporal feature vector x from any video segmentiInputting three full-connected layers to obtain the abnormal score s of the video segmentiThen a certain video VjAbnormal score S of all video segments in the videoj=(s1,s2,Λsn) (ii) a Abnormality score siThe calculation formula is as follows:
wherein t is threeWeight set (t) of fully connected layers1,t2,t3) (ii) a b is the deviation set of three fully connected layers (b)1,b2,b3) (ii) a Phi is a three-layer fully-connected neural network.
Step 4, in the multi-example learning, at least one video segment containing abnormality is certainly existed in the video marked as positive, and a multi-example highest abnormality score solving function f is constructed by calculating the example with the highest abnormality score in each video;
wherein z represents the number of videos in a given set of video data,representing a video VjThe corresponding label.
In step 5, the abnormal score of the abnormal video is higher than that of the normal video, so a sort loss function needs to be constructed to help the abnormal video obtain a higher score than that of the normal video in the model training process. In real life, the abnormality generally occurs in a very short time, the video is taken as a multi-example object, the abnormality score between the examples is smoothly changed, the time smoothness between the examples is enhanced by minimizing the deviation of the abnormality score between the multiple examples, and the ranking loss function l (V) is constructedm,Vn) The multi-example highest abnormal score solving function f in the step 4 is used for correcting, so that the abnormal video is helped to obtain a higher score than the normal video in the model training process;
wherein, VmAnd VnRespectively representing an abnormal video m and a normal video n,in order to smooth the terms in the time domain,for sparse terms, λ10.00008 is the coefficient of the time domain smoothing term, λ20.00008 is the coefficient of the sparse term.
Step 6, in order to ensure that the model obtained by constructing the network training can enable abnormal video segments in the positive sample to be predicted to obtain high scores, the invention constructs an objective function:
L(W)=l(Vm,Vn)+||W||
w is a model weight set and comprises weights t of the convolutional neural network to be trained and deviations b of the convolutional neural network to be trained.
And 7, by extracting data characteristic representation of a time-space sequence of the video, taking multiple abnormal videos including limb conflicts, fires, explosions, thefts, the damage to public objects, the leaving of articles and dangerous driving under multiple scenes as positive samples and normal videos under multiple scenes as negative samples in a video-based weak labeling mode, training to obtain a video depth abnormal score ordering model by adopting a multi-example learning method, accessing a real-time video stream, obtaining abnormal scores of the real-time video through model calculation, and judging whether the videos are abnormal or not.
It is understood that the above description is not intended to limit the present application, and the present application is not limited to the above examples, and those skilled in the art can make variations, modifications, additions and substitutions within the spirit and scope of the present application.
Claims (6)
1. An abnormal behavior detection method based on multi-example learning modeling is characterized by comprising the following steps:
step 1, marking an original monitoring video based on a multi-example learning method, wherein a marking target is a video sequence, and a video segment is an example;
step 2, extracting space-time sequence characteristics;
step 3, calculating the abnormal score s of the video segmenti;
Step 4, constructing a multi-example highest abnormal score solving function f;
step 5, constructing a sequencing loss function l (V)m,Vn) The multi-example highest abnormal score solving function f in the step 4 is used for correcting, so that the abnormal video is helped to obtain a score higher than that of a normal video in the model training process; wherein VmAnd VnRespectively representing an abnormal video m and a normal video n;
step 6, constructing an objective function:
L(W)=l(Vm,Vn)+||W||
wherein W is the model weight;
and 7, accessing the trained video depth abnormal score ordering model to a real-time video stream, calculating the abnormal score of the real-time video through the model, and judging whether the video is abnormal or not.
2. The method according to claim 1, wherein the spatio-temporal sequence feature extraction method in step 2 is to extract a certain video V in the data set from the marked video data setjSlicing into n video segments, wherein each video segment viThe method comprises the steps of respectively sending each video segment into a C3D convolutional neural network containing continuous 16 non-overlapping frames, entering a full connection layer after 8 times of 3D volume and 5 times of 3D pooling, and obtaining a video segment viSpace-time feature vector x ofiWill video VjAfter the space-time feature vectors of all the video segments are spliced according to the time sequence, the video V is obtainedjSpace-time feature matrix X ofj。
3. The method for detecting abnormal behaviors based on multi-instance learning modeling as claimed in claim 2, wherein in step 3, the video V obtained in step 2jSpace-time feature matrix X ofj=(x1,x2,Λ,xn) Obtaining a spatio-temporal feature vector x from any video segmentiInputting three full-connected layers to obtain the abnormal score s of the video segmentiThen a certain video VjAbnormal score S of all video segments in the videoj=(s1,s2,Λsn) (ii) a Abnormality score siThe calculation formula is as follows:
where t is the set of weights for the three fully-connected layers (t)1,t2,t3) (ii) a b is the deviation set of three fully connected layers (b)1,b2,b3) (ii) a Phi is a three-layer fully-connected neural network.
4. The method for detecting abnormal behavior based on multi-instance learning modeling as claimed in claim 3, wherein in step 4, the multi-instance highest abnormal score solving function f is:
5. The method according to claim 4, wherein in the step 5, in the multi-instance learning, at least one video segment with abnormality in the video labeled as positive is definitely existed, and the ranking loss function l (V) is constructed by calculating the instance with highest abnormality score in each videom,Vn);
6. The method for detecting abnormal behavior based on multi-instance learning modeling as claimed in claim 1, wherein in step 7, multiple abnormal videos in multiple scenes are used as positive samples based on the weak labeling mode of the videos.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911262679.0A CN111160117A (en) | 2019-12-11 | 2019-12-11 | Abnormal behavior detection method based on multi-example learning modeling |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911262679.0A CN111160117A (en) | 2019-12-11 | 2019-12-11 | Abnormal behavior detection method based on multi-example learning modeling |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111160117A true CN111160117A (en) | 2020-05-15 |
Family
ID=70556796
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911262679.0A Pending CN111160117A (en) | 2019-12-11 | 2019-12-11 | Abnormal behavior detection method based on multi-example learning modeling |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111160117A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111885349A (en) * | 2020-06-08 | 2020-11-03 | 北京市基础设施投资有限公司(原北京地铁集团有限责任公司) | Pipe rack abnormity detection system and method |
CN113011322A (en) * | 2021-03-17 | 2021-06-22 | 南京工业大学 | Detection model training method and detection method for specific abnormal behaviors of monitoring video |
CN113037783A (en) * | 2021-05-24 | 2021-06-25 | 中南大学 | Abnormal behavior detection method and system |
CN113312968A (en) * | 2021-04-23 | 2021-08-27 | 上海海事大学 | Real anomaly detection method in surveillance video |
CN116485041A (en) * | 2023-06-14 | 2023-07-25 | 天津生联智慧科技发展有限公司 | Abnormality detection method and device for gas data |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105930792A (en) * | 2016-04-19 | 2016-09-07 | 武汉大学 | Human action classification method based on video local feature dictionary |
US20170132528A1 (en) * | 2015-11-06 | 2017-05-11 | Microsoft Technology Licensing, Llc | Joint model training |
CN106980826A (en) * | 2017-03-16 | 2017-07-25 | 天津大学 | A kind of action identification method based on neutral net |
CN108846852A (en) * | 2018-04-11 | 2018-11-20 | 杭州电子科技大学 | Monitor video accident detection method based on more examples and time series |
CN109271876A (en) * | 2018-08-24 | 2019-01-25 | 南京理工大学 | Video actions detection method based on temporal evolution modeling and multi-instance learning |
CN110084151A (en) * | 2019-04-10 | 2019-08-02 | 东南大学 | Video abnormal behaviour method of discrimination based on non-local network's deep learning |
CN110263728A (en) * | 2019-06-24 | 2019-09-20 | 南京邮电大学 | Anomaly detection method based on improved pseudo- three-dimensional residual error neural network |
CN110378233A (en) * | 2019-06-20 | 2019-10-25 | 上海交通大学 | A kind of double branch's method for detecting abnormality based on crowd behaviour priori knowledge |
CN110502988A (en) * | 2019-07-15 | 2019-11-26 | 武汉大学 | Group positioning and anomaly detection method in video |
-
2019
- 2019-12-11 CN CN201911262679.0A patent/CN111160117A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170132528A1 (en) * | 2015-11-06 | 2017-05-11 | Microsoft Technology Licensing, Llc | Joint model training |
CN105930792A (en) * | 2016-04-19 | 2016-09-07 | 武汉大学 | Human action classification method based on video local feature dictionary |
CN106980826A (en) * | 2017-03-16 | 2017-07-25 | 天津大学 | A kind of action identification method based on neutral net |
CN108846852A (en) * | 2018-04-11 | 2018-11-20 | 杭州电子科技大学 | Monitor video accident detection method based on more examples and time series |
CN109271876A (en) * | 2018-08-24 | 2019-01-25 | 南京理工大学 | Video actions detection method based on temporal evolution modeling and multi-instance learning |
CN110084151A (en) * | 2019-04-10 | 2019-08-02 | 东南大学 | Video abnormal behaviour method of discrimination based on non-local network's deep learning |
CN110378233A (en) * | 2019-06-20 | 2019-10-25 | 上海交通大学 | A kind of double branch's method for detecting abnormality based on crowd behaviour priori knowledge |
CN110263728A (en) * | 2019-06-24 | 2019-09-20 | 南京邮电大学 | Anomaly detection method based on improved pseudo- three-dimensional residual error neural network |
CN110502988A (en) * | 2019-07-15 | 2019-11-26 | 武汉大学 | Group positioning and anomaly detection method in video |
Non-Patent Citations (2)
Title |
---|
胡正平;张乐;尹艳华;: "时空深度特征AP聚类的稀疏表示视频异常检测算法" * |
胡正平;张乐;尹艳华;: "时空深度特征AP聚类的稀疏表示视频异常检测算法", 信号处理, no. 03 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111885349A (en) * | 2020-06-08 | 2020-11-03 | 北京市基础设施投资有限公司(原北京地铁集团有限责任公司) | Pipe rack abnormity detection system and method |
CN111885349B (en) * | 2020-06-08 | 2023-05-09 | 北京市基础设施投资有限公司 | Pipe gallery abnormality detection system and method |
CN113011322A (en) * | 2021-03-17 | 2021-06-22 | 南京工业大学 | Detection model training method and detection method for specific abnormal behaviors of monitoring video |
CN113011322B (en) * | 2021-03-17 | 2023-09-05 | 贵州安防工程技术研究中心有限公司 | Detection model training method and detection method for monitoring specific abnormal behavior of video |
CN113312968A (en) * | 2021-04-23 | 2021-08-27 | 上海海事大学 | Real anomaly detection method in surveillance video |
CN113312968B (en) * | 2021-04-23 | 2024-03-12 | 上海海事大学 | Real abnormality detection method in monitoring video |
CN113037783A (en) * | 2021-05-24 | 2021-06-25 | 中南大学 | Abnormal behavior detection method and system |
CN116485041A (en) * | 2023-06-14 | 2023-07-25 | 天津生联智慧科技发展有限公司 | Abnormality detection method and device for gas data |
CN116485041B (en) * | 2023-06-14 | 2023-09-01 | 天津生联智慧科技发展有限公司 | Abnormality detection method and device for gas data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111160117A (en) | Abnormal behavior detection method based on multi-example learning modeling | |
Hu et al. | Real-time video fire smoke detection by utilizing spatial-temporal ConvNet features | |
CN110298404B (en) | Target tracking method based on triple twin Hash network learning | |
CN102521340A (en) | Method for analyzing TV video based on role | |
CN110675421B (en) | Depth image collaborative segmentation method based on few labeling frames | |
CN113129284B (en) | Appearance detection method based on 5G cloud edge cooperation and implementation system | |
CN113221710A (en) | Neural network-based drainage pipeline defect identification method, device, equipment and medium | |
CN108038515A (en) | Unsupervised multi-target detection tracking and its storage device and camera device | |
CN111368634A (en) | Human head detection method, system and storage medium based on neural network | |
CN113723558A (en) | Remote sensing image small sample ship detection method based on attention mechanism | |
CN103279581A (en) | Method for performing video retrieval by compact video theme descriptors | |
CN113469938A (en) | Pipe gallery video analysis method and system based on embedded front-end processing server | |
CN113052073A (en) | Meta learning-based few-sample behavior identification method | |
CN116977859A (en) | Weak supervision target detection method based on multi-scale image cutting and instance difficulty | |
CN111275025A (en) | Parking space detection method based on deep learning | |
Ben-Ahmed et al. | Eurecom@ mediaeval 2017: Media genre inference for predicting media interestingnes | |
Geng et al. | Shelf Product Detection Based on Deep Neural Network | |
CN115240647A (en) | Sound event detection method and device, electronic equipment and storage medium | |
CN114581769A (en) | Method for identifying houses under construction based on unsupervised clustering | |
CN111401519B (en) | Deep neural network unsupervised learning method based on similarity distance in object and between objects | |
CN113658216A (en) | Remote sensing target tracking method based on multi-stage self-adaptive KCF and electronic equipment | |
Caselles-Dupré et al. | Are standard object segmentation models sufficient for learning affordance segmentation? | |
Vasudevan et al. | ETH-CVL@ MediaEval 2016: Textual-Visual Embeddings and Video2GIF for Video Interestingness. | |
Gao et al. | Intelligent appearance quality detection of air conditioner external unit and dataset construction | |
Chen et al. | Recognition and localization of freshwater fish heads and tails based on lightweight neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |