CN111291699B - Substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection - Google Patents

Substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection Download PDF

Info

Publication number
CN111291699B
CN111291699B CN202010103140.7A CN202010103140A CN111291699B CN 111291699 B CN111291699 B CN 111291699B CN 202010103140 A CN202010103140 A CN 202010103140A CN 111291699 B CN111291699 B CN 111291699B
Authority
CN
China
Prior art keywords
video
behavior
abnormal
time sequence
segment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010103140.7A
Other languages
Chinese (zh)
Other versions
CN111291699A (en
Inventor
聂礼强
战新刚
郑晓云
姚一杨
徐万龙
尉寅玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
State Grid Zhejiang Electric Power Co Ltd
Quzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Zhiyang Innovation Technology Co Ltd
Original Assignee
Shandong University
State Grid Zhejiang Electric Power Co Ltd
Quzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Zhiyang Innovation Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University, State Grid Zhejiang Electric Power Co Ltd, Quzhou Power Supply Co of State Grid Zhejiang Electric Power Co Ltd, Zhiyang Innovation Technology Co Ltd filed Critical Shandong University
Priority to CN202010103140.7A priority Critical patent/CN111291699B/en
Publication of CN111291699A publication Critical patent/CN111291699A/en
Application granted granted Critical
Publication of CN111291699B publication Critical patent/CN111291699B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • G06Q50/265Personal security, identity or safety
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Primary Health Care (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Business, Economics & Management (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Evolutionary Biology (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Probability & Statistics with Applications (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Computer Security & Cryptography (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)

Abstract

A transformer substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection is characterized in that a priori knowledge is utilized to conduct autonomous acquisition, processing and construction of a transformer substation personnel abnormal behavior monitoring video data set, and a new transformer substation abnormal behavior detection video data set is introduced; according to the invention, the time sequence information is acquired through the video motion detection model based on the transfer learning, so that the accurate positioning of the time sequence motion of the monitoring video can be realized, the time for starting and ending the motion of a worker is found in a section of video which is not edited, and the motion is classified. Meanwhile, the video clip of the person specific behavior obtained by video motion detection is obtained. According to the method, the video anomaly detection technology is utilized, multiple examples are adopted for learning and training under weak supervision, the obtained model can judge whether abnormal behaviors exist in the fragments, the abnormal behaviors and the occurring time sequence position can be accurately detected, and the utilization value of video monitoring of the transformer substation and the anomaly detection accuracy are improved.

Description

Substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection
Technical Field
The invention discloses a substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection, and belongs to the technical field of intelligent management of power grids.
Background
In the current power system, the operation and maintenance of power transmission and transformation are particularly important, and are directly related to the normal operation of the power system and the production and living electricity of society. During the operation of the power system, the failure of some devices may cause the power system to crash, and the error of the related personnel may cause the problem of the power system. In many power transformation scenes, safety accidents caused by abnormal operation of workers often occur, and the accidents bring great danger to the operators and seriously harm the social production and living order. Therefore, security management and security monitoring of the power transformation working environment are receiving more and more attention and attention.
The video safety monitoring can realize real-time monitoring and centralized management, and is an important mode for guaranteeing the life safety of the working personnel of the transformer substation and the normal operation of the power transmission and transformation equipment. The coverage rate is high, the stability is good, and the situation of the power transformation working scene can be recorded all-around and all-weather. Although video surveillance technology has been developed and widely used in power transformation scenes, various shortcomings still exist. Generally, video monitoring only simply records video information of the working scene condition of the power transformation, only has shooting and storing functions, and needs to set a specially-assigned person for 24-hour uninterrupted monitoring in subsequent judgment and processing, so that waste of human resources is caused. And the transformer room is monitored all day long, and data information volume is big, the useless information is many, only relies on monitoring personnel's shift naked eye to monitor the discernment efficiency extremely low. Therefore, it is very necessary to research a method for detecting abnormal behaviors of power transformation scene personnel based on video monitoring.
Chinese patent document CN110084151A discloses a video abnormal behavior discrimination method based on non-local network deep learning, belonging to the field of computer vision, intelligence and multimedia signal processing. The method uses the thought of multi-example learning to construct a training set, and defines and marks positive and negative packets and examples of the video data. The method comprises the steps of extracting the characteristics of a video sample by adopting a non-local network, taking an I3D network with a residual structure as a convolution filter for extracting space-time information, and fusing long-distance dependence information by using a non-local network block so as to meet the time sequence and space requirements of video characteristic extraction. After the characteristics are obtained, a regression task is established and a model is trained through a weak supervision learning method. The invention can distinguish the classes which are not marked, and is suitable for the conditions that the normal samples of the abnormal detection task are rare and the diversity in the classes is high.
The patent document CN110084151A adopts non-local network deep learning to judge the abnormal behavior of the video, and compared with the patent document, the invention adopts the improved C3D characteristic extraction network to extract the characteristics of the monitoring video sequence frame of the transformer substation; constructing a time sequence candidate area extraction network, and extracting candidate time sequence segments possibly having abnormal behaviors of substation personnel from a monitoring long video; constructing a behavior classification network, and classifying the extracted substation personnel behavior video segments; and constructing an abnormal behavior detection network, and detecting the abnormal behavior of the candidate time sequence segments obtained by the time sequence behavior classification network.
The method of patent document CN110084151A cannot be applied to long monitoring videos in a power transformation scene, but in the invention, a multi-network fusion model can be constructed based on a 3D feature extraction network, and the abnormal behavior of the long monitoring video is determined by performing time sequence candidate region extraction, video segment time sequence behavior classification, and abnormal behavior detection on the long monitoring video.
Patent document CN110084151A obtains examples required for multi-example learning by a method of average cutting of a video into 8 segments, but the present invention designs a frame clustering algorithm based on spatio-temporal continuity to segment a video, and segment each training video into 32 segments containing a single complete action, and these segments are used as examples in MIL, so as to achieve accurate video segment cutting and content evaluation.
In summary, the prior art still has many technical deficiencies, and is difficult to be applied to a substation scene to identify specific behaviors of workers.
Disclosure of Invention
Aiming at the defects of the prior art, the invention discloses a substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection.
The technical problems to be solved by the invention are as follows:
(1) as the transformer substation monitoring video data without an open source on the current network needs to be automatically collected, and a new transformer substation abnormal behavior detection video data set is constructed by carrying out time sequence classification and abnormal behavior marking on the actions of the personnel. Under the conditions that the abnormal behaviors of the transformer substation are low in occurrence frequency and data collection and marking are difficult, how to construct a proper abnormal behavior detection training data set of transformer substation personnel is an important problem to be solved at present.
(2) Due to the fact that the frequency of the abnormal events of the transformer substation is low, abnormal behavior monitoring video data of transformer substation personnel are not easy to collect, and the number of samples of the independently constructed behavior detection data set is limited. Under the condition of insufficient data, how to construct an efficient 3D convolution feature extraction network is a key problem needing to be researched, so that a model can fully mine features of video sequence frames through a small amount of substation monitoring video data.
(3) Because the frame images need to be combined with the time sequence information in the time sequence action detection, the action time span of the personnel behaviors in the monitoring scene of the transformer substation is large, and the time sequence action boundary is fuzzy. How to achieve high quality time-series segment cutting and accurate motion classification is another important issue.
(4) In the abnormal behavior detection process, it takes time to accurately mark the time position of each abnormal behavior in a video, meanwhile, the rarity of abnormal events causes that positive samples in training are far less than negative samples, and in a substation monitoring scene, the abnormal events are complex and diverse regardless of normal events or abnormal events, and the diversity in categories is high. How to solve the problems and realize accurate abnormal behavior detection of the transformer substation personnel is a key problem of the invention.
Summary of the invention:
the invention aims to realize automatic identification of abnormal behaviors of transformer substation personnel by utilizing monitoring video information of a transformer substation and through a video time sequence action positioning and abnormality detection technology based on 3D convolution.
Aiming at the characteristic of abnormal behavior of personnel under the video monitoring of the transformer substation, the invention utilizes the priori knowledge to carry out the autonomous acquisition, processing and construction of the monitoring video data set of the abnormal behavior of the personnel of the transformer substation, and introduces a new video data set for detecting the abnormal behavior of the transformer substation. The invention also obtains the time sequence information of the video data obtained by monitoring shooting through the video motion detection model based on the transfer learning, and can realize the accurate positioning of the monitoring video time sequence motion, thereby finding the time of the start and the end of the motion of the staff in a section of video which is not edited and classifying the motion. Meanwhile, for the personnel specific behavior video clip obtained by video action detection, the invention utilizes a video anomaly detection technology, and trains under weak supervision by adopting Multiple Instance Learning (MIL), so that the obtained model can judge whether the clip has an abnormal behavior, thereby realizing accurate detection of the abnormal behavior and the occurring time sequence position, and improving the utilization value of the video monitoring of the transformer substation and the accuracy of anomaly detection.
The technical scheme of the invention is as follows:
a transformer substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection is characterized by comprising the following steps:
s1: the method comprises the steps of automatically acquiring, processing and constructing a monitoring video data set for abnormal behavior of personnel in the transformer substation by using priori knowledge;
s2: constructing a 3D convolution feature extraction network: performing feature extraction on the input unsegmented substation monitoring long video, and extracting feature information of a monitoring video sequence;
s3: constructing a time sequence candidate area extraction network: the method comprises the steps of extracting candidate time sequence segments which may have abnormal behaviors of substation personnel;
s4: constructing a time sequence behavior classification network: classifying and regressing the extracted substation personnel behavior video bands;
s5: constructing an abnormal behavior detection network: performing abnormal behavior detection on the candidate time sequence segments obtained by the time sequence behavior classification network of the step S4;
s6: performing end-to-end joint training on a structure consisting of S2-S4 by adopting transfer learning, and processing monitoring long videos by utilizing a model obtained by training so as to intercept simply classified behavior category video clips;
s7: a frame clustering algorithm based on space-time continuity is designed to segment videos, and abnormal behavior detection networks based on multi-instance learning are adopted to evaluate the contents in video segments, so that whether abnormal behaviors exist in the video segments or not is identified, and the types and the accurate positions of the abnormal behaviors are determined.
Preferably, the method for constructing the data set in step S1 includes:
s11: acquiring substation personnel behavior monitoring videos at different shooting angles and under background environments by using video monitoring equipment erected in a substation;
s12: carrying out time sequence behavior marking on the typical behavior category to construct a video data set for a time sequence action detection task; the time sequence action detection task is based on the time positioning and action category classification of personnel actions of the transformer substation monitoring long video; typical behaviors described herein include, but are not limited to: movement, instrument manipulation, monitoring data, etc.;
s13: labeling a video, and constructing an abnormal behavior identification video data set by marking whether the video segment has abnormal behaviors; the abnormal behavior identification task is used for detecting abnormal behaviors according to the classification based on the video clips obtained by the segmentation of the time sequence action detection module; wherein the abnormal behavior includes, but is not limited to, an illegal operation, indoor sprinting, trunk falling, etc.
Preferably, the method for constructing the 3D convolution feature extraction network in step S2 includes:
s21: C3D (3D conversation) feature extraction network is improved: replacing the normal convolution operation with a depth separable convolution; the improvement is used for greatly reducing the calculation amount and the size of the model on the premise of ensuring the precision;
s22: extracting the characteristics of the monitoring video sequence frames of the transformer substation by adopting an improved C3D characteristic extraction network:
frame sequence of input surveillance video
Figure BDA0002387540230000061
The feature extraction network composed of multilayer depth separable 3D convolution layers is used for processing to obtain
Figure BDA0002387540230000062
The characteristic diagram of (1).
Preferably, the method for constructing the candidate timing extraction network in step S3 includes:
s31: first, C obtained in S2conv5bThe feature map is used as input to carry out candidate time sequence generation, and the time sequence segments are assumed to be uniformly distributed
Figure BDA0002387540230000063
And each position in the time domain generates K candidate time sequences with different lengthsTotal sum of all
Figure BDA0002387540230000064
A candidate time series segment;
s32: the time-sequential field is extended by a 3X3 3D convolution filter, and then the size is
Figure BDA0002387540230000065
Down-sampling it down to the spatial dimension
Figure BDA0002387540230000066
Obtaining a feature map of the obtained time position features
Figure BDA0002387540230000067
The 512-dimensional feature vector at each timing position is used to predict the relative shift of the center position { δ ci,δliAnd length of each Anchor ci,li},i∈{1,…,K};
S33: in the feature map CtlAdd two convolutions of 1x1x1 above, predict the confidence score that the candidate temporal segment is background or there is behavioral activity.
Preferably, in step S4, the method for constructing the time-series behavior classification network includes:
s41: performing Non-maximum Suppression (NMS) operation on the candidate time sequence segment obtained in step S3 with 0.6 as a threshold to obtain a Region of interest (RoI) candidate time sequence segment with higher quality;
s42: mapping the RoI to C obtained in step S2conv5bIn the feature map, output features of 512x1x4x4 are obtained through 3D RoI pooling operation;
s43: the output characteristics are firstly sent to a large full-connection layer for characteristic synthesis, and then the two full-connection layers are respectively classified and regressed: and carrying out substation personnel behavior category classification through a classification layer, and adjusting the starting time and the ending time of the behavior segments through a regression layer. The two full-connection layers respectively refer to a classification layer and a regression layer, the classification layer performs behavior classification, and the regression layer performs behavior time regression.
Preferably, in the step S5, the method for constructing the abnormal behavior detection network includes:
s51: a specific video clip picture size intercepted by the time sequence action detection part; the size and frame rate selection can be adjusted according to the change of the application scene, for example, the size and frame rate selection are adjusted to 360x480, and the frame rate is fixed to 32 fps;
s52: dividing each video clip into a group of unit video clips with fixed length of 1 frame, clustering the unit video clips by a K-means frame clustering algorithm based on space-time continuity, wherein each clustering result represents a complete action; finally, dividing the video into 32 groups of video segments containing single complete action;
s53: extracting the characteristics of the video clips by using the 3D convolution characteristic extraction network constructed in the step S2, and extracting C of each 16 framesconv5bAdding a full connection layer to obtain 4096-dimensional characteristics;
s54: the extracted features are input into a Multi-Layer Perceptron (MLP) composed of 3 continuous full-connected layers to score each segment, the score of the segment with the maximum abnormal score in the video is used as an abnormal score, and a final predicted abnormal value is obtained.
According to a preferred embodiment of the present invention, in step S53, 512 neurons in the first fully connected layer in MLP are activated using a Linear rectification function (strained Linear Unit, ReLU); the second full-connectivity layer is 32 neurons and the third full-connectivity layer is 1 neuron, activated using Sigmoid function.
Preferably, in the step S6, the process of training and operating the time sequence motion detection model specifically includes:
s61: in the transfer learning, firstly, THUMOS2014 data sets are used for jointly training the corresponding network modules in the steps S2-S4, parameters of the first four layers of the network are extracted by fixing the characteristics of the step S2, and then the data sets constructed in the step S1 are trained to obtain parameters of the later network structure; the design is that the substation personnel behavior video data obtained by S1 is not large in set, and a transfer learning mode is adopted to improve the detection precision and the generalization capability of the model; the thumb 2014 dataset is an open-source action identification and time-sequence action detection dataset;
s62: training the RoI obtained in the step S4 according to the proportion of 1:3 of positive and negative samples; specifically, the RoI of IoU greater than 0.5 of the true value (ground route) is taken as a positive sample, and the RoI less than 0.5 is taken as a negative sample;
s63: for the candidate time series extraction network of step S3 and the time series behavior classification network of step S4, the classification task and the regression task are optimized simultaneously:
classification task LclsWith softmax penalty, regression task LregLosses with smooth L1:
Figure BDA0002387540230000091
wherein N isclsAnd NregRepresenting the number of samples selected for a training run and the number of candidate time series segments for regression, λ is a loss balance parameter, set to 1, i is an index of candidate time series segments in a batch process, αiIs the likelihood of predicting a candidate time series segment as human behavior,
Figure BDA0002387540230000092
is a group channel of the group channel,
Figure BDA0002387540230000093
is the relative offset of the predicted temporal segment to the candidate temporal segment,
Figure BDA0002387540230000094
the coordinate transformation between the ground route and the candidate time sequence segment is as follows:
Figure BDA0002387540230000095
in the candidate timing extraction sub-network (referred to as the timing candidate area extraction network) of step S3, LclsPredicting whether a candidate time sequence segment contains a person behavior, regardless of the specific behavior class, LregOptimizing the relative displacement between the candidate time sequence segment and the ground channel;
in the time-series behavior classification subnetwork (referred to as a behavior classification network) of step S4, LclsPredicting the specific personnel behavior category of the RoI, and optimizing the relative displacement between the RoI and the ground route; the four losses of the two subnets are jointly optimized;
s64: and processing the monitoring long video of the transformer substation by utilizing the model parameters obtained by training S61-S63 based on the time sequence action detection network module constructed in the steps S2-S4 so as to intercept the simply classified behavior category video clips.
Preferably, the training and operating process of the abnormal behavior detection model in step S7 includes:
s71: adopting a K-means frame clustering algorithm based on space-time continuity to divide the video segments into 32 groups of video segments containing single complete action: firstly, segmenting a video clip into a unit frame data set, and randomly selecting 32 video frames from the unit frame data set as a centroid; calculating the similarity Euclidean distance between each frame in the data set and the front and rear centroids of the time sequence position, and dividing the Euclidean distance into a set to which the centroids closer to each other belong; after all the data are grouped together, recalculating the centroid of each group until the time sequence distance between the newly calculated centroid and the original centroid is less than 8 frames;
s72: using the video segment segmentation algorithm of step S71, each training video is segmented into 32 segments containing a single complete action, the segments are examples in MIL, and each video is a package in MIL: during training, randomly selecting 10 positive example packages (abnormal behavior videos) and 10 negative example packages (normal behavior videos) as mini-batch for training;
s73: extracting the spatio-temporal feature of each example segment by using the 3D convolution feature extraction network constructed in the step S2, and performing a full connection operation to obtain a 4096-dimensional feature which is used as a feature map required by subsequent multi-example learning;
s74: scoring each segment using the MLP constructed in step S54, then selecting a segment with the largest abnormal score from the positive example packet as a potential abnormal sample, selecting a segment with the largest abnormal score from the negative example packet as a non-abnormal sample, and training the model parameters of the MLP using the above two samples, wherein the objective function is:
Figure BDA0002387540230000101
wherein beta isaIndicating a positive case, vaAn abnormal sample is obtained; beta is anIndicates a negative example bag, vnF is a model prediction function;
adopting a Hinge-loss function to enlarge the score difference between the positive example and the negative example, wherein the training effect is that the model outputs high scores for abnormal samples and low scores for non-abnormal samples; the Hinge-loss function is:
Figure BDA0002387540230000102
in a real scene of the transformer substation, abnormal behaviors usually only occupy a very small period of time, namely, the proportion of positive samples (abnormal behaviors) in a positive packet is very low, so that the scores in the positive packet should be sparse, and therefore sparse constraint is added; meanwhile, considering the temporal structure of the video, since the video segments are continuous, the abnormal scores of the adjacent segments of the video sequence should also be relatively smooth, so that a temporal smoothing constraint is added, and the loss function becomes:
Figure BDA0002387540230000111
to prevent model overfitting, finally add l2 regular, get the final loss function:
L(w)=l(βan)+||w||F
wherein w represents a model weight;
s75: based on the abnormal behavior detection network constructed in the step S5, the behavior category video clips obtained in the step S64 are detected by using the model parameters obtained through training in the steps S72-S74, so as to identify whether the specific behavior clip has an abnormal behavior, and the category and the precise time sequence position of the abnormal behavior.
The invention has the beneficial effects that:
according to the method, aiming at the characteristic of abnormal behavior of personnel under the video monitoring of the transformer substation, the prior knowledge is utilized to perform autonomous collection, processing and construction of the abnormal behavior monitoring video data set of the personnel of the transformer substation, and the blank of video data in the field of abnormal behavior detection of the transformer substation is filled; the 3D convolution feature extraction network based on transfer learning is used, so that the model can fully mine the features of the video sequence frame under the condition that the data volume of the substation personnel behavior monitoring video is limited, the algorithm operation efficiency is improved, and the precision and generalization capability of the model are enhanced; the invention adopts an end-to-end joint training mode to train the time sequence action detection module, and realizes high-quality time sequence segment cutting and accurate action classification by fully fusing and utilizing different network structures such as feature extraction, time sequence detection, action classification and the like; meanwhile, the abnormal detection network is trained by using multi-example learning of weak supervision, the obtained model can accurately judge whether the video clip has abnormal behaviors or not, and simultaneously, the abnormal behavior category and the occurring time sequence position are accurately detected, so that the utilization value of the video monitoring of the transformer substation is improved, and the high-efficiency and high-quality abnormal behavior detection of the transformer substation personnel is realized.
Drawings
FIG. 1 is an overall flow diagram of the present invention;
FIG. 2 is a schematic diagram of the abnormal behavior recognition result of the present invention;
fig. 3 is a schematic diagram of the normal behavior recognition result of the present invention.
Detailed Description
The invention is described in detail below with reference to the following examples and the accompanying drawings of the specification, but is not limited thereto.
Examples of the following,
As shown in fig. 1.
A transformer substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection comprises the following steps:
s1: the method comprises the steps of automatically acquiring, processing and constructing a monitoring video data set for abnormal behavior of personnel in the transformer substation by using priori knowledge;
s2: constructing a 3D convolution feature extraction network: performing feature extraction on the input un-segmented substation monitoring long video, and extracting feature information of a monitoring video sequence;
s3: constructing a time sequence candidate area extraction network: the method comprises the steps of extracting candidate time sequence segments which may have abnormal behaviors of substation personnel;
s4: constructing a time sequence behavior classification network: classifying and regressing the extracted substation personnel behavior video bands;
s5: constructing an abnormal behavior detection network: performing abnormal behavior detection on the candidate time sequence segments obtained by the time sequence behavior classification network of the step S4;
s6: performing end-to-end joint training on a structure consisting of S2-S4 by adopting transfer learning, and processing monitoring long videos by utilizing a model obtained by training so as to intercept simply classified behavior category video clips;
s7: a frame clustering algorithm based on space-time continuity is designed to segment videos, and abnormal behavior detection networks based on multi-instance learning are adopted to evaluate the contents in video segments, so that whether abnormal behaviors exist in the video segments or not is identified, and the types and the accurate positions of the abnormal behaviors are determined.
The method for constructing the data set in step S1 includes:
s11: acquiring substation personnel behavior monitoring videos at different shooting angles and under background environments by using video monitoring equipment erected in a substation;
s12: carrying out time sequence behavior marking on the typical behavior category to construct a video data set for a time sequence action detection task; the time sequence action detection task is based on the time positioning and action category classification of personnel actions of the transformer substation monitoring long video; typical behaviors described herein include, but are not limited to: movement, instrument manipulation, monitoring data, etc.;
s13: labeling a video, and constructing an abnormal behavior identification video data set by marking whether the video segment has abnormal behaviors; the abnormal behavior identification task is used for detecting abnormal behaviors according to the classification based on the video clips obtained by the segmentation of the time sequence action detection module; wherein the abnormal behavior includes, but is not limited to, an illegal operation, indoor sprinting, trunk falling, etc.
The method for constructing the 3D convolution feature extraction network in step S2 includes:
s21: C3D (3D conversation) feature extraction network is improved: replacing the normal convolution operation with a depth separable convolution; the improvement is used for greatly reducing the calculation amount and the size of the model on the premise of ensuring the precision;
s22: extracting the characteristics of the monitoring video sequence frames of the transformer substation by adopting an improved C3D characteristic extraction network:
frame sequence of input surveillance video
Figure BDA0002387540230000141
The feature extraction network composed of multilayer depth separable 3D convolution layers is used for processing to obtain
Figure BDA0002387540230000142
The characteristic diagram of (1).
The method for constructing the candidate timing extraction network in step S3 includes:
s31: firstly, C obtained in S2conv5bThe feature map is used as input to carry out candidate time sequence generation, and the time sequence segments are assumed to be uniformly distributed
Figure BDA0002387540230000143
And each position in the time domain generates K candidate sequences of different lengths, then a total common sequence is generated
Figure BDA0002387540230000144
A candidate time series segment;
s32: extending the time series field by using a 3X3 3D convolution filterBy a size of
Figure BDA0002387540230000145
Down-sampling it down to the spatial dimension
Figure BDA0002387540230000146
Obtaining a feature map of the obtained time-position features
Figure BDA0002387540230000147
The 512-dimensional feature vector at each time sequence position is used to predict the relative shift of the center position { deltac }i,δliAnd length of each Anchor ci,li},i∈{1,…,K};
S33: in the feature map CtlAdd two convolutions of 1x1x1 above, predict the confidence score that the candidate temporal segment is background or there is behavioral activity.
The method for constructing the time-series behavior classification network in step S4 includes:
s41: performing Non-maximum Suppression (NMS) operation on the candidate time sequence segment obtained in step S3 with 0.6 as a threshold to obtain a Region of interest (RoI) candidate time sequence segment with higher quality;
s42: mapping the RoI to C obtained in step S2conv5bIn the feature map, the output features of 512x1x4x4 are obtained through 3D RoI pooling operation;
s43: the output characteristics are firstly sent to a large full-connection layer for characteristic synthesis, and then the two full-connection layers are respectively classified and regressed: and carrying out substation personnel behavior category classification through a classification layer, and adjusting the starting time and the ending time of the behavior segments through a regression layer. The two full-connection layers respectively refer to a classification layer and a regression layer, the classification layer performs behavior classification, and the regression layer performs behavior time regression.
The method for constructing the abnormal behavior detection network in the step S5 includes:
s51: a specific video clip picture size intercepted by the time sequence action detection part; the size and frame rate selection can be adjusted according to the change of the application scene, for example, the size and frame rate selection are adjusted to 360x480, and the frame rate is fixed to 32 fps;
s52: dividing each video clip into a group of unit video clips with fixed length of 1 frame, clustering the unit video clips by a K-means frame clustering algorithm based on space-time continuity, wherein each clustering result represents a complete action; finally, dividing the video into 32 groups of video segments containing single complete action;
s53: extracting the characteristics of the video clips by using the 3D convolution characteristic extraction network constructed in the step S2, and extracting C of each 16 framesconv5bAdding a full connection layer to obtain 4096-dimensional characteristics;
s54: the extracted features are input into a Multi-Layer Perceptron (MLP) composed of 3 continuous full-connected layers to score each segment, the score of the segment with the maximum abnormal score in the video is used as an abnormal score, and a final predicted abnormal value is obtained.
In step S53, the first fully-connected layer in MLP has 512 neurons, and is activated using a Linear rectification function (ReLU); the second full-connectivity layer is 32 neurons and the third full-connectivity layer is 1 neuron, activated using Sigmoid function.
The training and running process of the time sequence motion detection model in the step S6 specifically includes:
s61: in the transfer learning, firstly, THUMOS2014 data sets are used for jointly training the corresponding network modules in the steps S2-S4, parameters of the first four layers of the network are extracted by fixing the characteristics of the step S2, and then the data sets constructed in the step S1 are trained to obtain parameters of the later network structure; the design is that the substation personnel behavior video data obtained by S1 is not large in set, and a transfer learning mode is adopted to improve the detection precision and the generalization capability of the model; the thumb 2014 dataset is an open-source action identification and time-sequence action detection dataset;
s62: training the RoI obtained in the step S4 according to the proportion of 1:3 of positive and negative samples; specifically, the RoI of IoU greater than 0.5 of the true value (ground route) is taken as a positive sample, and the RoI less than 0.5 is taken as a negative sample;
s63: for the candidate time series extraction network of step S3 and the time series behavior classification network of step S4, the classification task and the regression task are optimized simultaneously:
classification task LclsRegression task L with softmax lossregLosses with smooth L1:
Figure BDA0002387540230000161
wherein N isclsAnd NregRepresenting the number of samples selected for a training run and the number of candidate time series segments for regression, λ is a loss balance parameter, set to 1, i is an index of candidate time series segments in a batch process, αiIs the likelihood of predicting a candidate time series segment as human behavior,
Figure BDA0002387540230000162
is a group channel of the group channel,
Figure BDA0002387540230000163
is the relative offset of the predicted temporal segment to the candidate temporal segment,
Figure BDA0002387540230000164
the coordinate transformation between the ground route and the candidate time sequence segment is as follows:
Figure BDA0002387540230000165
in the candidate timing extraction sub-network (referred to as the timing candidate area extraction network) of step S3, LclsPredicting whether a candidate time sequence segment contains a person behavior, regardless of the specific behavior class, LregOptimizing the relative displacement between the candidate time sequence segment and the ground channel;
in the time-series behavior classification subnetwork (referred to as a behavior classification network) of step S4, LclsPredicting the specific personnel behavior category of the RoI, and optimizing the relative position between the RoI and the ground routeMoving; the four losses of the two subnets are jointly optimized;
s64: and processing the monitoring long video of the transformer substation by utilizing the model parameters obtained by training S61-S63 based on the time sequence action detection network module constructed in the steps S2-S4 so as to intercept the simply classified behavior category video clips.
The process of training and operating the abnormal behavior detection model in step S7 includes:
s71: adopting a K-means frame clustering algorithm based on space-time continuity to divide the video segments into 32 groups of video segments containing single complete action: firstly, segmenting a video clip into a unit frame data set, and randomly selecting 32 video frames from the unit frame data set as a centroid; calculating the similarity Euclidean distance between each frame in the data set and the front and rear centroids of the time sequence position, and dividing the Euclidean distance into a set to which the centroids closer to each other belong; after all the data are grouped together, recalculating the centroid of each group until the time sequence distance between the newly calculated centroid and the original centroid is less than 8 frames;
s72: using the video segment segmentation algorithm of step S71, each training video is segmented into 32 segments containing a single complete action, the segments are examples in MIL, and each video is a package in MIL: during training, randomly selecting 10 positive example packages (abnormal behavior videos) and 10 negative example packages (normal behavior videos) as mini-batch for training;
s73: extracting the spatiotemporal feature of each example segment by using the 3D convolution feature extraction network constructed in the step S2, and performing a full-connection operation to obtain a 4096-dimensional feature as a feature map required by subsequent multi-example learning;
s74: scoring each segment using the MLP constructed in step S54, then selecting a segment with the largest abnormal score from the positive example packet as a potential abnormal sample, selecting a segment with the largest abnormal score from the negative example packet as a non-abnormal sample, and training the model parameters of the MLP using the above two samples, wherein the objective function is:
Figure BDA0002387540230000181
wherein, betaaRepresents a positive case, vaAn abnormal sample is obtained; beta is anIndicates a negative example bag, vnF is a model prediction function;
adopting a Hinge-loss function to enlarge the score difference between the positive example and the negative example, wherein the training effect is that the model outputs high scores for abnormal samples and low scores for non-abnormal samples; the Hinge-loss function is:
Figure BDA0002387540230000182
in a real scene of the transformer substation, abnormal behaviors usually only occupy a very small period of time, namely, the proportion of positive samples (abnormal behaviors) in a positive packet is very low, so that the scores in the positive packet should be sparse, and therefore sparse constraint is added; meanwhile, considering the temporal structure of the video, since the video segments are continuous, the abnormal scores of the adjacent segments of the video sequence should also be relatively smooth, so that a temporal smoothing constraint is added, and the loss function becomes:
Figure BDA0002387540230000183
to prevent the model from overfitting, the l2 regularization is finally added, resulting in the final loss function:
L(w)=l(βan)+||w||F
wherein w represents a model weight;
s75: based on the abnormal behavior detection network constructed in the step S5, the behavior category video clips obtained in the step S64 are detected by using the model parameters obtained through training in the steps S72-S74, so as to identify whether the specific behavior clip has an abnormal behavior, and the category and the precise time sequence position of the abnormal behavior.
Application examples 1,
The method according to the embodiment is applied to identifying whether a person in a power transformation scene wears a safety helmet or not, as shown in fig. 2.
Processing the long video through time sequence action detection, acquiring action category video clips, and judging personnel actions as monitoring data; and carrying out abnormity judgment on the video clips through abnormal behavior detection, and judging whether the behaviors of the personnel are abnormal, wherein the abnormal behaviors are that the safety helmet is not worn.
Application examples 2,
When the method is applied to identifying whether a person wearing a safety helmet in a power transformation scene, as shown in fig. 3.
Processing the long video through time sequence action detection, acquiring action category video clips, and judging personnel actions as monitoring data; and carrying out abnormal judgment on the video clip through abnormal behavior detection to judge that the behavior of personnel is normal, wherein the normal behavior is to wear a safety helmet.

Claims (3)

1. A transformer substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection is characterized by comprising the following steps:
s1: the method comprises the steps of automatically acquiring, processing and constructing a monitoring video data set for abnormal behavior of personnel in the transformer substation by using priori knowledge;
s2: constructing a 3D convolution feature extraction network: performing feature extraction on the input un-segmented substation monitoring long video, and extracting feature information of a monitoring video sequence;
s3: constructing a time sequence candidate area extraction network: the method comprises the steps of extracting candidate time sequence segments which may have abnormal behaviors of substation personnel;
s4: constructing a time sequence behavior classification network: classifying and regressing the extracted substation personnel behavior video bands;
s5: constructing an abnormal behavior detection network: detecting abnormal behaviors of the behavior category video clips obtained by the time sequence behavior classification network in the step S4;
s6: performing end-to-end joint training on a structure consisting of S2-S4 by adopting transfer learning, and processing and monitoring long videos by using a model obtained by training so as to intercept simply classified behavior category video clips;
s7: designing a frame clustering algorithm based on space-time continuity to segment a video, and evaluating the content in a video clip by adopting an abnormal behavior detection network based on multi-instance learning, so as to identify whether abnormal behaviors exist in the video clip and determine the type and the accurate position of the abnormal behaviors;
the method for constructing the time sequence candidate area extraction network in step S3 includes:
s31: obtained at S2C conv5b Taking the feature map as an input to generate a candidate time sequence;
s32: the time-sequential field is extended by a 3X3 3D convolution filter, and then the size is
Figure 42491DEST_PATH_IMAGE001
Down-sampling it down to the spatial dimension
Figure 710232DEST_PATH_IMAGE002
Obtaining a characteristic diagram of the obtained time position characteristics
Figure 113532DEST_PATH_IMAGE003
(ii) a A512-dimensional feature vector at each timing position is used to predict the relative offset of the center positionδc i ,δl i Great and the length of each Anchorc i ,l i },i∈{l,...,K };
S33: in the feature diagramC tl Adding two convolutions of 1x1x1, and predicting the confidence score of the candidate time sequence segment as the background or the existing behavior activity;
the method for constructing the time-series behavior classification network in step S4 includes:
s41: performing non-maximum suppression operation on the candidate time sequence segment obtained in the step S3 by using 0.6 as a threshold value to obtain a candidate time sequence segment;
s42: mapping the RoI to that obtained in step S2C conv5b In the profile, the output of 512x1x4x4 is obtained by 3D RoI poolingCharacteristic;
s43: the output characteristics are firstly sent to a large full-connection layer for characteristic synthesis, and then the two full-connection layers are respectively classified and regressed: classifying the behavior categories of the substation personnel through a classification layer, and adjusting the starting time and the ending time of the behavior segments through a regression layer;
the method for constructing the abnormal behavior detection network in the step S5 includes:
s51: adjusting the size and frame rate of a specific video segment picture intercepted by the time sequence action detection part;
s52: dividing each video clip into a group of unit video clips with fixed length of 1 frame, clustering the unit video clips by a K-means frame clustering algorithm based on space-time continuity, wherein each clustering result represents a complete action;
s53: extracting the characteristics of the video clips by using the 3D convolution characteristic extraction network constructed in the step S2, and extracting each 16 framesC conv5b Adding a full connection layer to obtain 4096-dimensional characteristics;
s54: inputting the extracted features into a multilayer perceptron consisting of 3 continuous full-connected layers to score each segment, taking the score of the segment with the maximum abnormal score in the video as the abnormal score, and obtaining the final predicted abnormal value;
the method for constructing the data set in step S1 includes:
s11: acquiring monitoring videos of the behaviors of transformer substation personnel at different shooting angles and under background environments;
s12: carrying out time sequence behavior marking on the typical behavior category to construct a video data set for a time sequence action detection task; the time sequence action detection task is based on the time positioning and action category classification of personnel actions of the transformer substation monitoring long video;
s13: labeling a video, and constructing an abnormal behavior identification video data set by marking whether the video segment has abnormal behaviors; the abnormal behavior identification task is used for detecting abnormal behaviors according to the classification based on the video clips obtained by the segmentation of the time sequence action detection module;
the method for constructing the 3D convolution feature extraction network in step S2 includes:
s21: the C3D feature extraction network is improved: replacing the normal convolution operation with a depth separable convolution;
s22: extracting the characteristics of the monitoring video sequence frames of the transformer substation by adopting an improved C3D characteristic extraction network:
frame sequence of input surveillance video
Figure 841316DEST_PATH_IMAGE004
The feature extraction network composed of multilayer depth separable 3D convolution layers is used for processing to obtain
Figure 330067DEST_PATH_IMAGE005
A characteristic diagram of (1);
the training and running process of the time sequence motion detection model in the step S6 specifically includes:
s61: in the transfer learning, firstly, THUMOS2014 data sets are used for jointly training the corresponding network modules in the steps S2-S4, parameters of the first four layers of the network are extracted by fixing the characteristics of the step S2, and then the data sets constructed in the step S1 are trained to obtain parameters of the later network structure;
s62: training the RoI obtained in the step S4 according to the proportion of 1:3 of positive and negative samples;
s63: for the time series candidate region extraction network of step S3 and the time series behavior classification network of step S4, the classification task and the regression task are optimized at the same time:
classification taskL cls With softmax loss, regression tasksL reg Losses with smooth L1:
Figure 801499DEST_PATH_IMAGE006
wherein,N cls andN reg representing the number of samples selected for one training and the number of candidate time series segments for regression, λ is a loss balance parameter,iis oneThe subscripts of the candidate time series segments in an individual batch,α i is the likelihood of predicting a candidate time series segment as human behavior,α i * is a group channel of the group channel,
Figure 59305DEST_PATH_IMAGE007
is the relative offset of the predicted temporal segment to the candidate temporal segment,t i * ={δc i ,δl i the coordinate transformation between the group channel and the candidate timing sequence fragment is calculated by the following formula:
Figure 472838DEST_PATH_IMAGE008
s64: and processing the monitoring long video of the transformer substation by utilizing the model parameters obtained by training S61-S63 based on the time sequence action detection network module constructed in the steps S2-S4 so as to intercept the simply classified behavior category video clips.
2. The substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection according to claim 1, wherein in step S53, 512 neurons in the first full connection layer in MLP are activated by using a linear rectification function ReLU; the second full-connectivity layer is 32 neurons and the third full-connectivity layer is 1 neuron, activated using Sigmoid function.
3. The substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormal detection as claimed in claim 1, wherein the process of training and operating the abnormal behavior detection network in step S7 includes:
s71: dividing the video segments into 32 groups of video segments containing single complete action by adopting a K-means frame clustering algorithm based on space-time continuity;
s72: using the video segment segmentation algorithm of step S71, each training video is segmented into 32 segments containing a single complete action, the segments are examples in MIL, and each video is a package in MIL: during training, randomly selecting 10 positive example bags and 10 negative example bags as mini-batch for training;
s73: extracting the spatiotemporal feature of each example segment by using the 3D convolution feature extraction network constructed in the step S2, and performing a full-connection operation to obtain a 4096-dimensional feature as a feature map required by subsequent multi-example learning;
s74: scoring each segment using the MLP constructed in step S54, then selecting a segment with the largest abnormal score from the positive example packet as a potential abnormal sample, selecting a segment with the largest abnormal score from the negative example packet as a non-abnormal sample, and training the model parameters of the MLP using the above two samples, wherein the objective function is:
Figure 448884DEST_PATH_IMAGE009
wherein,β athe positive example packet is shown as a positive example packet,v aan abnormal sample is obtained;β na negative example packet is shown, and,v nthe sample is a non-abnormal sample,fpredicting a function for the model;
adopting a Hinge-loss function to enlarge the score difference between the positive case and the negative case, wherein the training effect is that the model outputs high scores for abnormal samples and outputs low scores for non-abnormal samples; the Hinge-loss function is:
Figure 458428DEST_PATH_IMAGE010
adding a temporal smoothing constraint, the loss function becomes:
Figure 570741DEST_PATH_IMAGE011
Figure 374749DEST_PATH_IMAGE012
Figure 838091DEST_PATH_IMAGE013
to prevent model overfitting, finally add l2 regular, get the final loss function:
L(w)=l(β aβ n)+||w|| F
wherein,wrepresenting model weights;
s75: based on the abnormal behavior detection network constructed in the step S5, the behavior category video clips obtained in the step S64 are detected by using the model parameters obtained through training in the steps S72-S74, so as to identify whether the specific behavior clip has an abnormal behavior, and the category and the precise time sequence position of the abnormal behavior.
CN202010103140.7A 2020-02-19 2020-02-19 Substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection Active CN111291699B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010103140.7A CN111291699B (en) 2020-02-19 2020-02-19 Substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010103140.7A CN111291699B (en) 2020-02-19 2020-02-19 Substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection

Publications (2)

Publication Number Publication Date
CN111291699A CN111291699A (en) 2020-06-16
CN111291699B true CN111291699B (en) 2022-06-03

Family

ID=71024617

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010103140.7A Active CN111291699B (en) 2020-02-19 2020-02-19 Substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection

Country Status (1)

Country Link
CN (1) CN111291699B (en)

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111985333B (en) * 2020-07-20 2023-01-17 中国科学院信息工程研究所 Behavior detection method based on graph structure information interaction enhancement and electronic device
CN111626273B (en) * 2020-07-29 2020-12-22 成都睿沿科技有限公司 Fall behavior recognition system and method based on atomic action time sequence characteristics
CN111914778B (en) * 2020-08-07 2023-12-26 重庆大学 Video behavior positioning method based on weak supervision learning
CN111652201B (en) * 2020-08-10 2020-10-27 中国人民解放军国防科技大学 Video data abnormity identification method and device based on depth video event completion
CN111709411B (en) * 2020-08-20 2020-11-10 深兰人工智能芯片研究院(江苏)有限公司 Video anomaly detection method and device based on semi-supervised learning
CN112307885A (en) * 2020-08-21 2021-02-02 北京沃东天骏信息技术有限公司 Model construction and training method and device, and time sequence action positioning method and device
CN112487913A (en) * 2020-11-24 2021-03-12 北京市地铁运营有限公司运营四分公司 Labeling method and device based on neural network and electronic equipment
CN112434615A (en) * 2020-11-26 2021-03-02 天津大学 Time sequence action detection method based on Tensorflow deep learning framework
CN112487967A (en) * 2020-11-30 2021-03-12 电子科技大学 Scenic spot painting behavior identification method based on three-dimensional convolution network
CN112737121A (en) * 2020-12-28 2021-04-30 内蒙古电力(集团)有限责任公司包头供电局 Intelligent video monitoring, analyzing, controlling and managing system for power grid
CN113297972B (en) * 2021-05-25 2022-03-22 国网湖北省电力有限公司检修公司 Transformer substation equipment defect intelligent analysis method based on data fusion deep learning
CN113159003A (en) * 2021-05-27 2021-07-23 中国银行股份有限公司 Bank branch abnormity monitoring method and device
CN113392770A (en) * 2021-06-16 2021-09-14 国网浙江省电力有限公司电力科学研究院 Typical violation behavior detection method and system for transformer substation operating personnel
CN113421236B (en) * 2021-06-17 2024-02-09 同济大学 Deep learning-based prediction method for apparent development condition of water leakage of building wall surface
CN113516058B (en) * 2021-06-18 2024-05-24 北京工业大学 Live video group abnormal activity detection method and device, electronic equipment and medium
CN113627386A (en) * 2021-08-30 2021-11-09 山东新一代信息产业技术研究院有限公司 Visual video abnormity detection method
CN114092851A (en) * 2021-10-12 2022-02-25 甘肃欧美亚信息科技有限公司 Monitoring video abnormal event detection method based on time sequence action detection
CN113992894A (en) * 2021-10-27 2022-01-28 甘肃风尚电子科技信息有限公司 Abnormal event identification system based on monitoring video time sequence action positioning and abnormal detection
CN114283492B (en) * 2021-10-28 2024-04-26 平安银行股份有限公司 Staff behavior-based work saturation analysis method, device, equipment and medium
CN114120180B (en) * 2021-11-12 2023-07-21 北京百度网讯科技有限公司 Time sequence nomination generation method, device, equipment and medium
CN114565968A (en) * 2021-11-29 2022-05-31 杭州好学童科技有限公司 Learning environment action and behavior identification method based on learning table
CN116453204B (en) * 2022-01-05 2024-08-13 腾讯科技(深圳)有限公司 Action recognition method and device, storage medium and electronic equipment
CN114429676B (en) * 2022-01-27 2023-07-25 山东纬横数据科技有限公司 Personnel identity and behavior recognition system for disinfection supply room of medical institution
CN114612868A (en) * 2022-02-25 2022-06-10 广东创亿源智能科技有限公司 Training method, training device and detection method of vehicle track detection model
CN114676739B (en) * 2022-05-30 2022-08-19 南京邮电大学 Method for detecting and identifying time sequence action of wireless signal based on fast-RCNN
CN115080748B (en) * 2022-08-16 2022-11-11 之江实验室 Weak supervision text classification method and device based on learning with noise label
CN115424347A (en) * 2022-09-02 2022-12-02 重庆邮电大学 Intelligent identification method for worker work content of barber shop
CN115690658B (en) * 2022-11-04 2023-08-08 四川大学 Priori knowledge-fused semi-supervised video abnormal behavior detection method
CN116313018B (en) * 2023-05-18 2023-09-15 北京大学第三医院(北京大学第三临床医学院) Emergency system and method for skiing field and near-field hospital
CN117710832A (en) * 2024-01-04 2024-03-15 广州智寻科技有限公司 Intelligent identification method for power grid satellite, unmanned aerial vehicle and video monitoring image

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10834436B2 (en) * 2015-05-27 2020-11-10 Arris Enterprises Llc Video classification using user behavior from a network digital video recorder
CN107506740B (en) * 2017-09-04 2020-03-17 北京航空航天大学 Human body behavior identification method based on three-dimensional convolutional neural network and transfer learning model
CN108399380A (en) * 2018-02-12 2018-08-14 北京工业大学 A kind of video actions detection method based on Three dimensional convolution and Faster RCNN
CN108734095B (en) * 2018-04-10 2022-05-20 南京航空航天大学 Motion detection method based on 3D convolutional neural network
CN110084151B (en) * 2019-04-10 2023-02-28 东南大学 Video abnormal behavior discrimination method based on non-local network deep learning
CN110263728B (en) * 2019-06-24 2022-08-19 南京邮电大学 Abnormal behavior detection method based on improved pseudo-three-dimensional residual error neural network

Also Published As

Publication number Publication date
CN111291699A (en) 2020-06-16

Similar Documents

Publication Publication Date Title
CN111291699B (en) Substation personnel behavior identification method based on monitoring video time sequence action positioning and abnormity detection
CN108216252B (en) Subway driver vehicle-mounted driving behavior analysis method, vehicle-mounted terminal and system
CN108009473A (en) Based on goal behavior attribute video structural processing method, system and storage device
CN104915655A (en) Multi-path monitor video management method and device
CN107133569A (en) The many granularity mask methods of monitor video based on extensive Multi-label learning
CN104717468B (en) Cluster scene intelligent monitoring method and system based on the classification of cluster track
CN103246896A (en) Robust real-time vehicle detection and tracking method
CN110222592B (en) Construction method of time sequence behavior detection network model based on complementary time sequence behavior proposal generation
CN105426820A (en) Multi-person abnormal behavior detection method based on security monitoring video data
CN107230267A (en) Intelligence In Baogang Kindergarten based on face recognition algorithms is registered method
CN107657232A (en) A kind of pedestrian's intelligent identification Method and its system
CN107944628A (en) A kind of accumulation mode under road network environment finds method and system
CN117437599B (en) Pedestrian abnormal event detection method and system for monitoring scene
Regazzoni et al. A real-time vision system for crowding monitoring
CN113569766A (en) Pedestrian abnormal behavior detection method for patrol of unmanned aerial vehicle
CN113076825A (en) Transformer substation worker climbing safety monitoring method
Jiang et al. A deep learning framework for detecting and localizing abnormal pedestrian behaviors at grade crossings
CN117197713A (en) Extraction method based on digital video monitoring system
Wang et al. Deep learning and multi-modal fusion for real-time multi-object tracking: Algorithms, challenges, datasets, and comparative study
Katariya et al. A pov-based highway vehicle trajectory dataset and prediction architecture
CN113012193B (en) Multi-pedestrian tracking method based on deep learning
CN117423157A (en) Mine abnormal video action understanding method combining migration learning and regional invasion
CN106960183A (en) A kind of image pedestrian's detection algorithm that decision tree is lifted based on gradient
CN116311082A (en) Wearing detection method and system based on matching of key parts and images
Tang et al. Multilevel traffic state detection in traffic surveillance system using a deep residual squeeze-and-excitation network and an improved triplet loss

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant