CN114694080A - Detection method, system and device for monitoring violent behavior and readable storage medium - Google Patents

Detection method, system and device for monitoring violent behavior and readable storage medium Download PDF

Info

Publication number
CN114694080A
CN114694080A CN202210415750.XA CN202210415750A CN114694080A CN 114694080 A CN114694080 A CN 114694080A CN 202210415750 A CN202210415750 A CN 202210415750A CN 114694080 A CN114694080 A CN 114694080A
Authority
CN
China
Prior art keywords
convolution
violent behavior
dimensional
network
monitoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210415750.XA
Other languages
Chinese (zh)
Inventor
徐映千
何欣楠
唐雪萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN202210415750.XA priority Critical patent/CN114694080A/en
Publication of CN114694080A publication Critical patent/CN114694080A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a detection method, a system and a device for monitoring violent behaviors and a readable storage medium, belonging to the technical field of computer vision and comprising the following steps: step 1, constructing a violent behavior video data set; step 2, constructing a three-dimensional convolution neural network, and extracting violent behavior video data characteristics; step 3, classifying the feature data by using a multilayer perceptron; the method, the system, the device and the readable storage medium for monitoring the violent behavior replace a time-consuming and labor-consuming manual detection method, adopt a dense connection network combined with 3D convolution to extract the characteristics of violent behavior video data, and use a multilayer perceptron algorithm to classify the characteristics in the violent behavior video data extracted by the network; replacing a 2D convolution kernel in the dense connection network by the 3D convolution kernel, so that the convolution neural network has the function of extracting video features; the violent behavior videos are classified through an algorithm, and the video content shot by the camera is analyzed through a computer, so that the labor can be saved, and violent events can be prevented.

Description

Method, system and device for detecting monitoring violent behavior and readable storage medium
Technical Field
The invention belongs to the technical field of computer vision, and particularly relates to a detection method, a system and a device for monitoring violent behaviors and a readable storage medium.
Background
With the development of neural network algorithms and the improvement of computer performance, the neural network algorithms have been widely applied in various fields; the monitoring camera has penetrated deep into each corner of a city, so that the aggressive pursuit behavior in the society is restrained, the safety of the society is maintained, but how to quickly and timely detect the violent behavior of the monitoring camera is large in data amount of the existing monitoring camera, and a task which cannot be completed by a manual method is required to be developed, so that the existing problem is solved.
Disclosure of Invention
The invention aims to provide a method for detecting violent behaviors of a monitoring camera, which aims to solve the problem of low detection efficiency of the violent behaviors of the monitoring camera.
In order to achieve the purpose, the invention provides the following technical scheme: a detection method for monitoring violent behaviors comprises the following steps:
step 1, constructing a violent behavior video data set;
step 2, constructing a three-dimensional convolutional neural network, and extracting violent behavior video data characteristics;
and 3, classifying the feature data by using a multilayer perceptron.
Preferably, the method for constructing the three-dimensional convolutional neural network is dense connection and 3D convolution.
Preferably, the construction of the dense connection and the 3D convolution comprises the following steps:
step 21, connecting each convolution layer in the network to construct a dense connection network; modifying the convolution kernels of the dense connection network, and replacing all 2D convolution kernels in the dense connection network with convolution kernels of the 3D convolution;
and step 22, training the neural network through the data set constructed in the step 1, so that the three-dimensional convolution neural network can extract picture features.
The method for replacing all 2D convolution kernels in the dense connection network by the convolution kernels of the 3D convolution comprises the following steps: generalizing the 2D convolution kernel to three dimensions using 3D convolution kernels, for values in (xyz) coordinates in the j-th Feature Map of the i-th 3D convolution layer
Figure BDA0003605868350000021
Is shown in formula 1:
Figure BDA0003605868350000022
where x, y, z are input sample points,
Figure BDA0003605868350000023
is the value in the (p.q, r) coordinate in the kth layer Feature Map where the current 3D convolution kernel is located, g (-) is the activation function, b (-) is the activation functionijIs bias, m is the set of layer (i-1) Feature Map indices, PiAnd QiLength and width of convolution kernel, QiIs the size of the time-sequential direction convolution kernel; and p, q and r are sampling points obtained by the input sampling points x, y and z according to the convolution definition, and omega is the weight of the sampling points.
Preferably, the building of the violent behavior video data set comprises collecting a network data set, monitoring video calling, and editing violent behavior segments of the collected video.
Preferably, in step 3, the multi-layer perceptron is trained by using the trained network output result to have a classification function, wherein the classification method used by the multi-layer perceptron is binary classification.
Preferably, the 3D convolution includes a dense connection layer 1 of 6 1 × 1 × 1 three-dimensional convolution kernels and 3 × 3 × 3 three-dimensional convolution kernels, a conversion layer 1 of 1 × 1 × 1 three-dimensional convolution kernels and 3 × 3 × 3 three-dimensional average pooling, a dense connection layer 2 of 12 1 × 1 × 1 three-dimensional convolution kernels and 3 × 3 × 3 three-dimensional convolution kernels, a dense connection layer 3 of 1 × 1 × 1 three-dimensional convolution kernels and 3 × 3 × 3 × 3 three-dimensional average pooling, a dense connection layer 3 of 24 1 × 1 × 1 three-dimensional convolution kernels and 3 × 3 × 3 three-dimensional convolution kernels, a conversion layer 3 of 1 × 7 × 7 three-dimensional global maximum pooling, a full connection layer; wherein
After translation layer 3, the activation function is as shown in equation 2; where n is the dimension of the input data, eiIs an input value of dimension i, SiAn output probability of dimension i;
Figure BDA0003605868350000024
and sending the output result of the activation function to a multi-layer perceptron for classification processing.
The invention also provides a system for detecting the monitoring violent behavior, which comprises:
the violent behavior video data set construction module is used for constructing a violent behavior video data set;
the building module of the three-dimensional convolutional neural network is used for building the three-dimensional convolutional neural network;
the data feature extraction module is used for extracting violent behavior video data features;
the data classification module is used for classifying the feature data by using a multilayer perceptron;
the dense connection and 3D convolution constructing module is used for connecting each convolution layer in the network to construct a dense connection network; modifying the convolution kernel of the dense connection network, and replacing all 2D convolution kernels in the dense connection network with the convolution kernel of the 3D convolution; and training the neural network through a Hockey lights data set and a VIolent-Flows data set, and extracting picture characteristics.
The invention also provides a detection device for monitoring violent behaviors, which comprises:
a memory for storing non-transitory computer readable instructions; and
a processor for executing the computer readable instructions such that the computer readable instructions, when executed by the processor, implement the method for monitoring violent behavior.
The present invention further provides a computer-readable storage medium for storing non-transitory computer-readable instructions which, when executed by a computer, cause the computer to perform the method for monitoring violent behavior.
The invention has the technical effects and advantages that: the method, the system, the device and the readable storage medium for monitoring the violent behavior replace a time-consuming and labor-consuming manual detection method, adopt a dense connection network combined with 3D convolution to extract the characteristics of violent behavior video data, and use a multilayer perceptron algorithm to classify the characteristics in the violent behavior video data extracted by the network;
replacing a 2D convolution kernel in the dense connection network by the 3D convolution kernel, so that the convolution neural network has the function of extracting video features;
the violent behavior videos are classified through the algorithm, and the video content shot by the camera is analyzed through the computer, so that the labor can be saved, the labor cost can be reduced, and the violent events can be prevented.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flow chart of the construction of dense joins and 3D convolutions of the present invention;
FIG. 3 is a block diagram of the framework flow of the present invention;
FIG. 4 is a diagram of a dense connection layer structure according to the present invention;
FIG. 5 is a schematic diagram of the 3D convolution according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a detection method for monitoring violent behaviors, which comprises the following steps of:
step 1, a violent behavior video data set is constructed, and in the embodiment, videos for monitoring violent behaviors are collected in various forms: the method comprises the following steps of (1) carrying out network data set, calling actual monitoring videos and the like, and editing violent behavior segments of collected videos;
step 2, constructing an improved CNN through a dense connection and 3D convolution method, and performing feature extraction on violent behavior video data; in this embodiment, the CNN is a convolutional neural network;
in the process of extracting the characteristics of violent behavior video data, the method specifically comprises the following steps: as shown in figure 2 of the drawings, in which,
step 21, dense connection network construction characteristics are extracted to obtain CNN, and the network structure of the dense connection network is modified; as shown in fig. 4, the feature maps in each dense module are connected, and high-bottom characteristics are fused, so that the network model can better extract the high-bottom semantic features of the video;
as shown in FIG. 5, the 3D convolution will be for the value in the (x, y, z) coordinate in the j-th layer Feature Map of the i-th layer 3D convolution layer
Figure BDA0003605868350000041
The calculation formula of (a) is shown in formula 2:
Figure BDA0003605868350000042
where x is the input sample point, g (-) is the activation function, bijIs bias, m is the set of layer (i-1) Feature Map indices, PiAnd QiIs the length and width p of the convolution kernelnIs a distance of p from the first sampling point0Sample point at position n, Δ pnIs the offset, ω is the sampling point weight;
step 22, training the modified dense connection network by using the data set constructed in the step S21, and extracting picture features;
step 3, training the multi-layer perceptron by using the trained network output result to enable the multi-layer perceptron to have a classification function;
in this embodiment, the 3D-CNN is based on a dense connection network, and includes a dense connection layer 1 including 6 1 × 1 × 1 three-dimensional convolution kernels and 3 × 3 × 3 three-dimensional convolution kernels, a dense connection layer 1 including 1 × 1 × 1 three-dimensional convolution kernel and 3 × 3 × 3 × 3 three-dimensional average pooled, a dense connection layer 2 including 12 1 × 1 × 1 three-dimensional convolution kernels and 3 × 3 × 3 × 3 three-dimensional convolution kernels, a dense connection layer 3 including 1 × 1 × 1 three-dimensional convolution kernel and 3 × 3 × 3 three-dimensional average pooled, 24 1 × 1 × 1 × 1 three-dimensional convolution kernels and 3 × 3 × 3 × 3 three-dimensional convolution kernels, a 1 × 7 × 7 three-dimensional global maximally pooled translation layer 3, and a fully-connected layer, that is, a multilayer perceptron; wherein:
after translation layer 3, the function is activated:
Figure BDA0003605868350000051
the data can be sent to a multi-layer perceptron for classification processing.
The invention also provides a system for detecting the monitoring violent behavior, which comprises:
the violent behavior video data set construction module is used for constructing a violent behavior video data set;
the building module of the three-dimensional convolutional neural network is used for building the three-dimensional convolutional neural network;
the data feature extraction module is used for extracting violent behavior video data features;
the data classification module is used for classifying the feature data by using a multilayer perceptron;
the dense connection and 3D convolution constructing module is used for connecting each convolution layer in the network to construct a dense connection network; modifying the convolution kernel of the dense connection network, and replacing all 2D convolution kernels in the dense connection network with the convolution kernel of the 3D convolution; and training the neural network through a Hockey lights data set and a VIolent-Flows data set, and extracting picture characteristics.
The invention also provides a detection device for monitoring violent behaviors, which comprises:
a memory for storing non-transitory computer readable instructions; and
a processor for executing the computer readable instructions such that the computer readable instructions, when executed by the processor, implement the method for monitoring violent behavior.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments or portions thereof without departing from the spirit and scope of the invention.

Claims (10)

1. A detection method for monitoring violent behaviors is characterized by comprising the following steps: the method comprises the following steps:
step 1, constructing a violent behavior video data set;
step 2, constructing a three-dimensional convolutional neural network, and extracting violent behavior video data characteristics;
and 3, classifying the feature data by using a multilayer perceptron.
2. A method for monitoring violent behavior as in claim 1, comprising the steps of: the method for constructing the three-dimensional convolution neural network is dense connection and 3D convolution.
3. A method for monitoring violent behavior as in claim 1, comprising the steps of: the construction of the dense connection and the 3D convolution comprises the following steps:
step 21, connecting each convolution layer in the network to construct a dense connection network; modifying the convolution kernel of the dense connection network, and replacing all 2D convolution kernels in the dense connection network with the convolution kernel of the 3D convolution;
and step 22, training the neural network through the data set constructed in the step 1, so that the three-dimensional convolution neural network can extract picture features.
4. A method for monitoring violent behavior as in claim 1, comprising the steps of:the method for replacing all 2D convolution kernels in the dense connection network by the convolution kernels of the 3D convolution comprises the following steps: generalizing the 2D convolution kernel to three dimensions using 3D convolution kernels, for values in (xyz) coordinates in the j-th Feature Map of the i-th 3D convolution layer
Figure FDA0003605868340000011
Is shown in formula 1:
Figure FDA0003605868340000012
where x, y, z are input sample points,
Figure FDA0003605868340000013
is the value in the (p.q, r) coordinate in the kth layer Feature Map where the current 3D convolution kernel is located, g (-) is the activation function, b (-) is the activation function1jIs bias, m is the set of (i-i) th layer Feature Map indices, PiAnd Q1Length and width of convolution kernel, QiIs the size of the time-sequential direction convolution kernel; and p, q and r are sampling points obtained by the input sampling points x, y and z according to the convolution definition, and omega is the weight of the sampling points.
5. A method for monitoring violent behavior as in claim 1, comprising the steps of: the method for constructing the violent behavior video data set comprises the steps of collecting a network data set, calling monitoring videos, and editing violent behavior segments of the collected videos.
6. A method for monitoring violent behavior as in claim 1, comprising the steps of: and 3, training the multilayer perceptron by using the trained network output result to enable the multilayer perceptron to have a classification function, wherein the classification mode used by the multilayer perceptron is binary classification.
7. A method for monitoring violent behavior as in claim 1, comprising the steps of: the 3D convolution includes a dense connection layer 1 of 6 1 × 1 × 1 three-dimensional convolution kernels and 3 × 3 × 3 three-dimensional convolution kernels, a conversion layer 1 of 1 × 1 × 1 three-dimensional convolution kernels and 3 × 3 × 3 three-dimensional average pooling, a dense connection layer 2 of 12 1 × 1 × 1 three-dimensional convolution kernels and 3 × 3 × 3 three-dimensional convolution kernels, a dense connection layer 3 of 1 × 1 × 1 three-dimensional convolution kernel and 3 × 3 × 3 three-dimensional average pooled conversion layer 2, 24 dense connection layers 3 of 1 × 1 × 1 three-dimensional convolution kernels and 3 × 3 × 3 three-dimensional convolution kernels, a conversion layer 3 of 1 × 7 × 7 three-dimensional global maximum pooling, a full connection layer; wherein
After translation of layer 3, the activation function is as shown in equation 2; where n is the dimension of the input data, eiIs an input value of dimension i, SiAn output probability of dimension i;
Figure FDA0003605868340000021
and sending the output result of the activation function to a multi-layer perceptron for classification processing.
8. A surveillance violent behavior detection system comprising:
the violent behavior video data set construction module is used for constructing a violent behavior video data set;
the building module of the three-dimensional convolutional neural network is used for building the three-dimensional convolutional neural network;
the data feature extraction module is used for extracting violent behavior video data features;
the data classification module is used for classifying the feature data by using a multilayer perceptron;
the dense connection and 3D convolution constructing module is used for connecting each convolution layer in the network to construct a dense connection network; modifying the convolution kernel of the dense connection network, and replacing all 2D convolution kernels in the dense connection network with the convolution kernel of the 3D convolution; and training the neural network through the constructed data set, so that the three-dimensional convolution neural network can extract picture characteristics.
9. A surveillance violence detection apparatus comprising:
a memory for storing non-transitory computer readable instructions; and
a processor for executing the computer readable instructions such that the computer readable instructions, when executed by the processor, implement the method of monitoring violent behavior detection according to any one of claims 1 to 7.
10. A computer-readable storage medium storing non-transitory computer-readable instructions which, when executed by a computer, cause the computer to perform the method of monitoring violent behavior detection of any of claims 1 to 7.
CN202210415750.XA 2022-04-20 2022-04-20 Detection method, system and device for monitoring violent behavior and readable storage medium Pending CN114694080A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210415750.XA CN114694080A (en) 2022-04-20 2022-04-20 Detection method, system and device for monitoring violent behavior and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210415750.XA CN114694080A (en) 2022-04-20 2022-04-20 Detection method, system and device for monitoring violent behavior and readable storage medium

Publications (1)

Publication Number Publication Date
CN114694080A true CN114694080A (en) 2022-07-01

Family

ID=82143848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210415750.XA Pending CN114694080A (en) 2022-04-20 2022-04-20 Detection method, system and device for monitoring violent behavior and readable storage medium

Country Status (1)

Country Link
CN (1) CN114694080A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049969A (en) * 2022-08-15 2022-09-13 山东百盟信息技术有限公司 Poor video detection method for improving YOLOv3 and BiConvLSTM

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049969A (en) * 2022-08-15 2022-09-13 山东百盟信息技术有限公司 Poor video detection method for improving YOLOv3 and BiConvLSTM

Similar Documents

Publication Publication Date Title
CN112001339B (en) Pedestrian social distance real-time monitoring method based on YOLO v4
Dong et al. A lightweight vehicles detection network model based on YOLOv5
CN113065558A (en) Lightweight small target detection method combined with attention mechanism
CN111046821B (en) Video behavior recognition method and system and electronic equipment
CN113052029A (en) Abnormal behavior supervision method and device based on action recognition and storage medium
CN110991444B (en) License plate recognition method and device for complex scene
CN113033454B (en) Method for detecting building change in urban video shooting
CN111832484A (en) Loop detection method based on convolution perception hash algorithm
CN110210433B (en) Container number detection and identification method based on deep learning
CN114694185B (en) Cross-modal target re-identification method, device, equipment and medium
CN111753682A (en) Hoisting area dynamic monitoring method based on target detection algorithm
CN109446897B (en) Scene recognition method and device based on image context information
CN116071668A (en) Unmanned aerial vehicle aerial image target detection method based on multi-scale feature fusion
CN111104855B (en) Workflow identification method based on time sequence behavior detection
CN114694080A (en) Detection method, system and device for monitoring violent behavior and readable storage medium
CN109359530B (en) Intelligent video monitoring method and device
Ouyang et al. Aerial target detection based on the improved YOLOv3 algorithm
Sun et al. UAV image detection algorithm based on improved YOLOv5
CN117496384A (en) Unmanned aerial vehicle image object detection method
CN112613496A (en) Pedestrian re-identification method and device, electronic equipment and storage medium
CN112418229A (en) Unmanned ship marine scene image real-time segmentation method based on deep learning
CN111832351A (en) Event detection method and device and computer equipment
CN112487911B (en) Real-time pedestrian detection method and device based on improvement yolov under intelligent monitoring environment
CN115100546A (en) Mobile-based small target defect identification method and system for power equipment
CN109815911B (en) Video moving object detection system, method and terminal based on depth fusion network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination