CN114266934A

CN114266934A - Dangerous action detection method based on cloud storage data

Info

Publication number: CN114266934A
Application number: CN202111510459.2A
Authority: CN
Inventors: 马海峰; 张继; 薛庆水; 王俊华; 时雪磊; 薛震; 王晨阳; 周雨卫; 崔墨香
Original assignee: Shanghai Institute of Technology
Current assignee: Shanghai Institute of Technology
Priority date: 2021-12-10
Filing date: 2021-12-10
Publication date: 2022-04-01

Abstract

The invention discloses a dangerous action detection method based on cloud storage data, which comprises the following steps of 1: downloading a training data set from a cloud storage server; step 2: training the model based on the dataset; and step 3: after the model is trained, a picture is taken by using a photographing device in the vehicle-mounted device and/or a specific place, and the model identifies the action of the picture. The method can make specific limitation on the behaviors of the people in certain specific scenes, such as no smoking in a gas station, no calling for a driver, no photographing in a museum and the like, can monitor, find and give warning when appropriate through a behavior detection algorithm based on video monitoring, greatly reduces the manpower, material resources and financial resources required by the specific places, and has good application scenes and certain social value.

Description

Dangerous action detection method based on cloud storage data

Technical Field

The invention relates to the field of computer vision technology and deep learning, in particular to a dangerous action detection method based on cloud storage data.

Background

At present, irregular or even illegal behaviors often occur around people, such as smoking at a gas station, making a call while driving a vehicle, illegal photographing at a museum and the like, and some activities may cause potential threats to personal safety of the people or others and even may cause serious hidden dangers of property loss. Therefore, in these specific scenarios, it is necessary to make specific restrictions on human behavior, and there are many restrictions on manual control and management, such as manpower and time. As computer technology has evolved, these behaviors may be detected through deep learning training models and the like. In recent years, deep learning has been applied to various research directions, and it is attempted to find a new effective method for solving these problems by deep learning.

1. Deep convolutional neural network training method

(1) The deep learning method does not need a researcher to design a feature extraction method, and a deep convolution neural network model is built firstly;

(2) transmitting the preprocessed and normalized data set into a deep convolution neural network model;

(3) and continuously and iteratively training the neural network model through normal propagation and back propagation until the identified error is converged into a smaller range.

Through the training and iteration method, the deep convolutional neural network can automatically learn hidden features which reflect the essence of the image in the dangerous and illegal behavior image of the driver, and the hidden features can be used for identifying and classifying a new input image.

2. Existing abnormal behavior detection technology based on artificial intelligence

(1) Smoking calling identification technology based on traditional image algorithm classification

After the detection algorithm detects the face, a large area including the face is intercepted on the basis of the face, and the large area is directly used for analyzing actions. A common conventional machine learning algorithm is: the face detection method comprises the steps of detecting a face by using an AdaBoost face detection algorithm, wherein AdaBoost is an abbreviation of English Adaptive Boosting, and is a machine learning method commonly used for rapidly detecting the face; intercepting a large area according to the detected face to construct positive and negative samples for making a call, and training a sample library constructed by learning by using an SVM (Support Vector Machine, which is a traditional Machine learning algorithm); and in the prediction stage, the trained SVM model is loaded, the intercepted area to be detected is input, and the probability of predicting whether the classification is smoking and calling is output. The algorithm has the advantages that: the speed is high; the algorithm has the following defects: with traditional machine learning classification, the final accuracy is not high due to insufficient learning ability of the algorithm.

(2) Calling identification technology based on deep learning algorithm classification

In view of the above-mentioned insufficient learning ability of the traditional SVM machine learning method, the algorithm is often not fully satisfactory in response to a complex practical application scenario, and therefore the classification method is replaced by a CNN (convolutional neural network) network learning classification from an SVM; when the network is deep enough, the CNN has better characteristic extraction capability on data, and can learn the calling behavior under various complex light rays in actual driving to have better data fitting effect. The algorithm has the advantages that: the deep learning method can fit a large amount of complex data, the learning capacity of data features is stronger, and the prediction effect is better than that of an SVM (support vector machine) under the background of big data; the disadvantages are as follows: in view of the fact that large areas based on human faces are classified, a lot of invalid background information is introduced, and when the background with untrained algorithm is encountered, the inexplicable false recognition is easy to occur.

From an algorithmic point of view, the most popular of the previous old methods of identifying violations, dangerous driving behaviors are some low-level and logically simple machine-learned discriminant algorithms. Such as K-nearest neighbors, SVM support vector machines, decision trees, and the like. The classification algorithms have the advantages of simplicity, easy operation and implementation, and better effect in the identification scenes with few categories, such as handwritten number identification and the like. The method has the disadvantages that the characteristic extraction factor and the classifier need to be designed separately, namely, the effective characteristics can be manually designed and constructed aiming at different use scenes by needing rich knowledge storage and experience, and the good effect can be achieved only when the method is used in image processing. Therefore, under the complex and varied scenes such as dangerous driving behavior recognition, the simple machine learning discrimination method has low recognition rate and accuracy and low robustness.

Disclosure of Invention

In order to overcome the defects in the prior art, the invention provides a dangerous action detection method based on cloud storage data, and solves the problems of low accuracy, high false detection rate and the like caused by the fact that the conventional dangerous action detection method is easily interfered by external factors.

In order to achieve the above purpose, the technical solution for solving the technical problem is as follows:

a dangerous action detection method based on cloud storage data comprises the following steps:

step 1: downloading a training data set from a cloud storage server;

step 2: training the model based on the dataset;

and step 3: after the model is trained, a picture is taken by using a photographing device in the vehicle-mounted device and/or a specific place, and the model identifies the action of the picture.

Further, in step 1, the downloaded training data set includes smoking image data, calling image data and normal data, and each picture at least contains one person.

Further, in step 2, the training model adopts a deep learning framework to obtain the paths of all pictures in the data set folder, then detects whether the picture format is complete, removes the improper pictures, constructs the corresponding labels, preprocesses the pictures, and reads all the pictures.

Further, in step 2, the data preprocessing and training process specifically includes the following steps:

obtaining the path of all pictures of a training set folder by using pathlib, then detecting whether the picture format is a complete jpg format, removing the unsuitable pictures and constructing corresponding label;

preprocessing pictures through a self-defined function load _ preprocess _ image, reading all pictures by using a function tf.data.dataset.from _ tenor _ slices, packaging zip with corresponding label, and dividing 10% of pictures into test sets;

calling a keras application ResNet algorithm, designating a pre-training parameter as an imagenet parameter, modifying network output, changing 1000 classified outputs into 4 types of repeated outputs, defining an optimizer and a loss function, and training for 50 rounds in a TITAN XP;

the input is a 3x 256 feature, the 256 dimensional channels are reduced to 64 dimensions with 64 convolutions of 1x1, and then finally restored to 256 channels by 1x1 convolution, the number of parameters used overall: 1x1x256x64+3x3x64x64+1x1x64x256 is 69632.

Further, in step 3, the result of the action recognition belongs to a preset dangerous action, and a prompt message or an alarm is sent out, wherein the preset dangerous action comprises at least one of the following actions of the personnel: smoking, making a call, smoking and making a call, no smoking and making a call.

Further, in step 3, taking a picture of the vehicle-mounted device and/or the specific location by using a camera device to identify the action of the person, includes:

a person in the photographed at least one image is to include at least one target area;

intercepting a target image corresponding to the target area from at least one frame of photo according to the target area obtained by detection;

performing action recognition on a person according to the target image;

wherein the target region comprises at least one of: a face local area, an action interaction object and a limb area.

Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects:

according to the method, based on a large amount of data, an algorithm based on computer vision is designed, smoking and calling behaviors of people are recognized, the problems that the definition of the people is limited in data set and the pixels of the people are different in size are solved, accuracy of recognition of the smoking and calling behaviors is improved continuously by trying different image algorithms and data enhancement, and the dangerous behaviors can be detected accurately. The manpower, material resources and financial resources required for dangerous behavior detection in the specific places are greatly reduced, and the method has wide application scenes and good market prospects.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. In the drawings:

fig. 1 is a schematic flow chart of a dangerous action detection method based on cloud storage data according to the present invention.

Detailed Description

The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

1. Operating environment of the method

The method adopts a TensorFlow deep learning framework, the version number is 2.0.0, the Python version is 3.6, and the CUDA version is 10.0. The dependencies used in the method are tensorfolw, pandas, numpy, matplotlib, os, pathlib.

2. Method execution step

As shown in fig. 1, the present embodiment discloses a method for detecting a dangerous action based on cloud storage data, which includes the following steps:

step 1: downloading a training data set from a cloud storage server;

step 2: training the model based on the dataset;

In this embodiment, in the data preprocessing stage, a variety of data enhancement operations are introduced, all the training set data are turned by 180 degrees and then added to the training set again, and the selection, translation, scaling, color saturation, contrast, and the like are performed, so that limited data can generate a value equivalent to more data without substantially increasing the data. In the stage of training the model, a ResNet Network (Residual Neural Network) is adopted by the backbone Network, the integrity of information is protected by directly bypassing the input information to the output, and the whole Network only needs to learn the part of the difference between the input and the output, thereby simplifying the learning objective and the difficulty. On the basis, the method adjusts the original network structure, 1000 types of output of the original network structure are added with a relu activation function, and the flatten tiling function changes the 1000 types of output into 4 types of output required by the method.

Specifically, in step 2, the data preprocessing and training process specifically includes the following steps:

obtaining the path of all pictures of a training set folder by using pathlib, then detecting whether the picture format is a complete jpg format, removing improper pictures (a small number of pictures are only dozens of pictures), and constructing a corresponding label;

the input is a 3x 256 feature, the 256 dimensional channels are reduced to 64 dimensions with 64 convolutions of 1x1, and then finally restored to 256 channels by 1x1 convolution, the number of parameters used overall: 1x1x256x64+3x3x64x64+1x1x64x256 is 69632. Therefore, the model at the network training position is improved in accuracy compared with the traditional algorithm, and is far better than the traditional algorithm in speed and calculation amount.

In the model deployment link, the method uses the remote host to receive the pictures of the camera and the vehicle-mounted camera equipment for identification and returns the result, and is simple, rapid and efficient. The accuracy of the local test set reaches 99%, the data of the test set is preprocessed in the same method and put into a model.

performing action recognition on a person according to the target image;

The invention relates to a dangerous action detection method based on cloud storage data, which is characterized in that a convolutional neural network is used for processing a video stream input frame shot by a camera, and abnormal behaviors are further judged through the processes of image feature extraction, feature fusion, target classification, target positioning and the like. Through a series of designed convolutional neural network modules, the model calculation amount is reduced, the deduction speed is increased, the real-time requirement is met, and the detection accuracy of small target objects is improved. In addition, some skills of model training are applied, and the robustness of the model is improved. Because the identification precision is insufficient due to insufficient data sets of the existing models, the method introduces a large amount of data stored in the cloud to solve the problem. Comparison experiments prove that the proposed algorithm has better detection effect on the data set and some public data sets.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A dangerous action detection method based on cloud storage data is characterized by comprising the following steps:

step 1: downloading a training data set from a cloud storage server;

step 2: training the model based on the dataset;

2. The method according to claim 1, wherein in step 1, the downloaded training data set comprises smoking image data, calling image data and normal data, and each picture contains at least one person.

3. The method for detecting dangerous actions based on cloud storage data according to claim 1, wherein in the step 2, the training model adopts a deep learning framework to obtain paths of all pictures of a data set folder, then detects whether the picture format is complete, removes the unsuitable pictures, constructs corresponding tags, preprocesses the pictures, and reads all the pictures.

4. The method according to claim 3, wherein in the step 2, the data preprocessing and training process specifically includes the following steps:

5. The method for detecting dangerous actions based on cloud storage data according to claim 1, wherein in step 3, the result of action recognition belongs to a predetermined dangerous action, and a prompt message or an alarm is issued, wherein the predetermined dangerous action comprises at least one of the following actions of personnel: smoking, making a call, smoking and making a call, no smoking and making a call.

6. The method for detecting dangerous actions based on cloud storage data according to claim 5, wherein in step 3, the taking a picture of the vehicle-mounted device and/or the image pickup device in a specific place to identify the actions of the person comprises:

performing action recognition on a person according to the target image;