CN114140750A - Filling station safety helmet wearing real-time detection method based on YOLOv4-Tiny - Google Patents
Filling station safety helmet wearing real-time detection method based on YOLOv4-Tiny Download PDFInfo
- Publication number
- CN114140750A CN114140750A CN202111495511.1A CN202111495511A CN114140750A CN 114140750 A CN114140750 A CN 114140750A CN 202111495511 A CN202111495511 A CN 202111495511A CN 114140750 A CN114140750 A CN 114140750A
- Authority
- CN
- China
- Prior art keywords
- tiny
- yolov4
- network
- training
- pictures
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a filling station safety helmet wearing real-time detection method based on YOLOv4-Tiny, which comprises the following steps: s1, performing frame extraction processing on the original monitoring video of the gas station, extracting a plurality of monitoring pictures, preprocessing the monitoring pictures to obtain training pictures, selecting training pictures with corresponding proportions to generate countermeasure samples, and taking the rest training pictures as original samples; s2, generating a first training set based on the confrontation sample and the original sample; s3, training the YOLOv4-Tiny improved model by adopting a first training set to obtain corresponding training weight; s4, accessing a gas station for real-time monitoring, performing frame decomposition, inputting the decomposition picture of each frame into a trained YOLOv4-Tiny improved model in real time to obtain a personal safety helmet wearing classification result in the decomposition picture, and overlapping the classification result into the decomposition picture to obtain a classification picture; and S5, synthesizing the classified pictures into videos in real time, and outputting the videos in real time. The detection accuracy is higher, the real-time is stronger, and the adaptation is in the interior hardware equipment basis of oil station.
Description
Technical Field
The invention belongs to the technical field of intelligent recognition of safety images of a gasoline station, and particularly relates to a method for detecting wearing of safety caps of the gasoline station based on YOLOv4-Tiny in real time.
Background
With the increasing complexity of industrial systems and the continuous development of artificial intelligence technology, many problems in industry can be solved by applying artificial intelligence technology to improve work efficiency. While the technology is being improved, worker safety is beginning to be a great concern. The safety helmet is used as important equipment for protecting the personal safety of workers, the safety of the workers can be effectively protected in actual industrial production, and production safety accidents are greatly reduced.
Most of the existing safety helmet detection methods are based on deep learning, and the existing safety helmet detection methods are used for specially detecting the safety helmet by replacing a training set with a classical algorithm in the field of target detection. Such as two-stage model R-CNN series and single-stage network YOLO series in target detection. However, in actual use, when an existing target detection model is used for detecting an actual safety helmet, due to the fact that the detection environment is severe and the size of the safety helmet is too small, the existing detection method has the problems of being high in requirement on the detection environment and low in detection precision. Secondly, hardware equipment in the gas station is updated slowly, and the calculation power of the hardware is low, so that the detection accuracy is improved, and meanwhile, the complexity of a target detection model needs to be considered so as to adapt to the hardware equipment foundation of the gas station.
Therefore, it is necessary to improve the existing detection model to adapt to the hardware device foundation in the gas station and improve the detection accuracy of the detection model.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a method for detecting the wearing real-time of the safety helmet of the gas station based on YOLOv4-Tiny, which can be adapted to the hardware equipment foundation in the gas station and has higher detection accuracy.
The invention adopts the following technical scheme: a filling station helmet wearing real-time detection method based on YOLOv4-Tiny comprises the following steps: s1, performing frame extraction processing on the original monitoring video of the gas station, extracting a plurality of monitoring pictures, preprocessing the monitoring pictures to obtain training pictures, selecting training pictures with corresponding proportions to generate countermeasure samples, and taking the rest training pictures as original samples;
s2, generating a first training set based on the confrontation sample and the original sample;
s3, training the YOLOv4-Tiny improved model by adopting a first training set to obtain corresponding training weight;
s4, accessing a gas station for real-time monitoring, performing frame decomposition, inputting the decomposition picture of each frame into a trained YOLOv4-Tiny improved model in real time to obtain a personal safety helmet wearing classification result in the decomposition picture, and overlapping the classification result into the decomposition picture to obtain a classification picture;
s5, synthesizing the classified pictures into videos in real time, and outputting the videos in real time;
the YOLOv4-Tiny improved model comprises a feature extraction backbone network module, a multi-scale feature fusion module and a classification prediction module which are sequentially connected, wherein the feature extraction backbone network module is a CSPdarknet53_ Tiny network;
the YOLOv4-Tiny improvement model also includes an attention mechanism module inserted into the residual network of the Resblock _ body module of the CSPdarknet53_ Tiny network.
As a preferred scheme, the attention mechanism module is a SENET network, the SENET network is a channel characteristic attention network, the input characteristic graph is subjected to global average pooling firstly, then passes through two full-connection layers, and finally outputs corresponding weight through a Sigmoid activation function, and the weight is multiplied by the input characteristic graph to obtain output.
Preferably, step S1 includes the steps of:
s1.1, selecting a monitoring video of a gas station in a period of time, and performing frame extraction processing to obtain monitoring pictures containing a safety helmet worn by a person and a safety helmet not worn by the person;
s1.2, marking the target position in the monitoring picture, and forming a label to obtain a training picture;
s1.3, selecting training pictures in corresponding proportions, and generating a confrontation sample by adopting a target confrontation object gradient attack method;
and S1.4, performing a scrambling operation on the confrontation sample and the rest training pictures, and performing a data enhancement operation to form a first training set.
Preferably, step S1.3 includes the steps of:
s1.3.1, training to obtain a trained original YOLOv4-Tiny network;
s1.3.2, inputting the selected training pictures with corresponding proportions into the trained original YOLOv4-Tiny network, and carrying out network forward propagation to obtain confidence loss;
s1.3.3, performing back propagation on the confidence loss, wherein the frozen network parameters are not variable in the back propagation process, and only the pixels of the picture can be modified;
s1.3.4, after each iteration, inputting the modified picture into the original well-trained YOLOv4-Tiny network, and if the target position cannot be correctly detected, stopping the iteration, wherein the picture becomes a countermeasure sample.
Preferably, in step S1.3.3, the propagation is reversed toward the direction of increasing confidence loss.
Preferably, step S1.3.1 includes the following steps:
a. taking all the training pictures as a second training set;
b. the training labels are changed into person, 50% hat and 100% hat, and are respectively used for detecting that the safety helmet is not worn, the safety helmet is not correctly worn and the safety helmet is correctly worn;
c. and inputting the second training set into an original YOLOv4-Tiny model to calculate loss, and performing back propagation to obtain corresponding network weight so as to obtain a trained original YOLOv4-Tiny network.
Preferably, in step c, the loss is calculated by the following formula:
wherein the content of the first and second substances,a weight representing a coordinate loss;weights representing mesh prediction class losses;representing the number of grids divided by the picture;representing the number of prediction boxes contained in each grid;is shown asA first of the gridWhether each prediction frame is a responsible prediction frame or not is judged, if yes, 1 is selected, and if not, 0 is selected;is shown asA first of the gridWhether the prediction frame is not the responsible prediction frame or not is judged, if so, 1 is selected, and if not, 0 is selected;、respectively represent byThe coordinates of the real marked central point of the target object in charge of each grid;、respectively represent byCoordinates of a central point of a prediction frame of the target object in charge of each grid;、respectively represent byThe length and the width of the real mark of the target object in charge of each grid;、respectively represent byThe length and the width of a prediction frame of the target object in charge of each grid;is represented byReal classification results of the target object for which each grid is responsible;is represented byThe prediction classification result of the target object is responsible for each grid;is represented byThe object for which the grid is responsible belongs toTrue classification probabilities for individual classes;is represented byThe object for which the grid is responsible belongs toA predicted classification probability for each class;a set of class numbers is indicated.
Preferably, the CSPdarknet53_ tiny network includes a first darknencv 2D _ BN _ leak module, a second darknencv 2D _ BN _ leak module, a first Resblock _ body module, a second Resblock _ body module, a third Resblock _ body module, and a third darknencv 2D _ BN _ leak module, which are connected in sequence.
As a preferred scheme, the multi-scale feature fusion module comprises a first convolution layer, a second convolution layer, a first up-sampling layer, a first splicing layer, a third convolution layer, a second up-sampling layer and a second splicing layer which are sequentially connected from bottom to top;
the classification prediction module comprises a first Yolo Head classification network, a second Yolo Head classification network and a third Yolo Head classification network which are sequentially connected from bottom to top;
the first feature graph output by the third DarknetConv2D _ BN _ Leaky module is input to the first Yolo Head classification network through the first convolution layer on one hand, and is input to the second Yolo Head classification network through the second convolution layer and the first up-sampling layer on the other hand, and the second feature graph output by the second Resblock _ body module is input to the first splicing layer together for splicing to obtain a first fusion feature graph, the first fusion feature graph is input to the second Yolo Head classification network on the one hand, and is input to the second splicing layer together with the third feature graph output by the first Resblock _ body module through the third convolution layer and the second up-sampling layer on the other hand, so as to obtain a second fusion feature graph, and the second fusion feature graph is input to the third Yolo Head classification network.
Preferably, the DarknetConv2D _ BN _ Leaky modules are each composed of convolution Conv2D, batch normalized BN, and Leaky ReLU activation functions.
The invention has the beneficial effects that:
the detection is carried out based on a YOLOv4-Tiny model, and the YOLOv4-Tiny model belongs to a light weight version of a YOLOv4 model so as to adapt to the hardware equipment foundation in the gas station and improve the detection speed.
The YOLOv4-Tiny model has only two scales and is not accurate enough for the detection of the safety helmet, so the method improves the model, increases one scale to perform more-scale detection and improves the detection accuracy. An attention mechanism module is also inserted into a residual error network of a Resblock _ body module of the CSPdarknet53_ tiny network, so that the detection accuracy is improved, and meanwhile, the CSPdarknet53_ tiny network and the attention mechanism network are simple and are further adapted to a hardware equipment foundation in a gas station.
Based on an original YOLOv4-Tiny network, a target counterattack object gradient attack method is adopted to generate a countersample, and a YOLOv4-Tiny improved model is trained based on the countersample and the original sample to obtain a final detection model, so that the robustness of the detection model is improved, and the generalization capability is strong.
The method comprises the steps of carrying out frame decomposition on a monitoring video, detecting and classifying decomposed pictures of each frame to obtain classified pictures, and finally carrying out video synthesis and output on the classified pictures to realize real-time classification of wearing conditions of the personal safety helmet, so that better detection is realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of a method for detecting the wearing of a safety helmet of a gas station based on YOLOv4-Tiny in real time according to the invention;
FIG. 2 is a schematic structural diagram of a YOLOv4-Tiny improved model;
fig. 3 is a schematic diagram of the structure of the SENet network.
Detailed Description
The following description of the embodiments of the present invention is provided by way of specific examples, and other advantages and effects of the present invention will be readily apparent to those skilled in the art from the disclosure herein. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
Referring to fig. 1, the embodiment provides a filling station helmet wearing real-time detection method based on YOLOv4-Tiny, which includes the steps:
s1, performing frame extraction processing on an original monitoring video of a gas station, extracting a plurality of monitoring pictures, preprocessing the monitoring pictures to obtain training pictures, selecting training pictures with corresponding proportions to generate confrontation samples, and taking the rest training pictures as original samples, wherein the proportion of 1/3 is selected in the embodiment and can be specifically set according to actual conditions;
s2, generating a first training set based on the confrontation sample and the original sample;
s3, training the YOLOv4-Tiny improved model by adopting a first training set to obtain corresponding training weight;
s4, accessing a gas station for real-time monitoring, performing frame decomposition, inputting the decomposition picture of each frame into a trained YOLOv4-Tiny improved model in real time to obtain a personal safety helmet wearing classification result in the decomposition picture, and overlapping the classification result into the decomposition picture to obtain a classification picture;
s5, synthesizing the classified pictures into videos in real time, and outputting the videos in real time;
the YOLOv4-Tiny improved model comprises a feature extraction backbone network module, a multi-scale feature fusion module and a classification prediction module which are sequentially connected, wherein the feature extraction backbone network module is a CSPdarknet53_ Tiny network;
the YOLOv4-Tiny improvement model also includes an attention mechanism module inserted into the residual network of the Resblock _ body module of the CSPdarknet53_ Tiny network.
In the invention, firstly, the detection is carried out based on a YOLOv4-Tiny model, and the YOLOv4-Tiny model belongs to a light weight version of a YOLOv4 model so as to adapt to the hardware equipment foundation in a gas station and improve the detection speed.
Secondly, the existing YOLOv4-Tiny model has only two scales, and the detection of the safety helmet is not accurate enough, so the model is improved, the feature extraction backbone network module adopts a CSPdarknet53_ Tiny network, and one scale is added to perform more-scale detection, and the detection accuracy is improved. An attention mechanism module is also inserted into a residual error network of a Resblock _ body module of the CSPdarknet53_ tiny network, so that the detection accuracy is improved, and meanwhile, the CSPdarknet53_ tiny network and the attention mechanism network are simple and are further adapted to a hardware equipment foundation in a gas station.
And thirdly, performing frame decomposition on the monitoring video, detecting and classifying the decomposed pictures of each frame to obtain classified pictures, and finally performing video synthesis and output on the classified pictures to realize real-time classification of the wearing condition of the personal safety helmet, thereby better detecting.
In addition, the YOLOv4-Tiny improved model is trained based on the confrontation sample and the original sample to obtain a final detection model, so that the robustness of the detection model is improved, and the generalization capability is strong.
Specifically, the method comprises the following steps:
in step S1, the method includes the steps of:
s1.1, selecting a monitoring video of a gas station in a time period, performing frame extraction processing, randomly selecting a plurality of pictures including pictures of workers with safety helmets and pictures without safety helmets, and uniformly cutting the pictures into 416 x 416 sizes to obtain monitoring pictures including pictures of workers wearing safety helmets and pictures of workers not wearing safety helmets;
s1.2, manually marking a target position in a monitored picture by adopting labelme software, forming a label, and simultaneously generating corresponding xml, json and png format files to obtain a training picture;
s1.3, selecting training pictures in corresponding proportions, and generating a confrontation sample by adopting a target confrontation object gradient attack method;
and S1.4, performing a scrambling operation on the confrontation sample and the rest of the training pictures, and performing a data enhancement operation to form a first training set, wherein the data enhancement operation comprises random cutting, turning, zooming and the like.
In step S1.3, the method comprises the steps of:
s1.3.1, training to obtain a trained original YOLOv4-Tiny network;
s1.3.2, inputting the selected 1/3 proportion training picture into the trained original YOLOv4-Tiny network, and carrying out network forward propagation to obtain confidence coefficient loss;
s1.3.3, performing back propagation on the confidence loss, wherein the frozen network parameters are not variable in the back propagation process, only the pixels of the picture can be modified, and in the back propagation process, the confidence loss is maximized (different from the traditional method for minimizing the loss), and the grid region containing the target in the picture can be classified into the background by maximizing the confidence loss, so as to generate a countermeasure sample with an attack effect;
s1.3.4, after each iteration, inputting the modified picture into the original well-trained YOLOv4-Tiny network, and if the target position cannot be correctly detected, stopping the iteration, wherein the picture becomes a countermeasure sample.
In step S1.3.1, the method includes the following steps:
a. taking all the training pictures as a second training set;
b. the training labels are changed into person, 50% hat and 100% hat, and are respectively used for detecting that the safety helmet is not worn, the safety helmet is not correctly worn and the safety helmet is correctly worn;
c. and inputting the second training set into an original YOLOv4-Tiny model to calculate loss, and performing back propagation to obtain corresponding network weight so as to obtain a trained original YOLOv4-Tiny network.
In step c, the formula for calculating loss is as follows:
wherein the content of the first and second substances,a weight representing a coordinate loss;weights representing mesh prediction class losses;representing the number of grids divided by the picture;representing the number of prediction boxes contained in each grid;is shown asA first of the gridWhether each prediction frame is a responsible prediction frame or not is judged, if yes, 1 is selected, and if not, 0 is selected;is shown asA first of the gridWhether the prediction frame is not the responsible prediction frame or not is judged, if so, 1 is selected, and if not, 0 is selected;、respectively represent byThe coordinates of the real marked central point of the target object in charge of each grid;、respectively represent byCoordinates of a central point of a prediction frame of the target object in charge of each grid;、respectively represent byThe length and the width of the real mark of the target object in charge of each grid;、respectively represent byThe length and the width of a prediction frame of the target object in charge of each grid;is represented byReal classification results of the target object for which each grid is responsible;is represented byA gridThe responsible target object prediction classification result;is represented byThe object for which the grid is responsible belongs toTrue classification probabilities for individual classes;is represented byThe object for which the grid is responsible belongs toA predicted classification probability for each class;a set of class numbers is indicated.
Therefore, the method generates the corresponding confrontation sample by using the TOG method for the YOLOv4-Tiny model, so that the confrontation training result is better, namely the robustness and the generalization performance of the model are improved to a greater extent.
With reference to fig. 2 and 3, the YOLOv4-Tiny improved model and the detection process thereof are described in more detail as follows:
the CSPdark net53_ tiny network outputs three feature maps with different sizes, the sizes are respectively: 52, 26, 13, and the three different feature maps are input into the multi-scale feature fusion module for feature fusion. The multi-scale feature fusion module fuses the three feature maps to obtain three fused feature maps with the sizes of 52 × 52, 26 × 26 and 13 × 13. And the classification prediction module predicts the input feature maps with different scales and outputs a final detection result.
The attention mechanism module is a SENET network. The SENEt is a channel feature attention network, the input feature graph is subjected to global average pooling firstly, then passes through two full-connection layers, and finally outputs corresponding weight through a Sigmoid activation function, and the weight is multiplied by the input feature graph to obtain output.
The CSPdarknet53_ tiny network comprises a first darknencv 2D _ BN _ leak module, a second darknencv 2D _ BN _ leak module, a first Resblock _ body module, a second Resblock _ body module, a third Resblock _ body module, and a third darknencv 2D _ BN _ leak module, which are connected in sequence. The DarknetConv2D _ BN _ Leaky module consists of a convolution Conv2D, a Batch Normalization (BN) and a Leaky ReLU activation function. The Resblock _ body module carries out residual error calculation on the input characteristic diagram through a DarknetConv2D _ BN _ Leaky module, outputs the residual error calculation output, and then splices the residual error calculation output with the input of the module to output.
The multi-scale feature fusion module comprises a first convolution layer, a second convolution layer, a first up-sampling layer, a first splicing layer, a third convolution layer, a second up-sampling layer and a second splicing layer which are sequentially connected from bottom to top;
the classification prediction module comprises a first Yolo Head classification network, a second Yolo Head classification network and a third Yolo Head classification network which are sequentially connected from bottom to top;
the first feature graph output by the third DarknetConv2D _ BN _ Leaky module is input to the first Yolo Head classification network through the first convolution layer on one hand, and is input to the second Yolo Head classification network through the second convolution layer and the first up-sampling layer on the other hand, and the second feature graph output by the second Resblock _ body module is input to the first splicing layer together for splicing to obtain a first fusion feature graph, the first fusion feature graph is input to the second Yolo Head classification network on the one hand, and is input to the second splicing layer together with the third feature graph output by the first Resblock _ body module through the third convolution layer and the second up-sampling layer on the other hand, so as to obtain a second fusion feature graph, and the second fusion feature graph is input to the third Yolo Head classification network.
Wherein the first signature size is 13 × 13, the second signature size is 26 × 26, the third signature size is 52 × 52, the first fused signature size is 26 × 26, and the second fused signature size is 52 × 52.
The Yolo Head classification network has the same structure as the Yolo Head of the existing Yolo 4-Tiny algorithm, and redundant explanation is not provided here.
The improved YOLOv4-Tiny model adopts the fusion feature layer of three dimensions 13 x 13, 26 x 26 and 52 x 52 to detect, and compared with the original model which only has two dimensions, the detection of the detector on Tiny objects can be greatly improved. The existing YOLOv4-Tiny has only two scales, and the detection capability of the existing YOLOv4-Tiny on the details and Tiny objects cannot meet the existing production requirement, so that more scales are required to participate in the reasoning of the model. Taking 13 × 13 as an example, the Yolo Head prediction process is as follows: the input picture is divided into 13 × 13 cells, and if the center of an object is detected in a certain cell, the cell is taken as a prediction cell of the object. And each cell can generate three anchor frames, and a total of 13 × 3=507 anchor frames are generated for prediction. And when the confidence coefficient of the object is greater than the threshold value, three anchor frames are reserved, and the optimal anchor frame is screened out by using non-maximum value inhibition and is used as the final prediction frame of the object. Three-scale prediction can predict more objects than two-scale prediction. And the fusion feature layers with different scales are suitable for detecting objects with different volumes.
The above-mentioned embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements of the technical solutions of the present invention by those skilled in the art should fall within the protection scope of the present invention without departing from the design spirit of the present invention.
Claims (10)
1. A filling station helmet wearing real-time detection method based on YOLOv4-Tiny is characterized by comprising the following steps:
s1, performing frame extraction processing on the original monitoring video of the gas station, extracting a plurality of monitoring pictures, preprocessing the monitoring pictures to obtain training pictures, selecting training pictures with corresponding proportions to generate countermeasure samples, and taking the rest training pictures as original samples;
s2, generating a first training set based on the confrontation sample and the original sample;
s3, training the YOLOv4-Tiny improved model by adopting a first training set to obtain corresponding training weight;
s4, accessing a gas station for real-time monitoring, performing frame decomposition, inputting the decomposition picture of each frame into a trained YOLOv4-Tiny improved model in real time to obtain a personal safety helmet wearing classification result in the decomposition picture, and overlapping the classification result into the decomposition picture to obtain a classification picture;
s5, synthesizing the classified pictures into videos in real time, and outputting the videos in real time;
the YOLOv4-Tiny improved model comprises a feature extraction backbone network module, a multi-scale feature fusion module and a classification prediction module which are sequentially connected, wherein the feature extraction backbone network module is a CSPdarknet53_ Tiny network;
the YOLOv4-Tiny improvement model also includes an attention mechanism module inserted into the residual network of the Resblock _ body module of the CSPdarknet53_ Tiny network.
2. The method as claimed in claim 1, wherein the attention mechanism module is a SENET network, the SENET network is a channel feature attention network, the input feature map is first subjected to global average pooling, then passes through two full connection layers, and finally outputs a corresponding weight through a Sigmoid activation function, and the weight is multiplied by the input feature map to obtain an output.
3. The method for detecting the wearing of the safety helmet of the gasoline station based on YOLOv4-Tiny as claimed in claim 1, wherein the step S1 comprises the steps of:
s1.1, selecting a monitoring video of a gas station in a period of time, and performing frame extraction processing to obtain monitoring pictures containing a safety helmet worn by a person and a safety helmet not worn by the person;
s1.2, marking the target position in the monitoring picture, and forming a label to obtain a training picture;
s1.3, selecting training pictures in corresponding proportions, and generating a confrontation sample by adopting a target confrontation object gradient attack method;
and S1.4, performing a scrambling operation on the confrontation sample and the rest training pictures, and performing a data enhancement operation to form a first training set.
4. The method for detecting the wearing condition of the gas station safety helmet based on YOLOv4-Tiny as claimed in claim 3, wherein the step S1.3 comprises the steps of:
s1.3.1, training to obtain a trained original YOLOv4-Tiny network;
s1.3.2, inputting the selected training pictures with corresponding proportions into the trained original YOLOv4-Tiny network, and carrying out network forward propagation to obtain confidence loss;
s1.3.3, performing back propagation on the confidence loss, wherein the frozen network parameters are not variable in the back propagation process, and only the pixels of the picture can be modified;
s1.3.4, after each iteration, inputting the modified picture into the original well-trained YOLOv4-Tiny network, and if the target position cannot be correctly detected, stopping the iteration, wherein the picture becomes a countermeasure sample.
5. The method for detecting wearing of safety caps of gas stations based on YOLOv4-Tiny as claimed in claim 4, wherein the step S1.3.3 is backward propagated in the direction of increasing confidence loss.
6. The method for detecting the wearing condition of the safety helmet of the gasoline station based on YOLOv4-Tiny as claimed in claim 4, wherein the step S1.3.1 comprises the following steps:
a. taking all the training pictures as a second training set;
b. the training labels are changed into person, 50% hat and 100% hat, and are respectively used for detecting that the safety helmet is not worn, the safety helmet is not correctly worn and the safety helmet is correctly worn;
c. and inputting the second training set into an original YOLOv4-Tiny model to calculate loss, and performing back propagation to obtain corresponding network weight so as to obtain a trained original YOLOv4-Tiny network.
7. The method for detecting the wearing condition of the gas station safety helmet based on YOLOv4-Tiny as claimed in claim 6, wherein in the step c, the loss is calculated by the formula:
wherein the content of the first and second substances,a weight representing a coordinate loss;weights representing mesh prediction class losses;representing the number of grids divided by the picture;representing the number of prediction boxes contained in each grid;is shown asA first of the gridWhether each prediction frame is a responsible prediction frame or not is judged, if yes, 1 is selected, and if not, 0 is selected;is shown asA first of the gridWhether the prediction frame is not the responsible prediction frame or not is judged, if so, 1 is selected, and if not, 0 is selected;、respectively represent byThe coordinates of the real marked central point of the target object in charge of each grid;、respectively represent byCoordinates of a central point of a prediction frame of the target object in charge of each grid;、respectively represent byThe length and the width of the real mark of the target object in charge of each grid;、respectively represent byThe length and the width of a prediction frame of the target object in charge of each grid;is represented byReal classification results of the target object for which each grid is responsible;is represented byThe prediction classification result of the target object is responsible for each grid;is represented byThe object for which the grid is responsible belongs toTrue classification probabilities for individual classes;is represented byThe object for which the grid is responsible belongs toA predicted classification probability for each class;a set of class numbers is indicated.
8. The method of claim 1, wherein the CSPdarknet53_ Tiny network comprises a first darknencv 2D _ BN _ leak module, a second darknencv 2D _ BN _ leak module, a first Resblock _ body module, a second Resblock _ body module, a third Resblock _ body module, and a third darknencv 2D _ BN _ leak module, which are sequentially connected.
9. The method for detecting wearing of a gas station helmet based on YOLOv4-Tiny according to claim 8, wherein the multi-scale feature fusion module comprises a first convolution layer, a second convolution layer, a first up-sampling layer, a first splicing layer, a third convolution layer, a second up-sampling layer and a second splicing layer which are sequentially connected from bottom to top;
the classification prediction module comprises a first Yolo Head classification network, a second Yolo Head classification network and a third Yolo Head classification network which are sequentially connected from bottom to top;
the first feature graph output by the third DarknetConv2D _ BN _ Leaky module is input to the first Yolo Head classification network through the first convolution layer on one hand, and is input to the second Yolo Head classification network through the second convolution layer and the first up-sampling layer on the other hand, and the second feature graph output by the second Resblock _ body module is input to the first splicing layer together for splicing to obtain a first fusion feature graph, the first fusion feature graph is input to the second Yolo Head classification network on the one hand, and is input to the second splicing layer together with the third feature graph output by the first Resblock _ body module through the third convolution layer and the second up-sampling layer on the other hand, so as to obtain a second fusion feature graph, and the second fusion feature graph is input to the third Yolo Head classification network.
10. The method for detecting wearing of safety caps of gas stations based on YOLOv4-Tiny as claimed in claim 8, wherein the DarknetConv2D _ BN _ Leaky modules are composed of convolution Conv2D, batch standardized BN and Leaky ReLU activation functions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111495511.1A CN114140750A (en) | 2021-12-09 | 2021-12-09 | Filling station safety helmet wearing real-time detection method based on YOLOv4-Tiny |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111495511.1A CN114140750A (en) | 2021-12-09 | 2021-12-09 | Filling station safety helmet wearing real-time detection method based on YOLOv4-Tiny |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114140750A true CN114140750A (en) | 2022-03-04 |
Family
ID=80385385
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111495511.1A Pending CN114140750A (en) | 2021-12-09 | 2021-12-09 | Filling station safety helmet wearing real-time detection method based on YOLOv4-Tiny |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114140750A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115409818A (en) * | 2022-09-05 | 2022-11-29 | 江苏济远医疗科技有限公司 | Enhanced training method applied to endoscope image target detection model |
CN116229522A (en) * | 2023-05-10 | 2023-06-06 | 广东电网有限责任公司湛江供电局 | Substation operator safety protection equipment detection method and system |
CN116977919A (en) * | 2023-06-21 | 2023-10-31 | 北京卓视智通科技有限责任公司 | Method and system for identifying dressing specification, storage medium and electronic equipment |
-
2021
- 2021-12-09 CN CN202111495511.1A patent/CN114140750A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115409818A (en) * | 2022-09-05 | 2022-11-29 | 江苏济远医疗科技有限公司 | Enhanced training method applied to endoscope image target detection model |
CN115409818B (en) * | 2022-09-05 | 2023-10-27 | 江苏济远医疗科技有限公司 | Enhanced training method applied to endoscope image target detection model |
CN116229522A (en) * | 2023-05-10 | 2023-06-06 | 广东电网有限责任公司湛江供电局 | Substation operator safety protection equipment detection method and system |
CN116977919A (en) * | 2023-06-21 | 2023-10-31 | 北京卓视智通科技有限责任公司 | Method and system for identifying dressing specification, storage medium and electronic equipment |
CN116977919B (en) * | 2023-06-21 | 2024-01-26 | 北京卓视智通科技有限责任公司 | Method and system for identifying dressing specification, storage medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108564097B (en) | Multi-scale target detection method based on deep convolutional neural network | |
CN114140750A (en) | Filling station safety helmet wearing real-time detection method based on YOLOv4-Tiny | |
CN111723786B (en) | Method and device for detecting wearing of safety helmet based on single model prediction | |
CN112434672B (en) | Marine human body target detection method based on improved YOLOv3 | |
CN111079739B (en) | Multi-scale attention feature detection method | |
CN112733749A (en) | Real-time pedestrian detection method integrating attention mechanism | |
CN113011319A (en) | Multi-scale fire target identification method and system | |
CN110378222A (en) | A kind of vibration damper on power transmission line target detection and defect identification method and device | |
CN112149591B (en) | SSD-AEFF automatic bridge detection method and system for SAR image | |
CN113569667B (en) | Inland ship target identification method and system based on lightweight neural network model | |
CN113052834A (en) | Pipeline defect detection method based on convolution neural network multi-scale features | |
CN114565891A (en) | Smoke and fire monitoring method and system based on graph generation technology | |
Park et al. | Advanced wildfire detection using generative adversarial network-based augmented datasets and weakly supervised object localization | |
CN112861646A (en) | Cascade detection method for oil unloading worker safety helmet in complex environment small target recognition scene | |
CN115512387A (en) | Construction site safety helmet wearing detection method based on improved YOLOV5 model | |
CN116543346A (en) | Deep learning-based transmission line video mountain fire detection method | |
CN116385958A (en) | Edge intelligent detection method for power grid inspection and monitoring | |
Yandouzi et al. | Investigation of combining deep learning object recognition with drones for forest fire detection and monitoring | |
CN113487610B (en) | Herpes image recognition method and device, computer equipment and storage medium | |
CN116310850B (en) | Remote sensing image target detection method based on improved RetinaNet | |
CN113936299A (en) | Method for detecting dangerous area in construction site | |
CN117218545A (en) | LBP feature and improved Yolov 5-based radar image detection method | |
CN112364864A (en) | License plate recognition method and device, electronic equipment and storage medium | |
CN115131826B (en) | Article detection and identification method, and network model training method and device | |
CN116740516A (en) | Target detection method and system based on multi-scale fusion feature extraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |