CN116630668A - Method for identifying wearing abnormality of safety helmet in quick lightweight manner - Google Patents

Method for identifying wearing abnormality of safety helmet in quick lightweight manner Download PDF

Info

Publication number
CN116630668A
CN116630668A CN202310419032.4A CN202310419032A CN116630668A CN 116630668 A CN116630668 A CN 116630668A CN 202310419032 A CN202310419032 A CN 202310419032A CN 116630668 A CN116630668 A CN 116630668A
Authority
CN
China
Prior art keywords
module
improved
data
output
safety helmet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310419032.4A
Other languages
Chinese (zh)
Inventor
张文钦
龙银格
刘颖
徐鹤
李鹏
王汝传
季一木
刘尚东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202310419032.4A priority Critical patent/CN116630668A/en
Publication of CN116630668A publication Critical patent/CN116630668A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • G06V10/763Non-hierarchical techniques, e.g. based on statistics of modelling distributions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of target detection in computer vision, and discloses a rapid lightweight safety helmet wearing abnormality identification method, which introduces a CA attention mechanism into a YOLOv5 model, combines the CA attention mechanism with ConvNeXt content, improves a Kmeans algorithm for automatically calculating an anchor frame, obtains an optimized YOLOv5_CC model after training, carries out abnormality identification analysis on a monitoring video slice uploaded by edge equipment, and detects whether a person in a video wears the safety helmet. The method improves the detection precision on the basis of keeping the weight reduction, accelerates the running speed and realizes better training effect.

Description

Method for identifying wearing abnormality of safety helmet in quick lightweight manner
Technical Field
The invention belongs to the technical field of target detection in computer vision, and particularly relates to a rapid lightweight safety helmet wearing abnormality identification method.
Background
Wearing safety helmets in places with potential safety hazards in construction sites, transformer station rooms and the like is an important measure for guaranteeing safety of workers, the safety helmet wearing detection work is closely related to safety of workers, and how to quickly and accurately detect whether the workers wear the safety helmets is a current research hot spot.
YOLO (You Only Look Once: unified, real-Time Object Detection) is a single neural network based target detection system proposed by Joseph redcon and Ali faradai et al in 2015. The concept of YOLO v0 stems from extending the basic CNN concept from classification to detection, i.e. the output of the network should be the coordinates of a bounding box to perform the task of classification of traversal. YOLO v1 appears by extension on the basis that YOLO v0 can only perform single target detection, single classification detection and small target detection. However, the predicted box of YOLO v1 is not sufficiently accurate and a significant miss-detection phenomenon occurs, and thus YOLO v2 occurs. The basic thought of YOLO has been determined, and the improvement of YOLO v3 is to add multi-scale detection, and YOLO v4 is to add some features on the basis of the multi-scale detection, so that the testing accuracy is improved. YOLO v5 is a transformation based on YOLO v3, and positive samples are added through a positive sample anchor matching strategy of the neighborhood; the models with different complexity are obtained through flexible pointed parameters; the overall performance is improved through a plurality of built-in super-parameter optimization strategies; through the enhancement of mosaics, the detection performance of small objects is improved.
In the prior art, a plurality of safety helmet wearing abnormality identification methods based on YOLO v5 exist, but the image video monitored by cameras of places such as actual transformer substation rooms is low in monitoring accuracy due to external interference such as light rays and angles, and the public data of places such as actual transformer substation rooms are insufficient, so that fitting is easy to generate, and an ideal monitoring effect cannot be achieved in the places such as actual transformer substation rooms; and most use the safety helmet data of building site, and neglect the problem of safety helmet discernment under other backgrounds. As in patent application CN115512387a, a method for detecting the wearing of a safety helmet on a construction site based on an improved YOLOV5 model is disclosed, but only the data under the background of the construction site are considered, the reliability of detection is considered, and the weight and speed of the model are ignored.
Disclosure of Invention
In order to solve the technical problems, the invention provides a rapid lightweight safety helmet wearing abnormality recognition method, an improved training model is built based on a classical yolv5_m model, a good recognition effect can be achieved for dark environment and blurred images, detection accuracy can be improved on the basis of keeping lightweight, and running speed is accelerated.
The invention relates to a rapid lightweight safety helmet wearing abnormality identification method, which comprises the following steps:
step 1: acquiring a wearing picture data set of the safety helmet, and dividing the wearing picture data set into a training set, a verification set and a test set; processing the training set data to generate corresponding label files;
step 2, an improved Yolov5_CC target detection model is built, training set data are input into the improved Yolov5_CC target detection model for training, and the capacity of the model is primarily evaluated through a verification set and super parameters of the model are timely adjusted;
step 3, detecting the improved Yolov5_CC target detection model obtained in the step 2 by using a test set to obtain a detection effect of the model, and determining feasibility of the model;
and 4, combining the trained Yolov5_CC target detection model with a corresponding application scene, performing anomaly identification analysis on the picture or video information uploaded by the edge terminal equipment, and detecting whether a person in the video wears the safety helmet.
Further, in step 1, determining that the target recognition positive sample is a helmet and the negative sample is a person; acquiring a safety helmet picture data set in a station room of a target place, and processing the safety helmet picture data set by using a data marking tool labellmg to generate a data set A in a YOLO format; collecting a large number of public data sets B of the safety helmet in other environments by means of a network; performing data expansion by using a generated countermeasure network model CycleGAN, and training a data set B into a data distribution B' similar to the style of the data set A; and finally adding B' to the data set A to generate a final data set.
Further, the improved yolv5_cc target detection model includes: the system comprises an input module, an improved Backbone network module, an improved Neck module and an output prediction module;
the input module obtains an optimal prediction anchor frame by utilizing an optimized kmeans algorithm, and then normalized image data is output to the improved Backbone network module;
the improved Backbone network module performs feature extraction on the image data processed by the input module, enables the Backbone to acquire feature images in two directions of horizontal and vertical through a CA attention mechanism added by the improved Backbone network module, and realizes the output of self-adaptive size through SPPF;
the improved Neck module carries out up-sampling and fusion on the multi-layer characteristics output by the improved Backbone network module, and then inputs the obtained data to the output prediction module;
the output prediction module utilizes the SIoU total loss function to finish abnormal recognition on the output result of the Neck module.
Further, the input module inputs 640×640 images, performs image preprocessing, mainly adopts Moasaic data enhancement, improves two original images to four images for stitching, performs random scaling, random cutting and random arrangement on the images to obtain richer images, then utilizes an optimized kmeans clustering algorithm for training set data to obtain anchor frame size most suitable for a data set, and then outputs normalized image data to the next module.
Further, the optimized k-means algorithm clusters the width and height of all target frames in the training set, and the result obtains 9 width and height combinations, namely 9 anchor frames, and the specific steps are as follows:
step a: randomly selecting k samples from all data samples in the training set in the step 1 as center points;
step b: the distance between each sample and k center points is calculated and the samples are assigned to the center points nearest to it.
Step c: each sample is distributed to a center point nearest to the sample, and all samples distributed to the same center point form a category to complete clustering;
step d: after clustering, the input samples in each training set find one activation node which is most matched with the input samples in the output layer of the clustering, then the parameters of the activation nodes are updated by a random gradient descent method, after a plurality of activation nodes are obtained, all the activation nodes form a winning field, all the nodes in the winning field are delineated, the node weights are adjusted, and then clustering is carried out, so that the anchor frame size most suitable for the data set is obtained.
Further, the improved backhaul Backbone network module comprises a layer of CBS structure, a CBS structure and a C3 structure alternately four times, a CA attention mechanism and an SPPF structure which are sequentially connected;
the CBS structure consists of a convolution layer, a BN layer and an activation layer, and performs feature extraction, standardization processing and nonlinear processing on the image data processed by the input module;
the C3 module learns residual characteristics, wherein the module is divided into two branches, one branch uses 3 standard convolution layers and a Bottleneck module stack, the other branch only passes through one basic convolution module, and finally the two branches are connected;
the CA attention mechanism captures remote dependency relationship in one space direction, stores accurate position information in the other space, obtains feature graphs of two directions of a horizontal direction and a vertical direction altogether, splices the feature graphs, realizes dimension reduction by using convolution, batchNorm and an activation function, and finally performs fusion output with original features after using the activation function;
the SPPF module is a space pyramid pooling module, and the input serial passes through a plurality of MaxPool layers with the size of 5 multiplied by 5 to realize self-adaptive output.
Further, the improved Neck module consists of a CBS module and an improved CNeB module, an output result of the improved back Backbone network module is input to the Neck module, the output result is subjected to feature fusion with CBS on one hand to obtain a feature fusion result B1, and on the other hand, the output result is subjected to concat fusion with the output feature of the third C3 layer in the back module through the CBS module and up-sampling to obtain an intermediate feature fusion result A; then, the output result of the middle feature fusion result A after the improved CNeB and CBS module is subjected to feature fusion with the CBS module to obtain a feature fusion result B2, and on the other hand, the up-sampling operation is performed again to double the length and the width of the feature map, and the up-sampling result is subjected to concat fusion with the output feature of the improved Backbone network module after the second time passing through the C3 layer to obtain a feature fusion result B3; b1, B2 and B3 are combined with CBS to be sent to an output prediction layer through an improved CNeB module;
the improved CNeB structure firstly adopts a grouping convolution and a large convolution to check upper layer input for processing, performs normalization processing through LayerNorm, passes through a GELU activation layer once, and uses drop_path regularization operation after scaling; the network structure is improved on the whole, the model is simplified, the speed is improved, and the generalization capability and the precision are improved.
Further, the SIoU loss function consists of 4 Cost functions, including Angle Cost, distance Cost, shape Cost and IoU Cost, and the total loss function expression is:
L=W box L box +W cls L cls
wherein W is box And W is cls Is the frame and class penalty weight, L cls Is a Focal loss functionFor the regression loss function, Δ is Distance cost and Ω is Shape cost.
The beneficial effects of the invention are as follows:
1) According to the method, by improving a Kmeans algorithm for automatically calculating an anchor frame in a YOLOv5 classical model, defining all nodes in a winning neighborhood, adjusting node weights and then clustering, the two defects that an original model algorithm is easily affected by noise, an iteration result is not generally globally optimal but is locally optimal are overcome, a more accurate prediction anchor frame is obtained, and because the calculation of an anchor belongs to a priori module, the additionally increased calculation cost is completely negligible compared with the training cost of a million-level data set;
2) The method introduces a CA attention mechanism in the backbone network, not only considers channel information, but also considers direction-related position information, and performs object detection more accurately in experiments;
3) According to the method, the CNeB module in ConvNeXt is introduced into the model, the CNeB module is added into the Neck of the model to be combined with the CBS module, so that the network structure is improved on the whole, the model is simplified, the speed is improved, and the generalization capability and precision are improved.
4) According to the method, a loss function SIoU is used for replacing a loss function CIoU of an original YOLOv5 model at an output prediction end, and the traditional target detection loss function CIoU depends on aggregation of regression indexes of a boundary frame, such as the distance, overlapping area and aspect ratio of a prediction frame and a real frame; the SIoU redefines punishment indexes in consideration of vector angles among required regression, focuses on not only overlapping areas, but also other non-overlapping areas, can better reflect the coincidence ratio of a prediction frame and a real frame, greatly improves the training and reasoning of a target detection algorithm, realizes faster convergence in a training stage, and can effectively improve the training speed and the accuracy of a model.
Drawings
FIG. 1 is a diagram of a model structure used in the present invention;
FIG. 2 is a flow chart of the method of the present invention;
FIG. 3 is a schematic diagram of the structure of CNeB;
FIG. 4 is a schematic illustration of the attention mechanism CA of the embedded model;
FIG. 5 is a schematic diagram of the composition structure of CBS in the model;
fig. 6 is a schematic diagram of the composition structure of C3 in the model.
Detailed Description
In order that the invention may be more readily understood, a more particular description of the invention will be rendered by reference to specific embodiments that are illustrated in the appended drawings.
The invention relates to a method for identifying wearing abnormality of a quick lightweight safety helmet, which is shown in fig. 2 and comprises the following steps:
step 1: acquiring a wearing picture data set of the safety helmet, and dividing the wearing picture data set into a training set, a verification set and a test set; processing the training set data to generate corresponding label files;
step 2, an improved Yolov5_CC target detection model is built, training set data are input into the improved Yolov5_CC target detection model for training, and the capacity of the model is primarily evaluated through a verification set and super parameters of the model are timely adjusted;
step 3, detecting the improved Yolov5_CC target detection model obtained in the step 2 by using a test set to obtain a detection effect of the model, and determining feasibility of the model;
and 4, combining the trained Yolov5_CC target detection model with a corresponding application scene, performing anomaly identification analysis on the picture or video information uploaded by the edge terminal equipment, and detecting whether a person in the video wears the safety helmet.
Step S1: and acquiring the safety helmet data set, and processing the safety helmet data set to generate a tag file. Aiming at the problem of unbalanced data set category, a mixed scene data enhancement method is designed, so that the category is balanced, and the accuracy of the model in a complex construction environment is improved:
in this embodiment, step 1 specifically includes:
determining a target identification positive sample as a helmet and a negative sample as a person; acquiring a safety helmet picture set in a station room of a target place, and processing the safety helmet picture set by using a data marking tool labellmg to generate a data set A in a YOLO format; collecting a large amount of public data B of the safety helmet in other environments by means of a network; using a generated countermeasure network model CycleGAN to expand data, training B into data distribution B' similar to A style; finally, B' is added to A to generate the final dataset.
In step 2, the improved yolv5_cc target detection model is shown in fig. 1, and includes an input module, an improved backhaul Backbone network module, an improved neg module, and an output prediction module;
the input module inputs 640 x 640 images, mainly adopts Moasaic data enhancement, improves the original two images to four images for splicing, randomly zooms, randomly cuts and randomly arranges the images, enhances the diversity of the data, clusters a training set through an optimized kmeans algorithm to obtain the predicted anchor frame size which is most suitable for the data set, and inputs the processed data into a Backbone module after a series of data processing.
The optimized k-means algorithm clusters the width and height of all target frames in the training set, and the result obtains 9 width and height combinations, namely 9 anchor frames, and the specific steps are as follows:
step a: randomly selecting k samples from all data samples in the training set in the step 1 as center points;
step b: the distance between each sample and k center points is calculated and the samples are assigned to the center points nearest to it.
Step c: each sample is distributed to a center point nearest to the sample, and all samples distributed to the same center point form a category to complete clustering;
step d: after clustering, the input samples in each training set find one activation node which is most matched with the input samples in the output layer of the clustering, then the parameters of the activation nodes are updated by a random gradient descent method, after a plurality of activation nodes are obtained, all the activation nodes form a winning field, all the nodes in the winning field are delineated, and the node weights are adjusted and then clustering is carried out.
B, clustering is completed by the original k-means algorithm, but the actual training effect and the expected difference are too large; the optimized k-means algorithm adopted in the embodiment can improve two defects of 'easy to be affected by noise' and 'iteration result is usually not global optimum but local optimum solution' of the original model Kmeans algorithm through two-layer clustering.
The improved Backbone network module comprises a layer of CBS structure, a CBS structure and a C3 structure which are alternately and four times, a CA attention mechanism and an SPPF structure which are sequentially connected;
as shown in fig. 5, the CBS structure is composed of a convolution layer, a BN layer, and an activation layer, and performs feature extraction, normalization, and nonlinear processing on the image data processed by the input module; the C3 module learns residual characteristics, and as shown in FIG. 6, the module is divided into two branches, wherein one branch uses 3 standard convolution layers and a Bottleneck module stack, the other branch only passes through one basic convolution module, and finally the two branches are connected;
as shown in fig. 4, the CA attention mechanism can capture remote dependency in one spatial direction, store accurate position information in another spatial direction, obtain feature graphs of two directions of horizontal direction and vertical direction altogether, splice them and implement dimension reduction by using convolution, batch norm and activation function, and finally use the activation function to perform fusion output with original features, so as to perform object detection more accurately; the SPPF module is a space pyramid pooling module, and the input serial passes through a plurality of MaxPool layers with the size of 5 multiplied by 5 to realize self-adaptive output.
The improved Neck module consists of a CBS module and an improved CNeB module, an output result of the improved Backbone network module is input to the Neck module, the output result is subjected to feature fusion with CBS on one hand to obtain a feature fusion result B1, and on the other hand, the output result is subjected to conca fusion with the output feature of the third C3 layer in the Backbone module through the CBS module and up-sampling to obtain an intermediate feature fusion result A; then, the output result of the middle feature fusion result A after the improved CNeB and CBS module is subjected to feature fusion with the CBS module to obtain a feature fusion result B2, and on the other hand, the up-sampling operation is performed again to double the length and the width of the feature map, and the up-sampling result is subjected to concat fusion with the output feature of the improved Backbone network module after the second time passing through the C3 layer to obtain a feature fusion result B3; b1, B2 and B3 are combined with CBS to be sent to an output prediction layer through an improved CNeB module; the improved CNeB structure is shown in fig. 3, firstly, the upper layer input is processed by adopting a grouping convolution and a large convolution check, normalization processing is carried out by LayerNorm, a GELU activating layer is adopted once, and after scaling, a drop_path regularization operation is used, so that the network structure is improved on the whole, the model is simplified, the speed is improved, and the generalization capability and precision are improved.
The main body part of the output prediction module is three detection detectors, and a SIoU total loss function is applied to target detection processes on feature graphs of different scales based on anchors of grids;
specific training result data of the YOLOv5 original model and the different models after improvement are shown in the following table,
from the data in the table, the improved model has six layers less than the YOLOv5m model in layer number, is lighter than the original model, but has improved accuracy.
The rapid and light-weight safety helmet wearing abnormality identification method can realize high-precision real-time detection and rapid and light-weight identification. The program design scheme of the design algorithm provided by the embodiment can be stored in a computer readable storage medium in a coded form, is realized in a computer program mode, inputs basic parameter information required by calculation through computer hardware, and outputs a calculation result.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows.
The present invention is described with reference to flowchart illustrations of methods, apparatus, and computer program products according to embodiments of the invention. It will be understood that each flow of the flowchart, and combinations of flows in the flowchart, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded process, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows.
The present invention is not limited to the above-mentioned preferred embodiments, and any person can obtain other types of high-precision abnormality detection methods based on safety helmet recognition under the teaching of this patent, and all equivalent changes and modifications made according to the claims of this application shall fall within the scope of coverage of this patent.

Claims (8)

1. A method for identifying wearing abnormality of a quick lightweight safety helmet is characterized by comprising the following steps:
step 1: acquiring a wearing picture data set of the safety helmet, and dividing the wearing picture data set into a training set, a verification set and a test set; processing the training set data to generate corresponding label files;
step 2, an improved Yolov5_CC target detection model is built, training set data are input into the improved Yolov5_CC target detection model for training, and the capacity of the model is primarily evaluated through a verification set and super parameters of the model are timely adjusted;
step 3, detecting the improved Yolov5_CC target detection model obtained in the step 2 by using a test set to obtain a detection effect of the model, and determining feasibility of the model;
and 4, combining the trained Yolov5_CC target detection model with a corresponding application scene, performing anomaly identification analysis on the picture or video information uploaded by the edge terminal equipment, and detecting whether a person in the video wears the safety helmet.
2. The method for identifying abnormal wearing of a helmet according to claim 1, wherein in step 1, it is determined that the target identification positive sample is a helmet and the negative sample is a person; acquiring a safety helmet picture data set in a station room of a target place, and processing the safety helmet picture data set by using a data marking tool labellmg to generate a data set A in a YOLO format; collecting a large number of public data sets B of the safety helmet in other environments by means of a network; performing data expansion by using a generated countermeasure network model CycleGAN, and training a data set B into a data distribution B' similar to the style of the data set A; and finally adding B' to the data set A to generate a final data set.
3. The method for identifying a rapid lightweight headgear wear anomaly in accordance with claim 1, wherein the modified yolv5_cc target detection model comprises: the system comprises an input module, an improved Backbone network module, an improved Neck module and an output prediction module;
the input module obtains an optimal prediction anchor frame by utilizing an optimized kmeans algorithm, and then normalized image data is output to the improved Backbone network module;
the improved Backbone network module performs feature extraction on the image data processed by the input module, enables the Backbone to acquire feature images in two directions of horizontal and vertical through a CA attention mechanism added by the improved Backbone network module, and realizes the output of self-adaptive size through SPPF;
the improved Neck module carries out up-sampling and fusion on the multi-layer characteristics output by the improved Backbone network module, and then inputs the obtained data to the output prediction module;
the output prediction module utilizes the SIoU loss function to finish abnormal recognition on the output result of the Neck module.
4. The method for identifying the wearing abnormality of the safety helmet according to claim 3, wherein the input module inputs images with the size of 640 multiplied by 640, performs image preprocessing, adopts Moasaic data enhancement, improves two original images to four images to splice, performs random scaling, random cutting and random arrangement on the images to obtain richer images, and then obtains anchor frame size most suitable for a dataset by training set data through an optimized kmeans clustering algorithm, and outputs normalized image data to the next module.
5. The method for identifying abnormal wearing of a helmet according to claim 3, wherein the optimized k-means algorithm clusters the width and height of all target frames in the training set, and the result obtains 9 width and height combinations, namely 9 anchor frames, and the specific steps are as follows:
step a: randomly selecting k samples from all data samples in the training set in the step 1 as center points;
step b: the distance between each sample and k center points is calculated and the samples are assigned to the center points nearest to it.
Step c: each sample is distributed to a center point nearest to the sample, and all samples distributed to the same center point form a category to complete clustering;
step d: after clustering, the input samples in each training set find one activation node which is most matched with the input samples in the output layer of the clustering, then the parameters of the activation nodes are updated by a random gradient descent method, after a plurality of activation nodes are obtained, all the activation nodes form a winning field, all the nodes in the winning field are delineated, the node weights are adjusted, and then clustering is carried out, so that the anchor frame size most suitable for the data set is obtained.
6. A method for identifying a rapid lightweight headgear wear anomaly in accordance with claim 3, wherein said improved Backbone network module comprises a layer of CBS structure, alternate four CBS and C3 structures, CA attention mechanism and SPPF structure connected in sequence;
the CBS structure consists of a convolution layer, a BN layer and an activation layer, and performs feature extraction, standardization processing and nonlinear processing on the image data processed by the input module;
the C3 module learns residual characteristics, wherein the module is divided into two branches, one branch uses 3 standard convolution layers and a Bottleneck module stack, the other branch only passes through one basic convolution module, and finally the two branches are connected;
the CA attention mechanism captures remote dependency relationship in one space direction, stores accurate position information in the other space, obtains feature graphs of two directions of a horizontal direction and a vertical direction altogether, splices the feature graphs, realizes dimension reduction by using convolution, batchNorm and an activation function, and finally performs fusion output with original features after using the activation function;
the SPPF module is a space pyramid pooling module, and the input serial passes through a plurality of MaxPool layers with the size of 5 multiplied by 5 to realize self-adaptive output.
7. The method for identifying the wearing abnormality of the safety helmet according to claim 3, wherein the improved neg module consists of a CBS module and an improved CNeB module, an output result of the improved Backbone trunk network module is input to the neg module, the output result is subjected to feature fusion with the CBS on one hand to obtain a feature fusion result B1, and the output result is subjected to concat fusion with the output feature of the third C3 layer in the Backbone module on the other hand through the CBS module and up-sampling to obtain an intermediate feature fusion result A; then, the output result of the middle feature fusion result A after the improved CNeB and CBS module is subjected to feature fusion with the CBS module to obtain a feature fusion result B2, and on the other hand, the up-sampling operation is performed again to double the length and the width of the feature map, and the up-sampling result is subjected to concat fusion with the output feature of the improved Backbone network module after the second time passing through the C3 layer to obtain a feature fusion result B3; b1, B2 and B3 are combined with CBS to be sent to an output prediction layer through an improved CNeB module;
the improved CNeB structure firstly adopts packet convolution and large convolution to check upper layer input for processing, performs normalization processing through LayerNorm, passes through a GELU activation layer once, and uses drop_path regularization operation after scaling.
8. The method for identifying abnormal wear of a helmet with rapid light weight according to claim 3, wherein the SIoU loss function consists of 4 Cost functions, including Angle Cost, distance Cost, shape Cost and IoU Cost, and the total loss function expression is:
L=W box L box +W cls L cls
wherein W is box And W is cls Is the frame and class penalty weight, L clk Is the Focal loss function of the loss,for the regression loss function, Δ is Distance cost and Ω is Shape cost.
CN202310419032.4A 2023-04-19 2023-04-19 Method for identifying wearing abnormality of safety helmet in quick lightweight manner Pending CN116630668A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310419032.4A CN116630668A (en) 2023-04-19 2023-04-19 Method for identifying wearing abnormality of safety helmet in quick lightweight manner

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310419032.4A CN116630668A (en) 2023-04-19 2023-04-19 Method for identifying wearing abnormality of safety helmet in quick lightweight manner

Publications (1)

Publication Number Publication Date
CN116630668A true CN116630668A (en) 2023-08-22

Family

ID=87620211

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310419032.4A Pending CN116630668A (en) 2023-04-19 2023-04-19 Method for identifying wearing abnormality of safety helmet in quick lightweight manner

Country Status (1)

Country Link
CN (1) CN116630668A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116958883A (en) * 2023-09-15 2023-10-27 四川泓宝润业工程技术有限公司 Safety helmet detection method, system, storage medium and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116958883A (en) * 2023-09-15 2023-10-27 四川泓宝润业工程技术有限公司 Safety helmet detection method, system, storage medium and electronic equipment
CN116958883B (en) * 2023-09-15 2023-12-29 四川泓宝润业工程技术有限公司 Safety helmet detection method, system, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
Huang et al. Point cloud labeling using 3d convolutional neural network
CN110378222B (en) Method and device for detecting vibration damper target and identifying defect of power transmission line
CN110135375A (en) More people's Attitude estimation methods based on global information integration
CN111461291A (en) Long-distance pipeline inspection method based on YO L Ov3 pruning network and deep learning defogging model
CN109858367B (en) Visual automatic detection method and system for worker through supporting unsafe behaviors
CN101667245B (en) Human face detection method by cascading novel detection classifiers based on support vectors
CN111126325A (en) Intelligent personnel security identification statistical method based on video
CN110136101A (en) A kind of tire X-ray defect detection method compared based on twin distance
CN110766046A (en) Air quality measurement method for two-channel convolutional neural network ensemble learning
CN107392254A (en) A kind of semantic segmentation method by combining the embedded structural map picture from pixel
US20230060211A1 (en) System and Method for Tracking Moving Objects by Video Data
CN116630668A (en) Method for identifying wearing abnormality of safety helmet in quick lightweight manner
CN114170144A (en) Power transmission line pin defect detection method, equipment and medium
CN113469950A (en) Method for diagnosing abnormal heating defect of composite insulator based on deep learning
CN109543617A (en) The detection method of intelligent vehicle movement traffic information based on YOLO target detection technique
CN115761645A (en) YOLOv 5-based light-weight safety helmet wearing detection method
CN113689382B (en) Tumor postoperative survival prediction method and system based on medical images and pathological images
CN117392568A (en) Method for unmanned aerial vehicle inspection of power transformation equipment in complex scene
CN116363075A (en) Photovoltaic module hot spot detection method and system and electronic equipment
CN114494893B (en) Remote sensing image feature extraction method based on semantic reuse context feature pyramid
CN115862128A (en) Human body skeleton-based customer abnormal behavior identification method
CN113850134A (en) Safety helmet wearing detection method and system integrating attention mechanism
CN114140524A (en) Closed loop detection system and method for multi-scale feature fusion
Zhang et al. Improved Algorithm for YOLOX-S Object Detection Based on Diverse Branch Block (DBB)
Natarajan et al. Earthquake Damage Prediction and Rapid Assessment of Building Damage Using Deep Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination