CN115731577A

CN115731577A - Wearing detection method for safety appliance of electric power constructor by improving YOLOv5

Info

Publication number: CN115731577A
Application number: CN202211505628.8A
Authority: CN
Inventors: 吴浩; 石柱; 金钟杨
Original assignee: Sichuan University of Science and Engineering
Current assignee: Sichuan University of Science and Engineering
Priority date: 2022-11-28
Filing date: 2022-11-28
Publication date: 2023-03-03

Abstract

The invention provides a wearing detection method of a safety appliance of an electric power constructor for improving YOLOv5, which comprises the following steps: marking the wearable image of the power operation field to obtain a data set; constructing a safety appliance wearing detection model based on improved YOLOv 5; training the safety appliance wearing detection model by using the data set; and inputting the power operation field wearing image into the trained safety appliance wearing detection model to obtain a wearing detection result. The invention can effectively improve the detection precision of the small target and the shielded target and meet the requirements of real-time performance and accuracy of wearing detection in the electric power operation field.

Description

Wearing detection method for safety appliance of electric power constructor with improved YOLOv5

Technical Field

The invention belongs to the technical field of wearing detection, and particularly relates to a wearing detection method for a safety appliance of an electric power constructor with improved YOLOv5.

Background

In recent years, casualty accidents caused by workers not wearing safety equipment according to regulations frequently occur on construction sites. With the progress of a new generation of power technology revolution as well as fire, in the face of complex and changeable power construction sites, how to improve the intelligent safety management level of the transformer substation, especially the timely discovery of the illegal behaviors of operating personnel, and the method has very important significance for reducing the occurrence probability of safety accidents and guaranteeing the safety of personal and property.

In order to improve the engineering safety supervision level, an intelligent supervision mode is gradually used to replace the traditional manual inspection method, and at present, a plurality of colleges and universities and scientific research institutions in China carry out deep research on the method, and a real-time monitoring technology for the safety and protective article wearing of a construction site is provided. In recent years, a target detection technology based on deep learning becomes a mainstream research direction, and at present, a target detection algorithm mainly comprises two major categories, namely an R-CNN (R-CNN), fast R-CNN and Fast R-CNN two-stage detection algorithm, and the like, a night keeper and the like detect the safety helmet by improving a Fast R-CNN network and fusing with multiple components, and aims to solve the problems of partial shielding, different sizes and the like of the conventional safety helmet. However, the fast RCNN detection speed is slow, and cannot meet the requirement of real-time detection. The second type is a single-stage target detection algorithm, which mainly represents algorithms such as SSD, YOLOv2, retinaNet, YOLOv3, YOLOv4 and the like, and people such as liuyunbao and the like judge the color and wearing condition of a safety helmet of a constructor by detecting the distribution of pixel points in a video in order to guarantee the construction safety as much as possible, so that workers who do not wear the safety helmet are warned, but the network is greatly influenced by illumination and the video shooting angle; redmon et al proposed a faster model of the YOLO algorithm in 2016, which, while not the best accuracy, was by contrast more suitable for real-time detection. The Xiaozhuang et al provides a wearing detection algorithm YOLOv3-WH by improving a YOLOv3 network, compared with YOLOv3, FPS is improved by 64%, mAP is improved by 6.5%, the detection speed is improved, and meanwhile, the detection accuracy is greatly improved, but when a target is in a complex background environment, the algorithm is difficult to realize effective extraction of features. Most of the existing algorithms only detect the safety helmet, and in an electric power construction site, the existing algorithms also have safety belts, gloves, safety clothes and other protective tools, which are also very important for the safety guarantee of workers and people.

Disclosure of Invention

In order to solve the technical problems, the invention provides the wearing detection method of the safety appliance of the electric power constructor for improving YOLOv5, which can effectively improve the detection precision of small targets and shielding targets and meet the requirements of real-time performance and accuracy of wearing detection of an electric power operation site.

In order to achieve the above object, the present invention provides a power constructor safety appliance wearing detection method of improved YOLOv5, comprising:

marking the wearable image of the power operation field to obtain a data set;

constructing a safety appliance wearing detection model based on improved YOLOv 5;

training the safety appliance wearing detection model by using the data set;

and inputting the power operation site wearing image into the trained safety appliance wearing detection model to obtain a wearing detection result.

Optionally, the labeling the power operation field wearing image includes: and marking out a target area and a category in the power operation site wearing image.

Optionally, the category content of the power work site wearing image includes: proper wearing of gloves, no wearing of gloves, proper wearing of safety belts, no wearing of safety belts, proper wearing of work wear, no wearing of work wear, proper wearing of safety helmets, and no wearing of safety helmets.

Optionally, the improvement YOLOv5 comprises: and improving the SPP sub-module and the Neck module in the backbone network module of the YOLOv5.

Optionally, improving the SPP sub-module comprises:

adding a pooling core with the size of 3 multiplied by 3 in a pooling layer of the SPP submodule to form a plurality of pooling cores with different sizes;

adding a convolution of 1 × 1 size to integrate the number of channels after the Concat layer connected to the pooling layer;

replacing a max-pooling operation of the pooling layer with a dilation convolution;

and embedding an MHSA self-attention mechanism at the tail part of the SPP submodule.

Optionally, the expanding convolution is to add a hole into the standard convolution to form a hole convolution kernel;

the calculation formula of the size of the cavity convolution kernel is as follows:

k ^* ＝(k-1)×r+1

where k is the size of the convolution kernel before dilation, k ^* The size of the convolution kernel after dilation, r, is the dilation rate.

Optionally, modifying the tack module comprises:

introducing a cross-scale feature bridging operation on a PANET structure of the Neck module, and introducing a lightweight upsampling operator CARAFE for upsampling;

wherein introducing the cross-scale feature bridging operation comprises: an upper layer, a middle layer and a lower layer of scale feature bridging are introduced into the PANet structure, and feature compensation is realized while introducing a small amount of complexity through tensor splicing operation; introducing the lightweight upsampling operator CARAFE comprises: and performing feature fusion on the feature map corresponding to the Neck module and the feature map passing through the lightweight upsampling operator CARAFE, and then obtaining an output feature map of the next step through 3 x 3 conv.

Optionally, training the security appliance wearing detection model comprises: setting a loss function of the safety appliance wearing detection model;

the loss function is:

wherein L is _CIoU A loss function of a wearing detection model of the safety appliance, wherein rho is an Euclidean distance between two central points, b is a central point of a prediction frame, b ^gt Is the center point of the real frame, v is the similarity of the length-width ratio, alpha is the weight parameter, w is the width of the prediction frame, h is the height of the prediction frame, w is the length of the prediction frame ^gt Is the width of the real frame, h ^gt And c is the diagonal distance of the minimum circumscribed matrix area of the prediction frame and the real frame, and the IoU is equal to the intersection of the prediction frame and the real frame divided by the union of the prediction frame and the real frame.

Compared with the prior art, the invention has the following advantages and technical effects:

according to the method, a safety appliance wearing detection model is constructed by improving YOLOv5, and an electric power operation field wearing image is input into the safety appliance wearing detection model to obtain a wearing detection result; the invention can effectively improve the detection precision of the small target and the shielded target and meet the requirements of real-time performance and accuracy of wearing detection in the electric power operation field.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. In the drawings:

FIG. 1 is a schematic flow chart of a method for detecting wearing of a safety appliance for power construction personnel according to an embodiment of the invention;

FIG. 2 is a schematic diagram of the original YOLOv5 algorithm network structure according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of the main module structure of the original YOLOv5 algorithm according to the embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a new SPP module after improvement of YOLOv5 according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of a multi-scale spatial feature pyramid module according to an embodiment of the present disclosure;

FIG. 6 is a diagram illustrating a comparison between the structure of PANet and the structure of BiFPN according to the embodiment of the present invention;

FIG. 7 is a schematic diagram of a Neck network based on cross-scale feature bridging and CARAFE according to an embodiment of the present invention;

FIG. 8 is a diagram illustrating an original inspection picture according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of the fast RCNN test result according to the embodiment of the present invention;

FIG. 10 is a diagram illustrating the results of the YOLOv4 assay according to the embodiment of the present invention;

FIG. 11 is a diagram illustrating the results of the YOLOv5 assay according to the embodiment of the present invention;

fig. 12 is a schematic view of a wearing detection result of the electric power construction worker safety equipment of the improved YOLOv5 according to the embodiment of the present invention.

Detailed Description

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than here.

Examples

As shown in fig. 1, the present embodiment provides a power constructor safety appliance wearing detection method of improved YOLOv5, including:

marking the wearable image of the power operation field to obtain a data set;

training the safety appliance wearing detection model by using the data set;

Further, annotating the power operation scene wearing image includes: and marking out the target area and the type in the wearing image of the power working site by using a Make Sense (online marking tool).

Further, the category contents of the power work site wearing image include: correctly worn gloves, not worn gloves, correctly worn safety belts, not worn safety belts, correctly worn work clothes, not worn work clothes, correctly worn safety helmets and not worn safety helmets.

Further, the improvement YOLOv5 comprises: and improving the SPP sub-module and the Neck module in the backbone network module of the YOLOv5.

Further, refining the SPP sub-module includes:

adding a pooling core with the size of 3 × 3 in a pooling layer of the SPP submodule to form four pooling cores with different sizes of 3 × 3, 5 × 5, 9 × 9 and 13 × 13;

adding a convolution with the size of 1 x 1 scale after the Concat layer connected with the pooling layer to integrate the number of channels;

Further, improving the tack module comprises:

wherein introducing the cross-scale feature bridging operation comprises: introducing upper, middle and lower three-layer scale feature bridging in a PANet structure, and realizing feature compensation while introducing a small amount of complexity through tensor splicing operation; introducing the lightweight upsampling operator CARAFE comprises: and performing feature fusion on the feature map corresponding to the Neck module and the feature map passing through the lightweight upsampling operator CARAFE, and then obtaining an output feature map of the next step through 3 x 3 conv.

The present embodiment provides a new wearing detection method SBC _ YOLOv5 for improving YOLOv5, aiming at the problems of low detection precision of small and medium targets and occluded targets in the power field, and the like. The method comprises the steps of firstly improving a spatial pyramid pooling layer based on expansion convolution and adopting multi-scale pooling operation and an MHSA self-attention module to obtain a richer receptive field, secondly improving a Neck network by utilizing feature bridging operation and a CARAFE operator to improve extraction and compensation capacity of the network on semantic features, and finally optimizing a loss function of the network by adopting CIoU to improve regression capacity of a model. The result shows that the wearing detection method SBC _ YOLOv5 established in this embodiment has an average accuracy mAP of 82.3%, a recall rate of 81.5%, and a detection speed FPS of 44, which are respectively increased by 1.5%,10.27%, and 25.21% compared with the mAP values of YOLOv5, YOLOv4, and fast RCNN, and can effectively improve the detection accuracy of small targets and occluded targets, and meet the requirements of real-time performance and accuracy of wearing detection in the power operation site.

YOLOv5 is a single-stage detection algorithm, and compared with YOLOv4, the flexibility, speed and precision are greatly improved, and in contrast, the speed is faster and the structure is simpler, so that the embodiment adopts an improved YOLOv5m network to detect the power field wearing protective tool, and the original YOLOv5 algorithm is described in detail below.

The YOLOv5m model consists of four parts, input, backbone network (backbone), neck (Neck), and output. The backbone network part obtains feature maps with different sizes in the images by utilizing a plurality of convolution and pooling operations, and mainly comprises modules such as Conv, C3, SPP and the like. Conv is a basic convolution unit of the YOLOv5 network, and performs operations such as two-dimensional convolution, regularization, activation, and the like on an input image. C3 is composed of a plurality of classical residual error structure Bottleneck modules. The SPP module is a Spatial Pyramid Pooling layer (Spatial Pyramid Pooling), performs three Pooling operations of different sizes on the input image, and connects the output result with Concat with unchanged depth. The neck part processes the extracted feature information of the images with different sizes, processes the images into the same size, and then performs feature fusion to generate feature maps with three scales. The Neck is a feature fusion part and is composed of a feature pyramid FPN and a path aggregation network PAN structure. The FPN structure transmits the category characteristics of the high-level large target to the low level, the PAN structure transmits the position characteristics of the low-level large target, the category characteristics of the small target and the position characteristics to the upper level, the low-level large target and the small target are complementary to each other, the limitations of the low-level large target and the small target are overcome, the high-level characteristics and the bottom-level characteristics are fused and complementary, and therefore the characteristic extraction capability of the model is enhanced. The PAN needs to up-sample the high-level features and then transmit the high-level features downwards, and the up-sampling method in the YOLOv5 backbone network adopts a nearest neighbor interpolation method, so that although the calculation amount is small, the precision is low. In the process of target detection, yoolov 5 uses weighted NMS (Non-Maximum Suppression) operation to perform detection screening on multiple anchor boxes. The original YOLOv5 algorithm network structure and the main modules are shown in fig. 2 and fig. 3.

The following detailed description is that the steps of the method for detecting the wearing of the safety appliance of the power construction personnel in the embodiment of improving YOLOv5 are as follows:

1. improved YOLOv5 safety appliance wearing detection method

1.1. Multi-scale spatial feature pyramid integrating expanded convolution and MHSA

YOLOv5 uses an SPP space pyramid pooling module in a backbone network on the basis of an SPPNet structure, performs maximum pooling operation on a feature map by adopting core sizes of 5 × 5, 9 × 9 and 13 × 13, and splices the result after maximum pooling with the feature map by a Concat operation.

Considering the existence of small targets such as gloves and safety helmets, in order to reduce the difference between the receptive fields and obtain more comprehensive receptive field information, in this embodiment, a 3 × 3 branch is added to obtain a relatively small receptive field information, four pooling kernels with different sizes of 3 × 3, 5 × 5, 9 × 9, and 13 × 13 are used for operation, and finally, the number of channels after Concat operation is integrated by convolution with 1 × 1 to obtain feature information with different sizes, so that the expressive ability of the feature map is more prominent. The new SPP module structure is shown in FIG. 4.

Although the SPP spatial pyramid pooling module can obtain rich receptive field information, when the feature map is subjected to the maximum pooling operation, position information of a part of targets is easy to be lost, so that the embodiment replaces the maximum pooling operation with the expanded convolution on the basis, and the receptive field is not changed under the condition of ensuring the completeness of the target position information. The essence of the dilation convolution is to add holes in the standard convolution. The calculation formula of the size of the cavity convolution kernel is as follows:

k ^* ＝(k-1)×r+1(1)

where k is the size of the convolution kernel before dilation, the minimum value is 3, and r is the dilation rate. Therefore, the present embodiment replaces the maximal pooling operation with the 3 × 3 dilation convolution with dilation rates r of 1,2,4,6 respectively, so as to ensure the same information as the receptive field in the original SPP module and avoid the loss of the target position information. In addition, in order to improve the processing capacity of the network on the global information, an MHSA self-attention mechanism is embedded at the tail part of the SPP module and is matched with the information extraction of the convolutional neural network, so that the processing capacity of the network on the global information is realized. The structure diagram of the multi-scale spatial feature pyramid module of the fusion dilation convolution and MHSA is shown in fig. 5.

1.2. Neck network based on cross-scale feature bridging and CARAFE

The YOLOv5 network uses a PANet structure to perform feature fusion in the tack part, and the PANet structure uses a low-layer positioning signal to enhance the whole feature level through a path from bottom to top, so that an information path between bottom-layer features and top-layer features is greatly shortened. In 2020, the BiFPN multi-scale feature fusion network is applied to the EfficientDet target detection network proposed by the Google brain team for the first time, and compared with the PANet, the BiFPN reduces part of nodes, increases jump connection and forms a fusion module. FIG. 6 shows the structure of PANET and BiFPN.

The Neck part is mainly used for realizing the fusion of three different scale semantic features to discerning the not wearing target of equidimension, in order to promote its feature fusion ability, this embodiment introduces scale-spanning feature bridging operation on the basis of PANET according to BiFPN's structure, through the concatenation operation of tensor, promotes better promotion feature compensation ability when partial complexity, is favorable to promoting the precision to little target and shelter from the target detection.

Meanwhile, the accuracy of the algorithm is improved by introducing the cross-scale feature bridging operation, but the parameter quantity is also greatly increased, so that the lightweight upsampling operator CARAFE is introduced into the Neck network for upsampling, the acquisition capability of the network on the image receptive field is improved by content perception and feature recombination, the accuracy of the algorithm is effectively improved, and the parameter quantity increase caused by the bridging operation is reduced. In the CARAFE implementation process, feature fusion is carried out on a corresponding feature map in the Neck network and a feature map subjected to a CARAFE up-sampling operator, and then a next-step output feature map is obtained through 3 x 3conv, so that preparation is provided for operations such as classification prediction of a subsequent model.

A Neck network based on cross-scale feature bridging and caraafe is shown in fig. 7.

1.3. Loss function optimization

The calculation formula of the YOLOv5 loss function is as (2), and the YOLOv5 loss function consists of three parts, namely a target confidence coefficient loss function l _obj Class loss function l _cls And the position loss function l of the target frame and the predicted frame _box 。

Loss＝l _obj +l _cls +l _box (2)

In the original YOLOv5 model, GIoU _ loss is adopted as a regression loss function of a bounding box, and the GIoU is an expansion method of an IoU, but if a prediction box is in an object box and has the same size, the difference between the prediction box and the object box is the same in the case, and the GIoU _ loss is not solved.

The CIoU considers the length-width ratio, the overlapping area and the distance between the central points, and compared with the GIoU, it finds a more suitable position when selecting the target in a frame, and locates the target better, so the present embodiment chooses to replace the regression loss function GIoU _ loss with CIoU _ loss, and the calculation formulas are shown in formulas (3), (4) and (5):

The phenomenon that loss values of a prediction frame and a target frame are identical when the prediction frame and the target frame are completely overlapped in different areas in an original model loss function can be overcome through the CIoU _ loss, so that the position of the model in a regression frame is more accurate, and the detection performance of the model can be effectively improved.

2. Results of model training and experiments

2.1. Model training

2.1.1. Experimental Environment and data set preparation

In the deep learning target detection process, the data set used by the experiment is an indispensable part. Most of electric power operation sites are located in suburbs, and due to the fact that operation time is indefinite and light changes are large, the difference of collected image effects is large. Because the existing open-source data set does not have a data set of a standard power field, the detection requirement in an actual production environment is not met. To solve this problem, the present embodiment collects 5015 images of the wearing detection data set of the power operation site by means of web crawler. From the data set, a total of 8 label categories are defined (see table 1), and target regions and categories of all images are labeled using Make Sense (online labeling tool).

TABLE 1

2.1.2. Network training and experiment environment

In order to achieve the best performance of the model, in the training process, the number of iterations is set to 200, the initial learning rate is 0.01, the learning rate attenuation weight is set to 0.0005, the learning rate momentum is set to 0.937 to prevent the model from being over-fitted, the training batch size is set to 32, so as to fully invoke the GPU, and the configuration and environment of the server used in the experiment are specifically shown in table 2.

TABLE 2

2.2. Results of the experiment

2.2.1. Evaluation index

The model is evaluated by the average accuracy AP and mAP of common evaluation indexes in the field of target detection.

The Precision P (Precision) and the Recall R (Recall) can be used for visually measuring the false detection degree and the omission degree of the model, and are calculated according to the formulas (6) and (7):

mAP @0.5 is obtained by averaging APs of all classes, and mAP @0.5 can reflect the change trend of model accuracy rate along with recall rate, and the higher the mAP @0.5, the higher the mAP @0.5 indicates that the model is easier to maintain high accuracy rate under high recall rate, and the calculation is as in formula (8) (9):

2.2.2. ablation experiment

In order to analyze the influence of each improved method in the detection algorithm on the wearing detection result of the power field operating personnel, an ablation experiment is designed, and the detection effect of the corresponding algorithm is evaluated. Specific experimental contents and detection results are shown in table 3, where "√" indicates that a corresponding method is used, and "x" indicates that a corresponding method is not used in the network model.

TABLE 3

As can be seen from table 3, after the spatial feature pyramid is improved by the algorithm 1, the average detection accuracy is improved by 0.8 percent, which indicates that the introduction of the multi-scale spatial feature pyramid based on the extended convolution and the MHSA can effectively improve the acquisition capability of the network to the receptive field information; the algorithm 2 introduces a Neck network based on cross-scale feature bridging and CARAFE, and the mAP value is improved by 1.2 percentage points, which shows that the improvement can effectively improve the feature extraction and feature compensation capability of the network; the loss function of the network is optimized by the aid of the CIoU in the algorithm 3, detection precision is improved by 0.5%, and the regression capability of the model can be improved through optimization. Finally, all the improved points are combined to train to obtain SBC _ YOLOv5, and compared with a YOLOv5 algorithm before improvement, the value of the mAP is improved by 1.5% and reaches 82.3%, and the detection speed is 44fps, which indicates that the wearing detection method of the improved YOLOv5 established in the embodiment is real and effective, and can improve the detection precision of a safety protection appliance.

2.2.3. Comparison experiment of different detection algorithms

The embodiment uses the YOLOv5 algorithm to detect the wearing condition of the safety appliance of the power construction personnel. To demonstrate the superiority of the improved algorithm of this embodiment, the same dataset was used under the same configuration, and compared with the current mainstream target detection models, fast RCNN, YOLOv4 and YOLOv5, and the comparison experiment results are shown in table 4 for each type of target mapp value comparison (/%).

TABLE 4

According to the experimental results in table 4, the algorithm of the embodiment can effectively improve the detection precision of the electric power construction personnel safety appliance wearing. The mAP value of the algorithm for detecting the illegal wearing of the staff is 82.3 percent and is far higher than the mAP57.09 percent of the fast RCNN. Compared with YOLOv4 and original YOLOv5, the algorithm of the embodiment has certain improvement on the mAP value detected by each category. Therefore, the algorithm is not popular in the aspect of wearing detection accuracy of the electric power operation site, and can meet the requirement of accuracy rate of wearing detection of safety appliances in a complex electric power operation environment.

2.2.4. Analysis of detection results

In addition, in order to more intuitively see the detection difference among the algorithms, partial test result graphs are selected as shown in fig. 8 to 12, where fig. 8 is an original detection picture, fig. 9 is fast RCNN detection, fig. 10 is YOLOv4 detection, fig. 11 is YOLOv5 detection, and fig. 12 is electric power constructor safety gear wearing detection for improving YOLOv5. As can be seen from FIG. 10, YOLOv4 has a good detection effect, and can detect most safety protection appliances, but has a certain missing detection phenomenon on small targets such as gloves; the detection effect of fast RCNN in test pictures of different scenes is poor, missed detection and false detection are serious, as shown in FIG. 9, a small target such as a glove is not detected, a black automobile is detected into a safety garment and a safety belt by mistake, and repeated judgment occurs in a shielded area; the detection performance of YOLOv5 is second to that of the present embodiment, and most of wearing objects can be detected, but in a complex scene, a small amount of false detection occurs, and as shown in fig. 11, a white automobile is detected as a helmet by a false detection. The method can accurately detect all small targets and targets with shielding in fig. 8, which shows that the wearing detection method provided by the embodiment can better fuse semantic feature information of high layers and low layers, thereby effectively improving the precision of detecting the small targets and the targets with shielding. According to comparison of the various network detection results, the improved YOLOv5 network model has a good detection effect on the wearing of the safety appliance in a complex power operation environment.

The embodiment provides an improved electric power constructor safety appliance wearing detection method SBC _ YOLOv5 for YOLOv5, which aims at the problems that the detection of small targets and shielded targets is difficult and the omission ratio is high in the complex electric power operation scene by the conventional target detection technology. On the basis of a YOLOv5 network, a multi-scale spatial feature pyramid fusing expansion convolution and MHSA and a Neck network fusing cross-scale feature bridging and CARAFE are introduced, a loss function is optimized by CIoU, and the detection accuracy of the network is effectively improved. Experimental results show that the mAP value of SBC _ YOLOv5 in the method provided by the embodiment is 82.3%, the detection speed is 44fps, the detection precision of small targets and shielded targets is effectively improved, the requirements on the accuracy and the real-time performance of wearing detection in the current complex power operation scene are basically met, and the method has good generalization capability.

The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. Improve YOLOv 5's electric power constructor safety gear dress detection method, its characterized in that includes:

marking the wearable image of the power operation field to obtain a data set;

training the safety appliance wearing detection model by using the data set;

and inputting the power operation field wearing image into the trained safety appliance wearing detection model to obtain a wearing detection result.

2. The improved YOLOv5 power constructor safety gear wear detection method of claim 1, wherein the labeling of the power job site wear image comprises: and marking out a target area and a category in the power operation site wearing image.

3. The improved yollov 5 power constructor safety gear wearing detection method according to claim 2, characterized in that the category contents of the power work site wearing image include: proper wearing of gloves, no wearing of gloves, proper wearing of safety belts, no wearing of safety belts, proper wearing of work wear, no wearing of work wear, proper wearing of safety helmets, and no wearing of safety helmets.

4. The method of claim 1 for detecting the wearing of safety gear of power constructor with improved YOLOv5, characterized in that the improvement YOLOv5 comprises: and improving the SPP sub-module and the Neck module in the backbone network module of the YOLOv5.

5. The improved YOLOv5 power constructor safety gear wear detection method of claim 4, wherein improving the SPP sub-module comprises:

replacing a max-pooling operation of the pooling layer with a dilated convolution;

and embedding an MHSA self-attention mechanism at the tail part of the SPP sub-module.

6. The improved YOLOv5 electric power constructor safety gear wearing detection method of claim 5, characterized in that the expanding convolution is that a hole is added into a standard convolution to form a hole convolution kernel;

k ^* ＝(k-1)×r+1

7. The method of claim 4 for improved YOLOv5 power constructor safety gear wear detection, wherein improving the Neck module comprises:

8. The improved yollov 5 power constructor safety gear wear detection method of claim 1, characterized in that training the safety gear wear detection model comprises: setting a loss function of the safety appliance wearing detection model;

the loss function is:

wherein L is _CIoU A loss function of a wearing detection model of the safety appliance, wherein rho is an Euclidean distance between two central points, b is a central point of a prediction frame, b ^gt Is the center point of the real frame, v is the similarity of the length-width ratio, α is the weight parameter, w is the width of the prediction frame, h is the height of the prediction frame, w ^gt Is the width of the real frame, h ^gt And c is the diagonal distance of the minimum circumscribed matrix area of the prediction frame and the real frame, and the IoU is equal to the intersection of the prediction frame and the real frame divided by the union of the prediction frame and the real frame.