CN117726991A

CN117726991A - High-altitude hanging basket safety belt detection method and terminal

Info

Publication number: CN117726991A
Application number: CN202410171627.7A
Authority: CN
Inventors: 黄宗荣; 林大甲; 郑敏忠; 江世松
Original assignee: Jinqianmao Technology Co ltd
Current assignee: Jinqianmao Technology Co ltd
Priority date: 2024-02-07
Filing date: 2024-02-07
Publication date: 2024-03-19
Anticipated expiration: 2044-02-07
Also published as: CN117726991B

Abstract

The invention discloses a high-altitude hanging basket safety belt detection method and a terminal, which are used for acquiring image data of a site scene in advance and carrying out frame selection of a detection target to obtain a scene database, and training the scene database by using an optimized YOLOv8 neural network to obtain a learned model; the method comprises the steps of obtaining images to be detected, shot by a plurality of cameras in a hanging basket, carrying out detection processing by using a learned model to obtain a detection target, carrying out verification by using coordinates of the detection target to finally judge whether the unworn condition exists or not, and carrying out alarm prompt to timely solve the safety problem of the construction environment. In this way, the operation condition of the high-altitude hanging basket is monitored by utilizing the optimized detection model, so that the detection precision is high, the monitoring can be performed uninterruptedly, the labor is not required, a series of safety problems are avoided, and the operation cost is low.

Description

High-altitude hanging basket safety belt detection method and terminal

Technical Field

The invention relates to the technical field of automatic detection of image processing, in particular to a method and a terminal for detecting a safety belt of a high-altitude hanging basket.

Background

With the rapid development of the construction industry, the requirement of the high-altitude operation of the construction site is continuously increased, however, the high-altitude operation area usually has more unstable factors and is easy to generate danger, so that the high-altitude operation area is very important in detecting whether the high-altitude operation personnel wear the safety belt in real time and efficiently, eliminating the potential safety hazard in time and ensuring the construction safety of the construction site.

At present, the common mode of each construction operation district is:

one is to supervise the aerial work personnel by means of manual patrol by safety supervision personnel, which rely on visual inspection by safety supervision personnel or management personnel on site, which can regularly or irregularly check whether the aerial work personnel have worn the safety belt correctly; however, this method may be subject to human factors (e.g., fatigue, negligence, etc.), resulting in inaccurate or missing inspections.

The other is to detect through wearing intelligent wearing equipment, and some safety belts or safety vests are provided with sensors and wireless communication modules, so that the positions and the postures of the wearers can be monitored in real time, and data can be sent to a remote monitoring center. The system may issue an alarm if it is detected that the wearer is working aloft but is not wearing a safety harness. The advantage of this approach is that it can provide real-time, personalized monitoring and alerting, but requires additional hardware investment and maintenance costs.

Disclosure of Invention

The technical problems to be solved by the invention are as follows: the method and the terminal for detecting the safety belt of the overhead hanging basket can automatically detect the wearing condition of the safety belt of an overhead working personnel under the condition that hardware cost is not added, and improve the detection accuracy.

In order to solve the technical problems, the invention adopts the following technical scheme:

a high-altitude hanging basket safety belt detection method comprises the following steps:

s1, acquiring image data of a scene of a construction site, and selecting a detection target in the image data by using a labeling frame to obtain a scene database;

s2, training the scene database by using an optimized YOLOv8 detection model to obtain a safety belt detection model;

the optimized YOLOv8 detection model is as follows: introducing a fusion multi-head self-attention mechanism into a feature extraction module of the YOLOv8 detection model, introducing Soft-NMS into the YOLOv8 detection model, and optimizing a loss function of the YOLOv8 detection model through self-adaptive weight;

s3, obtaining images to be detected which are shot by a plurality of cameras in the hanging basket;

s4, detecting the image to be detected by using the safety belt detection model, and rechecking a detection result by using the relative position of the detection target;

and S5, judging whether the situation that the safety belt is not worn exists in the image to be detected according to the rechecking result, and if so, giving an alarm.

In order to solve the technical problems, the invention adopts another technical scheme that:

the overhead basket safety belt detection terminal comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the processor realizes the steps of the overhead basket safety belt detection method when executing the computer program.

The invention has the beneficial effects that: acquiring image data of a site scene in advance, performing frame selection of a detection target to obtain a scene database, and training the scene database by using an optimized YOLOv8 neural network to obtain a learned model; the method comprises the steps of obtaining images to be detected, shot by a plurality of cameras in a hanging basket, carrying out detection processing by using a learned model to obtain a detection target, carrying out verification by using coordinates of the detection target to finally judge whether the unworn condition exists or not, and carrying out alarm prompt to timely solve the safety problem of the construction environment. In this way, the operation condition of the high-altitude hanging basket is monitored by utilizing the optimized detection model, so that the detection precision is high, the monitoring can be performed uninterruptedly, the labor is not required, a series of safety problems are avoided, and the operation cost is low.

Drawings

FIG. 1 is a flow chart of a method for detecting a safety belt of a high-altitude basket according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a detection terminal of a safety belt of a high-altitude basket according to an embodiment of the present invention;

description of the reference numerals:

1. a high-altitude hanging basket safety belt detection terminal; 2. a memory; 3. a processor.

Detailed Description

In order to describe the technical contents, the achieved objects and effects of the present invention in detail, the following description will be made with reference to the embodiments in conjunction with the accompanying drawings.

Referring to fig. 1, an embodiment of the present invention provides a method for detecting a safety belt of a high altitude basket, including the steps of:

From the above description, the beneficial effects of the invention are as follows: acquiring image data of a site scene in advance, performing frame selection of a detection target to obtain a scene database, and training the scene database by using an optimized YOLOv8 neural network to obtain a learned model; the method comprises the steps of obtaining images to be detected, shot by a plurality of cameras in a hanging basket, carrying out detection processing by using a learned model to obtain a detection target, carrying out verification by using coordinates of the detection target to finally judge whether the unworn condition exists or not, and carrying out alarm prompt to timely solve the safety problem of the construction environment. In this way, the operation condition of the high-altitude hanging basket is monitored by utilizing the optimized detection model, so that the detection precision is high, the monitoring can be performed uninterruptedly, the labor is not required, a series of safety problems are avoided, and the operation cost is low.

Further, the step S1 includes the steps of:

s11, setting a plurality of monitoring points for a construction site scene, randomly sampling the plurality of monitoring points, and collecting image data in actual construction;

s12, carrying out frame selection on the image data through an annotation frame by utilizing an image annotation tool, and selecting detection targets in the image data in a frame manner to obtain a scene database;

the detection targets are divided into: a human body, a half body wearing the safety belt and a half body not wearing the safety belt;

s13, enhancing the image data in the scene database to expand the content of the image data.

From the above description, it can be seen that image data during actual construction is randomly collected and marked, so that general enhancement processing is performed on scene knowledge data, and the content of the image data can be further expanded to prevent model overfitting.

Further, the step S2 includes the steps of:

s21, splicing a plurality of pieces of image data, and adding the spliced image data into a scene database for training;

s22, selecting YOLOv8 to train the scene database, and introducing a fused multi-head self-attention mechanism into a model feature extraction module C2 f;

s23, introducing soft-NMS into a YOLOv8 detection model;

s24, improving the loss function of the YOLOv8 detection model, and introducing an adaptive weight to optimize the loss function.

From the above description, a fused multi-head self-attention mechanism is introduced into the C2f module, so that the capturing capability of the model on global information can be enhanced; the Soft-NMS is introduced, so that the problem that personnel are easy to overlap in the hanging basket can be solved; finally, the self-adaptive weight optimization loss function is introduced, so that the problems that difficult samples are relatively sparse and the samples with more easy sample numbers are unbalanced can be solved.

Further, the step S21 includes the steps of:

s211, determining the grid layout of the target spliced image;

s212, filling the image data in the grid layout in a counterclockwise direction to obtain spliced image data;

s213, filling the labeling frame corresponding to the image data in the grid layout in a counterclockwise direction to form a labeling information matrix.

From the description, a plurality of pictures are spliced together according to a certain rule to form a new picture, the new picture is added into a scene database for training, the diversity of the images can be enhanced, the background information is enriched, and meanwhile, the number of targets in batch processing is increased.

Further, the step S22 includes the steps of:

s221, outputting characteristics of x by each bottleneck in a C2f module, wherein the format of x is H multiplied by W multiplied by d, and the x represents the height, the width and the channel dimension of a characteristic matrix respectively;

s222, adopting 3 convolutions of 1 multiplied by 1 to project the input feature images into a query vector q, a key vector k and a value vector v respectively;

s223, initializing two parameter vectors Rh and Rw used for learning, representing position codes of different positions of the height and width of the two-dimensional feature map, and then adding through a broadcasting mechanism to obtain a parameter r;

s224, performing matrix multiplication on q and r to obtain qrT, representing the association information between the pixel content and the position, and performing matrix multiplication on q and k to obtain qkT, representing the association information between the pixel content and the content;

s225, performing matrix addition on qrT and qkT, and performing softmax normalized index processing on the obtained matrix, wherein the processed matrix is in a format of HW×HW;

s226, performing matrix multiplication on the output matrix obtained in the step S225 and the weight vector v to obtain a matrix F, wherein the size of the matrix F is H multiplied by W multiplied by d.

From the above description, a fused multi-head self-attention mechanism is introduced into the model C2f module, so that the capturing capability of the model to global information can be enhanced; and the self-attention mechanism can carry out global modeling on the input characteristics, so that each output can acquire global information, and the relevance between the pixel characteristics can be extracted more effectively than the common convolution operation.

Further, the step S23 includes the steps of:

s231, in the detection process, ordering all the candidate frames in a descending order according to the confidence level of each detected target frame;

s232, selecting a candidate frame with the highest confidence coefficient, adding the candidate frame to a final output result list, and calculating the intersection ratio between the rest candidate frames and the selected frames;

s233, attenuating the confidence coefficient of the rest candidate frames according to the calculated intersection ratio and a preset attenuation function, and discarding the candidate frames if the attenuated confidence coefficient is lower than a certain threshold; otherwise, it is reserved and added to the final output result list, and the attenuation formula is as follows:

；

where S represents the confidence level, i is the sequence number of the rest of the frames except for the highest scoring frame "A", b _i Representing to be treatedσ represents a gaussian function parameter;

s234, repeating the steps S232 and S233 until all candidate frames are processed;

s235, obtaining a final output result list which comprises the filtered target frames.

From the above description, since the space between the personnel implementing the operation in the basket is narrow, the high overlapping part often occurs, and the traditional NMS is easy to cause errors, so that the soft NMS is introduced, the frames with larger intersection ratio are not directly discarded, the attenuation formula is used for reducing the score of the frames, so that the frames with higher overlapping degree are possibly used as correct detection frames in the subsequent calculation, thereby improving the recall rate, reducing the omission ratio and further improving the detection effect.

Further, the step S24 includes the steps of:

s241, respectively defining a class and a regression frame loss function by using the YOLOv8 standard loss function, and calculating a loss value;

s242, calculating the intersection ratio by using the target frame after the last screening and the labeling frame of the corresponding image data, and calculating an average intersection ratio m, wherein when m is smaller than 0.2, the value is 0.2;

s243, calculating a weight value:

；

wherein x represents label information of a label frame, the label information is a numerical value of a label class, mu represents an input average intersection ratio value, epsilon represents a bias value;

s244, multiplying the calculated weight value and the loss value to obtain a final loss value.

As can be seen from the above description, since the number of clear and easily-resolved samples is relatively large and the number of difficult samples is relatively small in the data collection process, the loss function is improved, the simple samples and the difficult samples are distinguished by using the size of the intersection ratio of the prediction frame and the existing labeling frame, and the self-adaptive weight optimization loss function is introduced, so that the problem of unbalance between the simple samples and the difficult samples is solved.

Further, the step S4 includes the steps of:

s41, acquiring human body region coordinate information of a detection result, and acquiring body coordinate information of an unworn safety belt or body coordinate information of a wearing safety belt of the detection result;

s42, assuming that the coordinate information of the human body area is A, and the coordinate information of the half body of the unworn safety belt or the coordinate information of the half body of the unworn safety belt is B, calculating the intersection ratio IOU:

。

further, the step S5 includes the steps of:

s51, if the intersection ratio of the human body region coordinate information and the half body coordinate information of the unworn safety belt is higher than a threshold value, judging that the constructor has the unworn safety belt, and executing a step S52, otherwise, detecting that the detection is invalid;

if the intersection ratio of the coordinate information of the human body region and the coordinate information of the half body of the wearing safety belt is higher than a threshold value, judging that the constructor wears the safety belt, otherwise, detecting that the detection is invalid;

s52, sending alarm information to the platform according to the detection result.

From the above description, after the person who does not wear the safety belt is detected, an alarm is pushed to a designated platform to prompt related persons to process, so as to eliminate potential safety problems.

Referring to fig. 2, another embodiment of the present invention provides a high-altitude basket safety belt detection terminal, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, wherein the processor implements the steps of the high-altitude basket safety belt detection method when executing the computer program.

The method and the terminal for detecting the safety belt of the overhead basket are suitable for automatically detecting the wearing condition of the safety belt of an overhead operator and improving the detection accuracy under the condition of not additionally inputting hardware cost, and are described by specific implementation modes:

example 1

Referring to fig. 1, a method for detecting a safety belt of a high altitude basket includes the steps of:

s1, acquiring image data of a scene of a construction site, and selecting a detection target in the image data by using a labeling frame to obtain a scene database.

S11, setting a plurality of monitoring points for a construction site scene, randomly sampling the monitoring points, and collecting image data in actual construction.

Specifically, a plurality of monitoring points can be arranged, the selected monitoring points are randomly sampled by using a high-definition camera on a construction site, and the situation of actual construction of workers on the construction site is collected.

S12, carrying out frame selection on the image data through an annotation frame by utilizing an image annotation tool, and selecting detection targets in the image data in a frame mode to obtain a scene database. Specifically, the detection targets are divided into: the body, the half body wearing the safety belt and the half body not wearing the safety belt.

In this embodiment, labeling tools such as labeling may be used to frame the detection target to be identified in the image data, and coordinate information and category information of the frame selection may be saved in a file; at the same time, the image data can also be screened.

Specifically, the image data is subjected to general enhancement processing, including rotation, clipping, filtering and other methods, so that the content of the image data can be further expanded, and the model is prevented from being fitted.

S2, training the scene database by using the optimized YOLOv8 detection model to obtain a safety belt detection model. In this embodiment, the optimized YOLOv8 detection model is to introduce a fused multi-head self-attention mechanism into the feature extraction module of the YOLOv8 detection model, introduce Soft-NMS (Soft Non-Maximum Suppression, soft threshold Non-maximum suppression) into the YOLOv8 detection model, and optimize the loss function of the YOLOv8 detection model through adaptive weights.

And S21, splicing the plurality of pieces of image data, and adding the spliced image data into a scene database for training.

Specifically, in order to enhance the diversity of images, enrich background information, and increase the number of targets in batch processing, multiple pictures can be spliced together according to a certain rule to form a new picture, and the new picture is added into a scene database for training.

S211, determining the grid layout of the target spliced image.

Specifically, the objective of the belt detection model is to accurately determine whether the human body wears the belt, and excessive image synthesis provides a large number of targets, but does not provide more features, but rather reduces the overall performance of the model, so the grid layout in this embodiment is preferably 2x3 and 3x 3.

S212, filling the image data in the grid layout in a counterclockwise direction to obtain spliced image data.

Specifically, after determining the grid layout, the original image is next filled in the grid in a counterclockwise direction; and after sequentially selecting points (100 is taken as the horizontal and vertical coordinate offset distance) near the left upper corner of the marking frame of the image data to be aligned with the left upper corner of the grid region, pasting the original image into the grid, and displaying the part of the image exceeding the grid.

Specifically, the coordinate information of the corresponding labeling frame is converted in the same way to form a labeling information matrix.

S22, selecting YOLOv8 to train the scene database, and introducing a fused multi-head self-attention mechanism into the model feature extraction module C2f in order to enhance the capturing capability of the model on global information.

Specifically, there is a feature extraction module C2f in the YOLOv8 network structure, in which there are three bottleck structures, and the 3x3 convolution inside the bottleck is replaced by the following steps S221 to S226, that is, the multi-head self-attention mechanism is introduced, so as to form a new bottleck structure.

S221, each bottleneck output feature in the C2f module is x, and the format of x is H multiplied by W multiplied by d, which respectively represent the height, width and channel dimension of the feature matrix.

S222, adopting 3 convolutions of 1 multiplied by 1 to project the input characteristic diagram into a query vector q, a key vector k and a value vector v respectively.

S223, initializing two learnable parameter vectors Rh and Rw, representing position codes of different positions of the height and width of the two-dimensional feature map, and then adding through a broadcasting mechanism to obtain a parameter r.

S224, performing matrix multiplication on q and r to obtain qrT, representing the association information between the pixel content and the position, and performing matrix multiplication on q and k to obtain qkT, representing the association information between the pixel content and the content.

And S225, performing matrix addition on qrT and qkT, and performing softmax normalized index processing on the obtained matrix, wherein the processed matrix is in a format of HW×HW.

S226, performing matrix multiplication on the output matrix obtained in the step S225 and a weight vector v to obtain a matrix F, wherein the size of the matrix F is H multiplied by W multiplied by d; the feature output after each bottleneck is then feature fused.

Specifically, the matrix multiplication is performed on hw×hw and the weight v, the size of v matrix may be hw×d after the size of resize, the size of matrix obtained after the multiplication is hw×d, and the three-dimensional matrix h×w×d may be obtained after the resize.

Because the self-attention mechanism can globally model the characteristics of the input, each output can acquire global information, and the relevance between the pixel characteristics can be extracted more effectively than the common convolution operation.

S23, workers implement operation in the hanging basket, due to the fact that the space is narrow, a high overlapping part often appears, errors are prone to being caused by the traditional NMS, so that soft-NMS is introduced, and the detection effect is further improved.

S231, in the detection process, all the candidate frames are ordered in a descending order according to the confidence level of each detected target frame.

S232, selecting a candidate frame with the highest confidence coefficient, adding the candidate frame to a final output result list, and calculating the intersection ratio between the rest candidate frames and the selected frames.

；

where S represents the confidence level, i is the sequence number of the rest of the frames except for the highest scoring frame "A", b _i Representing candidate boxes to be processed, σ representing gaussian function parameters.

S234, repeating the steps S232 and S233 until all the candidate frames are processed.

Therefore, although the NMS used in the prior art effectively filters repeated frames, frames with lower scores among two objects are easily suppressed, so that the recall rate is reduced, so long as when the intersection ratio of two frames is greater than the IOU threshold, the NMS algorithm sets the frame to 0 directly, which is equivalent to directly discarding the frame, thereby possibly causing missed detection of the frame (in the application scene hanging basket, due to the large body overlapping of people with narrow space, missed detection is easily caused by using a hard threshold). The Soft threshold Soft-NMS algorithm does not discard frames with IOU larger than the threshold directly, but reduces the score by using an attenuation formula, so that even if the score is very high (i.e. a frame with higher overlapping degree) is possibly used as a correct detection frame in subsequent calculation, thereby improving recall rate and reducing missing detection condition.

S24, because the number of clear and easily-resolved samples is relatively large and the number of difficult samples is relatively small in the data collection process, the loss function is improved, simple samples and difficult samples are distinguished by utilizing the intersection ratio based on the prediction frame and the labeling frame, and the self-adaptive weight optimizing loss function is introduced.

S241, using the loss function of the YOLOv8 standard to define the loss functions of the category and the regression frame respectively, and calculating the loss value.

S242, calculating the intersection ratio by using the target frame after the last screening and the labeling frame of the corresponding image data, and calculating the average intersection ratio m, wherein when m is smaller than 0.2, the value is 0.2, so that the extreme situation is avoided.

The marking frame is the coordinate point positions of four points of the detection target obtained during the marking of the training data.

S243, calculating a weight value by using the following weight formula:

；

in the formula, x represents label information of a label frame, the label information is a numerical value of a label class, mu represents an input average intersection ratio value, epsilon represents a bias value, and 0.1 is selected.

Because the number of difficult samples is relatively small, the weight formula here is mainly to try to give higher weight information to the difficult samples, increase the weight of the misclassified samples around the threshold, and increase the relative loss value.

S3, obtaining images to be detected which are shot by a plurality of cameras in the hanging basket.

And S4, detecting the image to be detected by using the safety belt detection model, and rechecking a detection result by using the relative position of the detection target.

S41, acquiring the human body region coordinate information of the detection result, and acquiring the body coordinate information of the unworn safety belt or the body coordinate information of the wearing safety belt of the detection result.

。

S51, if the intersection ratio of the human body region coordinate information and the half body coordinate information of the unworn safety belt is higher than a threshold value, judging that the constructor has the unworn safety belt, otherwise, detecting that the detection is invalid;

in this embodiment, the threshold is set to 0.8;

s52, sending alarm information to the platform according to the detection result, and timely reminding so as to reduce potential safety hazards;

in this way, an alarm can be pushed to a designated platform after a person who is not wearing a safety belt is detected, so as to urge related persons to process, and potential safety problems are eliminated.

Example two

Referring to fig. 2, an overhead basket safety belt detection terminal 1 includes a memory 2, a processor 3, and a computer program stored in the memory 2 and capable of running on the processor 3, wherein the processor 3 implements the steps of an overhead basket safety belt detection method according to the first embodiment when executing the computer program.

In summary, according to the method and the terminal for detecting the high-altitude hanging basket safety belt, provided by the invention, the image data of a scene of a construction site are acquired in advance and marked, the optimized YOLOv8 neural network is utilized for training to obtain a learned model, the images to be detected which are shot by a plurality of cameras in the hanging basket are acquired, the learned model is utilized for detection processing to obtain a detection target, and the coordinates of the detection target are utilized for verification to finally judge whether the unworn condition exists or not and carry out alarm prompt, so that the problem of construction environment safety is solved in time. In this way, the operation condition of the high-altitude hanging basket is monitored by utilizing the optimized detection model, so that the detection precision is high, the monitoring can be performed uninterruptedly, the labor is not required, a series of safety problems are avoided, and the operation cost is low.

The foregoing description is only illustrative of the present invention and is not intended to limit the scope of the invention, and all equivalent changes made by the specification and drawings of the present invention, or direct or indirect application in the relevant art, are included in the scope of the present invention.

Claims

1. The method for detecting the safety belt of the high-altitude hanging basket is characterized by comprising the following steps of:

2. The method for detecting the safety belt of the overhead basket according to claim 1, wherein the step S1 comprises the following steps:

3. The method for detecting the safety belt of the overhead basket according to claim 1, wherein the step S2 comprises the following steps:

s23, introducing soft-NMS into a YOLOv8 detection model;

4. A method of detecting a high altitude basket harness as claimed in claim 3 wherein S21 comprises the steps of:

s211, determining the grid layout of the target spliced image;

5. A method of detecting a high altitude basket harness as claimed in claim 3 wherein S22 comprises the steps of:

6. A method of detecting a high altitude basket harness as claimed in claim 3 wherein S23 comprises the steps of:

；

where S represents the confidence level, i is the sequence number of the rest of the frames except for the highest scoring frame "A", b _i Representing candidate boxes to be processed, sigma representing gaussian function parameters;

7. A method of detecting a high altitude basket harness as claimed in claim 3 wherein S24 comprises the steps of:

s243, calculating a weight value:

；

8. The method for detecting the safety belt of the overhead basket according to claim 1, wherein the step S4 comprises the following steps:

。

9. the method for detecting the safety belt of the overhead basket according to claim 8, wherein the step S5 comprises the following steps:

10. An overhead basket safety belt detection terminal comprising a memory, a processor and a computer program stored on the memory and operable on the processor, wherein the processor, when executing the computer program, carries out the steps of a method of overhead basket safety belt detection as claimed in any one of claims 1 to 9.