CN116229570B

CN116229570B - Aloft work personnel behavior situation identification method based on machine vision

Info

Publication number: CN116229570B
Application number: CN202310147997.2A
Authority: CN
Inventors: 陈明举; 兰中孝; 熊兴中; 宋竑森
Original assignee: Sichuan University of Science and Engineering
Current assignee: Sichuan University of Science and Engineering
Priority date: 2023-02-21
Filing date: 2023-02-21
Publication date: 2024-01-23
Anticipated expiration: 2043-02-21
Also published as: CN116229570A

Abstract

The invention discloses a machine vision-based high-altitude operation personnel behavior situation identification method, which comprises the following steps: s1, identifying an operator, a safety helmet, a safety belt and a safety belt hook in a real-time picture; s2, judging whether the safety helmet and the safety belt are worn correctly, if so, entering a step S3; otherwise, early warning output is carried out; s3, judging whether the hanging mode of the safety belt is correct, if so, entering a step S4; otherwise, early warning output is carried out; s4, judging whether the state of the safety belt hook is normal, if so, judging that the behavior situation of the overhead working personnel is normal; otherwise, early warning output is carried out. On the basis of deep learning target detection, the invention integrates scene recognition technology, cross-over ratio, logic judgment function, colorimetry space target extraction and morphological processing technology, realizes the recognition of illegal operation behavior of power operation in a specific scene, and carries out corresponding early warning output, thereby effectively improving the supervision convenience of overhead operation.

Description

Aloft work personnel behavior situation identification method based on machine vision

Technical Field

The invention relates to the field of safety construction management, in particular to a machine vision-based high-altitude operation personnel behavior situation identification method.

Background

The power grid operation is often in dangerous environments such as high altitude, high voltage and the like, and the environments often threaten the safety of power operators, so a series of regulations are formulated in the industry to standardize the operation behaviors of the constructors. However, there are often actions such as not wearing a helmet, not wearing a safety belt, not using a hook, etc. for convenience of a constructor. To ensure the safety of constructors to a certain extent, constructors usually send patrol personnel to perform the investigation. But the overhead operation is higher from the ground, and ground personnel are difficult to supervise the overhead personnel through naked eyes, and personnel tour back and forth also waste human resources, so that the overhead operation safety supervision is difficult to become a hidden trouble of the industry.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides the recognition method for the behavior situation of the aerial work personnel based on the machine vision, which solves the problem of difficult safety supervision of the aerial work.

In order to achieve the aim of the invention, the invention adopts the following technical scheme:

the invention provides a machine vision-based high-altitude operator behavior situation identification method, which comprises the following steps:

s1, acquiring a real-time picture, and identifying an operator, a safety helmet, a safety belt and a safety belt hook in the real-time picture through a deep learning network;

s2, judging whether the safety helmet and the safety belt are worn correctly, if so, entering a step S3; otherwise, early warning output is carried out;

s3, judging whether the hanging mode of the safety belt is correct, if so, entering a step S4; otherwise, early warning output is carried out;

s4, judging whether the state of the safety belt hook is normal, if so, judging that the behavior situation of the overhead working personnel is normal; otherwise, early warning output is carried out.

Further, the method for constructing the deep learning network in step S1 includes:

replacing an SPPF module of the yolov5 network with an SPPFCSPC module, replacing a coupling head of the yolov5 network with a decoupling head, and optimizing by adopting a SIOU loss function; the SPPFCSPC module comprises a first convolution unit and a second convolution unit which are connected in parallel; the second convolution unit is sequentially connected with a third convolution unit and a fourth convolution unit, and the output end of the fourth convolution unit is respectively connected with the first maximum pooling unit and the first fusion unit; the output end of the first maximum pooling unit is respectively connected with the input ends of the second maximum pooling unit and the first fusion unit; the output end of the second maximum pooling unit is respectively connected with the input ends of the third maximum pooling unit and the first fusion unit; the output end of the third maximum pooling unit is connected with the input end of the first fusion unit; the output end of the first fusion unit is sequentially connected with a fifth convolution unit and a sixth convolution unit; the output end of the sixth convolution unit and the output end of the first convolution unit are respectively connected with the second fusion unit; the output end of the second fusion unit is connected with a seventh convolution unit; the input ends of the first convolution unit and the second convolution unit are jointly used as the input end of the SPPFCSPC module; the output end of the seventh convolution unit is the output end of the SPPFCSPC module.

Further, the specific method for optimizing by using the SIOU loss function comprises the following steps:

a1, taking an image containing a safety helmet, a safety belt and/or a safety belt hook as a training sample, and acquiring a real frame of the training sample; inputting training samples into a deep learning network to obtain a prediction frame;

a2, according to the formula:

acquiring an angle loss lambda of the deep learning network; wherein C is _h The height difference between the true frame center point and the predicted frame center point; pi is the circumference ratio; sigma is the distance between the center points of the real frame and the predicted frame;the center coordinates of the real frames; />The center point coordinates of the prediction frame are obtained; max (·) represents taking the maximum value; min (·) represents taking the minimum value;

a3, according to the formula:

obtaining the distance loss delta of the deep learning network; where γ=2- Λ, ρ _x And ρ _y As an intermediate parameter ρ _t ＝{ρ _x ,ρ _y }；X _w And X _h The width and the height of the minimum circumscribed rectangle of the real frame and the prediction frame are respectively;

a4, according to the formula:

acquiring shape loss omega of the deep learning network; wherein e is a natural constant; w, h, w ^gt And h ^gt The width and the height of the prediction frame and the real frame are respectively; θ is a constant; w (W) _w And W is _h Are all intermediate parameters, w _t ＝{W _w ,W _h }；

A5, according to the formula:

obtaining comprehensive Loss value Loss of deep learning network _SIOU The method comprises the steps of carrying out a first treatment on the surface of the The IOU is the ratio of the intersection set and the union set of the prediction frame and the real frame;

a6, according to the comprehensive Loss value Loss of the deep learning network _SIOU And optimizing parameters of the deep learning network to finish optimization of the deep learning network.

Further, the specific method for judging whether the safety helmet and the safety belt are worn correctly in the step S2 is as follows:

acquiring the region R (x _p ) Region R (x) of the helmet _h ) And a region R (x) _s ) According to the formula:

obtaining the correct discrimination value IOU of whether the safety helmet and the safety belt are worn _X The method comprises the steps of carrying out a first treatment on the surface of the Wherein n represents the intersection; u represents the denoise set;

judging IOU _X Whether the safety helmet is larger than a corresponding threshold value or not, if so, judging that the safety helmet and the safety belt are worn correctly; otherwise, early warning output is carried out.

Further, the specific method for judging whether the hanging mode of the safety belt is correct in the step S3 includes the following sub-steps:

s3-1, acquiring an identification frame containing a safety belt, and independently extracting pictures in the identification frame;

s3-2, performing binarization processing on the extracted picture to obtain a binary image;

s3-3, calculating the center coordinates (X ₀ ,Y ₀ ) And from the center coordinatesCoordinates (X of the belt pixel furthest apart _e ,Y _e )；

S3-4, according to the formula:

T＝|y1-y2|/4

acquiring a judging threshold T of a hanging mode of the safety belt; wherein y1 and y2 are the ordinate of the upper and lower boundaries of the seat belt identification frame, respectively;

s3-5, if Y ₀ -Y _e <-T is established, then it is determined as high hanging abutment, i.e. safe; if Y ₀ -Y _e >And T, judging that the vehicle is low in hanging and high in use, namely early warning output.

Further, the specific method of step S3-2 comprises the following sub-steps:

s3-2-1, obtaining a dot r of a cubic sphere of an RBG color space where the safety belt is located _i And radius R ₀ And RBG component a thereof _i The method comprises the steps of carrying out a first treatment on the surface of the Wherein i=1, 2,3;

s3-2-2, according to the formula:

performing color slicing on the extracted picture to obtain RBG component b after the color slicing _i ；

S3-2-3, the method of representing white by 1 and black by 0 will be represented by RBG component b _i Binarizing the formed image to obtain a binary image after color slicing;

s3-2-4, removing burr holes of the binary image after the color slicing through morphological opening operation to obtain the binary image.

Further, the specific method for judging whether the state of the safety belt hook is normal in the step S4 is as follows:

constructing a safety belt hook data set through self-timer to perform transfer learning on the yolov5 network, and identifying the safety belt hook state by adopting the yolov5 network after transfer learning; the safety belt hook state comprises normal, thin wire hanging, oblique wire hanging, unclosed buckle and buckle tying line 5;

and when the state of the safety belt hook is that a thin wire is hung, oblique lines are hung, the buckle is not closed or the buckle is tied, early warning output is carried out.

The beneficial effects of the invention are as follows: on the basis of deep learning target detection, the invention integrates scene recognition technology, cross-over ratio, logic judgment function, colorimetry space target extraction and morphological processing technology, realizes the recognition of illegal operation behavior of power operation in a specific scene, and carries out corresponding early warning output, thereby effectively improving the supervision convenience of overhead operation.

Drawings

FIG. 1 is a schematic flow chart of the method;

FIG. 2 is a schematic diagram of a deep learning network;

FIG. 3 is a schematic view of SPPFCSPC structure;

FIG. 4 is a schematic diagram of a decoupling head configuration;

FIG. 5 is a schematic diagram of recognition results of a deep learning network according to an embodiment;

FIG. 6 is a belt identification chart in an embodiment;

FIG. 7 is a schematic diagram of a color slice binarization process according to an embodiment;

FIG. 8 is a schematic diagram of the morphological open operation according to the embodiment;

FIG. 9 is a diagram showing the result of identifying the coordinates of the belt region according to the embodiment;

FIG. 10 is a schematic diagram of early warning output in an embodiment;

fig. 11 is a schematic diagram of the recognition result being safe in the embodiment.

Detailed Description

The following description of the embodiments of the present invention is provided to facilitate understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and all the inventions which make use of the inventive concept are protected by the spirit and scope of the present invention as defined and defined in the appended claims to those skilled in the art.

As shown in fig. 1, the method for identifying the behavior situation of the aerial working personnel based on machine vision comprises the following steps:

The method for constructing the deep learning network in the step S1 comprises the following steps: replacing an SPPF module of the yolov5 network with an SPPFCSPC module, replacing a coupling head of the yolov5 network with a decoupling head, and optimizing by adopting a SIOU loss function; the SPPFCSPC module comprises a first convolution unit and a second convolution unit which are connected in parallel; the second convolution unit is sequentially connected with a third convolution unit and a fourth convolution unit, and the output end of the fourth convolution unit is respectively connected with the first maximum pooling unit and the first fusion unit; the output end of the first maximum pooling unit is respectively connected with the input ends of the second maximum pooling unit and the first fusion unit; the output end of the second maximum pooling unit is respectively connected with the input ends of the third maximum pooling unit and the first fusion unit; the output end of the third maximum pooling unit is connected with the input end of the first fusion unit; the output end of the first fusion unit is sequentially connected with a fifth convolution unit and a sixth convolution unit; the output end of the sixth convolution unit and the output end of the first convolution unit are respectively connected with the second fusion unit; the output end of the second fusion unit is connected with a seventh convolution unit; the input ends of the first convolution unit and the second convolution unit are jointly used as the input end of the SPPFCSPC module; the output end of the seventh convolution unit is the output end of the SPPFCSPC module.

The specific method for optimizing by adopting the SIOU loss function comprises the following steps:

a2, according to the formula:

a3, according to the formula:

a4, according to the formula:

acquiring shape loss omega of the deep learning network; wherein e is a natural constant; w, h, w ^gt And h ^gt The width and the height of the prediction frame and the real frame are respectively; θ is a constant, the degree of attention to shape loss is controlled, and in order to avoid excessively paying attention to shape loss, the movement of the prediction frame is reduced, and θ=2; w (W) _w And W is _h Are all intermediate parameters, w _t ＝{W _w ,W _h }；

A5, according to the formula:

The specific method for judging whether the safety helmet and the safety belt are worn correctly in the step S2 is as follows: acquiring the region R (x _p ) Region R (x) of the helmet _h ) And a region R (x) _s )，According to the formula:

judging IOU _X Whether the safety helmet is larger than a corresponding threshold value or not, if so, judging that the safety helmet and the safety belt are worn correctly; otherwise, early warning output is carried out. If IOU _X =0, then this indicates that there is an object that is not being worn correctly (e.g., is put aside).

The specific method for judging whether the hanging mode of the safety belt is correct in the step S3 comprises the following substeps:

s3-3, calculating the center coordinates (X ₀ ,Y ₀ ) And coordinates (X) of the belt pixel farthest from the center coordinates _e ,Y _e )；

S3-4, according to the formula:

T＝|y1-y2|/4

The specific method of the step S3-2 comprises the following substeps:

s3-2-2, according to the formula:

The specific method for judging whether the state of the safety belt hook is normal in the step S4 is as follows:

In a specific implementation process, the deep learning network of the method takes yolov5 as a main network, improves the yolov5 network, improves the main network, the coupling head structure and the loss function of the yolov5 network, and realizes accurate identification of targets such as people, safety belts, safety belt hooks, safety caps and the like in an electric power operation scene. The network takes yolov5 as a main network, the feature extraction part of the main network is used for designing SPPFCSPC to replace SPPF of the original network, the receptive field of the main network is enlarged, the extraction capacity of the model to deep important features is improved, and the feature loss caused in the network feature processing process is reduced. And secondly, a decoupling head is used for replacing a coupling head structure in the original network, the confidence coefficient and the regression frame are realized separately by decoupling operation, and the negative influence caused by conflict between classification and regression tasks is relieved, so that the network detection precision is improved and the network convergence is accelerated. Finally, a SIOU (Scylla Intersection Over Union) loss function is introduced, and vector angles between the real frame and the predicted frame are further considered, so that the loss function is optimized, and the convergence speed of the model is increased. Optimized Yolov5 (deep learning network) is shown in fig. 2. Wherein the backup: a backbone network; neg: a bottleneck network; coupled Head: decoupling the detection head; CBS: and the convolution processing module is used for executing convolution, batch normalization and activation function operation on the input feature map. C3_x: the backbone network residual characteristic learning module comprises 3 standard convolution layers and x residual modules. SPPFCSPC: and a rapid spatial pyramid pooling module. Upsample: and an up-sampling module. Concat: and the feature fusion module is used for superposing two feature layers with the same size and adding the channel numbers. C3_1_F: and a neck network characteristic learning module. Coupled head: the head module is decoupled.

The method aims at the problem that more target detail information is lost due to the fact that the SPPF module repeatedly uses the pooling layer, and the target with a small individual hook is regarded as a background and cannot be detected finally. The method modifies the last SPPF module of the YOLOv5s backhaul, and the modified SPPF module is SPPFCSPC, and the structure is shown in figure 3. The SPPFCSPC module inputs a 512 x 20 feature map, serially inputs 3 convolution kernels of 5 x 5 size, and performs three convolution operations. Finally, the result is combined with the 1×1 convolution operation data, resulting in a feature map of size 512×20×20. SPPFCSPC can effectively increase receptive field through maximum pooling, so that the algorithm is suitable for images with different resolutions, and multi-scale target information can be obtained under the condition that the feature mapping size is kept unchanged. Maxpool2d in fig. 3: and (5) a maximum pooling module. Where k is the size of the window, k=5, and the size of the window is 5×5; s is the window moving step length; p is an edge fill value; conv: and a convolution module. k is the convolution kernel size, s is the step size, and p is the edge fill value. Concat: and the fusion module is used for superposing two characteristic layers with the same size channels and adding the channel numbers.

The decoupling Head (Decoupled Head) structure is shown in fig. 4. For the input feature layer, the decoupling head can utilize 1×1 convolution to reduce the dimension of the input feature layer, then 2 3×3 convolutions are respectively used in classification and regression branches, and parallel channels respectively perform object classification and target frame coordinate regression tasks. The Cls, the Reg and the Obj 3 outputs can be obtained through processing, wherein the Cls is the type corresponding to the target frame, the Reg is the position information of the target frame, and the Obj is whether each feature point contains an object. And fusing the 3 output values to obtain final prediction information. The decoupling operation is implemented by separating the confidence coefficient from the regression frame, and the negative influence caused by conflict between classification and regression tasks is relieved while the computational complexity is increased to a bit, so that the aims of improving the network detection precision and accelerating the network convergence are achieved.

And identifying the above targets by adopting optimized yolo v5, and recording the area of the identified targets, wherein the identification result is shown in fig. 5. And adding an identification frame for each identification target by the optimized yolo v5, and giving a corresponding discrimination value.

In one embodiment of the present invention, a picture in an identification frame including a seat belt is extracted, resulting in a picture as shown in fig. 6. The picture is subjected to color slice binarization to obtain the picture shown in fig. 7, and morphological open operation processing is further performed to obtain a binary image shown in fig. 8, wherein the identification result of the safety belt region coordinates in the binary image is shown in fig. 9. If the hanging mode of the safety belt is the low hanging high use mode as shown in fig. 10, early warning output is performed. If the fastening system of the seat belt is for high fastening as shown in fig. 11, the seat belt is determined to be safe.

Aiming at the problem of insufficient hook data quantity, the identification of the hook state carries out transfer learning on the basis of a yolov5 network, and a SCUT-hook data set is built by self-timer. The recognition network adopts the trained yolov5 to carry out migration learning. And (3) the migration learning freezes all convolution layers of the pre-training network, and only the top layer partial parameters are trained and fine-tuned by utilizing the collected and marked wearing equipment images. The parameters can be used for identifying the safety belt hook after being adjusted.

In summary, on the basis of deep learning target detection, the invention integrates scene recognition technology, cross-correlation ratio, logic judgment function, colorimetry space target extraction and morphological processing technology, realizes the recognition of illegal operation behaviors of power operation in a specific scene, carries out corresponding early warning output, and effectively improves the supervision convenience of overhead operation.

Claims

1. The method for identifying the behavior situation of the aerial working personnel based on the machine vision is characterized by comprising the following steps of:

s4, judging whether the state of the safety belt hook is normal, if so, judging that the behavior situation of the overhead working personnel is normal; otherwise, early warning output is carried out;

the specific method for judging whether the safety helmet and the safety belt are worn correctly in the step S2 is as follows:

judging IOU _X Whether the safety helmet is larger than a corresponding threshold value or not, if so, judging that the safety helmet and the safety belt are worn correctly; otherwise, early warning output is carried out;

the method for constructing the deep learning network in the step S1 comprises the following steps:

2. The machine vision-based aerial work personnel behavior situation recognition method according to claim 1, wherein the specific method for optimizing by using the SIOU loss function comprises the following steps:

a2, according to the formula:

a3, according to the formula:

a4, according to the formula:

A5, according to the formula:

3. The machine vision-based high-altitude operation personnel behavior situation recognition method according to claim 1, wherein the specific method for judging whether the hanging mode of the safety belt is correct in the step S3 comprises the following sub-steps:

S3-4, according to the formula:

T＝|y1-y2|/4

s3-5, ifY ₀ -Y _e <-T is established, then it is determined as high hanging abutment, i.e. safe; if Y ₀ -Y _e >And T, judging that the vehicle is low in hanging and high in use, namely early warning output.

4. The machine vision-based aerial working personnel behavior situation recognition method of claim 3, wherein the specific method of step S3-2 comprises the following sub-steps:

s3-2-2, according to the formula:

5. The machine vision-based high-altitude operation personnel behavior situation recognition method according to claim 1, wherein the specific method for judging whether the state of the safety belt hook is normal in the step S4 is as follows: