CN111985387A - Helmet wearing early warning method and system based on deep learning - Google Patents

Helmet wearing early warning method and system based on deep learning Download PDF

Info

Publication number
CN111985387A
CN111985387A CN202010824956.9A CN202010824956A CN111985387A CN 111985387 A CN111985387 A CN 111985387A CN 202010824956 A CN202010824956 A CN 202010824956A CN 111985387 A CN111985387 A CN 111985387A
Authority
CN
China
Prior art keywords
data
image
early warning
safety helmet
frames
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010824956.9A
Other languages
Chinese (zh)
Inventor
唐标
李博
沈映泉
于辉
李婷
黄绪勇
朱梦梦
秦雄鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of Yunnan Power Grid Co Ltd
Original Assignee
Electric Power Research Institute of Yunnan Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of Yunnan Power Grid Co Ltd filed Critical Electric Power Research Institute of Yunnan Power Grid Co Ltd
Priority to CN202010824956.9A priority Critical patent/CN111985387A/en
Publication of CN111985387A publication Critical patent/CN111985387A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B21/00Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
    • G08B21/18Status alarms
    • G08B21/182Level alarms, e.g. alarms responsive to variables exceeding a threshold
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B21/00Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
    • G08B21/18Status alarms
    • G08B21/24Reminder alarms, e.g. anti-loss alarms

Abstract

The application provides a safety helmet wearing early warning method and system based on deep learning, wherein the method comprises the steps of firstly, obtaining video data; extracting any n frames of images in the video data; determining characteristic region information in the image based on a pre-constructed safety helmet wearing state confirmation model, wherein the characteristic region information comprises a position and a label, and the label comprises a safety helmet and a head; determining the number m of frames of the image with the label of the head corresponding to the same position in the n frames of images, calculating the ratio of the number m of frames to the number n of frames, judging whether the ratio exceeds a threshold value, and if so, sending out early warning information. Since the image detected by the method is arbitrarily extracted, the false alarm caused by detection errors can be obviously avoided; in addition, the head and the safety helmet can be directly detected without segmenting human body images, so that the complexity of detection steps is reduced, and the timeliness of detection is improved. In conclusion, the method and the device can solve the problems that the detection by the past means is not timely and the accuracy is not high.

Description

Helmet wearing early warning method and system based on deep learning
Technical Field
The application relates to the technical field of building supervision, in particular to a safety helmet wearing early warning method and system based on deep learning.
Background
The safety helmet can protect the head of a construction worker and prevent the construction worker from being injured by falling objects. Therefore, in order to prevent the safety from being guaranteed, it is necessary to monitor the condition that the construction worker wears the helmet.
In the prior art, a cloud camera is usually installed on a working site to monitor the field situation, the sent video is detected through a model, and once a person without a safety helmet is detected, the system sends out early warning information. However, according to the technical scheme, the position of the human body is firstly identified, then the position of the human body is divided, and then whether the head of the human body is worn with the safety helmet is identified, because the identification is carried out twice, the processing steps are complicated and complex, the feedback of the wearing information of the safety helmet of a constructor is not facilitated, and the detection process is instant and disposable, so that the phenomenon of false alarm is easily caused, and the monitoring accuracy is low. Once the false alarm is generated, confusion can be brought to an operation field, the operation progress and efficiency can be influenced, and even the economic benefit of a project can be influenced.
Therefore, there is a need to solve the problems of untimely detection and low accuracy in the past.
Disclosure of Invention
The application provides a safety helmet wearing early warning method and system based on deep learning, and the problems that detection is not timely and accuracy is low in past means can be solved.
In a first aspect, the application provides a safety helmet wearing early warning method based on deep learning, which comprises the steps of
Acquiring video data, wherein the duration of the video data is less than or equal to a preset acquisition duration;
extracting any n frames of images in the video data;
determining characteristic region information in the image based on a pre-constructed safety helmet wearing state confirmation model, wherein the characteristic region information comprises a position and a label, and the label comprises a safety helmet and a head;
determining the number m of frames of the image of which the label corresponding to the same position in the n frames of the image is the head, calculating the ratio of the number m of frames to the number n of frames, judging whether the ratio exceeds a threshold value, and if so, sending out early warning information;
and repeatedly executing the step of acquiring the video data at intervals of preset execution time.
Optionally, the method further comprises:
storing the image of which the tag is a head and a shooting time of the image.
Optionally, the method for constructing the helmet wearing state confirmation model includes:
acquiring sample data;
preprocessing the sample data to obtain preprocessed data;
training a convolutional neural network through the preprocessed data to generate a safety helmet wearing state confirmation model;
wherein the sample data is an image of a worker wearing a helmet and a worker not wearing a helmet.
Optionally, the method for preprocessing the sample data to obtain preprocessed data includes:
reading data of the sample data to obtain read data;
performing data enhancement on the read data to obtain enhanced data;
carrying out data annotation on the enhanced data to obtain annotated data;
performing data cleaning on the marked data to obtain cleaning data;
and performing data sorting on the cleaning data to obtain preprocessing data.
Optionally, the read data comprises the sample data and two-dimensional coordinates of all pixels of the sample data; the method for enhancing the read data to obtain the enhanced data comprises the following steps:
inputting the two-dimensional coordinates of all pixels in the sample data to an affine matrix M to obtain enhanced coordinates of all the pixels in the sample data;
obtaining the enhancement data according to the enhancement coordinates;
wherein the affine matrix M has the formula:
Figure BDA0002635801430000021
x is the abscissa of the pixel, y is the ordinate of the pixel, and u is the abscissa of the pixel after data enhancement; v is the ordinate of the pixel after data enhancement; a is00、a01、a10、a11、b00And b10Is an enhancement parameter.
Optionally, the method for determining the enhancement parameter includes:
acquiring and reading an experimental image to obtain read information;
carrying out transformation operation on the experimental image to obtain transformation information;
determining the enhancement parameters according to the reading information and the transformation information;
wherein the transformation operations include scaling, rotation, miscut, and translation.
Optionally, the method for performing data annotation on the enhanced data to obtain annotated data includes:
determining a location of the hard hat and the head in the augmented data;
and respectively marking the position of the safety helmet or the head to obtain marked data.
Optionally, the method for performing data cleaning on the labeled data to obtain cleaned data includes:
and eliminating the data without the characteristic region information and the data with the definition not meeting the preset qualified standard in the marking data to obtain cleaning data.
Optionally, the convolutional neural network comprises a Yolov4 network structure.
In a second aspect, the present application provides a headgear wearing warning system comprising: the system comprises a video acquisition module and a server, wherein the server comprises an image extraction module, a state judgment module, a result confirmation module and a pre-warning broadcast module which are sequentially connected; the video acquisition module is connected with the image extraction module;
the video acquisition module is used for acquiring the video data of a construction site and transmitting the video data to the image extraction module;
the image extraction module is used for extracting the image from the video data and transmitting the image to the state judgment module;
the state judgment module is used for confirming a characteristic region in the image, determining the characteristic region information and transmitting the characteristic region information to the result confirmation module;
the result confirmation module is used for judging whether constructors who do not wear safety helmets exist in the images according to the characteristic region information, obtaining a judgment result and transmitting the judgment result to the early warning and broadcasting module;
and the early warning broadcasting module is used for sending early warning information according to the judgment result.
According to the technical scheme, the method comprises the steps of firstly obtaining video data, then extracting any n frames of images from the video data, then determining the positions and labels of characteristic regions in the images based on a pre-constructed safety helmet wearing state confirmation model, wherein the labels comprise safety helmets and heads, then determining the number m of frames of the images with the labels of the heads corresponding to the same positions in the n frames of images, calculating the ratio of the number m of frames to the number n of frames, judging whether the ratio exceeds a threshold value or not, and if the ratio exceeds the threshold value, sending out early warning information. The application also provides a safety helmet wearing early warning system which can monitor and early warn whether a constructor in a construction site wears a safety helmet or not. Since the image detected by the method is arbitrarily extracted, the false alarm caused by detection errors can be obviously avoided; the method comprises the steps of detecting n frames of images, calculating the ratio of the image of the person without wearing the safety helmet to n, and comparing the ratio with a preset ratio, so that the possibility of false alarm caused by detection errors can be obviously reduced; in addition, the head and the safety helmet in the image can be directly detected without segmenting the human body image, so that the complexity of detection steps is reduced, the detection timeliness and the detection efficiency are improved, the resource cost is saved, and the safety factor and the operation efficiency of a construction site are also improved. In conclusion, the method and the device can solve the problems that the detection by the past means is not timely and the accuracy is not high.
Drawings
In order to more clearly explain the technical solution of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without any creative effort.
Fig. 1 is a schematic flow chart of a safety helmet wearing warning method based on deep learning in an embodiment of the present application;
fig. 2 is a schematic structural diagram of a safety helmet wearing warning system in an embodiment of the present application;
FIG. 3 is a schematic flow chart illustrating a method for constructing a helmet wearing state confirmation model according to an embodiment of the present disclosure;
FIG. 4 is a schematic flow chart illustrating another method for constructing a helmet wearing state confirmation model according to an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a Yolov4 network structure in an embodiment of the present application;
fig. 6 is a schematic structural diagram of a backbone network structure in the embodiment of the present application.
Wherein: 1-a video monitoring module; 2-a server; 21-an image extraction module; 22-state judgment module; 23-result confirmation module; and 24, early warning broadcasting module.
Detailed Description
Reference will now be made in detail to embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following examples do not represent all embodiments consistent with the present application. But merely as exemplifications of systems and methods consistent with certain aspects of the application, as recited in the claims.
Referring to fig. 1, fig. 1 is a flowchart of a helmet wearing warning method based on deep learning in an embodiment of the present application. As can be seen from fig. 1, the method for early warning of wearing of a safety helmet based on deep learning in the present application includes:
s110: and acquiring video data, wherein the duration of the video data is less than or equal to the preset acquisition duration.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a safety helmet wearing warning system in an embodiment of the present application. Early warning system is worn to safety helmet includes video acquisition module 1 and server 2, and server 2 is including the image extraction module 21, state judgment module 22, result confirmation module 23 and the early warning of connecting gradually reports module 24.
The video acquisition module 1 captures video data of a construction worker at a construction site, and transmits the video data to the image extraction module 21. Specifically, the video capture module 1 is a camera or other device capable of capturing video data. The installation positions and the number of the cameras can be determined according to the area of a construction site, the environment of the construction site and the number of constructors, and the method is not particularly limited. In addition, the value range of the preset acquisition duration is [ time including at least 2 frames, 1 second ], wherein the time of each frame is determined according to the definition of the camera, and in some embodiments, the preset acquisition duration is 1 second.
S120: and extracting any n frames of images in the video data.
The image extraction module 21 receives the transmitted video data, extracts an arbitrary n-frame image from the video data, and transmits the n-frame image to the state judgment module 22. Specifically, the value range of the image frame number n is [2 frames, total frame number in the preset acquisition duration ], wherein the total frame number in the preset acquisition duration is determined according to the definition of the camera. In some embodiments, the number of image frames n is 5.
S130: determining characteristic region information in the image based on a pre-constructed safety helmet wearing state confirmation model, wherein the characteristic region information comprises a position and a label, and the label comprises a safety helmet and a head.
The state judgment module 22 can confirm the feature regions in the image and can determine feature region information including the number, positions and labels of the feature regions, and transmit the feature region information to the result confirmation module 23. Specifically, the state judgment module 22 is installed with a pre-constructed helmet wearing state confirmation model capable of recognizing target features in the image, the target features including "helmet" representing a helmet on the head of a constructor wearing the helmet and "head" representing the head of a constructor not wearing the helmet.
The characteristic region is a region defined for the target characteristic by the helmet wearing state confirmation model. In some embodiments, the shape of the feature region is a square, and the position of the feature region refers to the coordinates of the four vertices of the square. The label of the characteristic area is determined according to the target characteristic, when the target characteristic is 'safety helmet', the label of the characteristic area is correspondingly determined as the safety helmet, and when the target characteristic is 'head', the label of the characteristic area is correspondingly determined as the head.
S140: determining the number m of frames of the image of which the label corresponding to the same position in the n frames of the image is the head, calculating the ratio of the number m of frames to the number n of frames, judging whether the ratio exceeds a threshold value, and if so, sending out early warning information.
The result confirmation module 23 receives the feature area information, determines whether there is a constructor who does not wear a helmet in the image according to the feature area information, obtains a determination result, and transmits the determination result to the early warning broadcast module 24. Specifically, the result confirmation module 23 may determine, according to different positions, the number m of frames of the image with the tag at the position being the head in the n frames of images, calculate a ratio of the number m of frames to the number n of frames, determine whether the ratio exceeds a threshold, and if the ratio exceeds the threshold, indicate that at least one constructor in the image does not wear the safety helmet. Since the preset acquisition time is short, the constructor basically does not move in the time, and therefore, for any feature region, the position of the feature region changes very little between n frames of images, and the position of the feature region can be approximately considered to be the same in the n frames of images. In addition, the threshold value may range from [0.5, 1], and in some embodiments, the threshold value may be 0.8.
And after receiving the judgment result, the early warning broadcasting module 24 sends out early warning information on the construction site according to the judgment result. In some embodiments, the warning report module 24 may also transmit the warning information to the relevant workers on the construction site in the form of a client program or a short message, so as to remind the workers who do not wear the safety helmet when the construction order on the construction site is maintained in a good condition.
S150: and repeatedly executing the step of acquiring the video data at intervals of preset execution time.
The video acquisition module 1 repeatedly executes step S110 at intervals of the preset execution time. Specifically, in order to secure the safety of the construction worker and to increase the safety awareness of the construction worker, it is necessary to continuously and stably monitor whether the construction worker wears the helmet on the construction site, so that S110 to S140 need to be repeatedly performed.
The preset execution time is set according to a requirement, and the application is not particularly limited.
In some embodiments, a method for early warning of helmet wearing based on deep learning further includes: storing the image of which the tag is a head and a shooting time of the image. Specifically, the label storing the feature area is an image of the head and the shooting time of the image, and the purpose is to improve the accuracy of detection for subsequent review and responsibility investigation.
In some embodiments, referring to fig. 3, fig. 3 is a schematic flowchart of a method for constructing a helmet wearing state confirmation model in an embodiment of the present application, where the method for constructing the helmet wearing state confirmation model includes:
s210: and acquiring sample data.
The sample data is an image of a worker wearing a helmet and a worker not wearing a helmet. Specifically, the sample data may be images of a worker wearing a helmet and a worker not wearing a helmet at different construction sites or industrial scenes.
S220: and preprocessing the sample data to obtain preprocessed data.
After sample data is acquired, the sample data needs to be preprocessed. In particular, the purpose of the pre-processing is to process the sample data into data that can train the convolutional neural network.
S230: and training a convolutional neural network through the preprocessed data to generate a safety helmet wearing state confirmation model.
And training the convolutional neural network by preprocessing data, stopping training when the loss function of the convolutional neural network gradually converges, and storing the optimal parameter combination corresponding to the current moment.
Specifically, the convolutional neural network only needs a small amount of manual work in the training process, and the convolutional neural network can effectively learn corresponding target features from a large amount of sample data, so that a complex target feature extraction process is avoided, and the model construction efficiency is improved.
In some embodiments, the preprocessing in the method for establishing the safety helmet wearing state confirmation model can more accurately simulate the situation that a camera collects construction site images, more accurately restore the shooting condition of the camera, and enhance the accuracy of detection. Referring to fig. 4, fig. 4 is a schematic flow chart of another method for constructing a helmet wearing state confirmation model in the embodiment of the present application, where the method includes:
s310: and acquiring sample data.
The sample data is an image of a worker wearing a helmet and a worker not wearing a helmet. Specifically, the sample data may be images of a worker wearing a helmet and a worker not wearing a helmet at different construction sites or industrial scenes.
S320: and reading the data of the sample data to obtain read data.
In the preprocessing process, data reading is firstly carried out on sample data to obtain read data. Specifically, reading each image in the sample data through an algorithm, and traversing the pixel position and the image information of each image to obtain read data. The read data includes sample data and two-dimensional coordinates of all pixels of the sample data.
S330: and performing data enhancement on the read data to obtain enhanced data.
After the read data is obtained, data enhancement needs to be performed on the read data to obtain enhanced data. Specifically, the data enhancement enables the shooting elevation angle of the image simulation camera to be subjected to transformation such as scaling, rotation, shearing, translation and the like.
In some embodiments, the method of data enhancement is affine transformation. Specifically, the two-dimensional coordinates of all pixels in the sample data are input to the affine matrix M, and the enhanced coordinates of all pixels in the sample data are obtained. And obtaining enhanced data according to the enhanced coordinates, wherein the enhanced data is sample data which is subjected to scaling, rotation, shearing, translation and other transformations.
The formula of the affine matrix M is shown below, where x is the abscissa of the pixel, y is the ordinate of the pixel, and u is the abscissa of the pixel after data enhancement; v is the ordinate of the pixel after data enhancement; a is00、a01、a10、a11、b00And b10Is an enhancement parameter.
Figure BDA0002635801430000061
The formula derivation process of the affine matrix M is as follows:
1) combining the matrix A for controlling operations such as scaling, rotation, and miscut with the matrix B for controlling translation operation to obtain a matrix M0Wherein matrix A, matrix B and matrix M0The formula of (a) is as follows:
Figure BDA0002635801430000062
2) is M0Expand one dimension change to obtain M1. In particular, the purpose of the expansion is to maintain the original size of the feature image after matrix multiplication, where the matrix M is1The formula of (a) is as follows:
Figure BDA0002635801430000071
therefore, the mathematical expression affine-transformed to (u, v) by the two-dimensional coordinates (x, y) is as follows:
Figure BDA0002635801430000072
in general, the formula of the affine matrix M can be automatically created from a plurality of points in an image extracted from video data and an image subjected to data enhancement by using a getoffset transform function in opencv.
In some embodiments, the method of determining enhancement parameters in an affine matrix M comprises:
1) and acquiring and reading the experimental image to obtain read information.
In order to determine the enhancement parameters, it is necessary to select an image, then arbitrarily select four points on the image, mark the four points and record the original coordinates of the four points. Wherein the read information is the original coordinates of the four points.
2) And carrying out transformation operation on the experimental image to obtain transformation information.
And carrying out manual transformation operation on the experimental image, wherein the transformation operation comprises zooming, rotating, shearing, translating and the like. And finding the positions of the four points according to the marks, and determining the transformation coordinates of the four points after transformation operation. Wherein the transformation information is transformation coordinates of four points.
3) And determining an enhancement parameter according to the reading information and the transformation information.
And calculating an enhancement parameter according to the original coordinates and the transformed coordinates of the four points.
S340: and carrying out data annotation on the enhanced data to obtain annotated data.
After the enhanced data is obtained, data annotation needs to be performed on the enhanced data to obtain annotated data. Specifically, the target features in the enhanced data are identified and feature areas are divided, and the positions and labels of the feature areas are confirmed to obtain labeled data.
S350: and carrying out data cleaning on the marked data to obtain cleaning data.
After the labeled data is obtained, the labeled data needs to be subjected to data cleaning to obtain cleaned data. Specifically, data without feature area information and data with definition not meeting a preset qualified standard in the labeling data are removed to obtain cleaning data. The marking data without safety caps or heads has no significance for building the model and needs to be removed, and the marking data with lower definition has lower practicability and needs to be removed. In addition, the preset standard for eliminating data can be determined according to actual requirements, and the application is not particularly limited.
S360: and performing data sorting on the cleaning data to obtain preprocessing data.
After the cleaning data is obtained, data sorting needs to be performed on the cleaning data to obtain preprocessed data. Specifically, the washing data is sorted according to the storage path of the labeled data after data washing, the position of the characteristic area, the rule of the label and the like so as to train the convolutional neural network.
S370: and training a convolutional neural network through the preprocessed data to generate a safety helmet wearing state confirmation model.
And training the convolutional neural network by preprocessing data, stopping training when the loss function of the convolutional neural network gradually converges, and storing the optimal parameter combination corresponding to the current moment.
In some embodiments, the convolutional neural network comprises a Yolov4 network structure. Referring to fig. 5, fig. 5 is a schematic structural diagram of a Yolov4 network structure in an embodiment of the present application.
Specifically, the Yolov4 network architecture includes a backbone network, a neck network, and a head network; the backbone network is used for extracting target features from the sorted data; the neck network is used for fusing the arrangement data containing the target characteristics to obtain fusion data; the head network is used for carrying out convolution training according to the fusion data and obtaining a safety helmet wearing state confirmation model.
The backbone network part selects CSPDarknet53 which is more suitable for detecting the network, thereby accelerating the network speed and optimizing the parallel computation. CSPs (Cross-Stage-Partial-connections, partially connected across stages) include computation layers and translation layers. The CSP divides the input sorted data into two parts, reduces the complexity of calculation, only inputs one part into the calculation layer, and directly skips the calculation to enter the conversion layer. Referring to fig. 6, fig. 6 is a schematic structural diagram of a backbone network structure in the embodiment of the present application, and the CSPDarknet53 applies the connection structure to a Darknet 53.
The neck Network part selects SPP (Spatial Pyramid Pooling) and PAN (Path Aggregation Network) to enhance the characteristics, enlarge the receptive field of the Network and enable the Network to better detect the targets with different sizes in the image. The SPP strategy uses a spatial pyramid pooling layer to replace a pooling layer behind the last convolution layer, the arrangement data act on each region through different pooling layers, processed results are spliced, images with different sizes are converted into fixed-length expressions, and the subsequent analysis of a full-connection layer is easier. Such processing allows the model to accept input images of different sizes.
The PAN algorithm is a path aggregation network. Because the features for sorting data contain a large amount of edge information and the path between the highest-level feature and the lower-level feature is long, information is easily lost. Thus, the strategy employs shortening the information path, enhancing the feature pyramid with low-level pinpointing information, creating a bottom-up path enhancement, and using adaptive feature pooling to recover corrupted information between each region and the feature level.
The head network part applies a data enhancement method, Mosaic and SAT (Self-adaptive Training), wherein the Mosaic data enhancement splices four images into one image, the number of images trained at one time is increased in a phase change manner, and the model is easy to train on a single GPU. SAT is divided into two phases: in the first stage, the neural network performs anti-attack on the neural network to change an original image, so that the training difficulty of a sample is increased; in the second stage, the neural network detects the object on the modified image.
In the training process, the embodiment adopts a strategy of freezing the backbone network parameters of the model, and only trains the parameters of the neck network and the head network so as to reduce the time for convergence; when the decline of the loss function starts to slow down, the backbone network parameters of the unfreezing model are added into training, the training is stopped when the loss function gradually converges, the current optimal parameters are stored, and then the corresponding output result q, the output result w, the output result t and the like are output.
The embodiment of the application combines a deep learning neural network model with a monitoring technology, provides an intelligent and accurate solution, improves the safety factor and the operation efficiency of an industrial field, reduces the hardware requirement when a Yolov4 network structure pursues both network precision and network speed, enables the model to accurately run on a GPU-only device, greatly reduces the training cost, and enables the model to still exert stable recognition and detection level under the limited condition of the device
According to the technical scheme, the method for early warning of the wearing of the safety helmet based on the deep learning comprises the steps of firstly obtaining video data, then extracting any n frames of images from the video data, and then determining the position and the label of a characteristic region in the images based on a pre-constructed safety helmet wearing state confirmation model, wherein the label comprises the safety helmet and a head. Then, the number m of frames of the image with the label at the same position as the head in the n frames of images is determined, the ratio of the number m of frames to the number n of frames is calculated, whether the ratio exceeds a threshold value or not is judged, and if the ratio exceeds the threshold value, early warning information is sent out. The embodiment of the application also provides a safety helmet wearing early warning system which can monitor and early warn whether a constructor in a construction site wears a safety helmet or not. Since the image detected by the method is arbitrarily extracted, the false alarm caused by detection errors can be obviously avoided; the method comprises the steps of detecting n frames of images, calculating the ratio of the image of the person without wearing the safety helmet to n, and comparing the ratio with a preset ratio, so that the possibility of false alarm caused by detection errors can be obviously reduced; in addition, the head and the safety helmet can be directly detected without segmenting human body images, so that the complexity of detection steps is reduced, the detection timeliness and the detection efficiency are improved, the resource cost is saved, and the safety factor and the operation efficiency of a construction site are also improved. In conclusion, the method and the device can solve the problems that the detection by the past means is not timely and the accuracy is not high.
The embodiments of the present application have been described in detail, but the present application is only a preferred embodiment of the present application and should not be construed as limiting the scope of the present application. All equivalent changes and modifications made within the scope of the present application shall fall within the scope of the present application.

Claims (10)

1. A safety helmet wearing early warning method based on deep learning is characterized by comprising the following steps:
acquiring video data, wherein the duration of the video data is less than or equal to a preset acquisition duration;
extracting any n frames of images in the video data;
determining characteristic region information in the image based on a pre-constructed safety helmet wearing state confirmation model, wherein the characteristic region information comprises a position and a label, and the label comprises a safety helmet and a head;
determining the number m of frames of the image of which the label corresponding to the same position in the n frames of the image is the head, calculating the ratio of the number m of frames to the number n of frames, judging whether the ratio exceeds a threshold value, and if so, sending out early warning information;
and repeatedly executing the step of acquiring the video data at intervals of preset execution time.
2. The deep learning based helmet wearing early warning method according to claim 1, further comprising:
storing the image of which the tag is a head and a shooting time of the image.
3. The method for early warning of wearing of safety helmet based on deep learning of claim 1, wherein the method for constructing the model for confirming wearing state of safety helmet comprises:
acquiring sample data;
preprocessing the sample data to obtain preprocessed data;
training a convolutional neural network through the preprocessed data to generate a safety helmet wearing state confirmation model;
wherein the sample data is an image of a worker wearing a helmet and a worker not wearing a helmet.
4. The method for early warning of wearing of safety helmet based on deep learning of claim 3, wherein the method for preprocessing the sample data to obtain preprocessed data comprises:
reading data of the sample data to obtain read data;
performing data enhancement on the read data to obtain enhanced data;
carrying out data annotation on the enhanced data to obtain annotated data;
performing data cleaning on the marked data to obtain cleaning data;
and performing data sorting on the cleaning data to obtain preprocessing data.
5. The deep learning based helmet wearing early warning method according to claim 4, wherein the read data comprises the sample data and two-dimensional coordinates of all pixels of the sample data; the method for enhancing the read data to obtain the enhanced data comprises the following steps:
inputting the two-dimensional coordinates of all pixels in the sample data to an affine matrix M to obtain enhanced coordinates of all the pixels in the sample data;
obtaining the enhancement data according to the enhancement coordinates;
wherein the affine matrix M has the formula:
Figure FDA0002635801420000021
x is the abscissa of the pixel, y is the ordinate of the pixel, and u is the abscissa of the pixel after data enhancement; v is the ordinate of the pixel after data enhancement; a is00、a01、a10、a11、b00And b10Is a reinforcing ginsengAnd (4) counting.
6. The deep learning based helmet wearing early warning method according to claim 5, wherein the method for determining the enhancement parameters comprises the following steps:
acquiring and reading an experimental image to obtain read information;
carrying out transformation operation on the experimental image to obtain transformation information;
determining the enhancement parameters according to the reading information and the transformation information;
wherein the transformation operations include scaling, rotation, miscut, and translation.
7. The safety helmet wearing early warning method based on deep learning of claim 5, wherein the method for performing data labeling on the enhanced data to obtain labeled data comprises the following steps:
determining a location of the hard hat and the head in the augmented data;
and respectively marking the position of the safety helmet or the head to obtain marked data.
8. The safety helmet wearing early warning method based on deep learning of claim 7, wherein the method for performing data cleaning on the labeled data to obtain the cleaning data comprises the following steps:
and eliminating the data without the characteristic region information and the data with the definition not meeting the preset qualified standard in the marking data to obtain cleaning data.
9. The deep learning-based helmet wearing early warning method according to claim 8, wherein the convolutional neural network comprises a Yolov4 network structure.
10. A headgear wear warning system, comprising: the system comprises a video acquisition module and a server, wherein the server comprises an image extraction module, a state judgment module, a result confirmation module and a pre-warning broadcast module which are sequentially connected; the video acquisition module is connected with the image extraction module;
the video acquisition module is used for acquiring the video data of a construction site and transmitting the video data to the image extraction module;
the image extraction module is used for extracting the image from the video data and transmitting the image to the state judgment module;
the state judgment module is used for confirming a characteristic region in the image, determining the characteristic region information and transmitting the characteristic region information to the result confirmation module;
the result confirmation module is used for judging whether constructors who do not wear safety helmets exist in the images according to the characteristic region information, obtaining a judgment result and transmitting the judgment result to the early warning and broadcasting module;
and the early warning broadcasting module is used for sending early warning information according to the judgment result.
CN202010824956.9A 2020-08-17 2020-08-17 Helmet wearing early warning method and system based on deep learning Pending CN111985387A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010824956.9A CN111985387A (en) 2020-08-17 2020-08-17 Helmet wearing early warning method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010824956.9A CN111985387A (en) 2020-08-17 2020-08-17 Helmet wearing early warning method and system based on deep learning

Publications (1)

Publication Number Publication Date
CN111985387A true CN111985387A (en) 2020-11-24

Family

ID=73434510

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010824956.9A Pending CN111985387A (en) 2020-08-17 2020-08-17 Helmet wearing early warning method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN111985387A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651344A (en) * 2020-12-29 2021-04-13 哈尔滨理工大学 Motorcycle helmet wearing detection method based on YOLOv4
CN112907660A (en) * 2021-01-08 2021-06-04 浙江大学 Underwater laser target detector for small sample
CN114301180A (en) * 2021-12-31 2022-04-08 南方电网大数据服务有限公司 Power distribution room equipment switch component state monitoring method and device based on deep learning
WO2023053519A1 (en) * 2021-09-28 2023-04-06 オムロン株式会社 Setting device, setting method, and setting program

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112651344A (en) * 2020-12-29 2021-04-13 哈尔滨理工大学 Motorcycle helmet wearing detection method based on YOLOv4
CN112907660A (en) * 2021-01-08 2021-06-04 浙江大学 Underwater laser target detector for small sample
CN112907660B (en) * 2021-01-08 2022-10-04 浙江大学 Underwater laser target detector for small sample
WO2023053519A1 (en) * 2021-09-28 2023-04-06 オムロン株式会社 Setting device, setting method, and setting program
CN114301180A (en) * 2021-12-31 2022-04-08 南方电网大数据服务有限公司 Power distribution room equipment switch component state monitoring method and device based on deep learning

Similar Documents

Publication Publication Date Title
CN111985387A (en) Helmet wearing early warning method and system based on deep learning
CN110852219B (en) Multi-pedestrian cross-camera online tracking system
CN110200598B (en) Poultry detection system and detection method for abnormal physical signs in large farm
CN103824070B (en) A kind of rapid pedestrian detection method based on computer vision
CN111862296B (en) Three-dimensional reconstruction method, three-dimensional reconstruction device, three-dimensional reconstruction system, model training method and storage medium
Bedruz et al. Real-time vehicle detection and tracking using a mean-shift based blob analysis and tracking approach
CN108960076B (en) Ear recognition and tracking method based on convolutional neural network
CN104966304A (en) Kalman filtering and nonparametric background model-based multi-target detection tracking method
CN110096945B (en) Indoor monitoring video key frame real-time extraction method based on machine learning
CN114140745A (en) Method, system, device and medium for detecting personnel attributes of construction site
CN114463788A (en) Fall detection method, system, computer equipment and storage medium
CN115205581A (en) Fishing detection method, fishing detection device and computer readable storage medium
US11544926B2 (en) Image processing apparatus, method of processing image, and storage medium
CN114155472A (en) Method, device and equipment for detecting abnormal state of factory scene empty face protection equipment
CN107862314B (en) Code spraying identification method and device
CN114764895A (en) Abnormal behavior detection device and method
CN109146916B (en) Moving object tracking method and device
CN111382705A (en) Reverse behavior detection method and device, electronic equipment and readable storage medium
CN116259002A (en) Human body dangerous behavior analysis method based on video
CN115953744A (en) Vehicle identification tracking method based on deep learning
CN113420839B (en) Semi-automatic labeling method and segmentation positioning system for stacking planar target objects
CN115131826A (en) Article detection and identification method, and network model training method and device
CN112101134B (en) Object detection method and device, electronic equipment and storage medium
CN115035160A (en) Target tracking method, device, equipment and medium based on visual following
CN115995093A (en) Safety helmet wearing identification method based on improved YOLOv5

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination