CN115588165A - Tunnel worker safety helmet detection and face recognition method - Google Patents

Tunnel worker safety helmet detection and face recognition method Download PDF

Info

Publication number
CN115588165A
CN115588165A CN202211318528.4A CN202211318528A CN115588165A CN 115588165 A CN115588165 A CN 115588165A CN 202211318528 A CN202211318528 A CN 202211318528A CN 115588165 A CN115588165 A CN 115588165A
Authority
CN
China
Prior art keywords
face
safety helmet
distance
tunnel
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211318528.4A
Other languages
Chinese (zh)
Inventor
周茂
岳杨
胡立锦
何文彬
邹飞
颜嘉
李育骏
李智
余留洋
廖柯嘉
罗洪平
唐贤伦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Chongqing Electric Power Co Construction Branch
Original Assignee
State Grid Chongqing Electric Power Co Construction Branch
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Chongqing Electric Power Co Construction Branch filed Critical State Grid Chongqing Electric Power Co Construction Branch
Priority to CN202211318528.4A priority Critical patent/CN115588165A/en
Publication of CN115588165A publication Critical patent/CN115588165A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention relates to a tunnel worker safety helmet detection and face recognition method, which belongs to the technical field of image target detection and face recognition, and comprises the steps of firstly, generating a classified label data set by utilizing acquired image label data of a person wearing and not wearing a safety helmet, and performing data enhancement, wherein images are balanced to balance adverse effects caused by harsh care conditions of a tunnel; training a classified convolution neural network model Tiny-YOLOv3 and a FaceNet neural network model fusing distance thresholds, transmitting a face picture into a FaceNet network to obtain a feature vector representing the face, calculating a combined distance between the feature vector and the face in a database and the Euclidean distance and cosine similarity, wherein the combined distance not only represents the direction of the feature vector, but also considers the absolute difference of the feature vector in terms of the value; meanwhile, abnormal alarm is carried out on the detection and identification results, so that the identity of tunnel workers is effective, the early warning of the safety helmet is not worn, and the access of irrelevant personnel is forbidden.

Description

Tunnel worker safety helmet detection and face recognition method
Technical Field
The invention belongs to the technical field of image target detection and face recognition, and relates to a method for detecting a safety helmet of a tunnel worker and recognizing a face.
Background
Convolutional Neural Networks (CNNs), a model architecture for deep learning, have become the most effective method in the field of image processing and computer vision. The two characteristics of weight sharing and local receptive field reduce the number of weights, so that the operation complexity of the model is reduced; the translation invariance of the image features also enables the image features to have good feature extraction capability and high stability.
Currently, a great deal of research is carried out on target detection and face recognition methods by using a convolutional neural network, and since R.Girshick et al propose an R-CNN deep learning model based on a candidate region in 2014, a group of classical target detection algorithms such as fast R-CNN, SSD, YOLO and the like are generated, and meanwhile, face recognition models such as faceNet and ArcFace and the like are generated. The trend in these models is generally that the number of network layers is continuously increased to obtain better feature extraction capability, and the image scale is continuously increased to cover a wider range of features. However, the more complex models bring problems of difficult network convergence, fast parameter growth, slow calculation speed and the like, and the most significant problem is that the models are difficult to deploy on resource-limited devices as the complexity of the models increases.
Meanwhile, on the construction site of the tunnel, the safety helmet can be correctly worn to effectively prevent and reduce the damage of safety accidents of workers in the operation process. Related studies have shown that safety accidents caused by the non-wearing of safety helmets are relatively large. However, at present, the supervision of wearing the safety helmet on a construction site is divided into two ideas, one is mainly dependent on manual supervision, the efficiency is low, and manpower is consumed; secondly, a detection network is established by utilizing deep learning, but the existing safety helmet wearing detection network model has the problems of low accuracy, low reasoning speed, incapability of meeting the requirements on precision and instantaneity when deployed to edge computing equipment and the like. Therefore, utilizing a lightweight end-to-end detection network is a poor choice for hardhat detection. Meanwhile, the safety helmet in the tunnel is used for detecting the problem that the accuracy of small targets is low in the conventional industry such as the building industry, and interference of dark light exists.
For the model of face recognition, after face images are subjected to convolution operation for multiple times, the final input vector contains the richest spatial and semantic information. For how to identify the identity of a worker, generally, the euclidean distance or cosine similarity between feature vectors representing human faces is calculated to calculate whether the identity is successfully identified, but any one of the euclidean distance or cosine similarity cannot take account of the direction and absolute difference of the vectors. Meanwhile, the general face recognition network finally uses 128-dimensional vectors to represent the identity, but for a large amount of data, the higher-dimensional representation vectors are adopted to be beneficial to distinguishing more fine features among the feature vectors.
Disclosure of Invention
In view of the above, the invention aims to provide a tunnel worker helmet detection and face recognition method based on the Tiny-YOLOv3 and the fusion distance threshold FaceNet, which performs model training to quickly obtain a face under the condition of detecting a helmet in real time, and after the face is aligned, compares the face with a database by calculating a characterization vector of the face to quickly recognize the identity of a worker and whether the worker wears the helmet.
In order to achieve the purpose, the invention provides the following technical scheme:
a tunnel worker safety helmet detection and face recognition method comprises the following steps:
s1: acquiring face images with safety helmets of tunnel workers in different postures, aligning the faces to be used as a data set for face recognition, recording the data set as a recog _ dataset, and establishing a face database of each worker;
s2: acquiring a plurality of face images with or without a safety helmet, labeling and enhancing data of the faces with or without the safety helmet as a training set of a detection network, and recording the training set as a detec _ dataset;
s3: training a classified convolutional neural network model Tiny-YOLOv3 for detecting whether a worker wears a safety helmet or not by using the detec _ dataset enhanced by the data in the step S2; training a convolutional neural network faceNet for face recognition by using the data of the recog _ dataset in the step S1;
s4: detecting the input image by using the trained Tiny-YOLOv3 in the step S3 to obtain the rectangular coordinates of the head image of the input image with the safety helmet, and intercepting the local picture of the detected part from the input image according to the corresponding coordinates;
s5: carrying out key point detection on the local picture in the step S4 by using dlib, aligning the local picture to an FFHQ data set sample by adopting affine transformation, and equalizing the aligned picture to obtain a human face picture to be identified;
s6: the face picture to be recognized in the step S5 is transmitted into a faceNet network to obtain a 512-dimensional feature vector representing the face;
s7: calculating cosine similarity, euclidean distance and Manhattan distance between the feature vector in the step S6 and the face in the database, and judging whether the combined result of the Euclidean distance and the cosine similarity meets a threshold value for successful recognition or not so as to judge whether the combined result is a worker or not;
s8: and performing identity authentication according to the result in the step S7, and warning the person who does not wear the safety helmet, so as to realize the identification of the tunnel staff and the early warning of the safety helmet.
Further, after acquiring the face images with the safety helmet of different postures of tunnel workers in the step S1, carrying out key point detection by using dlib, aligning the local picture to an FFHQ data set sample by adopting affine transformation, and equalizing the aligned pictures to be used as a database for face recognition, specifically:
s11: using dlib to traverse the FFHQ data set sample to obtain 5 key point coordinates of the FFHQ, namely key points in two eyes and people, and using the key points as expected template coordinates;
s12: carrying out dlib 5 key point detection on the picture with the safety helmet, and transforming the key points of the picture to template coordinates through affine transformation to realize face alignment;
s13: meanwhile, the influence of the light is weakened by using equalization in consideration of the problem of light imbalance of the tunnel.
Further, in the step S2, the face is labeled by adopting ImageLabel, the face is divided into two categories of wearing and non-wearing safety helmets, and the labeled picture is rotated, overturned and fuzzified to enrich the training set.
Further, the step S3: training a classified convolutional neural network model Tiny-YOLOv3 for detecting whether a worker wears a safety helmet or not by using the detec _ dataset data in the step S2; using the recog _ dataset data set data in the step S1 to train a convolutional neural network FaceNet for face recognition, specifically:
s31: inputting the data-enhanced labeled image into a target detection Tiny-YOLOv3 model for supervised training: updating the weight of the target detection model by using an Adam optimization algorithm by taking a loss function between the minimized predicted value and the label as an optimization target; the loss function is composed of a positioning loss function and a classification loss function, wherein the positioning loss function adopts a cross-over ratio DIoU (Distance interaction over Unit):
Figure BDA0003909420230000031
where IoU denotes the ratio of intersection to union of the prediction box and the real box, DIoU denotes IoU, b, taking into account the distance between the centers of the prediction box and the real box gt Representing the center points of the predicted and real frames, respectively, B gt B is a real frame and a prediction frame, rho represents the Euclidean distance between two central points, and c represents the diagonal distance of a minimum closure area which can simultaneously contain the prediction frame and the real frame;
the classification loss function is binary cross-entropy BCE (binary cross entropy):
Figure BDA0003909420230000032
wherein y is the label category, p (y) is the probability that the output belongs to the y label, i represents the ith sample in a training batch, and N represents the total number of samples contained in a training batch;
s32: inputting the different shielded face images subjected to face alignment into a FaceNet model for supervision training: taking a triple loss function as a target, aiming at enabling the intra-class distance to be smaller and the inter-class distance to be larger, updating the weight of a target detection model by using an Adam optimization algorithm, wherein the loss function is as follows:
Figure BDA0003909420230000033
wherein
Figure BDA0003909420230000039
The training sample of the ith class a in one batch,
Figure BDA0003909420230000034
is represented by
Figure BDA0003909420230000035
The samples of the same type are used for the analysis,
Figure BDA0003909420230000036
is shown and
Figure BDA0003909420230000037
a sample that is not of the same class,
Figure BDA0003909420230000038
representing the Euclidean distance between a and b, alpha is a threshold value, and loss and gradient are generated when the distance between the alpha class inner distance and the class outer distance is larger than the threshold value.
Further, in step S4, the actually shot picture is transferred into a Tiny-yoolov 3 detection network, and the detected face is cut out after affine transformation.
Further, step S5 specifically includes the following steps:
s51: using dlib to traverse the FFHQ data set sample to obtain 5 key point coordinates of the FFHQ, namely key points in two eyes and people, and using the key points as expected template coordinates;
s52: 5 key point detection of dlib is carried out on the local picture in the step S4, and the picture key points are transformed to template coordinates through affine transformation to achieve face alignment;
s53: meanwhile, the problem of light imbalance of the tunnel is considered, and the influence of light is weakened by using equalization.
Further, in step S7, the aligned face image is transmitted to the recognition network to obtain a 512-dimensional vector representing the face, and the distances between the face and all the faces in the database are calculated in the following manner:
Figure BDA0003909420230000041
wherein L is Euclidean ,L cosine Respectively, the euclidean distance and the cosine distance, x,
Figure BDA0003909420230000042
a 512-dimensional vector representing the input and a 512-dimensional feature vector in the database,
Figure BDA0003909420230000043
representing a given input x, finding the one in the database having the smallest Euclidean distance therefrom
Figure BDA0003909420230000044
Representing the euclidean distance between a and b.
Further, step S8 specifically includes: if the calculated distance meets the set threshold value, the same person is considered to be the same, namely identity confirmation is carried out, and otherwise authentication fails; meanwhile, whether the person wears the safety helmet or not can be obtained according to the detection result, and if the person does not wear the safety helmet and the identity authentication is abnormal, an alarm is given, so that the safety of the person and the entrance of irrelevant persons are ensured.
The invention has the beneficial effects that: the method uses dlib to detect key points, and adopts affine transformation to align the local pictures to the FFHQ data set samples, so that the pictures have uniform formats. The aligned pictures are equalized to obtain a face picture to be recognized, and adverse effects of the face picture on the equalization of harsh care conditions of the tunnel can be effectively balanced; the face picture is transmitted into a faceNet network to obtain a feature vector with 512 dimensionalities for representing the face, higher representation dimensionality is adopted, finer identification of the feature vector is achieved, and the face picture can be freely recognized even facing more people. Calculating a combined result of Euclidean distance and cosine similarity of the feature vector and the face in the database, judging the face as a worker according to a threshold value meeting successful recognition, and simultaneously, not only representing the direction of the feature vector, but also considering absolute difference of numerical values of the feature vector by the calculated result; meanwhile, abnormal alarm can be carried out on the detection and identification results, so that the identity identification of the workers in the tunnel is effective, the early warning of the safety helmet is not worn, and the access of irrelevant people is forbidden.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow chart of a tunnel worker helmet detection and face recognition method based on the Tiny-YOLOv3 and the fusion distance threshold FaceNet according to the preferred embodiment of the invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by the terms "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not intended to indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and therefore the terms describing the positional relationship in the drawings are only used for illustrative purposes and are not to be construed as limiting the present invention, and the specific meaning of the terms described above will be understood by those skilled in the art according to the specific circumstances.
Referring to fig. 1, the tunnel worker helmet detection and face recognition method based on Tiny-yollov 3 and the fusion distance threshold FaceNet according to the embodiment includes the following steps:
step 1: the method comprises the steps of obtaining face images of safety helmets with different postures of workers, using dlib to detect key points, adopting affine transformation to align local pictures to FFHQ data set samples, equalizing the aligned pictures to be used as a data base for face recognition, and establishing a recog _ dataset data set for face recognition, wherein the recog _ dataset specifically comprises the following steps: using dlib to traverse the FFHQ dataset samples to obtain 5 key point coordinates of FFHQ, i.e. key points in both eyes and people, as the desired template coordinates. And carrying out dlib 5 key point detection on the picture with the wearable safety helmet, and transforming the picture key points to the template coordinates through affine transformation to realize face alignment. Meanwhile, the problem of light imbalance of the tunnel is considered, and the influence of light is weakened by using equalization.
And 2, step: acquiring a plurality of face images with or without safety helmets, and performing label labeling and data enhancement on the faces with or without safety helmets as a detec _ dataset training set of a detection network;
and 3, step 3: training a classified convolutional neural network model Tiny-YOLOv3 for detecting whether a worker wears a safety helmet or not by using the detec _ dataset enhanced by the data in the step 2; using the recog _ dataset data set data in the step 1 to train a face recognition convolutional neural network FaceNet, specifically:
the method comprises the steps of using 6 depth separable convolution and maximum pooling structures as a backbone network, using FPN to realize multi-scale detection, constructing a classification convolution neural network model for a large target and a medium target after a feature map of each scale, using positioning loss, confidence loss and classification loss as optimization targets until the model converges, storing the weight of the model with the best classification accuracy of a test set, and using an Adam optimization algorithm to update the weight of the model. The positioning loss takes DIou as a loss function, and the classification loss takes binary cross entropy as the loss function.
Figure BDA0003909420230000061
Where IoU denotes the ratio of intersection to union of the prediction box and the real box, DIoU denotes IoU, b, taking into account the distance between the centers of the prediction box and the real box gt Respectively representing the center points of the predicted frame and the real frame, B gt And B is the real and predicted boxes. And p represents the calculation of the euclidean distance between the two center points. c represents the diagonal distance of the minimum closure area that can contain both the prediction box and the real box.
Figure BDA0003909420230000062
Where y is the binary label 0 or 1,p (y) is the probability that the output belongs to the y label, i represents the ith sample in a training batch, and N represents the number of samples contained in a training batch.
To balance the speed accuracy, the FaceNet backbone network adopts mobile net to extract features and map the image to the Euclidean space. The spatial distance of the picture is directly related to the picture similarity: the distance between different images of the same person in space is small, and the distance between the images of different persons in space is large, so that the method can be used for face verification, recognition and clustering. The network targets a triple loss function, aiming at making the intra-class distance smaller and the inter-class distance larger.
Figure BDA0003909420230000063
Wherein
Figure BDA0003909420230000064
Ith category in a batchA training sample of a is selected from the training samples of a,
Figure BDA0003909420230000065
is shown and
Figure BDA0003909420230000066
the samples of the same type are used for the analysis,
Figure BDA0003909420230000067
is shown and
Figure BDA0003909420230000068
the samples of the non-homogeneous type are,
Figure BDA0003909420230000069
representing the Euclidean distance between a and b, alpha is a threshold value, and loss and gradient are generated when the distance between the alpha class inner distance and the class outer distance is larger than the threshold value.
And 4, step 4: using the trained Tiny-Yolov3 in the step 3 to obtain a detected head image with a safety helmet, and intercepting a local image of a detected part from a corresponding image;
and 5: carrying out key point detection by using dlib, aligning a local picture to an FFHQ data set sample by adopting affine transformation, and equalizing the aligned pictures to obtain a human face picture to be identified, wherein the method specifically comprises the following steps:
and detecting a pre-training model according to the official shape _ predictor _5_face with angles of eyes and key points in the human according to bz 2. The 5-point coordinates are obtained by detecting the FFHQ data in advance. And aligning the detected face to FFHQ data through affine transformation. The affine transformation is a linear transformation from two-dimensional coordinates to two-dimensional coordinates, and maintains "straightness" and "parallelism" of a two-dimensional figure. The three non-collinear pairs of corresponding points define a unique affine transformation. Affine transformations can be achieved by a complex series of atomic transformations, including translation, scaling, flipping, rotation, and shearing. Equalization is used as a method for enhancing Image Contrast (Image Contrast), and the main idea is to change the histogram distribution of one Image into an approximately uniform distribution, thereby enhancing the Contrast of the Image.
Step 6: using the face picture to be recognized in the step 5, and transmitting the face picture into a FaceNet network to obtain a 512-dimensional feature vector representing the face;
and 7: and (3) judging whether the combined result of the Euclidean distance and the cosine similarity of the feature vector in the step (6) and the face calculation in the database meets a threshold value for successful recognition, wherein the judgment is specifically as follows: and judging the similarity of the two vectors by calculating the cosine value of the included angle between the template vector and the vector of the detected image in the database and the Euclidean distance between the template vector and the vector of the detected image. And when the calculation result is larger than the set threshold value, the people are considered to be the same person.
The reason is that the cosine similarity is more to distinguish differences from the vector direction, but is insensitive to absolute numerical values, the Euclidean distance can reflect the absolute difference of individual numerical value characteristics, the two are combined to ensure that a threshold value has certain balance to the direction and the absolute difference of the vector, and the value range of the cosine [ -1,1] does not expand the absolute difference of the vector, and the calculation formula is as follows:
Figure BDA0003909420230000071
wherein L is Euclidean ,L cosine Respectively, representing the euclidean and cosine distances, x,
Figure BDA0003909420230000072
a 512-dimensional vector representing the input and a 512-dimensional feature vector in the database,
Figure BDA0003909420230000073
representing a given input x, finding the one in the database having the smallest Euclidean distance therefrom
Figure BDA0003909420230000074
The expression x is used to indicate the number x,
Figure BDA0003909420230000075
the euclidean distance between them.
And 8: and (4) performing identity authentication according to the result in the step (7), if the safety helmet is not worn, giving an alarm, and realizing the identification of the tunnel workers and the early warning of the wearing of the safety helmet.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims (8)

1. A tunnel worker safety helmet detection and face recognition method is characterized in that: the method comprises the following steps:
s1: acquiring face images with safety helmets of tunnel workers in different postures, aligning the faces to be used as a data set for face recognition, recording the data set as a recog _ dataset, and establishing a face database of each worker;
s2: acquiring a plurality of face images with or without a safety helmet, labeling and enhancing data of the faces with or without the safety helmet as a training set of a detection network, and recording the training set as a detec _ dataset;
s3: training a classified convolutional neural network model Tiny-YOLOv3 for detecting whether a worker wears a safety helmet or not by using the detec _ dataset enhanced by the data in the step S2; training a convolutional neural network faceNet for face recognition by using the data of the recog _ dataset in the step S1;
s4: detecting the input image by using the trained Tiny-YOLOv3 in the step S3 to obtain the rectangular coordinates of the head image of the input image with the safety helmet, and intercepting the local picture of the detected part from the input image according to the corresponding coordinates;
s5: carrying out key point detection on the local picture in the step S4 by using dlib, aligning the local picture to an FFHQ data set sample by adopting affine transformation, and equalizing the aligned picture to obtain a human face picture to be identified;
s6: the face picture to be recognized in the step S5 is transmitted into a faceNet network to obtain a 512-dimensional feature vector representing the face;
s7: calculating cosine similarity, euclidean distance and Manhattan distance between the feature vector in the step S6 and the face in the database, and judging whether the combined result of the Euclidean distance and the cosine similarity meets a threshold value for successful recognition to judge whether the combined result is a worker;
s8: and performing identity authentication according to the result in the step S7, and warning the user who does not wear the safety helmet, so as to realize the identification of the tunnel staff and the early warning of wearing the safety helmet.
2. The method of claim 1, wherein the method comprises the steps of: after the face images with the safety helmet of different postures of tunnel workers are obtained in the step S1, carrying out key point detection by using dlib, aligning local pictures to FFHQ data set samples by adopting affine transformation, and balancing the aligned pictures to serve as a database for face recognition, wherein the method specifically comprises the following steps of:
s11: using dlib to traverse the FFHQ data set sample to obtain 5 key point coordinates of the FFHQ, namely key points in two eyes and people, and using the key points as expected template coordinates;
s12: carrying out dlib 5 key point detection on the picture with the safety helmet, and transforming the key points of the picture to template coordinates through affine transformation to realize face alignment;
s13: meanwhile, the influence of the light is weakened by using equalization in consideration of the problem of light imbalance of the tunnel.
3. The method for detecting a safety helmet of a tunnel worker and recognizing a human face according to claim 1, wherein the method comprises the following steps: in the step S2, the face is labeled by adopting ImageLabel, the face is divided into two categories of wearing and non-wearing safety helmets, and the labeled picture is labeled
And rotating, overturning and fuzzifying to enrich the training set.
4. The method of claim 1, wherein the method comprises the steps of: the step S3: training a classified convolutional neural network model Tiny-YOLOv3 for detecting whether a worker wears a safety helmet or not by using the detec _ dataset data in the step S2; using the recog _ dataset data set data in the step S1 to train a convolutional neural network FaceNet for face recognition, specifically:
s31: inputting the data-enhanced annotation image into a target detection Tiny-YOLOv3 model for supervision training: updating the weight of the target detection model by using an Adam optimization algorithm by taking a loss function between the minimized predicted value and the label as an optimization target; the loss function is composed of a positioning loss function and a classification loss function, wherein the positioning loss function adopts an intersection ratio DIoU:
Figure FDA0003909420220000021
where IoU denotes the ratio of intersection to union of the predicted and real boxes, DIoU denotes IoU, b, considering the distance between the predicted and real box centers gt Representing the center points of the predicted and real frames, respectively, B gt B is a real frame and a prediction frame, rho represents the Euclidean distance between two central points, and c represents the diagonal distance of a minimum closure area which can simultaneously contain the prediction frame and the real frame;
the classification loss function is binary cross entropy BCE:
Figure FDA0003909420220000022
wherein y is the label category, p (y) is the probability that the output belongs to the y label, i represents the ith sample in a training batch, and N represents the total number of samples contained in a training batch;
s32: inputting the different shielded face images subjected to face alignment into a FaceNet model for supervision training: taking a triple loss function as a target, aiming at enabling the intra-class distance to be smaller and the inter-class distance to be larger, updating the weight of a target detection model by using an Adam optimization algorithm, wherein the loss function is as follows:
Figure FDA0003909420220000023
wherein
Figure FDA0003909420220000024
The ith training sample in one batch,
Figure FDA0003909420220000025
is shown and
Figure FDA0003909420220000026
the samples of the same type are used for the analysis,
Figure FDA0003909420220000027
is shown and
Figure FDA0003909420220000028
the samples of the non-homogeneous type are,
Figure FDA0003909420220000029
representing the Euclidean distance between a and b, alpha is a threshold value, and loss and gradient are generated when the distance between the alpha class inner distance and the class outer distance is larger than the threshold value.
5. The method of claim 1, wherein the method comprises the steps of: in step S4, the actually shot picture is transmitted into a Tiny-YOLOv3 detection network, and the detected face is intercepted after affine transformation.
6. The method for detecting a safety helmet of a tunnel worker and recognizing a human face according to claim 1, wherein the method comprises the following steps: in step S5, the method specifically includes the following steps:
s51: using dlib to traverse the FFHQ data set sample to obtain 5 key point coordinates of the FFHQ, namely key points in two eyes and people, and using the key points as expected template coordinates;
s52: carrying out dlib 5 key point detection on the local picture in the step S4, and transforming the picture key points to template coordinates through affine transformation to realize face alignment;
s53: meanwhile, the influence of the light is weakened by using equalization in consideration of the problem of light imbalance of the tunnel.
7. The method for detecting a safety helmet of a tunnel worker and recognizing a human face according to claim 1, wherein the method comprises the following steps: in step S7, the aligned face image is transmitted to the recognition network to obtain a 512-dimensional vector representing the face, and the distances between the face and all faces in the database are calculated, and the person in the database with the smallest distance is selected as the identity of the detected face, and the calculation method is as follows:
Figure FDA0003909420220000031
wherein L is Euclidean ,L cosine Respectively representing the euclidean distance and the cosine distance,
Figure FDA0003909420220000032
a 512-dimensional vector representing the input and a 512-dimensional feature vector in the database,
Figure FDA0003909420220000033
representing a given input x, finding the one in the database having the smallest Euclidean distance therefrom
Figure FDA0003909420220000034
To represent
Figure FDA0003909420220000035
The euclidean distance between them.
8. The method of claim 1, wherein the method comprises the steps of: step S8 specifically includes: if the calculated distance meets the set threshold value, the same person is considered, namely identity confirmation is carried out, otherwise, authentication fails; meanwhile, whether the safety helmet is worn by a person or not can be obtained according to the detection result, and if the safety helmet is not worn and the identity authentication is abnormal, an alarm is given, so that the safety of the person and the entrance of irrelevant people are ensured.
CN202211318528.4A 2022-10-26 2022-10-26 Tunnel worker safety helmet detection and face recognition method Pending CN115588165A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211318528.4A CN115588165A (en) 2022-10-26 2022-10-26 Tunnel worker safety helmet detection and face recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211318528.4A CN115588165A (en) 2022-10-26 2022-10-26 Tunnel worker safety helmet detection and face recognition method

Publications (1)

Publication Number Publication Date
CN115588165A true CN115588165A (en) 2023-01-10

Family

ID=84782082

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211318528.4A Pending CN115588165A (en) 2022-10-26 2022-10-26 Tunnel worker safety helmet detection and face recognition method

Country Status (1)

Country Link
CN (1) CN115588165A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116092199A (en) * 2023-04-11 2023-05-09 山东易视智能科技有限公司 Employee working state identification method and identification system
CN116895030A (en) * 2023-09-11 2023-10-17 西华大学 Insulator detection method based on target detection algorithm and attention mechanism

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116092199A (en) * 2023-04-11 2023-05-09 山东易视智能科技有限公司 Employee working state identification method and identification system
CN116092199B (en) * 2023-04-11 2023-07-14 山东易视智能科技有限公司 Employee working state identification method and identification system
CN116895030A (en) * 2023-09-11 2023-10-17 西华大学 Insulator detection method based on target detection algorithm and attention mechanism
CN116895030B (en) * 2023-09-11 2023-11-17 西华大学 Insulator detection method based on target detection algorithm and attention mechanism

Similar Documents

Publication Publication Date Title
CN115588165A (en) Tunnel worker safety helmet detection and face recognition method
CN101833646B (en) In vivo iris detection method
CN105354902B (en) A kind of security management method and system based on recognition of face
US9064145B2 (en) Identity recognition based on multiple feature fusion for an eye image
CN101558431B (en) Face authentication device
CN102521565B (en) Garment identification method and system for low-resolution video
CN111488804A (en) Labor insurance product wearing condition detection and identity identification method based on deep learning
CN108171184A (en) Method for distinguishing is known based on Siamese networks again for pedestrian
CN106355138A (en) Face recognition method based on deep learning and key features extraction
US11594074B2 (en) Continuously evolving and interactive Disguised Face Identification (DFI) with facial key points using ScatterNet Hybrid Deep Learning (SHDL) network
CN108197587A (en) A kind of method that multi-modal recognition of face is carried out by face depth prediction
CN106156688A (en) A kind of dynamic human face recognition methods and system
CN111460962A (en) Mask face recognition method and system
CN101236599A (en) Human face recognition detection device based on multi- video camera information integration
CN103049736A (en) Face identification method based on maximum stable extremum area
CN108629336A (en) Face value calculating method based on human face characteristic point identification
CN108537143B (en) A kind of face identification method and system based on key area aspect ratio pair
CN106599785A (en) Method and device for building human body 3D feature identity information database
CN110135327A (en) A kind of driving behavior recognition methods based on multi-region feature learning model
CN107220598A (en) Iris Texture Classification based on deep learning feature and Fisher Vector encoding models
CN109344909A (en) A kind of personal identification method based on multichannel convolutive neural network
CN112435414A (en) Security monitoring system based on face recognition and monitoring method thereof
Menezes et al. Automatic attendance management system based on deep one-shot learning
CN105184236A (en) Robot-based face identification system
CN110135470A (en) A kind of vehicle characteristics emerging system based on multi-modal vehicle feature recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination