CN111209822A - Face detection method of thermal infrared image - Google Patents
Face detection method of thermal infrared image Download PDFInfo
- Publication number
- CN111209822A CN111209822A CN201911394420.1A CN201911394420A CN111209822A CN 111209822 A CN111209822 A CN 111209822A CN 201911394420 A CN201911394420 A CN 201911394420A CN 111209822 A CN111209822 A CN 111209822A
- Authority
- CN
- China
- Prior art keywords
- frame
- thermal infrared
- prediction
- neural network
- convolutional neural
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a face detection method of a thermal infrared image, which comprises the following steps: (1) acquiring a positive sample, a negative sample and a test set of the training set, and respectively framing a human face frame as a calibration frame for each thermal infrared image of the positive sample; (2) acquiring a training label; (3) building a convolutional neural network, inputting a training set and a training label into the convolutional neural network together for training, and optimizing the convolutional neural network by using a loss function so as to obtain a required training model of the convolutional neural network; (4) and inputting the thermal infrared image concentrated in the test, and obtaining a face detection frame through a convolutional neural network. The invention inputs the thermal infrared image into the convolutional neural network for training to obtain the convolutional neural network meeting the requirement, and can realize automatic detection of the thermal infrared image so as to accurately frame out the human face range and reduce the detection error rate.
Description
Technical Field
The invention belongs to the technical field of biological feature recognition, and particularly relates to a face detection method.
Background
And detecting the human face to obtain the specific positions of all human faces in the picture, wherein the specific positions are usually represented by a rectangular frame, the object in the rectangular frame is the human face, and the part outside the rectangular frame is the background.
The visible light face detection technology is widely applied to the fields of customs, stations, attendance checking, automatic driving, suspect tracking and the like. However, the visible light face detection technology cannot work without an external light source, and cannot detect a face with a mask on the face. The visible light can not be used for living body detection, and the imaging is not judged to be a real person, so that the human face detection method is easy to be deceived by photos and faces dressed up easily, and the results of the face detection are inaccurate and limited.
The thermal infrared image is thermal radiation imaging, which is based on the difference in infrared radiation from an object, and the infrared thermal imager can convert the naturally emitted infrared radiation distribution from the surface of the object into a visible image. Because different objects or different parts of the same object usually have different heat radiation characteristics, such as temperature difference, emissivity and the like, after thermal infrared imaging is carried out, the objects in the thermal infrared image are distinguished due to the difference of the heat radiation. Therefore, the hot infrared image can easily solve the function of biopsy, the human face is a high-temperature object compared with other objects, the image is white in a gray scale image, different capillary vessels are distributed on different organs of the face, the heat radiation is different, and facial five sense organs can be presented.
Active near-infrared face recognition is started to rise at present, but the technology needs an active light source and limits the distance to 50-100 cm. And the active light source can generate obvious reflection on the glasses, so that the positioning precision of the eyes is reduced, and the active light source can be damaged and attenuated after being used for a long time. At present, no face detection method for thermal infrared images exists in China.
Disclosure of Invention
In view of the above defects or improvement requirements of the prior art, the present invention provides a face detection method for thermal infrared images, which can clearly frame the face position in the thermal infrared images without any light source, so as to meet the detection requirements for the thermal infrared images.
In order to achieve the above object, according to an aspect of the present invention, there is provided a method for detecting a human face using thermal infrared images, comprising the steps of:
(1) the method comprises the following steps of taking N thermal infrared images as positive samples and L thermal infrared images of an undisplayed face as negative samples to form a training set, obtaining M thermal infrared images as a test set, and framing a face frame of each thermal infrared image of the positive samples as a calibration frame; the mark of each thermal infrared image in the positive sample is 1, and the mark of each thermal infrared image in the negative sample is 0;
(2) the coordinate value of the central point of the calibration frame of each thermal infrared image is reduced in proportion to the size values of the width and the height, and the reduced coordinate value of the central point, the reduced size values of the width and the height and the mark of the thermal infrared image are stored in an independent txt file together, so that N txt files are obtained in total;
in addition, the path of each thermal infrared image in the training set and the marks of all the thermal infrared images in the negative sample are stored in another txt file;
in this way, N +1 txt files are obtained as training labels;
(3) building a convolutional neural network, inputting a training set and a training label into the convolutional neural network together for training, and optimizing the convolutional neural network by using a loss function so as to obtain a required training model of the convolutional neural network;
(4) and inputting the thermal infrared image concentrated in the test, and obtaining a face detection frame through a convolutional neural network.
Preferably, in the step (1), a thermal infrared imager is adopted to collect the thermal infrared image, and the collection condition is as follows: the human face and the medium wave thermal infrared imager of each person record videos by adopting a plurality of groups of distances and a plurality of groups of set time, the videos are cut according to set frame numbers, then, the set number of photos are selected, and then N thermal infrared images are selected as a training set.
Preferably, the training labels generated in step (2) are specifically as follows:
(2.1) storing the relative coordinates of the center point of the calibration frame:
wherein (x)1,y1),(x2,y2) Two coordinates representing diagonal positions on the calibration frame are represented by (x)1,y1),(x2,y2) Determining the calibration frame, x1And x2Representing the width coordinate, y, in an x-y image coordinate system1And y2Denotes the height coordinate in the x-y image coordinate system, and x1>x2,y1>y2;
centrexRepresenting the width coordinate, centre, of the centre point of the calibration frame in the x-y image coordinate systemyThe length coordinate of the central point of the calibration frame under an x-y image coordinate system is represented, w represents the length of the thermal infrared image where the calibration frame is located, and h represents the height of the thermal infrared image where the calibration frame is located;
(2.2) store the relative size of the length of the calibration box to the thermal infrared image in which it is located:
wherein the frame isxRepresenting the relative width, frame, of the calibration frameyIndicating the relative height of the calibration frame;
mixing the above centrex、centrey、framex、frameyStoring the marks of the thermal infrared images in the positive sample in the same txt file, and marking the marks of different thermal infrared images in the positive sample and the center of the calibration framex、centrey、framex、frameyStoring different txt files.
Preferably, the convolutional neural network adopts a Darknet framework and a Yolo network, the Darknet framework is used for performing convolution, maximum pooling and normalization operations on the input thermal infrared image so as to obtain the weight of the convolutional neural network, and the Yolo network is used for processing the weight of the convolutional neural network so as to perform face determination and position regression.
Preferably, the size relationship between the calibration box and the prediction box constructed by the convolutional neural network is as follows:
ax=dx+Δ(mx)
ay=dy+Δ(my)
wherein, ax,ayRespectively representing the width and height of the center coordinate of the calibration frame under the u-v image coordinate system, awAnd ahDenotes the width and height, Δ (m), of the calibration framex),Δ(my) Respectively indicating the amount of deviation in the width direction and the amount of deviation in the height direction from the center of the calibration frame to the center of the prediction frame, dx,dyRespectively representing the width and height, p, of the central coordinate of the prediction boxw,phExpressed as the width and height of the prediction box, m, respectivelyw,mhWide and high scaling ratios of the prediction box respectively, and the delta function is a sigmoid function.
Preferably, the prediction box constructed by the convolutional neural network is six and is divided into two scales, the heights of the six prediction boxes are respectively a prediction box I, a prediction box II, a prediction box III, a prediction box IV, a prediction box V and a prediction box VI after being sorted from large to small, wherein the first scale allocates the prediction box I, the prediction box III and the prediction box IV, and the second scale allocates the prediction box II, the prediction box IV and the prediction box VI.
Preferably, in step (3), the loss function is optimized for the convolutional neural network specifically as follows:
where loss represents the loss, S2Represents the number of grids of the convolutional neural network, B represents the number of prediction boxes per cell,whether the jth anchor box of the ith grid is responsible for the target or not is shown, the value is 0 when the ith grid is not responsible for the target, the value is 1 when the ith grid is responsible for the target,the j-th prediction frame of the i grids represents an irresponsible target, the value of the target is 1 when the target exists, the value of the target is 0 when the target does not exist, and the lambda iscoord=5,λnoobj=0.5,xi,yiRespectively representing the width and height of the center point coordinate of the ith prediction box,respectively representing the width and height, w, of the coordinates of the center point of the ith calibration framei,hiRespectively representing the width and height of the ith prediction box,respectively, the width and height of the ith calibration frame, ciRepresenting the confidence of the ith prediction box, the value of the selected prediction box is 1, the value of the unselected prediction box is 0,representing the confidence of the ith calibration frame, the value of the selected calibration frame is 1, the value of the unselected calibration frame is 0, piRepresenting the classification probability of a face in the ith prediction box,representing the classification probability of the face in the ith calibration frame, c representing the class with or without the face, and classes representing the set of the classes with and without the face;
and after the loss is obtained, updating by adopting a random gradient descent algorithm, continuously selecting and judging the optimal parameter under the current target by the convolutional neural network, updating the parameter in the convolutional neural network according to the loss result, and stopping updating after the convolutional neural network reaches the required index.
In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:
1) the invention inputs the thermal infrared image into the convolutional neural network for training to obtain the convolutional neural network meeting the requirement, and can realize automatic detection of the thermal infrared image so as to accurately frame out the human face range and reduce the error rate of human face detection.
2) The invention carries out face detection by the thermal infrared technology, can clearly frame the face position in the thermal infrared image without any light source, and meets the detection requirement of the thermal infrared image.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic flow chart of the present invention for obtaining training labels;
FIG. 3 is a flow chart of the convolutional neural network gain loss calculation in the present invention;
FIG. 4 is a thermal infrared image to be detected;
FIG. 5 is a schematic illustration of the thermal infrared image of FIG. 4 after detection;
FIG. 6 is a schematic diagram of three prediction boxes in a first scale;
FIG. 7 is a schematic diagram of three prediction boxes at a second scale;
fig. 8 is a schematic diagram of detection of two faces.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Referring to the attached drawings, the method for detecting the human face of the thermal infrared image comprises the following steps:
(1) and taking N thermal infrared images as positive samples and L thermal infrared images without human faces as negative samples to form a training set, and obtaining M thermal infrared images as a test set.
In order to guarantee a sufficient number of thermal infrared images, it is necessary to guarantee sufficient experimental data. Specifically, a medium wave thermal infrared imager with the model of TAURUS-110kM of IRCAM company in Germany can be adopted, and the test environment of data is as follows: the distance of people's face from camera is 2 meters, 3 meters, the people's face of 5 meters different distances, and through recording the video of settlement time to everyone, every video is selected the photo of settlement quantity after cutting out according to setting for the frame number, can select 200 people to shoot, adopts the video form of 50 frame intercepting, has included different gestures, the influence of different scene backgrounds, has had the scene of external light source, has guaranteed the accuracy of the follow-up use of face detection model through a large amount of experiments. Then, the thermal infrared images intercepted by the video can be screened, the images which do not meet the training requirement are removed, the training data is screened to remove some useless data, so as to prevent a computer from learning the parameters and influencing real parameters in deep learning, for example, when a picture is cut, blurred images which are easy to appear in posture conversion are generally removed, 14 ten thousand thermal infrared images can be obtained as a training set, and M-6 ten thousand thermal infrared images are obtained as a test set, the training set selects N-3.5 million thermal infrared images as positive samples and L-10.5 million thermal infrared images as negative samples, the thermal infrared images in the positive samples show faces and can select face frames, the images in the negative examples do not show the face, e.g. only devices, clothing, walls, etc. are shown.
Then framing a face frame of each thermal infrared image of the alignment sample as a calibration frame; the mark of each thermal infrared image in the positive sample is 1, and the mark of each thermal infrared image in the negative sample is 0;
(2) the coordinate value of the central point of the calibration frame of each thermal infrared image is reduced in proportion to the size values of the width and the height, and the reduced coordinate value of the central point, the reduced size values of the width and the height and the mark of the thermal infrared image are stored in an independent txt file together, so that N txt files are obtained in total;
in addition, the path of each thermal infrared image in the training set and the marks of all the thermal infrared images in the negative sample are stored in another txt file;
in this way, a total of N +1 txt files are obtained as training labels, as follows:
(2.1) storing the relative coordinates of the center point of the calibration frame:
wherein (x)1,y1),(x2,y2) Two coordinates representing diagonal positions on the calibration frame are represented by (x)1,y1),(x2,y2) Determining the calibration frame, x1And x2Representing the width coordinate, y, in an x-y image coordinate system1And y2Denotes the height coordinate in the x-y image coordinate system, and x1>x2,y1>y2;
centrexRepresenting the width coordinate, centre, of the centre point of the calibration frame in the x-y image coordinate systemyThe length coordinate of the central point of the calibration frame under an x-y image coordinate system is represented, w represents the length of the thermal infrared image where the calibration frame is located, and h represents the height of the thermal infrared image where the calibration frame is located;
(2.2) store the relative size of the length of the calibration box to the thermal infrared image in which it is located:
wherein the frame isxRepresenting the relative width, frame, of the calibration frameyIndicating the relative height of the calibration frame;
mixing the above centrex、centrey、framex、frameyStoring the marks of the thermal infrared images in the positive sample in the same txt file, and marking the marks of different thermal infrared images in the positive sample and the center of the calibration framex、centrey、framex、frameyStoring different txt files.
The invention only needs to store the relative coordinate of the central point of the calibration frame and the relative size of the calibration frame, thereby saving the acquisition time of a large number of parameters.
(3) Building a convolutional neural network, inputting a training set and a training label into the convolutional neural network together for training, and optimizing the convolutional neural network by using a loss function so as to obtain a required training model of the convolutional neural network;
the convolutional neural network adopts a Darknet framework, and the Darknet framework is used for performing convolution, maximum pooling and normalization operations on an input thermal infrared image so as to obtain the weight of the convolutional neural network, specifically, the Darknet framework trains a 53-layer network and provides a 106-layer fully-convolutional bottom layer framework. In the forward propagation process, the size of the tensor is transformed by changing the step size of the convolution kernel, such as stride (2, 2), which is equivalent to reducing the side length of the image by half (i.e. reducing the area to 1/4). In the network, 5 times of reduction is needed, 1/2 which reduces the characteristic diagram to the original input size5I.e., 1/32. The input is 416x416 and the output is 13x13(416/32 ═ 13). The backpone would narrow the output profile to 1/32 at the input.
The convolutional neural network also adopts a Yolo network for processing the weight of the convolutional neural network to perform face judgment and position regression, six prediction frames are built and divided into two scales by designing Fast Anchor (Fast prediction frame algorithm), the heights of the six prediction frames are respectively a prediction frame I, a prediction frame II, a prediction frame III, a prediction frame IV, a prediction frame V and a prediction frame VI after being sorted from large to small, wherein the first scale allocates the prediction frame I, the prediction frame III and the prediction frame IV, and the second scale allocates the prediction frame II, the prediction frame IV and the prediction frame VI.
The size relationship between the calibration box and the prediction box constructed by the convolutional neural network is as follows:
ax=dx+Δ(mx)
ay=dy+Δ(my)
wherein, ax,ayRespectively representing the width and height of the center coordinate of the calibration frame under the u-v image coordinate system, awAnd ahDenotes the width and height, Δ (m), of the calibration framex),Δ(my) Respectively indicating the amount of deviation in the width direction and the amount of deviation in the height direction from the center of the calibration frame to the center of the prediction frame, dx,dyRespectively representing the width and height, p, of the central coordinate of the prediction boxw,phExpressed as the width and height of the prediction box, m, respectivelyw,mhRespectively, the wide scaling ratio and the high scaling ratio of the prediction frame; the delta function is a sigmoid function, and the prediction quantity is scaled to be within 0-1, so that the aim of fast convergence can be achieved. When whether the face exists or not is detected, the length-width ratio is approximate to 1: 1, and a prediction frame with large length-width ratio difference cannot appear.
The loss function is optimized for the convolutional neural network as follows:
in the above formula, the total variance is adopted for the loss functions of w and h, and the binary cross entropy is used for the loss function of the confidence coefficient. The first row of the expression is the total square error and is used as the loss function of the position prediction, the second row of the expression uses the root total variance as the loss function of the height and the width, the third row and the fourth row of the expression uses the binary cross entropy as the loss function of the confidence coefficient, and the fifth row of the expression uses SSE as the loss function of the category probability.
Where loss represents the loss, S2Represents the number of grids of the convolutional neural network, B represents the number of prediction boxes per cell,whether the jth anchor box of the ith grid is responsible for the target or not is shown, the value is 0 when the ith grid is not responsible for the target, the value is 1 when the ith grid is responsible for the target,the j-th prediction frame of the i grids represents an irresponsible target, the value of the target is 1 when the target exists, the value of the target is 0 when the target does not exist, and the lambda iscoord=5,λnoobj=0.5,xi,yiRespectively representing the width and height of the center point coordinate of the ith prediction box,respectively representing the width and height, w, of the coordinates of the center point of the ith calibration framei,hiRespectively representing the width and height of the ith prediction box,respectively, the width and height of the ith calibration frame, ciRepresenting the confidence of the ith prediction box, the value of the selected prediction box is 1, the value of the unselected prediction box is 0,representing the confidence of the ith calibration frame, the value of the selected calibration frame is 1, the value of the unselected calibration frame is 0, piRepresenting the classification probability of a face in the ith prediction box,representing the classification probability of the face in the ith calibration frame, c representing the class with or without the face, and classes representing the set of the classes with and without the face;
and after the loss is obtained, updating by adopting a random gradient descent algorithm, continuously selecting and judging the optimal parameter under the current target by the convolutional neural network, updating the parameter in the convolutional neural network according to the loss result so as to ensure that the output result of the convolutional neural network is the same as the training label, and stopping updating after the convolutional neural network reaches the required index.
(4) And inputting the thermal infrared image to be detected to obtain a face detection result. The invention can realize the processing of 0.024s of a single graph, and has high precision and accuracy rate of more than 98.6 percent.
In addition, the coordinates mentioned in the invention refer to the coordinates under a u-v image coordinate system, the widths of the thermal infrared image and the frame are the side length sizes in the left and right directions, and the heights are the side length sizes in the vertical direction.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (7)
1. A face detection method of a thermal infrared image is characterized by comprising the following steps:
(1) the method comprises the following steps of taking N thermal infrared images as positive samples and L thermal infrared images of an undisplayed face as negative samples to form a training set, obtaining M thermal infrared images as a test set, and framing a face frame of each thermal infrared image of the positive samples as a calibration frame; the mark of each thermal infrared image in the positive sample is 1, and the mark of each thermal infrared image in the negative sample is 0;
(2) the coordinate value of the central point of the calibration frame of each thermal infrared image is reduced in proportion to the size values of the width and the height, and the reduced coordinate value of the central point, the reduced size values of the width and the height and the mark of the thermal infrared image are stored in an independent txt file together, so that N txt files are obtained in total;
in addition, the path of each thermal infrared image in the training set and the marks of all the thermal infrared images in the negative sample are stored in another txt file;
in this way, N +1 txt files are obtained as training labels;
(3) building a convolutional neural network, inputting a training set and a training label into the convolutional neural network together for training, and optimizing the convolutional neural network by using a loss function so as to obtain a required training model of the convolutional neural network;
(4) and inputting the thermal infrared image concentrated in the test, and obtaining a face detection frame through a convolutional neural network.
2. The method for detecting the human face of the thermal infrared image according to claim 1, wherein in the step (1), the thermal infrared image is collected by a thermal infrared imager, and the collection condition is as follows: the human face and the medium wave thermal infrared imager of each person record videos by adopting a plurality of groups of distances and a plurality of groups of set time, the videos are cut according to set frame numbers, then, the set number of photos are selected, and then, a training set and a testing set are obtained.
3. The method for detecting the human face with the thermal infrared image according to claim 1, wherein the training label generated in the step (2) is specifically as follows:
(2.1) storing the relative coordinates of the center point of the calibration frame:
wherein (x)1,y1),(x2,y2) Two coordinates representing diagonal positions on the calibration frame are represented by (x)1,y1),(x2,y2) Determining the calibration frame, x1And x2Representing the width coordinate, y, in an x-y image coordinate system1And y2Denotes the height coordinate in the x-y image coordinate system, and x1>x2,y1>y2;
centrexRepresenting the width coordinate, centre, of the centre point of the calibration frame in the x-y image coordinate systemyGraph with center point of calibration box in x-yLength coordinates under an image coordinate system, w represents the length of the thermal infrared image where the calibration frame is located, and h represents the height of the thermal infrared image where the calibration frame is located;
(2.2) store the relative size of the length of the calibration box to the thermal infrared image in which it is located:
wherein the frame isxRepresenting the relative width, frame, of the calibration frameyIndicating the relative height of the calibration frame;
mixing the above centrex、centrey、framex、frameyStoring the marks of the thermal infrared images in the positive sample in the same txt file, and marking the marks of different thermal infrared images in the positive sample and the center of the calibration framex、centrey、framex、frameyStoring different txt files.
4. The method according to claim 1, wherein the convolutional neural network employs a Darknet framework and a Yolo network, the Darknet framework is used for performing convolution, max pooling and normalization on the input thermal infrared image to obtain weights of the convolutional neural network, and the Yolo network is used for processing the weights of the convolutional neural network to perform face determination and position regression.
5. The method for detecting the human face of the thermal infrared image according to claim 1, wherein the size relationship between the calibration frame and the prediction frame constructed by the convolutional neural network is as follows:
ax=dx+Δ(mx)
ay=dy+Δ(my)
wherein, ax,ayRespectively representing the width and height of the center coordinate of the calibration frame under the u-v image coordinate system, awAnd ahDenotes the width and height, Δ (m), of the calibration framex),Δ(my) Respectively indicating the amount of deviation in the width direction and the amount of deviation in the height direction from the center of the calibration frame to the center of the prediction frame, dx,dyRespectively representing the width and height, p, of the central coordinate of the prediction boxw,phExpressed as the width and height of the prediction box, m, respectivelyw,mhWide and high scaling ratios of the prediction box respectively, and the delta function is a sigmoid function.
6. The method according to claim 5, wherein the prediction frame constructed by the convolutional neural network is six and divided into two scales, and the heights of the six prediction frames are respectively a prediction frame I, a prediction frame II, a prediction frame III, a prediction frame IV, a prediction frame V and a prediction frame VI after being sorted from large to small, wherein the first scale allocates the prediction frame I, the prediction frame III and the prediction frame IV, and the second scale allocates the prediction frame II, the prediction frame IV and the prediction frame VI.
7. The method for detecting the human face of the thermal infrared image according to claim 1, wherein in the step (3), the loss function is optimized for the convolutional neural network as follows:
where loss represents the loss, S2Represents the number of grids of the convolutional neural network, B represents the number of prediction boxes per cell,to representWhether the jth anchor box of the ith grid is responsible for the target or not is 0 when not responsible and 1 when responsible,the j-th prediction frame of the i grids represents an irresponsible target, the value of the target is 1 when the target exists, the value of the target is 0 when the target does not exist, and the lambda iscoord=5,λnoobj=0.5,xi,yiRespectively representing the width and height of the center point coordinate of the ith prediction box,respectively representing the width and height, w, of the coordinates of the center point of the ith calibration framei,hiRespectively representing the width and height of the ith prediction box,respectively, the width and height of the ith calibration frame, ciRepresenting the confidence of the ith prediction box, the value of the selected prediction box is 1, the value of the unselected prediction box is 0,representing the confidence of the ith calibration frame, the value of the selected calibration frame is 1, the value of the unselected calibration frame is 0, piRepresenting the classification probability of a face in the ith prediction box,representing the classification probability of the face in the ith calibration frame, c representing the class with or without the face, and classes representing the set of the classes with and without the face;
and after the loss is obtained, updating by adopting a random gradient descent algorithm, continuously selecting and judging the optimal parameter under the current target by the convolutional neural network, updating the parameter in the convolutional neural network according to the loss result, and stopping updating after the convolutional neural network reaches the required index.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911394420.1A CN111209822A (en) | 2019-12-30 | 2019-12-30 | Face detection method of thermal infrared image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911394420.1A CN111209822A (en) | 2019-12-30 | 2019-12-30 | Face detection method of thermal infrared image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111209822A true CN111209822A (en) | 2020-05-29 |
Family
ID=70786541
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911394420.1A Pending CN111209822A (en) | 2019-12-30 | 2019-12-30 | Face detection method of thermal infrared image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111209822A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111985374A (en) * | 2020-08-12 | 2020-11-24 | 汉王科技股份有限公司 | Face positioning method and device, electronic equipment and storage medium |
CN112115838A (en) * | 2020-09-11 | 2020-12-22 | 南京华图信息技术有限公司 | Thermal infrared image spectrum fusion human face classification method |
CN112199993A (en) * | 2020-09-01 | 2021-01-08 | 广西大学 | Method for identifying transformer substation insulator infrared image detection model in any direction based on artificial intelligence |
CN112232208A (en) * | 2020-10-16 | 2021-01-15 | 蓝普金睛(北京)科技有限公司 | Infrared human face temperature measurement system and method thereof |
CN112529947A (en) * | 2020-12-07 | 2021-03-19 | 北京市商汤科技开发有限公司 | Calibration method and device, electronic equipment and storage medium |
CN112926478A (en) * | 2021-03-08 | 2021-06-08 | 新疆爱华盈通信息技术有限公司 | Gender identification method, system, electronic device and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108038474A (en) * | 2017-12-28 | 2018-05-15 | 深圳云天励飞技术有限公司 | Method for detecting human face, the training method of convolutional neural networks parameter, device and medium |
CN108764057A (en) * | 2018-05-03 | 2018-11-06 | 武汉高德智感科技有限公司 | A kind of far infrared human type of face detection method and system based on deep learning |
CN109902556A (en) * | 2019-01-14 | 2019-06-18 | 平安科技(深圳)有限公司 | Pedestrian detection method, system, computer equipment and computer can storage mediums |
CN110399905A (en) * | 2019-07-03 | 2019-11-01 | 常州大学 | The detection and description method of safety cap wear condition in scene of constructing |
-
2019
- 2019-12-30 CN CN201911394420.1A patent/CN111209822A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108038474A (en) * | 2017-12-28 | 2018-05-15 | 深圳云天励飞技术有限公司 | Method for detecting human face, the training method of convolutional neural networks parameter, device and medium |
CN108764057A (en) * | 2018-05-03 | 2018-11-06 | 武汉高德智感科技有限公司 | A kind of far infrared human type of face detection method and system based on deep learning |
CN109902556A (en) * | 2019-01-14 | 2019-06-18 | 平安科技(深圳)有限公司 | Pedestrian detection method, system, computer equipment and computer can storage mediums |
CN110399905A (en) * | 2019-07-03 | 2019-11-01 | 常州大学 | The detection and description method of safety cap wear condition in scene of constructing |
Non-Patent Citations (1)
Title |
---|
蔡成涛 等: "《海洋浮标目标探测技术》", 哈尔滨工程大学出版社, pages: 51 - 53 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111985374A (en) * | 2020-08-12 | 2020-11-24 | 汉王科技股份有限公司 | Face positioning method and device, electronic equipment and storage medium |
CN111985374B (en) * | 2020-08-12 | 2022-11-15 | 汉王科技股份有限公司 | Face positioning method and device, electronic equipment and storage medium |
CN112199993A (en) * | 2020-09-01 | 2021-01-08 | 广西大学 | Method for identifying transformer substation insulator infrared image detection model in any direction based on artificial intelligence |
CN112199993B (en) * | 2020-09-01 | 2022-08-09 | 广西大学 | Method for identifying transformer substation insulator infrared image detection model in any direction based on artificial intelligence |
CN112115838A (en) * | 2020-09-11 | 2020-12-22 | 南京华图信息技术有限公司 | Thermal infrared image spectrum fusion human face classification method |
CN112115838B (en) * | 2020-09-11 | 2024-04-05 | 南京华图信息技术有限公司 | Face classification method based on thermal infrared image spectrum fusion |
CN112232208A (en) * | 2020-10-16 | 2021-01-15 | 蓝普金睛(北京)科技有限公司 | Infrared human face temperature measurement system and method thereof |
CN112529947A (en) * | 2020-12-07 | 2021-03-19 | 北京市商汤科技开发有限公司 | Calibration method and device, electronic equipment and storage medium |
CN112926478A (en) * | 2021-03-08 | 2021-06-08 | 新疆爱华盈通信息技术有限公司 | Gender identification method, system, electronic device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111209822A (en) | Face detection method of thermal infrared image | |
CN108549873B (en) | Three-dimensional face recognition method and three-dimensional face recognition system | |
CN113705478B (en) | Mangrove single wood target detection method based on improved YOLOv5 | |
CN111882579A (en) | Large infusion foreign matter detection method, system, medium and equipment based on deep learning and target tracking | |
JP4559437B2 (en) | Sky detection in digital color images | |
CN111476827B (en) | Target tracking method, system, electronic device and storage medium | |
KR102521386B1 (en) | Dimension measuring device, dimension measuring method, and semiconductor manufacturing system | |
US20070154088A1 (en) | Robust Perceptual Color Identification | |
CN114241548A (en) | Small target detection algorithm based on improved YOLOv5 | |
CN109253722A (en) | Merge monocular range-measurement system, method, equipment and the storage medium of semantic segmentation | |
CN111429448B (en) | Biological fluorescent target counting method based on weak segmentation information | |
CN104240264A (en) | Height detection method and device for moving object | |
CN108428220A (en) | Satellite sequence remote sensing image sea island reef region automatic geometric correction method | |
US8094971B2 (en) | Method and system for automatically determining the orientation of a digital image | |
CN109191434A (en) | Image detecting system and detection method in a kind of cell differentiation | |
CN110232387A (en) | A kind of heterologous image matching method based on KAZE-HOG algorithm | |
CN113435282B (en) | Unmanned aerial vehicle image ear recognition method based on deep learning | |
CN111914761A (en) | Thermal infrared face recognition method and system | |
CN116448019B (en) | Intelligent detection device and method for quality flatness of building energy-saving engineering | |
CN109190458A (en) | A kind of person of low position's head inspecting method based on deep learning | |
CN111860587A (en) | Method for detecting small target of picture | |
CN111862040B (en) | Portrait picture quality evaluation method, device, equipment and storage medium | |
CN108154513A (en) | Cell based on two photon imaging data detects automatically and dividing method | |
Li et al. | An automatic plant leaf stoma detection method based on YOLOv5 | |
CN112183287A (en) | People counting method of mobile robot under complex background |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |