CN109815814B - Face detection method based on convolutional neural network - Google Patents

Face detection method based on convolutional neural network Download PDF

Info

Publication number
CN109815814B
CN109815814B CN201811572322.8A CN201811572322A CN109815814B CN 109815814 B CN109815814 B CN 109815814B CN 201811572322 A CN201811572322 A CN 201811572322A CN 109815814 B CN109815814 B CN 109815814B
Authority
CN
China
Prior art keywords
image
neural network
convolutional neural
loss function
face
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811572322.8A
Other languages
Chinese (zh)
Other versions
CN109815814A (en
Inventor
刘高华
王萌
苏寒松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201811572322.8A priority Critical patent/CN109815814B/en
Publication of CN109815814A publication Critical patent/CN109815814A/en
Application granted granted Critical
Publication of CN109815814B publication Critical patent/CN109815814B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)
  • Collating Specific Patterns (AREA)

Abstract

The invention discloses a face detection method based on a convolutional neural network, which comprises the following steps: step (1), establishing a database; step (2), the images in the database are processed; pre-treating; step (3), training a deeply built learning network; and (4) testing the training result, wherein the detection accuracy of the human faces with shielding, different angles and side faces in the picture and the small and fuzzy human faces in the picture is high, the network structure is simple, the iteration parameters are less, and the training time is short.

Description

Face detection method based on convolutional neural network
Technical Field
The invention belongs to the field of computer vision and artificial intelligence, and particularly relates to a face detection method based on a convolutional neural network.
Background
The face detection is a process of determining the position and size of a face in an image with the face, is an important component in the field of computer vision, is also a key step of preprocessing during face recognition, and has great influence on subsequent work because the detection precision also determines the precision of the face recognition to a great extent, so that the face detection has great significance and practical value for the research of the face detection.
Human face detection has wide application in real life, such as personal authentication and security protection, in electronic products related to human face, such as media and entertainment, mobile phones and digital cameras, and image retrieval level. The face detection method can be roughly classified into a conventional detection method (including a detection method based on a matching template, a detection method based on a distance, and the like) and a detection method based on deep learning.
In recent years, deep learning is continuously perfected and developed, and the method is widely applied to classification and regression tasks. The face detection method based on deep learning is also continuously developed, but for the current method, taking the MTCNN method which is most commonly applied as an example, the recognition speed is not fast enough, the recognition accuracy is not high enough, and particularly, the face detection method is not easy to detect for the face which has a block in the image or video, or has different angles, sides and is small in the picture. As a preprocessing step in the face recognition process, the accuracy of face detection also greatly affects the accuracy of subsequent recognition work, so that solving the problems is very important.
Disclosure of Invention
Based on the prior art, the invention provides a face detection method based on a convolutional neural network, and particularly relates to the detection of a face which is in a side state or is illuminated in a picture and is very small in the picture.
The invention provides a face detection method based on a convolutional neural network, which comprises the following steps:
a face detection method based on a convolutional neural network comprises the following steps:
step 1, establishing a database to obtain image data, and preprocessing the image data to construct a convolutional neural network;
step 2, carrying out four times of iterative operation on the preprocessed data through an image feature analysis module in the convolutional neural network to generate image feature parameters;
step 3, operating the image characteristic parameters through a full connection layer in the convolutional neural network to generate an image one-dimensional vector;
and 4, classifying and regressing the one-dimensional vectors of the images through a classification layer in the convolutional neural network to obtain the position coordinates of the face images.
The step 2 of the image feature analysis module for preprocessing data comprises the following steps:
2.1, extracting image characteristics by a method of convolving the weight and the parameters of the preprocessed data by a convolution layer of the image characteristic analysis module;
2.2, an activation function layer of the image feature analysis module carries out nonlinear operation on the image features by applying a ReLu function to obtain nonlinear feature map parameters;
and 2.3, reducing the parameters of the nonlinear feature map by the maximum pooling layer of the image feature analysis module.
The classification layer in the step 4 performs classification and regression processes on the image one-dimensional vectors: comprises the following steps.
Step 4.1, iteration weight is carried out on the image one-dimensional vector by an optimization method of a random gradient descent method, so that a loss function is continuously adjusted, and a super-parameter during training is continuously adjusted to obtain an optimal training result, wherein the super-parameter comprises: iteration times, batches, maximum iteration times and learning rate;
step 4.2, the loss function selected in the classification process is to combine the central loss function with the softmax loss function
The specific expression of the method is as follows:
Figure BDA0001915858600000021
wherein L is S As a softmax loss function, L c For the central loss function, λ is a coefficient, indicating that the weights of both here are taken to be λ =0.1. Wherein Wx + b is the output of the full connection layer, and represents x after log i Belong to the category y i C represents the feature center of the category;
and 4.3, adopting a loss function in the regression process as follows: the Euclidean distance loss function has the following specific expression:
Figure BDA0001915858600000031
y i ∈R 4
wherein,
Figure BDA0001915858600000032
is the output result of network prediction, and y is the true label of the mark, namely 68 face key pointsAnd (4) coordinates. And 4.4, comparing the coordinates of the 68 human face key points output under the optimal weight value condition with the coordinates of the human face key points with the labels in the database and the human face, and calculating the accuracy of the convolutional neural network for detecting the human face.
Advantageous effects
Compared with the prior art, the face detection method based on the convolutional neural network has the advantages that the detection accuracy rate of the face with shielding, different angles and side faces in the picture and the face with smaller and fuzzy face in the picture is higher, the network structure is simple, the iteration parameters are fewer, and the training time is shorter.
Drawings
FIG. 1 is a flow chart of a face detection method based on a convolutional neural network;
fig. 2 is a connection mode of a convolutional neural network used in a face recognition method based on a convolutional neural network provided by the present invention, which includes four convolutional layers, four ReLu activation function layers, four maximum pooling layers, and two full-link layers, wherein the last full-link layer is a softmax classification layer;
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings:
fig. 1 is a flowchart of a face detection method based on a convolutional neural network.
A face detection method based on a convolutional neural network comprises the following steps:
step 1 (110), establishing a database to obtain image data, preprocessing the image data and constructing a convolutional neural network;
in this step, a database is established to obtain image data, that is, the established database contains the pictures with the following requirements: the picture contains at least one face, the position of the face does not make requirements, and the face which is not in the center of the picture and is far away is better; the background of the face is complex and diverse, and comprises various indoor and outdoor scenes; the location of the face in the image is marked with a rectangular box and 68 key points including eyebrows, eyes, nose, mouth, face contour are marked. The image clarity is not required. The created database contains 6000 images containing faces and marked.
In the step, the images in the database are preprocessed, and the images in the established database are subjected to spatial pyramid pooling operation firstly, so that a plurality of images with different pixels and different scales can be obtained from one image, and a feature vector with a fixed size can be conveniently extracted from the features with the multiple scales; carrying out random mirroring on all the pictures generated in the step, wherein the random mirroring comprises up-down mirroring and left-right mirroring; 4/5 of the database images processed in the steps are used as a training database, and 1/5 of the database images are used as a testing database;
step 2, (210) carrying out four times of iterative operation on the preprocessed data through an image feature analysis module in the convolutional neural network to generate image feature parameters;
2.1, extracting image characteristics by a method of convolving the weight and the parameters of the preprocessed data by a convolution layer of the image characteristic analysis module;
2.2, an activation function layer of the image feature analysis module carries out nonlinear operation on the image features by applying a ReLu function to obtain nonlinear feature map parameters;
and 2.3, reducing the parameters of the nonlinear feature map by the maximum pooling layer of the image feature analysis module.
Sending a preprocessed test database image into a trained neural network, outputting classification and regression results after a test image passes through a trained neural network weight matrix and a classifier after characteristics are extracted, wherein the classification results are expressed in a probability form, if the probability of judging as the face is greater than the probability of judging as a non-face, judging as the face, and marking the part judged as the face by using a rectangular frame; the regression results in that 68 key points of the face part in the picture are marked by the key points, and the marked coordinates are returned.
And 3, operating the image characteristic parameters through a full connection layer in the convolutional neural network to generate an image one-dimensional vector (310).
And 4, classifying and regressing the one-dimensional vectors of the images through a classification layer in the convolutional neural network to obtain the position coordinates of the face images. The classification layer in the step 4 performs classification and regression processes on the image one-dimensional vectors: the method comprises the following steps:
step 4.1, iteration weights are carried out on the image one-dimensional vector by an optimization method of a random gradient descent method, so that a loss function is continuously adjusted, and a super-parameter during training is continuously adjusted to obtain an optimal training result, wherein the super-parameter comprises the following steps: iteration times, batches, maximum iteration times and learning rate;
step 4.2, the loss function selected in the classification process is a method for combining the central loss function with the softmax loss function, and the specific expression is as follows:
Figure BDA0001915858600000051
wherein L is S As a softmax loss function, L c For the central loss function, λ is a coefficient, indicating that the weights of both here are taken to be λ =0.1. Wherein Wx + b is the output of the full connection layer, and represents x after log i Belong to the category y i C represents the feature center of the category;
step 4.3, the loss function adopted in the regression process is as follows: the Euclidean distance loss function has the following specific expression:
Figure BDA0001915858600000052
y i ∈R 4
wherein,
Figure BDA0001915858600000053
is the output of the network prediction, and y is the coordinates of the labeled real label, namely 68 face key points. Step 4.4, the coordinates of the 68 face key points output under the optimal weight value condition are compared with the coordinates of the labeled face key points in the database and the face, and therefore the convolutional neural network is calculatedThe accuracy rate for face detection.
The training task of the invention is integrally divided into two parts: classification and regression. The classification means that the human face detection problem is regarded as a two-classification problem of human face and non-human face; the regression refers to a process of returning the coordinates of the frame of the human face and the coordinates of the positions of the 68 key points of the human face after the training of the neural network, so that the purpose of detecting the human face is achieved. Continuously iterating and updating the weight in the network to reduce a loss function, thereby finally obtaining an optimal weight; and comparing the recognition result output under the condition of the optimal weight with the labeled human face key point coordinates and the human face in the database, thereby calculating the accuracy of the convolutional neural network for human face detection.
As shown in fig. 2, the convolutional neural network used in the face recognition method based on the convolutional neural network provided by the present invention includes four convolutional layers, four ReLu activation function layers, four maximum pooling layers, and two full-link layers, wherein the last full-link layer is a softmax classification layer. The convolution layer is used for extracting the characteristics of the image by utilizing a method of convolving the weight of the convolution layer with the parameters; the function layer is activated to increase the nonlinear capability of the network, wherein the ReLu function refers to a function of y = max (0, x); the maximum pooling layer is used for reducing the output size and parameters; the full connection layer is used for mapping the extracted features into one-dimensional vectors; the classification layer is used for classifying two parts of a human face and a non-human face from the features extracted by the network and regressing the coordinates of 68 key points of the human face. The whole training process is as follows: firstly, initializing parameters in a convolution layer and a full connection layer randomly, obtaining the characteristics of the face after four convolution, activation and pooling layers after sending the image in the established database to the network, obtaining the characteristic vector with a fixed size through the full connection layer, and finally obtaining the coordinates of the position of the face through the classification layer. The classification layer is used for classifying two parts of a human face and a non-human face from the features extracted by the network and regressing the coordinates of 68 key points of the human face. In the process of network training, data are transmitted in the forward direction of the network, errors obtained through loss functions are transmitted in the reverse direction of the network, parameters in the convolutional layer and the full connection layer are optimized continuously, and good training effects are obtained finally through continuous training and fine adjustment of various parameters.
The whole training process of the invention is as follows: firstly, initializing parameters in a convolution layer and a full connection layer randomly, obtaining the characteristics of the face after four convolution, activation and pooling layers after sending the image in the established database to the network, obtaining the characteristic vector with fixed size through the full connection layer, and finally obtaining the coordinates of the position of the face through the classification layer. In the process of network training, data are propagated in the forward direction of the network, errors obtained through loss functions are propagated in the reverse direction of the network, parameters in the convolutional layers and the full connection layers are optimized continuously, and various parameters are trained and fine-tuned continuously to obtain a good training effect finally. This step is performed by training the database to obtain the optimal parameters. In the whole training process, the error between the actual label and the prediction result is represented by the loss function, namely, the loss function is minimized, iterative training is continuously carried out, and when the loss function is minimized finally, the optimal parameter is obtained. The parameters to be trained include the convolution kernel and bias of the convolutional layer, and the neuron parameters in the fully-connected layer. In the whole training process, data are transmitted forward, errors obtained by calculation of a loss function are transmitted backward, and the network finds a global optimum point in the continuous iteration process through a gradient descent method, so that the optimum parameters are obtained. After the training is finished, the optimal network parameters are obtained, the optimal parameters are substituted into the whole network, and at the moment, the whole network has the face detection capability, so that the face detection can be carried out. And then, the accuracy of the neural network for face detection can be obtained through testing.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, but rather as the subject matter of any modification, equivalent arrangement, or improvement made within the spirit and principle of the present invention is included in the scope of the present invention.

Claims (2)

1. A face detection method based on a convolutional neural network is characterized by comprising the following steps:
step 1, establishing a database to obtain image data, preprocessing the image data and constructing a convolutional neural network;
step 2, carrying out four times of iterative operation on the preprocessed data through an image feature analysis module in the convolutional neural network to generate image feature parameters;
step 3, operating the image characteristic parameters through a full connection layer in the convolutional neural network to generate an image one-dimensional vector;
step 4, classifying and regressing the one-dimensional vectors of the images through a classification layer in a convolutional neural network to obtain the position coordinates of the face images; wherein: the classification layer in the step 4 performs classification and regression processes on the image one-dimensional vector, and the classification and regression process comprises the following steps:
step 4.1, iteration weight is carried out on the image one-dimensional vector by an optimization method of a random gradient descent method, so that a loss function is continuously adjusted, and a super-parameter during training is continuously adjusted to obtain an optimal training result, wherein the super-parameter comprises: iteration times, batches, maximum iteration times and learning rate;
step 4.2, the loss function selected in the classification process is a method for combining the central loss function with the softmax loss function, and the specific expression is as follows:
Figure FDA0003919103020000011
wherein L is S Is a softmax loss function, L c Is a central loss function, and is represented by a coefficient, wherein the weight of the two is represented by lambda =0.1, and Wx + b is the output of a full connection layer and is represented by x after log i Belong to the category y i C represents the feature center of the category;
and 4.3, adopting a loss function in the regression process as follows: the Euclidean distance loss function has the following specific expression:
Figure FDA0003919103020000012
y i ∈R 4
wherein,
Figure FDA0003919103020000013
is the output result of the network prediction, and y is the real label of the mark, namely the coordinates of the key points of 68 human faces;
and 4.4, comparing the coordinates of the 68 human face key points output under the optimal weight value condition with the coordinates of the human face key points with the labels in the database and the human face, and calculating the accuracy of the convolutional neural network for detecting the human face.
2. The face detection method based on the convolutional neural network as claimed in claim 1, wherein the image feature analysis module in step 2 processes the preprocessed data, comprising the following steps:
step 2.1, extracting image characteristics by a method of convolving the weight and the parameters of the preprocessed data by the convolution layer of the image characteristic analysis module;
2.2, an activation function layer of the image feature analysis module carries out nonlinear operation on the image features by applying a ReLu function to obtain nonlinear feature map parameters;
and 2.3, reducing the parameters of the nonlinear feature map by the maximum pooling layer of the image feature analysis module.
CN201811572322.8A 2018-12-21 2018-12-21 Face detection method based on convolutional neural network Active CN109815814B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811572322.8A CN109815814B (en) 2018-12-21 2018-12-21 Face detection method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811572322.8A CN109815814B (en) 2018-12-21 2018-12-21 Face detection method based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN109815814A CN109815814A (en) 2019-05-28
CN109815814B true CN109815814B (en) 2023-01-24

Family

ID=66602244

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811572322.8A Active CN109815814B (en) 2018-12-21 2018-12-21 Face detection method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN109815814B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222764B (en) * 2019-06-10 2021-06-18 中南民族大学 Method, system, device and storage medium for detecting occluded target
CN111177469A (en) * 2019-12-20 2020-05-19 国久大数据有限公司 Face retrieval method and face retrieval device
CN111523452B (en) * 2020-04-22 2023-08-25 北京百度网讯科技有限公司 Method and device for detecting human body position in image
CN111612785B (en) * 2020-06-03 2024-02-02 浙江大华技术股份有限公司 Face picture quality assessment method, device and storage medium
CN112084551A (en) * 2020-07-03 2020-12-15 邱宇 Building facade identification and generation method based on confrontation generation network
CN112052772A (en) * 2020-08-31 2020-12-08 福建捷宇电脑科技有限公司 Face shielding detection algorithm
CN112733589B (en) * 2020-10-29 2023-01-03 广西科技大学 Infrared image pedestrian detection method based on deep learning

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599830A (en) * 2016-12-09 2017-04-26 中国科学院自动化研究所 Method and apparatus for positioning face key points
CN106874883A (en) * 2017-02-27 2017-06-20 中国石油大学(华东) A kind of real-time face detection method and system based on deep learning
CN107292267A (en) * 2017-06-21 2017-10-24 北京市威富安防科技有限公司 Photo fraud convolutional neural networks training method and human face in-vivo detection method
CN107358223A (en) * 2017-08-16 2017-11-17 上海荷福人工智能科技(集团)有限公司 A kind of Face datection and face alignment method based on yolo
CN107729819A (en) * 2017-09-22 2018-02-23 华中科技大学 A kind of face mask method based on sparse full convolutional neural networks
CN107871134A (en) * 2016-09-23 2018-04-03 北京眼神科技有限公司 A kind of method for detecting human face and device
CN107871106A (en) * 2016-09-26 2018-04-03 北京眼神科技有限公司 Face detection method and device
CN108073917A (en) * 2018-01-24 2018-05-25 燕山大学 A kind of face identification method based on convolutional neural networks
CN108427921A (en) * 2018-02-28 2018-08-21 辽宁科技大学 A kind of face identification method based on convolutional neural networks

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9141196B2 (en) * 2012-04-16 2015-09-22 Qualcomm Incorporated Robust and efficient learning object tracker
WO2017070858A1 (en) * 2015-10-28 2017-05-04 Beijing Sensetime Technology Development Co., Ltd A method and a system for face recognition
US10032067B2 (en) * 2016-05-28 2018-07-24 Samsung Electronics Co., Ltd. System and method for a unified architecture multi-task deep learning machine for object recognition
KR20180057096A (en) * 2016-11-21 2018-05-30 삼성전자주식회사 Device and method to perform recognizing and training face expression
CN107808129B (en) * 2017-10-17 2021-04-16 南京理工大学 Face multi-feature point positioning method based on single convolutional neural network
CN108304788B (en) * 2018-01-18 2022-06-14 陕西炬云信息科技有限公司 Face recognition method based on deep neural network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107871134A (en) * 2016-09-23 2018-04-03 北京眼神科技有限公司 A kind of method for detecting human face and device
CN107871106A (en) * 2016-09-26 2018-04-03 北京眼神科技有限公司 Face detection method and device
CN106599830A (en) * 2016-12-09 2017-04-26 中国科学院自动化研究所 Method and apparatus for positioning face key points
CN106874883A (en) * 2017-02-27 2017-06-20 中国石油大学(华东) A kind of real-time face detection method and system based on deep learning
CN107292267A (en) * 2017-06-21 2017-10-24 北京市威富安防科技有限公司 Photo fraud convolutional neural networks training method and human face in-vivo detection method
CN107358223A (en) * 2017-08-16 2017-11-17 上海荷福人工智能科技(集团)有限公司 A kind of Face datection and face alignment method based on yolo
CN107729819A (en) * 2017-09-22 2018-02-23 华中科技大学 A kind of face mask method based on sparse full convolutional neural networks
CN108073917A (en) * 2018-01-24 2018-05-25 燕山大学 A kind of face identification method based on convolutional neural networks
CN108427921A (en) * 2018-02-28 2018-08-21 辽宁科技大学 A kind of face identification method based on convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
特征匹配融合结合改进卷积神经网络的人脸识别;李佳妮,张宝华;《激光与光电子学进展》;20180530;全文 *

Also Published As

Publication number Publication date
CN109815814A (en) 2019-05-28

Similar Documents

Publication Publication Date Title
CN109815814B (en) Face detection method based on convolutional neural network
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
US20200285896A1 (en) Method for person re-identification based on deep model with multi-loss fusion training strategy
US11657602B2 (en) Font identification from imagery
Xie et al. Multilevel cloud detection in remote sensing images based on deep learning
CN106096561B (en) Infrared pedestrian detection method based on image block deep learning features
CN109977757B (en) Multi-modal head posture estimation method based on mixed depth regression network
CN114202672A (en) Small target detection method based on attention mechanism
US10943352B2 (en) Object shape regression using wasserstein distance
CN110619352A (en) Typical infrared target classification method based on deep convolutional neural network
CN114332578A (en) Image anomaly detection model training method, image anomaly detection method and device
CN108288047A (en) A kind of pedestrian/vehicle checking method
CN114049381A (en) Twin cross target tracking method fusing multilayer semantic information
CN111882554B (en) SK-YOLOv 3-based intelligent power line fault detection method
CN111709313A (en) Pedestrian re-identification method based on local and channel combination characteristics
CN111382791B (en) Deep learning task processing method, image recognition task processing method and device
CN118251698A (en) Novel view synthesis of robust NERF model for sparse data
CN110751005B (en) Pedestrian detection method integrating depth perception features and kernel extreme learning machine
CN113888603A (en) Loop detection and visual SLAM method based on optical flow tracking and feature matching
CN113469091A (en) Face recognition method, training method, electronic device and storage medium
CN116959125A (en) Data processing method and related device
CN113313091B (en) Density estimation method based on multiple attention and topological constraints under warehouse logistics
Ahmad et al. Resource efficient mountainous skyline extraction using shallow learning
CN115439926A (en) Small sample abnormal behavior identification method based on key region and scene depth
Sun et al. Object Detection in Urban Aerial Image Based on Advanced YOLO v3 Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant