CN113378675A - Face recognition method for simultaneous detection and feature extraction - Google Patents

Face recognition method for simultaneous detection and feature extraction Download PDF

Info

Publication number
CN113378675A
CN113378675A CN202110603538.1A CN202110603538A CN113378675A CN 113378675 A CN113378675 A CN 113378675A CN 202110603538 A CN202110603538 A CN 202110603538A CN 113378675 A CN113378675 A CN 113378675A
Authority
CN
China
Prior art keywords
face
feature extraction
detection
branch
face detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110603538.1A
Other languages
Chinese (zh)
Inventor
茅耀斌
沈庆强
项文波
陈婷
吴敏杰
张伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN202110603538.1A priority Critical patent/CN113378675A/en
Publication of CN113378675A publication Critical patent/CN113378675A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a face recognition method for simultaneous detection and feature extraction, which comprises the steps of constructing a face detection data set and a face feature extraction data set to form a multitask face detection recognition data set; constructing a backbone network, a face detection branch and a face feature extraction branch, and training a face detection recognition model based on a deep neural network, wherein the backbone network is used for extracting deep features in an image and providing regression classification information for subsequent detection and the feature extraction branch, the face detection branch is used for estimating a heat map, target center offset and the size of a boundary box, and the face feature extraction branch is used for extracting features of each face to generate a feature vector; and inputting the identification image to be detected into the trained face detection identification model, completing face detection and feature extraction, and further determining the identity information of the personnel. The invention improves the speed of face detection and recognition and reduces the dependence of the feature extraction stage on the effect of the face detection stage.

Description

Face recognition method for simultaneous detection and feature extraction
Technical Field
The invention relates to the field of image processing and deep learning, in particular to a face recognition method and a face recognition system for simultaneous detection and feature extraction.
Background
With the rapid development of internet and information technology, the security requirements in many fields in life and production are gradually increased, and the identity authentication needs to be accurately and rapidly performed. Compared with other identity verification modes, the face recognition has naturalness, intuition and non-contact property, and is more in line with human cognitive rules. Just because human face features have many advantages, the human face recognition technology has been widely applied to many fields of production and life, such as access control systems, face-brushing payment and the like.
Most of the face recognition processes at the present stage are based on a paradigm of firstly detecting and then extracting features. For example, document [1] performs image division feature construction in a face space, performs recognition and detection, switches and selects local features, performs statistics on the local features, and performs targeted calculation and comparison on data. Document [2] performs face detection on a current frame image by using a trained face detection CNN, sets a face frame having the same size and position as a face frame of a previous frame image on the current frame image, and enlarges the face frame by a certain multiple to obtain a face region.
[1] Yuanpejiang, Song Bo, Shizhen, Li Jianmin a model switching algorithm [ P ] based on face recognition in Beijing city: CN 111860454A, 2020-10-30.
[2] Zhou army, Wangyang, Surveillance video image face detection and tracking method, device, medium and equipment [ P ]. Beijing City: c N112825116A,2021-05-21
Disclosure of Invention
The invention aims to provide a face recognition method and a face recognition system for simultaneous detection and feature extraction.
The technical solution for realizing the purpose of the invention is as follows: a face recognition method for simultaneous detection and feature extraction comprises the following steps:
step 1, data modeling and preparation
Constructing a face detection data set and a face feature extraction data set to form a multitask face detection identification data set;
step 2, deep neural network model training
Constructing a backbone network, a face detection branch and a face feature extraction branch, and training a face detection recognition model based on a deep neural network, wherein the backbone network is used for extracting deep features in an image and providing regression classification information for subsequent detection and the feature extraction branch, the face detection branch is used for estimating a heat map, target center offset and the size of a boundary box, and the face feature extraction branch is used for extracting features of each face to generate a feature vector;
step 3, model reasoning application
And inputting the identification image to be detected into the trained face detection identification model, completing face detection and feature extraction, and determining the identity information of the personnel.
Further, in step 1, data modeling and preparation are specifically performed by:
step 1.1, constructing a face detection data set
Marking a face area in the image in a rectangular frame form, and recording the position of the central point and the width and height of the rectangular frame;
step 1.2, constructing a human face feature extraction data set
Identity identification positions are arranged in the label, the same identity adopts the same identification, and the identifications of different identities are different;
step 1.3, constructing a multitask face detection recognition data set
And integrating the constructed face detection data set and the face feature extraction identity data set to obtain the label content comprising an identity identification position, coordinates of the center point of the rectangular frame and the width and height of the rectangular frame, and setting the name of the label file to be consistent with the name of the original image.
Further, in step 2, deep neural network model training is specifically performed by:
(1) backbone network
The backbone network can adopt ResNet + transposition convolution, DLA (deep LayerAttregeration), Hourglass, MobilenetV2 or High Resolution network (High Resolution);
(2) face detection branch
The detection branch is designed by adopting an anchor-free frame, and is represented as three parallel heads behind a main network, namely a heat map head, a central point offset head and a boundary frame head, wherein the heat map head comprises an input layer, a dynamic convolution layer, a first full-connection layer, a second full-connection layer and an output layer;
(a) thermal map head
Detecting branch heatmap heads estimate the location of object centers based on the heatmap representation, detecting bounding boxes for each object in the image
Figure BDA0003093389770000031
The center point can be calculated
Figure BDA0003093389770000032
Wherein:
Figure BDA0003093389770000033
Figure BDA0003093389770000034
the position of the characteristic diagram can be obtained by dividing the stride
Figure BDA0003093389770000035
Figure BDA0003093389770000036
Feature map response M at image (x, y)xyCan be expressed as:
Figure BDA0003093389770000037
wherein N represents the number of face frames in the image, σcRepresenting standard deviation, heat map head loss function LheatUsing the Focal-Loss function, as shown in equation (2):
Figure BDA0003093389770000038
in the formula
Figure BDA0003093389770000039
Representing a heat value graph estimated by the model, wherein alpha and beta are represented as Focal-local preset hyper-parameters;
(b) center point offset head
The center point offset head aims at more accurately positioning the face position, and the alignment accuracy of the feature extraction branch and the center of the object is crucial to the performance. Let the displacement of the output center point be O ∈ RW×H×2For each bounding box
Figure BDA00030933897700000310
With an offset of its center point of
Figure BDA00030933897700000311
Center point offset head loss function LcenterBy means of1Norm, as shown in equation (3):
Figure BDA00030933897700000312
in the formula
Figure BDA00030933897700000313
Representing the estimated central point offset of the model;
(c) boundary frame head
The boundary box head is used for estimating the height and width of the human face boundary box at each anchor point position, the height and width are not directly related to the feature extraction branch, but the positioning precision influences the evaluation of human face detection performance, and the size of the output boundary box is represented as the size of S belonging to RW×H×2To, forAt each bounding box
Figure BDA00030933897700000314
It has a size of
Figure BDA00030933897700000315
Bounding box head loss function LboxBy means of1Norm is shown as formula (4);
Figure BDA00030933897700000316
in the formula
Figure BDA00030933897700000317
Representing the size of a bounding box estimated by the model;
(3) face feature extraction branch
The human face feature extraction branch learns the feature extraction task through the classification task, all objects with the same identification in the training set are considered to be the same class, and each boundary frame in the image
Figure BDA00030933897700000318
Obtain its center on heatmap
Figure BDA00030933897700000319
A distinguishable feature vector can be extracted at this position and expressed as
Figure BDA00030933897700000320
Loss function L of face feature extraction branchidComprises the following steps:
Figure BDA0003093389770000041
combining a heat map head loss function, a center offset head loss function, a bounding box head loss function and a face feature extraction branch loss function in a face detection branch, balancing tasks of the detection branch and the feature extraction branch according to uncertainty loss, wherein the detection branch loss function and the fair network overall loss function are respectively expressed as (6) and (7):
Ldet=Lheat+Lcenter+Lbox (6)
Figure BDA0003093389770000042
wherein ω is1And ω2Expressed as parameters balancing the detection and feature extraction tasks.
Further, in step 2, the deep neural network model training adopts a cutting mixing strategy, a multi-scale image strategy, a random left-right turning strategy and a random rotation strategy.
Further, step 3, model reasoning application, the specific method is as follows:
step 3.1, inputting the recognition image to be detected into a trained face recognition model for face detection and feature extraction;
step 3.2, post-processing the face detection result in the step 3.1, setting a confidence threshold of the face bounding box, screening invalid candidate boxes, and performing non-maximum suppression to filter overlapped bounding boxes;
step 3.3, post-processing the face feature extraction result in the step 3.1, comparing the face feature extraction result with feature data stored in a database (such as Euclidean distance and cosine distance), and obtaining identity information corresponding to the features;
and 3.4, corresponding the face detection result in the step 3.2 with the face recognition result in the step 3.3 to obtain the personnel identity information.
A face recognition system capable of simultaneously detecting and extracting features is based on the face recognition method and is used for face recognition of simultaneous detection and feature extraction.
A computer device comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein when the processor executes the computer program, the face recognition of detection and feature extraction is simultaneously carried out based on the face recognition method.
A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, performs face recognition for both detection and feature extraction based on the face recognition method.
Compared with the prior art, the invention has the following remarkable advantages:
(1) the invention abandons the two-stage paradigm of 'detection before feature extraction' of the structure of the existing face detection and identification method, but integrates the face detection task and the feature extraction task into a deep network structure, namely, the two tasks of face detection and face feature extraction can be completed through one-time reasoning, thereby improving the identification speed and the identification accuracy and reducing the network complexity;
(2) the method abandons the method that the prior face feature extraction stage needs fixed size, multi-scale training is added in the network training process, and the full-convolution model can accept input of any size, thereby reducing the influence of image size scaling in the face feature extraction stage;
(3) the invention adopts the deep learning network to detect the face and simultaneously extract the face characteristics, thereby avoiding the need of face alignment in the characteristic extraction stage in the prior method, having certain universality and improving the fault tolerance.
(4) Aiming at the problem that a data set with double tasks of face detection and feature extraction is quite few at the present stage, the face recognition training set CASIA-Webface is used for modification to manufacture the data set with the double tasks of face detection and feature extraction.
Drawings
Fig. 1 is a flow chart of a face intelligent recognition method for simultaneously performing detection and feature extraction.
FIG. 2 is a diagram of a neural network architecture.
Fig. 3 is a schematic diagram of a residual backbone network.
Fig. 4 is a schematic diagram of a labeling format of a multitask face detection recognition data set.
FIG. 5 is a schematic diagram of a model one-time detection implementation; wherein (a) represents the test result on LFW public data set, and (b) represents the test result on CASIA-faceV5 public data set.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
With reference to fig. 1, a face recognition method with simultaneous detection and feature extraction utilizes a single deep neural network to detect and position a plurality of faces in a single image and output feature values through one-time forward reasoning, thereby performing face recognition.
Step 1, data modeling and preparation
The method comprises the following steps of constructing a face detection data set and a face feature extraction data set to form a multitask face detection identification data set, and specifically comprises the following steps:
step 1.1, collecting a face image: the collected images include but are not limited to facial images of various visual angles and expressions;
step 1.2, labeling a face detection data set: marking a face area in the image in a rectangular frame form, and recording the position of the central point and the width and height of the rectangular frame;
step 1.3, labeling a face recognition data set: identity identification positions are arranged in the label, the same identity adopts the same identification, and the identifications of different identities are different;
step 1.4, labeling a face detection recognition data set; the labeled data sets in the steps 1.2 and 1.3 are integrated, the labeling format is shown in fig. 4, one image corresponds to one text file, and the labeling information of each face in the image is recorded in the text file line by line;
step 1.5, according to a preset proportion n1:n2:n3Respectively carrying out random sampling on the two data sets to construct a training set, a verification set and a test set; wherein n is1+n2+n3=1;
Exemplarily, n1:n2:n3=8:1:1。
Step 2, deep neural network model training
Constructing a main network for extracting features, constructing a face detection branch, constructing a face feature extraction branch, and training a face detection recognition model, which specifically comprises the following steps:
step 2.1, constructing a backbone network, specifically: and constructing a feature extraction backbone network by using a residual error module in ResNet as a basic module for feature extraction and combining a dynamic convolution module and a transposed convolution module.
Specifically, the face detection and recognition backbone network described with reference to fig. 3 by taking the residual backbone network as an example includes: a residual network module with an output feature map size of 1/4, a residual network module with an output feature map size of 1/8, a residual network module with an output feature map size of 1/16, a residual network module with an output feature map size of 1/32, a dynamic convolution + transposed convolution module with an output feature map size of 1/16, a dynamic convolution + transposed convolution module with an output feature map size of 1/8, and a dynamic convolution + transposed convolution module with an output feature map size of 1/4.
And 2.2, constructing a detection branch, and regarding the face detection as a boundary box regression task based on the center on the high-resolution image. Three parallel regression heads (heads) are accessed behind the backbone network to estimate the heatmap, face center offset and bounding box size respectively. Each regression header (head) is achieved by applying a 3 × 3 convolution to the output profile of the backbone network, and then generating the final target by a 1 × 1 convolution layer.
(1) Thermal map head
Here, the position of the center of the object is estimated using a representation based on a heat map having a size of 1 × H × W. The response decays exponentially with the distance between the location and the center of the object in the heat map. Detecting bounding boxes for each object in an image
Figure BDA0003093389770000071
The center point can be calculated
Figure BDA0003093389770000072
Wherein
Figure BDA0003093389770000073
Figure BDA0003093389770000074
The position of the characteristic diagram can be obtained by dividing the stride
Figure BDA0003093389770000075
Figure BDA0003093389770000076
The feature map response at image (x, y) may be expressed as:
Figure BDA0003093389770000077
wherein N represents the number of face frames in the image, σcRepresents the standard deviation. The Loss function adopts a Focal-Loss function, and is shown as formula (2).
Figure BDA0003093389770000078
In the formula
Figure BDA0003093389770000079
Expressed as a model predicted heat value graph, and alpha and beta are expressed as the preset hyper-parameters of Focal-local.
(2) Center point offset head
The center point offset head is responsible for more accurately positioning the object. And supposing that the displacement of the center point of the output face is represented as O epsilon RW ×H×2For each bounding box
Figure BDA00030933897700000710
Its offset can be calculated as
Figure BDA00030933897700000711
Figure BDA00030933897700000712
Loss function adopts1Norm is shown in formula (3).
Figure BDA00030933897700000713
(3) Boundary frame head
The bounding box head is responsible for estimating the height and width of the face bounding box at each anchor point position, and has no direct relation with the feature extraction branch, but the positioning precision influences the evaluation of the face detection performance. Assume that the size of the bounding box of the output is denoted as S ∈ RW×H×2For each bounding box
Figure BDA00030933897700000714
Its size can be calculated as
Figure BDA00030933897700000715
Loss function adopts1Norm is shown in formula (4).
Figure BDA00030933897700000716
And 2.3, constructing a feature extraction branch, wherein the feature extraction branch aims at generating features capable of distinguishing different human faces. Ideally, the distance between different faces should be greater than the distance between the same face. To achieve this goal, the present invention applies a convolutional layer with multiple kernels over the backbone features to extract the identity embedded features for each location.
The feature extraction branch employs a classification task to learn a feature extraction task. All objects in the training set having the same identity are considered to be of the same class, for each bounding box in the image
Figure BDA00030933897700000717
Center retrievable on heatmap
Figure BDA00030933897700000718
Extracting a distinguishable feature vector at this location
Figure BDA00030933897700000719
The penalty function for the feature extraction branch is:
Figure BDA0003093389770000081
step 2.4, training of face recognition model
Step 2.4.1, preprocessing the training data
Adopting a cutting mixing strategy:
the original image is cut randomly or two images are added and mixed. The cutting technology is to cut the image randomly, fill 0 in the cut area and supplement, and the data label is still unchanged. The mixing technology is that random two pictures in a training set are mixed according to a proportion, and data labels are labeled according to the mixing proportion. The cropping mixing technique is to crop the image but not fill 0 elements, and fill in the pixel values of the region of other related images of the training set. And taking the processed image as a network input image.
Strategy for training with multi-scale images: and the image is amplified and reduced to obtain a multi-scale image, so that the multi-scale invariance of the network model is ensured.
Finally, random left-right flipping and random rotation are used: increasing the diversity of the sample;
step 2.4.2, deep neural network model training is carried out by using the processed data set
The training process mainly comprises the steps of setting a loss function, jointly detecting a branch heat map head loss function, detecting a branch target boundary box, a center offset loss function and a feature extraction branch loss function, and balancing tasks of detecting branches and feature extraction branches according to uncertainty loss. The detection branch loss function and the fairness network overall loss function are represented as (6) and (7), respectively.
Ldet=Lheat+Lcenter+Lbox (6)
Figure BDA0003093389770000082
Wherein ω is1And ω2Expressed as parameters balancing the detection and feature extraction tasks.
Classifying the data set into a training set, a verification set and a test set, setting a weighted value of each task by taking an equation (6) as an objective function, selecting an optimization method such as Adam, SGD and the like, setting a training round, a network initial learning rate and an attenuation rate, finishing training when an error calculated by training reaches an expected value, and obtaining parameters of a convolutional neural network model.
Module 3, model inference application
Inputting a recognition image to be detected into a trained face detection recognition model, completing face detection and feature extraction, and further determining personnel identity information, wherein the method specifically comprises the following steps:
step 3.1, inputting the recognition image to be detected into a trained face recognition model for face detection and feature extraction;
and 3.2, performing post-processing on the face detection result in the step 3.1, setting a confidence threshold of the face bounding box, screening invalid candidate boxes, performing non-maximum suppression to filter overlapped bounding boxes, and taking LFW and CASIA-faceV5 as examples, wherein the detection result is shown in FIG. 5.
Step 3.3, post-processing the face feature extraction result in the step 3.1, and comparing the face feature extraction result with feature data stored in a database to obtain identity information corresponding to the features;
and 3.4, corresponding the face detection result in the step 3.2 with the face recognition result in the step 3.3 to obtain the personnel identity information.
The invention also provides a face intelligent recognition system for simultaneously carrying out detection and feature extraction.
A computer device comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, when the processor executes the computer program, the intelligent face recognition of detection and feature extraction is carried out simultaneously based on the face recognition method.
A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, performs face recognition based on the intelligent face recognition method, both detection and feature extraction.
In summary, unlike the previous paradigm of "detecting and then extracting features", the present invention directly uses a single deep neural network to detect and extract features of multiple faces in an image through one forward inference. The working principle is as follows: constructing a deep neural network by adopting a mode that a trunk network is subsequently connected with a detection branch and a characteristic identification branch; extracting the features of the image by utilizing the strong feature extraction capability of the backbone network; detecting the face in the image by using the subsequent detection branch of the backbone network; synchronously extracting features of the detected human face by utilizing subsequent feature extraction branches of the backbone network; and comparing the features by taking the result stored in the database as a reference to calculate the face attribute. The invention has the following characteristics: 1. the face detection and the feature extraction are integrated into a deep neural network, so that the speed of face detection and recognition is improved; 2. the feature extraction stage does not depend on the result output of the face detection stage any more, so that the dependence of the feature extraction stage on the effect of the face detection stage can be reduced; 3. in the feature extraction stage, the face image size does not need to be fixed, and then the face feature extraction is carried out, so that the influence on the accuracy of the face feature extraction caused by image scaling is avoided; 4. the feature extraction stage does not need face alignment any more, and the steps of face recognition are reduced.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (8)

1. A face recognition method for simultaneous detection and feature extraction is characterized by comprising the following steps:
step 1, data modeling and preparation
Constructing a face detection data set and a face feature extraction data set to form a multitask face detection identification data set;
step 2, deep neural network model training
Constructing a backbone network, a face detection branch and a face feature extraction branch, and training a face detection recognition model based on a deep neural network, wherein the backbone network is used for extracting deep features in an image and providing regression classification information for subsequent detection and the feature extraction branch, the face detection branch is used for estimating a heat map, target center offset and the size of a boundary box, and the face feature extraction branch is used for extracting features of each face to generate a feature vector;
step 3, model reasoning application
And inputting the identification image to be detected into the trained face detection identification model, completing face detection and feature extraction, and further determining the identity information of the personnel.
2. The face recognition method for simultaneous detection and feature extraction according to claim 1, wherein in step 1, data modeling and preparation are specifically performed by:
step 1.1, constructing a face detection data set
Marking a face area in the image in a rectangular frame form, and recording the position of the central point and the width and height of the rectangular frame;
step 1.2, constructing a human face feature extraction data set
Identity identification positions are arranged in the label, the same identity adopts the same identification, and the identifications of different identities are different;
step 1.3, constructing a multitask face detection recognition data set
And integrating the constructed face detection data set and the face feature extraction identity data set to obtain the label content comprising an identity identification position, coordinates of the center point of the rectangular frame and the width and height of the rectangular frame, and setting the name of the label file to be consistent with the name of the original image.
3. The method for face recognition with simultaneous detection and feature extraction according to claim 1, wherein in step 2, deep neural network model training is specifically performed by:
(1) backbone network
The main network adopts ResNet + transposition convolution, DLA, Hourglass, MobilenetV2 or high-resolution network;
(2) face detection branch
The detection branch is designed by adopting an anchor-free frame, and is represented as three parallel heads behind a main network, namely a heat map head, a central point offset head and a boundary frame head, wherein the heat map head comprises an input layer, a dynamic convolution layer, a first full-connection layer, a second full-connection layer and an output layer;
(a) thermal map head
Detecting branch heatmap heads estimate the location of object centers based on the heatmap representation, detecting bounding boxes for each object in the image
Figure FDA0003093389760000021
Calculating center point
Figure FDA0003093389760000022
Wherein:
Figure FDA0003093389760000023
Figure FDA0003093389760000024
dividing stride to obtain the position of the stride on the feature map
Figure FDA0003093389760000025
The feature at image (x, y) is ringingShould MxyExpressed as:
Figure FDA0003093389760000026
wherein N represents the number of face frames in the image, σcRepresenting standard deviation, heat map head loss function LheatUsing the Focal-Loss function, as shown in equation (2):
Figure FDA0003093389760000027
in the formula
Figure FDA0003093389760000028
Representing a heat value graph estimated by the model, wherein alpha and beta are represented as Focal-local preset hyper-parameters;
(b) center point offset head
The center point offset head aims at more accurately positioning the position of a human face, and the output center point displacement is set as O e RW×H×2For each bounding box
Figure FDA0003093389760000029
With an offset of its center point of
Figure FDA00030933897600000210
Figure FDA00030933897600000211
Center point offset head loss function LcenterBy means of1Norm, as shown in equation (3):
Figure FDA00030933897600000212
in the formula
Figure FDA00030933897600000213
Representing the estimated central point offset of the model;
(c) boundary frame head
The boundary box head is used for estimating the height and width of the human face boundary box at each anchor point position, the height and width are not directly related to the feature extraction branch, but the positioning precision influences the evaluation of human face detection performance, and the size of the output boundary box is represented as the size of S belonging to RW ×H×2For each bounding box
Figure FDA00030933897600000214
It has a size of
Figure FDA00030933897600000215
Bounding box head loss function LboxBy means of1Norm is shown as formula (4);
Figure FDA00030933897600000216
in the formula
Figure FDA0003093389760000031
Representing the size of a bounding box estimated by the model;
(3) face feature extraction branch
The human face feature extraction branch learns the feature extraction task through the classification task, all objects with the same identification in the training set are considered to be the same class, and each boundary frame in the image
Figure FDA0003093389760000032
Obtain its center on heatmap
Figure FDA0003093389760000033
Extracting a distinguishable feature vector at this position as
Figure FDA0003093389760000034
Loss function L of face feature extraction branchidComprises the following steps:
Figure FDA0003093389760000035
combining a heat map head loss function, a center offset head loss function, a bounding box head loss function and a face feature extraction branch loss function in a face detection branch, balancing tasks of the detection branch and the feature extraction branch according to uncertainty loss, wherein the detection branch loss function and the fair network overall loss function are respectively expressed as (6) and (7):
Ldet=Lheaf+Lcenter+Lbox (6)
Figure FDA0003093389760000036
wherein ω is1And ω2Expressed as parameters balancing the detection and feature extraction tasks.
4. The method for face recognition with simultaneous detection and feature extraction as claimed in claim 1, wherein in step 2, the deep neural network model training employs a clipping hybrid strategy, a multi-scale image strategy, a random left-right flip and a random rotation strategy.
5. The face recognition method for simultaneous detection and feature extraction as claimed in claim 1, wherein in step 3, the model inference is applied, and the specific method is as follows:
step 3.1, inputting the recognition image to be detected into a trained face recognition model for face detection and feature extraction;
step 3.2, post-processing the face detection result in the step 3.1, setting a confidence threshold of the face bounding box, screening invalid candidate boxes, performing non-maximum suppression, and filtering out coincident bounding boxes
Step 3.3, post-processing the face feature extraction result in the step 3.1, and comparing the face feature extraction result with feature data stored in a database to obtain identity information corresponding to the features;
and 3.4, corresponding the face detection result in the step 3.2 with the face recognition result in the step 3.3 to obtain the personnel identity information.
6. A face recognition system for simultaneous detection and feature extraction is characterized in that the face intelligent recognition for simultaneous detection and feature extraction is carried out based on the face intelligent recognition method of any one of claims 1-5.
7. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to perform face recognition based on the intelligent face recognition method according to any one of claims 1 to 5, and simultaneously perform detection and feature extraction.
8. A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, performs face recognition for both detection and feature extraction based on the face recognition method of any one of claims 1-5.
CN202110603538.1A 2021-05-31 2021-05-31 Face recognition method for simultaneous detection and feature extraction Withdrawn CN113378675A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110603538.1A CN113378675A (en) 2021-05-31 2021-05-31 Face recognition method for simultaneous detection and feature extraction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110603538.1A CN113378675A (en) 2021-05-31 2021-05-31 Face recognition method for simultaneous detection and feature extraction

Publications (1)

Publication Number Publication Date
CN113378675A true CN113378675A (en) 2021-09-10

Family

ID=77575092

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110603538.1A Withdrawn CN113378675A (en) 2021-05-31 2021-05-31 Face recognition method for simultaneous detection and feature extraction

Country Status (1)

Country Link
CN (1) CN113378675A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114168684A (en) * 2021-12-10 2022-03-11 南威软件股份有限公司 Face modeling warehousing service implementation method and device based on asynchronous mechanism
CN114387553A (en) * 2022-01-18 2022-04-22 桂林电子科技大学 Video face recognition method based on frame structure perception aggregation
CN114462355A (en) * 2022-01-17 2022-05-10 网易有道信息技术(北京)有限公司 Question acquisition method and device, electronic equipment and storage medium
CN117173461A (en) * 2023-08-29 2023-12-05 湖北盛林生物工程有限公司 Multi-visual task filling container defect detection method, system and medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114168684A (en) * 2021-12-10 2022-03-11 南威软件股份有限公司 Face modeling warehousing service implementation method and device based on asynchronous mechanism
CN114168684B (en) * 2021-12-10 2023-08-08 清华大学 Face modeling warehouse-in service implementation method and device based on asynchronous mechanism
CN114462355A (en) * 2022-01-17 2022-05-10 网易有道信息技术(北京)有限公司 Question acquisition method and device, electronic equipment and storage medium
CN114387553A (en) * 2022-01-18 2022-04-22 桂林电子科技大学 Video face recognition method based on frame structure perception aggregation
CN114387553B (en) * 2022-01-18 2024-03-22 桂林电子科技大学 Video face recognition method based on frame structure perception aggregation
CN117173461A (en) * 2023-08-29 2023-12-05 湖北盛林生物工程有限公司 Multi-visual task filling container defect detection method, system and medium

Similar Documents

Publication Publication Date Title
Dai et al. TIRNet: Object detection in thermal infrared images for autonomous driving
Wei et al. Enhanced object detection with deep convolutional neural networks for advanced driving assistance
CN110414432B (en) Training method of object recognition model, object recognition method and corresponding device
CN108470332B (en) Multi-target tracking method and device
Tian et al. A dual neural network for object detection in UAV images
CN111783576B (en) Pedestrian re-identification method based on improved YOLOv3 network and feature fusion
Lei et al. Region-enhanced convolutional neural network for object detection in remote sensing images
CN111460968B (en) Unmanned aerial vehicle identification and tracking method and device based on video
CN113378675A (en) Face recognition method for simultaneous detection and feature extraction
CN109948707B (en) Model training method, device, terminal and storage medium
CN111104925B (en) Image processing method, image processing apparatus, storage medium, and electronic device
CN111339975A (en) Target detection, identification and tracking method based on central scale prediction and twin neural network
CN110008899B (en) Method for extracting and classifying candidate targets of visible light remote sensing image
Lu et al. Multi-object detection method based on YOLO and ResNet hybrid networks
Mukherjee et al. Human activity recognition in RGB-D videos by dynamic images
Bhavya Sree et al. An Inter-Comparative Survey on State-of-the-Art Detectors—R-CNN, YOLO, and SSD
Wu et al. Single shot multibox detector for vehicles and pedestrians detection and classification
CN114677633A (en) Multi-component feature fusion-based pedestrian detection multi-target tracking system and method
Ahmad et al. Embedded deep vision in smart cameras for multi-view objects representation and retrieval
CN112101154B (en) Video classification method, apparatus, computer device and storage medium
CN112668662A (en) Outdoor mountain forest environment target detection method based on improved YOLOv3 network
CN117115412A (en) Small target detection method based on weighted score label distribution
CN108985216B (en) Pedestrian head detection method based on multivariate logistic regression feature fusion
CN115953744A (en) Vehicle identification tracking method based on deep learning
CN114639084A (en) Road side end vehicle sensing method based on SSD (solid State disk) improved algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20210910

WW01 Invention patent application withdrawn after publication