CN113239727A - Person detection and identification method - Google Patents

Person detection and identification method Download PDF

Info

Publication number
CN113239727A
CN113239727A CN202110375567.7A CN202110375567A CN113239727A CN 113239727 A CN113239727 A CN 113239727A CN 202110375567 A CN202110375567 A CN 202110375567A CN 113239727 A CN113239727 A CN 113239727A
Authority
CN
China
Prior art keywords
face
image
face recognition
training
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110375567.7A
Other languages
Chinese (zh)
Inventor
李扬曦
缪亚男
王佩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Original Assignee
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Computer Network and Information Security Management Center filed Critical National Computer Network and Information Security Management Center
Priority to CN202110375567.7A priority Critical patent/CN113239727A/en
Publication of CN113239727A publication Critical patent/CN113239727A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Human Computer Interaction (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a person detection and identification method, which relates to the technical field of face identification and comprises the following steps: performing video frame extraction on an input video to obtain an original face image; carrying out face detection on the original face image by using a Blazeface network structure to obtain a target image; positioning key points of the human face of the target image by using Dlib, aligning the human face, and cutting the human face area of the target image to generate a training image; carrying out face recognition training on the training image by using ResNet50+ ArcFace Loss to obtain a trained face recognition network; and analyzing the face image to be recognized by using the trained face recognition network to obtain a recognition result. The method can be used for rapidly carrying out face detection and face recognition, and the detection speed is improved while the detection precision is ensured.

Description

Person detection and identification method
Technical Field
The invention relates to the technical field of face recognition, in particular to a person detection and recognition method.
Background
Various publicity videos in the internet are also endless at present, and some people can also publish some unrealistic words on the internet, if the videos flow into the domestic internet, bad effects can be caused, and the scheme for detecting and identifying people in the prior art has the following defects: the fast RCNN is a two-stage general object detection network, has the characteristics of high precision, but relatively low speed compared with other general detection networks such as Yolo and the like, and has high speed requirements on the speed when face detection is carried out in a video; and secondly, with the development of the face recognition technology, more and better metric learning methods are provided, and the recall of the face recognition can be improved. Therefore, how to provide a method for fast face detection and face recognition is a technical problem to be solved urgently for those skilled in the art.
Disclosure of Invention
In view of this, the present invention provides a method for detecting and identifying a person, so as to solve the problems in the background art and improve the detection speed while ensuring the detection accuracy.
In order to achieve the purpose, the invention adopts the following technical scheme: a person detection and identification method comprises the following steps:
performing video frame extraction on an input video to obtain an original face image;
carrying out face detection on the original face image by using a Blazeface network structure to obtain a target image;
positioning key points of the human face of the target image by using Dlib, aligning the human face, and cutting the human face area of the target image to generate a training image;
carrying out face recognition training on the training image by using ResNet50+ ArcFace Loss to obtain a trained face recognition network;
and analyzing the face image to be recognized by using the trained face recognition network to obtain a recognition result.
Preferably, one frame of image in the input video is extracted as a key frame at regular intervals by using FFmpeg software for the input video.
Preferably, the BlazeFace network structure is based on MobileNet + SSD to improve the size of a convolution kernel and a control mechanism.
Preferably, the specific steps of the face recognition training are as follows:
inputting the training image into ResNet50 to extract features;
calculating the difference between the predicted label and the real label by using the ArcFace Loss to complete the training stage of face recognition, wherein the calculation formula is as follows:
Figure RE-GSB0000194222250000021
wherein L is1As a loss function, xiFor the features extracted by ResNet50, W is the weight value of the fully connected layer, b is the bias value of the fully connected layer, e is the natural logarithm, and m is the sample number.
Preferably, at least one image is taken by each person, the features are extracted by the trained face recognition network, and the extracted features are stored in a database to obtain base features.
Preferably, the method further comprises a face recognition test, and the face recognition test specifically comprises the following steps:
inputting a test image, extracting a face region, inputting the face region into the trained face recognition network, and extracting features to obtain test features;
and calculating Euclidean distance between the test features and the bottom library features, and if the Euclidean distance is smaller than a specified threshold value, determining that the character is the figure.
Compared with the prior art, the technical scheme has the advantages that the human detection and identification method can assist in achieving identification of the target human, detection accuracy is guaranteed, detection speed is improved, and detection of the target human is fast and accurate.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic diagram of the structure of the present invention;
fig. 2 is a diagram of an improved network architecture according to the present invention.
FIG. 3(a) is a drawing of a prior art anchor machine;
FIG. 3(b) is a drawing showing the improved anchor mechanism of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention discloses a person detection and identification method, which comprises the following steps as shown in figure 1:
performing video frame extraction on an input video to obtain an original face image;
carrying out face detection on an original face image by using a Blazeface network structure to obtain a target image;
carrying out face key point positioning on a target image by using Dlib, carrying out face alignment, and cutting a face area of the target image to generate a training image;
carrying out face recognition training on the training image by using ResNet50+ ArcFace Loss to obtain a trained face recognition network;
and analyzing the face image to be recognized by using the trained face recognition network to obtain a recognition result.
Furthermore, FFmpeg software is used for performing video frame extraction on the input video, and in order to improve efficiency, one frame of image in the input video is extracted every two seconds to serve as a key frame.
Further, the face detection adopts a BlazeFace network structure, and the network is improved based on MobileNet + SSD.
It should be noted that SSD is a one-stage detection network, MobileNet is an optimization means for network acceleration, and the BlazeFace network structure improves the speed as much as possible under the condition of ensuring the accuracy, and is improved based on MobileNet + SSD:
1. the network structure was modified to replace the convolution with 3 x 3 with 5 x 5, as shown in fig. 2, the length of the convolution kernel was 5 x 5, which increased the receptive field.
2. The anchor mechanism is improved, and 2 anchors of each pixel in 8 × 8, 4 × 4 and 2 × 2 resolutions are replaced by 6 anchors of 8 × 8, as shown in fig. 3(a) and 3(b), so that the detection speed is improved.
Further, in the process of carrying out face detection on the input video, the detected face can obviously shake, a tie resolution strategy is used for replacing NMS, and regression parameters of the bounding box are estimated to be a weighted average value between overlapping predictions. The improved network structure can achieve the speed of sub-millisecond grade on the mobile equipment and has higher precision.
Furthermore, after face detection is completed, the face area is cut, and because the faces in the video may have different orientations, the Dlib can be used for positioning key points of the faces, then the faces are aligned, and inclined faces are adjusted, so that the accuracy of the face recognition part is improved.
Further, the specific steps of the face recognition training are as follows:
inputting the training image into ResNet50 to extract features;
calculating the difference between the predicted label and the real label by using the ArcFace Loss to complete the training stage of face recognition, wherein the calculation formula is as follows:
Figure RE-GSB0000194222250000051
wherein L is1As a loss function, xiFor the features extracted by ResNet50, W is the weight value of the fully connected layer, b is the bias value of the fully connected layer, e is the natural logarithm, and m is the sample number.
Further, after the face recognition network is trained, the characteristics of the base library need to be generated. And for the target person to be recognized, at least one image is taken from each position, the features are extracted by using a trained face recognition network, and the images are stored in a database to obtain the characteristics of a base.
Further, the method also comprises a face recognition test, and the specific steps are as follows:
when a test image is input, extracting a face region, inputting the face region into a trained face recognition network, and extracting features to obtain test features;
and calculating Euclidean distance between the test features and the bottom library features, and if the Euclidean distance is smaller than a specified threshold value, determining that the target person is the target person.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (6)

1. A person detection and identification method is characterized by comprising the following steps:
performing video frame extraction on an input video to obtain an original face image;
carrying out face detection on the original face image by using a Blazeface network structure to obtain a target image;
positioning key points of the human face of the target image by using Dlib, aligning the human face, and cutting the human face area of the target image to generate a training image;
carrying out face recognition training on the training image by using ResNet50+ ArcFace Loss to obtain a trained face recognition network;
and analyzing the face image to be recognized by using the trained face recognition network to obtain a recognition result.
2. The method as claimed in claim 1, wherein the FFmpeg software is used to extract a frame of image of the input video as the key frame at regular intervals.
3. The method of claim 1, wherein the BlazeFace network architecture is based on MobileNet + SSD to improve the convolution kernel size and control mechanism.
4. The method for detecting and identifying a person as claimed in claim 1, wherein the steps of the face recognition training are as follows:
inputting the training image into ResNet50 to extract features;
calculating the difference between the predicted label and the real label by using the ArcFace Loss to finish the training stage of face recognition, wherein the calculation formula is as follows:
Figure RE-FSB0000194222240000011
wherein L is1As a loss function, xiFor the features extracted by ResNet50, W is the weight value of the fully connected layer, b is the bias value of the fully connected layer, e is the natural logarithm, and m is the sample number.
5. The method of claim 1, wherein at least one image of each person is extracted by the trained face recognition network and stored in a database to obtain the base features.
6. The method of claim 5, further comprising a face recognition test, wherein the face recognition test comprises the following specific steps:
inputting a test image, extracting a face region, inputting the face region into the trained face recognition network, and extracting features to obtain test features;
and calculating Euclidean distance between the test features and the bottom library features, and if the Euclidean distance is smaller than a specified threshold value, determining that the person is the character.
CN202110375567.7A 2021-04-03 2021-04-03 Person detection and identification method Pending CN113239727A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110375567.7A CN113239727A (en) 2021-04-03 2021-04-03 Person detection and identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110375567.7A CN113239727A (en) 2021-04-03 2021-04-03 Person detection and identification method

Publications (1)

Publication Number Publication Date
CN113239727A true CN113239727A (en) 2021-08-10

Family

ID=77131254

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110375567.7A Pending CN113239727A (en) 2021-04-03 2021-04-03 Person detection and identification method

Country Status (1)

Country Link
CN (1) CN113239727A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116110100A (en) * 2023-01-14 2023-05-12 深圳市大数据研究院 Face recognition method, device, computer equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203395A (en) * 2016-07-26 2016-12-07 厦门大学 Face character recognition methods based on the study of the multitask degree of depth
CN108875602A (en) * 2018-05-31 2018-11-23 珠海亿智电子科技有限公司 Monitor the face identification method based on deep learning under environment
CN111178228A (en) * 2019-12-26 2020-05-19 中云智慧(北京)科技有限公司 Face recognition method based on deep learning
CN112070058A (en) * 2020-09-18 2020-12-11 深延科技(北京)有限公司 Face and face composite emotional expression recognition method and system
CN112488064A (en) * 2020-12-18 2021-03-12 平安科技(深圳)有限公司 Face tracking method, system, terminal and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203395A (en) * 2016-07-26 2016-12-07 厦门大学 Face character recognition methods based on the study of the multitask degree of depth
CN108875602A (en) * 2018-05-31 2018-11-23 珠海亿智电子科技有限公司 Monitor the face identification method based on deep learning under environment
CN111178228A (en) * 2019-12-26 2020-05-19 中云智慧(北京)科技有限公司 Face recognition method based on deep learning
CN112070058A (en) * 2020-09-18 2020-12-11 深延科技(北京)有限公司 Face and face composite emotional expression recognition method and system
CN112488064A (en) * 2020-12-18 2021-03-12 平安科技(深圳)有限公司 Face tracking method, system, terminal and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIANKANG DENG等: "ArcFace: Additive Angular Margin Loss for Deep Face Recognition", 《ARXIV》 *
VALENTIN BAZAREVSKY等: "BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs", 《ARXIV》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116110100A (en) * 2023-01-14 2023-05-12 深圳市大数据研究院 Face recognition method, device, computer equipment and storage medium
CN116110100B (en) * 2023-01-14 2023-11-14 深圳市大数据研究院 Face recognition method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US11062123B2 (en) Method, terminal, and storage medium for tracking facial critical area
CN108805131B (en) Text line detection method, device and system
CN103208008B (en) Based on the quick adaptive method of traffic video monitoring target detection of machine vision
CN110460838B (en) Lens switching detection method and device and computer equipment
CN115713715B (en) Human behavior recognition method and recognition system based on deep learning
CN112766218B (en) Cross-domain pedestrian re-recognition method and device based on asymmetric combined teaching network
CN111191535B (en) Pedestrian detection model construction method based on deep learning and pedestrian detection method
CN110245697A (en) A kind of dirty detection method in surface, terminal device and storage medium
CN110610123A (en) Multi-target vehicle detection method and device, electronic equipment and storage medium
CN111738153B (en) Image recognition analysis method and device, electronic equipment and storage medium
CN110674887A (en) End-to-end road congestion detection algorithm based on video classification
CN116152824A (en) Invoice information extraction method and system
CN112052702A (en) Method and device for identifying two-dimensional code
CN114005019B (en) Method for identifying flip image and related equipment thereof
CN113239727A (en) Person detection and identification method
JP6341059B2 (en) Character recognition device, character recognition method, and program
US11908124B2 (en) Pavement nondestructive detection and identification method based on small samples
CN113570540A (en) Image tampering blind evidence obtaining method based on detection-segmentation architecture
Wang et al. Fast blur detection algorithm for UAV crack image sets
CN112380970B (en) Video target detection method based on local area search
CN112733864A (en) Model training method, target detection method, device, equipment and storage medium
CN112308061B (en) License plate character recognition method and device
CN111553408B (en) Automatic test method for video recognition software
CN104318207B (en) A kind of method that shearing lens and gradual shot are judged using rapid robust feature and SVMs
CN114170271A (en) Multi-target tracking method with self-tracking consciousness, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Li Yangxi

Inventor after: Miao Yanan

Inventor after: Wang Pei

Inventor after: Liu Kedong

Inventor after: Peng Chengwei

Inventor after: Hu Yanlin

Inventor before: Li Yangxi

Inventor before: Miao Yanan

Inventor before: Wang Pei