CN112149557B - Person identity tracking method and system based on face recognition - Google Patents

Person identity tracking method and system based on face recognition Download PDF

Info

Publication number
CN112149557B
CN112149557B CN202011000236.7A CN202011000236A CN112149557B CN 112149557 B CN112149557 B CN 112149557B CN 202011000236 A CN202011000236 A CN 202011000236A CN 112149557 B CN112149557 B CN 112149557B
Authority
CN
China
Prior art keywords
face
identity
tracking
person
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011000236.7A
Other languages
Chinese (zh)
Other versions
CN112149557A (en
Inventor
柯逍
林炳辉
陈宇杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202011000236.7A priority Critical patent/CN112149557B/en
Publication of CN112149557A publication Critical patent/CN112149557A/en
Application granted granted Critical
Publication of CN112149557B publication Critical patent/CN112149557B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/167Detection; Localisation; Normalisation using comparisons between temporally consecutive images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/48Matching video sequences
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Multimedia (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Analysis (AREA)
  • Collating Specific Patterns (AREA)

Abstract

The invention relates to a person identity tracking method and a person identity tracking system based on face recognition, which comprise the following steps: training a neural network by adopting a face data set; acquiring a face picture of an identity figure to be identified, and constructing a face identity library to be identified; detecting the face position of each frame of image by using a trained yolov3 face detection model according to an input video frame; extracting features of the detected face by using a trained neural network, comparing the features with face features in a face identity library to be recognized to determine identity, and initializing a face target to be tracked; and tracking the identity of the person corresponding to the face. The invention can confirm the id of the person for the tracked target.

Description

Person identity tracking method and system based on face recognition
Technical Field
The invention relates to the technical field of machine vision, in particular to a person identity tracking method and system based on face recognition.
Background
In recent years, with social progress and continuous development of science and technology, the problem of face recognition is always a popular research field, and many experts at home and abroad have very deep research in the field. Meanwhile, as an entrance and a basis of face recognition, techniques of face detection, alignment, tracking and the like are also developed together. The face recognition technology is widely applied to practical application scenes such as intelligent monitoring, video conferences and access control systems, but due to the fact that background complexity of the practical scenes is changed by other factors such as illumination, shielding or changes of human postures, a face recognition algorithm in videos of a real monitoring system still has certain challenges.
Meanwhile, an object tracking algorithm is rapidly developed in recent years, the tracking algorithm is widely applied to monitoring scenes, and the requirement on an intelligent security scene is high. However, most of the tracking of people by the current tracking algorithm only stays at the pedestrian level, and the id of the target often has a conversion problem in the tracking process.
Disclosure of Invention
In view of the above, the present invention is directed to a method and a system for tracking a person identity based on face recognition, wherein an id of the person can be confirmed for a tracked target.
The invention is realized by adopting the following scheme: a person identity tracking method based on face recognition specifically comprises the following steps:
training a neural network by adopting a face data set; acquiring a face picture of an identity figure to be identified, and constructing a face identity library to be identified;
detecting the face position of each frame of image by using a trained yolov3 face detection model according to an input video frame;
extracting features of the detected face by using a trained neural network, comparing the features with face features in a face identity library to be recognized to determine identity, and initializing a face target to be tracked;
and tracking the identity of the person corresponding to the face.
Further, the training of the neural network by using the face data set specifically includes:
collecting a public face data set to obtain pictures of related persons and corresponding person names;
taking the size of the face image in the face data set to be 112 × 112 and using resnet as the backbone network, the loss function is set as follows:
Figure GDA0003593406620000021
where m is the number of samples, i represents the ith sample, n represents the number of classes, j represents the jth class,
Figure GDA0003593406620000022
score, y, representing the category to which the ith sample belongs i Is the class to which the ith sample belongs, s is a normalization parameter, i.e., a scaling factor,
Figure GDA0003593406620000023
is the weight W yi And a feature vector x i Angle of cosine of (1), wherein weight W i And a feature vector x i Having been normalized to 1, t is an introduced hyperparameter used to limit the included angle between the different classes.
Further, the constructing of the face identity library to be recognized specifically includes: selecting a human face image of a target person to be tracked, taking the name of the person as a file name, placing the human face image under an appointed folder as an image library of the person to be tracked, wherein k persons exist in the library, and the corresponding name is a name 1 ,name 2 ,...,name k
Further, the detecting the face position of each frame of image by using the trained yolov3 face detection model according to the input video frame specifically comprises:
selecting an image of a first frame of a video stream;
calling a pre-trained yolov3 face detection model, changing the size of an input picture into 448 x 448 size by the yolov3 face detection model, and dividing the input picture into 7 x 49 grids on average, wherein the size of each grid is 64 x 64;
for each grid, 2 bounding boxes are predicted, and each bounding box has five basic parameters of (x, y, w, h, confidence), wherein (x, y) is the center coordinate of the bounding box, and (w, h) is the width and height of the bounding box and the confidence is the confidence;
and predicting 7 × 2 boundary frames according to the previous step, screening the boundary frames with the confidence coefficient lower than a preset threshold value of 0.7, then inhibiting and removing redundant windows by using a non-maximum value, and taking the obtained boundary frames as face detection frames to obtain the positions of the faces in the images.
Further, the step of extracting features of the detected face by using a trained neural network, and comparing the features with the face features in the face identity library to be recognized to determine the identity specifically comprises the following steps:
intercepting an image of a face position, aligning the face by adopting similar transformation, changing the size of the intercepted image into 112 multiplied by 112, and sending the image into a trained neural network to obtain a feature vector a;
respectively sending k pictures in a face identity library to be recognized into a trained neural network to obtain k output characteristic vectors b 1 ,b 2 ,...,b k K is the number of the faces in the face identity library to be recognized;
respectively matching the feature vectors a with b 1 ,b 2 ,...,b k Calculating the cosine similarity, and determining the b with the highest cosine similarity and exceeding the set threshold value of 0.8 i And taking the corresponding face as the face matched with the feature a, otherwise, setting the face corresponding to the feature a as a stranger.
Further, the tracking the person identity corresponding to the face specifically includes:
representing the target state of each tracked face as follows:
Figure GDA0003593406620000041
in the formula, m' represents the tracked face target state, u and v represent the central coordinates of the tracked face region, s is the length-width ratio of the face frame, r is the height of the face frame,
Figure GDA0003593406620000042
respectively represent (u, v, s, r) in the imageVelocity in coordinate space;
allocating a tracker for each face detection frame to be tracked, setting a counter, increasing the counter during Kalman filtering prediction, and once the face detection results of one face detection frame tracker and yolov3 can be matched, resetting the counter corresponding to the face detection frame tracker to be 0; if a face detection frame tracker cannot match the face detection result of yolov3 within a preset period of time, namely 30 frames, deleting the track of the face detection frame tracker from the track list;
and (4) transmitting the track boxes in the track list into a trained neural network in real time to detect the id of the face.
Further, the matching of the tracking result and the detection result is realized by adopting the following method:
linear weighting of three metric approaches is used as the final metric value:
d(i1,j1)=αd (1) (i1,j1)+βd (2) (i1,j1)+(1-α-β)d (3) (i1,j1);
in the formula (d) (1) (i1, j1) is the tracking result c i1 And the detection result d j1 A measure of the position between, d (2) (i1, j1) is the tracking result c i1 And the detection result d j1 The value of the appearance metric between d (3) (i1, j1) is the tracking result c i1 And the detection result d j1 The alpha and beta are weighting coefficients;
if d (i1, j1) is less than the set threshold value of 0.3, the tracking result c is judged i1 And the detection result d j1 Are matched.
Further, the tracking result c i1 And the detection result d j1 The velocity metric value between is calculated using the following equation:
Figure GDA0003593406620000051
in the formula (I), the compound is shown in the specification,
Figure GDA0003593406620000052
as a result of tracking c i1 And the detection result d j1 F is the tracking result c i1 And the detection result d j1 The number of frames in between.
The invention also provides a person identity tracking system based on face recognition, comprising a processor, a memory and computer program instructions stored on the memory and capable of being executed by the processor, wherein when the computer program instructions are executed by the processor, the steps of the method are realized.
The present invention also provides a computer readable storage medium having stored thereon computer program instructions executable by a processor, the computer program instructions, when executed by the processor, performing the method steps as described above.
Compared with the prior art, the invention has the following beneficial effects: the invention can confirm the id, namely the name, of the person for the tracked target, and can reconfirm the identity of the person by a face recognition method if the number of the tracked target changes in the tracking process. Meanwhile, the idea of object tracking is also utilized to predict the motion trail of the face, so that the problem of tracking frame delay in face frame-by-frame identification is avoided.
Drawings
FIG. 1 is a schematic flow chart of a method according to an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure herein. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the embodiment provides a person identity tracking method based on face recognition, which specifically includes the following steps:
training a neural network by adopting a face data set; acquiring a face picture of an identity figure to be identified, and constructing a face identity library to be identified;
detecting the face position of each frame of image by using a trained yolov3 face detection model according to an input video frame;
extracting features of the detected face by using a trained neural network, comparing the features with face features in a face identity library to be recognized to determine identity, and initializing a face target to be tracked;
and tracking the identity of the person corresponding to the face.
In this embodiment, the training of the neural network by using the face data set specifically includes:
collecting a public face data set to obtain pictures of related persons and corresponding person names;
the Size of a face image in a face data set is changed to 112 × 112, resnet is used as a backbone network, the total Batch Size during training is set to 512, meanwhile, the learning rate is reduced by one order of magnitude from 0.1 in 10 ten thousand iterations, 14 ten thousand iterations and 16 ten thousand iterations respectively, the total iteration is set to 20 ten thousand iterations, momentum is 0.9, weight attenuation is 5e-4, and a loss function is set as follows:
Figure GDA0003593406620000071
where m is the number of samples, i represents the ith sample, n represents the number of classes, j represents the jth class,
Figure GDA0003593406620000072
a score representing the category to which the ith sample belongs,y i is the class to which the ith sample belongs, s is a normalization parameter, i.e., a scaling factor,
Figure GDA0003593406620000073
is the weight W yi And a feature vector x i Angle of cosine of (1), wherein weight W i And a feature vector x i Having been normalized to 1, t is an introduced hyperparameter used to limit the included angle between the different classes.
In this embodiment, the constructing the face identity library to be recognized specifically includes: selecting a human face image of a target person to be tracked, taking the name of the person as a file name, placing the human face image under an appointed folder as an image library of the person to be tracked, wherein k persons exist in the library, and the corresponding name is a name 1 ,name 2 ,...,name k
In this embodiment, the detecting, according to the input video frame, the face position of each frame of image using the trained yolov3 face detection model specifically includes:
selecting an image of a first frame of a video stream;
calling a pre-trained yolov3 face detection model, changing the size of an input picture into 448 x 448 size by the yolov3 face detection model, and dividing the input picture into 7 x 49 grids on average, wherein the size of each grid is 64 x 64;
for each grid, 2 bounding boxes are predicted, and each bounding box has five basic parameters of (x, y, w, h, confidence), wherein (x, y) is the center coordinate of the bounding box, (w, h) is the width and height of the bounding box, and confidence is confidence;
and (3) predicting 7 × 2 boundary frames according to the previous step, screening the boundary frames with the confidence coefficient lower than a preset threshold value of 0.7, then inhibiting and removing redundant windows by using a non-maximum value, and taking the obtained boundary frames as face detection frames, namely obtaining the positions of the faces in the images.
In this embodiment, the extracting features of the detected face by using the trained neural network, and comparing the extracted features with the face features in the face identity library to be recognized to determine the identity specifically includes:
intercepting an image of a face position, aligning the face by adopting similar transformation, changing the size of the intercepted image into 112 multiplied by 112, and sending the image into a trained neural network to obtain a feature vector a;
respectively sending k pictures in a face identity library to be recognized into a trained neural network to obtain k output eigenvectors b 1 ,b 2 ,...,b k K is the number of the faces in the face identity library to be recognized;
respectively matching the feature vectors a with b 1 ,b 2 ,...,b k The cosine similarity is calculated according to the following formula:
Figure GDA0003593406620000091
the largest one of the k similar is the matched face, and the degree of acquaintance of the feature vector a and all the feature vectors b does not exceed the threshold, the face corresponding to the feature vector a is a stranger, namely is not in the library.
In this embodiment, the tracking the person identity corresponding to the face specifically includes:
representing the target state of each tracked face as follows:
Figure GDA0003593406620000092
in the formula, m' represents the tracked face target state, u and v represent the central coordinates of the tracked face region, s is the length-width ratio of the face frame, r is the height of the face frame,
Figure GDA0003593406620000093
respectively representing the velocities of (u, v, s, r) in the image coordinate space;
allocating a tracker for each face detection frame to be tracked, setting a counter, increasing the counter during Kalman filtering prediction, and once the face detection results of one face detection frame tracker and yolov3 can be matched, resetting the counter corresponding to the face detection frame tracker to be 0; if a face detection frame tracker cannot match the face detection result of yolov3 within a preset period of time, namely 30 frames, the track of the face detection frame tracker is deleted from the track list;
and (4) transmitting the track boxes in the track list into a trained neural network in real time to detect the id of the face.
In this embodiment, the matching between the tracking result and the detection result is implemented by the following method:
linear weighting of three metric approaches is used as the final metric value:
d(i1,j1)=αd (1) (i1,j1)+βd (2) (i1,j1)+(1-α-β)d (3) (i1,j1);
in the formula (d) (1) (i1, j1) is the tracking result c i1 And the detection result d j1 A measure of the position between, d (2) (i1, j1) is the tracking result c i1 And the detection result d j1 The value of the appearance metric between d (3) (i1, j1) is the tracking result c i1 And the detection result d j1 Alpha and beta are weighting coefficients;
if d (i1, j1) is less than the set threshold value of 0.3, the tracking result c is judged i1 And the detection result d j1 And matching, and tracking the human face of all video frames by the method.
The matching of the tracking frame adopts a position factor and an appearance factor respectively, wherein the mahalanobis distance is used in the position measurement:
Figure GDA0003593406620000101
mahalanobis distance calculation object detection frame d j1 And an object tracking frame c i1 S is a covariance matrix, i1, j1 is a serial number;
wherein, in the aspect of appearance measurement, for each detection block d j1 Computing a corresponding 128-dimensional feature vector r through a CNN network j1 Constructing a list for each tracking target, and storing eachAnd tracking the feature vector of the last 100 frames successfully associated with the target. Then the appearance metric is calculated by calculating the minimum cosine distance between the last 100 successfully associated feature sets of the tracker and the feature vector of the current frame detection result:
Figure GDA0003593406620000102
i1, j1, k1 are all ordinal numbers, and R represents the set of eigenvectors.
Wherein the tracking result c i1 And the detection result d j1 The velocity metric value between is calculated using the following equation:
Figure GDA0003593406620000103
in the formula (I), the compound is shown in the specification,
Figure GDA0003593406620000111
as a result of tracking c i1 And the detection result d j1 F is the tracking result c i1 And the detection result d j1 The number of frames in between. The distance measurement is divided by f to represent the moving speed and direction of the detected object, so that the problem of tracking id switching caused by similar people meeting can be better solved.
The present embodiment also provides a person identification tracking system based on face recognition, comprising a processor, a memory, and computer program instructions stored on the memory and capable of being executed by the processor, wherein when the computer program instructions are executed by the processor, the steps of the method as described above are implemented.
The present embodiments also provide a computer readable storage medium having stored thereon computer program instructions executable by a processor, the computer program instructions, when executed by the processor, performing the method steps as described above.
The embodiment focuses on the recognition and tracking of the human face under the monitoring scene by computer vision, and the yolov3 is used as the human face detector, so that the human face detection efficiency is improved. The face recognition is combined with the tracking, the face recognition can determine the identity of a person in the tracking process, the person identity is used for id reduction, the target id frequent transformation in the tracking process is reduced, the speed measurement is added, the constraint on tracking matching is strengthened, and the method has innovative significance. The method provided by the embodiment has high accuracy and good timeliness, and has practical application significance for recognizing and tracking the human face.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.

Claims (6)

1. A person identity tracking method based on face recognition is characterized by comprising the following steps:
training a neural network by adopting a face data set; acquiring a face picture of an identity figure to be identified, and constructing a face identity library to be identified;
detecting the face position of each frame of image by using a trained yolov3 face detection model according to an input video frame;
extracting features of the detected face by using a trained neural network, comparing the features with face features in a face identity library to be recognized to determine identity, and initializing a face target to be tracked;
tracking the identity of a person corresponding to the face;
the neural network trained by adopting the face data set specifically comprises the following steps:
collecting a public face data set to obtain pictures of related persons and corresponding person names;
taking the size of the face image in the face data set to be 112 × 112 and using resnet as the backbone network, the loss function is set as follows:
Figure FDA0003687283350000011
where m is the number of samples, i represents the ith sample, n represents the number of classes, j represents the jth class,
Figure FDA0003687283350000012
score, y, representing the category to which the ith sample belongs i Is the class to which the ith sample belongs, s is a normalization parameter, i.e., a scaling factor,
Figure FDA0003687283350000013
is the weight W yi And a feature vector x i Angle cosine value of, wherein weight W i And a feature vector x i Having been normalized to 1, t is the introduced hyper-parameter used to limit the included angles between the different classes;
the tracking of the person identity corresponding to the face specifically comprises the following steps:
representing the target state of each tracked face as follows:
Figure FDA0003687283350000021
in the formula, m 'represents the tracked face target state, u and v represent the center coordinates of the tracked face region, s' is the length-width ratio of the face frame, r is the height of the face frame,
Figure FDA0003687283350000022
respectively representing the velocities of (u, v, s', r) in the image coordinate space;
allocating a tracker for each face detection frame to be tracked, setting a counter, increasing the counter during Kalman filtering prediction, and once the face detection results of one face detection frame tracker and yolov3 can be matched, resetting the counter corresponding to the face detection frame tracker to be 0; if a face detection frame tracker fails to match the face detection result of yolov3 within a preset period of time, deleting the track of the face detection frame tracker from the track list;
transmitting the track frame in the track list into a trained neural network in real time to detect the id of the face;
the matching of the tracking result and the detection result is realized by adopting the following method:
linear weighting of three metric approaches is used as the final metric value:
d(i1,j1)=αd (1) (i1,j1)+βd (2) (i1,j1)+(1-α-β)d (3) (i1,j1);
in the formula (d) (1) (i1, j1) is the tracking result c i1 And the detection result d j1 Position measure of between, d (2) (i1, j1) is the tracking result c i1 And the detection result d j1 The value of the appearance metric between d (3) (i1, j1) is the tracking result c i1 And the detection result d j1 The alpha and beta are weighting coefficients;
if d (i1, j1) is smaller than the set threshold, the tracking result c is judged i1 And the detection result d j1 Are matched;
the tracking result c i1 And the detection result d j1 The velocity metric value between is calculated using the following equation:
Figure FDA0003687283350000031
in the formula (I), the compound is shown in the specification,
Figure FDA0003687283350000032
as a result of tracking c i1 And the detection result d j1 F is the tracking result c i1 And the detection result d j1 The number of frames in between.
2. A face recognition based on human face as claimed in claim 1The method for tracking the identity of other people is characterized in that the step of constructing the identity library of the face to be recognized specifically comprises the following steps: selecting a human face image of a target person to be tracked, taking the name of the person as a file name, placing the human face image under an appointed folder as an image library of the person to be tracked, wherein k persons exist in the library, and the corresponding names are name1 and name 2 ,...,name k
3. The person identity tracking method based on face recognition as claimed in claim 1, wherein the detecting the face position of each frame of image using the trained yolov3 face detection model according to the input video frame specifically comprises:
selecting an image of a first frame of a video stream;
calling a pre-trained yolov3 face detection model, changing the size of an input picture into 448 x 448 size by the yolov3 face detection model, and dividing the input picture into 7 x 49 grids on average, wherein the size of each grid is 64 x 64;
for each grid, 2 bounding boxes are predicted, and each bounding box has five basic parameters of (x, y, w, h, confidence), wherein (x, y) is the center coordinate of the bounding box, and (w, h) is the width and height of the bounding box and the confidence is the confidence;
and predicting 7 × 2 boundary frames according to the previous step, screening the boundary frames with the confidence coefficient lower than a preset threshold value, then inhibiting and removing redundant windows by using a non-maximum value, and taking the obtained boundary frames as face detection frames, namely obtaining the positions of the faces in the images.
4. The person identity tracking method based on face recognition according to claim 1, wherein the step of extracting features of the detected face by using a trained neural network and comparing the extracted features with the face features in the face identity library to be recognized to determine the identity specifically comprises the following steps:
intercepting an image of a face position, aligning the face by adopting similar transformation, changing the size of the intercepted image into 112 multiplied by 112, and sending the image into a trained neural network to obtain a feature vector a;
the human face to be recognizedK pictures in the stock library are respectively sent into a trained neural network to obtain k output characteristic vectors b 1 ,b 2 ,...,b k K is the number of the faces in the face identity library to be recognized;
respectively matching the feature vectors a with b 1 ,b 2 ,...,b k Calculating cosine similarity, and comparing the cosine similarity with the highest value and exceeding the set threshold value i And taking the corresponding face as the face matched with the feature a, otherwise, setting the face corresponding to the feature a as a stranger.
5. A person identity tracking system based on face recognition, comprising a processor, a memory and computer program instructions stored on the memory and executable by the processor, which when executed by the processor, implement the method steps of any of claims 1-4.
6. A computer-readable storage medium, having stored thereon computer program instructions executable by a processor, for performing, when the processor executes the computer program instructions, the method steps according to any one of claims 1-4.
CN202011000236.7A 2020-09-22 2020-09-22 Person identity tracking method and system based on face recognition Active CN112149557B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011000236.7A CN112149557B (en) 2020-09-22 2020-09-22 Person identity tracking method and system based on face recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011000236.7A CN112149557B (en) 2020-09-22 2020-09-22 Person identity tracking method and system based on face recognition

Publications (2)

Publication Number Publication Date
CN112149557A CN112149557A (en) 2020-12-29
CN112149557B true CN112149557B (en) 2022-08-09

Family

ID=73892695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011000236.7A Active CN112149557B (en) 2020-09-22 2020-09-22 Person identity tracking method and system based on face recognition

Country Status (1)

Country Link
CN (1) CN112149557B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668483B (en) * 2020-12-30 2022-06-10 福州大学 Single-target person tracking method integrating pedestrian re-identification and face detection
CN113705510A (en) * 2021-09-02 2021-11-26 广州市奥威亚电子科技有限公司 Target identification tracking method, device, equipment and storage medium
CN113723375B (en) * 2021-11-02 2022-03-04 杭州魔点科技有限公司 Double-frame face tracking method and system based on feature extraction
CN115206322A (en) * 2022-09-15 2022-10-18 广东海新智能厨房股份有限公司 Intelligent cabinet based on automatic induction and intelligent cabinet control method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1855118A (en) * 2005-04-28 2006-11-01 中国科学院自动化研究所 Method for discriminating face at sunshine based on image ratio
WO2018133666A1 (en) * 2017-01-17 2018-07-26 腾讯科技(深圳)有限公司 Method and apparatus for tracking video target
CN109829436A (en) * 2019-02-02 2019-05-31 福州大学 Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network
CN109919977A (en) * 2019-02-26 2019-06-21 鹍骐科技(北京)股份有限公司 A kind of video motion personage tracking and personal identification method based on temporal characteristics

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1855118A (en) * 2005-04-28 2006-11-01 中国科学院自动化研究所 Method for discriminating face at sunshine based on image ratio
WO2018133666A1 (en) * 2017-01-17 2018-07-26 腾讯科技(深圳)有限公司 Method and apparatus for tracking video target
CN109829436A (en) * 2019-02-02 2019-05-31 福州大学 Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network
CN109919977A (en) * 2019-02-26 2019-06-21 鹍骐科技(北京)股份有限公司 A kind of video motion personage tracking and personal identification method based on temporal characteristics

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YOLOv3 as a Deep Face Detector;Filiz Gurkan et al.;《2019 11th International Conference on Electrical and Electronics Engineering (ELECO)》;20200213;全文 *
基于YOLOv3与ResNet50的摄影机器人人脸识别跟踪系统;陈凯等;《计算机与现代化》;20200415(第04期);全文 *
基于深度学习的高效实时性M:N模式人脸识别方法;郑开发 等;《2019电力行业信息化年会论文集》;20190930;全文 *

Also Published As

Publication number Publication date
CN112149557A (en) 2020-12-29

Similar Documents

Publication Publication Date Title
CN109829436B (en) Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network
CN112149557B (en) Person identity tracking method and system based on face recognition
CN110147743B (en) Real-time online pedestrian analysis and counting system and method under complex scene
CN111860282A (en) Subway section passenger flow volume statistics and pedestrian retrograde motion detection method and system
CN107909027B (en) Rapid human body target detection method with shielding treatment
Azhar et al. People tracking system using DeepSORT
CN110969087B (en) Gait recognition method and system
US20120014562A1 (en) Efficient method for tracking people
Hassan et al. A review on human actions recognition using vision based techniques
US20070291984A1 (en) Robust object tracking system
CN108960047B (en) Face duplication removing method in video monitoring based on depth secondary tree
CN112989889A (en) Gait recognition method based on posture guidance
Serpush et al. Complex human action recognition in live videos using hybrid FR-DL method
Elsayed et al. Abnormal Action detection in video surveillance
CN116342645A (en) Multi-target tracking method for natatorium scene
Xu et al. A novel multi-view face detection method based on improved real adaboost algorithm
Bing et al. Research of face detection based on adaboost and asm
Hashem et al. Human gait identification system based on transfer learning
Ildarabadi et al. Improvement Tracking Dynamic Programming using Replication Function for Continuous Sign Language Recognition
CN117011335B (en) Multi-target tracking method and system based on self-adaptive double decoders
Raskin et al. Tracking and classifying of human motions with gaussian process annealed particle filter
Radulescu et al. Model of human actions recognition based on 2D Kernel
Zheng et al. Object detection and tracking using Bayes-constrained particle swarm optimization
Shah et al. RESTAURANT SYSTEM TO CALCULATE WAITING TIME AND AGE, GENDER INSIGHTS
Abdellaoui et al. Robust Object Tracker in Video via Discriminative Model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant