CN109829436B - Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network - Google Patents

Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network Download PDF

Info

Publication number
CN109829436B
CN109829436B CN201910106309.1A CN201910106309A CN109829436B CN 109829436 B CN109829436 B CN 109829436B CN 201910106309 A CN201910106309 A CN 201910106309A CN 109829436 B CN109829436 B CN 109829436B
Authority
CN
China
Prior art keywords
face
frame
feature
target
tracking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910106309.1A
Other languages
Chinese (zh)
Other versions
CN109829436A (en
Inventor
柯逍
郑毅腾
朱敏琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN201910106309.1A priority Critical patent/CN109829436B/en
Publication of CN109829436A publication Critical patent/CN109829436A/en
Priority to PCT/CN2019/124966 priority patent/WO2020155873A1/en
Application granted granted Critical
Publication of CN109829436B publication Critical patent/CN109829436B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a multi-face tracking method based on depth appearance characteristics and a self-adaptive aggregation network, which comprises the steps of firstly adopting a face recognition data set to train the self-adaptive aggregation network; then, acquiring the position of a human face by using a human face detection method based on a convolutional neural network, initializing a human face target to be tracked, and extracting human face characteristics; then, predicting the position of each face tracking target in the next frame by adopting a Kalman filter, positioning the position of the face in the next frame again, and extracting the characteristics of the detected face; and finally, using a self-adaptive aggregation network to aggregate the face feature set in each tracked face target tracking track, dynamically generating a face depth apparent feature fused with multi-frame information, combining the predicted position and the fused feature, performing similarity calculation and matching with the face position and the feature thereof obtained by detection in the current frame, and updating the tracking state. The invention can improve the performance of face tracking.

Description

Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network
Technical Field
The invention relates to the field of pattern recognition and computer vision, in particular to a multi-face tracking method based on depth appearance characteristics and a self-adaptive aggregation network.
Background
In recent years, with social progress and continuous development of science and technology, the problem of video face recognition gradually becomes a popular research field, the research interests of numerous experts and scholars at home and abroad are attracted, the video face recognition is used as an entrance and a basis for video face recognition, the face detection and tracking technology is rapidly developed, and the video face recognition and tracking method is widely applied to the fields of intelligent monitoring, virtual reality perception interfaces, video conferences and the like.
In order to analyze a face, a face must be captured first, which can be realized by a face detection technology and a face tracking technology, and only if a face target is accurately positioned and tracked in a video, the face can be analyzed more carefully, such as face recognition, pose estimation and the like. The target tracking technology is undoubtedly one of the most important technologies in intelligent security, the face tracking technology is a specific application of the current tracking technology, a tracking algorithm is used for processing a moving face in a video sequence, and the face area is kept locked to complete tracking, so that the technology has good application prospects in scenes such as intelligent security, video monitoring and the like.
Face tracking plays an important role in video monitoring, but currently, in a real scene, due to the large change of the face pose and the overlapping and shielding between tracking targets, practical application is difficult.
Disclosure of Invention
In view of this, the present invention provides a multi-face tracking method based on a deep appearance feature and an adaptive aggregation network, which can improve the face tracking performance.
The invention is realized by adopting the following scheme: a multi-face tracking method based on depth appearance characteristics and a self-adaptive aggregation network specifically comprises the following steps:
step S1: training a self-adaptive aggregation network by adopting a face recognition data set;
step S2: acquiring the position of a human face by adopting a convolutional neural network according to an initial input video frame, initializing a human face target to be tracked, extracting human face characteristics and storing;
step S3: predicting the position of each face target in the next frame by adopting a Kalman filter, positioning the position of the face in the next frame again, and extracting characteristics of the detected face;
step S4: and (4) using the self-adaptive aggregation network trained in the step (S1) to aggregate the face feature set in each tracked face target tracking track, dynamically generating a face depth apparent feature fused with multi-frame information, combining the predicted position and the fused feature, performing similarity calculation and matching with the face position and the feature thereof obtained through detection in the current frame, and updating the tracking state.
Further, step S1 specifically includes the following steps:
step S11: collecting a public face recognition data set to obtain pictures and names of related persons;
step S12: and integrating the pictures of the common characters in the plurality of data sets by adopting a fusion strategy, carrying out face detection and face key point positioning by using a pre-trained MTCNN model, carrying out face alignment by applying similarity transformation, and simultaneously subtracting the mean value of each channel on the training set from all the images in the training set to finish data preprocessing and train the self-adaptive aggregation network.
Furthermore, the self-adaptive aggregation network is formed by connecting a depth feature extraction module and a self-adaptive feature aggregation module in series, receives one or more face images of the same person as input, and outputs aggregated features, wherein the depth feature extraction module adopts 34 layers of ResNet as a backbone network, and the self-adaptive feature aggregation module comprises a feature aggregation layer; let B denote the number of samples input, { ztAnd B represents an input sample number, and the calculation mode of a feature aggregation layer is as follows:
Figure BDA0001966809980000021
Figure BDA0001966809980000022
a=∑totzt
wherein q represents a feature vector ztThe weights of the individual components are parameters that can be learned by using the face recognition signal as a supervisory signal and learning by means of back propagation and gradient descent, vtIs the output of the sigmoid function and,representing each feature vector ztIn the range between 0 and 1, otNormalized output for L1, such that ∑totAnd a is a feature vector after the aggregation of the B feature vectors.
Further, step S2 specifically includes the following steps:
step S21: let i denote the number of the ith frame of the input video, initially i equals 1, and the pre-trained MTCNN model is used to simultaneously detect the positions D of all facesiAnd the position C of its corresponding facial key pointiWherein
Figure BDA0001966809980000031
J is the number of the jth detected face, JiThe number of faces detected for the first frame,
Figure BDA0001966809980000032
wherein
Figure BDA0001966809980000033
The position of the jth face in the ith frame is shown, x, y, w and h respectively show the coordinates of the upper left corner of the face region and the width and height of the face region,
Figure BDA0001966809980000034
wherein
Figure BDA0001966809980000035
Representing a keypoint of the jth face in the ith frame, c1,c2,c3,c4,c5Coordinates of a left eye, a right eye, a nose, a left mouth corner and a right mouth corner of the human face are respectively represented;
step S22: position D for each facej iAnd coordinates of key points of the face
Figure BDA0001966809980000036
To which a unique identity ID is assignedk,k=1,2,...,KiWhere K denotes the number of the kth tracking target, KiIndicating the number of persons tracking the target at frame i and initiatingChange its corresponding tracker Tk={IDk,Pk,Lk,Ek,AkTherein ID ofkRepresenting a unique identity, P, of the kth tracked objectkIndicating the face position coordinates assigned to the kth target, LkCoordinates of key points of the face representing the k-th object, EkList of face features representing the kth target, AkDenotes the life cycle of the kth target, initializing Ki=Ji
Figure BDA0001966809980000037
Ak=1;
Step S23: for TkPosition P of each face in (1)kCutting the image to obtain a corresponding face image, and using the corresponding key point position L of the facekCarrying out face alignment by applying similarity transformation to obtain an aligned face image;
step S24: inputting the aligned face image into the self-adaptive aggregation network to obtain corresponding face depth apparent characteristics, and adding the face depth apparent characteristics into the tracker TkFeature list E ofk
Further, step S3 specifically includes the following steps:
step S31: representing the target state of each tracked face as follows:
Figure BDA0001966809980000038
in the formula, m represents the tracked face target state, u and v represent the central coordinates of the tracked face region, s is the area of the face frame, r is the aspect ratio of the face frame,
Figure BDA0001966809980000041
respectively representing the velocities of (u, v, s, r) in the image coordinate space;
step S32: each tracker TkFace position P inkConversion to (x, y, w, h)
Figure BDA0001966809980000042
In the form of (1), wherein
Figure BDA0001966809980000043
Representing the form of the transformed face position of the kth tracking target in the ith frame;
step S33: will be provided with
Figure BDA0001966809980000044
As a direct observation result of the kth tracking target in the ith frame, the kth tracking target is detected by human faces, and the state of the kth tracking target in the (i + 1) th frame is detected by adopting a Kalman filter based on a linear uniform motion model
Figure BDA0001966809980000045
Carrying out prediction;
step S34: in the (i + 1) th frame, adopting MTCNN model to make face detection and face key point positioning again to obtain face position Di+1And face key point Ci+1
Step S35: for each face position
Figure BDA0001966809980000046
Based on its facial key points
Figure BDA0001966809980000047
Completing face alignment by applying similarity transformation, inputting the face alignment into a self-adaptive aggregation network to extract features, and obtaining a feature set Fi+1In which F isi+1And representing the feature sets of all the faces in the (i + 1) th frame.
Further, step S4 specifically includes the following steps:
step S41: tracker T for each facekA set E of all the characteristics in the historical motion trailkInputting the self-adaptive aggregation network to obtain an aggregation characteristic fkWherein f iskRepresenting an aggregation characteristic output after all characteristic vectors in the history motion trail of the kth tracking target are fused;
step (ii) ofS42: the position state of the kth target predicted by the Kalman filter in the ith frame is in the next frame
Figure BDA0001966809980000048
Is converted into
Figure BDA0001966809980000049
In the form of (a);
step S43: bonding of
Figure BDA00019668099800000410
And the characteristic f after polymerization of the object kkAnd face position D in the i +1 th frame obtained by face detectioni+1And its feature set Fi+1The following correlation matrix is calculated:
G=[gjk],j=1,2,...,Ji+1,k=1,2,...,Ki
in the formula, Ji+1For the number of faces detected in the i +1 th frame, KiFor the number of tracked objects in the ith frame,
Figure BDA0001966809980000051
Figure BDA0001966809980000052
for the jth personal face detection frame in the (i + 1) th frame and the position state of the kth target predicted by the Kalman filter in the (i + 1) th frame
Figure BDA0001966809980000053
The degree of coincidence between them,
Figure BDA0001966809980000054
for the j individual face feature in the (i + 1) th frame
Figure BDA0001966809980000055
With the k-th target in the i-th framekCosine similarity between them, λ is a hyper-parameter, used to balance the weights of the two metrics;
step S44: using the incidence matrix G as the costMatrix, calculating by using Hungarian algorithm to obtain matching result, and detecting the face in the (i + 1) th frame
Figure BDA0001966809980000056
Associating to the kth tracking target;
step S45: corresponding the subscript in the matching result to the item in the incidence matrix G, and filtering all the subscripts smaller than TsimilarityItem g ofjkIt is deleted from the matching result, where TsimilarityThe minimum similarity threshold value is a set hyper-parameter and represents the successful matching;
step S46: in the matching result, if the frame is detected
Figure BDA0001966809980000057
If the association with the kth tracking target is successful, the corresponding tracker T is updatedkPosition state of
Figure BDA0001966809980000058
Human face key point position
Figure BDA0001966809980000059
Life cycle ak=Ak+1, and corresponding face features
Figure BDA00019668099800000510
Add to feature list EkIf the detection frame is not correct
Figure BDA00019668099800000511
If the association fails, a new tracker is created;
step S47: for each tracker TkIf it is in life cycle Ak>TageThen delete the tracker, where TageThe set hyper-parameter represents the maximum time a tracked object can survive.
Compared with the prior art, the invention has the following beneficial effects:
1. the multi-face tracking method based on the depth appearance characteristics and the self-adaptive aggregation network can effectively track the face in the video, improves the face tracking accuracy and reduces the target switching times.
2. The invention can track the human face in the video on line while ensuring the tracking effect.
3. The invention provides a method for utilizing apparent features of human face depth, which improves the performance of human face tracking by combining information between a spatial position and depth features.
4. Aiming at the problem that all features in the same target tracking track are difficult to be effectively utilized and a plurality of feature sets are effectively compared in the face tracking process, the invention provides a self-adaptive aggregation network, and the importance degree of each feature in the feature sets is adaptively learned and effectively fused through a feature aggregation module, so that the face tracking effect is improved.
Drawings
FIG. 1 is a schematic flow chart of an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the present embodiment provides a multi-face tracking method based on depth appearance features and an adaptive aggregation network, which specifically includes the following steps:
step S1: training a self-adaptive aggregation network by adopting a face recognition data set;
step S2: acquiring the position of a human face by using a human face detection method based on a convolutional neural network according to an initial input video frame, initializing a human face target to be tracked, extracting human face characteristics and storing the human face characteristics;
step S3: predicting the position of each face target in the next frame by adopting a Kalman filter, positioning the position of the face in the next frame by using a face detection method again, and extracting features of the detected face;
step S4: and (4) using the self-adaptive aggregation network trained in the step (S1) to aggregate the face feature set in each tracked face target tracking track, dynamically generating a face depth apparent feature fused with multi-frame information, combining the predicted position and the fused feature, performing similarity calculation and matching with the face position and the feature thereof obtained through detection in the current frame, and updating the tracking state.
In this embodiment, step S1 specifically includes the following steps:
step S11: collecting a public face recognition data set to obtain pictures and names of related persons;
step S12: and integrating the pictures of the common characters in the plurality of data sets by adopting a fusion strategy, carrying out face detection and face key point positioning by using a pre-trained MTCNN model, carrying out face alignment by applying similarity transformation, and simultaneously subtracting the mean value of each channel on the training set from all the images in the training set to finish data preprocessing and train the self-adaptive aggregation network.
In this embodiment, the adaptive aggregation network is formed by connecting a depth feature extraction module and an adaptive feature aggregation module in series, and accepts one or more face images of the same person as input and outputs aggregated features, wherein the depth feature extraction module adopts 34 layers of ResNet as a backbone network, and the adaptive feature aggregation module performs adaptive feature aggregationThe module comprises a characteristic polymerization layer; let B denote the number of samples input, { ztAnd B represents an input sample number, and the calculation mode of a feature aggregation layer is as follows:
Figure BDA0001966809980000071
Figure BDA0001966809980000072
a=∑totzt
wherein q represents a feature vector ztThe weights of the individual components are parameters that can be learned by using the face recognition signal as a supervisory signal and learning by means of back propagation and gradient descent, vtRepresenting each feature vector z as the output of the sigmoid functiontIn the range between 0 and 1, otNormalized output for L1, such that ∑totAnd a is a feature vector after the aggregation of the B feature vectors.
In this embodiment, step S2 specifically includes the following steps:
step S21: let i denote the number of the ith frame of the input video, initially i equals 1, and the pre-trained MTCNN model is used to simultaneously detect the positions D of all facesiAnd the position C of its corresponding facial key pointiWherein
Figure BDA0001966809980000073
J is the number of the jth detected face, JiThe number of faces detected for the first frame,
Figure BDA0001966809980000081
wherein
Figure BDA0001966809980000082
The position of the jth face in the ith frame is shown, and x, y, w and h respectively represent peopleThe coordinates of the upper left corner of the face area and its width and height,
Figure BDA0001966809980000083
wherein
Figure BDA0001966809980000084
Representing a keypoint of the jth face in the ith frame, c1,c2,c3,c4,c5Coordinates of a left eye, a right eye, a nose, a left mouth corner and a right mouth corner of the human face are respectively represented;
step S22: position D for each facej iAnd coordinates of key points of the face
Figure BDA0001966809980000085
To which a unique identity ID is assignedk,k=1,2,...,KiWhere K denotes the number of the kth tracking target, KiIndicating the number of persons tracking the target at frame i and initializing the tracker T corresponding theretok={IDk,Pk,Lk,Ek,AkTherein ID ofkRepresenting a unique identity, P, of the kth tracked objectkIndicating the face position coordinates assigned to the kth target, LkCoordinates of key points of the face representing the k-th object, EkList of face features representing the kth target, AkRepresenting the life cycle of the kth target, initializing Ki=Ji
Figure BDA0001966809980000086
Ak=1;
Step S23: for TkPosition P of each face in (1)kCutting the image to obtain a corresponding face image, and using the corresponding key point position L of the facekCarrying out face alignment by applying similarity transformation to obtain an aligned face image;
step S24: inputting the aligned face images into a self-adaptive aggregation network to obtain corresponding face depth apparent characteristics, and adding the face depth apparent characteristics into a tracker TkIs characterized byList Ek
In this embodiment, step S3 specifically includes the following steps:
step S31: the state of each tracked face target is represented in the form:
Figure BDA0001966809980000087
in the formula, m represents the tracked face target state, u and v represent the central coordinates of the tracked face region, s is the area of the face frame, r is the aspect ratio of the face frame,
Figure BDA0001966809980000088
respectively representing the velocities of (u, v, s, r) in the image coordinate space;
step S32: each tracker TkFace position P inkConversion to (x, y, w, h)
Figure BDA0001966809980000089
In the form of (1), wherein
Figure BDA00019668099800000810
Representing the form of the transformed face position of the kth tracking target in the ith frame;
step S33: will be provided with
Figure BDA0001966809980000091
As a direct observation result of the kth tracking target in the ith frame, the kth tracking target is detected by human faces, and the state of the kth tracking target in the (i + 1) th frame is detected by adopting a Kalman filter based on a linear uniform motion model
Figure BDA0001966809980000092
Carrying out prediction;
step S34: in the (i + 1) th frame, adopting MTCNN model to make face detection and face key point positioning again to obtain face position Di+1And face key point Ci+1
Step S35: for each personFace position
Figure BDA0001966809980000093
Based on its facial key points
Figure BDA0001966809980000094
Completing face alignment by applying similarity transformation, inputting the face alignment into a self-adaptive aggregation network to extract features, and obtaining a feature set Fi+1In which F isi+1And representing the feature sets of all the faces in the (i + 1) th frame.
In this embodiment, step S4 specifically includes the following steps:
step S41: tracker T for each facekA set E of all the characteristics in the historical motion trailkInputting the self-adaptive aggregation network to obtain an aggregation characteristic fkWherein f iskRepresenting an aggregation characteristic output after all characteristic vectors in the history motion trail of the kth tracking target are fused;
step S42: the position state of the kth target predicted by the Kalman filter in the ith frame is in the next frame
Figure BDA0001966809980000095
Is converted into
Figure BDA0001966809980000096
In the form of (a);
step S43: bonding of
Figure BDA0001966809980000097
And the characteristic f after polymerization of the object kkAnd face position D in the i +1 th frame obtained by face detectioni+1And its feature set Fi+1The following correlation matrix is calculated:
G=[gjk],j=1,2,...,Ji+1,k=1,2,...,Ki
in the formula, Ji+1For the number of faces detected in the i +1 th frame, KiFor the number of tracked objects in the ith frame,
Figure BDA0001966809980000098
Figure BDA0001966809980000099
for the jth personal face detection frame in the (i + 1) th frame and the position state of the kth target predicted by the Kalman filter in the (i + 1) th frame
Figure BDA00019668099800000910
The degree of coincidence between them,
Figure BDA00019668099800000911
for the j individual face feature in the (i + 1) th frame
Figure BDA00019668099800000912
With the k-th target in the i-th framekCosine similarity between them, λ is a hyper-parameter, used to balance the weights of the two metrics;
step S44: taking the incidence matrix G as a cost matrix, calculating by using a Hungarian algorithm to obtain a matching result, and detecting the face in the (i + 1) th frame
Figure BDA0001966809980000101
Associating to the kth tracking target;
step S45: corresponding the subscript in the matching result to the item in the incidence matrix G, and filtering all the subscripts smaller than TsimilarityItem g ofjkIt is deleted from the matching result, where TsimilarityThe minimum similarity threshold value is a set hyper-parameter and represents the successful matching;
step S46: in the matching result, if the frame is detected
Figure BDA0001966809980000102
If the association with the kth tracking target is successful, the corresponding tracker T is updatedkPosition state of
Figure BDA0001966809980000103
Human face key point position
Figure BDA0001966809980000104
Life cycle ak=Ak+1, and corresponding face features
Figure BDA0001966809980000105
Add to feature list EkIf the detection frame is not correct
Figure BDA0001966809980000106
If the association fails, a new tracker is created;
step S47: for each tracker TkIf it is in life cycle Ak>TageThen delete the tracker, where TageThe set hyper-parameter represents the maximum time a tracked object can survive.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.

Claims (2)

1. A multi-face tracking method based on depth appearance characteristics and an adaptive aggregation network is characterized in that: the method comprises the following steps:
step S1: training a self-adaptive aggregation network by adopting a face recognition data set;
step S2: acquiring the position of a human face by adopting a convolutional neural network according to an initial input video frame, initializing a human face target to be tracked, extracting human face characteristics and storing;
step S3: predicting the position of each face target in the next frame by adopting a Kalman filter, positioning the position of the face in the next frame again, and extracting characteristics of the detected face;
step S4: using the self-adaptive aggregation network trained in the step S1 to aggregate the face feature set in each tracked face target tracking track, dynamically generating a face depth apparent feature fused with multi-frame information, combining the predicted position and the fused feature, performing similarity calculation and matching with the face position and the feature thereof obtained through detection in the current frame, and updating the tracking state;
step S1 specifically includes the following steps:
step S11: collecting a public face recognition data set to obtain pictures and names of related persons;
step S12: integrating images of people in a plurality of data sets by adopting a fusion strategy, carrying out face detection and face key point positioning by using a pre-trained MTCNN model, carrying out face alignment by applying similarity transformation, simultaneously subtracting the mean value of each channel on a training set from all the images in the training set, completing data preprocessing, and training a self-adaptive aggregation network;
step S2 specifically includes the following steps:
step S21: let i denote the number of the ith frame of the input video, initially i equals 1, and the pre-trained MTCNN model is used to simultaneously detect the positions D of all facesiAnd the position C of its corresponding facial key pointiWherein
Figure FDA0003548592730000011
J is the number of the jth detected face, JiThe number of faces detected for the ith frame,
Figure FDA0003548592730000012
wherein
Figure FDA0003548592730000013
The position of the jth face in the ith frame is shown, x, y, w and h respectively show the coordinates of the upper left corner of the face region and the width and height of the face region,
Figure FDA0003548592730000014
wherein
Figure FDA0003548592730000015
Representing a keypoint of the jth face in the ith frame, c1,c2,c3,c4,c5Coordinates of a left eye, a right eye, a nose, a left mouth corner and a right mouth corner of the human face are respectively represented;
step S22: for each face position
Figure FDA0003548592730000021
And coordinates of key points of the face
Figure FDA0003548592730000022
To which a unique identity ID is assignedk,k=1,2,...,KiWhere K denotes the number of the kth tracking target, KiIndicating the number of persons tracking the target at frame i and initializing the tracker T corresponding theretok={IDk,Pk,Lk,Ek,AkTherein ID ofkRepresenting a unique identity, P, of the kth tracked objectkIndicating the face position coordinates assigned to the kth target, LkCoordinates of key points of the face representing the k-th object, EkList of face features representing the kth target, AkDenotes the life cycle of the kth target, initializing Ki=Ji
Figure FDA0003548592730000023
Ak=1;
Step S23: for TkPosition P of each face in (1)kCutting the image to obtain a corresponding face image, and using the corresponding key point position L of the facekCarrying out face alignment by applying similarity transformation to obtain an aligned face image;
step S24: inputting the aligned face image into the self-adaptive aggregation network to obtain corresponding face depth apparent characteristics, and adding the face depth apparent characteristics into the tracker TkFeature list E ofk
Step S3 specifically includes the following steps:
step S31: representing the target state of each tracked face as follows:
Figure FDA0003548592730000024
in the formula, m represents the tracked face target state, u and v represent the central coordinates of the tracked face region, s is the area of the face frame, r is the aspect ratio of the face frame,
Figure FDA0003548592730000025
respectively representing the velocities of (u, v, s, r) in the image coordinate space;
step S32: each tracker TkFace position P inkConversion to (x, y, w, h)
Figure FDA0003548592730000026
In the form of (1), wherein
Figure FDA0003548592730000027
Representing the form of the transformed face position of the kth tracking target in the ith frame;
step S33: will be provided with
Figure FDA0003548592730000031
As a direct observation result of the kth tracking target in the ith frame, the kth tracking target is detected by human faces, and the state of the kth tracking target in the (i + 1) th frame is detected by adopting a Kalman filter based on a linear uniform motion model
Figure FDA0003548592730000032
Carrying out prediction;
step S34: in the (i + 1) th frame, adopting MTCNN model to make face detection and face key point positioning again to obtain face position Di+1And face key point Ci+1
Step S35: for each face position
Figure FDA0003548592730000033
Based on its facial key points
Figure FDA0003548592730000034
Completing face alignment by applying similarity transformation, inputting the face alignment into a self-adaptive aggregation network to extract features, and obtaining a feature set Fi+1In which F isi+1Representing the feature set of all the human faces in the (i + 1) th frame;
step S4 specifically includes the following steps:
step S41: tracker T for each facekA set E of all the characteristics in the historical motion trailkInputting the self-adaptive aggregation network to obtain an aggregation characteristic fkWherein f iskRepresenting an aggregation characteristic output after all characteristic vectors in the history motion trail of the kth tracking target are fused;
step S42: the position state of the kth target predicted by the Kalman filter in the ith frame is in the next frame
Figure FDA0003548592730000035
Is converted into
Figure FDA0003548592730000036
In the form of (a);
step S43: bonding of
Figure FDA0003548592730000037
And the characteristic f after polymerization of the object kkAnd face position D in the i +1 th frame obtained by face detectioni+1And its feature set Fi+1The following correlation matrix is calculated:
G=[gjk],j=1,2,...,Ji+1,k=1,2,...,Ki
in the formula, Ji+1For the number of faces detected in the i +1 th frame, KiFor the number of tracked objects in the ith frame,
Figure FDA0003548592730000038
Figure FDA0003548592730000039
for the jth personal face detection frame in the (i + 1) th frame and the position state of the kth target predicted by the Kalman filter in the (i + 1) th frame
Figure FDA00035485927300000310
The degree of coincidence between them,
Figure FDA00035485927300000311
for the j individual face feature in the (i + 1) th frame
Figure FDA00035485927300000312
With the k-th target in the i-th framekCosine similarity between them, λ is a hyper-parameter, used to balance the weights of the two metrics;
step S44: taking the incidence matrix G as a cost matrix, calculating by using a Hungarian algorithm to obtain a matching result, and detecting the face in the (i + 1) th frame
Figure FDA0003548592730000041
Associating to the kth tracking target;
step S45: corresponding the subscript in the matching result to the item in the incidence matrix G, and filtering all the subscripts smaller than TsimilarityItem g of (a)jkIt is deleted from the matching result, where TsimilarityThe minimum similarity threshold value is a set hyper-parameter and represents the successful matching;
step S46: in the matching result, if the frame is detected
Figure FDA0003548592730000042
If the association with the kth tracking target is successful, the corresponding tracker T is updatedkPosition state of
Figure FDA0003548592730000043
Human face key point position
Figure FDA0003548592730000044
Life cycle ak=Ak+1, and corresponding face features
Figure FDA0003548592730000045
Add to feature list EkIf the detection frame is not correct
Figure FDA0003548592730000046
If the association fails, a new tracker is created;
step S47: for each tracker TkIf it is in life cycle Ak>TageThen delete the tracker, where TageThe set hyper-parameter represents the maximum time a tracked object can survive.
2. The method for tracking multiple faces based on the depth appearance characteristics and the adaptive aggregation network as claimed in claim 1, wherein: the self-adaptive aggregation network is formed by connecting a depth feature extraction module and a self-adaptive feature aggregation module in series, receives one or more face images of the same person as input and outputs aggregated features, wherein the depth feature extraction module adopts 34 layers of ResNet as a backbone network, and the self-adaptive feature aggregation module comprises a feature aggregation layer; let B denote the number of samples input, { ztAnd B represents an input sample number, and the calculation mode of a feature aggregation layer is as follows:
Figure FDA0003548592730000047
Figure FDA0003548592730000048
a=∑totzt
wherein q represents a feature vector ztThe weight of each component is a parameter which can be learned by using a face recognition signal as a supervision signal and learning by using a back propagation and gradient descent method, vtRepresenting each feature vector z as the output of the sigmoid functiontIn the range between 0 and 1, otNormalized output for L1, such that ∑totAnd a is a feature vector after the aggregation of the B feature vectors.
CN201910106309.1A 2019-02-02 2019-02-02 Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network Active CN109829436B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910106309.1A CN109829436B (en) 2019-02-02 2019-02-02 Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network
PCT/CN2019/124966 WO2020155873A1 (en) 2019-02-02 2019-12-13 Deep apparent features and adaptive aggregation network-based multi-face tracking method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910106309.1A CN109829436B (en) 2019-02-02 2019-02-02 Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network

Publications (2)

Publication Number Publication Date
CN109829436A CN109829436A (en) 2019-05-31
CN109829436B true CN109829436B (en) 2022-05-13

Family

ID=66863393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910106309.1A Active CN109829436B (en) 2019-02-02 2019-02-02 Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network

Country Status (2)

Country Link
CN (1) CN109829436B (en)
WO (1) WO2020155873A1 (en)

Families Citing this family (82)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829436B (en) * 2019-02-02 2022-05-13 福州大学 Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network
TWI727337B (en) * 2019-06-06 2021-05-11 大陸商鴻富錦精密工業(武漢)有限公司 Electronic device and face recognition method
CN110490901A (en) * 2019-07-15 2019-11-22 武汉大学 The pedestrian detection tracking of anti-attitudes vibration
CN110414443A (en) * 2019-07-31 2019-11-05 苏州市科远软件技术开发有限公司 A kind of method for tracking target, device and rifle ball link tracking
CN110705478A (en) * 2019-09-30 2020-01-17 腾讯科技(深圳)有限公司 Face tracking method, device, equipment and storage medium
CN111078295B (en) * 2019-11-28 2021-11-12 核芯互联科技(青岛)有限公司 Mixed branch prediction device and method for out-of-order high-performance core
CN111160202B (en) * 2019-12-20 2023-09-05 万翼科技有限公司 Identity verification method, device, equipment and storage medium based on AR equipment
CN111079718A (en) * 2020-01-15 2020-04-28 中云智慧(北京)科技有限公司 Quick face comparison method
CN111275741B (en) * 2020-01-19 2023-09-08 北京迈格威科技有限公司 Target tracking method, device, computer equipment and storage medium
CN111325279B (en) * 2020-02-26 2022-06-10 福州大学 Pedestrian and personal sensitive article tracking method fusing visual relationship
CN111476826A (en) * 2020-04-10 2020-07-31 电子科技大学 Multi-target vehicle tracking method based on SSD target detection
CN111770299B (en) * 2020-04-20 2022-04-19 厦门亿联网络技术股份有限公司 Method and system for real-time face abstract service of intelligent video conference terminal
CN111553234B (en) * 2020-04-22 2023-06-06 上海锘科智能科技有限公司 Pedestrian tracking method and device integrating facial features and Re-ID feature ordering
CN111914613B (en) * 2020-05-21 2024-03-01 淮阴工学院 Multi-target tracking and facial feature information recognition method
CN112001225B (en) * 2020-07-06 2023-06-23 西安电子科技大学 Online multi-target tracking method, system and application
CN111932588B (en) * 2020-08-07 2024-01-30 浙江大学 Tracking method of airborne unmanned aerial vehicle multi-target tracking system based on deep learning
CN111784746B (en) * 2020-08-10 2024-05-03 青岛高重信息科技有限公司 Multi-target pedestrian tracking method and device under fish-eye lens and computer system
CN111899284B (en) * 2020-08-14 2024-04-09 北京交通大学 Planar target tracking method based on parameterized ESM network
CN112036271B (en) * 2020-08-18 2023-10-10 汇纳科技股份有限公司 Pedestrian re-identification method, system, medium and terminal based on Kalman filtering
CN111932661B (en) * 2020-08-19 2023-10-24 上海艾麒信息科技股份有限公司 Facial expression editing system and method and terminal
CN112016440B (en) * 2020-08-26 2024-02-20 杭州云栖智慧视通科技有限公司 Target pushing method based on multi-target tracking
CN112215873A (en) * 2020-08-27 2021-01-12 国网浙江省电力有限公司电力科学研究院 Method for tracking and positioning multiple targets in transformer substation
CN112085767B (en) * 2020-08-28 2023-04-18 安徽清新互联信息科技有限公司 Passenger flow statistical method and system based on deep optical flow tracking
CN112053386B (en) * 2020-08-31 2023-04-18 西安电子科技大学 Target tracking method based on depth convolution characteristic self-adaptive integration
CN112257502A (en) * 2020-09-16 2021-01-22 深圳微步信息股份有限公司 Pedestrian identification and tracking method and device for surveillance video and storage medium
CN112149557B (en) * 2020-09-22 2022-08-09 福州大学 Person identity tracking method and system based on face recognition
CN112215155B (en) * 2020-10-13 2022-10-14 北京中电兴发科技有限公司 Face tracking method and system based on multi-feature fusion
CN112288773A (en) * 2020-10-19 2021-01-29 慧视江山科技(北京)有限公司 Multi-scale human body tracking method and device based on Soft-NMS
CN112307234A (en) * 2020-11-03 2021-02-02 厦门兆慧网络科技有限公司 Face bottom library synthesis method, system, device and storage medium
CN112287877B (en) * 2020-11-18 2022-12-02 苏州爱可尔智能科技有限公司 Multi-role close-up shot tracking method
CN114639129B (en) * 2020-11-30 2024-05-03 北京君正集成电路股份有限公司 Paper medium living body detection method for access control system
CN112651994A (en) * 2020-12-18 2021-04-13 零八一电子集团有限公司 Ground multi-target tracking method
CN112668432A (en) * 2020-12-22 2021-04-16 上海幻维数码创意科技股份有限公司 Human body detection tracking method in ground interactive projection system based on YoloV5 and Deepsort
CN112597901B (en) * 2020-12-23 2023-12-29 艾体威尔电子技术(北京)有限公司 Device and method for effectively recognizing human face in multiple human face scenes based on three-dimensional ranging
CN112560874B (en) * 2020-12-25 2024-04-16 北京百度网讯科技有限公司 Training method, device, equipment and medium for image recognition model
CN112653844A (en) * 2020-12-28 2021-04-13 珠海亿智电子科技有限公司 Camera holder steering self-adaptive tracking adjustment method
CN112597944A (en) * 2020-12-29 2021-04-02 北京市商汤科技开发有限公司 Key point detection method and device, electronic equipment and storage medium
CN112669345B (en) * 2020-12-30 2023-10-20 中山大学 Cloud deployment-oriented multi-target track tracking method and system
CN112581506A (en) * 2020-12-31 2021-03-30 北京澎思科技有限公司 Face tracking method, system and computer readable storage medium
CN112686175A (en) * 2020-12-31 2021-04-20 北京澎思科技有限公司 Face snapshot method, system and computer readable storage medium
CN112784725A (en) * 2021-01-15 2021-05-11 北京航天自动控制研究所 Pedestrian anti-collision early warning method and device, storage medium and forklift
CN113076808B (en) * 2021-03-10 2023-05-26 海纳云物联科技有限公司 Method for accurately acquiring bidirectional traffic flow through image algorithm
CN113158788B (en) * 2021-03-12 2024-03-08 中国平安人寿保险股份有限公司 Facial expression recognition method and device, terminal equipment and storage medium
CN113033439B (en) * 2021-03-31 2023-10-20 北京百度网讯科技有限公司 Method and device for data processing and electronic equipment
CN113158853A (en) * 2021-04-08 2021-07-23 浙江工业大学 Pedestrian's identification system that makes a dash across red light that combines people's face and human gesture
CN113192105B (en) * 2021-04-16 2023-10-17 嘉联支付有限公司 Method and device for indoor multi-person tracking and attitude measurement
CN113158909B (en) * 2021-04-25 2023-06-27 中国科学院自动化研究所 Behavior recognition light-weight method, system and equipment based on multi-target tracking
CN113408348B (en) * 2021-05-14 2022-08-19 桂林电子科技大学 Video-based face recognition method and device and storage medium
CN113377192B (en) * 2021-05-20 2023-06-20 广州紫为云科技有限公司 Somatosensory game tracking method and device based on deep learning
CN113379795B (en) * 2021-05-21 2024-03-22 浙江工业大学 Multi-target tracking and segmentation method based on conditional convolution and optical flow characteristics
CN113269098B (en) * 2021-05-27 2023-06-16 中国人民解放军军事科学院国防科技创新研究院 Multi-target tracking positioning and motion state estimation method based on unmanned aerial vehicle
CN113313201A (en) * 2021-06-21 2021-08-27 南京挥戈智能科技有限公司 Multi-target detection and distance measurement method based on Swin transducer and ZED camera
CN113487653B (en) * 2021-06-24 2024-03-26 之江实验室 Self-adaptive graph tracking method based on track prediction
CN113486771B (en) * 2021-06-30 2023-07-07 福州大学 Video action uniformity evaluation method and system based on key point detection
CN113724291B (en) * 2021-07-29 2024-04-02 西安交通大学 Multi-panda tracking method, system, terminal device and readable storage medium
CN113658223B (en) * 2021-08-11 2023-08-04 山东建筑大学 Multi-row person detection and tracking method and system based on deep learning
CN113807187B (en) * 2021-08-20 2024-04-02 北京工业大学 Unmanned aerial vehicle video multi-target tracking method based on attention feature fusion
CN113688740B (en) * 2021-08-26 2024-02-27 燕山大学 Indoor gesture detection method based on multi-sensor fusion vision
CN113723279B (en) * 2021-08-30 2022-11-01 东南大学 Multi-target tracking acceleration method based on time-space optimization in edge computing environment
CN113920457A (en) * 2021-09-16 2022-01-11 中国农业科学院农业资源与农业区划研究所 Fruit yield estimation method and system based on space and ground information acquisition cooperative processing
CN113723361A (en) * 2021-09-18 2021-11-30 西安邮电大学 Video monitoring method and device based on deep learning
CN113808170B (en) * 2021-09-24 2023-06-27 电子科技大学长三角研究院(湖州) Anti-unmanned aerial vehicle tracking method based on deep learning
CN113822211B (en) * 2021-09-27 2023-04-11 山东睿思奥图智能科技有限公司 Interactive person information acquisition method
CN113936312A (en) * 2021-10-12 2022-01-14 南京视察者智能科技有限公司 Face recognition base screening method based on deep learning graph convolution network
CN114627339B (en) * 2021-11-09 2024-03-29 昆明物理研究所 Intelligent recognition tracking method and storage medium for cross border personnel in dense jungle area
CN114120188B (en) * 2021-11-19 2024-04-05 武汉大学 Multi-row person tracking method based on joint global and local features
CN114169425B (en) * 2021-12-03 2023-02-03 北京百度网讯科技有限公司 Training target tracking model and target tracking method and device
CN114339398A (en) * 2021-12-24 2022-04-12 天翼视讯传媒有限公司 Method for real-time special effect processing in large-scale video live broadcast
CN114419151A (en) * 2021-12-31 2022-04-29 福州大学 Multi-target tracking method based on contrast learning
CN114663796A (en) * 2022-01-04 2022-06-24 北京航空航天大学 Target person continuous tracking method, device and system
CN114821702A (en) * 2022-03-15 2022-07-29 电子科技大学 Thermal infrared face recognition method based on face shielding
CN115214430B (en) * 2022-03-23 2023-11-17 广州汽车集团股份有限公司 Vehicle seat adjusting method and vehicle
WO2023184197A1 (en) * 2022-03-30 2023-10-05 京东方科技集团股份有限公司 Target tracking method and apparatus, system, and storage medium
CN115272404B (en) * 2022-06-17 2023-07-18 江南大学 Multi-target tracking method based on kernel space and implicit space feature alignment
CN114943924B (en) * 2022-06-21 2024-05-14 深圳大学 Pain assessment method, system, equipment and medium based on facial expression video
CN114783043B (en) * 2022-06-24 2022-09-20 杭州安果儿智能科技有限公司 Child behavior track positioning method and system
CN115994929A (en) * 2023-03-24 2023-04-21 中国兵器科学研究院 Multi-target tracking method integrating space motion and apparent feature learning
CN116596958B (en) * 2023-07-18 2023-10-10 四川迪晟新达类脑智能技术有限公司 Target tracking method and device based on online sample augmentation
CN117011335B (en) * 2023-07-26 2024-04-09 山东大学 Multi-target tracking method and system based on self-adaptive double decoders
CN117455955B (en) * 2023-12-14 2024-03-08 武汉纺织大学 Pedestrian multi-target tracking method based on unmanned aerial vehicle visual angle
CN117576166B (en) * 2024-01-15 2024-04-30 浙江华是科技股份有限公司 Target tracking method and system based on camera and low-frame-rate laser radar
CN117809054B (en) * 2024-02-29 2024-05-10 南京邮电大学 Multi-target tracking method based on feature decoupling fusion network

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8295543B2 (en) * 2007-08-31 2012-10-23 Lockheed Martin Corporation Device and method for detecting targets in images based on user-defined classifiers
CN101216885A (en) * 2008-01-04 2008-07-09 中山大学 Passerby face detection and tracing algorithm based on video
CN101777116B (en) * 2009-12-23 2012-07-25 中国科学院自动化研究所 Method for analyzing facial expressions on basis of motion tracking
US10902243B2 (en) * 2016-10-25 2021-01-26 Deep North, Inc. Vision based target tracking that distinguishes facial feature targets
CN106845385A (en) * 2017-01-17 2017-06-13 腾讯科技(上海)有限公司 The method and apparatus of video frequency object tracking
CN107292911B (en) * 2017-05-23 2021-03-30 南京邮电大学 Multi-target tracking method based on multi-model fusion and data association
CN107492116A (en) * 2017-09-01 2017-12-19 深圳市唯特视科技有限公司 A kind of method that face tracking is carried out based on more display models
CN107609512A (en) * 2017-09-12 2018-01-19 上海敏识网络科技有限公司 A kind of video human face method for catching based on neutral net
CN108509859B (en) * 2018-03-09 2022-08-26 南京邮电大学 Non-overlapping area pedestrian tracking method based on deep neural network
CN108363997A (en) * 2018-03-20 2018-08-03 南京云思创智信息科技有限公司 It is a kind of in video to the method for real time tracking of particular person
CN109101915B (en) * 2018-08-01 2021-04-27 中国计量大学 Face, pedestrian and attribute recognition network structure design method based on deep learning
CN109086724B (en) * 2018-08-09 2019-12-24 北京华捷艾米科技有限公司 Accelerated human face detection method and storage medium
CN109829436B (en) * 2019-02-02 2022-05-13 福州大学 Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于预测的实时人脸特征点定位跟踪算法;翁政魁等;《万方数据知识服务平台期刊库》;20150722;第198-202页 *

Also Published As

Publication number Publication date
WO2020155873A1 (en) 2020-08-06
CN109829436A (en) 2019-05-31

Similar Documents

Publication Publication Date Title
CN109829436B (en) Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network
CN110135375B (en) Multi-person attitude estimation method based on global information integration
CN110472554B (en) Table tennis action recognition method and system based on attitude segmentation and key point features
CN104881637B (en) Multimodal information system and its fusion method based on heat transfer agent and target tracking
CN111709311B (en) Pedestrian re-identification method based on multi-scale convolution feature fusion
CN114220176A (en) Human behavior recognition method based on deep learning
WO2017150032A1 (en) Method and system for detecting actions of object in scene
CN109685037B (en) Real-time action recognition method and device and electronic equipment
Nandini et al. Face recognition using neural networks
CN110135249A (en) Human bodys' response method based on time attention mechanism and LSTM
CN108960047B (en) Face duplication removing method in video monitoring based on depth secondary tree
CN112149557B (en) Person identity tracking method and system based on face recognition
CN114067358A (en) Human body posture recognition method and system based on key point detection technology
CN112989889B (en) Gait recognition method based on gesture guidance
CN114582030A (en) Behavior recognition method based on service robot
CN111931654A (en) Intelligent monitoring method, system and device for personnel tracking
CN108830170A (en) A kind of end-to-end method for tracking target indicated based on layered characteristic
CN113963032A (en) Twin network structure target tracking method fusing target re-identification
CN110222607A (en) The method, apparatus and system of face critical point detection
CN113378649A (en) Identity, position and action recognition method, system, electronic equipment and storage medium
CN109711232A (en) Deep learning pedestrian recognition methods again based on multiple objective function
CN114429646A (en) Gait recognition method based on deep self-attention transformation network
Wang et al. Thermal infrared object tracking based on adaptive feature fusion
Galiyawala et al. Dsa-pr: discrete soft biometric attribute-based person retrieval in surveillance videos
Caetano et al. Magnitude-Orientation Stream network and depth information applied to activity recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant