CN109829436B - Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network - Google Patents
Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network Download PDFInfo
- Publication number
- CN109829436B CN109829436B CN201910106309.1A CN201910106309A CN109829436B CN 109829436 B CN109829436 B CN 109829436B CN 201910106309 A CN201910106309 A CN 201910106309A CN 109829436 B CN109829436 B CN 109829436B
- Authority
- CN
- China
- Prior art keywords
- face
- frame
- feature
- target
- tracking
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000002776 aggregation Effects 0.000 title claims abstract description 55
- 238000004220 aggregation Methods 0.000 title claims abstract description 55
- 238000000034 method Methods 0.000 title claims abstract description 17
- 238000001514 detection method Methods 0.000 claims abstract description 24
- 238000004364 calculation method Methods 0.000 claims abstract description 7
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 4
- 239000013598 vector Substances 0.000 claims description 15
- 239000011159 matrix material Substances 0.000 claims description 11
- 238000012549 training Methods 0.000 claims description 10
- 230000009466 transformation Effects 0.000 claims description 9
- 230000003044 adaptive effect Effects 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 6
- 230000001815 facial effect Effects 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 4
- 238000006116 polymerization reaction Methods 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000005520 cutting process Methods 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000011478 gradient descent method Methods 0.000 claims 1
- 238000005516 engineering process Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a multi-face tracking method based on depth appearance characteristics and a self-adaptive aggregation network, which comprises the steps of firstly adopting a face recognition data set to train the self-adaptive aggregation network; then, acquiring the position of a human face by using a human face detection method based on a convolutional neural network, initializing a human face target to be tracked, and extracting human face characteristics; then, predicting the position of each face tracking target in the next frame by adopting a Kalman filter, positioning the position of the face in the next frame again, and extracting the characteristics of the detected face; and finally, using a self-adaptive aggregation network to aggregate the face feature set in each tracked face target tracking track, dynamically generating a face depth apparent feature fused with multi-frame information, combining the predicted position and the fused feature, performing similarity calculation and matching with the face position and the feature thereof obtained by detection in the current frame, and updating the tracking state. The invention can improve the performance of face tracking.
Description
Technical Field
The invention relates to the field of pattern recognition and computer vision, in particular to a multi-face tracking method based on depth appearance characteristics and a self-adaptive aggregation network.
Background
In recent years, with social progress and continuous development of science and technology, the problem of video face recognition gradually becomes a popular research field, the research interests of numerous experts and scholars at home and abroad are attracted, the video face recognition is used as an entrance and a basis for video face recognition, the face detection and tracking technology is rapidly developed, and the video face recognition and tracking method is widely applied to the fields of intelligent monitoring, virtual reality perception interfaces, video conferences and the like.
In order to analyze a face, a face must be captured first, which can be realized by a face detection technology and a face tracking technology, and only if a face target is accurately positioned and tracked in a video, the face can be analyzed more carefully, such as face recognition, pose estimation and the like. The target tracking technology is undoubtedly one of the most important technologies in intelligent security, the face tracking technology is a specific application of the current tracking technology, a tracking algorithm is used for processing a moving face in a video sequence, and the face area is kept locked to complete tracking, so that the technology has good application prospects in scenes such as intelligent security, video monitoring and the like.
Face tracking plays an important role in video monitoring, but currently, in a real scene, due to the large change of the face pose and the overlapping and shielding between tracking targets, practical application is difficult.
Disclosure of Invention
In view of this, the present invention provides a multi-face tracking method based on a deep appearance feature and an adaptive aggregation network, which can improve the face tracking performance.
The invention is realized by adopting the following scheme: a multi-face tracking method based on depth appearance characteristics and a self-adaptive aggregation network specifically comprises the following steps:
step S1: training a self-adaptive aggregation network by adopting a face recognition data set;
step S2: acquiring the position of a human face by adopting a convolutional neural network according to an initial input video frame, initializing a human face target to be tracked, extracting human face characteristics and storing;
step S3: predicting the position of each face target in the next frame by adopting a Kalman filter, positioning the position of the face in the next frame again, and extracting characteristics of the detected face;
step S4: and (4) using the self-adaptive aggregation network trained in the step (S1) to aggregate the face feature set in each tracked face target tracking track, dynamically generating a face depth apparent feature fused with multi-frame information, combining the predicted position and the fused feature, performing similarity calculation and matching with the face position and the feature thereof obtained through detection in the current frame, and updating the tracking state.
Further, step S1 specifically includes the following steps:
step S11: collecting a public face recognition data set to obtain pictures and names of related persons;
step S12: and integrating the pictures of the common characters in the plurality of data sets by adopting a fusion strategy, carrying out face detection and face key point positioning by using a pre-trained MTCNN model, carrying out face alignment by applying similarity transformation, and simultaneously subtracting the mean value of each channel on the training set from all the images in the training set to finish data preprocessing and train the self-adaptive aggregation network.
Furthermore, the self-adaptive aggregation network is formed by connecting a depth feature extraction module and a self-adaptive feature aggregation module in series, receives one or more face images of the same person as input, and outputs aggregated features, wherein the depth feature extraction module adopts 34 layers of ResNet as a backbone network, and the self-adaptive feature aggregation module comprises a feature aggregation layer; let B denote the number of samples input, { ztAnd B represents an input sample number, and the calculation mode of a feature aggregation layer is as follows:
a=∑totzt;
wherein q represents a feature vector ztThe weights of the individual components are parameters that can be learned by using the face recognition signal as a supervisory signal and learning by means of back propagation and gradient descent, vtIs the output of the sigmoid function and,representing each feature vector ztIn the range between 0 and 1, otNormalized output for L1, such that ∑totAnd a is a feature vector after the aggregation of the B feature vectors.
Further, step S2 specifically includes the following steps:
step S21: let i denote the number of the ith frame of the input video, initially i equals 1, and the pre-trained MTCNN model is used to simultaneously detect the positions D of all facesiAnd the position C of its corresponding facial key pointiWhereinJ is the number of the jth detected face, JiThe number of faces detected for the first frame,whereinThe position of the jth face in the ith frame is shown, x, y, w and h respectively show the coordinates of the upper left corner of the face region and the width and height of the face region,whereinRepresenting a keypoint of the jth face in the ith frame, c1,c2,c3,c4,c5Coordinates of a left eye, a right eye, a nose, a left mouth corner and a right mouth corner of the human face are respectively represented;
step S22: position D for each facej iAnd coordinates of key points of the faceTo which a unique identity ID is assignedk,k=1,2,...,KiWhere K denotes the number of the kth tracking target, KiIndicating the number of persons tracking the target at frame i and initiatingChange its corresponding tracker Tk={IDk,Pk,Lk,Ek,AkTherein ID ofkRepresenting a unique identity, P, of the kth tracked objectkIndicating the face position coordinates assigned to the kth target, LkCoordinates of key points of the face representing the k-th object, EkList of face features representing the kth target, AkDenotes the life cycle of the kth target, initializing Ki=Ji,Ak=1;
Step S23: for TkPosition P of each face in (1)kCutting the image to obtain a corresponding face image, and using the corresponding key point position L of the facekCarrying out face alignment by applying similarity transformation to obtain an aligned face image;
step S24: inputting the aligned face image into the self-adaptive aggregation network to obtain corresponding face depth apparent characteristics, and adding the face depth apparent characteristics into the tracker TkFeature list E ofk。
Further, step S3 specifically includes the following steps:
step S31: representing the target state of each tracked face as follows:
in the formula, m represents the tracked face target state, u and v represent the central coordinates of the tracked face region, s is the area of the face frame, r is the aspect ratio of the face frame,respectively representing the velocities of (u, v, s, r) in the image coordinate space;
step S32: each tracker TkFace position P inkConversion to (x, y, w, h)In the form of (1), whereinRepresenting the form of the transformed face position of the kth tracking target in the ith frame;
step S33: will be provided withAs a direct observation result of the kth tracking target in the ith frame, the kth tracking target is detected by human faces, and the state of the kth tracking target in the (i + 1) th frame is detected by adopting a Kalman filter based on a linear uniform motion modelCarrying out prediction;
step S34: in the (i + 1) th frame, adopting MTCNN model to make face detection and face key point positioning again to obtain face position Di+1And face key point Ci+1;
Step S35: for each face positionBased on its facial key pointsCompleting face alignment by applying similarity transformation, inputting the face alignment into a self-adaptive aggregation network to extract features, and obtaining a feature set Fi+1In which F isi+1And representing the feature sets of all the faces in the (i + 1) th frame.
Further, step S4 specifically includes the following steps:
step S41: tracker T for each facekA set E of all the characteristics in the historical motion trailkInputting the self-adaptive aggregation network to obtain an aggregation characteristic fkWherein f iskRepresenting an aggregation characteristic output after all characteristic vectors in the history motion trail of the kth tracking target are fused;
step (ii) ofS42: the position state of the kth target predicted by the Kalman filter in the ith frame is in the next frameIs converted intoIn the form of (a);
step S43: bonding ofAnd the characteristic f after polymerization of the object kkAnd face position D in the i +1 th frame obtained by face detectioni+1And its feature set Fi+1The following correlation matrix is calculated:
G=[gjk],j=1,2,...,Ji+1,k=1,2,...,Ki;
in the formula, Ji+1For the number of faces detected in the i +1 th frame, KiFor the number of tracked objects in the ith frame, for the jth personal face detection frame in the (i + 1) th frame and the position state of the kth target predicted by the Kalman filter in the (i + 1) th frameThe degree of coincidence between them,for the j individual face feature in the (i + 1) th frameWith the k-th target in the i-th framekCosine similarity between them, λ is a hyper-parameter, used to balance the weights of the two metrics;
step S44: using the incidence matrix G as the costMatrix, calculating by using Hungarian algorithm to obtain matching result, and detecting the face in the (i + 1) th frameAssociating to the kth tracking target;
step S45: corresponding the subscript in the matching result to the item in the incidence matrix G, and filtering all the subscripts smaller than TsimilarityItem g ofjkIt is deleted from the matching result, where TsimilarityThe minimum similarity threshold value is a set hyper-parameter and represents the successful matching;
step S46: in the matching result, if the frame is detectedIf the association with the kth tracking target is successful, the corresponding tracker T is updatedkPosition state ofHuman face key point positionLife cycle ak=Ak+1, and corresponding face featuresAdd to feature list EkIf the detection frame is not correctIf the association fails, a new tracker is created;
step S47: for each tracker TkIf it is in life cycle Ak>TageThen delete the tracker, where TageThe set hyper-parameter represents the maximum time a tracked object can survive.
Compared with the prior art, the invention has the following beneficial effects:
1. the multi-face tracking method based on the depth appearance characteristics and the self-adaptive aggregation network can effectively track the face in the video, improves the face tracking accuracy and reduces the target switching times.
2. The invention can track the human face in the video on line while ensuring the tracking effect.
3. The invention provides a method for utilizing apparent features of human face depth, which improves the performance of human face tracking by combining information between a spatial position and depth features.
4. Aiming at the problem that all features in the same target tracking track are difficult to be effectively utilized and a plurality of feature sets are effectively compared in the face tracking process, the invention provides a self-adaptive aggregation network, and the importance degree of each feature in the feature sets is adaptively learned and effectively fused through a feature aggregation module, so that the face tracking effect is improved.
Drawings
FIG. 1 is a schematic flow chart of an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the present embodiment provides a multi-face tracking method based on depth appearance features and an adaptive aggregation network, which specifically includes the following steps:
step S1: training a self-adaptive aggregation network by adopting a face recognition data set;
step S2: acquiring the position of a human face by using a human face detection method based on a convolutional neural network according to an initial input video frame, initializing a human face target to be tracked, extracting human face characteristics and storing the human face characteristics;
step S3: predicting the position of each face target in the next frame by adopting a Kalman filter, positioning the position of the face in the next frame by using a face detection method again, and extracting features of the detected face;
step S4: and (4) using the self-adaptive aggregation network trained in the step (S1) to aggregate the face feature set in each tracked face target tracking track, dynamically generating a face depth apparent feature fused with multi-frame information, combining the predicted position and the fused feature, performing similarity calculation and matching with the face position and the feature thereof obtained through detection in the current frame, and updating the tracking state.
In this embodiment, step S1 specifically includes the following steps:
step S11: collecting a public face recognition data set to obtain pictures and names of related persons;
step S12: and integrating the pictures of the common characters in the plurality of data sets by adopting a fusion strategy, carrying out face detection and face key point positioning by using a pre-trained MTCNN model, carrying out face alignment by applying similarity transformation, and simultaneously subtracting the mean value of each channel on the training set from all the images in the training set to finish data preprocessing and train the self-adaptive aggregation network.
In this embodiment, the adaptive aggregation network is formed by connecting a depth feature extraction module and an adaptive feature aggregation module in series, and accepts one or more face images of the same person as input and outputs aggregated features, wherein the depth feature extraction module adopts 34 layers of ResNet as a backbone network, and the adaptive feature aggregation module performs adaptive feature aggregationThe module comprises a characteristic polymerization layer; let B denote the number of samples input, { ztAnd B represents an input sample number, and the calculation mode of a feature aggregation layer is as follows:
a=∑totzt;
wherein q represents a feature vector ztThe weights of the individual components are parameters that can be learned by using the face recognition signal as a supervisory signal and learning by means of back propagation and gradient descent, vtRepresenting each feature vector z as the output of the sigmoid functiontIn the range between 0 and 1, otNormalized output for L1, such that ∑totAnd a is a feature vector after the aggregation of the B feature vectors.
In this embodiment, step S2 specifically includes the following steps:
step S21: let i denote the number of the ith frame of the input video, initially i equals 1, and the pre-trained MTCNN model is used to simultaneously detect the positions D of all facesiAnd the position C of its corresponding facial key pointiWhereinJ is the number of the jth detected face, JiThe number of faces detected for the first frame,whereinThe position of the jth face in the ith frame is shown, and x, y, w and h respectively represent peopleThe coordinates of the upper left corner of the face area and its width and height,whereinRepresenting a keypoint of the jth face in the ith frame, c1,c2,c3,c4,c5Coordinates of a left eye, a right eye, a nose, a left mouth corner and a right mouth corner of the human face are respectively represented;
step S22: position D for each facej iAnd coordinates of key points of the faceTo which a unique identity ID is assignedk,k=1,2,...,KiWhere K denotes the number of the kth tracking target, KiIndicating the number of persons tracking the target at frame i and initializing the tracker T corresponding theretok={IDk,Pk,Lk,Ek,AkTherein ID ofkRepresenting a unique identity, P, of the kth tracked objectkIndicating the face position coordinates assigned to the kth target, LkCoordinates of key points of the face representing the k-th object, EkList of face features representing the kth target, AkRepresenting the life cycle of the kth target, initializing Ki=Ji,Ak=1;
Step S23: for TkPosition P of each face in (1)kCutting the image to obtain a corresponding face image, and using the corresponding key point position L of the facekCarrying out face alignment by applying similarity transformation to obtain an aligned face image;
step S24: inputting the aligned face images into a self-adaptive aggregation network to obtain corresponding face depth apparent characteristics, and adding the face depth apparent characteristics into a tracker TkIs characterized byList Ek。
In this embodiment, step S3 specifically includes the following steps:
step S31: the state of each tracked face target is represented in the form:
in the formula, m represents the tracked face target state, u and v represent the central coordinates of the tracked face region, s is the area of the face frame, r is the aspect ratio of the face frame,respectively representing the velocities of (u, v, s, r) in the image coordinate space;
step S32: each tracker TkFace position P inkConversion to (x, y, w, h)In the form of (1), whereinRepresenting the form of the transformed face position of the kth tracking target in the ith frame;
step S33: will be provided withAs a direct observation result of the kth tracking target in the ith frame, the kth tracking target is detected by human faces, and the state of the kth tracking target in the (i + 1) th frame is detected by adopting a Kalman filter based on a linear uniform motion modelCarrying out prediction;
step S34: in the (i + 1) th frame, adopting MTCNN model to make face detection and face key point positioning again to obtain face position Di+1And face key point Ci+1;
Step S35: for each personFace positionBased on its facial key pointsCompleting face alignment by applying similarity transformation, inputting the face alignment into a self-adaptive aggregation network to extract features, and obtaining a feature set Fi+1In which F isi+1And representing the feature sets of all the faces in the (i + 1) th frame.
In this embodiment, step S4 specifically includes the following steps:
step S41: tracker T for each facekA set E of all the characteristics in the historical motion trailkInputting the self-adaptive aggregation network to obtain an aggregation characteristic fkWherein f iskRepresenting an aggregation characteristic output after all characteristic vectors in the history motion trail of the kth tracking target are fused;
step S42: the position state of the kth target predicted by the Kalman filter in the ith frame is in the next frameIs converted intoIn the form of (a);
step S43: bonding ofAnd the characteristic f after polymerization of the object kkAnd face position D in the i +1 th frame obtained by face detectioni+1And its feature set Fi+1The following correlation matrix is calculated:
G=[gjk],j=1,2,...,Ji+1,k=1,2,...,Ki;
in the formula, Ji+1For the number of faces detected in the i +1 th frame, KiFor the number of tracked objects in the ith frame, for the jth personal face detection frame in the (i + 1) th frame and the position state of the kth target predicted by the Kalman filter in the (i + 1) th frameThe degree of coincidence between them,for the j individual face feature in the (i + 1) th frameWith the k-th target in the i-th framekCosine similarity between them, λ is a hyper-parameter, used to balance the weights of the two metrics;
step S44: taking the incidence matrix G as a cost matrix, calculating by using a Hungarian algorithm to obtain a matching result, and detecting the face in the (i + 1) th frameAssociating to the kth tracking target;
step S45: corresponding the subscript in the matching result to the item in the incidence matrix G, and filtering all the subscripts smaller than TsimilarityItem g ofjkIt is deleted from the matching result, where TsimilarityThe minimum similarity threshold value is a set hyper-parameter and represents the successful matching;
step S46: in the matching result, if the frame is detectedIf the association with the kth tracking target is successful, the corresponding tracker T is updatedkPosition state ofHuman face key point positionLife cycle ak=Ak+1, and corresponding face featuresAdd to feature list EkIf the detection frame is not correctIf the association fails, a new tracker is created;
step S47: for each tracker TkIf it is in life cycle Ak>TageThen delete the tracker, where TageThe set hyper-parameter represents the maximum time a tracked object can survive.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing is directed to preferred embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow. However, any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the protection scope of the technical solution of the present invention.
Claims (2)
1. A multi-face tracking method based on depth appearance characteristics and an adaptive aggregation network is characterized in that: the method comprises the following steps:
step S1: training a self-adaptive aggregation network by adopting a face recognition data set;
step S2: acquiring the position of a human face by adopting a convolutional neural network according to an initial input video frame, initializing a human face target to be tracked, extracting human face characteristics and storing;
step S3: predicting the position of each face target in the next frame by adopting a Kalman filter, positioning the position of the face in the next frame again, and extracting characteristics of the detected face;
step S4: using the self-adaptive aggregation network trained in the step S1 to aggregate the face feature set in each tracked face target tracking track, dynamically generating a face depth apparent feature fused with multi-frame information, combining the predicted position and the fused feature, performing similarity calculation and matching with the face position and the feature thereof obtained through detection in the current frame, and updating the tracking state;
step S1 specifically includes the following steps:
step S11: collecting a public face recognition data set to obtain pictures and names of related persons;
step S12: integrating images of people in a plurality of data sets by adopting a fusion strategy, carrying out face detection and face key point positioning by using a pre-trained MTCNN model, carrying out face alignment by applying similarity transformation, simultaneously subtracting the mean value of each channel on a training set from all the images in the training set, completing data preprocessing, and training a self-adaptive aggregation network;
step S2 specifically includes the following steps:
step S21: let i denote the number of the ith frame of the input video, initially i equals 1, and the pre-trained MTCNN model is used to simultaneously detect the positions D of all facesiAnd the position C of its corresponding facial key pointiWhereinJ is the number of the jth detected face, JiThe number of faces detected for the ith frame,whereinThe position of the jth face in the ith frame is shown, x, y, w and h respectively show the coordinates of the upper left corner of the face region and the width and height of the face region,whereinRepresenting a keypoint of the jth face in the ith frame, c1,c2,c3,c4,c5Coordinates of a left eye, a right eye, a nose, a left mouth corner and a right mouth corner of the human face are respectively represented;
step S22: for each face positionAnd coordinates of key points of the faceTo which a unique identity ID is assignedk,k=1,2,...,KiWhere K denotes the number of the kth tracking target, KiIndicating the number of persons tracking the target at frame i and initializing the tracker T corresponding theretok={IDk,Pk,Lk,Ek,AkTherein ID ofkRepresenting a unique identity, P, of the kth tracked objectkIndicating the face position coordinates assigned to the kth target, LkCoordinates of key points of the face representing the k-th object, EkList of face features representing the kth target, AkDenotes the life cycle of the kth target, initializing Ki=Ji,Ak=1;
Step S23: for TkPosition P of each face in (1)kCutting the image to obtain a corresponding face image, and using the corresponding key point position L of the facekCarrying out face alignment by applying similarity transformation to obtain an aligned face image;
step S24: inputting the aligned face image into the self-adaptive aggregation network to obtain corresponding face depth apparent characteristics, and adding the face depth apparent characteristics into the tracker TkFeature list E ofk;
Step S3 specifically includes the following steps:
step S31: representing the target state of each tracked face as follows:
in the formula, m represents the tracked face target state, u and v represent the central coordinates of the tracked face region, s is the area of the face frame, r is the aspect ratio of the face frame,respectively representing the velocities of (u, v, s, r) in the image coordinate space;
step S32: each tracker TkFace position P inkConversion to (x, y, w, h)In the form of (1), whereinRepresenting the form of the transformed face position of the kth tracking target in the ith frame;
step S33: will be provided withAs a direct observation result of the kth tracking target in the ith frame, the kth tracking target is detected by human faces, and the state of the kth tracking target in the (i + 1) th frame is detected by adopting a Kalman filter based on a linear uniform motion modelCarrying out prediction;
step S34: in the (i + 1) th frame, adopting MTCNN model to make face detection and face key point positioning again to obtain face position Di+1And face key point Ci+1;
Step S35: for each face positionBased on its facial key pointsCompleting face alignment by applying similarity transformation, inputting the face alignment into a self-adaptive aggregation network to extract features, and obtaining a feature set Fi+1In which F isi+1Representing the feature set of all the human faces in the (i + 1) th frame;
step S4 specifically includes the following steps:
step S41: tracker T for each facekA set E of all the characteristics in the historical motion trailkInputting the self-adaptive aggregation network to obtain an aggregation characteristic fkWherein f iskRepresenting an aggregation characteristic output after all characteristic vectors in the history motion trail of the kth tracking target are fused;
step S42: the position state of the kth target predicted by the Kalman filter in the ith frame is in the next frameIs converted intoIn the form of (a);
step S43: bonding ofAnd the characteristic f after polymerization of the object kkAnd face position D in the i +1 th frame obtained by face detectioni+1And its feature set Fi+1The following correlation matrix is calculated:
G=[gjk],j=1,2,...,Ji+1,k=1,2,...,Ki;
in the formula, Ji+1For the number of faces detected in the i +1 th frame, KiFor the number of tracked objects in the ith frame, for the jth personal face detection frame in the (i + 1) th frame and the position state of the kth target predicted by the Kalman filter in the (i + 1) th frameThe degree of coincidence between them,for the j individual face feature in the (i + 1) th frameWith the k-th target in the i-th framekCosine similarity between them, λ is a hyper-parameter, used to balance the weights of the two metrics;
step S44: taking the incidence matrix G as a cost matrix, calculating by using a Hungarian algorithm to obtain a matching result, and detecting the face in the (i + 1) th frameAssociating to the kth tracking target;
step S45: corresponding the subscript in the matching result to the item in the incidence matrix G, and filtering all the subscripts smaller than TsimilarityItem g of (a)jkIt is deleted from the matching result, where TsimilarityThe minimum similarity threshold value is a set hyper-parameter and represents the successful matching;
step S46: in the matching result, if the frame is detectedIf the association with the kth tracking target is successful, the corresponding tracker T is updatedkPosition state ofHuman face key point positionLife cycle ak=Ak+1, and corresponding face featuresAdd to feature list EkIf the detection frame is not correctIf the association fails, a new tracker is created;
step S47: for each tracker TkIf it is in life cycle Ak>TageThen delete the tracker, where TageThe set hyper-parameter represents the maximum time a tracked object can survive.
2. The method for tracking multiple faces based on the depth appearance characteristics and the adaptive aggregation network as claimed in claim 1, wherein: the self-adaptive aggregation network is formed by connecting a depth feature extraction module and a self-adaptive feature aggregation module in series, receives one or more face images of the same person as input and outputs aggregated features, wherein the depth feature extraction module adopts 34 layers of ResNet as a backbone network, and the self-adaptive feature aggregation module comprises a feature aggregation layer; let B denote the number of samples input, { ztAnd B represents an input sample number, and the calculation mode of a feature aggregation layer is as follows:
a=∑totzt;
wherein q represents a feature vector ztThe weight of each component is a parameter which can be learned by using a face recognition signal as a supervision signal and learning by using a back propagation and gradient descent method, vtRepresenting each feature vector z as the output of the sigmoid functiontIn the range between 0 and 1, otNormalized output for L1, such that ∑totAnd a is a feature vector after the aggregation of the B feature vectors.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910106309.1A CN109829436B (en) | 2019-02-02 | 2019-02-02 | Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network |
PCT/CN2019/124966 WO2020155873A1 (en) | 2019-02-02 | 2019-12-13 | Deep apparent features and adaptive aggregation network-based multi-face tracking method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910106309.1A CN109829436B (en) | 2019-02-02 | 2019-02-02 | Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109829436A CN109829436A (en) | 2019-05-31 |
CN109829436B true CN109829436B (en) | 2022-05-13 |
Family
ID=66863393
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910106309.1A Active CN109829436B (en) | 2019-02-02 | 2019-02-02 | Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109829436B (en) |
WO (1) | WO2020155873A1 (en) |
Families Citing this family (82)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109829436B (en) * | 2019-02-02 | 2022-05-13 | 福州大学 | Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network |
TWI727337B (en) * | 2019-06-06 | 2021-05-11 | 大陸商鴻富錦精密工業(武漢)有限公司 | Electronic device and face recognition method |
CN110490901A (en) * | 2019-07-15 | 2019-11-22 | 武汉大学 | The pedestrian detection tracking of anti-attitudes vibration |
CN110414443A (en) * | 2019-07-31 | 2019-11-05 | 苏州市科远软件技术开发有限公司 | A kind of method for tracking target, device and rifle ball link tracking |
CN110705478A (en) * | 2019-09-30 | 2020-01-17 | 腾讯科技(深圳)有限公司 | Face tracking method, device, equipment and storage medium |
CN111078295B (en) * | 2019-11-28 | 2021-11-12 | 核芯互联科技(青岛)有限公司 | Mixed branch prediction device and method for out-of-order high-performance core |
CN111160202B (en) * | 2019-12-20 | 2023-09-05 | 万翼科技有限公司 | Identity verification method, device, equipment and storage medium based on AR equipment |
CN111079718A (en) * | 2020-01-15 | 2020-04-28 | 中云智慧(北京)科技有限公司 | Quick face comparison method |
CN111275741B (en) * | 2020-01-19 | 2023-09-08 | 北京迈格威科技有限公司 | Target tracking method, device, computer equipment and storage medium |
CN111325279B (en) * | 2020-02-26 | 2022-06-10 | 福州大学 | Pedestrian and personal sensitive article tracking method fusing visual relationship |
CN111476826A (en) * | 2020-04-10 | 2020-07-31 | 电子科技大学 | Multi-target vehicle tracking method based on SSD target detection |
CN111770299B (en) * | 2020-04-20 | 2022-04-19 | 厦门亿联网络技术股份有限公司 | Method and system for real-time face abstract service of intelligent video conference terminal |
CN111553234B (en) * | 2020-04-22 | 2023-06-06 | 上海锘科智能科技有限公司 | Pedestrian tracking method and device integrating facial features and Re-ID feature ordering |
CN111914613B (en) * | 2020-05-21 | 2024-03-01 | 淮阴工学院 | Multi-target tracking and facial feature information recognition method |
CN112001225B (en) * | 2020-07-06 | 2023-06-23 | 西安电子科技大学 | Online multi-target tracking method, system and application |
CN111932588B (en) * | 2020-08-07 | 2024-01-30 | 浙江大学 | Tracking method of airborne unmanned aerial vehicle multi-target tracking system based on deep learning |
CN111784746B (en) * | 2020-08-10 | 2024-05-03 | 青岛高重信息科技有限公司 | Multi-target pedestrian tracking method and device under fish-eye lens and computer system |
CN111899284B (en) * | 2020-08-14 | 2024-04-09 | 北京交通大学 | Planar target tracking method based on parameterized ESM network |
CN112036271B (en) * | 2020-08-18 | 2023-10-10 | 汇纳科技股份有限公司 | Pedestrian re-identification method, system, medium and terminal based on Kalman filtering |
CN111932661B (en) * | 2020-08-19 | 2023-10-24 | 上海艾麒信息科技股份有限公司 | Facial expression editing system and method and terminal |
CN112016440B (en) * | 2020-08-26 | 2024-02-20 | 杭州云栖智慧视通科技有限公司 | Target pushing method based on multi-target tracking |
CN112215873A (en) * | 2020-08-27 | 2021-01-12 | 国网浙江省电力有限公司电力科学研究院 | Method for tracking and positioning multiple targets in transformer substation |
CN112085767B (en) * | 2020-08-28 | 2023-04-18 | 安徽清新互联信息科技有限公司 | Passenger flow statistical method and system based on deep optical flow tracking |
CN112053386B (en) * | 2020-08-31 | 2023-04-18 | 西安电子科技大学 | Target tracking method based on depth convolution characteristic self-adaptive integration |
CN112257502A (en) * | 2020-09-16 | 2021-01-22 | 深圳微步信息股份有限公司 | Pedestrian identification and tracking method and device for surveillance video and storage medium |
CN112149557B (en) * | 2020-09-22 | 2022-08-09 | 福州大学 | Person identity tracking method and system based on face recognition |
CN112215155B (en) * | 2020-10-13 | 2022-10-14 | 北京中电兴发科技有限公司 | Face tracking method and system based on multi-feature fusion |
CN112288773A (en) * | 2020-10-19 | 2021-01-29 | 慧视江山科技(北京)有限公司 | Multi-scale human body tracking method and device based on Soft-NMS |
CN112307234A (en) * | 2020-11-03 | 2021-02-02 | 厦门兆慧网络科技有限公司 | Face bottom library synthesis method, system, device and storage medium |
CN112287877B (en) * | 2020-11-18 | 2022-12-02 | 苏州爱可尔智能科技有限公司 | Multi-role close-up shot tracking method |
CN114639129B (en) * | 2020-11-30 | 2024-05-03 | 北京君正集成电路股份有限公司 | Paper medium living body detection method for access control system |
CN112651994A (en) * | 2020-12-18 | 2021-04-13 | 零八一电子集团有限公司 | Ground multi-target tracking method |
CN112668432A (en) * | 2020-12-22 | 2021-04-16 | 上海幻维数码创意科技股份有限公司 | Human body detection tracking method in ground interactive projection system based on YoloV5 and Deepsort |
CN112597901B (en) * | 2020-12-23 | 2023-12-29 | 艾体威尔电子技术(北京)有限公司 | Device and method for effectively recognizing human face in multiple human face scenes based on three-dimensional ranging |
CN112560874B (en) * | 2020-12-25 | 2024-04-16 | 北京百度网讯科技有限公司 | Training method, device, equipment and medium for image recognition model |
CN112653844A (en) * | 2020-12-28 | 2021-04-13 | 珠海亿智电子科技有限公司 | Camera holder steering self-adaptive tracking adjustment method |
CN112597944A (en) * | 2020-12-29 | 2021-04-02 | 北京市商汤科技开发有限公司 | Key point detection method and device, electronic equipment and storage medium |
CN112669345B (en) * | 2020-12-30 | 2023-10-20 | 中山大学 | Cloud deployment-oriented multi-target track tracking method and system |
CN112581506A (en) * | 2020-12-31 | 2021-03-30 | 北京澎思科技有限公司 | Face tracking method, system and computer readable storage medium |
CN112686175A (en) * | 2020-12-31 | 2021-04-20 | 北京澎思科技有限公司 | Face snapshot method, system and computer readable storage medium |
CN112784725A (en) * | 2021-01-15 | 2021-05-11 | 北京航天自动控制研究所 | Pedestrian anti-collision early warning method and device, storage medium and forklift |
CN113076808B (en) * | 2021-03-10 | 2023-05-26 | 海纳云物联科技有限公司 | Method for accurately acquiring bidirectional traffic flow through image algorithm |
CN113158788B (en) * | 2021-03-12 | 2024-03-08 | 中国平安人寿保险股份有限公司 | Facial expression recognition method and device, terminal equipment and storage medium |
CN113033439B (en) * | 2021-03-31 | 2023-10-20 | 北京百度网讯科技有限公司 | Method and device for data processing and electronic equipment |
CN113158853A (en) * | 2021-04-08 | 2021-07-23 | 浙江工业大学 | Pedestrian's identification system that makes a dash across red light that combines people's face and human gesture |
CN113192105B (en) * | 2021-04-16 | 2023-10-17 | 嘉联支付有限公司 | Method and device for indoor multi-person tracking and attitude measurement |
CN113158909B (en) * | 2021-04-25 | 2023-06-27 | 中国科学院自动化研究所 | Behavior recognition light-weight method, system and equipment based on multi-target tracking |
CN113408348B (en) * | 2021-05-14 | 2022-08-19 | 桂林电子科技大学 | Video-based face recognition method and device and storage medium |
CN113377192B (en) * | 2021-05-20 | 2023-06-20 | 广州紫为云科技有限公司 | Somatosensory game tracking method and device based on deep learning |
CN113379795B (en) * | 2021-05-21 | 2024-03-22 | 浙江工业大学 | Multi-target tracking and segmentation method based on conditional convolution and optical flow characteristics |
CN113269098B (en) * | 2021-05-27 | 2023-06-16 | 中国人民解放军军事科学院国防科技创新研究院 | Multi-target tracking positioning and motion state estimation method based on unmanned aerial vehicle |
CN113313201A (en) * | 2021-06-21 | 2021-08-27 | 南京挥戈智能科技有限公司 | Multi-target detection and distance measurement method based on Swin transducer and ZED camera |
CN113487653B (en) * | 2021-06-24 | 2024-03-26 | 之江实验室 | Self-adaptive graph tracking method based on track prediction |
CN113486771B (en) * | 2021-06-30 | 2023-07-07 | 福州大学 | Video action uniformity evaluation method and system based on key point detection |
CN113724291B (en) * | 2021-07-29 | 2024-04-02 | 西安交通大学 | Multi-panda tracking method, system, terminal device and readable storage medium |
CN113658223B (en) * | 2021-08-11 | 2023-08-04 | 山东建筑大学 | Multi-row person detection and tracking method and system based on deep learning |
CN113807187B (en) * | 2021-08-20 | 2024-04-02 | 北京工业大学 | Unmanned aerial vehicle video multi-target tracking method based on attention feature fusion |
CN113688740B (en) * | 2021-08-26 | 2024-02-27 | 燕山大学 | Indoor gesture detection method based on multi-sensor fusion vision |
CN113723279B (en) * | 2021-08-30 | 2022-11-01 | 东南大学 | Multi-target tracking acceleration method based on time-space optimization in edge computing environment |
CN113920457A (en) * | 2021-09-16 | 2022-01-11 | 中国农业科学院农业资源与农业区划研究所 | Fruit yield estimation method and system based on space and ground information acquisition cooperative processing |
CN113723361A (en) * | 2021-09-18 | 2021-11-30 | 西安邮电大学 | Video monitoring method and device based on deep learning |
CN113808170B (en) * | 2021-09-24 | 2023-06-27 | 电子科技大学长三角研究院(湖州) | Anti-unmanned aerial vehicle tracking method based on deep learning |
CN113822211B (en) * | 2021-09-27 | 2023-04-11 | 山东睿思奥图智能科技有限公司 | Interactive person information acquisition method |
CN113936312A (en) * | 2021-10-12 | 2022-01-14 | 南京视察者智能科技有限公司 | Face recognition base screening method based on deep learning graph convolution network |
CN114627339B (en) * | 2021-11-09 | 2024-03-29 | 昆明物理研究所 | Intelligent recognition tracking method and storage medium for cross border personnel in dense jungle area |
CN114120188B (en) * | 2021-11-19 | 2024-04-05 | 武汉大学 | Multi-row person tracking method based on joint global and local features |
CN114169425B (en) * | 2021-12-03 | 2023-02-03 | 北京百度网讯科技有限公司 | Training target tracking model and target tracking method and device |
CN114339398A (en) * | 2021-12-24 | 2022-04-12 | 天翼视讯传媒有限公司 | Method for real-time special effect processing in large-scale video live broadcast |
CN114419151A (en) * | 2021-12-31 | 2022-04-29 | 福州大学 | Multi-target tracking method based on contrast learning |
CN114663796A (en) * | 2022-01-04 | 2022-06-24 | 北京航空航天大学 | Target person continuous tracking method, device and system |
CN114821702A (en) * | 2022-03-15 | 2022-07-29 | 电子科技大学 | Thermal infrared face recognition method based on face shielding |
CN115214430B (en) * | 2022-03-23 | 2023-11-17 | 广州汽车集团股份有限公司 | Vehicle seat adjusting method and vehicle |
WO2023184197A1 (en) * | 2022-03-30 | 2023-10-05 | 京东方科技集团股份有限公司 | Target tracking method and apparatus, system, and storage medium |
CN115272404B (en) * | 2022-06-17 | 2023-07-18 | 江南大学 | Multi-target tracking method based on kernel space and implicit space feature alignment |
CN114943924B (en) * | 2022-06-21 | 2024-05-14 | 深圳大学 | Pain assessment method, system, equipment and medium based on facial expression video |
CN114783043B (en) * | 2022-06-24 | 2022-09-20 | 杭州安果儿智能科技有限公司 | Child behavior track positioning method and system |
CN115994929A (en) * | 2023-03-24 | 2023-04-21 | 中国兵器科学研究院 | Multi-target tracking method integrating space motion and apparent feature learning |
CN116596958B (en) * | 2023-07-18 | 2023-10-10 | 四川迪晟新达类脑智能技术有限公司 | Target tracking method and device based on online sample augmentation |
CN117011335B (en) * | 2023-07-26 | 2024-04-09 | 山东大学 | Multi-target tracking method and system based on self-adaptive double decoders |
CN117455955B (en) * | 2023-12-14 | 2024-03-08 | 武汉纺织大学 | Pedestrian multi-target tracking method based on unmanned aerial vehicle visual angle |
CN117576166B (en) * | 2024-01-15 | 2024-04-30 | 浙江华是科技股份有限公司 | Target tracking method and system based on camera and low-frame-rate laser radar |
CN117809054B (en) * | 2024-02-29 | 2024-05-10 | 南京邮电大学 | Multi-target tracking method based on feature decoupling fusion network |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8295543B2 (en) * | 2007-08-31 | 2012-10-23 | Lockheed Martin Corporation | Device and method for detecting targets in images based on user-defined classifiers |
CN101216885A (en) * | 2008-01-04 | 2008-07-09 | 中山大学 | Passerby face detection and tracing algorithm based on video |
CN101777116B (en) * | 2009-12-23 | 2012-07-25 | 中国科学院自动化研究所 | Method for analyzing facial expressions on basis of motion tracking |
US10902243B2 (en) * | 2016-10-25 | 2021-01-26 | Deep North, Inc. | Vision based target tracking that distinguishes facial feature targets |
CN106845385A (en) * | 2017-01-17 | 2017-06-13 | 腾讯科技(上海)有限公司 | The method and apparatus of video frequency object tracking |
CN107292911B (en) * | 2017-05-23 | 2021-03-30 | 南京邮电大学 | Multi-target tracking method based on multi-model fusion and data association |
CN107492116A (en) * | 2017-09-01 | 2017-12-19 | 深圳市唯特视科技有限公司 | A kind of method that face tracking is carried out based on more display models |
CN107609512A (en) * | 2017-09-12 | 2018-01-19 | 上海敏识网络科技有限公司 | A kind of video human face method for catching based on neutral net |
CN108509859B (en) * | 2018-03-09 | 2022-08-26 | 南京邮电大学 | Non-overlapping area pedestrian tracking method based on deep neural network |
CN108363997A (en) * | 2018-03-20 | 2018-08-03 | 南京云思创智信息科技有限公司 | It is a kind of in video to the method for real time tracking of particular person |
CN109101915B (en) * | 2018-08-01 | 2021-04-27 | 中国计量大学 | Face, pedestrian and attribute recognition network structure design method based on deep learning |
CN109086724B (en) * | 2018-08-09 | 2019-12-24 | 北京华捷艾米科技有限公司 | Accelerated human face detection method and storage medium |
CN109829436B (en) * | 2019-02-02 | 2022-05-13 | 福州大学 | Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network |
-
2019
- 2019-02-02 CN CN201910106309.1A patent/CN109829436B/en active Active
- 2019-12-13 WO PCT/CN2019/124966 patent/WO2020155873A1/en active Application Filing
Non-Patent Citations (1)
Title |
---|
一种基于预测的实时人脸特征点定位跟踪算法;翁政魁等;《万方数据知识服务平台期刊库》;20150722;第198-202页 * |
Also Published As
Publication number | Publication date |
---|---|
WO2020155873A1 (en) | 2020-08-06 |
CN109829436A (en) | 2019-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109829436B (en) | Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network | |
CN110135375B (en) | Multi-person attitude estimation method based on global information integration | |
CN110472554B (en) | Table tennis action recognition method and system based on attitude segmentation and key point features | |
CN104881637B (en) | Multimodal information system and its fusion method based on heat transfer agent and target tracking | |
CN111709311B (en) | Pedestrian re-identification method based on multi-scale convolution feature fusion | |
CN114220176A (en) | Human behavior recognition method based on deep learning | |
WO2017150032A1 (en) | Method and system for detecting actions of object in scene | |
CN109685037B (en) | Real-time action recognition method and device and electronic equipment | |
Nandini et al. | Face recognition using neural networks | |
CN110135249A (en) | Human bodys' response method based on time attention mechanism and LSTM | |
CN108960047B (en) | Face duplication removing method in video monitoring based on depth secondary tree | |
CN112149557B (en) | Person identity tracking method and system based on face recognition | |
CN114067358A (en) | Human body posture recognition method and system based on key point detection technology | |
CN112989889B (en) | Gait recognition method based on gesture guidance | |
CN114582030A (en) | Behavior recognition method based on service robot | |
CN111931654A (en) | Intelligent monitoring method, system and device for personnel tracking | |
CN108830170A (en) | A kind of end-to-end method for tracking target indicated based on layered characteristic | |
CN113963032A (en) | Twin network structure target tracking method fusing target re-identification | |
CN110222607A (en) | The method, apparatus and system of face critical point detection | |
CN113378649A (en) | Identity, position and action recognition method, system, electronic equipment and storage medium | |
CN109711232A (en) | Deep learning pedestrian recognition methods again based on multiple objective function | |
CN114429646A (en) | Gait recognition method based on deep self-attention transformation network | |
Wang et al. | Thermal infrared object tracking based on adaptive feature fusion | |
Galiyawala et al. | Dsa-pr: discrete soft biometric attribute-based person retrieval in surveillance videos | |
Caetano et al. | Magnitude-Orientation Stream network and depth information applied to activity recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |