WO2020155873A1 - Deep apparent features and adaptive aggregation network-based multi-face tracking method - Google Patents

Deep apparent features and adaptive aggregation network-based multi-face tracking method Download PDF

Info

Publication number
WO2020155873A1
WO2020155873A1 PCT/CN2019/124966 CN2019124966W WO2020155873A1 WO 2020155873 A1 WO2020155873 A1 WO 2020155873A1 CN 2019124966 W CN2019124966 W CN 2019124966W WO 2020155873 A1 WO2020155873 A1 WO 2020155873A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
frame
feature
target
tracking
Prior art date
Application number
PCT/CN2019/124966
Other languages
French (fr)
Chinese (zh)
Inventor
柯逍
郑毅腾
朱敏琛
Original Assignee
福州大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 福州大学 filed Critical 福州大学
Publication of WO2020155873A1 publication Critical patent/WO2020155873A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery

Definitions

  • the invention relates to the field of pattern recognition and computer vision, in particular to a multi-face tracking method based on deep appearance features and an adaptive aggregation network.
  • Face tracking technology is a specific application of current tracking technology. It uses tracking algorithms to process the moving faces in the video sequence, and keep the face area The lock completes the tracking, and the technology has good application prospects in scenarios such as smart security and video surveillance.
  • Face tracking plays an important role in video surveillance, but at present, in real scenes, due to the large changes in face pose and the overlap and occlusion between tracking targets, practical applications are still difficult.
  • the purpose of the present invention is to propose a multi-face tracking method based on deep appearance features and an adaptive aggregation network, which can improve the performance of face tracking.
  • the present invention adopts the following scheme to realize: a multi-face tracking method based on deep appearance features and an adaptive aggregation network, which specifically includes the following steps:
  • Step S1 Use the face recognition data set to train an adaptive aggregation network
  • Step S2 According to the initial input video frame, use the convolutional neural network to obtain the position of the face, initialize the face target to be tracked, extract and save the face features;
  • Step S3 Use the Kalman filter to predict the position of each face target in the next frame, and locate the position of the face again in the next frame, and extract features from the detected face;
  • Step S4 Use the adaptive aggregation network trained in step S1 to aggregate the face feature sets in the tracking trajectory of each tracked face target, and dynamically generate a face depth apparent feature fused with multi-frame information, combined The predicted position and fusion features are calculated and matched with the face position and features obtained through detection in the current frame, and the tracking state is updated.
  • step S1 specifically includes the following steps:
  • Step S11 Collect public face recognition data sets to obtain pictures and names of relevant people
  • Step S12 Use the fusion strategy to integrate the pictures of the common people in the multiple data sets, use the pre-trained MTCNN model for face detection and face key point positioning, and apply similar transformations for face alignment, and at the same time all the training sets The image subtracts the mean value of each channel on the training set, completes the data preprocessing, and trains the adaptive aggregation network.
  • the adaptive aggregation network is composed of a deep feature extraction module and an adaptive feature aggregation module in series. It accepts one or more face images of the same person as input and outputs the aggregated features, where the deep feature extraction
  • q represents the weight of each component of the feature vector z t , which is a parameter that can be learned.
  • the face recognition signal is used as a supervisory signal to learn using back propagation and gradient descent methods.
  • V t is the output of the sigmoid function, representing The score of each feature vector z t ranges between 0 and 1
  • step S2 specifically includes the following steps:
  • Step S24 Input the aligned face image into the adaptive aggregation network to obtain the corresponding deep apparent feature of the face, and add it to the feature list E k of T k in the tracker.
  • step S3 specifically includes the following steps:
  • Step S31 Express the state of each tracked face target in the following form:
  • m represents the state of the tracked face target
  • u and v represent the center coordinates of the tracked face area
  • s is the area of the face frame
  • r is the aspect ratio of the face frame
  • Step S33 Change As the direct observation result of the k-th tracking target in the i-th frame, it is derived from face detection, and the state of the k-th tracking target in the i+1-th frame is determined by the Kalman filter based on the linear uniform motion model. Make predictions
  • Step S34 In the i+1th frame, the MTCNN model is used to perform face detection and facial key point positioning again, to obtain the face position Di +1 and the face key point C i+1 ;
  • Step S35 For each person's face position Based on its facial key points The similarity transformation is applied to complete the face alignment, and the adaptive aggregation network is input to extract the features, and the feature set F i+1 is obtained , where F i+1 represents the feature set of all faces in the i+1 frame.
  • step S4 specifically includes the following steps:
  • Step S41 For each face tracker T k , the set E k of all the features in its historical motion trajectory is input into the adaptive aggregation network to obtain the aggregated feature f k , where f k represents the k-th tracking target historical motion trajectory An aggregated feature output after all feature vectors in the fusion;
  • Step S42 Set the position state of the kth target predicted by the Kalman filter in the i-th frame in the next frame change into form
  • Step S43 Combine And certain features of the polymeric k f k, as well as human face detection in the i + 1-position of the face frame obtained D i + 1, and the feature set F i + 1, the correlation matrix is calculated as follows:
  • J i+1 is the number of faces detected in the i+1 frame
  • K i is the number of tracking targets in the i frame
  • the degree of overlap between Is the j-th face feature in the i+1-th frame The cosine similarity with the k-th target aggregation feature f k in the i-th frame, where ⁇ is a hyperparameter used to balance the weights of the two metrics;
  • Step S44 Using the incidence matrix G as the cost matrix, the Hungarian algorithm is used to calculate the matching result, and the face detection frame in the i+1 frame Related to the kth tracking target;
  • Step S45 Correspond the subscripts in the matching result to items in the incidence matrix G, and filter all items g jk that are less than T similarity , and delete them from the matching result, where T similarity is the set hyperparameter, which means the matching is successful The lowest similarity threshold;
  • Step S47 For each tracker T k , if its life cycle A k > T age , delete the tracker, where T age is a set hyperparameter, which represents the longest time a tracking target can survive.
  • the present invention has the following beneficial effects:
  • a multi-face tracking method based on deep appearance features and an adaptive aggregation network constructed by the present invention can effectively track faces in videos, improve the accuracy of face tracking, and reduce target switching The number of times.
  • the present invention can track the face in the video online while ensuring the tracking effect.
  • the predicted face position is uncertain, and at the same time, the face may undergo significant posture changes and occlusion.
  • the present invention proposes a method of using the depth and apparent features of the face. The information between position and depth features improves the performance of face tracking.
  • the present invention proposes an adaptive aggregation network, which is adaptive through a feature aggregation module The importance of each feature in the feature set is learned and fused effectively, which improves the effect of face tracking.
  • Fig. 1 is a schematic flowchart of an embodiment of the present invention.
  • this embodiment provides a multi-face tracking method based on deep appearance features and an adaptive aggregation network, which specifically includes the following steps:
  • Step S1 Use the face recognition data set to train an adaptive aggregation network
  • Step S2 According to the initial input video frame, the face detection method based on convolutional neural network is used to obtain the position of the face, initialize the face target to be tracked, and extract and save the face feature;
  • Step S3 Use the Kalman filter to predict the position of each face target in the next frame, and use the face detection method to locate the position of the face again in the next frame, and extract features from the detected face;
  • Step S4 Use the adaptive aggregation network trained in step S1 to aggregate the face feature sets in the tracking trajectory of each tracked face target, and dynamically generate a face depth apparent feature fused with multi-frame information, combined The predicted position and fusion features are calculated and matched with the face position and features obtained through detection in the current frame, and the tracking state is updated.
  • step S1 specifically includes the following steps:
  • Step S11 Collect public face recognition data sets to obtain pictures and names of relevant people
  • Step S12 Use the fusion strategy to integrate the pictures of the common people in the multiple data sets, use the pre-trained MTCNN model for face detection and face key point positioning, and apply similar transformations for face alignment, and at the same time all the training sets The image subtracts the mean value of each channel on the training set, completes the data preprocessing, and trains the adaptive aggregation network.
  • the adaptive aggregation network is composed of a deep feature extraction module and an adaptive feature aggregation module in series. It accepts one or more face images of the same person as input and outputs the aggregated features, where
  • q represents the weight of each component of the feature vector z t , which is a parameter that can be learned.
  • the learning is performed using back propagation and gradient descent methods
  • v t is the output of the sigmoid function, representing The score of each feature vector z t ranges between 0 and 1
  • step S2 specifically includes the following steps:
  • Step S24 Input the aligned face image into the adaptive aggregation network to obtain the corresponding deep apparent feature of the face, and add it to the feature list E k of T k in the tracker.
  • step S3 specifically includes the following steps:
  • Step S31 Express the state of each tracked face target in the following form:
  • m represents the state of the tracked face target
  • u and v represent the center coordinates of the tracked face area
  • s is the area of the face frame
  • r is the aspect ratio of the face frame
  • Step S33 Change As the direct observation result of the k-th tracking target in the i-th frame, it is derived from face detection, and the state of the k-th tracking target in the i+1-th frame is determined by the Kalman filter based on the linear uniform motion model. Make predictions
  • Step S34 In the i+1th frame, the MTCNN model is used to perform face detection and facial key point positioning again, to obtain the face position Di +1 and the face key point C i+1 ;
  • Step S35 For each person's face position Based on its facial key points The similarity transformation is applied to complete the face alignment, and the adaptive aggregation network is input to extract the features, and the feature set F i+1 is obtained , where F i+1 represents the feature set of all faces in the i+1 frame.
  • step S4 specifically includes the following steps:
  • Step S41 For each face tracker T k , the set E k of all the features in its historical motion trajectory is input into the adaptive aggregation network to obtain the aggregated feature f k , where f k represents the k-th tracking target historical motion trajectory An aggregated feature output after all feature vectors in the fusion;
  • Step S42 Set the position state of the kth target predicted by the Kalman filter in the i-th frame in the next frame change into form
  • Step S43 Combine And certain features of the polymeric k f k, as well as human face detection in the i + 1-position of the face frame obtained D i + 1, and the feature set F i + 1, the correlation matrix is calculated as follows:
  • J i+1 is the number of faces detected in the i+1 frame
  • K i is the number of tracking targets in the i frame
  • the degree of overlap between Is the j-th face feature in the i+1-th frame The cosine similarity with the k-th target aggregation feature f k in the i-th frame, where ⁇ is a hyperparameter used to balance the weights of the two metrics;
  • Step S44 Using the incidence matrix G as the cost matrix, the Hungarian algorithm is used to calculate the matching result, and the face detection frame in the i+1 frame Related to the kth tracking target;
  • Step S45 Correspond the subscripts in the matching result to items in the incidence matrix G, and filter all items g jk that are less than T similarity , and delete them from the matching result, where T similarity is the set hyperparameter, which means the matching is successful The lowest similarity threshold;
  • Step S47 For each tracker T k , if its life cycle A k > T age , delete the tracker, where T age is a set hyperparameter, which represents the longest time a tracking target can survive.
  • this application can be provided as methods, systems, or computer program products. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions can also be stored in a computer-readable memory that can direct a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A deep apparent features and adaptive aggregation network-based multi-face tracking method: first, using a face recognition data set to train an adaptive aggregation network; next, using a convolutional neural network-based face detection method to obtain the position of a face, initializing a face target to be tracked, and extracting face features; then, using a Kalman filter to predict the position of each face tracking target in a next frame, and again locating the position at which the face is located in the next frame to extract the features of the detected face; finally, using the adaptive aggregation network to aggregate face feature sets in each tracked face target tracking trajectory, dynamically generating a face depth apparent feature fused with a plurality of frames of information, performing similarity calculation and matching with the face position and the feature thereof obtained by means of detection in a current frame in conjunction with the predicted position and the fused feature, and updating a tracking status. The described method may improve the performance of face tracking.

Description

一种基于深度表观特征和自适应聚合网络的多人脸跟踪方法A multi-face tracking method based on deep appearance features and adaptive aggregation network 技术领域Technical field
本发明涉及模式识别与计算机视觉领域,特别是一种基于深度表观特征和自适应聚合网络的多人脸跟踪方法。The invention relates to the field of pattern recognition and computer vision, in particular to a multi-face tracking method based on deep appearance features and an adaptive aggregation network.
背景技术Background technique
近年来,随着社会进步及科技的不断发展,视频人脸识别问题已渐渐成为一个热门研究领域,吸引了国内外众多专家学者的研究兴趣,作为视频人脸识别的入口和基础,人脸检测和跟踪技术得到了快速的发展,广泛应用在智能监控、虚拟现实感知接口、视频会议等领域,由于现实的视频背景是复杂多变的,且人脸作为非刚性目标,在视频序列中可能存在大幅的姿态或表情的变化,在真实场景中实现一个鲁棒的人脸跟踪算法仍然具有很大的挑战。In recent years, with the progress of society and the continuous development of technology, the problem of video face recognition has gradually become a hot research field, attracting the research interest of many experts and scholars at home and abroad, as the entrance and basis of video face recognition, face detection And tracking technology has developed rapidly and is widely used in fields such as intelligent surveillance, virtual reality perception interfaces, video conferencing, etc. Because the real video background is complex and changeable, and as a non-rigid target, human faces may exist in video sequences. With large changes in posture or expression, it is still a big challenge to implement a robust face tracking algorithm in a real scene.
为了对一个人脸进行分析,我们首先必须捕捉人脸,这可以通过人脸检测技术和人脸跟踪技术来实现,只有在视频中精确定位和跟踪人脸目标,我们才可以对人脸进行更细致的分析,如人脸识别,姿态估计等。目标跟踪技术无疑是智能安防中最为重要的技术之一,人脸跟踪技术便是目前跟踪技术的一种具体应用,其运用跟踪算法处理视频序列中运动的人脸,并保持对这个人脸区域的锁定完成跟踪,该技术在智能安防及视频监控等场景下都具有良好的应用前景。In order to analyze a face, we must first capture the face. This can be achieved through face detection technology and face tracking technology. Only by accurately locating and tracking the face target in the video, we can make changes to the face. Detailed analysis, such as face recognition, pose estimation, etc. Target tracking technology is undoubtedly one of the most important technologies in intelligent security. Face tracking technology is a specific application of current tracking technology. It uses tracking algorithms to process the moving faces in the video sequence, and keep the face area The lock completes the tracking, and the technology has good application prospects in scenarios such as smart security and video surveillance.
技术问题technical problem
人脸跟踪在视频监控中扮演着重要的角色,但目前在真实场景中,由于人脸姿态的大幅变化以及跟踪目标之间的重叠与遮挡,导致实际应用还具有较大的困难。Face tracking plays an important role in video surveillance, but at present, in real scenes, due to the large changes in face pose and the overlap and occlusion between tracking targets, practical applications are still difficult.
技术解决方案Technical solutions
有鉴于此,本发明的目的是提出一种基于深度表观特征和自适应聚合网络的多人脸跟踪方法,能够提升人脸跟踪的性能。In view of this, the purpose of the present invention is to propose a multi-face tracking method based on deep appearance features and an adaptive aggregation network, which can improve the performance of face tracking.
本发明采用以下方案实现:一种基于深度表观特征和自适应聚合网络的多人脸跟踪方法,具体包括以下步骤:The present invention adopts the following scheme to realize: a multi-face tracking method based on deep appearance features and an adaptive aggregation network, which specifically includes the following steps:
步骤S1:采用人脸识别数据集训练自适应聚合网络;Step S1: Use the face recognition data set to train an adaptive aggregation network;
步骤S2:根据初始的输入视频帧,采用卷积神经网络获取人脸的位置,初 始化待跟踪的人脸目标,提取人脸特征并保存;Step S2: According to the initial input video frame, use the convolutional neural network to obtain the position of the face, initialize the face target to be tracked, extract and save the face features;
步骤S3:采用卡尔曼滤波器预测每个人脸目标在下一帧的位置,并在下一帧中再次定位人脸所在位置,并对检测出的人脸提取特征;Step S3: Use the Kalman filter to predict the position of each face target in the next frame, and locate the position of the face again in the next frame, and extract features from the detected face;
步骤S4:使用步骤S1训练好的自适应聚合网络,对每个跟踪的人脸目标跟踪轨迹中的人脸特征集合进行聚合,动态地生成一个融合多帧信息的人脸深度表观特征,结合预测的位置及融合后的特征,与当前帧中通过检测得到的人脸位置及其特征,进行相似度计算与匹配,更新跟踪状态。Step S4: Use the adaptive aggregation network trained in step S1 to aggregate the face feature sets in the tracking trajectory of each tracked face target, and dynamically generate a face depth apparent feature fused with multi-frame information, combined The predicted position and fusion features are calculated and matched with the face position and features obtained through detection in the current frame, and the tracking state is updated.
进一步地,步骤S1具体包括以下步骤:Further, step S1 specifically includes the following steps:
步骤S11:收集公开的人脸识别数据集,获得相关人物的图片及姓名;Step S11: Collect public face recognition data sets to obtain pictures and names of relevant people;
步骤S12:采用融合策略对多个数据集中共有人物的图片进行整合,使用预训练的MTCNN模型进行人脸检测和人脸关键点定位,并应用相似变换进行人脸对齐,同时将训练集中的所有图像都减去其每个通道在训练集上的均值,完成数据预处理,训练自适应聚合网络。Step S12: Use the fusion strategy to integrate the pictures of the common people in the multiple data sets, use the pre-trained MTCNN model for face detection and face key point positioning, and apply similar transformations for face alignment, and at the same time all the training sets The image subtracts the mean value of each channel on the training set, completes the data preprocessing, and trains the adaptive aggregation network.
进一步地,所述自适应聚合网络由深度特征抽取模块和自适应特征聚合模块串联而成,其接受同一个人的一张或多张人脸图像作为输入,输出聚合后的特征,其中深度特征抽取模块采用34层的ResNet作为骨干网络,自适应特征聚合模块含有一个特征聚合层;令B表示输入的样本数量,{z t}表示深度特征抽取模块的输出特征集合,其中t=1,2,...,B表示输入样本编号,特征聚合层的计算方式为: Further, the adaptive aggregation network is composed of a deep feature extraction module and an adaptive feature aggregation module in series. It accepts one or more face images of the same person as input and outputs the aggregated features, where the deep feature extraction The module uses 34-layer ResNet as the backbone network, and the adaptive feature aggregation module contains a feature aggregation layer; let B denote the number of input samples, {z t } denote the output feature set of the deep feature extraction module, where t=1, 2, ..., B represents the input sample number, the calculation method of the feature aggregation layer is:
Figure PCTCN2019124966-appb-000001
Figure PCTCN2019124966-appb-000001
Figure PCTCN2019124966-appb-000002
Figure PCTCN2019124966-appb-000002
a=∑ to tz ta=∑ t o t z t ;
式中,q表示特征向量z t各个分量的权重,是可以学习的参数,通过将人脸 识别信号作为监督信号,利用反向传播和梯度下降方法进行学习,v t为sigmoid函数的输出,代表每个特征向量z t的分数,范围在0和1之间,o t为L1归一化的输出,使得∑ to t=1,a为B个特征向量聚合后的一个特征向量。 In the formula, q represents the weight of each component of the feature vector z t , which is a parameter that can be learned. The face recognition signal is used as a supervisory signal to learn using back propagation and gradient descent methods. V t is the output of the sigmoid function, representing The score of each feature vector z t ranges between 0 and 1, o t is the normalized output of L1, so that ∑ t o t = 1, and a is a feature vector aggregated by B feature vectors.
进一步地,步骤S2具体包括以下步骤:Further, step S2 specifically includes the following steps:
步骤S21:令i表示输入视频的第i帧的编号,初始时i=1,使用预训练的MTCNN模型同时检测所有人脸的位置D i及其对应的面部关键点的位置C i,其中
Figure PCTCN2019124966-appb-000003
j为第j个检测到人脸的编号,J i为第帧检测到的人脸数量,
Figure PCTCN2019124966-appb-000004
其中
Figure PCTCN2019124966-appb-000005
表示第i帧中第j个人脸的位置,x,y,w,h分别表示人脸区域的左上角坐标及其宽度和高度,
Figure PCTCN2019124966-appb-000006
其中
Figure PCTCN2019124966-appb-000007
表示第i帧中第j个人脸的关键点,c 1,c 2,c 3,c 4,c 5分别表示人脸的左眼,右眼,鼻子,左嘴角,右嘴角的坐标;
Step S21: Let i represents the number of i-th frame of the input video, initially i = 1, using the model pre-trained simultaneously detect all faces MTCNN position C i D i and the position of the corresponding key face, wherein
Figure PCTCN2019124966-appb-000003
j is the number of the j-th detected face, J i is the number of faces detected in the frame,
Figure PCTCN2019124966-appb-000004
among them
Figure PCTCN2019124966-appb-000005
Represents the position of the j-th face in the i-th frame, x, y, w, and h represent the coordinates of the upper left corner of the face area and its width and height,
Figure PCTCN2019124966-appb-000006
among them
Figure PCTCN2019124966-appb-000007
Represents the key points of the j-th face in the i-th frame, c 1 , c 2 , c 3 , c 4 , and c 5 represent the coordinates of the left eye, right eye, nose, left mouth corner, and right mouth corner of the face respectively;
步骤S22:对于每一个人脸的位置
Figure PCTCN2019124966-appb-000008
及其面部关键点坐标
Figure PCTCN2019124966-appb-000009
为其分配一个唯一的身份ID k,k=1,2,...,K i,其中k表示第k个跟踪目标的编号,K i表示在第i帧时跟踪目标的人数,并初始化其对应的跟踪器T k={ID k,P k,L k,E k,A k},其中ID k表示第k个跟踪目标的唯一身份标识,P k表示分配给第k个目标的人脸位置坐标,L k表示第k个目标的面部关键点坐标,E k表示第k个目标的人脸特征列表,A k表示第k个目标的生命周期,初始化K i=J i
Figure PCTCN2019124966-appb-000010
A k=1;
Step S22: For each person's face position
Figure PCTCN2019124966-appb-000008
Key point coordinates
Figure PCTCN2019124966-appb-000009
Assign a unique identity ID k , k=1, 2,..., K i , where k represents the number of the k-th tracking target, and K i represents the number of people tracking the target in the i-th frame, and initialize it The corresponding tracker T k ={ID k ,P k ,L k ,E k ,A k }, where ID k represents the unique identity of the k-th tracking target, and P k represents the face assigned to the k-th target Position coordinates, L k represents the face key point coordinates of the k-th target, E k represents the face feature list of the k-th target, Ak represents the life cycle of the k-th target, initialized K i =J i ,
Figure PCTCN2019124966-appb-000010
A k =1;
步骤S23:对于T k中的每一个人脸的位置P k,对图像进行裁剪,得到对应的人脸图像,使用对应的面部关键点位置L k,应用相似变换进行人脸对齐,得到对齐后的人脸图像; After the position of T k in each individual face P k, cropping the image, to obtain the corresponding face image using a face corresponding keypoint locations L k, the similarity transformation applied for face alignment, alignment obtained: Step S23 Face image of
步骤S24:将对齐后的人脸图像输入自适应聚合网络,得到对应的人脸深度表观特征,添加到跟踪器中T k的特征列表E kStep S24: Input the aligned face image into the adaptive aggregation network to obtain the corresponding deep apparent feature of the face, and add it to the feature list E k of T k in the tracker.
进一步地,步骤S3具体包括以下步骤:Further, step S3 specifically includes the following steps:
步骤S31:将每个跟踪的人脸目标状态表示为以下形式:Step S31: Express the state of each tracked face target in the following form:
Figure PCTCN2019124966-appb-000011
Figure PCTCN2019124966-appb-000011
式中,m表示跟踪的人脸目标状态,u和v表示跟踪人脸区域的中心坐标,s为人脸框的面积,r为人脸框的宽高比,
Figure PCTCN2019124966-appb-000012
分别表示(u,v,s,r)在图像坐标空间中的速度;
In the formula, m represents the state of the tracked face target, u and v represent the center coordinates of the tracked face area, s is the area of the face frame, and r is the aspect ratio of the face frame,
Figure PCTCN2019124966-appb-000012
Respectively represent the speed of (u, v, s, r) in the image coordinate space;
步骤S32:将每个跟踪器T k中的人脸位置P k=(x,y,w,h)转化为
Figure PCTCN2019124966-appb-000013
的形式,其中
Figure PCTCN2019124966-appb-000014
表示第i帧中第k个跟踪目标的人脸位置转化后的形式;
Step S32: Convert the face position P k = (x, y, w, h) in each tracker T k into
Figure PCTCN2019124966-appb-000013
In the form of
Figure PCTCN2019124966-appb-000014
Represents the transformed form of the face position of the k-th tracking target in the i-th frame;
步骤S33:将
Figure PCTCN2019124966-appb-000015
作为第i帧第k个跟踪目标的直接观测结果,其由人脸检测而来,采用基于线性匀速运动模型的卡尔曼滤波器对第k个跟踪目标在第i+1帧中的状态
Figure PCTCN2019124966-appb-000016
进行预测;
Step S33: Change
Figure PCTCN2019124966-appb-000015
As the direct observation result of the k-th tracking target in the i-th frame, it is derived from face detection, and the state of the k-th tracking target in the i+1-th frame is determined by the Kalman filter based on the linear uniform motion model.
Figure PCTCN2019124966-appb-000016
Make predictions
步骤S34:在第i+1帧中,采用MTCNN模型再次进行人脸检测与面部关键点定位,得到人脸的位置D i+1和面部关键点C i+1Step S34: In the i+1th frame, the MTCNN model is used to perform face detection and facial key point positioning again, to obtain the face position Di +1 and the face key point C i+1 ;
步骤S35:对每一个人脸位置
Figure PCTCN2019124966-appb-000017
基于其面部关键点
Figure PCTCN2019124966-appb-000018
应用相似变换完成人脸对齐,并输入自适应聚合网络提取特征,得到特征集合F i+1,其中F i+1表示第i+1帧中所有人脸的特征集合。
Step S35: For each person's face position
Figure PCTCN2019124966-appb-000017
Based on its facial key points
Figure PCTCN2019124966-appb-000018
The similarity transformation is applied to complete the face alignment, and the adaptive aggregation network is input to extract the features, and the feature set F i+1 is obtained , where F i+1 represents the feature set of all faces in the i+1 frame.
进一步地,步骤S4具体包括以下步骤:Further, step S4 specifically includes the following steps:
步骤S41:对于每个人脸的跟踪器T k,将其历史运动轨迹中所有特征的集合E k输入自适应聚合网络,得到聚合特征f k,其中f k表示将第k个跟踪目标历史运动轨迹中所有特征向量进行融合之后输出的一个聚合特征; Step S41: For each face tracker T k , the set E k of all the features in its historical motion trajectory is input into the adaptive aggregation network to obtain the aggregated feature f k , where f k represents the k-th tracking target historical motion trajectory An aggregated feature output after all feature vectors in the fusion;
步骤S42:将第i帧中由卡尔曼滤波器预测的第k个目标在下一帧的位置状 态
Figure PCTCN2019124966-appb-000019
转化为
Figure PCTCN2019124966-appb-000020
的形式;
Step S42: Set the position state of the kth target predicted by the Kalman filter in the i-th frame in the next frame
Figure PCTCN2019124966-appb-000019
change into
Figure PCTCN2019124966-appb-000020
form;
步骤S43:结合
Figure PCTCN2019124966-appb-000021
和目标k聚合后的特征f k,以及第i+1帧中的由人脸检测得到的人脸位置D i+1及其特征集合F i+1,计算如下关联矩阵:
Step S43: Combine
Figure PCTCN2019124966-appb-000021
And certain features of the polymeric k f k, as well as human face detection in the i + 1-position of the face frame obtained D i + 1, and the feature set F i + 1, the correlation matrix is calculated as follows:
G=[g jk],j=1,2,...,J i+1,k=1,2,...,K iG=[g jk ],j=1,2,...,J i+1 ,k=1,2,...,K i ;
式中,J i+1为第i+1帧中检测到的人脸数量,K i为第i帧中的跟踪目标数量,
Figure PCTCN2019124966-appb-000022
为第i+1帧中第j个人脸检测框与第i帧中由卡尔曼滤波器预测的第k个目标在第i+1帧中的位置状态
Figure PCTCN2019124966-appb-000023
之间的重合程度,
Figure PCTCN2019124966-appb-000024
为第i+1帧中第j个人脸特征
Figure PCTCN2019124966-appb-000025
与第i帧中第k个目标聚合特征f k之间的余弦相似度,λ为超参数,用于平衡两个度量的权重;
Where J i+1 is the number of faces detected in the i+1 frame, K i is the number of tracking targets in the i frame,
Figure PCTCN2019124966-appb-000022
Is the position state of the j-th face detection frame in the i+1-th frame and the k-th target predicted by the Kalman filter in the i-th frame in the i+1-th frame
Figure PCTCN2019124966-appb-000023
The degree of overlap between
Figure PCTCN2019124966-appb-000024
Is the j-th face feature in the i+1-th frame
Figure PCTCN2019124966-appb-000025
The cosine similarity with the k-th target aggregation feature f k in the i-th frame, where λ is a hyperparameter used to balance the weights of the two metrics;
步骤S44:将关联矩阵G作为代价矩阵,使用匈牙利算法计算得到匹配的结果,将第i+1帧中的人脸检测框
Figure PCTCN2019124966-appb-000026
关联到第k个跟踪目标;
Step S44: Using the incidence matrix G as the cost matrix, the Hungarian algorithm is used to calculate the matching result, and the face detection frame in the i+1 frame
Figure PCTCN2019124966-appb-000026
Related to the kth tracking target;
步骤S45:将匹配结果中的下标对应关联矩阵G中的项,并过滤所有小于T similarity的项g jk,将其从匹配结果中删除,其中T similarity为设定的超参数,表示匹配成功的最低相似度阈值; Step S45: Correspond the subscripts in the matching result to items in the incidence matrix G, and filter all items g jk that are less than T similarity , and delete them from the matching result, where T similarity is the set hyperparameter, which means the matching is successful The lowest similarity threshold;
步骤S46:在匹配结果中,若检测框
Figure PCTCN2019124966-appb-000027
与第k个跟踪目标关联成功,则更新对应跟踪器T k中的位置状态
Figure PCTCN2019124966-appb-000028
人脸关键点位置
Figure PCTCN2019124966-appb-000029
生命周期A k=A k+1,以及将对应的人脸特征
Figure PCTCN2019124966-appb-000030
添加到特征列表E k,若检测框
Figure PCTCN2019124966-appb-000031
关联失败,则创建新的跟踪器;
Step S46: In the matching result, if the check box
Figure PCTCN2019124966-appb-000027
If successfully associated with the k-th tracking target, update the position status in the corresponding tracker T k
Figure PCTCN2019124966-appb-000028
Key points of the face
Figure PCTCN2019124966-appb-000029
The life cycle A k =A k +1, and the corresponding facial features
Figure PCTCN2019124966-appb-000030
Add to the feature list E k , if the check box
Figure PCTCN2019124966-appb-000031
If the association fails, a new tracker will be created;
步骤S47:对每一个跟踪器T k,若其生命周期A k>T age,则删除该跟踪器,其中T age为设定的超参数,表示一个跟踪目标可以存活的最长时间。 Step S47: For each tracker T k , if its life cycle A k > T age , delete the tracker, where T age is a set hyperparameter, which represents the longest time a tracking target can survive.
有益效果Beneficial effect
与现有技术相比,本发明有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:
1、本发明构建的一种基于深度表观特征和自适应聚合网络的多人脸跟踪方法能够有效地对视频中的人脸进行跟踪,提升了人脸跟踪的准确率,并降低了目标切换的次数。1. A multi-face tracking method based on deep appearance features and an adaptive aggregation network constructed by the present invention can effectively track faces in videos, improve the accuracy of face tracking, and reduce target switching The number of times.
2、本发明能够在保证跟踪效果的同时对视频中的人脸进行在线跟踪。2. The present invention can track the face in the video online while ensuring the tracking effect.
3、针对人脸跟踪过程中,预测的人脸位置不确定性较大,同时人脸可能发生大幅姿态变化以及遮挡等问题,本发明提出了利用人脸深度表观特征的方法,通过结合空间位置与深度特征之间的信息,提高了人脸跟踪的性能。3. In the face tracking process, the predicted face position is uncertain, and at the same time, the face may undergo significant posture changes and occlusion. The present invention proposes a method of using the depth and apparent features of the face. The information between position and depth features improves the performance of face tracking.
4、针对人脸跟踪过程中,难以有效利用同一目标跟踪轨迹中的所有特征,并将多个特征集合之间进行有效比较的问题,本发明提出了自适应聚合网络,通过特征聚合模块自适应地学习特征集合中每一个特征的重要程度并有效地进行融合,提升了人脸跟踪的效果。4. In the face tracking process, it is difficult to effectively use all the features in the same target tracking trajectory and effectively compare multiple feature sets. The present invention proposes an adaptive aggregation network, which is adaptive through a feature aggregation module The importance of each feature in the feature set is learned and fused effectively, which improves the effect of face tracking.
附图说明Description of the drawings
图1为本发明实施例的流程示意图。Fig. 1 is a schematic flowchart of an embodiment of the present invention.
本发明的实施方式Embodiments of the invention
下面结合附图及实施例对本发明做进一步说明。The present invention will be further described below in conjunction with the drawings and embodiments.
应该指出,以下详细说明都是示例性的,旨在对本申请提供进一步的说明。除非另有指明,本文使用的所有技术和科学术语具有与本申请所属技术领域的普通技术人员通常理解的相同含义。It should be pointed out that the following detailed descriptions are all exemplary and are intended to provide further descriptions of this application. Unless otherwise indicated, all technical and scientific terms used herein have the same meaning as commonly understood by those of ordinary skill in the technical field to which this application belongs.
需要注意的是,这里所使用的术语仅是为了描述具体实施方式,而非意图限制根据本申请的示例性实施方式。如在这里所使用的,除非上下文另外明确指出,否则单数形式也意图包括复数形式,此外,还应当理解的是,当在本说明书中使用术语“包含”和/或“包括”时,其指明存在特征、步骤、操作、器件、组件和/或它们的组合。It should be noted that the terms used here are only for describing specific implementations, and are not intended to limit the exemplary implementations according to the present application. As used herein, unless the context clearly indicates otherwise, the singular form is also intended to include the plural form. In addition, it should also be understood that when the terms "comprising" and/or "including" are used in this specification, they indicate There are features, steps, operations, devices, components, and/or combinations thereof.
如图1所示,本实施例提供了一种基于深度表观特征和自适应聚合网络的多人脸跟踪方法,具体包括以下步骤:As shown in Figure 1, this embodiment provides a multi-face tracking method based on deep appearance features and an adaptive aggregation network, which specifically includes the following steps:
步骤S1:采用人脸识别数据集训练自适应聚合网络;Step S1: Use the face recognition data set to train an adaptive aggregation network;
步骤S2:根据初始的输入视频帧,使用基于卷积神经网络的人脸检测方法获取人脸的位置,初始化待跟踪的人脸目标,提取人脸特征并保存;Step S2: According to the initial input video frame, the face detection method based on convolutional neural network is used to obtain the position of the face, initialize the face target to be tracked, and extract and save the face feature;
步骤S3:采用卡尔曼滤波器预测每个人脸目标在下一帧的位置,并在下一帧中再次使用人脸检测方法定位人脸所在位置,并对检测出的人脸提取特征;Step S3: Use the Kalman filter to predict the position of each face target in the next frame, and use the face detection method to locate the position of the face again in the next frame, and extract features from the detected face;
步骤S4:使用步骤S1训练好的自适应聚合网络,对每个跟踪的人脸目标跟踪轨迹中的人脸特征集合进行聚合,动态地生成一个融合多帧信息的人脸深度表观特征,结合预测的位置及融合后的特征,与当前帧中通过检测得到的人脸位置及其特征,进行相似度计算与匹配,更新跟踪状态。Step S4: Use the adaptive aggregation network trained in step S1 to aggregate the face feature sets in the tracking trajectory of each tracked face target, and dynamically generate a face depth apparent feature fused with multi-frame information, combined The predicted position and fusion features are calculated and matched with the face position and features obtained through detection in the current frame, and the tracking state is updated.
在本实施例中,步骤S1具体包括以下步骤:In this embodiment, step S1 specifically includes the following steps:
步骤S11:收集公开的人脸识别数据集,获得相关人物的图片及姓名;Step S11: Collect public face recognition data sets to obtain pictures and names of relevant people;
步骤S12:采用融合策略对多个数据集中共有人物的图片进行整合,使用预训练的MTCNN模型进行人脸检测和人脸关键点定位,并应用相似变换进行人脸对齐,同时将训练集中的所有图像都减去其每个通道在训练集上的均值,完成数据预处理,训练自适应聚合网络。Step S12: Use the fusion strategy to integrate the pictures of the common people in the multiple data sets, use the pre-trained MTCNN model for face detection and face key point positioning, and apply similar transformations for face alignment, and at the same time all the training sets The image subtracts the mean value of each channel on the training set, completes the data preprocessing, and trains the adaptive aggregation network.
在本实施例中,所述自适应聚合网络由深度特征抽取模块和自适应特征聚合模块串联而成,其接受同一个人的一张或多张人脸图像作为输入,输出聚合后的特征,其中深度特征抽取模块采用34层的ResNet作为骨干网络,自适应特征聚合模块含有一个特征聚合层;令B表示输入的样本数量,{z t}表示深度特征抽取模块的输出特征集合,其中t=1,2,...,B表示输入样本编号,特征聚合层的计算方式为: In this embodiment, the adaptive aggregation network is composed of a deep feature extraction module and an adaptive feature aggregation module in series. It accepts one or more face images of the same person as input and outputs the aggregated features, where The deep feature extraction module uses 34-layer ResNet as the backbone network, and the adaptive feature aggregation module contains a feature aggregation layer; let B represent the number of input samples, {z t } represent the output feature set of the deep feature extraction module, where t=1 ,2,...,B represents the input sample number, and the calculation method of the feature aggregation layer is:
Figure PCTCN2019124966-appb-000032
Figure PCTCN2019124966-appb-000032
Figure PCTCN2019124966-appb-000033
Figure PCTCN2019124966-appb-000033
a=∑ to tz ta=∑ t o t z t ;
式中,q表示特征向量z t各个分量的权重,是可以学习的参数,通过将人脸识别信号作为监督信号,利用反向传播和梯度下降方法进行学习,v t为sigmoid 函数的输出,代表每个特征向量z t的分数,范围在0和1之间,o t为L1归一化的输出,使得∑ to t=1,a为B个特征向量聚合后的一个特征向量。 In the formula, q represents the weight of each component of the feature vector z t , which is a parameter that can be learned. By using the face recognition signal as the supervision signal, the learning is performed using back propagation and gradient descent methods, v t is the output of the sigmoid function, representing The score of each feature vector z t ranges between 0 and 1, o t is the normalized output of L1, so that ∑ t o t = 1, and a is a feature vector aggregated by B feature vectors.
在本实施例中,步骤S2具体包括以下步骤:In this embodiment, step S2 specifically includes the following steps:
步骤S21:令i表示输入视频的第i帧的编号,初始时i=1,使用预训练的MTCNN模型同时检测所有人脸的位置D i及其对应的面部关键点的位置C i,其中
Figure PCTCN2019124966-appb-000034
j为第j个检测到人脸的编号,J i为第帧检测到的人脸数量,
Figure PCTCN2019124966-appb-000035
其中
Figure PCTCN2019124966-appb-000036
表示第i帧中第j个人脸的位置,x,y,w,h分别表示人脸区域的左上角坐标及其宽度和高度,
Figure PCTCN2019124966-appb-000037
其中
Figure PCTCN2019124966-appb-000038
表示第i帧中第j个人脸的关键点,c 1,c 2,c 3,c 4,c 5分别表示人脸的左眼,右眼,鼻子,左嘴角,右嘴角的坐标;
Step S21: Let i represents the number of i-th frame of the input video, initially i = 1, using the model pre-trained simultaneously detect all faces MTCNN position C i D i and the position of the corresponding key face, wherein
Figure PCTCN2019124966-appb-000034
j is the number of the j-th detected face, J i is the number of faces detected in the frame,
Figure PCTCN2019124966-appb-000035
among them
Figure PCTCN2019124966-appb-000036
Represents the position of the j-th face in the i-th frame, x, y, w, and h represent the coordinates of the upper left corner of the face area and its width and height,
Figure PCTCN2019124966-appb-000037
among them
Figure PCTCN2019124966-appb-000038
Represents the key points of the j-th face in the i-th frame, c 1 , c 2 , c 3 , c 4 , and c 5 represent the coordinates of the left eye, right eye, nose, left mouth corner, and right mouth corner of the face respectively;
步骤S22:对于每一个人脸的位置
Figure PCTCN2019124966-appb-000039
及其面部关键点坐标
Figure PCTCN2019124966-appb-000040
为其分配一个唯一的身份ID k,k=1,2,...,K i,其中k表示第k个跟踪目标的编号,K i表示在第i帧时跟踪目标的人数,并初始化其对应的跟踪器T k={ID k,P k,L k,E k,A k},其中ID k表示第k个跟踪目标的唯一身份标识,P k表示分配给第k个目标的人脸位置坐标,L k表示第k个目标的面部关键点坐标,E k表示第k个目标的人脸特征列表,A k表示第k个目标的生命周期,初始化K i=J i
Figure PCTCN2019124966-appb-000041
A k=1;
Step S22: For each person's face position
Figure PCTCN2019124966-appb-000039
Key point coordinates
Figure PCTCN2019124966-appb-000040
Assign a unique identity ID k , k=1, 2,..., K i , where k represents the number of the k-th tracking target, and K i represents the number of people tracking the target in the i-th frame, and initialize it The corresponding tracker T k ={ID k ,P k ,L k ,E k ,A k }, where ID k represents the unique identity of the k-th tracking target, and P k represents the face assigned to the k-th target Position coordinates, L k represents the face key point coordinates of the k-th target, E k represents the face feature list of the k-th target, Ak represents the life cycle of the k-th target, initialized K i =J i ,
Figure PCTCN2019124966-appb-000041
A k =1;
步骤S23:对于T k中的每一个人脸的位置P k,对图像进行裁剪,得到对应的人脸图像,使用对应的面部关键点位置L k,应用相似变换进行人脸对齐,得到对齐后的人脸图像; After the position of T k in each individual face P k, cropping the image, to obtain the corresponding face image using a face corresponding keypoint locations L k, the similarity transformation applied for face alignment, alignment obtained: Step S23 Face image of
步骤S24:将对齐后的人脸图像输入自适应聚合网络,得到对应的人脸深度表观特征,添加到跟踪器中T k的特征列表E kStep S24: Input the aligned face image into the adaptive aggregation network to obtain the corresponding deep apparent feature of the face, and add it to the feature list E k of T k in the tracker.
在本实施例中,步骤S3具体包括以下步骤:In this embodiment, step S3 specifically includes the following steps:
步骤S31:将每个跟踪的人脸目标状态表示为以下形式:Step S31: Express the state of each tracked face target in the following form:
Figure PCTCN2019124966-appb-000042
Figure PCTCN2019124966-appb-000042
式中,m表示跟踪的人脸目标状态,u和v表示跟踪人脸区域的中心坐标,s为人脸框的面积,r为人脸框的宽高比,
Figure PCTCN2019124966-appb-000043
分别表示(u,v,s,r)在图像坐标空间中的速度;
In the formula, m represents the state of the tracked face target, u and v represent the center coordinates of the tracked face area, s is the area of the face frame, and r is the aspect ratio of the face frame,
Figure PCTCN2019124966-appb-000043
Respectively represent the speed of (u, v, s, r) in the image coordinate space;
步骤S32:将每个跟踪器T k中的人脸位置P k=(x,y,w,h)转化为
Figure PCTCN2019124966-appb-000044
的形式,其中
Figure PCTCN2019124966-appb-000045
表示第i帧中第k个跟踪目标的人脸位置转化后的形式;
Step S32: Convert the face position P k = (x, y, w, h) in each tracker T k into
Figure PCTCN2019124966-appb-000044
In the form of
Figure PCTCN2019124966-appb-000045
Represents the transformed form of the face position of the k-th tracking target in the i-th frame;
步骤S33:将
Figure PCTCN2019124966-appb-000046
作为第i帧第k个跟踪目标的直接观测结果,其由人脸检测而来,采用基于线性匀速运动模型的卡尔曼滤波器对第k个跟踪目标在第i+1帧中的状态
Figure PCTCN2019124966-appb-000047
进行预测;
Step S33: Change
Figure PCTCN2019124966-appb-000046
As the direct observation result of the k-th tracking target in the i-th frame, it is derived from face detection, and the state of the k-th tracking target in the i+1-th frame is determined by the Kalman filter based on the linear uniform motion model.
Figure PCTCN2019124966-appb-000047
Make predictions
步骤S34:在第i+1帧中,采用MTCNN模型再次进行人脸检测与面部关键点定位,得到人脸的位置D i+1和面部关键点C i+1Step S34: In the i+1th frame, the MTCNN model is used to perform face detection and facial key point positioning again, to obtain the face position Di +1 and the face key point C i+1 ;
步骤S35:对每一个人脸位置
Figure PCTCN2019124966-appb-000048
基于其面部关键点
Figure PCTCN2019124966-appb-000049
应用相似变换完成人脸对齐,并输入自适应聚合网络提取特征,得到特征集合F i+1,其中F i+1表示第i+1帧中所有人脸的特征集合。
Step S35: For each person's face position
Figure PCTCN2019124966-appb-000048
Based on its facial key points
Figure PCTCN2019124966-appb-000049
The similarity transformation is applied to complete the face alignment, and the adaptive aggregation network is input to extract the features, and the feature set F i+1 is obtained , where F i+1 represents the feature set of all faces in the i+1 frame.
在本实施例中,步骤S4具体包括以下步骤:In this embodiment, step S4 specifically includes the following steps:
步骤S41:对于每个人脸的跟踪器T k,将其历史运动轨迹中所有特征的集合E k输入自适应聚合网络,得到聚合特征f k,其中f k表示将第k个跟踪目标历史运动轨迹中所有特征向量进行融合之后输出的一个聚合特征; Step S41: For each face tracker T k , the set E k of all the features in its historical motion trajectory is input into the adaptive aggregation network to obtain the aggregated feature f k , where f k represents the k-th tracking target historical motion trajectory An aggregated feature output after all feature vectors in the fusion;
步骤S42:将第i帧中由卡尔曼滤波器预测的第k个目标在下一帧的位置状态
Figure PCTCN2019124966-appb-000050
转化为
Figure PCTCN2019124966-appb-000051
的形式;
Step S42: Set the position state of the kth target predicted by the Kalman filter in the i-th frame in the next frame
Figure PCTCN2019124966-appb-000050
change into
Figure PCTCN2019124966-appb-000051
form;
步骤S43:结合
Figure PCTCN2019124966-appb-000052
和目标k聚合后的特征f k,以及第i+1帧中的由人脸检测得到的人脸位置D i+1及其特征集合F i+1,计算如下关联矩阵:
Step S43: Combine
Figure PCTCN2019124966-appb-000052
And certain features of the polymeric k f k, as well as human face detection in the i + 1-position of the face frame obtained D i + 1, and the feature set F i + 1, the correlation matrix is calculated as follows:
G=[g jk],j=1,2,...,J i+1,k=1,2,...,K iG=[g jk ],j=1,2,...,J i+1 ,k=1,2,...,K i ;
式中,J i+1为第i+1帧中检测到的人脸数量,K i为第i帧中的跟踪目标数量,
Figure PCTCN2019124966-appb-000053
为第i+1帧中第j个人脸检测框与第i帧中由卡尔曼滤波器预测的第k个目标在第i+1帧中的位置状态
Figure PCTCN2019124966-appb-000054
之间的重合程度,
Figure PCTCN2019124966-appb-000055
为第i+1帧中第j个人脸特征
Figure PCTCN2019124966-appb-000056
与第i帧中第k个目标聚合特征f k之间的余弦相似度,λ为超参数,用于平衡两个度量的权重;
Where J i+1 is the number of faces detected in the i+1 frame, K i is the number of tracking targets in the i frame,
Figure PCTCN2019124966-appb-000053
Is the position state of the j-th face detection frame in the i+1-th frame and the k-th target predicted by the Kalman filter in the i-th frame in the i+1-th frame
Figure PCTCN2019124966-appb-000054
The degree of overlap between
Figure PCTCN2019124966-appb-000055
Is the j-th face feature in the i+1-th frame
Figure PCTCN2019124966-appb-000056
The cosine similarity with the k-th target aggregation feature f k in the i-th frame, where λ is a hyperparameter used to balance the weights of the two metrics;
步骤S44:将关联矩阵G作为代价矩阵,使用匈牙利算法计算得到匹配的结果,将第i+1帧中的人脸检测框
Figure PCTCN2019124966-appb-000057
关联到第k个跟踪目标;
Step S44: Using the incidence matrix G as the cost matrix, the Hungarian algorithm is used to calculate the matching result, and the face detection frame in the i+1 frame
Figure PCTCN2019124966-appb-000057
Related to the kth tracking target;
步骤S45:将匹配结果中的下标对应关联矩阵G中的项,并过滤所有小于T similarity的项g jk,将其从匹配结果中删除,其中T similarity为设定的超参数,表示匹配成功的最低相似度阈值; Step S45: Correspond the subscripts in the matching result to items in the incidence matrix G, and filter all items g jk that are less than T similarity , and delete them from the matching result, where T similarity is the set hyperparameter, which means the matching is successful The lowest similarity threshold;
步骤S46:在匹配结果中,若检测框
Figure PCTCN2019124966-appb-000058
与第k个跟踪目标关联成功,则更新对应跟踪器T k中的位置状态
Figure PCTCN2019124966-appb-000059
人脸关键点位置
Figure PCTCN2019124966-appb-000060
生命周期A k=A k+1,以及将对应的人脸特征
Figure PCTCN2019124966-appb-000061
添加到特征列表E k,若检测框
Figure PCTCN2019124966-appb-000062
关联失败,则创建新的跟踪器;
Step S46: In the matching result, if the check box
Figure PCTCN2019124966-appb-000058
If successfully associated with the k-th tracking target, update the position status in the corresponding tracker T k
Figure PCTCN2019124966-appb-000059
Key points of the face
Figure PCTCN2019124966-appb-000060
The life cycle A k =A k +1, and the corresponding facial features
Figure PCTCN2019124966-appb-000061
Add to the feature list E k , if the check box
Figure PCTCN2019124966-appb-000062
If the association fails, a new tracker will be created;
步骤S47:对每一个跟踪器T k,若其生命周期A k>T age,则删除该跟踪器,其中T age为设定的超参数,表示一个跟踪目标可以存活的最长时间。 Step S47: For each tracker T k , if its life cycle A k > T age , delete the tracker, where T age is a set hyperparameter, which represents the longest time a tracking target can survive.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计 算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application can be provided as methods, systems, or computer program products. Therefore, this application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, this application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。This application is described with reference to flowcharts and/or block diagrams of methods, equipment (systems), and computer program products according to embodiments of this application. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment can be generated A device that implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions can also be stored in a computer-readable memory that can direct a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
以上所述,仅是本发明的较佳实施例而已,并非是对本发明作其它形式的限制,任何熟悉本专业的技术人员可能利用上述揭示的技术内容加以变更或改型为等同变化的等效实施例。但是凡是未脱离本发明技术方案内容,依据本发明的技术实质对以上实施例所作的任何简单修改、等同变化与改型,仍属于本发明技术方案的保护范围。The above are only preferred embodiments of the present invention, and are not intended to limit the present invention in other forms. Any person familiar with the profession may use the technical content disclosed above to change or modify the equivalent of equivalent changes. Examples. However, any simple modifications, equivalent changes and modifications made to the above embodiments based on the technical essence of the present invention without departing from the content of the technical solution of the present invention still belong to the protection scope of the technical solution of the present invention.

Claims (6)

  1. 一种基于深度表观特征和自适应聚合网络的多人脸跟踪方法,其特征在于:包括以下步骤:A multi-face tracking method based on deep appearance features and an adaptive aggregation network is characterized in that it includes the following steps:
    步骤S1:采用人脸识别数据集训练自适应聚合网络;Step S1: Use the face recognition data set to train an adaptive aggregation network;
    步骤S2:根据初始的输入视频帧,采用卷积神经网络获取人脸的位置,初始化待跟踪的人脸目标,提取人脸特征并保存;Step S2: According to the initial input video frame, use the convolutional neural network to obtain the position of the face, initialize the face target to be tracked, extract the face features and save;
    步骤S3:采用卡尔曼滤波器预测每个人脸目标在下一帧的位置,并在下一帧中再次定位人脸所在位置,并对检测出的人脸提取特征;Step S3: Use the Kalman filter to predict the position of each face target in the next frame, and locate the position of the face again in the next frame, and extract features from the detected face;
    步骤S4:使用步骤S1训练好的自适应聚合网络,对每个跟踪的人脸目标跟踪轨迹中的人脸特征集合进行聚合,动态地生成一个融合多帧信息的人脸深度表观特征,结合预测的位置及融合后的特征,与当前帧中通过检测得到的人脸位置及其特征,进行相似度计算与匹配,更新跟踪状态。Step S4: Use the adaptive aggregation network trained in step S1 to aggregate the face feature sets in the tracking trajectory of each tracked face target, and dynamically generate a face depth apparent feature fused with multi-frame information, combined The predicted position and fusion features are calculated and matched with the face position and features obtained through detection in the current frame, and the tracking state is updated.
  2. 根据权利要求1所述的一种基于深度表观特征和自适应聚合网络的多人脸跟踪方法,其特征在于:步骤S1具体包括以下步骤:The method for tracking multiple faces based on deep appearance features and an adaptive aggregation network according to claim 1, wherein step S1 specifically includes the following steps:
    步骤S11:收集公开的人脸识别数据集,获得相关人物的图片及姓名;Step S11: Collect public face recognition data sets to obtain pictures and names of relevant people;
    步骤S12:采用融合策略对多个数据集中共有人物的图片进行整合,使用预训练的MTCNN模型进行人脸检测和人脸关键点定位,并应用相似变换进行人脸对齐,同时将训练集中的所有图像都减去其每个通道在训练集上的均值,完成数据预处理,训练自适应聚合网络。Step S12: Use the fusion strategy to integrate the pictures of the common people in the multiple data sets, use the pre-trained MTCNN model for face detection and face key point positioning, and apply similar transformations for face alignment, and at the same time all the training sets The image subtracts the mean value of each channel on the training set, completes the data preprocessing, and trains the adaptive aggregation network.
  3. 根据权利要求2所述的一种基于深度表观特征和自适应聚合网络的多人脸跟踪方法,其特征在于:所述自适应聚合网络由深度特征抽取模块和自适应特征聚合模块串联而成,其接受同一个人的一张或多张人脸图像作为输入,输出聚合后的特征,其中深度特征抽取模块采用34层的ResNet作为骨干网络,自适应特征聚合模块含有一个特征聚合层;令B表示输入的样本数量,{z t}表示深度特征抽取模块的输出特征集合,其中t=1,2,...,B表示输入样本编号,特征聚合层的计算方式为: The multi-face tracking method based on deep appearance features and an adaptive aggregation network according to claim 2, characterized in that: the adaptive aggregation network is composed of a deep feature extraction module and an adaptive feature aggregation module in series , It accepts one or more face images of the same person as input and outputs aggregated features. The deep feature extraction module uses 34-layer ResNet as the backbone network, and the adaptive feature aggregation module contains a feature aggregation layer; let B Represents the number of input samples, {z t } represents the output feature set of the deep feature extraction module, where t=1, 2,...,B represents the input sample number, and the calculation method of the feature aggregation layer is:
    Figure PCTCN2019124966-appb-100001
    Figure PCTCN2019124966-appb-100001
    Figure PCTCN2019124966-appb-100002
    Figure PCTCN2019124966-appb-100002
    a=∑ to tz ta=∑ t o t z t ;
    式中,q表示特征向量z t各个分量的权重,是可以学习的参数,通过将人脸识别信号作为监督信号,利用反向传播和梯度下降方法进行学习,v t为sigmoid函数的输出,代表每个特征向量z t的分数,范围在0和1之间,o t为L1归一化的输出,使得∑ to t=1,a为B个特征向量聚合后的一个特征向量。 In the formula, q represents the weight of each component of the feature vector z t , which is a parameter that can be learned. The face recognition signal is used as a supervisory signal to learn using back propagation and gradient descent methods. V t is the output of the sigmoid function, representing The score of each feature vector z t ranges between 0 and 1, o t is the normalized output of L1, so that ∑ t o t = 1, and a is a feature vector aggregated by B feature vectors.
  4. 根据权利要求1所述的一种基于深度表观特征和自适应聚合网络的多人脸跟踪方法,其特征在于:步骤S2具体包括以下步骤:The multi-face tracking method based on deep appearance features and adaptive aggregation network according to claim 1, characterized in that: step S2 specifically includes the following steps:
    步骤S21:令i表示输入视频的第i帧的编号,初始时i=1,使用预训练的MTCNN模型同时检测所有人脸的位置D i及其对应的面部关键点的位置C i,其中
    Figure PCTCN2019124966-appb-100003
    j为第j个检测到人脸的编号,J i为第帧检测到的人脸数量,
    Figure PCTCN2019124966-appb-100004
    其中
    Figure PCTCN2019124966-appb-100005
    表示第i帧中第j个人脸的位置,x,y,w,h分别表示人脸区域的左上角坐标及其宽度和高度,
    Figure PCTCN2019124966-appb-100006
    其中
    Figure PCTCN2019124966-appb-100007
    表示第i帧中第j个人脸的关键点,c 1,c 2,c 3,c 4,c 5分别表示人脸的左眼,右眼,鼻子,左嘴角,右嘴角的坐标;
    Step S21: Let i represents the number of i-th frame of the input video, initially i = 1, using the model pre-trained simultaneously detect all faces MTCNN position C i D i and the position of the corresponding key face, wherein
    Figure PCTCN2019124966-appb-100003
    j is the number of the j-th detected face, J i is the number of faces detected in the frame,
    Figure PCTCN2019124966-appb-100004
    among them
    Figure PCTCN2019124966-appb-100005
    Represents the position of the j-th face in the i-th frame, x, y, w, and h represent the coordinates of the upper left corner of the face area and its width and height,
    Figure PCTCN2019124966-appb-100006
    among them
    Figure PCTCN2019124966-appb-100007
    Represents the key points of the j-th face in the i-th frame, c 1 , c 2 , c 3 , c 4 , and c 5 represent the coordinates of the left eye, right eye, nose, left mouth corner, and right mouth corner of the face respectively;
    步骤S22:对于每一个人脸的位置
    Figure PCTCN2019124966-appb-100008
    及其面部关键点坐标
    Figure PCTCN2019124966-appb-100009
    为其分配一个唯一的身份ID k,k=1,2,...,K i,其中k表示第k个跟踪目标的编号,K i表示在第i帧时跟踪目标的人数,并初始化其对应的跟踪器T k={ID k,P k,L k,E k,A k},其中ID k表示第k个跟踪目标的唯一身份标识,P k表示分配给第k个目标的人脸位置坐标,L k表示第k个目标的面部关键点坐标,E k表示第k个目标的人脸特征列 表,A k表示第k个目标的生命周期,初始化K i=J i
    Figure PCTCN2019124966-appb-100010
    A k=1;
    Step S22: For each person's face position
    Figure PCTCN2019124966-appb-100008
    Key point coordinates
    Figure PCTCN2019124966-appb-100009
    Assign a unique identity ID k , k=1, 2,..., K i , where k represents the number of the k-th tracking target, and K i represents the number of people tracking the target in the i-th frame, and initialize it The corresponding tracker T k ={ID k ,P k ,L k ,E k ,A k }, where ID k represents the unique identity of the k-th tracking target, and P k represents the face assigned to the k-th target Position coordinates, L k represents the face key point coordinates of the k-th target, E k represents the face feature list of the k-th target, Ak represents the life cycle of the k-th target, initialized K i =J i ,
    Figure PCTCN2019124966-appb-100010
    A k =1;
    步骤S23:对于T k中的每一个人脸的位置P k,对图像进行裁剪,得到对应的人脸图像,使用对应的面部关键点位置L k,应用相似变换进行人脸对齐,得到对齐后的人脸图像; After the position of T k in each individual face P k, cropping the image, to obtain the corresponding face image using a face corresponding keypoint locations L k, the similarity transformation applied for face alignment, alignment obtained: Step S23 Face image of
    步骤S24:将对齐后的人脸图像输入自适应聚合网络,得到对应的人脸深度表观特征,添加到跟踪器中T k的特征列表E kStep S24: Input the aligned face image into the adaptive aggregation network to obtain the corresponding deep apparent feature of the face, and add it to the feature list E k of T k in the tracker.
  5. 根据权利要求1所述的一种基于深度表观特征和自适应聚合网络的多人脸跟踪方法,其特征在于:步骤S3具体包括以下步骤:The method for tracking multiple faces based on deep appearance features and an adaptive aggregation network according to claim 1, wherein step S3 specifically includes the following steps:
    步骤S31:将每个跟踪的人脸目标状态表示为以下形式:Step S31: Express the state of each tracked face target in the following form:
    Figure PCTCN2019124966-appb-100011
    Figure PCTCN2019124966-appb-100011
    式中,m表示跟踪的人脸目标状态,u和v表示跟踪人脸区域的中心坐标,s为人脸框的面积,r为人脸框的宽高比,
    Figure PCTCN2019124966-appb-100012
    分别表示(u,v,s,r)在图像坐标空间中的速度;
    In the formula, m represents the state of the tracked face target, u and v represent the center coordinates of the tracked face area, s is the area of the face frame, and r is the aspect ratio of the face frame,
    Figure PCTCN2019124966-appb-100012
    Respectively represent the speed of (u, v, s, r) in the image coordinate space;
    步骤S32:将每个跟踪器T k中的人脸位置P k=(x,y,w,h)转化为
    Figure PCTCN2019124966-appb-100013
    的形式,其中
    Figure PCTCN2019124966-appb-100014
    表示第i帧中第k个跟踪目标的人脸位置转化后的形式;
    Step S32: Convert the face position P k = (x, y, w, h) in each tracker T k into
    Figure PCTCN2019124966-appb-100013
    In the form of
    Figure PCTCN2019124966-appb-100014
    Represents the transformed form of the face position of the k-th tracking target in the i-th frame;
    步骤S33:将
    Figure PCTCN2019124966-appb-100015
    作为第i帧第k个跟踪目标的直接观测结果,其由人脸检测而来,采用基于线性匀速运动模型的卡尔曼滤波器对第k个跟踪目标在第i+1帧中的状态
    Figure PCTCN2019124966-appb-100016
    进行预测;
    Step S33: Change
    Figure PCTCN2019124966-appb-100015
    As the direct observation result of the k-th tracking target in the i-th frame, it is derived from face detection, and the state of the k-th tracking target in the i+1-th frame is determined by the Kalman filter based on the linear uniform motion model.
    Figure PCTCN2019124966-appb-100016
    Make predictions
    步骤S34:在第i+1帧中,采用MTCNN模型再次进行人脸检测与面部关键点定位,得到人脸的位置D i+1和面部关键点C i+1Step S34: In the i+1th frame, the MTCNN model is used to perform face detection and facial key point positioning again, to obtain the face position Di +1 and the face key point C i+1 ;
    步骤S35:对每一个人脸位置
    Figure PCTCN2019124966-appb-100017
    基于其面部关键点
    Figure PCTCN2019124966-appb-100018
    应用相似变换 完成人脸对齐,并输入自适应聚合网络提取特征,得到特征集合F i+1,其中F i+1表示第i+1帧中所有人脸的特征集合。
    Step S35: For each person's face position
    Figure PCTCN2019124966-appb-100017
    Based on its facial key points
    Figure PCTCN2019124966-appb-100018
    The similarity transformation is applied to complete the face alignment, and the adaptive aggregation network is input to extract the features, and the feature set F i+1 is obtained , where F i+1 represents the feature set of all faces in the i+1 frame.
  6. 根据权利要求1所述的一种基于深度表观特征和自适应聚合网络的多人脸跟踪方法,其特征在于:步骤S4具体包括以下步骤:The multi-face tracking method based on deep appearance features and an adaptive aggregation network according to claim 1, wherein step S4 specifically includes the following steps:
    步骤S41:对于每个人脸的跟踪器T k,将其历史运动轨迹中所有特征的集合E k输入自适应聚合网络,得到聚合特征f k,其中f k表示将第k个跟踪目标历史运动轨迹中所有特征向量进行融合之后输出的一个聚合特征; Step S41: For each face tracker T k , the set E k of all the features in its historical motion trajectory is input into the adaptive aggregation network to obtain the aggregated feature f k , where f k represents the k-th tracking target historical motion trajectory An aggregated feature output after all feature vectors in the fusion;
    步骤S42:将第i帧中由卡尔曼滤波器预测的第k个目标在下一帧的位置状态
    Figure PCTCN2019124966-appb-100019
    转化为
    Figure PCTCN2019124966-appb-100020
    的形式;
    Step S42: Set the position state of the kth target predicted by the Kalman filter in the i-th frame in the next frame
    Figure PCTCN2019124966-appb-100019
    change into
    Figure PCTCN2019124966-appb-100020
    form;
    步骤S43:结合
    Figure PCTCN2019124966-appb-100021
    和目标k聚合后的特征f k,以及第i+1帧中的由人脸检测得到的人脸位置D i+1及其特征集合F i+1,计算如下关联矩阵:
    Step S43: Combine
    Figure PCTCN2019124966-appb-100021
    And certain features of the polymeric k f k, as well as human face detection in the i + 1-position of the face frame obtained D i + 1, and the feature set F i + 1, the correlation matrix is calculated as follows:
    G=[g jk],j=1,2,...,J i+1,k=1,2,...,K iG=[g jk ],j=1,2,...,J i+1 ,k=1,2,...,K i ;
    式中,J i+1为第i+1帧中检测到的人脸数量,K i为第i帧中的跟踪目标数量,
    Figure PCTCN2019124966-appb-100022
    为第i+1帧中第j个人脸检测框与第i帧中由卡尔曼滤波器预测的第k个目标在第i+1帧中的位置状态
    Figure PCTCN2019124966-appb-100023
    之间的重合程度,
    Figure PCTCN2019124966-appb-100024
    为第i+1帧中第j个人脸特征
    Figure PCTCN2019124966-appb-100025
    与第i帧中第k个目标聚合特征f k之间的余弦相似度,λ为超参数,用于平衡两个度量的权重;
    Where J i+1 is the number of faces detected in the i+1 frame, K i is the number of tracking targets in the i frame,
    Figure PCTCN2019124966-appb-100022
    Is the position state of the j-th face detection frame in the i+1-th frame and the k-th target predicted by the Kalman filter in the i-th frame in the i+1-th frame
    Figure PCTCN2019124966-appb-100023
    The degree of overlap between
    Figure PCTCN2019124966-appb-100024
    Is the j-th face feature in the i+1-th frame
    Figure PCTCN2019124966-appb-100025
    The cosine similarity with the k-th target aggregation feature f k in the i-th frame, where λ is a hyperparameter used to balance the weights of the two metrics;
    步骤S44:将关联矩阵G作为代价矩阵,使用匈牙利算法计算得到匹配的结果,将第i+1帧中的人脸检测框
    Figure PCTCN2019124966-appb-100026
    关联到第k个跟踪目标;
    Step S44: Using the incidence matrix G as the cost matrix, the Hungarian algorithm is used to calculate the matching result, and the face detection frame in the i+1 frame
    Figure PCTCN2019124966-appb-100026
    Related to the kth tracking target;
    步骤S45:将匹配结果中的下标对应关联矩阵G中的项,并过滤所有小于T similarity的项g jk,将其从匹配结果中删除,其中T similarity为设定的超参数,表示匹配成功的最低相似度阈值; Step S45: Correspond the subscripts in the matching result to items in the incidence matrix G, and filter all items g jk that are less than T similarity , and delete them from the matching result, where T similarity is the set hyperparameter, which means the matching is successful The lowest similarity threshold;
    步骤S46:在匹配结果中,若检测框
    Figure PCTCN2019124966-appb-100027
    与第k个跟踪目标关联成功,则更新对应跟踪器T k中的位置状态
    Figure PCTCN2019124966-appb-100028
    人脸关键点位置
    Figure PCTCN2019124966-appb-100029
    生命周期A k=A k+1,以及将对应的人脸特征
    Figure PCTCN2019124966-appb-100030
    添加到特征列表E k,若检测框
    Figure PCTCN2019124966-appb-100031
    关联失败,则创建新的跟踪器;
    Step S46: In the matching result, if the check box
    Figure PCTCN2019124966-appb-100027
    If successfully associated with the k-th tracking target, update the position status in the corresponding tracker T k
    Figure PCTCN2019124966-appb-100028
    Key points of the face
    Figure PCTCN2019124966-appb-100029
    The life cycle A k =A k +1, and the corresponding facial features
    Figure PCTCN2019124966-appb-100030
    Add to the feature list E k , if the check box
    Figure PCTCN2019124966-appb-100031
    If the association fails, a new tracker will be created;
    步骤S47:对每一个跟踪器T k,若其生命周期A k>T age,则删除该跟踪器,其中T age为设定的超参数,表示一个跟踪目标可以存活的最长时间。 Step S47: For each tracker T k , if its life cycle A k > T age , delete the tracker, where T age is a set hyperparameter, which represents the longest time a tracking target can survive.
PCT/CN2019/124966 2019-02-02 2019-12-13 Deep apparent features and adaptive aggregation network-based multi-face tracking method WO2020155873A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910106309.1A CN109829436B (en) 2019-02-02 2019-02-02 Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network
CN201910106309.1 2019-02-02

Publications (1)

Publication Number Publication Date
WO2020155873A1 true WO2020155873A1 (en) 2020-08-06

Family

ID=66863393

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/124966 WO2020155873A1 (en) 2019-02-02 2019-12-13 Deep apparent features and adaptive aggregation network-based multi-face tracking method

Country Status (2)

Country Link
CN (1) CN109829436B (en)
WO (1) WO2020155873A1 (en)

Cited By (71)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111784746A (en) * 2020-08-10 2020-10-16 上海高重信息科技有限公司 Multi-target pedestrian tracking method and device under fisheye lens and computer system
CN111899284A (en) * 2020-08-14 2020-11-06 北京交通大学 Plane target tracking method based on parameterized ESM network
CN111932588A (en) * 2020-08-07 2020-11-13 浙江大学 Tracking method of airborne unmanned aerial vehicle multi-target tracking system based on deep learning
CN111932661A (en) * 2020-08-19 2020-11-13 上海交通大学 Facial expression editing system and method and terminal
CN112016440A (en) * 2020-08-26 2020-12-01 杭州云栖智慧视通科技有限公司 Target pushing method based on multi-target tracking
CN112036271A (en) * 2020-08-18 2020-12-04 汇纳科技股份有限公司 Pedestrian re-identification method, system, medium and terminal based on Kalman filtering
CN112053386A (en) * 2020-08-31 2020-12-08 西安电子科技大学 Target tracking method based on depth convolution characteristic self-adaptive integration
CN112085767A (en) * 2020-08-28 2020-12-15 安徽清新互联信息科技有限公司 Passenger flow statistical method and system based on deep optical flow tracking
CN112215155A (en) * 2020-10-13 2021-01-12 北京中电兴发科技有限公司 Face tracking method and system based on multi-feature fusion
CN112287877A (en) * 2020-11-18 2021-01-29 上海泗科智能科技有限公司 Multi-role close-up shot tracking method
CN112288773A (en) * 2020-10-19 2021-01-29 慧视江山科技(北京)有限公司 Multi-scale human body tracking method and device based on Soft-NMS
CN112541418A (en) * 2020-12-04 2021-03-23 北京百度网讯科技有限公司 Method, apparatus, device, medium, and program product for image processing
CN112560669A (en) * 2020-12-14 2021-03-26 杭州趣链科技有限公司 Face posture estimation method and device and electronic equipment
CN112560874A (en) * 2020-12-25 2021-03-26 北京百度网讯科技有限公司 Training method, device, equipment and medium for image recognition model
CN112597944A (en) * 2020-12-29 2021-04-02 北京市商汤科技开发有限公司 Key point detection method and device, electronic equipment and storage medium
CN112651994A (en) * 2020-12-18 2021-04-13 零八一电子集团有限公司 Ground multi-target tracking method
CN112669345A (en) * 2020-12-30 2021-04-16 中山大学 Cloud deployment-oriented multi-target track tracking method and system
CN112668432A (en) * 2020-12-22 2021-04-16 上海幻维数码创意科技股份有限公司 Human body detection tracking method in ground interactive projection system based on YoloV5 and Deepsort
CN112686175A (en) * 2020-12-31 2021-04-20 北京澎思科技有限公司 Face snapshot method, system and computer readable storage medium
CN113033439A (en) * 2021-03-31 2021-06-25 北京百度网讯科技有限公司 Method and device for data processing and electronic equipment
CN113076808A (en) * 2021-03-10 2021-07-06 青岛海纳云科技控股有限公司 Method for accurately acquiring bidirectional pedestrian flow through image algorithm
CN113096156A (en) * 2021-04-23 2021-07-09 中国科学技术大学 End-to-end real-time three-dimensional multi-target tracking method and device for automatic driving
CN113158788A (en) * 2021-03-12 2021-07-23 中国平安人寿保险股份有限公司 Facial expression recognition method and device, terminal equipment and storage medium
CN113158909A (en) * 2021-04-25 2021-07-23 中国科学院自动化研究所 Behavior identification lightweight method, system and equipment based on multi-target tracking
CN113158853A (en) * 2021-04-08 2021-07-23 浙江工业大学 Pedestrian's identification system that makes a dash across red light that combines people's face and human gesture
CN113192105A (en) * 2021-04-16 2021-07-30 嘉联支付有限公司 Method and device for tracking multiple persons and estimating postures indoors
CN113269098A (en) * 2021-05-27 2021-08-17 中国人民解放军军事科学院国防科技创新研究院 Multi-target tracking positioning and motion state estimation method based on unmanned aerial vehicle
CN113313201A (en) * 2021-06-21 2021-08-27 南京挥戈智能科技有限公司 Multi-target detection and distance measurement method based on Swin transducer and ZED camera
CN113377192A (en) * 2021-05-20 2021-09-10 广州紫为云科技有限公司 Motion sensing game tracking method and device based on deep learning
CN113379795A (en) * 2021-05-21 2021-09-10 浙江工业大学 Multi-target tracking and segmenting method based on conditional convolution and optical flow characteristics
CN113408348A (en) * 2021-05-14 2021-09-17 桂林电子科技大学 Video-based face recognition method and device and storage medium
CN113487653A (en) * 2021-06-24 2021-10-08 之江实验室 Adaptive graph tracking method based on track prediction
CN113486771A (en) * 2021-06-30 2021-10-08 福州大学 Video motion uniformity evaluation method and system based on key point detection
CN113658223A (en) * 2021-08-11 2021-11-16 山东建筑大学 Multi-pedestrian detection and tracking method and system based on deep learning
CN113688740A (en) * 2021-08-26 2021-11-23 燕山大学 Indoor posture detection method based on multi-sensor fusion vision
CN113723279A (en) * 2021-08-30 2021-11-30 东南大学 Multi-target tracking acceleration method based on time-space optimization in edge computing environment
CN113724291A (en) * 2021-07-29 2021-11-30 西安交通大学 Multi-panda tracking method, system, terminal equipment and readable storage medium
CN113723361A (en) * 2021-09-18 2021-11-30 西安邮电大学 Video monitoring method and device based on deep learning
CN113762013A (en) * 2020-12-02 2021-12-07 北京沃东天骏信息技术有限公司 Method and device for face recognition
CN113807187A (en) * 2021-08-20 2021-12-17 北京工业大学 Unmanned aerial vehicle video multi-target tracking method based on attention feature fusion
CN113808170A (en) * 2021-09-24 2021-12-17 电子科技大学长三角研究院(湖州) Anti-unmanned aerial vehicle tracking method based on deep learning
CN113850843A (en) * 2021-09-27 2021-12-28 联想(北京)有限公司 Target tracking method and device, electronic equipment and storage medium
CN113920457A (en) * 2021-09-16 2022-01-11 中国农业科学院农业资源与农业区划研究所 Fruit yield estimation method and system based on space and ground information acquisition cooperative processing
CN113936312A (en) * 2021-10-12 2022-01-14 南京视察者智能科技有限公司 Face recognition base screening method based on deep learning graph convolution network
CN114022509A (en) * 2021-09-24 2022-02-08 北京邮电大学 Target tracking method based on monitoring videos of multiple animals and related equipment
CN114120188A (en) * 2021-11-19 2022-03-01 武汉大学 Multi-pedestrian tracking method based on joint global and local features
CN114332909A (en) * 2021-11-16 2022-04-12 南京行者易智能交通科技有限公司 Binocular pedestrian identification method and device under monitoring scene
CN114339398A (en) * 2021-12-24 2022-04-12 天翼视讯传媒有限公司 Method for real-time special effect processing in large-scale video live broadcast
CN114419151A (en) * 2021-12-31 2022-04-29 福州大学 Multi-target tracking method based on contrast learning
CN114529577A (en) * 2022-01-10 2022-05-24 燕山大学 Multi-target tracking method for road side visual angles
CN114627339A (en) * 2021-11-09 2022-06-14 昆明物理研究所 Intelligent recognition and tracking method for border crossing personnel in dense jungle area and storage medium
CN114639129A (en) * 2020-11-30 2022-06-17 北京君正集成电路股份有限公司 Paper medium living body detection method for access control system
CN114663796A (en) * 2022-01-04 2022-06-24 北京航空航天大学 Target person continuous tracking method, device and system
CN114783043A (en) * 2022-06-24 2022-07-22 杭州安果儿智能科技有限公司 Child behavior track positioning method and system
CN114821702A (en) * 2022-03-15 2022-07-29 电子科技大学 Thermal infrared face recognition method based on face shielding
CN114863539A (en) * 2022-06-09 2022-08-05 福州大学 Portrait key point detection method and system based on feature fusion
CN114898458A (en) * 2022-04-15 2022-08-12 中国兵器装备集团自动化研究所有限公司 Factory floor number monitoring method, system, terminal and medium based on image processing
CN114943924A (en) * 2022-06-21 2022-08-26 深圳大学 Pain assessment method, system, device and medium based on facial expression video
CN114972426A (en) * 2022-05-18 2022-08-30 北京理工大学 Single-target tracking method based on attention and convolution
CN115272404A (en) * 2022-06-17 2022-11-01 江南大学 Multi-target tracking method based on nuclear space and implicit space feature alignment
CN115690545A (en) * 2021-12-03 2023-02-03 北京百度网讯科技有限公司 Training target tracking model and target tracking method and device
CN115994929A (en) * 2023-03-24 2023-04-21 中国兵器科学研究院 Multi-target tracking method integrating space motion and apparent feature learning
CN116596958A (en) * 2023-07-18 2023-08-15 四川迪晟新达类脑智能技术有限公司 Target tracking method and device based on online sample augmentation
CN117011335A (en) * 2023-07-26 2023-11-07 山东大学 Multi-target tracking method and system based on self-adaptive double decoders
CN117455955A (en) * 2023-12-14 2024-01-26 武汉纺织大学 Pedestrian multi-target tracking method based on unmanned aerial vehicle visual angle
CN117576166A (en) * 2024-01-15 2024-02-20 浙江华是科技股份有限公司 Target tracking method and system based on camera and low-frame-rate laser radar
CN117809054A (en) * 2024-02-29 2024-04-02 南京邮电大学 Multi-target tracking method based on feature decoupling fusion network
CN118072000A (en) * 2024-04-17 2024-05-24 中国科学院合肥物质科学研究院 Fish detection method based on novel target recognition algorithm
CN118379608A (en) * 2024-06-26 2024-07-23 浙江大学 High-robustness deep forgery detection method based on self-adaptive learning
CN118522058A (en) * 2024-07-22 2024-08-20 中电桑达电子设备(江苏)有限公司 Object tracking method, system and medium based on face recognition
CN114863539B (en) * 2022-06-09 2024-09-24 福州大学 Portrait key point detection method and system based on feature fusion

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829436B (en) * 2019-02-02 2022-05-13 福州大学 Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network
TWI727337B (en) * 2019-06-06 2021-05-11 大陸商鴻富錦精密工業(武漢)有限公司 Electronic device and face recognition method
CN110490901A (en) * 2019-07-15 2019-11-22 武汉大学 The pedestrian detection tracking of anti-attitudes vibration
CN110414443A (en) * 2019-07-31 2019-11-05 苏州市科远软件技术开发有限公司 A kind of method for tracking target, device and rifle ball link tracking
CN110705478A (en) * 2019-09-30 2020-01-17 腾讯科技(深圳)有限公司 Face tracking method, device, equipment and storage medium
CN111078295B (en) * 2019-11-28 2021-11-12 核芯互联科技(青岛)有限公司 Mixed branch prediction device and method for out-of-order high-performance core
CN111160202B (en) * 2019-12-20 2023-09-05 万翼科技有限公司 Identity verification method, device, equipment and storage medium based on AR equipment
CN111079718A (en) * 2020-01-15 2020-04-28 中云智慧(北京)科技有限公司 Quick face comparison method
CN111275741B (en) * 2020-01-19 2023-09-08 北京迈格威科技有限公司 Target tracking method, device, computer equipment and storage medium
CN111325279B (en) * 2020-02-26 2022-06-10 福州大学 Pedestrian and personal sensitive article tracking method fusing visual relationship
CN111476826A (en) * 2020-04-10 2020-07-31 电子科技大学 Multi-target vehicle tracking method based on SSD target detection
CN111770299B (en) * 2020-04-20 2022-04-19 厦门亿联网络技术股份有限公司 Method and system for real-time face abstract service of intelligent video conference terminal
CN111553234B (en) * 2020-04-22 2023-06-06 上海锘科智能科技有限公司 Pedestrian tracking method and device integrating facial features and Re-ID feature ordering
CN111914613B (en) * 2020-05-21 2024-03-01 淮阴工学院 Multi-target tracking and facial feature information recognition method
CN112001225B (en) * 2020-07-06 2023-06-23 西安电子科技大学 Online multi-target tracking method, system and application
CN112215873A (en) * 2020-08-27 2021-01-12 国网浙江省电力有限公司电力科学研究院 Method for tracking and positioning multiple targets in transformer substation
CN112257502A (en) * 2020-09-16 2021-01-22 深圳微步信息股份有限公司 Pedestrian identification and tracking method and device for surveillance video and storage medium
CN112149557B (en) * 2020-09-22 2022-08-09 福州大学 Person identity tracking method and system based on face recognition
CN112307234A (en) * 2020-11-03 2021-02-02 厦门兆慧网络科技有限公司 Face bottom library synthesis method, system, device and storage medium
CN112597901B (en) * 2020-12-23 2023-12-29 艾体威尔电子技术(北京)有限公司 Device and method for effectively recognizing human face in multiple human face scenes based on three-dimensional ranging
CN112653844A (en) * 2020-12-28 2021-04-13 珠海亿智电子科技有限公司 Camera holder steering self-adaptive tracking adjustment method
CN112581506A (en) * 2020-12-31 2021-03-30 北京澎思科技有限公司 Face tracking method, system and computer readable storage medium
CN112784725B (en) * 2021-01-15 2024-06-07 北京航天自动控制研究所 Pedestrian anti-collision early warning method, device, storage medium and stacker
CN113822211B (en) * 2021-09-27 2023-04-11 山东睿思奥图智能科技有限公司 Interactive person information acquisition method
CN115214430B (en) * 2022-03-23 2023-11-17 广州汽车集团股份有限公司 Vehicle seat adjusting method and vehicle
WO2023184197A1 (en) * 2022-03-30 2023-10-05 京东方科技集团股份有限公司 Target tracking method and apparatus, system, and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110051999A1 (en) * 2007-08-31 2011-03-03 Lockheed Martin Corporation Device and method for detecting targets in images based on user-defined classifiers
CN108363997A (en) * 2018-03-20 2018-08-03 南京云思创智信息科技有限公司 It is a kind of in video to the method for real time tracking of particular person
CN109086724A (en) * 2018-08-09 2018-12-25 北京华捷艾米科技有限公司 A kind of method for detecting human face and storage medium of acceleration
CN109101915A (en) * 2018-08-01 2018-12-28 中国计量大学 Face and pedestrian and Attribute Recognition network structure design method based on deep learning
CN109829436A (en) * 2019-02-02 2019-05-31 福州大学 Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101216885A (en) * 2008-01-04 2008-07-09 中山大学 Passerby face detection and tracing algorithm based on video
CN101777116B (en) * 2009-12-23 2012-07-25 中国科学院自动化研究所 Method for analyzing facial expressions on basis of motion tracking
US10902243B2 (en) * 2016-10-25 2021-01-26 Deep North, Inc. Vision based target tracking that distinguishes facial feature targets
CN106845385A (en) * 2017-01-17 2017-06-13 腾讯科技(上海)有限公司 The method and apparatus of video frequency object tracking
CN107292911B (en) * 2017-05-23 2021-03-30 南京邮电大学 Multi-target tracking method based on multi-model fusion and data association
CN107492116A (en) * 2017-09-01 2017-12-19 深圳市唯特视科技有限公司 A kind of method that face tracking is carried out based on more display models
CN107609512A (en) * 2017-09-12 2018-01-19 上海敏识网络科技有限公司 A kind of video human face method for catching based on neutral net
CN108509859B (en) * 2018-03-09 2022-08-26 南京邮电大学 Non-overlapping area pedestrian tracking method based on deep neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110051999A1 (en) * 2007-08-31 2011-03-03 Lockheed Martin Corporation Device and method for detecting targets in images based on user-defined classifiers
CN108363997A (en) * 2018-03-20 2018-08-03 南京云思创智信息科技有限公司 It is a kind of in video to the method for real time tracking of particular person
CN109101915A (en) * 2018-08-01 2018-12-28 中国计量大学 Face and pedestrian and Attribute Recognition network structure design method based on deep learning
CN109086724A (en) * 2018-08-09 2018-12-25 北京华捷艾米科技有限公司 A kind of method for detecting human face and storage medium of acceleration
CN109829436A (en) * 2019-02-02 2019-05-31 福州大学 Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network

Cited By (112)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111932588A (en) * 2020-08-07 2020-11-13 浙江大学 Tracking method of airborne unmanned aerial vehicle multi-target tracking system based on deep learning
CN111932588B (en) * 2020-08-07 2024-01-30 浙江大学 Tracking method of airborne unmanned aerial vehicle multi-target tracking system based on deep learning
CN111784746A (en) * 2020-08-10 2020-10-16 上海高重信息科技有限公司 Multi-target pedestrian tracking method and device under fisheye lens and computer system
CN111784746B (en) * 2020-08-10 2024-05-03 青岛高重信息科技有限公司 Multi-target pedestrian tracking method and device under fish-eye lens and computer system
CN111899284A (en) * 2020-08-14 2020-11-06 北京交通大学 Plane target tracking method based on parameterized ESM network
CN111899284B (en) * 2020-08-14 2024-04-09 北京交通大学 Planar target tracking method based on parameterized ESM network
CN112036271A (en) * 2020-08-18 2020-12-04 汇纳科技股份有限公司 Pedestrian re-identification method, system, medium and terminal based on Kalman filtering
CN112036271B (en) * 2020-08-18 2023-10-10 汇纳科技股份有限公司 Pedestrian re-identification method, system, medium and terminal based on Kalman filtering
CN111932661A (en) * 2020-08-19 2020-11-13 上海交通大学 Facial expression editing system and method and terminal
CN111932661B (en) * 2020-08-19 2023-10-24 上海艾麒信息科技股份有限公司 Facial expression editing system and method and terminal
CN112016440A (en) * 2020-08-26 2020-12-01 杭州云栖智慧视通科技有限公司 Target pushing method based on multi-target tracking
CN112016440B (en) * 2020-08-26 2024-02-20 杭州云栖智慧视通科技有限公司 Target pushing method based on multi-target tracking
CN112085767A (en) * 2020-08-28 2020-12-15 安徽清新互联信息科技有限公司 Passenger flow statistical method and system based on deep optical flow tracking
CN112053386B (en) * 2020-08-31 2023-04-18 西安电子科技大学 Target tracking method based on depth convolution characteristic self-adaptive integration
CN112053386A (en) * 2020-08-31 2020-12-08 西安电子科技大学 Target tracking method based on depth convolution characteristic self-adaptive integration
CN112215155B (en) * 2020-10-13 2022-10-14 北京中电兴发科技有限公司 Face tracking method and system based on multi-feature fusion
CN112215155A (en) * 2020-10-13 2021-01-12 北京中电兴发科技有限公司 Face tracking method and system based on multi-feature fusion
CN112288773A (en) * 2020-10-19 2021-01-29 慧视江山科技(北京)有限公司 Multi-scale human body tracking method and device based on Soft-NMS
CN112287877A (en) * 2020-11-18 2021-01-29 上海泗科智能科技有限公司 Multi-role close-up shot tracking method
CN114639129A (en) * 2020-11-30 2022-06-17 北京君正集成电路股份有限公司 Paper medium living body detection method for access control system
CN114639129B (en) * 2020-11-30 2024-05-03 北京君正集成电路股份有限公司 Paper medium living body detection method for access control system
CN113762013A (en) * 2020-12-02 2021-12-07 北京沃东天骏信息技术有限公司 Method and device for face recognition
CN112541418B (en) * 2020-12-04 2024-05-28 北京百度网讯科技有限公司 Method, apparatus, device, medium and program product for image processing
CN112541418A (en) * 2020-12-04 2021-03-23 北京百度网讯科技有限公司 Method, apparatus, device, medium, and program product for image processing
CN112560669A (en) * 2020-12-14 2021-03-26 杭州趣链科技有限公司 Face posture estimation method and device and electronic equipment
CN112651994A (en) * 2020-12-18 2021-04-13 零八一电子集团有限公司 Ground multi-target tracking method
CN112668432A (en) * 2020-12-22 2021-04-16 上海幻维数码创意科技股份有限公司 Human body detection tracking method in ground interactive projection system based on YoloV5 and Deepsort
CN112560874A (en) * 2020-12-25 2021-03-26 北京百度网讯科技有限公司 Training method, device, equipment and medium for image recognition model
CN112560874B (en) * 2020-12-25 2024-04-16 北京百度网讯科技有限公司 Training method, device, equipment and medium for image recognition model
CN112597944A (en) * 2020-12-29 2021-04-02 北京市商汤科技开发有限公司 Key point detection method and device, electronic equipment and storage medium
CN112597944B (en) * 2020-12-29 2024-06-11 北京市商汤科技开发有限公司 Key point detection method and device, electronic equipment and storage medium
CN112669345A (en) * 2020-12-30 2021-04-16 中山大学 Cloud deployment-oriented multi-target track tracking method and system
CN112669345B (en) * 2020-12-30 2023-10-20 中山大学 Cloud deployment-oriented multi-target track tracking method and system
CN112686175A (en) * 2020-12-31 2021-04-20 北京澎思科技有限公司 Face snapshot method, system and computer readable storage medium
CN113076808B (en) * 2021-03-10 2023-05-26 海纳云物联科技有限公司 Method for accurately acquiring bidirectional traffic flow through image algorithm
CN113076808A (en) * 2021-03-10 2021-07-06 青岛海纳云科技控股有限公司 Method for accurately acquiring bidirectional pedestrian flow through image algorithm
CN113158788A (en) * 2021-03-12 2021-07-23 中国平安人寿保险股份有限公司 Facial expression recognition method and device, terminal equipment and storage medium
CN113158788B (en) * 2021-03-12 2024-03-08 中国平安人寿保险股份有限公司 Facial expression recognition method and device, terminal equipment and storage medium
CN113033439B (en) * 2021-03-31 2023-10-20 北京百度网讯科技有限公司 Method and device for data processing and electronic equipment
CN113033439A (en) * 2021-03-31 2021-06-25 北京百度网讯科技有限公司 Method and device for data processing and electronic equipment
CN113158853A (en) * 2021-04-08 2021-07-23 浙江工业大学 Pedestrian's identification system that makes a dash across red light that combines people's face and human gesture
CN113192105B (en) * 2021-04-16 2023-10-17 嘉联支付有限公司 Method and device for indoor multi-person tracking and attitude measurement
CN113192105A (en) * 2021-04-16 2021-07-30 嘉联支付有限公司 Method and device for tracking multiple persons and estimating postures indoors
CN113096156A (en) * 2021-04-23 2021-07-09 中国科学技术大学 End-to-end real-time three-dimensional multi-target tracking method and device for automatic driving
CN113096156B (en) * 2021-04-23 2024-05-24 中国科学技术大学 Automatic driving-oriented end-to-end real-time three-dimensional multi-target tracking method and device
CN113158909A (en) * 2021-04-25 2021-07-23 中国科学院自动化研究所 Behavior identification lightweight method, system and equipment based on multi-target tracking
CN113408348A (en) * 2021-05-14 2021-09-17 桂林电子科技大学 Video-based face recognition method and device and storage medium
CN113408348B (en) * 2021-05-14 2022-08-19 桂林电子科技大学 Video-based face recognition method and device and storage medium
CN113377192B (en) * 2021-05-20 2023-06-20 广州紫为云科技有限公司 Somatosensory game tracking method and device based on deep learning
CN113377192A (en) * 2021-05-20 2021-09-10 广州紫为云科技有限公司 Motion sensing game tracking method and device based on deep learning
CN113379795A (en) * 2021-05-21 2021-09-10 浙江工业大学 Multi-target tracking and segmenting method based on conditional convolution and optical flow characteristics
CN113379795B (en) * 2021-05-21 2024-03-22 浙江工业大学 Multi-target tracking and segmentation method based on conditional convolution and optical flow characteristics
CN113269098B (en) * 2021-05-27 2023-06-16 中国人民解放军军事科学院国防科技创新研究院 Multi-target tracking positioning and motion state estimation method based on unmanned aerial vehicle
CN113269098A (en) * 2021-05-27 2021-08-17 中国人民解放军军事科学院国防科技创新研究院 Multi-target tracking positioning and motion state estimation method based on unmanned aerial vehicle
CN113313201A (en) * 2021-06-21 2021-08-27 南京挥戈智能科技有限公司 Multi-target detection and distance measurement method based on Swin transducer and ZED camera
CN113487653B (en) * 2021-06-24 2024-03-26 之江实验室 Self-adaptive graph tracking method based on track prediction
CN113487653A (en) * 2021-06-24 2021-10-08 之江实验室 Adaptive graph tracking method based on track prediction
CN113486771A (en) * 2021-06-30 2021-10-08 福州大学 Video motion uniformity evaluation method and system based on key point detection
CN113486771B (en) * 2021-06-30 2023-07-07 福州大学 Video action uniformity evaluation method and system based on key point detection
CN113724291A (en) * 2021-07-29 2021-11-30 西安交通大学 Multi-panda tracking method, system, terminal equipment and readable storage medium
CN113724291B (en) * 2021-07-29 2024-04-02 西安交通大学 Multi-panda tracking method, system, terminal device and readable storage medium
CN113658223A (en) * 2021-08-11 2021-11-16 山东建筑大学 Multi-pedestrian detection and tracking method and system based on deep learning
CN113658223B (en) * 2021-08-11 2023-08-04 山东建筑大学 Multi-row person detection and tracking method and system based on deep learning
CN113807187B (en) * 2021-08-20 2024-04-02 北京工业大学 Unmanned aerial vehicle video multi-target tracking method based on attention feature fusion
CN113807187A (en) * 2021-08-20 2021-12-17 北京工业大学 Unmanned aerial vehicle video multi-target tracking method based on attention feature fusion
CN113688740B (en) * 2021-08-26 2024-02-27 燕山大学 Indoor gesture detection method based on multi-sensor fusion vision
CN113688740A (en) * 2021-08-26 2021-11-23 燕山大学 Indoor posture detection method based on multi-sensor fusion vision
CN113723279B (en) * 2021-08-30 2022-11-01 东南大学 Multi-target tracking acceleration method based on time-space optimization in edge computing environment
CN113723279A (en) * 2021-08-30 2021-11-30 东南大学 Multi-target tracking acceleration method based on time-space optimization in edge computing environment
CN113920457A (en) * 2021-09-16 2022-01-11 中国农业科学院农业资源与农业区划研究所 Fruit yield estimation method and system based on space and ground information acquisition cooperative processing
CN113723361A (en) * 2021-09-18 2021-11-30 西安邮电大学 Video monitoring method and device based on deep learning
CN113808170B (en) * 2021-09-24 2023-06-27 电子科技大学长三角研究院(湖州) Anti-unmanned aerial vehicle tracking method based on deep learning
CN113808170A (en) * 2021-09-24 2021-12-17 电子科技大学长三角研究院(湖州) Anti-unmanned aerial vehicle tracking method based on deep learning
CN114022509A (en) * 2021-09-24 2022-02-08 北京邮电大学 Target tracking method based on monitoring videos of multiple animals and related equipment
CN113850843A (en) * 2021-09-27 2021-12-28 联想(北京)有限公司 Target tracking method and device, electronic equipment and storage medium
CN113936312A (en) * 2021-10-12 2022-01-14 南京视察者智能科技有限公司 Face recognition base screening method based on deep learning graph convolution network
CN113936312B (en) * 2021-10-12 2024-06-07 南京视察者智能科技有限公司 Face recognition base screening method based on deep learning graph convolution network
CN114627339A (en) * 2021-11-09 2022-06-14 昆明物理研究所 Intelligent recognition and tracking method for border crossing personnel in dense jungle area and storage medium
CN114627339B (en) * 2021-11-09 2024-03-29 昆明物理研究所 Intelligent recognition tracking method and storage medium for cross border personnel in dense jungle area
CN114332909A (en) * 2021-11-16 2022-04-12 南京行者易智能交通科技有限公司 Binocular pedestrian identification method and device under monitoring scene
CN114120188B (en) * 2021-11-19 2024-04-05 武汉大学 Multi-row person tracking method based on joint global and local features
CN114120188A (en) * 2021-11-19 2022-03-01 武汉大学 Multi-pedestrian tracking method based on joint global and local features
CN115690545A (en) * 2021-12-03 2023-02-03 北京百度网讯科技有限公司 Training target tracking model and target tracking method and device
CN115690545B (en) * 2021-12-03 2024-06-11 北京百度网讯科技有限公司 Method and device for training target tracking model and target tracking
CN114339398A (en) * 2021-12-24 2022-04-12 天翼视讯传媒有限公司 Method for real-time special effect processing in large-scale video live broadcast
CN114419151A (en) * 2021-12-31 2022-04-29 福州大学 Multi-target tracking method based on contrast learning
CN114663796A (en) * 2022-01-04 2022-06-24 北京航空航天大学 Target person continuous tracking method, device and system
CN114529577A (en) * 2022-01-10 2022-05-24 燕山大学 Multi-target tracking method for road side visual angles
CN114821702A (en) * 2022-03-15 2022-07-29 电子科技大学 Thermal infrared face recognition method based on face shielding
CN114898458A (en) * 2022-04-15 2022-08-12 中国兵器装备集团自动化研究所有限公司 Factory floor number monitoring method, system, terminal and medium based on image processing
CN114972426A (en) * 2022-05-18 2022-08-30 北京理工大学 Single-target tracking method based on attention and convolution
CN114863539B (en) * 2022-06-09 2024-09-24 福州大学 Portrait key point detection method and system based on feature fusion
CN114863539A (en) * 2022-06-09 2022-08-05 福州大学 Portrait key point detection method and system based on feature fusion
CN115272404A (en) * 2022-06-17 2022-11-01 江南大学 Multi-target tracking method based on nuclear space and implicit space feature alignment
CN114943924A (en) * 2022-06-21 2022-08-26 深圳大学 Pain assessment method, system, device and medium based on facial expression video
CN114943924B (en) * 2022-06-21 2024-05-14 深圳大学 Pain assessment method, system, equipment and medium based on facial expression video
CN114783043A (en) * 2022-06-24 2022-07-22 杭州安果儿智能科技有限公司 Child behavior track positioning method and system
CN115994929A (en) * 2023-03-24 2023-04-21 中国兵器科学研究院 Multi-target tracking method integrating space motion and apparent feature learning
CN116596958A (en) * 2023-07-18 2023-08-15 四川迪晟新达类脑智能技术有限公司 Target tracking method and device based on online sample augmentation
CN116596958B (en) * 2023-07-18 2023-10-10 四川迪晟新达类脑智能技术有限公司 Target tracking method and device based on online sample augmentation
CN117011335B (en) * 2023-07-26 2024-04-09 山东大学 Multi-target tracking method and system based on self-adaptive double decoders
CN117011335A (en) * 2023-07-26 2023-11-07 山东大学 Multi-target tracking method and system based on self-adaptive double decoders
CN117455955B (en) * 2023-12-14 2024-03-08 武汉纺织大学 Pedestrian multi-target tracking method based on unmanned aerial vehicle visual angle
CN117455955A (en) * 2023-12-14 2024-01-26 武汉纺织大学 Pedestrian multi-target tracking method based on unmanned aerial vehicle visual angle
CN117576166B (en) * 2024-01-15 2024-04-30 浙江华是科技股份有限公司 Target tracking method and system based on camera and low-frame-rate laser radar
CN117576166A (en) * 2024-01-15 2024-02-20 浙江华是科技股份有限公司 Target tracking method and system based on camera and low-frame-rate laser radar
CN117809054B (en) * 2024-02-29 2024-05-10 南京邮电大学 Multi-target tracking method based on feature decoupling fusion network
CN117809054A (en) * 2024-02-29 2024-04-02 南京邮电大学 Multi-target tracking method based on feature decoupling fusion network
CN118072000A (en) * 2024-04-17 2024-05-24 中国科学院合肥物质科学研究院 Fish detection method based on novel target recognition algorithm
CN118379608A (en) * 2024-06-26 2024-07-23 浙江大学 High-robustness deep forgery detection method based on self-adaptive learning
CN118522058A (en) * 2024-07-22 2024-08-20 中电桑达电子设备(江苏)有限公司 Object tracking method, system and medium based on face recognition
CN118522058B (en) * 2024-07-22 2024-09-17 中电桑达电子设备(江苏)有限公司 Object tracking method, system and medium based on face recognition

Also Published As

Publication number Publication date
CN109829436A (en) 2019-05-31
CN109829436B (en) 2022-05-13

Similar Documents

Publication Publication Date Title
WO2020155873A1 (en) Deep apparent features and adaptive aggregation network-based multi-face tracking method
CN110472554B (en) Table tennis action recognition method and system based on attitude segmentation and key point features
Kudo et al. Unsupervised adversarial learning of 3d human pose from 2d joint locations
Liu et al. Human pose estimation in video via structured space learning and halfway temporal evaluation
Arif et al. Automated body parts estimation and detection using salient maps and Gaussian matrix model
CN105574510A (en) Gait identification method and device
CN110135249A (en) Human bodys' response method based on time attention mechanism and LSTM
CN109325440A (en) Human motion recognition method and system
CN112149557B (en) Person identity tracking method and system based on face recognition
Shah et al. Multi-view action recognition using contrastive learning
CN111931654A (en) Intelligent monitoring method, system and device for personnel tracking
CN113111857A (en) Human body posture estimation method based on multi-mode information fusion
Abobakr et al. Body joints regression using deep convolutional neural networks
Mu et al. Resgait: The real-scene gait dataset
CN112906520A (en) Gesture coding-based action recognition method and device
Batool et al. Telemonitoring of daily activities based on multi-sensors data fusion
Raychaudhuri et al. Prior-guided source-free domain adaptation for human pose estimation
Pang et al. Analysis of computer vision applied in martial arts
Yaseen et al. A novel approach based on multi-level bottleneck attention modules using self-guided dropblock for person re-identification
Yu Deep learning methods for human action recognition
CN112487926A (en) Scenic spot feeding behavior identification method based on space-time diagram convolutional network
Li et al. Real-time human action recognition using depth motion maps and convolutional neural networks
Wang et al. Thermal infrared object tracking based on adaptive feature fusion
Su et al. Dynamic facial expression recognition using autoregressive models
Caetano et al. Magnitude-Orientation Stream network and depth information applied to activity recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19913037

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19913037

Country of ref document: EP

Kind code of ref document: A1