CN109948560B - Mobile robot target tracking system fusing bone recognition and IFace-TLD - Google Patents

Mobile robot target tracking system fusing bone recognition and IFace-TLD Download PDF

Info

Publication number
CN109948560B
CN109948560B CN201910227611.2A CN201910227611A CN109948560B CN 109948560 B CN109948560 B CN 109948560B CN 201910227611 A CN201910227611 A CN 201910227611A CN 109948560 B CN109948560 B CN 109948560B
Authority
CN
China
Prior art keywords
target
skeleton
point
center
tracking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201910227611.2A
Other languages
Chinese (zh)
Other versions
CN109948560A (en
Inventor
苑晶
蔡晶鑫
高远兮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nankai University
Original Assignee
Nankai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nankai University filed Critical Nankai University
Priority to CN201910227611.2A priority Critical patent/CN109948560B/en
Publication of CN109948560A publication Critical patent/CN109948560A/en
Application granted granted Critical
Publication of CN109948560B publication Critical patent/CN109948560B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

A mobile robot target tracking system integrating skeleton recognition and IFace-TLD comprises an original color picture of a human body and a skeleton picture of an upper limb which are obtained through a Kinect sensor, an IFace-TLD unit used for tracking and positioning a target on the color picture and a skeleton recognition unit used for tracking and positioning the target on the skeleton picture, wherein a region frame where the target is located is obtained and sent to an image target positioning unit, the image target positioning unit marks a target region on the original color picture according to the obtained region frame where the target is located, and the target region is fed back to the IFace-TLD unit. The invention effectively solves the problem of short sequence tracking, and can realize better tracking on the target face no matter the length of the tracking sequence. The invention can realize stable recognition effect even under the condition that the human face faces back to the camera. The online processing based on the bone recognition is realized, and the tracking accuracy and robustness are improved.

Description

融合骨骼识别和IFace-TLD的移动机器人目标跟踪系统Mobile robot target tracking system integrating skeleton recognition and IFace-TLD

技术领域Technical Field

本发明涉及一种移动机器人目标跟踪系统。特别是涉及一种融合骨骼识别和IFace-TLD的移动机器人目标跟踪系统。The present invention relates to a mobile robot target tracking system, and more particularly to a mobile robot target tracking system integrating skeleton recognition and IFace-TLD.

背景技术Background Art

目标跟踪在安防、机器人、人机交互等领域有着广泛的应用。在实际的跟踪过程中,由于目标的快速运动、光照的变化以及遮挡等因素,实现鲁棒而高效的跟踪是一个极具挑战性的工作。Object tracking has a wide range of applications in security, robotics, human-computer interaction, etc. In the actual tracking process, due to factors such as the rapid movement of the target, changes in lighting, and occlusion, achieving robust and efficient tracking is a very challenging task.

人脸具有较大的区分性,为了达到较好的跟踪效果,我们选择对目标人脸进行跟踪。基于人脸的跟踪-检测-学习算法(Face-TLD)可以实现长时间地对人脸进行跟踪。但是,因为这是一个长跟踪算法,当跟踪序列较短时,由于Face-TLD中学习部分的训练不够充分,跟踪效果较差,甚至会发生较大的漂移。另外,在实际的应用场景中,人脸的转动有较大的随机性,不能保证人脸每时每刻都正对着摄像头,在有些情况下,目标人脸可能完全背对着摄像头,这时,基于图像外观的跟踪算法都会失效。Faces are highly distinguishable. In order to achieve better tracking results, we choose to track the target face. The face-based tracking-detection-learning algorithm (Face-TLD) can track faces for a long time. However, because this is a long tracking algorithm, when the tracking sequence is short, the learning part of Face-TLD is not fully trained, the tracking effect is poor, and even large drift may occur. In addition, in actual application scenarios, the rotation of the face is highly random, and it cannot be guaranteed that the face is facing the camera at all times. In some cases, the target face may be completely facing away from the camera. At this time, the tracking algorithm based on image appearance will fail.

人脸具有独特的生物学特性,已在许多场合得到应用。然而,大多数高精度的人脸识别算法都是很耗时的,这些耗时的算法不能应用到对实时性要求较高的移动机器人目标跟踪中。Human face has unique biological characteristics and has been used in many occasions. However, most high-precision face recognition algorithms are time-consuming, and these time-consuming algorithms cannot be applied to mobile robot target tracking, which has high real-time requirements.

基于TLD算法,Face-TLD能够长时间、鲁棒地跟踪人脸。原始的TLD是一个单目标跟踪算法,可以在视频流中长时间地跟踪一个未知的物体。原始的TLD可以被分为三个部分,即,跟踪部分、学习部分和检测部分。因为其跟踪效果良好,有很多基于此算法的改进工作。Face-TLD就是其中一个。Face-TLD将人脸检测和TLD结合起来,从而实现对人脸的长时间跟踪。在原始的Face-TLD中,检测器可以被分为两个部分,即,人脸检测部分和验证器部分。人脸检测部分对所有的图像块进行处理,验证器部分输出包含特定人脸的图像块的置信系数。然而,当跟踪序列较短时,由于学习部分的训练不够充分,Face-TLD无法达到一个令人满意的跟踪效果。具体来说,由于要处理各种不确定性,从而引入了学习部分。但是,为了保证精度,学习部分需要充分多的训练数据,同时,这是一个耗时的过程。短序列无法提供充足的图片来训练学习部分。更严重的是,在跟踪初期时,如果目标之间的外观比较相似的话,原始的Face-TLD很有可能会跟丢目标。Based on the TLD algorithm, Face-TLD can track faces for a long time and robustly. The original TLD is a single target tracking algorithm that can track an unknown object for a long time in a video stream. The original TLD can be divided into three parts, namely, the tracking part, the learning part, and the detection part. Because of its good tracking effect, there are many improvements based on this algorithm. Face-TLD is one of them. Face-TLD combines face detection and TLD to achieve long-term tracking of faces. In the original Face-TLD, the detector can be divided into two parts, namely, the face detection part and the verifier part. The face detection part processes all image blocks, and the verifier part outputs the confidence coefficient of the image block containing a specific face. However, when the tracking sequence is short, Face-TLD cannot achieve a satisfactory tracking effect due to insufficient training of the learning part. Specifically, the learning part is introduced to deal with various uncertainties. However, in order to ensure accuracy, the learning part requires sufficient training data, and this is a time-consuming process. Short sequences cannot provide enough pictures to train the learning part. What’s more serious is that in the early stage of tracking, if the targets have similar appearances, the original Face-TLD is likely to lose the target.

微软Kinect传感器可以直接采集人体骨骼信息,对人体运动较为鲁棒,即使人体完全背对着Kinect,都可以采集到较为稳定的骨骼。由于传感器的优势,基于骨骼的人体识别对光照、运动以及目标外观变化有较好的鲁棒性。但是,现有的基于骨骼的人体识别算法都是采集一定数量的数据,然后离线处理,需要手动设置训练标签,难以实现在线识别。这显然不能满足移动机器人在线跟踪的要求。Microsoft Kinect sensor can directly collect human skeleton information and is relatively robust to human movement. Even if the human body is completely facing away from Kinect, relatively stable skeletons can be collected. Due to the advantages of the sensor, skeleton-based human recognition has good robustness to changes in lighting, movement, and target appearance. However, existing skeleton-based human recognition algorithms all collect a certain amount of data and then process it offline. They need to manually set training labels, making it difficult to achieve online recognition. This obviously cannot meet the requirements of online tracking of mobile robots.

发明内容Summary of the invention

本发明所要解决的技术问题是,提供一种能够对目标人脸实现较好的跟踪,并能提高跟踪准确度以及鲁棒性的融合骨骼识别和IFace-TLD的移动机器人目标跟踪系统。The technical problem to be solved by the present invention is to provide a mobile robot target tracking system which integrates skeleton recognition and IFace-TLD and can achieve better tracking of a target face and improve tracking accuracy and robustness.

本发明所采用的技术方案是:一种融合骨骼识别和IFace-TLD的移动机器人目标跟踪系统,包括通过Kinect传感器获取人体的原始彩色图片和上肢的骨骼图片,分别通过用于在彩色图片上对目标进行跟踪定位的IFace-TLD单元,以及用于在骨骼图片上对目标进行跟踪定位的骨骼识别单元得到目标所在区域框,并送入图像目标定位单元,图像目标定位单元根据得到的目标所在区域框在原始彩色图片标出目标区域,并将目标区域反馈给IFace-TLD单元。The technical solution adopted by the present invention is: a mobile robot target tracking system integrating skeleton recognition and IFace-TLD, comprising obtaining an original color picture of a human body and a skeleton picture of an upper limb through a Kinect sensor, obtaining a target area frame through an IFace-TLD unit for tracking and locating the target on the color picture, and a skeleton recognition unit for tracking and locating the target on the skeleton picture, and sending the target area frame to an image target positioning unit, the image target positioning unit marks a target area in the original color picture according to the obtained target area frame, and feeds the target area back to the IFace-TLD unit.

所述IFace-TLD单元包括有分别获取原始彩色图片的跟踪部分、学习部分、检测部分,以及集成器,其中,所述的跟踪部分使用光流法跟踪器来估计获取的原始彩色图片中的目标在相邻两帧之间的运动轨迹,并分别送入学习部分和集成器;所述的检测部分对获取的第一帧原始彩色图片中所有的图像块进行独立地扫描和处理,将目标人脸和背景分离出来,并将目标人脸分别送入学习部分和集成器,对获取的第一帧之后的原始彩色图片只对图像目标定位单元所反馈的目标区域及周边进行扫描和处理,将目标人脸和背景分离出来,并将目标人脸分别送入学习部分和集成器;集成器根据得到的目标在相邻两帧之间的运动轨迹和目标人脸计算原始彩色图片中最可能包含目标位置的置信系数,将计算结果送入学习部分,以及骨骼识别单元或者图像目标定位单元;所述的学习部分根据原始彩色图片以及从跟踪部分、检测部分和集成器得到的结果进行训练,根据训练结果对跟踪部分和检测部分出现的错误进行更新和修正。The IFace-TLD unit includes a tracking part, a learning part, a detection part, and an integrator for respectively acquiring the original color picture, wherein the tracking part uses an optical flow tracker to estimate the motion trajectory of the target in the acquired original color picture between two adjacent frames, and sends the trajectory to the learning part and the integrator respectively; the detection part independently scans and processes all image blocks in the first frame of the acquired original color picture, separates the target face and the background, and sends the target face to the learning part and the integrator respectively, and scans and processes only the target area and the surrounding area fed back by the image target positioning unit for the original color picture after the first frame, separates the target face and the background, and sends the target face to the learning part and the integrator respectively; the integrator calculates the confidence coefficient of the most likely target position in the original color picture according to the obtained motion trajectory of the target and the target face between two adjacent frames, and sends the calculation result to the learning part, and the skeleton recognition unit or the image target positioning unit; the learning part is trained according to the original color picture and the results obtained from the tracking part, the detection part and the integrator, and updates and corrects the errors in the tracking part and the detection part according to the training results.

所述的检测部分包括有用于根据获取的原始彩色图片、学习部分的目标区域和图像目标定位单元所反馈的目标区域信息检测出人脸在原始彩色图片的区域的人脸检测部分,根据从人脸检测部分得到的人脸在原始彩色图片的区域识别出目标人脸区域的人脸识别部分,以及用于判断人脸识别部分识别出的目标人脸区域是否正确的验证器部分,所述验证器部分的验证结果分别送入学习部分和集成器。The detection part includes a face detection part for detecting the area of the face in the original color picture according to the acquired original color picture, the target area of the learning part and the target area information fed back by the image target positioning unit, a face recognition part for identifying the target face area according to the area of the face in the original color picture obtained from the face detection part, and a verifier part for judging whether the target face area recognized by the face recognition part is correct. The verification results of the verifier part are respectively sent to the learning part and the integrator.

所述骨骼识别单元包括有运动周期提取部分、骨骼特征提取部分和支持向量数据描述部分,其中,所述的运动周期提取部分根据获取的骨骼图片计算人体的运动周期,所述的骨骼特征提取部分是在得到的人体的运动周期内计算骨骼特征,当IFace-TLD单元中的集成器输出的结果为目标所在区域框时,骨骼特征提取部分得到的骨骼特征送入支持向量数据描述部分中的训练部分进行训练,当IFace-TLD单元中的集成器输出的结果为空时,骨骼特征提取部分得到的骨骼特征送入支持向量数据描述部分中的预测部分,所述预测部分根据训练部分的训练结果预测目标所在区域框,并将预测的目标所在区域框送入图像目标定位单元。The skeleton recognition unit includes a motion cycle extraction part, a skeleton feature extraction part and a support vector data description part, wherein the motion cycle extraction part calculates the motion cycle of the human body according to the acquired skeleton image, and the skeleton feature extraction part calculates the skeleton features within the obtained motion cycle of the human body. When the result output by the integrator in the IFace-TLD unit is the target area frame, the skeleton features obtained by the skeleton feature extraction part are sent to the training part in the support vector data description part for training. When the result output by the integrator in the IFace-TLD unit is empty, the skeleton features obtained by the skeleton feature extraction part are sent to the prediction part in the support vector data description part. The prediction part predicts the target area frame according to the training result of the training part, and sends the predicted target area frame to the image target positioning unit.

所述运动周期提取部分根据获取的骨骼图片计算人体的运动周期是采用如下公式:The motion cycle extraction part calculates the motion cycle of the human body according to the acquired skeleton image using the following formula:

Figure BDA0002005702460000021
Figure BDA0002005702460000021

其中,distk为在Kinect坐标系下,第k帧图像左手腕和肩膀中心之间的距离;

Figure BDA0002005702460000022
Figure BDA0002005702460000023
表示第k帧图像左手腕和肩膀中心的三维点坐标;N表示序列中总的图像总帧数。Where dist k is the distance between the left wrist and shoulder center of the kth frame image in the Kinect coordinate system;
Figure BDA0002005702460000022
and
Figure BDA0002005702460000023
Represents the 3D point coordinates of the center of the left wrist and shoulder of the kth frame image; N represents the total number of image frames in the sequence.

所述骨骼特征提取部分在得到的人体的运动周期内计算骨骼特征,包括:The bone feature extraction part calculates the bone features within the obtained motion cycle of the human body, including:

首先定义步态半周期为Tw,则,基于人体上肢的骨骼特征分别表示为:First, the gait half-cycle is defined as T w , then the skeletal features based on the upper limbs of the human body are expressed as:

轨迹特征:将肩膀中心点选为固定点,通过下式计算其他上肢骨骼点相对于固定点的相对位置,得到一个9维的特征P:Trajectory feature: The center point of the shoulder is selected as the fixed point, and the relative positions of other upper limb bone points relative to the fixed point are calculated by the following formula to obtain a 9-dimensional feature P:

Figure BDA0002005702460000031
Figure BDA0002005702460000031

其中,

Figure BDA0002005702460000032
表示第j个人体骨骼在步态半周期Tw中的第t帧图像的位置,
Figure BDA0002005702460000033
被表示为:in,
Figure BDA0002005702460000032
represents the position of the jth human skeleton in the tth frame image in the gait half cycle Tw ,
Figure BDA0002005702460000033
is represented as:

Figure BDA0002005702460000034
Figure BDA0002005702460000034

其中,

Figure BDA0002005702460000035
表示相机坐标系下肩膀中心点的位置,上肢骨骼中其余坐标点的位置表示为
Figure BDA0002005702460000036
用P的协方差矩阵来表示包含人体行走习惯的轨迹特征矩阵FT,定义in,
Figure BDA0002005702460000035
represents the position of the center point of the shoulder in the camera coordinate system, and the positions of the remaining coordinate points in the upper limb bones are expressed as
Figure BDA0002005702460000036
The covariance matrix of P is used to represent the trajectory feature matrix F T containing the walking habits of human beings, and the definition is

Figure BDA0002005702460000037
Figure BDA0002005702460000037

其中,

Figure BDA0002005702460000038
Figure BDA0002005702460000039
分别表示测试数据和训练数据的轨迹特征矩阵;
Figure BDA00020057024600000310
Figure BDA00020057024600000311
Figure BDA00020057024600000312
之间的广义特征值,满足
Figure BDA00020057024600000313
x是相应的广义右特征向量;in,
Figure BDA0002005702460000038
and
Figure BDA0002005702460000039
Represent the trajectory feature matrices of test data and training data respectively;
Figure BDA00020057024600000310
yes
Figure BDA00020057024600000311
and
Figure BDA00020057024600000312
The generalized eigenvalues between
Figure BDA00020057024600000313
x is the corresponding generalized right eigenvector;

面积和距离特征:面积特征FA表示人体上肢部分围成封闭区域的面积,距离特征FD由不同人体中心之间的距离来表示,FA表示为:Area and distance features: The area feature FA represents the area of the closed area enclosed by the upper limbs of the human body. The distance feature FD is represented by the distance between different human body centers. FA is expressed as:

Figure BDA00020057024600000314
Figure BDA00020057024600000314

其中,

Figure BDA00020057024600000315
Figure BDA00020057024600000316
分别表示肩膀中心点、头部、左肩和右肩在步态半周期Tw中的第t帧的位置;in,
Figure BDA00020057024600000315
and
Figure BDA00020057024600000316
denote the positions of the shoulder center, head, left shoulder, and right shoulder in the tth frame in the gait half cycle Tw , respectively;

为了计算距离特征FD,首先计算上肢区域的三个封闭多边形的中心,三个中心点即头部中心

Figure BDA00020057024600000317
右手中心
Figure BDA00020057024600000318
和左手中心
Figure BDA00020057024600000319
通过以下公式计算得到:To calculate the distance feature FD , we first calculate the centers of the three closed polygons in the upper limb area. The three center points are the center of the head.
Figure BDA00020057024600000317
Right hand center
Figure BDA00020057024600000318
and left hand center
Figure BDA00020057024600000319
Calculated by the following formula:

Figure BDA00020057024600000320
Figure BDA00020057024600000320

头部中心

Figure BDA00020057024600000321
是肩膀中心点和头部点围成多边形的中心;右手中心
Figure BDA00020057024600000322
是右肩点,右肘点和右手腕点围成多边形的中心;左手中心
Figure BDA00020057024600000323
是左肩点,左肘点和左手腕点围成多边形的中心;右手中心
Figure BDA00020057024600000324
分别与头部中心
Figure BDA00020057024600000325
和左手中心
Figure BDA00020057024600000326
之间的欧几里得距离ft d1和ft d2写为:Head Center
Figure BDA00020057024600000321
It is the center of the polygon formed by the center point of the shoulder and the head; the center of the right hand
Figure BDA00020057024600000322
It is the center of the polygon formed by the right shoulder point, right elbow point and right wrist point; the left hand center
Figure BDA00020057024600000323
It is the center of the polygon formed by the left shoulder point, left elbow point and left wrist point; the center of the right hand
Figure BDA00020057024600000324
Head center
Figure BDA00020057024600000325
and left hand center
Figure BDA00020057024600000326
The Euclidean distance between f t d1 and f t d2 is written as:

Figure BDA00020057024600000327
Figure BDA00020057024600000327

Figure BDA00020057024600000328
这样,整个距离特征表示为make
Figure BDA00020057024600000328
In this way, the entire distance feature is expressed as

Figure BDA00020057024600000329
Figure BDA00020057024600000329

其中,

Figure BDA00020057024600000330
Figure BDA00020057024600000331
分别表示fdi的均值、方差和最大值,其中,i=1,2;in,
Figure BDA00020057024600000330
and
Figure BDA00020057024600000331
Respectively represent the mean, variance and maximum value of f di , where i = 1, 2;

静态特征:静态特征由一个5维向量FS=[fh,flua,frua,flf,frf]T来表示,其中fh表示目标的高度,flua,frua,flf和frf分别表示左上臂长度,右上臂长度,左前臂长度和右前臂长度,具体由下各式得到:Static features: Static features are represented by a 5-dimensional vector F S = [f h ,f lua ,f rua ,f lf ,f rf ] T , where f h represents the height of the target, and f lua ,f rua ,f lf and f rf represent the length of the left upper arm, the length of the right upper arm, the length of the left forearm and the length of the right forearm, respectively. They are obtained by the following formulas:

Figure BDA00020057024600000332
Figure BDA00020057024600000332

Figure BDA0002005702460000041
Figure BDA0002005702460000041

Figure BDA0002005702460000042
Figure BDA0002005702460000042

Figure BDA0002005702460000043
Figure BDA0002005702460000043

Figure BDA0002005702460000044
Figure BDA0002005702460000044

其中,

Figure BDA0002005702460000045
分别表示在相机坐标系下,肩部中心点、左肩点、左肘点、左腕点、左手点、右肩点、右肘点、右腕点和右手点的-位置;in,
Figure BDA0002005702460000045
Respectively represent the positions of the shoulder center point, left shoulder point, left elbow point, left wrist point, left hand point, right shoulder point, right elbow point, right wrist point and right hand point in the camera coordinate system;

频率特征和振幅特征:频率特征FFre是步态半周期内骨骼图像帧的数量,相邻的局部最大和局部最小之间的差值为振幅特征FAmpFrequency feature and amplitude feature: The frequency feature F Fre is the number of skeleton image frames in a gait half cycle, and the difference between adjacent local maximum and local minimum is the amplitude feature F Amp ;

最后,得到一个23维的混合特征

Figure BDA0002005702460000046
构成基于人体上肢的骨骼特征。Finally, we get a 23-dimensional mixed feature
Figure BDA0002005702460000046
The composition is based on the skeletal features of the human upper limbs.

本发明的融合骨骼识别和IFace-TLD的移动机器人目标跟踪系统,在原Face-TLD算法上加入了基于主成分分析(PCA)的人脸识别,有效地解决了短序列跟踪问题,本发明称之为IFace-TLD。本发明,不论跟踪序列的长短,IFace-TLD都能够对目标人脸实现较好的跟踪。同时,SIFace-TLD将IFace-TLD与基于骨骼的人体识别无缝融合起来,当IFace-TLD跟踪成功时,用提取到的骨骼特征去训练支持向量数据描述(SVDD),当其跟踪失效时,新提取的骨骼特征可以送入训练好的SVDD中进行识别。这样,即使是人脸背对着摄像头的情况下,也能够实现稳定的识别效果。不仅实现了基于骨骼识别的在线处理,还提高了跟踪的准确度以及鲁棒性。The mobile robot target tracking system of the present invention, which integrates skeleton recognition and IFace-TLD, adds face recognition based on principal component analysis (PCA) to the original Face-TLD algorithm, effectively solves the short sequence tracking problem, and the present invention is referred to as IFace-TLD. In the present invention, regardless of the length of the tracking sequence, IFace-TLD can achieve better tracking of the target face. At the same time, SIFace-TLD seamlessly integrates IFace-TLD with skeleton-based human body recognition. When IFace-TLD tracks successfully, the extracted skeleton features are used to train support vector data description (SVDD). When its tracking fails, the newly extracted skeleton features can be sent to the trained SVDD for identification. In this way, even when the face is facing away from the camera, a stable recognition effect can be achieved. Not only online processing based on skeleton recognition is achieved, but also the accuracy and robustness of tracking are improved.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明融合骨骼识别和IFace-TLD的移动机器人目标跟踪系统的构成框图;FIG1 is a block diagram of a mobile robot target tracking system integrating skeleton recognition and IFace-TLD according to the present invention;

图2是含有20个关节的人体骨骼示意图;Fig. 2 is a schematic diagram of a human skeleton containing 20 joints;

图3是distk的曲线图。FIG3 is a graph of dist k .

图中In the figure

101:IFace-TLD单元 101.1:跟踪部分101: IFace-TLD Unit 101.1: Tracking Section

101.2:学习部分 101.3:检测部分101.2: Learning part 101.3: Detection part

101.31人脸检测部分 101.32:人脸识别部分101.31 Face detection part 101.32: Face recognition part

101.33:验证器部分 101.4:集成器101.33: Validator Section 101.4: Integrator

102:骨骼识别单元 102.1:运动周期提取部分102: Skeleton Recognition Unit 102.1: Motion Cycle Extraction Part

102.2:骨骼特征提取部分 102.3:支持向量数据描述部分102.2: Skeleton feature extraction part 102.3: Support vector data description part

102.31:训练部分 102.32:预测部分102.31: Training part 102.32: Prediction part

103:图像目标定位单元103: Image target positioning unit

具体实施方式DETAILED DESCRIPTION

下面结合实施例和附图对本发明的融合骨骼识别和IFace-TLD的移动机器人目标跟踪系统做出详细说明。The mobile robot target tracking system integrating skeleton recognition and IFace-TLD of the present invention is described in detail below in conjunction with the embodiments and drawings.

如图1所示,本发明的融合骨骼识别和IFace-TLD的移动机器人目标跟踪系统,包括通过Kinect传感器获取人体的原始彩色图片A和上肢的骨骼图片B,分别通过用于在彩色图片A上对目标进行跟踪定位的IFace-TLD单元101,以及用于在骨骼图片B上对目标进行跟踪定位的骨骼识别单元102得到目标所在区域框,并送入图像目标定位单元103,图像目标定位单元103根据得到的目标所在区域框在原始彩色图片A中标出目标区域,并将目标区域反馈给IFace-TLD单元101。As shown in FIG1 , the mobile robot target tracking system integrating skeleton recognition and IFace-TLD of the present invention comprises obtaining an original color image A of a human body and a skeleton image B of an upper limb through a Kinect sensor, obtaining a target region frame through an IFace-TLD unit 101 for tracking and locating the target on the color image A, and a skeleton recognition unit 102 for tracking and locating the target on the skeleton image B, and sending the frame to an image target positioning unit 103. The image target positioning unit 103 marks a target region in the original color image A according to the obtained target region frame, and feeds the target region back to the IFace-TLD unit 101.

所述IFace-TLD单元101包括有分别获取原始彩色图片A的跟踪部分101.1、学习部分101.2、检测部分101.3,以及集成器101.4,其中,所述的跟踪部分101.1使用光流法跟踪器来估计获取原始彩色图片A中的目标在相邻两帧之间的运动轨迹,并分别送入学习部分101.2和集成器101.4;所述的检测部分101.3对获取的第一帧原始彩色图片A中所有的图像块进行独立地扫描和处理,将目标人脸和背景分离出来,并将目标人脸分别送入学习部分101.2和集成器101.4,对获取的第一帧之后的原始彩色图片A只对图像目标定位单元103所反馈的目标区域及周边进行扫描和处理,将目标人脸和背景分离出来,并将目标人脸分别送入学习部分101.2和集成器101.4;集成器101.4根据得到的目标在相邻两帧之间的运动轨迹和目标人脸计算原始彩色图片A中最可能包含目标位置的置信系数,将计算结果送入学习部分101.2,以及骨骼识别单元102或者图像目标定位单元103;所述的学习部分101.2根据原始彩色图片A以及从跟踪部分101.1、检测部分101.3和集成器101.4得到的结果进行训练,根据训练结果对跟踪部分101.1和检测部分101.3出现的错误进行更新和修正。The IFace-TLD unit 101 includes a tracking part 101.1, a learning part 101.2, a detection part 101.3, and an integrator 101.4 for respectively acquiring the original color image A, wherein the tracking part 101.1 uses an optical flow tracker to estimate the motion trajectory of the target in the original color image A between two adjacent frames, and sends the estimates to the learning part 101.2 and the integrator 101.4 respectively; the detection part 101.3 independently scans and processes all image blocks in the first frame of the original color image A acquired, separates the target face and the background, and sends the target face to the learning part 101.2 and the integrator 101.4 respectively, and only scans and processes the original color image A after the first frame acquired. 03, scan and process the target area and the surrounding area fed back by the tracking part 101.1, separate the target face from the background, and send the target face to the learning part 101.2 and the integrator 101.4 respectively; the integrator 101.4 calculates the confidence coefficient of the most likely target position in the original color picture A according to the obtained motion trajectory of the target between two adjacent frames and the target face, and sends the calculation result to the learning part 101.2, and the skeleton recognition unit 102 or the image target positioning unit 103; the learning part 101.2 is trained according to the original color picture A and the results obtained from the tracking part 101.1, the detection part 101.3 and the integrator 101.4, and updates and corrects the errors in the tracking part 101.1 and the detection part 101.3 according to the training results.

所述的检测部分101.3包括有用于根据获取的原始彩色图片A、学习部分101.2的目标区域和图像目标定位单元103所反馈的目标区域信息检测出人脸在原始彩色图片A的区域的人脸检测部分101.31,根据从人脸检测部分101.31得到的人脸在原始彩色图片A的区域识别出目标人脸区域的人脸识别部分101.32,以及用于判断人脸识别部分101.32识别出的目标人脸区域是否正确的验证器部分101.33,所述验证器部分101.33的验证结果分别送入学习部分101.2和集成器101.4。The detection part 101.3 includes a face detection part 101.31 for detecting the area of the face in the original color picture A based on the acquired original color picture A, the target area of the learning part 101.2 and the target area information fed back by the image target positioning unit 103, a face recognition part 101.32 for identifying the target face area in the original color picture A based on the face obtained from the face detection part 101.31, and a verifier part 101.33 for judging whether the target face area recognized by the face recognition part 101.32 is correct. The verification results of the verifier part 101.33 are respectively sent to the learning part 101.2 and the integrator 101.4.

本发明的IFace-TLD融合了Face-TLD和基于主成分分析的人脸识别。所述的Face-TLD由ZdenekKalal,Krystian Mikolajczyk以及Jiri Matas在2010年一篇名为《Face-TLD:Tracking-Learning-Detection Appliedto Faces》的文章中提出。与原始Face-TLD相比,IFace-TLD多了人脸检测部分,从而跟踪性能得到了加强。在IFace-TLD单元101中,基于主成分分析的人脸识别被添加在人脸检测的后面,从而用于区分形似的图像块。为了减少图像块的数量,提高处理速度,本发明只考虑反馈的目标区域,反馈的目标区域的大小是之前得到的包含目标人脸的包围框的两倍。人脸检测用于检测所有包含人脸的图像块,之后,人脸识别部分将不是目标的人脸滤除。最后,所有剩下的图像块被送到验证器中进一步确定是否包含目标人脸。The IFace-TLD of the present invention combines Face-TLD and face recognition based on principal component analysis. The Face-TLD was proposed by Zdenek Kalal, Krystian Mikolajczyk and Jiri Matas in an article entitled "Face-TLD: Tracking-Learning-Detection Applied to Faces" in 2010. Compared with the original Face-TLD, IFace-TLD has an additional face detection part, so that the tracking performance is enhanced. In the IFace-TLD unit 101, face recognition based on principal component analysis is added after face detection, so as to distinguish similar image blocks. In order to reduce the number of image blocks and improve the processing speed, the present invention only considers the target area of feedback, and the size of the target area of feedback is twice the size of the bounding box containing the target face obtained before. Face detection is used to detect all image blocks containing faces, and then the face recognition part filters out faces that are not the target. Finally, all the remaining image blocks are sent to the verifier to further determine whether they contain the target face.

所述骨骼识别单元102包括有运动周期提取部分102.1、骨骼特征提取部分102.2和支持向量数据描述部分102.3,其中,所述的运动周期提取部分102.1根据获取的骨骼图片B计算人体的运动周期,所述的骨骼特征提取部分102.2是在得到的人体的运动周期内计算骨骼特征,当IFace-TLD单元101中的集成器101.4输出的结果为目标所在区域框时,骨骼特征提取部分102.2得到的骨骼特征送入支持向量数据描述部分102.3中的训练部分102.31进行训练,当IFace-TLD单元101中的集成器101.4输出的结果为空时,骨骼特征提取部分102.2得到的骨骼特征送入支持向量数据描述部分102.3中的预测部分102.32,所述预测部分102.32根据训练部分102.31的训练结果预测目标所在区域框,并将预测的目标所在区域框送入图像目标定位单元103。The skeleton recognition unit 102 includes a motion cycle extraction part 102.1, a skeleton feature extraction part 102.2 and a support vector data description part 102.3, wherein the motion cycle extraction part 102.1 calculates the motion cycle of the human body according to the acquired skeleton image B, and the skeleton feature extraction part 102.2 calculates the skeleton features within the obtained motion cycle of the human body. When the result output by the integrator 101.4 in the IFace-TLD unit 101 is the target area frame, the skeleton feature extraction part 102.2 obtains The skeletal features are sent to the training part 102.31 in the support vector data description part 102.3 for training. When the result output by the integrator 101.4 in the IFace-TLD unit 101 is empty, the skeletal features obtained by the skeletal feature extraction part 102.2 are sent to the prediction part 102.32 in the support vector data description part 102.3. The prediction part 102.32 predicts the target area box according to the training result of the training part 102.31, and sends the predicted target area box to the image target positioning unit 103.

含有20个关节的人体骨骼如图2所示,图中的编号如表1所示:The human skeleton with 20 joints is shown in Figure 2, and the numbers in the figure are shown in Table 1:

表1Table 1

11 髋关节中心点Hip center 1111 右腕点Right wrist point 22 脊点Ridge Point 1212 右手点Right hand point 33 肩部中心点Shoulder center point 1313 左髋关节点Left hip joint 44 头部点Head Point 1414 左膝点Left knee point 55 左肩点Left shoulder point 1515 左脚踝点Left ankle point 66 左肘点Left elbow point 1616 左脚点Left foot point 77 左腕点Left wrist point 1717 右髋关节点Right hip joint 88 左手点Left hand point 1818 右膝点Right knee point 99 右肩点Right shoulder point 1919 右脚踝点Right ankle point 1010 右肘点Right elbow point 2020 右脚点Right foot point

本发明用上肢的十个骨骼点实现在线的人体识别,这十个骨骼点用黑色实线连接。剩下的十个骨骼点用黑色虚线连接。通过计算在Kinect坐标系下左手腕和肩膀中心之间的距离来获得步态周期。The present invention realizes online human body recognition using ten skeleton points of the upper limbs, which are connected by black solid lines. The remaining ten skeleton points are connected by black dotted lines. The gait cycle is obtained by calculating the distance between the left wrist and the center of the shoulder in the Kinect coordinate system.

所述运动周期提取部分102.1根据获取的骨骼图片B计算人体的运动周期是采用如下公式:The motion cycle extraction part 102.1 calculates the motion cycle of the human body according to the acquired skeleton image B using the following formula:

Figure BDA0002005702460000061
Figure BDA0002005702460000061

其中,distk为在Kinect坐标系下,第k帧图像左腕点和肩部中心点之间的距离;

Figure BDA0002005702460000062
Figure BDA0002005702460000063
表示第k帧图像左腕点和肩部中心点的三维点坐标;N表示序列中总的图像总帧数,distk曲线如图3所示。点线表示最原始的距离曲线。为了降低噪声的干扰,将原始数据进行均值滤波。滤波后的曲线由实线表示。本发明定义步态全周期是相邻局部最大(或最小)的帧数间隔。相邻局部最大和局部最小之间的帧数数目定义为步态半周期。因为骨骼特征是在步态周期内提取的,为了得到更多的骨骼特征,本发明在步态半周期内进行提取。Wherein, dist k is the distance between the left wrist point and the shoulder center point of the kth frame image in the Kinect coordinate system;
Figure BDA0002005702460000062
and
Figure BDA0002005702460000063
Represents the three-dimensional point coordinates of the left wrist point and the shoulder center point of the k-th frame image; N represents the total number of image frames in the sequence, and the dist k curve is shown in Figure 3. The dotted line represents the most original distance curve. In order to reduce the interference of noise, the original data is mean filtered. The filtered curve is represented by a solid line. The present invention defines the full gait cycle as the frame interval between adjacent local maximums (or minimums). The number of frames between adjacent local maximums and local minimums is defined as the gait half cycle. Because the skeletal features are extracted within the gait cycle, in order to obtain more skeletal features, the present invention extracts them within the gait half cycle.

所述骨骼特征提取部分102.2在得到的人体的运动周期内计算骨骼特征,包括:The bone feature extraction part 102.2 calculates the bone features within the obtained human body motion cycle, including:

首先定义步态半周期为Tw,则,基于人体上肢的骨骼特征分别表示为:First, the gait half-cycle is defined as T w , then the skeletal features based on the upper limbs of the human body are expressed as:

轨迹特征:将肩部中心点选为固定点,通过下式计算其他上肢骨骼点相对于固定点的相对位置,得到一个9维的特征P:Trajectory feature: The center point of the shoulder is selected as the fixed point, and the relative positions of other upper limb bone points relative to the fixed point are calculated by the following formula to obtain a 9-dimensional feature P:

Figure BDA0002005702460000071
Figure BDA0002005702460000071

其中,

Figure BDA0002005702460000072
表示第j个人体骨骼在步态半周期Tw中的第t帧图像的位置,
Figure BDA0002005702460000073
被表示为:in,
Figure BDA0002005702460000072
represents the position of the jth human skeleton in the tth frame image in the gait half cycle Tw ,
Figure BDA0002005702460000073
is represented as:

Figure BDA0002005702460000074
Figure BDA0002005702460000074

其中,

Figure BDA0002005702460000075
表示相机坐标系下肩部中心点的位置,上肢骨骼中其余坐标点的位置表示为
Figure BDA0002005702460000076
用P的协方差矩阵来表示包含人体行走习惯的轨迹特征矩阵FT,定义in,
Figure BDA0002005702460000075
represents the position of the center point of the shoulder in the camera coordinate system, and the positions of the remaining coordinate points in the upper limb bones are expressed as
Figure BDA0002005702460000076
The covariance matrix of P is used to represent the trajectory feature matrix F T containing the walking habits of human beings, and the definition is

Figure BDA0002005702460000077
Figure BDA0002005702460000077

其中,

Figure BDA0002005702460000078
Figure BDA0002005702460000079
分别表示测试数据和训练数据的轨迹特征矩阵;
Figure BDA00020057024600000710
Figure BDA00020057024600000711
Figure BDA00020057024600000712
之间的广义特征值,满足
Figure BDA00020057024600000713
x是相应的广义右特征向量;in,
Figure BDA0002005702460000078
and
Figure BDA0002005702460000079
Represent the trajectory feature matrices of test data and training data respectively;
Figure BDA00020057024600000710
yes
Figure BDA00020057024600000711
and
Figure BDA00020057024600000712
The generalized eigenvalues between
Figure BDA00020057024600000713
x is the corresponding generalized right eigenvector;

面积和距离特征:面积特征FA表示人体上肢部分围成封闭区域的面积,距离特征FD由不同人体中心之间的距离来表示,FA表示为:Area and distance features: The area feature FA represents the area of the closed area enclosed by the upper limbs of the human body. The distance feature FD is represented by the distance between different human body centers. FA is expressed as:

Figure BDA00020057024600000714
Figure BDA00020057024600000714

其中,

Figure BDA00020057024600000715
Figure BDA00020057024600000716
分别表示肩部中心点、头部点、左肩点和右肩点在步态半周期Tw中的第t帧的位置;in,
Figure BDA00020057024600000715
and
Figure BDA00020057024600000716
denote the positions of the shoulder center point, head point, left shoulder point, and right shoulder point in the tth frame in the gait half cycle Tw , respectively;

为了计算距离特征FD,首先计算上肢区域的三个封闭多边形的中心,三个中心点即头部中心

Figure BDA00020057024600000717
右手中心
Figure BDA00020057024600000718
和左手中心
Figure BDA00020057024600000719
通过以下公式计算得到:To calculate the distance feature FD , we first calculate the centers of the three closed polygons in the upper limb area. The three center points are the center of the head.
Figure BDA00020057024600000717
Right hand center
Figure BDA00020057024600000718
and left hand center
Figure BDA00020057024600000719
Calculated by the following formula:

Figure BDA00020057024600000720
Figure BDA00020057024600000720

头部中心

Figure BDA00020057024600000721
是肩部中心点和头部点围成多边形的中心;右手中心
Figure BDA00020057024600000722
是右肩点,右肘点和右腕点围成多边形的中心;左手中心
Figure BDA00020057024600000723
是左肩点,左肘点和左腕点围成多边形的中心;右手中心
Figure BDA00020057024600000724
分别与头部中心
Figure BDA00020057024600000725
和左手中心
Figure BDA00020057024600000726
之间的欧几里得距离ft d1和ft d2写为:Head Center
Figure BDA00020057024600000721
It is the center of the polygon formed by the center point of the shoulder and the head; the center of the right hand
Figure BDA00020057024600000722
It is the center of the polygon formed by the right shoulder point, right elbow point and right wrist point; the left hand center
Figure BDA00020057024600000723
It is the center of the polygon formed by the left shoulder point, left elbow point and left wrist point; the center of the right hand
Figure BDA00020057024600000724
Head center
Figure BDA00020057024600000725
and left hand center
Figure BDA00020057024600000726
The Euclidean distance between f t d1 and f t d2 is written as:

Figure BDA00020057024600000727
Figure BDA00020057024600000727

Figure BDA00020057024600000728
这样,整个距离特征表示为make
Figure BDA00020057024600000728
In this way, the entire distance feature is expressed as

Figure BDA00020057024600000729
Figure BDA00020057024600000729

其中,

Figure BDA00020057024600000730
Figure BDA00020057024600000731
分别表示fdi的均值、方差和最大值,其中,i=1,2;in,
Figure BDA00020057024600000730
and
Figure BDA00020057024600000731
Respectively represent the mean, variance and maximum value of f di , where i = 1, 2;

静态特征:静态特征由一个5维向量FS=[fh,flua,frua,flf,frf]T来表示,其中fh表示目标的高度,flua,frua,flf和frf分别表示左上臂长度,右上臂长度,左前臂长度和右前臂长度,具体由下各式得到:Static features: Static features are represented by a 5-dimensional vector F S = [f h ,f lua ,f rua ,f lf ,f rf ] T , where f h represents the height of the target, and f lua ,f rua ,f lf and f rf represent the length of the left upper arm, the length of the right upper arm, the length of the left forearm and the length of the right forearm, respectively. They are obtained by the following formulas:

Figure BDA0002005702460000081
Figure BDA0002005702460000081

Figure BDA0002005702460000082
Figure BDA0002005702460000082

Figure BDA0002005702460000083
Figure BDA0002005702460000083

Figure BDA0002005702460000084
Figure BDA0002005702460000084

Figure BDA0002005702460000085
Figure BDA0002005702460000085

其中,

Figure BDA0002005702460000086
分别表示在相机坐标系下,肩部中心点、左肩点、左肘点、左腕点、左手点、右肩点、右肘点、右腕点和右手点的-位置。in,
Figure BDA0002005702460000086
They respectively represent the positions of the shoulder center point, left shoulder point, left elbow point, left wrist point, left hand point, right shoulder point, right elbow point, right wrist point and right hand point in the camera coordinate system.

频率特征和振幅特征:频率特征FFre是步态半周期内骨骼图像帧的数量,相邻的局部最大和局部最小之间的差值为振幅特征FAmpFrequency feature and amplitude feature: The frequency feature F Fre is the number of skeleton image frames in a gait half cycle, and the difference between adjacent local maximum and local minimum is the amplitude feature F Amp ;

最后,得到一个23维的混合特征

Figure BDA0002005702460000087
构成基于人体上肢的骨骼特征。Finally, we get a 23-dimensional mixed feature
Figure BDA0002005702460000087
The composition is based on the skeletal features of the human upper limbs.

Claims (4)

1.一种融合骨骼识别和IFace-TLD的移动机器人目标跟踪系统,其特征在于,包括通过Kinect传感器获取人体的原始彩色图片(A)和上肢的骨骼图片(B),分别通过用于在彩色图片(A)上对目标进行跟踪定位的IFace-TLD单元(101),以及用于在骨骼图片(B)上对目标进行跟踪定位的骨骼识别单元(102)得到目标所在区域框,并送入图像目标定位单元(103),图像目标定位单元(103)根据得到的目标所在区域框在原始彩色图片(A)标出目标区域,并将目标区域反馈给IFace-TLD单元(101);1. A mobile robot target tracking system integrating skeleton recognition and IFace-TLD, characterized in that it comprises obtaining an original color picture (A) of a human body and a skeleton picture (B) of an upper limb through a Kinect sensor, obtaining a target area frame through an IFace-TLD unit (101) for tracking and locating the target on the color picture (A), and a skeleton recognition unit (102) for tracking and locating the target on the skeleton picture (B), and sending the frame to an image target positioning unit (103); the image target positioning unit (103) marks a target area on the original color picture (A) according to the obtained target area frame, and feeds the target area back to the IFace-TLD unit (101); 所述IFace-TLD单元(101)包括有分别获取原始彩色图片(A)的跟踪部分(101.1)、学习部分(101.2)、检测部分(101.3),以及集成器(101.4),其中,所述的跟踪部分(101.1)使用光流法跟踪器来估计获取的原始彩色图片(A)中的目标在相邻两帧之间的运动轨迹,并分别送入学习部分(101.2)和集成器(101.4);所述的检测部分(101.3)对获取的第一帧原始彩色图片(A)中所有的图像块进行独立地扫描和处理,将目标人脸和背景分离出来,并将目标人脸分别送入学习部分(101.2)和集成器(101.4),对获取的第一帧之后的原始彩色图片(A)只对图像目标定位单元(103)所反馈的目标区域及周边进行扫描和处理,将目标人脸和背景分离出来,并将目标人脸分别送入学习部分(101.2)和集成器(101.4);集成器(101.4)根据得到的目标在相邻两帧之间的运动轨迹和目标人脸计算原始彩色图片(A)中最可能包含目标位置的置信系数,将计算结果送入学习部分(101.2),以及骨骼识别单元(102)或者图像目标定位单元(103);所述的学习部分(101.2)根据原始彩色图片(A)以及从跟踪部分(101.1)、检测部分(101.3)和集成器(101.4)得到的结果进行训练,根据训练结果对跟踪部分(101.1)和检测部分(101.3)出现的错误进行更新和修正;The IFace-TLD unit (101) comprises a tracking part (101.1), a learning part (101.2), a detection part (101.3), and an integrator (101.4) for respectively acquiring an original color image (A), wherein the tracking part (101.1) uses an optical flow tracker to estimate the motion trajectory of a target in the acquired original color image (A) between two adjacent frames, and sends the estimated motion trajectory to the learning part (101.2) and the integrator (101.4) respectively; the detection part (101.3) independently scans and processes all image blocks in the first frame of the acquired original color image (A), separates the target face from the background, and sends the target face to the learning part (101.2) and the integrator (101.4) respectively, and only processes the image target positioning unit (101.3) for the original color images (A) acquired after the first frame. The target area and the surrounding area fed back by the tracking part (101.1) and the detection part (101.3) are scanned and processed to separate the target face and the background, and the target face is sent to the learning part (101.2) and the integrator (101.4) respectively; the integrator (101.4) calculates the confidence coefficient of the most likely target position in the original color image (A) according to the obtained target motion trajectory between two adjacent frames and the target face, and sends the calculation result to the learning part (101.2), and the skeleton recognition unit (102) or the image target positioning unit (103); the learning part (101.2) is trained according to the original color image (A) and the results obtained from the tracking part (101.1), the detection part (101.3) and the integrator (101.4), and the errors in the tracking part (101.1) and the detection part (101.3) are updated and corrected according to the training results; 所述骨骼识别单元(102)包括有运动周期提取部分(102.1)、骨骼特征提取部分(102.2)和支持向量数据描述部分(102.3),其中,所述的运动周期提取部分(102.1)根据获取的骨骼图片(B)计算人体的运动周期,所述的骨骼特征提取部分(102.2)是在得到的人体的运动周期内计算骨骼特征,当IFace-TLD单元(101)中的集成器(101.4)输出的结果为目标所在区域框时,骨骼特征提取部分(102.2)得到的骨骼特征送入支持向量数据描述部分(102.3)中的训练部分(102.31)进行训练,当IFace-TLD单元(101)中的集成器(101.4)输出的结果为空时,骨骼特征提取部分(102.2)得到的骨骼特征送入支持向量数据描述部分(102.3)中的预测部分(102.32),所述预测部分(102.32)根据训练部分(102.31)的训练结果预测目标所在区域框,并将预测的目标所在区域框送入图像目标定位单元(103)。The skeleton recognition unit (102) comprises a motion cycle extraction part (102.1), a skeleton feature extraction part (102.2) and a support vector data description part (102.3), wherein the motion cycle extraction part (102.1) calculates the motion cycle of the human body according to the acquired skeleton image (B), and the skeleton feature extraction part (102.2) calculates the skeleton features within the obtained motion cycle of the human body. When the result output by the integrator (101.4) in the IFace-TLD unit (101) is the target area frame, the skeleton feature extraction part (102.2) obtains The skeletal features are sent to the training part (102.31) in the support vector data description part (102.3) for training. When the result output by the integrator (101.4) in the IFace-TLD unit (101) is empty, the skeletal features obtained by the skeletal feature extraction part (102.2) are sent to the prediction part (102.32) in the support vector data description part (102.3). The prediction part (102.32) predicts the target area frame according to the training result of the training part (102.31), and sends the predicted target area frame to the image target positioning unit (103). 2.根据权利要求1所述的融合骨骼识别和IFace-TLD的移动机器人目标跟踪系统,其特征在于,所述的检测部分(101.3)包括有用于根据获取的原始彩色图片(A)、学习部分(101.2)的目标区域和图像目标定位单元(103)所反馈的目标区域信息检测出人脸在原始彩色图片(A)的区域的人脸检测部分(101.31),根据从人脸检测部分(101.31)得到的人脸在原始彩色图片(A)的区域识别出目标人脸区域的人脸识别部分(101.32),以及用于判断人脸识别部分(101.32)识别出的目标人脸区域是否正确的验证器部分(101.33),所述验证器部分(101.33)的验证结果分别送入学习部分(101.2)和集成器(101.4)。2. According to the mobile robot target tracking system integrating skeleton recognition and IFace-TLD as described in claim 1, it is characterized in that the detection part (101.3) includes a face detection part (101.31) for detecting the area of the face in the original color picture (A) according to the acquired original color picture (A), the target area of the learning part (101.2) and the target area information fed back by the image target positioning unit (103), a face recognition part (101.32) for identifying the target face area according to the area of the face in the original color picture (A) obtained from the face detection part (101.31), and a verifier part (101.33) for judging whether the target face area recognized by the face recognition part (101.32) is correct, and the verification results of the verifier part (101.33) are respectively sent to the learning part (101.2) and the integrator (101.4). 3.根据权利要求1所述的融合骨骼识别和IFace-TLD的移动机器人目标跟踪系统,其特征在于,所述运动周期提取部分(102.1)根据获取的骨骼图片(B)计算人体的运动周期是采用如下公式:3. The mobile robot target tracking system integrating skeleton recognition and IFace-TLD according to claim 1, characterized in that the motion cycle extraction part (102.1) calculates the motion cycle of the human body according to the acquired skeleton image (B) by using the following formula:
Figure FDA0004038316100000021
Figure FDA0004038316100000021
其中,distk为在Kinect坐标系下,第k帧图像左手腕和肩膀中心之间的距离;
Figure FDA0004038316100000022
Figure FDA0004038316100000023
表示第k帧图像左手腕和肩膀中心的三维点坐标;N表示序列中总的图像总帧数。
Where dist k is the distance between the left wrist and shoulder center of the kth frame image in the Kinect coordinate system;
Figure FDA0004038316100000022
and
Figure FDA0004038316100000023
Represents the 3D point coordinates of the center of the left wrist and shoulder of the kth frame image; N represents the total number of image frames in the sequence.
4.根据权利要求1所述的融合骨骼识别和IFace-TLD的移动机器人目标跟踪系统,其特征在于,所述骨骼特征提取部分(102.2)在得到的人体的运动周期内计算骨骼特征,包括:4. The mobile robot target tracking system integrating skeleton recognition and IFace-TLD according to claim 1, characterized in that the skeleton feature extraction part (102.2) calculates the skeleton features within the obtained motion cycle of the human body, including: 首先定义步态半周期为Tw,则,基于人体上肢的骨骼特征分别表示为:First, the gait half-cycle is defined as T w , then the skeletal features based on the upper limbs of the human body are expressed as: 轨迹特征:将肩膀中心点选为固定点,通过下式计算其他上肢骨骼点相对于固定点的相对位置,得到一个9维的特征P:Trajectory feature: The center point of the shoulder is selected as the fixed point, and the relative positions of other upper limb bone points relative to the fixed point are calculated by the following formula to obtain a 9-dimensional feature P:
Figure FDA0004038316100000024
Figure FDA0004038316100000024
其中,
Figure FDA0004038316100000025
表示第j个人体骨骼在步态半周期Tw中的第t帧图像的位置,
Figure FDA0004038316100000026
被表示为:
in,
Figure FDA0004038316100000025
represents the position of the jth human skeleton in the tth frame image in the gait half cycle Tw ,
Figure FDA0004038316100000026
is represented as:
Figure FDA0004038316100000027
Figure FDA0004038316100000027
其中,
Figure FDA0004038316100000028
表示相机坐标系下肩膀中心点的位置,上肢骨骼中其余坐标点的位置表示为
Figure FDA0004038316100000029
用P的协方差矩阵来表示包含人体行走习惯的轨迹特征矩阵FT,定义
in,
Figure FDA0004038316100000028
represents the position of the center point of the shoulder in the camera coordinate system, and the positions of the remaining coordinate points in the upper limb bones are expressed as
Figure FDA0004038316100000029
The covariance matrix of P is used to represent the trajectory feature matrix F T containing the walking habits of human beings, and the definition is
Figure FDA00040383161000000210
Figure FDA00040383161000000210
其中,
Figure FDA00040383161000000211
Figure FDA00040383161000000212
分别表示测试数据和训练数据的轨迹特征矩阵;
Figure FDA00040383161000000213
Figure FDA00040383161000000214
Figure FDA00040383161000000215
之间的广义特征值,满足
Figure FDA00040383161000000216
x是相应的广义右特征向量;
in,
Figure FDA00040383161000000211
and
Figure FDA00040383161000000212
Represent the trajectory feature matrices of test data and training data respectively;
Figure FDA00040383161000000213
yes
Figure FDA00040383161000000214
and
Figure FDA00040383161000000215
The generalized eigenvalues between
Figure FDA00040383161000000216
x is the corresponding generalized right eigenvector;
面积和距离特征:面积特征FA表示人体上肢部分围成封闭区域的面积,距离特征FD由不同人体中心之间的距离来表示,FA表示为:Area and distance features: The area feature FA represents the area of the closed area enclosed by the upper limbs of the human body. The distance feature FD is represented by the distance between different human body centers. FA is expressed as:
Figure FDA00040383161000000217
Figure FDA00040383161000000217
其中,
Figure FDA00040383161000000218
Figure FDA00040383161000000219
分别表示肩膀中心点、头部、左肩和右肩在步态半周期Tw中的第t帧的位置;
in,
Figure FDA00040383161000000218
and
Figure FDA00040383161000000219
denote the positions of the shoulder center, head, left shoulder, and right shoulder in the tth frame in the gait half cycle Tw , respectively;
为了计算距离特征FD,首先计算上肢区域的三个封闭多边形的中心,三个中心点即头部中心
Figure FDA00040383161000000220
右手中心
Figure FDA00040383161000000221
和左手中心
Figure FDA00040383161000000222
通过以下公式计算得到:
To calculate the distance feature FD , we first calculate the centers of the three closed polygons in the upper limb area. The three center points are the center of the head.
Figure FDA00040383161000000220
Right hand center
Figure FDA00040383161000000221
and left hand center
Figure FDA00040383161000000222
Calculated by the following formula:
Figure FDA00040383161000000223
Figure FDA00040383161000000223
头部中心
Figure FDA00040383161000000224
是肩膀中心点和头部点围成多边形的中心;右手中心
Figure FDA00040383161000000225
是右肩点,右肘点和右手腕点围成多边形的中心;左手中心
Figure FDA0004038316100000031
是左肩点,左肘点和左手腕点围成多边形的中心;右手中心
Figure FDA0004038316100000032
分别与头部中心
Figure FDA0004038316100000033
和左手中心
Figure FDA0004038316100000034
之间的欧几里得距离ft d1和ft d2写为:
Head Center
Figure FDA00040383161000000224
It is the center of the polygon formed by the center point of the shoulder and the head; the center of the right hand
Figure FDA00040383161000000225
It is the center of the polygon formed by the right shoulder point, right elbow point and right wrist point; the left hand center
Figure FDA0004038316100000031
It is the center of the polygon formed by the left shoulder point, left elbow point and left wrist point; the center of the right hand
Figure FDA0004038316100000032
Head center
Figure FDA0004038316100000033
and left hand center
Figure FDA0004038316100000034
The Euclidean distance between f t d1 and f t d2 is written as:
Figure FDA0004038316100000035
Figure FDA0004038316100000035
Figure FDA0004038316100000036
这样,整个距离特征表示为
make
Figure FDA0004038316100000036
In this way, the entire distance feature is expressed as
Figure FDA0004038316100000037
Figure FDA0004038316100000037
其中,
Figure FDA0004038316100000038
Figure FDA0004038316100000039
分别表示fdi的均值、方差和最大值,其中,i=1,2;
in,
Figure FDA0004038316100000038
and
Figure FDA0004038316100000039
Respectively represent the mean, variance and maximum value of f di , where i = 1, 2;
静态特征:静态特征由一个5维向量FS=[fh,flua,frua,flf,frf]T来表示,其中fh表示目标的高度,flua,frua,flf和frf分别表示左上臂长度,右上臂长度,左前臂长度和右前臂长度,具体由下各式得到:Static features: Static features are represented by a 5-dimensional vector F S = [f h ,f lua ,f rua ,f lf ,f rf ] T , where f h represents the height of the target, and f lua ,f rua ,f lf and f rf represent the length of the left upper arm, the length of the right upper arm, the length of the left forearm and the length of the right forearm, respectively. They are obtained by the following formulas:
Figure FDA00040383161000000310
Figure FDA00040383161000000310
Figure FDA00040383161000000311
Figure FDA00040383161000000311
Figure FDA00040383161000000312
Figure FDA00040383161000000312
Figure FDA00040383161000000313
Figure FDA00040383161000000313
Figure FDA00040383161000000314
Figure FDA00040383161000000314
其中,
Figure FDA00040383161000000315
分别表示在相机坐标系下,肩部中心点、左肩点、左肘点、左腕点、左手点、右肩点、右肘点、右腕点和右手点的-位置;
in,
Figure FDA00040383161000000315
Respectively represent the positions of the shoulder center point, left shoulder point, left elbow point, left wrist point, left hand point, right shoulder point, right elbow point, right wrist point and right hand point in the camera coordinate system;
频率特征和振幅特征:频率特征FFre是步态半周期内骨骼图像帧的数量,相邻的局部最大和局部最小之间的差值为振幅特征FAmpFrequency feature and amplitude feature: The frequency feature F Fre is the number of skeleton image frames in a gait half cycle, and the difference between adjacent local maximum and local minimum is the amplitude feature F Amp ; 最后,得到一个23维的混合特征
Figure FDA00040383161000000316
构成基于人体上肢的骨骼特征。
Finally, we get a 23-dimensional mixed feature
Figure FDA00040383161000000316
The composition is based on the skeletal features of the human upper limbs.
CN201910227611.2A 2019-03-25 2019-03-25 Mobile robot target tracking system fusing bone recognition and IFace-TLD Expired - Fee Related CN109948560B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910227611.2A CN109948560B (en) 2019-03-25 2019-03-25 Mobile robot target tracking system fusing bone recognition and IFace-TLD

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910227611.2A CN109948560B (en) 2019-03-25 2019-03-25 Mobile robot target tracking system fusing bone recognition and IFace-TLD

Publications (2)

Publication Number Publication Date
CN109948560A CN109948560A (en) 2019-06-28
CN109948560B true CN109948560B (en) 2023-04-07

Family

ID=67010640

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910227611.2A Expired - Fee Related CN109948560B (en) 2019-03-25 2019-03-25 Mobile robot target tracking system fusing bone recognition and IFace-TLD

Country Status (1)

Country Link
CN (1) CN109948560B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102945375A (en) * 2012-11-20 2013-02-27 天津理工大学 Multi-view monitoring video behavior detection and recognition method under multiple constraints
CN105469113A (en) * 2015-11-19 2016-04-06 广州新节奏智能科技有限公司 Human body bone point tracking method and system in two-dimensional video stream
CN105760832A (en) * 2016-02-14 2016-07-13 武汉理工大学 Escaped prisoner recognition method based on Kinect sensor
CN106652291A (en) * 2016-12-09 2017-05-10 华南理工大学 Indoor simple monitoring and alarming system and method based on Kinect
CN108805093A (en) * 2018-06-19 2018-11-13 华南理工大学 Escalator passenger based on deep learning falls down detection algorithm

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102945375A (en) * 2012-11-20 2013-02-27 天津理工大学 Multi-view monitoring video behavior detection and recognition method under multiple constraints
CN105469113A (en) * 2015-11-19 2016-04-06 广州新节奏智能科技有限公司 Human body bone point tracking method and system in two-dimensional video stream
CN105760832A (en) * 2016-02-14 2016-07-13 武汉理工大学 Escaped prisoner recognition method based on Kinect sensor
CN106652291A (en) * 2016-12-09 2017-05-10 华南理工大学 Indoor simple monitoring and alarming system and method based on Kinect
CN108805093A (en) * 2018-06-19 2018-11-13 华南理工大学 Escalator passenger based on deep learning falls down detection algorithm

Also Published As

Publication number Publication date
CN109948560A (en) 2019-06-28

Similar Documents

Publication Publication Date Title
Bogo et al. Dynamic FAUST: Registering human bodies in motion
US9330307B2 (en) Learning based estimation of hand and finger pose
US9235753B2 (en) Extraction of skeletons from 3D maps
US8824781B2 (en) Learning-based pose estimation from depth maps
US7239718B2 (en) Apparatus and method for high-speed marker-free motion capture
CN111027432B (en) A Vision-Following Robot Method Based on Gait Features
JP7422456B2 (en) Image processing device, image processing method and program
Nguyen et al. Practical 3D human skeleton tracking based on multi-view and multi-Kinect fusion
CN110991292A (en) Action identification comparison method and system, computer storage medium and electronic device
Gall et al. Drift-free tracking of rigid and articulated objects
KR20100025048A (en) Image analysis apparatus and method for motion capture
CN109948560B (en) Mobile robot target tracking system fusing bone recognition and IFace-TLD
CN108694348B (en) Tracking registration method and device based on natural features
Sadeghzadehyazdi et al. Glidar3dj: a view-invariant gait identification via flash lidar data correction
Xia et al. Toward accurate real-time marker labeling for live optical motion capture
Dornaika et al. Face and facial feature tracking using deformable models
Cheng et al. Tracking human walking in dynamic scenes
Qiu et al. Estimating metric poses of dynamic objects using monocular visual-inertial fusion
Biasi et al. Garment-based motion capture (gamocap): high-density capture of human shape in motion
Hu et al. A robust person tracking and following approach for mobile robot
Goffredo et al. Human perambulation as a self calibrating biometric
CN119296184B (en) Image processing method and system for rehabilitation training motion analysis
Seenivasan et al. Shape tracking of flexible morphing matters from depth images
Ben Abdeljelil et al. Fusion-Based Approach to Enhance Markerless Motion Capture Accuracy for On-Site Analysis
Cheng et al. Enhancing Image-based Positioning With a Novel Foot Position Extraction Algorithm and Machine Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20230407

CF01 Termination of patent right due to non-payment of annual fee