CN107463873B

CN107463873B - A real-time gesture analysis and evaluation method and system based on RGBD depth sensor

Info

Publication number: CN107463873B
Application number: CN201710523575.5A
Authority: CN
Inventors: 梁华刚; 易生; 孙凯; 李怀德
Original assignee: Changan University
Current assignee: Changan University
Priority date: 2017-06-30
Filing date: 2017-06-30
Publication date: 2020-02-21
Anticipated expiration: 2037-06-30
Also published as: CN107463873A

Abstract

The invention discloses a method and system for real-time gesture analysis and evaluation based on RGBD depth sensors, including a static gesture recognition and evaluation system for the palm of a train driver and a dynamic gesture recognition and evaluation system for the arm of the train driver. The static gesture recognition and evaluation system includes the palm position determination module, the palm area image extraction module, the denoising module, the gesture recognition and evaluation module; the dynamic gesture recognition and evaluation system of the train driver's arm includes the arm skeleton node motion sequence extraction module, Dynamic gesture optimal matching module, arm dynamic gesture evaluation module. The present invention has strong robustness to environmental background and lighting, and adopts gesture pixel search based on palm nodes in palm gesture detection, which improves the detection effect of palm gestures; supervises the driver's gestures in real time to ensure the safety of trains, and also It can avoid artificial gesture monitoring of train drivers and reduce the consumption of human resources.

Description

A real-time gesture analysis and evaluation method and system based on RGBD depth sensor

技术领域technical field

本发明属于图像处理技术领域，具体涉及一种基于RGBD深度传感器的实时手势分析与评价方法与系统。The invention belongs to the technical field of image processing, and in particular relates to a real-time gesture analysis and evaluation method and system based on an RGBD depth sensor.

背景技术Background technique

手势识别技术作为未来人机交互系统的关键技术之一不仅具有重要的研究价值，而且具有广泛的应用前景。目前，传统的手势识别方法通常是对输入的二维图像进行手势检测并加以识别，然而这类方法对于输入的图像比较敏感，当背景简单、环境光照影响较小时手势检测与识别效果较好，然而当背景复杂、光照变化较大时手势检测与识别效果急剧下降，应用范围受到限制。近几年，为了克服传统手势识别方法的缺陷，三维图像传感设备越来越受到人们的青睐，这类设备不仅可以获取到RGB图像，而且还可以获取到图像的深度数据，从而可以避免复杂的环境背景、光照变化等对手势识别的影响。As one of the key technologies of future human-computer interaction systems, gesture recognition technology not only has important research value, but also has broad application prospects. At present, the traditional gesture recognition method usually detects and recognizes the gesture of the input two-dimensional image. However, this method is more sensitive to the input image, and the gesture detection and recognition effect is better when the background is simple and the influence of ambient light is small. However, when the background is complex and the illumination changes greatly, the effect of gesture detection and recognition drops sharply, and the application scope is limited. In recent years, in order to overcome the shortcomings of traditional gesture recognition methods, 3D image sensing devices have become more and more popular. Such devices can not only obtain RGB images, but also obtain depth data of images, so as to avoid complicated The influence of environmental background, lighting changes, etc. on gesture recognition.

目前，手势在交通领域应用广泛，例如火车驾驶员在火车行驶前、行驶中、车内仪器检测情况等需要用手势演示出来，以确保火车的安全行驶，防止事故的发生，交警则通过一系列的手势来保障道路交通安全畅通等。然而，由于火车驾驶员、交警等人员工作环境较为复杂，光线变化极大，传统方法无法对驾驶员手势有效地进行识别与评价，而且驾驶员标准手势众多，不仅包含手臂上的动态手势，还包括手掌区域的手势，进一步加大了手势识别的难度。传统的手势识别方法主要包括两个步骤，首先在二维图像上检测出手掌与手臂，其次是对检测出来的手势进行识别。一般说来，手势检测的好坏直接影响着手势识别的效果，通常背景环境和光照会对手势检测有影响，而且，在二维图像上检测出手掌与手臂也是一大难点，传统做法是训练大量手势样本获得分类器，然而，人手是复杂的变形体，手势具有多样性、多义性、时间上的差异等特点，很难训练得到理想的手势分类器，从而传统手势识别方法无法应用于驾驶员等手势识别当中。At present, gestures are widely used in the field of transportation. For example, the train driver needs to use gestures to demonstrate the situation before the train runs, while the train is running, and the detection of the in-vehicle instruments, so as to ensure the safe running of the train and prevent accidents. The traffic police use a series of gestures to ensure the safety and smooth flow of road traffic. However, due to the complex working environment of train drivers, traffic police and other personnel, and the great changes in light, traditional methods cannot effectively identify and evaluate driver gestures, and there are many standard driver gestures, including not only dynamic gestures on the arm, but also dynamic gestures on the arm. Including gestures in the palm area further increases the difficulty of gesture recognition. The traditional gesture recognition method mainly includes two steps, the first is to detect the palm and arm on the two-dimensional image, and the second is to recognize the detected gesture. Generally speaking, the quality of gesture detection directly affects the effect of gesture recognition. Usually the background environment and lighting will have an impact on gesture detection. Moreover, it is also difficult to detect palms and arms on two-dimensional images. The traditional method is to train a large number of Gesture samples are used to obtain a classifier. However, the human hand is a complex deformation body, and gestures have the characteristics of diversity, ambiguity, and time difference. It is difficult to train an ideal gesture classifier, so traditional gesture recognition methods cannot be applied to driving. During the gesture recognition of the staff and so on.

发明内容SUMMARY OF THE INVENTION

本发明的目的是提供一种基于RGBD深度传感器的实时手势分析与评价系统，解决了在复杂背景和光照条件下检测手掌与手臂的问题，并且能够对火车驾驶员、交警等工作人员的各种手势进行分析并评价，具有广阔的应用前景。The purpose of the present invention is to provide a real-time gesture analysis and evaluation system based on RGBD depth sensor, which solves the problem of detecting palms and arms under complex background and lighting conditions, and is able to detect various problems of train drivers, traffic police and other staff. Gestures are analyzed and evaluated, which has broad application prospects.

本发明所采用的技术方案是，The technical scheme adopted in the present invention is,

一种基于RGBD深度传感器的实时手掌手势分析与评价方法，包括以下步骤：A real-time palm gesture analysis and evaluation method based on RGBD depth sensor, comprising the following steps:

步骤1，利用RGBD传感器在一段时间内获取T帧初始图像，所述T帧初始图像中的每一帧初始图像均包括手掌节点、手腕节点和手肘节点，确定T帧初始图像的手掌节点坐标；Step 1, using an RGBD sensor to acquire an initial image of T frames within a period of time, each frame of initial image in the initial T frame image includes a palm node, a wrist node and an elbow node, and determine the palm node coordinates of the T frame initial image. ;

包括：include:

步骤11，从T帧初始图像中任选一帧初始图像作为当前帧初始图像,通过初始手掌节点得到当前帧初始图像中手掌节点P的坐标

Step 11, select a frame of initial image from the initial image of T frames as the initial image of the current frame, and obtain the coordinates of the palm node P in the initial image of the current frame through the initial palm node

其中，M表示区域圆内白色像素点的个数，M为大于等于1的自然数，x_i表示第i个像素点的横坐标，y_i表示第i个像素点的纵坐标，z_i表示第i个像素点到RGBD传感器之间的距离；Among them, M represents the number of white pixels in the area circle, M is a natural number greater than or equal to 1, _xi represents the abscissa of the ith pixel, _yi represents the ordinate of the ith pixel, and _zi represents the ith pixel The distance between i pixels and the RGBD sensor;

所述区域圆为以初始手掌节点P₁为圆心，以初始手掌节点到手腕节点之间的距离为半径的圆；The area circle is _a circle whose center is the initial palm node P1, and the radius is the distance between the initial palm node and the wrist node;

所述手掌节点坐标P是在以RGBD传感器中心为原点，以水平方向为X轴，竖直方向为Y轴，传感器指向驾驶员方向为Z轴的坐标系下；The palm node coordinate P is in a coordinate system with the center of the RGBD sensor as the origin, the horizontal direction as the X axis, the vertical direction as the Y axis, and the direction of the sensor pointing to the driver as the Z axis;

步骤12，重复步骤11，直至T帧初始图像中的每一帧初始图像都被作为当前帧初始图像，得到T帧初始图像的手掌节点坐标；Step 12, repeating step 11, until each frame of the initial image in the T frame initial image is used as the current frame initial image, to obtain the palm node coordinates of the T frame initial image;

步骤2，根据T帧初始图像的手掌节点坐标，提取T帧初始图像的手掌区域图像，包括：Step 2, according to the palm node coordinates of the initial image of the T frame, extract the palm area image of the initial image of the T frame, including:

步骤21，从T帧初始图像中任选一帧初始图像作为当前初始图像，提取该当前初始图像的方法包括：Step 21, select one frame of initial image as the current initial image from the T frame initial images, and the method for extracting the current initial image includes:

以当前初始图像的手掌节点为中心，在宽度为W，高度为H的矩形区域内进行手掌像素点搜索，将满足式(2)的手掌像素点放入手掌像素点集合S_k中，即得到当前手掌区域图像；Taking the palm node of the current initial image as the center, search for palm pixels in a rectangular area with a width of W and a height of H, and put the palm pixels that satisfy the formula (2) into the palm pixel set S _k , that is, we get Current palm area image;

式(2)中，k＝1,2,...,W×H，k表示搜寻的次数，d_p表示当当前初始图像的手掌节点到RGBD传感器之间的距离，g_i表示第i个像素点，d_i表示矩形区域内第i个像素点到RGBD传感器之间的距离，abs(d_p-d_i,j)表示当前初始图像的手掌节点与矩形区域内第i个像素点到RGBD传感器之间的距离之差的绝对值，threshold表示阈值，25≤threshold≤35，S_k表示第k次搜索到的手势像素点集合，S_k-1表示第k-1次搜索到的手势像素点集合；In formula (2), k=1,2,...,W×H, k represents the number of searches, d _p represents the distance between the palm node of the current initial image and the RGBD sensor, and _gi represents the i-th Pixel point, d _i represents the distance from the i-th pixel in the rectangular area to the RGBD sensor, abs(d _p -d _i,j ) represents the palm node of the current initial image and the i-th pixel in the rectangular area to the RGBD sensor The absolute value of the difference between the distances between the sensors, threshold represents the threshold, 25≤threshold≤35, S _k represents the set of gesture pixels searched for the kth time, and S _k-1 represents the gesture pixels searched for the k-1th time point set;

其中(x_w,y_w)和(x_e,y_e)分别表示当前初始图像的手掌节点中手腕节点以及手肘节点的坐标；

where (x _w , y _w ) and (x _e , y _e ) represent the coordinates of the wrist node and the elbow node in the palm node of the current initial image, respectively;

步骤22，重复步骤22，直至提取到T帧初始图像的手掌区域图像；Step 22, repeat step 22, until the palm area image of the T frame initial image is extracted;

步骤3，对T帧初始图像的手掌区域图像中的每一帧手掌区域图像进行膨胀和腐蚀操作，得到T帧去噪后的手掌区域图像；Step 3, performing expansion and erosion operations on each frame of the palm area image in the palm area image of the T frame initial image, to obtain a denoised palm area image of the T frame;

步骤4，通过神经网络识别T帧去噪后的手掌区域图像中的手掌区域的手势，并通过式(1)得到识别出的手掌区域的手势的分数P_palm：Step 4: Identify the gestures in the palm area in the denoised palm area image of the T frame through the neural network, and obtain the recognized gesture score P _palm in the palm area by formula (1):

式(1)中，

为将T帧去噪后的手掌区域图像输入到神经网络中后神经网络第t帧的输出结果，T为去噪后的手掌区域图像的总帧数，round(·)表示取整数。In formula (1),

It is the output result of the t-th frame of the neural network after inputting the denoised palm region image of T frame into the neural network, where T is the total number of frames of the denoised palm region image, and round( ) represents an integer.

一种基于RGBD深度传感器的实时手臂手势分析与评价方法，包括以下步骤：A real-time arm gesture analysis and evaluation method based on RGBD depth sensor, comprising the following steps:

步骤1，利用RGBD传感器在一段时间内获取T帧初始图像，从T帧初始图像中任选一帧初始图像作为第t帧初始图像,提取第t帧初始图像的手臂骨骼节点运动序列；Step 1, use the RGBD sensor to obtain the initial image of T frames within a period of time, select a frame of initial image from the initial image of the T frame as the initial image of the t frame, and extract the arm bone node motion sequence of the initial image of the t frame;

包括：include:

所述第t帧初始图像包括初始手掌节点P₁ ^t、手腕节点P₂ ^t、手肘节点P₃ ^t、肩节点P₄ ^t和肩中心节点P_s ^t，通过式(3)得到节点P_n ^t到肩中心节点P_s ^t的距离D_sn ^t：The initial image of the t-th frame includes the initial palm node P ₁ ^t , the wrist node P ₂ ^t , the elbow node P ₃ ^t , the shoulder node P ₄ ^t and the shoulder center node P _s ^t , and the node P _n is obtained by formula (3). Distance D _sn ^t from ^t to shoulder center node P _s ^t :

式(3)中，n＝1,2,3,4，t＝1,2,...,T，D_sn ^t表示第t帧初始图像中节点P_n ^t到肩中心节点P_s ^t的距离，T为初始图像的总帧数，x_n ^t，y_n ^t，z_n ^t分别表示第t帧初始图像中手掌节点、手腕节点、手肘节点、肩节点的坐标值；x_s ^t，y_s ^t，z_s ^t表示第t帧初始图像中肩中心节点的坐标；In formula (3), n=1, 2, 3, 4, t=1, 2,..., T, D _sn ^t represents the distance from node P _n ^t to the shoulder center node P _s ^t in the initial image of the t-th frame. distance, T is the total number of frames of the initial image, x _n ^t , y _n ^t , and z _n ^t respectively represent the coordinate values of the palm node, wrist node, elbow node, and shoulder node in the initial image of the t-th frame; x _s ^t , y _s ^t , z _s ^t represent the coordinates of the shoulder center node in the initial image of the t-th frame;

即可得到初始图像中手臂骨骼节点的运动序列D_sn＝(D_sn ¹,D_sn ²,...,D_sn ^t,...,D_sn ^T)，所述手臂骨骼节点包括手掌节点、手腕节点、手肘节点、肩节点；The motion sequence D _sn =(D _sn ¹ ,D _sn ² ,...,D _sn ^t ,...,D _sn ^T ) of the arm skeleton nodes in the initial image can be obtained, where the arm skeleton nodes include palm nodes, Wrist node, elbow node, shoulder node;

步骤2，在驾驶员标准动态手势样本的运动序列库中找到与初始图像中手臂骨骼节点的运动序列之间的对应点距离之和最小的驾驶员标准动态手势样本的运动序列；Step 2, find the motion sequence of the driver's standard dynamic gesture sample with the smallest sum of the corresponding point distances between the motion sequences of the arm skeleton nodes in the initial image in the motion sequence library of the driver's standard dynamic gesture samples;

步骤3，通过式(4)得到手臂动态手势的得分P_arm：Step 3, the score P _arm of the arm dynamic gesture is obtained by formula (4):

式(4)中，α为标准手势序列样本之间的DTW距离平均值，

D_a，D_b表示驾驶员标准动态手势样本的运动序列库中的任一运动序列，a＝1,2,...,N，b＝1,2,...,N，a≠b，N为驾驶员标准动态手势样本的运动序列库中运动序列的总数。In formula (4), α is the average DTW distance between standard gesture sequence samples,

D _a , D _b represent any motion sequence in the motion sequence library of driver's standard dynamic gesture samples, a=1,2,...,N, b=1,2,...,N, a≠b , N is the total number of motion sequences in the motion sequence library of standard dynamic gesture samples of drivers.

一种基于RGBD深度传感器的实时手势分析与评价方法，包括权利要求1所述实时手掌手势分析与评价方法和权利要求2所述的实时手臂手势分析与评价方法。A real-time gesture analysis and evaluation method based on RGBD depth sensor, including the real-time palm gesture analysis and evaluation method of claim 1 and the real-time arm gesture analysis and evaluation method of claim 2.

一种基于RGBD深度传感器的实时手掌手势分析与评价系统，包括：A real-time palm gesture analysis and evaluation system based on RGBD depth sensor, including:

确定手掌掌心位置模块，用于利用RGBD传感器在一段时间内获取T帧初始图像，所述T帧初始图像中的每一帧初始图像均包括手掌节点、手腕节点和手肘节点，确定T帧初始图像的手掌节点坐标；The module for determining the palm position of the palm is used to obtain an initial T frame image by using the RGBD sensor within a period of time, and each initial image in the T frame initial image includes a palm node, a wrist node and an elbow node, and determines the initial T frame image. The palm node coordinates of the image;

包括：include:

提取手掌区域图像模块，用于根据T帧初始图像的手掌节点坐标，提取T帧初始图像的手掌区域图像：The palm area image extraction module is used to extract the palm area image of the T frame initial image according to the palm node coordinates of the T frame initial image:

包括：include:

去噪模块：用于对T帧初始图像的手掌区域图像中的每一帧手掌区域图像进行膨胀和腐蚀操作，得到T帧去噪后的手掌区域图像；Denoising module: It is used to dilate and corrode each palm area image in the palm area image of the T frame initial image, and obtain the palm area image after T frame denoising;

手势识别及评价模块：用于通过神经网络识别T帧去噪后的手掌区域图像中的手掌区域的手势，并通过式(1)得到识别出的手掌区域的手势的分数P_palm：Gesture recognition and evaluation module: used to recognize the gesture of the palm area in the palm area image after T frame denoising through the neural network, and obtain the recognized gesture score P _palm of the palm area by formula (1):

式(1)中，

一种基于RGBD深度传感器的实时手臂手势分析与评价系统，包括：A real-time arm gesture analysis and evaluation system based on RGBD depth sensor, including:

手臂骨骼节点运动序列提取模块，利用RGBD传感器在一段时间内获取T帧初始图像，从T帧初始图像中任选一帧初始图像作为第t帧初始图像,提取第t帧初始图像的手臂骨骼节点运动序列；The arm bone node motion sequence extraction module uses the RGBD sensor to obtain the T frame initial image within a period of time, selects one frame of the initial image from the T frame initial image as the t frame initial image, and extracts the arm bone node of the t frame initial image. motion sequence;

包括：include:

动态手势最优匹配模块，用于在驾驶员标准动态手势样本的运动序列库中找到与初始图像中手臂骨骼节点的运动序列之间的对应点距离之和最小的驾驶员标准动态手势样本的运动序列；The dynamic gesture optimal matching module is used to find the motion of the driver's standard dynamic gesture sample with the smallest sum of the corresponding point distances between the motion sequences of the arm skeleton nodes in the initial image in the motion sequence library of the driver's standard dynamic gesture samples sequence;

手臂动态手势评价模块，用于通过式(4)得到手臂动态手势的得分P_arm：The arm dynamic gesture evaluation module is used to obtain the arm dynamic gesture score P _arm by formula (4):

式(4)中，α为标准手势序列样本之间的DTW距离平均值，

一种基于RGBD深度传感器的实时手势分析与评价系统，包括权利要求4所述实时手掌手势分析与评价系统和权利要求5所述的实时手臂手势分析与评价系统。A real-time gesture analysis and evaluation system based on RGBD depth sensor, including the real-time palm gesture analysis and evaluation system of claim 4 and the real-time arm gesture analysis and evaluation system of claim 5.

本发明的有益效果是The beneficial effects of the present invention are

本发明采用RGBD深度传感器来获取二维图像的深度数据，通过相应算法能够很好的提取到手掌、手臂等关键数据，相比传统方法，本发明对于环境背景、光照具有很强的鲁棒性，而且在手掌手势检测时采用基于掌心节点的手势像素搜索，进一步提高了手掌手势的检测效果。其次，本发明能够同时识别驾驶员手掌手势与手臂动态手势，而且能够依据识别算法的输出结果对驾驶员手势进行规范性评价并给出手掌手势和手臂动态手势的得分，不仅可以实时监督驾驶员手势以确保火车行驶安全，还可以避免人为的对火车驾驶员手势监控，减轻人力资源消耗。The invention adopts the RGBD depth sensor to obtain the depth data of the two-dimensional image, and the key data such as the palm and the arm can be well extracted through the corresponding algorithm. Compared with the traditional method, the invention has strong robustness to the environmental background and illumination. , and the gesture pixel search based on the palm node is used in the palm gesture detection, which further improves the detection effect of the palm gesture. Secondly, the present invention can recognize the palm gesture of the driver and the dynamic arm gesture at the same time, and can perform normative evaluation on the driver's gesture according to the output result of the recognition algorithm and give the score of the palm gesture and the dynamic arm gesture, which can not only supervise the driver in real time Gestures can ensure the safety of trains, and can also avoid artificial monitoring of train drivers' gestures, reducing human resource consumption.

附图说明Description of drawings

图1是火车驾驶员手掌区域手势识别与评价流程图；Fig. 1 is the flow chart of gesture recognition and evaluation in the palm area of the train driver;

图2是火车驾驶员手掌与手臂各节点示意图；Fig. 2 is a schematic diagram of each node of the train driver's palm and arm;

图3是火车驾驶员手臂动态手势识别与评价流程图；Fig. 3 is the flow chart of dynamic gesture recognition and evaluation of the train driver's arm;

图4是本发明应用场景图。FIG. 4 is an application scenario diagram of the present invention.

具体实施方式Detailed ways

火车驾驶员手势通常包含手掌区域的手势以及手臂部分动态手势，下面分别对手掌区域手势和手臂动态手势识别过程分别进行详细介绍。The gesture of the train driver usually includes the gesture of the palm area and the dynamic gesture of the arm part. The following is a detailed introduction to the recognition process of the palm area gesture and the dynamic arm gesture.

实施例1Example 1

一种基于RGBD深度传感器的实时手掌手势分析与评价方法，其特征在于，包括以下步骤：A real-time palm gesture analysis and evaluation method based on RGBD depth sensor, is characterized in that, comprises the following steps:

包括：include:

如图2，所述区域圆为以初始手掌节点P₁为圆心，以初始手掌节点到手腕节点之间的距离为半径的圆；As shown in Figure 2, the area circle is _a circle with the initial palm node P1 as the center and the distance between the initial palm node and the wrist node as the radius;

如图4，所述手掌节点坐标P是在以RGBD传感器中心为原点，以水平方向为X轴，竖直方向为Y轴，传感器指向驾驶员方向为Z轴的坐标系下；As shown in Figure 4, the palm node coordinate P is in the coordinate system with the center of the RGBD sensor as the origin, the horizontal direction as the X axis, the vertical direction as the Y axis, and the direction of the sensor pointing to the driver as the Z axis;

由于RGBD传感器跟踪人体骨骼节点时容易发生节点漂移现象，导致手掌与传感器之间的距离存在偏差，为了减小偏差，需对初始手掌节点进行校正来得到手掌节点P的准确坐标：Since the RGBD sensor is prone to node drift when tracking human skeleton nodes, resulting in a deviation in the distance between the palm and the sensor, in order to reduce the deviation, it is necessary to correct the initial palm node to obtain the accurate coordinates of the palm node P:

由于矩形搜索区域的W和H不能设置的过小，防止手势区域大小的变化导致手势检测的不完全，矩形搜寻区域的高、宽为：

其中(x_w,y_w)和(x_e,y_e)分别表示当前初始图像的手掌节点中手腕节点以及手肘节点的坐标；Since the W and H of the rectangular search area cannot be set too small to prevent the gesture detection from being incomplete due to the change of the size of the gesture area, the height and width of the rectangular search area are:

手掌区域图像通常包含一些噪声，这些噪声包括手势边缘的一些毛刺和图像内部的一些孔洞等，为了得到更加精确的手势图像，需要进行膨胀与腐蚀操作，膨胀可以去除二值手势图像边缘的一些毛刺以及分散的噪声点等，而腐蚀则可以填平图像内部的一些孔洞。The palm area image usually contains some noise, including some burrs on the edge of the gesture and some holes inside the image, etc. In order to obtain a more accurate gesture image, dilation and erosion operations are required. Dilation can remove some burrs on the edge of the binary gesture image. And scattered noise points, etc., while erosion can fill in some holes inside the image.

式(1)中，为将T帧去噪后的手掌区域图像输入到神经网络中后神经网络第t帧的输出结果，T为去噪后的手掌区域图像的总帧数，round(·)表示取整数。In formula (1), It is the output result of the t-th frame of the neural network after inputting the denoised palm region image of T frame into the neural network, where T is the total number of frames of the denoised palm region image, and round( ) represents an integer.

实施例2Example 2

一种基于RGBD深度传感器的实时手臂手势分析与评价方法，如图3，包括以下步骤：A real-time arm gesture analysis and evaluation method based on RGBD depth sensor, as shown in Figure 3, includes the following steps:

包括：include:

本实施例采用DTW算法来解决两条运动序列长度不同的问题，设驾驶员标准动态手势样本的运动序列D_a＝(D_a ¹,D_a ²,...,D_a ^T′)，初始图像中手臂骨骼节点的运动序列D_sn＝(D_sn ¹,D_sn ²,...,D_sn ^T),设两条序列之间的点对关系为φ(k)＝(φ_s(k),φ_a(k))，其中1≤φ_s(k)≤T，1≤φ_a(k)≤T′，max(T,T′)≤k≤T+T′，DTW算法目的是找出两条序列之间的最佳点对关系φ(k)，使得对应点之间距离之和DTW(D_a,D_sn)最小:In this embodiment, the DTW algorithm is used to solve the problem of different lengths of the two motion sequences. Assume that the motion sequence D _a = (D _a ¹ , D _a ² , . . . , D _a ^T′ ) of the standard dynamic gesture sample of the driver, the initial The motion sequence D _sn = (D _sn ¹ , D _sn ² ,..., D _sn ^T ) of the arm skeleton nodes in the image, and the point-to-point relationship between the two sequences is φ(k)=(φ _s (k ), φ _a (k)), where 1≤φ _s (k)≤T, 1≤φ _a (k)≤T′, max(T,T′)≤k≤T+T′, the purpose of the DTW algorithm is Find the best point-to-point relationship φ(k) between the two sequences such that the sum of the distances between the corresponding points DTW(D _a , D _sn ) is the smallest:

式(4)中，α为标准手势序列样本之间的DTW距离平均值，

实施例3Example 3

本实施例在实施例1、2的基础上，提供了一种基于RGBD深度传感器的实时手势分析与评价方法，包括实施例1所提供的实时手掌手势分析与评价方法和实施例2所提供的实时手臂手势分析与评价方法。该实施例能够同时识别驾驶员手掌手势与手臂动态手势，而且能够依据识别算法的输出结果对驾驶员手势进行规范性评价并给出手掌手势和手臂动态手势的得分，不仅可以实时监督驾驶员手势以确保火车行驶安全，还可以避免人为的对火车驾驶员手势监控，减轻人力资源消耗。Based on Embodiments 1 and 2, this embodiment provides a real-time gesture analysis and evaluation method based on an RGBD depth sensor, including the real-time palm gesture analysis and evaluation method provided in Embodiment 1 and the method provided in Embodiment 2. A real-time arm gesture analysis and evaluation method. This embodiment can recognize the driver's palm gesture and arm dynamic gesture at the same time, and can perform normative evaluation on the driver's gesture according to the output result of the recognition algorithm, and give the score of the palm gesture and the arm dynamic gesture, which can not only monitor the driver's gesture in real time In order to ensure the safety of the train, it can also avoid artificial gesture monitoring of the train driver and reduce the consumption of human resources.

实施例4Example 4

本实施例提供了火车驾驶员手掌的静态手势识别与评价系统，如图1所示，包括：This embodiment provides a static gesture recognition and evaluation system for the palm of a train driver, as shown in Figure 1, including:

包括：include:

提取手掌区域图像模块，提取手掌区域图像模块，用于根据T帧初始图像的手掌节点坐标，提取T帧初始图像的手掌区域图像：The palm area image module is extracted, and the palm area image module is extracted, which is used to extract the palm area image of the T frame initial image according to the palm node coordinates of the T frame initial image:

包括：include:

由于矩形搜索区域的W和H不能设置的过小，防止手势区域大小的变化导致手势检测的不完全，矩形搜寻区域的高、宽为：其中(x_w,y_w)和(x_e,y_e)分别表示当前初始图像的手掌节点中手腕节点以及手肘节点的坐标；Since the W and H of the rectangular search area cannot be set too small to prevent the gesture detection from being incomplete due to the change of the size of the gesture area, the height and width of the rectangular search area are: where (x _w , y _w ) and (x _e , y _e ) represent the coordinates of the wrist node and the elbow node in the palm node of the current initial image, respectively;

式(1)中，

实施例5Example 5

本实施例提供了火车驾驶员手臂的动态手势识别与评价系统，如图3，包括：This embodiment provides a dynamic gesture recognition and evaluation system for the train driver's arm, as shown in Figure 3, including:

包括：include:

式(4)中，α为标准手势序列样本之间的DTW距离平均值，

D_a，D_b表示驾驶员标准动态手势样本的运动序列库中的任一运动序列，a＝1,2,...,N，b＝1,2,...,N，a≠b，N为驾驶员标准动态手势样本的运动序列库中运动序列的总数；D_sn＝(D_sn ¹,D_sn ²,...,D_sn ^t,...,D_sn ^T)，D_sn为初始图像中手臂骨骼节点的运动序列。In formula (4), α is the average DTW distance between standard gesture sequence samples,

D _a , D _b represent any motion sequence in the motion sequence library of driver's standard dynamic gesture samples, a=1,2,...,N, b=1,2,...,N, a≠b , N is the total number of motion sequences in the motion sequence library of the driver's standard dynamic gesture samples; D _sn =(D _sn ¹ ,D _sn ² ,...,D _sn ^t ,...,D _sn ^T ), D _sn is the motion sequence of the arm bone node in the initial image.

实施例6Example 6

本实施例在实施例4、5的基础上，提供了一种基于RGBD深度传感器的实时手势分析与评价系统，包括实施例4所提供的实时手掌手势分析与评价系统和实施例5所提供的实时手臂手势分析与评价系统。该实施例能够同时识别驾驶员手掌手势与手臂动态手势，而且能够依据识别算法的输出结果对驾驶员手势进行规范性评价并给出手掌手势和手臂动态手势的得分，不仅可以实时监督驾驶员手势以确保火车行驶安全，还可以避免人为的对火车驾驶员手势监控，减轻人力资源消耗。Based on Embodiments 4 and 5, this embodiment provides a real-time gesture analysis and evaluation system based on an RGBD depth sensor, including the real-time palm gesture analysis and evaluation system provided in Embodiment 4 and the system provided in Embodiment 5. Real-time arm gesture analysis and evaluation system. This embodiment can recognize the driver's palm gesture and arm dynamic gesture at the same time, and can perform normative evaluation on the driver's gesture according to the output result of the recognition algorithm, and give the score of the palm gesture and the arm dynamic gesture, which can not only monitor the driver's gesture in real time In order to ensure the safety of the train, it can also avoid artificial gesture monitoring of the train driver and reduce the consumption of human resources.

Claims

1. A real-time palm gesture analysis and evaluation method based on an RGBD depth sensor is characterized by comprising the following steps:

step 1, acquiring T frames of initial images within a period of time by using an RGBD sensor, wherein each frame of initial image in the T frames of initial images comprises a palm node, a wrist node and an elbow node, and determining the coordinates of the palm node of the T frames of initial images;

the method comprises the following steps:

step 11, selecting one frame of initial image from the T frame of initial image as the current frame of initial image, and obtaining the coordinates of the palm node P in the current frame of initial image through the initial palm node

Wherein M represents the number of white pixel points in the region circle, M is a natural number greater than or equal to 1, and x_iAbscissa, y, representing the ith pixel_iThe ordinate, z, representing the ith pixel_iRepresenting the distance from the ith pixel point to the RGBD sensor;

the region circle is an initial palm node P₁A circle with the distance between the initial palm node and the wrist node as the radius and the center of the circle;

the palm node coordinate P is in a coordinate system which takes the center of the RGBD sensor as an origin, takes the horizontal direction as an X axis, takes the vertical direction as a Y axis, and takes the direction of the sensor pointing to the driver as a Z axis;

step 12, repeating step 11 until each frame initial image in the T frame initial images is used as a current frame initial image, and obtaining palm node coordinates of the T frame initial images;

step 2, extracting a palm region image of the T frame initial image according to the palm node coordinates of the T frame initial image, wherein the method comprises the following steps:

step 21, selecting one frame of initial image from the T frames of initial images as a current initial image, and the method for extracting the current initial image includes:

taking a palm node of the current initial image as a center, searching for palm pixel points in a rectangular region with the width of W and the height of H, and putting the palm pixel points satisfying the formula (2) into a palm pixel point set S_kObtaining a current palm area image;

in formula (2), k is 1,2Indicating the number of searches, d_pRepresents the distance, g, between the palm node of the current initial image and the RGBD sensor_iRepresents the ith pixel point, d_iRepresenting the distance between the ith pixel point in the rectangular area and the RGBD sensor, abs (d)_p-d_i) Representing the absolute value of the difference between the distance from the palm node of the current initial image and the ith pixel point in the rectangular region to the RGBD sensor, wherein threshold represents a threshold value, is more than or equal to 25 and less than or equal to 35, and S_kRepresenting the gesture pixel point set searched for the k time, S_k-1Representing the gesture pixel point set searched at the k-1 st time;

wherein (x)_w,y_w) And (x)_e,y_e) Respectively representing the coordinates of a wrist node and an elbow node in a palm node of the current initial image;

step 22, repeating the step 22 until a palm area image of the T-frame initial image is extracted;

step 3, performing expansion and corrosion operations on each frame of palm area image in the palm area image of the T frame initial image to obtain a T frame denoised palm area image;

step 4, recognizing the gesture of the palm area in the T-frame de-noised palm area image through a neural network, and obtaining the fraction P of the recognized gesture of the palm area through the formula (1)_palm：

In the formula (1), the reaction mixture is,

inputting the denoised palm region image of the T frame into the output result of the T frame of the neural network, wherein T is the denoised output result of the T frame of the neural networkThe total frame number of the palm region image of (1), round (·) represents an integer.

2. A real-time gesture analysis and evaluation method based on an RGBD depth sensor is characterized by comprising the real-time palm gesture analysis and evaluation method and the real-time arm gesture analysis and evaluation method of claim 1;

the real-time arm gesture analysis and evaluation method comprises the following steps:

step 1, acquiring a T frame initial image within a period of time by using an RGBD sensor, selecting one frame initial image from the T frame initial image as a T frame initial image, and extracting an arm skeleton node motion sequence of the T frame initial image;

the method comprises the following steps:

the t-th frame initial image comprises an initial palm node P₁ ^tWrist node P₂ ^tElbow joint P₃ ^tShoulder node P₄ ^tAnd shoulder center node P_s ^tObtaining a node P by the formula (3)_n ^tTo shoulder center node P_s ^tDistance D of_sn ^t：

In formula (3), n is 1,2,3,4, T is 1,2_sn ^tIndicating the node P in the initial image of the t-th frame_n ^tTo shoulder center node P_s ^tT is the total frame number of the initial image, x_n ^t，y_n ^t，z_n ^tRespectively representing coordinate values of a palm node, a wrist node, an elbow node and a shoulder node in the t-th frame initial image; x is the number of_s ^t，y_s ^t，z_s ^tRepresenting the coordinates of the shoulder center node in the t-th frame initial image;

obtaining the motion sequence D of the arm skeleton node in the initial image_sn＝(D_sn ¹,D_sn ²,...,D_sn ^t,...,D_sn ^T) The arm skeleton nodes comprise palm nodes, wrist nodes, elbow nodes and shoulder nodes;

step 2, finding the motion sequence of the driver standard dynamic gesture sample with the minimum sum of the distances of corresponding points between the motion sequence of the arm skeleton nodes in the initial image and the motion sequence of the driver standard dynamic gesture sample in the motion sequence library of the driver standard dynamic gesture sample;

step 3, obtaining the score P of the dynamic gesture of the arm through the formula (4)_arm：

In equation (4), α is the average of the DTW distances between the standard gesture sequence samples,

D_a，D_brepresenting any motion sequence in the motion sequence library of the driver standard dynamic gesture sample, wherein a is 1,2, and N, b is 1, 2.

3. A real-time palm gesture analysis and evaluation system based on an RGBD depth sensor is characterized by comprising:

the device comprises a palm center position determining module, a palm image acquiring module and a palm image acquiring module, wherein the palm center position determining module is used for acquiring T frames of initial images in a period of time by using an RGBD sensor, each frame of initial image in the T frames of initial images comprises a palm node, a wrist node and an elbow node, and the palm node coordinates of the T frames of initial images are determined;

the method comprises the following steps:

the palm region image extracting module is used for extracting a palm region image of the T frame initial image according to the palm node coordinates of the T frame initial image:

the method comprises the following steps:

in formula (2), k is 1,2_pRepresents the distance, g, between the palm node of the current initial image and the RGBD sensor_iRepresents the ith pixel point, d_iRepresenting the distance between the ith pixel point in the rectangular area and the RGBD sensor, abs (d)_p-d_i) Representing the palm node of the current initial image and the ith image in the rectangular areaThe absolute value of the difference between the distances from the prime point to the RGBD sensor, where threshold represents a threshold, 25 ≦ threshold ≦ 35, S_kRepresenting the gesture pixel point set searched for the k time, S_k-1Representing the gesture pixel point set searched at the k-1 st time;

a denoising module: the image processing device is used for performing expansion and corrosion operations on each frame of palm area image in the palm area image of the T frame initial image to obtain a T frame denoised palm area image;

gesture recognition and evaluation module: the method is used for recognizing the gesture of the palm region in the T-frame de-noised palm region image through a neural network and obtaining the fraction P of the recognized gesture of the palm region through the formula (1)_palm：

In the formula (1), the reaction mixture is,

and (3) inputting the denoised palm region image of the T frame into an output result of the T frame of the neural network, wherein T is the total frame number of the denoised palm region image, and round (·) represents an integer.

4. A real-time gesture analysis and evaluation system based on an RGBD depth sensor, comprising the real-time palm gesture analysis and evaluation system and the real-time arm gesture analysis and evaluation system of claim 3;

the real-time arm gesture analysis and evaluation system comprises:

the arm skeleton node motion sequence extraction module is used for acquiring T frame initial images within a period of time by using an RGBD sensor, selecting one frame initial image from the T frame initial images as a T frame initial image, and extracting an arm skeleton node motion sequence of the T frame initial image;

the method comprises the following steps:

the dynamic gesture optimal matching module is used for finding the motion sequence of the driver standard dynamic gesture sample with the minimum sum of corresponding point distances between the motion sequence and the arm skeleton nodes in the initial image in the motion sequence library of the driver standard dynamic gesture sample;

an arm dynamic gesture evaluation module for obtaining the score P of the arm dynamic gesture through the formula (4)_arm：