CN103810480B

CN103810480B - Method for detecting gesture based on RGB-D image

Info

Publication number: CN103810480B
Application number: CN201410073064.4A
Authority: CN
Inventors: 张维忠; 丁洁玉; 赵志刚; 张峰; 李明; 王青林
Original assignee: QINGDAO ANIMATION; Qingdao Broadcasting And Tv Wireless Media Group Co ltd; Qingdao University
Current assignee: Qingdao Fruit Science And Technology Service Platform Co ltd; Shenzhen Micagent Technology Co ltd
Priority date: 2014-02-28
Filing date: 2014-02-28
Publication date: 2017-01-18
Anticipated expiration: 2034-02-28
Also published as: CN103810480A

Abstract

The present invention provides a gesture detection method based on an RGB-D image, which includes: the first step, acquiring the RGB-D image; the second step, segmenting the hand from the background; the third step, recognizing the gesture; the fourth step , to find the optimal segmentation of gestures. The gesture detection method of the RGB-D image provided by the present invention can effectively segment out the human hand area, has the advantages of accurate segmentation, and can obtain good gesture segmentation even when the hand is partially self-occluded or there is interference from other people in the background, and the algorithm Good robustness.

Description

Gesture Detection Method Based on RGB-D Image

技术领域technical field

本发明涉及数字图像处理技术领域，尤其涉及一种基于RGB-D图像的手势检测方法。The invention relates to the technical field of digital image processing, in particular to a gesture detection method based on RGB-D images.

背景技术Background technique

人机交互接口需要尽可能直观和自然。用户与机器进行交互，不需要繁琐的设备(如彩色标记或手套)或装置像遥控器、鼠标和键盘。手势可提供一个与机器智能相结合的简单沟通方式。可以发现，在各种研究和工业领域都有成功应用的手势系统。例如：游戏控制、虚拟环境、智能家居和手语识别等等。Human-computer interaction interfaces need to be as intuitive and natural as possible. Users interact with machines without the need for cumbersome equipment (such as colored markers or gloves) or contraptions like remote controls, mice, and keyboards. Gestures provide an easy way to communicate combined with machine intelligence. It can be found that there are successfully applied gesture systems in various fields of research and industry. For example: game control, virtual environment, smart home and sign language recognition, etc.

手势分割的好坏直接影响后续的手势特征提取、跟踪、识别的精度和准确度。近年来，国内外研究人员在手势分割的研究中提出了多种方法，主要包括模版匹配法、差分法、肤色分割法和约束限制法等。模版匹配法是建立在手型数据库的基础上，在数据库中将手势图像和手型数据中的模版比对。手型是一个非刚性的物体，比对的过程计算量大，困难较大，难以满足实时性要求。约束限制法是通过佩戴不同颜色的手套，或者突出手部与背景的对比，以此来简化对手势区域(前景)和背景进行划分。但这些约束限制了手势数据交流的方便性和自由性。图像差分法是通过运动的手势图像和静止的背景图像相减来进行手势分割，该方法的缺陷是无法克服图像上对应像点偏移的发生。肤色分割法是根据肤色的聚类特性来进行手势分割，它会因为手势相对于光源的角度不同而使肤色受到较大影响。对于要求快捷、方便、实用的基于视觉的手势识别，单独的使用这些方法都有一定的局限性，无法精确实时的对手势进行有效分割，严重地影响了分割效果。专利CN 103226708 A在手势分割中，也采用了深度图像与彩色图像相结合的方法，但它的前提是假定人手位于人体的最前面。另外，也有人提出了采用了类似方法，但它要求首先对RGB相机和Depth相机进行标定，这增加了算法的复杂性和繁琐性。The quality of gesture segmentation directly affects the precision and accuracy of subsequent gesture feature extraction, tracking, and recognition. In recent years, researchers at home and abroad have proposed a variety of methods in the research of gesture segmentation, mainly including template matching method, difference method, skin color segmentation method and constraint method. The template matching method is based on the hand-shape database, and the gesture image is compared with the template in the hand-shape data in the database. The hand shape is a non-rigid object, and the comparison process is computationally intensive and difficult, and it is difficult to meet the real-time requirements. The constraint method is to simplify the division of the gesture area (foreground) and the background by wearing gloves of different colors, or highlighting the contrast between the hand and the background. But these constraints limit the convenience and freedom of gesture data communication. The image difference method is to perform gesture segmentation by subtracting the moving gesture image and the static background image. The defect of this method is that it cannot overcome the occurrence of corresponding pixel offset on the image. The skin color segmentation method performs gesture segmentation based on the clustering characteristics of the skin color, which will greatly affect the skin color due to the different angles of the gesture relative to the light source. For vision-based gesture recognition that requires quickness, convenience and practicality, these methods alone have certain limitations, and cannot accurately and effectively segment gestures in real time, seriously affecting the segmentation effect. Patent CN 103226708 A also adopts the method of combining depth image and color image in gesture segmentation, but its premise is to assume that the human hand is at the front of the human body. In addition, some people have proposed a similar method, but it requires the RGB camera and the Depth camera to be calibrated first, which increases the complexity and cumbersomeness of the algorithm.

发明内容Contents of the invention

本发明所要解决的技术问题是在于克服上面提到的手势检测方法中存在的各种缺陷，提供一种基于RGB-D图像的手势检测方法，其能有效地分割出人手区域，具有分割准确，即使在手部发生部分自遮挡或者背景中有其他人干扰时也能得到好的手势分割，且算法鲁棒性好。The technical problem to be solved by the present invention is to overcome the various defects in the gesture detection method mentioned above, and provide a gesture detection method based on RGB-D images, which can effectively segment the hand area, and has the advantages of accurate segmentation, Even when the hands are partially self-occluded or there are other people in the background, good gesture segmentation can be obtained, and the algorithm is robust.

为解决上述技术问题，本发明提供了一种基于RGB-D图像的手势检测方法，其包括：In order to solve the above technical problems, the present invention provides a gesture detection method based on RGB-D images, which includes:

第一步，获取RGB-D图像；The first step is to obtain the RGB-D image;

第二步，从背景中分割手部；In the second step, the hand is segmented from the background;

第三步，利用凸函数优化分割；The third step is to use the convex function to optimize the segmentation;

第四步，寻找手势的最优分割。The fourth step is to find the optimal segmentation of gestures.

所述第一步具体为利用深度传感器获取彩色图像(RGB Image)流和深度图像(Depth Image)流，即RGB-D图像数据流，并将其转换成一帧帧的图像以便于后续的图像处理。The first step is specifically to use a depth sensor to obtain a color image (RGB Image) stream and a depth image (Depth Image) stream, that is, an RGB-D image data stream, and convert it into a frame-by-frame image for subsequent image processing .

所述第二步具体为通过骨骼图和深度图像的像素比，将手部位置映射到深度图像，利用深度图像信息将手部从背景中予以分割。The second step is specifically to map the hand position to the depth image through the pixel ratio of the skeleton map and the depth image, and segment the hand from the background by using the depth image information.

所述第三步具体为利用凸函数来优化分割RGB-D的手势图像。The third step is specifically to use a convex function to optimize the segmentation of the RGB-D gesture image.

所述第四步具体为利用最小化函数及其函数约束，通过Split Bregman快速算法解出模型，对RGB-D图像寻找最优分割。The fourth step is specifically to use the minimization function and its function constraints to solve the model through the Split Bregman fast algorithm, and find the optimal segmentation for the RGB-D image.

本发明的有益效果：Beneficial effects of the present invention:

本发明提供的RGB-D图像的手势检测方法能有效地分割出人手区域，具有分割准确，即使在手部发生部分自遮挡或者背景中有其他人干扰时也能得到好的手势分割，且算法鲁棒性好。The gesture detection method of the RGB-D image provided by the present invention can effectively segment the area of the human hand, has the advantages of accurate segmentation, and can obtain good gesture segmentation even when the hand is partially self-occluded or there is interference from other people in the background, and the algorithm Good robustness.

附图标记reference sign

图1a-1e为基于彩色图像/深度图像/RGB-D图像分割结果；其中，图1a彩色图像；图1b深度图像；图1c彩色图像分割结果；图1d深度图像的分割结果；图1eRGB-D图像分割结果；Figure 1a-1e is the segmentation result based on color image/depth image/RGB-D image; among them, Figure 1a color image; Figure 1b depth image; Figure 1c color image segmentation result; Figure 1d depth image segmentation result; Figure 1eRGB-D Image segmentation results;

图2a-2e为另一种情况下基于彩色图像/深度图像/RGB-D图像分割结果；其中，图2a彩色图像；图2b深度图像；图2c彩色图像分割结果；图2d深度图像的分割结果；图2e RGB-D图像分割结果。Figure 2a-2e is another case based on color image/depth image/RGB-D image segmentation results; wherein, Figure 2a color image; Figure 2b depth image; Figure 2c color image segmentation result; Figure 2d depth image segmentation result ; Fig. 2e RGB-D image segmentation results.

具体实施方式detailed description

本发明提供了一种基于RGB-D图像的手势检测方法，其包括：The present invention provides a kind of gesture detection method based on RGB-D image, it comprises:

第一步，获取RGB-D图像；The first step is to obtain the RGB-D image;

所述第一步具体为利用深度传感器获取彩色图像，即RGB Image流和深度图像，即Depth Image流，即RGB-D图像数据流，并将其转换成一帧帧的图像以便于后续的图像处理。The first step is specifically to use a depth sensor to acquire a color image, i.e. an RGB Image stream and a depth image, i.e. a Depth Image stream, i.e. an RGB-D image data stream, and convert it into a frame-by-frame image for subsequent image processing .

利用深度传感器可以同时获取深度图像和RGB彩色图像数据，能够支持实时的全身和骨骼追踪，同时可以识别一系列的姿态、动作，在本申请中利用它来获取手势数据信息。The depth sensor can simultaneously acquire depth image and RGB color image data, which can support real-time whole body and bone tracking, and can recognize a series of postures and actions at the same time. In this application, it is used to obtain gesture data information.

手势检测的目的是从原始图像中有效地分割手部区域，也就是把图像中的人手区域(前景)与其它(背景区域)区分开来，是手势识别一项很重要的基础工作。深度传感器具有分析深度数据和探测人体或者游戏者轮廓的功能。通过它可以获取颜色和深度数据流并将其转换成一帧帧的图像以便于后续的图像处理。对输入的图像，要求RGB图像与Depth深度图像在像素上对齐且时间同步。在获得了满足上述条件的图像对后，对输入图像进行预处理，如滤波等，达到抑制噪声的目的。The purpose of gesture detection is to effectively segment the hand area from the original image, that is, to distinguish the human hand area (foreground) in the image from others (background area), which is a very important basic work for gesture recognition. The depth sensor has a function of analyzing depth data and detecting the outline of a human body or a player. Through it, the color and depth data stream can be obtained and converted into a frame-by-frame image for subsequent image processing. For the input image, the RGB image and the Depth image are required to be pixel-aligned and time-synchronized. After the image pair satisfying the above conditions is obtained, the input image is preprocessed, such as filtering, to achieve the purpose of suppressing noise.

彩色图像和深度图像都可以用来进行手势分割。彩色图像的优点是清晰，但它仅包含二维信息，且抗干扰性比较弱。而深度图像在分辨率上没有彩色图像高，但它包含了三维信息，且抗干扰性强。由于骨骼图能追踪人体手部的坐标位置，因此很容易确定手部在骨骼图中的具体位置。然后通过骨骼图和深度图像的像素比，将手部位置映射到深度图像，利用深度图像信息将手部从背景中予以分割。由于深度图像分辨率低且易受深度值相同物体的干扰，分割的效果并不理想。因此，在本申请中提出了结合深度图像和彩色图像的检测方法。Both color and depth images can be used for gesture segmentation. The advantage of color image is clear, but it only contains two-dimensional information, and the anti-interference is relatively weak. The depth image is not as high in resolution as the color image, but it contains three-dimensional information and has strong anti-interference. Since the skeleton map can track the coordinate position of the human hand, it is easy to determine the specific position of the hand in the skeleton map. Then the hand position is mapped to the depth image through the pixel ratio of the skeleton map and the depth image, and the hand is segmented from the background by using the depth image information. Due to the low resolution of the depth image and the interference of objects with the same depth value, the segmentation effect is not ideal. Therefore, a detection method combining depth images and color images is proposed in this application.

对于分割优化过程，我们定义这个问题的图像分割为一个最小化的泛函：For the segmentation optimization process, we define the problem of image segmentation as a minimized functional:

E(u)＝∫_Ωf(x)u(x)dx+∫_Ω|Du(x)| (1)E(u)＝ _∫Ω f(x)u(x)dx+ _∫Ω |Du(x)| (1)

其中，u∈BV(IR^d；{0，1})是一个指示函数上的二元函数的有界变差，u＝1和u＝0表示在表面IR^d的内部和外部，即在二维图像分割情况下的一组封闭边界或在三维分割情况下的一组封闭曲面。公式(1)中第二部分是全变差。其中Du表示分布导数，可微函数u归结为通过松弛二进制约束，函数u的值在0和1之间。该优化问题变为在凸集BV(IR^d；[0，1])中求得最小化凸公式(1)。where u∈BV(IR ^d ; {0, 1}) is a bounded variation of a binary function on an indicator function, u=1 and u=0 denote inside and outside of the surface IR ^d , that is, in two A set of closed boundaries in the case of 3D image segmentation or a set of closed surfaces in the case of 3D segmentation. The second part in formula (1) is total variation. where Du represents the distribution derivative, and the differentiable function u boils down to By relaxing the binary constraints, the function u takes values between 0 and 1. This optimization problem becomes to find the minimum convex formula (1) in the convex set BV(IR ^d ; [0, 1]).

通过凸优化和阈值，在空间上连续设置泛函的形式，可以实现全局优化。这个域值定理确保解决方案u*分解问题对原始二进制标记问题保持全局最优。计算公式(1)的全局最小值如下：在凸集BV(IR^d；[0，1])，θ∈(0，1)任何值时，计算公式(1)中全局最小值u*和大于最小值u*的阈值。Global optimization can be achieved by continuously setting the form of the functional in space through convex optimization and thresholding. This domain value theorem ensures that the solution u*decomposition problem remains globally optimal to the original binary labeling problem. Calculate the global minimum value of formula (1) as follows: in convex set BV(IR ^d ; [0, 1]), when θ∈(0, 1) is any value, calculate the global minimum value u* and greater than Threshold for minimum u*.

由于从RGB-D图像获取到额外的深度信息，所以边界长度可以在绝对值域|Du(x)|而不是在图像域d(x)中进行测量。泛函(1)可以推广到：Due to the extra depth information obtained from the RGB-D image, the boundary length can be measured in the absolute value domain |Du(x)| instead of in the image domain d(x). Function (1) can be generalized to:

E(u)＝∫_Ωf(x)u(x)dx+∫_Ωd(x)|Du(x)| (2)E(u)＝ _∫Ω f(x)u(x)dx+ _∫Ω d(x)|Du(x)| (2)

深度值d：Ω→IR，公式(2)补偿了操作过程中引起的不良效果(由于透视投影，对象越远，相机出现较小的图像)。Depth value d: Ω→IR, formula (2) compensates for undesirable effects caused during operation (the farther the object is, the smaller the image appears to the camera due to perspective projection).

对于RGB-D图像的函数约束，我们将利用深度信息来约束分割的矩，同时将说明这些约束条件怎么样影响内嵌的凸优化函数对应的集合点。我们用定义在B＝BV(Ω；[0，1])的凸函数表示定义在整个图像区域的有界变差二值标记函数。面积约束：0阶矩的对应区域u的形状，可以通过公式(3)计算For the function constraints of RGB-D images, we will use the depth information to constrain the moments of the segmentation, and show how these constraints affect the set point corresponding to the embedded convex optimization function. We represent the convex function defined in B=BV(Ω; [0,1]) defined over the entire image region The bounded variation binary labeling function of . Area constraint: The shape of the corresponding area u of the 0th moment can be calculated by formula (3)

Area(u)：＝∫_Ωd²(x)u(x)dx (3)Area(u):＝ _∫Ω d ² (x)u(x)dx (3)

其中d(x)给出了像素x的深度。假设d(x)＝KD(x)，K是相机的焦距，D(x)是测量出的像素的深度。令d²(x)为对应的像素在3D空间中投影的大小，整体的空间是表面积而不是图像中的投影区域。采用与文献的(Grenander，U.，Chow，Y.，Keenan，D.M.：Hands：APattern Theoretic Study of Biological Shapes.Springer，New York(1991))方法，以同样的方式处理所有的像素。where d(x) gives the depth of pixel x. Suppose d(x)=KD(x), K is the focal length of the camera, and D(x) is the measured pixel depth. Let d ² (x) be the projected size of the corresponding pixel in 3D space, the overall space being the surface area rather than the projected area in the image. All pixels were processed in the same manner as in the literature (Grenander, U., Chow, Y., Keenan, DM: Hands: A Pattern Theoretic Study of Biological Shapes. Springer, New York (1991)).

形状u的绝对面积被限制在常量c₁≤c₂之间，通过在公式(4)集合中约束u来实现：The absolute area of shape u is constrained between the constants c ₁ ≤ c ₂ by constraining u in the set of formula (4):

C₀＝{u∈β|c₁≤Area(u)≤c₂}C ₀ ＝{u∈β|c ₁ ≤Area(u)≤c ₂ }

(4) (4)

集合C₀线性依赖于u，因此凸常量c₂≥c₁≥0。The set C ₀ is linearly dependent on u, so the convex constant c ₂ ≥c ₁ ≥0.

通常，通过设置c₁＝c₂或施加上界和下界的区域来确定准确的面积，或者施加一个软区域约束，通过公式(5)提升泛函(1)如下：Usually, the exact area is determined by setting c ₁ =c ₂ or by imposing upper and lower bounds on the area, or imposing a soft area constraint, and lifting the functional (1) by formula (5) as follows:

E_total(u)＝E(u)+λ(∫d²udx-c)² (5)E _total (u)＝E(u)+λ(∫d ² udx-c) ² (5)

公式(5)增加软约束权重λ＞0，使得估计的面积形状接近c≥0。公式(5)也是凸函数。Equation (5) increases the soft constraint weight λ>0 so that the estimated area shape is close to c≥0. Formula (5) is also a convex function.

所述Split Bregman快速算法具体为最大化一个似然函数同最大化它的自然对数是等价的。本申请首先将Split方法应用到RGB-D图像分割中，建立一个如下的通用模型：The Split Bregman fast algorithm is specifically that maximizing a likelihood function is equivalent to maximizing its natural logarithm. This application first applies the Split method to RGB-D image segmentation, and establishes a general model as follows:

$\underset{ω ω,, u u &Element; &Element; {{00,, 11}}}{min min} {{E E. ((ω ω,, u u)) = = {α α}_{11} {&Integral; &Integral;}_{Ω Ω} {Q Q}_{11} ((x x,, {ω ω}_{11})) u u d d x x d d y the y + + {α α}_{22} {&Integral; &Integral;}_{Ω Ω} {Q Q}_{22} ((x x,, {ω ω}_{22})) ((11 - - u u)) d d x x d d y the y + + γ γ {&Integral; &Integral;}_{Ω Ω} | | &dtri; &dtri; u u | | d d x x d d y the y}} - - - - - - ((77))$

其中Q_i＝-ln P_i，i＝1，2，ω＝(μ，σ)＝Max(p_i)，i＝1，2，u为二值标记函数用来表示曲线运动。Wherein Q _i =-ln P _i , i=1, 2, ω=(μ, σ)=Max(p _i ), i=1, 2, and u is a binary marker function used to represent the curved movement.

本申请将Split Bregman算法思想引入到RGB-D图像分割的通用模型中，即在Split方法的基础上先引入分裂变量w＝[w₁，w₂]^T，再引入Bregman距离b＝(b₁，b₂)^T，将公式(7)的泛函极值问题转化为：This application introduces the idea of the Split Bregman algorithm into the general model of RGB-D image segmentation, that is, on the basis of the Split method, the split variable w=[w ₁ , w ₂ ] ^T is first introduced, and then the Bregman distance b=(b ₁ , b ₂ ) ^T , transform the functional extremum problem of formula (7) into:

${b b}^{k k + + 11} = = {b b}^{k k} + + &dtri; &dtri; {u u}^{k k} - - {w w}^{k k} - - - - - - ((88))$

$(({u u}^{k k + + 11},, {w w}^{k k + + 11})) = = arg arg \underset{w w,, φ φ &Element; &Element; [[00,, 11]]}{min min} {{E E. ((u u,, w w)) = = γ γ {&Integral; &Integral;}_{Ω Ω} | | w w | | d d x x d d y the y + + \frac{u u}{22} {&Integral; &Integral;}_{Ω Ω} ((w w - - &dtri; &dtri; u u - - {b b}^{k k + + 11})) d d x x + + {&Integral; &Integral;}_{Ω Ω} r r (({u u}_{11},, {u u}_{22})) u u d d x x d d y the y}} - - - - - - ((99))$

其中r(u₁，u₂)＝α₁Q₁(x，ω₁)-α₂Q₂(x，ω₂)。公式(9)为对两个变量的能量泛函求极值的问题，通常采用交替优化实现。首先，假定w不变，上述问题转化为对u求极值问题：where r(u ₁ , u ₂ )=α ₁ Q ₁ (x,ω ₁ )−α ₂ Q ₂ (x,ω ₂ ). Equation (9) is the problem of finding the extremum of the energy functional of two variables, which is usually realized by alternate optimization. First, assuming that w remains unchanged, the above problem is transformed into the problem of finding the extreme value of u:

$\underset{u u}{min min} E E. ((u u)) = = \frac{θ θ}{22} {&Integral; &Integral;}_{Ω Ω} ((w w - - &dtri; &dtri; u u - - {b b}^{k k + + 11})) d d x x d d y the y + + {&Integral; &Integral;}_{Ω Ω} r r (({u u}_{11},, {u u}_{22})) u u d d x x d d y the y - - - - - - ((1010))$

然后，假定u不变，求解关于w的极值问题：Then, assuming u is constant, solve the extremum problem about w:

$\underset{w w}{min min} E E. ((w w)) = = γ γ {&Integral; &Integral;}_{Ω Ω} | | w w | | d d x x d d y the y + + \frac{θ θ}{22} {&Integral; &Integral;}_{Ω Ω} {((w w - - &dtri; &dtri; u u - - {b b}^{k k + + 11}))}^{22} d d x x d d y the y - - - - - - ((1111))$

由变分方法可得到能量泛函(10)的Euler-Lagrange方程：The Euler-Lagrange equation of the energy functional (10) can be obtained by the variational method:

$\{\begin{matrix} r r (({u u}_{11},, {u u}_{22})) - - θ θ &dtri; &dtri; \cdot &Center Dot; ((&dtri; &dtri; u u + + {b b}^{k k + + 11} - - {w w}^{k k})) = = 00 & i i n no Ω Ω \\ ((&dtri; &dtri; u u + + {b b}^{k k + + 11} - - {w w}^{k k})) \cdot \cdot \overset{&RightArrow; &Right Arrow;}{n no} = = 00 & o o n no \partial \partial Ω Ω \end{matrix} - - - - - - ((1212))$

公式(12)可以采用快速高斯赛德尔迭代机制来求解。由于采用凸松弛技术后u的取值范围为[0，1]，所以需采用如下的投影方式将u约束到此范围内：Equation (12) can be solved by fast Gauss-Seidel iteration mechanism. Since the value range of u after the convex relaxation technique is [0, 1], it is necessary to use the following projection method to constrain u to this range:

u^k+1＝Max(Min(u^k+1，1)，0) (13)u ^k+1 = Max(Min(u ^k+1 , 1), 0) (13)

求解完能量泛函(10)后，接着求解能量泛函(11)。公式(11)的Euler-Lagrange方程为：After solving the energy functional (10), then solve the energy functional (11). The Euler-Lagrange equation of formula (11) is:

$w w = = &dtri; &dtri; {u u}^{k k + + 11} + + {b b}^{k k + + 11} - - \frac{γ γ}{θ θ} \frac{w w}{| | w w | |} - - - - - - ((1414))$

通过广义软阈值公式得到其解析解，其形式为：Its analytical solution is obtained through the generalized soft threshold formula, which is in the form:

${w w}^{k k + + 11} = = M m a a x x ((| | &dtri; &dtri; {u u}^{k k + + 11} + + {b b}^{k k + + 11} | | - - \frac{γ γ}{θ θ},, 00)) \frac{&dtri; &dtri; {u u}^{k k + + 11} + + {b b}^{k k + + 11}}{| | &dtri; &dtri; {u u}^{k k + + 11} + + {b b}^{k k + + 11} | |} - - - - - - ((1515))$

以下采用实施例来详细说明本发明的实施方式，借此对本发明如何应用技术手段来解决技术问题，并达成技术效果的实现过程能充分理解并据以实施。The following examples are used to describe the implementation of the present invention in detail, so as to fully understand and implement the process of how to apply technical means to solve technical problems and achieve technical effects in the present invention.

本发明显示了本方法与其它方法的实验对比结果。测试分割方法由图1和图2两个场景来演示，实验旨在从人群中分割个体的手势。从图中可以看出基于RGB-D手势分割优于单独基于颜色图像或深度图像的分割。如图1(c)所示，当仅利用RGB彩色图像信息算法分割出了人手，人脸和部分墙壁信息，未能分割出需要的手势。图1(d)所示，仅利用深度图像信息时，人手以及与人手深度相同的人体部分被分割出来了。由此可见，当仅考虑上述两种情况中的一种时分割效果都不理想。如图1(e)所示，当同时考虑RGB和深度信息时，即基于RGB-D图像信息时，人手的区域分割被单独的分割出来，分割困难的问题得到了解决。在复杂的场景下，本申请算法也具有很好的鲁棒性，如图2所示。在场景中加入了处于不同深度的新人物，在这种情况下也能很好的分割出目标手势。The present invention shows the experimental comparison results of this method and other methods. The test segmentation method is demonstrated by two scenarios in Figure 1 and Figure 2. The experiment aims to segment individual gestures from a crowd. It can be seen from the figure that gesture segmentation based on RGB-D is better than segmentation based on color image or depth image alone. As shown in Figure 1(c), when only the RGB color image information algorithm is used to segment the human hand, face and part of the wall information, the required gesture cannot be segmented. As shown in Figure 1(d), when only the depth image information is used, the human hand and the human body parts with the same depth as the human hand are segmented. It can be seen that when only one of the above two cases is considered, the segmentation effect is not ideal. As shown in Figure 1(e), when both RGB and depth information are considered, that is, based on RGB-D image information, the region segmentation of the human hand is segmented separately, and the problem of difficult segmentation is solved. In complex scenarios, the algorithm of this application also has good robustness, as shown in FIG. 2 . New characters at different depths are added to the scene, and in this case target gestures are also well segmented.

所有上述的首要实施这一知识产权，并没有设定限制其他形式的实施这种新产品和/或新方法。本领域技术人员将利用这一重要信息，上述内容修改，以实现类似的执行情况。但是，所有修改或改造基于本发明新产品属于保留的权利。All of the above-mentioned primary implementations of this intellectual property rights are not intended to limit other forms of implementations of this new product and/or new method. Those skilled in the art will, with this important information, modify the above to achieve a similar implementation. However, all modifications or alterations to the new product based on the present invention belong to reserved rights.

以上所述，仅是本发明的较佳实施例而已，并非是对本发明作其它形式的限制，任何熟悉本专业的技术人员可能利用上述揭示的技术内容加以变更或改型为等同变化的等效实施例。但是凡是未脱离本发明技术方案内容，依据本发明的技术实质对以上实施例所作的任何简单修改、等同变化与改型，仍属于本发明技术方案的保护范围。The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention to other forms. Any skilled person who is familiar with this profession may use the technical content disclosed above to change or modify the equivalent of equivalent changes. Example. However, any simple modifications, equivalent changes and modifications made to the above embodiments according to the technical essence of the present invention without departing from the content of the technical solution of the present invention still belong to the protection scope of the technical solution of the present invention.

Claims

1. a kind of gesture detecting method based on rgb-d image is it is characterised in that include:

The first step, obtains rgb-d image；

Second step, splits hand from background；

3rd step, using Optimization Problems of Convex Functions segmentation；

4th step, finds the optimum segmentation of gesture；

The described first step is specially and utilizes depth transducer to obtain coloured image, i.e. rgb image stream and depth image, that is, Depth image flows, i.e. rgb-d image data stream, and converts thereof into the image of a frame frame in order to follow-up image procossing；

Described second step, specifically by the pixel ratio of skeletal graph and depth image, hand position is mapped to depth image, profit With deep image information, hand is split from background；

Described 3rd step is specially using convex function come the images of gestures of Optimized Segmentation rgb-d；

For segmentation optimization process, defining image segmentation is a functional minimizing:

E (u)=∫_ωf(x)u(x)dx+∫_ωd(x)|du(x)| (2)

Wherein, u ∈ bv (ir^d；{ 0,1 }) be binary function on an indicator function bounded variation, u=1 and u=0 represent Surface ir^dInside and outside, that is, two dimensional image segmentation in the case of one group of closed boundary or in the case of three-dimensional segmentation One group of occluding surface, in formula (2), Part II is total variation；Wherein du represents derivative of a distribution, and differentiable function u is attributed toBy lax binary system constraint, the value of function u is between zero and one；Segmentation is optimized in convex set bv (ir^d； [0,1]) try to achieve the convex formula of minimum (, 2) in；Depth value d: ω → ir；

Described 4th step is specially using minimizing function and its function constraint, is solved by split bregman fast algorithm Model, finds optimum segmentation to rgb-d image；

Described split bregman fast algorithm is specially one likelihood function of maximization Valency, first split method is applied in rgb-d image segmentation, sets up a following universal model:

\underset{ω, u &element; {0, 1}}{m i n} {e (ω, u) = α_{1} {&integral;}_{ω} q_{1} (x, ω_{1}) u d x d y + α_{2} {&integral;}_{ω} q_{2} (x, ω_{2}) (1 - u) d x d y + γ {&integral;}_{ω} | &dtri; u | d x d y} - - - (7)

Wherein q_i=-lnp_i, i=1,2, ω=(μ, σ)=max (p_i), i=1,2, u are used for representing curve for Closing Binary Marker function Motion.