CN109214297A

CN109214297A - A kind of static gesture identification method of combination depth information and Skin Color Information

Info

Publication number: CN109214297A
Application number: CN201810900948.0A
Authority: CN
Inventors: 周智恒; 许冰媛
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2018-08-09
Filing date: 2018-08-09
Publication date: 2019-01-15

Abstract

The invention discloses a static gesture recognition method combining depth information and skin color information, comprising the following steps: collecting RGB images and depth images with kinect; using depth threshold and human skin color information to perform hand segmentation to obtain a hand binary image; Using the distance transformation operation combined with the palm cutting circle and the threshold method, it is judged whether there is an arm region in the hand image, and the existing arm region is removed by the XOR operation between the images to obtain a binary image of the gesture; the Fourier descriptor and the finger are calculated. The number of sharp points constitutes the feature vector of the gesture; the gesture classification is performed by the support vector machine to achieve the purpose of gesture recognition. The invention realizes hand segmentation by combining depth information and skin color information, and overcomes the influence of skin color-like regions in complex backgrounds; by removing arm regions, it overcomes the interference of arms on the classification accuracy of the system; calculates the number of fingertips and Fourier description The sub-features are input into the support vector machine to realize gesture recognition.

Description

A kind of static gesture identification method of combination depth information and Skin Color Information

Technical field

The present invention relates to image identification technical fields, and in particular to a kind of static hand of combination depth information and Skin Color Information Gesture recognition methods.

Background technique

With the development of the relevant technologies such as computer vision, pattern-recognition, grinding for human-computer interaction technology has greatly been pushed Study carefully, people can exchange more naturally with computer and be not limited to keyboard, mouse.The static gesture of view-based access control model identifies As a kind of natural man-machine interaction mode, it is increasingly becoming the research hotspot of human-computer interaction technology, and is widely used in intelligence The fields such as household, robot control, Sign Language Recognition.

In the static gesture identification based on monocular cam, realize that the separation of manpower and background in image is a difficulty Point.In the gesture recognition system of early stage, by using the means such as gloves or the complex background simplification for wearing tape label, system The environment for being easily achieved gesture and background separation is made, to obtain higher discrimination.But in a practical situation, environment is often difficult With prediction, this method is not able to satisfy the needs of human-computer interaction under current conditions.

Static gesture is only indicated that existing arm regions are redundancies after hand Segmentation by palm and finger, can be interfered With the result for influencing gesture identification.Many static gestures know the situation that method for distinguishing only consider not exposed arm, for training and The images of gestures of classification does not simultaneously include arm.Due to not considering that arm interferes, these methods are difficult to be widely applied.

For occur during gesture identification gesture translation, gesture rotation, gesture dimensional variation the problems such as, selection is used for The feature of description gesture need to have translation, rotation, constant rate.The gesture feature extracted in existing gesture identification method compared with To be single, manifold combination can effectively improve the accuracy and robustness of gesture identification method.

In order to realize higher classification accuracy, when the classifier of many gesture identification method selections needs to spend longer Between, on very big sample set training obtain.In addition, speed is slower in assorting process, it is difficult to meet gesture recognition system reality The requirement of when property.

Summary of the invention

The purpose of the present invention is to solve drawbacks described above in the prior art, a kind of combination depth information and the colour of skin are provided The static gesture identification method of information.

The purpose of the present invention can be reached by adopting the following technical scheme that:

A kind of static gesture identification method of combination depth information and Skin Color Information, the static gesture identification method packet It includes:

Image acquisition step acquires RGB image and depth image simultaneously with kinect, obtains all pixels in RGB image The corresponding depth information of point；

Hand Segmentation step passes through setting depth threshold and utilizes human body complexion information, extracts the hand area in RGB image Domain obtains hand bianry image；

Arm removes step, is operated using range conversion and combines palm cutting round and threshold method, judges hand images In whether there is arm regions, existing arm regions are removed by xor operation between image, obtain gesture bianry image；

Characteristic extraction step calculates the Fourier descriptor and finger tip number of images of gestures, constitutes the feature vector of gesture；

Gesture identification step carries out gesture classification using gesture feature vector of the support vector machines to input.

Further, the kinect is located at the position of 200mm~3500mm immediately ahead of gathered person's human body, and is adopted The positional distance kinect of collection person's hand is nearest.

Further, the hand Segmentation step process is as follows:

In conjunction with depth image and RGB image, divided using depth threshold, obtains the RGB image comprising hand；

The colour of skin is realized using the area of skin color in human body complexion information extraction RGB image to the RGB image comprising hand Segmentation, obtains hand bianry image.

Further, the combination depth image and RGB image, are divided using depth threshold, are obtained comprising hand RGB image process is as follows:

Distance kinect nearest pixel in emplacement depth image, and calculate the pixel within the depth threshold Point, retains the value of respective pixel point in RGB image, and enabling the rest of pixels point of RGB image is black.

Further, described to the RGB image comprising hand, utilize the skin in human body complexion information extraction RGB image It realizes skin color segmentation, it is as follows to obtain hand bianry image process in color region:

Image is transformed into YCr ' Cb ' color space by rgb color space, formula is as follows:

Y (x, y)=0.299 × r (x, y)+0.587 × g (x, y)+0.114 × b (x, y)

Wherein r (x, y), g (x, y), b (x, y) be respectively comprising hand RGB image in positioned at coordinate (x, y) it is red, Green, blue three components, y (x, y) are the luminance component that image is located at coordinate (x, y) in YCr ' Cb ' color space, cr ' (x, y) It is located at the red chrominance component of coordinate (x, y) in YCr ' Cb ' color space for image, cb ' (x, y) is image in YCr ' Cb ' It is located at the chroma blue component of coordinate (x, y) in color space；

Skin color segmentation is carried out using threshold method, formula is as follows:

Wherein f (x, y) is the pixel value for being located at coordinate (x, y) in hand bianry image；cr′_min、cr′_max、cb′_min、 cb′_maxFor threshold method in YCr ' Cb ' color space the minimum value of the minimum value, maximum value and cb ' component of cr ' component, most Big value.

Further, arm removal step process is as follows:

To hand bianry image, range conversion is taken to operate, obtains range conversion mapping graph, hand in range conversion mapping graph The value in portion region indicates corresponding pixel to the minimum range on hand boundary, in range conversion mapping graph in addition to hand region Rest of pixels point value be 0；

According to hand geometrical characteristic, enabling the maximum pixel of value in range conversion mapping graph is the centre of the palm, and enables the centre of the palm Pixel value is R₀, to realize the accurate positionin in the centre of the palm；

Transformed mappings of adjusting the distance figure, using the centre of the palm as the center of circle, radius R₁=1.35 × R₀Circle is drawn, pixel within the circle point is enabled to take Value is 0, and palm cutting circle avoids influence of the palm area to subsequent operation；

To the image that above-mentioned processing obtains, calculating max pixel value is R_max, calculate R_max/R₀And compared with threshold value T, if taking Value is less than T, then judges that arm regions are not present, hand bianry image is gesture bianry image, is transferred to characteristic extraction step；If Value is greater than T, then judges that arm regions exist, and need to carry out subsequent operation removal arm regions；

To the image that above-mentioned processing obtains, arm regions are removed using eight connectivity distinguished number, obtain containing only finger Bianry image；

Arm regions present in hand bianry image are removed by the XOR xor operation between image, obtain gesture two-value Image.

Further, the characteristic extraction step process is as follows:

To gesture bianry image, using the centre of the palm as the center of circle, radius R₂=1.95 × R₀Draw circle, wherein R₀For the pixel in the centre of the palm Value, enabling pixel within the circle point value is 0, calculates the number of remaining area in image by eight connectivity distinguished number to get finger is arrived Sharp number；

The coordinate (x, y) of every bit on gesture contour edge is expressed as plural u=x+yi, to entire gesture profile side The sequence of complex numbers that edge is constituted carries out Fourier transformation, one group of Fourier coefficient Z (k) is obtained, in M Fu since k=1 Leaf system number carries out modulus value and normalizes, and obtains M Fourier descriptor feature；

M Fourier descriptor and finger tip number are collectively formed to the gesture feature vector of M+1 dimension.

Further, before the gesture identification step, further includes:

Supporting vector machine model training step is realized more classification using 1 sorting algorithm of 1v of libsvm software, and is selected RBF kernel function makes problem turn to high-dimensional feature space solution, and optimal parameter when discrimination highest is found using trellis traversal method Combination, obtains kernel functional parameter g and error penalty factor, is then combined using the optimal parameter found, inputs training sample Gesture feature vector Training Support Vector Machines, obtain optimum model parameter.

The present invention has the following advantages and effects with respect to the prior art:

(1) present invention is using kinect acquisition RGB image and its corresponding depth image, in conjunction with depth information and human body skin Color information realization hand and complex background are precisely separating；

(2) present invention operates and combines the geometrical characteristic of hand using range conversion, and realization accurately and rapidly removes arm Interference, improves the recognition accuracy of gesture recognition system；

(3) present invention extracts the Fourier descriptor and finger tip number of images of gestures, and the gesture feature vector of composition is defeated Enter support vector machines to be trained, realizes the real-time static gesture identification of accuracy rate height, strong robustness.

Detailed description of the invention

Fig. 1 is the overall procedure of the static gesture identification method of combination depth information and Skin Color Information disclosed by the invention Figure；

Fig. 2 is hand Segmentation stream in the static gesture identification method of combination depth information and Skin Color Information disclosed by the invention Cheng Tu；

Fig. 3 is arm removal stream in the static gesture identification method of combination depth information and Skin Color Information disclosed by the invention Cheng Tu；

Fig. 4 is feature extraction stream in the static gesture identification method of combination depth information and Skin Color Information disclosed by the invention Cheng Tu；

Fig. 5 is gesture classification stream in the static gesture identification method of combination depth information and Skin Color Information disclosed by the invention Cheng Tu.

Specific embodiment

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.

Embodiment

As shown in Figure 1, a kind of static gesture identification method of combination depth information and Skin Color Information, step is successively are as follows: Image acquisition step, hand Segmentation step, arm removal step, characteristic extraction step and gesture identification step.

Image acquisition step:

RGB image and its corresponding depth image are acquired simultaneously to get all pixels point into RGB image using kinect Corresponding depth information.

Kinect should be located at the position of 200mm~3500mm immediately ahead of human body, and the positional distance kinect of hand is nearest.

Hand Segmentation step:

As shown in Fig. 2, hand Segmentation step the following steps are included:

(1) depth image and RGB image are combined, is divided using depth threshold, obtains the RGB image comprising hand, specifically Process is as follows:

Distance kinect nearest pixel in emplacement depth image, and the pixel in its 250mm is calculated, retain The value of respective pixel point in RGB image, and enabling the rest of pixels point of RGB image is black.

(2) hand bianry image is obtained, detailed process is as follows using skin color segmentation to the RGB image comprising hand:

I. image is transformed into YCr ' Cb ' color space by rgb color space, specific formula is as follows:

Y (x, y)=0.299 × r (x, y)+0.587 × g (x, y)+0.114 × b (x, y)

Wherein r (x, y), g (x, y), b (x, y) be respectively comprising hand RGB image in positioned at coordinate (x, y) it is red, Green, blue three components, y (x, y) are the luminance component that image is located at coordinate (x, y) in YCr ' Cb ' color space, cr ' (x, y) It is located at the red chrominance component of coordinate (x, y) in YCr ' Cb ' color space for image, cb ' (x, y) is image in YCr ' Cb ' It is located at the chroma blue component of coordinate (x, y) in color space.

Ii. skin color segmentation is carried out using threshold method, specific formula is as follows:

Wherein f (x, y) is the pixel value for being located at coordinate (x, y) in hand bianry image, cr '_min、cr′_max、cb′_min、 cb′_maxFor threshold method in YCr ' Cb ' color space the minimum value of the minimum value, maximum value and cb ' component of cr ' component, most Big value.

Arm removes step:

As shown in figure 3, arm removal step the following steps are included:

I. to hand bianry image, range conversion is taken to operate, obtains range conversion mapping graph, in range conversion mapping graph The value of hand region indicates that corresponding pixel to the minimum range on hand boundary, removes hand region in range conversion mapping graph Outer rest of pixels point value is 0.

Ii. according to hand geometrical characteristic, enabling the maximum pixel of value in range conversion mapping graph is the centre of the palm, and enables the centre of the palm Pixel value be R₀, to realize the accurate positionin in the centre of the palm.

Iii. transformed mappings of adjusting the distance figure, using the centre of the palm as the center of circle, radius R₁=1.35 × R₀Circle is drawn, pixel within the circle is enabled Point value is 0, and palm cutting circle avoids influence of the palm area to subsequent operation.

Iv. the image obtained to above-mentioned processing, calculating max pixel value are R_max, calculate R_max/R₀And compared with threshold value T, If value is less than T, judge that arm regions are not present, hand bianry image is gesture bianry image, can be directly used for feature Extraction step；If value is greater than T, judge that arm regions exist, subsequent operation removal arm regions need to be carried out.T in the present invention =0.35.

V. the image obtained to above-mentioned processing removes arm regions using eight connectivity distinguished number, obtains containing only finger Bianry image.

Vi. arm regions present in hand bianry image can be removed by the XOR xor operation between image, obtained in one's hands Gesture bianry image.

Characteristic extraction step:

As shown in figure 4, characteristic extraction step the following steps are included:

I. to gesture bianry image, using the centre of the palm as the center of circle, radius R₂=1.95 × R₀Circle is drawn, pixel within the circle point is enabled to take Value is 0, calculates the number of remaining area in image by eight connectivity distinguished number to get finger tip number is arrived.

Ii. the coordinate (x, y) of the every bit on gesture contour edge is expressed as plural u=x+yi, to entire gesture wheel The sequence of complex numbers that wide edge is constituted carries out Fourier transformation, obtains one group of Fourier coefficient Z (k), to M since k=1 Fourier coefficient carries out modulus value and normalizes, and obtains M Fourier descriptor feature.Present invention selection M=12 is best, can Gesture profile is identified well and will not introduce excessive noise.

Iii., the gesture feature vector that 12 Fourier descriptors and finger tip number are collectively formed to 13 dimensions, can overcome hand The problems such as gesture occurred in gesture identification process translates, gesture rotates, gesture dimensional variation, improves the accurate of gesture identification method Property and robustness.

Gesture identification step:

The present embodiment realizes more classification using 1 sorting algorithm of 1v of libsvm software, and RBF kernel function is selected to make problem It turns to high-dimensional feature space to solve, kernel functional parameter g and error penalty factor are two keys for influencing support vector machines performance Factor.

As shown in figure 5, before gesture identification step, it is further comprising the steps of:

The training step of model, inputs the gesture feature vector of training sample, and the present embodiment is found using trellis traversal method Optimal parameter group when discrimination highest is combined into C=21619, g=12, then utilizes the optimal parameter training supporting vector found Machine obtains optimum model parameter.

Classified with gesture feature vector of the trained supporting vector machine model to input, obtains final gesture point Class result.

In conclusion present embodiment discloses the static gesture identification method of a kind of combination depth information and Skin Color Information, This method is using kinect acquisition RGB image and its corresponding depth image, in conjunction with depth information and human body complexion information realization Hand and complex background are precisely separating.This method operate and combines the geometrical characteristic of hand using range conversion, realization accurately, Rapidly removal arm interference, improves the recognition accuracy of gesture recognition system.In addition, this method extracts Fourier descriptor With finger tip number, the gesture feature vector input support vector machines of composition is trained, accuracy rate height, strong robustness are realized Real-time static gesture identification.

The above embodiment is a preferred embodiment of the present invention, but embodiments of the present invention are not by above-described embodiment Limitation, other any changes, modifications, substitutions, combinations, simplifications made without departing from the spirit and principles of the present invention, It should be equivalent substitute mode, be included within the scope of the present invention.

Claims

1. a kind of static gesture identification method of combination depth information and Skin Color Information, which is characterized in that the static gesture Recognition methods includes:

Image acquisition step acquires RGB image and depth image simultaneously with kinect, obtains all pixels point pair in RGB image The depth information answered；

Hand Segmentation step passes through setting depth threshold and utilizes human body complexion information, extracts the hand region in RGB image, Obtain hand bianry image；

Arm remove step, operate using range conversion and combines palm cut justify and threshold method, judge be in hand images No there are arm regions, remove existing arm regions by the xor operation between image, obtain gesture bianry image；

2. the static gesture identification method of a kind of combination depth information and Skin Color Information according to claim 1, feature It is, the kinect is located at the position of 200mm~3500mm immediately ahead of gathered person's human body, and the position of gathered person's hand Distance kinect is nearest.

3. the static gesture identification method of a kind of combination depth information and Skin Color Information according to claim 1, feature It is, the hand Segmentation step process is as follows:

Skin color segmentation is realized using the area of skin color in human body complexion information extraction RGB image to the RGB image comprising hand, Obtain hand bianry image.

4. the static gesture identification method of a kind of combination depth information and Skin Color Information according to claim 3, feature It is, the combination depth image and RGB image are divided using depth threshold, obtain the RGB image process comprising hand such as Under:

Distance kinect nearest pixel in emplacement depth image, and the pixel within the depth threshold is calculated, it protects The value of respective pixel point in RGB image is stayed, and enabling the rest of pixels point of RGB image is black.

5. the static gesture identification method of a kind of combination depth information and Skin Color Information according to claim 3, feature It is, it is described that skin is realized using the area of skin color in human body complexion information extraction RGB image to the RGB image comprising hand Color segmentation, it is as follows to obtain hand bianry image process:

Y (x, y)=0.299 × r (x, y)+0.587 × g (x, y)+0.114 × b (x, y)

Wherein r (x, y), g (x, y), b (x, y) are respectively the red, green, blue for being located at coordinate (x, y) in the RGB image comprising hand Three components, y (x, y) are the luminance component that image is located at coordinate (x, y) in YCr ' Cb ' color space, and cr ' (x, y) is figure Red chrominance component as being located at coordinate (x, y) in YCr ' Cb ' color space, cb ' (x, y) are image in YCr ' Cb ' color It is located at the chroma blue component of coordinate (x, y) in space；

Wherein f (x, y) is the pixel value for being located at coordinate (x, y) in hand bianry image；cr′_min、cr′_max、cb′_min、cb′_maxFor The threshold method minimum value of cr ' component, the minimum value of maximum value and cb ' component, maximum value in YCr ' Cb ' color space.

6. the static gesture identification method of a kind of combination depth information and Skin Color Information according to claim 1, feature It is, the arm removal step process is as follows:

To hand bianry image, range conversion is taken to operate, obtains range conversion mapping graph, hand area in range conversion mapping graph The value in domain indicates corresponding pixel to the minimum range on hand boundary, its in range conversion mapping graph in addition to hand region Afterimage vegetarian refreshments value is 0；

According to hand geometrical characteristic, enabling the maximum pixel of value in range conversion mapping graph is the centre of the palm, and enables the pixel in the centre of the palm Value is R₀, to realize the accurate positionin in the centre of the palm；

Transformed mappings of adjusting the distance figure, using the centre of the palm as the center of circle, radius R₁=1.35 × R₀Circle is drawn, enables the pixel within the circle point value be 0；

To the image that above-mentioned processing obtains, calculating max pixel value is R_max, calculate R_max/R₀And compared with threshold value T, if value is small In T, then judge that arm regions are not present, hand bianry image is gesture bianry image, is transferred to characteristic extraction step；If value Greater than T, then judge that arm regions exist, subsequent operation removal arm regions need to be carried out；

To the image that above-mentioned processing obtains, arm regions are removed using eight connectivity distinguished number, obtain the two-value for containing only finger Image；

Arm regions present in hand bianry image are removed by the XOR xor operation between image, obtain gesture bianry image.

7. the static gesture identification method of a kind of combination depth information and Skin Color Information according to claim 1, feature It is, the characteristic extraction step process is as follows:

To gesture bianry image, using the centre of the palm as the center of circle, radius R₂=1.95 × R₀Draw circle, wherein R₀For the pixel value in the centre of the palm, Enabling pixel within the circle point value is 0, calculates the number of remaining area in image by eight connectivity distinguished number to get a to finger tip Number；

The coordinate (x, y) of every bit on gesture contour edge is expressed as plural u=x+yi, to entire gesture contour edge institute The sequence of complex numbers of composition carries out Fourier transformation, obtains one group of Fourier coefficient Z (k), is to the M Fourier since k=1 Number carries out modulus value and normalizes, and obtains M Fourier descriptor feature；

8. the static gesture identification method of a kind of combination depth information and Skin Color Information according to claim 1, feature It is, before the gesture identification step, further includes:

Supporting vector machine model training step realizes more classification using 1 sorting algorithm of 1v of libsvm software, and selects RBF core Function makes problem turn to high-dimensional feature space solution, and optimal parameter when being found discrimination highest using trellis traversal method is combined, Kernel functional parameter g and error penalty factor are obtained, is then combined using the optimal parameter found, inputs the gesture of training sample Feature vector Training Support Vector Machines, obtain optimum model parameter.