A kind of static gesture identification method of combination depth information and Skin Color Information
Technical field
The present invention relates to image identification technical fields, and in particular to a kind of static hand of combination depth information and Skin Color Information
Gesture recognition methods.
Background technique
With the development of the relevant technologies such as computer vision, pattern-recognition, grinding for human-computer interaction technology has greatly been pushed
Study carefully, people can exchange more naturally with computer and be not limited to keyboard, mouse.The static gesture of view-based access control model identifies
As a kind of natural man-machine interaction mode, it is increasingly becoming the research hotspot of human-computer interaction technology, and is widely used in intelligence
The fields such as household, robot control, Sign Language Recognition.
In the static gesture identification based on monocular cam, realize that the separation of manpower and background in image is a difficulty
Point.In the gesture recognition system of early stage, by using the means such as gloves or the complex background simplification for wearing tape label, system
The environment for being easily achieved gesture and background separation is made, to obtain higher discrimination.But in a practical situation, environment is often difficult
With prediction, this method is not able to satisfy the needs of human-computer interaction under current conditions.
Static gesture is only indicated that existing arm regions are redundancies after hand Segmentation by palm and finger, can be interfered
With the result for influencing gesture identification.Many static gestures know the situation that method for distinguishing only consider not exposed arm, for training and
The images of gestures of classification does not simultaneously include arm.Due to not considering that arm interferes, these methods are difficult to be widely applied.
For occur during gesture identification gesture translation, gesture rotation, gesture dimensional variation the problems such as, selection is used for
The feature of description gesture need to have translation, rotation, constant rate.The gesture feature extracted in existing gesture identification method compared with
To be single, manifold combination can effectively improve the accuracy and robustness of gesture identification method.
In order to realize higher classification accuracy, when the classifier of many gesture identification method selections needs to spend longer
Between, on very big sample set training obtain.In addition, speed is slower in assorting process, it is difficult to meet gesture recognition system reality
The requirement of when property.
Summary of the invention
The purpose of the present invention is to solve drawbacks described above in the prior art, a kind of combination depth information and the colour of skin are provided
The static gesture identification method of information.
The purpose of the present invention can be reached by adopting the following technical scheme that:
A kind of static gesture identification method of combination depth information and Skin Color Information, the static gesture identification method packet
It includes:
Image acquisition step acquires RGB image and depth image simultaneously with kinect, obtains all pixels in RGB image
The corresponding depth information of point;
Hand Segmentation step passes through setting depth threshold and utilizes human body complexion information, extracts the hand area in RGB image
Domain obtains hand bianry image;
Arm removes step, is operated using range conversion and combines palm cutting round and threshold method, judges hand images
In whether there is arm regions, existing arm regions are removed by xor operation between image, obtain gesture bianry image;
Characteristic extraction step calculates the Fourier descriptor and finger tip number of images of gestures, constitutes the feature vector of gesture;
Gesture identification step carries out gesture classification using gesture feature vector of the support vector machines to input.
Further, the kinect is located at the position of 200mm~3500mm immediately ahead of gathered person's human body, and is adopted
The positional distance kinect of collection person's hand is nearest.
Further, the hand Segmentation step process is as follows:
In conjunction with depth image and RGB image, divided using depth threshold, obtains the RGB image comprising hand;
The colour of skin is realized using the area of skin color in human body complexion information extraction RGB image to the RGB image comprising hand
Segmentation, obtains hand bianry image.
Further, the combination depth image and RGB image, are divided using depth threshold, are obtained comprising hand
RGB image process is as follows:
Distance kinect nearest pixel in emplacement depth image, and calculate the pixel within the depth threshold
Point, retains the value of respective pixel point in RGB image, and enabling the rest of pixels point of RGB image is black.
Further, described to the RGB image comprising hand, utilize the skin in human body complexion information extraction RGB image
It realizes skin color segmentation, it is as follows to obtain hand bianry image process in color region:
Image is transformed into YCr ' Cb ' color space by rgb color space, formula is as follows:
Y (x, y)=0.299 × r (x, y)+0.587 × g (x, y)+0.114 × b (x, y)
Wherein r (x, y), g (x, y), b (x, y) be respectively comprising hand RGB image in positioned at coordinate (x, y) it is red,
Green, blue three components, y (x, y) are the luminance component that image is located at coordinate (x, y) in YCr ' Cb ' color space, cr ' (x, y)
It is located at the red chrominance component of coordinate (x, y) in YCr ' Cb ' color space for image, cb ' (x, y) is image in YCr ' Cb '
It is located at the chroma blue component of coordinate (x, y) in color space;
Skin color segmentation is carried out using threshold method, formula is as follows:
Wherein f (x, y) is the pixel value for being located at coordinate (x, y) in hand bianry image;cr′min、cr′max、cb′min、
cb′maxFor threshold method in YCr ' Cb ' color space the minimum value of the minimum value, maximum value and cb ' component of cr ' component, most
Big value.
Further, arm removal step process is as follows:
To hand bianry image, range conversion is taken to operate, obtains range conversion mapping graph, hand in range conversion mapping graph
The value in portion region indicates corresponding pixel to the minimum range on hand boundary, in range conversion mapping graph in addition to hand region
Rest of pixels point value be 0;
According to hand geometrical characteristic, enabling the maximum pixel of value in range conversion mapping graph is the centre of the palm, and enables the centre of the palm
Pixel value is R0, to realize the accurate positionin in the centre of the palm;
Transformed mappings of adjusting the distance figure, using the centre of the palm as the center of circle, radius R1=1.35 × R0Circle is drawn, pixel within the circle point is enabled to take
Value is 0, and palm cutting circle avoids influence of the palm area to subsequent operation;
To the image that above-mentioned processing obtains, calculating max pixel value is Rmax, calculate Rmax/R0And compared with threshold value T, if taking
Value is less than T, then judges that arm regions are not present, hand bianry image is gesture bianry image, is transferred to characteristic extraction step;If
Value is greater than T, then judges that arm regions exist, and need to carry out subsequent operation removal arm regions;
To the image that above-mentioned processing obtains, arm regions are removed using eight connectivity distinguished number, obtain containing only finger
Bianry image;
Arm regions present in hand bianry image are removed by the XOR xor operation between image, obtain gesture two-value
Image.
Further, the characteristic extraction step process is as follows:
To gesture bianry image, using the centre of the palm as the center of circle, radius R2=1.95 × R0Draw circle, wherein R0For the pixel in the centre of the palm
Value, enabling pixel within the circle point value is 0, calculates the number of remaining area in image by eight connectivity distinguished number to get finger is arrived
Sharp number;
The coordinate (x, y) of every bit on gesture contour edge is expressed as plural u=x+yi, to entire gesture profile side
The sequence of complex numbers that edge is constituted carries out Fourier transformation, one group of Fourier coefficient Z (k) is obtained, in M Fu since k=1
Leaf system number carries out modulus value and normalizes, and obtains M Fourier descriptor feature;
M Fourier descriptor and finger tip number are collectively formed to the gesture feature vector of M+1 dimension.
Further, before the gesture identification step, further includes:
Supporting vector machine model training step is realized more classification using 1 sorting algorithm of 1v of libsvm software, and is selected
RBF kernel function makes problem turn to high-dimensional feature space solution, and optimal parameter when discrimination highest is found using trellis traversal method
Combination, obtains kernel functional parameter g and error penalty factor, is then combined using the optimal parameter found, inputs training sample
Gesture feature vector Training Support Vector Machines, obtain optimum model parameter.
The present invention has the following advantages and effects with respect to the prior art:
(1) present invention is using kinect acquisition RGB image and its corresponding depth image, in conjunction with depth information and human body skin
Color information realization hand and complex background are precisely separating;
(2) present invention operates and combines the geometrical characteristic of hand using range conversion, and realization accurately and rapidly removes arm
Interference, improves the recognition accuracy of gesture recognition system;
(3) present invention extracts the Fourier descriptor and finger tip number of images of gestures, and the gesture feature vector of composition is defeated
Enter support vector machines to be trained, realizes the real-time static gesture identification of accuracy rate height, strong robustness.
Detailed description of the invention
Fig. 1 is the overall procedure of the static gesture identification method of combination depth information and Skin Color Information disclosed by the invention
Figure;
Fig. 2 is hand Segmentation stream in the static gesture identification method of combination depth information and Skin Color Information disclosed by the invention
Cheng Tu;
Fig. 3 is arm removal stream in the static gesture identification method of combination depth information and Skin Color Information disclosed by the invention
Cheng Tu;
Fig. 4 is feature extraction stream in the static gesture identification method of combination depth information and Skin Color Information disclosed by the invention
Cheng Tu;
Fig. 5 is gesture classification stream in the static gesture identification method of combination depth information and Skin Color Information disclosed by the invention
Cheng Tu.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is
A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art
Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
Embodiment
As shown in Figure 1, a kind of static gesture identification method of combination depth information and Skin Color Information, step is successively are as follows:
Image acquisition step, hand Segmentation step, arm removal step, characteristic extraction step and gesture identification step.
Image acquisition step:
RGB image and its corresponding depth image are acquired simultaneously to get all pixels point into RGB image using kinect
Corresponding depth information.
Kinect should be located at the position of 200mm~3500mm immediately ahead of human body, and the positional distance kinect of hand is nearest.
Hand Segmentation step:
As shown in Fig. 2, hand Segmentation step the following steps are included:
(1) depth image and RGB image are combined, is divided using depth threshold, obtains the RGB image comprising hand, specifically
Process is as follows:
Distance kinect nearest pixel in emplacement depth image, and the pixel in its 250mm is calculated, retain
The value of respective pixel point in RGB image, and enabling the rest of pixels point of RGB image is black.
(2) hand bianry image is obtained, detailed process is as follows using skin color segmentation to the RGB image comprising hand:
I. image is transformed into YCr ' Cb ' color space by rgb color space, specific formula is as follows:
Y (x, y)=0.299 × r (x, y)+0.587 × g (x, y)+0.114 × b (x, y)
Wherein r (x, y), g (x, y), b (x, y) be respectively comprising hand RGB image in positioned at coordinate (x, y) it is red,
Green, blue three components, y (x, y) are the luminance component that image is located at coordinate (x, y) in YCr ' Cb ' color space, cr ' (x, y)
It is located at the red chrominance component of coordinate (x, y) in YCr ' Cb ' color space for image, cb ' (x, y) is image in YCr ' Cb '
It is located at the chroma blue component of coordinate (x, y) in color space.
Ii. skin color segmentation is carried out using threshold method, specific formula is as follows:
Wherein f (x, y) is the pixel value for being located at coordinate (x, y) in hand bianry image, cr 'min、cr′max、cb′min、
cb′maxFor threshold method in YCr ' Cb ' color space the minimum value of the minimum value, maximum value and cb ' component of cr ' component, most
Big value.
Arm removes step:
As shown in figure 3, arm removal step the following steps are included:
I. to hand bianry image, range conversion is taken to operate, obtains range conversion mapping graph, in range conversion mapping graph
The value of hand region indicates that corresponding pixel to the minimum range on hand boundary, removes hand region in range conversion mapping graph
Outer rest of pixels point value is 0.
Ii. according to hand geometrical characteristic, enabling the maximum pixel of value in range conversion mapping graph is the centre of the palm, and enables the centre of the palm
Pixel value be R0, to realize the accurate positionin in the centre of the palm.
Iii. transformed mappings of adjusting the distance figure, using the centre of the palm as the center of circle, radius R1=1.35 × R0Circle is drawn, pixel within the circle is enabled
Point value is 0, and palm cutting circle avoids influence of the palm area to subsequent operation.
Iv. the image obtained to above-mentioned processing, calculating max pixel value are Rmax, calculate Rmax/R0And compared with threshold value T,
If value is less than T, judge that arm regions are not present, hand bianry image is gesture bianry image, can be directly used for feature
Extraction step;If value is greater than T, judge that arm regions exist, subsequent operation removal arm regions need to be carried out.T in the present invention
=0.35.
V. the image obtained to above-mentioned processing removes arm regions using eight connectivity distinguished number, obtains containing only finger
Bianry image.
Vi. arm regions present in hand bianry image can be removed by the XOR xor operation between image, obtained in one's hands
Gesture bianry image.
Characteristic extraction step:
As shown in figure 4, characteristic extraction step the following steps are included:
I. to gesture bianry image, using the centre of the palm as the center of circle, radius R2=1.95 × R0Circle is drawn, pixel within the circle point is enabled to take
Value is 0, calculates the number of remaining area in image by eight connectivity distinguished number to get finger tip number is arrived.
Ii. the coordinate (x, y) of the every bit on gesture contour edge is expressed as plural u=x+yi, to entire gesture wheel
The sequence of complex numbers that wide edge is constituted carries out Fourier transformation, obtains one group of Fourier coefficient Z (k), to M since k=1
Fourier coefficient carries out modulus value and normalizes, and obtains M Fourier descriptor feature.Present invention selection M=12 is best, can
Gesture profile is identified well and will not introduce excessive noise.
Iii., the gesture feature vector that 12 Fourier descriptors and finger tip number are collectively formed to 13 dimensions, can overcome hand
The problems such as gesture occurred in gesture identification process translates, gesture rotates, gesture dimensional variation, improves the accurate of gesture identification method
Property and robustness.
Gesture identification step:
The present embodiment realizes more classification using 1 sorting algorithm of 1v of libsvm software, and RBF kernel function is selected to make problem
It turns to high-dimensional feature space to solve, kernel functional parameter g and error penalty factor are two keys for influencing support vector machines performance
Factor.
As shown in figure 5, before gesture identification step, it is further comprising the steps of:
The training step of model, inputs the gesture feature vector of training sample, and the present embodiment is found using trellis traversal method
Optimal parameter group when discrimination highest is combined into C=21619, g=12, then utilizes the optimal parameter training supporting vector found
Machine obtains optimum model parameter.
Classified with gesture feature vector of the trained supporting vector machine model to input, obtains final gesture point
Class result.
In conclusion present embodiment discloses the static gesture identification method of a kind of combination depth information and Skin Color Information,
This method is using kinect acquisition RGB image and its corresponding depth image, in conjunction with depth information and human body complexion information realization
Hand and complex background are precisely separating.This method operate and combines the geometrical characteristic of hand using range conversion, realization accurately,
Rapidly removal arm interference, improves the recognition accuracy of gesture recognition system.In addition, this method extracts Fourier descriptor
With finger tip number, the gesture feature vector input support vector machines of composition is trained, accuracy rate height, strong robustness are realized
Real-time static gesture identification.
The above embodiment is a preferred embodiment of the present invention, but embodiments of the present invention are not by above-described embodiment
Limitation, other any changes, modifications, substitutions, combinations, simplifications made without departing from the spirit and principles of the present invention,
It should be equivalent substitute mode, be included within the scope of the present invention.