CN103530892B - A kind of both hands tracking based on Kinect sensor and device - Google Patents

A kind of both hands tracking based on Kinect sensor and device Download PDF

Info

Publication number
CN103530892B
CN103530892B CN201310497334.XA CN201310497334A CN103530892B CN 103530892 B CN103530892 B CN 103530892B CN 201310497334 A CN201310497334 A CN 201310497334A CN 103530892 B CN103530892 B CN 103530892B
Authority
CN
China
Prior art keywords
hands
tracks
kinect sensor
tracking
depth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310497334.XA
Other languages
Chinese (zh)
Other versions
CN103530892A (en
Inventor
朱艳敏
袁博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Tsinghua University
Original Assignee
Shenzhen Graduate School Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Tsinghua University filed Critical Shenzhen Graduate School Tsinghua University
Priority to CN201310497334.XA priority Critical patent/CN103530892B/en
Publication of CN103530892A publication Critical patent/CN103530892A/en
Application granted granted Critical
Publication of CN103530892B publication Critical patent/CN103530892B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • User Interface Of Digital Computer (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a kind of both hands tracking based on Kinect sensor and device, described tracking, including: S1 video acquisition step;S2 detecting step one;S3 one hand tracking step;S4 detecting step two;The both hands having been detected by are tracked by S5 both hands tracking step。Make use of the information of first hands during due to second hands of detection, so the tracking of the present invention can follow the tracks of the motion of user's both hands fast and accurately, and computation complexity is low。

Description

A kind of both hands tracking based on Kinect sensor and device
Technical field
The present invention relates to gesture identification and both hands in field of human-computer interaction to follow the tracks of, especially relate to a kind of both hands tracking based on Kinect sensor and device。
Background technology
Along with the progress of the fast development of computer technology and people's science and technology idea, the perception of computer is proposed increasingly higher requirement by user。Traditional interactive mode depends on physical keyboard and mouse, its more single input mode based on text can not meet the demand of people, and the novel human-machine interaction mode more focusing on " people-oriented " has broken the constraint of traditional mode, input mode is transformed into the more abundant natural form such as image and sound, substantially improves the experience of user。People achieve significant progress in research fields such as recognition of face, speech recognition, human body attitude identification, gesture identification in recent years。
Gesture plays extremely important role in the life of people, and the gesture identification based on computer vision is to realize the key technology that a new generation's man-machine interaction is indispensable。Crucial and most of difficult point in gesture identification is all present in the tracking section of staff。Its significant challenge faced has: one, the interference of complex background, such as face and other class colour of skin background;Two, hands can deform upon in motor process;Three, the appearance effects of the change opponent of illumination is very big;Four, system runs the requirement of real-time。
Current Gesture Recognition is that positioning and tracing method used has: Camshift(ContinuouslyAdaptiveMean-SHIFT for singlehanded situation mostly) algorithm, feature space matching method etc.。These existing methods simply obtain good effect under given conditions。Tracking and the identification of both hands have again new feature and challenge compared to one hand, as how accurately differentiated and to follow the tracks of both hands after occurring mutually to block at both hands。The way having in prior art is that two handss of restriction are blocking front and back shape invariance, and both hands block after separately, utilize the both hands shape information before blocking to be identified。But this method is relatively big to staff restriction, not naturally convenient。The Kinect of Microsoft is gesture tracking and identification provides great convenience, and the depth information that it provides enormously simplify background removal step。Prior art has the 3D model that the depth image utilizing Kinect to provide and coloured image set up both hands the method being tracked, the detailed information such as both hands joint can be accurately positioned, but this method computation complexity is very high, even if when use GPU(GraphicProcessingUnit) accelerate can not reach real-time tracking。Because Kinect is per se with the function extracting and following the tracks of human skeleton, therefore prior art also has be used on skeleton find hand node method extract hand position thus following the tracks of staff, but this method requires that the attitude of user is sitting posture or the stance of standard, user's posture restriction is too many, and identify and to follow the trail of effect not ideal enough。
Summary of the invention
The technical problem to be solved is: overcome the defect of aforementioned prior art, it is provided that a kind of can process complex background and different light conditions and the both hands tracking based on Kinect sensor that the restriction of staff attitude is little, including:
S1 video acquisition step, obtains resolution from Kinect sensor and frame per second is the same from color video stream and deep video stream;
S2 detecting step one, detects first hands doing initial gesture from the image of acquired color video stream and deep video stream;
S3 one hand tracking step, utilizes the positional information of first hands in former frame or two frame images above and dimension information that first hands is positioned and followed the tracks of;
S4 detecting step two, utilizes positional information and dimension information second hands of detection of first hands;
The both hands having been detected by are tracked by S5 both hands tracking step。
According to embodiment, the present invention also can adopt following preferred technical scheme:
Described step S2 farther includes: S2-1 sample training step;S2-2 mode selecting step;S2-3 initiates gesture determination step。
In described step S2-1, select SVM classifier (SupportVectorMachine) study and the morphologic information of training hand, choose geometric invariant moment as training characteristics。
In described step S2-2, when light is suitable, select colour of skin pattern, namely with Complexion filter in conjunction with depth filtering method extract first hands;When light too dark or too bright time, select shape model, method first hands of extraction namely filter in conjunction with shape with depth filtering。
In described step S2-3, initial definition of gesture is: hands forward extends out and distance health is at more than threshold value d。
Location in described step S3 is: utilize the position of first hands in the former frame or two frame images above obtained and boundary rectangle thereof to predict the ROI(RegionOfInterest of first hands in current frame image), in this ROI, do depth filtering to position the position of first hands in current frame image。
Both hands in described step S5 are followed the tracks of and are included:
1) during both hands released state before mutually blocking, two targets are followed the tracks of respectively, namely predict the current frame image two respective ROI of target respectively according to the positions and dimensions information of two handss in former frame or two frame images above, and detect respectively in the two region;
2) both hands are when mutual occlusion state, and two target trajectorys detected overlap, and are considered as same target is tracked;
3), when both hands separate after mutually blocking, the invariance according to blocking front and back both hands position relationship in the depth direction distinguishes both hands, and follows the tracks of respectively。
The present invention also provides for a kind of both hands based on Kinect and follows the tracks of device, including such as lower module:
Video acquisition module, for being the same from color video stream and deep video stream from Kinect sensor acquisition resolution and frame per second;
Detection module one, for detecting first hands doing initial gesture from acquired coloured image and depth image;
Singlehanded tracking module, including positioning unit and tracking cell, for utilizing the positional information of first hands in former frame or two frame images above and dimension information that first hands is positioned and followed the tracks of;
Detection module two, for utilizing positional information and dimension information second hands of detection of first hands;
Both hands tracking module, for being tracked the both hands having been detected by。
Following preferred technical scheme also can be adopted according to embodiment:
Described detection module one includes sample training unit, mode selecting unit and initial gesture identifying unit。
Described sample training unit selects SVM classifier study and the morphologic information of training hand, chooses geometric invariant moment as training characteristics。
Described mode selecting unit is used for: when light is suitable, select colour of skin pattern, namely with Complexion filter in conjunction with depth filtering method extract first hands;When light too dark or too bright time, select shape model, method first hands of extraction namely filter in conjunction with shape with depth filtering。
In described initial gesture identifying unit, initial definition of gesture is: hands forward extends out and distance health is at more than threshold value d。
Described positioning unit is used for: utilizes the position of first hands in the former frame or two frame images above obtained and boundary rectangle thereof to predict the ROI of first hands in current frame image, does depth filtering to position the position of first hands in current frame image in this ROI。
Described both hands tracking module is used for:
1) during both hands released state before mutually blocking, two targets are followed the tracks of respectively, namely predict the current frame image two respective ROI of target respectively according to the positions and dimensions information of two handss in former frame or two frame images above, and detect respectively in the two region;
2) both hands are when mutual occlusion state, and two target trajectorys detected overlap, and are considered as same target is tracked;
3), when both hands separate after mutually blocking, the invariance according to blocking front and back both hands position relationship in the depth direction distinguishes both hands, and follows the tracks of respectively。
The present invention is compared with the prior art and provides the benefit that:
Make use of the information of first hands during due to second hands of detection, so the tracking of the present invention can follow the tracks of the motion of user's both hands fast and accurately, and computation complexity is low。
In one preferred technical scheme, owing to have employed the method for depth filtering and setting the region of interest ROI of target in tracking process, so the interference of complex background can not be subject to, because depth filtering can remove the impact of the face of target rear chaff interference such as user;And set the target that ROI region only makes in the region detection next frame image of target proximity, eliminate the object outside ROI region, such as the interference of the hand and face etc. of onlooker;Below another benefit is also brought, it is simply that the attitude of user is limited seldom。
In another preferred technical scheme, user can select suitable detection pattern according to light conditions so that the method can adapt to the situation of different light。
In another preferred technical scheme, not using its shape information when detecting hands in tracking process, therefore hands deformation in motor process does not affect tracking effect。
Accompanying drawing explanation
Fig. 1 is the flow chart of the both hands tracking of one embodiment of the invention。
Fig. 2 is the flow chart of first hands initial sign of detection of an embodiment。
Detailed description of the invention
Below against accompanying drawing and the present invention being explained in detail in conjunction with preferred embodiment。
Embodiment 1
A kind of both hands tracking based on Kinect, including:
S1 video acquisition step, obtains resolution from Kinect sensor and frame per second is the same from color video stream and deep video stream;
S2 detecting step one, detects first hands doing initial gesture from the image of acquired color video stream and deep video stream;
S3 one hand tracking step, utilizes former frame or two frames with positional information and the dimension information of upper first hands, first hands to be positioned and followed the tracks of;
S4 detecting step two, utilizes positional information and dimension information second hands of detection of first hands;
The both hands having been detected by are tracked by S5 both hands tracking step。
Described step S2 farther includes: S2-1 sample training step;S2-2 mode selecting step;S2-3 initiates gesture determination step。
In described step S2-1, select SVM classifier study and the morphologic information of training hand, choose geometric invariant moment as training characteristics。
In described step S2-2, when light is suitable, select colour of skin pattern, namely with Complexion filter in conjunction with depth filtering method extract first hands;When light too dark or too bright time, select shape model, method first hands of extraction namely filter in conjunction with shape with depth filtering。
In described step S2-3, initial definition of gesture is: hands forward extends out and distance health is at more than threshold value d。
Location in described step S3 is: utilizes the position of first hands in the former frame or two frame images above obtained and boundary rectangle thereof to predict the ROI of first hands in current frame image, does depth filtering to position the position of first hands in current frame image in this ROI。
Both hands in described step S5 are followed the tracks of and are included:
1) during both hands released state before mutually blocking, two targets are followed the tracks of respectively, namely predict the current frame image two respective ROI of target respectively according to the positions and dimensions information of two handss in former frame or two frame images above, and detect respectively in the two region;
2) both hands are when mutual occlusion state, and two target trajectorys detected overlap, and are considered as same target is tracked;
3), when both hands separate after mutually blocking, the invariance according to blocking front and back both hands position relationship in the depth direction distinguishes both hands, and follows the tracks of respectively。
Embodiment 2
As it is shown in figure 1, be the present embodiment both hands follow the tracks of flow chart, including:
Step 1) obtains video flowing。Such as using Kinect sensor to obtain resolution is 640*480, and frame per second is color video stream and the deep video stream of 30fps。
Step 2) first hands of detection。Namely from the cromogram got and depth map, detect first hands。Specifically find nearest sufficiently large target object as first hands according to depth information and Skin Color Information or shape information。
Step 3) judges whether initial sign to be detected。Namely judge whether find first hands is doing initial gesture。When hands forward extends out, distance health is when more than threshold value d, it is determined that this hands makes effectively initial gesture, and tracking starts。This hands of labelling is Hand1, and its three dimensional local information is stored in the track traj1 of this hands。
Wherein d is typically set to 15~25cm。
Step 4) follows the tracks of single hands。Namely described first hands Hand1 that tracing detection arrives。Specifically may is that according to the positions and dimensions information of hands in front cross frame image, it was predicted that the possible position region of target, i.e. region of interest ROI in current frame image。In the ROI region of depth map, find most probable target, the three dimensional local information of target is stored in the track traj1 of Hand1。Described front cross frame refers to two two field pictures in tracking process before present frame, if present frame is ft, then front cross frame refers to ft-1And ft-2。Certainly, utilizing front cross frame is a preferred way, " two " frame before also must being not necessarily, as when starting to follow the tracks of, above only having (that frame of first hands being detected) in a two field picture to have the information of hands, that is just only with the position dimension information of the hands of former frame。
This step Main Function is that the position of present frame target is given a forecast by the information utilizing " front N frame "。N is more big, it is possible to the information of reference is more many, but predictor formula can be more complicated。It addition, N is more big, before the reference value of nth frame image more little, so N is not the bigger the better, take 2 or 3 be likely to proper。
Step 5) second hands of detection。Because the natural initial state of bimanual input is that two handss stretch out close depth distance, show similar gesture, the positions and dimensions information of first hands Hand1 in current frame image therefore can be utilized to detect second hands in global scope。Depth map being filtered, finds the area target all similar with depth location and first hands, this method is simpler quickly relative to utilizing colouring information or shape information to re-search for the method for an other hands。
Step 6) judges whether target to be detected。Namely whether determination step 5 finds second hands, if it is found, be designated as second hands Hand2, its three dimensional local information is stored in the track traj2 of Hand2;If do not found, then continue to follow the tracks of single hands。
Step 7) follows the tracks of both hands。Namely after finding second hands, both hands (Hand1 and Hand2) are tracked。Mutually do not block or when overlap at two handss, be equivalent to two independent targets are tracked, according to the possible position region of two targets in the positions and dimensions information prediction current frame image of target in front cross frame image, set the respective region of interest ROI of two targets1And ROI2, and at the ROI of depth map1And ROI2Do depth filtering in scope and find most possible target respectively。When two palmistrys are blocked mutually or be overlapping, at ROI1And ROI2What inside detect is same target, and the positional information of two handss is identical, and track overlaps。After two palmistrys are blocked mutually more separately time, differentiate Hand1 and Hand2 according to the invariance of the depth value relation of two handss before blocking and after blocking, if namely before blocking Hand1 before Hand2, then block after separately before Hand1 should remain in Hand2。
The track of step 8) output hands。Namely after telling two handss, its each three dimensional local information be stored in respectively correspondence track in。
As in figure 2 it is shown, be the flow chart detecting first hands initial sign in the present embodiment, specifically include that
Step 201, obtains coloured image and depth image: obtaining resolution from Kinect is 640*480, and frame per second is RGB color image and the gray scale depth image of 30fps。
Step 202 judges that whether light is suitable。Select detection pattern according to illumination power, if illumination is moderate, select colour of skin pattern;If illumination is too weak or too bright, selected shape pattern。
Step 203 Complexion filter。First colour of skin pattern is transformed into the greater concentration of YCbCr color space of the colour of skin RGB picture, and wherein Y represents lightness (Luminance, Luma), and Cb and Cr refers to color, and Cb refers to chroma blue, and Cr refers to red color。Conversion formula is as follows:
Y=0.299*R+0.587*G+0.114*B
Cr=(R-Y)*0.713+128
Cb=(B-Y)*0.564+128
Then carry out Complexion filter, with oval complexion model, the colour picture of YCbCr color space is filtered obtaining the bianry image mask1 of the colour of skin。
Step 204 depth filtering: colour of skin bianry image mask1 is carried out depth filtering in conjunction with depth map。All areas connected domain more than hand minimum area Amin is found from mask1, select that wherein area is maximum 3, obtain new colour of skin bianry image mask2。Mask2 chooses the connected domain SRn that the degree of depth is minimum, it is carried out depth filtering, choose depth bounds region in [Dmin, Dmin+l] as quasi goal region Rc。Wherein, Dmin is the minimum depth value of connected domain SRn, l is the threshold value for splitting hand, be typically set to 5~8cm(referred to herein as from hands foremost such as the depth distance of finger tip to wrist, the setting implication of " threshold value of segmentation hand " described hereinafter is with where like)。
Step 205 sample training: shape model requires do not have object to disturb between user and photographic head, and user is in forefront, and when doing initial gesture, palm to open。First the positive and negative sample database opened hand is gathered, then selected characteristic and grader learning training hand。Owing to geometric invariant moment Hu square has translation, rotates and scale invariability, select it as the feature of training grader。The definition of Hu square is as follows:
Wherein, ηpqIt it is (p+q) rank normalization central moment。
I12002
I 2 = ( η 20 - η 02 ) 2 + 4 η 11 2
I3=(η30-3η12)2+(3η2103)2
I4=(η3012)2+(η2103)2
I5=(η30-3η12)(η3012)[(η3012)2-3(η2103)2]+
(3η2103)(η2103)[3(η3012)2-(η2103)2]
I6=(η2002)[(η3012)2-(η2103)2]+4η113012)(η2103)
I7=(3η2103)(η3012)[(η3012)2-3(η2103)2]-
30-3η12)(η2103)[3(η3012)2-(η2103)2]
SVM (SupportVectorMachine, support vector machine) has many distinctive advantages when solving small sample, non-linear and high-dimensional classification problem, therefore selects SVM that the Hu moment characteristics of hand is learnt and train, generation hand grader。
Step 206 depth filtering: depth map carries out depth filtering, emplacement depth scope largest object profile in [Dmin, Dmin+l], if its area is more than hand-type minimum area Amin, then the region that this profile comprises is territory, probable target area Rn。Wherein, Dmin is depth map minimum depth value, and l is the threshold value for splitting hand, is typically set to 5~8cm。
Step 207 form discrimination: of the SVM hand grader training territory, probable target area Rn made and classify, it determines whether be the hand opened。The target area Rc if it is, this territory, probable target area is as the criterion。
Step 208 judges whether to find target hands: judge whether quasi goal region Rc is the staff doing gesture, namely judges whether hands makes initial gesture。First depth image is filtered, takes depth bounds region in [Dmin, Dmin+d], choose the region Rb wherein comprising Rc, calculate the area ratio in two regions。
ratio=area(Rc)/area(Rb)
Scope according to ratio judges whether target is the hands stretched out。
If ratio is in [0.5,1] scope, then judges that Rb does not include health, only includes arm segment, now judge that quasi goal Rc is the hands stretched out, be effectively initiate gesture, be not otherwise effectively initiate gesture。
Wherein, Dmin is as the criterion the minimum-depth of target area Rc, and d is for judging the threshold value whether staff stretches out, being typically set to 15~25cm。
Step 209 lock onto target: if having found effective initial gesture, the target hands that labelling finds is Hand1, and its three dimensional local information is stored in the track traj1 of Hand1。
After finding initial gesture, the detailed implementation process of later step describes as follows:
Singlehanded tracking, enters singlehanded tracking module, utilizes front cross frame f after finding first handst-2And ft-1The positions and dimensions information of middle Hand1 determines present frame ftRectangle region of interest ROI1。ROI1Defining method be:
x=2xt-1-xt-2
y=2yt-1-yt-2
width=1.5*width(boundRect)
height=1.5*height(boundRect)
Wherein (x, y) for ROI1Central point, respectively it is wide and high for width and height, (xt-2,yt-2) and (xt-1,yt-1) for Hand1 central point at front cross frame image ft-2And ft-1The coordinate of middle x/y plane, boundRect is former frame ft-1The boundary rectangle in middle hand region。If t < 2, namely present frame is the first frame after initial gesture being detected, now only has the information of the hands of former frame, then ROI1Central point be x=xt-1, y=yt-1, wide same as above with height。ROI at depth map1The degree of depth area in [Dmin, Dmin+l] the largest connected region more than Amin is found, it is determined that for following the tracks of target, its three dimensional local information is stored in the track traj1 of Hand1 in scope。Wherein, Dmin is ROI1Minimum depth value in scope, l is the threshold value for splitting hand, is typically set to 5~8cm。
Detecting second hands, find the target that depth value is all similar to Hand1 with area in the current frame, namely the degree of depth is at [zt-2cm,zt+ l+2cm] in scope, and area is at [areat* 0.8, areat* 1.2] the connected region in scope, is labeled as second hands Hand2, its three dimensional local information is stored in the track traj2 of its correspondence。Wherein, ztFor the minimum depth value of Hand1, areatFor the area of Hand1, l is the threshold value of segmentation hand, is typically set to 5~8cm。
Both hands are followed the tracks of, when the state of separation before two handss are in and block, according to front cross frame ft-2And ft-1The position of middle Hand1 and Hand2 and size information set present frame ftIn respective region of interest ROI1And ROI2。At ROI1Middle searching target Hand1, at ROI2Middle searching target Hand2, method is identical with singlehanded tracking。When two handss occur mutually to block, ROI1With ROI2Overlapping, two targets detecting are actual is same target, and now the track of two handss overlaps。When two handss block and separate afterwards, block front and back invariance of position relationship on its depth direction according to two handss and differentiate both hands。
z s 1 < z s 2 &DoubleRightArrow; z t 1 < z t 2
Wherein s is the moment before blocking, and t blocks the moment after separately,WithThe respectively depth value of s moment Hand1 and Hand2,WithThe respectively depth value of t Hand1 and Hand2。If before the moment Hand1 namely before mutually blocking is in Hand2, then block the moment after end, Hand1 is also before Hand2。After telling both hands in this way, it is indicated and their three dimensional local information is stored in respective track respectively。
Should be appreciated that the modules in assembly of the invention embodiment can be identical with the description in embodiment of the method with the specific operation process of unit, be not described in detail herein。
Above content is in conjunction with concrete preferred implementation further description made for the present invention, it is impossible to assert that specific embodiment of the invention is confined to these explanations。For general technical staff of the technical field of the invention, without departing from the inventive concept of the premise, make some equivalent replacements or obvious modification, and performance or purposes are identical, all should be considered as belonging to protection scope of the present invention。

Claims (14)

1. the both hands tracking based on Kinect sensor, it is characterised in that including:
S1 video acquisition step, obtains resolution from Kinect sensor and frame per second is the same from color video stream and deep video stream;
S2 detecting step one, detects first hands doing initial gesture from the image of acquired color video stream and deep video stream;
S3 one hand tracking step, utilizes the positional information of first hands in former frame or two frame images above and dimension information that first hands is positioned and followed the tracks of;
S4 detecting step two, filters depth map, finds the target that area is all similar with depth location and first hands, utilizes positional information and dimension information second hands of detection of first hands;
The both hands having been detected by are tracked by S5 both hands tracking step。
2. the both hands tracking based on Kinect sensor as claimed in claim 1, it is characterised in that described step S2 farther includes:
S2-1 sample training step;
S2-2 mode selecting step;
S2-3 initiates gesture determination step。
3. the both hands tracking based on Kinect sensor as claimed in claim 2, it is characterized in that: in described step S2-1, select SVM classifier (SupportVectorMachine) study and the morphologic information of training hand, choose geometric invariant moment as training characteristics。
4. the both hands tracking based on Kinect sensor as claimed in claim 2, it is characterised in that: in described step S2-2, when light is suitable, select colour of skin pattern, namely extract first hands with Complexion filter in conjunction with the method for depth filtering;When light too dark or too bright time, select shape model, method first hands of extraction namely filter in conjunction with shape with depth filtering。
5. the both hands tracking based on Kinect sensor as claimed in claim 2, it is characterised in that: in described step S2-3, initial definition of gesture is: hands forward extends out and distance health is at more than threshold value d。
6. the both hands tracking based on Kinect sensor as claimed in claim 1, it is characterized in that, location in described step S3 is: utilizes the position of first hands in the former frame or two frame images above obtained and boundary rectangle thereof to predict the ROI of first hands in current frame image, does depth filtering to position the position of first hands in current frame image in this ROI。
7. the both hands tracking based on Kinect sensor as claimed in claim 1, it is characterised in that the both hands in described step S5 are followed the tracks of and included:
1) during both hands released state before mutually blocking, two targets are followed the tracks of respectively, namely predict the current frame image two respective ROI of target respectively according to the positions and dimensions information of two handss in former frame or two frame images above, and detect respectively in the two region;
2) both hands are when mutual occlusion state, and two target trajectorys detected overlap, and are considered as same target is tracked;
3), when both hands separate after mutually blocking, the invariance according to blocking front and back both hands position relationship in the depth direction distinguishes both hands, and follows the tracks of respectively。
8. the both hands based on Kinect sensor follow the tracks of device, it is characterised in that include such as lower module:
Video acquisition module, for being the same from color video stream and deep video stream from Kinect sensor acquisition resolution and frame per second;
Detection module one, for detecting first hands doing initial gesture from acquired coloured image and depth image;
Singlehanded tracking module, including positioning unit and tracking cell, for utilizing the positional information of first hands in former frame or two frame images above and dimension information that first hands is positioned and followed the tracks of;
Detection module two, for utilizing positional information and dimension information second hands of detection of first hands;
Both hands tracking module, for being tracked the both hands having been detected by。
9. the both hands based on Kinect sensor as claimed in claim 8 follow the tracks of device, it is characterised in that: described detection module one includes sample training unit, mode selecting unit and initial gesture identifying unit。
10. the both hands based on Kinect sensor as claimed in claim 9 follow the tracks of device, it is characterised in that: described sample training unit selects SVM classifier study and the morphologic information of training hand, chooses geometric invariant moment as training characteristics。
11. the both hands based on Kinect sensor as claimed in claim 9 follow the tracks of device, it is characterised in that described mode selecting unit is used for:
When light is suitable, select colour of skin pattern, namely with Complexion filter in conjunction with depth filtering method extract first hands;
When light too dark or too bright time, select shape model, method first hands of extraction namely filter in conjunction with shape with depth filtering。
12. the both hands based on Kinect sensor as claimed in claim 9 follow the tracks of device, it is characterised in that: in described initial gesture identifying unit, initial definition of gesture is: hands forward extends out and distance health is at more than threshold value d。
13. the both hands based on Kinect sensor as claimed in claim 8 follow the tracks of device, it is characterized in that, described positioning unit is used for: utilizes the position of first hands in the former frame or two frame images above obtained and boundary rectangle thereof to predict the ROI of first hands in current frame image, does depth filtering to position the position of first hands in current frame image in this ROI。
14. the both hands based on Kinect sensor as claimed in claim 8 follow the tracks of device, it is characterised in that described both hands tracking module is used for:
1) during both hands released state before mutually blocking, two targets are followed the tracks of respectively, namely predict the current frame image two respective ROI of target respectively according to the positions and dimensions information of two handss in former frame or two frame images above, and detect respectively in the two region;
2) both hands are when mutual occlusion state, and two target trajectorys detected overlap, and are considered as same target is tracked;
3), when both hands separate after mutually blocking, the invariance according to blocking front and back both hands position relationship in the depth direction distinguishes both hands, and follows the tracks of respectively。
CN201310497334.XA 2013-10-21 2013-10-21 A kind of both hands tracking based on Kinect sensor and device Active CN103530892B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310497334.XA CN103530892B (en) 2013-10-21 2013-10-21 A kind of both hands tracking based on Kinect sensor and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310497334.XA CN103530892B (en) 2013-10-21 2013-10-21 A kind of both hands tracking based on Kinect sensor and device

Publications (2)

Publication Number Publication Date
CN103530892A CN103530892A (en) 2014-01-22
CN103530892B true CN103530892B (en) 2016-06-22

Family

ID=49932870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310497334.XA Active CN103530892B (en) 2013-10-21 2013-10-21 A kind of both hands tracking based on Kinect sensor and device

Country Status (1)

Country Link
CN (1) CN103530892B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10685214B2 (en) 2015-09-25 2020-06-16 Intel Corporation Face detection window refinement using depth
CN105469043A (en) * 2015-11-20 2016-04-06 苏州铭冠软件科技有限公司 Gesture recognition system
CN108369643B (en) 2016-07-20 2022-05-13 杭州凌感科技有限公司 Method and system for 3D hand skeleton tracking
CN106504751A (en) * 2016-08-01 2017-03-15 深圳奥比中光科技有限公司 Self adaptation lip reading exchange method and interactive device
CN106327486B (en) * 2016-08-16 2018-12-28 广州视源电子科技股份有限公司 Track the method and device thereof of the finger web position
CN106625658A (en) * 2016-11-09 2017-05-10 华南理工大学 Method for controlling anthropomorphic robot to imitate motions of upper part of human body in real time
CN108120433B (en) * 2016-11-28 2021-06-11 沈阳新松机器人自动化股份有限公司 Method and device for detecting obstacle by robot
CN107742102B (en) * 2017-10-13 2020-03-24 北京华捷艾米科技有限公司 Gesture recognition method based on depth sensor
CN108509025A (en) * 2018-01-26 2018-09-07 吉林大学 A kind of crane intelligent Lift-on/Lift-off System based on limb action identification
CN108664904A (en) * 2018-04-24 2018-10-16 长沙学院 A kind of human body sitting posture Activity recognition method and system based on Kinect
CN108846789A (en) * 2018-05-31 2018-11-20 中国科学院合肥物质科学研究院 A kind of high speed CAMShift method based on GPU
CN109271847B (en) * 2018-08-01 2023-04-07 创新先进技术有限公司 Abnormity detection method, device and equipment in unmanned settlement scene
CN109636779B (en) * 2018-11-22 2021-02-19 华南农业大学 Method, apparatus and storage medium for recognizing integrated ruler of poultry body
CN109933190B (en) * 2019-02-02 2022-07-19 青岛小鸟看看科技有限公司 Head-mounted display equipment and interaction method thereof
CN112083801A (en) * 2020-07-24 2020-12-15 青岛小鸟看看科技有限公司 Gesture recognition system and method based on VR virtual office

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101901052A (en) * 2010-05-24 2010-12-01 华南理工大学 Target control method based on mutual reference of both hands
CN103034851A (en) * 2012-12-24 2013-04-10 清华大学深圳研究生院 Device and method of self-learning skin-color model based hand portion tracking
CN103257713A (en) * 2013-05-31 2013-08-21 华南理工大学 Gesture control method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101901052A (en) * 2010-05-24 2010-12-01 华南理工大学 Target control method based on mutual reference of both hands
CN103034851A (en) * 2012-12-24 2013-04-10 清华大学深圳研究生院 Device and method of self-learning skin-color model based hand portion tracking
CN103257713A (en) * 2013-05-31 2013-08-21 华南理工大学 Gesture control method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Tracking the articulated motion of two strongly interacting hands;I. Oikonomidis;《Computer Vision and Pattern Recognition》;20120621;1862-1869 *
tracking the articulated motion of two strongly interacting hands;Yanmin Zhu et al;《Service Sciences》;20130413;260-265 *
Yi Li.Hand Gesture Recognition Using Kinect.《Software Engineering and Service Science》.2012,196-199. *

Also Published As

Publication number Publication date
CN103530892A (en) 2014-01-22

Similar Documents

Publication Publication Date Title
CN103530892B (en) A kind of both hands tracking based on Kinect sensor and device
JP6079832B2 (en) Human computer interaction system, hand-to-hand pointing point positioning method, and finger gesture determination method
CN102081918B (en) Video image display control method and video image display device
CN105320248B (en) Aerial gesture input method and device
CN102831404B (en) Gesture detecting method and system
CN102324025B (en) Human face detection and tracking method based on Gaussian skin color model and feature analysis
CN104123529B (en) human hand detection method and system
CN105353634A (en) Household appliance and method for controlling operation by gesture recognition
CN103353935A (en) 3D dynamic gesture identification method for intelligent home system
CN105739702A (en) Multi-posture fingertip tracking method for natural man-machine interaction
CN103295016A (en) Behavior recognition method based on depth and RGB information and multi-scale and multidirectional rank and level characteristics
CN104504383B (en) A kind of method for detecting human face based on the colour of skin and Adaboost algorithm
Chen et al. Research and implementation of sign language recognition method based on Kinect
CN102880865A (en) Dynamic gesture recognition method based on complexion and morphological characteristics
CN106503651B (en) A kind of extracting method and system of images of gestures
CN103995595A (en) Game somatosensory control method based on hand gestures
CN106200971A (en) Man-machine interactive system device based on gesture identification and operational approach
CN104281839A (en) Body posture identification method and device
Ji et al. Integrating visual selective attention model with HOG features for traffic light detection and recognition
CN102194108A (en) Smiley face expression recognition method based on clustering linear discriminant analysis of feature selection
CN103336948A (en) Video tracking method based on face recognition
CN103257713A (en) Gesture control method
CN105335711A (en) Fingertip detection method in complex environment
CN103456012B (en) Based on visual human hand detecting and tracking method and the system of maximum stable area of curvature
CN105261038A (en) Bidirectional optical flow and perceptual hash based fingertip tracking method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant