CN104765440A

CN104765440A - Hand detecting method and device

Info

Publication number: CN104765440A
Application number: CN201410001215.5A
Authority: CN
Inventors: 赵颖
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2014-01-02
Filing date: 2014-01-02
Publication date: 2015-07-08
Anticipated expiration: 2034-01-02
Also published as: CN104765440B

Abstract

The invention provides a hand detecting method device. The method comprises the steps of detecting candidate areas of a hand in the current scene; determining a forearm area in the current scene based on the depth image of the current scene; respectively predicating corresponding wrist information in the candidate areas of each handle; predicating the wrist information according to the forearm area; selecting the candidate area with the top confidence coefficient as the area of the handle according to the wrist information. According to the hand detecting method and device, good detecting result can be obtained under various complex conditions such as that the image is fuzzy caused by the moving of the hand, an object with color similar to the skin color is in the background, the light changes in the man-machine interaction process, and the hand and face are overlapped.

Description

Hand detection method and equipment

Technical field

The present invention relates in general to man-machine interaction, relates more specifically to hand detection method and equipment.

Background technology

At present, man-machine interaction carries out man-machine interaction from touching interactive development to the gesture and posture by detecting operator.Specifically, exactly by catching the scene image of the operator before screen and screen, and the scene image obtained being processed, obtaining the action of operator, then the action of operator being converted to the operational order of machine, thus realize man-machine interaction.The gesture of this man-machine interaction needs identifying operation usually person, and the basis of gesture identification is the detection of hand.In view of the characteristic of hand self, such as, skin color and the distinctive shape of hand, people usually identify based on the color of hand or profile and detect hand.

In U.S. Patent application US20100803369A, describe a kind of staff detection method of view-based access control model, the method requires that two hands occur simultaneously.Concrete, the method calculates the depth information of scene according to the left and right anaglyph that two video cameras obtain, multiple staff candidate region is gone out according to Face Detection, and to the candidate region combination of two obtained, drawn the position of both hands by the information such as difference in size, depth difference, alternate position spike calculating them.In Japanese patent application JP2005000236876, describe a kind of method based on Color images detecting staff.The method utilizes features of skin colors to detect multiple staff candidate region, the then shape in calculated candidate region, and determines whether this region is staff according to the complexity of shape.In addition, in US Patent No. 7593552B2, describe a kind of gesture identification method, comprising the detection of staff.Concrete, in the method, first detect face thus obtain position and the Skin Color Information of face, then centered by the position of face, get a region of search in its left and right sides respectively, and detect area of skin color in this region, the focus point of the area of skin color detected is staff location point.But above-mentioned hand detection method all well can not tackle the situation such as object, illumination variation having the similar colour of skin in motion blur, background.In addition, said method needs initiation gesture more.

Summary of the invention

According to an aspect of the present invention, provide a kind of hand detection method, comprising: the candidate region detecting hand in current scene; Based on the depth image of current scene, determine the underarm area in current scene; Utilize the candidate region of each described hand to predict corresponding wrist information respectively, and utilize described underarm area to predict described wrist information; With based on wrist information described in each, select the highest candidate region of degree of confidence as the region at hand place.

According to another aspect of the present invention, provide a kind of hand checkout equipment, comprising: hand candidate region detection part, be configured for the candidate region detecting hand in current scene; Underarm area detection part, is configured for the depth image based on current scene, determines the underarm area in current scene; Wrist information prediction unit, is configured for and utilizes the candidate region of each described hand to predict corresponding wrist information respectively, and utilize described underarm area to predict described wrist information; With hand region determining means, be configured for based on wrist information described in each, select the highest candidate region of degree of confidence as the region at hand place.

According to the hand detection technique of the embodiment of the present invention can be good at processing due to the motion of hand cause having in image blurring, background that illumination in the object of the similar colour of skin, interactive process changes, hand and the Various Complex situation such as face is overlapping.In addition, the detection of hand can be carried out according to the hand detection technique of the embodiment of the present invention on single-frame images, and do not need initiation gesture and movable information.

Accompanying drawing explanation

Fig. 1 schematically shows the scene of application according to the hand detection technique of the embodiment of the present invention.

Fig. 2 shows the process flow diagram of the hand detection method according to the embodiment of the present invention.

Fig. 3 shows the process flow diagram detecting the process of the candidate region of hand in current scene in the hand detection method according to the embodiment of the present invention.

Fig. 4 shows in the process shown in Fig. 3 and determines that shade contrast schemes, the process flow diagram of saturation degree contrast figure and Depth contrasts figure based on foreground depth image and prospect coloured image.

Fig. 5 shows in the process shown in Fig. 3 the process flow diagram of weights figure calculating shade contrast figure, saturation degree contrast figure and Depth contrasts figure.

Fig. 6 shows the process flow diagram of the concrete process determining the underarm area in current scene in the hand detection method according to the embodiment of the present invention.

Fig. 7 (a) shows exemplary foreground depth image, and Fig. 7 (b) shows schematic depth profile, and Fig. 7 (c) shows the depth profile after schematic thresholding process, and Fig. 7 (d) schematically illustrates the straight line corresponding to forearm.

Fig. 8 shows the process flow diagram of the process of the wrist information of the candidate region prediction correspondence utilizing hand in the hand detection method according to the embodiment of the present invention.

Fig. 9 shows in the hand detection method according to the embodiment of the present invention process flow diagram utilizing underarm area to predict the process of wrist information.

Figure 10 (a) schematically shows the wrist information utilizing the candidate region of hand to dope in foreground depth image.

Figure 10 (b) schematically shows the wrist information utilizing underarm area to dope in foreground depth image.

Figure 11 shows the functional configuration block diagram of the hand checkout equipment according to the embodiment of the present invention.

Figure 12 shows the general hardware block diagram of the hand detection system according to the embodiment of the present invention.

Embodiment

In order to make those skilled in the art understand the present invention better, below in conjunction with the drawings and specific embodiments, the present invention is described in further detail.

Fig. 1 schematically shows the scene of application according to the hand detection technique of the embodiment of the present invention.As shown in Figure 1, user is just standing in the image pickup scope of video camera 101, makes the program in the treatment facility 102 using gesture to control such as computing machine.Video camera 101 can be that any one can provide the coloured image of scene and the video camera of depth image, such as PrimeSensor, Kinect etc.When user carries out gesture control, video camera 101 couples of users take, every two field picture that treatment facility 102 photographs for video camera 101, detects the positional information of the hand of user.It should be noted that, Fig. 1 illustrate only a kind of possible application scenarios of the present invention, and according to actual conditions, the equipment in application scenarios can correspondingly increase or reduce, and has different configurations, or can adopt other different equipment.

Below, be described in detail to the hand detection technique according to the embodiment of the present invention.

First concise and to the point description is carried out to the basic thought of hand detection technique of the present invention.As everyone knows, hand is non-rigid object, has the motion feature such as fast, yielding.Therefore, in the Various Complex situation such as have the object of the similar colour of skin, hand and face overlapping in the motion due to hand causes image blurring, background, be difficult to accurately detect the position of selling.For this situation, the present invention proposes and utilize the candidate region of forearm information to the multiple hands detected to verify, thus obtain the technology of the position of hand accurately.Because forearm volume is comparatively large and have less distortion, therefore it detects and wants easily relative to the detection of hand, thus the position of the hand utilizing forearm Information Authentication to obtain has higher confidence level.

As shown in Figure 2, in step S201, detect the candidate region of hand in current scene.

As mentioned before, detect candidate region existing a lot of research in the art of one or more hand, it is not key point of the present invention, and those skilled in the art can adopt the candidate region detecting hand in current scene in any suitable manner.Herein, complete in order to what illustrate, with reference to Fig. 3, the way of example used in the present embodiment is simply described.

As shown in Figure 3, in step S2011, obtain depth image and the coloured image of the current scene of video camera shooting.As known to the skilled person, depth image is the image that in image, the value of each pixel represents the distance in scene between certain any and video camera, and coloured image is then a width RGB image.

In step S2012, based on described depth image and coloured image, obtain foreground depth image and the prospect coloured image of current scene.

Owing to not only comprising user in depth image and coloured image, also include other background image content, therefore in order to reduce the calculated amount of subsequent detection process, can be partitioned into from depth image and coloured image by foreground segmentation and comprise in user and/or scene that other, near the foreground area of the object of video camera, namely obtain foreground depth image and the prospect coloured image of current scene.This process can be realized by existing method various in this area, and such as connected domain analysis, Threshold segmentation etc., be no longer described in detail herein.

In step S2013-S2016 subsequently, by from the tone of hand, saturation degree and depth information, Utilization prospects depth image and prospect Color images detecting go out the salient region in foreground area, as the candidate region of hand.Because the color of hand and most of object have distinction, therefore color contrast can as one of feature distinguishing hand and most of object.And by processing tone and the saturation degree of color respectively, the object comprised for foreground area and hand having the situation of similar color, also can distinguish comparatively accurately and selling.About tone and saturation degree component, can by coloured image be obtained from RGB color space conversion to hsv color space, this conversion belongs to prior art, is not described in detail it herein.On the other hand, due in man-machine interactive operation, do not have other objects in hand to the distance of video camera, thus hand is significant in the degree of depth, and it can as the another feature distinguishing hand and other objects.Therefore, the detection of salient region comprise tone, saturation degree and the degree of depth three aspect calculating and fusion.Below in conjunction with the concrete process in step S2013-S2016, this salient region testing process is described.

Concrete, in step S2013, based on described foreground depth image and prospect coloured image, determine that shade contrast schemes C ^t, saturation degree contrast figure C ^sc is schemed with Depth contrasts ^d.

Shade contrast's degree figure C ^tobtained by the tone path computation of prospect coloured image, saturation degree contrast figure C ^sbe obtained by the saturation degree path computation of prospect coloured image, and Depth contrasts scheme C ^dthen calculated by foreground depth image, wherein in each contrast figure, the pixel value of each pixel represents the saliency value of this pixel relative to other pixels in image.Shade contrast schemes C ^t, saturation degree contrast figure C ^sc is schemed with Depth contrasts ^dcan adopt and use the same method to calculate, such as, adopt hereinafter by method that composition graphs 4 describes.That is, by performing the method shown in three times Fig. 4, calculating shade contrast respectively and scheming C ^t, saturation degree contrast figure C ^sc is schemed with Depth contrasts ^d.Be specifically described to the method shown in Fig. 4 below.For convenience of explanation, represent input picture in the following description with I, the contrast figure of correspondence then uses C ^d(d=D, T, S) represents.Can understand, when calculating shade contrast and scheming, I represents the tone channel image of prospect coloured image, and when calculating saturation degree contrast figure, I represents the saturation degree channel image of prospect coloured image, and when compute depth contrast figure, I represents foreground depth image.

As shown in Figure 4, in step S401, its neighborhood territory pixel j (j=1...n is selected to each pixel i of image I _i), wherein n _iit is the neighborhood territory pixel number of pixel i.

In this step, can adopt and select neighborhood territory pixel in any suitable manner.A kind of possible mode adopts the multi-density method of sampling.The so-called multi-density method of sampling is exactly more at the neighborhood territory pixel of the nearer position sampling of distance pixel i, fewer at distance pixel i position sampling far away.Specifically, with pixel i for initial point chooses a m decile direction.On this m direction, be that step-length is sampled, until the border of image I respectively with r.The value of m can be determined according to specific needs, and such as, the value of m is got and is generally 8 in an experiment, if and want obtain more accurate result, the value of m can larger (as 16), if or not high to the accuracy requirement of result, the value of m also can less (as 4).The value of step-length r also can be determined according to specific needs, and the value of such as r can be 2 pixel distances.By this sampling, the tone channel image of corresponding prospect coloured image, the saturation degree channel image of prospect coloured image and foreground depth image, obtain the tone value of the sampled point in each image, intensity value and depth value respectively.Optionally, only its neighborhood territory pixel can be selected, to reduce calculated amount for the pixel i of each non-zero value in this step.

In step S402, to each pixel i as initial point in image I, calculate the difference of the pixel value between itself and corresponding neighborhood territory pixel j.

In this step, expression formula [1] can be utilized to calculate margin of image element d between each pixel i as initial point and corresponding neighborhood territory pixel j _ij.

d _ij=|I _i-I _j| ²,i=1...N [1]

Wherein, I _ithe pixel value of pixel i, I _jbe the pixel value of neighborhood territory pixel j, N is the size of image I.More specifically, at compute depth contrast figure C ^dtime, I _iand I _jrepresent the depth value of pixel i and j; At calculating shade contrast degree figure C ^ttime, I _iand I _jrepresent the shade of color value of pixel i and j respectively; And at calculating saturation degree contrast figure C ^stime, I _iand I _jrepresent the color saturation value of pixel i and j respectively.

In step S403, for each neighborhood territory pixel j, determine its weights.

In this step, such as, expression formula [2] can be utilized to calculate Gauss's weight w of neighborhood territory pixel j _ij.

w_{ij} = \exp (- \frac{1}{2 σ_{p}^{2}} {| | p_{i} - p_{j} | |}^{2}) - - - [2]

Wherein, σ _pbeing the scale factor of Gauss's weights, such as, can value be 0.25, p in experiment _iand p _jthe position of pixel i and j respectively.|| p _i-p _j|| represent position p _iand p _jeuclidean distance.As can be seen from expression formula [2], the neighborhood territory pixel that distance is far away, its weights are less, and the neighborhood territory pixel that distance is nearer, its weights are larger.

Subsequently, in step S404, to each pixel i as initial point in image I, determine its contrast value, to obtain contrast figure C ^d.

In this step, such as, expression formula [3] can be utilized to carry out the contrast value of each pixel i as initial point in computed image I.

{C^{d}}_{i} = Σ_{j = 1}^{n_{i}} d_{ij} w_{ij} - - - [3]

By for three kinds of foreground image I each in each pixel, perform above-mentioned steps S401-S404, can obtain three kinds of contrast figure, namely shade contrast schemes C ^t, saturation degree contrast figure C ^sc is schemed with Depth contrasts ^d.As previously mentioned, in each contrast figure, the pixel value of each pixel represents the saliency value of this pixel relative to other pixels of image, because hand is all remarkable than other objects in scene in tone, saturation degree and the degree of depth, so the probability that the larger pixel of contrast figure intermediate value belongs to hand is higher.

Return Fig. 3, in step S2014, calculate shade contrast respectively and scheme C ^t, saturation degree contrast figure C ^sc is schemed with Depth contrasts ^dweights figure W _t, W _sand W _d.

In the present embodiment, by calculating contrast figure C ^d, C ^tand C ^sballot each other obtains weights figure W _t, W _sand W _d.Herein, ballot describes the one of difference between contrast figure.The pixel value of weights figure represents the degree of confidence of the contrast figure of its correspondence, and the pixel value in weights figure is larger, then corresponding contrast figure is more credible.Below, with reference to Fig. 5, the process of the calculating weights figure according to the present embodiment is described.

As shown in Figure 5, in step S501, for contrast figure C ^d, C ^t, C ^s, calculate corresponding gradient vector figure G ^d.

Gradient vector figure G ^dtwo tuple (D ^d, M ^d) (d=D, T, S), wherein, D ^dgradient direction, M ^dit is gradient magnitude.By for each pixel compute gradient in contrast figure, the gradient vector figure corresponding with contrast figure can be generated.And the gradient calculating pixel is the common technology means of this area, therefore do not describe in detail herein.

In step S502, for every width contrast figure, calculate other each contrast figure to its ballot.

For convenience of description, hereinafter, C is used ^d(d=D, T, S) represents three width contrast figure C ^d, C ^t, C ^sin arbitrary contrast figure, use C ^c(c=D, T, S, and c ≠ d) expression is different from C ^danother contrast figure.C ^cto C ^dballot describe be hypothesis C ^cfor under correct condition, C ^dalso be correct probability.

In this step, first C is calculated ^dat C ^cfor the probability of mistake under correct condition.Usually, if C ^cc when being correct ^dbe wrong, so the direction of their gradient vector is inevitable different, there is angle between two vectors.According to the Vector triangle of vector, two vectorial differences are the length on the limit corresponding to two vector angles.Therefore, C ^dat C ^cthe probability of mistake under correct condition calculates for can pass through expression formula [4].

P (C_{-}^{d} | C_{+}^{c}) = M^{d} * \sin θ * F; - - - [4]

F = \frac{1}{1 + \exp (- | D^{c} - D^{d} |)};

c,d=D,T,S;c≠d.

Wherein, represent C ^dwrong, represent C ^cbe correct, θ is gradient vector G ^cand G ^dangle, F is for tackling the situation that two vector angles are obtuse angle.

Subsequently, C is calculated by expression formula [5] ^cto C ^dballot,

V_{dc} = \frac{1}{1 + P (C_{-}^{d} | C_{+}^{c})} - - - [5]

As can be seen from expression formula [5], C ^cc when being correct ^dthe probability being mistake is higher, C ^cto C ^dballot less.Process in this step S502 is carried out one by one for each pixel in every width contrast figure.

In step S503, based on described ballot, calculate the weights figure of every width contrast figure.

Specifically, the weights figure that expression formula [6] calculates each contrast figure can be passed through in this step.

W_{d} = Σ_{c &NotEqual; d}^{D, T, S} V_{dc} - - - [6]

Wherein, W _d(d=D, T, S) is contrast figure C ^dweights figure.As can be seen from expression formula [6], the weights figure of each contrast figure is the summations of all the other contrast figure to the ballot of this contrast figure.Process in this step S503 is carried out one by one for each pixel in every width weights figure.

Thus, obtain shade contrast and scheme C ^t, saturation degree contrast figure C ^sc is schemed with Depth contrasts ^dweights figure W _t, W _sand W _d.Optionally, in order to process conveniently, expression formula [7] can be utilized weights figure W _t, W _sand W _dbe normalized

W_{d}^{'} = \frac{W_{d}}{Σ_{d}^{D, T, S} W_{d}} - - - [7]

Wherein, W' _d(d=D, T, S) is normalized weights figure.

Return see Fig. 3, in step S2015, based on described contrast figure and corresponding weights figure, obtain Saliency maps.

Concrete, as shown in expression formula [8], in this step, summation can be weighted by utilizing corresponding weights figure to each contrast figure, obtain Saliency maps SM.

SM = Σ_{d}^{D, T, S} C^{d} W_{d}^{'} - - - [8]

Saliency maps SM considers the conspicuousness that tone, saturation degree and the degree of depth three obtain behind aspect to describe, and the pixel value in this Saliency maps represents that respective pixel belongs to the probability in staff region.

Subsequently, in step S2016, based on the candidate region of hand in described Saliency maps determination current scene.

In this step, can by carrying out to described Saliency maps the candidate region that binary conversion treatment determines hand.Concrete, as shown in expression formula [9], predetermined threshold alpha can be adopted to carry out binary conversion treatment to each pixel in Saliency maps.

Wherein, H value is one or more candidate regions of each pixel formation hand of 1.So far, the candidate region of hand in current scene can be obtained, equally also can obtain the candidate region of hand in the foreground depth image of current scene.

In addition, optionally, such as can pass through expression formula [10], for the candidate region of each hand determined thus, calculate the probability that it is the region at hand place, for follow-up process.

G _H(t)=average(SM(t)) [10]

Wherein t is the candidate region of hand, and SM (t) is the significance value of each pixel being arranged in region t in Saliency maps.According to expression formula [10], the probability in the region at Shi Shou place, each candidate region is the mean value of the significance value of each pixel comprised in this region.

Below simply described and utilized the tone of hand, saturation degree and depth information to detect the way of example of the candidate region of hand in current scene.As previously mentioned, in fact, those skilled in the art can utilize other any suitable modes to detect candidate region, such as, adopt the detection etc. of candidate region of the detection of the candidate region of the hand based on the colour of skin, the hand of Shape-based interpolation.

Get back to Fig. 2, in step S202, based on the depth image of current scene, determine the underarm area in current scene.

Fig. 6 shows the process flow diagram of the concrete process based on the underarm area in the depth image determination current scene of current scene according to the present embodiment.

As shown in Figure 6, in step S601, based on the foreground depth image that the depth image by current scene obtains, generate depth profile.

Concrete, in this step, for each the pixel i in foreground depth image, an interior zone and a perimeter are set respectively centered by it, wherein perimeter comprises interior zone, and according to such as following expression [11]-[14], calculate the depth profile of this pixel.

G(i)=exp(-(R _I+R _O)) [11]

R_{I} = \frac{C_{I}}{N_{I}} - - - [12]

R_{O} = \frac{C_{O}}{N_{O}} - - - [13]

T=min(D _I)+ε [14]

Wherein, D _irepresent the depth value matrix of interior zone, min (D _i) be the minimum depth value (in the present embodiment, agreement pixel is nearer to the distance of video camera, then the depth value of this pixel is larger) of pixel in interior zone, ε is constant value, and it can set as required.R _irepresent in interior zone, depth value is greater than the number C of the pixel of T _iaccount for the total number of pixels N of interior zone _iratio.R _orepresent that in the region removed in perimeter outside interior zone, depth value is greater than the number C of the pixel of T _oaccount in this perimeter total number of pixels N in the region removed outside interior zone _oratio.

By as above calculating the depth profile of each the pixel i in foreground depth image, Formation Depth distribution plan.Such as, Fig. 7 (a) shows exemplary foreground depth image, and Fig. 7 (b) shows formed schematic depth profile.

Can understand, described above by arranging an interior zone and perimeter for each pixel i, to carry out the mode that compute depth distributes be only an example.Those skilled in the art can adopt any other suitable mode to calculate the depth profile of pixel, such as, can be obtained the distribution situation of pixel value by the histogram of compute depth image, thus obtain depth profile.

In step S602, thresholding process is carried out to described depth profile.

It is as known in the art for how carrying out thresholding process to image, is not described in detail herein.Fig. 7 (c) shows the depth profile after schematic thresholding process.

In step S603, detection of straight lines in the depth profile after thresholding process.

There is the method for multiple known detection of straight lines from image in this area, such as hough transform etc., are not also described in detail herein.

In step S604, based on the position of human body in described foreground depth image, determine the straight line corresponding to forearm in described straight line.

The position of human body in foreground depth image can be determined by any suitable methods such as such as head shoulder Model Matching, is not described in detail herein.After determining the position of human body, according to human figure structure, the straight line corresponding to forearm in many straight lines detected in step S603 can be determined.Fig. 7 (d) schematically illustrates the straight line corresponding to forearm.

Optionally, such as can pass through expression formula [15], for the straight line corresponding to forearm determined thus, calculate the probability that it is forearm, for follow-up process.

G _L=average(G(j)) [15]

Wherein j is the pixel being positioned at described linearity region, and G (j) is the depth profile of each the pixel j in this linearity region of calculating as described previously by expression formula [11].According to expression formula [15], the mean value of determined straight line to be the probability of forearm the be depth profile of each pixel comprised in this linearity region.

In step S605, according to the described straight line corresponding to forearm, in described foreground depth image, determine underarm area.

After depth profile after thresholding process detecting the straight line corresponding to forearm, underarm area can be determined by any suitable methods such as such as region growings in foreground depth image.

Above, composition graphs 6 describes the concrete process based on the underarm area in the depth image determination current scene of current scene.Should be appreciated that this is only a kind of example, those skilled in the art can adopt any other suitable method to determine underarm area, and such as, forearm based on kinematic constraint detects, based on the arm detection etc. of joint variable.

Get back to Fig. 2, in step S203, utilize the candidate region of each described hand to predict corresponding wrist information respectively, and utilize described underarm area to predict described wrist information.

In this step, for the candidate region of each hand detected, all dope corresponding wrist information based on this candidate region, and for determined underarm area, also dope corresponding wrist information based on this underarm area.Optionally, this prediction steps can be carried out in foreground depth image.Below, composition graphs 8 and 9 is described in detail.

Fig. 8 shows the process flow diagram of the method for the wrist information utilizing the candidate region prediction of hand corresponding according to the embodiment of the present invention.

As shown in Figure 8, in step S801, for the candidate region of any one hand, determine the boundary rectangle of the candidate region of this hand, and calculate the principal direction of hand based on this boundary rectangle.

In this step, calculate the principal direction of this boundary rectangle, as the principal direction of hand.Principal direction can be determined by multiple method as known in the art, in the present embodiment, uses principal component analysis (PCA) (PCA) to calculate the principal direction of this boundary rectangle.Figure 10 (a) schematically shows the wrist information utilizing the candidate region of hand to dope in foreground depth image.As shown in Figure 10 (a), the solid-line rectangle frame in figure is the boundary rectangle of the candidate region of hand, and arrow A represents the principal direction of hand.

In step S802, on the direction contrary with the principal direction of described hand, arrange the first auxiliary area of a pre-sizing, this first auxiliary area connects with the boundary rectangle of the candidate region of described hand.

First auxiliary area is the region that prediction comprises wrist.The size of this first auxiliary area can preset according to the limbs proportionate relationship of wrist and hand.Such as, the first auxiliary area wide can be set to the twice that the boundary rectangle of hand is wide, height is set to the high half of the boundary rectangle of hand.As shown in Figure 10 (a), the dotted rectangle in figure is described first auxiliary area.

In step S803, determine the center of gravity of described first auxiliary area, as the position of the wrist of prediction.

In this step, using the position J of the center of gravity of the first auxiliary area as wrist _hl(x, y).As shown in Figure 10 (a), " * " mark in figure represents the position of the wrist determined thus.

In step S804, calculate the principal direction of described first auxiliary area, as the direction of the wrist of prediction.

In the present embodiment, with similar in step S801, principal component analysis (PCA) (PCA) is also used to calculate the principal direction of this first auxiliary rectangle, as the direction θ of the wrist of prediction _h.As shown in Figure 10 (a), the arrow B in figure represents the direction of the wrist of prediction.

Thus, the wrist information corresponding with the candidate region of a hand has been doped.By repeating the process of above-mentioned steps S801-S804 for the candidate region of each hand, for the candidate region of each hand, corresponding wrist information can be doped.

Fig. 9 shows and predicts the process flow diagram of the method for wrist information according to the underarm area that utilizes of the embodiment of the present invention.

As shown in Figure 9, in step S901, determine the boundary rectangle of underarm area, and calculate the principal direction of forearm based on this boundary rectangle.

In this step, calculate the principal direction of this boundary rectangle, as the principal direction of forearm.As previously mentioned, principal direction can be determined by multiple method as known in the art, in the present embodiment, uses principal component analysis (PCA) (PCA) to calculate the principal direction of this boundary rectangle.Figure 10 (b) schematically shows the wrist information utilizing underarm area to dope in foreground depth image.As shown in Figure 10 (b), the solid-line rectangle frame in figure is the boundary rectangle of underarm area, and arrow A represents the principal direction of forearm.

In step S902, on the direction identical with the principal direction of described forearm, arrange the second auxiliary area of a pre-sizing, this second auxiliary area connects with the boundary rectangle of underarm area described in this.

Second auxiliary area is the region that prediction comprises wrist.The size of this second auxiliary area can preset according to the limbs proportionate relationship of wrist and forearm.Such as, the second auxiliary area wide all can be set to the twice that the boundary rectangle of underarm area is wide, height is set to the wide of the boundary rectangle of underarm area.As shown in Figure 10 (b), the dotted rectangle in figure is described second auxiliary area.

In step S903, determine the center of gravity of described second auxiliary area, as the position of the wrist of prediction.

In this step, using the position J of the center of gravity of the second auxiliary area as wrist _lh(x, y).As shown in Figure 10 (b), " * " mark in figure represents the position of the wrist determined thus.

In step S904, calculate the principal direction of described second auxiliary area, as the direction of the wrist of prediction.

In the present embodiment, with similar in step S901, principal component analysis (PCA) (PCA) is also used to calculate the principal direction of this second auxiliary rectangle, as the direction θ of the wrist of prediction _l.As shown in Figure 10 (b), the arrow B in figure represents the direction of the wrist of prediction.

Get back to Fig. 2, in step S204, based on wrist information described in each, select the highest candidate region of degree of confidence as the region at hand place.

In this step, for determined each hand candidate region, according to the wrist information utilizing this candidate region to predict and the wrist information utilizing underarm area to predict, calculate the degree of confidence of this candidate region, and select to have the region of candidate region as hand place of most high confidence level.

The degree of confidence of candidate region can adopt various suitable mode to calculate.Such as, a kind of possible mode is the candidate region for each hand, and the probability being forearm by the probability in the region at Shi Shou place, this candidate region and the straight line corresponding to forearm is weighted process, and using the degree of confidence of weighted results as this candidate region.Concrete, degree of confidence can be calculated according to expression formula [16].

G _H|L(t)=G _H(t)+R(H _t,L)G _L[16]

Wherein, t is the candidate region of hand, G _h|Lt () is the degree of confidence of candidate region t, G _hthe probability in (t) to be the candidate region t calculated as described previously by expression formula [10] the be region at hand place, G _lbe calculate as described previously by expression formula [15] correspond to the probability that the straight line of forearm is forearm, R (H _t, L) and be G _lweights.

R (H _t, L) and be determine according to the wrist information utilizing candidate region t to dope and the wrist information utilizing underarm area to dope.Optionally, the wrist information utilizing underarm area to dope is less with the difference of the wrist information utilizing the candidate region t of hand to dope, then R (H _t, L) and larger.Concrete, weights R (H can be determined according to expression formula [17] _t, L).

R(H _t,L)=exp(-E) [17]

E=w ^J||J _hl(x,y)-J _lh(x,y)||+w ^D|θ _l-θ _h|

Wherein, θ _land θ _hthe wrist direction utilizing the candidate region of underarm area and hand to dope as previously mentioned respectively, w ^dfor the weights of the deviation in wrist direction doped, J _lh(x, y) and J _hl(x, y) is the wrist location utilizing the candidate region of underarm area and hand to dope as previously mentioned respectively, w ^jfor the weights of the deviation of wrist location doped.W ^dand w ^jcan set according to actual conditions, such as, in experiment, w ^jcan value be 1/3, w ^dcan value be 2/3.

The foregoing describe a kind of method determining the degree of confidence of the candidate region of each hand.Can understand, the method is only an example, and is not limitation of the present invention, and the degree of confidence of candidate region can adopt other any suitable modes to calculate.Such as, can simply according to the difference of the wrist information utilizing the candidate region of hand to predict with the wrist information utilizing underarm area to predict, determine the degree of confidence of each candidate region, namely less with the difference of the wrist information utilizing underarm area to predict candidate region, its degree of confidence is larger.

Optionally, can the selected candidate region with most high confidence level be verified.Concrete, this most high confidence level is compared with the threshold value preset, if most high confidence level is greater than described threshold value, then determines to have the region at the Shi Shou place, candidate region of this most high confidence level; Otherwise, then illustrate that the degree of confidence of the candidate region of each hand detected is too low, namely think do not detect in one's hands.

The foregoing describe the hand detection method according to the embodiment of the present invention.In the method, the candidate region of forearm message to the multiple hands detected is utilized to verify, namely utilize forearm information to calculate the degree of confidence of the candidate region of the multiple hand detected, and select to have in the candidate region of multiple hand the region of candidate region as hand place of most high confidence level.Because the detection of forearm is wanted easily relative to the detection of hand, the forearm therefore detected has higher degree of confidence, and then utilizes the region of the hand of forearm Information Authentication to have higher confidence level.Therefore, even if having in the motion due to hand causes image blurring, background that illumination in the object of the similar colour of skin, interactive process changes, in hand and the Various Complex situation such as face is overlapping, the hand detection method according to the embodiment of the present invention also can obtain good testing result.In addition, this hand detection method does not need initiation gesture and movable information.

Below with reference to Figure 11, the hand checkout equipment according to the embodiment of the present invention is described.

Figure 11 shows the functional configuration block diagram of the hand checkout equipment 1100 according to the embodiment of the present invention.

As shown in figure 11, hand checkout equipment 1100 can comprise: hand candidate region detection part 1110, is configured for the candidate region detecting hand in current scene; Underarm area detection part 1120, is configured for the depth image based on current scene, determines the underarm area in current scene; Wrist information prediction unit 1130, is configured for and utilizes the candidate region of each described hand to predict corresponding wrist information respectively, and utilize described underarm area to predict described wrist information; With hand region determining means 1140, be configured for based on wrist information described in each, select the highest candidate region of degree of confidence as the region at hand place.

Concrete function and the operation of above-mentioned hand candidate region detection part 1110, underarm area detection part 1120, wrist information prediction unit 1130 and hand region determining means 1140 can with reference to the associated description of above-mentioned Fig. 1 to Figure 10, no longer repeated description herein.

Below with reference to Figure 12, the general hardware block diagram according to the hand detection system 1200 of the embodiment of the present invention is described.As shown in figure 12, hand detection system 1200 can comprise: input equipment 1210, for inputting relevant image or information from outside, and the depth image, coloured image etc. of such as video camera shooting, this input equipment can be such as such as video camera, treatment facility 1220, for implementing the above-mentioned hand detection method according to the embodiment of the present invention, or be embodied as above-mentioned hand checkout equipment, this treatment facility can be such as the central processing unit or other the chip with processing power etc. of computing machine, output device 1230, such as, for externally exporting the result implemented above-mentioned hand testing process and obtain, the location point coordinate of the hand determined, the direction of hand etc., this output device can be such as display, printer etc., and memory device 1240, for storing various image and data involved by above-mentioned hand testing process in volatile or non-volatile mode, such as depth image, coloured image, foreground depth image, prospect coloured image, comparison diagram, weights figure, Saliency maps, utilize the wrist information that the candidate region of each hand is predicted, utilize the wrist information that underarm area is predicted, the various threshold value and weights etc. preset, this memory device can be such as random-access memory (ram), ROM (read-only memory) (ROM), hard disk, or the various volatile or nonvolatile memory of semiconductor memory etc.

Below ultimate principle of the present invention is described in conjunction with specific embodiments, but, it is to be noted, for those of ordinary skill in the art, whole or any step or the parts of method and apparatus of the present invention can be understood, can in the network of any calculation element (comprising processor, storage medium etc.) or calculation element, realized with hardware, firmware, software or their combination, this is that those of ordinary skill in the art use their basic programming skill just can realize when having read explanation of the present invention.

Therefore, object of the present invention can also be realized by an operation program or batch processing on any calculation element.Described calculation element can be known fexible unit.Therefore, object of the present invention also can realize only by the program product of providing package containing the program code realizing described method or device.That is, such program product also forms the present invention, and the storage medium storing such program product also forms the present invention.Obviously, described storage medium can be any storage medium developed in any known storage medium or future.

Also it is pointed out that in apparatus and method of the present invention, obviously, each parts or each step can decompose and/or reconfigure.These decompose and/or reconfigure and should be considered as equivalents of the present invention.Further, the step performing above-mentioned series of processes can order naturally following the instructions perform in chronological order, but does not need necessarily to perform according to time sequencing.Some step can walk abreast or perform independently of one another.

Above-mentioned embodiment, does not form limiting the scope of the invention.It is to be understood that depend on designing requirement and other factors, various amendment, combination, sub-portfolio can be there is and substitute in those skilled in the art.Any amendment done within the spirit and principles in the present invention, equivalent replacement and improvement etc., all should be included within scope.

Claims

1. a hand detection method, comprising:

Detect the candidate region of hand in current scene;

Based on the depth image of current scene, determine the underarm area in current scene;

Utilize the candidate region of each described hand to predict corresponding wrist information respectively, and utilize described underarm area to predict described wrist information; With

Based on wrist information described in each, select the highest candidate region of degree of confidence as the region at hand place.

2. hand detection method as claimed in claim 1, wherein saidly comprises based on the underarm area in the depth image determination current scene of current scene:

Foreground segmentation is carried out to described depth image, to generate foreground depth image;

Based on described foreground depth image, generate depth profile;

Thresholding process is carried out to described depth profile;

Detection of straight lines in depth profile after thresholding process;

Based on the position of human body in described foreground depth image, determine the straight line corresponding to forearm in described straight line;

According to the described straight line corresponding to forearm, in described foreground depth image, determine underarm area.

3. hand detection method as claimed in claim 1 or 2, corresponding wrist information comprises the candidate region for each described hand further wherein to utilize the candidate region of each described hand to predict respectively:

Determine the boundary rectangle of the candidate region of this hand, and calculate the principal direction of hand based on this boundary rectangle;

On the direction contrary with the principal direction of described hand, arrange the first auxiliary area of a pre-sizing, this first auxiliary area connects with the boundary rectangle of the candidate region of described hand;

Determine the center of gravity of described first auxiliary area, as the position of the wrist of prediction;

Calculate the principal direction of described first auxiliary area, as the direction of the wrist of prediction.

4. hand detection method as claimed in claim 3, described wrist information comprises further wherein to utilize described underarm area to predict:

Determine the boundary rectangle of this underarm area, and calculate the principal direction of forearm based on this boundary rectangle;

On the direction identical with the principal direction of described forearm, arrange the second auxiliary area of a pre-sizing, this second auxiliary area connects with the boundary rectangle of underarm area described in this;

Determine the center of gravity of described second auxiliary area, as the position of the wrist of prediction;

Calculate the principal direction of described second auxiliary area, as the direction of the wrist of prediction.

5. hand detection method as claimed in claim 1, comprises the candidate region for each described hand further, calculates the probability that it is the region at hand place.

6. hand detection method as claimed in claim 5, comprises further for the determined straight line corresponding to forearm, calculates the probability that this straight line is forearm.

7. hand detection method as claimed in claim 6, wherein select the highest candidate region of degree of confidence to comprise as the region at hand place based on wrist information described in each:

For the candidate region of each described hand, be that probability and the described straight line of forearm of corresponding in the region at hand place is that the probability of forearm is weighted, and using the degree of confidence of weighted results as this candidate region;

Select the candidate region with maximum weighted result, as the candidate region that degree of confidence is the highest.

8. hand detection method as claimed in claim 7, the wrist information wherein utilizing underarm area to dope is less with the difference of the wrist information utilizing the candidate region of hand to dope, then the described straight line corresponding to forearm is that the weights of the probability of forearm are larger.

9. hand detection method as claimed in claim 1, the candidate region wherein detecting hand in current scene comprises: utilize the tone of hand, saturation degree and depth information, detects the candidate region of hand in current scene.

10. a hand checkout equipment, comprising:

Hand candidate region detection part, is configured for the candidate region detecting hand in current scene;

Underarm area detection part, is configured for the depth image based on current scene, determines the underarm area in current scene;

Wrist information prediction unit, is configured for and utilizes the candidate region of each described hand to predict corresponding wrist information respectively, and utilize described underarm area to predict described wrist information; With

Hand region determining means, is configured for based on wrist information described in each, selects the highest candidate region of degree of confidence as the region at hand place.