CN104765440B

CN104765440B - Hand detection method and equipment

Info

Publication number: CN104765440B
Application number: CN201410001215.5A
Authority: CN
Inventors: 赵颖
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2014-01-02
Filing date: 2014-01-02
Publication date: 2017-08-11
Anticipated expiration: 2034-01-02
Also published as: CN104765440A

Abstract

There is provided hand detection method and equipment, the hand detection method includes：Detect the candidate region of hand in current scene；Depth image based on current scene, determines the underarm area in current scene；Corresponding wrist information is predicted respectively using the candidate region of each hand, and predicts the wrist information using the underarm area；With based on wrist information each described, selection confidence level highest candidate region is used as the region where hand.The hand detection method and equipment due to the motion of hand causing in the object for having the similar colour of skin in image blurring, background, interactive process illumination to change, can obtain preferable testing result in the case of hand and the Various Complex such as face is overlapping.

Description

Hand detection method and equipment

Technical field

This patent disclosure relates generally to man-machine interaction, relate more specifically to hand detection method and equipment.

Background technology

At present, man-machine interaction is from interactive development is touched to by detecting that the gesture and posture of operator is man-machine to carry out Interaction.Specifically, exactly by capturing the scene image of the operator before screen and screen, and to the scene image of acquisition Handled, to obtain the action of operator, the action of operator is then converted into the operational order of machine, so as to realize people Machine is interacted.This man-machine interaction usually requires to recognize the gesture of operator, and the basis of gesture identification is the detection of hand.In view of hand The characteristic of itself, for example, skin color and the distinctive shape of hand, people are typically based on the color or profile of hand to recognize and examine Survey hand.

In U.S. Patent application US20100803369A, a kind of human hand detection method of view-based access control model, the party are described Method requires that two hands occur simultaneously.Specifically, the left and right anaglyph that this method is obtained according to two video cameras calculates scene Depth information, goes out multiple human hand candidate regions, and the candidate region combination of two to obtaining, by calculating it according to Face Detection The information such as size is poor, depth difference, alternate position spike draw the positions of both hands.In Japanese patent application JP2005000236876, Describe a kind of method based on Color images detecting human hand.This method detects multiple human hand candidate regions using features of skin colors Domain, then calculates the shape of candidate region, and determines whether the region is human hand according to the complexity of shape.In addition, in the U.S. In patent US7593552B2, a kind of gesture identification method is described, including the detection of human hand.Specifically, in this method In, face is detected first so as to obtain position and the Skin Color Information of face, then centered on the position of face, in its left and right two Side takes a region of search respectively, and detects area of skin color in this region, and the focus point of the area of skin color detected is people Hand position point.But, above-mentioned hand detection method can not all tackle well have the similar colour of skin in motion blur, background object, Situations such as illumination variation.In addition, needing initiation gesture the above method more.

The content of the invention

According to an aspect of the invention, there is provided a kind of hand detection method, including：Detect the candidate of hand in current scene Region；Depth image based on current scene, determines the underarm area in current scene；Utilize the candidate region of each hand Corresponding wrist information is predicted respectively, and predicts the wrist information using the underarm area；With based on wrist each described Information, selection confidence level highest candidate region is used as the region where hand.

According to another aspect of the present invention there is provided a kind of hand detection device, including：Hand candidate region detection part, It is configured to detect the candidate region of hand in current scene；Underarm area detection part, is configured to the depth based on current scene Image is spent, the underarm area in current scene is determined；Wrist information prediction unit, is configured to the candidate using each hand Corresponding wrist information is predicted in region respectively, and predicts the wrist information using the underarm area；With hand region determining section Part, is configured to be based on each described wrist information, selection confidence level highest candidate region is used as the region where hand.

Hand detection technique according to embodiments of the present invention can be good at processing because the motion of hand causes the image blurring, back of the body There is in the object of the similar colour of skin, interactive process illumination change in scape, hand and the Various Complex situation such as face is overlapping.Separately Outside, hand detection technique according to embodiments of the present invention can carry out the detection of hand on single-frame images, and need not start hand Gesture and movable information.

Brief description of the drawings

Fig. 1 schematically shows the scene using hand detection technique according to embodiments of the present invention.

Fig. 2 shows the flow chart of hand detection method according to embodiments of the present invention.

Fig. 3 shows the place that the candidate region of hand in current scene is detected in hand detection method according to embodiments of the present invention The flow chart of reason.

Fig. 4 is shown in the processing shown in Fig. 3 determines shade contrast based on foreground depth image and prospect coloured image The flow chart of figure, saturation degree contrast figure and Depth contrasts' figure.

Fig. 5, which is shown, calculates shade contrast's figure, saturation degree contrast figure and Depth contrasts' figure in the processing shown in Fig. 3 Weights figure flow chart.

Fig. 6 show in hand detection method according to embodiments of the present invention determine current scene in underarm area it is specific The flow chart of processing.

Fig. 7（a）Show exemplary foreground depth image, Fig. 7（b）Show schematical depth profile, Fig. 7 （c）Show the depth profile after schematical thresholding processing, Fig. 7（d）Schematically illustrate corresponding to the straight of forearm Line.

Fig. 8 shows the corresponding wrist of candidate region prediction using hand in hand detection method according to embodiments of the present invention The flow chart of the processing of information.

Fig. 9 shows the processing for predicting wrist information in hand detection method according to embodiments of the present invention using underarm area Flow chart.

Figure 10 (a), which schematically shows the wrist predicted in foreground depth image using the candidate region of hand, to be believed Breath.

Figure 10 (b) schematically shows the wrist information predicted in foreground depth image using underarm area.

Figure 11 shows the functional configuration block diagram of hand detection device according to embodiments of the present invention.

Figure 12 shows the general hardware block diagram of hand detecting system according to embodiments of the present invention.

Embodiment

In order that those skilled in the art more fully understand the present invention, with reference to the accompanying drawings and detailed description to this hair It is bright to be described in further detail.

Fig. 1 schematically shows the scene using hand detection technique according to embodiments of the present invention.As shown in figure 1, with Family is just stood in the image pickup scope of video camera 101, and the program in the processing equipment 102 of such as computer is controlled using gesture. Video camera 101 can be the video camera of any one coloured image that can provide scene and depth image, such as PrimeSensor, Kinect etc..When user carries out gesture control, video camera 101 is shot to user, processing equipment 102 The every two field picture photographed for video camera 101, detects the positional information of the hand of user.It should be noted that Fig. 1 only schemes Shown the present invention a kind of possible application scenarios, according to actual conditions, the equipment in application scenarios can correspondingly increase or Reduce, and with different configurations, or other different equipment can be used.

Below, hand detection technique according to embodiments of the present invention will be described in detail.

Brief description is carried out to the basic thought of the hand detection technique of the present invention first.It is well known that hand is non-rigid Object, with move it is fast, yielding the features such as.Therefore, causing have the similar colour of skin in image blurring, background due to the motion of hand Object, hand and the Various Complex such as face is overlapping in the case of, it is difficult to accurately detect the position sold.For this case, this hair The bright candidate region for proposing multiple hands using forearm information to detecting is verified, so as to obtain the position of accurate hand Technology.Because forearm volume is larger and with less deformation, therefore its detection is easy relative to the detection of hand, so that The position of the hand obtained using forearm Information Authentication has higher confidence level.

As shown in Fig. 2 in step S201, detecting the candidate region of hand in current scene.

As it was noted above, detecting candidate region existing many researchs in the art of one or more hands, it is not The key point of the present invention, those skilled in the art can detect the candidate of hand in current scene in any suitable manner Region.Herein, it is complete for explanation, the way of example used in the present embodiment will simply be described with reference to Fig. 3.

As shown in figure 3, in step S2011, obtaining the depth image and coloured image for the current scene that video camera is shot.Such as Well known to a person skilled in the art the value that, depth image is each pixel in image represent in scene certain point and video camera it Between distance image, and coloured image is then a width RGB image.

In step S2012, based on the depth image and coloured image, the foreground depth image of current scene is obtained with before Scape coloured image.

Due to not only including user in depth image and coloured image, also include other background image contents, therefore In order to reduce subsequent detection processing amount of calculation, can be partitioned into by foreground segmentation from depth image and coloured image including Other in user and/or scene close to the foreground area of the object of video camera, that is, obtain current scene foreground depth image and Prospect coloured image.This processing can be realized by various existing methods in this area, such as connected domain analysis, threshold value Segmentation etc., is no longer described in detail herein.

In subsequent step S2013-S2016, by from the tone of hand, saturation degree and depth information, Utilization prospects The salient region that depth image and prospect Color images detecting go out in foreground area, is used as the candidate region of hand.Due to hand Color and most of objects have a distinction, thus color contrast can as distinguish hand and most of objects feature it One.Moreover, tone and saturation degree by handling color respectively, the object and hand included for foreground area have similar color Situation, can also accurately distinguish and sell.On tone and saturation degree component, can by by coloured image from RGB Color space conversion is obtained to hsv color space, and this conversion belongs to prior art, does not carry out detailed retouch to it herein State.On the other hand, due in man-machine interactive operation, not having other objects in the distance of hand to video camera, thus hand is in depth On be significant, it can be used as the another feature for distinguishing hand and other objects.Therefore, the detection of salient region includes color The calculating and fusion of tune, saturation degree and the aspect of depth three.This is described below in conjunction with the specific processing in step S2013-S2016 One salient region detection process.

Specifically, in step S2013, based on the foreground depth image and prospect coloured image, determining shade contrast Scheme C^T, saturation degree contrast figure C^SScheme C with Depth contrasts^D。

Shade contrast's degree figure C^TIt is to be obtained by the tone path computation of prospect coloured image, saturation degree contrast figure C^SIt is Obtained by the saturation degree path computation of prospect coloured image, and Depth contrasts' figure C^DIt is then to be calculated by foreground depth image Arrive, wherein in each contrast figure the pixel value of each pixel represent the pixel relative in image other pixels it is aobvious Work value.Shade contrast schemes C^T, saturation degree contrast figure C^SScheme C with Depth contrasts^DIt can be calculated using same method, example As using below in conjunction with the method described in Fig. 4.That is, by performing the method shown in three times Fig. 4, calculating respectively Go out tone comparison diagram C^T, saturation degree contrast figure C^SScheme C with Depth contrasts^D.To have below to the method shown in Fig. 4 The explanation of body.For convenience of description, input picture is represented with I in the following description, and corresponding contrast figure then uses C^d(d= D, T, S) represent.It is understood that when calculating shade contrast's figure, I represents the tone channel image of prospect coloured image, in meter When calculating saturation degree contrast figure, I represents the saturation degree channel image of prospect coloured image, when calculating Depth contrasts' figure, I tables Show foreground depth image.

As shown in figure 4, in step S401, its neighborhood territory pixel j (j=1...n are selected to image I each pixel i_i), Wherein n_iIt is pixel i neighborhood territory pixel number.

In this step, neighborhood territory pixel can be selected in any suitable manner.A kind of possible mode is to use The multi-density method of sampling.The so-called multi-density method of sampling be exactly range pixel i it is nearer position sampling neighborhood territory pixel more It is many, in the fewer of position sampling more remote range pixel i.Specifically, m deciles direction is chosen by origin of pixel i.At this On m direction, sampled respectively by step-length of r, until image I border.M value can be come really according to specific needs Fixed, for example, m value takes usually 8 in an experiment, and if it is desired to obtaining more accurate result, m value can be bigger（Such as 16）, or if not high to the accuracy requirement of result, m value can also be smaller（Such as 4）.Step-length r value can also root Determined according to specific needs, such as r value can be 2 pixel distances.By this sampling, prospect cromogram is corresponded to respectively The tone channel image of picture, the saturation degree channel image of prospect coloured image and foreground depth image, obtain adopting in each image Tone value, intensity value and the depth value of sampling point.Optionally, in this step can only for each non-zero value pixel i Its neighborhood territory pixel is selected, to reduce amount of calculation.

In step S402, to each pixel i as origin in image I, calculate itself and corresponding neighborhood territory pixel j it Between pixel value difference.

In this step, it is possible to use expression formula [1] calculates each pixel i as origin and corresponding neighborhood territory pixel Margin of image element d between j_ij。

d_ij=|I_i-I_j|²,i=1...N [1]

Wherein, I_iIt is pixel i pixel value, I_jIt is neighborhood territory pixel j pixel value, N is image I size.More specifically, Calculating Depth contrasts' figure C^DWhen, I_iAnd I_jRepresent pixel i and j depth value；Calculating shade contrast's degree figure C^TWhen, I_iAnd I_j Pixel i and j shade of color value are represented respectively；And calculating saturation degree contrast figure C^SWhen, I_iAnd I_jPixel i and j are represented respectively Color saturation value.

In step S403, for each neighborhood territory pixel j, its weights is determined.

In this step, for example, it is possible to use expression formula [2] calculates neighborhood territory pixel j Gauss weight w_ij。

Wherein, σ_pThe scale factor of Gauss weights, such as can be using value as 0.25, p in testing_iAnd p_jIt is pixel i respectively With j position.||p_i-p_j| | represent position p_iAnd p_jEuclidean distance.Apart from more remote neighborhood it can be seen from expression formula [2] Pixel, its weights are smaller, and apart from nearer neighborhood territory pixel, its weights is bigger.

Then, in step S404, to each pixel i as origin in image I, its contrast value is determined, to obtain Contrast figure C^d。

In this step, for example, it is possible to use expression formula [3] calculates in image I each pixel i as origin pair Compare angle value.

By each pixel in each for three kinds of foreground image I, above-mentioned steps S401-S404 is performed, can be with Obtain three kinds of contrast figures, i.e. shade contrast's figure C^T, saturation degree contrast figure C^SScheme C with Depth contrasts^D.As it was previously stated, each right Pixel value than each pixel in degree figure represents saliency value of the pixel relative to other pixels of image, because hand exists In terms of tone, saturation degree and depth all than other objects in scene significantly, so the bigger pixel of contrast figure intermediate value belongs to The probability of hand is higher.

Fig. 3 is returned, in step S2014, shade contrast's figure C is calculated respectively^T, saturation degree contrast figure C^SAnd depth correlation Degree figure C^DWeights figure W_T、W_SAnd W_D。

In the present embodiment, by calculating contrast figure C^D、C^TAnd C^SBallot each other obtains weights figure W_T、W_SAnd W_D.Herein, ballot is a kind of description to difference between contrast figure.The pixel value of weights figure represents its corresponding contrast figure Confidence level, the pixel value in weights figure is bigger, then corresponding contrast figure is more credible.Below, will be with reference to Fig. 5 to according to this reality The processing for applying the calculating weights figure of example is described.

As shown in figure 5, in step S501, for contrast figure C^D,C^T,C^S, calculate corresponding gradient vector figure G^d。

Gradient vector figure G^dIt is two tuple (D^d,M^d) (d=D, T, S), wherein, D^dIt is gradient direction, M^dIt is gradient magnitude.It is logical Cross and calculate gradient for each pixel in contrast figure, gradient vector figure corresponding with contrast figure can be generated.And calculate The gradient of pixel is common technical means in the art, therefore herein without being described in detail.

In step S502, for every width contrast figure, other each ballots of contrast figure to it are calculated.

For the ease of description, hereinafter, C is used^d(d=D, T, S) represents three width contrast figure C^D,C^T,C^SIn it is any right Than degree figure, C is used^c(c=D, T, S, and c ≠ d) represents to be different from C^dAnother contrast figure.C^cTo C^dBallot describe in vacation If C^cUnder the conditions of correct, C^dIt is also correct probability.

In this step, C is calculated first^dIn C^cFor wrong probability under the conditions of correct.Generally, if C^cIt is correct In the case of C^dIt is wrong, then the direction of their gradient vector is inevitable different, there is angle between two vectors.According to The Vector triangle of amount, two vectorial differences are the length on the side corresponding to two vector angles.Therefore, C^dIn C^cFor correct bar Wrong probability can be calculated by expression formula [4] under part.

c,d=D,T,S;c≠d.

Wherein,Represent C^dBe it is wrong,Represent C^cIt is correct, θ is gradient vector G^cAnd G^dAngle, F be used for should To the situation that two vector angles are obtuse angle.

Then, C is calculated by expression formula [5]^cTo C^dBallot,

The C it can be seen from expression formula [5]^cBe it is correct in the case of C^dIt is that wrong probability is higher, C^cTo C^dBallot get over It is small.Each pixel that processing in step S502 is directed in every width contrast figure is carried out one by one.

In step S503, based on the ballot, the weights figure of every width contrast figure is calculated.

Specifically, the weights figure of each contrast figure can be calculated by expression formula [6] in this step.

Wherein, W_d(d=D, T, S) is contrast figure C^dWeights figure.Each contrast figure it can be seen from expression formula [6] Weights figure be remaining summation of contrast figure to the ballot of the contrast figure.Processing in step S503 is directed to every width weights Each pixel in figure is carried out one by one.

Thus, shade contrast's figure C is obtained^T, saturation degree contrast figure C^SScheme C with Depth contrasts^DWeights figure W_T、W_SAnd W_D.Optionally, in order to handle conveniently, it is possible to use expression formula [7] is to weights figure W_T、W_SAnd W_DIt is normalized

Wherein, W'_d(d=D, T, S) is normalized weights figure.

Referring back to Fig. 3, in step S2015, based on the contrast figure and corresponding weights figure, conspicuousness is obtained Figure.

Specifically, as shown in expression formula [8], in this step, can be contrasted by using corresponding weights figure to each Degree figure is weighted summation, obtains Saliency maps SM.

Saliency maps SM is to consider the conspicuousness description obtained after tone, saturation degree and the aspect of depth three, and this is notable Pixel value in property figure represents that respective pixel belongs to the probability in human hand region.

Then, in step S2016, the candidate region of hand in current scene is determined based on the Saliency maps.

In this step, the candidate region of hand can be determined by carrying out binary conversion treatment to the Saliency maps.Tool Body, such as shown in expression formula [9], binaryzation can be carried out to each pixel in Saliency maps using predetermined threshold alpha Processing.

Wherein, the one or more candidate regions for each pixel formation hand that H values are 1.So far, current scene can be obtained The candidate region of middle hand, can also equally obtain the candidate region of hand in the foreground depth image of current scene.

In addition, it is optional, can be for example by expression formula [10], for the candidate region of each hand thereby determined that, meter The probability that it is the region where hand is calculated, for follow-up processing.

G_H(t)=average(SM(t)) [10]

Wherein t is the candidate region of hand, and SM (t) is the significance value for each pixel being located in Saliency maps in the t of region. According to expression formula [10], each candidate region is that the probability in the region where hand is the conspicuousness of each pixel included in the region The average value of value.

Simply describe using the tone of hand, saturation degree and depth information to detect the time of hand in current scene above The way of example of favored area.As it was previously stated, in fact, those skilled in the art can be examined using other any appropriate modes Astronomical observation favored area, the detection for example with the candidate region of the hand based on the colour of skin, the detection of the candidate region of the hand based on shape Deng.

Fig. 2 is returned to, in step S202, the depth image based on current scene determines the underarm area in current scene.

Fig. 6 shows the underarm area determined according to the depth image based on current scene of the present embodiment in current scene Specific processing flow chart.

As shown in fig. 6, in step S601, based on the foreground depth image of the depth image acquisition by current scene, generation Depth profile.

Specifically, in this step, for each pixel i in foreground depth image, being set respectively centered on it One interior zone and a perimeter, wherein perimeter include interior zone, and according to such as following expression [11]- [14], the depth profile of the pixel is calculated.

G(i)=exp(-(R_I+R_O)) [11]

T=min(D_I)+ε [14]

Wherein, D_IRepresent the depth value matrix of interior zone, min (D_I) be pixel in interior zone minimum depth value （In the present embodiment, the distance of agreement pixel to video camera is nearer, then the depth value of the pixel is bigger）, ε is constant value, and it can Set with as needed.R_IRepresent in region internally, depth value is more than the number C of T pixel_IAccount for the total pixel of interior zone Number N_IRatio.R_ORepresent to remove in the outer region in the region outside interior zone, depth value is more than of T pixel Number C_OAccount for total number of pixels N in the region in the perimeter outside removing interior zone_ORatio.

By the way that the depth profile of each pixel i in foreground depth image is calculated as above, depth profile is formed.Example Such as, Fig. 7（a）Show exemplary foreground depth image, Fig. 7（b）Show the schematical depth profile to be formed.

It is understood that it is described above by for each pixel i set an interior zone and a perimeter come The mode for calculating depth profile is only an example.Those skilled in the art can count by the way of any other is appropriate The depth profile of pixel is calculated, for example, can obtain the distribution situation of pixel value by calculating the histogram of depth image, so that Obtain depth profile.

In step S602, thresholding processing is carried out to the depth profile.

It is it is well known in the art that being not be described in detail herein that thresholding processing is how carried out to image.Fig. 7（c）Show The depth profile gone out after schematical thresholding processing.

The detection of straight lines on step S603, the depth profile after thresholding processing.

The method that a variety of known detection of straight lines from image are there are in this area, such as hough transform, herein Without being described in detail.

In step S604, based on the position of human body in the foreground depth image, determine to correspond to forearm in the straight line Straight line.

Position of the human body in foreground depth image can be come for example, by any appropriate method such as head and shoulder Model Matching It is determined that, it is not be described in detail herein., can be according to human figure structure, it is determined that in step after the position of human body is determined Correspond to the straight line of forearm in a plurality of straight line detected in S603.Fig. 7（d）Schematically illustrate corresponding to the straight of forearm Line.

Optionally, can be for example by expression formula [15], for the straight line corresponding to forearm thereby determined that, calculating it is The probability of forearm, for follow-up processing.

G_L=average(G(j)) [15]

Wherein j is the pixel being located in the linearity region, and G (j) is as described previously by being somebody's turn to do that expression formula [11] is calculated The depth profile of each pixel j in linearity region.According to expression formula [15], identified straight line is that the probability of forearm is this The average value of the depth profile of each pixel included in linearity region.

In step S605, according to the straight line corresponding to forearm, underarm area is determined in the foreground depth image.

, can be for example, by region after detecting the straight line corresponding to forearm in depth profile after thresholding processing Any appropriate methods such as growth determine underarm area in foreground depth image.

More than, describe the tool that the depth image based on current scene determines the underarm area in current scene with reference to Fig. 6 Body processing.It should be appreciated that this is only a kind of example, those skilled in the art can be using any other appropriate method come really Underarm area is determined, such as the forearm detection based on kinematic constraint, the arm detection based on joint variable.

Fig. 2 is returned to, in step S203, corresponding wrist information is predicted respectively using the candidate region of each hand, And predict the wrist information using the underarm area.

In this step, for the candidate region of each hand detected, be based on the candidate region predict it is corresponding Wrist information, and for identified underarm area, corresponding wrist information is predicted also based on the underarm area.It is optional , the prediction steps can be carried out in foreground depth image.Below, it will be described in detail with reference to Fig. 8 and 9.

Fig. 8 shows that the method for corresponding wrist information is predicted in the candidate region of utilization hand according to embodiments of the present invention Flow chart.

As shown in figure 8, in step S801, for the candidate region of any one hand, determine the hand candidate region it is outer Rectangle is connect, and based on the principal direction of boundary rectangle calculating hand.

In this step, the principal direction of the boundary rectangle is calculated, the principal direction of hand is used as.Principal direction can pass through this area In known a variety of methods determine, in the present embodiment, use PCA（PCA）To calculate the master of the boundary rectangle Direction.Figure 10 (a) schematically shows the wrist information predicted in foreground depth image using the candidate region of hand.Such as Shown in Figure 10 (a), the solid-line rectangle frame in figure is the boundary rectangle of the candidate region of hand, and arrow A represents the principal direction of hand.

In step S802, on the direction opposite with the principal direction of the hand, the first auxiliary region of a predefined size is set Domain, first auxiliary area connects with the boundary rectangle of the candidate region of the hand.

First auxiliary area is the region that prediction includes wrist.The size of first auxiliary area can be according to wrist and hand Limbs proportionate relationship preset.For example, can by wide twice of the wide boundary rectangle for being set to hand of the first auxiliary area, High setting is the high half of the boundary rectangle of hand.As shown in Figure 10 (a), the dotted rectangle in figure is first auxiliary region Domain.

In step S803, the center of gravity of first auxiliary area is determined, the position of the wrist of prediction is used as.

In this step, using the center of gravity of the first auxiliary area as wrist position J_hl(x,y).As shown in Figure 10 (a), figure In " * " mark represent the position of wrist thereby determined that.

In step S804, the principal direction of first auxiliary area is calculated, the direction of the wrist of prediction is used as.

In the present embodiment, it is similar with step S801, also using PCA（PCA）To calculate first auxiliary The principal direction of rectangle, is used as the direction θ of the wrist of prediction_h.As shown in Figure 10 (a), the arrow B in figure represents the wrist of prediction Direction.

Thus, wrist information corresponding with the candidate region of a hand has been predicted.Pass through the candidate regions for each hand Domain repeats the above steps S801-S804 processing, can be directed to the candidate region of each hand, predict corresponding wrist information.

Fig. 9 shows the flow chart of the method for prediction wrist information in utilization underarm area according to embodiments of the present invention.

As shown in figure 9, in step S901, determining the boundary rectangle of underarm area, and forearm is calculated based on the boundary rectangle Principal direction.

In this step, the principal direction of the boundary rectangle is calculated, the principal direction of forearm is used as.As it was previously stated, principal direction can To be determined by a variety of methods as known in the art, in the present embodiment, PCA is used（PCA）To calculate this The principal direction of boundary rectangle.Figure 10 (b) schematically shows the hand predicted in foreground depth image using underarm area Wrist information.As shown in Figure 10 (b), the solid-line rectangle frame in figure is the boundary rectangle of underarm area, and arrow A represents the main side of forearm To.

In step S902, on the principal direction identical direction with the forearm, the second of one predefined size of setting is auxiliary Region is helped, second auxiliary area connects with the boundary rectangle of underarm area this described.

Second auxiliary area is the region that prediction includes wrist.The size of second auxiliary area can according to wrist with it is small The limbs proportionate relationship of arm is preset.For example, the width of the second auxiliary area can be disposed as to the external square of underarm area Wide twice of shape, high setting for the boundary rectangle of underarm area width.As shown in Figure 10 (b), the dotted rectangle in figure is institute State the second auxiliary area.

In step S903, the center of gravity of second auxiliary area is determined, the position of the wrist of prediction is used as.

In this step, using the center of gravity of the second auxiliary area as wrist position J_lh(x,y).As shown in Figure 10 (b), figure In " * " mark represent the position of wrist thereby determined that.

In step S904, the principal direction of second auxiliary area is calculated, the direction of the wrist of prediction is used as.

In the present embodiment, it is similar with step S901, also using PCA（PCA）To calculate second auxiliary The principal direction of rectangle, is used as the direction θ of the wrist of prediction_l.As shown in Figure 10 (b), the arrow B in figure represents the wrist of prediction Direction.

Fig. 2 is returned to, in step S204, based on wrist information each described, selection confidence level highest candidate region is made For the region where hand.

In this step, for identified each hand candidate region, believed according to the wrist predicted using the candidate region Breath and the wrist information predicted using underarm area, calculate the confidence level of the candidate region, and select with highest confidence level Candidate region is used as the region where hand.

The confidence level of candidate region can using it is various it is appropriate by the way of calculate.For example, a kind of possible mode is pair In the candidate region of each hand, the probability and the straight line corresponding to forearm for by the candidate region being region where hand are forearms Probability is weighted processing, and using weighted results as the candidate region confidence level.Specifically, can be according to expression formula [16] To calculate confidence level.

G_H|L(t)=G_H(t)+R(H_t,L)G_L [16]

Wherein, t is the candidate region of hand, G_H|L(t) be candidate region t confidence level, G_H(t) it is as described previously by table The candidate region t calculated up to formula [10] is the probability in the region where hand, G_LIt is to be calculated as described previously by expression formula [15] Corresponding to forearm straight line be forearm probability, R (H_t, L) and it is G_LWeights.

R(H_t, L) and it is according to the wrist predicted using the candidate region t wrist informations predicted and using underarm area What information was determined.Optionally, the wrist that the wrist information predicted using underarm area is predicted with the candidate region t using hand The difference of information is smaller, then R (H_t, L) and it is bigger.Specifically, weights R (H can be determined according to expression formula [17]_t,L)。

R(H_t,L)=exp(-E) [17]

E=w^J||J_hl(x,y)-J_lh(x,y)||+w^D|θ_l-θ_h|

Wherein, θ_lAnd θ_hIt is the wrist direction predicted as previously described using the candidate region of underarm area and hand, w respectively^D For the weights of the deviation in wrist direction predicted, J_lh(x, y) and J_hl(x, y) be respectively as previously described using underarm area and The wrist location that the candidate region of hand is predicted, w^JFor the weights of the deviation of wrist location predicted.w^DAnd w^JCan be according to reality Border situation is set, such as in testing, w^JCan be using value as 1/3, w^DCan be using value as 2/3.

The foregoing describe a kind of method of the confidence level for the candidate region for determining each hand.It is understood that this method is only Be an example, and be not limitation of the present invention, the confidence level of candidate region can using other it is any it is appropriate by the way of To calculate.For example, the wrist information that simply can be predicted according to the candidate region using hand is with utilizing underarm area prediction The difference of wrist information, to determine the confidence level of each candidate region, i.e., with the difference for the wrist information predicted using underarm area Not smaller candidate region, its confidence level is bigger.

Optionally, the selected candidate region with highest confidence level can be verified.Specifically, by the highest Confidence level is compared with threshold value set in advance, if highest confidence level is more than the threshold value, it is determined that put with the highest The candidate region of reliability is the region where hand；Conversely, then illustrate that the confidence level of the candidate region of each hand detected is too low, Think to be not detected by hand.

The foregoing describe hand detection method according to embodiments of the present invention.In the method, using forearm message to detection To the candidate regions of multiple hands verified that the candidate region that the multiple hands detected are calculated using forearm information is put Reliability, and select that there is the candidate region of highest confidence level in the candidate region of multiple hands as the region where hand.Because small The detection of arm is easy relative to the detection of hand, therefore the forearm detected has higher confidence level, and then utilizes forearm letter Ceasing the region of the hand of checking has higher confidence level.Therefore, cause to have in image blurring, background even in due to the motion of hand Illumination is changed, in the case of hand and the Various Complex such as face is overlapping in the object of the similar colour of skin, interactive process, according to this hair The hand detection method of bright embodiment can also obtain preferable testing result.In addition, the hand detection method do not need initiation gesture and Movable information.

Hand detection device according to embodiments of the present invention is described below with reference to Figure 11.

Figure 11 shows the functional configuration block diagram of hand detection device 1100 according to embodiments of the present invention.

As shown in figure 11, hand detection device 1100 can include：Hand candidate region detection part 1110, is configured to detection The candidate region of hand in current scene；Underarm area detection part 1120, is configured to the depth image based on current scene, really Determine the underarm area in current scene；Wrist information prediction unit 1130, is configured to the candidate region using each hand Corresponding wrist information is predicted respectively, and predicts the wrist information using the underarm area；Part is determined with hand region 1140, it is configured to be based on each described wrist information, selection confidence level highest candidate region is used as the region where hand.

Above-mentioned hand candidate region detection part 1110, underarm area detection part 1120, wrist information prediction unit 1130 And hand region determines that the concrete function of part 1140 and operation may be referred to above-mentioned Fig. 1 to Figure 10 associated description, herein not Repeat description.

The general hardware block diagram of hand detecting system 1200 according to embodiments of the present invention is described below with reference to Figure 12.Such as Figure 12 Shown, hand detecting system 1200 can include：Input equipment 1210, for from the relevant image of outside input or information, such as taking the photograph Depth image, coloured image that camera is shot etc., the input equipment for example can be such as video camera；Processing equipment 1220, is used In the above-mentioned hand detection method according to the embodiment of the present invention of implementation, or above-mentioned hand detection device is embodied as, the processing is set Standby for example can be the central processing unit of computer or other chips with disposal ability etc.；Output equipment 1230, is used Result obtained by above-mentioned hand detection process is implemented to outside output, the position point coordinates of the hand of such as determination, the direction of hand Etc., the output equipment for example can be display, printer etc.；And storage device 1240, for volatile or non-easy The mode of mistake stores various images and data involved by above-mentioned hand detection process, such as depth image, coloured image, foreground depth Image, prospect coloured image, comparison diagram, weights figure, Saliency maps, using each hand candidate region predict wrist information, Wrist information, various threshold values set in advance and weights for being predicted using underarm area etc., the storage device for example can be Random access memory（RAM）, read-only storage（ROM）, hard disk or semiconductor memory etc. it is various volatile and nonvolatile Property memory.

The general principle of the present invention is described above in association with specific embodiment, however, it is desirable to, it is noted that to this area For those of ordinary skill, it is to be understood that the whole or any steps or part of methods and apparatus of the present invention, Ke Yi Any computing device（Including processor, storage medium etc.）Or in the network of computing device, with hardware, firmware, software or Combinations thereof is realized that this is that those of ordinary skill in the art use them in the case where having read the explanation of the present invention Basic programming skill can be achieved with.

Therefore, the purpose of the present invention can also by run on any computing device a program or batch processing come Realize.The computing device can be known fexible unit.Therefore, the purpose of the present invention can also be included only by offer Realize that the program product of the program code of methods described or device is realized.That is, such program product is also constituted The present invention, and the storage medium for such program product that is stored with also constitutes the present invention.Obviously, the storage medium can be Any known storage medium or any storage medium developed in the future.

It may also be noted that in apparatus and method of the present invention, it is clear that each part or each step are to decompose And/or reconfigure.These decompose and/or reconfigured the equivalents that should be regarded as the present invention.Also, perform above-mentioned series The step of processing can order naturally following the instructions perform in chronological order, but and need not necessarily sequentially in time Perform.Some steps can be performed parallel or independently of one another.

Above-mentioned embodiment, does not constitute limiting the scope of the invention.Those skilled in the art should be bright It is white, depending on design requirement and other factors, can occur various modifications, combination, sub-portfolio and replacement.It is any Modifications, equivalent substitutions and improvements made within the spirit and principles in the present invention etc., should be included in the scope of the present invention Within.

Claims

1. a kind of hand detection method, including：

Detect the candidate region of hand in current scene；

Depth image based on current scene, determines the underarm area in current scene；

Corresponding wrist information is predicted respectively using the candidate region of each hand, and it is described using underarm area prediction Wrist information；With

Based on wrist information each described, selection confidence level highest candidate region is used as the region where hand.

2. hand detection method as claimed in claim 1, wherein the depth image based on current scene determines current scene In underarm area include：

Foreground segmentation is carried out to the depth image, to generate foreground depth image；

Based on the foreground depth image, depth profile is generated；

Thresholding processing is carried out to the depth profile；

Detection of straight lines in depth profile after thresholding processing；

Based on the position of human body in the foreground depth image, determine to correspond to the straight line of forearm in the straight line；

According to the straight line corresponding to forearm, underarm area is determined in the foreground depth image.

3. hand detection method as claimed in claim 1 or 2, wherein predicting correspondence respectively using the candidate region of each hand Wrist information further comprise candidate region for each hand：

The boundary rectangle of the candidate region of the hand is determined, and based on the principal direction of boundary rectangle calculating hand；

On the direction opposite with the principal direction of the hand, the first auxiliary area of a predefined size, first auxiliary region are set Domain connects with the boundary rectangle of the candidate region of the hand；

The center of gravity of first auxiliary area is determined, the position of the wrist of prediction is used as；

The principal direction of first auxiliary area is calculated, the direction of the wrist of prediction is used as.

4. hand detection method as claimed in claim 3, wherein predicting that the wrist information is further using the underarm area Including：

The boundary rectangle of the underarm area is determined, and based on the principal direction of boundary rectangle calculating forearm；

On the principal direction identical direction with the forearm, the second auxiliary area of a predefined size is set, second auxiliary Region connects with the boundary rectangle of underarm area this described；

The center of gravity of second auxiliary area is determined, the position of the wrist of prediction is used as；

The principal direction of second auxiliary area is calculated, the direction of the wrist of prediction is used as.

5. hand detection method as claimed in claim 1, further comprises the candidate region for each hand, calculating it is The probability in the region where hand.

6. hand detection method as claimed in claim 5, further comprises, for the identified straight line corresponding to forearm, calculating The straight line is the probability of forearm.

7. hand detection method as claimed in claim 6, wherein being waited based on wrist information selection confidence level highest each described Favored area includes as the region where hand：

For the candidate region of each hand, the probability for being region where hand is with the straight line corresponding to forearm The probability of forearm is weighted, and using weighted results as the candidate region confidence level；

Candidate region of the selection with maximum weighted result, is used as confidence level highest candidate region.

8. hand detection method as claimed in claim 7, wherein the wrist information predicted using underarm area is with utilizing hand The difference for the wrist information that candidate region is predicted is smaller, then the straight line corresponding to forearm is that the weights of the probability of forearm are got over Greatly.

9. hand detection method as claimed in claim 1, wherein the candidate region of hand includes in detection current scene：Utilize hand The candidate region of hand in tone, saturation degree and depth information, detection current scene.

10. a kind of hand detection device, including：

Hand candidate region detection part, is configured to detect the candidate region of hand in current scene；

Underarm area detection part, is configured to the depth image based on current scene, determines the underarm area in current scene；

Wrist information prediction unit, is configured to predict corresponding wrist information respectively using the candidate region of each hand, And predict the wrist information using the underarm area；With

Hand region determines part, is configured to be based on each described wrist information, selects the candidate region conduct of confidence level highest Region where hand.