Content of the invention
The technical problem to be solved be to overcome various present in gesture detecting method above-mentioned
Defect, provides a kind of gesture detecting method based on rgb-d image, and it can be partitioned into staff region effectively, has segmentation accurate
Really, though hand occur part from block or background in have other people to disturb when the Hand Gesture Segmentation that also can obtain, and calculation
Method robustness is good.
For solving above-mentioned technical problem, the invention provides a kind of gesture detecting method based on rgb-d image, its bag
Include:
The first step, obtains rgb-d image;
Second step, splits hand from background;
3rd step, using Optimization Problems of Convex Functions segmentation;
4th step, finds the optimum segmentation of gesture.
The described first step is specially and utilizes depth transducer to obtain coloured image (rgb image) stream and depth image
(depth image) flows, i.e. rgb-d image data stream, and convert thereof into the image of a frame frame in order to follow-up image at
Reason.
Described second step, specifically by the pixel ratio of skeletal graph and depth image, hand position is mapped to depth map
Hand is split from background by picture using deep image information.
Described 3rd step is specially using convex function come the images of gestures of Optimized Segmentation rgb-d.
Described 4th step is specially using minimizing function and its function constraint, by split bregman fast algorithm
Solve model, optimum segmentation is found to rgb-d image.
Beneficial effects of the present invention:
The gesture detecting method of the rgb-d image that the present invention provides can be partitioned into staff region effectively, has segmentation accurate
Really, though hand occur part from block or background in have other people to disturb when the Hand Gesture Segmentation that also can obtain, and calculation
Method robustness is good.
Reference
Fig. 1 a-1e is based on coloured image/depth image/rgb-d image segmentation result;Wherein, Fig. 1 a coloured image;Figure
1b depth image;Fig. 1 c color images result;The segmentation result of Fig. 1 d depth image;Fig. 1 ergb-d image segmentation result;
Fig. 2 a-2e is based on coloured image/depth image/rgb-d image segmentation result in the case of another kind;Wherein, scheme
2a coloured image;Fig. 2 b depth image;Fig. 2 c color images result;The segmentation result of Fig. 2 d depth image;Fig. 2 e rgb-
D image segmentation result.
Specific embodiment
The invention provides a kind of gesture detecting method based on rgb-d image, comprising:
The first step, obtains rgb-d image;
Second step, splits hand from background;
3rd step, using Optimization Problems of Convex Functions segmentation;
4th step, finds the optimum segmentation of gesture.
The described first step is specially and utilizes depth transducer to obtain coloured image, i.e. rgb image stream and depth image, that is,
Depth image flows, i.e. rgb-d image data stream, and converts thereof into the image of a frame frame in order to follow-up image procossing.
Depth image and rgb color image data can be obtained using depth transducer, it would be preferable to support complete in real time simultaneously
Body and skeleton are followed the trail of, and can identify a series of attitude, action simultaneously, are utilized in this application obtain gesture data letter
Breath.
The purpose of gestures detection is effective Ground Split hand region from original image, that is, the staff area in image
Domain (prospect) is made a distinction with other (background area), is one critically important element task of gesture identification.Depth sensing utensil
The function of having analysis depth data and detect human body or player's profile.Color and depth data stream can be obtained simultaneously by it
The image converting thereof into a frame frame is in order to follow-up image procossing.Image to input is it is desirable to rgb image is deep with depth
Degree image aligns and time synchronized in pixel.Obtaining the image meeting above-mentioned condition to rear, input picture is being carried out pre-
Process, such as filtering etc., reach the purpose of suppression noise.
Described second step, specifically by the pixel ratio of skeletal graph and depth image, hand position is mapped to depth map
Hand is split from background by picture using deep image information.
Coloured image and depth image may serve to carry out Hand Gesture Segmentation.The advantage of coloured image is clear, but it is only
Comprise two-dimensional signal, and anti-interference is weaker.And depth image does not have cromogram image height in resolution, but it contains three
Dimension information, and strong interference immunity.Because skeletal graph can follow the trail of the coordinate position of human hands, therefore it is easily determined hand in bone
The particular location of bone in figure.Then pass through the pixel ratio of skeletal graph and depth image, hand position is mapped to depth image, profit
With deep image information, hand is split from background.Because depth image resolution is low and is easily subject to depth value same object
Interference, the effect of segmentation is unsatisfactory.Therefore, propose the detection side with reference to depth image and coloured image in this application
Method.
Described 3rd step is specially using convex function come the images of gestures of Optimized Segmentation rgb-d.
For segmentation optimization process, the image segmentation that we define this problem is the functional of a minimum:
E (u)=∫ωf(x)u(x)dx+∫ω|du(x)| (1)
Wherein, u ∈ bv (ird;{ 0,1 }) be binary function on an indicator function bounded variation, u=1 and u=0
Represent in surface irdInside and outside, that is, two dimensional image segmentation in the case of one group of closed boundary or in three-dimensional segmentation feelings
One group of occluding surface under condition.In formula (1), Part II is total variation.Wherein du represents derivative of a distribution, and differentiable function u sums up
ForBy lax binary system constraint, the value of function u is between zero and one.This optimization problem is changed in convex set bv
(ird;[0,1]) in try to achieve minimum convex formula (1).
By the convex form optimizing and threshold value, spatially continuously arranging functional, it is possible to achieve global optimization.This thresholding
Theorem guarantees that solution u* resolution problem keeps global optimum to original binary labelling problem.The overall situation of computing formula (1)
Minima is as follows: in convex set bv (ird;[0,1]), during θ ∈ (0,1) any value, global minimum u* and big in computing formula (1)
Threshold value in minima u*.
Due to from rgb-d Image Acquisition to extra depth information, so boundary length can be in absolute codomain | du (x) |
Rather than measure in image area d (x).Functional (1) can be generalized to:
E (u)=∫ωf(x)u(x)dx+∫ωd(x)|du(x)| (2)
Depth value d: ω → ir, the ill effect that formula (2) compensate for causing in operating process is (due to perspective projection, right
As more remote, the camera less image of appearance).
Described 4th step is specially using minimizing function and its function constraint, by split bregman fast algorithm
Solve model, optimum segmentation is found to rgb-d image.
For the function constraint of rgb-d image, we will constrain the square of segmentation using depth information, and this will be described simultaneously
How a little constraintss affect the embedded corresponding meeting point of convex majorized function.We are with being defined on b=bv (ω;[0,1])
Convex function represent and be defined on whole image regionBounded variation Closing Binary Marker function.Area-constrained: 0 rank square
The shape of corresponding region u, can be calculated by formula (3)
Area (u) :=∫ωd2(x)u(x)dx (3)
Wherein d (x) gives the depth of pixel x.Assume d (x)=kd (x), k is the focal length of camera, and d (x) is to measure
Pixel depth.Make d2X () is the size that corresponding pixel projects in 3d space, overall space be surface area rather than
View field in image.Using (grenander, u., chow, y., keenan, d.m.:hands:a with document
Pattern theoretic study of biological shapes.springer, new york (1991)) method, with
Same mode processes all of pixel.
The absolute area of shape u is limited in constant c1≤c2Between, realized by constraint u in formula (4) set:
c0=u ∈ β | c1≤area(u)≤c2}
(4)
Set c0It is linearly dependent on u, therefore convex constant c2≥c1≥0.
Generally, by arranging c1=c2Or apply the region of the upper bound and lower bound and to determine accurate area, or apply one
Soft zone region constraint, lifts functional (1) by formula (5) as follows:
etotal(u)=e (u)+λ (∫ d2udx-c)2(5)
Formula (5) increases soft-constraint weight λ > 0 so that the area shape estimated is close to c >=0.Formula (5) is also convex letter
Number.
Described split bregman fast algorithm is specially and maximizes a likelihood function with its natural logrithm of maximization
It is of equal value.The application first split method is applied in rgb-d image segmentation, sets up a following universal model:
Wherein qi=-ln pi, i=1,2, ω=(μ, σ)=max (pi), i=1,2, u are used for table for Closing Binary Marker function
Show curvilinear motion.
Split bregman algorithm idea is incorporated in the universal model of rgb-d image segmentation the application, that is, exist
Division variable w=[w is first introduced on the basis of split method1, w2]t, it is re-introduced into bregman apart from b=(b1, b2)t, by formula
(7) functional extreme value problem is converted into:
Wherein r (u1, u2)=α1q1(x, ω1)-α2q2(x, ω2).Formula (9) is that the energy functional to two variables asks pole
The problem of value, generally adopts alternative optimization to realize.First, it is assumed that w is constant, the problems referred to above are converted into seeks extreme-value problem to u:
Then it is assumed that u is constant, solution is with regard to the extreme-value problem of w:
Can get the euler-lagrange equation of energy functional (10) by variational method:
Formula (12) can be solved using your iterator mechanism of quick Gauss Saden.Due to using u after convex relaxing techniquess
Span is [0,1], so u need to be tied in the range of this using following projection pattern:
uk+1=max (min (uk+1, 1), 0) (13)
After having solved energy functional (10), then solve energy functional (11).The euler-lagrange side of formula (11)
Cheng Wei:
Its analytic solutions is obtained by broad sense soft-threshold formula, its form is:
Hereinafter embodiments of the present invention are described in detail using embodiment, whereby to the present invention how application technology means
To solve technical problem, and reach realizing process and fully understanding and implement according to this of technique effect.
Present invention show the Experimental comparison results of this method and other methods.Test dividing method is by Fig. 1 and Fig. 2 two
Scene, to demonstrate, tests the gesture being intended to split individuality from crowd.As can be seen from the figure it is better than based on rgb-d Hand Gesture Segmentation
It is based solely on the segmentation of color image or depth image.As shown in Fig. 1 (c), when merely with the segmentation of rgb color image information algorithm
Go out staff, face and part wall information, fail to be partitioned into the gesture of needs.Shown in Fig. 1 (d), merely with depth image letter
During breath, staff and divided out with staff depth identical human body parts.As can be seen here, when only considering above two feelings
During one of condition, segmentation effect is all undesirable.As shown in Fig. 1 (e), when considering rgb and depth information simultaneously, that is, it is based on
During rgb-d image information, the region segmentation of staff is individually split, and the difficult problem of segmentation is resolved.Multiple
Under miscellaneous scene, the application algorithm also has good robustness, as shown in Figure 2.Add in the scene and be in different depth
New persona, also can be partitioned into target gesture in this case well.
All above-mentioned this intellectual properties of primary enforcement, do not set this new product of enforcement limiting other forms
And/or new method.Those skilled in the art will be using this important information, and the above is changed, to realize similar execution feelings
Condition.But, all modifications or transformation belong to the right of reservation based on new product of the present invention.
The above, be only presently preferred embodiments of the present invention, is not the restriction that the present invention is made with other forms, appoints
What those skilled in the art possibly also with the disclosure above technology contents changed or be modified as equivalent variations etc.
Effect embodiment.But every without departing from technical solution of the present invention content, according to the present invention technical spirit to above example institute
Any simple modification, equivalent variations and the remodeling made, still falls within the protection domain of technical solution of the present invention.