CN108268819A

CN108268819A - A kind of motion gesture detection and recognition methods based on Face Detection

Info

Publication number: CN108268819A
Application number: CN201611262542.1A
Authority: CN
Inventors: 钟鸿飞; 覃争鸣; 杨旭
Original assignee: Rich Intelligent Science And Technology Ltd Is Reflected In Guangzhou
Current assignee: Rich Intelligent Science And Technology Ltd Is Reflected In Guangzhou
Priority date: 2016-12-31
Filing date: 2016-12-31
Publication date: 2018-07-10

Abstract

The present invention proposes a kind of motion gesture detection and recognition methods based on Face Detection, includes the following steps：S1 shakes detection：The candidate region of hand is found in the position of detection positioning hand；S2 hand skin color model treatments：Skin color model is carried out to shaking the hand region found after detection, human hand and background is distinguished, establishes skin similarity model；S3 gesture trackings：Gesture tracking is carried out using mean shift algorithm, gesture is tracked and identified with window search；S4 gesture areas are divided：Further the hand region of tracking is split using hand skin color model, obtains the bianry image of gesture；S5 gesture identifications：Use the identification of implement the algorithm of support vector machine gesture motion.

Description

A kind of motion gesture detection and recognition methods based on Face Detection

Technical field

The present invention relates to field of human-computer interaction, and in particular to a kind of motion gesture detection and identification side based on Face Detection Method.

Background technology

In order to help disabled person/the elderly keep with the exchanging of the external world, link up, improve their independent living ability, subtract Light family, the burden of society, all over the world many scientists start the novel man-machine interaction mode of exploratory development.So-called interaction Technology includes people and the interaction of executing agency (such as robot) and the interaction of executing agency and environment.The former meaning is can It is gone to realize the planning and decision that executing agency is difficult in unknown or uncertain condition by people；And be can for the meaning of the latter By robot go to complete people job task in inaccessiable adverse circumstances or long distance environment.

Traditional human-computer interaction device mainly has keyboard, mouse, handwriting pad, touch screen, game console etc., these equipment The function of human-computer interaction is realized using the hand exercise of user.Gesture interaction supports more more natural interactive modes, carries Human-centred rather than facility center management interaction technique is supplied, being primarily focused on original this thereby using family does In thing and content rather than concentrate in equipment.

Common gesture interaction technology is divided into the gesture interaction technology based on data glove sensor and is regarded based on computer Two kinds of the gesture interaction technology of feel.

Gesture interaction technology based on data glove sensor needs user to wear data glove or position sensor etc. Hardware device acquires the information such as finger state and movement locus using sensor, computer is allowed to identify so as to carry out calculation process Gesture motion realizes various interactive controllings.This mode advantage is to identify that accurate robust performance is good, algorithm is relatively easy, operation Data are few and quick, can precisely obtain the solid space action of hand, change without the ambient lighting of vision system and carry on the back completely The problems such as scape is complicated interferes.Shortcoming is that equipment wearing is complicated, of high cost, user's operation is inconvenient for use and gesture motion is by certain Restrict, therefore, it is difficult to largely put into actual production to use.

Gesture interaction technology based on computer vision by machine vision to camera collected gesture image sequence Processing identification, so as to be interacted with computer, this method acquires gesture information using camera, then utilizes complexion model Human hand part is split, so as to fulfill gestures detection and identification, the tracking of motion gesture is finally realized using frame-to-frame differences method. The effect of this method depends on the accuracy rate of complexion model, however the skin color of people differs, it is difficult to obtain general, efficient skin Color model；In addition, when human hand movement speed is uneven, disruption will appear using frame-to-frame differences method tracking gesture, so as to lose It loses and is tracked gesture.

Invention content

The purpose of the present invention is to overcome the deficiency in the prior art, especially solves the existing gesture interaction based on computer vision In technology, it is difficult to general, efficient complexion model is established to the colour of skin that color differs, frame-to-frame differences method motion tracking occurs interrupting existing As the problem of.

In order to solve the above technical problems, the present invention proposes a kind of motion gesture detection based on Face Detection and identification side Method, key step include：

S1 shakes detection：The candidate region of hand is found in the position of detection positioning hand；

S2 hand skin color model treatments：To shake detect after the hand region that finds carry out skin color model, distinguish human hand and Background establishes skin similarity model；

S3 gesture trackings：Gesture tracking is carried out using mean shift algorithm, gesture is tracked and identified with window search；

S4 gesture areas are divided：Further the hand region of tracking is split using hand skin color model, is obtained in one's hands The bianry image of gesture；

S5 gesture identifications：Use the identification of implement the algorithm of support vector machine gesture motion.

The present invention has following advantageous effect compared with prior art：

The present invention program carries out elimination head jitter using acceleration transducer acquisition acceleration information and calculating inclination angle Or the mobile interference extracted to human hand coordinate, gesture profile is found using connected region, and search to put between finger tip point and the palm and realize Non- gesture area filtering, the tracking of motion gesture is finally realized using simplified mean shift process, is reduced since body is trembled The dynamic interference brought, while avoid tracking disruption.

Description of the drawings

Fig. 1 is a kind of flow of one embodiment of motion gesture detection and recognition methods based on Face Detection of the present invention Figure.

Fig. 2 is the acceleration squint angle schematic diagram of the embodiment of the present invention.

Fig. 3 is four connected regions of the embodiment of the present invention and the principle compares figure in eight connectivity region.

Fig. 4 is the finger tip point set schematic diagram of the embodiment of the present invention.

Fig. 5 is the program flow diagram that the motion gesture of the embodiment of the present invention detects.

Specific embodiment

Below in conjunction with the accompanying drawings and specific embodiment the present invention is carried out in further detail with complete explanation.It is appreciated that It is that specific embodiment described herein is only used for explaining the present invention rather than limitation of the invention.

Referring to Fig. 1, the detection of the motion gesture based on Face Detection and recognition methods of the embodiment of the present invention, key step packet It includes：

Referring to Fig. 2, when human hand shakes, will there is continuously the average brightness value of pixel in the region that human hand passes through Violent fluctuating change.This variation is in image not available for other regions.System only needs one section in input in this way It in the sequence of continuous frame, finds out those and changes bigger region, it is possible to obtain the position that hand substantially shakes.

It is in systems sub-block that several sizes are m × n first by the image separation of each frame, to each of t frames Sub-block (i, j) calculates its variation degree in continuous 10 frame image with S (i, j, t).

Wherein, I (i, j, t) illustrates the average brightness of sub-block (i, j) on t frames, and w (n), n=0 ..., 9 represent power Weight.Referring to Fig. 3, the visual representation definition of I (i, j, t)：

Wherein, luminance (p, q, t) represents the brightness of pixel (p, q) on t frames.For the influence of reflecting time, When calculating each block in the accumulation of the variation degree on continuous 10 frame, an incremental weight at any time is assigned to each frame W (n), n=0 ..., 9.

By above-mentioned calculating, the block of S (t) maximums is exactly to change region the most violent within nearest 10 frame.

What is used in the above process is based entirely on the positioning of movement.Then, with relatively simple colour of skin decision rule root Candidate hand region is found according to the colour of skin, i.e., if the colour of skin for having comparable pixel to be similar to experience around this region, judges It is the region of human hand around this block.

Using the rectangle colour of skin model of linear equation of determining Cb and Cr maximum values and minimum value, rectangular model can use four Straight line L1, L2, L3, L4 expressions are as follows：

L1:C_b×T₁+T₂＜ C_r (2)

L2:C_b×T₂+T₃＜ C_r (3)

L3:C_b×T₅+T₆＞ C_r (4)

L4:C_b×T₇+T₈＞ C_r (5)

Wherein T₁=-1.22265625, T₂=267.3330078125, T₃=0.875, T₄=29.375, T₅=- 1.3330078125、T₆=316.3330078125, T₇=0.064453125, T₈=170.612903225.Above-mentioned linear equation The parameter of parted pattern carries out off-line training to Finite Amplitude image and obtains, and passes through the parameter detecting different application of off-line training The colour of skin under scene.When the gray scale in image slices vegetarian refreshments is fallen in the range of matrix, human body complexion is taken as, Fig. 4 is this implementation The Face Detection result figure of example.

An initial search window is inputted, that is, shakes the region that detector navigates to.

(1) skin color probability map of search window is calculated.

(2) 0 rank square M of skin color probability is calculated₀₀With 1 rank square M₁₀、M₀₁：

(3) position (x of the high probability colour of skin barycenter in search window is calculated_c,y_c)：

(4) size of high probability area of skin color in search window is calculated.

(5) center and the size of search window are adjusted according to the size of high probability area of skin color.

(6) repeat the above steps 1-5, until the variation of the center of search window and size in certain adjustment is less than some threshold Until value.At this point, the position of high probability colour of skin barycenter seeks to the position of the object (human hand) of tracking.

S4 gesture areas are divided：Further the hand region of tracking is split using hand skin color model, is obtained in one's hands The bianry image of gesture；The complexion model of step S2 is used again, and the pixel in hand region obtained to tracking is sentenced It is disconnected, if pixel gray value is located in complexion model, it is judged as hand region, is otherwise judged as background area.After judgement, Due to the presence of noise, the pixel of part hand region can be caused to be mistaken for background, it is therefore desirable to use dilation erosion Morphological operation realizes the full segmentation of hand region.Process is：

(1) operator of corrosion is Θ, and set A is aggregated B corrosion and is defined as：

(2) operator of expansion isSet A is aggregated B expansions and is defined as：

Using dilation erosion type gradient operator, i.e., the image after subtracting corrosion with the image after expansion, you can obtain image In edge.Since edge at this time is not that single pixel wide connects, it is also necessary to again with region framework extraction algorithm to edge into Row refinement.

1) it is image to set B, and S (A) represents the skeleton of A, and B is structural element, and following formula represents：

Wherein, K represents to corrode A into the iterations before empty set, i.e. following formula is expressed as：

S_k(A) it is known as skeleton subset, can be written as according to the following formula：

A Θ kB represent to corrode A with B for continuous k times.Final etching is the result is that gesture binary image.

Using all gesture features as one group of n dimensional feature vectors x and its class label w.It is differentiable super by defining Plane obtains the discriminant function wx+b=0 of two classes.It is spaced to maximize, defines two parallel hyperplane wx+b=1, Wx+b=-1 by supporting vector, and does not have training mode between them.Then for all training mode x_iIt must expire The inequality in foot face:

w_i(w·x+b)≥1 (15)

The distance of this hyperplane is 2/ | | w | |.It is spaced to maximize, needs to minimize | | w | |, with Lagrange Principle states this minimization problem, simplifies optimization process, finally can be calculated discriminant function is：

In the problem of this method being generalized to Nonlinear separability using geo-nuclear tracin4.Linear support vector grader Dot product can be replaced with Non-linear Kernel function：

k(x_i,x_j)=Φ (x_i)·Φ(x_j) (17)

The discriminant function of generation is：

During practical operation, the present embodiment is trained using 150 groups of images of gestures as training sample using support vector machine method Sample obtains gesture classifier, and wherein Non-linear Kernel function is Sigmoid functionsAnd to 50 groups of test images of gestures into Row classification.

Claims

1. a kind of motion gesture detection and recognition methods based on Face Detection, which is characterized in that include the following steps：

S2 hand skin color model treatments：Skin color model is carried out to shaking the hand region found after detection, distinguishes human hand and background, Establish skin similarity model；

S4 gesture areas are divided：Further the hand region of tracking is split using hand skin color model, obtains gesture Bianry image；