CN102831439B

CN102831439B - Gesture tracking method and system

Info

Publication number: CN102831439B
Application number: CN201210290337.1A
Authority: CN
Inventors: 宋展; 赵颜果; 聂磊; 杨卫; 郑锋
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2012-08-15
Filing date: 2012-08-15
Publication date: 2015-09-23
Anticipated expiration: 2032-08-15
Also published as: CN102831439A

Abstract

The invention provides a kind of gesture tracking method, comprise the steps: the apparent model designing gesture, comprise the iamge description mode for tracking prediction and prediction checking; Gestures detection obtains the original state of target, obtains the position of target, size information; Carry out initialization according to described original state to the tracker of target, comprise initialization apparent model, namely initialization is used for the iamge description template of tracking prediction and prediction checking, and the classification of initial initialization gesture, state and observability; According to described tracker information, by following the tracks of process, final estimation is made to the state of target and observability; Judge the observability of target, wherein, if forever lose, then need to restart detection to obtain a tracking target, otherwise, continue to follow the tracks of.The present invention also provides a kind of gesture tracking system.Gesture tracking method and system in the present invention, have simple, quick and stable advantage.

Description

Gesture tracking method and system

Technical field

The present invention relates to target following and the field of human-computer interaction of view-based access control model and image, particularly relate to a kind of gesture tracking method and system being applicable to TV embedding platform.

Background technology

Based on the man-machine interactive operation of gesture as a kind of important man-machine interaction method, extensively concerned in recent years.Such as, the motion picture of user is gathered by common camera, pass through algorithm for pattern recognition, detection and tracking are carried out to the hand-characteristic in image, the movable information of hand is converted into the movable information of TV screen cursor, feeds back to Intelligent television terminal, and trigger corresponding operational order, as the switching of TV programme, the movement of TV cursor, and simple game interactive etc.The camera that Gesture Recognition is equipped with based on intelligent terminal, in terminal, corresponding identification software is installed, can complete above operation, thus on hardware cost and mode of operation, all have great advantage, thus this technology is becoming the standard configuration module of intelligent television gradually.A wherein involved key issue is exactly how accurately and glibly to carry out the tracking of hand-characteristic, thus display mouse or TV screen cursor are moved exactly with the movement set about, and this process is also referred to as gesture tracking technology.

But the gesture tracking method of existing view-based access control model method, there is following common problem: 1) poor stability, be subject to the impact of the factor such as ambient lighting and complex background, and the change of hand shape in the image that causes of the Angulation changes that produces in motion process of hand, extremely easily cause loss and the operation disruption of tracking target; 2) counting yield is low, based on the hand-characteristic such as the colour of skin, shape information, easily be subject to extraneous factor interference, and based on methods such as the online machine learning of high complexity and training, greatly improve again the complexity of track algorithm, the embedded platform that huge operand makes it be difficult in low arithmetic capability, as smooth operation stable in intelligent television platform.

Thus simple and quick and stable gesture target tracking algorism how is developed, make it can be applied on the embedded platform of low arithmetic capability and become current urgent problem, and for all gesture interaction systems, the Stability and veracity followed the tracks of is directly connected to fluency and the experience degree of user operation, because of but one of key issue of gesture man-machine interactive system.

Summary of the invention

The present invention is directed to above problem, propose a kind of feasible simple and quick and stable gesture tracking method, described gesture tracking method comprises: the apparent model of design gesture, comprises the iamge description mode for tracking prediction and prediction checking; Gestures detection obtains the original state of target, i.e. the position of target, size information; According to described original state, initialization is carried out to the tracker of target, comprise initialization apparent model, namely initialization is used for the iamge description template of tracking prediction and prediction checking, and the classification of the tracked gesture that records of initialization tracker, state (position & size) and observability; According to described tracker information, by following the tracks of process, final estimation is made to the state of target and observability; Judge the observability of target, wherein, if forever lose, then need to restart detection to obtain a tracking target, otherwise, continue to follow the tracks of.

Preferably, also comprise the steps: to follow the tracks of restricted area R, for following the tracks of the target in present frame according to the dbjective state setting of previous frame.

Preferably, also comprise the steps: the described operation followed the tracks of in process, comprise prediction, checking and partial check, be only confined to carry out among described tracking restricted area R.

Preferably, also comprise the steps:, in described tracking restricted area R, to detect other gestures outside tracked gesture, for upgrading the apparent model of gesture when gesture is suddenlyd change.

Preferably, when finding to there is the change of gesture classification from the testing result of local, then abandoning original gesture model, reinitializing tracker information and apparent model by the result detected.

Preferably, in the step that dbjective state gives a forecast, employing be the way of color histogram in conjunction with cam-shift, according to the dbjective state of former frame or front some frames, the dbjective state in present frame is made prediction.

Preferably, do predicting the outcome in the step verified, employing be the such two kinds of describing modes of piecemeal LBP histogram and edge gradient direction histogram.

Preferably, also comprise the steps:, according to described result of following the tracks of process, to upgrade the information of described tracker, comprise the renewal to apparent model, and the renewal to gesture-type, state and observability that tracker records.

Preferably, when target generation transient loss time, trace daemon is not stopped immediately; But the state of foundation previous frame arranges tracking restricted area in a big way, some frames backward continue to do in this restricted area follows the tracks of process.

The invention allows for a kind of gesture tracking system, wherein, the module for following the tracks of process comprises following submodule: the apparent model of gesture, tracker initialization module, tracking prediction module, prediction authentication module, locally detection module and model modification module.Wherein, the apparent model of gesture, comprises the iamge description mode for tracking prediction and prediction checking; Tracker initialization module, for using described gesture detection module to detect Pre-defined gesture, and when certain class gesture being detected, initialization is carried out to tracker, comprise initialization apparent model, and record in initialization tracker gesture classification, state and observability; Tracking prediction module, describes for the apparent model in conjunction with gesture, according to the dbjective state in former frame or front some frames, makes prediction to the dbjective state in present frame; Whether prediction authentication module, from the target image corresponding to predicted current frame state, extracts the feature for predicting checking, corresponding to the apparent model of gesture predicting that the characteristics of image of checking compares, determine to predict the outcome effective; Model modification module, for according to described result of following the tracks of process, upgrades the information of gesture-type, state and observability that tracker in described tracker initialization module records, and upgrades the apparent model of gesture.

Preferably, also comprise local detection module, for the dbjective state according to former frame, determine to follow the tracks of restricted area, other gestures outside tracked gesture are detected.

Based on above problem, the present invention proposes one and stablize and efficient gesture method for tracking target, make it can stablize smooth operation on the embedded platforms such as intelligent television.From technological layer, 1) first reduce following range by arranging tracking restricted area, reduce image procossing amount on the one hand, effectively can reduce the large scene background interference factor that global follow is brought on the other hand; 2) describing method by using various features to merge, verifies the result of tracking prediction, effectively suppresses erroneous matching; 3) detected when gesture is suddenlyd change by local, in time trace model is upgraded; 4) after track rejection, the basis of nearest state is continued follow the tracks of, reduce the tracking termination that target transient loss causes, thus make whole operation more efficiently smooth.

Accompanying drawing explanation

Fig. 1 is the structural representation of an embodiment of gesture tracking system of the present invention.

Fig. 2 is total process flow diagram of gesture tracking method in the present invention.

Fig. 3 is the operational flowchart to single-frame images process in tracking module in the present invention.

Fig. 4 follows the tracks of restricted area schematic diagram in the present invention in tracing process.

Fig. 5 is an exemplary plot realizing four kinds of gestures used in system of the present invention.

Embodiment

As shown in Figure 1, be the structural representation of a kind of gesture tracking system 10 of the present invention.Gesture tracking system is applied to intelligent television platform system 1 etc.

In the present embodiment, the plateform system 1 at gesture tracking system 10 place at least also comprises image collection module 20 and gesture detection module 30.Image collection module 20 normally camera, for catching the gesture of user.In other embodiments, image collection module 20 also can be arranged in gesture tracking system 10.Gesture detection module 30, to for detecting Pre-defined gesture, obtains initialized gesture state.

Gesture tracking system 10 comprises: the apparent model 11 of gesture, tracker initialization module 12, tracking prediction module 13, prediction authentication module 14, locally detection module 15 and model modification module 16.

The apparent model 11 of gesture, comprises the iamge description mode for tracking prediction and prediction checking.

In the present embodiment, by the apparent model using the mode of various features associating to express target, namely, select two stack features describing mode Ω p and Ω v respectively, feature templates based on Ω p foundation is used for the similarity measurement in the middle of tracking prediction, based on the feature templates that Ω v sets up, for doing further inspection to predicting the outcome in tracking, the situation of prevention flase drop.

Tracker initialization module 12, the gesture detection module 30 of training in advance is used to carry out the detection of Pre-defined gesture at predeterminable area (or in entire image), once certain class gesture stably be detected, then testing result accordingly, carries out initialization to the parameter of tracker.

In the present embodiment, tracker information not only records which kind of hand shape belonging to tracked gesture, the state (position & size) of tracked gesture, the observability of tracked gesture; Also include the parameter information of tracked target apparent model.The initialized concrete mode of tracking target is with reference to the tracking target initialization in hereinafter embodiment.

Tracking prediction module 13, adopt color histogram in conjunction with cam-shift method, the model in conjunction with tracked target describes, according to the dbjective state in former frame or front some frames, make prediction to the dbjective state in present frame, tracking prediction is also only confined to follow the tracks of in restricted area and carries out.The concrete mode of tracking prediction is with reference to tracking prediction in Examples below.

Prediction authentication module 14, from the target image corresponding to predicted current frame state, extract the feature for predicting checking, corresponding to the apparent model of gesture predicting that the characteristics of image of checking compares, if similarity within the specific limits, then representative is followed the tracks of successfully, otherwise thinks and follow the tracks of unsuccessfully.The concrete mode of prediction authentication module is with reference to the checking predicted the outcome in Examples below.

Local detection module 15, according to the dbjective state of former frame, determine that one is followed the tracks of restricted area (according to the continuity for hand exercise, unexpected quick locus is there will not be to change under normal operation, tracing detection region can be reduced like this, improve counting yield), other gestures outside tracked gesture are detected, this is on the one hand for judging whether gesture shape switches, and improves the accuracy of gesture classification in following the tracks of on the other hand.The setting following the tracks of restricted area and associated description in the concrete mode reference Examples below locally detected.

In the present embodiment, tracking restricted area is set in tracing process by local detection module 15 and reduces following range, reduce image procossing amount on the one hand, effectively can reduce the large scene background interference factor that global follow is brought on the other hand.

Model modification module 16, for according to the result of following the tracks of process, upgrades the information of gesture-type, state and observability that tracker in tracker initialization module 12 records, and upgrades the apparent model 11 of gesture.Its embodiment is with reference to the renewal of the object module in Examples below.

In the present embodiment, after track rejection, the basis of nearest state is continued follow the tracks of, reduce the tracking termination that target transient loss causes, thus make whole operation more efficiently smooth.

As shown in Figure 2, being for the operational flowchart of video flowing in the gesture tracking method that proposes of the present invention, following the tracks of and for the process hocketed between initialized detection for showing.

In step s 201, image collection module 20 obtains video image.

In step S202, gesture detection module 30 detects certain gestures in detection restricted area.

In step S203, gesture detection module 30 judges whether certain gestures to be detected.Wherein, if the gesture of not detecting, then return step S201 and continue to obtain video image, if certain gestures detected, then perform step S204, enter the step performed by gesture tracking system 10.

In step S204, the information of tracker initialization module 12 pairs of trackers and the apparent model of gesture carry out initialization.

In the present embodiment, above-mentioned initialization comprises the characteristics of image extracting target image, carries out initialization to apparent model: initially for the template of tracking prediction, and initialization is for predicting the characteristics of image of checking; And the information such as gesture classification, state (size & position) and observability simultaneously in initialization tracker.

In step S205, image collection module 20 continues to obtain video image.

In step S206, gesture tracking system 10 is followed the tracks of according to the apparent model of tracker current information and tracked gesture.The flow process that tracking module algorithm is implemented obtains detailed statement in figure 3.

In the present embodiment, the apparent model 11 of gesture includes the set omega p of the iamge description mode for tracking and matching, and the set for doing the feature interpretation mode verified to predicting the outcome is Ω v.

In step S207, gesture tracking system 10 judges whether target forever disappears.Wherein, if forever disappear, then step S201 is returned; Otherwise, then step S205 is returned.

In the present invention, the observability of target is divided into three kinds of states, i.e. " visible ", " transient loss ", " forever losing ".The concrete reference explanation about observability hereafter.In the present embodiment, permanent disappearance refers to, reaches certain hour if target is in the transient loss stage or all has no target in multiple image after this.

As shown in Figure 3, be the operational flowchart for single-frame images process in the gesture tracking method of the present invention's proposition.

In step S301, the apparent model 11 of gesture is first according to apparent model and the tracker information of testing result initialization gesture.

In step s 302, image collection module 20 obtains video image.

In step S303, judge whether previous frame is followed the tracks of successfully according to the information of tracker record.Wherein, if follow the tracks of successfully, then perform step S304, if do not follow the tracks of successfully, then perform step S309.

In step s 304, tracking restricted area is set according to previous frame state.

In step S305, tracking prediction module 15, in tracking restriction local, gives a forecast to the gesture state of present frame.

In step S306, prediction authentication module 14 is verified predicting the outcome, and whether qualification predicts the outcome effective.Wherein, if effectively, then gradual renewal is implemented to apparent model.

In step S307, model modification module 16 to be done the information of tracker and the apparent model of tracked target according to tracking and testing result and is upgraded, and identifies the gesture state of present frame.

In step S308, gesture tracking system 10 judges the observability that target is current, for providing foundation for whether forever losing in Fig. 2.

In step S309, gesture tracking system 10 follows the tracks of successful state according to last, arranges a larger tracking restricted area.

In step S310, local detection module 15 detects other gestures outside tracked gesture in tracking restricted area.And testing result is supplied to step S307 as the foundation upgraded.

To sum up, first the model of initialization tracked target, to ensuing video image, according to the position of previous frame target, capital determines that is followed the tracks of a restricted area, on the one hand based on trace model in this region, prediction is provided to the current state of tracked target, uses moving window to detect other contingent Pre-defined gestures on the one hand; Meanwhile, make correction with the model of result to target followed the tracks of and detect again and upgrade.Can judge target " visible " property after the tracking of each frame and detection complete, if prediction authentication failed, and new gesture do not detected in regional area, then target enters " transient loss " state; If target enters " transient loss ", can transmit target for the last time by the state that successfully traces into as current state, continue to follow the tracks of based on this state in several frames after this; If target has continuous multiple frames to be all in " transient loss " state, then target enters " forever losing " state; State that if target is in " forever losing ", jump out tracking module, reenter the tracking initiation stage, the detection that the gesture detection module of training under using line carries out Pre-defined gesture is pre-.The judgement of target observability, its embodiment is with reference to the observability about target in Examples below.

About the modelling of gesture tracking system, workflow and functional module, its detailed technology scheme is described below:

(1) design of the apparent model of gesture

The apparent model of gesture is the foundation of target following, and it have recorded portraying objective attribute target attribute, the standard of similarity measurement when attributive character data are on the one hand for following the tracks of, on the other hand, for the benchmark predicted the outcome when verifying.Enumerate the describing mode of target image conventional in gesture tracking herein:

(a) based on the description of geometric properties, such as provincial characteristics, profile, curvature, concavity and convexity etc.;

(b) based on histogrammic description, such as color histogram, Texture similarity, gradient orientation histogram;

C () is based on the description of colour of skin degree of membership image;

(d) based on the description of pixel/super-pixel contrast, as point to feature, Haar/Haar-like feature etc.;

Generally, for predicting that the describing mode of checking is different from the describing mode for predicting, if be Ω p for the set of the describing mode predicted, the set for the describing mode verified is Ω v.In the example that system of the present invention realizes, Ω p contains the color histogram of H and channel S in HSV space, and Ω v contains piecemeal LBP histogram and represents and to represent with piecemeal gradient orientation histogram.

(2) tracking target initialization

The initialization of tracking target is realized by gestures detection, when target being detected in certain predefine region of image or entire image, from target image, extract feature be described objective attribute target attribute, for the foundation that later tracking phase prediction and matching and prediction are verified.

The gestures detection in this stage can be carry out also can being carry out in certain regional area of image in entire image, in order to reduce sensing range, improve detection speed, also consider that user generally stands in camera dead ahead to operate in addition, the method that specific region is detected is have employed in the present invention, as specific region can be arranged on middle 1/4 part of image, the benefit arranging this specific region is:

A () meets the custom of nature operation, as when operative intelligence TV, user generally can stand in screen (camera) dead ahead, when user operation, general is all first hand is raised to certain comfortable position P, then just start certain gesture, so the tracking starting position in user's consciousness is P, instead of lift certain position in process at staff; Therefore be arranged in specific region and detect, be conducive to realizing correct initialization, also meet the custom of normal running.

B () reduces false drop rate, owing to being provided with particular detection region, thus greatly can reducing the region searched for, thus effectively suppress the interference of complex background, dynamic background; The operation of convenient subject user, suppresses the interference of non-subject user, suppresses the interference of unconscious gesture;

C () strengthens the quality of supervise, if initialization occurs in the process that staff lifts, due to the motion blur that jerky motion causes, the object module accuracy that may cause being initialised declines, and affects follow-up tracking quality; Detect in specific region, can effectively suppress this situation.

D () is detected among a small circle, can significantly improve efficiency and the accuracy of detection;

What the present invention paid close attention to is the tracking problem of single gesture, when system (such as just started or certain tracing task stop after) not when performing tracing task, will perform gestures detection, until find a new tracking target.Initial phase can be detect certain several Pre-defined gesture, and also can be detect some certain gestures, this depends on the needs of application system.Such as, when the identification of dynamic gesture only according to movement locus time, only can detect certain single gesture, which increase detection efficiency, and can not impact application system; If the identification of dynamic gesture also depends on the shape of staff in tracking, the classification of initialization gesture can impact recognition result, now may need to detect multiple gesture.Such as, one of the present invention realizes only detecting closed palm as shown in Figure 5 in system.

Method used is detected about initialization, can in conjunction with movable information, texture information of Skin Color Information or gesture etc.Conventional method has:

A () judges the target gesture region of candidate by split plot design, carry out the identification of hand shape by the geometric configuration analyzing candidate region;

(b) by appearance features such as LBP histogram, Haar feature, put apparent attributes such as features, detect in conjunction with moving window method.

Realize in system at one of the present invention, under line, from sample data, extract Haar feature, gesture-non-gesture is done to each gesture training Ada-Boost sorter and distinguishes; In the object initialization stage, use this sorter in conjunction with moving window method to detect such gesture.

(3) setting of restricted area is followed the tracks of

Target-recognition region is the continuity features according to target travel, and according to the state of target previous moment, the region that estimating target present frame may occur, then only limits to and find the optimum matching with model in this region; And in fact, under normal circumstances, the position of target all can be followed the tracks of within restricted area at this.Based on this kind of way, not only substantially reduce the region of search, improve the efficiency of tracking, and owing to avoiding the coupling in unnecessary position, be therefore conducive to suppressing the drift in target following and erroneous matching.

Also reminding user gesture motion is unsuitable too fast potentially, to avoid, because the too fast camera shooting picture that causes of moving is fuzzy, causing following the tracks of unsuccessfully in the setting in this region in addition.

Realize in system of the present invention, we test color histogram+camshift tracking scheme respectively, LBP histogram+particle filter tracking scheme, proves to add the restriction following the tracks of restricted area, can suppress to follow the tracks of the erroneous matching at area of skin color such as face, neck, arms.

As shown in Figure 4, what inner side frame marked follows the tracks of the target gesture state (comprising position and the size of hand) obtained for present frame, what outer side frame marked is according to the determined tracking restricted area of dbjective state, and the dbjective state prediction of adjacent next frame only will be followed the tracks of in restricted area at this and carry out.

(4) tracking prediction

Tracking prediction, refers to according to the model of tracked target and the target state at former frame or front some frames, the current state of target is made to the process of estimation.Enumerate several practical method for quick predicting herein:

A (), with the distribution of color histogram graph expression target pixel value, calculates the backpropagation image P of source images, carries out camshift tracking according to P based on this color histogram;

B () calculates colour of skin degree of membership figure P according to complexion model, P represents at the pixel value of certain some the probability that this point is colour of skin point, carries out camshift tracking according to P;

C (), using source images/piecemeal LBP histogram/piecemeal gradient orientation histogram/Haar feature etc. as iamge description, the method in conjunction with particle filter is followed the tracks of;

D () chooses random point on image, or the net point of evenly subdivision formation, or detects as Harris angle point, SIFT/SURF unique point; These points are followed the tracks of based on optical flow method, the state that comprehensive analysis obtains target is done to the result of following the tracks of.

Use color histogram in conjunction with the tracking scheme of cam-shift forecasting mechanism in the present invention; In the track, for the video image that each width is new, according to the color histogram in model, calculate the backpropagation image following the tracks of target image corresponding to restricted area, in this backpropagation image, find optimum matching based on cam-shift scheme.

(5) checking predicted the outcome

Tracking prediction algorithm basically, being all find and Model Matching degree soprano in all candidate state comprised in certain regional extent, in other words by producing a series of candidate state someway from this region, and therefrom choosing optimum matching person S.But this optimum matching person is exactly not necessarily real dbjective state, therefore need to verify it, namely in the present invention, the prediction of indication is verified.

The describing mode set omega v for verifying according to object module, from the target image corresponding to state S, extracts feature statement, and the description benchmark corresponding to model compares, if similarity is in certain limit, then thinks and follow the tracks of successfully, otherwise think and follow the tracks of unsuccessfully.This scheme, mainly based on a kind of like this hypothesis, i.e. real dbjective state, should coincide with benchmark image on multiple attribute.Through prediction Qualify Phase, may find that tracking prediction is invalid, state of now thinking that target enters for the authentication failed predicted the outcome " transient loss ".

In the prediction scheme (color histogram+camshift) adopted in the present invention, adopt for predicting that the iamge description mode of checking has piecemeal LBP histogram and profile HOG histogram these two kinds, in current state that and if only if all compares with model and coincide under these two kinds of describing modes, just be considered to follow the tracks of successfully, otherwise think and follow the tracks of unsuccessfully.

(6) local is detected

In dynamic hand object tracking process, not only need by following the tracks of the position obtaining motion hand, and need to make identification to gesture shape in each frame of this process.

Many systems are by doing the identification identifying the static gesture realized in tracking to the image-region corresponding to predicted state S, but this also exists the problem of following two aspects, a) when drift occurs gradually in tracking time, image-region corresponding to state S also not exclusively coincide with real gesture region, may be such as the staff of line and a part for arm centered by wrist, now identify this region, recognition result is specious in the majority; Even if b) in correct situation of following the tracks of, only do one-off recognition to the image corresponding to S, the probability of identification error is also larger.

Given this, the present invention proposes in above-mentioned tracking restricted area, and the scheme using multi-scale sliding window mouth to detect, detects other Pre-defined gesture types outside tracked gesture.To each class gestures detection to target window carry out cluster, obtain several bunches, in window corresponding to all gestures bunch, select a degree of confidence soprano, calculate hand gesture location and the type of its correspondence, export as detection; If any class does not all detect target window, or does not have satisfactory bunch through cluster, then no-output is detected in present frame local.

If no-output is detected in local, then the classification results of present frame gesture is to the gesture-type recorded in tracked model, otherwise, if there is output, then think in following the tracks of, there is the change of gesture-type, now classification results is detect the gesture classification exported, and will be the result detected by the gesture attitude assignment recorded in trace model, reinitializes trace model by this testing result simultaneously.

Using moving window testing result to do classification to improve classify accuracy, is based on a kind of like this understanding, because can produce the window including target gesture in a large number in this process, adopts the degree of confidence of repeatedly classifying higher than the degree of confidence of single time.

The beneficial effect that the method is brought is as follows:

A () improves the precision to static gesture classification in the middle of tracking;

B () solves gesture sudden change, and the tracking that model has little time to learn to cause is failed;

C (), compared to the model of on-line study, the sorter for detecting is trained and is obtained under supervised learning, and degree of confidence is high, and flase drop is less likely to occur.

(7) renewal of object module

In the successful situation of tracking verification, the slow change that in moving, target is apparent can be adapted to allow object module, needing to do gradual renewal to object module; Update algorithm needs to determine according to feature specifically used in model and Forecasting Methodology and verification method.Realize doing tracking prediction with color histogram in system at one of the present invention, piecemeal LBP histogram and edge gradient direction histogram are verified, the update method of these features is as follows:

H_{c} (i) = a H_{c} (i) + (1 - a) H_{c}^{t} (i),

i＝1，...，N _c；

H_{l} (j) = b H_{l} (j) + (1 - b) H_{l}^{t} (j),

j＝1，...，N _l；

H_{e} (k) = g H_{e} (k) + (1 - g) H_{e}^{t} (k),

k＝1，...，N _e；

Wherein H _c, H _l, H _erespectively representative model represent in color histogram, piecemeal LBP histogram and edge gradient direction histogram; then represent the description histogram that present frame target image is corresponding respectively; N _c, N _l, N _erepresent each histogrammic dimension; H _ci () represents the component in histogrammic i-th dimension; A, b, g are the turnover rate that various describing mode is corresponding.

In tracing process, if gesture-type does not change, model modification can be carried out according to such scheme, if gesture classification changes, then according to tracking mode instantly, initialization be re-started to target.

Use local detection and tracking result, as follows to the update rule of object module:

If a () local detection-phase, successfully detects other gesture classification targets, then show to there is gesture sudden change, master mould thoroughly lost efficacy, and now reinitialized object module parameter according to testing result;

If b () local detection-phase, does not detect other gesture targets, and the tracking of present frame unsuccessful (target is in transient loss state or prediction checking shows that prediction was lost efficacy), then object module is not done and upgrade;

If c () present frame is followed the tracks of successfully, namely the prediction empirical tests of present frame shows qualified, then need to carry out Renewal step by step to the model of target.

(8) about the observability of target

The observability of target is divided into three kinds of states by present system, i.e. " visible ", " transient loss ", " forever losing ".Namely visible state refers to that target is traced at present frame and is verified by prediction.At a certain frame of tracking phase, if occur unsuccessfully to the checking predicted the outcome, then for the target following failure of this frame, target enters " transient loss " stage; In " transient loss " stage, still can determine according to the state be traced to for the last time that is followed the tracks of a restricted area, local detection and tracking are carried out in this region, period, dbjective state was likely converted into " visible " state again, condition has: (a) target is traced to again, and (b) local detects certain Pre-defined gesture; Target is in the transient loss stage and reaches certain hour else if, then dbjective state is converted into " forever losing " state by " transient loss ", now destroys object module, stops trace daemon, reenter initialization detection-phase.

About the result of the present invention and beneficial effect

This gesture tracking method and system is that the Android platform under intelligent television hardware supported is tested, hardware configuration is: processor host frequency is 700MHz, Installed System Memory is 200M, and the common WEB camera connected by USB interface carries out video capture, and video image shows in the TV upper left corner.The results show beneficial effect of the present invention is as follows:

(1) processing speed is fast, real-time.The present invention is based on the continuity features of gesture motion in gesture recognition system, be provided with tracking restricted area, reduce the scope of tracking prediction; Detect the quantity only carrying out reducing moving window in regional area, improve the operational efficiency at embedded platform.Experiment proves, after tracking initiation, performs all operations comprising the whole flow process such as tracking prediction, local detection, model modification, television system as above can reach the speed of 30ms/frame.

(2) good stability of tracker, strong robustness.Tracking restricted area is adjusted in real time in following the tracks of, thus the coupling reduced in unnecessary background area, continue during target transient loss to implement tracking prediction, object module replacement problem when solving gesture sudden change is detected in regional area, stability and the robustness of tracker can be ensured, thus avoid the problems such as the tracking interruption caused because of factors such as hand deformation, environmental interference in existing method.

(3) in following the tracks of, the precision of gesture identification is high.Detect owing to implementing moving window in regional area, if certain gesture exists, then can produce many windows comprising this gesture, therefore a large amount of successful detections is had to verify the existence of corresponding gesture, this only does the system of a subseries to the target image corresponding to tracking mode compared to other, the probability of correct classification obtains the raising of large degree.Experiment proves, allow in tracking to convert gesture arbitrarily between the gesture of four as shown in Figure 5, recognition correct rate is more than 99%.

(4) the method is based on common camera, is realized in time by image recognition, wears extra auxiliary equipment without the need to user, also without the need to the 3D scanning device of costliness, does not increase hardware cost.

About description of use of the present invention

Embodiments of the invention are intelligent television system, also may be used for other intelligent appliance equipment simultaneously.Such as: mobile phone terminal, flutterred by mobile phone camera and catch hand exercise picture, realize the control of hand exercise to mobile phone screen cursor; Air-conditioning equipment, realizes the control etc. of hand exercise to air-conditioning wind direction by gesture tracking; PC platform, by the tracking to hand target, realizes the motion control etc. to screen mouse.In addition, also based on the movement locus followed the tracks of, other kinds interactive operation can be carried out by track identification technology.

The foregoing is only the preferred embodiments of the present invention; not thereby the scope of the claims of the present invention is limited; every utilize instructions of the present invention and accompanying drawing content to do equivalent structure or flow process conversion; or be directly or indirectly used in other relevant technical field, be all in like manner included in scope of patent protection of the present invention.

Claims

1. a gesture tracking method, is characterized in that, comprises the steps:

The apparent model of design gesture, comprise the iamge description mode for tracking prediction and prediction checking: the set expression for the describing mode predicted is Ω p, set expression for the describing mode verified is Ω v, Ω p contains the color histogram of H and channel S in HSV space, and Ω v contains piecemeal LBP histogram and represents and to represent with piecemeal gradient orientation histogram;

Gestures detection obtains the original state of target, i.e. the position of target, size information;

According to described original state, initialization is carried out to the tracker of target, comprise initialization apparent model, namely initialization is used for the iamge description template of tracking prediction and prediction checking, and the classification of the tracked gesture that records of initialization tracker, state and observability, wherein state comprises position and dimension information;

According to described tracker information, by following the tracks of process, final estimation is made to the state of target and observability;

According to tracking and testing result, do the information of tracker and the apparent model of tracked target and upgrade: do tracking prediction with color histogram, piecemeal LBP histogram and edge gradient direction histogram are verified,

H_{c} (i) = {aH}_{c} (i) + (1 - a) H_{c}^{t} (i), i = 1, ..., N_{c};

H_{l} (j) = {bH}_{l} (j) + (1 - b) H_{l}^{t} (j), j = 1, ..., N_{l};

H_{e} (k) = {gH}_{e} (k) + (1 - g) H_{e}^{t} (k), k = 1, ..., N_{e};

Wherein, H _c, H _l, H _erespectively representative model represent in color histogram, piecemeal LBP histogram and edge gradient direction histogram; then represent the description histogram that present frame target image is corresponding respectively; N _c, N _l, N _erepresent each histogrammic dimension; H _ccomponent in histogrammic i-th dimension of (i) representative color; A, b, g are the turnover rate that various describing mode is corresponding;

Judge the observability of target, when target generation transient loss time, do not stop trace daemon immediately, but the state of foundation previous frame arranges tracking restricted area in a big way, some frames backward continue to do in this restricted area follows the tracks of process; Wherein, if forever lose, then need to restart detection to obtain a tracking target, otherwise, continue to follow the tracks of.

2. gesture tracking method as claimed in claim 1, is characterized in that, also comprises the steps: to follow the tracks of restricted area R, for following the tracks of the target in present frame according to the dbjective state setting of previous frame.

3. gesture tracking method as claimed in claim 2, is characterized in that, also comprise the steps:

The described operation followed the tracks of in process, comprises prediction, checking and partial check, is only confined to carry out among described tracking restricted area R.

4. gesture tracking method as claimed in claim 3, is characterized in that, also comprise the steps:

In described tracking restricted area R, other gestures outside tracked gesture are detected, for upgrading the apparent model of gesture when gesture is suddenlyd change.

5. gesture tracking method as claimed in claim 3, is characterized in that, when finding to there is the change of gesture classification from the testing result of local, then abandoning original gesture model, reinitializing tracker information and apparent model by the result detected.

6. gesture tracking method as claimed in claim 3, it is characterized in that, in the step that dbjective state gives a forecast, employing be the way of color histogram in conjunction with cam-shift, according to the dbjective state of former frame or front some frames, the dbjective state in present frame is made prediction.

7. gesture tracking method as claimed in claim 3, does predicting the outcome in the step verified, employing be the such two kinds of describing modes of piecemeal LBP histogram and edge gradient direction histogram.

8. gesture tracking method as claimed in claim 1, is characterized in that, also comprise the steps:

According to described result of following the tracks of process, the information of described tracker is upgraded, comprises the renewal to apparent model, and the renewal to gesture-type, state and observability that tracker records.

9. a gesture tracking system, be applied in the system platform of band image collection module and gesture detection module, it is characterized in that, described gesture tracking system comprises:

The apparent model of gesture, comprise the iamge description mode for tracking prediction and prediction checking: the set expression for the describing mode predicted is Ω p, set expression for the describing mode verified is Ω v, Ω p contains the color histogram of H and channel S in HSV space, and Ω v contains piecemeal LBP histogram and represents and to represent with piecemeal gradient orientation histogram;

Tracker initialization module, for using described gesture detection module to detect Pre-defined gesture, and when certain class gesture being detected, initialization is carried out to tracker, comprise initialization apparent model, and record in initialization tracker gesture classification, state and observability;

Tracking prediction module, describes for the apparent model in conjunction with gesture, according to the dbjective state in former frame or front some frames, makes prediction to the dbjective state in present frame;

Whether prediction authentication module, from the target image corresponding to predicted current frame state, extracts the feature for predicting checking, corresponding to the apparent model of gesture predicting that the characteristics of image of checking compares, determine to predict the outcome effective;

Local detection module, for the dbjective state according to former frame, determines to follow the tracks of restricted area, detects other gestures outside tracked gesture;

Model modification module, for following the tracks of according to described the result processed, the information of gesture-type, state and observability that tracker in described tracker initialization module records is upgraded, and the apparent model of gesture is upgraded: do tracking prediction with color histogram, piecemeal LBP histogram and edge gradient direction histogram are verified

H_{c} (i) = {aH}_{c} (i) + (1 - a) H_{c}^{t} (i), i = 1, ..., N_{c};

H_{l} (j) = {bH}_{l} (j) + (1 - b) H_{l}^{t} (j), j = 1, ..., N_{l};

H_{e} (k) = {gH}_{e} (k) + (1 - g) H_{e}^{t} (k), k = 1, ..., N_{e};

Wherein, H _c, H _l, H _erespectively representative model represent in color histogram, piecemeal LBP histogram and edge gradient direction histogram; then represent the description histogram that present frame target image is corresponding respectively; N _c, N _l, N _erepresent each histogrammic dimension; H _ccomponent in histogrammic i-th dimension of (i) representative color; A, b, g are the turnover rate that various describing mode is corresponding.