CN107357427A

CN107357427A - A kind of gesture identification control method for virtual reality device

Info

Publication number: CN107357427A
Application number: CN201710535118.8A
Authority: CN
Inventors: 吴斌; 徐子怡; 周希元; 孟寅桢
Original assignee: Nanjing Jiangnan High Tech Research Institute Co Ltd
Current assignee: Nanjing Jiangnan High Tech Research Institute Co Ltd
Priority date: 2017-07-03
Filing date: 2017-07-03
Publication date: 2017-11-17

Abstract

The invention discloses a kind of gesture identification control method for virtual reality device, virtual reality device includes multiple 3D depth cameras and controller, method includes establishing the skeleton and threedimensional model of hand, depth camera is tracked to hand, and hand skeleton pattern is optimized, and then the deformation of threedimensional model is controlled, further according to three-dimensional model deformation information, the reciprocation that hand occurs with environment is quickly detected in real time, realizes the interaction of user and virtual environment.The method of the present invention reduces the amount of calculation during the gesture control of virtual reality device, improves the precision that gesture understands, realizes user and the quick interactive effect true to nature of virtual environment.

Description

A kind of gesture identification control method for virtual reality device

Technical field

The invention belongs to technical field of virtual reality, relates in particular to a kind of gesture identification for virtual reality device Control method.

Background technology

Virtual Maintenance application in, it is man-machine between interactivity it is particularly important.The body feeling interaction of view-based access control model is man-machine friendship One of mutual important technology, and one of most natural man-machine interaction means, it uses human hand, the position of body, direction and appearance Input mode of the state as computer, make the input space and output space integration, realize virtual and true environment object phase Interoperate uniformity, therefore body feeling interaction is increasingly becoming the key technology that interactive command inputs in Virtual Maintenance.

A critically important part is exactly gesture interaction in body feeling interaction.The research of gesture analysis and interaction can trace back to earliest The eighties in last century.Grimes achieves " numerical data gloves " patent in nineteen eighty-three at first in AT＆T., can using data glove Directly obtain the space-time characteristic information such as hand joint shape facility and position, movement locus.Gesture/Sign Language Recognition of early stage is ground Study carefully and be typically based on data glove, yet with needing wearable device, mutual inconvenience and expensive, therefore it is most of Researcher's all emphasis of research is put on the gesture interaction of view-based access control model gradually.Due to inherently a kind of vision of gesture Language, therefore the gesture interaction of view-based access control model also more conforms to the cognition expectation of " What You See Is What You Get " of people.To simplify gesture inspection Survey, problem analysis, researcher is by the gloves for having color mark so that the conversion of motion of hand is color mark in image sequence Motion.But color glove needs special color mark, also not enough naturally, therefore also failing to succeed in interaction Popularization and application.With computer vision technique flourish, researcher gradually focus onto it is pure based on regarding The gesture analysis of feel, i.e., capture the posture of hand or finger, movement sequence image, Jin Ertong with one or more video cameras Computer vision algorithms make is crossed to detect and identify specific gesture.

In gesture understanding problem, gesture appearance is the set of pixel in certain image-region.Gesture outward appearance is by itself Motion (as rotated, deformation, translation etc.), imaging parameters change (such as viewing angle and distance) and the change of outside natural conditions The influence of factors such as (such as illumination), and change with the change of above-mentioned factor.Simultaneously with the motion set about, the back of the body residing for it Scape also constantly changes.How a variety of uncertain and changing factor influences are excluded, fast and accurately by hand target from video sequence Separated in row (hand detect and track), and effectively solve the occlusion issue of hand, extracted for portraying gesture outward appearance Parameter, it is one of difficult point and important research content that gesture understands.Based on this, the present invention pays close attention to gesture capture and is used for The gesture of the fine interactive task of virtual platform for maintenance understands problem.

A subject matter in gesture tracking is to block (occlusion).Have a few thing to this to have made explicitly Processing is blocked, such as：Awad by between any object in tracking and the object newly detected block early warning maintain both hands, Mutual occlusion state between face.Singlehanded segmentation is unified in a framework by Gui et al. with follow-up gesture identification, using actively The method of profile solves the problems, such as to block, although this method is preferable to blocking the effect that the target shape after occurring is recovered, But it is too high to calculate cost.

Method based on image 2D appearance features is the most popular method portrayed for carrying out hand, under uniform background, Such method effectively can be modeled to hand region.But research shows, no matter described using which kind of gesture, utilizing base When the video data that the method for vision captures to 2D cameras is handled, gestures detection, tracking and follow-up gesture reason Solution illumination in by environment, background are influenceed very big, and are difficult processing occlusion issue.

Compared with the uncertainty of 2D appearance features, depth information can provide more abundant data, effective to distinguish Foreground and background in scene.To simplify gesture analysis task, the precision of gesture analysis is improved, many researchers believe depth Breath is incorporated into gesture analysis and understanding task.

To obtain useful depth information, a kind of method is exactly using multiple cameras, i.e. the method for stereoscopic vision is caught Obtain 3D body shape and motion.This method extracts 2D characteristics of image (such as profile information or key from each camera Characteristic point) update the surface structure parameter of 3D hands or body model, but due to lacking the texture of target object, generally can not Sufficiently high resolution ratio is provided.Another method is then the 3D sensors that direct use can provide depth image, such as directly Using the stereopsis video camera containing two cameras, to reconstruct human depth's information, human body tracking is carried out；Depth is for example utilized again Information carries out the segmentation of hand.For the colour of skin, the hand Segmentation based on depth information is more accurate, but only using single One Depth cue, also it can be caused segmentation errors by the interference of the object of other in the range of similar depth.

Different according to the concrete application of gesture interaction task, the fine degree demand that gesture understands is also variant.For letter Single controlling gesture application, it is sufficient that coarse model characterizes, but for completing the natural man-machine interaction of certain complexity Task, it is necessary to more fine descriptive model, to ensure to there is each gesture mode in model space certain classification Discrimination.For from coarse to fine, gesture analysis can be divided into：Simply isolated static gesture identification, dynamic hand gesture recognition, essence Thin gesture model recovers (such as shape, direction, position, movable information).

Static gesture only has space attribute, and generally pre-defined static gesture classification is few, therefore, in accurate hand On the basis of gesture tracking, the gesture identification method based on outward appearance can be with relatively effective this problem of solution.Dynamic gesture includes hand The movement locus time-space attribute of gesture, while may also can include the attribute such as shape, position, but often using hand as an entirety, Fine finger, joint details are not needed, can generally be solved by the method that gesture motion tracks.Here present invention weight Point recovers to summarize with understanding method to fine gesture model.

Understand directly to obtain for gesture and the model parameter of interaction, generally use 3D hands (arm) models enter to gesture Row description, such as body Model, grid model, geometrical model and skeleton pattern.Wherein, most commonly skeleton pattern, its parameter It is by simplified joint angles parameter and finger joint length.The physical attribute of human hand can provide the constraint of two classes for 3D skeleton patterns, That is static constraint and dynamic constrained.3D models are recovered by 2D images, it is contemplated that from 2D to 3D mapping, sampled several in advance The reference picture of hand gestures is known, and with machine learning method or image retrieval technologies, by from the gesture data collected in advance A feature and the most like reference of input picture (template) image are found out in storehouse.Method Modeling based on 3D models is accurate, such as Fruit parameter can be optimized, then can accurately simulate the various motions of hand.But because the dimension of its parameter space is very high, lead to Crossing vision technique, to obtain model parameter relatively difficult and time complexity is high.

On the whole, it is man-machine interaction indispensable one in virtual reality applications that the gesture of view-based access control model, which is captured with understanding, Item key technology.But so far, gesture, which understands, is still difficult to break through from laboratory to the distance of real commercial applications.For solution The certainly active demand in virtual reality applications for gesture interaction technology, it is also necessary to which researcher deposits during gesture is captured and understood Key technology and bottleneck problem have breakthrough, such as quick gestures detection and tracking, accurately unobstructed 3D gestures understand Deng.By the depth transducer of multiple different visual angles, depth information and image information are effectively combined, gesture is improved and catches The speed and precision for obtaining and understanding, are the premises that gesture interaction is applied successfully in virtual reality device, and the one of the field Individual important studies a question.

The content of the invention

The main purpose of the present invention is to capture and understand for the fast and accurately gesture improved under reality environment, Include the processing of both hands occlusion issue in gesture modeling process, and the gesture animation of driving parameter is drawn, there is provided a kind of new Method, so as to realize higher precision, more real-time gesture control.

Specifically, present invention employs following technical scheme：

A kind of gesture identification control method for virtual reality device, the virtual reality device include multiple 3D depth Video camera and controller, the multiple 3D depth cameras obtain the depth image of hand from different dimensions, and controller is deep from 3D Degree video camera obtains deep image information and hand is tracked according to the information obtained, recovers hand model and to gesture Understood, it is characterised in that methods described includes：1) hand key point is chosen, establishes the skeleton pattern of hand, and establish Threedimensional model, and the selected characteristic point on the basis of this hand threedimensional model；2) hand is entered using multiple 3D depth cameras Line trace, the initial pixel information of both hands is obtained, and Pixel Information is matched with the key point on above-mentioned skeleton pattern, obtained Take accurate initial key point position；3) persistently hand is tracked, the coordinate of key point is obtained, to the skeleton of hand Model optimizes, and obtains hand appearance information and hand-characteristic information according to the parameter information obtained, applied to hand Threedimensional model, control the deformation of hand threedimensional model；4) according to the hand three-dimensional model deformation information obtained, quickly in real time The reciprocation that hand occurs with environment is detected, realizes the interaction of user and virtual environment.

Preferably, the foundation of hand skeleton pattern includes choosing hand key point, and the key point includes the centre of the palm, hand The palm and articulations digitorum manus and finger end points.Further, the optimization to hand skeleton pattern includes obtaining relatively accurate key first Point position, in this, as the initial value of optimization, palm is described using circular feature, finger is described using oval feature, by right Oval feature is approximate respectively to obtain finger-joint point, then using obtained artis position as initial value, passes through analysis-synthesis Method carries out careful optimization.

In one embodiment, the foundation of threedimensional model includes：A) hand generation hand cloud data is scanned；B) by point Cloud data are converted into triangle data, obtain triangular mesh；C) it is freely bent with topology information by triangular mesh generation Surface model.

In a preferred scheme, Time-sharing control is carried out to tracking multiple 3D depth cameras for hand, that is, in 3D Shutter is provided with before the infrared transmitter of depth camera, and the shutter is controlled so as in only current depth camera operation When, its shutter is only opening, and now the shutter of other depth cameras is to close.Further, for different depths The depth data that degree video camera is obtained by Time-sharing control also carries out space-time alignment.

In the method for the invention, methods described is tracked to both hands, that is, a kind of multiple target tracking.Preferably, hand Portion's tracking is tracked using the multiple target tracking based on core tracking framework to both hands.

Beneficial effect：The invention discloses a kind of gesture identification control method for virtual reality device, virtual reality Equipment includes multiple 3D depth cameras and controller, and method includes establishing the skeleton and threedimensional model of hand, depth camera Hand is tracked, and hand skeleton pattern is optimized, and then controls the deformation of threedimensional model, further according to three-dimensional Model deformation information, the reciprocation that hand occurs with environment is quickly detected in real time, realizes the interaction of user and virtual environment. In the application of virtual reality device, not only require real-time recovery model parameter, and also to handle in interaction both hands it Between mutual occlusion issue.To solve this problem, the present invention utilizes multiple depth transducers, simplifies hand Segmentation, solves hand and hides Gear, parameter optimization is carried out on the basis of preferable model parameter initial value is obtained, recover complexity substantially reducing model parameter Meanwhile obtain gratifying model parameter precision.Therefore the method for the present invention reduces the gesture control of virtual reality device During amount of calculation, improve the precision of gesture understanding, realize the quick interactive effect true to nature of user and virtual environment.

Brief description of the drawings

Fig. 1 is the schematic diagram of 3D-HOG calculating process during gesture tracking in the inventive method；

Fig. 2 is the schematic layout pattern of multiple depth cameras in virtual reality device of the present invention；

Fig. 3 is the schematic diagram of the time unifying for the depth image that multiple depth cameras are obtained in the present invention；

Fig. 4 is hand skeleton pattern schematic diagram.

Embodiment

The solution of the present invention will be described in more detail below.

Content and key issue in being applied for virtual reality device mentioned above, we employ following technology road Line.

(1) multiple target (both hands) based on core tracking framework cooperates with tracking

The present invention gets off to solve quick both hands collaboration tracking in core tracking (Kernal-based Tracking) framework and asked Topic.To realize reliable tracking, there are two important factors to need to consider：First, the hand region feature description of robust；Second, Object function for the collaboration tracking of optimization.

The apparent information of 2D images can be used for providing the information for characterizing image texture and shape, and 3D depth informations can then provide field Range information of all visible points of each object to camera in scape.Therefore, the present invention goes out from the texture and architectural characteristic of target Hair, the combined situation of different spaces feature is analyzed, formed and described to portraying the maximally efficient feature of different scale target.It is contemplated that The relatively straightforward scheme of ratio be：Merge position feature, depth characteristic (including Gradient Features in the dimension), the face on 2D images Color characteristic, the Gradient Features to illumination variation with robustness, the LBP features for portraying small yardstick First-order Gradient direction and compared with HOG features of large scale etc..But because the dimension that different characteristic represents is different, therefore, it should avoid entering the feature of different dimensions Row directly splicing.Because HOG features have excellent ability of portraying to object edge direction, the present invention plans 3D depth informations It is in connection, new 3D character representations are proposed to portray hand target, i.e. 3D-HOG features.

HOG features are essentially to carry out counting statistics to the direction of image border and intensity.In conventional research, Due to generally directed to be 2D view data, therefore HOG features are also based on image information to be calculated.But due to this The sensor that invention front end uses can provide high-resolution depth data while coloured image is provided, therefore, we The goal description of more robust can be formed by incorporating depth information.3D-HOG calculating process is as shown in Figure 1.

Tracking for gesture, the present invention take the thought that core tracks.Core tracking is to a kind of effective of non-rigid Tracking.It carries out spatial mask using a core window function to target, and on this basis, definition one is spatially smooth Similarity function, therefore, target orientation problem translates into a search problem in function space.In virtual reality hand In gesture interactive application, both hands often have situation about blocking mutually and occurred.Therefore, this problem definition is a multiple target by we (both hands) cooperate with tracking problem., it is necessary to consider that other targets (T2) are right while being tracked to any one target (T1) In T1 influence, this is particularly important for the target being blocked.

Assuming that it is x that tracked target T1, which is a center,₀, window width is h rectangular area.The pixel position of target in the picture Put with (x_i)_{I=1 ... M}Represent.If being described using K dimensional features, object module can be expressed as：Q={ q_u}_U=1...K, whereinAnd target candidate can be described as：P (x)={ p_u(x)}_U=1...K, whereinIn order to find mesh in present frame Position corresponding to mark, target area center can be estimated to obtain by following formula：

Wherein, w_iIt is weighting function, it is defined as follows：

Function b (x_i) it is pixel x_iIndex value in histogram feature, function δ (x) are defined as It is a window function, represents pixel x in target (center x₀) weight in region, targeted peripheral pixel is easy It is relatively insecure by background influence, so pixel is bigger closer to target's center, weight.

But in view of tracking target T2 influence, i.e. there is the possibility blocked by T2 in T1.Therefore, we are by weighting function weight It is newly defined as：

Here, pixel x_iWhether be blocked, can according to where tracking target depth and the pixel position depth information come Judged.

(2) Time-sharing control between more depth transducers and space-time alignment

Single 3D depth transducers can not obtain the complete information of two hands in the case where hand blocks.For This problem, this project intend solving the acquisition of information of occlusion area using multiple depth transducers, and device layout is as shown in Figure 2.

But which kind of 3D depth transducer equipment no matter is taken, i.e., no matter its range measurement principle is time-of-flight method or structure light Method, if the sensing range of different sensors exists overlapping, necessarily interact, deviation occurs for the depth for causing to measure.Therefore, The present invention intends simultaneously considering at the aspect of soft and hardware two, to realize the Time-sharing control between different sensors, to suppress distinct device Between interference phenomenon.By taking Kinect device as an example, if only starting with coming the working time of control device from software, every equipment Open every time, its initialization time is long, can not meet the mission requirements of real-time, interactive.It is each 3D depth that the present invention, which intends attempting, When adding a shutter, only current depth cameras work before its infrared transmitter of camera, its shutter is to open, and this When, the shutter of other interior depth cameras of environment is to close, this guarantees the depth camera of work not by other cameras The interference of infrared light supply, so as to obtain accurate depth information in real time.

Although the multisensor of Time-sharing control can obtain the complete information of hand, one problem of the thing followed is exactly Data are asynchronous between not homologous 2D images, 3D depth image frames, that is, the skew of space-time be present.Due in hand 3D modeling mistake Cheng Zhong, we are concerned with depth data, and therefore, the present invention intends carrying out not homologous depth data space-time alignment, specific side Case is as follows：

The different pieces of information obtained for Time-sharing control is (by taking 3 depth cameras as an example ), first The depth image frame gathered with front picture pick-up device C1 in tFor reference frame, the image pair that other picture pick-up devices are gathered Arrive t together.I.e. for C2 equipment, calculateWithBetween light stream, and according to the time come to motion stream carry out Interpolation, obtain depth image hardwood corresponding to tAlignd with realizing distinct device to obtain on data time.And for sky Between on alignment, setting can be passed through, will for reference templateRespectively with reference templateCarry out non-rigid pair Together, so as to obtaining more complete hand depth information.The space-time alignment can be found in Fig. 3.

(3) Interest frequency of fine hand model parameter

For obtained complete hand region detection/tracking result, we can recover to the parameter of hand model. The present invention is portrayed hand using skeleton pattern, as shown in figure 4, the model includes 1 palm center, 5 fingertip ends Point, and 14 finger-joint points, totally 20 key feature points.Therefore, our target is exactly excellent to 20 key points progress Change, obtain its accurate position.To reduce the time complexity of parameter optimization, the present invention is using Interest frequency plan from coarse to fine Slightly, i.e., first according to relatively accurate key point position is obtained, in this, as the initial value of majorized function；And then enter near initial value The careful optimization of row, obtains the accurate location of all parameters.

Here, it is therefore desirable to be able to provide the 3D coordinate positions of key point, fortunately depth distance can be sensed by 3D Device directly obtains, and therefore, we need to only position to position of the key point on image.It is considered that no matter hand is in Which kind of gesture existing is, its palm portion can be described by approximate circular geometry feature.And finger can be by ellipse Shape feature describes, each artis on finger can by corresponding oval feature it is approximate divide equally obtain.

To obtain fine hand parameter, we pass through analysis-synthesis using artis position obtained above as initial value Method (Analysis-by-Synthesis) to carry out careful optimization to parameter.Here the object function optimized is：

Minimize the given image I and image I according to parameter synthesis_syn(a) difference between.

This programme is because the acquisition of initial parameter values is more accurate, therefore, need to can only be obtained in the parameter space search of very little It must stablize, accurate parametric results, greatly reduce the complexity of parameter optimization, meet again to fine gesture interaction task Parameter accuracy demand.According to the design needs, this programme can also provide the hands such as length and width, width and the thickness of finger and palm Portion's geometric shape information, data input is provided for the hand fine modeling of next step.

(4) it is applied to the hand Accurate Model of Virtual Maintenance based on certain

Driving parameter hand threedimensional model is established, is to establish the universal model for the hand as template first.Hand The threedimensional model in portion, two ways can be used to establish：Mathematical model or FREEFORM SURFACE MODEL, according to the interaction under virtual environment Demand, the Free-surface model with topology information is more suitable for the present invention.And the Free Surface that industry is widely recognized as at present Model is NURBS models, and NURBS models are defined by Piecewise Rational B- batten polynomial basis functions, and its control point necessarily be formed sternly The tensor product network of lattice, in complex surface modeling, to reach precision, smoothness requirements, bulk redundancy control point can be caused Occur, amount of calculation is huge.T- battens are a kind of improvement to NURBS proposed at first by Sederberg in 2003, main to improve It is that the Control point meshs of T- battens need not form strict tensor product network, therefore possesses the ability of Local grid refinement.T- Batten obtains the extensive concern of academia and industrial quarters.Research shows, compared to NURBS, especially for complex Geometric modeling problem, T- battens are improving the fairing degree of model, reduce with the obvious advantage on Redundant Control point.

After establishing general hand model, the hand-characteristic information for user is also needed just to correct model, in addition Hand also needs to that interaction occurs with virtual environment, and these are required for deforming hand model.Free-surface model is carried out Deformation, substantially there are following several research directions at present：(a) curved surface LFM signal (surface line manipulation), lead to Adjustment curved surface top branching is crossed to reach the purpose of deformation；(b) control point editor (control point editing), it is main If reach the purpose that curved surface deforms by directly adjusting spline function control point position/control point weight (NURBS)；(c) Framework adjusts (lattice manipulation), initially sets up the geological information of curved surface and the mapping relations of reference framework, so Afterwards by this mapping relations, surface geometry shape is adjusted by changing the shape of reference framework.Because this method can be with Globality adjustment is carried out according to the intention of designer to the control point of curved surface, avoid being likely encountered in (a), (b) method from Uncertain, the inaccurate problem occurred in being deformed by curved surface, therefore more approve in industrial quarters.The maturity of consideration technology and The requirement of design, research team think that (c) is more suitable for the demand of the present invention, as the major design difficult point of (c), that is, joined According to the design of framework, research team thinks the deformation mode of the geometric properties and hand for hand, is believed according to the joint of hand Cease, determine finger, the key position design reference framework of palm configuration size, it is more particularly suitable.

After the universal model of hand is established, so that it may the characteristic information of itself hand provided by user, such as size, close Position etc. is saved, general FREEFORM SURFACE MODEL is deformed, to obtain the hand that can more accurately reflect user's hand-characteristic Threedimensional model is input in virtual environment system.

In summary analyze, to design a kind of hand threedimensional model for being suitably applied virtual reality system, research team Following steps are taken in plan：

It is that modeling approach is that (a) passes through specialty by accurately surveying and drawing and modeling, establish the universal model of a hand first Scanner generates hand cloud data；(b) cloud data is converted into triangle data；(c) carried by triangular mesh generation The FREEFORM SURFACE MODEL of topology information.In view of the complexity of hand geometry, plan is using the control needed for complex-curved System point is less, and the preferable T- battens geometric modeling method of fairing degree establishes Free-surface model.

Secondly, the selected characteristic point in the disengaging of this hand model, characteristic point, which should be chosen at, determines hand shape and change On the key position of shape, if determining finger, the point group of palm width, the various artis of hand etc..In the characteristic point of selection On the basis of, design the reference control framework of hand model.

Finally, by from above-described (3) point：The hand obtained in the Interest frequency of fine hand model parameter is special Reference ceases, and carries out deformation by referring to framework, and then generate the hand threedimensional model matched with user；For in virtual based environment User gesture is showed, by the reference framework after deformation, controls the deformation of user's hand model.

By constructing the grid of hand threedimensional model entrained topology information in itself in itself, quickly detect in real time hand with Collision, the interference of environment generation, realize that user is quick with virtual environment, interactive effect true to nature.

Embodiments of the present invention are described in detail above in conjunction with accompanying drawing and example, but the invention is not restricted to upper Embodiment is stated, in art those of ordinary skill's possessed knowledge, the present invention can also not departed from Made a variety of changes on the premise of objective.

Claims

1. a kind of gesture identification control method for virtual reality device, the virtual reality device is taken the photograph including multiple 3D depth Camera and controller, the multiple 3D depth cameras obtain the depth image of hand from different dimensions, and controller is from 3D depth Video camera obtains deep image information and hand is tracked according to the information obtained, recovers hand model and gesture is entered Row understands, it is characterised in that methods described includes：1) hand key point is chosen, establishes the skeleton pattern of hand, and establish three Dimension module, and the selected characteristic point on the basis of this hand threedimensional model；2) hand is carried out using multiple 3D depth cameras Tracking, obtains the initial pixel information of both hands, and Pixel Information is matched with the key point on above-mentioned skeleton pattern, obtains Accurate initial key point position；3) persistently hand is tracked, the coordinate of key point is obtained, to the skeleton pattern of hand Type optimizes, and obtains hand appearance information and hand-characteristic information according to the parameter information obtained, applied to hand three Dimension module, control the deformation of hand threedimensional model；4) according to the hand three-dimensional model deformation information obtained, quickly examine in real time The reciprocation that hand occurs with environment is surveyed, realizes the interaction of user and virtual environment.

2. the gesture identification control method of virtual reality device as claimed in claim 1, it is characterised in that hand skeleton pattern Foundation include choosing hand key point, the key point includes the centre of the palm, palm and articulations digitorum manus and finger end points.

3. the gesture identification control method of virtual reality device as claimed in claim 2, it is characterised in that to hand skeleton pattern The optimization of type includes obtaining relatively accurate key point position first, in this, as the initial value of optimization, is described using circular feature Palm, finger is described using oval feature, by respectively obtaining finger-joint point to oval feature is approximate, then will obtained Artis position as initial value, careful optimization is carried out by analysis-synthetic method.

4. the gesture identification control method of virtual reality device as claimed in claim 1, it is characterised in that threedimensional model is built It is vertical to include：A) hand generation hand cloud data is scanned；B) cloud data is converted into triangle data, obtains network of triangle Lattice；C) by FREEFORM SURFACE MODEL of the triangular mesh generation with topology information.

5. the gesture identification control method of virtual reality device as claimed in claim 1, it is characterised in that to for hand with The multiple 3D depth cameras of track carry out Time-sharing control, that is, shutter is provided with before the infrared transmitter of 3D depth cameras, and The shutter is controlled so as to when only current depth camera operation, and its shutter is only opening, and now other depths The shutter of degree video camera is to close.

6. the gesture identification control method of virtual reality device as claimed in claim 5, it is characterised in that for different depth The depth data that video camera is obtained by Time-sharing control also carries out space-time alignment.

7. the gesture identification control method of virtual reality device as claimed in claim 1, it is characterised in that methods described is to double Hand is tracked.

8. the gesture identification control method of virtual reality device as claimed in claim 7, it is characterised in that hand tracking uses Multiple target tracking based on core tracking framework is tracked to both hands.