CN109255324A

CN109255324A - Gesture processing method, interaction control method and equipment

Info

Publication number: CN109255324A
Application number: CN201811032593.4A
Authority: CN
Inventors: 冷芝莹; 卢杨; 王平平; 于洋; 梁晓辉
Original assignee: Qingdao Research Institute Of Beihang University
Current assignee: Qingdao Research Institute Of Beihang University
Priority date: 2018-09-05
Filing date: 2018-09-05
Publication date: 2019-01-22

Abstract

This application discloses a kind of gesture processing method, interaction control method and equipment, the gesture processing method includes: video data when acquiring hand exercise, and obtains the target image frame of the video data；Extract the hand-characteristic of the target image frame；The hand-characteristic is identified using gesture identification model, obtains gesture information；Determine the corresponding control command of gesture information；Determine the hand predetermined position in the target position of described image frame；The target position is mapped into display screen, obtains corresponding operating position；Wherein, the control command is executed for the operating position.The application reduces number of devices, while improving control accuracy.

Description

Gesture processing method, interaction control method and equipment

Technical field

The invention relates to field of computer technology, specifically, being related to a kind of gesture processing method, interactive controlling Method and apparatus.

Background technique

In field of human-computer interaction, gesture is often referred to the hand exercise of people, and can represent one kind can be known by smart machine Other information, for example, index finger stretches out, other fingers are bent as a kind of gesture, the gesture when being closed to palm and can be endowed Specific gesture information, it is assumed that the corresponding gesture information of the gesture is clicking operation, and smart machine is held after identifying the gesture The corresponding clicking operation of row.Gesture identification is to identify gesture information by mathematical algorithm, so that smart machine is understood that people Hand exercise, and based on gesture information realize people and smart machine interactive controlling.

In the prior art, when gesture identification, data glove acquisition hand data are generallyd use.Built-in biography in data glove Sensor can incude hand exercise data in hand exercise.The hand exercise number that smart machine is acquired by obtaining data glove According to, and the hand exercise data of acquisition are analyzed and processed, identify the corresponding gesture information of hand exercise data, Jin Erji Equipment interaction is completed in gesture information.

But if acquiring hand exercise data by data glove, it is also necessary to be smart machine configuration data gloves, make At the increase of the equipment amount of intelligent interaction device, and then increase the complexity of equipment.In actual use, due to data glove Use, also increase using difficulty, mode of operation is complicated, inconvenient for use.

Summary of the invention

In view of this, this application provides a kind of gesture processing method, interaction control method and equipment, to solve existing skill The individual sensing apparatus such as data glove is needed to incude corresponding exercise data in art, to realize the interactive controlling of smart machine And the technical issues of causing equipment cost to increase.

In order to solve the above-mentioned technical problem, the application first aspect provides a kind of gesture processing method, comprising:

Video data when hand exercise is acquired, and obtains the target image frame in the video data；Extract the mesh Hand-characteristic in logo image frame；The hand-characteristic is identified using gesture identification model, obtains gesture information；Determine that gesture is believed Cease corresponding control command；Determine hand predetermined position in the target position of the target image frame；The target position is reflected It is incident upon display screen, obtains corresponding operating position；Wherein, the control command is executed for the operating position.

Second aspect the application also provides a kind of interaction control method, comprising:

Acquire video data when hand exercise, and the target image frame of the video data；

Extract the hand-characteristic of target image frame；

The hand-characteristic is identified using gesture identification model, obtains gesture information；

Determine the corresponding control command of gesture information；

Determine the hand predetermined position in the target position of the target image frame；

The target position is mapped into display screen, obtains corresponding operating position；

The control command is executed in the operating position.

Third aspect the application also provides a kind of gesture processing equipment, comprising: processing component is distinguished with the processing component The storage assembly and display component of connection；

The storage assembly stores one or more computer program instructions；One or more computer program instructions For being called and being executed by the processing component；

The processing component is used for:

Video data when hand exercise is acquired, and obtains the target image frame of the video data；Extract the target Hand-characteristic in picture frame；The hand-characteristic is identified using gesture identification model, obtains gesture information；Determine gesture information Corresponding control command；Determine the hand predetermined position in the target position of the target image frame；By the target position The display component is mapped to, corresponding operating position is obtained；

Wherein, the control command is executed for the operating position.

Fourth aspect the application also provides a kind of interactive control equipment, comprising: processing component is distinguished with the processing component The storage assembly and display component of connection；

The processing component is used for:

Video data when hand exercise is acquired, the target image frame in the video data is obtained；Extract the target Hand-characteristic in picture frame；The hand-characteristic is identified using gesture identification model, obtains gesture information；Determine gesture information Corresponding control command；Determine the hand predetermined position in the target position of the target image frame；By the target position The display component is mapped to, corresponding operating position is obtained；The control life is executed in the operating position of the display component It enables.

It, can be by acquiring video data when hand exercise in the embodiment of the present application, and obtain the mesh in video data Logo image frame, and be directed to target image frame, extract its hand-characteristic and determine the target signature of the hand in the target Target position in picture frame.And identifying that the hand-characteristic obtains gesture information and determining gesture using gesture identification model After the corresponding control command of information, corresponding control command is executed at the corresponding display screen in target position.No longer need The hand exercise data of the external sensing apparatus acquisition user such as data glove, but use camera acquisition user in natural environment In hand exercise when video data, and carry out corresponding gesture recognition process for video data and then equipment can be reduced Complexity reduces operation difficulty.Meanwhile it can be directed to position of the target signature of hand in target image frame, it may be implemented The acquisition of hand respective operations position may be implemented accurate position positioning, mention high control precision.

Detailed description of the invention

The drawings described herein are used to provide a further understanding of the present application, constitutes part of this application, this Shen Illustrative embodiments and their description please are not constituted an undue limitation on the present application for explaining the application.In the accompanying drawings:

Fig. 1 is a kind of flow chart of one embodiment of gesture processing method provided by the embodiments of the present application；

Fig. 2 is the exemplary diagram of various gestures provided by the embodiments of the present application；

Fig. 3 is a kind of flow chart of one embodiment of gesture processing method provided by the embodiments of the present application；

Fig. 4 is a kind of structural schematic diagram of one embodiment of interaction control method provided by the embodiments of the present application；

Fig. 5 is a kind of structural schematic diagram of one embodiment of gesture processing equipment provided by the embodiments of the present application；

Fig. 6 is a kind of structural schematic diagram of one embodiment of interactive control equipment provided by the embodiments of the present application.

Specific embodiment

Presently filed embodiment is described in detail below in conjunction with accompanying drawings and embodiments, how the application is applied whereby Technological means solves technical problem and reaches the realization process of technical effect to fully understand and implement.

The embodiment of the present application is mainly used in human-computer interaction scene, the hand exercise under the main natural environment by acquisition When video data, with achieve the purpose that with simple device realize intelligent interaction control.

In the prior art, the modes such as data glove are mostly used to be acquired user's hand exercise, and then for acquisition Hand exercise data carry out the data processings such as coordinate mapping, complete corresponding interactive controlling.But this mode, it is desirable to provide number According to the additional sensing apparatus such as gloves, the device is complicated, and degree is higher, is unfavorable for the extension of interaction scenarios.To solve the above-mentioned problems, Inventor expects that can use camera is acquired the hand exercise of user, to carry out movement knowledge by the image acquired Not, and then interactive controlling is completed.Accordingly, the technical solution of the application is inventors herein proposed.

In the embodiment of the present application, video data when user's hand exercise is mainly acquired using camera, and obtain view Target image frame of the frequency in can then be directed to target image frame, extract hand-characteristic and determine hand predetermined position Target position in the target image frame, and then can use gesture identification model and identify the hand-characteristic, obtain hand Gesture information, and then obtain its corresponding control command.After target position to be mapped to display screen, in the behaviour of display screen Make to execute corresponding control command at position.Using camera, acquire in the natural environment user's hand motion video data with Corresponding interactive controlling is completed, more natural interaction scenarios can be provided.Since knowledge can be realized only with common camera The acquisition of other data, reduces the complexity of equipment.Accurately determine user in the accurate location of display screen using mapping mode And then interactive process can be accurately executed, improve interactive accuracy and accuracy.

The embodiment of the present application is described in detail below in conjunction with attached drawing.

As shown in Figure 1, be a kind of flow chart of one embodiment of gesture processing method provided by the embodiments of the present application, it should Method may include following steps:

101: video data when acquisition hand exercise, and obtain the target image frame in the video data.

Gesture processing method provided by the embodiments of the present application can be applied in smart machine, and the smart machine can be In various types of smart machines such as smart television, intelligent refrigerator, computer.The smart machine may include display screen, and The Show Button, dialog box, file, icon of application program etc. it can show content on the display screen.User can pass through hand Portion's movement executes the operation such as clicks, bipolar, sliding on the display screen, wherein click bipolar etc. is operated primarily directed to pressing Button, icon of dialog box and application program etc. show the operation of content, and sliding is the change in location of user's hand on the screen.

The gesture processing method of the embodiment of the present application can be applied in VR (Virtual Reality, virtual reality) and set In the Virtual Intelligents equipment such as standby, AR (Augmented Reality, augmented reality) equipment.Virtual Intelligent device external can match Be equipped with camera, to acquire video data when user's hand exercise, and by user's hand portion moves outside when motion profile, The information MAPs such as gesture are into virtual scene, to pass through the virtual content in the motion control virtual scene of hand.

It is alternatively possible to video data when acquiring hand exercise using camera.Camera can be with smart machine Integrated equipment is also possible to the camera independently of smart machine, does not make in the position and type the application for camera It is excessive to limit.

After video data when camera acquisition hand exercise, smart machine can be taken the photograph by reading camera data The video data acquired as head.Target image frame in the acquisition video data may include: every in acquisition video data One frame picture frame.By the tracking to all picture frames, more accurate gesture tracking is obtained.But due to image procossing speed Spend limited, if carrying out gesture tracking to all picture frames, calculation amount is very big, is easy to cause the reduction of gesture processing speed. Therefore, in order to improve tracking process speed, the target image frame obtained in video data may include: according to pre-set image Frame acquisition interval acquires the target image frame in video data.By reducing the quantity of picture frame, to reduce total tracing computation Amount, can be improved processing speed.Image-sampling frequency can be previously set as required, for example it is assumed that the frame per second of video data Per second for 24 frames, the acquisition interval that picture frame can be set is 2, namely opens picture frame at interval of two image frame acquisitions one, right In the 24 frame images shown in 1 second, 8 target image frames can be collected.

Optionally, camera can acquire the video data of RGB color.At this point, each frame image of video data Frame is rgb format.It, can be by partitioning video data at image one by one after smart machine gets video data Frame, and target image frame is corresponded to picture frame and carries out gesture identification.

In the present invention, target image frame when hand exercise is acquired, video counts when hand exercise can be acquired in real time According to, and the target image frame in video data is obtained, subsequent gesture identification or hand are executed for each frame target image frame Gesture control.

102: extracting the hand-characteristic in target image frame.

103: identifying the hand-characteristic using gesture identification model, obtain gesture information.

104: determining the corresponding control command of gesture information.It is alternatively possible to be directed to the hand institute of each target image frame In region, hand-characteristic is extracted, the calculating of data volume can be reduced by carrying out feature extraction for hand region, improved and calculated effect Rate.

The hand-characteristic of extraction can be extracted for each target image frame of video data, to obtain each target The hand-characteristic of hand in picture frame, and then realize continuous control.

It is alternatively possible to which hand-characteristic is inputted gesture identification model, gesture information is obtained by calculating.

Gesture identification model can train acquisition in advance.It, can be with thing in smart machine as a kind of possible implementation Gesture identification model is first stored, after obtaining hand-characteristic, the gesture identification model being previously stored can be called to hand spy Sign is identified, and the output by calculating gesture identification model obtains gesture information.

Optionally, gesture identification model identifies that the hand-characteristic can refer to that hand-characteristic is input to the gesture identification mould After type, the output of corresponding model is obtained as a result, model output result can be data information, it can be according to the number identified It is believed that breath determines corresponding key assignments, and then determine that data information or the corresponding key assignments of data information determine corresponding gesture letter Breath.

Gesture information can refer to the corresponding meaning of gesture, for example, hand index finger stretches out, can represent and click information, can also Information is opened to represent, can be defined according to user's needs.And after the meaning of gesture information determines, it can determine Its corresponding control command.

In practical applications, the gesture identification model can be supporting vector machine model, utilize support trained in advance Vector machine model identifies the hand-characteristic, can obtain higher recognition result.

As shown in Fig. 2, being the exemplary diagram of various gestures provided in an embodiment of the present invention, wherein may include first gesture 201, second gesture 202, third gesture 203, wherein each gesture can be corresponding with corresponding gesture information, for example, this In inventive embodiments, the gesture information of first gesture 201 can be defined as translating, the gesture of second gesture 202 can believed Breath is defined as left button click, the gesture information of third gesture 203 can be defined as clicking by right key.In practical applications, different Gesture can be defined as different gesture informations, specifically can according to need and set.

Optionally, gesture information can use different key assignments and indicate, any one key assignments can be corresponding with a control life It enables, for example, gesture information key assignments 001 indicates, corresponding control command can be movement, gesture information 002 table of key assignments Show, corresponding control command can be to click.When gesture identification model identifies corresponding gesture information namely the identification hand The key assignments of gesture information indicates.

The corresponding control command of the determining gesture information may include: that the gesture information pair is determined from different key assignments The target key value answered determines the corresponding control command of the target key value.

105: determining target position of the hand predetermined position in target image frame.

106: the target position being mapped into display screen, obtains corresponding operating position.

Wherein, the control command is executed for the operating position.

It can determine in the target image frame of video data, hand predetermined position is in the target position of target image frame.

Hand predetermined position may include the positions such as the special marking of finger fingertip, palm center, hand, can be directed to hand The different parts in portion carry out hand tracking.The target position in target image frame of hand predetermined position can refer to that hand is pre- Determine position of the position in the target image frame of video data, realize to the gesture tracking of each frame in target image frame or Gesture control.

Optionally, the determination hand predetermined position can refer in the target position of target image frame determines the hand The finger fingertip in portion is located at the coordinate position of target image frame, can also refer to that the palm center of determining hand is located at the target and turns The coordinate position in image is changed, can also refer to that the special identifier of hand is located at the coordinate position of target image frame.

The target position is hand predetermined position in the position of target image frame, can specifically refer to that hand predetermined position exists The coordinate of the target image frame.Position of the hand predetermined position in target image frame can be mapped to the aobvious of smart machine Display screen curtain obtains corresponding operating position.Corresponding operation control can be accurately realized by the determination of operating position.

In the embodiment of the present application, it is no longer necessary to the hand exercise data of the external sensing apparatus acquisition user such as data glove, But video data when camera being used to acquire the hand exercise of user in the natural environment, and phase is carried out for video data The gesture recognition process answered and then the degree that can reduce that the device is complicated, reduce operation difficulty.Meanwhile it can be for the target spy of hand The position in target image frame is levied, the acquisition of hand respective operations position may be implemented, accurate position positioning may be implemented, Mention high control precision.

As shown in figure 3, be a kind of flow chart of another embodiment of gesture processing method provided in an embodiment of the present invention, The described method includes:

301: video data when acquisition hand exercise, and obtain the target image frame in the video data.

The embodiment of the present application is identical as the part steps of embodiment shown in FIG. 1, and details are not described herein.

302: image segmentation being carried out to target image frame, obtains the images of gestures of hand region.

Optionally, carrying out image segmentation to target image frame includes: background image and the hand institute to target image frame It is split in region, obtains images of gestures.

The background image to target image frame is split with hand region, and obtaining images of gestures can wrap It includes:

Using hand region and background image in Otsu algorithm segmentation object picture frame, images of gestures is obtained. Images of gestures in target image frame can be split by Otsu algorithm, specifically can be by hand region and other backgrounds Region is split, and obtains images of gestures.

After images of gestures region determines, the images of gestures can be subjected to conversion process, extract and obtain gesture figure Hand-characteristic as in.

303: extracting the hand-characteristic in images of gestures.

Hand-characteristic in the extraction images of gestures may include: by the images of gestures and feature filters convolution, Calculate the hand-characteristic that the convolution results obtained are the images of gestures.

304: identifying the hand-characteristic using gesture identification model, obtain gesture information；

305: determining the corresponding control command of gesture information.

306: being based on images of gestures, determine hand predetermined position in the target position of target image frame.

307: target position being mapped into display screen, obtains corresponding operating position.

Wherein, the control command is executed for the operating position.

In the embodiment of the present application, by carrying out image segmentation for the target image frame of video data, hand where hand is obtained Gesture image to reduce the data volume of subsequent image processing, and then improves processing speed.

In certain embodiments, the target image frame of video data can be the target image frame of rgb format, but due to RGB image, which is illuminated by the light, to be affected, and therefore, RGB image can be carried out color space conversion, be converted to YCrCb image.Cause This, can extract hand-characteristic and determine the target signature of hand in target image frame for the YCrCb image after conversion Target position.

As one embodiment, the hand-characteristic extracted in the images of gestures may include:

Color space conversion is carried out to the images of gestures, target is obtained and converts image；Extract the target conversion image In hand-characteristic.

It is described to be based on the images of gestures as another embodiment, determine hand predetermined position in target image frame Target position includes:

Color space conversion is carried out to the images of gestures, target is obtained and converts image；Image is converted based on the target, Determine hand predetermined position in the target position of the target image frame.

Optionally, described to carry out color space conversion to the images of gestures, obtaining target conversion image may include pair The images of gestures carries out color space conversion, and the corresponding color space of the destination channel after extracting conversion obtains target transition diagram Picture.

It is described that color space conversion is carried out to the images of gestures as a kind of possible implementation, it obtains target and turns Changing image may include: to be converted to the space YCrCb by rgb format to the color space of the target image frame of the video data, Extract the target conversion image that the color space of Cb obtains.

In some embodiments it is possible to which the target image frame of rgb format is converted to YCrCb target by following formula Picture frame:

Y=0.2990R+0.5870G+0.1140B

Cr=-0.1687R -0.3313G+0.5000B+128

Cb=0.5000R -0.4187G -0.0813B+128

Wherein, R, G, B respectively represent red, green, blue three Color Channels, and Y, Cr, Cb respectively represent brightness, RGB Difference between the RED sector and brightness value of input signal, the difference between the blue portion and brightness value of RGB input signal.

Images of gestures is subjected to color space conversion, each images of gestures is as converted into YCrCb sky by rgb space Between, it obtains image and is the image of YCrCb format, and then influence of the illumination to image can be eliminated, and then reduce illumination to feature The negative effect of extraction improves the accuracy of extraction.

In the embodiment of the present application, the conversion that images of gestures carries out color space can reduce illumination and bear to target image frame Face is rung, and then can obtain the hand-characteristic for not being illuminated by the light influence, and then the recognition result of gesture can be improved.

It is described to be based on the images of gestures as a kind of possible implementation, determine hand predetermined position in target figure As the target position of frame includes:

Determine detection position of the hand predetermined position in the images of gestures；Based on the images of gestures with it is described The distributed relation of target image frame determines target position of the detection position in the target image frame.

Optionally, the distributed relation based on the images of gestures Yu the target image frame, determines the check bit The target position set in the target image frame includes:

Distributed relation based on images of gestures Yu target image frame determines the central point of the images of gestures in the target The position of picture frame；According to the positional relationship between detection position and central point, the detection position is mapped by images of gestures Into target image frame.

In certain embodiments, detection position packet of the determination hand predetermined position in the images of gestures It includes:

Determine closed area of the hand in the images of gestures；Obtain at least one finger tip in the closed area Position；According at least one described fingertip location, detection position of the hand predetermined position in the images of gestures is obtained.

In certain embodiments, determine that closed area of the hand in the images of gestures includes:

Using contour detecting algorithm, at least one profile in the images of gestures is detected；Determine at least one described wheel The maximum objective contour of area in exterior feature；Determine that the objective contour corresponding region is the hand in the images of gestures enclosed area Domain.

Optionally, described to utilize contour detecting algorithm, at least one profile detected in the images of gestures includes:

Color space conversion is carried out to the images of gestures, target is obtained and converts image；Using contour detecting algorithm, determine Contour area of the hand in target conversion image.

In certain embodiments, at least one fingertip location may include: in the acquisition closed area

The corresponding multiple vertex of profile for detecting the closed area using convex defects detection algorithm；According to the multiple top The distance on any two vertex and angle corresponding relationship, determine multiple finger tip points in point；The multiple finger tip point is determined respectively Multiple finger tip coordinates in the closed area, obtain at least one fingertip location.

It may include multiple finger tip points and multiple palms point in the multiple vertex.Highest point in the multiple finger tip point It can be with the point to need to track in the embodiment of the present invention.

Optionally, the corresponding multiple vertex packets of profile for detecting the closed area using convex defects detection algorithm It includes:

The profile of the closed area is wrapped up using convex closure, obtains the closure image of hand profile.

Multiple vertex in the closure image are detected using convex defects detection algorithm.

In certain embodiments, the hand predetermined position may include the palm center of the hand.

Optionally, after the corresponding multiple vertex of profile for detecting the closed area using convex defects detection algorithm, Can also include:

Determine multiple palm points in the multiple vertex；Determine the multiple palm point respectively in the images of gestures pair Answer multiple palm coordinates of target image frame；It determines with the sum of the Euclidean distance of the multiple palm coordinate the smallest coordinate points and is The coordinate of centre of the palm point is target position.

The multiple hand can be determined based on the distributed relation between the corresponding target image frame of the images of gestures Palm point is respectively in multiple palm coordinates of target conversion image.

Optionally, the centre of the palm point is the smallest point of quadratic sum with the Euclidean distance of each point.In practical applications, may be used With in the multiple palm point of determination with other the smallest points of palm point Euclidean distance, be the first palm point, will be based on described the The Euclidean distance and angle of 1 palm point and other palm points, carry out range conversion, it is determining with multiple palm points it is European away from From the sum of the smallest point be centre of the palm point.

It in actual tracking, can be tracked for the centre of the palm, to obtain the tracing positional compared with centered on, be obtained more Accurate tracking effect.Palm tracking is operated more suitable for translation, left cunning, right cunning etc., and finger tip tracks more to be appropriate to and clicks, is left It hits, the operation such as right click.

It can use contour detecting algorithm and convex defects detection algorithm, can determine accurate target position, into And when the target position determines, it can be further improved the accuracy of identification.

In certain embodiments, the hand predetermined position includes the highest finger tip of user's hand.

Described at least one fingertip location according to obtains inspection of the hand predetermined position in the images of gestures Location, which is set, includes:

The coordinate for determining the highest finger tip point in the multiple finger tip coordinate is the hand predetermined position in the gesture Detection position in image.

In certain embodiments, described that the target position is mapped to display screen, obtaining corresponding operating position can To include:

The target position is inputted into Kalman filter, calculates and obtains operating position.

The embodiment of the present invention can be with the position of display screen where Accurate Prediction hand by the mapping of Kalman filter It sets, obtains accurate control result.

In certain embodiments, the hand-characteristic extracted in the target conversion image may include:

Target conversion image is subjected to Gaussian Blur denoising, obtains denoising image；The denoising image is converted to Bianry image；Extract the hand-characteristic in the bianry image.

Optionally, after obtaining bianry image, it can use Morphology Algorithm and eliminate discrete point in the bianry image, And the missing point in the bianry image is filled, and then the bianry image can preferably characterize gesture, and then improve and extract Hand-characteristic accuracy.That is, being mentioned after the bianry image can be carried out to discrete point elimination and missing point filling Take the hand-characteristic of hand.

Described that the images of gestures is carried out Gaussian Blur denoising, obtaining denoising image may include: by the gesture figure As passing through Gaussian filter, and then Gaussian Blur denoising is completed, obtains denoising image.Gaussian filter can will be in images of gestures High-frequency noise filtered out, and then obtain denoising image.

It is optionally, described that the denoising image is converted to bianry image may include: using threshold transition algorithm by institute It states denoising image and is converted to bianry image.As a kind of possible implementation, can determine each in the denoising image The pixel value of a pixel, determines pixel threshold；If the pixel value of any one pixel is greater than the pixel threshold, can be true The fixed pixel value corresponding pixel points are 1；If any one pixel is less than the pixel threshold, the pixel value can be determined Corresponding pixel points are 0.Images of gestures shown in Fig. 2 is the images of gestures of two-value.

As a kind of possible implementation, the hand-characteristic extracted in the bianry image may include: to utilize Multidirectional Gabor filter carries out the feature extraction of multiple directions to the bianry image, obtains multiple wavelet characters；It determines The multiple wavelet character is the hand-characteristic.

Gabor filter can refer to the filter that two-dimensional Gabor kernel function is constituted.Wherein two-dimentional kernel function can be by Gauss Function and cosine function, which are multiplied, to be obtained, which can be indicated with following formula:

X'=xcos θ+ysin θ

Y'=-xcos θ+ysin θ

Wherein, θ, φ, γ, λ, σ are parameter, and θ indicates that the direction of parallel stripes in filtering core, φ indicate remaining in filtering core Phase parameter, the γ of string function indicate that the ovality in filtering core, the wavelength parameter in λ expression filtering core, σ are indicated in filtering core The standard deviation of Gaussian function.

That Gabor can be certain in other parameters using the main reason for Gabor filter, only θ direction change when, can To obtain the Gabor filter of multiple directions, and then acquire the hand-characteristic of the multiple directions of the bianry image.Optionally, θ Can choose is 0 °, 45 °, 90 °, 135 °.In practical applications, other parameters are such as: φ, γ, λ, σ etc. can be in advance by more Secondary test determines optimal value of the parameter, to collect most accurate characteristic.Such as, φ=180 after testing, γ=2, λ= When 5, σ=2, core size is 12*12, and Gabor filter can preferably extract gesture feature.

It is described that the bianry image is carried out using multidirectional Gabor filter as a kind of possible implementation The feature extraction of multiple directions, obtaining multiple wavelet characters can specifically include:

Determine the corresponding multiple filter functions of multidirectional Gabor filter；By the bianry image and the multiple filter Wave function convolution obtains the convolution feature of multiple directions；Each convolution feature is subjected to one-dimensional processing, it is multiple small to obtain Wave characteristic.

In the present embodiment, the extraction of wavelet character is carried out to bianry image using Gabor wavelet, and then determine multiple small echos Feature is hand-characteristic.Since Gabor wavelet can be acquired for multi-direction, multidirectional collection apparatus is realized, into And characteristic can be made more comprehensively, the accuracy of the recognition result of acquisition is higher.

Described to identify the hand-characteristic using gesture identification model as one embodiment, obtaining gesture information can be with Include:

Multiple wavelet characters are inputted into gesture identification model respectively, obtain multiple recognition results.

By in multiple recognition results, the most same recognition result of quantity is as gesture information.

Multiple wavelet characters are to be extracted to obtain by bianry image, and bianry image is obtained by the target image frame of video data , therefore, the practical corresponding gesture information of multiple wavelet characters is a kind of gesture information.But since gesture identification model is deposited In identification error, therefore, gesture identification model being inputted in multiple wavelet characters, multiple recognition results of acquisition may be different, because This, the multiple recognition result can be counted, and the same recognition result of quantification at most is as gesture information. And then may insure the accuracy of gesture identification, even if being directed to the wavelet character of different directions, can also determine the multiple small The recognition result of wave characteristic identification, and then improve the accuracy of identification.

In the embodiment of the present application, the multiple recognition result is counted, and the same identification knot that quantification is most Fruit is as gesture information.And then may insure the accuracy of gesture identification, it, can also be with even if being directed to the wavelet character of different directions It determines the recognition result of the multiple wavelet character identification, and then improves the accuracy of identification.

As one embodiment, the gesture identification model can training be obtained in the following manner in advance:

Obtain the corresponding sample image of different gesture informations；

Multiple wavelet characters of each sample image are extracted using multidirectional Gabor filter.

Each wavelet character and the corresponding gesture information of each sample image based on each sample image, training obtain Gesture identification model.

Wherein, the corresponding gesture information of each sample image is namely to be based on each sample image for known gesture information Each wavelet character and each sample image known to gesture information, training obtain gesture identification model.

In capturing sample image, Image Acquisition can be carried out according to the gesture of the hand of setting, and each gesture is corresponding Gesture information set namely the gesture information of each sample image be Given information.

In certain embodiments, the sample image may include training image and test image.

The corresponding gesture information of the training image is set as it is known that can use training image, and training obtains multiple groups and waits for Player's gesture identification model；The corresponding gesture information of the test image is set as unknown, can use test image, obtains to training The multiple groups gesture identification model to be selected obtained is tested, and gesture test result is obtained, and is based on gesture test result and test image Gesture information comparison, the gesture identification model to be selected for selecting discrimination optimal is target gesture identification model.Optionally, institute It states and may include: using multiple wavelet characters that multidirectional Gabor filter extracts each sample image

Using multidirectional Gabor filter, multiple wavelet characters of each training image are extracted respectively, and every Multiple wavelet characters of a test image.

Each wavelet character and the corresponding gesture information of each sample image based on each sample image, training Obtaining gesture identification model may include:

Each wavelet character and the corresponding gesture information of each training image based on the training image, training Obtain the multiple groups model to be selected of gesture identification model；The multiple groups are waited for respectively using each wavelet character of the test image Modeling type is tested, and multiple groups result to be selected is obtained；Gesture information based on multiple groups result to be selected and the test image Matching degree, obtain the discrimination of the corresponding gesture identification model of each model parameter to be selected；Determine discrimination highest one Group model parameter to be selected is the model parameter of gesture identification model.

It is described obtain the corresponding sample image of different gesture informations may include: using camera to different gesture informations into Row acquisition obtains acquisition image；Extract the corresponding images of gestures in hand region in the target conversion image；By gesture figure Conversion as carrying out the color space of YCrCb obtains target and converts image；By the target conversion image carry out noise filtering, Two-value conversion, obtains corresponding sample image.

Gesture identification model can train acquisition in advance, can be carried out with pointer to each gesture with sample image multiple After acquisition, acquisition image is obtained, the acquisition image is executed and is carried out with each frame target image frame of the video data of gesture Identical processing obtains two-value master drawing namely sample image comprising gesture area.Wherein, each sample image is corresponding Known to gesture information.

In the embodiment of the present application, training obtains gesture identification model in advance, can call directly gesture identification when needed Model makes smart machine that need not execute the training process of gesture identification model in real time, it is possible to reduce calculation amount, reduction processing Time accelerates gesture recognition process efficiency.

As shown in figure 4, being a kind of flow chart of one embodiment of interaction control method provided by the embodiments of the present application, institute The method of stating may include:

401: video data when acquisition hand exercise, and obtain the target image frame of the video data.

402: extracting the hand-characteristic in target image frame.

403: identifying the hand-characteristic using gesture identification model, obtain gesture information.

404: determining the corresponding control command of gesture information.

405: determining hand predetermined position in the target position of the target image frame.

406: the target position being mapped into display screen, obtains corresponding operating position.

407: executing the control command in the operating position.

In the embodiment of the present application, by acquiring video data when hand exercise, to the target image of the video data Frame carries out the determination of the extraction of hand-characteristic and the target position of hand predetermined position, can accurately obtain the hand of hand-characteristic Gesture information, and then the corresponding control command of gesture information is obtained so that the target position is mapped to display screen, it is corresponded to Operating position so that execute corresponding control command in the operating position and realize accurately controlling for operation.It does not need using volume Outer data acquisition equipment can simply execute gesture control movement, but be based on accurate judgment mode, can obtain Accurate hard recognition result.

The interaction control method of the embodiment of the present application is also based on any of the above-described gesture processing method and realizes.

As shown in figure 5, being a kind of structural representation of one embodiment of gesture processing equipment provided by the embodiments of the present application Figure, the equipment include: the storage assembly 502 and display group that processing component 501 and the processing component 501 are separately connected Part 503；

The storage assembly 503 stores one or more computer program instructions；One or more computer program Instruction is for being called and being executed by the processing component 501；

The processing component 501 is used for:

Video data when hand exercise is acquired, and obtains the target image frame of the video data；Extract the target Hand-characteristic in picture frame；The hand-characteristic is identified using gesture identification model, obtains gesture information；Determine gesture information Corresponding control command；Determine the hand predetermined position in the target position of the target image frame；By the target position The display component is mapped to, corresponding operating position is obtained.Wherein, the control command is executed for the operating position.

Gesture processing equipment provided by the embodiments of the present application can be smart machine, such as can be smart television, intelligence In various types of smart machines such as refrigerator, computer.Gesture processing equipment may include display screen, and can be in display screen Upper the Show Button, dialog box, file, icon of application program etc. show content.User can be by hand exercise in display screen The operation such as click, bipolar, sliding is executed on curtain, wherein the operation such as click, bipolar is primarily directed to button, dialog box and answers Icon with program etc. shows the operation of content, and sliding is the change in location of user's hand on the screen.

The gesture processing equipment of the embodiment of the present application can also be VR (Virtual Reality, virtual reality) equipment, AR In the Virtual Intelligents equipment such as (Augmented Reality, augmented reality) equipment.Virtual Intelligent device external can be configured with and take the photograph As head, to acquire video data when user's hand exercise, and by user's hand portion moves outside when motion profile, gesture etc. Information MAP is into virtual scene, to pass through the virtual content in the motion control virtual scene of hand.

Optionally, video data when camera acquires hand exercise can be used in gesture processing equipment.Camera can be with It is integrated equipment with equipment, is also possible to the camera independently of equipment, in the position and type the application for camera It does not limit excessively.

After video data when camera acquisition hand exercise, the processing component of equipment, which can pass through, reads camera data Obtain the video data of camera acquisition.The target image frame that the processing component obtains in video data, which specifically may is that, to be obtained Take each frame picture frame in video data.By the tracking to all picture frames, more accurate gesture tracking is obtained.But It is that, since image processing speed is limited, if carrying out gesture tracking to all picture frames, calculation amount is very big, is easy to cause Gesture processing speed reduces.Therefore, in order to improve tracking process speed, the processing component obtains the target figure in video data As frame specifically may is that according to pre-set image frame acquisition interval, the target image frame in video data is acquired.By reducing image Processing speed can be improved to reduce total tracing computation amount in the quantity of frame.Image-sampling frequency can be set in advance as required It is fixed.

Optionally, camera can acquire the video data of RGB color.At this point, each frame image of video data Frame is rgb format.It, can be by partitioning video data at image one by one after processing component gets video data Frame, and gesture identification is carried out to corresponding target image frame.

Optionally, processing component it is special can to extract hand for the hand region of the target image frame of video data Sign.The calculating of data volume can be reduced by carrying out feature extraction for hand region, improve computational efficiency.Processing component can determine In the target image frame of video data, hand predetermined position is in the target position of target image frame.

The hand-characteristic that processing component extracts can be extracted for the target image frame of video data, to obtain target Hand-characteristic in picture frame, and then realize continuous control.

Optionally, hand-characteristic can be inputted gesture identification model by processing component, obtain gesture information by calculating.Hand Gesture identification model can train acquisition in advance.As a kind of possible implementation, gesture knowledge is can be stored in advance in processing component Other model, after obtaining hand-characteristic, processing component can call the gesture identification model that is previously stored to hand-characteristic into Row identification, and the output by calculating gesture identification model obtains gesture information.

Gesture information can refer to gesture meaning, each gesture information can have corresponding control command.

In practical applications, the gesture identification model can be supporting vector machine model, utilize support trained in advance Vector machine model identifies the hand-characteristic, can obtain higher recognition result.Different gestures can be defined as Different gesture informations, specifically can according to need and sets.

Optionally, gesture information can use different key assignments and indicate, represent corresponding control command with different key assignments.When When gesture identification model identifies corresponding gesture information, the gesture information can be indicated with different key assignments.

The processing component determines that the corresponding control command of gesture information specifically may is that from different key assignments described in determination The corresponding target key value of gesture information determines the corresponding control command of the target key value.

Hand predetermined position may include the positions such as the special marking of finger fingertip, palm center, hand, can be directed to hand The different parts in portion carry out hand tracking.

Optionally, processing component determines that hand predetermined position can refer to determining hand in the target position of target image frame Finger fingertip is located at the coordinate position of target image frame, can also refer to that the palm center of determining hand is located in target conversion image Coordinate position, can also refer to that the special identifier of hand is located at the coordinate position of target image frame.

Target position is hand predetermined position in the position of target image frame, can specifically refer to hand predetermined position described The coordinate of target image frame.Position of the hand predetermined position in target image frame can be mapped to the display screen of smart machine Curtain, obtains corresponding operating position.Corresponding operation control can be accurately realized by the determination of operating position.

As one embodiment, the processing component is also used to:

Image segmentation is carried out to the target image frame, obtains the images of gestures of hand region；

The hand-characteristic that the processing component extracts in the target image frame is specifically:

Extract the hand-characteristic in the images of gestures；

The processing component determines that target position of the hand predetermined position in the target image frame is specifically:

Based on the images of gestures, determine hand predetermined position in the target position of target image frame.

The processing component is split the background image of target image frame with hand region, obtains images of gestures Specifically it may is that

The hand-characteristic that the processing component extracts in images of gestures, which specifically may is that, filters the images of gestures and feature Wave device convolution, the convolution results for calculating acquisition are the hand-characteristic of the images of gestures.

As one embodiment, the hand-characteristic that the processing component extracts in the images of gestures specifically be may is that

As another embodiment, the processing component is based on the images of gestures, determines hand predetermined position in target The target position of picture frame specifically may is that

Optionally, the processing component carries out color space conversion to the images of gestures, obtains target conversion image tool Body, which can be, carries out color space conversion to the images of gestures, and the corresponding color space of the destination channel after extracting conversion obtains Target converts image.

As a kind of possible implementation, the processing component carries out color space conversion to the images of gestures, obtains Obtaining target conversion image specifically may is that the color space to the target image frame of the video data is converted to by rgb format The target conversion image that the color space of Cb obtains is extracted in the space YCrCb.

In certain embodiments, processing component can be converted to the target image frame of rgb format by following formula YCrCb target image frame:

Y=0.2990R+0.5870G+0.1140B

Cr=-0.1687R -0.3313G+0.5000B+128

Cb=0.5000R -0.4187G -0.0813B+128

Images of gestures is carried out color space conversion by processing component, is as converted to each images of gestures by rgb space The space YCrCb obtains image and is the image of YCrCb format, and then can eliminate influence of the illumination to image, and then reduces illumination Negative effect to feature extraction improves the accuracy of extraction.

As a kind of possible implementation, the processing component is based on the images of gestures, determines hand predetermined position It specifically may is that in the target position of target image frame

Optionally, distributed relation of the processing component based on the images of gestures Yu the target image frame, determines the inspection The target position that location is set in the target image frame specifically may is that

In certain embodiments, the processing component determines detection of the hand predetermined position in the images of gestures Position specifically may is that

In certain embodiments, processing component determines that closed area of the hand in the images of gestures specifically may is that

Optionally, the processing component utilizes contour detecting algorithm, detects at least one profile in the images of gestures Specifically it may is that

In certain embodiments, the processing component obtain in the closed area at least one fingertip location specifically can be with It is:

Optionally, the profile that the processing component detects the closed area using convex defects detection algorithm is corresponding multiple Vertex specifically may is that

The profile of the closed area is wrapped up using convex closure, obtains the closure image of hand profile；

Optionally, processing component is on the corresponding multiple tops of profile for detecting the closed area using convex defects detection algorithm After point, it can be also used for:

Optionally, centre of the palm point is the smallest point of quadratic sum with the Euclidean distance of each point.In practical applications, processing group Part can determine in the multiple palm point with other the smallest points of palm point Euclidean distance, be the first palm point, will be based on institute The Euclidean distance and angle of the first palm point and other palm points are stated, range conversion, the determining Europe with multiple palm points are carried out The smallest point of formula sum of the distance is centre of the palm point.

In actual tracking, processing component can be tracked for the centre of the palm, to obtain the tracing positional compared with centered on, Obtain accurate tracking effect.Palm tracking is operated more suitable for translation, left cunning, right cunning etc., and finger tip tracking is more suitable In click, left click, the operation such as right click.

Using contour detecting algorithm and convex defects detection algorithm, accurate target position, Jin Er can be determined When the target position determines, the accuracy of identification can be further improved.

In certain embodiments, the hand predetermined position includes the highest finger tip point of user's hand.

The processing component obtains the hand predetermined position in the gesture figure according at least one described fingertip location Detection position as in, which specifically may is that, determines that the coordinate of the highest finger tip point in the multiple finger tip coordinate is that the hand is pre- Determine detection position of the position in the images of gestures.

In certain embodiments, the target position is mapped to display screen by the processing component, obtains corresponding behaviour Making position specifically may is that

In certain embodiments, the processing component extract the hand-characteristic in target conversion image specifically can be with It is:

Optionally, after processing component obtains bianry image, it can use Morphology Algorithm and eliminate in the bianry image Discrete point, and fill the missing point in the bianry image, and then the bianry image can preferably characterize gesture, in turn Improve the accuracy of the hand-characteristic extracted.That is, processing component can by the bianry image carry out discrete point elimination and After missing point filling, the hand-characteristic of hand is extracted.

It is described by images of gestures described in processing component carry out Gaussian Blur denoising, obtain denoising image specifically may is that by The images of gestures completes Gaussian Blur denoising by Gaussian filter, obtains denoising image.Gaussian filter can incite somebody to action High-frequency noise in images of gestures is filtered out, and then obtains denoising image.

Optionally, the denoising image is converted to bianry image specifically and may is that and turned using threshold value by the processing component The denoising image is converted to bianry image by scaling method.As a kind of possible implementation, processing component can determine institute The pixel value for stating each of denoising image pixel, determines pixel threshold；If the pixel value of any one pixel is greater than The pixel threshold can determine that the pixel value corresponding pixel points are 1；If any one pixel is less than the pixel threshold Value can determine that the pixel value corresponding pixel points are 0.

As a kind of possible implementation, the hand-characteristic that the processing component extracts in the bianry image specifically may be used To be:

The feature extraction for being carried out multiple directions to the bianry image using multidirectional Gabor filter, is obtained multiple Wavelet character；Determine that the multiple wavelet character is the hand-characteristic.

X'=xcos θ+ysin θ

Y'=-xcos θ+ysin θ

As a kind of possible implementation, the processing component is using multidirectional Gabor filter to the two-value Image carries out the feature extraction of multiple directions, and obtaining multiple wavelet characters specifically may is that

As one embodiment, the processing component identifies the hand-characteristic using gesture identification model, obtains gesture Information specifically may is that

Multiple wavelet characters are to be extracted to obtain by bianry image, and bianry image is obtained by the target image frame of video data , therefore, the practical corresponding gesture information of multiple wavelet characters is a kind of gesture information.But since gesture identification model is deposited In identification error, therefore, gesture identification model being inputted in multiple wavelet characters, multiple recognition results of acquisition may be different, because This, processing component can be counted the multiple recognition result, and the same recognition result of quantification at most is as hand Gesture information.And then may insure the accuracy of gesture identification, even if being directed to the wavelet character of different directions, can also determine described The recognition result of multiple wavelet character identifications, and then improve the accuracy of identification.

As one embodiment, the processing component can training obtains gesture identification model in the following manner in advance:

Obtain the corresponding sample image of different gesture informations；Each sample graph is extracted using multidirectional Gabor filter Multiple wavelet characters of picture；The corresponding gesture letter of each wavelet character and each sample image based on each sample image Breath, training obtain gesture identification model.

The corresponding gesture information of the training image is set as it is known that can use training image, and training obtains multiple groups and waits for Player's gesture identification model；The corresponding gesture information of the test image is set as unknown, can use test image, obtains to training The multiple groups gesture identification model to be selected obtained is tested, and gesture test result is obtained, and is based on gesture test result and test image Gesture information comparison, the gesture identification model to be selected for selecting discrimination optimal is target gesture identification model.Optionally, institute State processing component specifically may is that using multiple wavelet characters that multidirectional Gabor filter extracts each sample image

Each wavelet character and each sample image corresponding gesture of the processing component based on each sample image Information, training obtain gesture identification model and specifically may is that

The processing component obtains the corresponding sample image of different gesture informations and specifically may is that using camera to difference Gesture information is acquired, and obtains acquisition image；Extract the corresponding gesture figure in hand region in the target conversion image Picture；Images of gestures is carried out to the conversion of the color space of YCrCb, target is obtained and converts image；By the target convert image into Row noise filtering, two-value conversion, obtain corresponding sample image.

As shown in fig. 6, being a kind of structural representation of one embodiment of interactive control equipment provided by the embodiments of the present application Figure, the equipment include: the storage assembly 602 and display group that processing component 601 and the processing component 601 are separately connected Part 603；

The storage assembly 602 stores one or more computer program instructions；One or more computer program Instruction is for being called and being executed by the processing component 601；

The processing component 601 is used for:

Acquire video data when hand exercise, and the target image frame of the video data；It extracts in target image frame Hand-characteristic；The hand-characteristic is identified using gesture identification model, obtains gesture information；Determine the corresponding control of gesture information System order；Determine the hand predetermined position in the target position of the target image frame；The target position is mapped to aobvious Display screen curtain obtains corresponding operating position；The control command is executed in the operating position.

The embodiment of the present application also provides a kind of computer readable storage medium, and computer journey is stored in the storage medium Sequence, the computer program are computer-executed to realize gesture processing method described in any of the above-described embodiment.

The embodiment of the present application also provides a kind of computer readable storage medium, and computer journey is stored in the storage medium Sequence, the computer program are computer-executed to realize interaction control method described in any of the above-described embodiment.

The embodiment of the present application provides a kind of computer readable storage medium again, and computer journey is stored in the storage medium Sequence, it is for realizing smart machine interactive controlling side described in any of the above-described embodiment that the computer program, which is computer-executed, Method.

In a typical configuration, smart machine includes one or more processors (CPU), input/output interface, net Network interface and memory.As used some vocabulary to censure specific components in the specification and claims.Art technology Personnel are, it is to be appreciated that hardware manufacturer may call the same component with different nouns.This specification and claims are simultaneously Not in such a way that the difference of title is as component is distinguished, but with the difference of component functionally as the criterion of differentiation. If "comprising" mentioned throughout the specification and claims is an open language, thus should be construed to " include but not It is defined in ".Specification subsequent descriptions are to implement the better embodiment of the application, and so the description is one to illustrate the application As for the purpose of principle, be not intended to limit the scope of the present application.The protection scope of the application is worked as to be defined depending on appended claims Subject to person.

It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability Include, so that commodity or system including a series of elements not only include those elements, but also including not clear The other element listed, or further include for this commodity or the intrinsic element of system.In the feelings not limited more Under condition, the element that is limited by sentence "including a ...", it is not excluded that in the commodity or system for including the element also There are other identical elements.

Above description shows and describes several preferred embodiments of the present application, but as previously described, it should be understood that the application Be not limited to forms disclosed herein, should not be regarded as an exclusion of other examples, and can be used for various other combinations, Modification and environment, and the above teachings or related fields of technology or knowledge can be passed through in application contemplated scope described herein It is modified.And changes and modifications made by those skilled in the art do not depart from spirit and scope, then it all should be in this Shen It please be in the protection scope of appended claims.

Claims

1. a kind of gesture processing method characterized by comprising

Video data when hand exercise is acquired, and obtains the target image frame in the video data；

Extract the hand-characteristic in the target image frame；

Determine the corresponding control command of gesture information；

Determine hand predetermined position in the target position of the target image frame；

The target position is mapped into display screen, obtains corresponding operating position；Wherein, the control command is for described Operating position executes.

2. the method according to claim 1, wherein the video data when acquisition hand exercise, obtains institute After stating the target image frame in video data, further includes:

The hand-characteristic extracted in the target image frame includes:

Extract the hand-characteristic in the images of gestures；

Target position of the determination hand predetermined position in the target image frame includes:

3. according to the method described in claim 2, it is characterized in that, it is described be based on the images of gestures, determine hand reservations Position includes: in the target position of target image frame

Determine detection position of the hand predetermined position in the images of gestures；

Distributed relation based on the images of gestures Yu the target image frame determines the detection position in the target image Target position in frame.

4. according to the method described in claim 2, it is characterized in that, the hand-characteristic packet extracted in the images of gestures It includes:

Color space conversion is carried out to the images of gestures, target is obtained and converts image；

Extract the hand-characteristic in the target conversion image.

5. according to the method described in claim 3, it is characterized in that, the determination hand predetermined position is in the gesture figure Detection position as in includes:

Determine closed area of the hand in the images of gestures；

Obtain at least one fingertip location in the closed area；

According at least one described fingertip location, detection position of the hand predetermined position in the images of gestures is obtained.

6. according to the method described in claim 5, it is characterized in that, enclosed area of the determining hand in the images of gestures Domain includes:

Using contour detecting algorithm, at least one profile in the images of gestures is detected；

Determine the maximum objective contour of area at least one described profile；

Determine that the objective contour corresponding region is the hand in the images of gestures closed area.

7. according to the method described in claim 5, it is characterized in that, described obtain at least one finger tip position in the closed area It sets and includes:

The corresponding multiple vertex of profile for detecting the closed area using convex defects detection algorithm；

According to the distance on any two vertex in the multiple vertex and angle corresponding relationship, multiple finger tip points are determined；

Multiple finger tip coordinates of the multiple finger tip point in the closed area are determined respectively, obtain at least one finger tip position It sets.

8. the method according to the description of claim 7 is characterized in that the hand predetermined position include user's hand most High finger tip point；

Described at least one fingertip location according to obtains check bit of the hand predetermined position in the images of gestures It sets and comprises determining that the coordinate of the highest finger tip point in the multiple finger tip coordinate is the hand predetermined position in the gesture figure Detection position as in.

9. according to the method described in claim 4, it is characterized in that, the hand-characteristic extracted in the target conversion image Include:

Target conversion image is subjected to Gaussian Blur denoising, obtains denoising image；

The denoising image is converted into bianry image；

Extract the hand-characteristic in the bianry image.

10. according to the method described in claim 9, it is characterized in that, the hand-characteristic packet extracted in the bianry image It includes:

The feature extraction for carrying out multiple directions to the bianry image using multidirectional Gabor filter, obtains multiple small echos Feature；

Determine that the multiple wavelet character is the hand-characteristic.

11. according to the method described in claim 9, it is characterized in that, described identify that the hand is special using gesture identification model Sign, obtaining gesture information includes:

The multiple wavelet character is inputted into the gesture identification model respectively, obtains multiple recognition results；

By in the multiple recognition result, the most same recognition result of quantity is as the gesture information.

12. the method according to claim 1, wherein the gesture identification model is instructed in the following manner in advance Practice and obtain:

Obtain the corresponding sample image of different gesture informations；

Multiple wavelet characters of each sample image are extracted using multidirectional Gabor filter；

Each wavelet character based on each sample image makees and the corresponding gesture information of each sample image, and training obtains Obtain gesture identification model.

13. being obtained the method according to claim 1, wherein described map to display screen for the target position Obtaining corresponding operating position includes:

The target position is inputted into Kalman filter, calculates and obtains hand in the operating position of the display screen.

14. a kind of interaction control method characterized by comprising

Extract the hand-characteristic in target image frame；

Determine the corresponding control command of gesture information；

The control command is executed in the operating position.

15. a kind of gesture processing equipment characterized by comprising processing component, the storage being separately connected with the processing component Component and display component；The storage assembly stores one or more computer program instructions；Described one or more calculates Machine program instruction is for being called and being executed by the processing component；

The processing component is used for:

Video data when hand exercise is acquired, and obtains the target image frame of the video data；Extract the target image Hand-characteristic in frame；The hand-characteristic is identified using gesture identification model, obtains gesture information；Determine that gesture information is corresponding Control command；Determine the hand predetermined position in the target position of the target image frame；The target position is mapped To the display component, corresponding operating position is obtained；Wherein, the control command is executed for the operating position.

16. a kind of interactive control equipment characterized by comprising processing component, the storage being separately connected with the processing component Component and display component；The storage assembly stores one or more computer program instructions；Described one or more calculates Machine program instruction is for being called and being executed by the processing component；

The processing component is used for:

Video data when hand exercise is acquired, the target image frame in the video data is obtained；Extract the target image Hand-characteristic in frame；The hand-characteristic is identified using gesture identification model, obtains gesture information；Determine that gesture information is corresponding Control command；Determine the hand predetermined position in the target position of the target image frame；The target position is mapped To the display component, corresponding operating position is obtained；The control command is executed in the operating position of the display component.