CN107368820B

CN107368820B - Refined gesture recognition method, device and equipment

Info

Publication number: CN107368820B
Application number: CN201710656434.0A
Authority: CN
Inventors: 姬晓鹏; 程俊; 潘亮亮; 张丰; 方琎
Original assignee: Shenzhen Institute of Advanced Technology of CAS; Shenzhen Tencent Computer Systems Co Ltd
Current assignee: Shenzhen Institute of Advanced Technology of CAS; Shenzhen Tencent Computer Systems Co Ltd
Priority date: 2017-08-03
Filing date: 2017-08-03
Publication date: 2023-04-18
Anticipated expiration: 2037-08-03
Also published as: CN107368820A

Abstract

A refined gesture recognition method comprises the following steps: extracting local features of hand joint positions, wherein the local features comprise relative position features and/or time sequence displacement features; performing clustering calculation according to the relative position characteristic and/or the time sequence displacement characteristic to obtain a clustering characteristic corresponding to the relative position characteristic and/or the time sequence displacement characteristic; and performing clustering feature model training according to the corresponding relation between the clustering features and the gesture categories, and performing gesture recognition according to the trained clustering feature model. The conversion from the dynamic gestures with different lengths to the features with fixed lengths can be realized, the similarity measurement of the gesture types by the classifier is facilitated, and the judgment of the refined motion process of the fingers and the detection of the large-amplitude motion are facilitated.

Description

Refined gesture recognition method, device and equipment

Technical Field

The invention belongs to the field of gesture recognition, and particularly relates to a method, a device and equipment for fine gesture recognition.

Background

In a human-computer interaction system such as a smart television, a wearable mobile terminal, a personal computer, or a virtual reality device, an online gesture recognition interaction input is often used.

According to different data acquisition modes, the current gesture recognition methods at home and abroad can be divided into wearable equipment and vision-based methods. Wherein:

the gesture recognition method based on the wearable device mainly utilizes sensors such as an accelerometer and a gyroscope to acquire motion trail information of gestures in a three-dimensional space, the gesture recognition based on the wearable device has the advantages that a plurality of sensors can be arranged to acquire accurate relative position information and space motion trail of hand joints, the recognition accuracy is high, however, the method needs to wear complex equipment or devices, such as data gloves and position trackers, the wearing is complex, and certain influence is brought to the naturalness of a human-computer interaction system.

The gesture recognition method based on vision can well solve the problem of naturalness during man-machine interaction, the image data of a hand region is acquired by a visible light camera, and then the processes of segmentation, feature extraction and classification of a hand target region are carried out. However, the existing gesture recognition method based on vision can only process a single type of static gesture (such as gesture digital recognition of a single image) or dynamic gesture (such as sliding and page-turning of a palm up and down), and in the problem of processing gesture recognition of unequal length sequences, the existing gesture recognition method mostly adopts a dynamic time warping algorithm to measure similarity of gesture motion trajectories, the algorithm can solve the difference of the motion trajectories of hands under large-amplitude motion, but has high computational complexity, and cannot realize refined and diversified finger motion recognition.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method, an apparatus, and a device for gesture recognition refinement, so as to solve the problem that in the gesture recognition method in the prior art, due to high computational complexity, refined and diverse finger motion recognition cannot be implemented.

A first aspect of an embodiment of the present invention provides a method for fine gesture recognition, where the method for fine gesture recognition includes:

extracting local features of hand joint positions, wherein the local features comprise relative position features and/or time sequence displacement features;

performing clustering calculation according to the relative position characteristic and/or the time sequence displacement characteristic to obtain a clustering characteristic corresponding to the relative position characteristic and/or the time sequence displacement characteristic;

and performing clustering feature model training according to the corresponding relation between the clustering features and the gesture categories, and performing gesture recognition according to the trained clustering feature model.

With reference to the first aspect, in a first possible implementation manner of the first aspect, the step of extracting a relative position feature of the local features of the hand joint positions includes:

acquiring T frames of dynamic gesture images, and determining the positions of the nodes of the hand in each frame of dynamic gesture image;

and calculating to obtain the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes in the nodes of the hand.

With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the step of calculating, according to the positions of the root node and the corresponding child nodes included in the nodes of the hand, a relative position feature of the child node with respect to the root node includes:

obtaining the relative position characteristic of the T' th frame in the T-frame dynamic gesture image

Wherein:

indicates the location of the root node, and>

represents the position of a child node corresponding to the root node, u ∈ { i |1 ≦ i ≦ N }, h =1 represents the left hand, h =2 represents the right hand, 1 ≦ T' ≦ T, and N is the number of nodes.

With reference to the first aspect, in a third possible implementation manner of the first aspect, the step of extracting a time-series displacement feature from local features of hand joint positions includes:

acquiring T frames of dynamic gesture images, and determining a displacement characteristic reference point in each frame of dynamic gesture image;

and determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent images.

With reference to the third possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the determining, according to the position of the displacement feature reference point in each two adjacent frames of images, a time-series displacement feature corresponding to the dynamic gesture image includes:

obtaining the time sequence displacement characteristic of the t' frame

Wherein it is present>

For a displaced characteristic reference point of a t "th frame, <' >>

Reference point for displacement feature of t "-1 th frame, 1<T ≦ T, v ∈ { i |1 ≦ i ≦ N }, M is the number of displacement feature reference points and 1 ≦ v ≦ M, h =1 represents the left hand, h =2 represents the right hand, and N is the number of nodes.

With reference to the first aspect, in a fifth possible implementation manner of the first aspect, the performing cluster calculation according to the relative position feature and/or the time series displacement feature to obtain a cluster feature corresponding to the relative position feature and/or the time series displacement feature includes:

representing the relative position features and/or timing displacement features as a set of uniform local features;

selecting a predetermined number of cluster sets from the set of uniform local features;

and performing transformation calculation on each cluster set to obtain cluster characteristics corresponding to the relative position characteristics and/or the time sequence displacement characteristics.

A second aspect of an embodiment of the present invention provides a refined gesture recognition apparatus, including:

a local feature extraction unit for extracting local features of hand joint positions, the local features including relative position features and/or time sequence displacement features;

the clustering feature calculating unit is used for carrying out clustering calculation according to the relative position feature and/or the time sequence displacement feature to obtain a clustering feature corresponding to the relative position feature and/or the time sequence displacement feature;

and the training recognition unit is used for carrying out clustering characteristic model training according to the corresponding relation between the clustering characteristics and the gesture categories and carrying out gesture recognition according to the trained clustering characteristic model.

With reference to the second aspect, in a first possible implementation manner of the second aspect, the local feature extraction unit includes:

the first image acquisition subunit is used for acquiring T frames of dynamic gesture images and determining the positions of the nodes of the hand in each frame of dynamic gesture image;

the first calculating subunit is used for calculating to obtain the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand;

and/or the presence of a gas in the atmosphere,

the second image acquisition subunit is used for acquiring the T frames of dynamic gesture images and determining a displacement characteristic reference point in each frame of dynamic gesture image;

and the second calculation subunit is used for determining the time sequence displacement characteristic corresponding to the dynamic gesture image according to the position of the displacement characteristic reference point in every two adjacent frames of images.

A third aspect of an embodiment of the present invention provides a refined gesture recognition apparatus, including: memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method of refining gesture recognition according to any one of the first aspect when executing the computer program.

A fourth aspect of the embodiments of the present invention provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the method for refining gesture recognition according to any one of the first aspect.

Compared with the prior art, the embodiment of the invention has the following beneficial effects: the hand joint position is described by adopting the relative position characteristics and/or the time sequence displacement characteristics, so that the judgment of the refined motion process of the fingers and/or the detection of the large-amplitude motion are favorably realized.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

FIG. 1 is a schematic diagram illustrating an implementation flow of a method for refining gesture recognition according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a dynamic gesture provided by an embodiment of the present invention;

FIG. 3 is a schematic diagram of the position of the hand joint according to the embodiment of the present invention;

fig. 4 is a schematic diagram of a correspondence relationship between a root node and a child node according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a displacement signature reference node provided in accordance with an embodiment of the present invention;

FIG. 6 is a schematic diagram of an implementation process for performing clustering calculation to obtain clustering characteristics according to an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of a dynamic gesture operation apparatus according to an embodiment of the present invention;

FIG. 8 is a diagram illustrating an apparatus for refining gesture recognition according to an embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

In order to explain the technical means of the present invention, the following description will be given by way of specific examples.

As shown in fig. 1, the method for refining gesture recognition according to the embodiment of the present invention includes:

in step S101, local features of hand joint positions are extracted, the local features including relative position features and/or time series displacement features.

Specifically, the extraction of the local feature is based on the acquired multi-frame image including the hand of the user, as shown in fig. 2, which is a schematic diagram of a T-frame image of an "OK" gesture, from frame 1 to frame T, multiple images are included, and the gesture in the image may have a change in position with a change in time frame, where the change in position includes a change in relative position of a finger and a change in overall position of the palm, and the change in relative position of the finger and the change in overall position of the palm may be described by a relative position feature and a time-series position feature, respectively.

In order to obtain the relative position feature and/or the time-series position feature, the nodes of the hand need to be extracted in advance, and the relative position feature and the time-series position feature are reflected by the change of the positions of the nodes. As an alternative embodiment of the present application, as shown in fig. 1 and fig. 2, the extracted nodes of the hand include joint point positions of the hand, finger end positions and a center position of the palm, wherein the joint point positions include 2 joint points of the thumb and 1 finger end node, the other four fingers include 3 joint points and 1 end node, the wrist portion includes one joint point, and the center position of the palm includes one node, and 22 feature reference points are included in total, in order to distinguish the left hand from the right hand, R1, R2 … … R22 and L1, L2 … … L22 respectively represent nodes of the right hand and the left hand.

Wherein the step of extracting relative position features in the local features of the hand joint positions comprises:

and calculating the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand.

For a dynamic gesture G = { G) including T (T is a natural number and 2 or more) frame images _t L1 ≦ T ≦ T }, such as the right hand "OK" gesture shown in FIG. 2, each frame of gesture data contains the position of the node of the hand in the world coordinate system estimated by the depth camera:

here h =1 denotes the left hand, h =2 denotes the right hand; i e {1,2,3., N } represents the ith node for the corresponding hand, as shown in fig. 2, each hand includes 22 nodes, and then N may have a value of 22.

Given the position of the node of the t' th frame

Where 1. Ltoreq. T'. Ltoreq.T. As shown in FIG. 4, the N-2 (N is 22 in FIG. 3) group root nodes corresponding to the node shown in FIG. 3 are selected>

And the corresponding sub-node->

Respectively calculating a relative position vector of a child node relative to the root node->

I.e. based on>

The relative position feature of the t' th frame

Can be expressed as:

as shown in fig. 3, the root node and the child node are selected, and two nodes separated by one joint point are selected as the root node and the child node according to the positions of joints where the nodes are located.

In this step, the step of extracting the time sequence displacement feature from the local features of the hand joint position includes:

The selection of the displacement characteristic reference point can select a node at a palm part, so that the overall movement amount of the hand can be more accurately determined, and errors of the overall movement direction and size caused by the selection of the node at a finger part are avoided. As shown in FIG. 5, the joint position for a given t "frame can be selected where M (M is 7 in FIG. 2) nodes are the displacement feature reference points

Here 1<T' is less than or equal to T, N is the number of nodes, M is the number of selected displacement characteristic reference points, and the selected displacement characteristic reference points are as follows: />

Calculate the present frame->

And the previous frame->

Is greater than or equal to>

Namely that

Then the time sequence displacement characteristic of the t "th frame

Can be expressed as:

in step S102, performing cluster calculation according to the relative position feature and/or the time sequence displacement feature to obtain a cluster feature corresponding to the relative position feature and/or the time sequence displacement feature.

The process of clustering the extracted local features may be as shown in fig. 6, and includes:

in step S601, the relative position feature and/or the time-series displacement feature are represented as a set of unified local features.

For the relative position feature extracted in step S101

And a timing shift feature>

Can be expressed as a unified local feature expression form: />

Here, it can be set that: s is more than or equal to 1 and less than or equal to 4,1<T is less than or equal to tau. For the inclusion of only relative position features, or only chronological displacement features, corresponding unified local feature expressions can be used, such as->

Or (R)>

In step S602, a predetermined number of cluster sets are selected from the set of uniform local features.

In the present invention, the s-th class feature set can be processed

K initialized cluster center points mu are selected by adopting a K-means + + algorithm _s,k And l1 is more than or equal to K and less than or equal to K, and the sum of squares of errors is used as a clustering measurement basis to obtain a K cluster set C _s ＝{C _s,k And |1 is not less than K and not more than K }, namely:

wherein

And the central point of each cluster after updating is obtained, wherein K is a natural number more than 2.

In step S603, each cluster set is transformed and calculated to obtain a cluster feature corresponding to the relative position feature and/or the time sequence displacement feature.

For each cluster class C _s,k Performing Principal Component Analysis (PCA) operation, and retaining all components to obtain a transformed cluster C' _s,k 。

Calculate cluster C' _s,k And cluster center point mu' _s,k Is obtained in a compact formClustering feature expression v _k ：

Combining the clustering characteristics of the K clustering central points to form a compact form expression of the s-th class characteristic:

V _s ＝{ν _s,k |1≤k≤K}。

repeating steps S602-S603, the local features can be assembled

The signature, expressed as a fixed length independent of the timing τ, is:

V＝{V _s |1≤s≤4}。

in step S103, according to the corresponding relationship between the cluster feature and the gesture category, performing cluster feature model training, and performing gesture recognition according to the trained cluster feature model.

After generating the fixed-length feature representation irrelevant to the time sequence tau, model training and testing can be performed through a support vector machine or other training models to obtain a trained model, and then gestures can be judged and recognized through the trained model.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

Fig. 7 is a schematic structural diagram of a refined gesture recognition apparatus according to an embodiment of the present invention, as shown in fig. 7, the refined gesture recognition apparatus includes:

a local feature extraction unit 701, configured to extract local features of hand joint positions, where the local features include relative position features and/or time sequence displacement features;

a clustering feature calculating unit 702, configured to perform clustering calculation according to the relative position feature and/or the time sequence displacement feature to obtain a clustering feature corresponding to the relative position feature and/or the time sequence displacement feature;

and the training recognition unit 703 is configured to perform clustering feature model training according to the correspondence between the clustering features and the gesture categories, and perform gesture recognition according to the trained clustering feature model.

Preferably, the local feature extraction unit includes:

and/or the presence of a gas in the atmosphere,

FIG. 8 is a diagram illustrating an apparatus for refining gesture recognition according to an embodiment of the present invention. As shown in fig. 8, the refined gesture recognition apparatus 8 of this embodiment includes: a processor 80, a memory 81, and a computer program 82, such as a refined gesture recognition program, stored in the memory 81 and executable on the processor 80. The processor 80, when executing the computer program 82, implements the steps in the various refinement gesture recognition method embodiments described above, such as steps 101 to 103 shown in fig. 1. Alternatively, the processor 80, when executing the computer program 82, implements the functions of each module/unit in each device embodiment described above, for example, the functions of the modules 701 to 703 shown in fig. 7.

Illustratively, the computer program 82 may be partitioned into one or more modules/units that are stored in the memory 81 and executed by the processor 80 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 82 in the refined gesture recognition apparatus 8. For example, the computer program 82 may be divided into a local feature extraction unit, a cluster feature calculation unit, and a training identification unit, and each unit has the following specific functions:

the clustering characteristic calculating unit is used for carrying out clustering calculation according to the relative position characteristic and/or the time sequence displacement characteristic to obtain a clustering characteristic corresponding to the relative position characteristic and/or the time sequence displacement characteristic;

and the training recognition unit is used for performing clustering characteristic model training according to the corresponding relation between the clustering characteristics and the gesture classes and performing gesture recognition according to the trained clustering characteristic model.

The refined gesture recognition device 8 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing device. The refined gesture recognition apparatus may include, but is not limited to, a processor 80, a memory 81. Those skilled in the art will appreciate that fig. 8 is merely an example of a refined gesture recognition device 8, and does not constitute a limitation of the refined gesture recognition device 8, and may include more or fewer components than those shown, or some components in combination, or different components, for example, the refined gesture recognition device may also include an input output device, a network access device, a bus, etc.

The Processor 80 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 81 may be an internal storage unit of the refinement gesture recognition device 8, such as a hard disk or a memory of the refinement gesture recognition device 8. The memory 81 may also be an external storage device of the gesture recognition apparatus 8, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the gesture recognition apparatus 8. Further, the memory 81 may also include both an internal storage unit and an external storage device of the refined gesture recognition apparatus 8. The memory 81 is used for storing the computer program and other programs and data required by the refined gesture recognition apparatus. The memory 81 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. For the specific working processes of the units and modules in the system, reference may be made to the corresponding processes in the foregoing method embodiments, which are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated module/unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. . Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media which may not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. A refined gesture recognition method, comprising:

performing clustering feature model training according to the corresponding relation between the clustering features and the gesture categories, and performing gesture recognition according to the trained clustering feature model;

the step of extracting relative position features among the local features of the hand joint positions includes:

calculating the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand;

the step of calculating the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand part comprises the following steps:

Wherein:

indicating a location of a root node>

Representing the position of a child node corresponding to the root node, wherein u belongs to { i |1 is less than or equal to i and less than or equal to N }, h =1 represents a left hand, h =2 represents a right hand, 1 is less than or equal to T' ≦ T, and N is the number of nodes;

the step of extracting the time sequence displacement characteristic in the local characteristic of the hand joint position comprises the following steps:

determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent images;

the step of determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent images comprises the following steps:

obtaining the time sequence displacement characteristic of the t' frame

Wherein it is present>

A reference point for a displacement characteristic of the t "th frame, <' >>

2. The method for identifying refined gestures according to claim 1, wherein the step of performing cluster calculation based on the relative position feature and/or the time sequence displacement feature to obtain a cluster feature corresponding to the relative position feature and/or the time sequence displacement feature comprises:

representing the relative position features and/or time series displacement features as a set of uniform local features;

selecting a cluster set with a preset number from the set of the unified local features;

3. A refined gesture recognition device, the refined gesture recognition device comprising:

the training recognition unit is used for carrying out clustering feature model training according to the corresponding relation between the clustering features and the gesture categories and carrying out gesture recognition according to the trained clustering feature model;

the local feature extraction unit includes:

and/or the presence of a gas in the gas,

the second calculation subunit is used for determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent frames of images;

the first calculating subunit is used for acquiring the relative position characteristic of the T' th frame in the T-frame dynamic gesture image

Wherein:

indicates the location of the root node, and>

representing the position of a child node corresponding to the root node, wherein u belongs to { i |1 is more than or equal to i and less than or equal to N }, h =1 represents a left hand, h =2 represents a right hand, 1 is more than or equal to T' ismore than or equal to T, and N is the number of nodes;

the second calculating subunit is used for acquiring the time sequence displacement characteristic of the t' th frame

Wherein,

a reference point for a displacement characteristic of the t "th frame, <' >>

4. A refined gesture recognition apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the refined gesture recognition method according to any one of claims 1 to 2.

5. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for refining gesture recognition according to any one of claims 1 to 2.