CN107368820B - Refined gesture recognition method, device and equipment - Google Patents
Refined gesture recognition method, device and equipment Download PDFInfo
- Publication number
- CN107368820B CN107368820B CN201710656434.0A CN201710656434A CN107368820B CN 107368820 B CN107368820 B CN 107368820B CN 201710656434 A CN201710656434 A CN 201710656434A CN 107368820 B CN107368820 B CN 107368820B
- Authority
- CN
- China
- Prior art keywords
- characteristic
- relative position
- displacement
- features
- time sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000006073 displacement reaction Methods 0.000 claims abstract description 78
- 238000004364 calculation method Methods 0.000 claims abstract description 17
- 210000002478 hand joint Anatomy 0.000 claims abstract description 17
- 238000012549 training Methods 0.000 claims abstract description 15
- 238000004590 computer program Methods 0.000 claims description 21
- 238000000605 extraction Methods 0.000 claims description 10
- 238000007670 refining Methods 0.000 claims description 7
- 230000009466 transformation Effects 0.000 claims description 2
- 230000008569 process Effects 0.000 abstract description 10
- 238000001514 detection method Methods 0.000 abstract description 2
- 238000005259 measurement Methods 0.000 abstract description 2
- 238000006243 chemical reaction Methods 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 11
- 210000003811 finger Anatomy 0.000 description 9
- 230000008859 change Effects 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000014509 gene expression Effects 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 210000004247 hand Anatomy 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 210000003813 thumb Anatomy 0.000 description 1
- 210000000707 wrist Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Social Psychology (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Psychiatry (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Image Analysis (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
A refined gesture recognition method comprises the following steps: extracting local features of hand joint positions, wherein the local features comprise relative position features and/or time sequence displacement features; performing clustering calculation according to the relative position characteristic and/or the time sequence displacement characteristic to obtain a clustering characteristic corresponding to the relative position characteristic and/or the time sequence displacement characteristic; and performing clustering feature model training according to the corresponding relation between the clustering features and the gesture categories, and performing gesture recognition according to the trained clustering feature model. The conversion from the dynamic gestures with different lengths to the features with fixed lengths can be realized, the similarity measurement of the gesture types by the classifier is facilitated, and the judgment of the refined motion process of the fingers and the detection of the large-amplitude motion are facilitated.
Description
Technical Field
The invention belongs to the field of gesture recognition, and particularly relates to a method, a device and equipment for fine gesture recognition.
Background
In a human-computer interaction system such as a smart television, a wearable mobile terminal, a personal computer, or a virtual reality device, an online gesture recognition interaction input is often used.
According to different data acquisition modes, the current gesture recognition methods at home and abroad can be divided into wearable equipment and vision-based methods. Wherein:
the gesture recognition method based on the wearable device mainly utilizes sensors such as an accelerometer and a gyroscope to acquire motion trail information of gestures in a three-dimensional space, the gesture recognition based on the wearable device has the advantages that a plurality of sensors can be arranged to acquire accurate relative position information and space motion trail of hand joints, the recognition accuracy is high, however, the method needs to wear complex equipment or devices, such as data gloves and position trackers, the wearing is complex, and certain influence is brought to the naturalness of a human-computer interaction system.
The gesture recognition method based on vision can well solve the problem of naturalness during man-machine interaction, the image data of a hand region is acquired by a visible light camera, and then the processes of segmentation, feature extraction and classification of a hand target region are carried out. However, the existing gesture recognition method based on vision can only process a single type of static gesture (such as gesture digital recognition of a single image) or dynamic gesture (such as sliding and page-turning of a palm up and down), and in the problem of processing gesture recognition of unequal length sequences, the existing gesture recognition method mostly adopts a dynamic time warping algorithm to measure similarity of gesture motion trajectories, the algorithm can solve the difference of the motion trajectories of hands under large-amplitude motion, but has high computational complexity, and cannot realize refined and diversified finger motion recognition.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method, an apparatus, and a device for gesture recognition refinement, so as to solve the problem that in the gesture recognition method in the prior art, due to high computational complexity, refined and diverse finger motion recognition cannot be implemented.
A first aspect of an embodiment of the present invention provides a method for fine gesture recognition, where the method for fine gesture recognition includes:
extracting local features of hand joint positions, wherein the local features comprise relative position features and/or time sequence displacement features;
performing clustering calculation according to the relative position characteristic and/or the time sequence displacement characteristic to obtain a clustering characteristic corresponding to the relative position characteristic and/or the time sequence displacement characteristic;
and performing clustering feature model training according to the corresponding relation between the clustering features and the gesture categories, and performing gesture recognition according to the trained clustering feature model.
With reference to the first aspect, in a first possible implementation manner of the first aspect, the step of extracting a relative position feature of the local features of the hand joint positions includes:
acquiring T frames of dynamic gesture images, and determining the positions of the nodes of the hand in each frame of dynamic gesture image;
and calculating to obtain the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes in the nodes of the hand.
With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the step of calculating, according to the positions of the root node and the corresponding child nodes included in the nodes of the hand, a relative position feature of the child node with respect to the root node includes:
obtaining the relative position characteristic of the T' th frame in the T-frame dynamic gesture imageWherein:
indicates the location of the root node, and>represents the position of a child node corresponding to the root node, u ∈ { i |1 ≦ i ≦ N }, h =1 represents the left hand, h =2 represents the right hand, 1 ≦ T' ≦ T, and N is the number of nodes.
With reference to the first aspect, in a third possible implementation manner of the first aspect, the step of extracting a time-series displacement feature from local features of hand joint positions includes:
acquiring T frames of dynamic gesture images, and determining a displacement characteristic reference point in each frame of dynamic gesture image;
and determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent images.
With reference to the third possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the determining, according to the position of the displacement feature reference point in each two adjacent frames of images, a time-series displacement feature corresponding to the dynamic gesture image includes:
obtaining the time sequence displacement characteristic of the t' frameWherein it is present> For a displaced characteristic reference point of a t "th frame, <' >>Reference point for displacement feature of t "-1 th frame, 1<T ≦ T, v ∈ { i |1 ≦ i ≦ N }, M is the number of displacement feature reference points and 1 ≦ v ≦ M, h =1 represents the left hand, h =2 represents the right hand, and N is the number of nodes.
With reference to the first aspect, in a fifth possible implementation manner of the first aspect, the performing cluster calculation according to the relative position feature and/or the time series displacement feature to obtain a cluster feature corresponding to the relative position feature and/or the time series displacement feature includes:
representing the relative position features and/or timing displacement features as a set of uniform local features;
selecting a predetermined number of cluster sets from the set of uniform local features;
and performing transformation calculation on each cluster set to obtain cluster characteristics corresponding to the relative position characteristics and/or the time sequence displacement characteristics.
A second aspect of an embodiment of the present invention provides a refined gesture recognition apparatus, including:
a local feature extraction unit for extracting local features of hand joint positions, the local features including relative position features and/or time sequence displacement features;
the clustering feature calculating unit is used for carrying out clustering calculation according to the relative position feature and/or the time sequence displacement feature to obtain a clustering feature corresponding to the relative position feature and/or the time sequence displacement feature;
and the training recognition unit is used for carrying out clustering characteristic model training according to the corresponding relation between the clustering characteristics and the gesture categories and carrying out gesture recognition according to the trained clustering characteristic model.
With reference to the second aspect, in a first possible implementation manner of the second aspect, the local feature extraction unit includes:
the first image acquisition subunit is used for acquiring T frames of dynamic gesture images and determining the positions of the nodes of the hand in each frame of dynamic gesture image;
the first calculating subunit is used for calculating to obtain the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand;
and/or the presence of a gas in the atmosphere,
the second image acquisition subunit is used for acquiring the T frames of dynamic gesture images and determining a displacement characteristic reference point in each frame of dynamic gesture image;
and the second calculation subunit is used for determining the time sequence displacement characteristic corresponding to the dynamic gesture image according to the position of the displacement characteristic reference point in every two adjacent frames of images.
A third aspect of an embodiment of the present invention provides a refined gesture recognition apparatus, including: memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method of refining gesture recognition according to any one of the first aspect when executing the computer program.
A fourth aspect of the embodiments of the present invention provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the method for refining gesture recognition according to any one of the first aspect.
Compared with the prior art, the embodiment of the invention has the following beneficial effects: the hand joint position is described by adopting the relative position characteristics and/or the time sequence displacement characteristics, so that the judgment of the refined motion process of the fingers and/or the detection of the large-amplitude motion are favorably realized.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a schematic diagram illustrating an implementation flow of a method for refining gesture recognition according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a dynamic gesture provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of the position of the hand joint according to the embodiment of the present invention;
fig. 4 is a schematic diagram of a correspondence relationship between a root node and a child node according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a displacement signature reference node provided in accordance with an embodiment of the present invention;
FIG. 6 is a schematic diagram of an implementation process for performing clustering calculation to obtain clustering characteristics according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a dynamic gesture operation apparatus according to an embodiment of the present invention;
FIG. 8 is a diagram illustrating an apparatus for refining gesture recognition according to an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
As shown in fig. 1, the method for refining gesture recognition according to the embodiment of the present invention includes:
in step S101, local features of hand joint positions are extracted, the local features including relative position features and/or time series displacement features.
Specifically, the extraction of the local feature is based on the acquired multi-frame image including the hand of the user, as shown in fig. 2, which is a schematic diagram of a T-frame image of an "OK" gesture, from frame 1 to frame T, multiple images are included, and the gesture in the image may have a change in position with a change in time frame, where the change in position includes a change in relative position of a finger and a change in overall position of the palm, and the change in relative position of the finger and the change in overall position of the palm may be described by a relative position feature and a time-series position feature, respectively.
In order to obtain the relative position feature and/or the time-series position feature, the nodes of the hand need to be extracted in advance, and the relative position feature and the time-series position feature are reflected by the change of the positions of the nodes. As an alternative embodiment of the present application, as shown in fig. 1 and fig. 2, the extracted nodes of the hand include joint point positions of the hand, finger end positions and a center position of the palm, wherein the joint point positions include 2 joint points of the thumb and 1 finger end node, the other four fingers include 3 joint points and 1 end node, the wrist portion includes one joint point, and the center position of the palm includes one node, and 22 feature reference points are included in total, in order to distinguish the left hand from the right hand, R1, R2 … … R22 and L1, L2 … … L22 respectively represent nodes of the right hand and the left hand.
Wherein the step of extracting relative position features in the local features of the hand joint positions comprises:
acquiring T frames of dynamic gesture images, and determining the positions of the nodes of the hand in each frame of dynamic gesture image;
and calculating the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand.
For a dynamic gesture G = { G) including T (T is a natural number and 2 or more) frame images t L1 ≦ T ≦ T }, such as the right hand "OK" gesture shown in FIG. 2, each frame of gesture data contains the position of the node of the hand in the world coordinate system estimated by the depth camera:
here h =1 denotes the left hand, h =2 denotes the right hand; i e {1,2,3., N } represents the ith node for the corresponding hand, as shown in fig. 2, each hand includes 22 nodes, and then N may have a value of 22.
Given the position of the node of the t' th frameWhere 1. Ltoreq. T'. Ltoreq.T. As shown in FIG. 4, the N-2 (N is 22 in FIG. 3) group root nodes corresponding to the node shown in FIG. 3 are selected>And the corresponding sub-node->Respectively calculating a relative position vector of a child node relative to the root node->I.e. based on>
as shown in fig. 3, the root node and the child node are selected, and two nodes separated by one joint point are selected as the root node and the child node according to the positions of joints where the nodes are located.
In this step, the step of extracting the time sequence displacement feature from the local features of the hand joint position includes:
acquiring T frames of dynamic gesture images, and determining a displacement characteristic reference point in each frame of dynamic gesture image;
and determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent images.
The selection of the displacement characteristic reference point can select a node at a palm part, so that the overall movement amount of the hand can be more accurately determined, and errors of the overall movement direction and size caused by the selection of the node at a finger part are avoided. As shown in FIG. 5, the joint position for a given t "frame can be selected where M (M is 7 in FIG. 2) nodes are the displacement feature reference pointsHere 1<T' is less than or equal to T, N is the number of nodes, M is the number of selected displacement characteristic reference points, and the selected displacement characteristic reference points are as follows: />Calculate the present frame->And the previous frame->Is greater than or equal to>Namely that
in step S102, performing cluster calculation according to the relative position feature and/or the time sequence displacement feature to obtain a cluster feature corresponding to the relative position feature and/or the time sequence displacement feature.
The process of clustering the extracted local features may be as shown in fig. 6, and includes:
in step S601, the relative position feature and/or the time-series displacement feature are represented as a set of unified local features.
For the relative position feature extracted in step S101And a timing shift feature>Can be expressed as a unified local feature expression form: />Here, it can be set that: s is more than or equal to 1 and less than or equal to 4,1<T is less than or equal to tau. For the inclusion of only relative position features, or only chronological displacement features, corresponding unified local feature expressions can be used, such as->Or (R)>
In step S602, a predetermined number of cluster sets are selected from the set of uniform local features.
In the present invention, the s-th class feature set can be processedK initialized cluster center points mu are selected by adopting a K-means + + algorithm s,k And l1 is more than or equal to K and less than or equal to K, and the sum of squares of errors is used as a clustering measurement basis to obtain a K cluster set C s ={C s,k And |1 is not less than K and not more than K }, namely:
whereinAnd the central point of each cluster after updating is obtained, wherein K is a natural number more than 2.
In step S603, each cluster set is transformed and calculated to obtain a cluster feature corresponding to the relative position feature and/or the time sequence displacement feature.
For each cluster class C s,k Performing Principal Component Analysis (PCA) operation, and retaining all components to obtain a transformed cluster C' s,k 。
Calculate cluster C' s,k And cluster center point mu' s,k Is obtained in a compact formClustering feature expression v k :
Combining the clustering characteristics of the K clustering central points to form a compact form expression of the s-th class characteristic:
V s ={ν s,k |1≤k≤K}。
repeating steps S602-S603, the local features can be assembledThe signature, expressed as a fixed length independent of the timing τ, is:
V={V s |1≤s≤4}。
in step S103, according to the corresponding relationship between the cluster feature and the gesture category, performing cluster feature model training, and performing gesture recognition according to the trained cluster feature model.
After generating the fixed-length feature representation irrelevant to the time sequence tau, model training and testing can be performed through a support vector machine or other training models to obtain a trained model, and then gestures can be judged and recognized through the trained model.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
Fig. 7 is a schematic structural diagram of a refined gesture recognition apparatus according to an embodiment of the present invention, as shown in fig. 7, the refined gesture recognition apparatus includes:
a local feature extraction unit 701, configured to extract local features of hand joint positions, where the local features include relative position features and/or time sequence displacement features;
a clustering feature calculating unit 702, configured to perform clustering calculation according to the relative position feature and/or the time sequence displacement feature to obtain a clustering feature corresponding to the relative position feature and/or the time sequence displacement feature;
and the training recognition unit 703 is configured to perform clustering feature model training according to the correspondence between the clustering features and the gesture categories, and perform gesture recognition according to the trained clustering feature model.
Preferably, the local feature extraction unit includes:
the first image acquisition subunit is used for acquiring T frames of dynamic gesture images and determining the positions of the nodes of the hand in each frame of dynamic gesture image;
the first calculating subunit is used for calculating to obtain the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand;
and/or the presence of a gas in the atmosphere,
the second image acquisition subunit is used for acquiring the T frames of dynamic gesture images and determining a displacement characteristic reference point in each frame of dynamic gesture image;
and the second calculation subunit is used for determining the time sequence displacement characteristic corresponding to the dynamic gesture image according to the position of the displacement characteristic reference point in every two adjacent frames of images.
FIG. 8 is a diagram illustrating an apparatus for refining gesture recognition according to an embodiment of the present invention. As shown in fig. 8, the refined gesture recognition apparatus 8 of this embodiment includes: a processor 80, a memory 81, and a computer program 82, such as a refined gesture recognition program, stored in the memory 81 and executable on the processor 80. The processor 80, when executing the computer program 82, implements the steps in the various refinement gesture recognition method embodiments described above, such as steps 101 to 103 shown in fig. 1. Alternatively, the processor 80, when executing the computer program 82, implements the functions of each module/unit in each device embodiment described above, for example, the functions of the modules 701 to 703 shown in fig. 7.
Illustratively, the computer program 82 may be partitioned into one or more modules/units that are stored in the memory 81 and executed by the processor 80 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 82 in the refined gesture recognition apparatus 8. For example, the computer program 82 may be divided into a local feature extraction unit, a cluster feature calculation unit, and a training identification unit, and each unit has the following specific functions:
a local feature extraction unit for extracting local features of hand joint positions, the local features including relative position features and/or time sequence displacement features;
the clustering characteristic calculating unit is used for carrying out clustering calculation according to the relative position characteristic and/or the time sequence displacement characteristic to obtain a clustering characteristic corresponding to the relative position characteristic and/or the time sequence displacement characteristic;
and the training recognition unit is used for performing clustering characteristic model training according to the corresponding relation between the clustering characteristics and the gesture classes and performing gesture recognition according to the trained clustering characteristic model.
The refined gesture recognition device 8 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing device. The refined gesture recognition apparatus may include, but is not limited to, a processor 80, a memory 81. Those skilled in the art will appreciate that fig. 8 is merely an example of a refined gesture recognition device 8, and does not constitute a limitation of the refined gesture recognition device 8, and may include more or fewer components than those shown, or some components in combination, or different components, for example, the refined gesture recognition device may also include an input output device, a network access device, a bus, etc.
The Processor 80 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 81 may be an internal storage unit of the refinement gesture recognition device 8, such as a hard disk or a memory of the refinement gesture recognition device 8. The memory 81 may also be an external storage device of the gesture recognition apparatus 8, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the gesture recognition apparatus 8. Further, the memory 81 may also include both an internal storage unit and an external storage device of the refined gesture recognition apparatus 8. The memory 81 is used for storing the computer program and other programs and data required by the refined gesture recognition apparatus. The memory 81 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. For the specific working processes of the units and modules in the system, reference may be made to the corresponding processes in the foregoing method embodiments, which are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated module/unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. . Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media which may not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.
Claims (5)
1. A refined gesture recognition method, comprising:
extracting local features of hand joint positions, wherein the local features comprise relative position features and/or time sequence displacement features;
performing clustering calculation according to the relative position characteristic and/or the time sequence displacement characteristic to obtain a clustering characteristic corresponding to the relative position characteristic and/or the time sequence displacement characteristic;
performing clustering feature model training according to the corresponding relation between the clustering features and the gesture categories, and performing gesture recognition according to the trained clustering feature model;
the step of extracting relative position features among the local features of the hand joint positions includes:
acquiring T frames of dynamic gesture images, and determining the positions of the nodes of the hand in each frame of dynamic gesture image;
calculating the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand;
the step of calculating the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand part comprises the following steps:
obtaining the relative position characteristic of the T' th frame in the T-frame dynamic gesture imageWherein: indicating a location of a root node>Representing the position of a child node corresponding to the root node, wherein u belongs to { i |1 is less than or equal to i and less than or equal to N }, h =1 represents a left hand, h =2 represents a right hand, 1 is less than or equal to T' ≦ T, and N is the number of nodes;
the step of extracting the time sequence displacement characteristic in the local characteristic of the hand joint position comprises the following steps:
acquiring T frames of dynamic gesture images, and determining a displacement characteristic reference point in each frame of dynamic gesture image;
determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent images;
the step of determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent images comprises the following steps:
obtaining the time sequence displacement characteristic of the t' frameWherein it is present> A reference point for a displacement characteristic of the t "th frame, <' >>Reference point for displacement feature of t "-1 th frame, 1<T ≦ T, v ∈ { i |1 ≦ i ≦ N }, M is the number of displacement feature reference points and 1 ≦ v ≦ M, h =1 represents the left hand, h =2 represents the right hand, and N is the number of nodes.
2. The method for identifying refined gestures according to claim 1, wherein the step of performing cluster calculation based on the relative position feature and/or the time sequence displacement feature to obtain a cluster feature corresponding to the relative position feature and/or the time sequence displacement feature comprises:
representing the relative position features and/or time series displacement features as a set of uniform local features;
selecting a cluster set with a preset number from the set of the unified local features;
and performing transformation calculation on each cluster set to obtain cluster characteristics corresponding to the relative position characteristics and/or the time sequence displacement characteristics.
3. A refined gesture recognition device, the refined gesture recognition device comprising:
a local feature extraction unit for extracting local features of hand joint positions, the local features including relative position features and/or time sequence displacement features;
the clustering characteristic calculating unit is used for carrying out clustering calculation according to the relative position characteristic and/or the time sequence displacement characteristic to obtain a clustering characteristic corresponding to the relative position characteristic and/or the time sequence displacement characteristic;
the training recognition unit is used for carrying out clustering feature model training according to the corresponding relation between the clustering features and the gesture categories and carrying out gesture recognition according to the trained clustering feature model;
the local feature extraction unit includes:
the first image acquisition subunit is used for acquiring T frames of dynamic gesture images and determining the positions of the nodes of the hand in each frame of dynamic gesture image;
the first calculating subunit is used for calculating to obtain the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand;
and/or the presence of a gas in the gas,
the second image acquisition subunit is used for acquiring the T frames of dynamic gesture images and determining a displacement characteristic reference point in each frame of dynamic gesture image;
the second calculation subunit is used for determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent frames of images;
the first calculating subunit is used for acquiring the relative position characteristic of the T' th frame in the T-frame dynamic gesture imageWherein:
indicates the location of the root node, and>representing the position of a child node corresponding to the root node, wherein u belongs to { i |1 is more than or equal to i and less than or equal to N }, h =1 represents a left hand, h =2 represents a right hand, 1 is more than or equal to T' ismore than or equal to T, and N is the number of nodes;
the second calculating subunit is used for acquiring the time sequence displacement characteristic of the t' th frameWherein, a reference point for a displacement characteristic of the t "th frame, <' >>Reference point for displacement feature of t "-1 th frame, 1<T ≦ T, v ∈ { i |1 ≦ i ≦ N }, M is the number of displacement feature reference points and 1 ≦ v ≦ M, h =1 represents the left hand, h =2 represents the right hand, and N is the number of nodes.
4. A refined gesture recognition apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the refined gesture recognition method according to any one of claims 1 to 2.
5. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for refining gesture recognition according to any one of claims 1 to 2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710656434.0A CN107368820B (en) | 2017-08-03 | 2017-08-03 | Refined gesture recognition method, device and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710656434.0A CN107368820B (en) | 2017-08-03 | 2017-08-03 | Refined gesture recognition method, device and equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107368820A CN107368820A (en) | 2017-11-21 |
CN107368820B true CN107368820B (en) | 2023-04-18 |
Family
ID=60309287
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710656434.0A Active CN107368820B (en) | 2017-08-03 | 2017-08-03 | Refined gesture recognition method, device and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107368820B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109992093B (en) * | 2017-12-29 | 2024-05-03 | 博世汽车部件(苏州)有限公司 | Gesture comparison method and gesture comparison system |
CN108346168B (en) * | 2018-02-12 | 2019-08-13 | 腾讯科技(深圳)有限公司 | A kind of images of gestures generation method, device and storage medium |
CN108921101A (en) | 2018-07-04 | 2018-11-30 | 百度在线网络技术(北京)有限公司 | Processing method, equipment and readable storage medium storing program for executing based on gesture identification control instruction |
CN109117766A (en) * | 2018-07-30 | 2019-01-01 | 上海斐讯数据通信技术有限公司 | A kind of dynamic gesture identification method and system |
CN109117771B (en) * | 2018-08-01 | 2022-05-27 | 四川电科维云信息技术有限公司 | System and method for detecting violence events in image based on anchor nodes |
CN110163130B (en) * | 2019-05-08 | 2021-05-28 | 清华大学 | Feature pre-alignment random forest classification system and method for gesture recognition |
CN111222486B (en) * | 2020-01-15 | 2022-11-04 | 腾讯科技(深圳)有限公司 | Training method, device and equipment for hand gesture recognition model and storage medium |
TWI777153B (en) * | 2020-04-21 | 2022-09-11 | 和碩聯合科技股份有限公司 | Image recognition method and device thereof and ai model training method and device thereof |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100071965A1 (en) * | 2008-09-23 | 2010-03-25 | Panasonic Corporation | System and method for grab and drop gesture recognition |
JP2010176380A (en) * | 2009-01-29 | 2010-08-12 | Sony Corp | Information processing device and method, program, and recording medium |
TW201123031A (en) * | 2009-12-24 | 2011-07-01 | Univ Nat Taiwan Science Tech | Robot and method for recognizing human faces and gestures thereof |
CN101976330B (en) * | 2010-09-26 | 2013-08-07 | 中国科学院深圳先进技术研究院 | Gesture recognition method and system |
US9916538B2 (en) * | 2012-09-15 | 2018-03-13 | Z Advanced Computing, Inc. | Method and system for feature detection |
CN103246891B (en) * | 2013-05-28 | 2016-07-06 | 重庆邮电大学 | A kind of Chinese Sign Language recognition methods based on Kinect |
CN104598915B (en) * | 2014-01-24 | 2017-08-11 | 深圳奥比中光科技有限公司 | A kind of gesture identification method and device |
EP3699736B1 (en) * | 2014-06-14 | 2023-03-29 | Magic Leap, Inc. | Methods and systems for creating virtual and augmented reality |
CN106886751A (en) * | 2017-01-09 | 2017-06-23 | 深圳数字电视国家工程实验室股份有限公司 | A kind of gesture identification method and system |
-
2017
- 2017-08-03 CN CN201710656434.0A patent/CN107368820B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN107368820A (en) | 2017-11-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107368820B (en) | Refined gesture recognition method, device and equipment | |
CN111815754B (en) | Three-dimensional information determining method, three-dimensional information determining device and terminal equipment | |
EP2903256B1 (en) | Image processing device, image processing method and program | |
WO2020244075A1 (en) | Sign language recognition method and apparatus, and computer device and storage medium | |
CN110348412B (en) | Key point positioning method and device, electronic equipment and storage medium | |
CN114186632B (en) | Method, device, equipment and storage medium for training key point detection model | |
CN103907139A (en) | Information processing device, information processing method, and program | |
Ruan et al. | Dynamic gesture recognition based on improved DTW algorithm | |
CN111667005B (en) | Human interactive system adopting RGBD visual sensing | |
WO2017116879A1 (en) | Recognition of hand poses by classification using discrete values | |
CN110490444A (en) | Mark method for allocating tasks, device, system and storage medium | |
CN107272899B (en) | VR (virtual reality) interaction method and device based on dynamic gestures and electronic equipment | |
JP2016014954A (en) | Method for detecting finger shape, program thereof, storage medium of program thereof, and system for detecting finger shape | |
GB2462903A (en) | Single Stroke Character Recognition | |
CN117523659A (en) | Skeleton-based multi-feature multi-stream real-time action recognition method, device and medium | |
Mahmud et al. | On-air English Capital Alphabet (ECA) recognition using depth information | |
CN108392207B (en) | Gesture tag-based action recognition method | |
Oszust et al. | Isolated sign language recognition with depth cameras | |
KR20140035271A (en) | Method and system for gesture recognition | |
CN114674328B (en) | Map generation method, map generation device, electronic device, storage medium, and vehicle | |
CN111931794B (en) | Sketch-based image matching method | |
Hisham et al. | Arabic sign language recognition using Microsoft Kinect and leap motion controller | |
CN113553884B (en) | Gesture recognition method, terminal device and computer-readable storage medium | |
CN114202799A (en) | Method and device for determining change speed of controlled object, electronic equipment and storage medium | |
CN109213322B (en) | Method and system for gesture recognition in virtual reality |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |