CN107368820B - Refined gesture recognition method, device and equipment - Google Patents

Refined gesture recognition method, device and equipment Download PDF

Info

Publication number
CN107368820B
CN107368820B CN201710656434.0A CN201710656434A CN107368820B CN 107368820 B CN107368820 B CN 107368820B CN 201710656434 A CN201710656434 A CN 201710656434A CN 107368820 B CN107368820 B CN 107368820B
Authority
CN
China
Prior art keywords
characteristic
relative position
displacement
features
time sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710656434.0A
Other languages
Chinese (zh)
Other versions
CN107368820A (en
Inventor
姬晓鹏
程俊
潘亮亮
张丰
方琎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Shenzhen Tencent Computer Systems Co Ltd
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Shenzhen Tencent Computer Systems Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS, Shenzhen Tencent Computer Systems Co Ltd filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN201710656434.0A priority Critical patent/CN107368820B/en
Publication of CN107368820A publication Critical patent/CN107368820A/en
Application granted granted Critical
Publication of CN107368820B publication Critical patent/CN107368820B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Social Psychology (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Psychiatry (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

A refined gesture recognition method comprises the following steps: extracting local features of hand joint positions, wherein the local features comprise relative position features and/or time sequence displacement features; performing clustering calculation according to the relative position characteristic and/or the time sequence displacement characteristic to obtain a clustering characteristic corresponding to the relative position characteristic and/or the time sequence displacement characteristic; and performing clustering feature model training according to the corresponding relation between the clustering features and the gesture categories, and performing gesture recognition according to the trained clustering feature model. The conversion from the dynamic gestures with different lengths to the features with fixed lengths can be realized, the similarity measurement of the gesture types by the classifier is facilitated, and the judgment of the refined motion process of the fingers and the detection of the large-amplitude motion are facilitated.

Description

Refined gesture recognition method, device and equipment
Technical Field
The invention belongs to the field of gesture recognition, and particularly relates to a method, a device and equipment for fine gesture recognition.
Background
In a human-computer interaction system such as a smart television, a wearable mobile terminal, a personal computer, or a virtual reality device, an online gesture recognition interaction input is often used.
According to different data acquisition modes, the current gesture recognition methods at home and abroad can be divided into wearable equipment and vision-based methods. Wherein:
the gesture recognition method based on the wearable device mainly utilizes sensors such as an accelerometer and a gyroscope to acquire motion trail information of gestures in a three-dimensional space, the gesture recognition based on the wearable device has the advantages that a plurality of sensors can be arranged to acquire accurate relative position information and space motion trail of hand joints, the recognition accuracy is high, however, the method needs to wear complex equipment or devices, such as data gloves and position trackers, the wearing is complex, and certain influence is brought to the naturalness of a human-computer interaction system.
The gesture recognition method based on vision can well solve the problem of naturalness during man-machine interaction, the image data of a hand region is acquired by a visible light camera, and then the processes of segmentation, feature extraction and classification of a hand target region are carried out. However, the existing gesture recognition method based on vision can only process a single type of static gesture (such as gesture digital recognition of a single image) or dynamic gesture (such as sliding and page-turning of a palm up and down), and in the problem of processing gesture recognition of unequal length sequences, the existing gesture recognition method mostly adopts a dynamic time warping algorithm to measure similarity of gesture motion trajectories, the algorithm can solve the difference of the motion trajectories of hands under large-amplitude motion, but has high computational complexity, and cannot realize refined and diversified finger motion recognition.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method, an apparatus, and a device for gesture recognition refinement, so as to solve the problem that in the gesture recognition method in the prior art, due to high computational complexity, refined and diverse finger motion recognition cannot be implemented.
A first aspect of an embodiment of the present invention provides a method for fine gesture recognition, where the method for fine gesture recognition includes:
extracting local features of hand joint positions, wherein the local features comprise relative position features and/or time sequence displacement features;
performing clustering calculation according to the relative position characteristic and/or the time sequence displacement characteristic to obtain a clustering characteristic corresponding to the relative position characteristic and/or the time sequence displacement characteristic;
and performing clustering feature model training according to the corresponding relation between the clustering features and the gesture categories, and performing gesture recognition according to the trained clustering feature model.
With reference to the first aspect, in a first possible implementation manner of the first aspect, the step of extracting a relative position feature of the local features of the hand joint positions includes:
acquiring T frames of dynamic gesture images, and determining the positions of the nodes of the hand in each frame of dynamic gesture image;
and calculating to obtain the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes in the nodes of the hand.
With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the step of calculating, according to the positions of the root node and the corresponding child nodes included in the nodes of the hand, a relative position feature of the child node with respect to the root node includes:
obtaining the relative position characteristic of the T' th frame in the T-frame dynamic gesture image
Figure BDA0001369396540000021
Wherein:
Figure BDA0001369396540000022
Figure BDA0001369396540000023
indicates the location of the root node, and>
Figure BDA0001369396540000024
represents the position of a child node corresponding to the root node, u ∈ { i |1 ≦ i ≦ N }, h =1 represents the left hand, h =2 represents the right hand, 1 ≦ T' ≦ T, and N is the number of nodes.
With reference to the first aspect, in a third possible implementation manner of the first aspect, the step of extracting a time-series displacement feature from local features of hand joint positions includes:
acquiring T frames of dynamic gesture images, and determining a displacement characteristic reference point in each frame of dynamic gesture image;
and determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent images.
With reference to the third possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the determining, according to the position of the displacement feature reference point in each two adjacent frames of images, a time-series displacement feature corresponding to the dynamic gesture image includes:
obtaining the time sequence displacement characteristic of the t' frame
Figure BDA0001369396540000031
Wherein it is present>
Figure BDA0001369396540000032
Figure BDA0001369396540000033
Figure BDA0001369396540000034
For a displaced characteristic reference point of a t "th frame, <' >>
Figure BDA0001369396540000035
Reference point for displacement feature of t "-1 th frame, 1<T ≦ T, v ∈ { i |1 ≦ i ≦ N }, M is the number of displacement feature reference points and 1 ≦ v ≦ M, h =1 represents the left hand, h =2 represents the right hand, and N is the number of nodes.
With reference to the first aspect, in a fifth possible implementation manner of the first aspect, the performing cluster calculation according to the relative position feature and/or the time series displacement feature to obtain a cluster feature corresponding to the relative position feature and/or the time series displacement feature includes:
representing the relative position features and/or timing displacement features as a set of uniform local features;
selecting a predetermined number of cluster sets from the set of uniform local features;
and performing transformation calculation on each cluster set to obtain cluster characteristics corresponding to the relative position characteristics and/or the time sequence displacement characteristics.
A second aspect of an embodiment of the present invention provides a refined gesture recognition apparatus, including:
a local feature extraction unit for extracting local features of hand joint positions, the local features including relative position features and/or time sequence displacement features;
the clustering feature calculating unit is used for carrying out clustering calculation according to the relative position feature and/or the time sequence displacement feature to obtain a clustering feature corresponding to the relative position feature and/or the time sequence displacement feature;
and the training recognition unit is used for carrying out clustering characteristic model training according to the corresponding relation between the clustering characteristics and the gesture categories and carrying out gesture recognition according to the trained clustering characteristic model.
With reference to the second aspect, in a first possible implementation manner of the second aspect, the local feature extraction unit includes:
the first image acquisition subunit is used for acquiring T frames of dynamic gesture images and determining the positions of the nodes of the hand in each frame of dynamic gesture image;
the first calculating subunit is used for calculating to obtain the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand;
and/or the presence of a gas in the atmosphere,
the second image acquisition subunit is used for acquiring the T frames of dynamic gesture images and determining a displacement characteristic reference point in each frame of dynamic gesture image;
and the second calculation subunit is used for determining the time sequence displacement characteristic corresponding to the dynamic gesture image according to the position of the displacement characteristic reference point in every two adjacent frames of images.
A third aspect of an embodiment of the present invention provides a refined gesture recognition apparatus, including: memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method of refining gesture recognition according to any one of the first aspect when executing the computer program.
A fourth aspect of the embodiments of the present invention provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the method for refining gesture recognition according to any one of the first aspect.
Compared with the prior art, the embodiment of the invention has the following beneficial effects: the hand joint position is described by adopting the relative position characteristics and/or the time sequence displacement characteristics, so that the judgment of the refined motion process of the fingers and/or the detection of the large-amplitude motion are favorably realized.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a schematic diagram illustrating an implementation flow of a method for refining gesture recognition according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a dynamic gesture provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of the position of the hand joint according to the embodiment of the present invention;
fig. 4 is a schematic diagram of a correspondence relationship between a root node and a child node according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a displacement signature reference node provided in accordance with an embodiment of the present invention;
FIG. 6 is a schematic diagram of an implementation process for performing clustering calculation to obtain clustering characteristics according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a dynamic gesture operation apparatus according to an embodiment of the present invention;
FIG. 8 is a diagram illustrating an apparatus for refining gesture recognition according to an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
As shown in fig. 1, the method for refining gesture recognition according to the embodiment of the present invention includes:
in step S101, local features of hand joint positions are extracted, the local features including relative position features and/or time series displacement features.
Specifically, the extraction of the local feature is based on the acquired multi-frame image including the hand of the user, as shown in fig. 2, which is a schematic diagram of a T-frame image of an "OK" gesture, from frame 1 to frame T, multiple images are included, and the gesture in the image may have a change in position with a change in time frame, where the change in position includes a change in relative position of a finger and a change in overall position of the palm, and the change in relative position of the finger and the change in overall position of the palm may be described by a relative position feature and a time-series position feature, respectively.
In order to obtain the relative position feature and/or the time-series position feature, the nodes of the hand need to be extracted in advance, and the relative position feature and the time-series position feature are reflected by the change of the positions of the nodes. As an alternative embodiment of the present application, as shown in fig. 1 and fig. 2, the extracted nodes of the hand include joint point positions of the hand, finger end positions and a center position of the palm, wherein the joint point positions include 2 joint points of the thumb and 1 finger end node, the other four fingers include 3 joint points and 1 end node, the wrist portion includes one joint point, and the center position of the palm includes one node, and 22 feature reference points are included in total, in order to distinguish the left hand from the right hand, R1, R2 … … R22 and L1, L2 … … L22 respectively represent nodes of the right hand and the left hand.
Wherein the step of extracting relative position features in the local features of the hand joint positions comprises:
acquiring T frames of dynamic gesture images, and determining the positions of the nodes of the hand in each frame of dynamic gesture image;
and calculating the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand.
For a dynamic gesture G = { G) including T (T is a natural number and 2 or more) frame images t L1 ≦ T ≦ T }, such as the right hand "OK" gesture shown in FIG. 2, each frame of gesture data contains the position of the node of the hand in the world coordinate system estimated by the depth camera:
Figure BDA0001369396540000061
here h =1 denotes the left hand, h =2 denotes the right hand; i e {1,2,3., N } represents the ith node for the corresponding hand, as shown in fig. 2, each hand includes 22 nodes, and then N may have a value of 22.
Given the position of the node of the t' th frame
Figure BDA0001369396540000062
Where 1. Ltoreq. T'. Ltoreq.T. As shown in FIG. 4, the N-2 (N is 22 in FIG. 3) group root nodes corresponding to the node shown in FIG. 3 are selected>
Figure BDA0001369396540000063
And the corresponding sub-node->
Figure BDA0001369396540000064
Respectively calculating a relative position vector of a child node relative to the root node->
Figure BDA0001369396540000065
I.e. based on>
Figure BDA0001369396540000066
The relative position feature of the t' th frame
Figure BDA0001369396540000071
Can be expressed as:
Figure BDA0001369396540000072
as shown in fig. 3, the root node and the child node are selected, and two nodes separated by one joint point are selected as the root node and the child node according to the positions of joints where the nodes are located.
In this step, the step of extracting the time sequence displacement feature from the local features of the hand joint position includes:
acquiring T frames of dynamic gesture images, and determining a displacement characteristic reference point in each frame of dynamic gesture image;
and determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent images.
The selection of the displacement characteristic reference point can select a node at a palm part, so that the overall movement amount of the hand can be more accurately determined, and errors of the overall movement direction and size caused by the selection of the node at a finger part are avoided. As shown in FIG. 5, the joint position for a given t "frame can be selected where M (M is 7 in FIG. 2) nodes are the displacement feature reference points
Figure BDA0001369396540000073
Here 1<T' is less than or equal to T, N is the number of nodes, M is the number of selected displacement characteristic reference points, and the selected displacement characteristic reference points are as follows: />
Figure BDA0001369396540000074
Calculate the present frame->
Figure BDA0001369396540000075
And the previous frame->
Figure BDA0001369396540000076
Is greater than or equal to>
Figure BDA0001369396540000077
Namely that
Figure BDA0001369396540000078
Then the time sequence displacement characteristic of the t "th frame
Figure BDA0001369396540000079
Can be expressed as:
Figure BDA00013693965400000710
in step S102, performing cluster calculation according to the relative position feature and/or the time sequence displacement feature to obtain a cluster feature corresponding to the relative position feature and/or the time sequence displacement feature.
The process of clustering the extracted local features may be as shown in fig. 6, and includes:
in step S601, the relative position feature and/or the time-series displacement feature are represented as a set of unified local features.
For the relative position feature extracted in step S101
Figure BDA00013693965400000711
And a timing shift feature>
Figure BDA00013693965400000712
Can be expressed as a unified local feature expression form: />
Figure BDA00013693965400000713
Here, it can be set that: s is more than or equal to 1 and less than or equal to 4,1<T is less than or equal to tau. For the inclusion of only relative position features, or only chronological displacement features, corresponding unified local feature expressions can be used, such as->
Figure BDA0001369396540000081
Or (R)>
Figure BDA0001369396540000082
In step S602, a predetermined number of cluster sets are selected from the set of uniform local features.
In the present invention, the s-th class feature set can be processed
Figure BDA0001369396540000086
K initialized cluster center points mu are selected by adopting a K-means + + algorithm s,k And l1 is more than or equal to K and less than or equal to K, and the sum of squares of errors is used as a clustering measurement basis to obtain a K cluster set C s ={C s,k And |1 is not less than K and not more than K }, namely:
Figure BDA0001369396540000083
wherein
Figure BDA0001369396540000084
And the central point of each cluster after updating is obtained, wherein K is a natural number more than 2.
In step S603, each cluster set is transformed and calculated to obtain a cluster feature corresponding to the relative position feature and/or the time sequence displacement feature.
For each cluster class C s,k Performing Principal Component Analysis (PCA) operation, and retaining all components to obtain a transformed cluster C' s,k
Calculate cluster C' s,k And cluster center point mu' s,k Is obtained in a compact formClustering feature expression v k
Figure BDA0001369396540000085
Combining the clustering characteristics of the K clustering central points to form a compact form expression of the s-th class characteristic:
V s ={ν s,k |1≤k≤K}。
repeating steps S602-S603, the local features can be assembled
Figure BDA0001369396540000087
The signature, expressed as a fixed length independent of the timing τ, is:
V={V s |1≤s≤4}。
in step S103, according to the corresponding relationship between the cluster feature and the gesture category, performing cluster feature model training, and performing gesture recognition according to the trained cluster feature model.
After generating the fixed-length feature representation irrelevant to the time sequence tau, model training and testing can be performed through a support vector machine or other training models to obtain a trained model, and then gestures can be judged and recognized through the trained model.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
Fig. 7 is a schematic structural diagram of a refined gesture recognition apparatus according to an embodiment of the present invention, as shown in fig. 7, the refined gesture recognition apparatus includes:
a local feature extraction unit 701, configured to extract local features of hand joint positions, where the local features include relative position features and/or time sequence displacement features;
a clustering feature calculating unit 702, configured to perform clustering calculation according to the relative position feature and/or the time sequence displacement feature to obtain a clustering feature corresponding to the relative position feature and/or the time sequence displacement feature;
and the training recognition unit 703 is configured to perform clustering feature model training according to the correspondence between the clustering features and the gesture categories, and perform gesture recognition according to the trained clustering feature model.
Preferably, the local feature extraction unit includes:
the first image acquisition subunit is used for acquiring T frames of dynamic gesture images and determining the positions of the nodes of the hand in each frame of dynamic gesture image;
the first calculating subunit is used for calculating to obtain the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand;
and/or the presence of a gas in the atmosphere,
the second image acquisition subunit is used for acquiring the T frames of dynamic gesture images and determining a displacement characteristic reference point in each frame of dynamic gesture image;
and the second calculation subunit is used for determining the time sequence displacement characteristic corresponding to the dynamic gesture image according to the position of the displacement characteristic reference point in every two adjacent frames of images.
FIG. 8 is a diagram illustrating an apparatus for refining gesture recognition according to an embodiment of the present invention. As shown in fig. 8, the refined gesture recognition apparatus 8 of this embodiment includes: a processor 80, a memory 81, and a computer program 82, such as a refined gesture recognition program, stored in the memory 81 and executable on the processor 80. The processor 80, when executing the computer program 82, implements the steps in the various refinement gesture recognition method embodiments described above, such as steps 101 to 103 shown in fig. 1. Alternatively, the processor 80, when executing the computer program 82, implements the functions of each module/unit in each device embodiment described above, for example, the functions of the modules 701 to 703 shown in fig. 7.
Illustratively, the computer program 82 may be partitioned into one or more modules/units that are stored in the memory 81 and executed by the processor 80 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 82 in the refined gesture recognition apparatus 8. For example, the computer program 82 may be divided into a local feature extraction unit, a cluster feature calculation unit, and a training identification unit, and each unit has the following specific functions:
a local feature extraction unit for extracting local features of hand joint positions, the local features including relative position features and/or time sequence displacement features;
the clustering characteristic calculating unit is used for carrying out clustering calculation according to the relative position characteristic and/or the time sequence displacement characteristic to obtain a clustering characteristic corresponding to the relative position characteristic and/or the time sequence displacement characteristic;
and the training recognition unit is used for performing clustering characteristic model training according to the corresponding relation between the clustering characteristics and the gesture classes and performing gesture recognition according to the trained clustering characteristic model.
The refined gesture recognition device 8 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing device. The refined gesture recognition apparatus may include, but is not limited to, a processor 80, a memory 81. Those skilled in the art will appreciate that fig. 8 is merely an example of a refined gesture recognition device 8, and does not constitute a limitation of the refined gesture recognition device 8, and may include more or fewer components than those shown, or some components in combination, or different components, for example, the refined gesture recognition device may also include an input output device, a network access device, a bus, etc.
The Processor 80 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 81 may be an internal storage unit of the refinement gesture recognition device 8, such as a hard disk or a memory of the refinement gesture recognition device 8. The memory 81 may also be an external storage device of the gesture recognition apparatus 8, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the gesture recognition apparatus 8. Further, the memory 81 may also include both an internal storage unit and an external storage device of the refined gesture recognition apparatus 8. The memory 81 is used for storing the computer program and other programs and data required by the refined gesture recognition apparatus. The memory 81 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. For the specific working processes of the units and modules in the system, reference may be made to the corresponding processes in the foregoing method embodiments, which are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated module/unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. . Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media which may not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims (5)

1. A refined gesture recognition method, comprising:
extracting local features of hand joint positions, wherein the local features comprise relative position features and/or time sequence displacement features;
performing clustering calculation according to the relative position characteristic and/or the time sequence displacement characteristic to obtain a clustering characteristic corresponding to the relative position characteristic and/or the time sequence displacement characteristic;
performing clustering feature model training according to the corresponding relation between the clustering features and the gesture categories, and performing gesture recognition according to the trained clustering feature model;
the step of extracting relative position features among the local features of the hand joint positions includes:
acquiring T frames of dynamic gesture images, and determining the positions of the nodes of the hand in each frame of dynamic gesture image;
calculating the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand;
the step of calculating the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand part comprises the following steps:
obtaining the relative position characteristic of the T' th frame in the T-frame dynamic gesture image
Figure QLYQS_1
Wherein:
Figure QLYQS_2
Figure QLYQS_3
indicating a location of a root node>
Figure QLYQS_4
Representing the position of a child node corresponding to the root node, wherein u belongs to { i |1 is less than or equal to i and less than or equal to N }, h =1 represents a left hand, h =2 represents a right hand, 1 is less than or equal to T' ≦ T, and N is the number of nodes;
the step of extracting the time sequence displacement characteristic in the local characteristic of the hand joint position comprises the following steps:
acquiring T frames of dynamic gesture images, and determining a displacement characteristic reference point in each frame of dynamic gesture image;
determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent images;
the step of determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent images comprises the following steps:
obtaining the time sequence displacement characteristic of the t' frame
Figure QLYQS_5
Wherein it is present>
Figure QLYQS_6
Figure QLYQS_7
Figure QLYQS_8
A reference point for a displacement characteristic of the t "th frame, <' >>
Figure QLYQS_9
Reference point for displacement feature of t "-1 th frame, 1<T ≦ T, v ∈ { i |1 ≦ i ≦ N }, M is the number of displacement feature reference points and 1 ≦ v ≦ M, h =1 represents the left hand, h =2 represents the right hand, and N is the number of nodes.
2. The method for identifying refined gestures according to claim 1, wherein the step of performing cluster calculation based on the relative position feature and/or the time sequence displacement feature to obtain a cluster feature corresponding to the relative position feature and/or the time sequence displacement feature comprises:
representing the relative position features and/or time series displacement features as a set of uniform local features;
selecting a cluster set with a preset number from the set of the unified local features;
and performing transformation calculation on each cluster set to obtain cluster characteristics corresponding to the relative position characteristics and/or the time sequence displacement characteristics.
3. A refined gesture recognition device, the refined gesture recognition device comprising:
a local feature extraction unit for extracting local features of hand joint positions, the local features including relative position features and/or time sequence displacement features;
the clustering characteristic calculating unit is used for carrying out clustering calculation according to the relative position characteristic and/or the time sequence displacement characteristic to obtain a clustering characteristic corresponding to the relative position characteristic and/or the time sequence displacement characteristic;
the training recognition unit is used for carrying out clustering feature model training according to the corresponding relation between the clustering features and the gesture categories and carrying out gesture recognition according to the trained clustering feature model;
the local feature extraction unit includes:
the first image acquisition subunit is used for acquiring T frames of dynamic gesture images and determining the positions of the nodes of the hand in each frame of dynamic gesture image;
the first calculating subunit is used for calculating to obtain the relative position characteristics of the child nodes relative to the root node according to the positions of the root node and the corresponding child nodes included in the nodes of the hand;
and/or the presence of a gas in the gas,
the second image acquisition subunit is used for acquiring the T frames of dynamic gesture images and determining a displacement characteristic reference point in each frame of dynamic gesture image;
the second calculation subunit is used for determining the time sequence displacement characteristics corresponding to the dynamic gesture images according to the positions of the displacement characteristic reference points in every two adjacent frames of images;
the first calculating subunit is used for acquiring the relative position characteristic of the T' th frame in the T-frame dynamic gesture image
Figure QLYQS_10
Wherein:
Figure QLYQS_11
Figure QLYQS_12
indicates the location of the root node, and>
Figure QLYQS_13
representing the position of a child node corresponding to the root node, wherein u belongs to { i |1 is more than or equal to i and less than or equal to N }, h =1 represents a left hand, h =2 represents a right hand, 1 is more than or equal to T' ismore than or equal to T, and N is the number of nodes;
the second calculating subunit is used for acquiring the time sequence displacement characteristic of the t' th frame
Figure QLYQS_14
Wherein,
Figure QLYQS_15
Figure QLYQS_16
a reference point for a displacement characteristic of the t "th frame, <' >>
Figure QLYQS_17
Reference point for displacement feature of t "-1 th frame, 1<T ≦ T, v ∈ { i |1 ≦ i ≦ N }, M is the number of displacement feature reference points and 1 ≦ v ≦ M, h =1 represents the left hand, h =2 represents the right hand, and N is the number of nodes.
4. A refined gesture recognition apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the refined gesture recognition method according to any one of claims 1 to 2.
5. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for refining gesture recognition according to any one of claims 1 to 2.
CN201710656434.0A 2017-08-03 2017-08-03 Refined gesture recognition method, device and equipment Active CN107368820B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710656434.0A CN107368820B (en) 2017-08-03 2017-08-03 Refined gesture recognition method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710656434.0A CN107368820B (en) 2017-08-03 2017-08-03 Refined gesture recognition method, device and equipment

Publications (2)

Publication Number Publication Date
CN107368820A CN107368820A (en) 2017-11-21
CN107368820B true CN107368820B (en) 2023-04-18

Family

ID=60309287

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710656434.0A Active CN107368820B (en) 2017-08-03 2017-08-03 Refined gesture recognition method, device and equipment

Country Status (1)

Country Link
CN (1) CN107368820B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109992093B (en) * 2017-12-29 2024-05-03 博世汽车部件(苏州)有限公司 Gesture comparison method and gesture comparison system
CN108346168B (en) * 2018-02-12 2019-08-13 腾讯科技(深圳)有限公司 A kind of images of gestures generation method, device and storage medium
CN108921101A (en) 2018-07-04 2018-11-30 百度在线网络技术(北京)有限公司 Processing method, equipment and readable storage medium storing program for executing based on gesture identification control instruction
CN109117766A (en) * 2018-07-30 2019-01-01 上海斐讯数据通信技术有限公司 A kind of dynamic gesture identification method and system
CN109117771B (en) * 2018-08-01 2022-05-27 四川电科维云信息技术有限公司 System and method for detecting violence events in image based on anchor nodes
CN110163130B (en) * 2019-05-08 2021-05-28 清华大学 Feature pre-alignment random forest classification system and method for gesture recognition
CN111222486B (en) * 2020-01-15 2022-11-04 腾讯科技(深圳)有限公司 Training method, device and equipment for hand gesture recognition model and storage medium
TWI777153B (en) * 2020-04-21 2022-09-11 和碩聯合科技股份有限公司 Image recognition method and device thereof and ai model training method and device thereof

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100071965A1 (en) * 2008-09-23 2010-03-25 Panasonic Corporation System and method for grab and drop gesture recognition
JP2010176380A (en) * 2009-01-29 2010-08-12 Sony Corp Information processing device and method, program, and recording medium
TW201123031A (en) * 2009-12-24 2011-07-01 Univ Nat Taiwan Science Tech Robot and method for recognizing human faces and gestures thereof
CN101976330B (en) * 2010-09-26 2013-08-07 中国科学院深圳先进技术研究院 Gesture recognition method and system
US9916538B2 (en) * 2012-09-15 2018-03-13 Z Advanced Computing, Inc. Method and system for feature detection
CN103246891B (en) * 2013-05-28 2016-07-06 重庆邮电大学 A kind of Chinese Sign Language recognition methods based on Kinect
CN104598915B (en) * 2014-01-24 2017-08-11 深圳奥比中光科技有限公司 A kind of gesture identification method and device
EP3699736B1 (en) * 2014-06-14 2023-03-29 Magic Leap, Inc. Methods and systems for creating virtual and augmented reality
CN106886751A (en) * 2017-01-09 2017-06-23 深圳数字电视国家工程实验室股份有限公司 A kind of gesture identification method and system

Also Published As

Publication number Publication date
CN107368820A (en) 2017-11-21

Similar Documents

Publication Publication Date Title
CN107368820B (en) Refined gesture recognition method, device and equipment
CN111815754B (en) Three-dimensional information determining method, three-dimensional information determining device and terminal equipment
EP2903256B1 (en) Image processing device, image processing method and program
WO2020244075A1 (en) Sign language recognition method and apparatus, and computer device and storage medium
CN110348412B (en) Key point positioning method and device, electronic equipment and storage medium
CN114186632B (en) Method, device, equipment and storage medium for training key point detection model
CN103907139A (en) Information processing device, information processing method, and program
Ruan et al. Dynamic gesture recognition based on improved DTW algorithm
CN111667005B (en) Human interactive system adopting RGBD visual sensing
WO2017116879A1 (en) Recognition of hand poses by classification using discrete values
CN110490444A (en) Mark method for allocating tasks, device, system and storage medium
CN107272899B (en) VR (virtual reality) interaction method and device based on dynamic gestures and electronic equipment
JP2016014954A (en) Method for detecting finger shape, program thereof, storage medium of program thereof, and system for detecting finger shape
GB2462903A (en) Single Stroke Character Recognition
CN117523659A (en) Skeleton-based multi-feature multi-stream real-time action recognition method, device and medium
Mahmud et al. On-air English Capital Alphabet (ECA) recognition using depth information
CN108392207B (en) Gesture tag-based action recognition method
Oszust et al. Isolated sign language recognition with depth cameras
KR20140035271A (en) Method and system for gesture recognition
CN114674328B (en) Map generation method, map generation device, electronic device, storage medium, and vehicle
CN111931794B (en) Sketch-based image matching method
Hisham et al. Arabic sign language recognition using Microsoft Kinect and leap motion controller
CN113553884B (en) Gesture recognition method, terminal device and computer-readable storage medium
CN114202799A (en) Method and device for determining change speed of controlled object, electronic equipment and storage medium
CN109213322B (en) Method and system for gesture recognition in virtual reality

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant