CN117423161A - Gesture recognition method, device, equipment and storage medium - Google Patents

Gesture recognition method, device, equipment and storage medium Download PDF

Info

Publication number
CN117423161A
CN117423161A CN202311321438.5A CN202311321438A CN117423161A CN 117423161 A CN117423161 A CN 117423161A CN 202311321438 A CN202311321438 A CN 202311321438A CN 117423161 A CN117423161 A CN 117423161A
Authority
CN
China
Prior art keywords
hand
gesture
target
gesture recognition
key point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311321438.5A
Other languages
Chinese (zh)
Inventor
薛鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Goertek Techology Co Ltd
Original Assignee
Goertek Techology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Goertek Techology Co Ltd filed Critical Goertek Techology Co Ltd
Priority to CN202311321438.5A priority Critical patent/CN117423161A/en
Publication of CN117423161A publication Critical patent/CN117423161A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/757Matching configurations of points or features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/11Hand-related biometrics; Hand pose recognition

Abstract

The invention relates to the technical field of image recognition, and discloses a gesture recognition method, device, equipment and storage medium, wherein the method comprises the following steps: acquiring a hand image of a user; the hand image is identified through a target gesture identification model, thermal data of each key point of the hand is obtained, and the target gesture identification model is built through a hierarchical structure of a hand skeleton; projecting each key point of the hand according to the thermal data of each key point of the hand and the connection relation of the hand joint to obtain the current gesture of the user; through the method, after the hand image of the user is obtained, the hand image is identified through the target gesture identification model built by the hierarchical structure of the hand skeleton, then the key points of the hand are projected by combining the connection relation of the hand joints, and the current gesture of the user is obtained, so that the accuracy of identifying the gesture of the user can be effectively improved.

Description

Gesture recognition method, device, equipment and storage medium
Technical Field
The present invention relates to the field of image recognition technologies, and in particular, to a gesture recognition method, apparatus, device, and storage medium.
Background
Gesture recognition technology is widely applied to various fields, such as virtual reality, augmented reality, smart home, medical treatment and the like, but currently commonly used gesture recognition modes take hand key points as a whole, after a depth model outputs each key point, the outline of a gesture is obtained by linearly fitting each key point, and then the final gesture outline is directly determined based on the gesture outline, but the gesture is determined by factors such as wrists, finger joints and the like, and the factors are not considered in the gesture recognition mode, so that the gesture recognition accuracy of a user is low.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
The invention mainly aims to provide a gesture recognition method, device, equipment and storage medium, and aims to solve the technical problem that the accuracy of recognizing gestures of users is low in the prior art.
In order to achieve the above object, the present invention provides a gesture recognition method, including the steps of:
acquiring a hand image of a user;
the hand image is identified through a target gesture identification model, thermal data of each key point of the hand is obtained, and the target gesture identification model is built through a hierarchical structure of a hand skeleton;
and projecting each key point of the hand according to the thermal data of each key point of the hand and the connection relation of the hand joint to obtain the current gesture of the user.
Optionally, the identifying the hand image through a target gesture identifying model to obtain thermal data of each key point of the hand, where the target gesture identifying model is obtained by constructing a hierarchical structure of a hand skeleton, and the method includes:
identifying the hand image through a target image identification model to obtain a background image;
filtering the background image from the hand image to obtain a target hand image;
and identifying the target hand image through a target gesture identification model to obtain thermal data of each key point of the hand, wherein the target gesture identification model is built through a hierarchical structure of a hand skeleton.
Optionally, the identifying the target hand image through the target gesture identifying model to obtain thermal data of each key point of the hand includes:
extracting features of the target hand image through a feature extraction module of the target gesture recognition model to obtain hand feature information;
and recognizing the hand characteristic information through a key point generating module of the target gesture recognition model to obtain thermal data of each key point of the hand.
Optionally, the feature extraction module of the target gesture recognition model includes a convolution extraction module, a back-residual extraction module, a deconvolution module, and a full connection layer;
feature extraction is carried out on the target hand image through a feature extraction module of the target gesture recognition model to obtain hand feature information, and the method comprises the following steps:
downsampling the target hand image through the convolution extraction module to obtain a downsampled hand image;
upsampling the downsampled hand image through the deconvolution module and the full connection layer to obtain an upsampled hand image;
and respectively carrying out feature extraction on the downsampled hand image and the upsampled hand image by the inverse residual error extraction module to obtain hand feature information.
Optionally, the projecting the each key point of the hand according to the thermal data of each key point of the hand and the connection relation of the hand joint to obtain the current gesture of the user includes:
obtaining the position data of each thermal data in the hand image according to the thermal data of each key point of the hand;
determining the position of each key point of the hand according to the position data of each thermal data in the hand image;
projecting each key point of the hand to a hand image according to the positions of each key point of the hand and the connection relation of the hand joints;
and determining the current gesture of the user according to each key point of the hand on the hand image.
Optionally, the determining the current gesture of the user according to each hand key point on the hand image includes:
generating a current gesture prototype according to each key point of the hand on the hand image;
inquiring a target gesture set by adopting big data, and obtaining a gesture rudiment set according to the target gesture set;
matching the current gesture primitive with each gesture primitive in the gesture primitive set to obtain the matching degree of each gesture primitive;
obtaining the highest gesture prototype matching degree from the gesture prototype matching degrees;
and determining the current gesture of the user according to the highest gesture prototype matching degree.
Optionally, before the identifying the hand image by the target gesture identifying model to obtain the thermal data of each key point of the hand, the method further includes:
acquiring hand images of skin colors under various scenes; obtaining a hand characteristic information set according to the hand images of the complexion under each scene;
determining target thermal data of each key point of the hand corresponding to the hand characteristic information set;
inquiring skeleton information of each hand by adopting big data, and determining a hierarchical structure of the hand skeleton according to the skeleton information of each hand;
obtaining a hand feature training set and a hand feature testing set according to the hand feature set;
constructing an initial gesture recognition model according to the hand feature training set, target thermal data of each key point of the hand and the hierarchical structure;
calculating a loss value of the initial gesture recognition model according to the hand characteristic test set and the tuning parameter;
and when the loss value of the initial gesture recognition model is smaller than a preset loss value threshold, determining the initial gesture recognition model as a target gesture recognition model.
In addition, in order to achieve the above object, the present invention also provides a gesture recognition apparatus, including:
the acquisition module is used for acquiring the hand image of the user;
the recognition module is used for recognizing the hand image through a target gesture recognition model to obtain thermal data of each key point of the hand, and the target gesture recognition model is built through a hierarchical structure of a hand skeleton;
and the projection module is used for projecting each key point of the hand according to the thermal data of each key point of the hand and the connection relation of the hand joint to obtain the current gesture of the user.
In addition, to achieve the above object, the present invention also proposes a gesture recognition apparatus including: a memory, a processor, and a gesture recognition program stored on the memory and executable on the processor, the gesture recognition program configured to implement a gesture recognition method as described above.
In addition, in order to achieve the above object, the present invention also proposes a storage medium having stored thereon a gesture recognition program which, when executed by a processor, implements the gesture recognition method as described above.
According to the gesture recognition method, the hand image of the user is obtained; the hand image is identified through a target gesture identification model, thermal data of each key point of the hand is obtained, and the target gesture identification model is built through a hierarchical structure of a hand skeleton; projecting each key point of the hand according to the thermal data of each key point of the hand and the connection relation of the hand joint to obtain the current gesture of the user; through the method, after the hand image of the user is obtained, the hand image is identified through the target gesture identification model built by the hierarchical structure of the hand skeleton, then the key points of the hand are projected by combining the connection relation of the hand joints, and the current gesture of the user is obtained, so that the accuracy of identifying the gesture of the user can be effectively improved.
Drawings
FIG. 1 is a schematic diagram of a gesture recognition device of a hardware running environment according to an embodiment of the present invention;
FIG. 2 is a flowchart of a gesture recognition method according to a first embodiment of the present invention;
FIG. 3 is a flowchart of a gesture recognition method according to a second embodiment of the present invention;
FIG. 4 is a flow chart illustrating a thermal data generation process according to an embodiment of the gesture recognition method of the present invention;
FIG. 5 is a flow chart illustrating feature extraction according to an embodiment of the gesture recognition method of the present invention;
FIG. 6 is a schematic functional block diagram of a gesture recognition apparatus according to a first embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a gesture recognition device in a hardware running environment according to an embodiment of the present invention.
As shown in fig. 1, the gesture recognition apparatus may include: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a Wireless interface (e.g., a Wireless-Fidelity (Wi-Fi) interface). The Memory 1005 may be a high-speed random access Memory (Random Access Memory, RAM) Memory or a stable nonvolatile Memory (NVM), such as a disk Memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Those skilled in the art will appreciate that the structure shown in fig. 1 does not constitute a limitation of the gesture recognition apparatus, and may include more or fewer components than shown, or may combine certain components, or may be arranged in a different arrangement of components.
As shown in fig. 1, an operating system, a network communication module, a user interface module, and a gesture recognition program may be included in the memory 1005 as one type of storage medium.
In the gesture recognition apparatus shown in fig. 1, the network interface 1004 is mainly used for data communication with a network integration platform workstation; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 in the gesture recognition apparatus of the present invention may be disposed in the gesture recognition apparatus, and the gesture recognition apparatus invokes a gesture recognition program stored in the memory 1005 through the processor 1001 and executes the gesture recognition method provided by the embodiment of the present invention.
Based on the hardware structure, the embodiment of the gesture recognition method is provided.
Referring to fig. 2, fig. 2 is a flowchart illustrating a gesture recognition method according to a first embodiment of the present invention.
In a first embodiment, the gesture recognition method includes the steps of:
step S10, acquiring a hand image of the user.
It should be noted that, the execution body of the embodiment is a gesture recognition device, and may be other devices that can implement the same or similar functions, for example, a gesture recognition controller, which is not limited in this embodiment, and in this embodiment, a gesture recognition controller is taken as an example for explanation.
It should be understood that the hand image refers to an image of the hand of the user, which may also include an image of the background in which the hand is located, and may be acquired by a camera or a sensor, and may also be a framing image in a captured hand video.
Step S20, recognizing the hand image through a target gesture recognition model to obtain thermal data of each key point of the hand, wherein the target gesture recognition model is built through a hierarchical structure of a hand skeleton.
It can be understood that the target gesture recognition model refers to a model for recognizing thermal data of key points, the target gesture recognition model can be built through a hierarchical structure of a hand skeleton, the hierarchical structure can be divided into a wrist, a front end joint, an end joint and the like, the key points of the wrist joint can affect all finger sub-joints at the rear end, but the sub-joints at the rear end of the hand cannot affect the joints at the front end of the hand, so as to conform to a real hand motion rule.
Further, before step S20, the method further includes: acquiring hand images of skin colors under various scenes; obtaining a hand characteristic information set according to the hand images of the complexion under each scene; determining target thermal data of each key point of the hand corresponding to the hand characteristic information set; inquiring skeleton information of each hand by adopting big data, and determining a hierarchical structure of the hand skeleton according to the skeleton information of each hand; obtaining a hand feature training set and a hand feature testing set according to the hand feature set; constructing an initial gesture recognition model according to the hand feature training set, target thermal data of each key point of the hand and the hierarchical structure; calculating a loss value of the initial gesture recognition model according to the hand characteristic test set and the tuning parameter; and when the loss value of the initial gesture recognition model is smaller than a preset loss value threshold, determining the initial gesture recognition model as a target gesture recognition model.
It should be understood that, when the hand image of each skin tone in each scene is acquired, a hand feature information set is obtained according to the hand image of each skin tone in each scene, the scene may be an outdoor scene, an indoor scene, a light sufficient scene, a light dim scene, etc., then the skeleton information of each hand is queried according to big data to determine the hierarchical structure of the hand skeleton, for example, the hierarchical structure is wrist- > front end joint- > end joint, then the hand feature set is divided into a hand feature training set and a hand feature test according to a preset proportion, then an initial gesture recognition model is built by utilizing the target thermal data of each key point of the hand and the hierarchical structure, and then the loss value of the initial gesture recognition model is calculated according to the hand feature test set and tuning parameters, which is specifically as follows: obtaining hand feature prediction category, hand feature real label category, joint prediction position, joint real label position, hand feature prediction depth information and hand feature real label depth information according to a hand feature test set, wherein the depth information refers to the distance between each hand key point and the camera equipment, and then calculating hand feature category loss value, joint position loss value and depth loss value by combining the number of the key points, specifically:
wherein L is cla Representing the hand feature class loss value, N representing the number of key points, cla pred Representing hand feature prediction category, cla gt Representing the real label category of hand features, L pos Indicating the loss of joint position, pos pred Representing the predicted position of the joint, pos gt Representing the real label position of the joint, L depth Representing depth loss value, dep pred Indicating predicted depth information of hand features, dep gt And representing the real label depth information of the hand features.
It can be understood that after obtaining the hand feature class loss value, the joint position loss value and the depth loss, the loss value of the initial gesture recognition model is calculated by combining each tuning parameter, specifically:
L=α 1 L cla2 L pos3 L depth
wherein L represents a loss value of the initial gesture recognition model, alpha 1 Tuning parameters, alpha, representing hand feature categories 2 Tuning parameters, alpha, representing joint position 3 Tuning parameters representing depth information.
And step S30, projecting each key point of the hand according to the thermal data of each key point of the hand and the connection relation of the hand joints to obtain the current gesture of the user.
It should be understood that the connection relationship refers to a connection relationship between joints of the hand, for example, the wrist is connected with a front end joint, after thermal data of each key point of the hand is obtained, each key point of the hand is projected by combining the connection relationship of the joints of the hand, and then the current gesture of the user is determined according to each key point of the hand on the hand image.
Further, step S30 includes: obtaining the position data of each thermal data in the hand image according to the thermal data of each key point of the hand; determining the position of each key point of the hand according to the position data of each thermal data in the hand image; projecting each key point of the hand to a hand image according to the positions of each key point of the hand and the connection relation of the hand joints; and determining the current gesture of the user according to each key point of the hand on the hand image.
It can be understood that after the thermal data of each key point of the hand is obtained, the position data of each thermal data in the hand image is obtained according to the thermal data of each key point of the hand, then the maximum value in the position data is used as the position of the key point to determine the position of each key point of the hand, then the position of each key point of the hand and the connection relation of the hand joints are projected to the hand image, then the current gesture of the user is determined according to each key point of the hand on the hand image, and the number of each key point of the hand can be 21.
Further, the determining the current gesture of the user according to each key point of the hand on the hand image includes: generating a current gesture prototype according to each key point of the hand on the hand image; inquiring a target gesture set by adopting big data, and obtaining a gesture rudiment set according to the target gesture set; matching the current gesture primitive with each gesture primitive in the gesture primitive set to obtain the matching degree of each gesture primitive; obtaining the highest gesture prototype matching degree from the gesture prototype matching degrees; and determining the current gesture of the user according to the highest gesture prototype matching degree.
It should be understood that the current gesture primitive refers to a gesture primitive generated by each key point of a hand on a hand image, then a gesture primitive set is obtained according to a target gesture set queried by big data, then the current gesture primitive is matched with each gesture primitive in the gesture primitive set, then the highest gesture primitive matching degree is obtained from the gesture primitive matching degrees, at this time, a gesture corresponding to the highest gesture primitive matching degree in the gesture primitive set is used as a current gesture of a user, and comments of the gesture and the current gesture are commonly displayed on the hand image.
The embodiment obtains the hand image of the user; the hand image is identified through a target gesture identification model, thermal data of each key point of the hand is obtained, and the target gesture identification model is built through a hierarchical structure of a hand skeleton; projecting each key point of the hand according to the thermal data of each key point of the hand and the connection relation of the hand joint to obtain the current gesture of the user; through the method, after the hand image of the user is obtained, the hand image is identified through the target gesture identification model built by the hierarchical structure of the hand skeleton, then the key points of the hand are projected by combining the connection relation of the hand joints, and the current gesture of the user is obtained, so that the accuracy of identifying the gesture of the user can be effectively improved.
In an embodiment, as shown in fig. 3, a second embodiment of the gesture recognition method according to the present invention is provided based on the first embodiment, and the step S20 includes:
step S201, identifying the hand image through a target image identification model, so as to obtain a background image.
It should be understood that the background image refers to an image of the background in which the hand is located, which is mixed in the hand image, and thus, after the hand image of the user is acquired, the background image in the hand image needs to be recognized by the target image recognition model, which may be the YOLOv5 model.
Step S202, filtering the background image from the hand image to obtain a target hand image.
It is understood that the target hand image refers to the hand image remaining after the background image is filtered out, and after the background image is obtained, the background image is filtered out of the hand image.
Step S203, identifying the target hand image through a target gesture identification model, to obtain thermal data of each key point of the hand, where the target gesture identification model is built through a hierarchical structure of a hand skeleton.
It should be understood that after the target hand image is obtained, the target hand image is input into the target gesture recognition model, and then the target gesture recognition model outputs the thermal data of each key point of the hand through a series of processes such as feature extraction, information recognition and the like, and it is emphasized that the construction of the target gesture recognition model takes the hierarchical structure of the hand skeleton into consideration.
Further, the identifying the target hand image through the target gesture identifying model to obtain thermal data of each key point of the hand includes: extracting features of the target hand image through a feature extraction module of the target gesture recognition model to obtain hand feature information; and recognizing the hand characteristic information through a key point generating module of the target gesture recognition model to obtain thermal data of each key point of the hand.
It may be understood that the hand feature information refers to features capable of uniquely identifying the target hand image, and referring to fig. 4, fig. 4 is a schematic flow chart of thermal data generation, specifically: after the target hand image is input, firstly, the feature extraction module of the target gesture recognition model is utilized to extract features of the target hand image, then the extracted hand feature information is input to the key point generation module of the target gesture recognition model, the key point generation module recognizes the hand feature information to obtain thermal data of each key point of the hand, and the thermal data can represent the position relation among the key points of the hand.
Further, the feature extraction module of the target gesture recognition model comprises a convolution extraction module, a reverse residual extraction module, a deconvolution module and a full connection layer; feature extraction is carried out on the target hand image through a feature extraction module of the target gesture recognition model to obtain hand feature information, and the method comprises the following steps: downsampling the target hand image through the convolution extraction module to obtain a downsampled hand image; upsampling the downsampled hand image through the deconvolution module and the full connection layer to obtain an upsampled hand image; and respectively carrying out feature extraction on the downsampled hand image and the upsampled hand image by the inverse residual error extraction module to obtain hand feature information.
It should be understood that, the convolution extracting module is configured to expand the receptive field of the convolution kernel, and the inverse residual module is configured to ensure a better information extracting capability on the premise of lower parameter quantity, because the generating module of each key point is a self-encoder, that is, downsampling is performed before upsampling, the purpose of this is to extract effective hand feature information, referring to fig. 5, fig. 5 is a flow diagram of an extraction flow, specifically: after the target hand image is obtained, the target hand image is firstly subjected to downsampling through a convolution extraction module, then the downsampled hand image is respectively input into a deconvolution module and a full-connection layer, the target hand image is subjected to upsampling through the deconvolution module and the full-connection layer, a key point generation sub-module is used after the input, and then the downsampled hand image and the upsampled hand image are respectively subjected to feature extraction through an inverted residual extraction module to obtain hand feature information, so that for the generation module of the hand feature of each tail end, only a thermodynamic diagram of the key point is required to be output, and other redundant feature diagrams are not required to be output.
In the embodiment, the hand image is identified through the target image identification model to obtain a background image; filtering the background image from the hand image to obtain a target hand image; the target hand image is identified through a target gesture identification model, thermal data of each key point of the hand is obtained, and the target gesture identification model is built through a hierarchical structure of a hand skeleton; by the method, the background image is identified by utilizing the target image identification model, then the background image is filtered from the hand image, and the target hand image obtained by filtering is identified by utilizing the target gesture identification model, so that the thermal data of each key point of the hand is obtained, and the accuracy of the thermal data can be effectively improved.
In addition, the embodiment of the invention also provides a storage medium, wherein a gesture recognition program is stored on the storage medium, and the gesture recognition program realizes the steps of the gesture recognition method when being executed by a processor.
Because the storage medium adopts all the technical schemes of all the embodiments, the storage medium has at least all the beneficial effects brought by the technical schemes of the embodiments, and the description is omitted here.
In addition, referring to fig. 6, an embodiment of the present invention further provides a gesture recognition apparatus, where the gesture recognition apparatus includes:
an acquisition module 10, configured to acquire a hand image of a user.
The recognition module 20 is configured to recognize the hand image through a target gesture recognition model, and obtain thermal data of each key point of the hand, where the target gesture recognition model is built through a hierarchical structure of a hand skeleton.
And the projection module 30 is used for projecting each key point of the hand according to the thermal data of each key point of the hand and the connection relation of the hand joints to obtain the current gesture of the user.
The embodiment obtains the hand image of the user; the hand image is identified through a target gesture identification model, thermal data of each key point of the hand is obtained, and the target gesture identification model is built through a hierarchical structure of a hand skeleton; projecting each key point of the hand according to the thermal data of each key point of the hand and the connection relation of the hand joint to obtain the current gesture of the user; through the method, after the hand image of the user is obtained, the hand image is identified through the target gesture identification model built by the hierarchical structure of the hand skeleton, then the key points of the hand are projected by combining the connection relation of the hand joints, and the current gesture of the user is obtained, so that the accuracy of identifying the gesture of the user can be effectively improved.
It should be noted that the above-described working procedure is merely illustrative, and does not limit the scope of the present invention, and in practical application, a person skilled in the art may select part or all of them according to actual needs to achieve the purpose of the embodiment, which is not limited herein.
In addition, technical details that are not described in detail in the present embodiment may refer to the gesture recognition method provided in any embodiment of the present invention, and are not described herein.
In an embodiment, the identifying module 20 is further configured to obtain hand images of skin colors in each scene; obtaining a hand characteristic information set according to the hand images of the complexion under each scene; determining target thermal data of each key point of the hand corresponding to the hand characteristic information set; inquiring skeleton information of each hand by adopting big data, and determining a hierarchical structure of the hand skeleton according to the skeleton information of each hand; obtaining a hand feature training set and a hand feature testing set according to the hand feature set; constructing an initial gesture recognition model according to the hand feature training set, target thermal data of each key point of the hand and the hierarchical structure; calculating a loss value of the initial gesture recognition model according to the hand characteristic test set and the tuning parameter; and when the loss value of the initial gesture recognition model is smaller than a preset loss value threshold, determining the initial gesture recognition model as a target gesture recognition model.
In an embodiment, the recognition module 20 is further configured to recognize the hand image through a target image recognition model to obtain a background image; filtering the background image from the hand image to obtain a target hand image; and identifying the target hand image through a target gesture identification model to obtain thermal data of each key point of the hand, wherein the target gesture identification model is built through a hierarchical structure of a hand skeleton.
In an embodiment, the recognition module 20 is further configured to perform feature extraction on the target hand image through a feature extraction module of the target gesture recognition model to obtain hand feature information; and recognizing the hand characteristic information through a key point generating module of the target gesture recognition model to obtain thermal data of each key point of the hand.
In an embodiment, the feature extraction module of the target gesture recognition model includes a convolution extraction module, a back-residual extraction module, a deconvolution module, and a full connection layer; the recognition module 20 is further configured to downsample the target hand image through the convolution extraction module, so as to obtain a downsampled hand image; upsampling the downsampled hand image through the deconvolution module and the full connection layer to obtain an upsampled hand image; and respectively carrying out feature extraction on the downsampled hand image and the upsampled hand image by the inverse residual error extraction module to obtain hand feature information.
In an embodiment, the projection module 30 is further configured to obtain the position data of each thermal data in the hand image according to the thermal data of each key point of the hand; determining the position of each key point of the hand according to the position data of each thermal data in the hand image; projecting each key point of the hand to a hand image according to the positions of each key point of the hand and the connection relation of the hand joints; and determining the current gesture of the user according to each key point of the hand on the hand image.
In an embodiment, the projection module 30 is further configured to generate a current gesture primitive according to each key point of the hand on the hand image; inquiring a target gesture set by adopting big data, and obtaining a gesture rudiment set according to the target gesture set; matching the current gesture primitive with each gesture primitive in the gesture primitive set to obtain the matching degree of each gesture primitive; obtaining the highest gesture prototype matching degree from the gesture prototype matching degrees; and determining the current gesture of the user according to the highest gesture prototype matching degree.
Other embodiments of the gesture recognition apparatus or the implementation method thereof may refer to the above method embodiments, and are not repeated herein.
Furthermore, it should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. Read Only Memory)/RAM, magnetic disk, optical disk) and including several instructions for causing a terminal device (which may be a mobile phone, a computer, an integrated platform workstation, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (10)

1. A gesture recognition method, characterized in that the gesture recognition method comprises the steps of:
acquiring a hand image of a user;
the hand image is identified through a target gesture identification model, thermal data of each key point of the hand is obtained, and the target gesture identification model is built through a hierarchical structure of a hand skeleton;
and projecting each key point of the hand according to the thermal data of each key point of the hand and the connection relation of the hand joint to obtain the current gesture of the user.
2. The gesture recognition method of claim 1, wherein the hand image is recognized by a target gesture recognition model to obtain thermal data of each key point of the hand, and the target gesture recognition model is constructed by a hierarchical structure of a hand skeleton, and comprises:
identifying the hand image through a target image identification model to obtain a background image;
filtering the background image from the hand image to obtain a target hand image;
and identifying the target hand image through a target gesture identification model to obtain thermal data of each key point of the hand, wherein the target gesture identification model is built through a hierarchical structure of a hand skeleton.
3. The gesture recognition method of claim 2, wherein the recognizing the target hand image by the target gesture recognition model to obtain thermal data of each key point of the hand comprises:
extracting features of the target hand image through a feature extraction module of the target gesture recognition model to obtain hand feature information;
and recognizing the hand characteristic information through a key point generating module of the target gesture recognition model to obtain thermal data of each key point of the hand.
4. The gesture recognition method of claim 3, wherein the feature extraction module of the target gesture recognition model comprises a convolution extraction module, a back-residual extraction module, a deconvolution module, and a full connection layer;
feature extraction is carried out on the target hand image through a feature extraction module of the target gesture recognition model to obtain hand feature information, and the method comprises the following steps:
downsampling the target hand image through the convolution extraction module to obtain a downsampled hand image;
upsampling the downsampled hand image through the deconvolution module and the full connection layer to obtain an upsampled hand image;
and respectively carrying out feature extraction on the downsampled hand image and the upsampled hand image by the inverse residual error extraction module to obtain hand feature information.
5. The gesture recognition method of claim 1, wherein the projecting each key point of the hand according to the thermal data of each key point of the hand and the connection relation of the hand joint to obtain the current gesture of the user comprises:
obtaining the position data of each thermal data in the hand image according to the thermal data of each key point of the hand;
determining the position of each key point of the hand according to the position data of each thermal data in the hand image;
projecting each key point of the hand to a hand image according to the positions of each key point of the hand and the connection relation of the hand joints;
and determining the current gesture of the user according to each key point of the hand on the hand image.
6. The method of claim 5, wherein determining the current gesture of the user from each keypoint of the hand on the hand image comprises:
generating a current gesture prototype according to each key point of the hand on the hand image;
inquiring a target gesture set by adopting big data, and obtaining a gesture rudiment set according to the target gesture set;
matching the current gesture primitive with each gesture primitive in the gesture primitive set to obtain the matching degree of each gesture primitive;
obtaining the highest gesture prototype matching degree from the gesture prototype matching degrees;
and determining the current gesture of the user according to the highest gesture prototype matching degree.
7. The gesture recognition method of claim 1, wherein before the hand image is recognized by the target gesture recognition model to obtain thermal data of each key point of the hand, the gesture recognition method further comprises:
acquiring hand images of skin colors under various scenes; obtaining a hand characteristic information set according to the hand images of the complexion under each scene;
determining target thermal data of each key point of the hand corresponding to the hand characteristic information set;
inquiring skeleton information of each hand by adopting big data, and determining a hierarchical structure of the hand skeleton according to the skeleton information of each hand;
obtaining a hand feature training set and a hand feature testing set according to the hand feature set;
constructing an initial gesture recognition model according to the hand feature training set, target thermal data of each key point of the hand and the hierarchical structure;
calculating a loss value of the initial gesture recognition model according to the hand characteristic test set and the tuning parameter;
and when the loss value of the initial gesture recognition model is smaller than a preset loss value threshold, determining the initial gesture recognition model as a target gesture recognition model.
8. A gesture recognition apparatus, the gesture recognition apparatus comprising:
the acquisition module is used for acquiring the hand image of the user;
the recognition module is used for recognizing the hand image through a target gesture recognition model to obtain thermal data of each key point of the hand, and the target gesture recognition model is built through a hierarchical structure of a hand skeleton;
and the projection module is used for projecting each key point of the hand according to the thermal data of each key point of the hand and the connection relation of the hand joint to obtain the current gesture of the user.
9. A gesture recognition apparatus, the gesture recognition apparatus comprising: memory, a processor and a gesture recognition program stored on the memory and executable on the processor, the gesture recognition program being configured to implement the gesture recognition method of any one of claims 1 to 7.
10. A storage medium having stored thereon a gesture recognition program which, when executed by a processor, implements the gesture recognition method of any one of claims 1 to 7.
CN202311321438.5A 2023-10-12 2023-10-12 Gesture recognition method, device, equipment and storage medium Pending CN117423161A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311321438.5A CN117423161A (en) 2023-10-12 2023-10-12 Gesture recognition method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311321438.5A CN117423161A (en) 2023-10-12 2023-10-12 Gesture recognition method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117423161A true CN117423161A (en) 2024-01-19

Family

ID=89531757

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311321438.5A Pending CN117423161A (en) 2023-10-12 2023-10-12 Gesture recognition method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117423161A (en)

Similar Documents

Publication Publication Date Title
CN110221690B (en) Gesture interaction method and device based on AR scene, storage medium and communication terminal
TWI506563B (en) A method and apparatus for enhancing reality of two - dimensional code
CN110610154A (en) Behavior recognition method and apparatus, computer device, and storage medium
CN111914775B (en) Living body detection method, living body detection device, electronic equipment and storage medium
CN111563502A (en) Image text recognition method and device, electronic equipment and computer storage medium
CN114402369A (en) Human body posture recognition method and device, storage medium and electronic equipment
CN111428664B (en) Computer vision real-time multi-person gesture estimation method based on deep learning technology
CN109886223B (en) Face recognition method, bottom library input method and device and electronic equipment
US20230334893A1 (en) Method for optimizing human body posture recognition model, device and computer-readable storage medium
CN111273772A (en) Augmented reality interaction method and device based on slam mapping method
CN111199169A (en) Image processing method and device
CN111401157A (en) Face recognition method and system based on three-dimensional features
CN111353325A (en) Key point detection model training method and device
CN111144374B (en) Facial expression recognition method and device, storage medium and electronic equipment
CN111274602B (en) Image characteristic information replacement method, device, equipment and medium
CN113793370A (en) Three-dimensional point cloud registration method and device, electronic equipment and readable medium
CN112380978A (en) Multi-face detection method, system and storage medium based on key point positioning
CN115222895B (en) Image generation method, device, equipment and storage medium
CN113627397B (en) Hand gesture recognition method, system, equipment and storage medium
CN117423161A (en) Gesture recognition method, device, equipment and storage medium
CN114299615A (en) Key point-based multi-feature fusion action identification method, device, medium and equipment
CN116434253A (en) Image processing method, device, equipment, storage medium and product
CN116403269B (en) Method, system, equipment and computer storage medium for analyzing occlusion human face
CN111047632A (en) Method and device for processing picture color of nail image
CN114596582B (en) Augmented reality interaction method and system with vision and force feedback

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination