CN111753764A

CN111753764A - Gesture recognition method of edge terminal based on attitude estimation

Info

Publication number: CN111753764A
Application number: CN202010601428.7A
Authority: CN
Inventors: 李雪; 李锐; 金长新
Original assignee: Jinan Inspur Hi Tech Investment and Development Co Ltd
Current assignee: Jinan Inspur Hi Tech Investment and Development Co Ltd
Priority date: 2020-06-29
Filing date: 2020-06-29
Publication date: 2020-10-09

Abstract

The invention discloses a gesture recognition method of an edge end based on attitude estimation, and relates to the technical field of image recognition; preprocessing the captured image, screening a key frame image, identifying the gesture action of the key frame image, judging whether the gesture action is a static gesture action or a dynamic gesture action, compensating action information by using motion compensation of optical flow for the dynamic gesture action, analyzing the static gesture action in the key frame image by using a gesture identification model through a gesture estimation algorithm to obtain a hand key point of the static gesture action, analyzing the dynamic gesture action in the key frame image by using a gesture estimation algorithm, obtaining the hand key point of the dynamic gesture action by combining the compensated action information, and identifying and classifying the gesture action.

Description

Gesture recognition method of edge terminal based on attitude estimation

Technical Field

The invention discloses a gesture recognition method, relates to the technical field of image recognition, and particularly relates to a gesture recognition method based on attitude estimation at an edge end.

Background

Gesture communication is generally common among people with hearing and speaking disorders, however, gesture-based human-computer interaction systems have a variety of application scenarios. For example, motion sensing games, gesture controlled aircraft, robots, or special environmental applications such as inconvenient speech or inability to make direct contact, etc. Gesture recognition facilitates human-computer interaction. At present, one mode of gesture recognition is based on wearable electromagnetic equipment, such as a special glove, and is mainly applied to the fields of motion capture of movies and the like, and the other mode is to utilize a computer vision technology to realize tasks of detection, recognition, classification and the like on gesture images. However, the existing gesture recognition based on image feature extraction is usually not sensitive enough in performance, large in error and unsatisfactory in result.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a gesture recognition method based on gesture estimation at an edge end, which is characterized in that the gesture estimation method is combined to extract the features of key points of a gesture to complete action recognition, and then a model compression and optical flow information compensation method is utilized to be deployed at the edge equipment end, so that the real-time performance is improved, and the phenomena of recognition blocking and the like are avoided.

The specific scheme provided by the invention is as follows:

a gesture recognition method based on posture estimation at edge end comprises preprocessing captured image, screening key frame image, recognizing gesture action of key frame image,

determining whether the gesture motion is a static gesture motion or a dynamic gesture motion,

motion information compensation is performed by using motion compensation of optical flow for dynamic gesture motion,

and analyzing the static gesture actions in the key frame image by utilizing a gesture recognition model through a gesture estimation algorithm to obtain hand key points of the static gesture actions, analyzing the dynamic gesture actions in the key frame image through the gesture estimation algorithm, combining compensated action information to obtain hand key points of the dynamic gesture actions, and recognizing and classifying the gesture actions.

In the gesture recognition method based on posture estimation at the edge end, the action of the action sequence combination with the position change range smaller than the threshold value in the key frame image is a static gesture action, and the action of the action sequence combination with the position change range above the threshold value in the key frame image is a dynamic hand action.

According to the gesture recognition method of the edge end based on the posture estimation, a previous frame from a previous key frame image to a next key frame image is used as an action sequence combination according to the key frame images.

In the gesture recognition method based on gesture estimation at the edge end, the compression processing of training and quantification is carried out on the gesture recognition model.

In the gesture recognition method based on posture estimation at the edge terminal, the gesture recognition model is a neural network model.

A gesture recognition system with edge based on attitude estimation comprises a preprocessing module, a judging module, a compensating module and a model recognition module,

the preprocessing module preprocesses the captured image, screens the key frame image, identifies the gesture action of the key frame image,

the judging module judges whether the gesture action is a static gesture action or a dynamic gesture action,

the compensation module compensates motion information by using motion compensation of optical flow for the dynamic gesture motion,

the model recognition module analyzes static gesture actions in the key frame images through a gesture recognition model through a gesture estimation algorithm to obtain hand key points of the static gesture actions, analyzes dynamic gesture actions in the key frame images through the gesture estimation algorithm, combines compensated action information to obtain hand key points of the dynamic gesture actions, and performs gesture action recognition and classification.

An edge-end gesture estimation-based gesture recognition apparatus, comprising: at least one memory and at least one processor;

the at least one memory to store a machine readable program;

the at least one processor is configured to invoke the machine readable program to execute the gesture recognition method based on pose estimation at the edge terminal.

A computer readable medium having computer instructions stored thereon, which, when executed by a processor, cause the processor to perform the method for gesture recognition with edge based on pose estimation.

The invention has the advantages that:

the invention provides a gesture recognition method of an edge terminal based on gesture estimation, which is characterized in that a gesture action of a key frame image is recognized by screening the key frame image, a static gesture action and a dynamic gesture action are judged, a gesture recognition model is introduced into a gesture estimation algorithm to respectively acquire key points of the static gesture action and the dynamic gesture action, and action recognition is completed by combining rules to effectively improve the recognition speed.

Drawings

FIG. 1 is a diagram illustrating the selection of action sequence combinations in the present invention;

FIG. 2 is a schematic flow diagram of the process of the present invention;

FIG. 3 is a schematic illustration of key points of the hand.

Detailed Description

The present invention is further described below in conjunction with the following figures and specific examples so that those skilled in the art may better understand the present invention and practice it, but the examples are not intended to limit the present invention.

The invention provides a gesture recognition method of edge terminal based on attitude estimation, which is characterized in that the captured image is preprocessed, the key frame image is screened, the gesture action of the key frame image is recognized,

In the field of motion recognition, a Kinect camera is mostly adopted, RGB images, depth images and skeleton point information can be obtained at the same time, but the Kinect camera has the defects of high cost, low resolution, limited recognition distance and the like, the hand image is small, the motion amplitude is small, the images and the information obtained by the Kinect camera cannot meet the requirements, the gesture images are firstly segmented from complex scenes in the current gesture recognition, then the features of the gesture images are extracted, and classification is finished at last. However, the gesture estimation algorithm is used for replacing Kinect to obtain the hand key point information, key point acquisition is directly carried out on the hand, action recognition is completed by combining rules, the recognition speed can be effectively improved, meanwhile, the optical flow motion information compensation is carried out on the dynamic gesture action according to the difference between the static gesture action and the dynamic gesture action, the recognition precision and the speed of a network can be guaranteed, and the serious blocking phenomenon caused by recognition is avoided.

In one embodiment of the invention, it is specified that key frames are focused on the context between frames, not on the length of the frames, and that the previous frame starting from the previous key frame to the next key frame is combined as a sequence of actions, see figure 1,

the static gesture action only considers the characteristics in the space dimension, namely the action of the action sequence combination with the position change range smaller than the threshold value in the key frame image,

the dynamic gesture motion needs to consider the logical relationship in the time dimension besides the characteristics in the space dimension, that is, the motion of the motion sequence combination with the position change range above the threshold in the key frame image is the dynamic hand motion, and when identifying the key point of the hand, the motion information compensation of the optical flow needs to be added, that is, the image of one frame at the beginning of the motion is identified, and then the position change condition of the motion sequence before the next key frame is recorded,

the gesture recognition model can effectively improve the recognition speed through learning and recognition of the information.

In another embodiment of the present invention, the key points of the hand are specified, referring to fig. 3, the key points include the finger tip, each joint point and one key point at the carpal bone, the coordinates of each key point can be obtained by using a gesture estimation algorithm, the distance between the key points is further calculated, the bending degree of the finger can be represented by measuring the distance from the finger joint point to the finger root, so as to determine the gesture posture, then, the accuracy can be improved by adding the intermediate layer of the gesture recognition model and using the multi-resolution gesture depth map as the network input, the depth map is insensitive to the light change of the RGB image, and the applicable scene is wider.

In the embodiment of the invention, the neural network model can be selected as the gesture recognition model, the model is compressed and quantized, and the mode of training and quantizing is selected, so that the situations of missing recognition or error recognition caused by neuron loss due to a direct quantization mode are avoided, the model trained by the mode of training and quantizing can effectively reduce the situations of missing recognition or error recognition, and certain recognition accuracy is ensured.

The invention also provides a gesture recognition system of the edge end based on attitude estimation, which comprises a preprocessing module, a judging module, a compensating module and a model recognition module,

The information interaction, execution process and other contents between the modules in the system are based on the same concept as the method embodiment of the present invention, and specific contents can be referred to the description in the method embodiment of the present invention, and are not described herein again.

The invention also provides a gesture recognition device of the edge end based on attitude estimation, which comprises: at least one memory and at least one processor;

the at least one memory to store a machine readable program;

A computer readable medium having computer instructions stored thereon, which, when executed by a processor, cause the processor to perform the method for gesture recognition with edge based on pose estimation. Specifically, a system or an apparatus equipped with a storage medium on which software program codes that realize the functions of any of the above-described embodiments are stored may be provided, and a computer (or a CPU or MPU) of the system or the apparatus is caused to read out and execute the program codes stored in the storage medium.

In this case, the program code itself read from the storage medium can realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code constitute a part of the present invention.

Examples of the storage medium for supplying the program code include a floppy disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD + RW), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer via a communications network.

Further, it should be clear that the functions of any one of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform a part or all of the actual operations based on instructions of the program code.

Further, it is to be understood that the program code read out from the storage medium is written to a memory provided in an expansion board inserted into the computer or to a memory provided in an expansion unit connected to the computer, and then causes a CPU or the like mounted on the expansion board or the expansion unit to perform part or all of the actual operations based on instructions of the program code, thereby realizing the functions of any of the above-described embodiments.

It should be noted that not all steps and modules in the above flows and system structure diagrams are necessary, and some steps or modules may be omitted according to actual needs. The execution order of the steps is not fixed and can be adjusted as required. The system structure described in the above embodiments may be a physical structure or a logical structure, that is, some modules may be implemented by the same physical entity, or some modules may be implemented by a plurality of physical entities, or some components in a plurality of independent devices may be implemented together.

In the above embodiments, the hardware unit may be implemented mechanically or electrically. For example, a hardware element may comprise permanently dedicated circuitry or logic (such as a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware elements may also comprise programmable logic or circuitry, such as a general purpose processor or other programmable processor, that may be temporarily configured by software to perform the corresponding operations. The specific implementation (mechanical, or dedicated permanent, or temporarily set) may be determined based on cost and time considerations.

The above-mentioned embodiments are merely preferred embodiments for fully illustrating the present invention, and the scope of the present invention is not limited thereto. The equivalent substitution or change made by the technical personnel in the technical field on the basis of the invention is all within the protection scope of the invention. The protection scope of the invention is subject to the claims.

Claims

1. A gesture recognition method based on posture estimation at edge end is characterized in that the captured image is preprocessed, key frame images are screened, gesture actions of the key frame images are recognized,

2. The method of claim 1, wherein the gesture sequence combinations with a position variation range smaller than a threshold in the key frame image are static gesture actions, and the gesture sequence combinations with a position variation range above the threshold in the key frame image are dynamic hand actions.

3. An edge-based gesture recognition method according to claim 2, wherein the key frame images are combined as a sequence of actions starting from the previous key frame image to the previous frame of the next key frame image.

4. An edge-based gesture recognition method according to any of claims 1-3, wherein the gesture recognition model is compressed by training and quantizing.

5. An edge-based gesture recognition method according to claim 4, wherein the gesture recognition model is a neural network model.

6. A gesture recognition system with edge based on attitude estimation is characterized by comprising a preprocessing module, a judging module, a compensating module and a model recognition module,

7. An edge terminal gesture recognition device based on attitude estimation is characterized by comprising: at least one memory and at least one processor;

the at least one memory to store a machine readable program;

the at least one processor, configured to invoke the machine readable program, to execute the method for gesture recognition based on pose estimation at an edge end according to any one of claims 1 to 5.

8. Computer readable medium, characterized in that said computer readable medium has stored thereon computer instructions, which, when executed by a processor, cause said processor to execute a method for edge-based gesture recognition according to any of claims 1 to 5.