CN112183424A - Real-time hand tracking method and system based on video - Google Patents

Real-time hand tracking method and system based on video Download PDF

Info

Publication number
CN112183424A
CN112183424A CN202011074015.4A CN202011074015A CN112183424A CN 112183424 A CN112183424 A CN 112183424A CN 202011074015 A CN202011074015 A CN 202011074015A CN 112183424 A CN112183424 A CN 112183424A
Authority
CN
China
Prior art keywords
image
palm
model
training
feature map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011074015.4A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Huayan Mutual Entertainment Technology Co ltd
Original Assignee
Beijing Huayan Mutual Entertainment Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Huayan Mutual Entertainment Technology Co ltd filed Critical Beijing Huayan Mutual Entertainment Technology Co ltd
Priority to CN202011074015.4A priority Critical patent/CN112183424A/en
Publication of CN112183424A publication Critical patent/CN112183424A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a real-time hand tracking method and a real-time hand tracking system based on videos, wherein the method comprises the following steps: inputting a video frame image; carrying out real-time palm detection on the video frame image through a palm detection model, and carrying out image cutting on the detected palm to obtain a palm image; performing finger key point positioning detection on the palm image through a hand identification model to obtain the coordinate position of each finger key point on the palm image and identify the coordinate position; and performing gesture recognition on the gesture image identified by the hand identification model through a gesture recognition model to obtain a real-time hand gesture recognition result. The invention realizes real-time effective tracking of the hands.

Description

Real-time hand tracking method and system based on video
Technical Field
The invention relates to the technical field of image recognition and animation, in particular to a real-time hand tracking method and system based on video.
Background
Today, millions of people are communicating using sign language, but to date, there has been limited progress in the research of capturing complex gestures and translating them into spoken language. Since the hand motion is usually fast and delicate, the hand is often blocked during the motion process, and the hand image and the background image usually lack high contrast, it is not easy to quickly identify the hand image from the video frame image, and it is difficult to dynamically track the hand in real time even if multiple cameras are used to capture the hand from multiple angles or other depth sensing devices are used to sense the hand area image.
Disclosure of Invention
The present invention is directed to a method and system for real-time video-based hand tracking to solve the above-mentioned problems.
In order to achieve the purpose, the invention adopts the following technical scheme:
a video-based real-time hand tracking method is provided, which comprises the following steps:
inputting a video frame image;
carrying out real-time palm detection on the video frame image through a palm detection model, and carrying out image cutting on the detected palm to obtain a palm image;
performing finger key point positioning detection on the palm image through a hand identification model to obtain the coordinate position of each finger key point on the palm image and identify the coordinate position;
and performing gesture recognition on the gesture image identified by the hand identification model through a gesture recognition model to obtain a real-time hand gesture recognition result.
Preferably, the method of training the palm detection model comprises the steps of:
selecting 30000 video frame images containing a palm as a training sample of the palm detection model;
inputting the video frame image serving as a training sample into a deep learning network, and training to form a palm detection initial model;
carrying out palm detection on the video frame image through the palm detection initial model, and outputting a detection result;
manually checking the detection result output by the palm detection initial model to evaluate the performance of the model, and then adjusting the model training parameters of the deep learning network according to the evaluation result of the performance of the model;
and according to the adjusted model training parameters, taking the video frame image as a training sample, carrying out iterative updating on the palm detection initial model, and finally training to form the palm detection model.
Preferably, the deep learning network is a neural network of an RPN network structure.
Preferably, the size of the video frame image is 256 × 256.
Preferably, the deep learning network includes 5 convolutional layers which are cascaded in sequence, and the video frame image with the size of 256 × 256 is subjected to image feature extraction of the first convolutional layer of the deep learning network and then outputs a 128 × 128 feature map; extracting the image features of the second convolution layer from the feature map with the size of 128 x 128 and outputting a feature map of 64 x 64; extracting the image features of the third convolution layer from the 64 × 64 feature map, and outputting a 32 × 32 feature map; extracting the image features of the fourth convolution layer from the 32 x 32 feature map, and outputting a 16 x 16 feature map; and (3) extracting the image features of the fifth convolution layer from the 16 × 16 feature map, and outputting an 8 × 8 feature map.
Preferably, the finger keypoints comprise 21 finger keypoints with 3D coordinates that can characterize the shape of the palm.
Preferably, the method for recognizing gestures by the gesture recognition model comprises the following steps:
cutting the gesture image from the palm image according to a preset size according to the identification result of the finger key point;
and performing image matching on the gesture image and classification template images stored in an image database, wherein each classification template image is associated with a gesture type, and if the image matching is successful, outputting the gesture type associated with the matched classification template image as a gesture recognition result of the gesture image.
The invention also provides a real-time hand tracking system based on video, which can realize the real-time hand tracking method, and the system comprises:
the image input module is used for inputting video frame images;
the palm detection module is connected with the image input module and used for carrying out real-time palm detection on the video frame image through a palm detection model and carrying out image cutting on the detected palm to obtain a palm image;
the hand identification module is connected with the palm detection module and used for carrying out finger key point positioning detection on the palm image through a hand identification model to obtain the coordinate position of each finger key point on the palm image and identifying the coordinate position;
and the gesture recognition module is connected with the hand identification module and used for performing gesture recognition on the gesture image identified by the hand identification module through a gesture recognition model to obtain a real-time hand gesture recognition result.
Preferably, the real-time hand tracking system further comprises:
palm detection model training module connects palm detection module is used for the training palm detection model, specifically include in the palm detection model training module:
the sample marking unit is used for providing marking personnel with palm positions identified in the video frame images;
the sample selecting unit is connected with the sample labeling unit and used for selecting an image sample used for training the palm detection model from each video frame image labeled by the palm position;
the palm detection initial model training unit is connected with the sample selection unit and used for inputting each video frame image serving as a training sample into a deep learning network and training to form a palm detection initial model;
the model performance verification unit is connected with the palm detection initial model training unit and used for verifying the model performance of the palm detection initial model;
the model parameter adjusting unit is connected with the model performance verifying unit and used for providing a model training person with an adjusting model training parameter according to a model performance verifying result;
and the model iteration updating unit is respectively connected with the sample selecting unit and the model parameter adjusting unit and is used for performing iteration updating on the palm detection initial model according to the adjusted model training parameters and by taking each selected video frame image as a training sample, and finally training to form the palm detection model.
Preferably, the deep learning network includes 5 convolutional layers which are cascaded in sequence, and the video frame image with the size of 256 × 256 is subjected to image feature extraction of the first convolutional layer of the deep learning network and then outputs a 128 × 128 feature map; extracting the image features of the second convolution layer from the feature map with the size of 128 x 128 and outputting a feature map of 64 x 64; extracting the image features of the third convolution layer from the 64 × 64 feature map, and outputting a 32 × 32 feature map; extracting the image features of the fourth convolution layer from the 32 x 32 feature map, and outputting a 16 x 16 feature map; and (3) extracting the image features of the fifth convolution layer from the 16 × 16 feature map, and outputting an 8 × 8 feature map.
The invention adopts a deep learning technology, firstly detects the most unique and reliable part of the hand, namely the palm, through a palm detection model, then detects the key points of fingers of the detected palm to obtain the hand posture information, and finally identifies the hand posture through gesture matching, thereby realizing the real-time effective tracking of the hand.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a diagram of the method steps of a video-based real-time hand tracking method according to an embodiment of the invention;
FIG. 2 is a diagram of method steps for training the palm detection model;
FIG. 3 is a diagram of method steps by which the gesture recognition model recognizes gestures;
FIG. 4 is a schematic diagram of a video-based real-time hand tracking system according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the internal structure of the palm detection model training module in the real-time hand tracking system;
fig. 6 is a network architecture diagram of the deep learning network.
Detailed Description
The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.
Wherein the showings are for the purpose of illustration only and are shown by way of illustration only and not in actual form, and are not to be construed as limiting the present patent; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if the terms "upper", "lower", "left", "right", "inner", "outer", etc. are used for indicating the orientation or positional relationship based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not indicated or implied that the referred device or element must have a specific orientation, be constructed in a specific orientation and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes and are not to be construed as limitations of the present patent, and the specific meanings of the terms may be understood by those skilled in the art according to specific situations.
In the description of the present invention, unless otherwise explicitly specified or limited, the term "connected" or the like, if appearing to indicate a connection relationship between the components, is to be understood broadly, for example, as being fixed or detachable or integral; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or may be connected through one or more other components or may be in an interactive relationship with one another. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
As shown in fig. 1, the method for real-time hand tracking based on video according to an embodiment of the present invention includes the following steps:
step S1, inputting video frame images;
step S2, real-time palm detection is carried out on the video frame image through a palm detection model, and image cutting is carried out on the detected palm to obtain a palm image;
step S3, carrying out finger key point positioning detection on the palm image through a hand identification model to obtain the coordinate position of each finger key point on the palm image and identify the coordinate position;
and step S4, performing gesture recognition on the gesture image identified by the hand identification model through a gesture recognition model to obtain a real-time hand gesture recognition result.
In step S2, as shown in fig. 2, the palm detection model is trained by the following steps:
step S21, selecting 30000 video frame images containing palms as training samples of the palm detection model;
step S22, inputting the video frame image as the training sample into a deep learning network, training and forming a palm detection initial model;
step S23, carrying out palm detection on the video frame image through the palm detection initial model, and outputting a detection result;
step S24, carrying out manual verification on the detection result output by the palm detection initial model to evaluate the model performance, and then adjusting the model training parameters of the deep learning network according to the model performance evaluation result;
and step S25, according to the adjusted model training parameters and by taking the video frame image as a training sample, carrying out iterative update on the palm detection initial model, and finally training to form a palm detection model.
The deep learning network adopted by the embodiment is improved based on the RPN neural network structure. Specifically, as shown in fig. 6, the deep learning network adopted in this embodiment includes sequentially cascaded 5 convolutional layers, and the video frame image with the size of 256 × 256 is subjected to image feature extraction by the first convolutional layer of the deep learning network and then output a 128 × 128 feature map; extracting the image features of the second convolution layer from the feature map with the size of 128 x 128 and outputting a feature map of 64 x 64; extracting the image features of the third convolution layer from the 64 × 64 feature map, and outputting a 32 × 32 feature map; extracting the image features of the fourth convolution layer from the 32 x 32 feature map, and outputting a 16 x 16 feature map; and (3) extracting the image features of the fifth convolution layer from the 16 × 16 feature map, and outputting an 8 × 8 feature map. According to the invention, multiple experiments show that the feature diagram with the size of 64 x 64 is enough to represent the palm in the five-finger stretching state, and the feature diagram with the size of 8 x 8 is enough to represent the palm in the fist-clenching state, so that the palm detection model outputs the detected palm to the hand identification model with the image size of 64 x 64 or 32 x 32 or 16 x 16 or 8 x 8 for further finger key point positioning detection according to the stretching state of the five fingers.
The finger key points detected by the invention comprise 21 finger key points with 3D coordinates, which can fully represent the shape of the palm. The 21 finger key points representing the palm shape are obtained by the research of the prior art, so that the specific positions of the 21 finger key points representing the palm shape on the palm are not explained here. After 21 finger key points are detected and positioned, the positions of the gesture images in the video frame images can be obtained through the coordinate positions of the 21 finger key points, and then the gesture images are intercepted and output to the gesture recognition model.
As shown in fig. 3, the method for recognizing a gesture by a gesture recognition model includes the following steps:
step S41, cutting out a gesture image from the palm image according to a preset size according to the finger key point identification result;
and step S42, performing image matching on the gesture image and the classification template images stored in an image database, wherein each classification template image is associated with a gesture type, and if the image matching is successful, outputting the gesture type associated with the matched classification template image as a gesture recognition result of the gesture image.
The present invention further provides a real-time hand tracking system based on video, which can implement the real-time hand tracking method, as shown in fig. 4, the system includes:
the image input module 1 is used for inputting video frame images;
the palm detection module 2 is connected with the image input module 1 and is used for carrying out real-time palm detection on the video frame image through a palm detection model and carrying out image cutting on the detected palm to obtain a palm image;
the hand identification module 3 is connected with the palm detection module 2 and used for carrying out finger key point positioning detection on the palm image through a hand identification model to obtain the coordinate position of each finger key point on the palm image and identifying the coordinate position;
and the gesture recognition module 4 is connected with the hand identification module 3 and used for performing gesture recognition on the gesture image identified by the hand identification module through a gesture recognition model to obtain a real-time hand gesture recognition result.
To achieve training of the palm detection model, preferably, the real-time hand tracking system further comprises:
palm detection model training module connects palm detection module for training palm detection model, as shown in fig. 5, specifically include in the training of palm detection model:
a sample annotation unit 51, configured to provide an annotation person with a palm position identified in the video frame image;
the sample selecting unit 52 is connected with the sample labeling unit 51 and is used for selecting an image sample serving as a training palm detection model from each video frame image labeled by the palm position;
the palm detection initial model training unit 53 is connected with the sample selection unit 52 and is used for inputting each video frame image serving as a training sample into a deep learning network and training to form a palm detection initial model;
the model performance verification unit 54 is connected with the palm detection initial model training unit 53 and is used for verifying the model performance of the palm detection initial model;
the model parameter adjusting unit 55 is connected with the model performance verifying unit 54 and used for providing the model training personnel with the adjusted model training parameters according to the model performance verifying result;
and the model iteration updating unit 56 is respectively connected with the sample selecting unit 52 and the model parameter adjusting unit 55, and is used for performing iteration updating on the palm detection initial model by using the selected video frame images as training samples according to the adjusted model training parameters, and finally training to form the palm detection model.
As a preferable scheme, the deep learning network adopted in the embodiment is improved based on the structure of the RPN neural network. Specifically, the deep learning network comprises 5 convolutional layers which are cascaded in sequence, and a video frame image with the size of 256 × 256 is subjected to image feature extraction of a first convolutional layer of the deep learning network and then output to a 128 × 128 feature map; extracting the image features of the second convolution layer from the feature map with the size of 128 x 128 and outputting a feature map of 64 x 64; extracting the image features of the third convolution layer from the 64 × 64 feature map, and outputting a 32 × 32 feature map; extracting the image features of the fourth convolution layer from the 32 x 32 feature map, and outputting a 16 x 16 feature map; and (3) extracting the image features of the fifth convolution layer from the 16 × 16 feature map, and outputting an 8 × 8 feature map.
In conclusion, the invention realizes real-time tracking detection of the hand.
It should be understood that the above-described embodiments are merely preferred embodiments of the invention and the technical principles applied thereto. It will be understood by those skilled in the art that various modifications, equivalents, changes, and the like can be made to the present invention. However, such variations are within the scope of the invention as long as they do not depart from the spirit of the invention. In addition, certain terms used in the specification and claims of the present application are not limiting, but are used merely for convenience of description.

Claims (10)

1. A real-time video-based hand tracking method, comprising:
inputting a video frame image;
carrying out real-time palm detection on the video frame image through a palm detection model, and carrying out image cutting on the detected palm to obtain a palm image;
performing finger key point positioning detection on the palm image through a hand identification model to obtain the coordinate position of each finger key point on the palm image and identify the coordinate position;
and performing gesture recognition on the gesture image identified by the hand identification model through a gesture recognition model to obtain a real-time hand gesture recognition result.
2. The video-based real-time hand tracking method of claim 1, wherein the method of training the palm detection model comprises the steps of:
selecting 30000 video frame images containing a palm as a training sample of the palm detection model;
inputting the video frame image serving as a training sample into a deep learning network, and training to form a palm detection initial model;
carrying out palm detection on the video frame image through the palm detection initial model, and outputting a detection result;
manually checking the detection result output by the palm detection initial model to evaluate the performance of the model, and then adjusting the model training parameters of the deep learning network according to the evaluation result of the performance of the model;
and according to the adjusted model training parameters, taking the video frame image as a training sample, carrying out iterative updating on the palm detection initial model, and finally training to form the palm detection model.
3. The video-based real-time hand tracking method of claim 2, wherein the deep learning network is a neural network of RPN network structure.
4. The video-based real-time hand tracking method of claim 3, wherein the video frame image has a size of 256 x 256.
5. The video-based real-time hand tracking method according to claim 4, wherein the deep learning network comprises 5 convolutional layers which are cascaded in sequence, and the video frame image with the size of 256 x 256 is subjected to image feature extraction of a first convolutional layer of the deep learning network and then output a 128 x 128 feature map; extracting the image features of the second convolution layer from the feature map with the size of 128 x 128 and outputting a feature map of 64 x 64; extracting the image features of the third convolution layer from the 64 × 64 feature map, and outputting a 32 × 32 feature map; extracting the image features of the fourth convolution layer from the 32 x 32 feature map, and outputting a 16 x 16 feature map; and (3) extracting the image features of the fifth convolution layer from the 16 × 16 feature map, and outputting an 8 × 8 feature map.
6. The video-based real-time hand tracking method of claim 1, wherein the finger keypoints comprise 21 finger keypoints with 3D coordinates that can characterize the shape of the palm.
7. The video-based real-time hand tracking method according to claim 1, wherein the method for recognizing gestures by the gesture recognition model comprises the following steps:
cutting the gesture image from the palm image according to a preset size according to the identification result of the finger key point;
and performing image matching on the gesture image and classification template images stored in an image database, wherein each classification template image is associated with a gesture type, and if the image matching is successful, outputting the gesture type associated with the matched classification template image as a gesture recognition result of the gesture image.
8. A video-based real-time hand tracking system that implements the real-time hand tracking method of any one of claims 1-7, comprising:
the image input module is used for inputting video frame images;
the palm detection module is connected with the image input module and used for carrying out real-time palm detection on the video frame image through a palm detection model and carrying out image cutting on the detected palm to obtain a palm image;
the hand identification module is connected with the palm detection module and used for carrying out finger key point positioning detection on the palm image through a hand identification model to obtain the coordinate position of each finger key point on the palm image and identifying the coordinate position;
and the gesture recognition module is connected with the hand identification module and used for performing gesture recognition on the gesture image identified by the hand identification module through a gesture recognition model to obtain a real-time hand gesture recognition result.
9. The video-based real-time hand tracking system of claim 8, further comprising:
palm detection model training module connects palm detection module is used for the training palm detection model, specifically include in the palm detection model training module:
the sample marking unit is used for providing marking personnel with palm positions identified in the video frame images;
the sample selecting unit is connected with the sample labeling unit and used for selecting an image sample used for training the palm detection model from each video frame image labeled by the palm position;
the palm detection initial model training unit is connected with the sample selection unit and used for inputting each video frame image serving as a training sample into a deep learning network and training to form a palm detection initial model;
the model performance verification unit is connected with the palm detection initial model training unit and used for verifying the model performance of the palm detection initial model;
the model parameter adjusting unit is connected with the model performance verifying unit and used for providing a model training person with an adjusting model training parameter according to a model performance verifying result;
and the model iteration updating unit is respectively connected with the sample selecting unit and the model parameter adjusting unit and is used for performing iteration updating on the palm detection initial model according to the adjusted model training parameters and by taking each selected video frame image as a training sample, and finally training to form the palm detection model.
10. The video-based real-time hand tracking system of claim 9, wherein the deep learning network comprises 5 convolutional layers cascaded in sequence, and the video frame image with the size of 256 × 256 is subjected to image feature extraction of a first convolutional layer of the deep learning network and then output a 128 × 128 feature map; extracting the image features of the second convolution layer from the feature map with the size of 128 x 128 and outputting a feature map of 64 x 64; extracting the image features of the third convolution layer from the 64 × 64 feature map, and outputting a 32 × 32 feature map; extracting the image features of the fourth convolution layer from the 32 x 32 feature map, and outputting a 16 x 16 feature map; and (3) extracting the image features of the fifth convolution layer from the 16 × 16 feature map, and outputting an 8 × 8 feature map.
CN202011074015.4A 2020-10-12 2020-10-12 Real-time hand tracking method and system based on video Pending CN112183424A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011074015.4A CN112183424A (en) 2020-10-12 2020-10-12 Real-time hand tracking method and system based on video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011074015.4A CN112183424A (en) 2020-10-12 2020-10-12 Real-time hand tracking method and system based on video

Publications (1)

Publication Number Publication Date
CN112183424A true CN112183424A (en) 2021-01-05

Family

ID=73948657

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011074015.4A Pending CN112183424A (en) 2020-10-12 2020-10-12 Real-time hand tracking method and system based on video

Country Status (1)

Country Link
CN (1) CN112183424A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113589928A (en) * 2021-07-27 2021-11-02 东莞理工学院 Gesture recognition method for smart television
US20220076433A1 (en) * 2019-12-10 2022-03-10 Google Llc Scalable Real-Time Hand Tracking
CN114581535A (en) * 2022-03-03 2022-06-03 北京深光科技有限公司 Method, device, storage medium and equipment for marking key points of user bones in image
WO2022166243A1 (en) * 2021-02-07 2022-08-11 青岛小鸟看看科技有限公司 Method, apparatus and system for detecting and identifying pinching gesture

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107871047A (en) * 2017-11-21 2018-04-03 中国人民解放军战略支援部队航天工程大学 A kind of complex spatial system safety management parallel computing method
CN108171112A (en) * 2017-12-01 2018-06-15 西安电子科技大学 Vehicle identification and tracking based on convolutional neural networks
CN109255375A (en) * 2018-08-29 2019-01-22 长春博立电子科技有限公司 Panoramic picture method for checking object based on deep learning
CN109376674A (en) * 2018-10-31 2019-02-22 北京小米移动软件有限公司 Method for detecting human face, device and storage medium
CN109635630A (en) * 2018-10-23 2019-04-16 百度在线网络技术(北京)有限公司 Hand joint point detecting method, device and storage medium
CN109918635A (en) * 2017-12-12 2019-06-21 中兴通讯股份有限公司 A kind of contract text risk checking method, device, equipment and storage medium
KR20190132885A (en) * 2018-05-21 2019-11-29 주식회사 케이티 Apparatus, method and computer program for detecting hand from video
CN110781668A (en) * 2019-10-24 2020-02-11 腾讯科技(深圳)有限公司 Text information type identification method and device
CN111104820A (en) * 2018-10-25 2020-05-05 中车株洲电力机车研究所有限公司 Gesture recognition method based on deep learning

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107871047A (en) * 2017-11-21 2018-04-03 中国人民解放军战略支援部队航天工程大学 A kind of complex spatial system safety management parallel computing method
CN108171112A (en) * 2017-12-01 2018-06-15 西安电子科技大学 Vehicle identification and tracking based on convolutional neural networks
CN109918635A (en) * 2017-12-12 2019-06-21 中兴通讯股份有限公司 A kind of contract text risk checking method, device, equipment and storage medium
KR20190132885A (en) * 2018-05-21 2019-11-29 주식회사 케이티 Apparatus, method and computer program for detecting hand from video
CN109255375A (en) * 2018-08-29 2019-01-22 长春博立电子科技有限公司 Panoramic picture method for checking object based on deep learning
CN109635630A (en) * 2018-10-23 2019-04-16 百度在线网络技术(北京)有限公司 Hand joint point detecting method, device and storage medium
CN111104820A (en) * 2018-10-25 2020-05-05 中车株洲电力机车研究所有限公司 Gesture recognition method based on deep learning
CN109376674A (en) * 2018-10-31 2019-02-22 北京小米移动软件有限公司 Method for detecting human face, device and storage medium
CN110781668A (en) * 2019-10-24 2020-02-11 腾讯科技(深圳)有限公司 Text information type identification method and device

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220076433A1 (en) * 2019-12-10 2022-03-10 Google Llc Scalable Real-Time Hand Tracking
US11783496B2 (en) * 2019-12-10 2023-10-10 Google Llc Scalable real-time hand tracking
WO2022166243A1 (en) * 2021-02-07 2022-08-11 青岛小鸟看看科技有限公司 Method, apparatus and system for detecting and identifying pinching gesture
US11776322B2 (en) 2021-02-07 2023-10-03 Qingdao Pico Technology Co., Ltd. Pinch gesture detection and recognition method, device and system
CN113589928A (en) * 2021-07-27 2021-11-02 东莞理工学院 Gesture recognition method for smart television
CN113589928B (en) * 2021-07-27 2023-11-24 东莞理工学院 Gesture recognition method for intelligent television
CN114581535A (en) * 2022-03-03 2022-06-03 北京深光科技有限公司 Method, device, storage medium and equipment for marking key points of user bones in image

Similar Documents

Publication Publication Date Title
CN112183424A (en) Real-time hand tracking method and system based on video
CN108399367B (en) Hand motion recognition method and device, computer equipment and readable storage medium
CN110135249B (en) Human behavior identification method based on time attention mechanism and LSTM (least Square TM)
CN107679446B (en) Human face posture detection method, device and storage medium
CN108898063B (en) Human body posture recognition device and method based on full convolution neural network
Raheja et al. Android based portable hand sign recognition system
CN111444488A (en) Identity authentication method based on dynamic gesture
CN111857334A (en) Human body gesture letter recognition method and device, computer equipment and storage medium
CN113420690A (en) Vein identification method, device and equipment based on region of interest and storage medium
Ben Jmaa et al. A new approach for hand gestures recognition based on depth map captured by rgb-d camera
Hu et al. Trajectory image based dynamic gesture recognition with convolutional neural networks
Ding et al. Designs of human–robot interaction using depth sensor-based hand gesture communication for smart material-handling robot operations
CN111160308B (en) Gesture recognition method, device, equipment and readable storage medium
CN112286360A (en) Method and apparatus for operating a mobile device
Cohen et al. Recognition of continuous sign language alphabet using leap motion controller
Zamora-Mora et al. Real-time hand detection using convolutional neural networks for costa rican sign language recognition
WO2020224127A1 (en) Video stream capturing method and apparatus, and storage medium
Kadhim et al. A multimodal biometric database and case study for face recognition based deep learning
Barhate et al. A Survey of fingertip character identification in open-air using Image Processing and HCI
CN113792569B (en) Object recognition method, device, electronic equipment and readable medium
Dhamanskar et al. Human computer interaction using hand gestures and voice
Karthik et al. Survey on Gestures Translation System for Hearing Impaired People in Emergency Situation using Deep Learning Approach
CN109508523A (en) A kind of social contact method based on recognition of face
Munir et al. Hand Gesture Recognition: A Review
Pradeep et al. Advancement Of Sign Language Recognition Through Technology Using Python And OpenCV

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210105