CN111860275A - Gesture recognition data acquisition system and method - Google Patents

Gesture recognition data acquisition system and method Download PDF

Info

Publication number
CN111860275A
CN111860275A CN202010674342.7A CN202010674342A CN111860275A CN 111860275 A CN111860275 A CN 111860275A CN 202010674342 A CN202010674342 A CN 202010674342A CN 111860275 A CN111860275 A CN 111860275A
Authority
CN
China
Prior art keywords
infrared
gesture recognition
tracking camera
coordinate
virtual reality
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010674342.7A
Other languages
Chinese (zh)
Other versions
CN111860275B (en
Inventor
吴涛
周锋宜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Xiaoniao Kankan Technology Co Ltd
Original Assignee
Qingdao Xiaoniao Kankan Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Xiaoniao Kankan Technology Co Ltd filed Critical Qingdao Xiaoniao Kankan Technology Co Ltd
Publication of CN111860275A publication Critical patent/CN111860275A/en
Application granted granted Critical
Publication of CN111860275B publication Critical patent/CN111860275B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/014Hand-worn input/output arrangements, e.g. data gloves
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/12Details of acquisition arrangements; Constructional details thereof
    • G06V10/14Optical characteristics of the device performing the acquisition or on the illumination arrangements
    • G06V10/147Details of sensors, e.g. sensor lenses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/95Hardware or software architectures specially adapted for image or video understanding structured as a network, e.g. client-server architectures
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Biology (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Vascular Medicine (AREA)
  • Image Analysis (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention provides a gesture recognition data acquisition system which comprises a closed space, an infrared tracking camera, gloves and a VR virtual reality head, wherein the surface of each glove is provided with an infrared mark point M1, the surface of each VR virtual reality head is provided with an infrared mark point M2, and the VR virtual reality head is provided with a gesture recognition tracking camera; the infrared tracking camera and the VR virtual reality are connected with the server client in a wearing manner; the server client comprises a space positioning module, a fitting module and a shifting module, wherein the positioning module is used for determining the infrared targetPosition relation between centroid of note point M1 and centroid of infrared mark point M2
Figure DDA0002583508460000011
The fitting module is used for fitting the coordinate data
Figure DDA0002583508460000012
In relation to position
Figure DDA0002583508460000013
Performing curve fitting to obtain a rotation matrix and a translation vector; the shifting module is used for translating and rotating the infrared mark point M1 to a coordinate system which takes the gesture recognition tracking camera as an origin coordinate according to the rotation matrix and the translation vector, so that manual marking is not needed, the precision of data marking is improved, and the efficiency of data marking is improved.

Description

Gesture recognition data acquisition system and method
Technical Field
The invention relates to the field of computer vision, in particular to a gesture recognition data acquisition system and method.
Background
In order to enhance the immersion of the virtual-real combination of VR/AR/MR and make the VR/AR/MR have a better experience, the human-computer interaction module is indispensable, and especially the high-precision real-time restoration of the 3D gesture of the hand in the VR/AR/MR scene greatly influences the experience immersion of the user in the VR/AR/MR scene.
Gesture recognition is very critical in the VR/AR/MR field, and especially the light interaction in the VR/AR/MR scene experience plays a more important role, therefore, the requirements on the accuracy, the time delay and the environment compatibility and stability of the bare hand tracking are higher, and in the gesture recognition in the VR/AR/MR field at present, VR/AR/MR equipment manufacturers gradually consider the environment capture camera arranged at the end of the multiplex head-wearing integrated machine to track and identify the hand information of users, the mainstream scheme of hand tracking at present adopts an AI-based algorithm framework, a large amount of image training data needs to be acquired, labeling each image data, then training and learning the convolutional neural network, and finally obtaining a high-precision and high-stability gesture recognition convolutional neural network model through multiple training and large data sets.
When a large amount of image data is collected and each piece of image data is labeled, the precision of the data labeling position of each bone point corresponding to a hand on each piece of image data is very critical, and the data labeling problem is solved by adopting a manual labeling and semi-supervised learning mode in the conventional method at present. The method comprises the steps of marking small data manually, training a network model based on the data, identifying and marking other data through the trained model, carrying out manual supervision and inspection, manually correcting wrong position information of marked points identified by the model, continuing training the network model, repeating the above processes in sequence, and finally obtaining a high-precision network model. By the method, the dependence of data annotation precision on people is large, and especially when some annotation point positions corresponding to some gesture actions are shielded under the visual angle of the camera, the position information of the shielded annotation points on the image needs to be manually estimated, so that the data annotation precision is difficult to be ensured, and the training precision of the network model is low.
Therefore, there is a need for a gesture recognition data acquisition system and method that does not require manual labeling, improves the precision of data labeling, and improves the efficiency of data labeling.
Disclosure of Invention
In view of the above problems, the present invention provides a gesture recognition data acquisition system to solve the problems of the prior art that a small part of data is labeled manually, and then a network model is trained based on the data, the recognition and marking of other data are carried out through the trained model, then the manual supervision and inspection are carried out, the wrong position information of the marking points identified by the model is manually corrected, then the network model training is continued, the above processes are repeated in sequence, and finally a high-precision network model is obtained, the data marking precision of the method has larger dependence on people, particularly when some marking point positions corresponding to some gesture actions are shielded under the visual angle of a camera, at the moment, the position information of the shielded marking points on the image needs to be manually estimated, so that the data marking precision is difficult to ensure, and the training precision of the network model is low.
The invention provides a gesture recognition data acquisition system which is characterized by comprising a closed space, an infrared tracking camera, gloves with infrared mark points M1 arranged on the surface, and VR virtual reality head wears with infrared mark points M2 arranged on the surface, wherein,
The infrared mark point M1 is arranged at a position corresponding to a hand skeleton point;
the infrared tracking camera and the VR virtual reality are connected with a server client in a wearing manner;
the infrared tracking camera is arranged on the wall surface of the closed space and used for scanning the infrared mark points M1 and M2 to obtain the position coordinates of the gloves in the closed space
Figure BDA0002583508440000021
With the VR virtual reality head wearing position coordinates within the enclosed space
Figure BDA0002583508440000022
And coordinate the position
Figure BDA0002583508440000023
With the position coordinates
Figure BDA0002583508440000024
Transmitting to the server client;
at least two gesture recognition tracking cameras are arranged on the VR virtual reality head; VR virtual reality wears and is used for through gesture recognition tracks the camera and shoots gloves in order to acquire gloves for gesture recognition tracks the coordinate data of camera
Figure BDA0002583508440000025
And combining the coordinate data
Figure BDA0002583508440000026
Transmitting to the server client;
the server client comprises a space positioning module, a fitting module and a shifting module; wherein the content of the first and second substances,
the positioning module is used for positioning the object according to the position coordinates
Figure BDA0002583508440000027
With the position coordinates
Figure BDA0002583508440000028
Determining a positional relationship between the centroid of the infrared marker point M1 and the centroid of the infrared marker point M2
Figure BDA0002583508440000031
The fitting module is used for taking the gesture recognition tracking camera as an origin coordinate and using the coordinate data
Figure BDA0002583508440000032
In relation to said position
Figure BDA0002583508440000033
Performing curve fitting estimation to obtain a rotation matrix and a translation vector of the centroid of the infrared marker point M2 relative to the origin coordinate;
the displacement module is used for translating and rotating the infrared mark point M1 to a coordinate system taking the gesture recognition tracking camera as an origin coordinate according to the rotation matrix and the translation vector, so that the hand skeleton point is marked on the gesture picture shot by the gesture recognition tracking camera.
Preferably, the number of the infrared tracking cameras is 40-50.
Preferably, the infrared tracking camera adopts a high-precision infrared tracking camera with a visual angle range of at least 55 degrees by 45 degrees, a frame rate of at least 180Hz, an exposure mode of Global Shutter and an image resolution of 1080P.
Preferably, the curve-fitting estimation is based on a least squares estimation algorithm.
Preferably, the infrared tracking camera is connected with the server client through a switch; the switch is used for transmitting the data collected by the infrared tracking camera to the server client in real time.
Preferably, the gesture tracking camera adopts a camera with a visual angle range of at least 130 × 100 °, a frame rate of at least 60Hz, an exposure mode of Global Shutter, and an image resolution of VGA.
Preferably, the centroid of the infrared marker point M1 is the centroid of the geometric figure formed by all infrared marker points M1;
the centroid of the infrared marker point M2 is the centroid of the geometric figure formed by all the infrared marker points M2.
The position coordinates
Figure BDA0002583508440000034
The position coordinates
Figure BDA0002583508440000035
In relation to said position
Figure BDA0002583508440000036
The position of the infrared tracking camera relative to the closed space is a coordinate system taking the infrared tracking camera as a coordinate origin.
The invention also provides a gesture recognition data acquisition method, which comprises the following steps:
scanning infrared mark points M1 on gloves and infrared mark points M2 on VR virtual reality head-wearing through infrared tracking cameras to acquire position coordinates of the gloves in a closed space
Figure BDA0002583508440000037
With the position coordinates of the VR virtual reality head worn in the enclosed space
Figure BDA0002583508440000038
The infrared mark point M1 corresponds to the position of a hand skeleton point;
according to the position coordinates
Figure BDA0002583508440000041
With the position coordinates
Figure BDA0002583508440000042
Determining a positional relationship between the centroid of the infrared marker point M1 and the centroid of the infrared marker point M2
Figure BDA0002583508440000043
Through the VR virtual reality is worn gesture recognition and is tracked the camera and shoot the gloves in order to acquire the gloves for the coordinate data of gesture recognition tracking camera
Figure BDA0002583508440000044
And using the gesture recognition tracking camera as an origin coordinate, and using the coordinate data
Figure BDA0002583508440000045
In relation to said position
Figure BDA0002583508440000046
Performing curve fitting estimation to obtain a rotation matrix and a translation vector of the centroid of the infrared marker point M2 relative to the origin coordinate;
and translating and rotating the infrared mark point M1 to a coordinate system taking the gesture recognition tracking camera as an origin coordinate according to the rotation matrix and the translation vector so as to mark the hand skeleton point on a gesture picture shot by the gesture recognition tracking camera.
Preferably, the glove is photographed by a gesture recognition tracking camera worn by the VR virtual reality head to acquire coordinate data of the glove relative to the gesture recognition tracking camera
Figure BDA0002583508440000047
In the course of (a) or (b),
the VR virtual reality is worn gesture recognition and is tracked the camera and shoot at least 20 infrared mark points M1 on the gloves.
According to the technical scheme, the system and the method for acquiring the gesture recognition data provided by the invention have the advantages that the infrared tracking cameras are arranged on five surfaces in the closed space, the infrared mark points M1 and M2 are arranged on the gloves and the VR virtual reality head, and the infrared mark points M1 and M2 are scanned by the infrared tracking cameras to acquire the position coordinates of the gloves in the closed space
Figure BDA0002583508440000048
With the position coordinates of the VR virtual reality head worn in the enclosed space
Figure BDA0002583508440000049
And according to said position coordinates
Figure BDA00025835084400000410
With the position coordinates
Figure BDA00025835084400000411
Determining a positional relationship between the centroid of the infrared marker point M1 and the centroid of the infrared marker point M2
Figure BDA00025835084400000412
Then shooting the glove through a gesture recognition tracking camera worn on the VR virtual reality head to acquire coordinate data of the glove relative to the gesture recognition tracking camera 121
Figure BDA00025835084400000413
And coordinate data
Figure BDA00025835084400000414
In relation to position
Figure BDA00025835084400000415
And performing curve fitting estimation to obtain a rotation matrix and a translation vector of the mass center of the infrared mark point M2 relative to an original point coordinate, and then translating and rotating the infrared mark point M1 to a coordinate system taking the gesture recognition tracking camera as the original point coordinate according to the rotation matrix and the translation vector so as to enable hand skeleton points to be calibrated on a gesture photo shot by the gesture recognition tracking camera to form high-precision picture training data, thereby reducing manual participation, improving the labeling efficiency, improving the labeling precision, further improving the precision of a network model, and improving the experience immersion of a user in a VR/AR/MR scene.
Drawings
Other objects and results of the present invention will become more apparent and more readily appreciated as the same becomes better understood by reference to the following specification taken in conjunction with the accompanying drawings. In the drawings:
FIG. 1 is a schematic diagram of a gesture recognition data acquisition system according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an application of a gesture recognition data acquisition system according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a glove in a gesture recognition data acquisition system according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a gesture recognition data acquisition method according to an embodiment of the invention.
Detailed Description
The traditional method is that a small part of data is marked manually, then network model training is carried out based on the data, other data are marked by the trained model, then manual supervision and inspection are carried out, the wrong position information of marking points identified by the model is corrected manually, then the network model training is continued, the above processes are repeated in sequence, and finally a high-precision network model is obtained.
In view of the above problems, the present invention provides a gesture recognition data collecting system, and the following describes in detail specific embodiments of the present invention with reference to the accompanying drawings.
For explaining the gesture recognition data acquisition system provided by the present invention, fig. 1 exemplarily shows a system structure of the gesture recognition data acquisition system according to the embodiment of the present invention, and fig. 2 exemplarily shows an application of the gesture recognition data acquisition system according to the embodiment of the present invention.
The following description of the exemplary embodiment(s) is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses. Techniques and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail, but are intended to be considered a part of the specification where appropriate.
As shown in fig. 1, the gesture recognition data collecting system 100 provided by the present invention includes a closed space 110 formed by wall surfaces, and an infrared tracking camera 111, where the wall surfaces may be actual wall surfaces, or may be any supporting material capable of forming the closed space, the ambient brightness degree in the closed space 110 is adjustable, the infrared tracking camera is disposed on the wall surfaces of the closed space, and may be four wall surfaces or five wall surfaces, and the number is not particularly limited, in order to enhance the tracking effect, in this embodiment, the closed space 110 is provided with the infrared tracking camera 111 on all five other surfaces except the bottom surface, in this embodiment, the closed space 110 adopts a sealed room with a length of 3m (width of 3 m), that is, except for the floor of the room, the infrared tracking camera 111 is disposed on all five other wall surfaces, the specification of the infrared tracking camera is not particularly limited, and in this embodiment, a high-precision infrared tracking camera with a view angle range of at least 55 ° x 45 °, a frame rate of at least 180Hz, an exposure mode of GlobalShutter, and an image resolution of 1080P is used to accurately capture each infrared mark point, thereby accurately determining the position of the object for marking the infrared mark point.
In the gesture recognition data collection system 100 shown in fig. 1 and fig. 2, the number of the infrared tracking cameras 111 is 40 to 50, in this embodiment, 45 infrared tracking cameras 111 are used, and infrared tracking camera installation is performed on the entire enclosed space 110 according to the preset installation position and angle, each infrared tracking camera 111 needs to have a fixed installation angle in a three-dimensional space, so as to ensure that at least one infrared tracking camera of the 45 infrared tracking cameras can scan a visible area (3 × 2.5 due to the height of a person being more than one meter) of 3 × 2.5 (length × width × height) in the enclosed space 110 under any condition, after the installation of the infrared tracking cameras 111 is completed, a coordinate system of the entire space is established, that is, a fixed position is set as an origin of the coordinate system of the enclosed space 110 in the entire room of 3 × 3m, and then, one camera in one infrared tracking system is arbitrarily appointed to be used as the origin of the coordinate system, so that the relative position relation information of other infrared tracking cameras relative to the appointed infrared tracking camera can be acquired.
As shown in fig. 3, in the gesture recognition data collecting system 100 provided by the present invention, the surface of the glove is provided with the infrared mark points M1, the size and specification of the glove are not particularly limited, each infrared mark point M1 can be accurately arranged at each joint of the human hand, and 26 infrared mark points M1 are arranged on each glove according to a certain mark position and finger joint distribution, so that the infrared tracking camera 111 can accurately capture the 26 infrared mark points M1.
As shown in fig. 1, 2, and 3, the gesture recognition data acquisition system 100 provided by the present invention includes a glove having an infrared mark point M1 on a surface thereof, the size of the glove can be adjusted according to the size of different hands, the infrared mark point M1 is disposed at a position corresponding to a hand skeleton point, so as to ensure that the hand and the glove are tightly attached to each other as much as possible after wearing the glove, thereby accurately obtaining the skeleton point distribution of the hand, and further, only one worker needs to carry the glove to perform several preset hand motions in the enclosed space 110 to obtain image training data with the skeleton point marks, and further train a convolutional neural network model, and enhance the immersion feeling of VR/AR/MR virtual-real combination.
As shown in fig. 1 and fig. 2, the gesture recognition data collecting system 100 provided by the present invention includes a VR virtual reality headset 120 having an infrared mark point M2 on a surface thereof, the number of the infrared mark points M2 is not limited specifically, in this embodiment, it is ensured that the infrared tracking camera 111 can capture at least 4 infrared mark points M2 on the VR virtual reality headset 120, so that the infrared tracking camera 111 can accurately determine the position information of the VR virtual reality headset 120, at least two gesture recognition tracking cameras 121 are disposed on the VR virtual reality headset 120, the model and specification of the gesture recognition tracking cameras 121 are not limited specifically, in this embodiment, a camera having an angle of view of at least 130 × 100 °, a frame rate of at least 60Hz, an exposure mode of Global Shutter, and an image resolution of VGA is used to accurately collect hand image data, preventing image distortion.
In addition, in the gesture recognition data collecting system 100 shown in fig. 1 and fig. 2, the arrangement of the infrared tracking camera and the gesture recognition tracking camera in their respective positions is not particularly limited, in this embodiment, the infrared tracking camera and the gesture recognition tracking camera are both calibrated cameras, that is, the infrared tracking camera and the gesture recognition tracking camera are both cameras with marked numbers and having position information, and the calibration method adopts a calibration method of a Zhang Zhengyou.
In the gesture recognition data acquisition system 100 shown in fig. 1 and fig. 2, the infrared tracking camera 111 and the VR virtual reality headset 120 are connected to the server client 130, and the specific connection manner is not particularly limited, in this embodiment, the VR virtual reality headset 120 is connected to the server client 130 through a wire, and the infrared tracking camera 111 is connected to the server client 130 through a switch; the switch is configured to transmit data collected by the infrared tracking camera 111 to the server client 130 in real time, so that the server client can obtain image data taken by the infrared tracking camera in time.
Specifically, in the gesture recognition data acquisition system 100 shown in fig. 1 and 2, the infrared tracking camera 111 is used for scanning the infrared target Point M1, M2 to obtain the position coordinates of the glove within the enclosed space 110
Figure BDA0002583508440000081
Coordinates of the position of the VR virtual reality headset 120 within the enclosed space 110
Figure BDA0002583508440000082
And coordinate the position
Figure BDA0002583508440000083
With the position coordinates
Figure BDA0002583508440000084
And transmitting to the server client.
In the gesture recognition data acquisition system 100 shown in fig. 1 and 2, at least two gesture recognition tracking cameras are disposed on the VR virtual reality headset 120; the VR virtual reality head is used for shooting the glove through the gesture recognition tracking camera 121 to acquire coordinate data of the glove relative to the gesture recognition tracking camera 121
Figure BDA0002583508440000085
Here coordinate data of the tracking camera 121 with respect to the gesture recognition
Figure BDA0002583508440000086
I.e., the data of the glove within the VR virtual reality headset 120, also can be said to be coordinate data
Figure BDA0002583508440000087
Image data of the glove captured by the gesture recognition tracking camera 121; the coordinate data is then compared
Figure BDA0002583508440000088
To the server client 130.
In the gesture recognition data acquisition system 100 shown in fig. 1 and 2, the server client 130 includes a spatial positioning module 131 and a fitting module 132And a shift module 133, wherein the spatial location module 131 is used for determining the position coordinates
Figure BDA0002583508440000089
And position coordinates
Figure BDA00025835084400000810
Determining a positional relationship between a centroid of the infrared marker point M1 and a centroid of the infrared marker point M2
Figure BDA00025835084400000811
The centroid of the infrared mark point M1 is the centroid of the geometric figure formed by all the infrared mark points M1, the centroid of the infrared mark point M2 is the centroid of the geometric figure formed by all the infrared mark points M2, the centroid is the traditional centroid, and the position coordinates are
Figure BDA00025835084400000812
Position coordinates
Figure BDA00025835084400000813
In relation to position
Figure BDA00025835084400000814
The positions of the glove and the VR virtual head-wear 120 are relative to the enclosed space, and the coordinate system is a coordinate system with the infrared tracking camera 111 as the origin of coordinates, that is, the space positioning module 131 is used for obtaining the relative positions of the glove and the VR virtual head-wear 120 in the whole enclosed space 110 according to the positions of the infrared marker points M1 and M2.
In the gesture recognition data acquisition system 100 shown in fig. 1 and 2, the fitting module 132 is configured to use the gesture recognition tracking camera 121 as an origin coordinate, and use the coordinate data
Figure BDA00025835084400000815
In relation to the position
Figure BDA0002583508440000091
Performing curve fitting estimation to obtain the coordinates of the mass center of the infrared mark point M2 relative to the origin pointThe rotation matrix and the translation vector are calculated, and the curve fitting estimation is based on a least square estimation algorithm, namely, the relationship between the centroid of the infrared mark point M2 on the VR virtual reality headset 120 and the gesture recognition tracking camera 121 is obtained by performing curve fitting on the image information of the glove shot by the gesture recognition tracking camera and the relative position information of the VR virtual reality headset 120 and the glove shot by the infrared tracking camera 111 through the least square estimation algorithm.
In the gesture recognition data collecting system 100 shown in fig. 1 and fig. 2, the shifting module 133 is configured to translate and rotate the infrared mark point M1 to a coordinate system with the gesture recognition and tracking camera 121 as an origin coordinate according to the rotation matrix and the translation vector, so as to mark a hand skeleton point on a gesture picture shot by the gesture recognition and tracking camera 121, that is, a positional relationship between a centroid of the infrared mark point M1 on the glove and a centroid of the infrared mark point M2 on the VR virtual reality headset 120 is
Figure BDA0002583508440000092
(obtained), the relationship between the centroid of the infrared mark point M2 on the VR virtual reality head 120 and the gesture recognition tracking camera 121 is obtained by the fitting module 132, therefore, the shifting module performs translation and rotation according to the data obtained by the fitting module, marks the infrared mark point M1 on the glove onto the gesture picture shot by the gesture recognition tracking camera, (as if a is related to B, B is related to C, the relationship between a and C can be obtained), thereby obtaining the hand image data accurately marked with the hand skeleton point, and further inputting the hand image data into the convolutional neural network model as image training data, thereby completing the high-precision training of the convolutional neural network model.
As shown in fig. 1, 2 and 3, the gesture recognition data acquisition system provided by the invention only needs one glove with two hands marked with the infrared mark point M1 for a worker, the VR virtual reality head with the infrared mark point M2 worn on the head swings a plurality of specific hand motions in the closed space, and the infrared tracking camera arranged on the wall of the closed space scans the infrared mark points M1 and M2, so that the position coordinates of the glove are obtained
Figure BDA0002583508440000093
With VR virtual reality head-mounted position coordinates
Figure BDA0002583508440000094
And coordinate the position
Figure BDA0002583508440000095
And position coordinates
Figure BDA0002583508440000101
The coordinate data is transmitted to the client side of the server, the gesture recognition tracking camera arranged on the VR virtual reality head shoots gloves to acquire the coordinate data of the gloves relative to the gesture recognition tracking camera 121
Figure BDA0002583508440000102
And coordinate data
Figure BDA0002583508440000103
Transmitting the coordinate data to a server client side which transmits the coordinate data to the server client side
Figure BDA0002583508440000104
In relation to position
Figure BDA0002583508440000105
The method comprises the steps of performing curve fitting, and performing rotation translation according to data obtained by fitting to obtain a large number of hand images accurately marked with hand skeleton points, so that the hand images are input into a convolutional neural network model as image training data, training of the convolutional neural network model is completed, gestures are automatically recognized after the convolutional neural network model, a large number of image training data can be obtained only by performing a few specific hand actions alone in the process, the limitation of a traditional method for manually marking the skeleton points is eliminated, the precision of marking the skeleton points by the images is improved, the training precision of the reeled neural network model is improved, and the experience immersion of a user in a VR/AR/MR scene is further enhanced.
As shown in fig. 4, corresponding to the gesture recognition data acquisition system, the present invention further provides a gesture recognition data acquisition method, including:
S110: the infrared tracking camera scans an infrared mark point M1 on the glove and an infrared mark point M2 on a VR virtual reality head so as to acquire the position coordinates of the glove in a closed space
Figure BDA0002583508440000106
Position coordinates worn in a closed space with the VR virtual reality head
Figure BDA0002583508440000107
The infrared mark point M1 corresponds to the position of the hand skeleton point;
s120: according to the position coordinates
Figure BDA0002583508440000108
With the position coordinates
Figure BDA0002583508440000109
Determining a positional relationship between the centroid of the infrared marker point M1 and the centroid of the infrared marker point M2
Figure BDA00025835084400001010
S130: shoot the glove through the gesture recognition tracking camera worn on the VR virtual reality so as to acquire the coordinate data of the glove relative to the gesture recognition tracking camera 121
Figure BDA00025835084400001011
And using the gesture recognition tracking camera as an origin coordinate to obtain coordinate data
Figure BDA00025835084400001012
In relation to the position
Figure BDA00025835084400001013
Performing curve fitting estimation based on least square estimation algorithm to obtain the coordinates of the centroid of the infrared mark point M2 relative to the origin pointThe rotation matrix and translation vector of (a);
s140: and translating and rotating the infrared mark point M1 to a coordinate system taking the gesture recognition tracking camera as an origin coordinate according to the rotation matrix and the translation vector so as to mark the hand skeleton point on the gesture picture shot by the gesture recognition tracking camera.
In step S110, the glove is worn during data acquisition, and several specific gesture actions are performed in a closed space, so that it is ensured that the infrared tracking camera and the gesture recognition tracking camera worn on the VR virtual reality head can capture the infrared mark point on the glove corresponding to each gesture action.
In step S120, the position coordinates are determined
Figure BDA0002583508440000111
And position coordinates
Figure BDA0002583508440000112
Determining a positional relationship between a centroid of the infrared marker point M1 and a centroid of the infrared marker point M2
Figure BDA0002583508440000113
The position coordinates
Figure BDA0002583508440000114
Position coordinates
Figure BDA0002583508440000115
In relation to position
Figure BDA0002583508440000116
The positions of the infrared tracking camera and the coordinate system are relative to a closed space, and the coordinate system is a coordinate system taking the infrared tracking camera as a coordinate origin.
In S130, the gesture recognition tracking camera photographs the glove to acquire coordinate data of the glove with respect to the gesture recognition tracking camera 121
Figure BDA0002583508440000117
Namely, the glove (quality of the infrared mark point M1)Heart) and gesture recognition tracks the positional relationship between the cameras, and the positional relationship
Figure BDA0002583508440000118
For the position relation of the center of mass of the glove infrared mark point M1 and the center of mass of the VR virtual reality head-wearing infrared mark point M2 in the closed space, coordinate data are transmitted
Figure BDA0002583508440000119
In relation to position
Figure BDA00025835084400001110
And performing curve fitting to obtain the relation between the center of mass of the VR virtual reality head wearing the infrared mark point M2 and the VR virtual reality head wearing gesture recognition tracking camera, namely the rotation matrix and the translation vector of the center of mass of the infrared mark point M2 relative to the origin coordinate.
In S140, the infrared mark point M1 is translated and rotated to a coordinate system using the gesture recognition tracking camera as an origin coordinate according to the rotation matrix and the translation vector, and then the distance between the centroid of the VR virtual reality head wearing the infrared mark point M2 and the relationship of the VR virtual reality head wearing the gesture recognition tracking camera is obtained, and the distance between the centroid and the VR virtual reality head wearing the gesture recognition tracking camera is obtained, and the position relationship between the VR virtual reality head wearing the infrared mark point M2 and the VR virtual reality head wearing the gesture recognition tracking camera in the coordinates of the closed space can be realized by moving and rotating, and the position relationship between the VR virtual reality head wearing the gesture recognition tracking camera and the glove corresponds to each other, so that the infrared mark point M1 completely covers the hand image shot by the gesture recognition tracking camera, and the infrared mark point M1 corresponds to the hand skeleton point, and further completes the skeleton point marking of the.
According to the gesture recognition data acquisition method provided by the embodiment of the invention, the infrared tracking camera scans the infrared mark point M1 on the glove and the infrared mark point M2 on the VR virtual reality head so as to acquire the position coordinates of the glove in the closed space
Figure BDA0002583508440000121
Worn in a closed environment with VR virtual realityPosition coordinates in space
Figure BDA0002583508440000122
Thereby determining the position relationship between the centroid of the infrared marker point M1 and the centroid of the infrared marker point M2
Figure BDA0002583508440000123
And then the gesture recognition tracking camera worn by the VR virtual reality head shoots the glove to acquire the coordinate data of the glove relative to the gesture recognition tracking camera 121
Figure BDA0002583508440000124
Coordinate data
Figure BDA0002583508440000125
In relation to position
Figure BDA0002583508440000126
Curve fitting estimation is carried out to obtain a rotation matrix and a translation vector of the mass center of the infrared mark point M2 relative to the original point coordinate, hand skeleton points are calibrated on a gesture picture shot by a gesture recognition tracking camera according to the rotation matrix and the translation vector, the hand images are input into a convolutional neural network model as image training data, training of the convolutional neural network model is completed, so that gestures are automatically recognized after the convolutional neural network model is carried out, a large amount of image training data can be obtained only by one person through a plurality of specific hand actions in the process, the method is broken away from the conventional method for manually identifying the skeleton points, the precision of the image identification skeleton points is improved, the training precision of the convolutional neural network model is improved, and further the immersion sense of a user in a VR/AR/MR scene is enhanced.
The gesture recognition data acquisition system, method proposed according to the present invention are described above by way of example with reference to the accompanying drawings. However, it should be understood by those skilled in the art that various modifications can be made to the gesture recognition data acquisition system and method provided by the invention without departing from the scope of the invention. Therefore, the scope of the present invention should be determined by the contents of the appended claims.

Claims (10)

1. A gesture recognition data acquisition system is characterized by comprising a closed space, an infrared tracking camera, gloves with infrared mark points M1 arranged on the surface, and VR virtual reality head wears with infrared mark points M2 arranged on the surface, wherein,
the infrared mark point M1 is arranged at a position corresponding to a hand skeleton point;
the infrared tracking camera and the VR virtual reality are connected with a server client in a wearing manner;
the infrared tracking camera is arranged on the wall surface of the closed space and used for scanning the infrared mark points M1 and M2 to obtain the position coordinates of the gloves in the closed space
Figure FDA0002583508430000011
With the VR virtual reality head wearing position coordinates within the enclosed space
Figure FDA0002583508430000012
And coordinate the position
Figure FDA0002583508430000013
With the position coordinates
Figure FDA0002583508430000014
Transmitting to the server client;
at least two gesture recognition tracking cameras are arranged on the VR virtual reality head; VR virtual reality wears and is used for through gesture recognition tracks the camera and shoots gloves in order to acquire gloves for gesture recognition tracks the coordinate data of camera
Figure FDA0002583508430000015
And combining the coordinate data
Figure FDA0002583508430000016
Transmitting to the server client;
the server client comprises a space positioning module, a fitting module and a shifting module; wherein the content of the first and second substances,
The positioning module is used for positioning the object according to the position coordinates
Figure FDA0002583508430000017
With the position coordinates
Figure FDA0002583508430000018
Determining a positional relationship between the centroid of the infrared marker point M1 and the centroid of the infrared marker point M2
Figure FDA0002583508430000019
The fitting module is used for taking the gesture recognition tracking camera as an origin coordinate and using the coordinate data
Figure FDA00025835084300000110
In relation to said position
Figure FDA00025835084300000111
Performing curve fitting estimation to obtain a rotation matrix and a translation vector of the centroid of the infrared marker point M2 relative to the origin coordinate;
the displacement module is used for translating and rotating the infrared mark point M1 to a coordinate system taking the gesture recognition tracking camera as an origin coordinate according to the rotation matrix and the translation vector, so that the hand skeleton point is marked on the gesture picture shot by the gesture recognition tracking camera.
2. The gesture recognition data collection system of claim 1,
the number of the infrared tracking cameras is 40-50.
3. The gesture recognition data collection system of claim 1,
the infrared tracking camera adopts a high-precision infrared tracking camera with a visual angle range of at least 55 degrees by 45 degrees, a frame rate of at least 180Hz, an exposure mode of Global Shutter and an image resolution of 1080P.
4. The gesture recognition data collection system of claim 1,
the curve fitting estimation is based on a least squares estimation algorithm.
5. The gesture recognition data collection system of claim 1,
the infrared tracking camera is connected with the server client through the switch; the switch is used for transmitting the data collected by the infrared tracking camera to the server client in real time.
6. The gesture recognition data collection system of claim 1,
the gesture tracking camera adopts a camera with a visual angle range of at least 130 × 100 degrees, a frame rate of at least 60Hz, an exposure mode of Global Shutter and an image resolution of VGA.
7. The gesture recognition data collection system of claim 1,
the mass center of the infrared marker point M1 is the mass center of a geometric figure formed by all the infrared marker points M1;
the centroid of the infrared marker point M2 is the centroid of the geometric figure formed by all the infrared marker points M2.
8. The gesture recognition data collection system of claim 1,
the position coordinates
Figure FDA0002583508430000021
The position coordinates
Figure FDA0002583508430000022
In relation to said position
Figure FDA0002583508430000023
The position of the infrared tracking camera relative to the closed space is a coordinate system taking the infrared tracking camera as a coordinate origin.
9. A gesture recognition data acquisition method is characterized by comprising the following steps:
scanning infrared mark points M1 on gloves and infrared mark points M2 on VR virtual reality head-wearing through infrared tracking cameras to acquire position coordinates of the gloves in a closed space
Figure FDA0002583508430000031
With the position coordinates of the VR virtual reality head worn in the enclosed space
Figure FDA0002583508430000032
The infrared mark point M1 corresponds to the position of a hand skeleton point;
according to the position coordinates
Figure FDA0002583508430000033
With the position coordinates
Figure FDA0002583508430000034
Determining a positional relationship between the centroid of the infrared marker point M1 and the centroid of the infrared marker point M2
Figure FDA0002583508430000035
Through the VR virtual reality is worn gesture recognition and is tracked the camera and shoot the gloves in order to acquire the gloves for the coordinate data of gesture recognition tracking camera
Figure FDA0002583508430000036
And using the gesture recognition tracking camera as an origin coordinate, and using the coordinate data
Figure FDA0002583508430000037
In relation to said position
Figure FDA0002583508430000038
Performing curve fitting estimation to obtain a rotation matrix and a translation vector of the centroid of the infrared marker point M2 relative to the origin coordinate;
and translating and rotating the infrared mark point M1 to a coordinate system taking the gesture recognition tracking camera as an origin coordinate according to the rotation matrix and the translation vector so as to mark the hand skeleton point on a gesture picture shot by the gesture recognition tracking camera.
10. The method of claim 9, wherein the glove is photographed by a gesture recognition tracking camera worn by the VR virtual reality headset to obtain coordinate data of the glove relative to the gesture recognition tracking camera
Figure FDA0002583508430000039
In the course of (a) or (b),
the VR virtual reality is worn gesture recognition and is tracked the camera and shoot at least 20 infrared mark points M1 on the gloves.
CN202010674342.7A 2020-05-12 2020-07-14 Gesture recognition data acquisition system and method Active CN111860275B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010397091 2020-05-12
CN2020103970912 2020-05-12

Publications (2)

Publication Number Publication Date
CN111860275A true CN111860275A (en) 2020-10-30
CN111860275B CN111860275B (en) 2023-11-03

Family

ID=72983413

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010674342.7A Active CN111860275B (en) 2020-05-12 2020-07-14 Gesture recognition data acquisition system and method

Country Status (1)

Country Link
CN (1) CN111860275B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112416134A (en) * 2020-12-10 2021-02-26 华中科技大学 Device and method for quickly generating hand key point data set

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927016A (en) * 2014-04-24 2014-07-16 西北工业大学 Real-time three-dimensional double-hand gesture recognition method and system based on binocular vision
WO2016132371A1 (en) * 2015-02-22 2016-08-25 Technion Research & Development Foundation Limited Gesture recognition using multi-sensory data
US20180088889A1 (en) * 2016-09-29 2018-03-29 Jiang Chang Three-dimensional image formation and color correction system and method
CN108196679A (en) * 2018-01-23 2018-06-22 河北中科恒运软件科技股份有限公司 Gesture-capture and grain table method and system based on video flowing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927016A (en) * 2014-04-24 2014-07-16 西北工业大学 Real-time three-dimensional double-hand gesture recognition method and system based on binocular vision
WO2016132371A1 (en) * 2015-02-22 2016-08-25 Technion Research & Development Foundation Limited Gesture recognition using multi-sensory data
US20180088889A1 (en) * 2016-09-29 2018-03-29 Jiang Chang Three-dimensional image formation and color correction system and method
CN108196679A (en) * 2018-01-23 2018-06-22 河北中科恒运软件科技股份有限公司 Gesture-capture and grain table method and system based on video flowing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
付倩;沈俊辰;张茜颖;武仲科;周明全;: "面向手语自动翻译的基于Kinect的手势识别", 北京师范大学学报(自然科学版), no. 06 *
吕美玉;侯文君;陈军;: "基于数据手套和双目视觉技术的虚拟手势及其空间位置跟踪", 北京邮电大学学报, no. 06 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112416134A (en) * 2020-12-10 2021-02-26 华中科技大学 Device and method for quickly generating hand key point data set

Also Published As

Publication number Publication date
CN111860275B (en) 2023-11-03

Similar Documents

Publication Publication Date Title
JP7262937B2 (en) Information processing device, information processing method, and program
US10432913B2 (en) Systems and methods for determining three dimensional measurements in telemedicine application
US9235753B2 (en) Extraction of skeletons from 3D maps
US9898651B2 (en) Upper-body skeleton extraction from depth maps
KR101323966B1 (en) A system and method for 3D space-dimension based image processing
CN105809701B (en) Panoramic video posture scaling method
CN111881887A (en) Multi-camera-based motion attitude monitoring and guiding method and device
CN108926355A (en) X-ray system and method for object of standing
US20070098250A1 (en) Man-machine interface based on 3-D positions of the human body
CN104887238A (en) Hand rehabilitation training evaluation system and method based on motion capture
CN109758756B (en) Gymnastics video analysis method and system based on 3D camera
CN108227920B (en) Motion closed space tracking method and system
CN111260793B (en) Remote virtual-real high-precision matching positioning method for augmented and mixed reality
CN106843507A (en) A kind of method and system of virtual reality multi-person interactive
CN114119739A (en) Binocular vision-based hand key point space coordinate acquisition method
JP2005256232A (en) Method, apparatus and program for displaying 3d data
KR20180094253A (en) Apparatus and Method for Estimating Pose of User
CN112183316B (en) Athlete human body posture measuring method
CN111860275B (en) Gesture recognition data acquisition system and method
JP7318814B2 (en) DATA GENERATION METHOD, DATA GENERATION PROGRAM AND INFORMATION PROCESSING DEVICE
JP2006227932A (en) Three-dimensional analysis method from two-dimensional image, and system for executing same
JP2008176696A (en) Cg character animation creation device
KR101375708B1 (en) System and method for motion capture using plural image, and a medium having computer readable program for executing the method
JP2021099666A (en) Method for generating learning model
Paoli et al. Automatic Point Clouds Alignment for the Reconstruction of Upper Limb Anatomy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant