CN114495273A - Robot gesture teleoperation method and related device - Google Patents

Robot gesture teleoperation method and related device Download PDF

Info

Publication number
CN114495273A
CN114495273A CN202210081710.6A CN202210081710A CN114495273A CN 114495273 A CN114495273 A CN 114495273A CN 202210081710 A CN202210081710 A CN 202210081710A CN 114495273 A CN114495273 A CN 114495273A
Authority
CN
China
Prior art keywords
gesture
information
hand
key point
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210081710.6A
Other languages
Chinese (zh)
Inventor
高庆
陈勇全
池楚亮
王启文
沈文心
房俊雯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chinese University of Hong Kong Shenzhen
Original Assignee
Chinese University of Hong Kong Shenzhen
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chinese University of Hong Kong Shenzhen filed Critical Chinese University of Hong Kong Shenzhen
Priority to CN202210081710.6A priority Critical patent/CN114495273A/en
Publication of CN114495273A publication Critical patent/CN114495273A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures

Abstract

The embodiment of the application discloses a robot gesture teleoperation method, which comprises the following steps: acquiring gesture image information of a user; obtaining gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information according to the gesture image information; estimating key point center information and gesture direction information through the gesture image information; obtaining gesture category information according to the gesture box information, the gesture confidence and the key point center information, wherein the gesture category information represents left and right categories of the hand of the user; and controlling the action of the manipulator of the robot through the first target gesture information.

Description

Robot gesture teleoperation method and related device
Technical Field
The embodiment of the application relates to the field of vision, in particular to a robot gesture teleoperation method and a related device.
Background
Teleoperation refers to the control of remote equipment to accomplish complex operations in an environment far away from an operation object under the control and participation of people. The teleoperation shows social value remarkably in the special scenes of post-disaster rescue sites, teleoperation and the like. The remote operation of the gestures of both hands of a person is simulated, so that the flexibility advantages of both hands of the person are achieved, and the mechanical equipment can be remotely controlled to complete complex operation.
The existing scheme, based on the data gloves and the myoelectricity bracelet method, can well control the robot to complete corresponding actions remotely. Vision-based two-hand detection and differentiation is a difficulty in visual teleoperation techniques. When two hands show different gestures, the difference of the gestures is far greater than the difference of the asymmetry of the hands, however, the existing methods can only extract the characteristics of the hands, and the methods cannot well distinguish the left hand from the right hand, so that inconvenience is brought to users.
Disclosure of Invention
The embodiment of the application provides a robot gesture teleoperation method and a related device.
A method of gesture teleoperation of a robot, comprising:
acquiring gesture image information of a user, wherein the gesture image information is image information representing hand motions of the user;
obtaining gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information according to the gesture image information, wherein the gesture box information is information representing the range of the hand of the user, the gesture confidence represents the probability that the gesture posture information conforms to the finger posture of the user, and the gesture posture information represents the posture with the maximum probability that the gesture posture conforms to the finger posture of the user in a gesture database;
estimating key point center information and gesture direction information through the gesture image information, wherein the key point center information is center position information representing the estimated hand, and the gesture direction information is direction information representing the hand direction of the user;
obtaining gesture category information according to the gesture box information, the gesture confidence and the key point center information, wherein the gesture category information represents left and right categories of the hand of the user;
and controlling the action of a manipulator of the robot through first target gesture information, wherein the first target gesture information comprises the gesture category information, the gesture three-dimensional position information, the gesture direction information and the gesture posture information.
Optionally, obtaining gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information according to the gesture image information includes:
performing feature extraction on a color image to obtain a feature map, wherein the color image is part of the information of the gesture image;
detecting the feature map to obtain the gesture three-dimensional position information and the gesture box information based on a depth image, wherein the depth image is partial information of the gesture image information;
and classifying the gesture box information to obtain the gesture attitude information and the gesture confidence.
Optionally, the estimating of the center information of the key point and the gesture direction information through the gesture image information includes:
determining two-dimensional information of key points according to the color image;
determining the three-dimensional information of the key points through the two-dimensional information of the key points based on the depth image;
and calculating the center information of the key points and the gesture direction information according to the three-dimensional information of the key points.
Optionally, obtaining gesture category information according to the gesture box information, the gesture confidence and the key point center information includes:
judging whether the gesture confidence is greater than a confidence threshold, wherein the confidence threshold is a preset constant value;
if yes, judging whether the key point center information is matched with the gesture box information;
if so, determining the gesture category information according to the key point center information matched with the gesture box information;
if not, judging whether the distance between the estimated center position of the hand in the key point center information and the center position of the hand judged in the gesture three-dimensional position information is smaller than a distance threshold value or not;
and if the distance between the hand and the center position of the hand is smaller than the distance threshold, determining the gesture type information according to the type of the estimated center position of the hand, wherein the distance between the hand and the determined center position of the hand is smaller than the distance threshold.
Optionally, the controlling the robot arm action of the robot through the first target gesture information includes:
calculating according to the gesture three-dimensional position information to obtain manipulator position information;
calculating according to the gesture direction information to obtain manipulator direction information;
and controlling the action of the manipulator through second target gesture information, wherein the second target gesture information comprises the gesture category information, the manipulator position information, the manipulator direction information and the gesture posture information.
Optionally, determining the three-dimensional information of the keypoint through the two-dimensional information of the keypoint based on the depth image includes:
calculating to obtain the three-dimensional information of the key points by the following formula:
Figure BDA0003486164340000021
Figure BDA0003486164340000022
Figure BDA0003486164340000023
zi、xiand yiThree-dimensional coordinate information of the ith key point, namely the three-dimensional information of the key point;
diobtaining the depth value of the ith key point from the depth image;
s is the depth range of the depth image;
uiand viTwo-dimensional coordinate information of the ith key point, namely the two-dimensional information of the key point;
cx and cy are coordinate values of the central point of the depth image;
fx is the focal length of the camera on the x-axis, fy is the focal length of the camera on the y-axis.
Optionally, calculating the key point center information through the key point three-dimensional information includes:
calculating the key point center information by the following formula:
Figure BDA0003486164340000024
Figure BDA0003486164340000025
Figure BDA0003486164340000031
x, Y and Z are the estimated central coordinate information of the hand of the corresponding category;
zi、xiand yiThree-dimensional coordinate information of the ith key point of the hand corresponding to the category;
k is the total number of key points of the hand corresponding to the category.
A robotic gesture teleoperational device, comprising:
the device comprises an acquisition unit, a display unit and a control unit, wherein the acquisition unit is used for acquiring gesture image information of a user, and the gesture image information is image information representing hand motions of the user;
the processing unit is used for obtaining gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information according to the gesture image information, wherein the gesture box information is information representing the range of the hand of the user, the gesture confidence represents the probability that the gesture posture information conforms to the finger posture of the user, and the gesture posture information represents the posture with the maximum probability that the gesture posture conforms to the finger posture of the user in a gesture database;
the estimation unit is used for estimating key point center information and gesture direction information through the gesture image information, wherein the key point center information is center position information representing estimated hands, and the gesture direction information is direction information representing the hand direction of the user;
the processing unit is further configured to obtain gesture category information according to the gesture box information, the gesture confidence and the key point center information, where the gesture category information represents left and right categories of the hand of the user;
and the control unit is used for controlling the action of a manipulator of the robot through first target gesture information, and the first target gesture information comprises the gesture category information, the gesture three-dimensional position information, the gesture direction information and the gesture posture information.
A robotic gesture teleoperational device, comprising:
the system comprises a central processing unit, a memory and an input/output interface;
the memory is a transient storage memory or a persistent storage memory;
the central processor is configured to communicate with the memory and execute the instruction operations in the memory to perform the aforementioned methods.
A computer-readable storage medium comprising instructions which, when executed on a computer, cause the computer to perform the aforementioned method.
According to the technical scheme, the embodiment of the application has the following advantages:
after gesture image information of a user is collected, gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information are obtained according to the gesture image information, and then key point center information and gesture direction information are estimated through the gesture image information. And then obtaining gesture category information according to the gesture box information, the gesture confidence and the key point center information. And finally, controlling the action of the manipulator of the robot through the first target gesture information. The left hand and the right hand can be distinguished through the gesture box information, the gesture confidence and the key point center information, and left and right categories of the hand of the user, namely gesture category information, are obtained. And corresponding remote operation is carried out on the left manipulator and the right manipulator of the robot by combining other information, so that the efficiency is improved, and better experience is provided for a user.
Drawings
FIG. 1 is a schematic diagram illustrating a gesture remote operation according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of an embodiment of a method for robot gesture teleoperation according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of another embodiment of a method for robot gesture teleoperation according to an embodiment of the present disclosure;
FIG. 4 is a topological diagram of key points of a human body according to an embodiment of the present application;
FIG. 5 is a schematic diagram of an embodiment of a robot gesture teleoperation device according to an embodiment of the present disclosure;
fig. 6 is a schematic view of another embodiment of a robot gesture teleoperation device according to the present application.
Detailed Description
The embodiment of the application provides a robot gesture teleoperation method and a related device.
The teleoperation mode of the traditional robot can control the traditional mechanical arm through equipment such as a mouse, a keyboard and the like. However, the conventional apparatus cannot easily and effectively control the robot having the robot arms, and cannot distinguish between the left and right hands of the user. The existing remote control methods of a two-arm robot simulating both hands and both arms of a person are teleoperated based on wearable interactive equipment, the motion of the hands of the user is limited by the interactive equipment, and the methods can only extract the characteristics of the hands, and the methods cannot well distinguish the left hand from the right hand. The gesture teleoperation method provided by the embodiment of the application can enable the user not to be limited by interactive equipment any more, and can distinguish the left hand and the right hand of the user to realize efficient control.
Referring to fig. 1, in the teleoperation according to the embodiment of the present application, gesture image information of a user is acquired by an RGB-D camera, and the acquired gesture image information is processed and then mapped to a manipulator of a dual-arm robot, that is, a five-finger dexterous hand. The five-finger dexterous hand grips an object on the table according to the gesture operation of the user to finish remote operation. The two-arm robot may be a baxter robot or other robots, and is not limited herein. Each arm of the baxter double-arm robot has 6 degrees of freedom, can realize basic path planning and motion control functions, and the double arms have a cooperative function, and can avoid collision between the double arms in complex operation. The manipulator at its end, also has 6 degrees of freedom, including 5 bending degrees of freedom for 5 fingers and 1 rotational degree of freedom for the thumb. Different objects can be grabbed or operated through different gesture postures.
To specifically describe the gesture teleoperation method in the embodiment of the present application, referring to fig. 2, an embodiment of the gesture teleoperation method for a robot in the embodiment of the present application includes:
201. acquiring gesture image information of a user;
and acquiring gesture image information of the user. The gesture image information is image information representing a hand motion of the user. Specifically, the user does not need to wear the interactive device, and can acquire the related image information through the RGB-D camera by taking a stroke action within the shooting range of the RGB-D camera. The RGB-D camera is capable of capturing RGB color images and depth images of the user's hand to generate relevant three-dimensional information, where the gesture image information includes a color image and a depth image.
It is understood that the RGB-D camera may be a realsense D435i camera, or a kinect camera or other cameras capable of acquiring a depth image and a two-dimensional color image, and is not limited herein.
202. Obtaining gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information according to the gesture image information;
and processing the gesture image information to obtain gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information. Specifically, feature extraction is performed on color images in the gesture image information to obtain a feature map, and the feature map is detected by combining the depth image to obtain gesture frame information and gesture three-dimensional position information. And inputting the gesture box information into an EfficientNet-B5 classifier for classification, comparing the classification with gesture postures in a gesture posture library, and picking out the most similar gesture posture to determine gesture posture information. And obtaining corresponding gesture confidence according to the gesture posture information. The gesture information is information representing the gesture with the maximum probability of conforming to the finger posture of the user in the gesture database.
203. Estimating key point center information and gesture direction information through the gesture image information;
and estimating key point center information and gesture direction information through the gesture image information. Specifically, the color image can be analyzed by using a BlazePose algorithm network to determine two-dimensional information of the key points, and then the three-dimensional information of the key points can be obtained by combining the two-dimensional information with the depth image. The BlazePose algorithm network is a lightweight convolutional neural network framework and is used for human posture estimation and real-time inference. In the inference process, the network generates 33 body key points for the human body, running at speeds above 30 frames per second. And (4) central position information of the hand, namely central information of the key points, which can be estimated according to the three-dimensional position information of the key points of the hand. A coordinate system can be established according to the three-dimensional information of the key points, and the gesture direction information can be determined by utilizing a quaternion rotation principle. The gesture direction information is information indicating the orientation of the hand of the user.
204. Obtaining gesture category information according to the gesture box information, the gesture confidence and the key point center information;
and analyzing and judging according to the gesture box information, the gesture confidence and the key point center information to obtain gesture category information. The gesture type information represents left and right types of the user's hand and is used for distinguishing the left hand from the right hand of the user. Specifically, the gesture category information is obtained by comparing the gesture confidence with a preset confidence threshold, and judging whether the center information of the key point is matched with the gesture frame information or judging whether the distance between the estimated center position of the hand in the center information of the key point and the center position of the hand judged in the three-dimensional position information of the gesture is smaller than a distance threshold.
205. Controlling the action of a manipulator of the robot through the first target gesture information;
and after the gesture type information is obtained, controlling the action of a manipulator of the robot through the first target gesture information. The first target gesture information comprises gesture category information, gesture three-dimensional position information, gesture direction information and gesture posture information. And processing the first target gesture information to obtain second target gesture information, and mapping the second target gesture information to a manipulator of the double-arm robot so as to control the manipulator to execute corresponding actions.
In the embodiment of the application, after gesture image information of a user is collected, gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information are obtained according to the gesture image information, and then key point center information and gesture direction information are estimated through the gesture image information. And then obtaining gesture category information according to the gesture box information, the gesture confidence and the key point center information. And finally, controlling the action of the manipulator through the first target gesture information. The left hand and the right hand can be distinguished through the gesture box information, the gesture confidence and the key point center information, and left and right categories of the hand of the user, namely gesture category information, are obtained. And corresponding remote operation is carried out on the left manipulator and the right manipulator of the robot by combining other information, so that the efficiency is improved. In addition, the actions of the hands without the wearable interactive equipment are collected, namely gesture image information is collected, the information is processed to obtain first target gesture information, and the manipulator is controlled to execute corresponding actions remotely according to the first target gesture information, so that the hands can trace the actions without limitation under the condition of remote operation, and better experience is provided for users.
Referring to fig. 3, another embodiment of a method for robot gesture remote operation according to the embodiment of the present application includes:
301. acquiring gesture image information of a user;
and acquiring gesture image information of the user. The gesture image information is image information representing a hand motion of the user. Specifically, the user does not need to wear the interactive device, and can acquire the related image information through the RGB-D camera by taking a stroke action within the shooting range of the RGB-D camera. The RGB-D camera is capable of capturing RGB color images and depth images of the user's hand to generate relevant three-dimensional information, where the gesture image information includes a color image and a depth image.
It is understood that the RGB-D camera may be a realsense D435i camera, or a kinect camera or other cameras capable of acquiring a depth image and a two-dimensional color image, and is not limited herein.
302. Carrying out feature extraction on the color image to obtain a feature map;
and performing feature extraction on the color image to obtain a feature map. Specifically, the color image is input into an input layer of the convolutional neural network, and then feature extraction is performed on the convolutional layer, so that a feature map of the hand of the user is obtained finally. It is understood that the type of convolutional neural network may be a network such as a DLA-34 network, which can be used for detecting and performing feature extraction, and is not limited herein.
303. Detecting the feature map based on the depth image to obtain the gesture three-dimensional position information and the gesture box information;
and detecting the feature map based on the depth image to obtain gesture three-dimensional position information and gesture box information. Specifically, a coordinate system is established, and the two-dimensional center coordinates of the hand and the width and height of a box representing the range of the hand are detected for the feature map. The offset amount may be set according to practical experience. And then, carrying out coordinate mapping by using the depth image to obtain gesture three-dimensional position information and gesture frame information. For example, the two-dimensional center coordinates of the hand detected for the feature map are (x, y), the width of the box is w, the height of the box is h, the offset is (Δ x, Δ y), the depth value obtained by the depth image is z, the two-dimensional center coordinates of the hand are (x + Δ x, y + Δ y), the coordinates of one diagonal point of the box of the hand are (x + Δ x + w/2, y + Δ y + h/2), and the coordinates of the other three diagonal points are analogized in sequence. The gesture three-dimensional position information is (x + Δ x, y + Δ y, z), and the gesture box information H _ box (x, y, z, w, H) represents a coordinate range with (x + Δ x, y + Δ y, z) as a center coordinate and w, H as a width and a height of the hand.
304. Classifying the gesture box information to obtain the gesture attitude information and the gesture confidence;
and classifying the gesture box information to obtain gesture posture information. Specifically, after the range of the square frame of the hand is determined, the feature map is cut according to the square frame, and the cut image of the hand is input into an EfficientNet-B5 classifier for classification. The gesture library is preset and can comprise a plurality of existing gesture gestures, such as 24 English letters or 10 Arabic numerals drawn by hand. And comparing in a gesture posture library, finding out the gesture posture with the maximum probability similar to the hand of the user, determining the gesture posture as gesture posture information, and determining the probability as a gesture confidence coefficient. For example, the gesture pose in the cropped image is the pose of the stroked number 1, while the gesture pose library has just no pose of the number 1, only poses of the number 2, the number 3, the number 4 and the number 5, while the probability of the number 2 is 90%, the probability of the number 3 is 80%, the probability of the number 4 is 70% and the probability of the number 5 is 60%. By contrast, the gesture of the number 2 with the highest probability is determined as gesture information, while the gesture confidence is determined to be 90%.
305. Determining two-dimensional information of key points according to the color image;
and determining two-dimensional information of the key points according to the color image. The color images can be analyzed using a BlazePose algorithm network to determine two-dimensional information for key points. The BlazePose algorithm network is a lightweight convolutional neural network framework and is used for human posture estimation and real-time inference. Referring to fig. 4, the network may generate two-dimensional information of key points according to a topological graph of key points of a human body. In the inference process, the network generates 33 body key points for the human body, running at speeds above 30 frames per second. In practice, the information of 10 key points of the face does not need to be acquired, and only 22 key points, especially the key points of the left hand and the right hand, need to be acquired.
306. Determining the three-dimensional information of the key points through the two-dimensional information of the key points based on the depth image;
and determining three-dimensional information of the key points through the two-dimensional information of the key points based on the depth image. Specifically, the three-dimensional information of the key points is calculated by the following formula:
Figure BDA0003486164340000061
Figure BDA0003486164340000071
Figure BDA0003486164340000072
zi、xiand yiThree-dimensional coordinate information of the ith key point, namely the three-dimensional information of the key point;
diobtaining a depth value of the ith key point from the depth image;
s is the depth range of the depth image;
uiand viTwo-dimensional coordinate information of the ith key point, namely the two-dimensional information of the key point;
cx and cy are coordinate values of the central point of the depth image;
fx is the focal length of the camera on the x-axis, fy is the focal length of the camera on the y-axis.
307. Calculating the center information of the key points and the gesture direction information according to the three-dimensional information of the key points;
and calculating key point center information and gesture direction information through the key point three-dimensional information. Specifically, the key point center information is calculated by the following formula:
Figure BDA0003486164340000073
Figure BDA0003486164340000074
Figure BDA0003486164340000075
x, Y and Z are the estimated central coordinate information of the hand of the corresponding category;
zi、xiand yiThree-dimensional coordinate information of the ith key point of the hand corresponding to the category;
k is the total number of key points of the hand corresponding to the category.
Taking the left hand as an example, please refer to fig. 4, the key points of the left hand are the 15 th point, the 17 th point, the 19 th point and the 21 st point, the value of k is 4, and the center coordinate information of the left hand is:
Figure BDA0003486164340000076
Figure BDA0003486164340000077
Figure BDA0003486164340000078
and so on for the right hand.
And establishing a coordinate system according to the three-dimensional information of the key points. Specifically, the z-axis is defined as a unit normal vector of a plane formed by the 15 th point, the 17 th point and the 19 th point, the x-axis is defined as a unit direction vector from the 15 th point to the left-hand center point, and the y-axis is defined as a unit direction vector cross-multiplied by the x-axis and the z-axis. According to the coordinate system, the gesture direction information can be obtained by utilizing a quaternion rotation principle.
308. And judging whether the gesture confidence is greater than a confidence threshold, if so, executing step 309, and if not, executing step 310. The confidence threshold is a preset constant value, the value range is 0.5 to 1, and the confidence threshold is generally 0.85 according to experience;
309. judging whether the key point center information is matched with the gesture box information, if so, executing a step 312, and if not, executing a step 311;
specifically, whether the estimated center position of the hand in the key point center information is in a square frame of the hand is judged, and if the estimated center position of the hand is in the square frame, the type of the hand can be determined.
310. Judging whether the distance between the estimated center position of the hand in the center information of the key points and the center position of the hand judged in the three-dimensional position information of the gesture is smaller than a distance threshold value, if so, executing a step 313, and if not, executing a step 311, wherein the distance threshold value is a constant value preset according to practical experience;
311. ending the control of the manipulator;
the control of the robot is ended. The type of the hand cannot be determined, and the remote control is ended to prevent the manipulator from performing a malfunction.
312. Determining the gesture category information according to the key point center information matched with the gesture box information;
and determining gesture category information according to the key point center information matched with the gesture box information. If the category of the key point center information matched with the gesture frame information is left, the gesture category information represents the information of the left hand, and if the category of the key point center information matched with the gesture frame information is right, the gesture category information represents the information of the right hand. For example, if the estimated center position of the left hand is within the frame of the hand, the gesture category information is determined as the left category, i.e., the hand of the user is the left hand. And determining the gesture category information as a right category if the estimated center position of the right hand is in the square frame of the hand, namely the hand of the user is the right hand.
313. Determining the gesture category information according to the category of the estimated central position of the hand, wherein the distance between the estimated central position of the hand and the determined central position of the hand is smaller than the distance threshold;
and determining gesture category information according to the category of the estimated central position of the hand, wherein the distance between the estimated central position of the hand and the determined central position of the hand is smaller than the distance threshold. For example, the estimated center position of the left hand in the center information of the key points is point P1, the estimated center position of the right hand is point P2, the center position of the hand determined in the three-dimensional position information of the gesture is point P3, and the distance threshold is set to 5. If the distance between P1 and P3 is 2, the distance between P2 and P3 is 7, and 2 < 5 < 7, the gesture type information is determined to be in a left type, that is, the hand of the user is the left hand.
314. Calculating according to the gesture three-dimensional position information to obtain manipulator position information;
and calculating through the gesture three-dimensional position information to obtain the manipulator position information. In order to efficiently control the manipulator, the three-dimensional position information of the gesture needs to be processed to obtain the position information of the manipulator. Specifically, the position information of the manipulator is calculated by the following formula:
Figure BDA0003486164340000081
Figure BDA0003486164340000082
Figure BDA0003486164340000083
Figure BDA0003486164340000084
and
Figure BDA0003486164340000085
the three-dimensional center coordinate information of the manipulator in the ith frame, namely the manipulator position information, wherein i is more than or equal to 2;
Figure BDA0003486164340000086
and
Figure BDA0003486164340000087
three-dimensional center coordinate information, namely gesture three-dimensional position information, representing the hand of the user in the ith frame;
kpthe position increment coefficient can be preset according to practical experience and is generally set to be 1.
315. Calculating according to the gesture direction information to obtain manipulator direction information;
and calculating through the gesture direction information to obtain the direction information of the manipulator. In order to efficiently control the manipulator, the gesture direction information needs to be processed to obtain the manipulator direction information. Specifically, the manipulator direction information is calculated by the following formula:
Figure BDA0003486164340000091
Figure BDA0003486164340000092
Figure BDA0003486164340000093
Figure BDA0003486164340000094
Figure BDA0003486164340000095
indicating the direction information of the manipulator in the ith frame, namely the manipulator direction information;
Figure BDA0003486164340000096
direction information representing the user's hand in the ith frame, namely gesture direction information;
kothe direction increment coefficient can be preset according to practical experience and is generally set to be 1.
316. Controlling the action of the manipulator through second target gesture information;
and controlling the action of the manipulator through second target gesture information. The second target gesture information comprises gesture category information, manipulator position information, manipulator direction information and gesture posture information. Specifically, the second target gesture information increment is mapped to the manipulator on the double-arm robot so as to remotely control the action of the manipulator.
In this embodiment, after acquiring gesture image information of a user, gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information are obtained according to the gesture image information, and then, key point center information and gesture direction information are estimated through the gesture image information. And then obtaining gesture category information according to the gesture box information, the gesture confidence and the key point center information. And finally, controlling the action of the manipulator through second target gesture information. The gesture control method comprises the steps of collecting the actions of the hands without wearing the interactive equipment, namely collecting gesture image information, processing the information to obtain first target gesture information, and remotely controlling the manipulator to execute corresponding actions according to the first target gesture information, so that the hands can draw the actions without limitation under the condition of remote operation, and better experience is provided for users. The key point center information and the gesture confidence are introduced for judgment, the gesture frame information and the gesture three-dimensional position information can be detected, the left hand and right hand categories can be judged, and the action accuracy can be improved.
Referring to fig. 5, an embodiment of a robot gesture remote operation device according to the embodiment of the present application includes:
the device comprises an acquisition unit 501, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring gesture image information of a user, and the gesture image information is image information representing hand motions of the user;
the processing unit 502 is configured to obtain gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information according to the gesture image information, where the gesture box information is information representing a range of a hand of the user, the gesture confidence represents a probability that the gesture posture information coincides with a finger posture of the user, and the gesture posture information is information representing a posture with a highest probability that the gesture posture coincides with the finger posture of the user in a gesture database;
an estimating unit 503, configured to estimate, through the gesture image information, key point center information and gesture direction information, where the key point center information is center position information representing an estimated hand, and the gesture direction information is orientation information representing a hand of the user;
the processing unit 502 is further configured to obtain gesture category information according to the gesture box information, the gesture confidence and the key point center information, where the gesture category information represents left and right categories of the hand of the user;
a control unit 504, configured to control a manipulator action of the robot through first target gesture information, where the first target gesture information includes the gesture category information, the gesture three-dimensional position information, the gesture direction information, and the gesture posture information.
In the embodiment of the application, after the acquisition unit 501 acquires gesture image information of a user, the processing unit 502 obtains gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information according to the gesture image information, and the estimation unit 503 estimates key point center information and gesture direction information through the gesture image information. And then obtaining gesture category information according to the gesture box information, the gesture confidence and the key point center information. Finally, the control unit 504 controls the manipulator to operate according to the first target gesture information. The left hand and the right hand can be distinguished through the gesture box information, the gesture confidence and the key point center information, and left and right categories of the hand of the user, namely gesture category information, are obtained. In addition, because the actions of the hands without wearing interactive equipment, namely acquiring gesture image information, processing the information to obtain first target gesture information, and remotely controlling the manipulators to execute corresponding actions according to the first target gesture information, the hands can trace the actions without limitation under the condition of remote operation, and better experience is provided for users.
The functions and processes executed by each unit in the robot gesture remote operation device of this embodiment are similar to those executed by the gesture remote operation device in fig. 1 to 5, and are not described again here.
Fig. 6 is a schematic structural diagram of a gesture remote operation device according to an embodiment of the present disclosure, where the gesture remote operation device 600 may include one or more Central Processing Units (CPUs) 601 and a memory 605, and the memory 605 stores one or more applications or data therein.
The memory 605 may be volatile storage or persistent storage, among other things. The program stored in memory 605 may include one or more modules, each of which may include a series of instruction operations in a gesture teleoperation device. Still further, the central processor 601 may be configured to communicate with the memory 605 to execute a series of command operations in the memory 605 on the gesture teleoperation device 600.
Gesture teleoperation device 600 may also include one or more power supplies 602, one or more wired or wireless network interfaces 603, one or more input-output interfaces 604, and/or one or more operating systems, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
The central processor 601 can perform the operations performed by the gesture teleoperation device in the embodiments shown in fig. 1 to fig. 5, which are not described herein again.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.

Claims (10)

1. A robot gesture teleoperation method is characterized by comprising the following steps:
acquiring gesture image information of a user, wherein the gesture image information is image information representing hand motions of the user;
obtaining gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information according to the gesture image information, wherein the gesture box information is information representing the range of the hand of the user, the gesture confidence represents the probability that the gesture posture information conforms to the finger posture of the user, and the gesture posture information represents the posture with the maximum probability that the gesture posture conforms to the finger posture of the user in a gesture database;
estimating key point center information and gesture direction information through the gesture image information, wherein the key point center information is center position information representing the estimated hand, and the gesture direction information is direction information representing the hand direction of the user;
obtaining gesture category information according to the gesture box information, the gesture confidence and the key point center information, wherein the gesture category information represents left and right categories of the hand of the user;
and controlling the action of a manipulator of the robot through first target gesture information, wherein the first target gesture information comprises the gesture category information, the gesture three-dimensional position information, the gesture direction information and the gesture posture information.
2. The gesture teleoperation method according to claim 1, wherein obtaining gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information according to the gesture image information comprises:
performing feature extraction on a color image to obtain a feature map, wherein the color image is part of the information of the gesture image;
detecting the feature map to obtain the gesture three-dimensional position information and the gesture box information based on a depth image, wherein the depth image is partial information of the gesture image information;
and classifying the gesture box information to obtain the gesture attitude information and the gesture confidence.
3. The gesture teleoperation method according to claim 2, wherein the estimating of the center information of the key point and the gesture direction information through the gesture image information comprises:
determining two-dimensional information of key points according to the color image;
determining the three-dimensional information of the key points through the two-dimensional information of the key points based on the depth image;
and calculating the center information of the key points and the gesture direction information according to the three-dimensional information of the key points.
4. The gesture teleoperation method according to claim 3, wherein obtaining gesture category information according to the gesture box information, the gesture confidence and the key point center information comprises:
judging whether the gesture confidence is greater than a confidence threshold, wherein the confidence threshold is a preset constant value;
if yes, judging whether the key point center information is matched with the gesture box information;
if the gesture is matched with the gesture frame information, determining the gesture category information according to the key point center information matched with the gesture frame information;
if not, judging whether the distance between the estimated center position of the hand in the key point center information and the center position of the hand judged in the gesture three-dimensional position information is smaller than a distance threshold value or not;
and if the distance between the hand and the center position of the hand is smaller than the distance threshold, determining the gesture type information according to the type of the estimated center position of the hand, wherein the distance between the hand and the determined center position of the hand is smaller than the distance threshold.
5. The gesture teleoperation method of claim 4, wherein controlling the manipulator motion of the robot through the first target gesture information comprises:
calculating according to the gesture three-dimensional position information to obtain manipulator position information;
calculating according to the gesture direction information to obtain manipulator direction information;
and controlling the action of the manipulator through second target gesture information, wherein the second target gesture information comprises the gesture category information, the manipulator position information, the manipulator direction information and the gesture posture information.
6. The gesture teleoperation method of claim 3, wherein determining the three-dimensional keypoint information from the two-dimensional keypoint information based on the depth image comprises:
calculating to obtain the three-dimensional information of the key points by the following formula:
Figure FDA0003486164330000021
Figure FDA0003486164330000022
Figure FDA0003486164330000023
zi、xiand yiThree-dimensional coordinate information of the ith key point, namely the three-dimensional information of the key point;
diobtaining the depth value of the ith key point from the depth image;
s is the depth range of the depth image;
uiand viTwo-dimensional coordinate information of the ith key point, namely the two-dimensional information of the key point;
cx and cy are coordinate values of the central point of the depth image;
fx is the focal length of the camera on the x-axis, fy is the focal length of the camera on the y-axis.
7. The gesture teleoperation method according to claim 3, wherein the step of calculating the key point center information through the key point three-dimensional information comprises:
calculating the key point center information by the following formula:
Figure FDA0003486164330000024
Figure FDA0003486164330000025
Figure FDA0003486164330000026
x, Y and Z are the estimated central coordinate information of the hand of the corresponding category;
zi、xiand yiThree-dimensional coordinate information of the ith key point of the hand corresponding to the category;
k is the total number of key points of the hand corresponding to the category.
8. A robotic gesture teleoperation device, comprising:
the device comprises an acquisition unit, a display unit and a control unit, wherein the acquisition unit is used for acquiring gesture image information of a user, and the gesture image information is image information representing hand motions of the user;
the processing unit is used for obtaining gesture box information, gesture three-dimensional position information, gesture confidence and gesture posture information according to the gesture image information, wherein the gesture box information is information representing the range of the hand of the user, the gesture confidence represents the probability that the gesture posture information conforms to the finger posture of the user, and the gesture posture information represents the posture with the maximum probability of conforming to the finger posture of the user in a gesture database;
the estimation unit is used for estimating key point center information and gesture direction information through the gesture image information, wherein the key point center information is center position information representing estimated hands, and the gesture direction information is direction information representing the hand direction of the user;
the processing unit is further configured to obtain gesture category information according to the gesture box information, the gesture confidence and the key point center information, where the gesture category information represents left and right categories of the hand of the user;
and the control unit is used for controlling the action of a manipulator of the robot through first target gesture information, and the first target gesture information comprises the gesture category information, the gesture three-dimensional position information, the gesture direction information and the gesture posture information.
9. A robotic gesture teleoperation device, comprising:
the system comprises a central processing unit, a memory and an input/output interface;
the memory is a transient memory or a persistent memory;
the central processor is configured to communicate with the memory and execute the operations of the instructions in the memory to perform the method of any of claims 1 to 7.
10. A computer-readable storage medium comprising instructions which, when executed on a computer, cause the computer to perform the method of any one of claims 1 to 7.
CN202210081710.6A 2022-01-24 2022-01-24 Robot gesture teleoperation method and related device Pending CN114495273A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210081710.6A CN114495273A (en) 2022-01-24 2022-01-24 Robot gesture teleoperation method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210081710.6A CN114495273A (en) 2022-01-24 2022-01-24 Robot gesture teleoperation method and related device

Publications (1)

Publication Number Publication Date
CN114495273A true CN114495273A (en) 2022-05-13

Family

ID=81475639

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210081710.6A Pending CN114495273A (en) 2022-01-24 2022-01-24 Robot gesture teleoperation method and related device

Country Status (1)

Country Link
CN (1) CN114495273A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116269455A (en) * 2023-03-20 2023-06-23 瑞石心禾(河北)医疗科技有限公司 Detection method and system for automatically acquiring human body contour in SPECT (single photon emission computed tomography)
CN116884095A (en) * 2023-09-08 2023-10-13 烟台大学 Gesture recognition control method, system, equipment and storage medium of bionic manipulator

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116269455A (en) * 2023-03-20 2023-06-23 瑞石心禾(河北)医疗科技有限公司 Detection method and system for automatically acquiring human body contour in SPECT (single photon emission computed tomography)
CN116269455B (en) * 2023-03-20 2023-12-12 瑞石心禾(河北)医疗科技有限公司 Detection method and system for automatically acquiring human body contour in SPECT (single photon emission computed tomography)
CN116884095A (en) * 2023-09-08 2023-10-13 烟台大学 Gesture recognition control method, system, equipment and storage medium of bionic manipulator
CN116884095B (en) * 2023-09-08 2023-11-21 烟台大学 Gesture recognition control method, system, equipment and storage medium of bionic manipulator

Similar Documents

Publication Publication Date Title
Bhuyan et al. Fingertip detection for hand pose recognition
Hasanuzzaman et al. Real-time vision-based gesture recognition for human robot interaction
CN111694428B (en) Gesture and track remote control robot system based on Kinect
CN114495273A (en) Robot gesture teleoperation method and related device
Triesch et al. Robotic gesture recognition
CN108171133A (en) A kind of dynamic gesture identification method of feature based covariance matrix
JP6487642B2 (en) A method of detecting a finger shape, a program thereof, a storage medium of the program, and a system for detecting a shape of a finger.
CN111367415B (en) Equipment control method and device, computer equipment and medium
CN110807391A (en) Human body posture instruction identification method for human-unmanned aerial vehicle interaction based on vision
Kabir et al. A novel dynamic hand gesture and movement trajectory recognition model for non-touch HRI interface
CN110910426A (en) Action process and action trend identification method, storage medium and electronic device
Li et al. Visual interpretation of natural pointing gestures in 3D space for human-robot interaction
Chai et al. Human gait recognition: approaches, datasets and challenges
Takimoto et al. Classification of hand postures based on 3d vision model for human-robot interaction
Liao et al. Design of real-time face position tracking and gesture recognition system based on image segmentation algorithm
Xu et al. A novel method for hand posture recognition based on depth information descriptor
JPH09179988A (en) Gesture recognition device
Tu et al. Face and gesture based human computer interaction
CN113822946B (en) Mechanical arm grabbing method based on computer vision
Hiyadi et al. Adaptive dynamic time warping for recognition of natural gestures
Thomas et al. A comprehensive review on vision based hand gesture recognition technology
Shah et al. Gesture recognition technique: a review
Ardizzone et al. Pose classification using support vector machines
CN112975993A (en) Robot teaching method, device, storage medium and equipment
Dornaika et al. Three-dimensional face pose detection and tracking using monocular videos: Tool and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination