CN113524173A - End-to-end intelligent capture method for extraterrestrial detection sample - Google Patents

End-to-end intelligent capture method for extraterrestrial detection sample Download PDF

Info

Publication number
CN113524173A
CN113524173A CN202110674012.2A CN202110674012A CN113524173A CN 113524173 A CN113524173 A CN 113524173A CN 202110674012 A CN202110674012 A CN 202110674012A CN 113524173 A CN113524173 A CN 113524173A
Authority
CN
China
Prior art keywords
grabbing
extraterrestrial
environment
target object
mechanical arm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110674012.2A
Other languages
Chinese (zh)
Other versions
CN113524173B (en
Inventor
黄煌
高锡珍
汤亮
刘昊
谢心如
刘乃龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Control Engineering
Original Assignee
Beijing Institute of Control Engineering
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Control Engineering filed Critical Beijing Institute of Control Engineering
Priority to CN202110674012.2A priority Critical patent/CN113524173B/en
Publication of CN113524173A publication Critical patent/CN113524173A/en
Application granted granted Critical
Publication of CN113524173B publication Critical patent/CN113524173B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1628Programme controls characterised by the control loop
    • B25J9/163Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1679Programme controls characterised by the tasks executed

Landscapes

  • Engineering & Computer Science (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Image Analysis (AREA)

Abstract

An end-to-end intelligent capture method for an extra-terrestrial detection sample is characterized in that a digital-physical test is carried out in a mode of firstly carrying out digital training and then carrying out a physical test; the method comprises the following steps: a sample collection method based on reinforcement learning is designed, then a sample collection digital simulation training environment is constructed to train a model, and finally the model is transferred to a physical environment to be verified, and the result shows that the model can be used for capturing objects with unknown and irregular geometric shapes with high success rate, and the success of an extraterrestrial sampling task is guaranteed.

Description

End-to-end intelligent capture method for extraterrestrial detection sample
Technical Field
The invention relates to an end-to-end intelligent capture method for an extraterrestrial exploration sample, and belongs to the technical field of aerospace.
Background
Extraterrestrial exploration is an important means for exploring universe origin and astrology evolution and peacefully developing universe resources by human beings, and is a main development direction in the future in the world aerospace field. Extraterrestrial exploration gradually extends from the nearest moon to more and more distant planets such as mars and asteroids, and the exploration mode gradually progresses from glancing and flying, to landing inspection and sampling return. The collection of the extraterrestrial detection sample is a core key link of sampling return, and has important scientific value and engineering significance.
At present, the collection of the extraterrestrial detection samples is carried out on-site sampling mainly through a mechanical arm, laser or drilling equipment, but at present, the extraterrestrial detection samples are still dependent on ground instructions or human-in-loop operation modes, and the extraterrestrial detection samples are difficult to independently perform various complex detection tasks under unknown change environments. Meanwhile, the characteristic problems that detection of the irregular unknown object is long in time consumption and large in deviation, and the grabbing pose of the irregular object is difficult to accurately describe and depends on manual setting exist in the process of conducting out-of-ground sampling based on the traditional method. Under the background of a new generation of artificial intelligence, the implantation of the artificial intelligence technology is an extremely effective way for improving the sampling autonomy of the extraterrestrial detector.
Disclosure of Invention
The invention aims to: in order to solve the problem of extraterrestrial detection sample acquisition, an end-to-end extraterrestrial detection sample intelligent grabbing method is provided, sample acquisition in Mars detection is used as an application background, learning training of a digital-physical integrated grabbing, analyzing and boxing full process is carried out, and full-autonomous target finding, grabbing and fine operation are achieved.
The purpose of the invention is realized by the following technical scheme:
an end-to-end intelligent capture method for an extraterrestrial exploration sample comprises the following steps:
selecting a reinforcement learning method;
constructing an extraterrestrial exploration sample acquisition simulation training environment;
performing digital training in the constructed simulation training environment to obtain a grabbing model;
and transferring the obtained grabbing model to an extraterrestrial exploration sample grabbing physical experiment system, and carrying out an extraterrestrial exploration sample acquisition physical experiment based on reinforcement learning, so as to finish the intelligent grabbing of the end-to-end extraterrestrial exploration sample.
Further, a near-end strategy optimization method PPO is adopted as a selected reinforcement learning method.
Furthermore, a multi-platform robot simulation software Webots is adopted to construct an extraterrestrial exploration sample acquisition simulation training environment.
Further, when an extraterrestrial detection sample acquisition simulation training environment is established, a target mechanical arm, a paw, a camera, a target object, a box and a desktop model are established;
the paw is arranged at the front end of the target mechanical arm and used for grabbing a target object on the desktop;
the camera is arranged above the desktop and used for observing a target object to be grabbed;
the box is used for placing the target object after the paw grabs the target object.
Further, the performing digital training specifically includes: by designing a reward function and a network structure, a deep neural network is trained, RGB-D images obtained through a camera are input, and the optimal grabbing pose under a corresponding image coordinate system is output.
Further, the reward function is as follows:
Figure BDA0003120319820000021
in the near-end strategy optimization method PPO, dense neural network DenseneET is adopted for both executing the network Actor and evaluating the network criticic, and the specific parameters are as follows: a DenseNet-121 network is selected, and 121 layers comprise an initialization layer, a dense connection layer, a transition layer and a full connection layer.
Further, the training process comprises the following steps:
(1) according to the current article grabbing environment state, the mechanical arm selects and executes grabbing actions according to an initial grabbing strategy; the initial grabbing strategy is obtained according to the selected reinforcement learning method;
(2) after the grabbing action is executed, the grabbing environment is transferred to a new state, and corresponding action rewards are obtained through a reward function;
(3) repeating the process until all the objects in the training environment are successfully grabbed;
(4) and obtaining a deep neural network model, namely a grabbing model.
Further, the extraterrestrial detection sample grabbing physical experiment system comprises a target mechanical arm, a paw, a camera, a target object, a box and a table;
the paw is arranged at the front end of the target mechanical arm and used for grabbing a target object on the desktop;
the camera is arranged above the desktop and used for observing a target object to be grabbed;
the box is used for placing a target object after the target object is grabbed by the paw;
the step of transferring the obtained grabbing model to an extraterrestrial exploration sample grabbing physical experiment system is to establish a one-to-one correspondence relationship between grabbing poses in a simulation environment and grabbing poses in a physical experiment environment;
in a physical test environment, the position and pose of the camera relative to a mechanical arm base coordinate system are solved by using a calibration plate, and the grabbing position and pose in the simulation environment are converted to be under the mechanical arm base coordinate system, so that the mechanical arm is controlled to finish sample grabbing.
Further, the sample collection physical test based on reinforcement learning for extraterrestrial exploration specifically comprises: the neural network parameters obtained by training are tested and verified in the migration physical environment, and the mechanical arm continuously updates the grabbing model through continuous interaction with the environment, so that continuous learning is realized, and the success rate of sample collection is improved.
Furthermore, the invention also provides an intelligent selection system for the capture pose of the extraterrestrial detection sample, which comprises the following steps:
the reinforcement learning method determination module: selecting a near-end strategy optimization method PPO as a reinforcement learning method;
the simulation training environment construction module: constructing an extraterrestrial exploration sample acquisition simulation training environment by adopting multi-platform robot simulation software Webots; establishing a target mechanical arm, a paw, a camera, a target object, a box and a desktop model; the paw is arranged at the front end of the target mechanical arm and used for grabbing a target object on the desktop; the camera is arranged above the desktop and used for observing a target object to be grabbed; the box is used for placing a target object after the target object is grabbed by the paw;
a training module: in the constructed simulation training environment, carrying out digital training to obtain a grabbing model, which specifically comprises the following steps: training a deep neural network by designing a reward function and a network structure, inputting an RGB-D image obtained by a camera, and outputting a corresponding optimal grabbing pose;
the reward function is as follows:
Figure BDA0003120319820000041
in PPO, dense neural network DenseNet is adopted for executing network Actor and evaluating network criticic, and the specific parameters are as follows: selecting a DenseNet-121 network, wherein the 121 layer comprises an initialization layer, a dense connection layer, a transition layer and a full connection layer;
a test verification module: and transferring the obtained grabbing model to an extraterrestrial exploration sample grabbing physical experiment system, and carrying out an extraterrestrial exploration sample acquisition physical experiment based on reinforcement learning, so as to finish the end-to-end extraterrestrial exploration sample grabbing pose intelligent selection.
Compared with the prior art, the invention has the following beneficial effects:
(1) according to the end-to-end intelligent capture method for the extraterrestrial exploration sample, disclosed by the invention, the end-to-end intelligent capture pose selection method for the extraterrestrial exploration sample does not need sample supervision training, is a self-learning mechanism, and can be promoted on line.
(2) According to the method for intelligently selecting the capture pose of the end-to-end extraterrestrial detection sample, prior information such as the shape and the size of the sample is not needed to be known in the capture training process, and an object with an unknown and irregular geometric shape can be captured with high success rate.
(3) The neural network parameter obtained by training is tested and verified in a migration physical environment, and the mechanical arm can continuously update the model through continuous interaction with the environment, so that continuous learning is realized, and the success rate of sample collection is continuously improved.
Drawings
FIG. 1 is a flow chart of an end-to-end method for intelligently grabbing an end-to-end extra-terrestrial probe sample according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an underground survey sample collection simulation training environment according to an embodiment of the present invention;
fig. 3 is a schematic diagram of grabbing irregular stones in the physical environment for acquiring an underground detection sample according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Fig. 1 is a flow chart of an end-to-end method for intelligently grabbing an end-to-end extraterrestrial probe sample according to an embodiment of the present invention, wherein digital-physical test verification is performed in a manner of digital training and physical test; the method comprises the following steps:
step one, selecting a reinforcement learning method;
step two, constructing an extraterrestrial detection sample acquisition simulation training environment;
step three, performing digital training in the constructed simulation training environment to obtain a grabbing model;
and step four, transferring the obtained grabbing model to an extraterrestrial exploration sample grabbing physical experiment system, carrying out an extraterrestrial exploration sample acquisition physical experiment based on reinforcement learning, and grabbing articles such as stones, so as to finish end-to-end extraterrestrial exploration sample grabbing pose intelligent selection.
In the embodiment of the invention, a near-end strategy optimization method PPO is adopted as a reinforcement learning algorithm.
In the embodiment of the invention, a digital simulation environment is constructed by adopting multi-platform robot simulation software Webots.
In the embodiment of the invention, models of a target mechanical arm, a paw, a camera, a target object, a box, a desktop and the like are established in a simulation environment.
The paw is arranged at the front end of the target mechanical arm and used for grabbing a target object on the desktop;
the camera is arranged above the desktop and used for observing a target object to be grabbed;
the box is used for placing the target object after the paw grabs the target object.
In the embodiment of the present invention, the digital training is specifically: by designing a reward function and a network structure, a deep neural network is trained, RGB-D images obtained through a camera are input, and a corresponding optimal grabbing pose is output.
In the embodiment of the present invention, the training process includes the following steps:
(1) according to the current article grabbing environment state, the mechanical arm selects and executes grabbing actions according to an initial grabbing strategy; the initial grabbing strategy is obtained according to the selected reinforcement learning method;
(2) after the grabbing action is executed, the grabbing environment is transferred to a new state, and corresponding action rewards are obtained through a reward function;
(3) repeating the process until all the objects in the training environment are successfully grabbed;
(4) and obtaining a deep neural network model, namely a grabbing model.
In the embodiment of the invention, the action reward function is designed as follows:
Figure BDA0003120319820000061
in the embodiment of the invention, a dense neural network DenseNet is adopted in both the execution network Actor and the evaluation network Critic in PPO, and the specific parameters are as follows:
a DenseNet-121 network is selected, and 121 layers comprise an initialization Layer, a dense connection Layer, a Transition Layer (TL) and a full connection Layer.
In the embodiment of the invention, the extraterrestrial detection sample grabbing physical experiment system comprises a target mechanical arm, a paw, a camera, a target object, a box and a table;
the paw is arranged at the front end of the target mechanical arm and used for grabbing a target object on the desktop;
the camera is arranged above the desktop and used for observing a target object to be grabbed;
the box is used for placing a target object after the target object is grabbed by the paw;
the step of transferring the obtained grabbing model to an extraterrestrial exploration sample grabbing physical experiment system is to establish a one-to-one correspondence relationship between grabbing poses in a simulation environment and physical environment grabbing poses;
in a physical experiment system, the position and pose of a camera relative to a mechanical arm base coordinate system are solved by using a calibration plate, and the grabbing position and pose in the simulation environment are converted to be under the mechanical arm base coordinate system, so that the mechanical arm is controlled to finish sample grabbing.
In the embodiment of the invention, the trained neural network parameters are tested and verified in a migration physical environment, and the mechanical arm can continuously update the model by continuously interacting with the environment, so that continuous learning is realized, and the success rate of sample collection is continuously improved.
Example (b):
sample collection in Mars detection is used as an application background, learning and training of the whole process of digital-physical integrated grabbing, analyzing and boxing is carried out, and full-autonomous target finding, grabbing and fine operation are achieved. The sample collection training grabbing process is shown in fig. 1. And (3) repeatedly carrying out iterative training on the basis of selecting a reinforcement learning training method and an initial network to determine a proper network structure, a reward design and training hyper-parameters, so as to obtain an optimal grabbing point network model and finally realize mechanical arm sampling control.
Step 1: the extraterrestrial exploration is designed based on a sample collection method of reinforcement learning.
The grab task is regarded as a Markov decision process: given the state s at time ttThe mechanical arm according to the strategy pi(s)t) Selecting and executing action atThen transited to a new state st+1And obtain a corresponding reward
Figure BDA0003120319820000071
The grabbing task needs to find a way to accumulate rewards
Figure BDA0003120319820000072
The largest strategy, γ, is the discount factor. The test adopts an off-strategy PPO method, and the PPO algorithm problem is defined as shown in formula (1):
Figure BDA0003120319820000073
wherein, thetaoldRepresenting the policy parameter vector before update, AtAnd the estimated value of the dominance function at the time t is shown, and beta represents the price adjustment parameter of the KL dispersion. The optimal parameters can be directly solved by adopting a gradient descent method.
In a deep learning Network, gradient disappearance is easily caused by deepening of the Network depth, a connection relation between different layers is established by a Dense neural Network (DenseNet), a Network structure better than ResNet is made through the connection, the gradient disappearance problem is further lightened, the Network is narrow, the parameter quantity is greatly reduced, and the problem of over-fitting is favorably inhibited, so that the strategy Network is designed according to DenseNet. Because the action space of the mechanical arm is continuous, the training strategy effect is poor or dimension disasters occur due to direct discretization, PPO and DensenNet are combined, and the mean value and the variance of the grasping posture continuous control quantity f are directly obtained. And (4) taking the star surface plane grabbing task into consideration, obtaining the positions (x, y) when the mechanical arm paw is grabbed and the rotation angle alpha of the mechanical arm paw along the z axis by randomly sampling the continuous control quantity f.
Step 2: and (5) constructing an extraterrestrial exploration sample acquisition simulation training environment.
The problems of high cost and low efficiency exist when the mechanical arm is directly used for physical training, a digital simulation environment is constructed by adopting open-source multi-platform robot simulation software Webots, training is carried out based on the reinforcement learning algorithm and the deep neural network in the step 1, and the effect of inputting R-GBD images and outputting corresponding action states is achieved by designing a reward function and a network structure through the deep neural network.
The mechanical arm grabbing training simulation system is built on the basis of an existing UR5 seven-degree-of-freedom mechanical arm of Webots, can control each angle of a joint space and the posture of the tail end of the mechanical arm relative to a base in a Euclidean space, and can simultaneously solve the displacement converted from a function to the joint space by using self inverse kinematics. In order to improve the grabbing success rate by considering that the shape of an operation object is irregular, the object is grabbed and placed by adopting a cooperative mechanical arm with three fingers at the tail end, and the closing is controlled by controlling the angle of a clamping jaw. In addition, models of the target object, box, and desktop are built in the simulation environment, as shown in FIG. 2. In order to compensate for the missing information caused by the view angle, an object RGB-D image acquired by a depth camera is used as the input of a strategy network, namely the state of a Markov decision process. The camera mounting orientation is 45 ° down, and the image size is 200 × 200 × 4.
And step 3: extraterrestrial exploration is based on a physical test of a sample collection method of reinforcement learning.
And (3) carrying out test verification on the neural network parameters obtained by training in the step (2) in a physical environment, and continuously interacting with the environment to enable the robot to continuously update the model, realize continuous learning and continuously improve the success rate of sample collection.
On the basis of the technical scheme, a pile of objects with complex shapes are put into a box with the size of 20 × 10cm according to a certain sequence and pose, and the limited space is required to contain as many objects as possible. Under the non-structural environment, it is difficult to directly obtain the accurate position, posture and shape of object, and the model of training in will emulation moves and directly snatchs arbitrary irregular stone to the actual scene, constantly adds the stone in the snatching process, snatchs the success rate 83.33%. The result of grabbing in the real environment is shown in fig. 3.
Those skilled in the art will appreciate that those matters not described in detail in the present specification are well known in the art.
Although the present invention has been described with reference to the preferred embodiments, it is not intended to limit the present invention, and those skilled in the art can make variations and modifications of the present invention without departing from the spirit and scope of the present invention by using the methods and technical contents disclosed above.

Claims (10)

1. An end-to-end intelligent capture method for an extraterrestrial exploration sample is characterized by comprising the following steps:
selecting a reinforcement learning method;
constructing an extraterrestrial exploration sample acquisition simulation training environment;
performing digital training in the constructed simulation training environment to obtain a grabbing model;
and transferring the obtained grabbing model to an extraterrestrial exploration sample grabbing physical experiment system, and carrying out an extraterrestrial exploration sample acquisition physical experiment based on reinforcement learning, so as to finish end-to-end extraterrestrial exploration sample grabbing.
2. The method for intelligently grabbing the end-to-end extra-terrestrial exploration samples according to claim 1, is characterized in that: and adopting a near-end strategy optimization method PPO as a selected reinforcement learning method.
3. The method for intelligently grabbing the end-to-end extra-terrestrial exploration samples according to claim 1, is characterized in that: and constructing an extraterrestrial detection sample acquisition simulation training environment by adopting multi-platform robot simulation software Webots.
4. The method for intelligently grabbing the end-to-end extra-terrestrial exploration samples according to claim 3, wherein the method comprises the following steps: when an extraterrestrial detection sample acquisition simulation training environment is established, a target mechanical arm, a paw, a camera, a target object, a box and a desktop model are established;
the paw is arranged at the front end of the target mechanical arm and used for grabbing a target object on the desktop;
the camera is arranged above the desktop and used for observing a target object to be grabbed;
the box is used for placing the target object after the paw grabs the target object.
5. The method for intelligently grabbing the end-to-end extra-terrestrial exploration samples according to claim 1, is characterized in that: the digital training is specifically as follows: by designing a reward function and a network structure, a deep neural network is trained, RGB-D images obtained through a camera are input, and the optimal grabbing pose under a corresponding image coordinate system is output.
6. The method for intelligently grabbing the end-to-end extra-terrestrial exploration samples according to claim 5, wherein the method comprises the following steps: the reward function is as follows:
Figure FDA0003120319810000021
in PPO, dense neural network DenseNet is adopted for executing network Actor and evaluating network criticic, and the specific parameters are as follows: a DenseNet-121 network is selected, and 121 layers comprise an initialization layer, a dense connection layer, a transition layer and a full connection layer.
7. The method for intelligently grabbing end-to-end extra-terrestrial exploration samples according to claim 6, wherein the training process comprises the following steps:
(1) according to the current article grabbing environment state, the mechanical arm selects and executes grabbing actions according to an initial grabbing strategy; the initial grabbing strategy is obtained according to the selected reinforcement learning method;
(2) after the grabbing action is executed, the grabbing environment is transferred to a new state, and corresponding action rewards are obtained through a reward function;
(3) repeating the process until all the objects in the training environment are successfully grabbed;
(4) and obtaining a deep neural network model, namely a grabbing model.
8. The method for intelligently grabbing the end-to-end extra-terrestrial exploration samples according to claim 1, is characterized in that: the extraterrestrial detection sample grabbing physical experiment system comprises a target mechanical arm, a paw, a camera, a target object, a box and a table;
the paw is arranged at the front end of the target mechanical arm and used for grabbing a target object on the desktop;
the camera is arranged above the desktop and used for observing a target object to be grabbed;
the box is used for placing a target object after the target object is grabbed by the paw;
the step of transferring the obtained grabbing model to an extraterrestrial exploration sample grabbing physical experiment system is to establish a one-to-one correspondence relationship between grabbing poses in a simulation environment and grabbing poses in a physical experiment environment;
in a physical test environment, the position and pose of the camera relative to a mechanical arm base coordinate system are solved by using a calibration plate, and the grabbing position and pose in the simulation environment are converted to be under the mechanical arm base coordinate system, so that the mechanical arm is controlled to finish sample grabbing.
9. The method for intelligently grabbing the end-to-end extra-terrestrial exploration samples according to claim 1, is characterized in that: the sample collection physical test for the extraterrestrial exploration based on reinforcement learning specifically comprises the following steps: the neural network parameters obtained by training are tested and verified in the migration physical environment, and the mechanical arm continuously updates the grabbing model through continuous interaction with the environment, so that continuous learning is realized, and the success rate of sample collection is improved.
10. An intelligent capture system for the extraterrestrial exploration sample, which is implemented by the intelligent capture method for the end-to-end extraterrestrial exploration sample according to any one of claims 1 to 9, and is characterized by comprising:
the reinforcement learning method determination module: selecting a near-end strategy optimization method PPO as a reinforcement learning method;
the simulation training environment construction module: constructing an extraterrestrial exploration sample acquisition simulation training environment by adopting multi-platform robot simulation software Webots; establishing a target mechanical arm, a paw, a camera, a target object, a box and a desktop model; the paw is arranged at the front end of the target mechanical arm and used for grabbing a target object on the desktop; the camera is arranged above the desktop and used for observing a target object to be grabbed; the box is used for placing a target object after the target object is grabbed by the paw;
a training module: in the constructed simulation training environment, carrying out digital training to obtain a grabbing model, which specifically comprises the following steps: training a deep neural network by designing a reward function and a network structure, inputting an RGB-D image obtained by a camera, and outputting a corresponding optimal grabbing pose;
the reward function is as follows:
Figure FDA0003120319810000031
in PPO, dense neural network DenseNet is adopted for executing network Actor and evaluating network criticic, and the specific parameters are as follows: selecting a DenseNet-121 network, wherein the 121 layer comprises an initialization layer, a dense connection layer, a transition layer and a full connection layer;
a test verification module: and transferring the obtained grabbing model to an extraterrestrial exploration sample grabbing physical experiment system, and carrying out an extraterrestrial exploration sample acquisition physical experiment based on reinforcement learning, so as to finish the end-to-end extraterrestrial exploration sample grabbing pose intelligent selection.
CN202110674012.2A 2021-06-17 2021-06-17 End-to-end intelligent capture method for extraterrestrial exploration sample Active CN113524173B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110674012.2A CN113524173B (en) 2021-06-17 2021-06-17 End-to-end intelligent capture method for extraterrestrial exploration sample

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110674012.2A CN113524173B (en) 2021-06-17 2021-06-17 End-to-end intelligent capture method for extraterrestrial exploration sample

Publications (2)

Publication Number Publication Date
CN113524173A true CN113524173A (en) 2021-10-22
CN113524173B CN113524173B (en) 2022-12-27

Family

ID=78125077

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110674012.2A Active CN113524173B (en) 2021-06-17 2021-06-17 End-to-end intelligent capture method for extraterrestrial exploration sample

Country Status (1)

Country Link
CN (1) CN113524173B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114888801A (en) * 2022-05-16 2022-08-12 南京邮电大学 Mechanical arm control method and system based on offline strategy reinforcement learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111496794A (en) * 2020-04-29 2020-08-07 华中科技大学 Kinematics self-grabbing learning method and system based on simulation industrial robot
CN112102405A (en) * 2020-08-26 2020-12-18 东南大学 Robot stirring-grabbing combined method based on deep reinforcement learning
CN112338921A (en) * 2020-11-16 2021-02-09 西华师范大学 Mechanical arm intelligent control rapid training method based on deep reinforcement learning
CN112605983A (en) * 2020-12-01 2021-04-06 浙江工业大学 Mechanical arm pushing and grabbing system suitable for intensive environment
CN112631131A (en) * 2020-12-19 2021-04-09 北京化工大学 Motion control self-generation and physical migration method for quadruped robot

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111496794A (en) * 2020-04-29 2020-08-07 华中科技大学 Kinematics self-grabbing learning method and system based on simulation industrial robot
CN112102405A (en) * 2020-08-26 2020-12-18 东南大学 Robot stirring-grabbing combined method based on deep reinforcement learning
CN112338921A (en) * 2020-11-16 2021-02-09 西华师范大学 Mechanical arm intelligent control rapid training method based on deep reinforcement learning
CN112605983A (en) * 2020-12-01 2021-04-06 浙江工业大学 Mechanical arm pushing and grabbing system suitable for intensive environment
CN112631131A (en) * 2020-12-19 2021-04-09 北京化工大学 Motion control self-generation and physical migration method for quadruped robot

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114888801A (en) * 2022-05-16 2022-08-12 南京邮电大学 Mechanical arm control method and system based on offline strategy reinforcement learning
CN114888801B (en) * 2022-05-16 2023-10-13 南京邮电大学 Mechanical arm control method and system based on offline strategy reinforcement learning

Also Published As

Publication number Publication date
CN113524173B (en) 2022-12-27

Similar Documents

Publication Publication Date Title
CN112102405B (en) Robot stirring-grabbing combined method based on deep reinforcement learning
CN111203878B (en) Robot sequence task learning method based on visual simulation
CN111515961B (en) Reinforcement learning reward method suitable for mobile mechanical arm
Xu et al. An end-to-end differentiable framework for contact-aware robot design
Kiatos et al. Robust object grasping in clutter via singulation
CN106651949A (en) Teleoperation method and system for grabbing objects using space mechanical arm based on simulation
CN113826051A (en) Generating digital twins of interactions between solid system parts
CN111325768B (en) Free floating target capture method based on 3D vision and simulation learning
Zhou et al. 6dof grasp planning by optimizing a deep learning scoring function
CN112605983A (en) Mechanical arm pushing and grabbing system suitable for intensive environment
CN112183188B (en) Method for simulating learning of mechanical arm based on task embedded network
McConachie et al. Bandit-based model selection for deformable object manipulation
CN113524173B (en) End-to-end intelligent capture method for extraterrestrial exploration sample
CN111152227A (en) Mechanical arm control method based on guided DQN control
Si et al. Grasp stability prediction with sim-to-real transfer from tactile sensing
CN113650015A (en) Dynamic task planning method for lunar surface sampling mechanical arm
CN114131603B (en) Deep reinforcement learning robot grabbing method based on perception enhancement and scene migration
CN111814823A (en) Transfer learning method based on scene template generation
Li et al. Hierarchical learning from demonstrations for long-horizon tasks
Zhou et al. Robot manipulator visual servoing based on image moments and improved firefly optimization algorithm-based extreme learning machine
US11383386B2 (en) Robotic drawing
CN116541701A (en) Training data generation method, intelligent body training device and electronic equipment
Guo et al. Learning pushing skills using object detection and deep reinforcement learning
CN111331598A (en) Robot attitude control method based on genetic algorithm optimization neural network structure
CN114882113A (en) Five-finger mechanical dexterous hand grabbing and transferring method based on shape correspondence of similar objects

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant