CN113232019A - Mechanical arm control method and device, electronic equipment and storage medium - Google Patents
Mechanical arm control method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN113232019A CN113232019A CN202110521680.1A CN202110521680A CN113232019A CN 113232019 A CN113232019 A CN 113232019A CN 202110521680 A CN202110521680 A CN 202110521680A CN 113232019 A CN113232019 A CN 113232019A
- Authority
- CN
- China
- Prior art keywords
- determining
- mechanical arm
- path
- pose information
- target object
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 239000013598 vector Substances 0.000 claims description 15
- 238000013145 classification model Methods 0.000 claims description 12
- 230000009471 action Effects 0.000 claims description 11
- 238000012545 processing Methods 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 4
- 230000003247 decreasing effect Effects 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 description 20
- 238000013527 convolutional neural network Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 8
- 230000002787 reinforcement Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1656—Programme controls characterised by programming, planning systems for manipulators
- B25J9/1664—Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1694—Programme controls characterised by use of sensors other than normal servo-feedback from position, speed or acceleration sensors, perception control, multi-sensor controlled systems, sensor fusion
- B25J9/1697—Vision controlled systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Mechanical Engineering (AREA)
- Robotics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Manipulator (AREA)
Abstract
The application provides a mechanical arm control method, a mechanical arm control device, electronic equipment and a storage medium, and a target object image corresponding to an object to be acquired is acquired; determining type information and pose information of a to-be-taken object according to the target object image; determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched; and controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be grabbed. Combine together vision sensor and arm, accomplish the snatching of the object of waiting to get that the structure is unset through the vision guide, snatch the precision high and stable, restriction condition is less simultaneously, and the self-adaptability is strong, and is nimble relatively.
Description
Technical Field
The present disclosure relates to the field of robot arm grabbing control, and in particular, to a robot arm control method and apparatus, an electronic device, and a storage medium.
Background
As technology continues to advance, industrial robots are moving into factories to replace human work.
At present, a mechanical arm grabs a planar object with a fixed structure according to a set control program so as to improve the working efficiency.
However, when the structure of the planar object is slightly changed, the robot cannot continue to operate, and the control program must be reset.
Disclosure of Invention
The application provides a mechanical arm control method, a mechanical arm control device, electronic equipment and a storage medium, which are used for solving the problem that a mechanical arm cannot continue to work when the structure of a planar object changes.
In a first aspect, the present application provides a method for controlling a robot arm, the method including:
acquiring a target object image corresponding to an object to be acquired;
determining type information and pose information of a to-be-taken object according to the target object image;
determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched;
and controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be grabbed.
Optionally, determining type information of the object to be taken according to the target object image includes:
obtaining a classification result by using a pre-trained classification model according to the target object image;
and determining the type information of the object to be taken according to the classification result.
Optionally, determining pose information of the object to be taken according to the target object image includes:
determining the characteristics of a target image according to the target object image;
and determining the pose information of the object to be taken by using a pose calculation model according to the characteristics of the target image.
Optionally, determining a grab path according to the type information and the pose information includes:
acquiring current pose information of each joint of the mechanical arm;
and determining a grabbing path by using a decision model according to the current pose information, the type information and the pose information of the object to be taken.
Optionally, determining a grabbing path by using the decision model according to the current pose information, the type information of the object to be taken, and the pose information, including:
the decision model is Q, the iteration frequency is assumed to be Rounds, wherein Rounds is a positive integer, batch _ size when the batch gradient is decreased is m, and the maximum size n of the empirical playback pool is obtained;
taking the current pose information, the type information of the object to be taken and the pose information as state vectors in the state SWherein, the state S is an initialization state;
executing the current action A in the state S to obtain the next state S ', wherein the next state S' corresponds to the feature vectorReward R, whether the state is _ end is terminated;
will five-membered groupAdding into experience playback pool, if experience returnsSampling in batches from the empirical playback pool and updating network parameters in the decision model if the size of the playback pool is larger than m, and removing the quintuple added earliest from the empirical playback pool and adding a new quintuple if the size of the empirical playback pool is larger than n;
updating the state S to a state S';
judging whether is _ end is in a final state, if not, continuing to circularly randomly take samples from the experience playback pool, and if so, finishing the circulation to obtain a final decision model;
and determining a grabbing path according to the final decision model.
Optionally, a state vectorThe method also comprises a specific scene, wherein the specific scene comprises a scene with an unfixed object structure to be taken.
Optionally, according to the grabbing path, each joint of the mechanical arm is controlled to perform angle adjustment so as to grab the object to be taken, including:
obtaining the motion trail of the mechanical arm by using a smooth trail interpolation method according to the grabbing path;
and controlling each joint of the mechanical arm to adjust the angle according to the motion trail so as to grab the object to be picked.
Optionally, the method further comprises:
and according to the target object image, if the type of the target object cannot be determined, acquiring the target object image again through the vision sensor.
In a second aspect, the present application provides an arm control apparatus, the apparatus comprising:
the acquisition module is used for acquiring a target object image corresponding to a to-be-taken object;
the processing module is used for determining the type information and the pose information of the object to be taken according to the target object image;
the processing module is further used for determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be taken;
and the processing module is also used for controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be grabbed.
In a third aspect, the present application provides an electronic device, comprising: a memory, a processor;
a memory; a memory for storing processor-executable instructions;
a processor for implementing the robot arm control method according to the first aspect and the alternative aspects, according to executable instructions stored in a memory.
In a fourth aspect, the present application provides a computer-readable storage medium having computer-executable instructions stored thereon, where the computer-executable instructions are executed by a processor to implement the robot arm control method according to the first aspect and the alternative.
In a fifth aspect, the present application provides a computer program product comprising instructions which, when executed by a processor, implement the robot arm control method of the first aspect and the alternatives.
The application provides a mechanical arm control method, a mechanical arm control device, electronic equipment and a storage medium, and a target object image corresponding to an object to be acquired is acquired; determining type information and pose information of a to-be-taken object according to the target object image; determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched; and controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be grabbed. Combine together vision sensor and arm, accomplish the snatching of the object of waiting to get that the structure is unset through the vision guide, snatch the precision high and stable, restriction condition is less simultaneously, and the self-adaptability is strong, and is nimble relatively.
Drawings
FIG. 1 is a schematic view of a robotic arm control system shown herein according to an exemplary embodiment;
FIG. 2 is a flow diagram illustrating a method for controlling a robotic arm according to an exemplary embodiment of the present application;
FIG. 3 is a flow diagram illustrating a method of controlling a robotic arm according to another exemplary embodiment of the present application;
FIG. 4 is a schematic diagram illustrating the construction of a robot arm control apparatus according to an exemplary embodiment of the present application;
fig. 5 is a schematic diagram of a hardware structure of an electronic device according to an exemplary embodiment of the present application.
Detailed Description
To make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the present application, and it is obvious that the described embodiments are some, but not all embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As technology continues to advance, industrial robots are moving into factories to replace human work. A robotic arm is a mechanical structure that mimics a human hand, such as a planar multi-joint robot, a palletizer, and the like. The mechanical arm generally has a plurality of joint arms and an execution end arranged on the last joint arm, various execution components are arranged on the execution end, and the execution end is moved to a specified coordinate in space through automatic control to realize functions provided by the execution components, such as writing, grabbing, testing and the like.
At present, a mechanical arm grabs a planar object with a fixed structure according to a set control program so as to improve the working efficiency.
However, when the structure of the planar object is slightly changed, the robot cannot continue to operate, and the control program must be reset.
In order to solve the problems, the application provides a mechanical arm control method, a target object image corresponding to an object to be taken is obtained through a vision sensor, and information and pose information of the object to be taken in the target object image are obtained by utilizing a depth learning and computer image processing method. And meanwhile, converting the pose information into pose information under a universal coordinate system. And then, automatically calculating the grabbing path of the mechanical arm by using the decision model. And controlling the mechanical arm to adjust the angle of each joint according to the grabbing path, and controlling the mechanical arm to grab the object to be taken. Combine together vision sensor and arm, accomplish the snatching of the object of waiting to get that the structure is unset through the vision guide, snatch the precision high and stable, restriction condition is less simultaneously, and the self-adaptability is strong, and is nimble relatively.
FIG. 1 is a schematic view of a robotic arm control system according to an exemplary embodiment of the present application. As shown in fig. 1, the robot arm control system provided in this embodiment includes: a vision sensor 110, a master server 120, and a robotic arm 130. The vision sensor 110 is configured to acquire a target object image corresponding to an object to be acquired, and send the target object image to the main control server 120. The main control server 120 receives the target object image sent by the vision sensor 110, determines a motion path from the current position of the mechanical arm 130 to the position of the object to be taken according to the target object image, controls each joint of the mechanical arm 130 to perform angle adjustment according to the motion path, and sends a control signal to the mechanical arm 130. The mechanical arm 130 is configured to receive a control signal sent by the main control server 120, and perform angle adjustment on each joint according to the control signal to grab an object to be picked.
Fig. 2 is a flow chart diagram illustrating a robot arm control method according to an exemplary embodiment of the present application. As shown in fig. 2, the robot arm control method provided in this embodiment is based on the robot arm control system shown in fig. 1, and includes the following steps:
s101, obtaining a target object image corresponding to the object to be acquired.
More specifically, the target object image is an RGB picture in a three-dimensional space. One or more vision sensors shoot the object to be taken to obtain a target object image at a corresponding angle, and the target object image is sent to the master control server. Wherein the vision sensor comprises an RGB video camera or an industrial camera. The vision sensor can shoot and acquire target object images from one or more angles, and the multi-angle target object images can determine the type information and the pose information of the object to be taken from multiple angles, so that the mechanical arm can grab the object more accurately. The master control server receives a target object image corresponding to an object to be taken.
And S102, determining the type information and the pose information of the object to be taken according to the target object image.
More specifically, the master control server determines type information of the object to be taken by using the classification model according to the target object image information sent by the vision sensor, wherein the type information comprises a type number. And inputting the image information of the target object into a classification model, and outputting the type number of the object to be taken by the classification model so as to determine the type information of the object to be taken. The master control server obtains relative coordinate positions of at least four three-dimensional space points according to target object image information sent by the vision sensor, and the pose calculation model is used for determining the poses of the three-dimensional space points under the vision sensor. And converting the pose under the vision sensor into pose information under a universal coordinate system. The pose information comprises spatial position information and direction information of the object to be taken. The spatial position information of the object to be taken is the spatial coordinates of a preset number of points on the surface of the object to be taken, and the preset number of points comprises at least four three-dimensional spatial points.
S103, determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched.
More specifically, the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be taken, and the moving path also comprises moving paths of all joints of the mechanical arm. And determining the motion path of each joint of the mechanical arm at each moment by adopting a reinforcement learning algorithm according to the type information and the pose information until the mechanical arm successfully grabs the object to be taken.
For a multi-joint mechanical arm, various joint angle configurations exist, so that the mechanical arm can grab a to-be-picked object, and countless solutions exist in the grabbing path of the mechanical arm. The traditional method generally uses Sampling-based Planning (Sampling-based Planning) to perform path Planning, and the method finds an optimal solution, but quickly finds an effective solution. Finding the optimal solution takes more time because there are numerous paths. However, the effective solution may not be the optimal solution, which means that when the machine grabs the object according to the calculated grabbing path, the grabbing path obtained by the non-optimal solution is not the shortest, that is, a single mechanical arm joint needs to rotate by an unnecessary angle or multiple joints need to rotate by an unnecessary angle. Meanwhile, because the method is based on sampling, under the condition that interpolation sampling has errors, the path planned by sampling is sometimes an invalid path for the mechanical arm, namely the path cannot be planned for the actual mechanical arm. Therefore, in the embodiment, a reinforcement learning algorithm is adopted to complete the grabbing path planning of the end-to-end real-time scene of the environment, and each step of decision-making action of each joint of the mechanical arm is determined, so that an optimal complete path is obtained.
And S104, controlling each joint of the mechanical arm to adjust the angle according to the grabbing path so as to grab the object to be grabbed.
More specifically, assuming that the time taken for the robot arm from the start of gripping to the completion of gripping is t, the gripping path includes the movement path of the robot arm at time 1, 2, …, t. And controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path at the corresponding moment at the moment 1, 2, … and t-1, and controlling the executing end of the mechanical arm to grab the object to be picked while controlling the angle adjustment of each joint of the mechanical arm until the moment t.
In the method provided by the embodiment, a target object image corresponding to an object to be acquired is acquired; determining type information and pose information of a to-be-taken object according to the target object image; determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched; and controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be grabbed. Combine together vision sensor and arm, accomplish the snatching of the object of waiting to get that the structure is unset through the vision guide, snatch the precision high and stable, restriction condition is less simultaneously, and the self-adaptability is strong, and is nimble relatively.
Fig. 3 is a flow chart diagram illustrating a robot arm control method according to another exemplary embodiment of the present application. As shown in fig. 3, the method for controlling a robot arm provided in this embodiment includes the following steps:
s201, obtaining a target object image corresponding to the object to be taken.
Step S201 is similar to the step S101 in the embodiment of fig. 2, and this embodiment is not described herein again.
S202, obtaining a classification result by using a pre-trained classification model according to the target object image; and determining the type information of the object to be taken according to the classification result.
More specifically, the pre-trained classification model may be a Yolo model, a Convolutional Neural Networks (CNN) model, a Mask R-CNN (Region-CNN) model, a fast Mask R-CNN model, or the like.
Taking the CNN model as an example, the high-dimensional features on the target object image are extracted through convolution, and different types of objects to be extracted have different features, that is, the features can be extracted through the RGB images taken by the visual sensor by using the neural network convolution operation. Then, the fully-connected layer of the model carries out reward penalty on the judgment by classifying the features and carrying out priori knowledge (from pre-artificial marking), so that the convolutional neural network can learn by self under the supervision, and is continuously optimized, namely the convolutional neural network is adjusted to be in an optimal parameter state through self optimization, and a pre-trained classification model is obtained. And inputting the target object image into a pre-trained classification model, and outputting a classification result by the classification model. And determining the type information of the object to be taken according to the classification result.
S203, determining the characteristics of the target image according to the target object image; and determining the pose information of the object to be taken by using a pose calculation model according to the characteristics of the target image.
More specifically, the pose calculation model may be a Perspective-N-Point (PNP) model. The fully connected layers in the classification model in step S202 are removed, and the plurality of remaining convolutional layers are used as a feature extraction model. And inputting the target object image into a feature extraction model, and outputting the target image features by the feature extraction model. And comparing the target image features with the features extracted from the RGB images manually marked before, so that the features of the target image features and the features of the RGB images manually marked before can be matched one by one. The target image features obtained after matching at least comprise relative coordinate positions of four feature points in a three-dimensional space. And inputting the target image characteristics obtained after matching into the PNP model, and determining the poses of the characteristic points in the three-dimensional space under the vision sensor. And converting the pose under the vision sensor into pose information under a universal coordinate system. The pose information comprises spatial position information and direction information of the object to be taken. The spatial position information of the object to be taken is the spatial coordinates of a preset number of points on the surface of the object to be taken, and the preset number of points comprises at least four three-dimensional spatial points.
In this embodiment, the steps S202 and S203 are not limited by the described operation sequence, and the steps S202 and S203 may be performed in other sequences or simultaneously.
And S204, acquiring current pose information of each joint of the mechanical arm.
More specifically, the current pose information includes spatial position information and orientation information where the mechanical arm is currently located.
And S205, determining a grabbing path by using a decision model according to the current pose information, the type information and the pose information of the object to be taken.
More specifically, the decision model includes a Deep Learning (DRL) algorithm. DRL algorithms include Deep Q-Learning (DQN) algorithms and Q-Learning (QL) algorithms. The DQN algorithm is one of deep reinforcement learning DRL algorithms, and is an algorithm that combines deep learning and reinforcement learning to realize end-to-end learning from perception to action. The QL algorithm is a classical reinforcement learning algorithm, and because the QL algorithm requires a huge Q table, the memory occupied in a high-dimensional space is huge and convergence is not easy. Therefore, the present embodiment uses the DQN algorithm. Unlike most collision detection algorithms in the past, the DQN algorithm is a model-free algorithm and does not need to be modeled for each scene. And determining a grabbing path by utilizing a DQN algorithm according to the current pose information, the type information and the pose information of the object to be taken so as to realize end-to-end control on the mechanical arms of the plurality of joints.
The behavior value function of the DQN algorithm is approximated by a neural network, belongs to nonlinear approximation, and adopts a network structure of three convolution layers and two full-connection layers. The decision model is formulated asThe updating of the network is actually the updating of the parameter theta, and once theta is determined, the network parameter is determined.
The DQN algorithm is mainly characterized by introducing empirical playback, namely a quintupleAdded to an empirical replay pool, which will later be used to update the decision modelThe network parameter θ in (1). Wherein the content of the first and second substances,andboth in the form of tensors, the actions a and rewards R are scalar quantities, and is _ end is a boolean value.
Optionally, determining a grabbing path by using the decision model according to the current pose information, the type information of the object to be taken, and the pose information, including:
the decision model is Q, and the number of iterations is assumed to be Rounds, where Rounds is a positive integer, batch _ size when the batch gradient is decreasing is m, and the empirical playback pool maximum size n.
Taking the current pose information, the type information of the object to be taken and the pose information as state vectors in the state SWherein, the state S is an initialization state,tensors are formed by the current pose information of each joint of the mechanical arm, the type information of the object to be taken and the pose information.
Executing the current action A in the state S to obtain the next state S ', wherein the next state S' corresponds to the feature vectorReward R, whether the state is _ end is terminated.
If the size of the empirical playback pool is larger than m, sampling in batch from the empirical playback pool and updating the network parameters in the decision model, specifically comprising:
step 1, randomly taking m samples from an experience playback poolWhere j is 1, 2, 3, …, m, and the target value y is calculatedi:
Wherein, yiDenotes the target value of the jth sample, RjRepresents the reward, is _ end, for the j-th samplejIndicates whether the jth sample is terminated, gamma indicates the attenuation coefficient,the decision model representing the jth sample,represents the feature vector, A ', for the j-th sample'jRepresents the action of the jth sample and theta represents a network parameter.
Step 2, using a mean square error loss functionThe network parameter θ in the decision model Q is updated.
And if the size of the experience playback pool is larger than n, removing the quintuple added earliest from the experience playback pool, and adding a new quintuple.
The state S is updated to the state S'.
And judging whether the is _ end is in a final state, if not, continuing to circularly and randomly take samples from the experience playback pool, and if so, finishing the circulation to obtain a final decision model.
And determining a grabbing path according to the final decision model. The optimal grabbing path is obtained according to the steps, so that unnecessary rotation angles of a single joint or unnecessary rotation angles of a plurality of joints of the mechanical arm are avoided to a certain extent, and the loss of the joints of the mechanical arm is reduced.
Optionally, a state vectorThe method also comprises a specific scene, wherein the specific scene comprises a scene with an unfixed object structure to be taken.
More specifically, the specific scene may be a scene in which the object structure and size are changing. Taking the specific scene, the current pose information, the type information and the pose information of the object to be taken as state vectors in the state SWherein the state S is an initialization state.
S206, obtaining the motion trail of the mechanical arm by using a smooth trail interpolation method according to the grabbing path; and controlling each joint of the mechanical arm to adjust the angle according to the motion trail so as to grab the object to be picked.
More specifically, the smooth trajectory interpolation method comprises a polynomial curve method, so that the mechanical arm is more continuous and smooth in the motion process, and noise is reduced.
Optionally, according to the target object image, if the type of the target object cannot be determined, the target object image is obtained again through the vision sensor.
More specifically, according to a target object image shot by the vision sensor, when a classification result cannot be identified by a pre-trained classification model, the main control server sends a target object image re-acquisition instruction to the vision sensor, and after receiving the instruction, the vision server re-acquires the target image and sends the target image to the main control server.
In the method provided by the embodiment, the path is planned in real time based on the deep reinforcement learning algorithm, and end-to-end real-time path planning can also be performed according to a specific scene. And (4) obtaining each step of decision-making action of the mechanical arm in a specific scene through training a decision-making model, and further obtaining an optimal complete path. In the practical application process, the trained decision model is utilized to input the target object image acquired by the visual sensor, and the path information of the mechanical arm movement can be obtained. The robustness is ensured, and the dependence on scenes is reduced.
Fig. 4 is a schematic structural diagram of a robot arm control device according to an exemplary embodiment of the present application. As shown in fig. 4, the present application provides a robot arm control apparatus 40, the apparatus 40 including:
the obtaining module 41 is configured to obtain a target object image corresponding to the object to be taken.
And the processing module 42 is configured to determine type information and pose information of the object to be taken according to the target object image.
And the processing module 42 is further configured to determine a grabbing path according to the type information and the pose information, where the grabbing path is a moving path from the current position of the robot arm to the position of the object to be fetched.
And the processing module 42 is further configured to control each joint of the mechanical arm to perform angle adjustment according to the grabbing path, so as to grab the object to be grabbed.
Specifically, the present embodiment may refer to the above method embodiments, and the principle and the technical effect are similar, which are not described again.
Fig. 5 is a schematic diagram of a hardware structure of an electronic device according to an exemplary embodiment of the present application. As shown in fig. 5, the electronic apparatus 50 of the present embodiment includes: a processor 51 and a memory 52; wherein the content of the first and second substances,
a memory 52, a memory for storing processor-executable instructions.
The processor 51 is configured to implement the robot arm control method in the above embodiments according to executable instructions stored in the memory. Reference may be made in particular to the description relating to the method embodiments described above.
Alternatively, the memory 52 may be separate or integrated with the processor 51.
When the memory 52 is provided separately, the electronic device 50 further includes a bus 53 for connecting the memory 52 and the processor 51.
The present application also provides a computer readable storage medium, in which computer instructions are stored, and the computer instructions are executed by a processor to implement the methods provided by the above-mentioned various embodiments.
The computer-readable storage medium may be a computer storage medium or a communication medium. Communication media includes any medium that facilitates transfer of a computer program from one place to another. Computer storage media may be any available media that can be accessed by a general purpose or special purpose computer. For example, a computer readable storage medium is coupled to the processor such that the processor can read information from, and write information to, the computer readable storage medium. Of course, the computer readable storage medium may also be integral to the processor. The processor and the computer-readable storage medium may reside in an Application Specific Integrated Circuit (ASIC). Additionally, the ASIC may reside in user equipment. Of course, the processor and the computer-readable storage medium may also reside as discrete components in a communication device.
The computer-readable storage medium may be implemented by any type of volatile or nonvolatile Memory device or combination thereof, such as Static Random-Access Memory (SRAM), Electrically-Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
The present application also provides a computer program product comprising execution instructions stored in a computer readable storage medium. The at least one processor of the device may read the execution instructions from the computer-readable storage medium, and the execution of the execution instructions by the at least one processor causes the device to implement the methods provided by the various embodiments described above.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.
Claims (12)
1. A method of controlling a robot arm, the method comprising:
acquiring a target object image corresponding to an object to be acquired;
determining the type information and the pose information of the object to be acquired according to the target object image;
determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched;
and controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be fetched.
2. The method according to claim 1, wherein the determining the type information of the object to be taken according to the target object image comprises:
obtaining a classification result by using a pre-trained classification model according to the target object image;
and determining the type information of the object to be taken according to the classification result.
3. The method according to claim 1, wherein the determining pose information of the object to be fetched according to the target object image comprises:
determining the characteristics of a target image according to the target object image;
and determining the pose information of the object to be acquired by using a pose calculation model according to the characteristics of the target image.
4. The method of claim 1, wherein determining a grab path from the type information and pose information comprises:
acquiring current pose information of each joint of the mechanical arm;
and determining the grabbing path by using a decision model according to the current pose information, the type information and the pose information of the object to be taken.
5. The method of claim 4, wherein determining the grasp path using a decision model based on the current pose information, the type information of the object to be taken, and the pose information comprises:
the decision model is Q, the iteration frequency is assumed to be Rounds, wherein Rounds is a positive integer, batch _ size when the batch gradient is decreased is m, and the maximum size n of the empirical playback pool is obtained;
the current pose information and the object to be taken are acquiredType information and pose information of a body as state vectors in state SWherein the state S is an initialization state;
the state vector is integrated into a vectorInputting the current action A into the decision model Q;
executing the current action A in the state S to obtain a next state S ', wherein the next state S' corresponds to the feature vectorReward R, whether the state is _ end is terminated;
will five-membered groupAdding the obtained data into an empirical playback pool, sampling in batch from the empirical playback pool and updating network parameters in the decision model if the size of the empirical playback pool is larger than m, and removing the five tuples added earliest and adding new five tuples from the empirical playback pool if the size of the empirical playback pool is larger than n;
updating the state S to a state S';
judging whether is _ end is in a final state, if not, continuing to circularly and randomly take samples from the experience playback pool, and if so, finishing the circulation to obtain a final decision model;
and determining the grabbing path according to the final decision model.
7. The method according to claim 1, wherein the controlling the mechanical arms to perform angular adjustment according to the grabbing path to grab the object to be taken comprises:
obtaining the motion track of the mechanical arm by using a smooth track interpolation method according to the grabbing path;
and controlling each joint of the mechanical arm to adjust the angle according to the motion track so as to grab the object to be fetched.
8. The method of any one of claims 1-7, further comprising:
and according to the target object image, if the type of the target object cannot be determined, acquiring the target object image again through the visual sensor.
9. An apparatus for controlling a robot arm, comprising:
the acquisition module is used for acquiring a target object image corresponding to a to-be-taken object;
the processing module is used for determining the type information and the pose information of the object to be acquired according to the target object image;
the processing module is further used for determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched;
and the processing module is also used for controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be fetched.
10. An electronic device, comprising: a memory, a processor;
a memory; a memory for storing the processor-executable instructions;
a processor for implementing the robot arm control method of any one of claims 1 to 8 in accordance with executable instructions stored in the memory.
11. A computer-readable storage medium having computer-executable instructions stored therein, which when executed by a processor, are configured to implement the robot arm control method of any one of claims 1 to 8.
12. A computer program product comprising instructions which, when executed by a processor, carry out the robot arm control method of any of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110521680.1A CN113232019A (en) | 2021-05-13 | 2021-05-13 | Mechanical arm control method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110521680.1A CN113232019A (en) | 2021-05-13 | 2021-05-13 | Mechanical arm control method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113232019A true CN113232019A (en) | 2021-08-10 |
Family
ID=77133957
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110521680.1A Pending CN113232019A (en) | 2021-05-13 | 2021-05-13 | Mechanical arm control method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113232019A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113753562A (en) * | 2021-08-24 | 2021-12-07 | 深圳市长荣科机电设备有限公司 | Carrying method, system and device based on linear motor and storage medium |
CN113942009A (en) * | 2021-09-13 | 2022-01-18 | 苏州大学 | Robot bionic hand grabbing method and system |
CN114523470A (en) * | 2021-12-30 | 2022-05-24 | 浙江图盛输变电工程有限公司 | Robot operation path planning method based on bearing platform linkage |
CN114683251A (en) * | 2022-03-31 | 2022-07-01 | 上海节卡机器人科技有限公司 | Robot grabbing method and device, electronic equipment and readable storage medium |
CN115648232A (en) * | 2022-12-30 | 2023-01-31 | 广东隆崎机器人有限公司 | Mechanical arm control method and device, electronic equipment and readable storage medium |
CN115847488A (en) * | 2023-02-07 | 2023-03-28 | 成都秦川物联网科技股份有限公司 | Industrial Internet of things system for cooperative robot monitoring and control method |
Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106970615A (en) * | 2017-03-21 | 2017-07-21 | 西北工业大学 | A kind of real-time online paths planning method of deeply study |
CN107883929A (en) * | 2017-09-22 | 2018-04-06 | 中冶赛迪技术研究中心有限公司 | Monocular vision positioner and method based on multi-joint mechanical arm |
CN109176521A (en) * | 2018-09-19 | 2019-01-11 | 北京因时机器人科技有限公司 | A kind of mechanical arm and its crawl control method and system |
CN109333536A (en) * | 2018-10-26 | 2019-02-15 | 北京因时机器人科技有限公司 | A kind of robot and its grasping body method and apparatus |
CN109483554A (en) * | 2019-01-22 | 2019-03-19 | 清华大学 | Robotic Dynamic grasping means and system based on global and local vision semanteme |
CN109521774A (en) * | 2018-12-27 | 2019-03-26 | 南京芊玥机器人科技有限公司 | A kind of spray robot track optimizing method based on intensified learning |
CN109531584A (en) * | 2019-01-31 | 2019-03-29 | 北京无线电测量研究所 | A kind of Mechanical arm control method and device based on deep learning |
CN110315258A (en) * | 2019-07-24 | 2019-10-11 | 广东工业大学 | A kind of welding method based on intensified learning and ant group algorithm |
CN110977967A (en) * | 2019-11-29 | 2020-04-10 | 天津博诺智创机器人技术有限公司 | Robot path planning method based on deep reinforcement learning |
KR20200059111A (en) * | 2018-11-20 | 2020-05-28 | 한양대학교 산학협력단 | Grasping robot, grasping method and learning method for grasp based on neural network |
CN111251294A (en) * | 2020-01-14 | 2020-06-09 | 北京航空航天大学 | Robot grabbing method based on visual pose perception and deep reinforcement learning |
CN111275063A (en) * | 2018-12-04 | 2020-06-12 | 广州中国科学院先进技术研究所 | Robot intelligent grabbing control method and system based on 3D vision |
CN111383263A (en) * | 2018-12-28 | 2020-07-07 | 阿里巴巴集团控股有限公司 | System, method and device for grabbing object by robot |
CN111496770A (en) * | 2020-04-09 | 2020-08-07 | 上海电机学院 | Intelligent carrying mechanical arm system based on 3D vision and deep learning and use method |
CN111618847A (en) * | 2020-04-22 | 2020-09-04 | 南通大学 | Mechanical arm autonomous grabbing method based on deep reinforcement learning and dynamic motion elements |
CN111923039A (en) * | 2020-07-14 | 2020-11-13 | 西北工业大学 | Redundant mechanical arm path planning method based on reinforcement learning |
CN112605983A (en) * | 2020-12-01 | 2021-04-06 | 浙江工业大学 | Mechanical arm pushing and grabbing system suitable for intensive environment |
-
2021
- 2021-05-13 CN CN202110521680.1A patent/CN113232019A/en active Pending
Patent Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106970615A (en) * | 2017-03-21 | 2017-07-21 | 西北工业大学 | A kind of real-time online paths planning method of deeply study |
CN107883929A (en) * | 2017-09-22 | 2018-04-06 | 中冶赛迪技术研究中心有限公司 | Monocular vision positioner and method based on multi-joint mechanical arm |
CN109176521A (en) * | 2018-09-19 | 2019-01-11 | 北京因时机器人科技有限公司 | A kind of mechanical arm and its crawl control method and system |
CN109333536A (en) * | 2018-10-26 | 2019-02-15 | 北京因时机器人科技有限公司 | A kind of robot and its grasping body method and apparatus |
KR20200059111A (en) * | 2018-11-20 | 2020-05-28 | 한양대학교 산학협력단 | Grasping robot, grasping method and learning method for grasp based on neural network |
CN111275063A (en) * | 2018-12-04 | 2020-06-12 | 广州中国科学院先进技术研究所 | Robot intelligent grabbing control method and system based on 3D vision |
CN109521774A (en) * | 2018-12-27 | 2019-03-26 | 南京芊玥机器人科技有限公司 | A kind of spray robot track optimizing method based on intensified learning |
CN111383263A (en) * | 2018-12-28 | 2020-07-07 | 阿里巴巴集团控股有限公司 | System, method and device for grabbing object by robot |
CN109483554A (en) * | 2019-01-22 | 2019-03-19 | 清华大学 | Robotic Dynamic grasping means and system based on global and local vision semanteme |
CN109531584A (en) * | 2019-01-31 | 2019-03-29 | 北京无线电测量研究所 | A kind of Mechanical arm control method and device based on deep learning |
CN110315258A (en) * | 2019-07-24 | 2019-10-11 | 广东工业大学 | A kind of welding method based on intensified learning and ant group algorithm |
CN110977967A (en) * | 2019-11-29 | 2020-04-10 | 天津博诺智创机器人技术有限公司 | Robot path planning method based on deep reinforcement learning |
CN111251294A (en) * | 2020-01-14 | 2020-06-09 | 北京航空航天大学 | Robot grabbing method based on visual pose perception and deep reinforcement learning |
CN111496770A (en) * | 2020-04-09 | 2020-08-07 | 上海电机学院 | Intelligent carrying mechanical arm system based on 3D vision and deep learning and use method |
CN111618847A (en) * | 2020-04-22 | 2020-09-04 | 南通大学 | Mechanical arm autonomous grabbing method based on deep reinforcement learning and dynamic motion elements |
CN111923039A (en) * | 2020-07-14 | 2020-11-13 | 西北工业大学 | Redundant mechanical arm path planning method based on reinforcement learning |
CN112605983A (en) * | 2020-12-01 | 2021-04-06 | 浙江工业大学 | Mechanical arm pushing and grabbing system suitable for intensive environment |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113753562A (en) * | 2021-08-24 | 2021-12-07 | 深圳市长荣科机电设备有限公司 | Carrying method, system and device based on linear motor and storage medium |
CN113753562B (en) * | 2021-08-24 | 2023-07-25 | 深圳市长荣科机电设备有限公司 | Linear motor-based carrying method, system, device and storage medium |
CN113942009A (en) * | 2021-09-13 | 2022-01-18 | 苏州大学 | Robot bionic hand grabbing method and system |
CN114523470A (en) * | 2021-12-30 | 2022-05-24 | 浙江图盛输变电工程有限公司 | Robot operation path planning method based on bearing platform linkage |
CN114523470B (en) * | 2021-12-30 | 2024-05-17 | 浙江图盛输变电工程有限公司 | Robot operation path planning method based on bearing platform linkage |
CN114683251A (en) * | 2022-03-31 | 2022-07-01 | 上海节卡机器人科技有限公司 | Robot grabbing method and device, electronic equipment and readable storage medium |
CN115648232A (en) * | 2022-12-30 | 2023-01-31 | 广东隆崎机器人有限公司 | Mechanical arm control method and device, electronic equipment and readable storage medium |
CN115847488A (en) * | 2023-02-07 | 2023-03-28 | 成都秦川物联网科技股份有限公司 | Industrial Internet of things system for cooperative robot monitoring and control method |
CN115847488B (en) * | 2023-02-07 | 2023-05-02 | 成都秦川物联网科技股份有限公司 | Industrial Internet of things system for collaborative robot monitoring and control method |
US11919166B2 (en) | 2023-02-07 | 2024-03-05 | Chengdu Qinchuan Iot Technology Co., Ltd. | Industrial internet of things for monitoring collaborative robots and control methods, storage media thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113232019A (en) | Mechanical arm control method and device, electronic equipment and storage medium | |
Dasari et al. | Robonet: Large-scale multi-robot learning | |
TWI776113B (en) | Object pose estimation method, device and computer readable storage medium thereof | |
CN110076772B (en) | Grabbing method and device for mechanical arm | |
Sadeghi et al. | Sim2real viewpoint invariant visual servoing by recurrent control | |
Finn et al. | Guided cost learning: Deep inverse optimal control via policy optimization | |
CN111203878B (en) | Robot sequence task learning method based on visual simulation | |
WO2022100363A1 (en) | Robot control method, apparatus and device, and storage medium and program product | |
US20180290298A1 (en) | Apparatus and methods for training path navigation by robots | |
CN113076615B (en) | High-robustness mechanical arm operation method and system based on antagonistic deep reinforcement learning | |
Bohez et al. | Sensor fusion for robot control through deep reinforcement learning | |
Chen et al. | Combining reinforcement learning and rule-based method to manipulate objects in clutter | |
CN112164112B (en) | Method and device for acquiring pose information of mechanical arm | |
CN114387513A (en) | Robot grabbing method and device, electronic equipment and storage medium | |
CN114564009A (en) | Surgical robot path planning method and system | |
CN114789454B (en) | Robot digital twin track completion method based on LSTM and inverse kinematics | |
CN115781685A (en) | High-precision mechanical arm control method and system based on reinforcement learning | |
Luo et al. | Balance between efficient and effective learning: Dense2sparse reward shaping for robot manipulation with environment uncertainty | |
Hu et al. | Grasping living objects with adversarial behaviors using inverse reinforcement learning | |
Liu et al. | Sim-and-real reinforcement learning for manipulation: A consensus-based approach | |
Xu et al. | Deep reinforcement learning for parameter tuning of robot visual servoing | |
CN111015676B (en) | Grabbing learning control method, system, robot and medium based on hand-free eye calibration | |
Hwang et al. | Image base visual servoing base on reinforcement learning for robot arms | |
Xu et al. | A fast and straightforward hand-eye calibration method using stereo camera | |
Zhao et al. | A robot demonstration method based on LWR and Q-learning algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210810 |
|
RJ01 | Rejection of invention patent application after publication |