CN113232019A - Mechanical arm control method and device, electronic equipment and storage medium - Google Patents

Mechanical arm control method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113232019A
CN113232019A CN202110521680.1A CN202110521680A CN113232019A CN 113232019 A CN113232019 A CN 113232019A CN 202110521680 A CN202110521680 A CN 202110521680A CN 113232019 A CN113232019 A CN 113232019A
Authority
CN
China
Prior art keywords
determining
mechanical arm
path
pose information
target object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110521680.1A
Other languages
Chinese (zh)
Inventor
王洛威
王恺
廉士国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Unicom Big Data Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Unicom Big Data Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd, Unicom Big Data Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN202110521680.1A priority Critical patent/CN113232019A/en
Publication of CN113232019A publication Critical patent/CN113232019A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1656Programme controls characterised by programming, planning systems for manipulators
    • B25J9/1664Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1694Programme controls characterised by use of sensors other than normal servo-feedback from position, speed or acceleration sensors, perception control, multi-sensor controlled systems, sensor fusion
    • B25J9/1697Vision controlled systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mechanical Engineering (AREA)
  • Robotics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Manipulator (AREA)

Abstract

The application provides a mechanical arm control method, a mechanical arm control device, electronic equipment and a storage medium, and a target object image corresponding to an object to be acquired is acquired; determining type information and pose information of a to-be-taken object according to the target object image; determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched; and controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be grabbed. Combine together vision sensor and arm, accomplish the snatching of the object of waiting to get that the structure is unset through the vision guide, snatch the precision high and stable, restriction condition is less simultaneously, and the self-adaptability is strong, and is nimble relatively.

Description

Mechanical arm control method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of robot arm grabbing control, and in particular, to a robot arm control method and apparatus, an electronic device, and a storage medium.
Background
As technology continues to advance, industrial robots are moving into factories to replace human work.
At present, a mechanical arm grabs a planar object with a fixed structure according to a set control program so as to improve the working efficiency.
However, when the structure of the planar object is slightly changed, the robot cannot continue to operate, and the control program must be reset.
Disclosure of Invention
The application provides a mechanical arm control method, a mechanical arm control device, electronic equipment and a storage medium, which are used for solving the problem that a mechanical arm cannot continue to work when the structure of a planar object changes.
In a first aspect, the present application provides a method for controlling a robot arm, the method including:
acquiring a target object image corresponding to an object to be acquired;
determining type information and pose information of a to-be-taken object according to the target object image;
determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched;
and controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be grabbed.
Optionally, determining type information of the object to be taken according to the target object image includes:
obtaining a classification result by using a pre-trained classification model according to the target object image;
and determining the type information of the object to be taken according to the classification result.
Optionally, determining pose information of the object to be taken according to the target object image includes:
determining the characteristics of a target image according to the target object image;
and determining the pose information of the object to be taken by using a pose calculation model according to the characteristics of the target image.
Optionally, determining a grab path according to the type information and the pose information includes:
acquiring current pose information of each joint of the mechanical arm;
and determining a grabbing path by using a decision model according to the current pose information, the type information and the pose information of the object to be taken.
Optionally, determining a grabbing path by using the decision model according to the current pose information, the type information of the object to be taken, and the pose information, including:
the decision model is Q, the iteration frequency is assumed to be Rounds, wherein Rounds is a positive integer, batch _ size when the batch gradient is decreased is m, and the maximum size n of the empirical playback pool is obtained;
taking the current pose information, the type information of the object to be taken and the pose information as state vectors in the state S
Figure BDA0003064220270000021
Wherein, the state S is an initialization state;
state vector
Figure BDA0003064220270000022
Inputting the current action A into a decision model Q;
executing the current action A in the state S to obtain the next state S ', wherein the next state S' corresponds to the feature vector
Figure BDA0003064220270000023
Reward R, whether the state is _ end is terminated;
will five-membered group
Figure BDA0003064220270000024
Adding into experience playback pool, if experience returnsSampling in batches from the empirical playback pool and updating network parameters in the decision model if the size of the playback pool is larger than m, and removing the quintuple added earliest from the empirical playback pool and adding a new quintuple if the size of the empirical playback pool is larger than n;
updating the state S to a state S';
judging whether is _ end is in a final state, if not, continuing to circularly randomly take samples from the experience playback pool, and if so, finishing the circulation to obtain a final decision model;
and determining a grabbing path according to the final decision model.
Optionally, a state vector
Figure BDA0003064220270000025
The method also comprises a specific scene, wherein the specific scene comprises a scene with an unfixed object structure to be taken.
Optionally, according to the grabbing path, each joint of the mechanical arm is controlled to perform angle adjustment so as to grab the object to be taken, including:
obtaining the motion trail of the mechanical arm by using a smooth trail interpolation method according to the grabbing path;
and controlling each joint of the mechanical arm to adjust the angle according to the motion trail so as to grab the object to be picked.
Optionally, the method further comprises:
and according to the target object image, if the type of the target object cannot be determined, acquiring the target object image again through the vision sensor.
In a second aspect, the present application provides an arm control apparatus, the apparatus comprising:
the acquisition module is used for acquiring a target object image corresponding to a to-be-taken object;
the processing module is used for determining the type information and the pose information of the object to be taken according to the target object image;
the processing module is further used for determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be taken;
and the processing module is also used for controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be grabbed.
In a third aspect, the present application provides an electronic device, comprising: a memory, a processor;
a memory; a memory for storing processor-executable instructions;
a processor for implementing the robot arm control method according to the first aspect and the alternative aspects, according to executable instructions stored in a memory.
In a fourth aspect, the present application provides a computer-readable storage medium having computer-executable instructions stored thereon, where the computer-executable instructions are executed by a processor to implement the robot arm control method according to the first aspect and the alternative.
In a fifth aspect, the present application provides a computer program product comprising instructions which, when executed by a processor, implement the robot arm control method of the first aspect and the alternatives.
The application provides a mechanical arm control method, a mechanical arm control device, electronic equipment and a storage medium, and a target object image corresponding to an object to be acquired is acquired; determining type information and pose information of a to-be-taken object according to the target object image; determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched; and controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be grabbed. Combine together vision sensor and arm, accomplish the snatching of the object of waiting to get that the structure is unset through the vision guide, snatch the precision high and stable, restriction condition is less simultaneously, and the self-adaptability is strong, and is nimble relatively.
Drawings
FIG. 1 is a schematic view of a robotic arm control system shown herein according to an exemplary embodiment;
FIG. 2 is a flow diagram illustrating a method for controlling a robotic arm according to an exemplary embodiment of the present application;
FIG. 3 is a flow diagram illustrating a method of controlling a robotic arm according to another exemplary embodiment of the present application;
FIG. 4 is a schematic diagram illustrating the construction of a robot arm control apparatus according to an exemplary embodiment of the present application;
fig. 5 is a schematic diagram of a hardware structure of an electronic device according to an exemplary embodiment of the present application.
Detailed Description
To make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the present application, and it is obvious that the described embodiments are some, but not all embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As technology continues to advance, industrial robots are moving into factories to replace human work. A robotic arm is a mechanical structure that mimics a human hand, such as a planar multi-joint robot, a palletizer, and the like. The mechanical arm generally has a plurality of joint arms and an execution end arranged on the last joint arm, various execution components are arranged on the execution end, and the execution end is moved to a specified coordinate in space through automatic control to realize functions provided by the execution components, such as writing, grabbing, testing and the like.
At present, a mechanical arm grabs a planar object with a fixed structure according to a set control program so as to improve the working efficiency.
However, when the structure of the planar object is slightly changed, the robot cannot continue to operate, and the control program must be reset.
In order to solve the problems, the application provides a mechanical arm control method, a target object image corresponding to an object to be taken is obtained through a vision sensor, and information and pose information of the object to be taken in the target object image are obtained by utilizing a depth learning and computer image processing method. And meanwhile, converting the pose information into pose information under a universal coordinate system. And then, automatically calculating the grabbing path of the mechanical arm by using the decision model. And controlling the mechanical arm to adjust the angle of each joint according to the grabbing path, and controlling the mechanical arm to grab the object to be taken. Combine together vision sensor and arm, accomplish the snatching of the object of waiting to get that the structure is unset through the vision guide, snatch the precision high and stable, restriction condition is less simultaneously, and the self-adaptability is strong, and is nimble relatively.
FIG. 1 is a schematic view of a robotic arm control system according to an exemplary embodiment of the present application. As shown in fig. 1, the robot arm control system provided in this embodiment includes: a vision sensor 110, a master server 120, and a robotic arm 130. The vision sensor 110 is configured to acquire a target object image corresponding to an object to be acquired, and send the target object image to the main control server 120. The main control server 120 receives the target object image sent by the vision sensor 110, determines a motion path from the current position of the mechanical arm 130 to the position of the object to be taken according to the target object image, controls each joint of the mechanical arm 130 to perform angle adjustment according to the motion path, and sends a control signal to the mechanical arm 130. The mechanical arm 130 is configured to receive a control signal sent by the main control server 120, and perform angle adjustment on each joint according to the control signal to grab an object to be picked.
Fig. 2 is a flow chart diagram illustrating a robot arm control method according to an exemplary embodiment of the present application. As shown in fig. 2, the robot arm control method provided in this embodiment is based on the robot arm control system shown in fig. 1, and includes the following steps:
s101, obtaining a target object image corresponding to the object to be acquired.
More specifically, the target object image is an RGB picture in a three-dimensional space. One or more vision sensors shoot the object to be taken to obtain a target object image at a corresponding angle, and the target object image is sent to the master control server. Wherein the vision sensor comprises an RGB video camera or an industrial camera. The vision sensor can shoot and acquire target object images from one or more angles, and the multi-angle target object images can determine the type information and the pose information of the object to be taken from multiple angles, so that the mechanical arm can grab the object more accurately. The master control server receives a target object image corresponding to an object to be taken.
And S102, determining the type information and the pose information of the object to be taken according to the target object image.
More specifically, the master control server determines type information of the object to be taken by using the classification model according to the target object image information sent by the vision sensor, wherein the type information comprises a type number. And inputting the image information of the target object into a classification model, and outputting the type number of the object to be taken by the classification model so as to determine the type information of the object to be taken. The master control server obtains relative coordinate positions of at least four three-dimensional space points according to target object image information sent by the vision sensor, and the pose calculation model is used for determining the poses of the three-dimensional space points under the vision sensor. And converting the pose under the vision sensor into pose information under a universal coordinate system. The pose information comprises spatial position information and direction information of the object to be taken. The spatial position information of the object to be taken is the spatial coordinates of a preset number of points on the surface of the object to be taken, and the preset number of points comprises at least four three-dimensional spatial points.
S103, determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched.
More specifically, the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be taken, and the moving path also comprises moving paths of all joints of the mechanical arm. And determining the motion path of each joint of the mechanical arm at each moment by adopting a reinforcement learning algorithm according to the type information and the pose information until the mechanical arm successfully grabs the object to be taken.
For a multi-joint mechanical arm, various joint angle configurations exist, so that the mechanical arm can grab a to-be-picked object, and countless solutions exist in the grabbing path of the mechanical arm. The traditional method generally uses Sampling-based Planning (Sampling-based Planning) to perform path Planning, and the method finds an optimal solution, but quickly finds an effective solution. Finding the optimal solution takes more time because there are numerous paths. However, the effective solution may not be the optimal solution, which means that when the machine grabs the object according to the calculated grabbing path, the grabbing path obtained by the non-optimal solution is not the shortest, that is, a single mechanical arm joint needs to rotate by an unnecessary angle or multiple joints need to rotate by an unnecessary angle. Meanwhile, because the method is based on sampling, under the condition that interpolation sampling has errors, the path planned by sampling is sometimes an invalid path for the mechanical arm, namely the path cannot be planned for the actual mechanical arm. Therefore, in the embodiment, a reinforcement learning algorithm is adopted to complete the grabbing path planning of the end-to-end real-time scene of the environment, and each step of decision-making action of each joint of the mechanical arm is determined, so that an optimal complete path is obtained.
And S104, controlling each joint of the mechanical arm to adjust the angle according to the grabbing path so as to grab the object to be grabbed.
More specifically, assuming that the time taken for the robot arm from the start of gripping to the completion of gripping is t, the gripping path includes the movement path of the robot arm at time 1, 2, …, t. And controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path at the corresponding moment at the moment 1, 2, … and t-1, and controlling the executing end of the mechanical arm to grab the object to be picked while controlling the angle adjustment of each joint of the mechanical arm until the moment t.
In the method provided by the embodiment, a target object image corresponding to an object to be acquired is acquired; determining type information and pose information of a to-be-taken object according to the target object image; determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched; and controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be grabbed. Combine together vision sensor and arm, accomplish the snatching of the object of waiting to get that the structure is unset through the vision guide, snatch the precision high and stable, restriction condition is less simultaneously, and the self-adaptability is strong, and is nimble relatively.
Fig. 3 is a flow chart diagram illustrating a robot arm control method according to another exemplary embodiment of the present application. As shown in fig. 3, the method for controlling a robot arm provided in this embodiment includes the following steps:
s201, obtaining a target object image corresponding to the object to be taken.
Step S201 is similar to the step S101 in the embodiment of fig. 2, and this embodiment is not described herein again.
S202, obtaining a classification result by using a pre-trained classification model according to the target object image; and determining the type information of the object to be taken according to the classification result.
More specifically, the pre-trained classification model may be a Yolo model, a Convolutional Neural Networks (CNN) model, a Mask R-CNN (Region-CNN) model, a fast Mask R-CNN model, or the like.
Taking the CNN model as an example, the high-dimensional features on the target object image are extracted through convolution, and different types of objects to be extracted have different features, that is, the features can be extracted through the RGB images taken by the visual sensor by using the neural network convolution operation. Then, the fully-connected layer of the model carries out reward penalty on the judgment by classifying the features and carrying out priori knowledge (from pre-artificial marking), so that the convolutional neural network can learn by self under the supervision, and is continuously optimized, namely the convolutional neural network is adjusted to be in an optimal parameter state through self optimization, and a pre-trained classification model is obtained. And inputting the target object image into a pre-trained classification model, and outputting a classification result by the classification model. And determining the type information of the object to be taken according to the classification result.
S203, determining the characteristics of the target image according to the target object image; and determining the pose information of the object to be taken by using a pose calculation model according to the characteristics of the target image.
More specifically, the pose calculation model may be a Perspective-N-Point (PNP) model. The fully connected layers in the classification model in step S202 are removed, and the plurality of remaining convolutional layers are used as a feature extraction model. And inputting the target object image into a feature extraction model, and outputting the target image features by the feature extraction model. And comparing the target image features with the features extracted from the RGB images manually marked before, so that the features of the target image features and the features of the RGB images manually marked before can be matched one by one. The target image features obtained after matching at least comprise relative coordinate positions of four feature points in a three-dimensional space. And inputting the target image characteristics obtained after matching into the PNP model, and determining the poses of the characteristic points in the three-dimensional space under the vision sensor. And converting the pose under the vision sensor into pose information under a universal coordinate system. The pose information comprises spatial position information and direction information of the object to be taken. The spatial position information of the object to be taken is the spatial coordinates of a preset number of points on the surface of the object to be taken, and the preset number of points comprises at least four three-dimensional spatial points.
In this embodiment, the steps S202 and S203 are not limited by the described operation sequence, and the steps S202 and S203 may be performed in other sequences or simultaneously.
And S204, acquiring current pose information of each joint of the mechanical arm.
More specifically, the current pose information includes spatial position information and orientation information where the mechanical arm is currently located.
And S205, determining a grabbing path by using a decision model according to the current pose information, the type information and the pose information of the object to be taken.
More specifically, the decision model includes a Deep Learning (DRL) algorithm. DRL algorithms include Deep Q-Learning (DQN) algorithms and Q-Learning (QL) algorithms. The DQN algorithm is one of deep reinforcement learning DRL algorithms, and is an algorithm that combines deep learning and reinforcement learning to realize end-to-end learning from perception to action. The QL algorithm is a classical reinforcement learning algorithm, and because the QL algorithm requires a huge Q table, the memory occupied in a high-dimensional space is huge and convergence is not easy. Therefore, the present embodiment uses the DQN algorithm. Unlike most collision detection algorithms in the past, the DQN algorithm is a model-free algorithm and does not need to be modeled for each scene. And determining a grabbing path by utilizing a DQN algorithm according to the current pose information, the type information and the pose information of the object to be taken so as to realize end-to-end control on the mechanical arms of the plurality of joints.
The behavior value function of the DQN algorithm is approximated by a neural network, belongs to nonlinear approximation, and adopts a network structure of three convolution layers and two full-connection layers. The decision model is formulated as
Figure BDA0003064220270000081
The updating of the network is actually the updating of the parameter theta, and once theta is determined, the network parameter is determined.
The DQN algorithm is mainly characterized by introducing empirical playback, namely a quintuple
Figure BDA0003064220270000082
Added to an empirical replay pool, which will later be used to update the decision model
Figure BDA0003064220270000083
The network parameter θ in (1). Wherein the content of the first and second substances,
Figure BDA0003064220270000084
and
Figure BDA0003064220270000085
both in the form of tensors, the actions a and rewards R are scalar quantities, and is _ end is a boolean value.
Optionally, determining a grabbing path by using the decision model according to the current pose information, the type information of the object to be taken, and the pose information, including:
the decision model is Q, and the number of iterations is assumed to be Rounds, where Rounds is a positive integer, batch _ size when the batch gradient is decreasing is m, and the empirical playback pool maximum size n.
Taking the current pose information, the type information of the object to be taken and the pose information as state vectors in the state S
Figure BDA0003064220270000086
Wherein, the state S is an initialization state,
Figure BDA0003064220270000087
tensors are formed by the current pose information of each joint of the mechanical arm, the type information of the object to be taken and the pose information.
State vector
Figure BDA0003064220270000088
And inputting the current action A into a decision model Q.
Executing the current action A in the state S to obtain the next state S ', wherein the next state S' corresponds to the feature vector
Figure BDA0003064220270000091
Reward R, whether the state is _ end is terminated.
Will five-membered group
Figure BDA0003064220270000092
Added to the experience playback pool.
If the size of the empirical playback pool is larger than m, sampling in batch from the empirical playback pool and updating the network parameters in the decision model, specifically comprising:
step 1, randomly taking m samples from an experience playback pool
Figure BDA0003064220270000093
Where j is 1, 2, 3, …, m, and the target value y is calculatedi
Figure BDA0003064220270000094
Wherein, yiDenotes the target value of the jth sample, RjRepresents the reward, is _ end, for the j-th samplejIndicates whether the jth sample is terminated, gamma indicates the attenuation coefficient,
Figure BDA0003064220270000095
the decision model representing the jth sample,
Figure BDA0003064220270000096
represents the feature vector, A ', for the j-th sample'jRepresents the action of the jth sample and theta represents a network parameter.
Step 2, using a mean square error loss function
Figure BDA0003064220270000097
The network parameter θ in the decision model Q is updated.
And if the size of the experience playback pool is larger than n, removing the quintuple added earliest from the experience playback pool, and adding a new quintuple.
The state S is updated to the state S'.
And judging whether the is _ end is in a final state, if not, continuing to circularly and randomly take samples from the experience playback pool, and if so, finishing the circulation to obtain a final decision model.
And determining a grabbing path according to the final decision model. The optimal grabbing path is obtained according to the steps, so that unnecessary rotation angles of a single joint or unnecessary rotation angles of a plurality of joints of the mechanical arm are avoided to a certain extent, and the loss of the joints of the mechanical arm is reduced.
Optionally, a state vector
Figure BDA0003064220270000098
The method also comprises a specific scene, wherein the specific scene comprises a scene with an unfixed object structure to be taken.
More specifically, the specific scene may be a scene in which the object structure and size are changing. Taking the specific scene, the current pose information, the type information and the pose information of the object to be taken as state vectors in the state S
Figure BDA0003064220270000099
Wherein the state S is an initialization state.
S206, obtaining the motion trail of the mechanical arm by using a smooth trail interpolation method according to the grabbing path; and controlling each joint of the mechanical arm to adjust the angle according to the motion trail so as to grab the object to be picked.
More specifically, the smooth trajectory interpolation method comprises a polynomial curve method, so that the mechanical arm is more continuous and smooth in the motion process, and noise is reduced.
Optionally, according to the target object image, if the type of the target object cannot be determined, the target object image is obtained again through the vision sensor.
More specifically, according to a target object image shot by the vision sensor, when a classification result cannot be identified by a pre-trained classification model, the main control server sends a target object image re-acquisition instruction to the vision sensor, and after receiving the instruction, the vision server re-acquires the target image and sends the target image to the main control server.
In the method provided by the embodiment, the path is planned in real time based on the deep reinforcement learning algorithm, and end-to-end real-time path planning can also be performed according to a specific scene. And (4) obtaining each step of decision-making action of the mechanical arm in a specific scene through training a decision-making model, and further obtaining an optimal complete path. In the practical application process, the trained decision model is utilized to input the target object image acquired by the visual sensor, and the path information of the mechanical arm movement can be obtained. The robustness is ensured, and the dependence on scenes is reduced.
Fig. 4 is a schematic structural diagram of a robot arm control device according to an exemplary embodiment of the present application. As shown in fig. 4, the present application provides a robot arm control apparatus 40, the apparatus 40 including:
the obtaining module 41 is configured to obtain a target object image corresponding to the object to be taken.
And the processing module 42 is configured to determine type information and pose information of the object to be taken according to the target object image.
And the processing module 42 is further configured to determine a grabbing path according to the type information and the pose information, where the grabbing path is a moving path from the current position of the robot arm to the position of the object to be fetched.
And the processing module 42 is further configured to control each joint of the mechanical arm to perform angle adjustment according to the grabbing path, so as to grab the object to be grabbed.
Specifically, the present embodiment may refer to the above method embodiments, and the principle and the technical effect are similar, which are not described again.
Fig. 5 is a schematic diagram of a hardware structure of an electronic device according to an exemplary embodiment of the present application. As shown in fig. 5, the electronic apparatus 50 of the present embodiment includes: a processor 51 and a memory 52; wherein the content of the first and second substances,
a memory 52, a memory for storing processor-executable instructions.
The processor 51 is configured to implement the robot arm control method in the above embodiments according to executable instructions stored in the memory. Reference may be made in particular to the description relating to the method embodiments described above.
Alternatively, the memory 52 may be separate or integrated with the processor 51.
When the memory 52 is provided separately, the electronic device 50 further includes a bus 53 for connecting the memory 52 and the processor 51.
The present application also provides a computer readable storage medium, in which computer instructions are stored, and the computer instructions are executed by a processor to implement the methods provided by the above-mentioned various embodiments.
The computer-readable storage medium may be a computer storage medium or a communication medium. Communication media includes any medium that facilitates transfer of a computer program from one place to another. Computer storage media may be any available media that can be accessed by a general purpose or special purpose computer. For example, a computer readable storage medium is coupled to the processor such that the processor can read information from, and write information to, the computer readable storage medium. Of course, the computer readable storage medium may also be integral to the processor. The processor and the computer-readable storage medium may reside in an Application Specific Integrated Circuit (ASIC). Additionally, the ASIC may reside in user equipment. Of course, the processor and the computer-readable storage medium may also reside as discrete components in a communication device.
The computer-readable storage medium may be implemented by any type of volatile or nonvolatile Memory device or combination thereof, such as Static Random-Access Memory (SRAM), Electrically-Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
The present application also provides a computer program product comprising execution instructions stored in a computer readable storage medium. The at least one processor of the device may read the execution instructions from the computer-readable storage medium, and the execution of the execution instructions by the at least one processor causes the device to implement the methods provided by the various embodiments described above.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (12)

1. A method of controlling a robot arm, the method comprising:
acquiring a target object image corresponding to an object to be acquired;
determining the type information and the pose information of the object to be acquired according to the target object image;
determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched;
and controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be fetched.
2. The method according to claim 1, wherein the determining the type information of the object to be taken according to the target object image comprises:
obtaining a classification result by using a pre-trained classification model according to the target object image;
and determining the type information of the object to be taken according to the classification result.
3. The method according to claim 1, wherein the determining pose information of the object to be fetched according to the target object image comprises:
determining the characteristics of a target image according to the target object image;
and determining the pose information of the object to be acquired by using a pose calculation model according to the characteristics of the target image.
4. The method of claim 1, wherein determining a grab path from the type information and pose information comprises:
acquiring current pose information of each joint of the mechanical arm;
and determining the grabbing path by using a decision model according to the current pose information, the type information and the pose information of the object to be taken.
5. The method of claim 4, wherein determining the grasp path using a decision model based on the current pose information, the type information of the object to be taken, and the pose information comprises:
the decision model is Q, the iteration frequency is assumed to be Rounds, wherein Rounds is a positive integer, batch _ size when the batch gradient is decreased is m, and the maximum size n of the empirical playback pool is obtained;
the current pose information and the object to be taken are acquiredType information and pose information of a body as state vectors in state S
Figure FDA0003064220260000011
Wherein the state S is an initialization state;
the state vector is integrated into a vector
Figure FDA0003064220260000012
Inputting the current action A into the decision model Q;
executing the current action A in the state S to obtain a next state S ', wherein the next state S' corresponds to the feature vector
Figure FDA0003064220260000021
Reward R, whether the state is _ end is terminated;
will five-membered group
Figure FDA0003064220260000022
Adding the obtained data into an empirical playback pool, sampling in batch from the empirical playback pool and updating network parameters in the decision model if the size of the empirical playback pool is larger than m, and removing the five tuples added earliest and adding new five tuples from the empirical playback pool if the size of the empirical playback pool is larger than n;
updating the state S to a state S';
judging whether is _ end is in a final state, if not, continuing to circularly and randomly take samples from the experience playback pool, and if so, finishing the circulation to obtain a final decision model;
and determining the grabbing path according to the final decision model.
6. The method of claim 5, wherein the state vector
Figure FDA0003064220260000023
Further comprises a specific scene, wherein the specific scene comprises that the object to be taken has an unfixed structureAnd (5) determining a scene.
7. The method according to claim 1, wherein the controlling the mechanical arms to perform angular adjustment according to the grabbing path to grab the object to be taken comprises:
obtaining the motion track of the mechanical arm by using a smooth track interpolation method according to the grabbing path;
and controlling each joint of the mechanical arm to adjust the angle according to the motion track so as to grab the object to be fetched.
8. The method of any one of claims 1-7, further comprising:
and according to the target object image, if the type of the target object cannot be determined, acquiring the target object image again through the visual sensor.
9. An apparatus for controlling a robot arm, comprising:
the acquisition module is used for acquiring a target object image corresponding to a to-be-taken object;
the processing module is used for determining the type information and the pose information of the object to be acquired according to the target object image;
the processing module is further used for determining a grabbing path according to the type information and the pose information, wherein the grabbing path is a moving path from the current position of the mechanical arm to the position of the object to be fetched;
and the processing module is also used for controlling each joint of the mechanical arm to carry out angle adjustment according to the grabbing path so as to grab the object to be fetched.
10. An electronic device, comprising: a memory, a processor;
a memory; a memory for storing the processor-executable instructions;
a processor for implementing the robot arm control method of any one of claims 1 to 8 in accordance with executable instructions stored in the memory.
11. A computer-readable storage medium having computer-executable instructions stored therein, which when executed by a processor, are configured to implement the robot arm control method of any one of claims 1 to 8.
12. A computer program product comprising instructions which, when executed by a processor, carry out the robot arm control method of any of claims 1 to 8.
CN202110521680.1A 2021-05-13 2021-05-13 Mechanical arm control method and device, electronic equipment and storage medium Pending CN113232019A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110521680.1A CN113232019A (en) 2021-05-13 2021-05-13 Mechanical arm control method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110521680.1A CN113232019A (en) 2021-05-13 2021-05-13 Mechanical arm control method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113232019A true CN113232019A (en) 2021-08-10

Family

ID=77133957

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110521680.1A Pending CN113232019A (en) 2021-05-13 2021-05-13 Mechanical arm control method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113232019A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113753562A (en) * 2021-08-24 2021-12-07 深圳市长荣科机电设备有限公司 Carrying method, system and device based on linear motor and storage medium
CN113942009A (en) * 2021-09-13 2022-01-18 苏州大学 Robot bionic hand grabbing method and system
CN114523470A (en) * 2021-12-30 2022-05-24 浙江图盛输变电工程有限公司 Robot operation path planning method based on bearing platform linkage
CN114683251A (en) * 2022-03-31 2022-07-01 上海节卡机器人科技有限公司 Robot grabbing method and device, electronic equipment and readable storage medium
CN115648232A (en) * 2022-12-30 2023-01-31 广东隆崎机器人有限公司 Mechanical arm control method and device, electronic equipment and readable storage medium
CN115847488A (en) * 2023-02-07 2023-03-28 成都秦川物联网科技股份有限公司 Industrial Internet of things system for cooperative robot monitoring and control method

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106970615A (en) * 2017-03-21 2017-07-21 西北工业大学 A kind of real-time online paths planning method of deeply study
CN107883929A (en) * 2017-09-22 2018-04-06 中冶赛迪技术研究中心有限公司 Monocular vision positioner and method based on multi-joint mechanical arm
CN109176521A (en) * 2018-09-19 2019-01-11 北京因时机器人科技有限公司 A kind of mechanical arm and its crawl control method and system
CN109333536A (en) * 2018-10-26 2019-02-15 北京因时机器人科技有限公司 A kind of robot and its grasping body method and apparatus
CN109483554A (en) * 2019-01-22 2019-03-19 清华大学 Robotic Dynamic grasping means and system based on global and local vision semanteme
CN109521774A (en) * 2018-12-27 2019-03-26 南京芊玥机器人科技有限公司 A kind of spray robot track optimizing method based on intensified learning
CN109531584A (en) * 2019-01-31 2019-03-29 北京无线电测量研究所 A kind of Mechanical arm control method and device based on deep learning
CN110315258A (en) * 2019-07-24 2019-10-11 广东工业大学 A kind of welding method based on intensified learning and ant group algorithm
CN110977967A (en) * 2019-11-29 2020-04-10 天津博诺智创机器人技术有限公司 Robot path planning method based on deep reinforcement learning
KR20200059111A (en) * 2018-11-20 2020-05-28 한양대학교 산학협력단 Grasping robot, grasping method and learning method for grasp based on neural network
CN111251294A (en) * 2020-01-14 2020-06-09 北京航空航天大学 Robot grabbing method based on visual pose perception and deep reinforcement learning
CN111275063A (en) * 2018-12-04 2020-06-12 广州中国科学院先进技术研究所 Robot intelligent grabbing control method and system based on 3D vision
CN111383263A (en) * 2018-12-28 2020-07-07 阿里巴巴集团控股有限公司 System, method and device for grabbing object by robot
CN111496770A (en) * 2020-04-09 2020-08-07 上海电机学院 Intelligent carrying mechanical arm system based on 3D vision and deep learning and use method
CN111618847A (en) * 2020-04-22 2020-09-04 南通大学 Mechanical arm autonomous grabbing method based on deep reinforcement learning and dynamic motion elements
CN111923039A (en) * 2020-07-14 2020-11-13 西北工业大学 Redundant mechanical arm path planning method based on reinforcement learning
CN112605983A (en) * 2020-12-01 2021-04-06 浙江工业大学 Mechanical arm pushing and grabbing system suitable for intensive environment

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106970615A (en) * 2017-03-21 2017-07-21 西北工业大学 A kind of real-time online paths planning method of deeply study
CN107883929A (en) * 2017-09-22 2018-04-06 中冶赛迪技术研究中心有限公司 Monocular vision positioner and method based on multi-joint mechanical arm
CN109176521A (en) * 2018-09-19 2019-01-11 北京因时机器人科技有限公司 A kind of mechanical arm and its crawl control method and system
CN109333536A (en) * 2018-10-26 2019-02-15 北京因时机器人科技有限公司 A kind of robot and its grasping body method and apparatus
KR20200059111A (en) * 2018-11-20 2020-05-28 한양대학교 산학협력단 Grasping robot, grasping method and learning method for grasp based on neural network
CN111275063A (en) * 2018-12-04 2020-06-12 广州中国科学院先进技术研究所 Robot intelligent grabbing control method and system based on 3D vision
CN109521774A (en) * 2018-12-27 2019-03-26 南京芊玥机器人科技有限公司 A kind of spray robot track optimizing method based on intensified learning
CN111383263A (en) * 2018-12-28 2020-07-07 阿里巴巴集团控股有限公司 System, method and device for grabbing object by robot
CN109483554A (en) * 2019-01-22 2019-03-19 清华大学 Robotic Dynamic grasping means and system based on global and local vision semanteme
CN109531584A (en) * 2019-01-31 2019-03-29 北京无线电测量研究所 A kind of Mechanical arm control method and device based on deep learning
CN110315258A (en) * 2019-07-24 2019-10-11 广东工业大学 A kind of welding method based on intensified learning and ant group algorithm
CN110977967A (en) * 2019-11-29 2020-04-10 天津博诺智创机器人技术有限公司 Robot path planning method based on deep reinforcement learning
CN111251294A (en) * 2020-01-14 2020-06-09 北京航空航天大学 Robot grabbing method based on visual pose perception and deep reinforcement learning
CN111496770A (en) * 2020-04-09 2020-08-07 上海电机学院 Intelligent carrying mechanical arm system based on 3D vision and deep learning and use method
CN111618847A (en) * 2020-04-22 2020-09-04 南通大学 Mechanical arm autonomous grabbing method based on deep reinforcement learning and dynamic motion elements
CN111923039A (en) * 2020-07-14 2020-11-13 西北工业大学 Redundant mechanical arm path planning method based on reinforcement learning
CN112605983A (en) * 2020-12-01 2021-04-06 浙江工业大学 Mechanical arm pushing and grabbing system suitable for intensive environment

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113753562A (en) * 2021-08-24 2021-12-07 深圳市长荣科机电设备有限公司 Carrying method, system and device based on linear motor and storage medium
CN113753562B (en) * 2021-08-24 2023-07-25 深圳市长荣科机电设备有限公司 Linear motor-based carrying method, system, device and storage medium
CN113942009A (en) * 2021-09-13 2022-01-18 苏州大学 Robot bionic hand grabbing method and system
CN114523470A (en) * 2021-12-30 2022-05-24 浙江图盛输变电工程有限公司 Robot operation path planning method based on bearing platform linkage
CN114523470B (en) * 2021-12-30 2024-05-17 浙江图盛输变电工程有限公司 Robot operation path planning method based on bearing platform linkage
CN114683251A (en) * 2022-03-31 2022-07-01 上海节卡机器人科技有限公司 Robot grabbing method and device, electronic equipment and readable storage medium
CN115648232A (en) * 2022-12-30 2023-01-31 广东隆崎机器人有限公司 Mechanical arm control method and device, electronic equipment and readable storage medium
CN115847488A (en) * 2023-02-07 2023-03-28 成都秦川物联网科技股份有限公司 Industrial Internet of things system for cooperative robot monitoring and control method
CN115847488B (en) * 2023-02-07 2023-05-02 成都秦川物联网科技股份有限公司 Industrial Internet of things system for collaborative robot monitoring and control method
US11919166B2 (en) 2023-02-07 2024-03-05 Chengdu Qinchuan Iot Technology Co., Ltd. Industrial internet of things for monitoring collaborative robots and control methods, storage media thereof

Similar Documents

Publication Publication Date Title
CN113232019A (en) Mechanical arm control method and device, electronic equipment and storage medium
Dasari et al. Robonet: Large-scale multi-robot learning
TWI776113B (en) Object pose estimation method, device and computer readable storage medium thereof
CN110076772B (en) Grabbing method and device for mechanical arm
Sadeghi et al. Sim2real viewpoint invariant visual servoing by recurrent control
Finn et al. Guided cost learning: Deep inverse optimal control via policy optimization
CN111203878B (en) Robot sequence task learning method based on visual simulation
WO2022100363A1 (en) Robot control method, apparatus and device, and storage medium and program product
US20180290298A1 (en) Apparatus and methods for training path navigation by robots
CN113076615B (en) High-robustness mechanical arm operation method and system based on antagonistic deep reinforcement learning
Bohez et al. Sensor fusion for robot control through deep reinforcement learning
Chen et al. Combining reinforcement learning and rule-based method to manipulate objects in clutter
CN112164112B (en) Method and device for acquiring pose information of mechanical arm
CN114387513A (en) Robot grabbing method and device, electronic equipment and storage medium
CN114564009A (en) Surgical robot path planning method and system
CN114789454B (en) Robot digital twin track completion method based on LSTM and inverse kinematics
CN115781685A (en) High-precision mechanical arm control method and system based on reinforcement learning
Luo et al. Balance between efficient and effective learning: Dense2sparse reward shaping for robot manipulation with environment uncertainty
Hu et al. Grasping living objects with adversarial behaviors using inverse reinforcement learning
Liu et al. Sim-and-real reinforcement learning for manipulation: A consensus-based approach
Xu et al. Deep reinforcement learning for parameter tuning of robot visual servoing
CN111015676B (en) Grabbing learning control method, system, robot and medium based on hand-free eye calibration
Hwang et al. Image base visual servoing base on reinforcement learning for robot arms
Xu et al. A fast and straightforward hand-eye calibration method using stereo camera
Zhao et al. A robot demonstration method based on LWR and Q-learning algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210810

RJ01 Rejection of invention patent application after publication