Autonomous learning method of mechanical arm control method
Technical Field
The invention relates to the field of mechanical arm control, in particular to an autonomous learning method of a mechanical arm control method.
Background
The manner in which robotic arms are automatically moved by having the robotic arms perform a designated procedure is used in a number of applications, one of which is that robotic arms controlled remotely can be used to perform non-medical and medical procedures. As one particular method of use, a teleoperated surgical manipulator can be used to perform minimally invasive medical procedures. It is desirable in the medical arts to reduce the amount of tissue damaged during a medical procedure, thereby reducing recovery time, discomfort, and harmful side effects for the patient. However, when the robot arm is used in this field to perform a movement around a remote center of motion (also referred to as a "remote center"), the robot arm is affected by external tissues and the control accuracy is lowered, so that the actual posture of the robot arm deviates from the posture or motion that the command is actually intended to output, and the motion of the robot arm is subject to an error.
As disclosed in chinese patent application publication No. CN111315309A, published as 6/19/2020, discloses a system and method for controlling a robotic manipulator or associated tool, in which the factors that influence the accuracy of control of a robotic arm by external tissue are considered to be vibrations generated by the external tissue, thereby providing accuracy of control of the robotic arm by reducing the vibrations to which the robotic arm is subjected.
However, since the vibration cannot be completely eliminated, the accuracy of the robot arm cannot be improved to a certain extent, and the accuracy that can be improved by reducing the vibration is extremely limited.
Disclosure of Invention
In order to overcome the problem of low control precision of the remote motion center of the mechanical arm in the prior art, the invention provides the autonomous learning method of the mechanical arm control method, and the control precision of the mechanical arm is improved through learning feedback.
In order to solve the technical problems, the invention adopts the technical scheme that: the autonomous learning method of the control method of the mechanical arm comprises the mechanical arm and an executing element arranged on the mechanical arm, wherein the mechanical arm comprises a plurality of driving motors for driving the executing element to move; the executing element further comprises a force feedback sensor, and the autonomous learning method of the mechanical arm control method comprises the following steps:
the method comprises the following steps: initializing the mechanical arm and the force feedback sensor;
step two: setting a moving path of an actuating element;
step three: operating the executing element to move to a remote movement central point along the moving path, and enabling the executing element to complete a rotation periodic movement in a given rotation range to obtain remote movement central point data;
step four: by establishing a learning network, giving a learning step number n, starting the execution element to perform n times of rotation motion on a moving path, and updating a remote motion central point and the learning network according to feedback sensor data after each step of motion.
When the mechanical arm drives the execution element to execute the motion of the remote motion central point along the set path, the motion displacement of the execution element is output through the learning network, and then the learning network is updated according to the stress feedback of the force feedback sensor in the motion process of the execution element to readjust the motion displacement of the execution element, so that the execution element continuously adjusts the displacement of the execution element according to the external environment in the motion process, and the actual motion of the execution element is kept consistent with the motion output by the control instruction as much as possible.
Preferably, the feedback sensor is initialized by reading the force feedback signal and defining the force feedback signal as a zero point when no set external force is applied; and performing difference value conversion on the force feedback signal in the online learning process, and taking the difference value with the zero point as output.
Preferably, the mechanical arm comprises a clamping part for clamping the actuating element, a first linear motor and a second linear motor for driving the clamping part to rotate, and a third linear motor for driving the actuating element to do linear motion along the clamping part is arranged on the clamping part; in the first step, the parameters of the mechanical arm are measured when the mechanical arm is initialized, wherein the parameters comprise a vertical distance dm that a third linear motor reaches a horizontal shaft of the first linear motor or the second linear motor, a distance Ltool that a horizontal center of the force feedback sensor is far away from the end of the actuating element, a linear distance h of the first linear motor and the second linear motor, and position calibration and zero resetting are carried out on the first linear motor, the second linear motor and the third linear motor.
Preferably, in the third step, the given rotation range is a positive and negative maximum rotation angle [ Θ, - Θ ] reached within the stroke of the first and second linear motors; the state of the vertical horizontal plane at the tail end of the actuating element is defined as an angle zero point, one rotation period is defined as a motion period that the actuating element starts to rotate to an angle theta by taking the angle zero point as an initial position, reversely rotates to reach the angle theta through the angle zero point, and reversely rotates back to the angle zero point.
Preferably, the remote center of motion is defined as a 3 x k matrix, k being the total number of groupings of center points, wherein each column represents a group of center points (CL 1) i ,CL2 i ,CL3 i ) Wherein CL1, CL2, and CL3 are mark positions of the first linear motor, the second linear motor, and the third linear motor, respectively; the remote movement center point is selected by recording the actual motion of the first linear motor, the second linear motor and the third linear motor during the movement of the executing element in the third stepThe stroke range specifically is as follows: [ minL1, maxL1],[minL2,maxL2],[minL3,maxL3]Then the k center points can be represented as:
preferably, the specific steps of updating the learning network in the fourth step are as follows:
s1: calculating the angle theta between the tail end of the current execution element and the vertical plane
Wherein, L1 and L2 are the current positions of the first linear motor and the second linear motor respectively;
s2: calculating a Jacobian matrix under the current position, wherein a calculation formula of the Jacobian matrix is as follows:
wherein L3 is the current position of the third linear motor 3;
s3: and obtaining the estimated relative displacement of the first linear motor, the second linear motor and the third linear motor in the training according to the calculation of the Jacobian matrix, wherein the calculation formula is as follows:
wherein, Δ L1, Δ L2, Δ L3 are respectively estimated displacement amounts of the first linear motor, the second linear motor and the third linear motor; x, y are coordinates (x, y) of the RCM point;
s4: inputting the estimated displacement delta L1, delta L2, delta L3 and the central matrix into a learning network, adjusting the estimated displacement according to the weight matrix W to obtain the target displacement, wherein the calculation method comprises the following steps:
wherein Δ L1 ', Δ L2 ', and Δ L3 ' are target displacement amounts of the first linear motor, the second linear motor, and the third linear motor, respectively; s is a matrix formed by remote motion center points; b is a 3 x 1 matrix which represents the offset value of each displacement direction modified by the learning network;
s5: the first linear motor, the second linear motor and the third linear motor move according to the target displacement amount obtained in step S4, so that the angle formed by the actuator and the vertical plane forms a new angle, specifically:
s6: acquiring and analyzing the data of the force feedback sensor to obtain the direction component of the remote motion center coordinate:
τ x =cos(θ)*τ';
τ y =sin(θ)*τ';
wherein τ' is a stress feedback value of the actuator; tau. x ,τ y The resolution forces in x and y directions of τ', respectively;
s7: updating the weight matrix according to the force feedback sensor data, wherein the updating formula is as follows:
W'=W-a*learnrate*(τ x *J 1 +τ y *J 2 )*S;
b=a*learnrate*(τ x *J 1 +τ y *J 2 )
wherein, W' is the updated weight matrix; w is an original weight matrix; a is a mark of the current rotation direction (a is epsilon {1, -1 }); j is a unit of 1 Is the first column of the Jacobian matrix; j. the design is a square 2 Second column of Jacobian matrix;
s8: the steps S1 to S7 are an updating cycle of the learning network, the next updating cycle executes S1 to S7 again, and the parameters updated in the previous updating cycle are taken as the parameters of the learning network.
Preferably, when the actuator performs the remote movement center movement, before each step of driving, the target displacement amounts of the first linear motor, the second linear motor and the third linear motor are calculated according to the jacobian matrix of the end position of the actuator calculated in the fourth step, and the movement of the actuator is completed according to the given target displacement amount.
Preferably, the data of the force feedback sensor is provided with a threshold T, when the force feedback value of the actuator exceeds the threshold T after the movement, the update of the learning network weight W is triggered, otherwise, the weight update is not performed. Therefore, the purposes of continuously reducing stress and reducing the extrusion degree to the periphery are achieved by pertinently updating the learning network.
Preferably, if the force feedback value of the actuator does not exceed the threshold T in a continuous rotation cycle, the training process of the learning network is considered to be completed. The remote movement center of the needle point can be ensured to move within the range of the stress smaller than T through learning, and therefore the precision of the remote movement center movement is dynamically improved.
The autonomous learning system for the mechanical arm control method is used for realizing the autonomous learning method for the mechanical arm control method and comprises a mechanical arm and a controller module, wherein the controller module is electrically connected with the mechanical arm, acquires force feedback sensor data and controls the motion of the mechanical arm; the controller module includes a processor for performing operations.
Compared with the prior art, the beneficial effects are that: the method has the advantages that the execution element is highly adaptive to the actual operation environment, the deviation of the deformation of the environment to the motion can be taken into consideration, the motion adjustment of the remote motion center is dynamically guided, the precision of the mechanical arm in executing the motion is greatly improved, and the error between the actual motion of the mechanical arm and the standard motion output by a control instruction is reduced.
Drawings
FIG. 1 is a schematic view of a robotic arm of the present invention;
fig. 2 is a flowchart of an autonomous learning method of a robot control method of the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent; for the purpose of better illustrating the present embodiments, certain elements of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted. The positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the present patent.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there are terms such as "upper", "lower", "left", "right", "long", "short", etc., indicating orientations or positional relationships based on the orientations or positional relationships shown in the drawings, it is only for convenience of description and simplicity of description, but does not indicate or imply that the device or element referred to must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationships in the drawings are only used for illustrative purposes and are not to be construed as limitations of the present patent, and specific meanings of the terms may be understood by those skilled in the art according to specific situations.
The technical scheme of the invention is further described in detail by the following specific embodiments in combination with the attached drawings:
example 1
Fig. 1-2 show an embodiment of an autonomous learning method for a robot arm control method, which includes a robot arm 1 and an actuator 2 mounted on the robot arm 1, where the robot arm 1 includes a plurality of driving motors for driving the actuator 2 to move; the actuator 2 further comprises a force feedback sensor 201; the mechanical arm 1 comprises a clamping part 3 for clamping an actuating element 2, a first linear motor 4 and a second linear motor 5 for driving the clamping part 3 to rotate, wherein the first linear motor 4 and the second linear motor 5 are respectively hinged with the clamping part 3, and a third linear motor 6 for driving the actuating element 2 to do linear motion along the clamping part 3 is arranged on the clamping part 3; in the first step, the parameters of the robot arm 1 are measured when the robot arm 1 is initialized, including the vertical distance dm that the third linear motor reaches the horizontal axis of the first linear motor 4 or the second linear motor 5, the distance Ltool between the horizontal center of the force feedback sensor 201 and the end of the actuator 2, the linear distance h between the first linear motor 4 and the second linear motor 5, and the position calibration and zero resetting of the first linear motor 4, the second linear motor 5, and the third linear motor.
The autonomous learning method of the control method of the mechanical arm 1 includes the steps of:
the method comprises the following steps: initializing the mechanical arm 1 and the force feedback sensor 201;
step two: setting a moving path of the actuator 2;
step three: operating the executive component 2 to move to a remote movement central point along a moving path, and enabling the executive component 2 to complete a rotation periodic movement in a given rotation range to obtain remote movement central point data; the given rotation ranges are positive and negative maximum rotation angles [ theta, -theta ] reached within the stroke of the first and second linear motors 4, 5](ii) a The state of the vertical horizontal plane at the tail end of the actuator 2 is defined as an angle zero point, and one rotation period is defined as a motion period in which the actuator 2 starts to rotate to an angle theta by taking the angle zero point as an initial position, reversely rotates to reach the angle theta through the angle zero point, and reversely rotates back to the angle zero point. The remote center of motion points are defined as a matrix of 3 x k, k being the total number of groupings of center points, wherein each column represents a group of center points CL1 i ,CL2 i ,CL3 i Wherein CL1, CL2, and CL3 are mark positions of the first linear motor 4, the second linear motor 5, and the third linear motor, respectively; the method for selecting the remote movement center point includes recording actual stroke ranges of the first linear motor 4, the second linear motor 5 and the third linear motor in the movement process of the actuating element 2 in the third step, and specifically includes: [ minL1, maxL1],[minL2,maxL2],[minL3,maxL3]Then the k remote center of motion points can be expressed as:
step four: by establishing a learning network, given a learning step number n, the actuator 2 starts to perform n rotational movements on the movement path, and after each movement, the remote movement center point and the learning network are updated according to the data of the feedback sensor 201. The method comprises the following specific steps:
s1: calculating the angle theta between the end of the current executive component 2 and the vertical plane
Wherein L1 and L2 are the current positions of the first linear motor 4 and the second linear motor 5, respectively;
s2: calculating a Jacobian matrix under the current position, wherein a calculation formula of the Jacobian matrix is as follows:
wherein L3 is the current position of the third linear motor 3;
s3: and obtaining the estimated relative displacement of the first linear motor 4, the second linear motor 5 and the third linear motor in the training according to the Jacobian matrix calculation, wherein the calculation formula is as follows:
wherein Δ L1, Δ L2, and Δ L3 are estimated displacement amounts of the first linear motor 4, the second linear motor 5, and the third linear motor, respectively; x and y are coordinates x and y of the RCM point;
s4: inputting the estimated displacement delta L1, delta L2, delta L3 and the central matrix into a learning network, adjusting the estimated displacement according to the weight matrix W to obtain the target displacement, wherein the calculation method comprises the following steps:
wherein Δ L1 ', Δ L2 ', and Δ L3 ' are target displacement amounts of the first linear motor 4, the second linear motor 5, and the third linear motor, respectively; s is a matrix formed by remote motion center points; b is a 3-by-1 matrix which represents the offset value of each displacement direction modified by the learning network;
s5: the first linear motor 4, the second linear motor 5, and the third linear motor are moved according to the target displacement amount obtained in step S4, so that the angle formed by the actuator 2 and the vertical plane forms a new angle, specifically:
s6: acquiring and analyzing data of the force feedback sensor 201, and acquiring a coordinate direction component of a remote motion center:
τ x =cos(θ)*τ';
τ y =sin(θ)*τ';
wherein τ' is a force feedback value of the actuator 2; tau is x ,τ y The resolution forces in x and y directions of τ', respectively;
s7: the weight matrix is updated according to the data of the force feedback sensor 201, and the updating formula is as follows:
W'=W-a*learnrate*(τ x *J 1 +τ y *J 2 )*S;
b=a*learnrate*(τ x *J 1 +τ y *J 2 )
wherein W' is the updated weight matrix; w is an original weight matrix; a is a mark a epsilon {1, -1} of the current rotation direction; j. the design is a square 1 Is the first column of the Jacobian matrix; j is a unit of 2 Second column of Jacobian matrix;
s8: the steps S1-S7 are an updating cycle of the learning network, the next updating cycle executes S1-S7 again, and the parameters updated in the previous updating cycle are used as the parameters of the learning network. That is, the weight matrix W of S4 in the n-th cycle is the updated weight matrix W' obtained from S7 in the n-1 th cycle.
Further, the data of the force feedback sensor 201 is provided with a threshold T, when the force feedback value of the actuator 2 after the current movement exceeds the threshold T, the update of the learning network weight W is triggered, otherwise, the weight update is not performed. Therefore, the purposes of continuously reducing stress and reducing the extrusion degree to the periphery are achieved by pertinently updating the learning network. And if the stress feedback value of the executing element 2 does not exceed the threshold value T in one continuous rotation period, the training process of the learning network is considered to be completed. The needle point can be ensured to move in the remote movement center within the range of the stress smaller than T through learning, and therefore the precision of the movement of the remote movement center is dynamically improved.
The working principle of the embodiment is as follows: when the mechanical arm 1 drives the execution element 2 to execute the motion of the remote motion central point along the set path, the motion displacement of the execution element 2 is output through the learning network, and then the learning network is updated according to the force feedback of the force feedback sensor 201 in the motion process of the execution element 2 to readjust the motion displacement of the execution element 2, so that the execution element 2 continuously adjusts the displacement thereof according to the external environment in the motion process, and the motion actually generated by the execution element 2 is kept as consistent as possible with the motion output by the control instruction.
The beneficial effects of the embodiment are as follows: the method has the advantages that the real-time learning of the operation object under the actual environment is carried out, the motion calibration of the remote motion center is carried out according to the stress feedback of the execution element 2, the motion of the execution element 2 is highly adaptive to the actual operation environment, the deviation of the deformation of the environment to the motion can be taken into consideration, the motion adjustment of the remote motion center is dynamically guided, the precision of the mechanical arm 1 in executing the motion is greatly improved, and the error between the actual motion of the mechanical arm 1 and the standard motion output by a control instruction is reduced.
Example 2
An autonomous learning system of a robot arm control method is used for realizing the autonomous learning method of the robot arm control method of embodiment 1, and comprises a robot arm and a controller module, wherein the controller module is electrically connected with the robot arm, acquires force feedback sensor data and controls the motion of the robot arm; the controller module includes a processor for performing operations.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.