WO2022199146A1 - 基于脉冲神经网络的机器人控制方法、机器人及存储介质 - Google Patents

基于脉冲神经网络的机器人控制方法、机器人及存储介质 Download PDF

Info

Publication number
WO2022199146A1
WO2022199146A1 PCT/CN2021/137977 CN2021137977W WO2022199146A1 WO 2022199146 A1 WO2022199146 A1 WO 2022199146A1 CN 2021137977 W CN2021137977 W CN 2021137977W WO 2022199146 A1 WO2022199146 A1 WO 2022199146A1
Authority
WO
WIPO (PCT)
Prior art keywords
robot
neural network
instruction
trajectory
spiking neural
Prior art date
Application number
PCT/CN2021/137977
Other languages
English (en)
French (fr)
Inventor
陈鑫
李骁健
岳斌
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2022199146A1 publication Critical patent/WO2022199146A1/zh

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1656Programme controls characterised by programming, planning systems for manipulators
    • B25J9/1664Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1628Programme controls characterised by the control loop
    • B25J9/163Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P80/00Climate change mitigation technologies for sector-wide applications
    • Y02P80/10Efficient use of energy, e.g. using compressed air or pressurized fluid as energy carrier

Definitions

  • the present application relates to the technical field of robot control, and in particular, to a robot control method, robot and storage medium based on a spiking neural network.
  • the robot when the robot is a robotic arm, since the robotic arm is fixed on a certain base, the movable space of the robotic arm is limited and fixed. In the active space of the manipulator, the control of the manipulator needs to improve the efficiency, reduce the energy consumption, and reduce the loss of the life of the manipulator while completing the task.
  • the main technical problem to be solved by the present application is to provide a robot control method, robot and storage medium based on a spiking neural network, which can improve the stability and robustness of robot control.
  • a technical solution adopted in this application is to provide a method for controlling a robot based on a spiking neural network.
  • the method for controlling the robot includes: generating a first instruction according to a preset motion trajectory, and the first instruction is used to control the robot to move to the preset motion trajectory obtain the feedback data of the robot moving to the first position; calculate the trajectory correction data in the spiking neural network based on the first instruction and the feedback data; generate the second instruction, the second instruction and the trajectory according to the preset motion trajectory The correction data is used to control the robot to move from the first position to the second position on the preset movement trajectory.
  • the first instruction before generating the first instruction according to the preset motion trajectory, it includes: acquiring the target position of the robot to move; and determining the preset motion trajectory of the robot according to the target position and the starting position of the robot.
  • calculating the trajectory correction data in the spiking neural network based on the first instruction and the feedback data includes: using the first instruction and the feedback data to update the weight of the spiking neural network; using the updated spiking neural network to calculate and obtain the trajectory correction data .
  • using the first instruction and the feedback data to update the weight of the spiking neural network includes: encoding the first instruction and the feedback data to obtain the activity of the neurons in the spiking neural network; using the activity of the neurons to calculate and decode using the decoder and neuron activity calculation to obtain the decoding estimate; using the decoding estimate and feedback data to obtain the first difference value; using the first difference value and the neuron activity to obtain the weight correction value of the spiking neural network; using the weight correction value Update the weights of the spiking neural network.
  • using the updated spiking neural network to calculate and obtain the trajectory correction data including: using the updated weights, the activities of the decoder and the neuron to obtain the trajectory correction data.
  • the trajectory correction data is obtained by using the updated weights, decoders and neuron activities, including: using the following formula to calculate and obtain the trajectory correction data: where a represents the activity of neurons, ⁇ represents the updated weights, d represents the decoder, and ⁇ adapt represents the trajectory correction data.
  • Another technical solution adopted in the present application is to provide a robot, which includes a processor and a memory coupled to the processor; wherein, the memory is used for storing program data, and the processor is used for executing the program data, so as to realize the above technology method provided by the program.
  • the robot is a robotic arm.
  • Another technical solution adopted in this application is to provide a computer-readable storage medium, where the computer-readable storage medium is used to store program data, and when the program data is executed by a processor, it is used to implement the method provided by the above technical solution .
  • the beneficial effects of the present application are: different from the situation in the prior art, a robot control method based on a spiking neural network of the present application, the robot control method includes: generating a first instruction according to a preset motion trajectory, and the first instruction is used for Control the robot to move to the first position on the preset motion trajectory; obtain the feedback data of the robot moving to the first position; calculate the trajectory correction data in the spiking neural network based on the first instruction and the feedback data; generate the first position according to the preset motion trajectory.
  • the second command, the second command and the trajectory correction data are used to control the robot to move from the first position to the second position on the preset movement trajectory.
  • the spiking neural network is used to correct the trajectory of the robot in real time, so that the robot can move stably and accurately, and the stability and robustness of the robot control are improved.
  • the calculation efficiency of trajectory correction data thereby improving the motion efficiency of the robot.
  • FIG. 1 is a schematic flowchart of an embodiment of a method for controlling a robot based on a spiking neural network provided by the present application;
  • Fig. 2 is the schematic flow chart before step 11 in Fig. 1 provided by this application;
  • FIG. 3 is a schematic flowchart of another embodiment of the spiking neural network-based robot control method provided by the present application.
  • Fig. 4 is the specific flow chart of step 33 in Fig. 3 provided by this application;
  • Fig. 5 is the specific flow chart of step 332 in Fig. 4 provided by this application;
  • Fig. 6 is the schematic diagram of application result of the robot control method based on spiking neural network provided by this application;
  • FIG. 7 is a schematic structural diagram of an embodiment of a robot provided by the present application.
  • FIG. 8 is a schematic structural diagram of another embodiment of the robot provided by the present application.
  • FIG. 9 is a schematic structural diagram of an embodiment of a computer-readable storage medium provided by the present application.
  • FIG. 1 is a schematic flowchart of an embodiment of a method for controlling a robot based on a spiking neural network provided by the present application.
  • the method includes:
  • Step 11 Generate a first instruction according to the preset motion trajectory, where the first instruction is used to control the robot to move to a first position on the preset motion trajectory.
  • the preset motion trajectory consists of a series of coordinate points.
  • the first command may be a control signal for each joint, and by controlling each joint, the robot can move to the first position on the preset movement trajectory.
  • Step 21 Obtain the target position where the robot moves.
  • the robot moves according to a given target position.
  • the robotic arm can be moved to the specified target position.
  • Step 22 Determine the preset motion trajectory of the robot according to the target position and the starting position of the robot.
  • Step 22 may establish the most reasonable preset motion trajectory according to the target position and the starting position. If there is an obstacle between the target position and the starting position, the preset motion trajectory can be made to bypass the obstacle.
  • Step 12 Acquire feedback data of the robot moving to the first position.
  • the position information of each joint is collected by sensors of the robot.
  • the sensors can be encoders at the robot joints or motor ends to obtain position information of the joints.
  • the speed and direction of the robot at the current moment can be obtained.
  • the actual trajectory data of the robot can be obtained.
  • the speed of the robot at the current moment can also be detected by these sensors.
  • the actual position of the robot when it moves to the first position is not the first position, and the feedback data may be the current robot's current position.
  • the feedback data may be the current robot's current position. Actual position and actual speed, actual direction, actual joint torque, etc.
  • Step 13 Calculate trajectory correction data in the spiking neural network based on the first instruction and the feedback data.
  • the spiking neural network may be constructed based on the Hodgkin-Huxley model, or may be constructed based on the Leaky Integrate and Fire model or the Izhikevich model.
  • the spiking neural network can be trained based on unsupervised learning algorithms and/or supervised learning algorithms.
  • Step 14 Generate a second instruction according to the preset motion trajectory, where the second instruction and the trajectory correction data are used to control the robot to move from the first position to the second position on the preset motion trajectory.
  • the robot After the trajectory correction is obtained, the robot generates the optimal second instruction according to the trajectory correction data and the second instruction, so as to control the robot to move from the first position to the second position on the preset motion trajectory.
  • the actual position of the robot moving from the first position to the second position on the preset movement trajectory can be closer to the second position.
  • the robot control method includes: generating a first instruction according to a preset motion trajectory, where the first instruction is used to control the robot to move to a first position on the preset motion trajectory; and obtaining feedback that the robot moves to the first position
  • the trajectory correction data is calculated in the spiking neural network based on the first instruction and the feedback data; the second instruction is generated according to the preset motion trajectory, and the second instruction and the trajectory correction data are used to control the robot to move from the first position to the preset motion second position on the track.
  • FIG. 3 is a schematic flowchart of another embodiment of the method for controlling a robot based on a spiking neural network provided by the present application.
  • the method includes:
  • Step 31 Generate a first instruction according to the preset motion trajectory, where the first instruction is used to control the robot to move to a first position on the preset motion trajectory.
  • Step 32 Acquire feedback data of the robot moving to the first position.
  • Step 33 Using the first instruction and the feedback data to update the weight of the spiking neural network.
  • step 33 may be the following process:
  • Step 331 Encode the first instruction and the feedback data to obtain the activity of the neurons in the spiking neural network.
  • the activity of neurons can be expressed by the following formula:
  • G[ ] is the nonlinear neural activation function
  • is the scaling factor (gain) associated with the neuron
  • e is the neuron's encoder
  • x is the vector to be encoded, the first instruction and feedback data.
  • Step 332 Calculate the decoder by using the activity of the neuron.
  • step 332 may use the following process to calculate the decoder:
  • Step 3321 Obtain the first parameter by using the first instruction and feedback data and the activity of the neuron.
  • step 3321 can use the following formula to obtain the first parameter:
  • a j is the activity of neuron j
  • x is the input first instruction and feedback data
  • r is the first parameter.
  • Step 3322 Obtain the second parameter by using the activities of the plurality of neurons.
  • step 3322 can use the following formula to obtain the second parameter:
  • T ij ⁇ a i a j dx.
  • a j is the activity of neuron j
  • a i is the activity of neuron i
  • T ij is the second parameter between neuron j and neuron i.
  • Step 3323 Calculate the decoder using the first parameter and the second parameter.
  • step 3323 can use the following formula to find the decoder:
  • Step 333 Obtain a decoding estimate using the activity calculation of the decoder and neurons.
  • a dot product is performed using the activity of the decoder and neurons to obtain the decoding estimate. It can be expressed using the following formula:
  • Step 334 Obtain the first difference value by using the decoded estimation and the feedback data.
  • the result estimated by decoding is the optimal motion data of the robot predicted by the spiking neural network, it can be compared with the actual motion data in the feedback data to obtain the difference between the optimal motion data and the actual motion data in the feedback data.
  • first difference the optimal motion data of the robot predicted by the spiking neural network
  • Step 335 Obtain the weight correction value of the spiking neural network by using the first difference and the activity of the neuron.
  • online supervised learning rules may be used to determine weight correction values.
  • ⁇ ij ⁇ j e j ⁇ Ea i ;
  • ⁇ ij represents the weight correction value of the connection weight between neuron j and neuron i
  • is the scalar learning rate
  • E represents the first difference, the decoding estimate difference from x.
  • the decoder correction value ⁇ d i corresponding to the neuron can be obtained according to the first difference.
  • unsupervised learning rules may be used to determine weight correction values.
  • ⁇ ij represents the weight correction value of the connection weight between neuron j and neuron i
  • represents the modification threshold, which is used to limit the modification range of neuron j.
  • the weight correction value may be determined using a combination of unsupervised learning rules and online supervised learning rules.
  • the weight correction value is calculated using the following formula:
  • ⁇ ij ⁇ j a i (Se j ⁇ E+(1-S)a j (a j - ⁇ )).
  • is the scalar learning rate
  • ⁇ j is the scaling factor of neuron j
  • a i is the activity of neuron i
  • S is the control parameter, used to represent the relative weight of the supervised learning term relative to the unsupervised learning
  • E represents the first difference
  • represents the modification threshold
  • Step 336 Update the weight of the spiking neural network using the weight correction value.
  • the weight of the spiking neural network is set between neurons and neurons, and the correction value can be used to update the weight between neurons. If the weight correction value is negative, it means that the original weight needs to be reduced, and if the weight correction value is positive, it means that the original weight needs to be increased.
  • Step 34 Calculate the trajectory correction data using the updated spiking neural network.
  • Trajectory correction data is obtained using the updated weights, decoders, and neuron activity.
  • the trajectory correction data can be calculated by multiplying the activity of the neuron by the weight and then dot-multiplying the decoder.
  • represents the updated weights
  • d represents the decoder
  • ⁇ adapt represents the trajectory correction data
  • Step 35 Generate a second instruction according to the preset motion trajectory, and the second instruction and the trajectory correction data are used to control the robot to move from the first position to the second position on the preset motion trajectory.
  • the torque required by each joint of the robot can be calculated according to the second instruction and the trajectory correction data.
  • the following formula can be used to calculate the moment for the movement control of the robot in the second command.
  • q represents the coordinates of each joint of the robot, Represents the angular velocity of each joint of the robot, M(q) represents the inertial force on each joint caused by the acceleration of the movement of each joint of the robot, Represents the inertial force caused by the speed of each joint of the robot to other joints, that is, the Coriolis force or centrifugal force, and G(q) represents the self-gravity of the robotic arm that each joint of the robot needs to overcome.
  • ⁇ adapt represents the trajectory correction data
  • represents the torque that each joint driver needs to apply to make the joint move according to the established trajectory (position, velocity, acceleration) according to the robot dynamics model.
  • the abscissa in Fig. 6 represents the time when the robot moves, and the ordinate represents the distance between the actual position of the robot's movement and the position in the preset motion trajectory. It can be seen that, according to the control method of the above-mentioned embodiment, the robot is moving During the process, it will gradually tend to the position in the preset motion trajectory, so that the robot can move stably and accurately.
  • the spiking neural network is used to correct the trajectory of the robot in real time, so that the robot can move stably and accurately, and the stability and robustness of the robot control can be improved.
  • the use of spiking neural network can improve the trajectory correction.
  • the computing efficiency of the data thereby improving the motion efficiency of the robot.
  • FIG. 7 is a schematic structural diagram of an embodiment of the robot provided by the present application.
  • the robot 70 includes a processor 71 and a memory 72 coupled to the processor 71 .
  • the memory 72 is used to store the program data
  • the processor 71 is used to execute the program data to realize the following method:
  • the trajectory correction data is calculated in the network;
  • the second instruction is generated according to the preset motion trajectory, and the second instruction and the trajectory correction data are used to control the robot to move from the first position to the second position on the preset motion trajectory.
  • processor 71 in this embodiment is further configured to execute program data to implement the method in any of the foregoing embodiments, and the specific implementation steps may refer to the foregoing embodiments, which will not be repeated here.
  • the robot 70 is a robotic arm.
  • FIG. 8 is a schematic structural diagram of another embodiment of the robot provided by the present application.
  • the robot 80 includes a trajectory generator 81 , a control signal generator 82 , an adaptive controller 83 and a robotic arm 84 .
  • the trajectory generator 81 is configured to generate a preset motion trajectory of the robotic arm 84 according to the starting position and the target position of the robotic arm 84 .
  • the control signal generator 82 is connected to the trajectory generator 81, and is used for generating the first instruction according to the preset motion trajectory.
  • the adaptive regulator 83 is connected to the control signal generator 82 and the robotic arm 84 , and the adaptive regulator 83 is constructed based on the spiking neural network.
  • the robotic arm 84 is connected to the control signal generator 82 and the adaptive regulator 83 .
  • the robotic arm 84 When receiving the first instruction, the robotic arm 84 will move to the first position on the preset movement trajectory.
  • the trajectory generator 81 acquires the target position, generates a preset motion trajectory according to the target position, and represents the position in the preset motion trajectory with a series of (x, y) coordinates.
  • the control signal generator 82 acquires these target positions sent by the trajectory generator 81 and combines these target positions with the locally calculated Jacobian matrix to convert the required robot end motion commands into low-level signals (i.e., the above-mentioned The first instruction in the embodiment), the low-level signal is sent to the robotic arm 84 and the adaptive regulator 83 .
  • the adaptive controller 83 compensates the speed and motion errors of the robotic arm 84 by sending an adaptive signal (ie, trajectory correction data) to the robotic arm 84 .
  • the robot arm 84 sends the feedback data to the adaptive regulator 83 .
  • the trajectory generator 81 can be modeled using the trajectory generation framework of dynamic motion primitives, which specifies the required trajectory in the operation space.
  • Dynamic motion primitives are simple controllers that can be used to quickly learn and generate complex trajectories.
  • the control signal generator 82 is used to map high-level control signals defined in the abstract space to low-level control signals that can be sent to the robotic arm 84 .
  • the forces of the end effector of the robotic arm 84 are mapped to joint moments.
  • the adaptive Jacobian matrix includes the inertia matrix, which uses the high-level control signal ux and the system speed .
  • q is used as a training signal and uses a recurrent neural network to adapt the connections that generate the Jacobian matrix. train. This ensures that the Jacobian remains up-to-date with improvements if the nature and environment of the system changes.
  • the approximate Jacobian will be projected along with the high-level control signals into a collective array, where a dot product operation is performed to compute the low-level control signals.
  • the resulting low-level control signal u is sent to the adaptive regulator 83 as a training signal, and the trajectory correction data is obtained and then sent to the robotic arm 84 .
  • a trajectory correction data is provided for the robot arm 84 to eliminate unmodeled errors in the movement of the robot arm 84 .
  • the adaptive regulator 83 receives the control signal generated by the control signal generator 82 and feedback data regarding the current state of the robotic arm 84 .
  • the adaptive controller 83 uses this information to understand the outcome of an action and give corresponding trajectory correction data.
  • the trajectory correction data is a combination of forward and reverse models to produce a correction control signal.
  • the spiking neural network in this embodiment uses an open-source neural network engineering framework (NEF), uses the first instruction generated by the control signal generator 82 as training data, uses the currently expected joint angle and angular velocity of the robotic arm 84 as the learning data, and uses Combined learning rule the homeostatic Prescribed Error Sensitivity (hPES, steady-state error sensitivity) as the weight update rule in spiking neural network.
  • NEF neural network engineering framework
  • the activity of neurons can be expressed as:
  • G[ ] is the nonlinear neural activation function
  • is the scaling factor (gain) associated with the neuron
  • e is the encoder of the neuron
  • x is the vector to be encoded, i.e. the input first instruction and feedback data.
  • Decoding estimate is the sum of the activity of each neuron, weighted by an n-dimensional decoder.
  • d is the decoder and a is the activity of the neuron.
  • the decoder d is found by least squares minimizing the difference between the decoded estimate and the actual encoded vector.
  • the decoder d can be calculated according to the following formula:
  • T ij ⁇ a i a j dx
  • d is the decoder
  • a i is the activity of neuron i
  • a j is the activity of neuron j
  • x is the input data
  • r is the first parameter
  • T ij is the connection between neuron j and neuron i Second parameter.
  • ⁇ ij ⁇ j a i (Se j ⁇ E+(1-S)a j (a j - ⁇ )).
  • S is the relative weight of the online supervised learning item relative to the unsupervised learning item, that is, the control parameter in the above embodiment.
  • the weight between neurons is obtained, and then the trajectory correction data is obtained by using the weight, decoder and neuron activity.
  • the trajectory correction data can be calculated according to the following formula:
  • a is the activity of the neuron, encoded by the input data
  • is the connection weight between neurons
  • d is the neuron decoder
  • the robotic arm 84 can calculate the motion data for moving to the second position according to the trajectory correction data and the second instruction.
  • q represents the coordinates of each joint of the robotic arm 84
  • M(q) represents the inertial force on each joint caused by the acceleration of the motion of each joint of the manipulator 84
  • G(q) represents the self-gravity of the robot arm 84 that each joint of the robot arm 84 needs to overcome.
  • ⁇ adapt represents the trajectory correction data (that is, the correction torque) calculated by the adaptive regulator 83
  • represents the dynamic model of the robotic arm 84 based on, and allows the joints of the robotic arm 84 to move according to a predetermined trajectory (position, velocity, acceleration), The torque required to be applied by each joint driver.
  • Each joint of the robotic arm 84 moves according to this moment, and the actual position of the movement will be closer to the position in the preset running track.
  • the above manner can improve the ability of the multi-joint joints of the robotic arm 84 to move in coordination at the same time, so that the action of the robotic arm 84 is more flexible and the moving efficiency is improved.
  • FIG. 9 is a schematic structural diagram of an embodiment of a computer-readable storage medium provided by the present application.
  • the computer-readable storage medium 90 is used for storing program data 91, and when the program data 91 is executed by the processor, it is used to realize The following method steps:
  • the trajectory correction data is calculated in the network;
  • the second instruction is generated according to the preset motion trajectory, and the second instruction and the trajectory correction data are used to control the robot to move from the first position to the second position on the preset motion trajectory.
  • the computer-readable storage medium 90 in this embodiment is applied to the robot 70 or the robot 80 in the above-mentioned embodiments, and the specific implementation steps thereof may refer to the above-mentioned embodiments, which will not be repeated here.
  • the disclosed method and apparatus may be implemented in other manners.
  • the device implementations described above are only illustrative.
  • the division of the modules or units is only a logical function division. In actual implementation, there may be other divisions.
  • multiple units or components may be Incorporation may either be integrated into another system, or some features may be omitted, or not implemented.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this implementation manner.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated units in the other embodiments described above are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium.
  • the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Manipulator (AREA)

Abstract

一种基于脉冲神经网络的机器人控制方法,包括:根据预设运动轨迹生成第一指令,第一指令用于控制机器人运动至预设运动轨迹上的第一位置;获取机器人移动至第一位置的反馈数据;基于第一指令和反馈数据在脉冲神经网络中计算出轨迹校正数据;根据预设运动轨迹生成第二指令,第二指令和轨迹校正数据用于控制机器人从第一位置运动至预设运动轨迹上的第二位置。通过该控制方法,能够提高对机器人控制的稳定性和鲁棒性。还提供一种基于脉冲神经网络的机器人及存储介质。

Description

基于脉冲神经网络的机器人控制方法、机器人及存储介质 技术领域
本申请涉及机器人控制技术领域,特别是涉及一种基于脉冲神经网络的机器人控制方法、机器人及存储介质。
背景技术
目前在各类机器人的应用中,机器人稳定而准确的移动具有重要意义。例如,在工业界的点胶、焊接、传送带产品的移动监测等应用,均需要机器人快速收敛至给定轨迹。在服务机械人领域,移动机器人需要实时同速跟随人类,或按给定轨迹给定速度运动。
如在机器人为机械臂时,由于机械臂固定在某一底座上,所以机械臂的活动空间是有限且固定的。在机械臂的活动空间中,对机械臂的控制需要在完成任务的同时做到提高效率,降低能耗,并减少对机械臂寿命的损耗。
发明内容
本申请主要解决的技术问题是提供基于脉冲神经网络的机器人控制方法、机器人及存储介质,能够提高对机器人控制的稳定性和鲁棒性。
本申请采用的一种技术方案是提供一种基于脉冲神经网络的机器人控制方法,该机器人控制方法包括:根据预设运动轨迹生成第一指令,第一指令用于控制机器人运动至预设运动轨迹上的第一位置;获取机器人移动至第一位置的反馈数据;基于第一指令和反馈数据在脉冲神经网络中计算出轨迹校正数据;根据预设运动轨迹生成第二指令,第二指令和轨迹校正数据用于控制机器人从第一位置运动至预设运动轨迹上的第二位置。
其中,根据预设运动轨迹生成第一指令之前,包括:获取机器人移 动的目标位置;根据目标位置和机器人的起始位置确定机器人的预设运动轨迹。
其中,基于第一指令和反馈数据在脉冲神经网络中计算出轨迹校正数据,包括:利用第一指令和反馈数据对脉冲神经网络的权重进行更新;利用更新后的脉冲神经网络计算得到轨迹校正数据。
其中,利用第一指令和反馈数据对脉冲神经网络的权重进行更新,包括:将第一指令和反馈数据进行编码,以得到脉冲神经网络中的神经元的活动;利用神经元的活动计算得到解码器;利用解码器和神经元的活动计算得到解码估计;利用解码估计和反馈数据得到第一差值;利用第一差值和神经元的活动得到脉冲神经网络的权重修正值;利用权重修正值对脉冲神经网络的权重进行更新。
其中,利用第一差值和神经元的活动得到脉冲神经网络的权重修正值,包括:利用以下公式计算权重修正值:Δω ij=κα ja i(Se j·E+(1-S)a j(a j-θ));其中,κ表示标量学习速率,α j表示神经元j的标度因子,a i表示神经元i的活动,S表示控制参数,E表示第一差值,θ表示修改阈值。
其中,利用更新后的脉冲神经网络计算得到轨迹校正数据,包括:利用更新后的权重、解码器和神经元的活动得到轨迹校正数据。
其中,利用更新后的权重、解码器和神经元的活动得到轨迹校正数据,包括:利用以下公式计算得到轨迹矫正数据:
Figure PCTCN2021137977-appb-000001
其中,a表示神经元的活动,ω表示更新后的权重,d表示解码器,Γ adapt表示轨迹校正数据。
本申请采用的另一种技术方案是提供一种机器人,该机器人包括处理器以及与处理器耦接的存储器;其中,存储器用于存储程序数据,处理器用于执行程序数据,以实现如上述技术方案提供的方法。
其中,该机器人为机械臂。
本申请采用的另一种技术方案是提供一种计算机可读存储介质,该计算机可读存储介质用于存储程序数据,程序数据在被处理器执行时, 用于实现如上述技术方案提供的方法。
本申请的有益效果是:区别于现有技术的情况,本申请的一种基于脉冲神经网络的机器人控制方法,该机器人控制方法包括:根据预设运动轨迹生成第一指令,第一指令用于控制机器人运动至预设运动轨迹上的第一位置;获取机器人移动至第一位置的反馈数据;基于第一指令和反馈数据在脉冲神经网络中计算出轨迹校正数据;根据预设运动轨迹生成第二指令,第二指令和轨迹校正数据用于控制机器人从第一位置运动至预设运动轨迹上的第二位置。通过上述方式,一方面利用脉冲神经网络实时的对机器人进行轨迹校正,使机器人能够稳定并准确的移动,提高对机器人控制的稳定性和鲁棒性,另一方面,利用脉冲神经网络能够提高对轨迹校正数据的计算效率,进而提升机器人的运动效率。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。其中:
图1是本申请提供的基于脉冲神经网络的机器人控制方法一实施例流程示意图;
图2是本申请提供的图1中步骤11之前的流程示意图;
图3是本申请提供的基于脉冲神经网络的机器人控制方法另一实施例流程示意图;
图4是本申请提供的图3中步骤33的具体流程示意图;
图5是本申请提供的图4中步骤332的具体流程示意图;
图6是本申请提供的基于脉冲神经网络的机器人控制方法应用结果示意图;
图7是本申请提供的机器人一实施例结构示意图;
图8是本申请提供的机器人另一实施例结构示意图;
图9是本申请提供的计算机可读存储介质一实施例的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。可以理解的是,此处所描述的具体实施例仅用于解释本申请,而非对本申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本申请相关的部分而非全部结构。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。
参阅图1,图1是本申请提供的基于脉冲神经网络的机器人控制方法一实施例流程示意图。该方法包括:
步骤11:根据预设运动轨迹生成第一指令,第一指令用于控制机器人运动至预设运动轨迹上的第一位置。
在一些实施例中,预设运动轨迹由一系列的坐标点组成。
因机器人存在很多关节,则第一指令可以是对每一关节的控制信号,通过对每一关节进行控制,以使机器人运动至预设运动轨迹上的第一位置。
参阅图2,在步骤11之前,可以是如下步骤:
步骤21:获取机器人移动的目标位置。
在本实施例中,机器人是按照给定目标位置以进行移动。
如机器人为机械臂,可使机械臂移动至指定目标位置。
步骤22:根据目标位置和机器人的起始位置确定机器人的预设运动轨迹。
步骤22可根据目标位置和起始位置建立最合理的预设运动轨迹。如在目标位置和起始位置之间存在阻挡物,则可使预设运动轨迹绕过阻挡物。
步骤12:获取机器人移动至第一位置的反馈数据。
在一些实施例中,在机器人基于第一指令移动至第一位置时,通过机器人的传感器采集每个关节的位置信息。传感器可以是机器人关节或电机端的编码器,以获取关节的位置信息。进一步通过对位置信息进行处理,可以获取到机器人当前时刻的速度、方向等。根据传感器采集的这些数据可以获得机器人的实际轨迹数据。还可以通过这些传感器来检测机器人的当前时刻的速度。
可以理解,不同的机器人,传感器获取的数据信息不尽相同,根据机器人的特性,获取合理的数据信息。
在一些实施例中,机器人在移动至第一位置时,因机器人的本身结构的误差,会导致机器人在移动至第一位置时的实际位置并不是第一位置,则反馈数据可以是机器人当前的实际位置以及此时的实际速度、实际方向、实际关节力矩等。
步骤13:基于第一指令和反馈数据在脉冲神经网络中计算出轨迹校正数据。
在一些实施例中,脉冲神经网络可以是基于Hodgkin-Huxley模型构建而成,也可以是基于Leaky Integrate and Fire模型或Izhikevich模型构建而成。
脉冲神经网络可以是基于无监督学习算法和/或有监督学习算法进行训练。
步骤14:根据预设运动轨迹生成第二指令,第二指令和轨迹校正数据用于控制机器人从第一位置运动至预设运动轨迹上的第二位置。
在得到轨迹校正后,机器人根据轨迹校正数据和第二指令结合,生成最佳的第二指令,以控制机器人从第一位置运动至预设运动轨迹上的第二位置。
可以理解,通过将轨迹校正数据和第二指令结合的方式,能够使机器人从第一位置运动至预设运动轨迹上的第二位置的实际位置更加接近第二位置。
通过这种方式,预设运动轨迹上的其他位置可按照上述方式进行,得到基于前一位置的轨迹校正数据,以对当前位置的实际位置进行补偿,使实际位置趋于预设运动轨迹中当前位置。
在本实施例中,该机器人控制方法包括:根据预设运动轨迹生成第一指令,第一指令用于控制机器人运动至预设运动轨迹上的第一位置;获取机器人移动至第一位置的反馈数据;基于第一指令和反馈数据在脉冲神经网络中计算出轨迹校正数据;根据预设运动轨迹生成第二指令,第二指令和轨迹校正数据用于控制机器人从第一位置运动至预设运动轨迹上的第二位置。通过上述方式,一方面利用脉冲神经网络实时的对机器人进行轨迹校正,使机器人能够稳定并准确的移动,提高对机器人控制的稳定性和鲁棒性,另一方面,利用脉冲神经网络能够提高对轨迹校正数据的计算效率,进而提升机器人的运动效率。
参阅图3,图3是本申请提供的基于脉冲神经网络的机器人控制方法另一实施例流程示意图。该方法包括:
步骤31:根据预设运动轨迹生成第一指令,第一指令用于控制机器人运动至预设运动轨迹上的第一位置。
步骤32:获取机器人移动至第一位置的反馈数据。
步骤33:利用第一指令和反馈数据对脉冲神经网络的权重进行更新。
在一些实施例中,参阅图4,步骤33可以是如下流程:
步骤331:将第一指令和反馈数据进行编码,以得到脉冲神经网络中的神经元的活动。
具体地,可以用以下公式表示神经元的活动:
a=G[αe·x];
其中,G[·]是非线性神经激活函数,α是与神经元相关的标度因子(增益),e是神经元的编码器,x是要编码的向量,即第一指令和反馈数据。
步骤332:利用神经元的活动计算得到解码器。
在一些实施例中,参阅图5,步骤332可以利用以下流程来计算得到解码器:
步骤3321:利用第一指令和反馈数据和神经元的活动求出第一参数。
具体地,步骤3321可使用以下公式求出第一参数:
r=∫a jxdx。
其中,a j是神经元j的活动,x是输入的第一指令和反馈数据,r为第一参数。
步骤3322:利用多个神经元的活动求出第二参数。
具体地,步骤3322可使用以下公式求出第二参数:
T ij=∫a ia jdx。
其中,a j是神经元j的活动,a i是神经元i的活动,T ij为神经元j和神经元i之间的第二参数。
步骤3323:利用第一参数和第二参数计算得到解码器。
具体地,步骤3323可使用以下公式求出解码器:
d=r -1T。
步骤333:利用解码器和神经元的活动计算得到解码估计。
具体地,利用解码器和神经元的活动进行点乘,得到解码估计。可使用如下公式进行表达:
Figure PCTCN2021137977-appb-000002
步骤334:利用解码估计和反馈数据得到第一差值。
可以理解,解码估计出的结果为脉冲神经网络预测的机器人最佳运动数据,则可与反馈数据中实际的运动数据进行比较,以得到最佳运动 数据与反馈数据中实际的运动数据之间的第一差值。
步骤335:利用第一差值和神经元的活动得到脉冲神经网络的权重修正值。
在一些实施例中,可使用在线监督学习规则来确定权重修正值。
具体地,可使用以下公式来表达:
Δd i=κEa i
Δω ij=κα je j·Ea i
其中,Δω ij表示神经元j和神经元i之间连接权重的权重修正值,κ是标量学习速率,E表示第一差值,即解码估计
Figure PCTCN2021137977-appb-000003
与x之间的差值。
可以理解,不同的神经元具有不同的解码器,则可根据第一差值求得该神经元对应的解码器修正值Δd i
在一些实施例中,可使用无监督学习规则来确定权重修正值。
具体地,可使用以下公式来表达:
Δω ij=a ia j(a j-θ);
其中,Δω ij表示神经元j和神经元i之间连接权重的权重修正值,θ表示修改阈值,用于限制神经元j的修改范围。
在一些实施例中,可使用无监督学习规则和在线监督学习规则结合的方式来确定权重修正值。
具体地,利用以下公式计算权重修正值:
Δω ij=κα ja i(Se j·E+(1-S)a j(a j-θ))。
其中,κ表示标量学习速率,α j表示神经元j的标度因子,a i表示神经元i的活动,S表示控制参数,用于表示监督学习项相对于无监督项学习的相对加权,E表示第一差值,θ表示修改阈值。
步骤336:利用权重修正值对脉冲神经网络的权重进行更新。
其中,脉冲神经网络的权重设置于神经元和神经元之间,则可利用修正值对神经元之间的权重进行更新。如权重修正值为负,则说明原权重需减小,权重修正值为正,则说明原权重需增加。
步骤34:利用更新后的脉冲神经网络计算得到轨迹校正数据。
利用更新后的权重、解码器和神经元的活动得到轨迹校正数据。
具体地,利用神经元的活动乘以权重,再点乘解码器,则可以计算出轨迹校正数据。
具体地,利用以下公式:
Figure PCTCN2021137977-appb-000004
其中,a表示神经元的活动,ω表示更新后的权重,d表示解码器,Γ adapt表示轨迹校正数据。
步骤35:根据预设运动轨迹生成第二指令,第二指令和轨迹校正数据用于控制机器人从第一位置运动至预设运动轨迹上的第二位置。
可根据第二指令和轨迹校正数据计算出机器人每个关节需要的力矩。
具体地,可以使用以下公式计算出第二指令中对机器人移动控制的力矩。
Figure PCTCN2021137977-appb-000005
其中,q表示机器人各个关节的坐标,
Figure PCTCN2021137977-appb-000006
表示机器人各个关节的角速度,M(q)表示机器人各个关节运动的加速度造成的每个关节所受的惯性力,
Figure PCTCN2021137977-appb-000007
表示机器人各个关节运动的速度对其他关节造成的惯性力,即科里奥利力或离心力,G(q)表示机器人各个关节需要克服的机械臂自身重力。Γ adapt表示轨迹校正数据,Γ表示根据机器人动力学模型,让关节按照既定轨迹(位置、速度、加速度)运动,每个关节驱动器需要施加的力矩。
参阅图6,图6中的横坐标表示机器人移动的时间,纵坐标表示机器人移动的实际位置与预设运动轨迹中的位置的距离,可以看出,按照上述实施例的控制方法,机器人在移动过程中会逐渐趋于预设运动轨迹中的位置,使机器人能够稳定并准确的移动。
通过上述方式,利用脉冲神经网络实时的对机器人进行轨迹校正,使机器人能够稳定并准确的移动,提高对机器人控制的稳定性和鲁棒 性,另一方面,利用脉冲神经网络能够提高对轨迹校正数据的计算效率,进而提升机器人的运动效率。
参阅图7,图7是本申请提供的机器人一实施例结构示意图。机器人70包括处理器71以及与处理器71耦接的存储器72。
其中,存储器72用于存储程序数据,处理器71用于执行程序数据,以实现以下方法:
根据预设运动轨迹生成第一指令,第一指令用于控制机器人运动至预设运动轨迹上的第一位置;获取机器人移动至第一位置的反馈数据;基于第一指令和反馈数据在脉冲神经网络中计算出轨迹校正数据;根据预设运动轨迹生成第二指令,第二指令和轨迹校正数据用于控制机器人从第一位置运动至预设运动轨迹上的第二位置。
可以理解的,本实施例中的处理器71还用于执行程序数据,以实现上述任一实施例中的方法,其具体的实施步骤可以参考上述实施例,这里不再赘述。
在一些实施例中,机器人70为机械臂。
参阅图8,图8是本申请提供的机器人另一实施例结构示意图。机器人80包括轨迹生成器81、控制信号生成器82、自适应调控器83和机械臂84。
其中,轨迹生成器81用于根据机械臂84的起始位置和目标位置生成机械臂84的预设运动轨迹。
控制信号生成器82与轨迹生成器81连接,用于根据预设运动轨迹生成第一指令。
自适应调控器83连接控制信号生成器82和机械臂84,自适应调控器83基于脉冲神经网络构建而成。
机械臂84连接控制信号生成器82和自适应调控器83。
机械臂84在接收到第一指令时,将运动至预设运动轨迹上的第一位置。
下面介绍机器人80的实际实现方式:
轨迹生成器81获取目标位置,根据目标位置生成预设运动轨迹, 并将预设运动轨迹中的位置以一系列(x,y)坐标进行表示。控制信号生成器82获取轨迹生成器81发送的这些目标位置,并将该这些目标位置与本地计算的雅可比矩阵组合起来,以将所需的机器人末端运动命令转换为低电平信号(即上述实施例中的第一指令),将该低电平信号发送到机械臂84和自适应调控器83。
自适应调控器83通过向机械臂84发出自适应信号(即轨迹校正数据),以补偿机械臂84的速度和运动误差。机械臂84反馈数据发送至到自适应调控器83。
其中,轨迹生成器81可以是使用动态运动原语这个轨迹生成框架实现建模的,它在操作空间中指定所需的轨迹。动态运动原语是简单的控制器,可以用来快速学习和生成复杂的轨迹。
控制信号生成器82用于将抽象空间中定义的高级控制信号映射到可发送到机械臂84的低级控制信号。在这个过程中,机械臂84末端执行器的力映射到关节力矩。在实现过程中分为两个部分,首先,自适应雅可比矩阵包括惯性矩阵,使用高级控制信号ux和系统速度 .q作为训练信号,使用递归神经网络,对生成雅可比矩阵的连接进行自适应训练。如果系统的性质和环境变化,这确保雅可比矩阵保持最新的改进。其次,近似雅可比将与高级控制信号一起投影到一个集合阵列中,其中执行点积运算来计算低级控制信号。由此产生的低水平控制信号u被发送到自适应调控器83作为训练信号,得到轨迹校正数据再发送到机械臂84。
在自适应调控器83中,为机械臂84提供了一个轨迹校正数据,消除机械臂84运动中出现的未建模的误差。自适应调控器83接收控制信号生成器82产生的控制信号,以及关于机械臂84当前状态的反馈数据。自适应调控器83利用这些信息来了解一个动作的结果,并给出相应的轨迹校正数据。这个轨迹校正数据是合并了正向和反向模型,产生的一个校正控制信号。
本实施例中的脉冲神经网络使用开源的神经网络工程框架(NEF),使用控制信号生成器82生成的第一指令作为训练数据,使用机械臂84当前期望的关节角度、角速度作为学习数据,使用组合学习规则the  homeostatic Prescribed Error Sensitivity(hPES,稳态误差灵敏度)作为脉冲神经网络中的权重更新规则。
其中,神经元的活动可以表示为:
a=G[αe·x];
其中G[·]是非线性神经激活函数,α是与神经元相关的标度因子(增益),e是神经元的编码器,x是要编码的向量,即输入的第一指令和反馈数据。
解码估计
Figure PCTCN2021137977-appb-000008
是每个神经元活动的总和,由n维解码器加权。
Figure PCTCN2021137977-appb-000009
其中,d是解码器,a是神经元的活动。解码器d是通过最小二乘最小化解码估计和实际编码向量之间的差异来找到的。
其中,解码器d可以按照以下公式计算得到:
d=r -1T;
T ij=∫a ia jdx;
r=∫a jxdx;
其中,d是解码器,a i是神经元i的活动,a j是神经元j的活动,x是输入的数据,r为第一参数,T ij为神经元j和神经元i之间的第二参数。
利用以下公式求得权重修正值:
Δω ij=κα ja i(Se j·E+(1-S)a j(a j-θ))。
其中,0≤S≤1,S是在线监督学习项相对于无监督学习项的相对加权,即上述实施例中的控制参数。
根据权重修正值得到神经元之间的权重,然后利用权重、解码器和神经元的活动得到轨迹校正数据。如按照以下公式计算得到轨迹校正数据:
Figure PCTCN2021137977-appb-000010
其中,a是神经元的活动,由输入数据编码而来,ω是神经元之间的连接权重,d是神经元解码器。
在得到轨迹校正数据后,机械臂84则可以根据轨迹校正数据和第 二指令进行计算移动至第二位置的运动数据。
如,可以按照以下公式进行计算:
Figure PCTCN2021137977-appb-000011
其中,q表示机械臂84各个关节的坐标,
Figure PCTCN2021137977-appb-000012
表示机械臂84各个关节的角速度,M(q)表示机械臂84各个关节运动的加速度造成的每个关节所受的惯性力,
Figure PCTCN2021137977-appb-000013
表示机械臂84各个关节运动的速度对其他关节造成的惯性力,即科里奥利力或离心力,G(q)表示机械臂84各个关节需要克服的机械臂84自身重力。
Γ adapt表示自适应调控器83计算出的轨迹校正数据(即校正力矩),Γ表示根据的机械臂84的动力学模型,让机械臂84的关节按照既定轨迹(位置、速度、加速度)运动,每个关节驱动器需要施加的力矩。
机械臂84的每一关节按照此力矩进行移动,则移动的实际位置会更加趋于预设运行轨迹中的位置。
在本实施例中,上述方式能够提高机械臂84的多关节同时协同运动的能力,使机械臂84的动作更加灵活,提升移动效率。
参阅图9,图9是本申请提供的计算机可读存储介质一实施例的结构示意图,该计算机可读存储介质90用于存储程序数据91,程序数据91在被处理器执行时,用于实现以下的方法步骤:
根据预设运动轨迹生成第一指令,第一指令用于控制机器人运动至预设运动轨迹上的第一位置;获取机器人移动至第一位置的反馈数据;基于第一指令和反馈数据在脉冲神经网络中计算出轨迹校正数据;根据预设运动轨迹生成第二指令,第二指令和轨迹校正数据用于控制机器人从第一位置运动至预设运动轨迹上的第二位置。
可以理解的,本实施例中的计算机可读存储介质90应用于上述实施例中的机器人70或机器人80,其具体的实施步骤可以参考上述实施例,这里不再赘述。
在本申请所提供的几个实施方式中,应该理解到,所揭露的方法以 及设备,可以通过其它的方式实现。例如,以上所描述的设备实施方式仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施方式方案的目的。
另外,在本申请各个实施方式中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
上述其他实施方式中的集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施方式所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅为本申请的实施方式,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (10)

  1. 一种基于脉冲神经网络的机器人控制方法,其特征在于,所述机器人控制方法包括:
    根据预设运动轨迹生成第一指令,所述第一指令用于控制机器人运动至所述预设运动轨迹上的第一位置;
    获取所述机器人移动至所述第一位置的反馈数据;
    基于所述第一指令和所述反馈数据在脉冲神经网络中计算出轨迹校正数据;
    根据所述预设运动轨迹生成第二指令,所述第二指令和所述轨迹校正数据用于控制所述机器人从所述第一位置运动至预设运动轨迹上的第二位置。
  2. 根据权利要求1所述的方法,其特征在于,
    所述根据预设运动轨迹生成第一指令之前,包括:
    获取所述机器人移动的目标位置;
    根据所述目标位置和所述机器人的起始位置确定所述机器人的所述预设运动轨迹。
  3. 根据权利要求1所述的方法,其特征在于,
    所述基于所述第一指令和所述反馈数据在脉冲神经网络中计算出轨迹校正数据,包括:
    利用所述第一指令和所述反馈数据对所述脉冲神经网络的权重进行更新;
    利用更新后的所述脉冲神经网络计算得到所述轨迹校正数据。
  4. 根据权利要求3所述的方法,其特征在于,
    所述利用所述第一指令和所述反馈数据对所述脉冲神经网络的权重进行更新,包括:
    将所述第一指令和所述反馈数据进行编码,以得到所述脉冲神经网络中的神经元的活动;
    利用所述神经元的活动计算得到解码器;
    利用所述解码器和所述神经元的活动计算得到解码估计;
    利用所述解码估计和所述反馈数据得到第一差值;
    利用所述第一差值和所述神经元的活动得到所述脉冲神经网络的权重修正值;
    利用所述权重修正值对所述脉冲神经网络的权重进行更新。
  5. 根据权利要求4所述的方法,其特征在于,
    所述利用所述第一差值和所述神经元的活动得到所述脉冲神经网络的权重修正值,包括:
    利用以下公式计算所述权重修正值:
    Δω ij=κα ja i(Se j·E+(1-S)a j(a j-θ));
    其中,κ表示标量学习速率,α j表示神经元j的标度因子,a i表示神经元i的活动,S表示控制参数,E表示所述第一差值,θ表示修改阈值。
  6. 根据权利要求4所述的方法,其特征在于,
    所述利用更新后的所述脉冲神经网络计算得到所述轨迹校正数据,包括:
    利用更新后的权重、所述解码器和所述神经元的活动得到所述轨迹校正数据。
  7. 根据权利要求6所述的方法,其特征在于,
    所述利用更新后的权重、所述解码器和所述神经元的活动得到所述轨迹校正数据,包括:
    利用以下公式计算得到所述轨迹矫正数据:
    Figure PCTCN2021137977-appb-100001
    其中,a表示所述神经元的活动,ω表示更新后的所述权重,d表示所述解码器,Γ adapt表示所述轨迹校正数据。
  8. 一种机器人,其特征在于,所述机器人包括处理器以及与处理器耦接的存储器;
    其中,所述存储器用于存储程序数据,所述处理器用于执行所述程序数据,以实现如权利要求1-7任一项所述的方法。
  9. 根据权利要求8所述的机器人,其特征在于,所述机器人为机械 臂。
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质用于存储程序数据,所述程序数据在被处理器执行时,用于实现如权利要求1-7任一项所述的方法。
PCT/CN2021/137977 2021-03-26 2021-12-14 基于脉冲神经网络的机器人控制方法、机器人及存储介质 WO2022199146A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110326516.5A CN113070878B (zh) 2021-03-26 2021-03-26 基于脉冲神经网络的机器人控制方法、机器人及存储介质
CN202110326516.5 2021-03-26

Publications (1)

Publication Number Publication Date
WO2022199146A1 true WO2022199146A1 (zh) 2022-09-29

Family

ID=76610522

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/137977 WO2022199146A1 (zh) 2021-03-26 2021-12-14 基于脉冲神经网络的机器人控制方法、机器人及存储介质

Country Status (2)

Country Link
CN (1) CN113070878B (zh)
WO (1) WO2022199146A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113070878B (zh) * 2021-03-26 2022-06-07 中国科学院深圳先进技术研究院 基于脉冲神经网络的机器人控制方法、机器人及存储介质
CN113977580B (zh) * 2021-10-29 2023-06-27 浙江工业大学 基于动态运动原语和自适应控制的机械臂模仿学习方法
CN116100537A (zh) * 2021-11-11 2023-05-12 中国科学院深圳先进技术研究院 机器人的控制方法、机器人、存储介质及抓取系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111582445A (zh) * 2020-04-24 2020-08-25 浙江大学 一种基于脉冲神经网络的高效学习系统及学习方法
CN111890350A (zh) * 2020-06-12 2020-11-06 深圳先进技术研究院 机器人及其控制方法、计算机可读存储介质
CN112140101A (zh) * 2019-06-28 2020-12-29 鲁班嫡系机器人(深圳)有限公司 轨迹规划方法、装置及系统
WO2021009293A1 (en) * 2019-07-17 2021-01-21 Deepmind Technologies Limited Training a neural network to control an agent using task-relevant adversarial imitation learning
KR20210012672A (ko) * 2019-07-26 2021-02-03 한국생산기술연구원 인공지능 기반 로봇 매니퓰레이터의 자동 제어 시스템 및 방법
CN113070878A (zh) * 2021-03-26 2021-07-06 中国科学院深圳先进技术研究院 基于脉冲神经网络的机器人控制方法、机器人及存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150269482A1 (en) * 2014-03-24 2015-09-24 Qualcomm Incorporated Artificial neural network and perceptron learning using spiking neurons
US10496922B1 (en) * 2015-05-15 2019-12-03 Hrl Laboratories, Llc Plastic neural networks
CN110524544A (zh) * 2019-10-08 2019-12-03 深圳前海达闼云端智能科技有限公司 一种机械臂运动的控制方法、终端和可读存储介质
CN111993416B (zh) * 2020-07-30 2021-09-14 浙江大华技术股份有限公司 一种控制机械臂运动的方法、设备、系统以及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112140101A (zh) * 2019-06-28 2020-12-29 鲁班嫡系机器人(深圳)有限公司 轨迹规划方法、装置及系统
WO2021009293A1 (en) * 2019-07-17 2021-01-21 Deepmind Technologies Limited Training a neural network to control an agent using task-relevant adversarial imitation learning
KR20210012672A (ko) * 2019-07-26 2021-02-03 한국생산기술연구원 인공지능 기반 로봇 매니퓰레이터의 자동 제어 시스템 및 방법
CN111582445A (zh) * 2020-04-24 2020-08-25 浙江大学 一种基于脉冲神经网络的高效学习系统及学习方法
CN111890350A (zh) * 2020-06-12 2020-11-06 深圳先进技术研究院 机器人及其控制方法、计算机可读存储介质
CN113070878A (zh) * 2021-03-26 2021-07-06 中国科学院深圳先进技术研究院 基于脉冲神经网络的机器人控制方法、机器人及存储介质

Also Published As

Publication number Publication date
CN113070878A (zh) 2021-07-06
CN113070878B (zh) 2022-06-07

Similar Documents

Publication Publication Date Title
WO2022199146A1 (zh) 基于脉冲神经网络的机器人控制方法、机器人及存储介质
US11845186B2 (en) Inverse kinematics solving method for redundant robot and redundant robot and computer readable storage medium using the same
US20210325894A1 (en) Deep reinforcement learning-based techniques for end to end robot navigation
US8396595B2 (en) Real-time self collision and obstacle avoidance using weighting matrix
US8924021B2 (en) Control of robots from human motion descriptors
Mitrovic et al. Adaptive optimal feedback control with learned internal dynamics models
CN109901397B (zh) 一种使用粒子群优化算法的机械臂逆运动学方法
Meier et al. Towards robust online inverse dynamics learning
JP2003241836A (ja) 自走移動体の制御方法および装置
CN115351780A (zh) 用于控制机器人设备的方法
Yang et al. Real-time motion adaptation using relative distance space representation
CN115256395A (zh) 基于控制障碍函数的模型不确定机器人安全控制方法
Patle et al. Optimal trajectory planning of the industrial robot using hybrid S-curve-PSO approach
CN117140527B (zh) 一种基于深度强化学习算法的机械臂控制方法及系统
Jiang et al. Mobile robot path planning based on dynamic movement primitives
CN113650014A (zh) 一种基于回声状态网络的冗余机械臂追踪控制方法
Siradjuddin et al. A real-time model based visual servoing application for a differential drive mobile robot using beaglebone black embedded system
CN107894709A (zh) 基于自适应评价网络冗余机器人视觉伺服控制
CN113352320B (zh) 一种基于Q学习的Baxter机械臂智能优化控制方法
Papageorgiou et al. Learning by demonstration for constrained tasks
Man et al. Intelligent Motion Control Method Based on Directional Drive for 3-DOF Robotic Arm
Gloye et al. Learning to drive and simulate autonomous mobile robots
Leiva et al. Combining RL and IL using a dynamic, performance-based modulation over learning signals and its application to local planning
El-Fakdi et al. Two steps natural actor critic learning for underwater cable tracking
Atsuta et al. Enhancement of the robustness of redundant robot arms against perturbations by inferring dynamical systems using echo state networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21932753

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21932753

Country of ref document: EP

Kind code of ref document: A1