CN112757275A - Method, system and device for controlling musculoskeletal system based on speed precision balance - Google Patents
Method, system and device for controlling musculoskeletal system based on speed precision balance Download PDFInfo
- Publication number
- CN112757275A CN112757275A CN202011610884.4A CN202011610884A CN112757275A CN 112757275 A CN112757275 A CN 112757275A CN 202011610884 A CN202011610884 A CN 202011610884A CN 112757275 A CN112757275 A CN 112757275A
- Authority
- CN
- China
- Prior art keywords
- time
- activation signal
- muscle activation
- musculoskeletal system
- moment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 57
- 210000002346 musculoskeletal system Anatomy 0.000 title claims abstract description 47
- 210000003205 muscle Anatomy 0.000 claims abstract description 98
- 230000004913 activation Effects 0.000 claims abstract description 77
- 230000009471 action Effects 0.000 claims abstract description 22
- 210000001577 neostriatum Anatomy 0.000 claims abstract description 11
- 108010076504 Protein Sorting Signals Proteins 0.000 claims abstract description 8
- 230000006870 function Effects 0.000 claims description 25
- 238000009825 accumulation Methods 0.000 claims description 24
- 210000004227 basal ganglia Anatomy 0.000 claims description 18
- 238000004364 calculation method Methods 0.000 claims description 16
- 238000012549 training Methods 0.000 claims description 16
- 239000011159 matrix material Substances 0.000 claims description 12
- 238000011156 evaluation Methods 0.000 claims description 7
- 230000008447 perception Effects 0.000 claims description 7
- 230000005484 gravity Effects 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 4
- 230000001133 acceleration Effects 0.000 claims description 3
- 230000003213 activating effect Effects 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000012804 iterative process Methods 0.000 claims description 3
- 230000003042 antagnostic effect Effects 0.000 abstract description 10
- 230000008602 contraction Effects 0.000 abstract description 10
- 238000011217 control strategy Methods 0.000 abstract description 3
- 230000008569 process Effects 0.000 description 12
- 210000005036 nerve Anatomy 0.000 description 4
- 230000006399 behavior Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 208000003098 Ganglion Cysts Diseases 0.000 description 1
- 208000029549 Muscle injury Diseases 0.000 description 1
- 208000005400 Synovial Cyst Diseases 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 210000000609 ganglia Anatomy 0.000 description 1
- 210000001905 globus pallidus Anatomy 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000007230 neural mechanism Effects 0.000 description 1
- 230000037078 sports performance Effects 0.000 description 1
- 210000003523 substantia nigra Anatomy 0.000 description 1
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/10—Programme-controlled manipulators characterised by positioning means for manipulator elements
- B25J9/1075—Programme-controlled manipulators characterised by positioning means for manipulator elements with muscles or tendons
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1628—Programme controls characterised by the control loop
- B25J9/1633—Programme controls characterised by the control loop compliant, force, torque control, e.g. combined with position control
Landscapes
- Engineering & Computer Science (AREA)
- Robotics (AREA)
- Mechanical Engineering (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Orthopedic Medicine & Surgery (AREA)
- Rheumatology (AREA)
- Feedback Control In General (AREA)
Abstract
The invention belongs to the technical field of control, and particularly relates to a musculoskeletal system control method, a musculoskeletal system control system and a musculoskeletal system control device based on speed precision balance, aiming at solving the problem that the existing control method of a musculoskeletal robot similar to a human cannot well control antagonistic muscle cooperative contraction. The invention comprises the following steps: obtaining estimated motion precision of a musculoskeletal system through a Fitz rule, calculating a supervision item moment through a speed modulation strategy inspired by a striatum based on the estimated motion precision, calculating a muscle activation signal vector through a muscle activation signal network, calculating action reward based on the muscle activation signal vector and the supervision item moment, further calculating a loss function, adjusting parameters of the muscle activation signal network based on the loss function, increasing the value of the action reward, and repeatedly iterating to obtain a muscle activation signal sequence required by control; the invention utilizes the structural information of a musculoskeletal system, constructs a general antagonistic muscle cooperative contraction control strategy and ensures the smooth movement.
Description
Technical Field
The invention belongs to the technical field of control, and particularly relates to a method, a system and a device for controlling a musculoskeletal system based on speed precision balance.
Background
The adaptability of living beings allows them the flexibility to adjust and execute behaviors, allowing learned skilled sports to vary according to the environment and task requirements. One of the typical strategies to achieve motion variability is a speed accuracy tradeoff, which reflects the trade-off between rapidity and accuracy of motion. How to implement such a flexible behavior strategy in a human-like musculoskeletal robot, which enables the robot to generate universal adaptability to environment and tasks, is an attractive challenge. On the other hand, for a human-like musculoskeletal robot system, the number of muscles is generally far greater than the number of joints, and redundant muscles not only bring difficulty to exercise learning, but also bring trouble to generation of new exercises. It is also a challenge how to construct a general antagonistic muscle cooperative contraction control strategy using structural information of the musculoskeletal system, especially considering partial muscle damage.
Disclosure of Invention
In order to solve the above problems in the prior art, that is, the problem that the existing human-like musculoskeletal robot control method cannot well perform antagonistic muscle cooperative contraction control, the present invention provides a musculoskeletal system control method based on speed precision balance, the method comprising:
making the training times k equal to 1;
s100, obtaining estimated motion precision W of a musculoskeletal system at the time t through a Fitz rule;
step S200, calculating a supervision item moment through a speed modulation strategy inspired by a striatum based on the estimated motion precision W
Step S300, based on the supervision item momentComputing a muscle activation signal vector u through a muscle activation signal networkt;
Step S400, based on the muscle activation signal vector utAnd moment of supervisionCalculating an action reward RtAnd further calculating a preset loss function L, adjusting parameters of the muscle activation signal network based on the preset loss function L, and enabling the action reward RtAnd increasing the value, and repeating the steps S100-S400 by making K equal to K +1 until K equal to K is the preset maximum training time, so as to obtain a muscle activation signal sequence required by control.
In some preferred embodiments, step S100 includes:
presetting accumulated time T;
step S110, obtaining t through a cortex model1Perceptual evidence of time xi(t1)~N(μi,σ2) Further, the accumulated perception evidence Y is obtainedi(T):
Step S120, the perception evidence Y is processedi(T) inputting the basal ganglia model to obtain an output OUT of the basal ganglia modeli;
Step S130, outputting OUT based on the basal ganglia modeliPassing through a preset decision threshold-ln Pi(T) obtaining a first effective accumulation timeAnd a second effective accumulation timeIf T > 0, let T be T-1, repeat steps S110-S130;
wherein, when firstSecond occurrence of the output OUT of the basal ganglia modeli≥-ln PiAt (T), will OUTiThe corresponding accumulation time T is set as the first effective accumulation timeOutput OUT when the basal ganglia model is presenti<≥-ln PiAt (T), will OUTiThe corresponding accumulated time T is set as the second effective accumulated timeEach time generating new second effective accumulation time in the iterative processCovering the last generated second effective accumulation timeThe-ln Pi(T) is a decision threshold;
step S140, passing the first effective accumulation timeAnd a second effective accumulation timeObtaining a final decision time
Step S150, based on the final decision time ToutAnd estimating the motion precision W by the Fitz rule. The accuracy of the subsequent muscle control is adjusted by calculating the appropriate final decision time.
In some preferred embodiments, the striatal inspired speed modulation strategy is:
wherein,joint angle, q, representing an end position calculated from the estimated motion accuracysAngle of articulation, t, representing initial positionSIs the starting moment of the movement, VM(lambda, T) is a bell-shaped velocity modulation model, T denotes the time T of the modulation, ToutRepresenting a decision time;
said bell-shaped velocity modulation model VM(λ, t) is:
wherein, λ is a parameter of the modulation model, and t represents t time;
wherein q istIs the joint angle at the time t,the desired angular velocity of the joint is,angular acceleration of desired angular velocity of joint, M (q)t) Is an inertial matrix of the musculoskeletal system,centripetal CoriolisForce, G (q)t) Is the gravity matrix of the musculoskeletal system;
the inertia matrix M (q) of the musculoskeletal systemt) Comprises the following steps:
the gravity matrix G (q) of the musculoskeletal systemt) Comprises the following steps:
wherein m is1Representing the mass, m, of the first link of the arm2Indicating the mass of the second link of the robot arm, d1Indicating the length of the first link of the robot arm, d2The length of the second connecting rod of the mechanical arm is shown,q1,trepresenting the angle of the first joint of the arm, q2,tIndicating the angle of the second joint of the robotic arm,andis the angular velocity of the first joint and the second joint of the mechanical arm.
In some preferred embodiments, the muscle activation signal vector utThe calculation method comprises the following steps:
wherein u ist-1Is the muscle activation signal at time t-1, taut-1For the joint moment at time t-1, the strategy network mu (· | theta)μ) A neural network for solving for muscle activation signals.
In some preferred embodiments, step S400 includes:
step S410, based on the muscle activation signal vector utAnd moment of supervisionCalculating an action reward RtLet R betAs large as possible:
where, gamma is a discount factor,for the moment generated by the flexors at time t,the force produced by the flexor muscle at time t,the muscle activation signal generated for the flexor muscle at time t,for the moment generated by the extensor muscle at time t,the force produced by the extensor muscle at time t,generating muscle activation signals for the extensors at time t, p being the number of flexors, q being the extensionNumber of muscles, ω1And ω2Is a proportional parameter;
step S420, reward R based on the actiontComputing an evaluation network QμLoss function L of (d):
wherein, thetaQTo evaluate the parameters of the network, Δ ut+1Is the muscle activation signal at time t +1, Δ utMuscle activation signal at time t;
step S430, evaluating the network Q based on the loss function LμParameter theta ofQUpdating:
wherein eta is1Represents an update step size;
based on said evaluation network QμUpdating policy network μ (· | θ)μ) Parameter theta ofμ:
Wherein eta is2Which represents the step size of the update,for policy network mu (· | theta)μ) Gradient (2):
step S440, if T ≠ ToutThe method from step S200 to step S400 is repeated until T is T +1out;
Step S450, if K ≠ K, let K ═ K +1, repeat the method of steps S100-S400 until K ═ K, at which time the muscle activation signalNumber ut(t∈[1,T]) The sequence of muscle activation signals required to accomplish control.
In some preferred embodiments, the output of the basal ganglia model, OUTiComprises the following steps:
In some preferred embodiments, the motion precision W estimated by the feitz law is:
where a and b are two constant parameters and D is the distance moved by the joint tip.
In another aspect of the invention, a musculoskeletal system control system based on speed accuracy balance is provided, the system comprising an accuracy estimation module, an expected torque calculation module, an activation signal calculation module and a speed accuracy balance module;
making the training times k equal to 1;
the precision estimation module is used for acquiring the estimated motion precision W of the musculoskeletal system at the moment t through the Fitz rule;
the expected moment calculation module is used for calculating the moment of a supervision item through a speed modulation strategy inspired by a striatum based on the estimated motion precision W
The activation signal calculation module is used for calculating the moment based on the supervision itemBy muscle stimulationActivity signal network computing muscle activation signal vector ut;
The velocity accuracy trade-off module is used for activating the signal vector u based on the muscletAnd moment of supervisionCalculating an action reward RtAnd further calculating a preset loss function L, adjusting parameters of the muscle activation signal network based on the preset loss function L, and enabling the action reward RtAnd increasing the value, and repeating the function of the precision estimation module, namely the speed precision balancing module, when K is equal to K +1 until K is equal to K, wherein K is the preset maximum training frequency, so as to obtain a muscle activation signal sequence required by control.
In a third aspect of the present invention, a storage device is provided, in which a plurality of programs are stored, the programs being adapted to be loaded and executed by a processor to implement the above-mentioned musculoskeletal system control method based on speed accuracy trade-off.
In a fourth aspect of the present invention, a processing apparatus is provided, which includes a processor, a storage device; the processor is suitable for executing various programs; the storage device is suitable for storing a plurality of programs; the program is adapted to be loaded and executed by a processor to implement the aforementioned musculoskeletal system control method based on speed accuracy trade-offs.
The invention has the beneficial effects that:
(1) the muscle-skeletal-system control method based on speed precision balance is characterized in that an antagonistic muscle cooperative contraction strategy is designed by combining the Fitz rule and a speed modulation strategy of a neuron loop in the striatum FSI-SPN, a universal redundant muscle control algorithm is realized, and the adaptability of the muscle-skeletal-system to the simulated movement on the basis of nerves is improved.
(2) The invention relates to a musculoskeletal system control method based on speed precision balance, which is characterized in that a general antagonistic muscle cooperative contraction control strategy is constructed by designing action rewards of an antagonistic muscle cooperative contraction strategy and updating parameters of a strategy network through combining the action rewards with an evaluation network, and is beneficial to the movement learning and control of redundant muscles of a man-like musculoskeletal robot.
(3) The invention relates to a musculoskeletal system control method based on speed precision balance, which constructs a supervised Markov decision process algorithm by introducing a supervision item in a Markov process, divides the control process into two stages of motion planning and motion execution, and takes the two stages as the basis for realizing motion variability, thereby realizing the efficient training and control of a musculoskeletal robot system.
(4) The muscle-skeletal-system control method based on speed precision balance calculates proper exercise execution time by combining an antagonistic muscle cooperative contraction strategy and an active speed precision balance model, further influences the precision of muscle control, and realizes the adaptability of exercise simulated on the basis of nerves on a muscle-skeletal system.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is a flow diagram of a musculoskeletal system control method based on speed accuracy trade-off in accordance with an embodiment of the present invention;
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
The invention provides a musculoskeletal system control method based on speed precision balance.
The invention relates to a musculoskeletal system control method based on speed precision balance, which comprises the following steps:
making the training times k equal to 1; because musculoskeletal robotic systems have redundant numbers of joints, there are several implementations for a single motor task.
S100, obtaining estimated motion precision W of a musculoskeletal system at the time t through a Fitz rule;
step S200, calculating a supervision item moment through a speed modulation strategy inspired by a striatum based on the estimated motion precision W
Step S300, based on the supervision item momentComputing a muscle activation signal vector u through a muscle activation signal networkt;
Step S400, based on the muscle activation signal vector utAnd moment of supervisionCalculating an action reward RtCalculating a loss value through a preset loss function L, and adjusting parameters of a muscle activation signal network based on the loss value corresponding to the preset loss function L to enable an action reward RtAnd increasing the value, and if K is less than K, repeating the steps S100-S400 by making K equal to K +1 until K equal to K is the preset maximum training times, so as to obtain a muscle activation signal sequence required by control.
In order to more clearly describe the musculoskeletal system control method based on speed precision balance, the following describes an embodiment of the present invention in detail with reference to fig. 1.
The invention discloses a musculoskeletal system control method based on speed precision balance, which comprises a step S100-a step S400, wherein the steps are described in detail as follows:
making the training times k equal to 1;
s100, obtaining estimated motion precision W of a musculoskeletal system at the time t through a Fitz rule;
in the present embodiment, step S100 includes:
presetting accumulated time T; the cumulative time T set in the first iteration needs to be the maximum time that can be set for the subsequent iteration to proceed, and preferably T may be selected to be 30 s.
Step S110, obtaining t through a cortex model1Perceptual evidence of time xi(t1)~N(μi,σ2) Further, the accumulated perception evidence Y is obtainedi(T) is shown in equation (1):
step S120, the perception evidence Y is processedi(T) inputting the basal ganglia model to obtain an output OUT of the basal ganglia modeli(ii) a The accumulated perception evidence is transmitted into the striatum of the extremely low ganglion model, and is collected in the substantia nigra and the globus pallidus after passing through a direct path and an indirect path to obtain the output OUT of the basal ganglion modeli;
Step S130, outputting OUT based on the basal ganglia modeliPassing through a preset decision threshold-ln Pi(T) obtaining a first effective accumulation timeAnd a second effective accumulation timeIf T > 0, let T be T-1, repeat steps S110-S130;
wherein the output OUT of the basal ganglia model occurs for the first timei≥-ln PiAt (T), will OUTiThe corresponding accumulation time T is set as the first effective accumulation timeWhen the basal nerve appearsOutput OUT of the section modeli<-ln PiAt (T), will OUTiThe corresponding accumulated time T is set as the second effective accumulated timeEach time generating new second effective accumulation time in the iterative processCovering the last generated second effective accumulation timeThe-ln Pi(T) is a decision threshold;
in the present embodiment, the decision threshold value-ln Pi(T) to determine if evidence is sufficient to make a decision, where Pi(T) indicating the accuracy of the decision, e.g. Pi0.8 (T) means that the decision has a probability of being correct of 80%.
In this embodiment, the output OUT of the model of the basal gangliaiAs shown in equation (2):
Step S140, passing the first effective accumulation timeAnd a second effective accumulation timeObtaining a final decision time
Step S150, based on the final decisionPolicy time ToutAnd estimating the motion precision W by the Fitz rule.
In this embodiment, the motion precision W estimated by the fitz law is shown in formula (3):
where a and b are two constant parameters and D is the distance of movement of the joint end
Step S200, calculating a supervision item moment through a speed modulation strategy inspired by a striatum based on the estimated motion precision W
In this embodiment, the striatum-inspired speed modulation strategy is shown in equation (4):
wherein,joint angle, q, representing an end position calculated from the estimated motion accuracysAngle of articulation, t, representing initial positionSIs the starting moment of the movement, VM(lambda, T) is a bell-shaped velocity modulation model, T denotes the time T of the modulation, ToutRepresenting a decision time;
said bell-shaped velocity modulation model VM(λ, t) is shown in equation (5):
wherein, λ is a parameter of the modulation model, and t represents t time;
wherein q istIs the joint angle at the time t,the desired angular velocity of the joint is,angular acceleration of desired angular velocity of joint, M (q)t) Is an inertial matrix of the musculoskeletal system,centripetal Coriolis force, G (q)t) Is the gravity matrix of the musculoskeletal system;
the inertia matrix M (q) of the musculoskeletal systemt) As shown in equation (7):
the gravity matrix G (q) of the musculoskeletal systemt) As shown in formula (9):
wherein m is1Representing the mass, m, of the first link of the arm2Indicating the mass of the second link of the robot arm, d1Indicating the length of the first link of the robot arm, d2The length of the second connecting rod of the mechanical arm is shown,q1,trepresenting the angle of the first joint of the arm, q2,tIndicating the angle of the second joint of the robotic arm,andis the angular velocity of the first joint and the second joint of the mechanical arm.
Step S300, based on the supervision item momentComputing a muscle activation signal vector u through a muscle activation signal networkt;
In this embodiment, the muscle activation signal vector utThe calculation method is shown as formula (10):
wherein u ist-1Is the muscle activation signal at time t-1, taut-1For the joint moment at time t-1, the strategy network mu (· | theta)μ) A neural network for solving for muscle activation signals. The preferred policy network may be that of the classical DDPG method.
Step S400, based on the muscle activation signal vector utAnd moment of supervisionCalculating an action reward RtCalculating a loss value through a preset loss function L, and adjusting parameters of a muscle activation signal network based on the loss value corresponding to the preset loss function L to enable an action reward RtAnd increasing the value, and if K is less than K, repeating the steps S100-S400 by making K equal to K +1 until K equal to K is the preset maximum training times, so as to obtain a muscle activation signal sequence required by control.
In this embodiment, step S400 includes:
step S410, based on the muscle activation signal vector utAnd moment of supervisionCalculating an action reward RtLet R betAs large as possible, as shown in equation (11):
where, gamma is a discount factor,for the moment generated by the flexors at time t,in order to produce the force of the flexors at the moment,the muscle activation signal generated for the flexor muscle at time t,for the moment generated by the extensor muscle at time t,the force produced by the extensor muscle at time t,producing muscle for extensor at time tMeat activation signal, p is the number of flexors, q is the number of extensors, ω1And ω2Is a proportional parameter; the fine control of redundant muscles is realized through the steps.
In this embodiment, the higher the reward for action, the more accurate the output is, and the purpose of the present invention is to bring the resultant moment closer to the supervisory term while minimizing the change in muscle activation signals of the flexors and extensors to form a stable coordinated contraction between the flexors and extensors.
Step S420, reward R based on the actiontComputing an evaluation network QμIs shown in equation (12):
wherein, thetaQTo evaluate the parameters of the network, Δ ut+1Is the muscle activation signal at time t-1, Δ utMuscle activation signal at time t; muscle activation signal Δ u heretIncluding flexor muscle activation signals and extensor muscle activation signals; wherein the network Q is evaluatedμRepresenting a state-action value function;
step S430, evaluating the network Q based on the loss function LμParameter theta ofQUpdating is performed as shown in equation (13):
wherein eta is1Represents an update step size;
based on said evaluation network QμUpdating policy network μ (· | θ)μ) Parameter theta ofμAs shown in equation (14):
wherein eta is2Which represents the step size of the update,for policy network mu (· | theta)μ) The gradient of (d) is shown in equation (15):
step S440, if T ≠ ToutThe method from step S200 to step S400 is repeated until T is T +1out;
Step S450, if K ≠ K, let K ═ K, and repeat the method of steps S100-S400 until K ═ K, at which time the muscle activation signal ut(t∈[1,T]) The sequence of muscle activation signals required to accomplish control. The calculation process is such that the initial activation signal is assumed to be ut-1Calculating the variation of the model at each moment At the next moment, the signal input to the muscle becomesRepeating until T is T; to show the sequence and value differences, ut(t∈[1,T]) Represents a sequence, utRepresenting a single value.
Aiming at a high-redundancy and high-coupling musculoskeletal robot system, on one hand, a biological credible basal ganglia calculable decision model is provided by simulating a cortical-basal ganglia neural loop by using a neural mechanism of speed precision balance of a living being as reference. Meanwhile, an active speed precision balance model is provided by combining Fitts' Law and a speed modulation strategy of a neuron loop in the striatum FSI-SPN, and the adaptability of flexibly adjusting the skilled sports performance according to the environment information and the task related parameters is realized. On the other hand, in order to realize efficient training and control of the musculoskeletal robot system, a supervision item is introduced in a Markov Decision Process (MDP), a supervised MDP algorithm is constructed, and the control process is divided into two stages of motion planning and motion execution, which are used as a basis for realizing motion variability. And in the exercise execution stage, an antagonistic muscle cooperative contraction strategy is designed for exploring the exercise cooperative relationship among antagonistic muscles, so that a universal redundant muscle control algorithm is realized. Finally, the algorithm is combined with an active speed precision balance model, and the adaptability of the motion simulated on the basis of the nerve is realized on a musculoskeletal system. The musculoskeletal system control system based on speed precision balance comprises a precision estimation module, an expected torque calculation module, an activation signal calculation module and a speed precision balance module;
making the training times k equal to 1;
the precision estimation module is used for acquiring the estimated motion precision W of the musculoskeletal system at the moment t through the Fitz rule;
the expected moment calculation module is used for calculating the moment of a supervision item through a speed modulation strategy inspired by a striatum based on the estimated motion precision W
The activation signal calculation module is used for calculating the moment based on the supervision itemComputing a muscle activation signal vector u through a muscle activation signal networkt;
The velocity accuracy trade-off module is used for activating the signal vector u based on the muscletAnd moment of supervisionCalculating an action reward RtAnd further calculating a preset loss function L, adjusting parameters of the muscle activation signal network based on the preset loss function L, and enabling the action reward RtIncreasing the value, and repeating the function of the precision estimation module, namely the speed precision balance module, when K is equal to K +1 until K is equal to K, and K is pre-determinedAnd (4) setting the maximum training times to obtain a muscle activation signal sequence required by control.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process and related description of the system described above may refer to the corresponding process in the foregoing method embodiments, and will not be described herein again.
It should be noted that the musculoskeletal system control system based on speed precision tradeoff provided in the above embodiment is only illustrated by the division of the above functional modules, and in practical applications, the above functions may be allocated to different functional modules according to needs, that is, the modules or steps in the embodiment of the present invention are further decomposed or combined, for example, the modules in the above embodiment may be combined into one module, or may be further split into multiple sub-modules, so as to complete all or part of the above described functions. The names of the modules and steps involved in the embodiments of the present invention are only for distinguishing the modules or steps, and are not to be construed as unduly limiting the present invention.
A storage device according to a third embodiment of the invention has stored therein a plurality of programs adapted to be loaded and executed by a processor to implement the method of musculoskeletal system control based on speed accuracy trade-off described above.
A processing apparatus according to a fourth embodiment of the present invention includes a processor, a storage device; a processor adapted to execute various programs; a storage device adapted to store a plurality of programs; the program is adapted to be loaded and executed by a processor to implement the aforementioned musculoskeletal system control method based on speed accuracy trade-offs.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes and related descriptions of the storage device and the processing device described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
Those of skill in the art would appreciate that the various illustrative modules, method steps, and modules described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that programs corresponding to the software modules, method steps may be located in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. To clearly illustrate this interchangeability of electronic hardware and software, various illustrative components and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as electronic hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The terms "first," "second," and the like are used for distinguishing between similar elements and not necessarily for describing or implying a particular order or sequence.
The terms "comprises," "comprising," or any other similar term are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.
Claims (10)
1. A musculoskeletal system control method based on speed accuracy trade-off, the control method comprising:
making the training times k equal to 1;
s100, obtaining estimated motion precision W of a musculoskeletal system at the time t through a Fitz rule;
step S200, calculating a supervision item moment through a speed modulation strategy inspired by a striatum based on the estimated motion precision W
Step S300, based on the supervision item momentComputing a muscle activation signal vector u through a muscle activation signal networkt;
Step S400, based on the muscle activation signal vector utAnd moment of supervisionCalculating an action reward RtCalculating a loss value through a preset loss function L, and adjusting parameters of a muscle activation signal network based on the loss value corresponding to the preset loss function L to enable an action reward RtAnd increasing the value, and if K is less than K, repeating the steps S100-S400 by making K equal to K +1 until K equal to K is the preset maximum training times, so as to obtain a muscle activation signal sequence required by control.
2. The method for controlling a musculoskeletal system based on a speed accuracy tradeoff according to claim 1, wherein the step S100 comprises:
presetting accumulated time T;
step S110, obtaining t through a cortex model1Perceptual evidence of time xi(t1)~N(μi,σ2) Further, the accumulated perception evidence Y is obtainedi(T):
Step S120, the perception evidence Y is processedi(T) inputting the basal ganglia model to obtain an output OUT of the basal ganglia modeli;
Step S130, outputting OUT based on the basal ganglia modeliPassing through a preset decision threshold-ln Pi(T) obtaining a first effective accumulation timeAnd a second effective accumulation timeIf T > 0, let T be T-1, repeat steps S110-S130;
wherein the output OUT of the basal ganglia model occurs for the first timei≥-ln PiAt (T), will OUTiThe corresponding accumulation time T is set as the first effective accumulation timeOutput OUT when the basal ganglia model is presenti<-ln PiAt (T), will OUTiThe corresponding accumulated time T is set as the second effective accumulated timeEach time generating new second effective accumulation time in the iterative processCovering the last generated second effective accumulation timeThe-ln Pi(T) is a decision threshold;
step S140, passing the first effective accumulation timeAnd a second effective accumulation timeObtaining a final decision time
Step S150, based on the final decision time ToutAnd estimating the motion precision W by the Fitz rule.
3. The method of musculoskeletal system control based on speed accuracy trade-off of claim 2, wherein the striatal inspired speed modulation strategy is:
wherein,joint angle, q, representing an end position calculated from the estimated motion accuracysAngle of articulation, t, representing initial positionSIs the starting moment of the movement, VM(lambda, T) is a bell-shaped velocity modulation model, T denotes the time T of the modulation, ToutRepresenting a decision time;
said bell-shaped velocity modulation model VM(λ, t) is:
wherein, λ is a parameter of the modulation model, and t represents t time;
wherein q istIs the joint angle at the time t,the desired angular velocity of the joint is,angular acceleration of desired angular velocity of joint, M (q)t) Is an inertial matrix of the musculoskeletal system,centripetal Coriolis force, G (q)t) Is the gravity matrix of the musculoskeletal system;
the inertia matrix M (q) of the musculoskeletal systemt) Comprises the following steps:
wherein m is1Indicating the first arm of the robotMass of the connecting rod, m2Indicating the mass of the second link of the robot arm, d1Indicating the length of the first link of the robot arm, d2The length of the second connecting rod of the mechanical arm is shown,q1,trepresenting the angle of the first joint of the arm, q2,tIndicating the angle of the second joint of the robotic arm,andis the angular velocity of the first joint and the second joint of the mechanical arm.
4. The method of claim 3, wherein the muscle activation signal vector u is a velocity precision tradeoff based musculoskeletal system control methodtThe calculation method comprises the following steps:
wherein u ist-1Is the muscle activation signal at time t-1, taut-1For the joint moment at time t-1, the strategy network mu (· | theta)μ) A neural network for solving for muscle activation signals.
5. The method for controlling a musculoskeletal system based on a speed accuracy tradeoff according to claim 4, wherein the step S400 comprises:
step S410, based on the muscle activation signal vector utAnd moment of supervisionCalculating an action reward RtLet R betAs large as possible:
where, gamma is a discount factor,for the moment generated by the flexors at time t,in order to produce the force of the flexors at the moment,the muscle activation signal generated for the flexor muscle at time t,for the moment generated by the extensor muscle at time t,the force produced by the extensor muscle at time t,generating muscle activation signals for the extensors at time t, p being the number of flexors, q being the number of extensors, ω1And ω2Is a proportional parameter;
step S420, reward R based on the actiontComputing an evaluation network QμLoss function L of (d):
wherein, thetaQTo evaluate the parameters of the network, Δ ut+1Is the muscle activation signal at time t-1, Δ utMuscle activation signal at time t;
step S430, evaluating the network Q based on the loss function LμParameter theta ofQUpdating:
wherein eta is1Represents an update step size;
based on said evaluation network QμUpdating policy network μ (· | θ)μ) Parameter theta ofμ:
Wherein eta is2Which represents the step size of the update,for policy network mu (· | theta)μ) Gradient (2):
step S440, if T ≠ ToutThe method from step S200 to step S400 is repeated until T is T +1out;
Step S450, if K ≠ K, let K ≠ K +1, and repeat the method from step S100 to step S400 until K ═ K, at which time the muscle activation signal ut(t∈[1,T]) The sequence of muscle activation signals required to accomplish control.
8. A musculoskeletal system control system based on speed accuracy trade-offs, the system comprising: the device comprises an accuracy estimation module, an expected torque calculation module, an activation signal calculation module and a speed accuracy balance module;
making the training times k equal to 1;
the precision estimation module is used for acquiring the estimated motion precision W of the musculoskeletal system at the moment t through the Fitz rule;
the expected moment calculation module is used for calculating the moment of a supervision item through a speed modulation strategy inspired by a striatum based on the estimated motion precision W
The activation signal calculation module is used for calculating the moment based on the supervision itemComputing a muscle activation signal vector u through a muscle activation signal networkt;
The velocity accuracy trade-off module is used for activating the signal vector u based on the muscletAnd moment of supervisionCalculating an action reward RtAnd further calculating a preset loss function L, adjusting parameters of the muscle activation signal network based on the preset loss function L, and enabling the action reward RtAnd increasing the value, and repeating the function of the precision estimation module, namely the speed precision balancing module, when K is equal to K +1 until K is equal to K, wherein K is the preset maximum training frequency, so as to obtain a muscle activation signal sequence required by control.
9. A storage device having stored therein a plurality of programs, wherein said programs are adapted to be loaded and executed by a processor to implement the method of musculoskeletal system control based on speed accuracy trade-offs of any one of claims 1-7.
10. A processing apparatus comprising a processor adapted to execute programs; and a storage device adapted to store a plurality of programs, wherein the programs are adapted to be loaded and executed by a processor to implement the method of musculoskeletal system control based on speed accuracy trade-off of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011610884.4A CN112757275B (en) | 2020-12-30 | 2020-12-30 | Method, system and device for controlling musculoskeletal system based on speed precision balance |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011610884.4A CN112757275B (en) | 2020-12-30 | 2020-12-30 | Method, system and device for controlling musculoskeletal system based on speed precision balance |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112757275A true CN112757275A (en) | 2021-05-07 |
CN112757275B CN112757275B (en) | 2022-02-25 |
Family
ID=75695918
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011610884.4A Active CN112757275B (en) | 2020-12-30 | 2020-12-30 | Method, system and device for controlling musculoskeletal system based on speed precision balance |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112757275B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113199460A (en) * | 2021-05-24 | 2021-08-03 | 中国科学院自动化研究所 | Nonlinear musculoskeletal robot control method, system and equipment |
CN114918914A (en) * | 2022-04-26 | 2022-08-19 | 中国科学院自动化研究所 | Human body musculoskeletal simulation control system and simulation device |
CN115070760A (en) * | 2022-06-16 | 2022-09-20 | 中国科学院自动化研究所 | Method and device for controlling musculoskeletal mechanical arm |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040229198A1 (en) * | 2003-05-15 | 2004-11-18 | Cns Vital Signs, Llc | Methods and systems for computer-based neurocognitive testing |
CN107199569A (en) * | 2017-06-22 | 2017-09-26 | 华中科技大学 | A kind of articulated robot method for planning track distributed based on joint balancing energy |
CN108115681A (en) * | 2017-11-14 | 2018-06-05 | 深圳先进技术研究院 | Learning by imitation method, apparatus, robot and the storage medium of robot |
CN108724191A (en) * | 2018-06-27 | 2018-11-02 | 芜湖市越泽机器人科技有限公司 | A kind of robot motion's method for controlling trajectory |
JP2020031508A (en) * | 2018-08-24 | 2020-02-27 | 株式会社日立産機システム | Control device of ac motor and control method thereof |
CN111515929A (en) * | 2020-04-15 | 2020-08-11 | 深圳航天科技创新研究院 | Human motion state estimation method, device, terminal and computer readable storage medium |
-
2020
- 2020-12-30 CN CN202011610884.4A patent/CN112757275B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040229198A1 (en) * | 2003-05-15 | 2004-11-18 | Cns Vital Signs, Llc | Methods and systems for computer-based neurocognitive testing |
CN107199569A (en) * | 2017-06-22 | 2017-09-26 | 华中科技大学 | A kind of articulated robot method for planning track distributed based on joint balancing energy |
CN108115681A (en) * | 2017-11-14 | 2018-06-05 | 深圳先进技术研究院 | Learning by imitation method, apparatus, robot and the storage medium of robot |
CN108724191A (en) * | 2018-06-27 | 2018-11-02 | 芜湖市越泽机器人科技有限公司 | A kind of robot motion's method for controlling trajectory |
JP2020031508A (en) * | 2018-08-24 | 2020-02-27 | 株式会社日立産機システム | Control device of ac motor and control method thereof |
CN111515929A (en) * | 2020-04-15 | 2020-08-11 | 深圳航天科技创新研究院 | Human motion state estimation method, device, terminal and computer readable storage medium |
Non-Patent Citations (2)
Title |
---|
李俊佑: ""基于时间约束和力反馈的速度—准确率权衡研究"", 《中国硕士学位论文全文数据库 信息科技辑》 * |
郭小军: ""速度、准确率及其权衡 ——被试反应状态评价与建模"", 《中国博士学位论文全文数据库 哲学与人文科学辑》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113199460A (en) * | 2021-05-24 | 2021-08-03 | 中国科学院自动化研究所 | Nonlinear musculoskeletal robot control method, system and equipment |
CN114918914A (en) * | 2022-04-26 | 2022-08-19 | 中国科学院自动化研究所 | Human body musculoskeletal simulation control system and simulation device |
CN114918914B (en) * | 2022-04-26 | 2024-03-22 | 中国科学院自动化研究所 | Simulation control system and simulation device for human musculature |
CN115070760A (en) * | 2022-06-16 | 2022-09-20 | 中国科学院自动化研究所 | Method and device for controlling musculoskeletal mechanical arm |
Also Published As
Publication number | Publication date |
---|---|
CN112757275B (en) | 2022-02-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112757275B (en) | Method, system and device for controlling musculoskeletal system based on speed precision balance | |
CN110909859B (en) | Bionic robot fish motion control method and system based on antagonistic structured control | |
Nguyen-Tuong et al. | Using model knowledge for learning inverse dynamics | |
Kober et al. | Reinforcement learning to adjust robot movements to new situations | |
CN110119844A (en) | Introduce robot motion's decision-making technique, the system, device of Feeling control mechanism | |
CN108115681A (en) | Learning by imitation method, apparatus, robot and the storage medium of robot | |
Higuera et al. | Synthesizing neural network controllers with probabilistic model-based reinforcement learning | |
CN113199460B (en) | Nonlinear musculoskeletal robot control method, system and device | |
Katliar et al. | Nonlinear model predictive control of a cable-robot-based motion simulator | |
CN112405542B (en) | Musculoskeletal robot control method and system based on brain inspiring multitask learning | |
Wu et al. | Semi-parametric Gaussian process for robot system identification | |
Wochner et al. | Optimality principles in human point-to-manifold reaching accounting for muscle dynamics | |
CN105205533A (en) | Development automatic machine with brain cognition mechanism and learning method of development automatic machine | |
CN110516389A (en) | Learning method, device, equipment and the storage medium of behaviour control strategy | |
CN110059439A (en) | A kind of spacecraft orbit based on data-driven determines method | |
JP2023548964A (en) | Methods and systems for modeling and controlling partially measurable systems | |
CN114474078B (en) | Friction force compensation method and device for mechanical arm, electronic equipment and storage medium | |
Polydoros et al. | Online multi-target learning of inverse dynamics models for computed-torque control of compliant manipulators | |
CN114802817A (en) | Satellite attitude control method and device based on multi-flywheel array | |
Bae et al. | Curriculum learning for vehicle lateral stability estimations | |
CN111531543B (en) | Robot self-adaptive impedance control method based on biological heuristic neural network | |
CN110515297B (en) | Staged motion control method based on redundant musculoskeletal system | |
Zhang et al. | Trajectory-tracking control of robotic system via proximal policy optimization | |
Wang et al. | Model-free event-triggered optimal control with performance guarantees via goal representation heuristic dynamic programming | |
CN115421387A (en) | Variable impedance control system and control method based on inverse reinforcement learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |