CN112894809B - Impedance controller design method and system based on reinforcement learning - Google Patents
Impedance controller design method and system based on reinforcement learning Download PDFInfo
- Publication number
- CN112894809B CN112894809B CN202110061914.9A CN202110061914A CN112894809B CN 112894809 B CN112894809 B CN 112894809B CN 202110061914 A CN202110061914 A CN 202110061914A CN 112894809 B CN112894809 B CN 112894809B
- Authority
- CN
- China
- Prior art keywords
- learning
- control
- function
- impedance
- parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1628—Programme controls characterised by the control loop
- B25J9/163—Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1602—Programme controls characterised by the control system, structure, architecture
- B25J9/1605—Simulation of manipulator lay-out, design, modelling of manipulator
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1628—Programme controls characterised by the control loop
- B25J9/1633—Programme controls characterised by the control loop compliant, force, torque control, e.g. combined with position control
Landscapes
- Engineering & Computer Science (AREA)
- Robotics (AREA)
- Mechanical Engineering (AREA)
- Automation & Control Theory (AREA)
- Feedback Control In General (AREA)
Abstract
The invention discloses an impedance controller design method and system based on reinforcement learning, and belongs to the field of robot control. The method comprehensively considers the control input, the position and the speed of the controlled system and the influence of the external force, designs an effective reward function and a cost function by utilizing the direct proportional relation between the external force and the position of the controlled system, can design an optimal impedance controller by reinforcement learning under the condition that a system model and an environment model are unknown, and can modify the response characteristic of the system by adjusting parameters to generate an ideal robot impedance controller. The method of the invention defines the form of the cost function, greatly reduces the number of undetermined coefficients, does not need a complex deep network to fit the cost function, and greatly accelerates the learning process.
Description
Technical Field
The invention belongs to the field of robot control, and particularly relates to an impedance controller design method and system based on reinforcement learning.
Background
With the advent of compliant operation and human-computer interaction scenarios, the goal of robot control is no longer to reduce positional error singly, and its compliance is receiving more and more attention. Impedance control is a very effective robot compliance control method, when external force exists, balance is automatically kept between the external force and a target position, rigid collision and overlarge contact force are avoided, and a robot, a workpiece and a user are protected; when no external force exists, higher position precision can be realized, and the requirements of various kinds of work are met.
Patent CN202010771033.1 proposes a method for controlling adaptive impedance of a mechanical arm based on RBF neural network, but it needs to obtain a nominal dynamical model to design an impedance controller and an error compensation controller, and the structure is complicated. The CN201910352004.9 patent needs to identify the kinetic parameters of the robot through preprocessing. The method for optimizing the multi-axis hole assembly control of the non-model robot by using environment prediction, which is proposed by patent CN201910287227.1, uses reinforcement learning for robot assembly, but the adopted deep reinforcement learning method has low convergence rate, long training time and certain limitation in use.
Disclosure of Invention
In view of the above drawbacks and needs of the prior art, the present invention provides a method and system for designing an impedance controller based on reinforcement learning, which aims to obtain an optimal impedance controller quickly without knowing a system dynamics model.
To achieve the above object, according to one aspect of the present invention, there is provided an impedance controller design method based on reinforcement learning, including:
s1, designing a reward function and a cost function; set up as a reward functionSet as a cost function Wherein the content of the first and second substances,respectively representing the current position, speed and external force of the controlled system; q f ,Q x ,Q v Weights for external force, position, velocity in the impedance controller design objective, respectively; and u is KX is the control input of the system; k is the impedance control parameter to be designed and optimized,representing the kronecker product of the matrix; theta is a cost function parameter;
and S2, estimating theta by adopting a reinforcement learning method based on the reward function and the cost function to obtain an optimal impedance control parameter K, and finishing the design of the impedance controller.
Further, step S1 specifically includes:
s101, regarding an external force f borne by a controlled system as a part of a system stateObtaining an augmented state vector ofRespectively representing the current position, speed and external force of the controlled system; the form of the impedance controller is set as that u is KX, u is the control input of the system, and K is the impedance control parameter to be designed and optimized;
s102, inputting control u, the current position q and the speed of a controlled systemAnd the received external force f is regarded as the cost of the control system, and the cost function is set as:Q 1 ,Q 2 ,Q 3 all are positive and real numbers for controlling the weight;
S104, designing the reward function as the inverse number of the cost function:the cost function being designed as an accumulation of reward functions
Further, the order of arrangement of the elements in the augmented state vector X is arbitrary, K, c k 、r、Q v (X,u)、The specific form of θ varies depending on the order of arrangement of the elements in X.
Further, step S2 is specifically:
s201, learning process initialization: setting K as zero vector and theta as zero vector, and setting updating period i update ,i update Is a positive integer;
s202, learning period initialization: setting a controlled system to an initial state X ΔT Setting a learning parameter P ═ δ H, wherein δ is a positive integer, and H is a unit matrix of n × n;
s203, calculating control input u ═ KX iΔT + σ Rand, Rand being a random number, σ being a weighting factor; x iΔT The system state of the current control period is 1, 2, 3 …, and Δ T is the control period of the controlled system;
S205, obtaining the system state X of the next control cycle (i+1)ΔT Updating a system value function parameter theta and a learning parameter P:
θ=θ+gradient
gradient is intermediate quantity, gamma is a prediction factor, and gamma is more than 0 and less than 1;
s206, updating an impedance control parameter K: when i is i update When the number of the H is multiple, the elements of theta are sequentially arranged into a matrix of n x n, and H is partitioned to obtain the HWherein H 21 Is a matrix with the same dimension as K; let K updated =K-l*(H 21 +KH 22 ),K=K updated (ii) a l is an update weight;
s207, judging termination of the learning period: if i Δ T is not less than T max If not, making i equal to i +1, returning to S203; t is max Is a maximum learning cycle length;
s208, learning termination judgment: the control laws before and after the K-th learning period are respectively K k X and u ═ K k-1 X, if max (abs (K) k -K k-1 ) Is less than or equal to epsilon, the learning process is terminated and the obtained impedance controller is K k X, otherwise, returning to S202; ε is the termination determination threshold.
Corresponding to the implementation process of the method, the invention also provides an impedance controller design system based on reinforcement learning, which comprises the following steps:
the control target design module is used for designing a reward function and a value function; set up as a reward functionA cost function set to Wherein the content of the first and second substances,respectively representing the current position, speed and external force of the controlled system; q f ,Q x ,Q v Are respectively outsideThe weight of force, position, velocity in the impedance controller design objective; and u is KX is the control input of the system; k is the impedance control parameter to be designed and optimized,representing the kronecker product of the matrix; theta is a cost function parameter;
and the impedance control parameter optimization module is used for estimating theta by adopting a reinforcement learning method based on the reward function and the value function to obtain an optimal impedance control parameter K and complete the design of the impedance controller.
In general, the above technical solutions contemplated by the present invention can achieve the following advantageous effects compared to the prior art.
(1) The method comprehensively considers the control input, the position and the speed of the controlled system and the influence of the external force, designs an effective reward function and a cost function by utilizing the direct proportional relation between the external force and the position of the controlled system, can design an optimal impedance controller by reinforcement learning under the condition that a system model and an environment model are unknown, and can modify the response characteristic of the system by adjusting parameters to generate an ideal robot impedance controller.
(2) The method of the invention defines the form of the cost function, greatly reduces the number of undetermined coefficients, does not need a complex deep network to fit the cost function, and greatly accelerates the learning process.
Drawings
FIG. 1 is a flow chart of a method for designing an impedance controller based on reinforcement learning according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Referring to fig. 1, the method for designing an impedance controller based on reinforcement learning according to the present invention includes:
s1, designing a reward function and a value function;
s101, regarding an external force f borne by a controlled system as a part of a system state, and obtaining an augmented state vectorRespectively representing the current position, speed and external force of the controlled system; the impedance controller is configured in the form of u-KX, where u is the control input to the system and K is the impedance control parameter to be designed and optimized.
S102, inputting control u, the position q and the speed of a controlled systemAnd the received external force f is regarded as the cost of the control system, and the cost function is set as:Q 1 ,Q 2 ,Q 3 for controlling the weights, all are positive real numbers.
S103, the external force F borne by the controlled system is a direct proportional function of the position of the controlled system, namely F is F e And q, substituting the cost function to obtain:it is written in matrix form:
s104, designing the reward function as the inverse number of the control function:the merit function being designed as an accumulation of reward functions, i.e.Set the cost function toRepresenting the kronecker product of the matrix. Theta is the control weight to be estimated as Q 1 ,Q 2 ,Q 3 A time-value function parameter.
The reward function represents the design target of the impedance controller, the reward function provided by the invention comprehensively considers the control input, the position and the speed of the controlled system and the influence of the external force, and the form of the reward function has sufficient physical meaning by utilizing the direct proportional relation between the external force and the position of the controlled system, so that the stability and the convergence of the learning process can be ensured in the actual use. At the same time, Q f ,Q x ,Q v And the weights of the external force, the position and the speed in the design target of the impedance controller are respectively, so that a user can flexibly adjust the design target according to the application scene requirement to obtain the optimal impedance controller under the scene.
In the reinforcement learning process, a real value function is finally obtained through continuous learning iteration. A wrong cost function form will lead to learning failure, and a complex cost function form (such as a deep neural network) will make the learning process tedious. The value function form provided by the invention greatly reduces the number of undetermined coefficients, does not need a complex deep network to fit the value function, and greatly accelerates the learning process.
And S2, estimating theta by adopting a reinforcement learning method based on the reward function and the value function to obtain an optimal impedance control parameter K, and finishing the design of the impedance controller.
The method for finishing the design of the controller by adopting the reinforcement learning method comprises the following steps:
s201, initializing a learning process, setting K as a zero vector, setting theta as a zero vector, and setting an updating period i update ,i update Is a positive integer.
S202, initializing a learning period, and setting a controlled system to be in an initial state X ΔT And setting a learning parameter P as δ H, wherein δ is a positive integer, and H is an identity matrix of n by n. Recording system stateIs X iΔT Where i is 1, 2, 3 … and Δ T is the control period of the controlled system. When i is 1.
S203, calculating control input u ═ KX iΔT + σ Rand; rand is a random number and σ is a weighting factor.
S205, obtaining a system state X (i+1)ΔT Updating a system value function parameter theta and a learning parameter P:
θ=θ+gradient
the method for updating the parameters of the value function has small calculation amount, can update the value function in real time in the learning process, and improves the convergence speed of the value function.
S206, updating an impedance control parameter K: when i is i update When the multiple of the number of the elements is multiple, the elements of theta are sequentially arranged into a matrix of n x n, and H is partitioned to obtain the multiple of the number of the elements of thetaWherein H 21 Is a matrix with the same dimension as K; let K updated =K-l*(H 21 +KH 22 ),K=K updated (ii) a l is the update weight.
The method provides an analytic control parameter updating strategy, and is low in calculation complexity and high in convergence speed. At the same time, i can be adjusted update Adjusting control parameter update frequencyAnd adjusting the updating speed of the control parameters by adjusting the I, realizing effective control on the learning process, and effectively avoiding the situations of redundant learning process caused by too slow learning and incapability of convergence of the control parameters caused by too fast learning.
S207, judging termination of the learning period: if i Δ T is not less than T max If not, i is made to be i +1, and the process returns to S203. T is max Is the maximum length of a learning cycle.
S208, learning termination judgment: the control laws recorded before and after the K-th learning period are respectively u-K k X and u ═ K k-1 X, if max (abs (K) k -K k-1 ) Is less than or equal to epsilon, the learning process is terminated and the obtained impedance controller is K k And X, otherwise, returning to S202.ε is the termination decision threshold.
In order to verify the effectiveness of the method, simulation and experimental verification are carried out according to the method, and the result shows that by adopting the method, the kinetic parameters of a controlled system do not need to be obtained, and the learning length is T after 10 learning lengths max After a learning period of 250 Δ T, an optimal impedance controller can be generated, which has obvious advantages compared with deep reinforcement learning which often requires thousands of training.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (6)
1. An impedance controller design method based on reinforcement learning is characterized by comprising the following steps:
s1, designing a reward function and a value function; set up as a reward functionSet as a cost function Wherein the content of the first and second substances,q,f respectively represents the current position, speed and external force of the controlled system; q f ,Q x ,Q v Weights for external force, position, velocity in the impedance controller design objective, respectively; and u is KX is the control input of the system; k is the impedance control parameter to be designed and optimized, representing a kronecker product of the matrix; theta is a cost function parameter;
s2, estimating theta by adopting a reinforcement learning method based on the reward function and the value function to obtain an optimal impedance control parameter K, and finishing the design of the impedance controller; step S2 specifically includes:
s201, learning process initialization: setting K as zero vector and theta as zero vector, and setting updating period i update ,i update Is a positive integer;
s202, learning period initialization: setting a controlled system to an initial state X ΔT Setting a learning parameter P ═ δ H, wherein δ is a positive integer, and H is a unit matrix of n × n;
s203, calculating control input u ═ KX iΔT + σ Rand, Rand being a random number, σ being a weight factor; x iΔT The system state of the current control period is 1, 2, 3 …, and Δ T is the control period of the controlled system;
S205, obtaining the system state X of the next control cycle (i+1)ΔT Updating a system value function parameter theta and a learning parameter P:
S206, updating an impedance control parameter K: when i is i update When the number of the H is multiple, the elements of theta are sequentially arranged into a matrix of n x n, and H is partitioned to obtain the HWherein H 21 Is a matrix with the same dimension as K; let K updated =K-l*(H 21 +KH 22 ),K=K updated (ii) a l is an update weight;
s207, learning period termination judgment: if i Δ T is not less than T max If not, making i equal to i +1, returning to S203; t is max Is a maximum learning cycle length;
s208, learning termination judgment: the control laws before and after the K-th learning period are respectively K k X and u ═ K k-1 X, if max (abs (K) k -K k-1 ) Is less than or equal to epsilon, the learning process is terminated and the obtained impedance controller is K k X, otherwise, returning to S202; ε is the termination decision threshold.
2. The method as claimed in claim 1, wherein the step S1 specifically includes:
s101, regarding an external force f borne by a controlled system as one of system statesIn part, obtaining an augmented state vector ofq,f respectively represents the current position, speed and external force of the controlled system; the form of the impedance controller is set as that u is KX, u is the control input of the system, and K is the impedance control parameter to be designed and optimized;
s102, inputting control u, the current position q and the speed of a controlled systemAnd the received external force f is regarded as the cost of the control system, and the cost function is set as:Q 1 ,Q 2 ,Q 3 for controlling the weight, the values are positive and real;
4. An impedance controller design system based on reinforcement learning, comprising:
the control target design module is used for designing a reward function and a value function; set up as a reward functionSet as a cost function Wherein the content of the first and second substances,q,f respectively represents the current position, speed and external force of the controlled system; q f ,Q x ,Q v Weights for external force, position, velocity in the impedance controller design objective, respectively; and u is KX is the control input of the system; k is the impedance control parameter to be designed and optimized, representing the kronecker product of the matrix; theta is a cost function parameter;
the impedance control parameter optimization module is used for estimating theta by adopting a reinforcement learning method based on the reward function and the value function to obtain an optimal impedance control parameter K and complete the design of the impedance controller; the implementation process of the impedance control parameter optimization module specifically comprises the following steps:
s201, learning process initialization: setting K as zero vector and theta as zero vector, and setting updating period i update ,i update Is a positive integer;
s202, learning period initialization: setting a controlled system to an initial state X ΔT Setting a learning parameter P ═ δ H, wherein δ is a positive integer, and H is a unit matrix of n × n;
s203, calculating control input u ═ KX iΔT + σ Rand, Rand being a random number, σ being a weighting factor; x iΔT The system state of the current control period is 1, 2, 3 …, and Δ T is the control period of the controlled system;
S205, obtaining the system state X of the next control cycle (i+1)ΔT Updating a system value function parameter theta and a learning parameter P:
θ=θ+gradient
S206, updating an impedance control parameter K: when i is i update When the number of the H is multiple, the elements of theta are sequentially arranged into a matrix of n x n, and H is partitioned to obtain the HWherein H 21 Is a matrix with the same dimension as K; let K updated =K-l*(H 21 +KH 22 ),K=K updated (ii) a l is an update weight;
s207, judging termination of the learning period: if i Δ T is not less than T max If not, making i equal to i +1, returning to S203; t is max Is a maximum learning cycle length;
s208, learning termination judgment: the control laws before and after the K-th learning period are respectively K k X and u ═ K k-1 X, if max (abs (K) k -K k-1 ) Is less than or equal to epsilon, the learning process is terminated and the obtained impedance controller is K k X, otherwise, returning to S202; ε is the termination decision threshold.
5. The reinforcement learning-based impedance controller design system according to claim 4, wherein the control target design module is implemented by:
the external force f borne by the controlled system is regarded as a part of the system state, and an augmented state vector is obtainedq,f respectively represents the current position, speed and external force of the controlled system; the form of the impedance controller is set as that u is KX, u is the control input of the system, and K is the impedance control parameter to be designed and optimized;
inputting control u, current position q and speed of the controlled systemAnd the received external force f is regarded as the cost of the control system, and the cost function is set as:Q 1 ,Q 2 ,Q 3 for controlling the weight, the values are positive and real;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110061914.9A CN112894809B (en) | 2021-01-18 | 2021-01-18 | Impedance controller design method and system based on reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110061914.9A CN112894809B (en) | 2021-01-18 | 2021-01-18 | Impedance controller design method and system based on reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112894809A CN112894809A (en) | 2021-06-04 |
CN112894809B true CN112894809B (en) | 2022-08-02 |
Family
ID=76114670
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110061914.9A Active CN112894809B (en) | 2021-01-18 | 2021-01-18 | Impedance controller design method and system based on reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112894809B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114789444B (en) * | 2022-05-05 | 2022-12-16 | 山东省人工智能研究院 | Compliant human-computer contact method based on deep reinforcement learning and impedance control |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107020636A (en) * | 2017-05-09 | 2017-08-08 | 重庆大学 | A kind of Learning Control Method for Robot based on Policy-Gradient |
CN108255182A (en) * | 2018-01-30 | 2018-07-06 | 上海交通大学 | A kind of service robot pedestrian based on deeply study perceives barrier-avoiding method |
CN111401556A (en) * | 2020-04-22 | 2020-07-10 | 清华大学深圳国际研究生院 | Selection method of opponent type imitation learning winning incentive function |
CN111531543A (en) * | 2020-05-12 | 2020-08-14 | 中国科学院自动化研究所 | Robot self-adaptive impedance control method based on biological heuristic neural network |
CN111613200A (en) * | 2020-05-26 | 2020-09-01 | 辽宁工程技术大学 | Noise reduction method based on reinforcement learning |
US10766136B1 (en) * | 2017-11-03 | 2020-09-08 | Amazon Technologies, Inc. | Artificial intelligence system for modeling and evaluating robotic success at task performance |
CN111708355A (en) * | 2020-06-19 | 2020-09-25 | 中国人民解放军国防科技大学 | Multi-unmanned aerial vehicle action decision method and device based on reinforcement learning |
CN111782870A (en) * | 2020-06-18 | 2020-10-16 | 湖南大学 | Antagonistic video time retrieval method and device based on reinforcement learning, computer equipment and storage medium |
-
2021
- 2021-01-18 CN CN202110061914.9A patent/CN112894809B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107020636A (en) * | 2017-05-09 | 2017-08-08 | 重庆大学 | A kind of Learning Control Method for Robot based on Policy-Gradient |
US10766136B1 (en) * | 2017-11-03 | 2020-09-08 | Amazon Technologies, Inc. | Artificial intelligence system for modeling and evaluating robotic success at task performance |
CN108255182A (en) * | 2018-01-30 | 2018-07-06 | 上海交通大学 | A kind of service robot pedestrian based on deeply study perceives barrier-avoiding method |
CN111401556A (en) * | 2020-04-22 | 2020-07-10 | 清华大学深圳国际研究生院 | Selection method of opponent type imitation learning winning incentive function |
CN111531543A (en) * | 2020-05-12 | 2020-08-14 | 中国科学院自动化研究所 | Robot self-adaptive impedance control method based on biological heuristic neural network |
CN111613200A (en) * | 2020-05-26 | 2020-09-01 | 辽宁工程技术大学 | Noise reduction method based on reinforcement learning |
CN111782870A (en) * | 2020-06-18 | 2020-10-16 | 湖南大学 | Antagonistic video time retrieval method and device based on reinforcement learning, computer equipment and storage medium |
CN111708355A (en) * | 2020-06-19 | 2020-09-25 | 中国人民解放军国防科技大学 | Multi-unmanned aerial vehicle action decision method and device based on reinforcement learning |
Non-Patent Citations (1)
Title |
---|
复杂结构柔顺加工的人机示教编程与机器人力控研究;李科霖;《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》;20200115(第01期);第43-46页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112894809A (en) | 2021-06-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11537897B2 (en) | Artificial neural network circuit training method, training program, and training device | |
EP3136304A1 (en) | Methods and systems for performing reinforcement learning in hierarchical and temporally extended environments | |
CN108375907B (en) | Adaptive compensation control method of hypersonic aircraft based on neural network | |
CN110647042A (en) | Robot robust learning prediction control method based on data driving | |
CN110286595B (en) | Fractional order system self-adaptive control method influenced by saturated nonlinear input | |
CN111665853A (en) | Unmanned vehicle motion planning method for planning control joint optimization | |
CN112894809B (en) | Impedance controller design method and system based on reinforcement learning | |
CN113406886B (en) | Fuzzy self-adaptive control method and system for single-link mechanical arm and storage medium | |
CN110488603B (en) | Rigid aircraft adaptive neural network tracking control method considering actuator limitation problem | |
CN112085050A (en) | Antagonistic attack and defense method and system based on PID controller | |
CN113043251A (en) | Robot teaching reproduction track learning method | |
CN114326405B (en) | Neural network backstepping control method based on error training | |
CN112904726B (en) | Neural network backstepping control method based on error reconstruction weight updating | |
CN113627075B (en) | Projectile pneumatic coefficient identification method based on adaptive particle swarm optimization extreme learning | |
CN113346552A (en) | Self-adaptive optimal AGC control method based on integral reinforcement learning | |
CN110991606B (en) | Piezoelectric ceramic driver composite control method based on radial basis function neural network | |
CN112947090A (en) | Data-driven iterative learning control method for wheeled robot under DOS attack | |
CN109709809B (en) | Modeling method and tracking method of electromagnetic/magneto-rheological actuator based on hysteresis kernel | |
CN114559429B (en) | Neural network control method of flexible mechanical arm based on self-adaptive iterative learning | |
CN110554605A (en) | complex mechanical system adaptive robust control method based on constraint tracking | |
CN112685835B (en) | Elastic event trigger control method and system for autonomous driving of vehicle | |
CN112346342B (en) | Single-network self-adaptive evaluation design method of non-affine dynamic system | |
CN115047769A (en) | Unmanned combat platform obstacle avoidance-arrival control method based on constraint following | |
CN112305916B (en) | Self-adaptive control method and system for mobile robot based on barrier function | |
CN114139282A (en) | Underwater impact load modeling method of cross-medium aircraft |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |