CN116661478A - Four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning - Google Patents

Four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning Download PDF

Info

Publication number
CN116661478A
CN116661478A CN202310930078.2A CN202310930078A CN116661478A CN 116661478 A CN116661478 A CN 116661478A CN 202310930078 A CN202310930078 A CN 202310930078A CN 116661478 A CN116661478 A CN 116661478A
Authority
CN
China
Prior art keywords
aerial vehicle
unmanned aerial
neural network
rotor unmanned
attitude
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310930078.2A
Other languages
Chinese (zh)
Other versions
CN116661478B (en
Inventor
赵冬
苏延旭
黄大荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202310930078.2A priority Critical patent/CN116661478B/en
Publication of CN116661478A publication Critical patent/CN116661478A/en
Application granted granted Critical
Publication of CN116661478B publication Critical patent/CN116661478B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/08Control of attitude, i.e. control of roll, pitch, or yaw
    • G05D1/0808Control of attitude, i.e. control of roll, pitch, or yaw specially adapted for aircraft
    • G05D1/0816Control of attitude, i.e. control of roll, pitch, or yaw specially adapted for aircraft to ensure stability
    • G05D1/0825Control of attitude, i.e. control of roll, pitch, or yaw specially adapted for aircraft to ensure stability using mathematical models
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/10Simultaneous control of position or course in three dimensions
    • G05D1/101Simultaneous control of position or course in three dimensions specially adapted for aircraft
    • G05D1/106Change initiated in response to external conditions, e.g. avoidance of elevated terrain or of no-fly zones
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Algebra (AREA)
  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)

Abstract

The invention discloses a four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning, which comprises the following steps: constructing an attitude tracking error model; constructing a long-term cost function of the quadrotor unmanned aerial vehicle based on the discretized attitude tracking error model, and forming a real-time reward function of integral reinforcement learning; constructing an evaluation neural network, constructing an error model of integral reinforcement learning based on an estimated value of the evaluation neural network on a long-term cost function, and constructing an evaluation neural network-action neural network integral reinforcement learning control model by combining a real-time rewarding function; and respectively designing a weight update law for the evaluation neural network and the action neural network in the control model, and tracking and controlling the gesture of the four-rotor unmanned aerial vehicle by using an integral reinforcement learning control model adopting the weight update law. The invention can ensure that the transient performance, the closed loop stability and the output tracking of the four-rotor unmanned aerial vehicle are improved, and the autonomy and the adaptability to new scenes of the four-rotor unmanned aerial vehicle are improved.

Description

Four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning
Technical Field
The invention belongs to the technical field of unmanned aerial vehicle automatic control, and particularly relates to a four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning.
Background
With the development and progress of aerospace technology, in various types of quadrotors unmanned aerial vehicle, the quadrotors are taken as a special aircraft in an unmanned aerial vehicle family, and the quadrotors are mainly used for monitoring and reconnaissance, emergency rescue, aerial photography, atmosphere monitoring and other purposes due to the characteristics of low cost, small size, simple structure and strong maneuverability, so that huge application prospects are shown in military and civil fields, research hot flashes are formed worldwide, and the research of a control system is the core in the research of the quadrotors.
Considering that a four-rotor aircraft is a multivariable, under-actuated and strongly coupled nonlinear system, some students adopt intelligent control strategies to identify and compensate the nonlinear system, but transient performance under the condition of strong nonlinearity is not considered yet, and insufficient control over the transient performance can lead to poor system response, including overshoot, convergence rate and other relevant factors, endanger the stability of the system and even possibly lead to system faults. Therefore, the comprehensive research on the transient performance of the four-rotor unmanned aerial vehicle control system has a vital meaning, and how to enhance the capability of the control system for effectively processing dynamic abrupt changes so as to improve the safety performance of the system becomes a research hot spot.
At present, the tracking control method of the four-rotor unmanned aerial vehicle mainly focuses on the following aspects: (1) external tamper-resistant control based on a disturbance observer; (2) Self-adaptive obstacle avoidance control based on potential functions or images; and (3) attitude control based on adaptive dynamic programming, and the like. In the past four rotor unmanned aerial vehicle design process, the robustness, security and maneuverability to four rotor unmanned aerial vehicle flight under the general circumstances are studied, aim at improving four rotor unmanned aerial vehicle complex environment's adaptability, but the past method still has fresh research in the aspect of improving the transient performance and intelligent autonomy of system.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides the four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning, which improves the performance of the four-rotor unmanned aerial vehicle in terms of dynamic performance and autonomy on the basis of the traditional steady-control four-rotor unmanned aerial vehicle and provides powerful support for subsequent intelligent autonomous application.
In order to achieve the technical purpose, the invention adopts the following technical scheme:
a four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning comprises the following steps:
step 1: building a posture power model of the four-rotor unmanned aerial vehicle, building a posture angle state constraint on the posture power model by adopting a preset performance function, and building a posture tracking error model meeting the requirement of transient response performance of the four-rotor unmanned aerial vehicle by combining a posture angle error variable;
step 2: discretizing the attitude tracking error model constructed in the step 1, constructing a long-term cost function of the four-rotor unmanned aerial vehicle based on the discretized attitude tracking error model, and forming a real-time reward function of integral reinforcement learning;
step 3: constructing an evaluation neural network for controlling the performance of the four-rotor unmanned aerial vehicle system, constructing an error model of integral reinforcement learning based on an estimated value of the evaluation neural network on a long-term cost function, and constructing an evaluation neural network-action neural network integral reinforcement learning control model by combining the real-time reward function formed in the step 2;
step 4: and respectively designing weight updating rules for the evaluation neural network and the action neural network in the evaluation neural network-action neural network integral reinforcement learning control model, and tracking and controlling the gesture of the four-rotor unmanned aerial vehicle by using the integral reinforcement learning control model adopting the weight updating rules.
In order to optimize the technical scheme, the specific measures adopted further comprise:
the step 1 comprises the following substeps:
step 11: building a gesture power model of the four-rotor unmanned aerial vehicle:
wherein ,the change rate of the attitude angle of the four-rotor unmanned aerial vehicle is the change rate;
the change rate of the attitude angular rate of the quadrotor unmanned aerial vehicle is the change rate of the attitude angular rate of the quadrotor unmanned aerial vehicle;
a rotation matrix of an attitude angle system of the quadrotor unmanned aerial vehicle;
、/>the attitude angular rate and the moment of inertia of the four-rotor unmanned aerial vehicle are adopted;
the attitude angular rate matrix is the attitude angular rate matrix of the quadrotor unmanned aerial vehicle;
the control moment of the four-rotor unmanned aerial vehicle is;
external bounded interference to the quad-rotor unmanned helicopter;
step 12: establishing attitude angle state constraint on an attitude power model by adopting a preset performance function:
wherein The attitude angle of the four-rotor unmanned aerial vehicle under i is;
,/>、/>、/>roll angle, pitch angle and yaw angle, respectively, represent the subscript +.>Refers to one of a roll angle, a pitch angle, and a yaw angle;
for presetting performance index function, satisfy->,/>;/>,/>Is constant, satisfy->,/>Is a time variable;
the amplitude adjusting parameter of the function for presetting the performance index satisfies +.>
Step 13: combining with an attitude angle error variable, constructing an attitude tracking error model meeting the transient response performance requirement of the four-rotor unmanned aerial vehicle:
wherein , and />The preset performance tracking error vectors respectively +.>About time->First and second derivatives of (2);
the state constraint and control model of the four-rotor unmanned aerial vehicle is considered for the attitude angle error variable, and specifically comprises the following steps:
for four rotor unmanned aerial vehicle attitude angle vector, +.>Intermediate variable +.>Wherein the auxiliary attitude angle constraint variable +.>,/>Is->First derivative with respect to time, < >>Is aboutBundle posture angle->Is a preset performance index function vector of +.>Is->First derivative with respect to time, < >>Is->Second derivative with respect to time,/>Rotation matrix of attitude angle system for quadrotor unmanned aerial vehicle, +.>For the first derivative of the rotation matrix of the attitude angle system of a quadrotor unmanned aerial vehicle with respect to time,is an intermediate variable introduced.
Above mentioned,/>And->The method comprises the following steps of:
wherein The roll angle rate, pitch angle rate and yaw angle rate, respectively; />、/>、/>The roll angle, pitch angle and yaw angle, respectively.
Above mentionedThe method comprises the following steps:
wherein ,representing a hyperbolic secant function->
,/> and />For performance parameters selected according to the transient performance of a quadrotor unmanned aerial vehicle>,/>Determining the initial boundary and the end boundary of the attitude angle operation of the quadrotor unmanned aerial vehicle, < ->And determining the convergence speed of the attitude angle of the quadrotor unmanned aerial vehicle under the constraint of the preset performance function.
The step 2 comprises the following substeps:
step 21: discretizing the attitude tracking error model constructed in the step 1 to obtain a discretized attitude tracking error model:
wherein ,discretizing the first part based on the forward difference method>Step preset Performance tracking error vector +.>And presetting a first derivative of the performance tracking error vector with respect to time
、/>Discretized according to forward difference method>Step preset Performance tracking error vector +.>And presetting a first derivative of the performance tracking error vector with respect to time
Control input torque for discretization;
a model matrix and a control distribution matrix of the discretization model respectively;
the external bounded disturbance of the discretized four-rotor unmanned aerial vehicle;
step 22: constructing a long-term cost function of the four-rotor unmanned aerial vehicle based on the error state quantity and the control quantity in the discretized attitude tracking error model obtained in the step 21:
wherein ,as a positive function, reflecting whether the attitude angle of the current quadrotor unmanned aerial vehicle is out of range;
to be at the present->Based on the steps, the control performance prediction step number is carried out backwards in time;
is discretized->Step (3) a preset performance tracking error vector and a first derivative thereof;
for->Is->The function value of the power of the second order>Is a discount factor, satisfy->
The weight matrix is a positive definite matrix, and the tracking error performance and the energy consumption of the four-rotor unmanned aerial vehicle model are balanced;
is discretized->Controlling input moment;
the initial moment is based on a forward difference discretization error model;
step 23: long term cost function according to step 22Form integration reinforcement learning->Real-time bonus function of steps->
wherein ,representing the output quantity of the four-rotor unmanned aerial vehicle attitude model; />A weight matrix is positively defined; />Is a desired four-rotor unmanned aerial vehicle attitude angle signal.
The step 3 comprises the following sub-steps:
step 31: constructing an evaluation neural network for controlling behavior of the four-rotor unmanned aerial vehicle model:
wherein ,the method is an ideal weight matrix for evaluating the neural network;
for a desired long-term performance index function, +.>A vector that is all zeros;
to evaluate the activation function of the neural network;
to evaluate the neural network for the desired long-term performance index function +.>Is determined by the estimation error of (a);
satisfy the following requirements,/> and />,/>Are all unknown constants;
step 32: based on the estimated value of the long-term cost function of the evaluation neural network, constructing an error model of integral reinforcement learning:
wherein ,to evaluate the long-term cost function of a neural network>Estimated value of ∈10->Error for integral reinforcement learning;
step 33: an evaluation neural network-action neural network integral reinforcement learning control model is established based on an error model of integral reinforcement learning and a real-time rewarding function:
error model based on integral reinforcement learning, in the firstStep, establish four rotor unmanned aerial vehicle attitude angle tracking error +.>
wherein , and />;/>Tracking signals for the attitude angles of the expected four-rotor unmanned aerial vehicle;
furthermore, the first step is introduced on the basis of the attitude angle tracking errorStep attitude angular rate tracking error:
according to the design method of the state feedback control law, an ideal controller is designed as follows:
wherein ,control gain for the design;
the following action neural network design is introduced:
wherein ,weights for ideal action neural network, +.>An activation function for the action neural network;
action neural networkThe input of (1) is defined as +.>
Representing the weight of hidden layers in the action neural network,thus, the establishment of the evaluation neural network-action neural network integral reinforcement learning control model is completed.
In the step 4, the weight update law design process for evaluating the neural network is as follows:
for evaluating neural networksThe following approximation errors of the evaluation neural network are introduced: />
Wherein, the weight approximation error of the neural network;/>Ideal weight for evaluating neural network>Is a function of the estimated value of (2);
the error of integral reinforcement learning is further evolved by combining the Bellman iterative equation and the empirical difference method:
to minimize long-term cost functionIt is mapped to the error cost function +.>In (a):
further, in combination with the self-adaptive gradient descent method of the discrete model, the following weight update gradient of the evaluation neural network is designed:
wherein ,in combination with the chain law,/->The method comprises the following steps:
wherein ,learning gain for evaluating the neural network weight update law;
thus, the following evaluation neural network weight update law can be obtained:
in the step 4, the weight update law design process of the action neural network is as follows:
aiming at an action neural network with ideal weight, the following neural network estimation strategy is designed to solve the problem that the action neural network cannot be directly applied:
wherein ,the output of the action neural network;
,/> and />,/>Is an unknown constant value;
further, by combining the tracking error of the attitude angular rate of the quadrotor unmanned aerial vehicle, the following steps are obtained:
wherein ,,/>is the desired attitude angular rate;
meanwhile, the estimation error of the action neural network weightDefined as->
Combining external bounded interference and action neural network estimation errors received by the four-rotor unmanned aerial vehicle, and defining the attitude angle tracking error of the four-rotor unmanned aerial vehicle caused by the external bounded interference and action neural network estimation errors received by the unmanned aerial vehicle as
Further, an estimation error of the action neural network is introducedIn combination with the aim of minimizing the output of the evaluation neural network, the following action neural network errors are designed:
wherein ,,/>an estimated value that is an output of the ideal evaluation neural network;
according to minimizing action neural network errorsIs subject to introduction of quadratic errors +.>In combination with the chain rule, the update gradient of the action neural network weight is designed as follows:
further, an action neural network weight update law can be obtained:
the invention has the following beneficial effects:
aiming at the strict transient response performance requirements faced by a four-rotor unmanned aerial vehicle attitude control model, the tracking error of the four-rotor unmanned aerial vehicle attitude is measured through a preset performance function, and the tracking error with performance constraint is dynamically converted into an equivalent 'state constraint' model; further, constructing a performance index function based on integral reinforcement learning, and balancing the long-term optimal performance and flexible transient response performance of the attitude control of the four-rotor unmanned aerial vehicle; and then, by designing a motion neural network aiming at minimizing the long-term performance and the transient performance cost function of the gesture, the integrated reinforcement learning preset performance four-rotor unmanned aerial vehicle gesture controller under the evaluation neural network-motion neural network self-adaptive neural network control architecture is formed.
According to the four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning, a fusion application framework for fusion preset performance control and integral reinforcement learning is established, on one hand, the transient performance, system closed loop stability and output tracking of the four-rotor unmanned aerial vehicle can be guaranteed to be improved, on the other hand, the autonomy and the adaptability to new scenes of the four-rotor unmanned aerial vehicle can be improved, and the adaptive neural network tracking control framework based on the preset performance integral reinforcement learning has a simple structure and is easy to realize.
Drawings
FIG. 1 is a frame diagram of a four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning;
FIG. 2 is a graph of circular position trajectory tracking based on preset performance integral reinforcement learning;
FIG. 3 is a graph of attitude angle preset constraint control for preset performance integral reinforcement learning;
FIG. 4 is a control output closed loop response curve for preset performance integral reinforcement learning.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Although the steps of the present invention are arranged by reference numerals, the order of the steps is not limited, and the relative order of the steps may be adjusted unless the order of the steps is explicitly stated or the execution of a step requires other steps as a basis. It is to be understood that the term "and/or" as used in this disclosure relates to and encompasses any and all possible combinations of one or more of the associated listed items.
As shown in fig. 1, the preset performance tracking control method of the four-rotor unmanned aerial vehicle based on reinforcement learning specifically comprises the following steps:
step 1: building a posture power model of the four-rotor unmanned aerial vehicle, building a posture angle state constraint on the posture power model by adopting a preset performance function, and building a posture tracking error model meeting the requirement of transient response performance of the four-rotor unmanned aerial vehicle by combining a posture angle error variable;
according to the structure and the flight environment of the four-rotor unmanned aerial vehicle, the characteristic of a gesture dynamics system of the four-rotor unmanned aerial vehicle is described by adopting a normal differential equation, the tracking error of the gesture of the four-rotor unmanned aerial vehicle is measured by adopting a preset performance function, the tracking error with performance constraint is dynamically converted into an equivalent 'state constraint' system, and a gesture tracking error system meeting the harsh transient response performance requirement of the four-rotor unmanned aerial vehicle is constructed;
step 1 comprises the following sub-steps:
step 11: building a gesture power model of the four-rotor unmanned aerial vehicle:
according to the structure and flying environment of the quadrotor unmanned aerial vehicle, the gesture motion of the quadrotor unmanned aerial vehicle is represented by the gesture angle of the quadrotor unmanned aerial vehicleAttitude angular rate->Moment of inertia->And control moment->The composition, therefore, the gestural motion of a quadrotor unmanned can be described by the following nonlinear very differential model:
wherein ,the change rate of the attitude angle of the four-rotor unmanned aerial vehicle is the change rate;
the change rate of the attitude angular rate of the quadrotor unmanned aerial vehicle is the change rate of the attitude angular rate of the quadrotor unmanned aerial vehicle;
external bounded interference to the quad-rotor unmanned helicopter;
rotation matrix of attitude angle model of four-rotor unmanned aerial vehicleAnd attitude angular rate matrix->The method comprises the following steps of:
in addition, the attitude angle matrix of the quadrotor unmanned aerial vehicle consists of a rolling angle, a pitch angle and a yaw angle, namely,、/>、/>respectively a roll angle, a pitch angle and a yaw angle;
likewise, the attitude angular rate of a quad-rotor unmanned helicopter is comprised of a roll angle rate, a pitch angle rate, and a yaw angle rate, specifically;/>The roll angle rate, pitch angle rate and yaw angle rate, respectively;
the control moment of the four-rotor unmanned aerial vehicle is;
step 12: establishing attitude angle state constraint on an attitude power model by adopting a preset performance function:
in order to meet the harsh transient response performance requirement of the quadrotor unmanned aerial vehicle, a preset performance function is adopted to measure the tracking error of the posture of the quadrotor unmanned aerial vehicle, and the method specifically comprises the following steps:
wherein Representing one of the four-rotor unmanned aerial vehicle attitude angles; i.e. < ->,/>、/>、/>Roll angle, pitch angle and yaw angle, respectively, represent the subscript +.>Refers to one of a roll angle, a pitch angle, and a yaw angle;
presetting a performance index functionIs a function of positive definite and monotonically increasing, satisfying +.>;/>,/>Is constant, satisfy->,/>Is a time variable;
the amplitude adjusting parameter of the function for presetting the performance index satisfies +.>
In the present invention, the following is designedFunction:
wherein ,representing a hyperbolic secant function->
,/> and />For performance parameters selected according to the transient performance of a quadrotor unmanned aerial vehicle>,/>Determining the initial boundary and the end boundary of the attitude angle operation of the quadrotor unmanned aerial vehicle, < ->Determining the convergence speed of the attitude angle of the four-rotor unmanned aerial vehicle under the constraint of a preset performance function;
step 13: combining with an attitude angle error variable, constructing an attitude tracking error model meeting the transient response performance requirement of the four-rotor unmanned aerial vehicle:
because the attitude angle state of the quadrotor unmanned aerial vehicle is constrained in the step 12, the segmentation characteristic of the method enables the attitude angle of the quadrotor unmanned aerial vehicle with the constraint not to be directly applied to the control model design based on the differential equation of the attitude dynamics of the quadrotor unmanned aerial vehicle, and the invention designs the following attitude angle error variablesMeanwhile, the four-rotor unmanned aerial vehicle state constraint and control model design are considered:
further, based on the introduced variablesDynamically converting tracking errors with performance constraints into an equivalent 'state constraint' model, and constructing an attitude tracking error model meeting the severe transient response performance requirements of the four-rotor unmanned aerial vehicle:
wherein , and />The preset performance tracking error vectors respectively +.>About time->First and second derivatives of (2);
wherein ,refers to the attitude angle tracking error variable +.>Regarding variables->Is a partial derivative of (c).
In addition, in the case of the optical fiber,
for convenience of explanation, the time variable in the four-rotor unmanned aerial vehicle attitude angle dynamic model isRemoving, such as: />In the present invention, the abbreviations described above are all carried out unless otherwise specified.
For four rotor unmanned aerial vehicle attitude angle vector, +.>Intermediate variable +.>Wherein the auxiliary attitude angle constraint variable +.>,/>Is->First derivative with respect to time, < >>For constraining attitude angle->Is a preset performance index function vector of +.>Is->First derivative with respect to time, < >>Is->Second derivative with respect to time,/>Rotation matrix of attitude angle system for quadrotor unmanned aerial vehicle, +.>For the first derivative of the rotation matrix of the attitude angle system of a quadrotor unmanned aerial vehicle with respect to time,is an intermediate variable introduced.
Step 2: discretizing the attitude tracking error model constructed in the step 1, constructing a long-term cost function integrating the four-rotor unmanned aerial vehicle state constraint out-of-range penalty term and controlling energy consumption based on the discretized attitude tracking error model, and forming a real-time reward function of integral reinforcement learning;
according to the attitude dynamic system model of the four-rotor unmanned aerial vehicle attitude tracking error 'state constraint' established in the step 1, based on analysis of the long-term optimal control performance of the four-rotor unmanned aerial vehicle attitude, constructing a long-term cost function integrating the four-rotor unmanned aerial vehicle state constraint out-of-range penalty term and the control energy consumption, and forming real-time rewards of integral reinforcement learning;
step 2 comprises the following sub-steps:
step 21: discretizing the attitude tracking error model constructed in the step 1 to obtain a discretized attitude tracking error model:
according to the attitude tracking error model established in the step 1, before integral reinforcement learning real-time rewarding design is carried out, in order to improve the calculation efficiency of the model, the continuous model equation established in the step 1 is discretized into the following discrete model by a forward difference method:
wherein ,step number of discretization model; />Discretizing the first part based on the forward difference method>Step preset Performance tracking error vector +.>And presetting the first derivative of the performance tracking error vector with respect to time +.>
、/>Discretized according to forward difference method>Step preset Performance tracking error vector +.>And presetting a first derivative of the performance tracking error vector with respect to time
Control input torque for discretization;
a model matrix and a control distribution matrix of the discretization model respectively;
is a discretized four-rotor unmanned aerial vehicle external bounded disturbance.
Step 22: based on the error state quantity in the discretized attitude tracking error model obtained in the step 21And control amount->Constructing a long-term cost function for fusing the state constraint out-of-range penalty term of the four-rotor unmanned aerial vehicle and controlling energy consumption:
according to the discretized four-rotor unmanned aerial vehicle attitude error tracking model, based on analysis of the four-rotor unmanned aerial vehicle attitude long-term optimal control performance, a long-term cost function integrating the four-rotor unmanned aerial vehicle state constraint out-of-range penalty term and the control energy consumption is constructed
wherein ,as a positive function, reflecting whether the attitude angle of the current quadrotor unmanned aerial vehicle is out of range or not>To be at the present->Based on the steps, the control performance prediction step number is carried out backwards in time;
is discretized->Step (3) a preset performance tracking error vector and a first derivative thereof;
is a discount factor, satisfy->;/>For->Is->The value of the power function;
weight matrix of positive definite matrix, balance fourTracking error performance and energy consumption of the rotor unmanned aerial vehicle model;
is discretized->Controlling input moment;
the initial moment is based on a forward difference discretization error model;
step 23: based on the long-term cost function designed in step 22Integral reinforcement learning->Real-time bonus function of steps->The design is as follows:
wherein ,representing the output quantity of the four-rotor unmanned aerial vehicle attitude model; />A weight matrix is positively defined; />Is a desired four-rotor unmanned aerial vehicle attitude angle signal.
Note that based on Lyapunov stability theorem, in the present invention, the desired long-term cost function is set to translate the reinforcement learning maximization rewards mechanism into an evaluation-control mechanism that aims at minimizing the long-term cost function.
Step 3: constructing an evaluation neural network for controlling the performance of the four-rotor unmanned aerial vehicle system, constructing an error model of integral reinforcement learning based on an estimated value of the evaluation neural network on a long-term cost function, and constructing an evaluation neural network-action neural network integral reinforcement learning control model by combining the real-time reward function formed in the step 2;
establishing an evaluation neural network-action neural network control architecture of integral reinforcement learning: based on the real-time rewarding function obtained in the step 2, an evaluation neural network for integrating reinforcement learning taking account of rewarding values at all future moments is constructed, and further, a four-rotor unmanned aerial vehicle gesture tracking control strategy based on an action neural network is provided with the aim of minimizing system evaluation neural network output, so that an integration reinforcement learning control framework of the evaluation neural network-the action neural network is formed, and the transient performance and autonomous intelligence of the four-rotor unmanned aerial vehicle are improved;
step 3 comprises the following sub-steps:
step 31: constructing an evaluation neural network for controlling behavior of the four-rotor unmanned aerial vehicle model:
based on the real-time rewarding function established in the step 2And long-term cost function->Because the state quantity of all the four-rotor unmanned aerial vehicle in the future cannot be directly obtained, the neural network is designed to estimate and predict, and the control behavior of the current four-rotor unmanned aerial vehicle model is evaluated. Further, the specific design of the evaluation neural network is as follows:
wherein ,for a desired long-term performance index function, +.>A vector that is all zeros;
weight matrix for ideal evaluation of neural network, < ->To evaluate the activation function of the neural network, +.>To evaluate the neural network for the desired long-term performance index function +.>Is used for the estimation error of (a). Furthermore, assume that +.>,/> and />,/>Are all unknown constants.
Note that: based on Lyapunov stability theorem, in the present invention, a long-term cost function is expectedIs set asThe reinforcement learning maximization rewards mechanism is converted into an evaluation-control mechanism that aims at minimizing the long-term cost function.
Step 32: based on the estimated value of the long-term cost function of the evaluation neural network, constructing an error model of integral reinforcement learning:
wherein ,to evaluate the long-term cost function of a neural network>Estimated value of ∈10->Is the error of integral reinforcement learning.
Step 33: an evaluation neural network-action neural network integral reinforcement learning control model is established based on an error model of integral reinforcement learning and a real-time rewarding function:
on the basis of establishing an evaluation neural network, a control strategy network of the four-rotor unmanned aerial vehicle based on reinforcement learning is further designed;
error model based on integral reinforcement learning, in the firstStep, establish four rotor unmanned aerial vehicle attitude angle tracking error +.>
wherein , and />;/>Tracking signals for the attitude angles of the expected four-rotor unmanned aerial vehicle;
furthermore, the first step is introduced on the basis of the attitude angle tracking errorStep attitude angular rate tracking error:
according to the design method of the state feedback control law, the ideal controller designed by the invention is as follows:
wherein ,control gain for the design;
as a result of the fact that,,/>unknown, including model subject to external bounded disturbance, model uncertainty, etc., ideal state feedback controller->Are not directly available. To solve this problem, the following action neural network design was introduced:
wherein ,weights for ideal action neural network, +.>An activation function for the action neural network;
action neural networkThe input of (1) is defined as +.>
The weight of the hidden layer in the action neural network is represented, and in the invention, the weight of the fixed hidden layer network is a fixed constant value. Thus, the design of the evaluation neural network-action neural network integral reinforcement learning control architecture aiming at minimizing the output of the model evaluation neural network is completed.
Step 4: and respectively designing weight updating rules for the evaluation neural network and the action neural network in the evaluation neural network-action neural network integral reinforcement learning control model, and tracking and controlling the gesture of the four-rotor unmanned aerial vehicle by using the integral reinforcement learning control model adopting the weight updating rules.
On the basis of the step 3, the factors such as uncertainty, external environment interference and the like existing in the four-rotor unmanned aerial vehicle system and evaluation neural network output containing unknown rewarding values are considered, the self-adaptive neural network is adopted to respectively approximate the evaluation neural network and the action neural network, in addition, strict theoretical analysis is provided based on the Lyapunov theory, stability of the closed-loop system and semi-global consistency and bounty of all states are proved, and the safety of the four-rotor unmanned aerial vehicle is ensured while the performance of the four-rotor unmanned aerial vehicle is improved.
Step 4 comprises the following sub-steps:
step 41: evaluating a neural network weight update law:
because the evaluation neural network and the action neural network established in the step 3 both contain ideal neural network weights, the method cannot be directly applied to an attitude angle tracking error model of the quadrotor unmanned aerial vehicle, and in addition, the calculation load problem caused by numerous neural network weight parameters is considered, the method adopts a less-parameterized neural network weight update law design method, and the specific steps are as follows:
evaluation neural network for designIntroduction ofThe approximation error of the neural network is evaluated as follows: />
Wherein, the weight approximation error of the neural network;/>Ideal weight for evaluating neural network>Is a function of the estimated value of (2);
combining the Bellman iterative equation and the empirical difference method, the Bellman error equation is further developed as:
to minimize long-term cost functionIt is mapped to the error cost function +.>In (a):
further, in combination with the self-adaptive gradient descent method of the discrete model, the following weight update gradient of the evaluation neural network is designed:
wherein ,in combination with the chain law,/->The method comprises the following steps:
;/>
wherein ,learning gain for evaluating the neural network weight update law;
thus, the following evaluation neural network weight update law can be obtained:
step 42: action neural network weight update law:
aiming at the action neural network with ideal weight, the invention designs the following neural network estimation strategy to solve the problem that the neural network estimation strategy cannot be directly applied:
wherein ,the output of the action neural network; furthermore, assume +.>,/> and />,/>Is an unknown constant value;
further, by combining the tracking error of the attitude angular rate of the quadrotor unmanned aerial vehicle, the following steps are obtained:
wherein ,,/>is the desired attitude angular rate; meanwhile, the estimation error of the action neural network weight is +.>Defined as->The method comprises the steps of carrying out a first treatment on the surface of the Meanwhile, the attitude angle tracking error of the four-rotor unmanned aerial vehicle caused by external bounded interference and action neural network estimation error of the four-rotor unmanned aerial vehicle is defined as
Further, an estimation error of the action neural network is introducedIn combination with the aim of minimizing the output of the evaluation neural network, the following action neural network errors are designed:
wherein ,,/>an estimated value that is an output of the ideal evaluation neural network;
according to minimizing action neural network errorsIs subject to introduction of quadratic errors +.>Binding chainThe rule, the update gradient of the action neural network weight is designed as follows:
further, an action neural network weight update law can be obtained:
。/>
therefore, the design of the four-rotor unmanned aerial vehicle gesture preset performance integral reinforcement learning self-adaptive neural network tracking control algorithm is realized, and the four-rotor unmanned aerial vehicle gesture can be tracked and controlled by using the integral reinforcement learning control model adopting the weight updating law. Meanwhile, lyapunov functions can be combined, and the stability of the model and the bouncy of all closed-loop model signals are proved, so that the performance of the four-rotor unmanned aerial vehicle is improved, and meanwhile, the safety of the four-rotor unmanned aerial vehicle is guaranteed.
The following simulation experiment is carried out on the preset performance integral reinforcement learning self-adaptive neural network tracking control method:
initial conditions of the four-rotor unmanned aerial vehicle attitude dynamic model are respectively as follows:
the control input constraint of the unmanned aerial vehicle is that,/>. The track tracked is: />
In addition, in the case of the optical fiber,,/>,/>,/>
and (3) constructing a four-rotor unmanned aerial vehicle model and a corresponding actuator fault model in the Matlab/Simulink by adopting Matlab/Simulink simulation, and performing simulation verification based on designing a corresponding controller.
According to the parameters designed in the method, the preset performance tracking control method of the four-rotor unmanned aerial vehicle based on reinforcement learning is simulated, so that an output tracking error curve of a four-rotor unmanned aerial vehicle motion model is obtained, as shown in fig. 2 and 3-4, the position change curve of the four-rotor unmanned aerial vehicle is shown in fig. 2, wherein the tracking effect is good, the method designed by the method is designed, the other curve is a comparison tracking curve, and the stability and the tracking performance of the four-rotor unmanned aerial vehicle are ensured; the preset performance of the attitude angle and the control moment output of the quadrotor unmanned aerial vehicle are shown in fig. 3-4, and the state safety constraint boundary of the quadrotor unmanned aerial vehicle is shown from top to bottom in fig. 3-4, and the attitude angle motion curve and the comparison group curve of the quadrotor unmanned aerial vehicle under the method can be found that the quadrotor unmanned aerial vehicle can keep good attitude angle constraint and energy consumption, and the comparison method can exceed the state constraint boundary and cause larger control quantity fluctuation, so that the energy consumption of the unmanned aerial vehicle is increased and the performance is reduced. Simulation results show that the real-time effectiveness of the design method provided by the invention expands a new visual angle for subsequent researches.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate embodiment, and that this description is provided for clarity only, and that the disclosure is not limited to the embodiments described in detail below, and that the embodiments described in the examples may be combined as appropriate to form other embodiments that will be apparent to those skilled in the art.

Claims (8)

1. The four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning is characterized by comprising the following steps of:
step 1: building a posture power model of the four-rotor unmanned aerial vehicle, building a posture angle state constraint on the posture power model by adopting a preset performance function, and building a posture tracking error model meeting the requirement of transient response performance of the four-rotor unmanned aerial vehicle by combining a posture angle error variable;
step 2: discretizing the attitude tracking error model constructed in the step 1, constructing a long-term cost function of the four-rotor unmanned aerial vehicle based on the discretized attitude tracking error model, and forming a real-time reward function of integral reinforcement learning;
step 3: constructing an evaluation neural network for controlling the performance of the four-rotor unmanned aerial vehicle system, constructing an error model of integral reinforcement learning based on an estimated value of the evaluation neural network on a long-term cost function, and constructing an evaluation neural network-action neural network integral reinforcement learning control model by combining the real-time reward function formed in the step 2;
step 4: and respectively designing weight updating rules for the evaluation neural network and the action neural network in the evaluation neural network-action neural network integral reinforcement learning control model, and tracking and controlling the gesture of the four-rotor unmanned aerial vehicle by using the integral reinforcement learning control model adopting the weight updating rules.
2. The four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning according to claim 1, wherein the step 1 comprises the following substeps:
step 11: building a gesture power model of the four-rotor unmanned aerial vehicle:
wherein ,the change rate of the attitude angle of the four-rotor unmanned aerial vehicle is the change rate;
the change rate of the attitude angular rate of the quadrotor unmanned aerial vehicle is the change rate of the attitude angular rate of the quadrotor unmanned aerial vehicle;
a rotation matrix of an attitude angle system of the quadrotor unmanned aerial vehicle;
、/>the attitude angular rate and the moment of inertia of the four-rotor unmanned aerial vehicle are adopted;
the attitude angular rate matrix is the attitude angular rate matrix of the quadrotor unmanned aerial vehicle;
the control moment of the four-rotor unmanned aerial vehicle is;
external bounded interference to the quad-rotor unmanned helicopter;
step 12: establishing attitude angle state constraint on an attitude power model by adopting a preset performance function:
wherein The attitude angle of the four-rotor unmanned aerial vehicle under i is;
,/>、/>、/>roll angle, pitch angle and yaw angle, respectively, represent the subscript +.>Refers to one of a roll angle, a pitch angle, and a yaw angle;
for presetting performance index function, satisfy->,/>;/>,/>Is constant, satisfy->,/>Is a time variable;
the amplitude adjusting parameter of the function for presetting the performance index satisfies +.>
Step 13: combining with an attitude angle error variable, constructing an attitude tracking error model meeting the transient response performance requirement of the four-rotor unmanned aerial vehicle:
wherein , and />The preset performance tracking error vectors respectively +.>About time->First and second derivatives of (2);
four-rotor-wing nothing is considered for attitude angle error variableThe state constraint and control model of the man-machine is specifically as follows:
for four rotor unmanned aerial vehicle attitude angle vector, +.>Intermediate variable +.>Wherein the auxiliary attitude angle constraint variable +.>,/>Is->First derivative with respect to time, < >>For constraining attitude angle->Is a preset performance index function vector of +.>Is->First derivative with respect to time, < >>Is->Second derivative with respect to time,/>Rotation matrix of attitude angle system for quadrotor unmanned aerial vehicle, +.>For the first derivative of the rotation matrix of the attitude angle system of a quadrotor unmanned aerial vehicle with respect to time,is an intermediate variable introduced.
3. The four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning according to claim 2, wherein,,/>and->The method comprises the following steps of:
wherein The roll angle rate, pitch angle rate and yaw angle rate, respectively; />、/>、/>The roll angle, pitch angle and yaw angle, respectively.
4. The four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning according to claim 2, wherein the four-rotor unmanned aerial vehicle preset performance tracking control method is characterized in that
wherein ,representing a hyperbolic secant function->
,/> and />For performance parameters selected according to the transient performance of a quadrotor unmanned aerial vehicle>,/>Determining the initial boundary and the end boundary of the attitude angle operation of the quadrotor unmanned aerial vehicle, < ->Determining attitude angle of quadrotor unmanned aerial vehicle under preset performance function constraintConvergence speed.
5. The four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning according to claim 1, wherein the step 2 comprises the following substeps:
step 21: discretizing the attitude tracking error model constructed in the step 1 to obtain a discretized attitude tracking error model:
wherein ,discretizing the first part based on the forward difference method>Step preset Performance tracking error vector +.>And presetting a first derivative of the performance tracking error vector with respect to time
、/>Discretized according to forward difference method>Step preset Performance tracking error vector +.>And presetting a first derivative of the performance tracking error vector with respect to time
Control input torque for discretization;
a model matrix and a control distribution matrix of the discretization model respectively;
the external bounded disturbance of the discretized four-rotor unmanned aerial vehicle;
step 22: constructing a long-term cost function of the four-rotor unmanned aerial vehicle based on the error state quantity and the control quantity in the discretized attitude tracking error model obtained in the step 21:
wherein ,as a positive function, reflecting whether the attitude angle of the current quadrotor unmanned aerial vehicle is out of range;
to be at the present->Based on the steps, the control performance prediction step number is carried out backwards in time;
is discretized->Step (3) a preset performance tracking error vector and a first derivative thereof;
for->Is->The function value of the power of the second order>Is a discount factor, satisfy->
The weight matrix is a positive definite matrix, and the tracking error performance and the energy consumption of the four-rotor unmanned aerial vehicle model are balanced;
is discretized->Controlling input moment;
the initial moment is based on a forward difference discretization error model;
step 23: long term cost function according to step 22Form integration reinforcement learning->Real-time rewarding function of steps
wherein ,representing the output quantity of the four-rotor unmanned aerial vehicle attitude model; />A weight matrix is positively defined;is a desired four-rotor unmanned aerial vehicle attitude angle signal.
6. The four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning according to claim 1, wherein the step 3 comprises the following sub-steps:
step 31: constructing an evaluation neural network for controlling behavior of the four-rotor unmanned aerial vehicle model:
wherein ,the method is an ideal weight matrix for evaluating the neural network;
for a desired long-term performance index function, +.>A vector that is all zeros;
to evaluate the activation function of the neural network;
to evaluate the neural network for the desired long-term performance index function +.>Is determined by the estimation error of (a);
satisfy the following requirements,/> and />,/>Are all unknown constants;
step 32: based on the estimated value of the long-term cost function of the evaluation neural network, constructing an error model of integral reinforcement learning:
wherein ,to evaluate the long-term cost function of a neural network>Estimated value of ∈10->Error for integral reinforcement learning;
step 33: an evaluation neural network-action neural network integral reinforcement learning control model is established based on an error model of integral reinforcement learning and a real-time rewarding function:
error model based on integral reinforcement learning, in the firstStep, the following four-rotor unmanned aerial vehicle attitude angle tracking error is established
wherein , and />;/>Tracking signals for the attitude angles of the expected four-rotor unmanned aerial vehicle;
furthermore, the first step is introduced on the basis of the attitude angle tracking errorStep attitude angular rate tracking error:
according to the design method of the state feedback control law, an ideal controller is designed as follows:
wherein ,control gain for the design;
the following action neural network design is introduced:
wherein ,weights for ideal action neural network, +.>An activation function for the action neural network;
action neural networkThe input of (1) is defined as +.>
Representing the weight of hidden layers in the action neural network, and thus finishing the establishment of the evaluation neural network-action neural network integral reinforcement learning control model.
7. The four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning according to claim 1, wherein in the step 4, the weight update law design process for evaluating the neural network is as follows:
for evaluating neural networksThe following approximation errors of the evaluation neural network are introduced: />
Wherein, the weight approximation error of the neural network;/>Ideal weight for evaluating neural network>Is a function of the estimated value of (2);
the error of integral reinforcement learning is further evolved by combining the Bellman iterative equation and the empirical difference method:
to minimize long-term cost functionIt is mapped to the error cost function +.>In (a):
further, in combination with the self-adaptive gradient descent method of the discrete model, the following weight update gradient of the evaluation neural network is designed:
wherein ,in combination with the chain law,/->The method comprises the following steps:
wherein ,learning gain for evaluating the neural network weight update law;
thus, the following evaluation neural network weight update law can be obtained:
8. the four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning according to claim 1, wherein in the step 4, the weight update law design process of the action neural network is as follows:
aiming at an action neural network with ideal weight, the following neural network estimation strategy is designed to solve the problem that the action neural network cannot be directly applied:
wherein ,the output of the action neural network;
,/> and />,/>Is an unknown constant value;
further, by combining the tracking error of the attitude angular rate of the quadrotor unmanned aerial vehicle, the following steps are obtained:
wherein ,,/>is the desired attitude angular rate;
meanwhile, the estimation error of the action neural network weightDefined as->
Combining external bounded interference and action neural network estimation errors received by the four-rotor unmanned aerial vehicle, and defining the attitude angle tracking error of the four-rotor unmanned aerial vehicle caused by the external bounded interference and action neural network estimation errors received by the unmanned aerial vehicle as
Further, an estimation error of the action neural network is introducedIn combination with the aim of minimizing the output of the evaluation neural network, the following action neural network errors are designed:
wherein ,,/>an estimated value that is an output of the ideal evaluation neural network;
according to minimizing action neural network errorsIs subject to introduction of quadratic errors +.>In combination with the chain rule, the update gradient of the action neural network weight is designed as follows:
further, an action neural network weight update law can be obtained:
CN202310930078.2A 2023-07-27 2023-07-27 Four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning Active CN116661478B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310930078.2A CN116661478B (en) 2023-07-27 2023-07-27 Four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310930078.2A CN116661478B (en) 2023-07-27 2023-07-27 Four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN116661478A true CN116661478A (en) 2023-08-29
CN116661478B CN116661478B (en) 2023-09-22

Family

ID=87709994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310930078.2A Active CN116661478B (en) 2023-07-27 2023-07-27 Four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN116661478B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160280369A1 (en) * 2013-11-01 2016-09-29 The University Of Queensland Rotorcraft
US20180164124A1 (en) * 2016-09-15 2018-06-14 Syracuse University Robust and stable autonomous vision-inertial navigation system for unmanned vehicles
US20200183339A1 (en) * 2018-12-10 2020-06-11 California Institute Of Technology Systems and Methods for Robust Learning-Based Control During Forward and Landing Flight Under Uncertain Conditions
CN111650830A (en) * 2020-05-20 2020-09-11 天津大学 Four-rotor aircraft robust tracking control method based on iterative learning
CN112363519A (en) * 2020-10-20 2021-02-12 天津大学 Four-rotor unmanned aerial vehicle reinforcement learning nonlinear attitude control method
CN113238572A (en) * 2021-05-31 2021-08-10 上海海事大学 Preset-time quadrotor unmanned aerial vehicle attitude tracking method based on preset performance control
CN113467255A (en) * 2021-08-12 2021-10-01 天津大学 Self-adaptive multivariable fixed time preset control method for reusable carrier
CN113848980A (en) * 2021-10-14 2021-12-28 北京航空航天大学 Rigid body aircraft attitude tracking method and system based on iterative learning
CN114594776A (en) * 2022-03-14 2022-06-07 安徽大学 Navigation obstacle avoidance method based on layering and modular learning
CN115079574A (en) * 2022-07-19 2022-09-20 安徽大学 Distributed fault compensation method for flexible hypersonic aircraft
US20230069480A1 (en) * 2020-02-13 2023-03-02 Tinamu Labs Ag Uav positioning system and method for controlling the position of an uav
CN116088550A (en) * 2022-12-31 2023-05-09 南京理工大学 Fixed time attitude tracking control method for four-rotor unmanned aerial vehicle

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160280369A1 (en) * 2013-11-01 2016-09-29 The University Of Queensland Rotorcraft
US20180164124A1 (en) * 2016-09-15 2018-06-14 Syracuse University Robust and stable autonomous vision-inertial navigation system for unmanned vehicles
US20200183339A1 (en) * 2018-12-10 2020-06-11 California Institute Of Technology Systems and Methods for Robust Learning-Based Control During Forward and Landing Flight Under Uncertain Conditions
US20230069480A1 (en) * 2020-02-13 2023-03-02 Tinamu Labs Ag Uav positioning system and method for controlling the position of an uav
CN111650830A (en) * 2020-05-20 2020-09-11 天津大学 Four-rotor aircraft robust tracking control method based on iterative learning
CN112363519A (en) * 2020-10-20 2021-02-12 天津大学 Four-rotor unmanned aerial vehicle reinforcement learning nonlinear attitude control method
CN113238572A (en) * 2021-05-31 2021-08-10 上海海事大学 Preset-time quadrotor unmanned aerial vehicle attitude tracking method based on preset performance control
CN113467255A (en) * 2021-08-12 2021-10-01 天津大学 Self-adaptive multivariable fixed time preset control method for reusable carrier
CN113848980A (en) * 2021-10-14 2021-12-28 北京航空航天大学 Rigid body aircraft attitude tracking method and system based on iterative learning
CN114594776A (en) * 2022-03-14 2022-06-07 安徽大学 Navigation obstacle avoidance method based on layering and modular learning
CN115079574A (en) * 2022-07-19 2022-09-20 安徽大学 Distributed fault compensation method for flexible hypersonic aircraft
CN116088550A (en) * 2022-12-31 2023-05-09 南京理工大学 Fixed time attitude tracking control method for four-rotor unmanned aerial vehicle

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
JACOB HWHITE: "An Iterative Pose Estimation Algorithm Based on Epipolar Geometry With Application to Multi-Target Tracking", IEEE_CAA JOURNAL OF AUTOMATICA SINICA, pages 942 - 953 *
MU CHAOXU: "Learning-Based_Robust_Tracking_Control_of_Quadrotor_With_Time-Varying_and_Coupling_Uncertainties", IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, pages 259 - 273 *
史豪斌: "基于强化学习的旋翼无人机智能追踪方法", 《电子科技大学学报》, vol. 48, no. 4, pages 553 - 559 *
沈林武: "基于快速终端滑模面的两旋翼飞行器有限时间姿态控制", 《计算机测量与控制》, pages 137 - 142 *
闫锦龙: "基于深度学习的无人机目标检测与跟踪算法研究", 《工程科技Ⅱ辑》, pages 1 - 77 *

Also Published As

Publication number Publication date
CN116661478B (en) 2023-09-22

Similar Documents

Publication Publication Date Title
Xu et al. Robust adaptive neural control of nonminimum phase hypersonic vehicle model
Chen et al. Human-in-the-loop consensus tracking control for UAV systems via an improved prescribed performance approach
CN110806759B (en) Aircraft route tracking method based on deep reinforcement learning
Elkhatem et al. Robust LQR and LQR-PI control strategies based on adaptive weighting matrix selection for a UAV position and attitude tracking control
CN106444799A (en) Quadrotor unmanned plane control method based on fuzzy expansion state observer and adaptive sliding formwork
Han et al. Online policy iteration ADP-based attitude-tracking control for hypersonic vehicles
Song et al. Adaptive compensation control for attitude adjustment of quad-rotor unmanned aerial vehicle
Ding et al. Robust fixed-time sliding mode controller for flexible air-breathing hypersonic vehicle
Elhaki et al. A novel model-free robust saturated reinforcement learning-based controller for quadrotors guaranteeing prescribed transient and steady state performance
CN110908281A (en) Finite-time convergence reinforcement learning control method for attitude motion of unmanned helicopter
Zhen et al. Deep reinforcement learning attitude control of fixed-wing UAVs
Shen et al. Dynamic surface control for tracking of unmanned surface vessel with prescribed performance and asymmetric time-varying full state constraints
Li et al. Leader-follower formation of light-weight UAVs with novel active disturbance rejection control
Wang et al. Intelligent control of air-breathing hypersonic vehicles subject to path and angle-of-attack constraints
Suresh et al. An on-line learning neural controller for helicopters performing highly nonlinear maneuvers
Zhen et al. Information fusion based optimal control for large civil aircraft system
Liu et al. Antisaturation fixed-time attitude tracking control based low-computation learning for uncertain quadrotor UAVs with external disturbances
Qian et al. Sliding mode control‐based distributed fault tolerant tracking control for multiple unmanned aerial vehicles with input constraints and actuator faults
Liu et al. Fixed-time self-structuring neural network fault-tolerant tracking control of underactuated surface vessels with state constraints
CN115981149B (en) Hypersonic aircraft optimal control method based on safety reinforcement learning
CN116661478B (en) Four-rotor unmanned aerial vehicle preset performance tracking control method based on reinforcement learning
An et al. Dual-mode control of hypersonic vehicles
An et al. Event-triggered adaptive control of hypersonic vehicles subject to actuator nonlinearities
CN116449704A (en) Nonlinear Gao Jiequan-driven multi-agent system distributed containment control method
CN114003052B (en) Fixed wing unmanned aerial vehicle longitudinal movement robust self-adaptive control method based on dynamic compensation system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant