CN117911414A - Automatic driving automobile motion control method based on reinforcement learning - Google Patents

Automatic driving automobile motion control method based on reinforcement learning Download PDF

Info

Publication number
CN117911414A
CN117911414A CN202410315976.1A CN202410315976A CN117911414A CN 117911414 A CN117911414 A CN 117911414A CN 202410315976 A CN202410315976 A CN 202410315976A CN 117911414 A CN117911414 A CN 117911414A
Authority
CN
China
Prior art keywords
steps
control
formula
model
method comprises
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410315976.1A
Other languages
Chinese (zh)
Inventor
何舒平
程纬地
任乘乘
王广宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202410315976.1A priority Critical patent/CN117911414A/en
Publication of CN117911414A publication Critical patent/CN117911414A/en
Pending legal-status Critical Current

Links

Landscapes

  • Feedback Control In General (AREA)

Abstract

The invention relates to the technical field of tracking control, in particular to an automatic driving automobile motion control method based on reinforcement learning. In the first stage, a robust steering controller based on reinforcement learning is designed by utilizing a back-stepping structure control based on a reference path model, a vehicle dynamics model and a kinematic model, so that a lateral path tracking error is restrained, unknown external interference is resisted, and the yaw stability of an autonomous vehicle is ensured. In the second stage, the uncertainty of the tire turning rigidity is compensated by learning approximately any nonlinear function by combining an adaptive control mechanism based on Lyapunov stability theory and a radial basis function neural network, and the global asymptotic stability of a closed-loop system is ensured.

Description

Automatic driving automobile motion control method based on reinforcement learning
Technical Field
The invention relates to the technical field of tracking control, in particular to an automatic driving automobile motion control method based on reinforcement learning.
Background
One of the most important considerations in designing a motion control scheme for an autonomous vehicle is to eliminate the lateral path tracking error while ensuring the stability of the vehicle during driving. In general, motion control of an autonomous vehicle can be achieved by longitudinal control and lateral control according to the current vehicle state and road information, the longitudinal control aiming at maintaining a desired cruising speed and maintaining a safe distance between a preceding vehicle and a controlled vehicle to avoid collision.
However, when autopilots leave the research laboratory, they must be able to react to emergency situations, some of which may require maneuver, such as emergency collision avoidance occurring in a short period of time, requiring a large number of actuator inputs and high yaw rates. The tire will be highly saturated and begin to slip. In this case, the characteristic of the tire force becomes highly nonlinear, which means that the cornering force of the tire does not linearly increase with an increase in slip angle, but rather it hardly changes, or even decreases with an increase in slip angle. Lateral motion control of autonomous vehicles still faces two major challenges: uncertainties in parametric modeling and unknown external disturbances are often problems in actual vehicle systems. If the tire side forces in the non-linear region are considered to be linear forces, or the driving environment changes suddenly, the behavior of the vehicle may become uncontrollable, thereby causing the autonomous vehicle to lose path tracking capability and stability.
Disclosure of Invention
Therefore, the invention aims to provide an automatic driving automobile motion control method based on reinforcement learning, so as to solve the problem of instability caused by parameter uncertainty in the existing automatic driving automobile algorithm.
Based on the above purpose, the invention provides an automatic driving automobile motion control method based on reinforcement learning, which comprises the following steps:
s1, establishing an autonomous control automobile system dynamic model;
S2, establishing a steering system model of the automobile;
S3, combining an autonomous control automobile system dynamic model and a steering system model to obtain a man-machine control mapping model of torque input;
S4, selecting state variables in a man-machine control mapping model, carrying out self-adaptive optimization controller design based on a back-step structure technology and reinforcement learning, and carrying out automatic driving automobile motion control by using the obtained optimized control strategy, wherein the obtained optimized control strategy is as follows:
wherein, Is the system state,/>Is the system order,/>,/>,/>And/>Is a positive constant,/>Is tracking error,/>Representing an optimized virtual control law,/>Representing the actual control law after optimization,/>And/>Is a basis function vector,/>And/>Is a bounded approximation error,/>Representing system variables,/>Is the actor's neural network weight,/>Is the neural network weight of criticist,/>Neural network weights for the identified person, wherein the adaptive update law based on the evaluation-action mechanism controller is:
Wherein the method comprises the steps of And/>Is a positive constant;
The identifier update law for disturbance and nonlinear term approximation is:
Wherein the method comprises the steps of Representing a positive constant matrix,/>Is constant.
Preferably, establishing the autonomous control automobile system dynamic model includes:
s11, establishing a two-degree-of-freedom vehicle dynamics model:
wherein, Is the roll angle of the vehicle body,/>For the yaw rate of the vehicle body,/>For the total mass of the vehicle,/>In order to be a longitudinal velocity,Is yaw moment of inertia,/>And/>The distance between the center of gravity and the front and rear axes,/>, respectivelyAnd/>Tire side forces of the front and rear axles, respectively;
s12, converting the vehicle dynamics state parameters into state parameters related to the reference track, and using the prediction error Incorporating lateral error/>And yaw error/>The obtained vehicle kinematic model is as follows:
wherein, Is the transverse velocity,/>Is the heading of the carrier,/>Is the heading of the reference path,/>Distance of constant projection,/>Representing the distance along the reference path;
s13, according to the pair And to the/>, of formula (1.5)And/>And (3) deriving to obtain:
wherein, Representing the curvature of the path;
S14, eliminating prediction error And oscillation of yaw rate, the following relationship is obtained:
s15, carrying the formula (1.4) into the formula (1.7) to obtain the following formula:
s16, under the condition of unknown external force interference, calculating the tire lateral force of the front axle and the rear axle as follows:
wherein, Tire bending force of front axle and rear axle respectively,/>Lateral unknown external force interference of front axle and rear axle respectively,/>Tire slip angle,/>, for front axle, rear axle, respectivelyFor the front axle tire bending rigidity,/>For the bending rigidity of the rear axle tyre,/>The adhesion coefficient between the tire and the road surface;
The slip angle of the front and rear tires satisfies the requirement:
In the middle of For the front wheel steering angle,/>The vehicle speed is the vehicle speed;
And satisfies cornering stiffness having a nonlinear characteristic:
In the middle of 、/>The nominal turning rigidity of the front axle and the rear axle is/>、/>Uncertainty of turning rigidity of front and rear tires respectively;
S17, combining the formula (1.8) -formula (1.11) to obtain a nonlinear vehicle-road system model:
wherein S 1,S2 And S3 is a system variable, a smoothing function Representing an equivalent random disturbance of the vehicle.
Preferably, building a steering system model of the automobile includes:
The steering system model is initially built as follows:
restated as:
In the middle of And/>Equivalent moment of inertia and damping, respectively,/>, of a steering systemIs the reduction ratio of a motor reduction mechanism,/>For the reduction ratio of the steering system,/>Is the front wheel angle,/>Input torque for driver,/>Is steering load torque;
The fitting equation is:
Will be And/>Viewed as real-time measurable,/>Is of a known value, then in formula (1.14)Viewed as a known term/>And a smoothing functionWherein/>Obtained by fitting equation (1.15)/>Is measured by a sensor,/>Is the torque fit error,/>The measurement error is obtained by the simplified model of the steering system:
wherein the system variables
Preferably, combining the autonomous control vehicle system dynamic model and the steering system modeling, the man-machine control mapping model for deriving the torque input comprises:
combining equations (1.12) and (1.16) results in a man-machine-controlled mapping model of torque input:
Wherein the method comprises the steps of Indicating the front wheel angular velocity.
Preferably, the process of deriving an optimized control strategy comprises:
Selecting the steering angular velocity of the front wheel Front wheel steering angle/>Projection error/>And derivative of projection error/>As a strict feedback system, i.e., the state variable of formula (1.17), the formula (1.17) is converted into:
wherein,
Coordinate transformation is adopted, and a first-order filter is introduced to obtain a tracking error equation:
The first order filter is designed as In/>For reference signal,/>Is the filter output signal,/>For the filter input signal, i.e. the optimal control law,/>For design constant,/>A first order derivative of the output signal of the filter;
Introducing finite time convergence into the controller design as a constraint, i.e., making the system finite time In, achieve control objective, wherein/>The method meets the following conditions:
wherein, Are all constant,/>Is an initial lyapunov function;
for the fourth-order system of formula (1.18), take For/>Virtual control law of steps, wherein/>The optimal performance index function is obtained as follows:
wherein,
Is provided withFor the optimal virtual controller, obtaining:
Wherein the method comprises the steps of Is a predefined tight set;
Will be Considered as optimal virtual control signal/>The HJB equation corresponding to equation (1.21) is obtained as:
Wherein the method comprises the steps of By solving/>Obtained, i.e. for/>And/>
Will beThe method comprises the following steps of:
Wherein the method comprises the steps of And/>Is an unknown continuous function;
Will be And/>Using neural network approximation, i.e. pair/>Has the following components
Wherein the method comprises the steps of,/>And/>Is the desired neural network weight,/>AndIs a basis function vector,/>And/>Is a bounded approximation error;
bringing formula (1.24) into formula (1.23) gives:
Wherein the method comprises the steps of
The corresponding optimized controller is obtained as follows:
; introduction of reinforcement learning with recognizer-criticizer-actor structure will be used to approximate/> Is designed as follows:
Wherein the method comprises the steps of Output for identifier,/>Neural network weights for the identified person;
The identifier update law is constructed as follows:
Wherein the method comprises the steps of Representing a positive constant matrix,/>Is a constant;
Based on the criticist-actor architecture and formula (1.25), a criticist evaluating control performance is configured to:
Wherein the method comprises the steps of For/>Estimate of/>A neural network weight for criticists;
According to equation (1.28), an actor for performing a control action is designed to:
In the middle of And/>Respectively an optimized virtual control law and an optimized actual control law,/>Neural network weights that are actors;
The criticizing home neural network weight and actor neural network weight update law is:
Wherein the method comprises the steps of And/>Is a positive constant.
The invention has the beneficial effects that:
(1) The invention provides a reinforcement learning method design controller based on an evaluation-action mechanism, which realizes stable motion control of an automatic driving vehicle.
(2) Aiming at adverse effects of vehicle system parameter modeling uncertainty and unknown external interference on transverse motion control of an autonomous vehicle under a limit driving condition, the invention provides an identifier-evaluator-actor mechanism, and control performance of the system under larger uncertainty is effectively improved.
(3) In a complex scene, aiming at the problems that an automatic driving automobile suppresses a lateral path tracking error, unknown external interference is resisted and yaw stability of the automatic vehicle is guaranteed, the invention provides an adaptive controller design based on a reinforcement learning and reaction structure method, and the reinforcement learning can be utilized to dynamically adjust adaptive parameters so as to improve the robustness of motion control.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only of the invention and that other drawings can be obtained from them without inventive effort for a person skilled in the art.
FIG. 1 is a modeling block diagram of a steering system module according to an embodiment of the present invention;
FIG. 2 is a diagram of a kinematic model of a vehicle according to an embodiment of the present invention;
FIG. 3 is an architecture diagram of an autonomous vehicle lateral motion controller according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a motion control method applied to an adaptive lateral motion system according to an embodiment of the present invention.
Detailed Description
The present invention will be further described in detail with reference to specific embodiments in order to make the objects, technical solutions and advantages of the present invention more apparent.
It is to be noted that unless otherwise defined, technical or scientific terms used herein should be taken in a general sense as understood by one of ordinary skill in the art to which the present invention belongs. The terms "first," "second," and the like, as used herein, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
The embodiment of the specification provides an automatic driving automobile motion control method based on reinforcement learning, which is characterized by comprising the following steps:
s1, establishing an autonomous control automobile system dynamic model;
The method specifically comprises the following steps: s11, establishing a two-degree-of-freedom vehicle dynamics model:
wherein, Is the roll angle of the vehicle body,/>For the yaw rate of the vehicle body,/>For the total mass of the vehicle,/>In order to be a longitudinal velocity,Is yaw moment of inertia,/>And/>The distance between the center of gravity and the front and rear axes,/>, respectivelyAnd/>Tire side forces of the front and rear axles, respectively;
S12, converting the vehicle dynamics state parameters into state parameters related to the reference track so as to concentrate the track tracking capability. Converting vehicle dynamics state parameters to state parameters related to a reference trajectory using prediction errors Incorporating lateral error/>And yaw error/>The kinematic model of the vehicle is shown in fig. 2, and is:
wherein, Is the transverse velocity,/>Is the heading of the carrier,/>Is the heading of the reference path,/>Distance of constant projection,/>Representing the distance along the reference path;
s13, according to the pair And to the/>, of formula (1.5)And/>And (3) deriving to obtain:
wherein, Representing the curvature of the path;
S14, under the condition of high lateral acceleration or low attachment coefficient, the closed loop steering response is under damped, and significant yaw rate oscillation is caused, so as to eliminate prediction error And the yaw rate oscillation, the following relationship is obtained by the equation (1.6):
s15, carrying the formula (1.4) into the formula (1.7) to obtain the following formula:
s16, under the condition of unknown external force interference, calculating the tire lateral force of the front axle and the rear axle as follows:
wherein, Tire bending force of front axle and rear axle respectively,/>Lateral unknown external force interference of front axle and rear axle respectively,/>Tire slip angle,/>, for front axle, rear axle, respectivelyFor the front axle tire bending rigidity,/>For the bending rigidity of the rear axle tyre,/>The adhesion coefficient between the tire and the road surface;
The slip angle of the front and rear tires satisfies the requirement:
In the middle of For the front wheel steering angle,/>The vehicle speed is the vehicle speed;
And satisfies cornering stiffness having a nonlinear characteristic:
;/>
In the middle of 、/>The nominal turning rigidity of the front axle and the rear axle is/>、/>Uncertainty of turning rigidity of front and rear tires respectively;
S17, combining the formula (1.8) -formula (1.11) to obtain a nonlinear vehicle-road system model:
wherein S 1,S2 And S3 is a system variable, a smoothing function Representing an equivalent random disturbance of the vehicle.
S2, building a steering system model of an automobile, wherein the steering system structure of the automobile is shown in fig. 1, and specifically comprises the following steps:
The steering system model is initially built as follows:
restated as:
In the middle of And/>Equivalent moment of inertia and damping, respectively,/>, of a steering systemIs the reduction ratio of a motor reduction mechanism,/>For the reduction ratio of the steering system,/>Is the front wheel angle,/>Input torque for driver,/>For steering load torque, both can be obtained by sensors, but under extreme conditions, the sensor data will be inaccurate, fitting the equation:
Will be And/>Viewed as real-time measurable,/>Is of a known value, then in formula (1.14)Viewed as a known term/>And a smoothing functionWherein/>Obtained by fitting equation (1.15)/>Is measured by a sensor,/>Is the torque fit error,/>The measurement error is obtained by the simplified model of the steering system:
;/>
wherein the system variables
S3, combining an autonomous control automobile system dynamic model and a steering system model to obtain a man-machine control mapping model of torque input, wherein the man-machine control mapping model specifically comprises the following steps:
combining equations (1.12) and (1.16) results in a man-machine-controlled mapping model of torque input:
Wherein the method comprises the steps of Indicating the front wheel angular velocity.
S4, selecting state variables in a man-machine control mapping model, carrying out self-adaptive optimization controller design based on a back-step structure technology and reinforcement learning, and carrying out automatic driving automobile motion control by using the obtained optimized control strategy, wherein the obtained optimized control strategy is as follows:
wherein, Is the system state,/>Is the system order,/>,/>,/>And/>Is a positive constant,/>Is tracking error,/>Representing an optimized virtual control law,/>Representing the actual control law after the optimization,And/>Is a basis function vector,/>And/>Is a bounded approximation error,/>Representing system variables,/>Is the actor's neural network weight,/>Is the neural network weight of criticist,/>Neural network weights for the identified person, wherein the adaptive update law based on the evaluation-action mechanism controller is:
Wherein the method comprises the steps of And/>Is a positive constant;
The identifier update law for disturbance and nonlinear term approximation is:
Wherein the method comprises the steps of Representing a positive constant matrix,/>Is constant.
Specifically, the process of deriving an optimized control strategy includes:
Selecting the steering angular velocity of the front wheel Front wheel steering angle/>Projection error/>And derivative of projection error/>As a strict feedback system, i.e., the state variable of formula (1.17), the formula (1.17) is converted into:
wherein,
And adopting coordinate transformation, and simultaneously adopting a dynamic surface technology to introduce a first-order filter to obtain a tracking error equation for inhibiting the jitter problem of the control signal:
In the middle of For reference signal,/>Is the filter output signal,/>For optimal control law, the filter is designed as,/>For design constant,/>A first order derivative of the output signal of the filter;
To ensure the rapidity of motion control, a constraint of limited time convergence is introduced into the controller design, namely, the system is enabled to be in limited time In, achieve control objective, wherein/>The method meets the following conditions:
wherein, Are all constant,/>Is an initial lyapunov function.
For the fourth-order system of formula (1.18), takeFor/>Virtual control law of steps, wherein/>The optimal performance index function is obtained as follows:
wherein,
Is provided withFor the optimal virtual controller, obtaining:
Wherein the method comprises the steps of Is a predefined tight set;
Will be Considered as optimal virtual control signal/>The HJB equation corresponding to equation (1.21) is obtained as: /(I)
Wherein the method comprises the steps ofBy solving/>Obtained, i.e. forAnd/>
Will beThe method comprises the following steps of:
Wherein the method comprises the steps of And/>Is an unknown continuous function;
Will be And/>Using neural network approximation, i.e. pair/>Has the following components
Wherein the method comprises the steps of,/>And/>Is the desired neural network weight,/>AndIs a basis function vector,/>And/>Is a bounded approximation error;
bringing formula (1.24) into formula (1.23) gives:
Wherein the method comprises the steps of
The corresponding optimized controller is obtained as follows:
; introduction of reinforcement learning with recognizer-criticizer-actor structure will be used to approximate/> Is designed as follows:
Wherein the method comprises the steps of Output for identifier,/>Neural network weights for the identified person;
The identifier update law is constructed as follows:
Wherein the method comprises the steps of Representing a positive constant matrix,/>Is a constant;
Based on the criticist-actor architecture and formula (1.25), a criticist evaluating control performance is configured to:
Wherein the method comprises the steps of For/>Estimate of/>A neural network weight for criticists;
According to equation (1.28), an actor for performing a control action is designed to:
In the middle of And/>Respectively an optimized virtual control law and an optimized actual control law,/>Neural network weights that are actors;
The criticizing home neural network weight and actor neural network weight update law is:
Wherein the method comprises the steps of And/>Is a positive constant.
The optimized robust steering controller designed by strengthening the learning algorithm based on the criticizing-evaluating mechanism and the identifier approximator based on the radial basis function neural network. In the first stage, a robust steering controller based on reinforcement learning is designed by utilizing a back-stepping structure control based on a reference path model, a vehicle dynamics model and a kinematic model, so that a lateral path tracking error is restrained, unknown external interference is resisted, and the yaw stability of an autonomous vehicle is ensured. In the second stage, the uncertainty of the tire turning rigidity is compensated by learning approximately any nonlinear function by combining an adaptive control mechanism based on Lyapunov stability theory and a radial basis function neural network, and the global asymptotic stability of a closed-loop system is ensured, and the system architecture is shown in figure 3.
As an implementation manner, the control method is applicable to the following self-adaptive transverse motion system, as shown in fig. 4, and mainly comprises a steering wheel assembly and a steering execution assembly, wherein the steering wheel assembly is provided with a steering column 10, a steering wheel 1 is fixed at the upper end of the steering column 10, a hand wheel angle sensor 2 is sleeved on the steering column 10, and a road sense feedback motor 3 is connected at the bottom end of the steering column 10; the steering motor 4 is arranged at the top of the steering transmission shaft 14, the gear angle sensor 5 is sleeved in the middle of the steering transmission shaft 14, one end of the rack-and-pinion steering gear 7 is connected with the steering transmission shaft 14, the other end of the rack-and-pinion steering gear 7 is connected with the tie rod 8, two ends of the tie rod 8 are respectively connected with the steering knuckle arm 13 through a ball head 12, and the other end of the steering knuckle arm 13 is connected to the front wheel 6. The hand wheel angle sensor 2 measures the angle of the steering wheel 1 rotated by the driver and transmits signals to the inside of the main controller 9, the main controller 9 transmits road condition information to the road sense feedback motor 3 in the form of an electric signal, the road sense feedback motor 3 drives the steering column 10 to rotate, the steering wheel 1 also rotates, and the driver feels the road condition of the road surface. According to the self-adaptive motion control method provided by the embodiment of the specification, an optimized motion control strategy is obtained through self-adaptive learning, the control signal is input into the main controller 9, the steering motor 4 receives the steering signal sent by the main controller 9 to make corresponding action, the steering motor 4 drives the steering transmission shaft 14 to rotate, the steering transmission shaft 14 drives the rack-and-pinion steering device 7 to operate, the rack-and-pinion steering device 7 drives the transverse pull rod 8 to move left and right, the transverse pull rod 8 drives the steering knuckle arm 13 to operate through the ball head 12, and the steering knuckle arm 13 drives the front wheels 6 to steer. The steering angle of the gear measured by the gear angle sensor 5 is transmitted to the main controller 9, thereby forming a closed loop.
Those of ordinary skill in the art will appreciate that: the discussion of any of the embodiments above is merely exemplary and is not intended to suggest that the scope of the invention (including the claims) is limited to these examples; the technical features of the above embodiments or in the different embodiments may also be combined within the idea of the invention, the steps may be implemented in any order and there are many other variations of the different aspects of the invention as described above, which are not provided in detail for the sake of brevity.
The present invention is intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Therefore, any omission, modification, equivalent replacement, improvement, etc. of the present invention should be included in the scope of the present invention.

Claims (5)

1. An automatic driving automobile motion control method based on reinforcement learning, which is characterized by comprising the following steps:
s1, establishing an autonomous control automobile system dynamic model;
S2, establishing a steering system model of the automobile;
S3, combining an autonomous control automobile system dynamic model and a steering system model to obtain a man-machine control mapping model of torque input;
S4, selecting state variables in a man-machine control mapping model, carrying out self-adaptive optimization controller design based on a back-step structure technology and reinforcement learning, and carrying out automatic driving automobile motion control by using the obtained optimized control strategy, wherein the obtained optimized control strategy is as follows:
wherein, Is the system state,/>Is the system order,/>,/>,/>And/>Is a positive constant,/>Is tracking error,/>Representing an optimized virtual control law,/>Representing the actual control law after the optimization,And/>Is a basis function vector,/>And/>Is a bounded approximation error,/>Representing system variables,/>Is the actor's neural network weight,/>Is the neural network weight of criticist,/>Neural network weights for the identified person, wherein the adaptive update law based on the evaluation-action mechanism controller is:
Wherein the method comprises the steps of And/>Is a positive constant;
The identifier update law for disturbance and nonlinear term approximation is:
Wherein the method comprises the steps of Representing a positive constant matrix,/>Is constant.
2. The reinforcement learning based automatic driving car motion control method of claim 1, wherein the establishing an autonomous control car system dynamic model comprises:
s11, establishing a two-degree-of-freedom vehicle dynamics model:
wherein, Is the roll angle of the vehicle body,/>For the yaw rate of the vehicle body,/>For the total mass of the vehicle,/>For longitudinal speed,/>Is yaw moment of inertia,/>And/>The distance between the center of gravity and the front and rear axes,/>, respectivelyAnd/>Tire side forces of the front and rear axles, respectively;
s12, converting the vehicle dynamics state parameters into state parameters related to the reference track, and using the prediction error Incorporating lateral error/>And yaw error/>The obtained vehicle kinematic model is as follows:
wherein, Is the transverse velocity,/>Is the heading of the carrier,/>Is the heading of the reference path,/>Distance of constant projection,/>Representing the distance along the reference path;
s13, according to the pair And to the/>, of formula (1.5)And/>And (3) deriving to obtain:
wherein, Representing the curvature of the path;
S14, eliminating prediction error And oscillation of yaw rate, the following relationship is obtained:
s15, carrying the formula (1.4) into the formula (1.7) to obtain the following formula:
s16, under the condition of unknown external force interference, calculating the tire lateral force of the front axle and the rear axle as follows:
wherein, Tire bending force of front axle and rear axle respectively,/>Lateral unknown external force interference of front axle and rear axle respectively,/>Tire slip angle,/>, for front axle, rear axle, respectivelyFor the front axle tire bending rigidity,/>For the bending rigidity of the rear axle tyre,/>The adhesion coefficient between the tire and the road surface;
The slip angle of the front and rear tires satisfies the requirement:
In the middle of For the front wheel steering angle,/>The vehicle speed is the vehicle speed;
And satisfies cornering stiffness having a nonlinear characteristic:
In the middle of 、/>The nominal turning rigidity of the front axle and the rear axle is/>、/>Uncertainty of turning rigidity of front and rear tires respectively;
S17, combining the formula (1.8) -formula (1.11) to obtain a nonlinear vehicle-road system model:
wherein S 1,S2 And S3 is a system variable, a smoothing function Representing an equivalent random disturbance of the vehicle.
3. The reinforcement learning-based automatic driving car motion control method according to claim 2, wherein the building a steering system model of a car comprises:
The steering system model is initially built as follows:
restated as:
In the middle of And/>Equivalent moment of inertia and damping, respectively,/>, of a steering systemIs the reduction ratio of the motor reduction mechanism,For the reduction ratio of the steering system,/>Is the front wheel angle,/>Input torque for driver,/>Is steering load torque;
The fitting equation is:
Will be And/>Viewed as real-time measurable,/>Is of a known value, then in formula (1.14)Viewed as a known term/>And a smoothing functionWherein/>Obtained by fitting formula (1.15),Is measured by a sensor,/>Is the torque fit error,/>The measurement error is obtained by the simplified model of the steering system:
wherein the system variables
4. The reinforcement learning-based automatic driving vehicle motion control method according to claim 3, wherein the combining the autonomous control vehicle system dynamic model and the steering system modeling to obtain the man-machine control map model of the torque input comprises:
combining equations (1.12) and (1.16) results in a man-machine-controlled mapping model of torque input:
Wherein the method comprises the steps of Indicating the front wheel angular velocity.
5. The reinforcement learning based autopilot motion control method of claim 4 wherein the process of deriving an optimized control strategy includes:
Selecting the steering angular velocity of the front wheel Front wheel steering angle/>Projection error/>And derivative of projection error/>As a strict feedback system, i.e., the state variable of formula (1.17), the formula (1.17) is converted into:
wherein,
Coordinate transformation is adopted, and a first-order filter is introduced to obtain a tracking error equation:
The first order filter is designed as In/>As a reference signal, a reference signal is provided,Is the filter output signal,/>For the filter input signal, i.e. the optimal control law,/>For design constant,/>A first order derivative of the output signal of the filter;
Introducing finite time convergence into the controller design as a constraint, i.e., making the system finite time In, achieve control objective, wherein/>The method meets the following conditions:
wherein, Are all constant,/>Is an initial lyapunov function;
for the fourth-order system of formula (1.18), take For/>Virtual control law of steps, wherein/>The optimal performance index function is obtained as follows:
wherein,
Is provided withFor the optimal virtual controller, obtaining:
Wherein the method comprises the steps of Is a predefined tight set;
Will be Considered as optimal virtual control signal/>The HJB equation corresponding to equation (1.21) is obtained as:
Wherein the method comprises the steps of By solving/>Obtained, i.e. for/>And/>
Will beThe method comprises the following steps of:
Wherein the method comprises the steps of And/>Is an unknown continuous function;
Will be And/>Using neural network approximation, i.e. pair/>Has the following components
Wherein the method comprises the steps of,/>And/>Is the desired neural network weight,/>AndIs a basis function vector,/>And/>Is a bounded approximation error;
bringing formula (1.24) into formula (1.23) gives:
Wherein the method comprises the steps of
The corresponding optimized controller is obtained as follows: ; introduction of reinforcement learning with recognizer-criticizer-actor structure will be used to approximate/> Is designed as follows:
Wherein the method comprises the steps of Output for identifier,/>Neural network weights for the identified person;
The identifier update law is constructed as follows:
Wherein the method comprises the steps of Representing a positive constant matrix,/>Is a constant;
Based on the criticist-actor architecture and formula (1.25), a criticist evaluating control performance is configured to:
Wherein the method comprises the steps of For/>Estimate of/>A neural network weight for criticists;
According to equation (1.28), an actor for performing a control action is designed to:
In the middle of And/>Respectively an optimized virtual control law and an optimized actual control law,/>Neural network weights that are actors;
The criticizing home neural network weight and actor neural network weight update law is:
Wherein the method comprises the steps of And/>Is a positive constant.
CN202410315976.1A 2024-03-20 2024-03-20 Automatic driving automobile motion control method based on reinforcement learning Pending CN117911414A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410315976.1A CN117911414A (en) 2024-03-20 2024-03-20 Automatic driving automobile motion control method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410315976.1A CN117911414A (en) 2024-03-20 2024-03-20 Automatic driving automobile motion control method based on reinforcement learning

Publications (1)

Publication Number Publication Date
CN117911414A true CN117911414A (en) 2024-04-19

Family

ID=90686199

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410315976.1A Pending CN117911414A (en) 2024-03-20 2024-03-20 Automatic driving automobile motion control method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN117911414A (en)

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019031268A (en) * 2017-05-12 2019-02-28 トヨタ モーター エンジニアリング アンド マニュファクチャリング ノース アメリカ,インコーポレイティド Control policy learning and vehicle control method based on reinforcement learning without active exploration
US20190266489A1 (en) * 2017-10-12 2019-08-29 Honda Motor Co., Ltd. Interaction-aware decision making
US20190291726A1 (en) * 2018-03-20 2019-09-26 Mobileye Vision Technologies Ltd. Systems and methods for navigating a vehicle
US20200241542A1 (en) * 2019-01-25 2020-07-30 Bayerische Motoren Werke Aktiengesellschaft Vehicle Equipped with Accelerated Actor-Critic Reinforcement Learning and Method for Accelerating Actor-Critic Reinforcement Learning
CN112026763A (en) * 2020-07-23 2020-12-04 南京航空航天大学 Automobile track tracking control method
CN113320542A (en) * 2021-06-24 2021-08-31 厦门大学 Tracking control method for automatic driving vehicle
WO2021248641A1 (en) * 2020-06-10 2021-12-16 北京理工大学 Multi-sensor information fusion-based model adaptive lateral velocity estimation method
CN114379583A (en) * 2021-12-10 2022-04-22 江苏大学 Automatic driving vehicle trajectory tracking system and method based on neural network dynamics model
US20220219691A1 (en) * 2018-03-04 2022-07-14 Traxen Inc. Automated cruise control system
CN115097736A (en) * 2022-08-10 2022-09-23 东南大学 Active disturbance rejection controller parameter optimization method based on deep reinforcement learning
CN115202341A (en) * 2022-06-16 2022-10-18 同济大学 Transverse motion control method and system for automatic driving vehicle
US20220363279A1 (en) * 2021-04-21 2022-11-17 Foundation Of Soongsil University-Industry Cooperation Method for combating stop-and-go wave problem using deep reinforcement learning based autonomous vehicles, recording medium and device for performing the method
WO2023102962A1 (en) * 2021-12-06 2023-06-15 深圳先进技术研究院 Method for training end-to-end autonomous driving strategy
CN117008995A (en) * 2023-07-13 2023-11-07 广东工业大学 Industrial software component service function chain assembly integration method
CN117246299A (en) * 2023-10-09 2023-12-19 清华大学 Auxiliary emergency braking method and system after braking failure
CN117580063A (en) * 2023-08-03 2024-02-20 北京邮电大学 Multi-dimensional resource collaborative management method in vehicle-to-vehicle network

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019031268A (en) * 2017-05-12 2019-02-28 トヨタ モーター エンジニアリング アンド マニュファクチャリング ノース アメリカ,インコーポレイティド Control policy learning and vehicle control method based on reinforcement learning without active exploration
US20190266489A1 (en) * 2017-10-12 2019-08-29 Honda Motor Co., Ltd. Interaction-aware decision making
US20220219691A1 (en) * 2018-03-04 2022-07-14 Traxen Inc. Automated cruise control system
US20190291726A1 (en) * 2018-03-20 2019-09-26 Mobileye Vision Technologies Ltd. Systems and methods for navigating a vehicle
US20200241542A1 (en) * 2019-01-25 2020-07-30 Bayerische Motoren Werke Aktiengesellschaft Vehicle Equipped with Accelerated Actor-Critic Reinforcement Learning and Method for Accelerating Actor-Critic Reinforcement Learning
WO2021248641A1 (en) * 2020-06-10 2021-12-16 北京理工大学 Multi-sensor information fusion-based model adaptive lateral velocity estimation method
CN112026763A (en) * 2020-07-23 2020-12-04 南京航空航天大学 Automobile track tracking control method
US20220363279A1 (en) * 2021-04-21 2022-11-17 Foundation Of Soongsil University-Industry Cooperation Method for combating stop-and-go wave problem using deep reinforcement learning based autonomous vehicles, recording medium and device for performing the method
CN113320542A (en) * 2021-06-24 2021-08-31 厦门大学 Tracking control method for automatic driving vehicle
WO2023102962A1 (en) * 2021-12-06 2023-06-15 深圳先进技术研究院 Method for training end-to-end autonomous driving strategy
CN114379583A (en) * 2021-12-10 2022-04-22 江苏大学 Automatic driving vehicle trajectory tracking system and method based on neural network dynamics model
CN115202341A (en) * 2022-06-16 2022-10-18 同济大学 Transverse motion control method and system for automatic driving vehicle
CN115097736A (en) * 2022-08-10 2022-09-23 东南大学 Active disturbance rejection controller parameter optimization method based on deep reinforcement learning
CN117008995A (en) * 2023-07-13 2023-11-07 广东工业大学 Industrial software component service function chain assembly integration method
CN117580063A (en) * 2023-08-03 2024-02-20 北京邮电大学 Multi-dimensional resource collaborative management method in vehicle-to-vehicle network
CN117246299A (en) * 2023-10-09 2023-12-19 清华大学 Auxiliary emergency braking method and system after braking failure

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XU CAN等: "An actor-critic based learning method for decision-making and planning of autonomous vehicles", 《SCIENCE CHINA TECHNOLOGICAL SCIENCES》, vol. 64, no. 5, 31 May 2021 (2021-05-31), pages 984 - 994, XP037445340, DOI: 10.1007/s11431-020-1729-2 *
朱亚东: "基于深度强化学习的云边协同智能电网故障监测系统的资源分配方法研究", 《中国优秀硕士学位论文全文数据库工程科技Ⅱ辑》, no. 02, 15 February 2022 (2022-02-15), pages 042 - 928 *

Similar Documents

Publication Publication Date Title
CN107415939B (en) Steering stability control method for distributed driving electric automobile
KR960000248B1 (en) System for predicting behavior of automotive vehicle and for controlling vehicular behavior based thereon
CN107885932B (en) Automobile emergency collision avoidance layered control method considering man-machine harmony
CN111007722B (en) Transverse robust fault-tolerant control system and method for four-wheel steering automatic driving automobile
Hima et al. Trajectory tracking for highly automated passenger vehicles
Attia et al. Coupled longitudinal and lateral control strategy improving lateral stability for autonomous vehicle
Xu et al. Model predictive control for lane keeping system in autonomous vehicle
CN113126623B (en) Adaptive dynamic sliding mode automatic driving vehicle path tracking control method considering input saturation
WO2022266824A1 (en) Steering control method and apparatus
Ma et al. A shared steering controller design based on steer-by-wire system considering human-machine goal consistency
CN113911106B (en) Method for cooperatively controlling transverse track following and stability of commercial vehicle based on game theory
CN113009829A (en) Longitudinal and transverse coupling control method for intelligent internet motorcade
CN109849898B (en) Vehicle yaw stability control method based on genetic algorithm hybrid optimization GPC
Dong et al. Real-time model predictive control for simultaneous drift and trajectory tracking of autonomous vehicles
Wang et al. Stability control of steer by wire system based on μ synthesis robust control
Wu et al. Integrated control system design of active front wheel steering and four wheel torque to improve vehicle handling and stability
CN117270386A (en) Coupling active disturbance rejection-based distributed drive six-wheel steering vehicle same-phase steering control method and controller
Adam et al. Robust super-twisting sliding mode controller for the lateral and longitudinal dynamics of rack steering vehicle
Jan et al. Decoupling of vehicle lateral dynamics using four-wheel steering system
CN117911414A (en) Automatic driving automobile motion control method based on reinforcement learning
CN113602278B (en) Four-wheel independent drive electric vehicle distributed model prediction path tracking control method
Wachter Lateral path tracking in limit handling condition using SDRE control
CN115167135A (en) Feedback and model feedforward cascade unmanned vehicle self-tendency optimal position and posture control system
Raffone et al. Optimal look-ahead vehicle lane centering control design and application for mid-high speed and curved roads
Malmir et al. A model predictive controller for minimum time cornering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination