CN114228690B - Automatic driving vehicle roll control method based on DDPG and iterative control - Google Patents

Automatic driving vehicle roll control method based on DDPG and iterative control Download PDF

Info

Publication number
CN114228690B
CN114228690B CN202111353270.7A CN202111353270A CN114228690B CN 114228690 B CN114228690 B CN 114228690B CN 202111353270 A CN202111353270 A CN 202111353270A CN 114228690 B CN114228690 B CN 114228690B
Authority
CN
China
Prior art keywords
vehicle
road
error
control
action
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111353270.7A
Other languages
Chinese (zh)
Other versions
CN114228690A (en
Inventor
唐晓峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yangzhou University
Original Assignee
Yangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yangzhou University filed Critical Yangzhou University
Priority to CN202111353270.7A priority Critical patent/CN114228690B/en
Publication of CN114228690A publication Critical patent/CN114228690A/en
Application granted granted Critical
Publication of CN114228690B publication Critical patent/CN114228690B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W10/00Conjoint control of vehicle sub-units of different type or different function
    • B60W10/20Conjoint control of vehicle sub-units of different type or different function including control of steering systems
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W10/00Conjoint control of vehicle sub-units of different type or different function
    • B60W10/04Conjoint control of vehicle sub-units of different type or different function including control of propulsion units
    • B60W10/06Conjoint control of vehicle sub-units of different type or different function including control of propulsion units including control of combustion engines
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W10/00Conjoint control of vehicle sub-units of different type or different function
    • B60W10/18Conjoint control of vehicle sub-units of different type or different function including control of braking systems
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/10Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to vehicle motion
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/10Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to vehicle motion
    • B60W40/105Speed
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W50/0098Details of control systems ensuring comfort, safety or stability not otherwise provided for
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W2050/0001Details of the control system
    • B60W2050/0043Signal treatments, identification of variables or parameters, parameter estimation or state estimation
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2520/00Input parameters relating to overall vehicle dynamics
    • B60W2520/10Longitudinal speed
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2520/00Input parameters relating to overall vehicle dynamics
    • B60W2520/14Yaw
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2520/00Input parameters relating to overall vehicle dynamics
    • B60W2520/18Roll
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2552/00Input parameters relating to infrastructure
    • B60W2552/50Barriers
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2552/00Input parameters relating to infrastructure
    • B60W2552/53Road markings, e.g. lane marker or crosswalk
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2554/00Input parameters relating to objects
    • B60W2554/40Dynamic objects, e.g. animals, windblown objects
    • B60W2554/402Type
    • B60W2554/4026Cycles
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2554/00Input parameters relating to objects
    • B60W2554/40Dynamic objects, e.g. animals, windblown objects
    • B60W2554/402Type
    • B60W2554/4029Pedestrians
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2554/00Input parameters relating to objects
    • B60W2554/40Dynamic objects, e.g. animals, windblown objects
    • B60W2554/404Characteristics
    • B60W2554/4041Position
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2554/00Input parameters relating to objects
    • B60W2554/40Dynamic objects, e.g. animals, windblown objects
    • B60W2554/404Characteristics
    • B60W2554/4042Longitudinal speed
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2554/00Input parameters relating to objects
    • B60W2554/80Spatial relation or speed relative to objects
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2555/00Input parameters relating to exterior conditions, not covered by groups B60W2552/00, B60W2554/00
    • B60W2555/20Ambient conditions, e.g. wind or rain
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2710/00Output or target parameters relating to a particular sub-units
    • B60W2710/06Combustion engines, Gas turbines
    • B60W2710/0605Throttle position
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2710/00Output or target parameters relating to a particular sub-units
    • B60W2710/18Braking system
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2710/00Output or target parameters relating to a particular sub-units
    • B60W2710/20Steering systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Chemical & Material Sciences (AREA)
  • Combustion & Propulsion (AREA)
  • Automation & Control Theory (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Control Of Driving Devices And Active Controlling Of Vehicle (AREA)

Abstract

The invention discloses an automatic driving vehicle roll control method based on DDPG and iterative control, which trains a DDPG algorithm on running maps of the automatic driving vehicle in different scenes, and the automatic driving vehicle generates real-time vehicle states through interaction with map environments in different scenes to determine the action behaviors of the vehicle; initializing an action space when action training is carried out, generating state space information by an online strategy network in an actor network, carrying out action output, and adding an action noise to obtain the action space with exploratory property; based on LSTM historic memory and road planning attributes, generating a predicted path of the state of the automatic driving vehicle, adopting a DDPG algorithm to realize tracking control of the path track under the normal running working condition and the extreme running working condition of the automatic driving vehicle, and adopting an iterative control method to realize compensation control of the automatic driving vehicle. The invention avoids the instability problem of the reinforcement learning algorithm under the extreme road environment driving condition of the vehicle, and improves the driving safety and the robustness of the vehicle.

Description

Automatic driving vehicle roll control method based on DDPG and iterative control
Technical Field
The invention belongs to the field of intelligent vehicle control, and particularly relates to an automatic driving vehicle roll control method based on DDPG and iterative control.
Background
With the development of artificial intelligence technology, the current automatic driving technology has greatly developed, and is currently focused on closed campus scenes, such as closed campus scenes, logistics industry park scenes and the like, and particularly, the application is more common in harbour road environments with fewer structured road features, pedestrians and vehicles. The automatic driving vehicle adopts environment sensing, navigation, map positioning, decision making, motion planning and track tracking control to realize the intellectualization of the vehicle. However, when an automatic driving vehicle is in complex weather and complex driving environments such as a cross-sea bridge, the severe weather environment can influence the road condition of the bridge, so that the vehicle turns, sideslips and turns over, such as rain, snow, wind and the like in the weather environment, the road attachment coefficient is changed, tires are slipped, and the path tracking, lane keeping and vehicle control precision are changed. In addition, the bridge road environment may vibrate due to the influence of wind weather, and a roll phenomenon of a vehicle may occur, thereby causing an uncontrollable situation. Therefore, the control technique is a complex task when the vehicle has uncertainty characteristics such as a sideslip phenomenon caused by a wet road surface, a vehicle yaw characteristic caused by bridge vibration, a vehicle roll dynamics phenomenon caused by high-speed vehicle performance, and the like, also considering road vibration characteristics, road angle, and aerodynamic characteristics in the overall control design. Therefore, running safety and stability research based on the roll control of an automatic driving vehicle under severe weather conditions is an important key technology. Reinforcement learning is an application category of artificial intelligence technology, and an intelligent agent can explore an unknown dynamic environment, try different actions and interact with the dynamic environment without any accurate vehicle model and given surrounding environment, can learn the unknown environment, realize complex vehicle dynamics through actions and states interacted with the environment, and is a new implementation method for adapting to dynamic road environment cognition and complex vehicle dynamics performance. Therefore, the adoption of the reinforcement learning algorithm to realize the control of the vehicle in the running environment of the automatic driving vehicle on the road condition of the cross-sea bridge is beneficial to realizing the intelligent safety of the vehicle and the scale industrialized development of the automatic driving vehicle.
Disclosure of Invention
The invention aims to: the invention provides an automatic driving vehicle roll control method based on DDPG and iterative control, which enables a vehicle to safely run in cross-sea bridge crossing environments with any different complexity levels and improves the intelligent level of the vehicle through instantaneous roll angles.
The technical scheme is as follows: the invention provides an automatic driving vehicle roll control method based on DDPG and iterative control, which specifically comprises the following steps:
(1) Installing a laser radar, a vision, a millimeter wave radar, an ultrasonic radar sensor, a positioning system and an inertial navigation system on an autopilot vehicle;
(2) The method comprises the steps of using a visual sensor, a positioning system and an inertial navigation system to realize positions and maps of vehicles in different scenes respectively so as to generate automatic driving vehicle running maps in different scenes and realize the environment required by vehicle running tracks;
(3) Respectively controlling a steering wheel, an accelerator and a pedal, driving on a cross-sea bridge, acquiring corresponding driving tracks in rainy and snowy days, strong wind and bad weather and sunny days, and constructing a data set;
(4) The DDPG algorithm is trained on the running map of the automatic driving vehicle under different scenes and is used for running states of the cross-sea bridge under different complex road condition grades in severe weather; the automatic driving vehicle generates real-time vehicle states through interaction with map environments in different scenes, and determines action behaviors of the vehicle; initializing an action space when action training is carried out, generating state space information by an online strategy network in an actor network, carrying out action output, and adding an action noise to obtain the action space with exploratory property;
(5) Based on LSTM historic memory and road planning attributes, generating a predicted path of the state of the automatic driving vehicle, adopting a DDPG algorithm to realize tracking control of the path track of the automatic driving vehicle under normal driving road conditions and extreme driving road conditions, and adopting an iterative control method to realize compensation control of the automatic driving vehicle.
Further, the lidar sensor in step (1) is used to detect dynamic and static obstacles on roads, including pedestrians, motorcycles, various vehicles, etc., and movable road areas; the visual sensor is used for sensing lane lines, pedestrians and vehicles and performing positioning and synchronous map creation; the millimeter wave radar sensor is used for detecting the distance between the vehicle and the pedestrian and the distance between the vehicle and the traveling vehicle; the ultrasonic radar is used for detecting the distance between the short-distance vehicles; the vision sensor, the positioning system and the inertial navigation system are used for realizing the vehicle positioning technology.
Further, the different scenes in the step (2) include five scenes including a cross-sea bridge road condition in rainy and snowy weather, a cross-sea bridge road condition in strong wind and severe weather, a road condition when the bridge vibrates in sunny days, a running road condition of a bicycle in frequent and changeable weather, and a running road condition of a plurality of bicycles in frequent and changeable weather.
Further, the data set of step (3) includes vehicle speed, travel track, vehicle position, heading angle, slip angle, yaw rate, roll angle.
Further, the network design of the DDPG algorithm in the step (4) is as follows:
an actor network is constructed, the vehicle state and the environment state are taken as input, the output is a vector formed by steering angle, accelerator and brake signals, the vector corresponds to 3 neurons of an actor strategy network output layer respectively, the activation function of the accelerator and the brake is set to be Sigmoid, the activation function of the steering action value is Tanh, and the hidden layer has the structure that: the first layer is a convolution size 7*7, a filter size 48, a step size 4, and a total of 200 neurons; the second layer is a convolution size 5*5, a filter size 16, a step size 2, an activation function ReLu function, 400 neurons total; the third layer adds 100 neurons to the LSTM layer; the fourth layer is a 128-unit fully connected layer; the fifth layer is a fully connected layer, totaling 128 units; the input of the critic network is a state and action space, and the state and action space are spliced with an activation function ReLu through two hidden layers, namely 200 neurons in the first layer and 400 neurons in the second layer, so that a Q value is finally obtained; definition of h i ∈(S t-T ,S t-T+1 ,…,S t ) Wherein S is t-T And S is t State information respectively representing the current time and the current time, the encoded state is: s=f (h i The method comprises the steps of carrying out a first treatment on the surface of the β), then the policy of the changed actor network is defined as: a=μ (h i /β,γ π )+η。
Further, the implementation process of the step (5) for implementing the tracking control of the path track under the normal running road condition of the automatic driving vehicle is as follows:
the method comprises the steps of establishing a vehicle dynamics model by considering roll, sideslip and yaw dynamics characteristics of a vehicle under normal running road conditions, namely road conditions when a bridge vibrates on a sunny day, setting vehicle state constraint conditions, and determining a lateral stability range, a maximum steering angle range and a range of allowable vehicle control for preventing roll so as to reduce a lateral deviation error of the vehicle:
ω z-min ≤ω z ≤ω z-maxx-min ≤ω x ≤ω x-max ,u x-min ≤u x ≤u x-max ,e r-x-min ≤e r ≤e r-x-max
wherein omega is z Is yaw rate; omega x Is the roll angle of the vehicle; u (u) x Is the steering angle; e, e r Is a lateral tracking offset error;
according to LSTM predicted road state information, constructing an objective function considering steering angle, tire attachment coefficient, roll angle error and path tracking error, and determining physical constraint of the vehicle determined by transverse tracking offset error under the condition of full consideration of the dynamics constraint condition of maximum allowable error of the vehicle so as to reduce the error of tracking control of the vehicle:
Figure BDA0003356573650000031
wherein w is 1 ,w 2 ,w 3 ,w 4 Parameter variables respectively; mu (mu) r Is the road adhesion coefficient;
adopting an iterative control algorithm to realize compensation control of vehicle roll, and setting a reference vehicle state, a reference control input and a reference output value to ensure a tracking function under the conditions of vehicle physical constraint and road constraint under the condition of multiple constraints, thereby improving the anti-interference performance of the vehicle during running and reducing a model error rate;
according to the running condition of the vehicle, constructing a state space, an action space and a reward function required by a DDPG algorithm; the motion space mainly comprises steering wheel rotation angle, throttle and braking signals, and the state space comprises vehicle transverse tracking error and change rate thereof, vehicle side-tipping angle error and change rate thereof, and yaw rate error and change rate thereof; construction of the bonus function in case of vibration of the cross-sea bridge, the actual track of the vehicle is changed, the road is inclined at a certain angle, and thus the bonus function is equal to the cumulative multiplication of the discount factor and the change of speed.
Further, the implementation process of the step (5) for implementing the tracking control of the path track of the automatic driving vehicle under the extreme driving road condition is as follows:
under extreme driving road conditions, namely under severe weather and other factors affecting road conditions, the phenomenon that the vehicle easily generates wet skid and vibration is considered, the driving track is changed, the actual running speed of the vehicle is affected, and the actual vehicle speed and the planned vehicle speed are deviated, so that the change of the actual vehicle speed is set as follows:
Figure BDA0003356573650000041
in the formula, v ref Is the reference vehicle speed; v act Is the actual vehicle speed; alpha is an influencing factor;
when the error between the actual vehicle speed and the reference vehicle speed is within 2KM/H, the actual vehicle speed can be assumed to be equal to the reference vehicle speed; when the error between the actual vehicle speed and the reference vehicle speed is in the [2 ] KM/H interval, the vehicle speed of the actual vehicle speed is equal to the reference vehicle speed and the difference between the two vehicle speeds; wherein α is a speed change factor; when the error between the actual vehicle speed and the reference vehicle speed is greater than 5KM/H, the vehicle needs to be braked instantaneously at the moment, so that the running safety of the vehicle is ensured;
at time t, the vehicle's set of actions and states are expressed as follows: { v 1 ,…,v i …v n I=1, …, n, path planning is implemented using bezier curves to produce collision-free predicted trajectories; in order to judge the accuracy of vehicle speed planning, the vehicle state and action of a small sample are taken out from an experience buffer of a DDPG algorithm as reference values: { v r1 ,…,v ri …v rn I=1, …, n, and the error of both is calculated as:
Figure BDA0003356573650000042
in the formula, v i Is the vehicle speed; l is the vehicle speed error rate; v ri The reference vehicle speed is obtained from the experience buffer;
under the extreme driving condition, the expected state reference track is given as S r The output error is e k (t)=S r (t) -S (t), learning law: u (u) k+1 (t)=L(u k (t),e k (t)) to obtain the compensation control action a k The method comprises the steps of carrying out a first treatment on the surface of the Under normal running, the DDPG algorithm is adopted to realize vehicle control, and the output action is a π =μ(h i /β,γ π ) +η, total action control a=a k +a π The reference formula is as follows:
Figure BDA0003356573650000051
in order to judge the accuracy of path planning, the track error when the actual running track of the actual running vehicle of the vehicle and the reference track generate the change of the side-tilt angle is calculated as follows:
Figure BDA0003356573650000052
wherein r is act Is the actual vibrating vehicle track; r is (r) ref Is a reference track;
Figure BDA0003356573650000053
the angle difference between the actual track and the reference track is the largest, and mu is the road adhesion coefficient; sigma is the deviation angle when the vehicle generates lateral sideslip, phi is the vehicle side-tipping angle, d is the vertical vibration distance, χ is the deviation angle when the bridge vibrates vertically, namely, the vertical dip angle generated by the bridge deck;
the vertical dip angle χ that bridge vibration produced can be maximally set to:
Figure BDA0003356573650000054
searching an optimal road running area according to a high-precision map designed by the visual sensor, designing a reference vehicle state, a control input value and a parameter output value in the optimal road running area, designing constraint conditions of multiple types of states, and realizing roll control of the vehicle by adopting an iterative learning control algorithm;
the prediction states of the high-precision map and the LSTM designed by the visual sensor find a limited road driving area, and design a range of layered uncertainty state parameters and action parameters, when a vehicle is driven in the limited road driving area, the DDPG and an iterative control algorithm are adopted to realize the roll control of the vehicle, the iterative control algorithm plays a role in compensation, and a reward function is constructed according to the constraint range of the vehicle state, wherein the reward function is as follows:
R=v·(R 1 +R 2 +…R 6 )
Figure BDA0003356573650000055
wherein R is 1 And R is 2 Each representing a lateral distance error and a rate of change thereof; r is R 3 And R is 4 Representing the transverse angular velocity and its rate of change; r is R 5 And R is 6 Representing the roll angle and its rate of change; x is an angle value; v is the vehicle speed; e, e y Is the lateral distance, k i ,k j The bonus factors, respectively.
The beneficial effects are that: compared with the prior art, the invention has the beneficial effects that: 1. the invention designs a comprehensive control method for the roll of the automatic driving vehicle based on a reinforcement learning algorithm (DDPG), and the roll of the vehicle is controlled in a complex road environment through reinforcement learning, so that the automatic driving vehicle can realize exploratory running of the vehicle under complex road conditions and extreme weather through exploration and utilization methods; 2. and the compensation effect of iterative learning control on the DDPG algorithm is carried out aiming at extreme driving conditions, so that the comprehensive control effect of the vehicle is realized, and the final safe driving of the vehicle is ensured.
Drawings
Fig. 1 is a schematic diagram of a DDPG network architecture;
FIG. 2 is a schematic diagram of integrated control of vehicle roll;
fig. 3 is a flow chart of integrated control based on autonomous vehicle roll.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings.
The invention provides an automatic driving vehicle roll control method based on DDPG and iterative control, which specifically comprises the following steps:
step 1: laser radar, vision, millimeter wave radar, ultrasonic radar sensor, positioning system and inertial navigation system are installed on an autonomous vehicle.
The invention is oriented to the road condition environment of a cross-sea bridge, and aims to control the vehicle to safely run at a medium-low speed (5-80 KM/H), and realize high-level intellectualization of the vehicle through the control behavior of the vehicle with instantaneous side inclination. To achieve the above object, the present invention mounts a plurality of several laser radar, machine vision, millimeter wave radar and ultrasonic radar sensors on an autonomous vehicle, and mounts a positioning system, an inertial navigation system (IMU), and the like. The laser radar sensor is used for detecting dynamic and static obstacles on the road, including pedestrians, motorcycles, various vehicles and the like, and can drive a road area; the machine vision sensor is used for sensing lane lines, pedestrians and vehicles and performing positioning and synchronous map creation; the millimeter wave radar sensor is used for detecting the distance between the vehicle and the pedestrian and the distance between the vehicle and the traveling vehicle; the ultrasonic radar is used for detecting the distance between the short-distance vehicles; positioning systems and inertial navigation Systems (IMUs) are used to implement vehicle positioning technology.
Step 2: the vision sensor, the positioning system and the inertial navigation system are used for realizing the positions and the maps of the vehicle in different scenes respectively so as to generate the running maps of the automatic driving vehicle in different scenes and realize the environment required by the running track of the vehicle.
The method comprises the steps of using a visual sensor, a positioning system and an inertial navigation system (IMU) to realize positions and maps of vehicles in severe weather such as sunny days, rainy days, snowy days, foggy days, strong winds and the like respectively so as to generate an automatic driving vehicle driving map in five scenes of a cross-sea bridge road condition in rainy and snowy days, a cross-sea bridge road condition in strong wind and severe weather, a road condition when a bridge vibrates in sunny days, a driving road condition of a bicycle in frequent changeable weather and a driving road condition of a plurality of vehicles in frequent changeable weather, and the method is used for realizing an environment required by a vehicle driving track.
Step 3: and respectively controlling a steering wheel, an accelerator and a pedal, driving on a cross-sea bridge, acquiring corresponding driving tracks in rainy and snowy days, strong wind and severe weather and sunny days, and constructing a data set.
A driver with abundant experience respectively runs on the cross-sea bridge under five scenes by controlling the steering wheel, the accelerator and the pedal, and records corresponding running tracks so as to construct corresponding data sets. The data set includes: the vehicle speed, the running track, the vehicle position, the course angle, the slip angle, the yaw rate and the roll angle provide necessary reference data for training the data set and evaluating the controllability of the vehicle.
Step 4: the DDPG algorithm is trained on the running map of the automatic driving vehicle under different scenes and is used for running states of the cross-sea bridge under different complex road condition grades in severe weather; the automatic driving vehicle generates real-time vehicle states through interaction with map environments in different scenes, and determines action behaviors of the vehicle; when motion training is performed, initializing a motion space, generating state space information by an online strategy network in an actor network, performing motion output, and adding motion noise to acquire a exploratory motion space.
As shown in fig. 1, an actor network is constructed, the vehicle state and the environment state are taken as input, the output is a vector formed by steering angle, accelerator and brake, 3 neurons of an actor strategy network output layer are respectively corresponding to the actor, the activation function of the accelerator and the brake is set to be Sigmoid, the activation function of the steering action value is Tanh, and the structure of the hidden layer is as follows: the first layer is a convolution size 7*7, a filter size 48, a step size 4, and a total of 200 neurons; the second layer is a convolution size 5*5, a filter size 16, a step size 2, an activation function ReLu function, 400 neurons total; the third layer adds 100 neurons to the LSTM layer; the fourth layer is a 128-unit fully connected layer; the fifth layer is a fully connected layer, totaling 128 units; the input of the critic network is a state and action space, and the state and action space are spliced with an activation function ReLu through two hidden layers, namely 200 neurons in the first layer and 400 neurons in the second layer, so that a Q value is finally obtained; definition of h i ∈(S t-T ,S t-T+1 ,…,S t ) Wherein S is t-T And S is t State information respectively representing the current time and the current time, the encoded state is: s=f (h i The method comprises the steps of carrying out a first treatment on the surface of the β), then the policy of the changed actor network is defined as: a=μ (h i /β,γ π )+η。
Designing an action space of the vehicle, wherein the action space comprises a steering wheel angle delta and a braking signal of the vehicle
Figure BDA0003356573650000082
And throttle signal
Figure BDA0003356573650000083
Considering that the running environment of the vehicle is complex, when the road environment is complex, the vehicle is in variable speed running, and the braking signal is the action generated by the vehicle under the extreme running condition so as to prevent the vehicle from rolling and rollover movement caused by braking and road surface wet sliding, and the action space at the moment should be set into three types of steering wheel rotation angle, braking signal and accelerator signal; when the vehicle is under the normal running working condition, the vehicle is assumed to run at a constant speed, in order to prevent the vehicle from generating a roll phenomenon due to wet and slippery road surface, the action space is set to be a steering wheel angle and an accelerator signal, and the constraint ranges of three actions are set according to the two different running working conditions, so that the vehicle can be ensured to run in a controllable way in a running possible road area.
The vehicle is configured to obtain state data by exploring and utilizing the environment, and the generated data states generally comprise a lateral distance and a change rate thereof, and a vehicle roll angle and a change rate thereof, and the state data are generally contained in an experience buffer. When the iterative learning is adopted to realize the vehicle control, parameters such as the state of a reference vehicle, the track of the vehicle and the like need to be designed, the reference state and the track can be obtained from an experience buffer, and the reference track can be adjusted and changed according to the complexity of different road environments.
Under five different scenes of road conditions of a cross-sea bridge in rainy and snowy weather, road conditions of a cross-sea bridge in severe weather, road conditions of a bridge in vibration in sunny weather, running road conditions of a bicycle in frequent variable weather and running road conditions of a plurality of bicycles in frequent variable weather, different generated tracks are used as reference paths, error comparison is carried out between an actually planned path of a vehicle and the reference tracks, and the reference tracks also need to be added with various constraint conditions meeting vehicle dynamics characteristics to modify and adjust the constraint conditions as set actual paths, so that the method can be expressed as follows:
Figure BDA0003356573650000081
where σ is a path influencing factor, p ref Is a reference track; p is p act An actual trajectory; that is, when the vehicle travels in different road environments, the acquired travel track needs to be appropriately modified and adjusted to conform to the vehicle dynamics characteristics of the vehicle during automatic traveling, and the travel track can be used as the travel track of the automatic driving vehicle.
Step 5: as shown in fig. 2, based on LSTM history memory and road planning attributes, a predicted path of the state of the autonomous vehicle is generated, tracking control of the path trajectories of the autonomous vehicle under normal driving road conditions and extreme driving road conditions is realized by using a DDPG algorithm, and compensation control of the autonomous vehicle is realized by using an iterative control method.
In order to ensure the safety and stability of the vehicle under normal driving road conditions, namely road conditions when the bridge vibrates on sunny days in five scenes, the roll, sideslip and yaw dynamics characteristics of the vehicle are required to be considered in vehicle dynamics modeling, vehicle state constraint conditions are set, and a lateral stability range, a maximum steering angle range and a range for allowable vehicle control for preventing roll are determined so as to reduce the lateral deviation error of the vehicle:
Figure BDA0003356573650000091
ω z-min ≤ω z ≤ω z-maxx-min ≤ω x ≤ω x-max ,u x-min ≤u x ≤u x-max ,e r-x-min ≤e r ≤e r-x-max
wherein ψ= [ v y ω z ω x φ] T Is a state vector, u 1 Is a control input, u 2 Is an auxiliary control input; omega z Is yaw rate; omega x Is the roll angle of the vehicle; u (u) x Is the steering angle; e, e r Is a lateral tracking offset error.
According to LSTM predicted road state information, constructing an objective function considering steering angle, tire attachment coefficient, roll angle error and path tracking error, and determining physical constraint of the vehicle determined by transverse tracking offset error under the condition of full consideration of the dynamics constraint condition of maximum allowable error of the vehicle so as to reduce the error of tracking control of the vehicle:
Figure BDA0003356573650000092
wherein w is 1 ,w 2 ,w 3 ,w 4 Parameter variables respectively; mu (mu) r Is the road adhesion coefficient
As shown in fig. 3, the iterative control algorithm is adopted to realize the compensation control of the vehicle roll, and the reference vehicle state, the reference control input and the reference output value are set to ensure the tracking function under the conditions of the physical constraint of the vehicle and the road constraint under the multiple constraint conditions, thereby increasing the anti-interference performance of the vehicle during running and reducing the model error rate. Firstly, a DDPG algorithm is adopted to carry out training work of a network model, interactive training is carried out through road conditions of a vehicle and a dynamic cross-sea bridge, a training completion task is ensured, if the task is completed, a trained action is saved, if the training task is not ideal in completion effect, an iterative control algorithm is adopted to compensate parameters of an output action space, finally, the training task is completed, a better action is realized, and finally, a better automatic driving vehicle roll control is realized.
Under extreme driving conditions, namely, the road conditions of cross-sea bridges in rainy and snowy weather in five scenes, the road conditions of cross-sea bridges in strong wind and severe weather, the driving road conditions of single vehicles in frequent and varied weather and the driving road conditions of multiple vehicles in frequent and varied weather; the road environment has uncertainty due to bad weather influence, which interferes with the normal running of the vehicle.
Under extreme driving road conditions, the vehicle is easy to generate wet skid and vibration phenomena, the driving track can be changed to influence the actual running speed of the vehicle, and the actual vehicle speed is deviated from the planned vehicle speed, so that the change of the actual vehicle speed can be set as follows:
Figure BDA0003356573650000101
in the formula, v ref Is the reference vehicle speed; v act Is the actual vehicle speed; alpha is an influencing factor.
When the error between the actual vehicle speed and the reference vehicle speed is within 2KM/H, the actual vehicle speed can be assumed to be equal to the reference vehicle speed; when the error between the actual vehicle speed and the reference vehicle speed is in the [2 ] KM/H interval, the vehicle speed of the actual vehicle speed is equal to the reference vehicle speed and the difference between the two vehicle speeds, wherein alpha is a speed change factor; when the error between the actual vehicle speed and the reference vehicle speed is larger than 5KM/H, the vehicle needs to be braked instantaneously, and the running safety of the vehicle is ensured.
After the path prediction of the automatic driving vehicle passes through the LSTM historic memory state, the predicted speed path is generated by utilizing the road planning attribute, and at the time t, the action and state set of the vehicle are expressed as follows: { v 1 ,…,v i …v n I=1, …, n, path planning is implemented using bezier curves to produce collision-free predicted trajectories; in order to judge the accuracy of vehicle speed planning, the vehicle state and action of a small sample are taken out from an experience buffer of a DDPG algorithm as reference values: { v r1 ,…,v ri …v rn I=1, …, n, and the error of both is calculated as:
Figure BDA0003356573650000102
in the formula, v i Is the vehicle speed; l is the vehicle speed error rate; v ri Is the reference vehicle speed obtained from the experience buffer.
Under extreme conditions, given a desired state reference trajectory of S r The output error is e k (t)=S r (t) -S (t), learning law: u (u) k+1 (t)=L(u k (t),e k (t)) to obtain the compensation control action a k The method comprises the steps of carrying out a first treatment on the surface of the Under normal running, the DDPG algorithm is adopted to realize vehicle control, and the output action is a π =μ(h i /β,γ π ) +η, total action control a=a k +a π The reference formula is as follows:
Figure BDA0003356573650000103
in order to judge the accuracy of path planning, the track error when the actual running track of the actual running vehicle of the vehicle and the reference track generate the change of the side-tilt angle is calculated as follows: the rolling motion phenomenon of the vehicle is mainly shown in two cases, wherein the first motion is that when a bridge vibrates, the vehicle automatically drives to deviate from a planned path, and at the moment, the vehicle easily rolls to generate a roll angle; therefore, a track error in which the actual running track of the vehicle and the reference track produce a change in the roll angle can be expressed as:
Figure BDA0003356573650000104
wherein r is act Is the actual vibrating vehicle track; r is (r) ref Is a reference track;
Figure BDA0003356573650000111
the angle difference between the actual track and the reference track is the largest, sigma is the deviation angle when the vehicle generates lateral sideslip, phi is the vehicle side-tipping angle, d vertical vibration distance, χ is the deviation angle when the bridge vibrates vertically, namely, the vertical dip angle generated by the bridge deck.
The vertical dip angle χ that bridge vibration produced can be maximally set to:
Figure BDA0003356573650000112
the rolling motion phenomenon of the vehicle occurs, the second motion is that the road surface is wet and slippery due to bad weather, the road attachment coefficient is changed, and the rolling, sideslip and side turning motions of the vehicle are caused; therefore, a track error in which the actual running track of the vehicle and the reference track generate a roll angle change can be expressed as:
Figure BDA0003356573650000113
wherein r is act Is the actual vibrating vehicle track; r is (r) ref Is a reference track;
Figure BDA0003356573650000114
the angle difference between the actual track and the reference track is the largest, and mu is the road adhesion coefficient; sigma is the angle of departure when the vehicle is laterally sideslip, phi is the roll angle of the vehicle, d is the vertical vibration distance, χ is the angle of departure when the bridge is vertically vibrating, i.e., the vertical tilt angle produced by the deck, μ is the road attachment coefficient, which is in the range of [01 ]]。
According to a high-precision map designed by the visual sensor, an optimal road running area is searched, and in the optimal road running area, the reference vehicle state, the control input and the parameter output values are designed, meanwhile, constraint conditions of multiple types of states are designed, and the roll control of the vehicle is realized by adopting an iterative learning control algorithm, so that the DDPG algorithm plays a compensation role, and the safe running of the vehicle in the controllable road area is realized. Searching a limited road running area according to a high-precision map designed by a visual sensor and the prediction state of an LSTM, designing the range of a layered uncertainty state parameter and an action parameter, and realizing the roll control of the vehicle by adopting a DDPG and an iterative control algorithm when the vehicle runs in the limited road running area, wherein the iterative control algorithm plays a role in compensation; when the vehicle is in a limit working condition, constraint conditions of the vehicle state need to be added, and a reward function is constructed as follows:
R=v·(R 1 +R 2 +…R 6 )
Figure BDA0003356573650000121
wherein R is 1 And R is 2 Each representing a lateral distance error and a rate of change thereof; r is R 3 And R is 4 Representing the transverse angular velocity and its rate of change; x is an angle value; v is the vehicle speed; r is R 5 And R is 6 Representing the roll angle and its rate of change; e, e y Is the lateral distance, k i ,k j The bonus factors, respectively.
The above description is only specific for the practical embodiments of the present invention, and they are not intended to limit the scope of the invention, but all equivalent manners or modifications that do not depart from the technology of the present invention should be included in the scope of the invention.

Claims (5)

1. An automatic driving vehicle roll control method based on DDPG and iterative control, which is characterized by comprising the following steps:
(1) Installing a laser radar, a vision, a millimeter wave radar, an ultrasonic radar sensor, a positioning system and an inertial navigation system on an autopilot vehicle;
(2) The method comprises the steps of using a visual sensor, a positioning system and an inertial navigation system to realize positions and maps of vehicles in different scenes respectively so as to generate automatic driving vehicle running maps in different scenes and realize the environment required by vehicle running tracks;
(3) Respectively controlling a steering wheel, an accelerator and a pedal, driving on a cross-sea bridge, acquiring corresponding driving tracks in rainy and snowy days, strong wind and bad weather and sunny days, and constructing a data set;
(4) The DDPG algorithm is trained on the running map of the automatic driving vehicle under different scenes and is used for running states of the cross-sea bridge under different complex road condition grades in severe weather; the automatic driving vehicle generates real-time vehicle states through interaction with map environments in different scenes, and determines action behaviors of the vehicle; initializing an action space when action training is carried out, generating state space information by an online strategy network in an actor network, carrying out action output, and adding an action noise to obtain the action space with exploratory property;
(5) Based on LSTM historic memory and road planning attributes, generating a predicted path of the state of the automatic driving vehicle, adopting a DDPG algorithm to realize tracking control of the path track of the automatic driving vehicle under normal driving road conditions and extreme driving road conditions, and adopting an iterative control method to realize compensation control of the automatic driving vehicle;
the network design of the DDPG algorithm in the step (4) is as follows:
an actor network is constructed, the vehicle state and the environment state are taken as input, the output is a vector formed by steering angle, accelerator and brake signals, the vector corresponds to 3 neurons of an actor strategy network output layer respectively, the activation function of the accelerator and the brake is set to be Sigmoid, the activation function of the steering action value is Tanh, and the hidden layer has the structure that: the first layer is a convolution size 7*7, a filter size 48, a step size 4, and a total of 200 neurons; the second layer is a convolution size 5*5, a filter size 16, a step size 2, an activation function ReLu function, 400 neurons total; the third layer adds 100 neurons to the LSTM layer; the fourth layer is a 128-unit fully connected layer; the fifth layer is a fully connected layer, totaling 128 units; the input of the critic network is a state and action space, and the state and action space are spliced with an activation function ReLu through two hidden layers, namely 200 neurons in the first layer and 400 neurons in the second layer, so that a Q value is finally obtained; definition of h i ∈(S t-T ,S t-T+1 ,…,S t ) Wherein S is t-T And S is t State information respectively representing the current time and the current time, the encoded state is: s=f (h i The method comprises the steps of carrying out a first treatment on the surface of the β), then the policy of the changed actor network is defined as action a=μ (h i /β,γ π )+η;
The implementation process of the tracking control of the path track of the automatic driving vehicle under the normal driving road condition in the step (5) is as follows:
the method comprises the steps of establishing a vehicle dynamics model by considering roll, sideslip and yaw dynamics characteristics of a vehicle under normal running road conditions, namely road conditions when a bridge vibrates on a sunny day, setting vehicle state constraint conditions, and determining a lateral stability range, a maximum steering angle range and a range of allowable vehicle control for preventing roll so as to reduce a lateral deviation error of the vehicle:
ω z-min ≤ω z ≤ω z-maxx-min ≤ω x ≤ω x-max ,u x-min ≤u x ≤u x-max ,e r-x-min ≤e r ≤e r-x-max
wherein omega is z Is yaw rate; omega x Is the roll angle of the vehicle; u (u) x Is the steering angle; e, e r Is a lateral tracking offset error;
according to LSTM predicted road state information, constructing an objective function considering steering angle, tire attachment coefficient, roll angle error and path tracking error, and determining physical constraint of the vehicle determined by transverse tracking offset error under the condition of full consideration of the dynamics constraint condition of maximum allowable error of the vehicle so as to reduce the error of tracking control of the vehicle:
Figure FDA0004186690960000021
wherein w is 1 ,w 2 ,w 3 ,w 4 Parameter variables respectively; mu (mu) r Is the road adhesion coefficient;
adopting an iterative control algorithm to realize compensation control of vehicle roll, and setting a reference vehicle state, a reference control input and a reference output value to ensure a tracking function under the conditions of vehicle physical constraint and road constraint under the condition of multiple constraints, thereby improving the anti-interference performance of the vehicle during running and reducing a model error rate;
according to the running condition of the vehicle, constructing a state space, an action space and a reward function required by a DDPG algorithm; the motion space mainly comprises steering wheel rotation angle, throttle and braking signals, and the state space comprises vehicle transverse tracking error and change rate thereof, vehicle side-tipping angle error and change rate thereof, and yaw rate error and change rate thereof; construction of the bonus function in case of vibration of the cross-sea bridge, the actual track of the vehicle is changed, the road is inclined at a certain angle, and thus the bonus function is equal to the cumulative multiplication of the discount factor and the change of speed.
2. The DDPG and iterative control-based roll control method for an autonomous vehicle of claim 1, wherein the lidar sensor of step (1) is used to detect dynamic and static obstacles on roads, including pedestrians, motorcycles, and various vehicles, and movable road areas; the visual sensor is used for sensing lane lines, pedestrians and vehicles and performing positioning and synchronous map creation; the millimeter wave radar sensor is used for detecting the distance between the vehicle and the pedestrian and the distance between the vehicle and the traveling vehicle; the ultrasonic radar is used for detecting the distance between the short-distance vehicles; the vision sensor, the positioning system and the inertial navigation system are used for realizing the vehicle positioning technology.
3. The DDPG and iterative control-based automatic driving vehicle roll control method of claim 1, wherein the different scenes in the step (2) comprise five scenes including a cross-sea bridge road condition in rainy and snowy weather, a cross-sea bridge road condition in strong wind and severe weather, a road condition when a bridge vibrates on a sunny day, a driving road condition of a single vehicle in frequent variable weather, and a driving road condition of multiple vehicles in frequent variable weather.
4. The DDPG and iterative control-based automatic driving vehicle roll control method of claim 1, wherein the dataset of step (3) comprises vehicle speed, travel trajectory, vehicle position, heading angle, slip angle, yaw rate, roll angle.
5. The DDPG and iterative control-based roll control method for an autonomous vehicle according to claim 1, wherein the implementation of the step (5) for implementing the tracking control of the path trajectory of the autonomous vehicle under the extreme driving road condition is as follows:
under extreme driving road conditions, namely under severe weather and other factors affecting road conditions, the phenomenon that the vehicle easily generates wet skid and vibration is considered, the driving track is changed, the actual running speed of the vehicle is affected, and the actual vehicle speed and the planned vehicle speed are deviated, so that the change of the actual vehicle speed is set as follows:
Figure FDA0004186690960000031
in the formula, v ref Is the reference vehicle speed; v act Is the actual vehicle speed; alpha is an influencing factor;
when the error between the actual vehicle speed and the reference vehicle speed is within 2KM/H, the actual vehicle speed is equal to the reference vehicle speed; when the error between the actual vehicle speed and the reference vehicle speed is in the [2 ] KM/H interval, the vehicle speed of the actual vehicle speed is equal to the reference vehicle speed and the difference between the two vehicle speeds; wherein α is a speed change factor; when the error between the actual vehicle speed and the reference vehicle speed is greater than 5KM/H, the vehicle needs to be braked instantaneously at the moment, so that the running safety of the vehicle is ensured;
at time t, the vehicle's set of actions and states are expressed as follows: { v 1 ,…,v i …v n I=1, …, n, path planning is implemented using bezier curves to produce collision-free predicted trajectories; in order to judge the accuracy of vehicle speed planning, the vehicle state and action of a small sample are taken out from an experience buffer of a DDPG algorithm as reference values: { v r1 ,…,v ri …v rn I=1, …, n, and the error of both is calculated as:
Figure FDA0004186690960000032
in the formula, v i Is the vehicle speed; l is the vehicle speed error rate; v ri The reference vehicle speed is obtained from the experience buffer;
under the extreme driving condition, the expected state reference track is given as S r The output error is e k (t)=S r (t) -S (t), learning law:u k+1 (t)=L(u k (t),e k (t)) to obtain the compensation control action a k The method comprises the steps of carrying out a first treatment on the surface of the Under normal running, the DDPG algorithm is adopted to realize vehicle control, and the output action is a π =μ(h i /β,γ π ) +η, total action control a=a k +a π The reference formula is as follows:
Figure FDA0004186690960000041
in order to judge the accuracy of path planning, the track error when the actual running track and the reference track of the vehicle generate the change of the roll angle is calculated as follows:
Figure FDA0004186690960000042
wherein r is act Is the actual vibrating vehicle track; r is (r) ref Is a reference track;
Figure FDA0004186690960000043
the angle difference between the actual track and the reference track is the largest, and mu is the road adhesion coefficient; sigma is the deviation angle when the vehicle generates lateral sideslip, phi is the vehicle side-tipping angle, d is the vertical vibration distance, χ is the deviation angle when the bridge vibrates vertically, namely, the vertical dip angle generated by the bridge deck;
vertical inclination χ maximize that bridge vibration produced sets up to:
Figure FDA0004186690960000044
searching an optimal road running area according to a high-precision map designed by the visual sensor, designing a reference vehicle state, a control input value and a parameter output value in the optimal road running area, designing constraint conditions of multiple types of states, and realizing roll control of the vehicle by adopting an iterative learning control algorithm;
the prediction states of the high-precision map and the LSTM designed by the visual sensor find a limited road driving area, and design a range of layered uncertainty state parameters and action parameters, when a vehicle is driven in the limited road driving area, the DDPG and an iterative control algorithm are adopted to realize the roll control of the vehicle, the iterative control algorithm plays a role in compensation, and a reward function is constructed according to the constraint range of the vehicle state, wherein the reward function is as follows:
R=v·(R 1 +R 2 +…R 6 )
Figure FDA0004186690960000051
wherein R is 1 And R is 2 Each representing a lateral distance error and a rate of change thereof; r is R 3 And R is 4 Representing the transverse angular velocity and its rate of change; r is R 5 And R is 6 Representing the roll angle and its rate of change; x is an angle value; v is the vehicle speed; e, e y Is the lateral distance, k i ,k j The bonus factors, respectively.
CN202111353270.7A 2021-11-16 2021-11-16 Automatic driving vehicle roll control method based on DDPG and iterative control Active CN114228690B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111353270.7A CN114228690B (en) 2021-11-16 2021-11-16 Automatic driving vehicle roll control method based on DDPG and iterative control

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111353270.7A CN114228690B (en) 2021-11-16 2021-11-16 Automatic driving vehicle roll control method based on DDPG and iterative control

Publications (2)

Publication Number Publication Date
CN114228690A CN114228690A (en) 2022-03-25
CN114228690B true CN114228690B (en) 2023-05-23

Family

ID=80749618

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111353270.7A Active CN114228690B (en) 2021-11-16 2021-11-16 Automatic driving vehicle roll control method based on DDPG and iterative control

Country Status (1)

Country Link
CN (1) CN114228690B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115571160A (en) * 2022-10-26 2023-01-06 清华大学 Chassis domain controller and control method for automatic driving and vehicle
WO2024113087A1 (en) * 2022-11-28 2024-06-06 Beijing Baidu Netcom Science Technology Co., Ltd. On-board parameter tuning for control module for autonomous vehicles
CN115973131B (en) * 2023-03-20 2023-06-13 上海伯镭智能科技有限公司 Mining area unmanned vehicle rollover prevention method and related device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110647839A (en) * 2019-09-18 2020-01-03 深圳信息职业技术学院 Method and device for generating automatic driving strategy and computer readable storage medium
CN110850861A (en) * 2018-07-27 2020-02-28 通用汽车环球科技运作有限责任公司 Attention-based hierarchical lane change depth reinforcement learning
DE102019115707A1 (en) * 2018-11-01 2020-05-07 Carnegie Mellon University SPATIAL AND TIMELINE ATTENTION-BASED DEPTH LEARNING LEARNING OF HIERARCHICAL Lane-changing Strategies for Controlling an Autonomous Vehicle
CN111845741A (en) * 2020-06-28 2020-10-30 江苏大学 Automatic driving decision control method and system based on hierarchical reinforcement learning
CN113386790A (en) * 2021-06-09 2021-09-14 扬州大学 Automatic driving decision-making method for cross-sea bridge road condition

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11554785B2 (en) * 2019-05-07 2023-01-17 Foresight Ai Inc. Driving scenario machine learning network and driving environment simulation
US11157784B2 (en) * 2019-05-08 2021-10-26 GM Global Technology Operations LLC Explainable learning system and methods for autonomous driving
US11493926B2 (en) * 2019-05-15 2022-11-08 Baidu Usa Llc Offline agent using reinforcement learning to speedup trajectory planning for autonomous vehicles

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110850861A (en) * 2018-07-27 2020-02-28 通用汽车环球科技运作有限责任公司 Attention-based hierarchical lane change depth reinforcement learning
DE102019115707A1 (en) * 2018-11-01 2020-05-07 Carnegie Mellon University SPATIAL AND TIMELINE ATTENTION-BASED DEPTH LEARNING LEARNING OF HIERARCHICAL Lane-changing Strategies for Controlling an Autonomous Vehicle
CN110647839A (en) * 2019-09-18 2020-01-03 深圳信息职业技术学院 Method and device for generating automatic driving strategy and computer readable storage medium
CN111845741A (en) * 2020-06-28 2020-10-30 江苏大学 Automatic driving decision control method and system based on hierarchical reinforcement learning
CN113386790A (en) * 2021-06-09 2021-09-14 扬州大学 Automatic driving decision-making method for cross-sea bridge road condition

Also Published As

Publication number Publication date
CN114228690A (en) 2022-03-25

Similar Documents

Publication Publication Date Title
CN114228690B (en) Automatic driving vehicle roll control method based on DDPG and iterative control
CN111845774B (en) Automatic driving automobile dynamic trajectory planning and tracking method based on transverse and longitudinal coordination
Gao et al. Robust lateral trajectory following control of unmanned vehicle based on model predictive control
JP4586795B2 (en) Vehicle control device
CN107264534B (en) Based on the intelligent driving control system and method for driver experience's model, vehicle
CN114407931A (en) Decision-making method for safe driving of highly-humanoid automatic driving commercial vehicle
CN114564016A (en) Navigation obstacle avoidance control method, system and model combining path planning and reinforcement learning
CN114379583B (en) Automatic driving vehicle track tracking system and method based on neural network dynamics model
CN110286681A (en) A kind of dynamic auto driving lane-change method for planning track of variable curvature bend
CN109733474A (en) A kind of intelligent vehicle steering control system and method based on piecewise affine hierarchical control
CN114771563A (en) Method for realizing planning control of track of automatic driving vehicle
CN113386790B (en) Automatic driving decision-making method for cross-sea bridge road condition
CN115683145A (en) Automatic driving safety obstacle avoidance method based on track prediction
CN113715842A (en) High-speed moving vehicle control method based on simulation learning and reinforcement learning
CN112249008A (en) Unmanned automobile early warning method aiming at complex dynamic environment
Kapania Trajectory planning and control for an autonomous race vehicle
CN113255998A (en) Expressway unmanned vehicle formation method based on multi-agent reinforcement learning
CN108711285B (en) Hybrid traffic simulation method based on road intersection
CN116486356A (en) Narrow scene track generation method based on self-adaptive learning technology
CN116182884A (en) Intelligent vehicle local path planning method based on transverse and longitudinal decoupling of frenet coordinate system
CN114179818A (en) Intelligent automobile transverse control method based on adaptive preview time and sliding mode control
CN113033902B (en) Automatic driving lane change track planning method based on improved deep learning
CN117826590A (en) Unmanned vehicle formation control method and system based on prepositive following topological structure
CN115214697A (en) Adaptive second-order sliding mode control intelligent automobile transverse control method
CN115447615A (en) Trajectory optimization method based on vehicle kinematics model predictive control

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant