CN114228690B - Automatic driving vehicle roll control method based on DDPG and iterative control - Google Patents
Automatic driving vehicle roll control method based on DDPG and iterative control Download PDFInfo
- Publication number
- CN114228690B CN114228690B CN202111353270.7A CN202111353270A CN114228690B CN 114228690 B CN114228690 B CN 114228690B CN 202111353270 A CN202111353270 A CN 202111353270A CN 114228690 B CN114228690 B CN 114228690B
- Authority
- CN
- China
- Prior art keywords
- vehicle
- road
- error
- control
- action
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000009471 action Effects 0.000 claims abstract description 60
- 238000012549 training Methods 0.000 claims abstract description 10
- 230000006399 behavior Effects 0.000 claims abstract description 5
- 230000003993 interaction Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 32
- 230000008859 change Effects 0.000 claims description 30
- 210000002569 neuron Anatomy 0.000 claims description 18
- 230000033001 locomotion Effects 0.000 claims description 14
- 230000004913 activation Effects 0.000 claims description 12
- 230000000007 visual effect Effects 0.000 claims description 11
- 238000005516 engineering process Methods 0.000 claims description 8
- 238000013461 design Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 3
- 239000013643 reference control Substances 0.000 claims description 3
- 230000003068 static effect Effects 0.000 claims description 3
- 230000001360 synchronised effect Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 2
- 230000001186 cumulative effect Effects 0.000 claims description 2
- 230000002787 reinforcement Effects 0.000 abstract description 5
- 238000005096 rolling process Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 206010063385 Intellectualisation Diseases 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W10/00—Conjoint control of vehicle sub-units of different type or different function
- B60W10/20—Conjoint control of vehicle sub-units of different type or different function including control of steering systems
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W10/00—Conjoint control of vehicle sub-units of different type or different function
- B60W10/04—Conjoint control of vehicle sub-units of different type or different function including control of propulsion units
- B60W10/06—Conjoint control of vehicle sub-units of different type or different function including control of propulsion units including control of combustion engines
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W10/00—Conjoint control of vehicle sub-units of different type or different function
- B60W10/18—Conjoint control of vehicle sub-units of different type or different function including control of braking systems
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W40/00—Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
- B60W40/10—Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to vehicle motion
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W40/00—Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
- B60W40/10—Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to vehicle motion
- B60W40/105—Speed
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W50/00—Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
- B60W50/0098—Details of control systems ensuring comfort, safety or stability not otherwise provided for
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W50/00—Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
- B60W2050/0001—Details of the control system
- B60W2050/0043—Signal treatments, identification of variables or parameters, parameter estimation or state estimation
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2520/00—Input parameters relating to overall vehicle dynamics
- B60W2520/10—Longitudinal speed
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2520/00—Input parameters relating to overall vehicle dynamics
- B60W2520/14—Yaw
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2520/00—Input parameters relating to overall vehicle dynamics
- B60W2520/18—Roll
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2552/00—Input parameters relating to infrastructure
- B60W2552/50—Barriers
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2552/00—Input parameters relating to infrastructure
- B60W2552/53—Road markings, e.g. lane marker or crosswalk
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2554/00—Input parameters relating to objects
- B60W2554/40—Dynamic objects, e.g. animals, windblown objects
- B60W2554/402—Type
- B60W2554/4026—Cycles
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2554/00—Input parameters relating to objects
- B60W2554/40—Dynamic objects, e.g. animals, windblown objects
- B60W2554/402—Type
- B60W2554/4029—Pedestrians
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2554/00—Input parameters relating to objects
- B60W2554/40—Dynamic objects, e.g. animals, windblown objects
- B60W2554/404—Characteristics
- B60W2554/4041—Position
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2554/00—Input parameters relating to objects
- B60W2554/40—Dynamic objects, e.g. animals, windblown objects
- B60W2554/404—Characteristics
- B60W2554/4042—Longitudinal speed
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2554/00—Input parameters relating to objects
- B60W2554/80—Spatial relation or speed relative to objects
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2555/00—Input parameters relating to exterior conditions, not covered by groups B60W2552/00, B60W2554/00
- B60W2555/20—Ambient conditions, e.g. wind or rain
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2710/00—Output or target parameters relating to a particular sub-units
- B60W2710/06—Combustion engines, Gas turbines
- B60W2710/0605—Throttle position
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2710/00—Output or target parameters relating to a particular sub-units
- B60W2710/18—Braking system
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2710/00—Output or target parameters relating to a particular sub-units
- B60W2710/20—Steering systems
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Transportation (AREA)
- Mechanical Engineering (AREA)
- Chemical & Material Sciences (AREA)
- Combustion & Propulsion (AREA)
- Automation & Control Theory (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Human Computer Interaction (AREA)
- Control Of Driving Devices And Active Controlling Of Vehicle (AREA)
Abstract
The invention discloses an automatic driving vehicle roll control method based on DDPG and iterative control, which trains a DDPG algorithm on running maps of the automatic driving vehicle in different scenes, and the automatic driving vehicle generates real-time vehicle states through interaction with map environments in different scenes to determine the action behaviors of the vehicle; initializing an action space when action training is carried out, generating state space information by an online strategy network in an actor network, carrying out action output, and adding an action noise to obtain the action space with exploratory property; based on LSTM historic memory and road planning attributes, generating a predicted path of the state of the automatic driving vehicle, adopting a DDPG algorithm to realize tracking control of the path track under the normal running working condition and the extreme running working condition of the automatic driving vehicle, and adopting an iterative control method to realize compensation control of the automatic driving vehicle. The invention avoids the instability problem of the reinforcement learning algorithm under the extreme road environment driving condition of the vehicle, and improves the driving safety and the robustness of the vehicle.
Description
Technical Field
The invention belongs to the field of intelligent vehicle control, and particularly relates to an automatic driving vehicle roll control method based on DDPG and iterative control.
Background
With the development of artificial intelligence technology, the current automatic driving technology has greatly developed, and is currently focused on closed campus scenes, such as closed campus scenes, logistics industry park scenes and the like, and particularly, the application is more common in harbour road environments with fewer structured road features, pedestrians and vehicles. The automatic driving vehicle adopts environment sensing, navigation, map positioning, decision making, motion planning and track tracking control to realize the intellectualization of the vehicle. However, when an automatic driving vehicle is in complex weather and complex driving environments such as a cross-sea bridge, the severe weather environment can influence the road condition of the bridge, so that the vehicle turns, sideslips and turns over, such as rain, snow, wind and the like in the weather environment, the road attachment coefficient is changed, tires are slipped, and the path tracking, lane keeping and vehicle control precision are changed. In addition, the bridge road environment may vibrate due to the influence of wind weather, and a roll phenomenon of a vehicle may occur, thereby causing an uncontrollable situation. Therefore, the control technique is a complex task when the vehicle has uncertainty characteristics such as a sideslip phenomenon caused by a wet road surface, a vehicle yaw characteristic caused by bridge vibration, a vehicle roll dynamics phenomenon caused by high-speed vehicle performance, and the like, also considering road vibration characteristics, road angle, and aerodynamic characteristics in the overall control design. Therefore, running safety and stability research based on the roll control of an automatic driving vehicle under severe weather conditions is an important key technology. Reinforcement learning is an application category of artificial intelligence technology, and an intelligent agent can explore an unknown dynamic environment, try different actions and interact with the dynamic environment without any accurate vehicle model and given surrounding environment, can learn the unknown environment, realize complex vehicle dynamics through actions and states interacted with the environment, and is a new implementation method for adapting to dynamic road environment cognition and complex vehicle dynamics performance. Therefore, the adoption of the reinforcement learning algorithm to realize the control of the vehicle in the running environment of the automatic driving vehicle on the road condition of the cross-sea bridge is beneficial to realizing the intelligent safety of the vehicle and the scale industrialized development of the automatic driving vehicle.
Disclosure of Invention
The invention aims to: the invention provides an automatic driving vehicle roll control method based on DDPG and iterative control, which enables a vehicle to safely run in cross-sea bridge crossing environments with any different complexity levels and improves the intelligent level of the vehicle through instantaneous roll angles.
The technical scheme is as follows: the invention provides an automatic driving vehicle roll control method based on DDPG and iterative control, which specifically comprises the following steps:
(1) Installing a laser radar, a vision, a millimeter wave radar, an ultrasonic radar sensor, a positioning system and an inertial navigation system on an autopilot vehicle;
(2) The method comprises the steps of using a visual sensor, a positioning system and an inertial navigation system to realize positions and maps of vehicles in different scenes respectively so as to generate automatic driving vehicle running maps in different scenes and realize the environment required by vehicle running tracks;
(3) Respectively controlling a steering wheel, an accelerator and a pedal, driving on a cross-sea bridge, acquiring corresponding driving tracks in rainy and snowy days, strong wind and bad weather and sunny days, and constructing a data set;
(4) The DDPG algorithm is trained on the running map of the automatic driving vehicle under different scenes and is used for running states of the cross-sea bridge under different complex road condition grades in severe weather; the automatic driving vehicle generates real-time vehicle states through interaction with map environments in different scenes, and determines action behaviors of the vehicle; initializing an action space when action training is carried out, generating state space information by an online strategy network in an actor network, carrying out action output, and adding an action noise to obtain the action space with exploratory property;
(5) Based on LSTM historic memory and road planning attributes, generating a predicted path of the state of the automatic driving vehicle, adopting a DDPG algorithm to realize tracking control of the path track of the automatic driving vehicle under normal driving road conditions and extreme driving road conditions, and adopting an iterative control method to realize compensation control of the automatic driving vehicle.
Further, the lidar sensor in step (1) is used to detect dynamic and static obstacles on roads, including pedestrians, motorcycles, various vehicles, etc., and movable road areas; the visual sensor is used for sensing lane lines, pedestrians and vehicles and performing positioning and synchronous map creation; the millimeter wave radar sensor is used for detecting the distance between the vehicle and the pedestrian and the distance between the vehicle and the traveling vehicle; the ultrasonic radar is used for detecting the distance between the short-distance vehicles; the vision sensor, the positioning system and the inertial navigation system are used for realizing the vehicle positioning technology.
Further, the different scenes in the step (2) include five scenes including a cross-sea bridge road condition in rainy and snowy weather, a cross-sea bridge road condition in strong wind and severe weather, a road condition when the bridge vibrates in sunny days, a running road condition of a bicycle in frequent and changeable weather, and a running road condition of a plurality of bicycles in frequent and changeable weather.
Further, the data set of step (3) includes vehicle speed, travel track, vehicle position, heading angle, slip angle, yaw rate, roll angle.
Further, the network design of the DDPG algorithm in the step (4) is as follows:
an actor network is constructed, the vehicle state and the environment state are taken as input, the output is a vector formed by steering angle, accelerator and brake signals, the vector corresponds to 3 neurons of an actor strategy network output layer respectively, the activation function of the accelerator and the brake is set to be Sigmoid, the activation function of the steering action value is Tanh, and the hidden layer has the structure that: the first layer is a convolution size 7*7, a filter size 48, a step size 4, and a total of 200 neurons; the second layer is a convolution size 5*5, a filter size 16, a step size 2, an activation function ReLu function, 400 neurons total; the third layer adds 100 neurons to the LSTM layer; the fourth layer is a 128-unit fully connected layer; the fifth layer is a fully connected layer, totaling 128 units; the input of the critic network is a state and action space, and the state and action space are spliced with an activation function ReLu through two hidden layers, namely 200 neurons in the first layer and 400 neurons in the second layer, so that a Q value is finally obtained; definition of h i ∈(S t-T ,S t-T+1 ,…,S t ) Wherein S is t-T And S is t State information respectively representing the current time and the current time, the encoded state is: s=f (h i The method comprises the steps of carrying out a first treatment on the surface of the β), then the policy of the changed actor network is defined as: a=μ (h i /β,γ π )+η。
Further, the implementation process of the step (5) for implementing the tracking control of the path track under the normal running road condition of the automatic driving vehicle is as follows:
the method comprises the steps of establishing a vehicle dynamics model by considering roll, sideslip and yaw dynamics characteristics of a vehicle under normal running road conditions, namely road conditions when a bridge vibrates on a sunny day, setting vehicle state constraint conditions, and determining a lateral stability range, a maximum steering angle range and a range of allowable vehicle control for preventing roll so as to reduce a lateral deviation error of the vehicle:
ω z-min ≤ω z ≤ω z-max ,ω x-min ≤ω x ≤ω x-max ,u x-min ≤u x ≤u x-max ,e r-x-min ≤e r ≤e r-x-max
wherein omega is z Is yaw rate; omega x Is the roll angle of the vehicle; u (u) x Is the steering angle; e, e r Is a lateral tracking offset error;
according to LSTM predicted road state information, constructing an objective function considering steering angle, tire attachment coefficient, roll angle error and path tracking error, and determining physical constraint of the vehicle determined by transverse tracking offset error under the condition of full consideration of the dynamics constraint condition of maximum allowable error of the vehicle so as to reduce the error of tracking control of the vehicle:
wherein w is 1 ,w 2 ,w 3 ,w 4 Parameter variables respectively; mu (mu) r Is the road adhesion coefficient;
adopting an iterative control algorithm to realize compensation control of vehicle roll, and setting a reference vehicle state, a reference control input and a reference output value to ensure a tracking function under the conditions of vehicle physical constraint and road constraint under the condition of multiple constraints, thereby improving the anti-interference performance of the vehicle during running and reducing a model error rate;
according to the running condition of the vehicle, constructing a state space, an action space and a reward function required by a DDPG algorithm; the motion space mainly comprises steering wheel rotation angle, throttle and braking signals, and the state space comprises vehicle transverse tracking error and change rate thereof, vehicle side-tipping angle error and change rate thereof, and yaw rate error and change rate thereof; construction of the bonus function in case of vibration of the cross-sea bridge, the actual track of the vehicle is changed, the road is inclined at a certain angle, and thus the bonus function is equal to the cumulative multiplication of the discount factor and the change of speed.
Further, the implementation process of the step (5) for implementing the tracking control of the path track of the automatic driving vehicle under the extreme driving road condition is as follows:
under extreme driving road conditions, namely under severe weather and other factors affecting road conditions, the phenomenon that the vehicle easily generates wet skid and vibration is considered, the driving track is changed, the actual running speed of the vehicle is affected, and the actual vehicle speed and the planned vehicle speed are deviated, so that the change of the actual vehicle speed is set as follows:
in the formula, v ref Is the reference vehicle speed; v act Is the actual vehicle speed; alpha is an influencing factor;
when the error between the actual vehicle speed and the reference vehicle speed is within 2KM/H, the actual vehicle speed can be assumed to be equal to the reference vehicle speed; when the error between the actual vehicle speed and the reference vehicle speed is in the [2 ] KM/H interval, the vehicle speed of the actual vehicle speed is equal to the reference vehicle speed and the difference between the two vehicle speeds; wherein α is a speed change factor; when the error between the actual vehicle speed and the reference vehicle speed is greater than 5KM/H, the vehicle needs to be braked instantaneously at the moment, so that the running safety of the vehicle is ensured;
at time t, the vehicle's set of actions and states are expressed as follows: { v 1 ,…,v i …v n I=1, …, n, path planning is implemented using bezier curves to produce collision-free predicted trajectories; in order to judge the accuracy of vehicle speed planning, the vehicle state and action of a small sample are taken out from an experience buffer of a DDPG algorithm as reference values: { v r1 ,…,v ri …v rn I=1, …, n, and the error of both is calculated as:
in the formula, v i Is the vehicle speed; l is the vehicle speed error rate; v ri The reference vehicle speed is obtained from the experience buffer;
under the extreme driving condition, the expected state reference track is given as S r The output error is e k (t)=S r (t) -S (t), learning law: u (u) k+1 (t)=L(u k (t),e k (t)) to obtain the compensation control action a k The method comprises the steps of carrying out a first treatment on the surface of the Under normal running, the DDPG algorithm is adopted to realize vehicle control, and the output action is a π =μ(h i /β,γ π ) +η, total action control a=a k +a π The reference formula is as follows:
in order to judge the accuracy of path planning, the track error when the actual running track of the actual running vehicle of the vehicle and the reference track generate the change of the side-tilt angle is calculated as follows:
wherein r is act Is the actual vibrating vehicle track; r is (r) ref Is a reference track;the angle difference between the actual track and the reference track is the largest, and mu is the road adhesion coefficient; sigma is the deviation angle when the vehicle generates lateral sideslip, phi is the vehicle side-tipping angle, d is the vertical vibration distance, χ is the deviation angle when the bridge vibrates vertically, namely, the vertical dip angle generated by the bridge deck;
the vertical dip angle χ that bridge vibration produced can be maximally set to:
searching an optimal road running area according to a high-precision map designed by the visual sensor, designing a reference vehicle state, a control input value and a parameter output value in the optimal road running area, designing constraint conditions of multiple types of states, and realizing roll control of the vehicle by adopting an iterative learning control algorithm;
the prediction states of the high-precision map and the LSTM designed by the visual sensor find a limited road driving area, and design a range of layered uncertainty state parameters and action parameters, when a vehicle is driven in the limited road driving area, the DDPG and an iterative control algorithm are adopted to realize the roll control of the vehicle, the iterative control algorithm plays a role in compensation, and a reward function is constructed according to the constraint range of the vehicle state, wherein the reward function is as follows:
R=v·(R 1 +R 2 +…R 6 )
wherein R is 1 And R is 2 Each representing a lateral distance error and a rate of change thereof; r is R 3 And R is 4 Representing the transverse angular velocity and its rate of change; r is R 5 And R is 6 Representing the roll angle and its rate of change; x is an angle value; v is the vehicle speed; e, e y Is the lateral distance, k i ,k j The bonus factors, respectively.
The beneficial effects are that: compared with the prior art, the invention has the beneficial effects that: 1. the invention designs a comprehensive control method for the roll of the automatic driving vehicle based on a reinforcement learning algorithm (DDPG), and the roll of the vehicle is controlled in a complex road environment through reinforcement learning, so that the automatic driving vehicle can realize exploratory running of the vehicle under complex road conditions and extreme weather through exploration and utilization methods; 2. and the compensation effect of iterative learning control on the DDPG algorithm is carried out aiming at extreme driving conditions, so that the comprehensive control effect of the vehicle is realized, and the final safe driving of the vehicle is ensured.
Drawings
Fig. 1 is a schematic diagram of a DDPG network architecture;
FIG. 2 is a schematic diagram of integrated control of vehicle roll;
fig. 3 is a flow chart of integrated control based on autonomous vehicle roll.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings.
The invention provides an automatic driving vehicle roll control method based on DDPG and iterative control, which specifically comprises the following steps:
step 1: laser radar, vision, millimeter wave radar, ultrasonic radar sensor, positioning system and inertial navigation system are installed on an autonomous vehicle.
The invention is oriented to the road condition environment of a cross-sea bridge, and aims to control the vehicle to safely run at a medium-low speed (5-80 KM/H), and realize high-level intellectualization of the vehicle through the control behavior of the vehicle with instantaneous side inclination. To achieve the above object, the present invention mounts a plurality of several laser radar, machine vision, millimeter wave radar and ultrasonic radar sensors on an autonomous vehicle, and mounts a positioning system, an inertial navigation system (IMU), and the like. The laser radar sensor is used for detecting dynamic and static obstacles on the road, including pedestrians, motorcycles, various vehicles and the like, and can drive a road area; the machine vision sensor is used for sensing lane lines, pedestrians and vehicles and performing positioning and synchronous map creation; the millimeter wave radar sensor is used for detecting the distance between the vehicle and the pedestrian and the distance between the vehicle and the traveling vehicle; the ultrasonic radar is used for detecting the distance between the short-distance vehicles; positioning systems and inertial navigation Systems (IMUs) are used to implement vehicle positioning technology.
Step 2: the vision sensor, the positioning system and the inertial navigation system are used for realizing the positions and the maps of the vehicle in different scenes respectively so as to generate the running maps of the automatic driving vehicle in different scenes and realize the environment required by the running track of the vehicle.
The method comprises the steps of using a visual sensor, a positioning system and an inertial navigation system (IMU) to realize positions and maps of vehicles in severe weather such as sunny days, rainy days, snowy days, foggy days, strong winds and the like respectively so as to generate an automatic driving vehicle driving map in five scenes of a cross-sea bridge road condition in rainy and snowy days, a cross-sea bridge road condition in strong wind and severe weather, a road condition when a bridge vibrates in sunny days, a driving road condition of a bicycle in frequent changeable weather and a driving road condition of a plurality of vehicles in frequent changeable weather, and the method is used for realizing an environment required by a vehicle driving track.
Step 3: and respectively controlling a steering wheel, an accelerator and a pedal, driving on a cross-sea bridge, acquiring corresponding driving tracks in rainy and snowy days, strong wind and severe weather and sunny days, and constructing a data set.
A driver with abundant experience respectively runs on the cross-sea bridge under five scenes by controlling the steering wheel, the accelerator and the pedal, and records corresponding running tracks so as to construct corresponding data sets. The data set includes: the vehicle speed, the running track, the vehicle position, the course angle, the slip angle, the yaw rate and the roll angle provide necessary reference data for training the data set and evaluating the controllability of the vehicle.
Step 4: the DDPG algorithm is trained on the running map of the automatic driving vehicle under different scenes and is used for running states of the cross-sea bridge under different complex road condition grades in severe weather; the automatic driving vehicle generates real-time vehicle states through interaction with map environments in different scenes, and determines action behaviors of the vehicle; when motion training is performed, initializing a motion space, generating state space information by an online strategy network in an actor network, performing motion output, and adding motion noise to acquire a exploratory motion space.
As shown in fig. 1, an actor network is constructed, the vehicle state and the environment state are taken as input, the output is a vector formed by steering angle, accelerator and brake, 3 neurons of an actor strategy network output layer are respectively corresponding to the actor, the activation function of the accelerator and the brake is set to be Sigmoid, the activation function of the steering action value is Tanh, and the structure of the hidden layer is as follows: the first layer is a convolution size 7*7, a filter size 48, a step size 4, and a total of 200 neurons; the second layer is a convolution size 5*5, a filter size 16, a step size 2, an activation function ReLu function, 400 neurons total; the third layer adds 100 neurons to the LSTM layer; the fourth layer is a 128-unit fully connected layer; the fifth layer is a fully connected layer, totaling 128 units; the input of the critic network is a state and action space, and the state and action space are spliced with an activation function ReLu through two hidden layers, namely 200 neurons in the first layer and 400 neurons in the second layer, so that a Q value is finally obtained; definition of h i ∈(S t-T ,S t-T+1 ,…,S t ) Wherein S is t-T And S is t State information respectively representing the current time and the current time, the encoded state is: s=f (h i The method comprises the steps of carrying out a first treatment on the surface of the β), then the policy of the changed actor network is defined as: a=μ (h i /β,γ π )+η。
Designing an action space of the vehicle, wherein the action space comprises a steering wheel angle delta and a braking signal of the vehicleAnd throttle signalConsidering that the running environment of the vehicle is complex, when the road environment is complex, the vehicle is in variable speed running, and the braking signal is the action generated by the vehicle under the extreme running condition so as to prevent the vehicle from rolling and rollover movement caused by braking and road surface wet sliding, and the action space at the moment should be set into three types of steering wheel rotation angle, braking signal and accelerator signal; when the vehicle is under the normal running working condition, the vehicle is assumed to run at a constant speed, in order to prevent the vehicle from generating a roll phenomenon due to wet and slippery road surface, the action space is set to be a steering wheel angle and an accelerator signal, and the constraint ranges of three actions are set according to the two different running working conditions, so that the vehicle can be ensured to run in a controllable way in a running possible road area.
The vehicle is configured to obtain state data by exploring and utilizing the environment, and the generated data states generally comprise a lateral distance and a change rate thereof, and a vehicle roll angle and a change rate thereof, and the state data are generally contained in an experience buffer. When the iterative learning is adopted to realize the vehicle control, parameters such as the state of a reference vehicle, the track of the vehicle and the like need to be designed, the reference state and the track can be obtained from an experience buffer, and the reference track can be adjusted and changed according to the complexity of different road environments.
Under five different scenes of road conditions of a cross-sea bridge in rainy and snowy weather, road conditions of a cross-sea bridge in severe weather, road conditions of a bridge in vibration in sunny weather, running road conditions of a bicycle in frequent variable weather and running road conditions of a plurality of bicycles in frequent variable weather, different generated tracks are used as reference paths, error comparison is carried out between an actually planned path of a vehicle and the reference tracks, and the reference tracks also need to be added with various constraint conditions meeting vehicle dynamics characteristics to modify and adjust the constraint conditions as set actual paths, so that the method can be expressed as follows:
where σ is a path influencing factor, p ref Is a reference track; p is p act An actual trajectory; that is, when the vehicle travels in different road environments, the acquired travel track needs to be appropriately modified and adjusted to conform to the vehicle dynamics characteristics of the vehicle during automatic traveling, and the travel track can be used as the travel track of the automatic driving vehicle.
Step 5: as shown in fig. 2, based on LSTM history memory and road planning attributes, a predicted path of the state of the autonomous vehicle is generated, tracking control of the path trajectories of the autonomous vehicle under normal driving road conditions and extreme driving road conditions is realized by using a DDPG algorithm, and compensation control of the autonomous vehicle is realized by using an iterative control method.
In order to ensure the safety and stability of the vehicle under normal driving road conditions, namely road conditions when the bridge vibrates on sunny days in five scenes, the roll, sideslip and yaw dynamics characteristics of the vehicle are required to be considered in vehicle dynamics modeling, vehicle state constraint conditions are set, and a lateral stability range, a maximum steering angle range and a range for allowable vehicle control for preventing roll are determined so as to reduce the lateral deviation error of the vehicle:
ω z-min ≤ω z ≤ω z-max ,ω x-min ≤ω x ≤ω x-max ,u x-min ≤u x ≤u x-max ,e r-x-min ≤e r ≤e r-x-max
wherein ψ= [ v y ω z ω x φ] T Is a state vector, u 1 Is a control input, u 2 Is an auxiliary control input; omega z Is yaw rate; omega x Is the roll angle of the vehicle; u (u) x Is the steering angle; e, e r Is a lateral tracking offset error.
According to LSTM predicted road state information, constructing an objective function considering steering angle, tire attachment coefficient, roll angle error and path tracking error, and determining physical constraint of the vehicle determined by transverse tracking offset error under the condition of full consideration of the dynamics constraint condition of maximum allowable error of the vehicle so as to reduce the error of tracking control of the vehicle:
wherein w is 1 ,w 2 ,w 3 ,w 4 Parameter variables respectively; mu (mu) r Is the road adhesion coefficient
As shown in fig. 3, the iterative control algorithm is adopted to realize the compensation control of the vehicle roll, and the reference vehicle state, the reference control input and the reference output value are set to ensure the tracking function under the conditions of the physical constraint of the vehicle and the road constraint under the multiple constraint conditions, thereby increasing the anti-interference performance of the vehicle during running and reducing the model error rate. Firstly, a DDPG algorithm is adopted to carry out training work of a network model, interactive training is carried out through road conditions of a vehicle and a dynamic cross-sea bridge, a training completion task is ensured, if the task is completed, a trained action is saved, if the training task is not ideal in completion effect, an iterative control algorithm is adopted to compensate parameters of an output action space, finally, the training task is completed, a better action is realized, and finally, a better automatic driving vehicle roll control is realized.
Under extreme driving conditions, namely, the road conditions of cross-sea bridges in rainy and snowy weather in five scenes, the road conditions of cross-sea bridges in strong wind and severe weather, the driving road conditions of single vehicles in frequent and varied weather and the driving road conditions of multiple vehicles in frequent and varied weather; the road environment has uncertainty due to bad weather influence, which interferes with the normal running of the vehicle.
Under extreme driving road conditions, the vehicle is easy to generate wet skid and vibration phenomena, the driving track can be changed to influence the actual running speed of the vehicle, and the actual vehicle speed is deviated from the planned vehicle speed, so that the change of the actual vehicle speed can be set as follows:
in the formula, v ref Is the reference vehicle speed; v act Is the actual vehicle speed; alpha is an influencing factor.
When the error between the actual vehicle speed and the reference vehicle speed is within 2KM/H, the actual vehicle speed can be assumed to be equal to the reference vehicle speed; when the error between the actual vehicle speed and the reference vehicle speed is in the [2 ] KM/H interval, the vehicle speed of the actual vehicle speed is equal to the reference vehicle speed and the difference between the two vehicle speeds, wherein alpha is a speed change factor; when the error between the actual vehicle speed and the reference vehicle speed is larger than 5KM/H, the vehicle needs to be braked instantaneously, and the running safety of the vehicle is ensured.
After the path prediction of the automatic driving vehicle passes through the LSTM historic memory state, the predicted speed path is generated by utilizing the road planning attribute, and at the time t, the action and state set of the vehicle are expressed as follows: { v 1 ,…,v i …v n I=1, …, n, path planning is implemented using bezier curves to produce collision-free predicted trajectories; in order to judge the accuracy of vehicle speed planning, the vehicle state and action of a small sample are taken out from an experience buffer of a DDPG algorithm as reference values: { v r1 ,…,v ri …v rn I=1, …, n, and the error of both is calculated as:
in the formula, v i Is the vehicle speed; l is the vehicle speed error rate; v ri Is the reference vehicle speed obtained from the experience buffer.
Under extreme conditions, given a desired state reference trajectory of S r The output error is e k (t)=S r (t) -S (t), learning law: u (u) k+1 (t)=L(u k (t),e k (t)) to obtain the compensation control action a k The method comprises the steps of carrying out a first treatment on the surface of the Under normal running, the DDPG algorithm is adopted to realize vehicle control, and the output action is a π =μ(h i /β,γ π ) +η, total action control a=a k +a π The reference formula is as follows:
in order to judge the accuracy of path planning, the track error when the actual running track of the actual running vehicle of the vehicle and the reference track generate the change of the side-tilt angle is calculated as follows: the rolling motion phenomenon of the vehicle is mainly shown in two cases, wherein the first motion is that when a bridge vibrates, the vehicle automatically drives to deviate from a planned path, and at the moment, the vehicle easily rolls to generate a roll angle; therefore, a track error in which the actual running track of the vehicle and the reference track produce a change in the roll angle can be expressed as:
wherein r is act Is the actual vibrating vehicle track; r is (r) ref Is a reference track;the angle difference between the actual track and the reference track is the largest, sigma is the deviation angle when the vehicle generates lateral sideslip, phi is the vehicle side-tipping angle, d vertical vibration distance, χ is the deviation angle when the bridge vibrates vertically, namely, the vertical dip angle generated by the bridge deck.
The vertical dip angle χ that bridge vibration produced can be maximally set to:
the rolling motion phenomenon of the vehicle occurs, the second motion is that the road surface is wet and slippery due to bad weather, the road attachment coefficient is changed, and the rolling, sideslip and side turning motions of the vehicle are caused; therefore, a track error in which the actual running track of the vehicle and the reference track generate a roll angle change can be expressed as:
wherein r is act Is the actual vibrating vehicle track; r is (r) ref Is a reference track;the angle difference between the actual track and the reference track is the largest, and mu is the road adhesion coefficient; sigma is the angle of departure when the vehicle is laterally sideslip, phi is the roll angle of the vehicle, d is the vertical vibration distance, χ is the angle of departure when the bridge is vertically vibrating, i.e., the vertical tilt angle produced by the deck, μ is the road attachment coefficient, which is in the range of [01 ]]。
According to a high-precision map designed by the visual sensor, an optimal road running area is searched, and in the optimal road running area, the reference vehicle state, the control input and the parameter output values are designed, meanwhile, constraint conditions of multiple types of states are designed, and the roll control of the vehicle is realized by adopting an iterative learning control algorithm, so that the DDPG algorithm plays a compensation role, and the safe running of the vehicle in the controllable road area is realized. Searching a limited road running area according to a high-precision map designed by a visual sensor and the prediction state of an LSTM, designing the range of a layered uncertainty state parameter and an action parameter, and realizing the roll control of the vehicle by adopting a DDPG and an iterative control algorithm when the vehicle runs in the limited road running area, wherein the iterative control algorithm plays a role in compensation; when the vehicle is in a limit working condition, constraint conditions of the vehicle state need to be added, and a reward function is constructed as follows:
R=v·(R 1 +R 2 +…R 6 )
wherein R is 1 And R is 2 Each representing a lateral distance error and a rate of change thereof; r is R 3 And R is 4 Representing the transverse angular velocity and its rate of change; x is an angle value; v is the vehicle speed; r is R 5 And R is 6 Representing the roll angle and its rate of change; e, e y Is the lateral distance, k i ,k j The bonus factors, respectively.
The above description is only specific for the practical embodiments of the present invention, and they are not intended to limit the scope of the invention, but all equivalent manners or modifications that do not depart from the technology of the present invention should be included in the scope of the invention.
Claims (5)
1. An automatic driving vehicle roll control method based on DDPG and iterative control, which is characterized by comprising the following steps:
(1) Installing a laser radar, a vision, a millimeter wave radar, an ultrasonic radar sensor, a positioning system and an inertial navigation system on an autopilot vehicle;
(2) The method comprises the steps of using a visual sensor, a positioning system and an inertial navigation system to realize positions and maps of vehicles in different scenes respectively so as to generate automatic driving vehicle running maps in different scenes and realize the environment required by vehicle running tracks;
(3) Respectively controlling a steering wheel, an accelerator and a pedal, driving on a cross-sea bridge, acquiring corresponding driving tracks in rainy and snowy days, strong wind and bad weather and sunny days, and constructing a data set;
(4) The DDPG algorithm is trained on the running map of the automatic driving vehicle under different scenes and is used for running states of the cross-sea bridge under different complex road condition grades in severe weather; the automatic driving vehicle generates real-time vehicle states through interaction with map environments in different scenes, and determines action behaviors of the vehicle; initializing an action space when action training is carried out, generating state space information by an online strategy network in an actor network, carrying out action output, and adding an action noise to obtain the action space with exploratory property;
(5) Based on LSTM historic memory and road planning attributes, generating a predicted path of the state of the automatic driving vehicle, adopting a DDPG algorithm to realize tracking control of the path track of the automatic driving vehicle under normal driving road conditions and extreme driving road conditions, and adopting an iterative control method to realize compensation control of the automatic driving vehicle;
the network design of the DDPG algorithm in the step (4) is as follows:
an actor network is constructed, the vehicle state and the environment state are taken as input, the output is a vector formed by steering angle, accelerator and brake signals, the vector corresponds to 3 neurons of an actor strategy network output layer respectively, the activation function of the accelerator and the brake is set to be Sigmoid, the activation function of the steering action value is Tanh, and the hidden layer has the structure that: the first layer is a convolution size 7*7, a filter size 48, a step size 4, and a total of 200 neurons; the second layer is a convolution size 5*5, a filter size 16, a step size 2, an activation function ReLu function, 400 neurons total; the third layer adds 100 neurons to the LSTM layer; the fourth layer is a 128-unit fully connected layer; the fifth layer is a fully connected layer, totaling 128 units; the input of the critic network is a state and action space, and the state and action space are spliced with an activation function ReLu through two hidden layers, namely 200 neurons in the first layer and 400 neurons in the second layer, so that a Q value is finally obtained; definition of h i ∈(S t-T ,S t-T+1 ,…,S t ) Wherein S is t-T And S is t State information respectively representing the current time and the current time, the encoded state is: s=f (h i The method comprises the steps of carrying out a first treatment on the surface of the β), then the policy of the changed actor network is defined as action a=μ (h i /β,γ π )+η;
The implementation process of the tracking control of the path track of the automatic driving vehicle under the normal driving road condition in the step (5) is as follows:
the method comprises the steps of establishing a vehicle dynamics model by considering roll, sideslip and yaw dynamics characteristics of a vehicle under normal running road conditions, namely road conditions when a bridge vibrates on a sunny day, setting vehicle state constraint conditions, and determining a lateral stability range, a maximum steering angle range and a range of allowable vehicle control for preventing roll so as to reduce a lateral deviation error of the vehicle:
ω z-min ≤ω z ≤ω z-max ,ω x-min ≤ω x ≤ω x-max ,u x-min ≤u x ≤u x-max ,e r-x-min ≤e r ≤e r-x-max
wherein omega is z Is yaw rate; omega x Is the roll angle of the vehicle; u (u) x Is the steering angle; e, e r Is a lateral tracking offset error;
according to LSTM predicted road state information, constructing an objective function considering steering angle, tire attachment coefficient, roll angle error and path tracking error, and determining physical constraint of the vehicle determined by transverse tracking offset error under the condition of full consideration of the dynamics constraint condition of maximum allowable error of the vehicle so as to reduce the error of tracking control of the vehicle:
wherein w is 1 ,w 2 ,w 3 ,w 4 Parameter variables respectively; mu (mu) r Is the road adhesion coefficient;
adopting an iterative control algorithm to realize compensation control of vehicle roll, and setting a reference vehicle state, a reference control input and a reference output value to ensure a tracking function under the conditions of vehicle physical constraint and road constraint under the condition of multiple constraints, thereby improving the anti-interference performance of the vehicle during running and reducing a model error rate;
according to the running condition of the vehicle, constructing a state space, an action space and a reward function required by a DDPG algorithm; the motion space mainly comprises steering wheel rotation angle, throttle and braking signals, and the state space comprises vehicle transverse tracking error and change rate thereof, vehicle side-tipping angle error and change rate thereof, and yaw rate error and change rate thereof; construction of the bonus function in case of vibration of the cross-sea bridge, the actual track of the vehicle is changed, the road is inclined at a certain angle, and thus the bonus function is equal to the cumulative multiplication of the discount factor and the change of speed.
2. The DDPG and iterative control-based roll control method for an autonomous vehicle of claim 1, wherein the lidar sensor of step (1) is used to detect dynamic and static obstacles on roads, including pedestrians, motorcycles, and various vehicles, and movable road areas; the visual sensor is used for sensing lane lines, pedestrians and vehicles and performing positioning and synchronous map creation; the millimeter wave radar sensor is used for detecting the distance between the vehicle and the pedestrian and the distance between the vehicle and the traveling vehicle; the ultrasonic radar is used for detecting the distance between the short-distance vehicles; the vision sensor, the positioning system and the inertial navigation system are used for realizing the vehicle positioning technology.
3. The DDPG and iterative control-based automatic driving vehicle roll control method of claim 1, wherein the different scenes in the step (2) comprise five scenes including a cross-sea bridge road condition in rainy and snowy weather, a cross-sea bridge road condition in strong wind and severe weather, a road condition when a bridge vibrates on a sunny day, a driving road condition of a single vehicle in frequent variable weather, and a driving road condition of multiple vehicles in frequent variable weather.
4. The DDPG and iterative control-based automatic driving vehicle roll control method of claim 1, wherein the dataset of step (3) comprises vehicle speed, travel trajectory, vehicle position, heading angle, slip angle, yaw rate, roll angle.
5. The DDPG and iterative control-based roll control method for an autonomous vehicle according to claim 1, wherein the implementation of the step (5) for implementing the tracking control of the path trajectory of the autonomous vehicle under the extreme driving road condition is as follows:
under extreme driving road conditions, namely under severe weather and other factors affecting road conditions, the phenomenon that the vehicle easily generates wet skid and vibration is considered, the driving track is changed, the actual running speed of the vehicle is affected, and the actual vehicle speed and the planned vehicle speed are deviated, so that the change of the actual vehicle speed is set as follows:
in the formula, v ref Is the reference vehicle speed; v act Is the actual vehicle speed; alpha is an influencing factor;
when the error between the actual vehicle speed and the reference vehicle speed is within 2KM/H, the actual vehicle speed is equal to the reference vehicle speed; when the error between the actual vehicle speed and the reference vehicle speed is in the [2 ] KM/H interval, the vehicle speed of the actual vehicle speed is equal to the reference vehicle speed and the difference between the two vehicle speeds; wherein α is a speed change factor; when the error between the actual vehicle speed and the reference vehicle speed is greater than 5KM/H, the vehicle needs to be braked instantaneously at the moment, so that the running safety of the vehicle is ensured;
at time t, the vehicle's set of actions and states are expressed as follows: { v 1 ,…,v i …v n I=1, …, n, path planning is implemented using bezier curves to produce collision-free predicted trajectories; in order to judge the accuracy of vehicle speed planning, the vehicle state and action of a small sample are taken out from an experience buffer of a DDPG algorithm as reference values: { v r1 ,…,v ri …v rn I=1, …, n, and the error of both is calculated as:
in the formula, v i Is the vehicle speed; l is the vehicle speed error rate; v ri The reference vehicle speed is obtained from the experience buffer;
under the extreme driving condition, the expected state reference track is given as S r The output error is e k (t)=S r (t) -S (t), learning law:u k+1 (t)=L(u k (t),e k (t)) to obtain the compensation control action a k The method comprises the steps of carrying out a first treatment on the surface of the Under normal running, the DDPG algorithm is adopted to realize vehicle control, and the output action is a π =μ(h i /β,γ π ) +η, total action control a=a k +a π The reference formula is as follows:
in order to judge the accuracy of path planning, the track error when the actual running track and the reference track of the vehicle generate the change of the roll angle is calculated as follows:
wherein r is act Is the actual vibrating vehicle track; r is (r) ref Is a reference track;the angle difference between the actual track and the reference track is the largest, and mu is the road adhesion coefficient; sigma is the deviation angle when the vehicle generates lateral sideslip, phi is the vehicle side-tipping angle, d is the vertical vibration distance, χ is the deviation angle when the bridge vibrates vertically, namely, the vertical dip angle generated by the bridge deck;
vertical inclination χ maximize that bridge vibration produced sets up to:
searching an optimal road running area according to a high-precision map designed by the visual sensor, designing a reference vehicle state, a control input value and a parameter output value in the optimal road running area, designing constraint conditions of multiple types of states, and realizing roll control of the vehicle by adopting an iterative learning control algorithm;
the prediction states of the high-precision map and the LSTM designed by the visual sensor find a limited road driving area, and design a range of layered uncertainty state parameters and action parameters, when a vehicle is driven in the limited road driving area, the DDPG and an iterative control algorithm are adopted to realize the roll control of the vehicle, the iterative control algorithm plays a role in compensation, and a reward function is constructed according to the constraint range of the vehicle state, wherein the reward function is as follows:
R=v·(R 1 +R 2 +…R 6 )
wherein R is 1 And R is 2 Each representing a lateral distance error and a rate of change thereof; r is R 3 And R is 4 Representing the transverse angular velocity and its rate of change; r is R 5 And R is 6 Representing the roll angle and its rate of change; x is an angle value; v is the vehicle speed; e, e y Is the lateral distance, k i ,k j The bonus factors, respectively.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111353270.7A CN114228690B (en) | 2021-11-16 | 2021-11-16 | Automatic driving vehicle roll control method based on DDPG and iterative control |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111353270.7A CN114228690B (en) | 2021-11-16 | 2021-11-16 | Automatic driving vehicle roll control method based on DDPG and iterative control |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114228690A CN114228690A (en) | 2022-03-25 |
CN114228690B true CN114228690B (en) | 2023-05-23 |
Family
ID=80749618
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111353270.7A Active CN114228690B (en) | 2021-11-16 | 2021-11-16 | Automatic driving vehicle roll control method based on DDPG and iterative control |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114228690B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115571160A (en) * | 2022-10-26 | 2023-01-06 | 清华大学 | Chassis domain controller and control method for automatic driving and vehicle |
WO2024113087A1 (en) * | 2022-11-28 | 2024-06-06 | Beijing Baidu Netcom Science Technology Co., Ltd. | On-board parameter tuning for control module for autonomous vehicles |
CN115973131B (en) * | 2023-03-20 | 2023-06-13 | 上海伯镭智能科技有限公司 | Mining area unmanned vehicle rollover prevention method and related device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110647839A (en) * | 2019-09-18 | 2020-01-03 | 深圳信息职业技术学院 | Method and device for generating automatic driving strategy and computer readable storage medium |
CN110850861A (en) * | 2018-07-27 | 2020-02-28 | 通用汽车环球科技运作有限责任公司 | Attention-based hierarchical lane change depth reinforcement learning |
DE102019115707A1 (en) * | 2018-11-01 | 2020-05-07 | Carnegie Mellon University | SPATIAL AND TIMELINE ATTENTION-BASED DEPTH LEARNING LEARNING OF HIERARCHICAL Lane-changing Strategies for Controlling an Autonomous Vehicle |
CN111845741A (en) * | 2020-06-28 | 2020-10-30 | 江苏大学 | Automatic driving decision control method and system based on hierarchical reinforcement learning |
CN113386790A (en) * | 2021-06-09 | 2021-09-14 | 扬州大学 | Automatic driving decision-making method for cross-sea bridge road condition |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11554785B2 (en) * | 2019-05-07 | 2023-01-17 | Foresight Ai Inc. | Driving scenario machine learning network and driving environment simulation |
US11157784B2 (en) * | 2019-05-08 | 2021-10-26 | GM Global Technology Operations LLC | Explainable learning system and methods for autonomous driving |
US11493926B2 (en) * | 2019-05-15 | 2022-11-08 | Baidu Usa Llc | Offline agent using reinforcement learning to speedup trajectory planning for autonomous vehicles |
-
2021
- 2021-11-16 CN CN202111353270.7A patent/CN114228690B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110850861A (en) * | 2018-07-27 | 2020-02-28 | 通用汽车环球科技运作有限责任公司 | Attention-based hierarchical lane change depth reinforcement learning |
DE102019115707A1 (en) * | 2018-11-01 | 2020-05-07 | Carnegie Mellon University | SPATIAL AND TIMELINE ATTENTION-BASED DEPTH LEARNING LEARNING OF HIERARCHICAL Lane-changing Strategies for Controlling an Autonomous Vehicle |
CN110647839A (en) * | 2019-09-18 | 2020-01-03 | 深圳信息职业技术学院 | Method and device for generating automatic driving strategy and computer readable storage medium |
CN111845741A (en) * | 2020-06-28 | 2020-10-30 | 江苏大学 | Automatic driving decision control method and system based on hierarchical reinforcement learning |
CN113386790A (en) * | 2021-06-09 | 2021-09-14 | 扬州大学 | Automatic driving decision-making method for cross-sea bridge road condition |
Also Published As
Publication number | Publication date |
---|---|
CN114228690A (en) | 2022-03-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114228690B (en) | Automatic driving vehicle roll control method based on DDPG and iterative control | |
CN111845774B (en) | Automatic driving automobile dynamic trajectory planning and tracking method based on transverse and longitudinal coordination | |
Gao et al. | Robust lateral trajectory following control of unmanned vehicle based on model predictive control | |
JP4586795B2 (en) | Vehicle control device | |
CN107264534B (en) | Based on the intelligent driving control system and method for driver experience's model, vehicle | |
CN114407931A (en) | Decision-making method for safe driving of highly-humanoid automatic driving commercial vehicle | |
CN114564016A (en) | Navigation obstacle avoidance control method, system and model combining path planning and reinforcement learning | |
CN114379583B (en) | Automatic driving vehicle track tracking system and method based on neural network dynamics model | |
CN110286681A (en) | A kind of dynamic auto driving lane-change method for planning track of variable curvature bend | |
CN109733474A (en) | A kind of intelligent vehicle steering control system and method based on piecewise affine hierarchical control | |
CN114771563A (en) | Method for realizing planning control of track of automatic driving vehicle | |
CN113386790B (en) | Automatic driving decision-making method for cross-sea bridge road condition | |
CN115683145A (en) | Automatic driving safety obstacle avoidance method based on track prediction | |
CN113715842A (en) | High-speed moving vehicle control method based on simulation learning and reinforcement learning | |
CN112249008A (en) | Unmanned automobile early warning method aiming at complex dynamic environment | |
Kapania | Trajectory planning and control for an autonomous race vehicle | |
CN113255998A (en) | Expressway unmanned vehicle formation method based on multi-agent reinforcement learning | |
CN108711285B (en) | Hybrid traffic simulation method based on road intersection | |
CN116486356A (en) | Narrow scene track generation method based on self-adaptive learning technology | |
CN116182884A (en) | Intelligent vehicle local path planning method based on transverse and longitudinal decoupling of frenet coordinate system | |
CN114179818A (en) | Intelligent automobile transverse control method based on adaptive preview time and sliding mode control | |
CN113033902B (en) | Automatic driving lane change track planning method based on improved deep learning | |
CN117826590A (en) | Unmanned vehicle formation control method and system based on prepositive following topological structure | |
CN115214697A (en) | Adaptive second-order sliding mode control intelligent automobile transverse control method | |
CN115447615A (en) | Trajectory optimization method based on vehicle kinematics model predictive control |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |