CN114954455A - Electric vehicle following running control method based on multi-step reinforcement learning - Google Patents
Electric vehicle following running control method based on multi-step reinforcement learning Download PDFInfo
- Publication number
- CN114954455A CN114954455A CN202210770539.XA CN202210770539A CN114954455A CN 114954455 A CN114954455 A CN 114954455A CN 202210770539 A CN202210770539 A CN 202210770539A CN 114954455 A CN114954455 A CN 114954455A
- Authority
- CN
- China
- Prior art keywords
- vehicle
- time
- reinforcement learning
- control
- battery
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000002787 reinforcement Effects 0.000 title claims abstract description 30
- 238000000034 method Methods 0.000 title claims abstract description 24
- 230000006870 function Effects 0.000 claims abstract description 38
- 230000001133 acceleration Effects 0.000 claims abstract description 22
- 230000008859 change Effects 0.000 claims abstract description 4
- 238000013461 design Methods 0.000 claims abstract description 4
- 230000000694 effects Effects 0.000 claims abstract description 4
- 238000004146 energy storage Methods 0.000 claims abstract description 4
- 238000004422 calculation algorithm Methods 0.000 claims description 14
- 238000005096 rolling process Methods 0.000 claims description 7
- 238000005457 optimization Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000011160 research Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000009795 derivation Methods 0.000 claims description 3
- 230000005484 gravity Effects 0.000 claims description 3
- 230000009471 action Effects 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims description 2
- 230000003044 adaptive effect Effects 0.000 claims 1
- 230000002035 prolonged effect Effects 0.000 claims 1
- 238000000926 separation method Methods 0.000 claims 1
- 230000008569 process Effects 0.000 abstract description 6
- 238000004891 communication Methods 0.000 abstract description 2
- 238000011161 development Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 4
- 238000004088 simulation Methods 0.000 description 4
- 241000282414 Homo sapiens Species 0.000 description 3
- 239000000446 fuel Substances 0.000 description 3
- 238000010792 warming Methods 0.000 description 3
- CURLTUGMZLYLDI-UHFFFAOYSA-N Carbon dioxide Chemical compound O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 description 2
- 230000005611 electricity Effects 0.000 description 2
- 239000003208 petroleum Substances 0.000 description 2
- 238000003915 air pollution Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 229910002092 carbon dioxide Inorganic materials 0.000 description 1
- 239000001569 carbon dioxide Substances 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 239000000295 fuel oil Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000000452 restraining effect Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W30/00—Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
- B60W30/14—Adaptive cruise control
- B60W30/16—Control of distance between vehicles, e.g. keeping a distance to preceding vehicle
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60L—PROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
- B60L15/00—Methods, circuits, or devices for controlling the traction-motor speed of electrically-propelled vehicles
- B60L15/20—Methods, circuits, or devices for controlling the traction-motor speed of electrically-propelled vehicles for control of the vehicle or its driving motor to achieve a desired performance, e.g. speed, torque, programmed variation of speed
- B60L15/2045—Methods, circuits, or devices for controlling the traction-motor speed of electrically-propelled vehicles for control of the vehicle or its driving motor to achieve a desired performance, e.g. speed, torque, programmed variation of speed for optimising the use of energy
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2556/00—Input parameters relating to data
- B60W2556/45—External transmission of data to or from the vehicle
- B60W2556/65—Data transmitted between vehicles
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/60—Other road transportation technologies with climate change mitigation effect
- Y02T10/72—Electric energy management in electromobility
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Transportation (AREA)
- Mechanical Engineering (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Power Engineering (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Automation & Control Theory (AREA)
- Electric Propulsion And Braking For Vehicles (AREA)
Abstract
The invention discloses an electric vehicle following running control method based on multi-step reinforcement learning. The method comprises the following steps: 1. determining a state variable through an information acquisition module and a controller design module, and determining a control variable through a control target; 2. dividing the state variable and the equivalent inter-vehicle distance to obtain a Q table; 3. inputting the state variable and the control variable into a Q table to obtain an expected value function; 4. and solving the state variable of the next moment through the longitudinal dynamics module and the electric automobile energy storage module. 5. And selecting a Q table with the minimum expected cost value within n steps in the future to obtain a control variable according to the acceleration change condition of the front vehicle at the next moment. 6. And judging the control variable solved by the Q table as the optimal or suboptimal control variable. The invention utilizes the communication facility between the front vehicle and the main vehicle, so that the vehicle can acquire the information of the acceleration and the like of the front vehicle and realize the vehicle following effect, and the aim of reducing the power consumption in the vehicle following process is fulfilled.
Description
Technical Field
The invention belongs to the technical field of intelligent driving, particularly relates to a multi-step reinforcement learning-based electric vehicle following driving control method for front and rear vehicles.
Background
In order to slow down the rising trend of global warming and reduce the emission of carbon dioxide, and meanwhile, as the capacity and economy of a battery are continuously improved, a pure electric vehicle becomes one of the development directions of new energy vehicles, the popularization and application of the electric vehicle are more and more emphasized by people for restraining the trend of global warming.
With the development of the times, the global non-renewable resources are increasingly scarce, and people must pay more attention to the reasonable utilization of the existing non-renewable resources while developing. However, with the rapid and vigorous development of modern economy, as one of the indispensable important travel means of everyone standard allocation and human beings at present, how to balance the fuel consumption of the automobile and the reasonable allocation of petroleum resources in human society becomes one of the important problems. Secondly, the problems of air pollution, global warming and the like caused by the problem of petroleum emission are increasingly prominent, and the regulations of the standards of automobile exhaust and fuel consumption are more and more strict, so that the development of new energy electric intelligent automobiles becomes necessary for human development. Therefore, new energy vehicles have attracted attention from the automobile manufacturing industry and governments. With the continuous progress of scientific and technical means, the new energy automobile makes great progress and development, and the electric automobile becomes one of the important means for people going out nowadays and occupies a great proportion in market trading share. Therefore, in order to improve the power consumption rate of the new energy electric vehicle to a greater extent and prolong the service life of a battery, the following running control method of the electric vehicle based on multi-step reinforcement learning is provided.
Disclosure of Invention
The invention mainly considers that the quantity of electric vehicles owned by China will continuously rise along with the wide use of electric vehicles in China and the gradual opening of the new energy vehicle market. How to better utilize the development of electric automobiles, thereby reducing the automobile exhaust emission of fuel oil automobiles, improving the ecological environment and reducing the power consumption of batteries is a worth of discussion.
The invention aims to provide an intelligent automobile following running control method based on multi-step reinforcement learning, which utilizes a communication facility between a front automobile and a main automobile to enable the automobile to obtain information such as acceleration of the front automobile and achieve a following effect, and achieves the aim of reducing power consumption in the following process.
The above object is achieved by the following technical solutions:
And 2, dividing the speed V in the state variable X (t) into 50000 grids in size from-3.7 to 4.399 in an equal proportion, dividing the equivalent inter-vehicle distance delta d into 50000 grids in size from-2 to 2.2999 in an equal proportion, and dividing the acceleration of the control variable into 21 grids from-1 to 1. Thus, a Q-table of 50000 × 21 table size is constructed.
Step 3, inputting the state variable X (t) and the control variable U (t) into a Q table to obtain an expected value function;
and 4, solving a state variable X (t +1) at the next moment through the longitudinal dynamics module and the electric automobile energy storage module.
And 5, selecting a Q table with the minimum expected cost value within the next n steps in the future to obtain a control variable U (t +1) according to the acceleration change condition of the front vehicle at the next moment.
And 6, judging whether the Q table meets the maximum iteration times or whether the tolerance meets the self-adaptive iteration value. If yes, the solved control variable U (t +1) is used as the optimal or suboptimal control variable, otherwise, the step 4 is returned.
And 7, after iteration is ended, obtaining control input from the Q table, obtaining the optimal required power Pe through calculation, and applying the optimal required power Pe to the main vehicle.
The method of the invention has the advantages and beneficial results that:
1. with the rapid development of the intelligent automobile industry in China and the wide use of electric automobiles, the electric automobile can be used as a developed and utilized recessive resource in the aspect of how to save electricity and energy, and the electric automobile can be actively carried out according to market requirement instructions to improve the electricity consumption efficiency and reduce the battery loss so as to maintain the stability and the economy of the whole electric automobile system. Compared with the traditional fuel automobile, the wide development of the electric automobile can be one of the important targets of the current new energy market.
2. The control variable of the electric automobile is limited in a certain range, so that the safety of the electric automobile in the driving process can not be reduced due to the fact that the electric automobile is greatly changed in the acceleration or deceleration process, and the driving comfort of passengers can be guaranteed to a certain extent.
3. The state variable is controlled within a certain range, so that a proper distance between the vehicle and a front vehicle is always kept in the driving process, and the final driving target, namely driving safety, is guaranteed.
4. The minimum function in a multi-step range can be obtained by using a multi-step reinforcement learning algorithm, so that the overall cost function is minimum, the power utilization efficiency is optimized, and the previous single-step updating efficiency is changed.
Drawings
FIG. 1 is a view of a vehicle following scene provided by the present invention.
Fig. 2 is a multi-step backtracking tree algorithm diagram based on the ACC policy provided by the present invention.
FIG. 3 is a simulation platform and parameter setting diagram of the multi-step reinforcement learning-based ecological ACC strategy provided by the invention.
Fig. 4 is a simulation experiment diagram of the intelligent automobile following driving control based on multi-step reinforcement learning under the WLTC driving cycle.
Detailed Description
The present invention will be described in detail with reference to specific embodiments.
The intelligent automobile following driving control method based on the multi-step reinforcement learning algorithm is implemented according to the following steps.
And 2, dividing the speed V in the state variable X (t) into 50000 grids in size from-3.7 to 4.399 in an equal proportion, dividing the equivalent vehicle distance delta d into 50000 grids in size from-2 to 2.2999 in an equal proportion, and dividing the acceleration of the control variable into 21 grids from-1 to 1. Thus, a Q-table of 50000 × 21 table size is constructed.
Step 3, inputting the state variable X (t) and the control variable U (t) into a Q table to obtain an expected value function;
and 4, solving a state variable X (t +1) at the next moment through the longitudinal dynamics module and the electric automobile energy storage module.
And 5, selecting a Q table with the minimum expected cost value within the next n steps in the future to obtain a control variable U (t +1) according to the acceleration change condition of the front vehicle at the next moment.
And 6, judging whether the Q table meets the maximum iteration times or whether the tolerance meets the self-adaptive iteration value. If yes, the solved control variable U (t +1) is used as the optimal or suboptimal control variable, otherwise, the step 4 is returned.
And 7, after iteration is ended, obtaining control input from the Q table, obtaining the optimal required power Pe through calculation, and applying the optimal required power Pe to the main vehicle.
Further, the initialization parameters in step 1 include the mass m of the vehicle, the air density ρ, the gravity acceleration g, the rolling resistance coefficient μ, and the nominal aerodynamic resistance coefficient C of the host vehicle h,d Vehicle body length L of electric vehicle car Motor efficiency eta m Fixed gear ratio G r Minimum acceleration a min Maximum acceleration a max And the like.
Further, the step 1 is specifically realized as follows:
the method comprises the steps of modeling longitudinal dynamics of a vehicle, and modeling basic information of the vehicle and physical quantities of the vehicle. The scene of the vehicle following is shown in figure 1.
1-1, establishing a second-order vehicle longitudinal dynamic model, which is as follows:
wherein S represents the position of the vehicle,indicating that the vehicle is in position to be derived. V represents the longitudinal speed of the vehicle,representing the derivation of the vehicle longitudinal speed. U is a control input.
1-2, establishing a vehicle longitudinal force balance running equation as follows:
wherein, F hf (t) and F hr (t) longitudinal force of front and rear wheel tires at time t, respectively, F a (t) is the air resistance at time t, F r (t) is the rolling resistance at time t, m is the vehicle mass, and V is the vehicle longitudinal speed.
the air resistance acting on the vehicle at time t can be expressed as:
where ρ is the air density, C D (d h ) Is the aerodynamic drag coefficient related to the distance between two vehicles, A v The frontal windward area of the main vehicle,representing the square of the velocity of the host vehicle (the leading vehicle).
the rolling resistance at time t is expressed as follows:
F r (t)=mgμ (4)
wherein g is the gravity acceleration, mu is the rolling resistance coefficient, and m is the automobile mass.
The aerodynamic drag coefficient can be expressed as:
wherein, C h,d Representing the nominal aerodynamic drag coefficient of the host vehicle. Parameter c 1 And c 2 Is obtained by regression of experimental data, d h The distance between two vehicles.
Following distance d between two vehicles h Calculated by the following way:
d h =S p -S h -L car (6)
wherein L is car The length of the body of the electric vehicle. S p ,S h Respectively a front vehicle position and a main vehicle position.
1-3, establishing a motor driving model of the electric vehicle.
The control main body is an electric vehicle, and the driving force of the electric vehicle is provided by a motor. In order to accurately describe the dynamic characteristics of the vehicle, the actual output torque T of the motor is assumed without considering the constraint of the motor efficiency m With the motor speed omega m Can be expressed as:
wherein R and G r Respectively, the radius of the tire and the fixed gear ratio, T ω (t) is the traction force at time t.
1-4, establishing a battery power model of the electric vehicle. Neglecting the effect of the auxiliary on the battery power, the desired output battery power may be equivalent to the desired motor input power, expressed as follows:
wherein, P bat Is the battery power; eta m To the motor efficiency. Where the motor efficiency can be described as:
η m (t)=f m (ω m (t),T m (t)) (9)
wherein f is m A power conversion efficiency function representing the motor conversion.
1-5, establishing a charge and discharge resistance model of the electric vehicle, as follows:
wherein, SoC bat (t) is the state of charge of the battery at time t, R bat (t) is the resistance of the battery at time t.Are all the coefficients of a charging model of the battery pack,is a battery discharge model coefficient, I bat And (t) is the battery pack current at time t.
And 2, performing ecological self-adaptive cruise control on the electric vehicle based on the multi-step reinforcement learning, and determining an optimization target.
2-1 optimization objective based on safe driving of the vehicle; in order to ensure the safety of the vehicle running, the inter-vehicle distance must be limited by:
d min (t)≤d h (t)≤d max (t) (11)
wherein d is h (t) is the inter-vehicle distance between the main vehicle and the preceding vehicle at time t, d min (t) and d max (t) minimum and maximum permissible inter-vehicle distances, d min (t) and d max (t) are all derived from the Q table.
d min (t) and d max (t) can be expressed as:
2-2, optimizing the target based on the driving comfort of the vehicle; to ensure driving comfort, the control inputs of an electric vehicle must be limited to:
a min ≤U(t)≤a max (13)
wherein, a min And a max Respectively the minimum and maximum acceleration allowed. In the present invention, a min =-1m/s 2 ,a max =1m/s 2 。
2-3. based on the optimization goal of prolonging the service life of the vehicle battery. ReduceThe life of the battery can be extended. Therefore, in order to extend the battery life as much as possible, it is necessary to reduce the following formula as much as possible:
wherein,is the square of the current of the battery at time t, P 0 To the start of the driving cycle, T cyc Is the end of the driving cycle.
And 2-4, optimizing the target based on the energy economy of the vehicle.
In order to improve the energy economy of the electric vehicle, it is necessary to reduce the following formula as much as possible:
and 3, determining the algorithm based on the multi-step reinforcement learning as an n-step tree backtracking algorithm.
3-1, determining state variables and control variables under the research of the Eco-ACC strategy based on multi-step reinforcement learning.
State variable x (t): in order for the electric vehicle to follow the preceding vehicle within a reasonable inter-vehicle distance range, equation (11) must be satisfied. The following performance can therefore be evaluated using the vehicle pitch deviation Δ d and the speed deviation Δ V, which can be defined as:
X(t)=[ΔV(t),Δd(t)]T (16)
wherein,
ΔV(t)=V p (t)-V(t) (18)
wherein BSF () is a band stop function, α, β,cf is the coefficient associated with the band stop function, see table 1. V p (t) is the speed of the preceding vehicle at time t.
In order to express the state information of the vehicle more intuitively, the belt resistance function is improved, and the distance between the vehicles is described as follows:
where δ d (t) is the equivalent inter-vehicle distance deviation at time t, α, β,cf z for coefficients related to the band stop function, see table 1.
Controlling variables: the control variable of the present invention is acceleration.
U(t)=a(t) (20)
Where a (t) is the acceleration of the subject at time t.
3-2, determining a reward function and a value function under the research of the Eco-ACC strategy based on multi-step reinforcement learning.
③ reward function: to achieve the control objective, the bonus function is given as follows:
r(X(t),U(t))=α 1 L 1 (t)+α 2 L 2 (t)+α 3 L 3 (t) (21)
wherein alpha is 1 、α 2 And alpha 3 The weighting coefficients for the reward function can be seen in table 1. L is 1 、L 2 And L 3 Can be expressed as:
value function: the value function of the ecological ACC strategy based on multi-step reinforcement learning can be expressed as follows:
wherein gamma is a discount factor, alpha, beta andare parameters of the band stop function and are detailed in table 1.
In reinforcement learning, the ultimate goal of an agent is to maximize the jackpot. The reward function can determine whether the action is good or bad in a short time. The value function can therefore be described as:
and 3-3, adopting an off-line algorithm which is not applicable to importance sampling, namely a backtracking tree method by using the multi-step learning algorithm, wherein the n-step backtracking tree diagram is shown in fig. 2.
3-4, the simulation platform and parameter setting of the ecological ACC strategy based on multi-step reinforcement learning are shown in FIG. 3.
The band stop function related parameters and the bonus function weight coefficient settings are shown in table 1.
TABLE 1
And 4, verifying the intelligent automobile following form control method based on the multi-step reinforcement learning algorithm under different driving cycles as shown in the following table.
TABLE 2 Intelligent automobile following form control method based on multi-step reinforcement learning algorithm under different driving cycles
The simulation results are shown in fig. 4, and the results show that: the speeds of the vehicle controlled by the Eco-ACC system based on the multi-step reinforcement learning algorithm and the vehicle in front are basically consistent, and the acceleration of the vehicle is smoother than that controlled by the traditional ACC system, so that passengers feel more comfortable; the actual distance between the vehicle controlled by the Eco-ACC system and the vehicle in front is always kept in a safe range, so that the safety of the vehicle in the driving process is ensured; vehicles controlled by the Eco-ACC system are more energy efficient than vehicles controlled by conventional ACC systems.
Claims (4)
1. An electric vehicle following running control method based on multi-step reinforcement learning is characterized by comprising the following steps:
step 1, determining a state variable X (t) through an information acquisition module and a controller design module of a vehicle, determining a control variable U (t) through a control target, and initializing relevant parameters of the vehicle;
step 2, dividing the speed V in the state variable X (t) into 50000 grids in size from-3.7 to 4.399 in equal proportion, dividing the equivalent inter-vehicle distance delta d into 50000 grids in size from-2 to 2.2999 in equal proportion, and dividing the acceleration of the control variable into 21 grids from-1 to 1; thus forming a Q table with the size of 50000X 21 tables;
step 3, inputting the state variable x (t) and the control variable U (t) into a Q table to obtain an expected value function;
step 4, solving a state variable X (t +1) at the next moment through the longitudinal dynamics module and the electric automobile energy storage module;
step 5, selecting a Q table with the minimum expected cost value within the next n steps in the future to obtain a control variable U (t +1) according to the acceleration change condition of the front vehicle at the next moment;
step 6, judging whether the Q table meets the maximum iteration times or whether the tolerance meets the self-adaptive iteration value; if yes, the solved control variable U (t +1) is used as the optimal or suboptimal control variable, otherwise, the step 4 is returned;
and 7, after iteration is ended, obtaining control input from the Q table, obtaining the optimal required power Pe through calculation, and applying the optimal required power Pe to the main vehicle.
2. The electric vehicle following running control method based on multi-step reinforcement learning according to claim 1, characterized in that longitudinal dynamics modeling is performed on the vehicle, basic information of the vehicle and physical quantities of the vehicle are modeled;
1-1, establishing a second-order vehicle longitudinal dynamic model, which is as follows:
wherein S represents the position of the vehicle,indicating that the position of the vehicle is subjected to derivation; v represents the longitudinal speed of the vehicle,representing the derivation of the vehicle longitudinal speed; u is a control input;
1-2, establishing a longitudinal force balance driving equation of the vehicle as follows:
wherein, F hf (t) and F hr (t) longitudinal force of front and rear wheel tires at time t, respectively, F a (t) is the air resistance at time t, F r (t) rolling resistance at time t, m is vehicle mass, and V is vehicle longitudinal speed;
the air resistance acting on the vehicle at time t can be expressed as:
where ρ is the air density, C D (d h ) Is the aerodynamic drag coefficient related to the distance between two vehicles, A v The frontal windward area of the main vehicle,represents the square of the host vehicle velocity;
the rolling resistance expression at time t is as follows:
F r (t)=mgμ (4)
wherein g is the gravity acceleration, mu is the rolling resistance coefficient, and m is the automobile mass;
the aerodynamic drag coefficient can be expressed as:
wherein, C h,d Representing a nominal aerodynamic drag coefficient of the host vehicle; parameter c 1 And c 2 Is obtained by regression of experimental data, d h The distance between two vehicles is the following distance of the two vehicles;
following distance d between two vehicles h Calculated by the following way:
d h =S p -S h -L car (6)
wherein L is car The length of the body of the electric vehicle; s p ,S h Respectively a front vehicle position and a main vehicle position;
1-3, establishing a motor driving model of the electric vehicle;
in order to accurately describe the dynamic characteristics of the vehicle, the actual output torque T of the motor is assumed without considering the constraint of the motor efficiency m With the motor speed omega m Can be expressed as:
wherein R and G r Respectively, the radius of the tire and the fixed gear ratio, T ω (t) is the traction at time t;
1-4, establishing a battery power model of the electric vehicle; neglecting the effect of the auxiliary on the battery power, the desired output battery power may be equivalent to the desired motor input power, expressed as follows:
wherein, P bat Is the battery power; eta m To the motor efficiency; where the motor efficiency can be described as:
η m (t)=f m (ω m (t),T m (t)) (9)
wherein f is m A power conversion efficiency function representing a motor conversion;
1-5, establishing a charge and discharge resistance model of the electric vehicle, as follows:
3. The electric vehicle following running control method based on multi-step reinforcement learning according to claim 2, characterized in that the electric vehicle based on multi-step reinforcement learning performs ecological adaptive cruise control to determine an optimization target, and specifically realizes the following:
2-1 optimizing target based on safe driving of vehicle; in order to ensure the safety of the vehicle running, the inter-vehicle distance must be limited by:
d min (t)≤d h (t)≤d max (t) (11)
wherein d is h (t) is the inter-vehicle distance between the main vehicle and the preceding vehicle at time t, d min (t) and d max (t) minimum and maximum permissible inter-vehicle distances, d min (t) and d max (t) are all derived from the Q table;
d min (t) and d max (t) can be expressed as:
2-2, optimizing the target based on the driving comfort of the vehicle; to ensure driving comfort, the control inputs of an electric vehicle must be limited by:
a min ≤U(t)≤a max (13)
wherein, a min And a max Minimum and maximum accelerations allowed, respectively; in the present invention, a min =-1m/s 2 ,a max =1m/s 2 ;
2-3. based on the optimization goal of prolonging the service life of the vehicle battery; reduceThe service life of the battery can be prolonged; therefore, in order to extend the battery life as long as possible, it is necessary to reduce the following equation (14) as small as possible:
wherein,for battery packs at time tSquare of the current of (d), t 0 To the start of the driving cycle, T cyc Is the end time of the driving cycle;
2-4, optimizing target based on vehicle energy economy; in order to improve the energy economy of the electric vehicle, it is necessary to reduce the following formula as much as possible:
4. the electric vehicle following running control method based on multi-step reinforcement learning according to claim 3, characterized in that the algorithm based on multi-step reinforcement learning is determined to be an n-step tree backtracking algorithm;
3-1, determining state variables and control variables under the research of the Eco-ACC strategy based on multi-step reinforcement learning;
state variable x (t): in order for the electric vehicle to follow the leading vehicle within a reasonable inter-vehicle distance range, equation (11) must be satisfied; the following performance is therefore evaluated with the vehicle separation deviation Δ d and the speed deviation Δ V, which can be defined as:
X(t)=[ΔV(t),Δd(t)] T (16)
wherein,
ΔV(t)=V p (t)-V(t) (18)
wherein BSF () is a band stop function, α, β,cf is a coefficient related to the band stop function, V p (t) is the speed of the vehicle ahead at time t;
in order to express the state information of the vehicle more intuitively, the belt resistance function is improved, and the distance between the vehicles is described as follows:
where δ d (t) is the equivalent inter-vehicle distance deviation at time t, α, β,cf z is a coefficient related to the band stop function;
controlling variables: the control variable is an acceleration;
U(t)=a(t) (20)
wherein a (t) is the principal acceleration at time t;
3-2, determining a reward function and a value function under the research of the Eco-ACC strategy based on multi-step reinforcement learning;
③ reward function: to achieve the control objective, the reward function is given as follows:
r(X(t),U(t))=α 1 L 1 (t)+α 2 L 2 (t)+α 3 L 3 (t) (21)
wherein alpha is 1 、α 2 And alpha 3 A weighting factor that is a reward function; l is 1 、L 2 And L 3 Can be expressed as:
value function: the value function of the ecological ACC strategy based on multi-step reinforcement learning can be expressed as follows:
in reinforcement learning, the ultimate goal of an agent is to maximize the jackpot; the reward function can judge whether the action is good or bad in a short time; the value function can therefore be described as:
and 3-3, adopting an off-line algorithm which is not suitable for importance sampling, namely an n-step back-tracing tree method, by using the multi-step learning algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210770539.XA CN114954455B (en) | 2022-06-30 | 2022-06-30 | Electric vehicle following driving control method based on multi-step reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210770539.XA CN114954455B (en) | 2022-06-30 | 2022-06-30 | Electric vehicle following driving control method based on multi-step reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114954455A true CN114954455A (en) | 2022-08-30 |
CN114954455B CN114954455B (en) | 2024-07-02 |
Family
ID=82966644
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210770539.XA Active CN114954455B (en) | 2022-06-30 | 2022-06-30 | Electric vehicle following driving control method based on multi-step reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114954455B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109484407A (en) * | 2018-11-14 | 2019-03-19 | 北京科技大学 | A kind of adaptive follow the bus method that electric car auxiliary drives |
CN112046484A (en) * | 2020-09-21 | 2020-12-08 | 吉林大学 | Q learning-based vehicle lane-changing overtaking path planning method |
CN112989553A (en) * | 2020-12-28 | 2021-06-18 | 郑州大学 | Construction and application of CEBs (common electronic devices and controllers) speed planning model based on battery capacity loss control |
WO2021197246A1 (en) * | 2020-03-31 | 2021-10-07 | 长安大学 | V2x-based motorcade cooperative braking method and system |
-
2022
- 2022-06-30 CN CN202210770539.XA patent/CN114954455B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109484407A (en) * | 2018-11-14 | 2019-03-19 | 北京科技大学 | A kind of adaptive follow the bus method that electric car auxiliary drives |
WO2021197246A1 (en) * | 2020-03-31 | 2021-10-07 | 长安大学 | V2x-based motorcade cooperative braking method and system |
CN112046484A (en) * | 2020-09-21 | 2020-12-08 | 吉林大学 | Q learning-based vehicle lane-changing overtaking path planning method |
CN112989553A (en) * | 2020-12-28 | 2021-06-18 | 郑州大学 | Construction and application of CEBs (common electronic devices and controllers) speed planning model based on battery capacity loss control |
Non-Patent Citations (1)
Title |
---|
李文昌;郭景华;王进;: "分层架构下智能电动汽车纵向运动自适应模糊滑模控制", 厦门大学学报(自然科学版), no. 03, 28 May 2019 (2019-05-28) * |
Also Published As
Publication number | Publication date |
---|---|
CN114954455B (en) | 2024-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chen et al. | Optimal strategies of energy management integrated with transmission control for a hybrid electric vehicle using dynamic particle swarm optimization | |
Wang et al. | Fuzzy adaptive-equivalent consumption minimization strategy for a parallel hybrid electric vehicle | |
CN112896161B (en) | Electric automobile ecological self-adaptation cruise control system based on reinforcement learning | |
Zhuang et al. | Integrated energy-oriented cruising control of electric vehicle on highway with varying slopes considering battery aging | |
CN109532513A (en) | A kind of optimal driving torque allocation strategy generation method of Two axle drive electric car | |
CN104175891B (en) | Pure electric automobile energy regenerating regenerating brake control method | |
Pu et al. | An adaptive stochastic model predictive control strategy for plug-in hybrid electric bus during vehicle-following scenario | |
Zhang et al. | Powertrain design and energy management of a novel coaxial series-parallel plug-in hybrid electric vehicle | |
CN115027290A (en) | Hybrid electric vehicle following energy management method based on multi-objective optimization | |
Yu et al. | Braking energy management strategy for electric vehicles based on working condition prediction | |
Zhang et al. | Deep reinforcement learning based multi-objective energy management strategy for a plug-in hybrid electric bus considering driving style recognition | |
CN114954455B (en) | Electric vehicle following driving control method based on multi-step reinforcement learning | |
Zhang et al. | Multi-objective optimization for pure electric vehicle during a car-following process | |
Hong et al. | Energy-Saving Driving Assistance System Integrated With Predictive Cruise Control for Electric Vehicles | |
Zhou et al. | Research on Design Optimization and Simulation of Regenerative Braking Control Strategy for Pure Electric Vehicle Based on EMB Systems | |
Shi et al. | Energy Management Strategy based on Driving Style Recognition for Plug-in Hybrid Electric Bus | |
Omar et al. | Design and optimization of powertrain system for prototype fuel cell electric vehicle | |
Keskin et al. | Fuzzy control of dual storage system of an electric drive vehicle considering battery degradation | |
Li et al. | Research on PHEV logic threshold energy management strategy based on Engine Optimal Working Curve | |
Bravo et al. | The influences of energy storage and energy management strategies on fuel consumption of a fuel cell hybrid vehicle | |
CN116424317A (en) | Electric automobile economic self-adaptive cruise control method based on multi-step DDQN | |
Bozhkov et al. | Modelling the hybrid electric vehicle energy efficiency | |
Wang et al. | Powertrain analysis and optimized power follower control strategy in a series hybrid electric vehicle | |
Ma et al. | Power demand for fuel cell system in hybrid vehicles | |
CN112659922B (en) | Hybrid power rail vehicle and direct current bus voltage control method and system thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |