CN114954455A - Electric vehicle following running control method based on multi-step reinforcement learning - Google Patents

Electric vehicle following running control method based on multi-step reinforcement learning Download PDF

Info

Publication number
CN114954455A
CN114954455A CN202210770539.XA CN202210770539A CN114954455A CN 114954455 A CN114954455 A CN 114954455A CN 202210770539 A CN202210770539 A CN 202210770539A CN 114954455 A CN114954455 A CN 114954455A
Authority
CN
China
Prior art keywords
vehicle
time
reinforcement learning
control
battery
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210770539.XA
Other languages
Chinese (zh)
Other versions
CN114954455B (en
Inventor
翟春杰
王栎
裘健鋆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202210770539.XA priority Critical patent/CN114954455B/en
Publication of CN114954455A publication Critical patent/CN114954455A/en
Application granted granted Critical
Publication of CN114954455B publication Critical patent/CN114954455B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W30/00Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
    • B60W30/14Adaptive cruise control
    • B60W30/16Control of distance between vehicles, e.g. keeping a distance to preceding vehicle
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60LPROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
    • B60L15/00Methods, circuits, or devices for controlling the traction-motor speed of electrically-propelled vehicles
    • B60L15/20Methods, circuits, or devices for controlling the traction-motor speed of electrically-propelled vehicles for control of the vehicle or its driving motor to achieve a desired performance, e.g. speed, torque, programmed variation of speed
    • B60L15/2045Methods, circuits, or devices for controlling the traction-motor speed of electrically-propelled vehicles for control of the vehicle or its driving motor to achieve a desired performance, e.g. speed, torque, programmed variation of speed for optimising the use of energy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2556/00Input parameters relating to data
    • B60W2556/45External transmission of data to or from the vehicle
    • B60W2556/65Data transmitted between vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/60Other road transportation technologies with climate change mitigation effect
    • Y02T10/72Electric energy management in electromobility

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Power Engineering (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Automation & Control Theory (AREA)
  • Electric Propulsion And Braking For Vehicles (AREA)

Abstract

The invention discloses an electric vehicle following running control method based on multi-step reinforcement learning. The method comprises the following steps: 1. determining a state variable through an information acquisition module and a controller design module, and determining a control variable through a control target; 2. dividing the state variable and the equivalent inter-vehicle distance to obtain a Q table; 3. inputting the state variable and the control variable into a Q table to obtain an expected value function; 4. and solving the state variable of the next moment through the longitudinal dynamics module and the electric automobile energy storage module. 5. And selecting a Q table with the minimum expected cost value within n steps in the future to obtain a control variable according to the acceleration change condition of the front vehicle at the next moment. 6. And judging the control variable solved by the Q table as the optimal or suboptimal control variable. The invention utilizes the communication facility between the front vehicle and the main vehicle, so that the vehicle can acquire the information of the acceleration and the like of the front vehicle and realize the vehicle following effect, and the aim of reducing the power consumption in the vehicle following process is fulfilled.

Description

Electric vehicle following running control method based on multi-step reinforcement learning
Technical Field
The invention belongs to the technical field of intelligent driving, particularly relates to a multi-step reinforcement learning-based electric vehicle following driving control method for front and rear vehicles.
Background
In order to slow down the rising trend of global warming and reduce the emission of carbon dioxide, and meanwhile, as the capacity and economy of a battery are continuously improved, a pure electric vehicle becomes one of the development directions of new energy vehicles, the popularization and application of the electric vehicle are more and more emphasized by people for restraining the trend of global warming.
With the development of the times, the global non-renewable resources are increasingly scarce, and people must pay more attention to the reasonable utilization of the existing non-renewable resources while developing. However, with the rapid and vigorous development of modern economy, as one of the indispensable important travel means of everyone standard allocation and human beings at present, how to balance the fuel consumption of the automobile and the reasonable allocation of petroleum resources in human society becomes one of the important problems. Secondly, the problems of air pollution, global warming and the like caused by the problem of petroleum emission are increasingly prominent, and the regulations of the standards of automobile exhaust and fuel consumption are more and more strict, so that the development of new energy electric intelligent automobiles becomes necessary for human development. Therefore, new energy vehicles have attracted attention from the automobile manufacturing industry and governments. With the continuous progress of scientific and technical means, the new energy automobile makes great progress and development, and the electric automobile becomes one of the important means for people going out nowadays and occupies a great proportion in market trading share. Therefore, in order to improve the power consumption rate of the new energy electric vehicle to a greater extent and prolong the service life of a battery, the following running control method of the electric vehicle based on multi-step reinforcement learning is provided.
Disclosure of Invention
The invention mainly considers that the quantity of electric vehicles owned by China will continuously rise along with the wide use of electric vehicles in China and the gradual opening of the new energy vehicle market. How to better utilize the development of electric automobiles, thereby reducing the automobile exhaust emission of fuel oil automobiles, improving the ecological environment and reducing the power consumption of batteries is a worth of discussion.
The invention aims to provide an intelligent automobile following running control method based on multi-step reinforcement learning, which utilizes a communication facility between a front automobile and a main automobile to enable the automobile to obtain information such as acceleration of the front automobile and achieve a following effect, and achieves the aim of reducing power consumption in the following process.
The above object is achieved by the following technical solutions:
step 1, determining a state variable X (t) through an information acquisition module and a controller design module of a vehicle, determining a control variable U (t) through a control target, and initializing relevant parameters of the vehicle, wherein the detailed parameters are shown in FIG. 3.
And 2, dividing the speed V in the state variable X (t) into 50000 grids in size from-3.7 to 4.399 in an equal proportion, dividing the equivalent inter-vehicle distance delta d into 50000 grids in size from-2 to 2.2999 in an equal proportion, and dividing the acceleration of the control variable into 21 grids from-1 to 1. Thus, a Q-table of 50000 × 21 table size is constructed.
Step 3, inputting the state variable X (t) and the control variable U (t) into a Q table to obtain an expected value function;
and 4, solving a state variable X (t +1) at the next moment through the longitudinal dynamics module and the electric automobile energy storage module.
And 5, selecting a Q table with the minimum expected cost value within the next n steps in the future to obtain a control variable U (t +1) according to the acceleration change condition of the front vehicle at the next moment.
And 6, judging whether the Q table meets the maximum iteration times or whether the tolerance meets the self-adaptive iteration value. If yes, the solved control variable U (t +1) is used as the optimal or suboptimal control variable, otherwise, the step 4 is returned.
And 7, after iteration is ended, obtaining control input from the Q table, obtaining the optimal required power Pe through calculation, and applying the optimal required power Pe to the main vehicle.
The method of the invention has the advantages and beneficial results that:
1. with the rapid development of the intelligent automobile industry in China and the wide use of electric automobiles, the electric automobile can be used as a developed and utilized recessive resource in the aspect of how to save electricity and energy, and the electric automobile can be actively carried out according to market requirement instructions to improve the electricity consumption efficiency and reduce the battery loss so as to maintain the stability and the economy of the whole electric automobile system. Compared with the traditional fuel automobile, the wide development of the electric automobile can be one of the important targets of the current new energy market.
2. The control variable of the electric automobile is limited in a certain range, so that the safety of the electric automobile in the driving process can not be reduced due to the fact that the electric automobile is greatly changed in the acceleration or deceleration process, and the driving comfort of passengers can be guaranteed to a certain extent.
3. The state variable is controlled within a certain range, so that a proper distance between the vehicle and a front vehicle is always kept in the driving process, and the final driving target, namely driving safety, is guaranteed.
4. The minimum function in a multi-step range can be obtained by using a multi-step reinforcement learning algorithm, so that the overall cost function is minimum, the power utilization efficiency is optimized, and the previous single-step updating efficiency is changed.
Drawings
FIG. 1 is a view of a vehicle following scene provided by the present invention.
Fig. 2 is a multi-step backtracking tree algorithm diagram based on the ACC policy provided by the present invention.
FIG. 3 is a simulation platform and parameter setting diagram of the multi-step reinforcement learning-based ecological ACC strategy provided by the invention.
Fig. 4 is a simulation experiment diagram of the intelligent automobile following driving control based on multi-step reinforcement learning under the WLTC driving cycle.
Detailed Description
The present invention will be described in detail with reference to specific embodiments.
The intelligent automobile following driving control method based on the multi-step reinforcement learning algorithm is implemented according to the following steps.
Step 1, determining a state variable X (t) through an information acquisition module and a controller design module of a vehicle, determining a control variable U (t) through a control target, and initializing relevant parameters of the vehicle, wherein the detailed parameters are shown in FIG. 3.
And 2, dividing the speed V in the state variable X (t) into 50000 grids in size from-3.7 to 4.399 in an equal proportion, dividing the equivalent vehicle distance delta d into 50000 grids in size from-2 to 2.2999 in an equal proportion, and dividing the acceleration of the control variable into 21 grids from-1 to 1. Thus, a Q-table of 50000 × 21 table size is constructed.
Step 3, inputting the state variable X (t) and the control variable U (t) into a Q table to obtain an expected value function;
and 4, solving a state variable X (t +1) at the next moment through the longitudinal dynamics module and the electric automobile energy storage module.
And 5, selecting a Q table with the minimum expected cost value within the next n steps in the future to obtain a control variable U (t +1) according to the acceleration change condition of the front vehicle at the next moment.
And 6, judging whether the Q table meets the maximum iteration times or whether the tolerance meets the self-adaptive iteration value. If yes, the solved control variable U (t +1) is used as the optimal or suboptimal control variable, otherwise, the step 4 is returned.
And 7, after iteration is ended, obtaining control input from the Q table, obtaining the optimal required power Pe through calculation, and applying the optimal required power Pe to the main vehicle.
Further, the initialization parameters in step 1 include the mass m of the vehicle, the air density ρ, the gravity acceleration g, the rolling resistance coefficient μ, and the nominal aerodynamic resistance coefficient C of the host vehicle h,d Vehicle body length L of electric vehicle car Motor efficiency eta m Fixed gear ratio G r Minimum acceleration a min Maximum acceleration a max And the like.
Further, the step 1 is specifically realized as follows:
the method comprises the steps of modeling longitudinal dynamics of a vehicle, and modeling basic information of the vehicle and physical quantities of the vehicle. The scene of the vehicle following is shown in figure 1.
1-1, establishing a second-order vehicle longitudinal dynamic model, which is as follows:
Figure BDA0003723960990000041
wherein S represents the position of the vehicle,
Figure BDA0003723960990000042
indicating that the vehicle is in position to be derived. V represents the longitudinal speed of the vehicle,
Figure BDA0003723960990000043
representing the derivation of the vehicle longitudinal speed. U is a control input.
1-2, establishing a vehicle longitudinal force balance running equation as follows:
Figure BDA0003723960990000044
wherein, F hf (t) and F hr (t) longitudinal force of front and rear wheel tires at time t, respectively, F a (t) is the air resistance at time t, F r (t) is the rolling resistance at time t, m is the vehicle mass, and V is the vehicle longitudinal speed.
the air resistance acting on the vehicle at time t can be expressed as:
Figure BDA0003723960990000045
where ρ is the air density, C D (d h ) Is the aerodynamic drag coefficient related to the distance between two vehicles, A v The frontal windward area of the main vehicle,
Figure BDA0003723960990000046
representing the square of the velocity of the host vehicle (the leading vehicle).
the rolling resistance at time t is expressed as follows:
F r (t)=mgμ (4)
wherein g is the gravity acceleration, mu is the rolling resistance coefficient, and m is the automobile mass.
The aerodynamic drag coefficient can be expressed as:
Figure BDA0003723960990000051
wherein, C h,d Representing the nominal aerodynamic drag coefficient of the host vehicle. Parameter c 1 And c 2 Is obtained by regression of experimental data, d h The distance between two vehicles.
Following distance d between two vehicles h Calculated by the following way:
d h =S p -S h -L car (6)
wherein L is car The length of the body of the electric vehicle. S p ,S h Respectively a front vehicle position and a main vehicle position.
1-3, establishing a motor driving model of the electric vehicle.
The control main body is an electric vehicle, and the driving force of the electric vehicle is provided by a motor. In order to accurately describe the dynamic characteristics of the vehicle, the actual output torque T of the motor is assumed without considering the constraint of the motor efficiency m With the motor speed omega m Can be expressed as:
Figure BDA0003723960990000052
wherein R and G r Respectively, the radius of the tire and the fixed gear ratio, T ω (t) is the traction force at time t.
1-4, establishing a battery power model of the electric vehicle. Neglecting the effect of the auxiliary on the battery power, the desired output battery power may be equivalent to the desired motor input power, expressed as follows:
Figure BDA0003723960990000053
wherein, P bat Is the battery power; eta m To the motor efficiency. Where the motor efficiency can be described as:
η m (t)=f mm (t),T m (t)) (9)
wherein f is m A power conversion efficiency function representing the motor conversion.
1-5, establishing a charge and discharge resistance model of the electric vehicle, as follows:
Figure BDA0003723960990000054
wherein, SoC bat (t) is the state of charge of the battery at time t, R bat (t) is the resistance of the battery at time t.
Figure BDA0003723960990000055
Are all the coefficients of a charging model of the battery pack,
Figure BDA0003723960990000056
is a battery discharge model coefficient, I bat And (t) is the battery pack current at time t.
And 2, performing ecological self-adaptive cruise control on the electric vehicle based on the multi-step reinforcement learning, and determining an optimization target.
2-1 optimization objective based on safe driving of the vehicle; in order to ensure the safety of the vehicle running, the inter-vehicle distance must be limited by:
d min (t)≤d h (t)≤d max (t) (11)
wherein d is h (t) is the inter-vehicle distance between the main vehicle and the preceding vehicle at time t, d min (t) and d max (t) minimum and maximum permissible inter-vehicle distances, d min (t) and d max (t) are all derived from the Q table.
d min (t) and d max (t) can be expressed as:
Figure BDA0003723960990000061
2-2, optimizing the target based on the driving comfort of the vehicle; to ensure driving comfort, the control inputs of an electric vehicle must be limited to:
a min ≤U(t)≤a max (13)
wherein, a min And a max Respectively the minimum and maximum acceleration allowed. In the present invention, a min =-1m/s 2 ,a max =1m/s 2
2-3. based on the optimization goal of prolonging the service life of the vehicle battery. Reduce
Figure BDA0003723960990000065
The life of the battery can be extended. Therefore, in order to extend the battery life as much as possible, it is necessary to reduce the following formula as much as possible:
Figure BDA0003723960990000062
wherein,
Figure BDA0003723960990000063
is the square of the current of the battery at time t, P 0 To the start of the driving cycle, T cyc Is the end of the driving cycle.
And 2-4, optimizing the target based on the energy economy of the vehicle.
In order to improve the energy economy of the electric vehicle, it is necessary to reduce the following formula as much as possible:
Figure BDA0003723960990000064
and 3, determining the algorithm based on the multi-step reinforcement learning as an n-step tree backtracking algorithm.
3-1, determining state variables and control variables under the research of the Eco-ACC strategy based on multi-step reinforcement learning.
State variable x (t): in order for the electric vehicle to follow the preceding vehicle within a reasonable inter-vehicle distance range, equation (11) must be satisfied. The following performance can therefore be evaluated using the vehicle pitch deviation Δ d and the speed deviation Δ V, which can be defined as:
X(t)=[ΔV(t),Δd(t)]T (16)
wherein,
Figure BDA0003723960990000071
ΔV(t)=V p (t)-V(t) (18)
wherein BSF () is a band stop function, α, β,
Figure BDA0003723960990000078
cf is the coefficient associated with the band stop function, see table 1. V p (t) is the speed of the preceding vehicle at time t.
In order to express the state information of the vehicle more intuitively, the belt resistance function is improved, and the distance between the vehicles is described as follows:
Figure BDA0003723960990000072
where δ d (t) is the equivalent inter-vehicle distance deviation at time t, α, β,
Figure BDA0003723960990000073
cf z for coefficients related to the band stop function, see table 1.
Controlling variables: the control variable of the present invention is acceleration.
U(t)=a(t) (20)
Where a (t) is the acceleration of the subject at time t.
3-2, determining a reward function and a value function under the research of the Eco-ACC strategy based on multi-step reinforcement learning.
③ reward function: to achieve the control objective, the bonus function is given as follows:
r(X(t),U(t))=α 1 L 1 (t)+α 2 L 2 (t)+α 3 L 3 (t) (21)
wherein alpha is 1 、α 2 And alpha 3 The weighting coefficients for the reward function can be seen in table 1. L is 1 、L 2 And L 3 Can be expressed as:
Figure BDA0003723960990000074
value function: the value function of the ecological ACC strategy based on multi-step reinforcement learning can be expressed as follows:
Figure BDA0003723960990000075
wherein gamma is a discount factor, alpha, beta and
Figure BDA0003723960990000076
are parameters of the band stop function and are detailed in table 1.
In reinforcement learning, the ultimate goal of an agent is to maximize the jackpot. The reward function can determine whether the action is good or bad in a short time. The value function can therefore be described as:
Figure BDA0003723960990000077
and 3-3, adopting an off-line algorithm which is not applicable to importance sampling, namely a backtracking tree method by using the multi-step learning algorithm, wherein the n-step backtracking tree diagram is shown in fig. 2.
3-4, the simulation platform and parameter setting of the ecological ACC strategy based on multi-step reinforcement learning are shown in FIG. 3.
The band stop function related parameters and the bonus function weight coefficient settings are shown in table 1.
Figure BDA0003723960990000082
TABLE 1
And 4, verifying the intelligent automobile following form control method based on the multi-step reinforcement learning algorithm under different driving cycles as shown in the following table.
TABLE 2 Intelligent automobile following form control method based on multi-step reinforcement learning algorithm under different driving cycles
Figure BDA0003723960990000081
The simulation results are shown in fig. 4, and the results show that: the speeds of the vehicle controlled by the Eco-ACC system based on the multi-step reinforcement learning algorithm and the vehicle in front are basically consistent, and the acceleration of the vehicle is smoother than that controlled by the traditional ACC system, so that passengers feel more comfortable; the actual distance between the vehicle controlled by the Eco-ACC system and the vehicle in front is always kept in a safe range, so that the safety of the vehicle in the driving process is ensured; vehicles controlled by the Eco-ACC system are more energy efficient than vehicles controlled by conventional ACC systems.

Claims (4)

1. An electric vehicle following running control method based on multi-step reinforcement learning is characterized by comprising the following steps:
step 1, determining a state variable X (t) through an information acquisition module and a controller design module of a vehicle, determining a control variable U (t) through a control target, and initializing relevant parameters of the vehicle;
step 2, dividing the speed V in the state variable X (t) into 50000 grids in size from-3.7 to 4.399 in equal proportion, dividing the equivalent inter-vehicle distance delta d into 50000 grids in size from-2 to 2.2999 in equal proportion, and dividing the acceleration of the control variable into 21 grids from-1 to 1; thus forming a Q table with the size of 50000X 21 tables;
step 3, inputting the state variable x (t) and the control variable U (t) into a Q table to obtain an expected value function;
step 4, solving a state variable X (t +1) at the next moment through the longitudinal dynamics module and the electric automobile energy storage module;
step 5, selecting a Q table with the minimum expected cost value within the next n steps in the future to obtain a control variable U (t +1) according to the acceleration change condition of the front vehicle at the next moment;
step 6, judging whether the Q table meets the maximum iteration times or whether the tolerance meets the self-adaptive iteration value; if yes, the solved control variable U (t +1) is used as the optimal or suboptimal control variable, otherwise, the step 4 is returned;
and 7, after iteration is ended, obtaining control input from the Q table, obtaining the optimal required power Pe through calculation, and applying the optimal required power Pe to the main vehicle.
2. The electric vehicle following running control method based on multi-step reinforcement learning according to claim 1, characterized in that longitudinal dynamics modeling is performed on the vehicle, basic information of the vehicle and physical quantities of the vehicle are modeled;
1-1, establishing a second-order vehicle longitudinal dynamic model, which is as follows:
Figure FDA0003723960980000011
wherein S represents the position of the vehicle,
Figure FDA0003723960980000012
indicating that the position of the vehicle is subjected to derivation; v represents the longitudinal speed of the vehicle,
Figure FDA0003723960980000013
representing the derivation of the vehicle longitudinal speed; u is a control input;
1-2, establishing a longitudinal force balance driving equation of the vehicle as follows:
Figure FDA0003723960980000014
wherein, F hf (t) and F hr (t) longitudinal force of front and rear wheel tires at time t, respectively, F a (t) is the air resistance at time t, F r (t) rolling resistance at time t, m is vehicle mass, and V is vehicle longitudinal speed;
the air resistance acting on the vehicle at time t can be expressed as:
Figure FDA0003723960980000021
where ρ is the air density, C D (d h ) Is the aerodynamic drag coefficient related to the distance between two vehicles, A v The frontal windward area of the main vehicle,
Figure FDA0003723960980000022
represents the square of the host vehicle velocity;
the rolling resistance expression at time t is as follows:
F r (t)=mgμ (4)
wherein g is the gravity acceleration, mu is the rolling resistance coefficient, and m is the automobile mass;
the aerodynamic drag coefficient can be expressed as:
Figure FDA0003723960980000023
wherein, C h,d Representing a nominal aerodynamic drag coefficient of the host vehicle; parameter c 1 And c 2 Is obtained by regression of experimental data, d h The distance between two vehicles is the following distance of the two vehicles;
following distance d between two vehicles h Calculated by the following way:
d h =S p -S h -L car (6)
wherein L is car The length of the body of the electric vehicle; s p ,S h Respectively a front vehicle position and a main vehicle position;
1-3, establishing a motor driving model of the electric vehicle;
in order to accurately describe the dynamic characteristics of the vehicle, the actual output torque T of the motor is assumed without considering the constraint of the motor efficiency m With the motor speed omega m Can be expressed as:
Figure FDA0003723960980000024
wherein R and G r Respectively, the radius of the tire and the fixed gear ratio, T ω (t) is the traction at time t;
1-4, establishing a battery power model of the electric vehicle; neglecting the effect of the auxiliary on the battery power, the desired output battery power may be equivalent to the desired motor input power, expressed as follows:
Figure FDA0003723960980000025
wherein, P bat Is the battery power; eta m To the motor efficiency; where the motor efficiency can be described as:
η m (t)=f mm (t),T m (t)) (9)
wherein f is m A power conversion efficiency function representing a motor conversion;
1-5, establishing a charge and discharge resistance model of the electric vehicle, as follows:
Figure FDA0003723960980000031
wherein, SoC bat (t) is the state of charge of the battery at time t, R bat (t) is the resistance of the battery at time t;
Figure FDA0003723960980000032
are all the coefficients of a charging model of the battery pack,
Figure FDA0003723960980000033
is a battery discharge model coefficient, I bat And (t) is the battery pack current at time t.
3. The electric vehicle following running control method based on multi-step reinforcement learning according to claim 2, characterized in that the electric vehicle based on multi-step reinforcement learning performs ecological adaptive cruise control to determine an optimization target, and specifically realizes the following:
2-1 optimizing target based on safe driving of vehicle; in order to ensure the safety of the vehicle running, the inter-vehicle distance must be limited by:
d min (t)≤d h (t)≤d max (t) (11)
wherein d is h (t) is the inter-vehicle distance between the main vehicle and the preceding vehicle at time t, d min (t) and d max (t) minimum and maximum permissible inter-vehicle distances, d min (t) and d max (t) are all derived from the Q table;
d min (t) and d max (t) can be expressed as:
Figure FDA0003723960980000034
2-2, optimizing the target based on the driving comfort of the vehicle; to ensure driving comfort, the control inputs of an electric vehicle must be limited by:
a min ≤U(t)≤a max (13)
wherein, a min And a max Minimum and maximum accelerations allowed, respectively; in the present invention, a min =-1m/s 2 ,a max =1m/s 2
2-3. based on the optimization goal of prolonging the service life of the vehicle battery; reduce
Figure FDA0003723960980000035
The service life of the battery can be prolonged; therefore, in order to extend the battery life as long as possible, it is necessary to reduce the following equation (14) as small as possible:
Figure FDA0003723960980000036
wherein,
Figure FDA0003723960980000037
for battery packs at time tSquare of the current of (d), t 0 To the start of the driving cycle, T cyc Is the end time of the driving cycle;
2-4, optimizing target based on vehicle energy economy; in order to improve the energy economy of the electric vehicle, it is necessary to reduce the following formula as much as possible:
Figure FDA0003723960980000041
4. the electric vehicle following running control method based on multi-step reinforcement learning according to claim 3, characterized in that the algorithm based on multi-step reinforcement learning is determined to be an n-step tree backtracking algorithm;
3-1, determining state variables and control variables under the research of the Eco-ACC strategy based on multi-step reinforcement learning;
state variable x (t): in order for the electric vehicle to follow the leading vehicle within a reasonable inter-vehicle distance range, equation (11) must be satisfied; the following performance is therefore evaluated with the vehicle separation deviation Δ d and the speed deviation Δ V, which can be defined as:
X(t)=[ΔV(t),Δd(t)] T (16)
wherein,
Figure FDA0003723960980000042
ΔV(t)=V p (t)-V(t) (18)
wherein BSF () is a band stop function, α, β,
Figure FDA0003723960980000043
cf is a coefficient related to the band stop function, V p (t) is the speed of the vehicle ahead at time t;
in order to express the state information of the vehicle more intuitively, the belt resistance function is improved, and the distance between the vehicles is described as follows:
Figure FDA0003723960980000044
where δ d (t) is the equivalent inter-vehicle distance deviation at time t, α, β,
Figure FDA0003723960980000045
cf z is a coefficient related to the band stop function;
controlling variables: the control variable is an acceleration;
U(t)=a(t) (20)
wherein a (t) is the principal acceleration at time t;
3-2, determining a reward function and a value function under the research of the Eco-ACC strategy based on multi-step reinforcement learning;
③ reward function: to achieve the control objective, the reward function is given as follows:
r(X(t),U(t))=α 1 L 1 (t)+α 2 L 2 (t)+α 3 L 3 (t) (21)
wherein alpha is 1 、α 2 And alpha 3 A weighting factor that is a reward function; l is 1 、L 2 And L 3 Can be expressed as:
Figure FDA0003723960980000051
value function: the value function of the ecological ACC strategy based on multi-step reinforcement learning can be expressed as follows:
Figure FDA0003723960980000052
wherein gamma is a discount factor, alpha, beta and
Figure FDA0003723960980000053
is a parameter of the band stop function;
in reinforcement learning, the ultimate goal of an agent is to maximize the jackpot; the reward function can judge whether the action is good or bad in a short time; the value function can therefore be described as:
Figure FDA0003723960980000054
and 3-3, adopting an off-line algorithm which is not suitable for importance sampling, namely an n-step back-tracing tree method, by using the multi-step learning algorithm.
CN202210770539.XA 2022-06-30 2022-06-30 Electric vehicle following driving control method based on multi-step reinforcement learning Active CN114954455B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210770539.XA CN114954455B (en) 2022-06-30 2022-06-30 Electric vehicle following driving control method based on multi-step reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210770539.XA CN114954455B (en) 2022-06-30 2022-06-30 Electric vehicle following driving control method based on multi-step reinforcement learning

Publications (2)

Publication Number Publication Date
CN114954455A true CN114954455A (en) 2022-08-30
CN114954455B CN114954455B (en) 2024-07-02

Family

ID=82966644

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210770539.XA Active CN114954455B (en) 2022-06-30 2022-06-30 Electric vehicle following driving control method based on multi-step reinforcement learning

Country Status (1)

Country Link
CN (1) CN114954455B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109484407A (en) * 2018-11-14 2019-03-19 北京科技大学 A kind of adaptive follow the bus method that electric car auxiliary drives
CN112046484A (en) * 2020-09-21 2020-12-08 吉林大学 Q learning-based vehicle lane-changing overtaking path planning method
CN112989553A (en) * 2020-12-28 2021-06-18 郑州大学 Construction and application of CEBs (common electronic devices and controllers) speed planning model based on battery capacity loss control
WO2021197246A1 (en) * 2020-03-31 2021-10-07 长安大学 V2x-based motorcade cooperative braking method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109484407A (en) * 2018-11-14 2019-03-19 北京科技大学 A kind of adaptive follow the bus method that electric car auxiliary drives
WO2021197246A1 (en) * 2020-03-31 2021-10-07 长安大学 V2x-based motorcade cooperative braking method and system
CN112046484A (en) * 2020-09-21 2020-12-08 吉林大学 Q learning-based vehicle lane-changing overtaking path planning method
CN112989553A (en) * 2020-12-28 2021-06-18 郑州大学 Construction and application of CEBs (common electronic devices and controllers) speed planning model based on battery capacity loss control

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李文昌;郭景华;王进;: "分层架构下智能电动汽车纵向运动自适应模糊滑模控制", 厦门大学学报(自然科学版), no. 03, 28 May 2019 (2019-05-28) *

Also Published As

Publication number Publication date
CN114954455B (en) 2024-07-02

Similar Documents

Publication Publication Date Title
Chen et al. Optimal strategies of energy management integrated with transmission control for a hybrid electric vehicle using dynamic particle swarm optimization
Wang et al. Fuzzy adaptive-equivalent consumption minimization strategy for a parallel hybrid electric vehicle
CN112896161B (en) Electric automobile ecological self-adaptation cruise control system based on reinforcement learning
Zhuang et al. Integrated energy-oriented cruising control of electric vehicle on highway with varying slopes considering battery aging
CN109532513A (en) A kind of optimal driving torque allocation strategy generation method of Two axle drive electric car
CN104175891B (en) Pure electric automobile energy regenerating regenerating brake control method
Pu et al. An adaptive stochastic model predictive control strategy for plug-in hybrid electric bus during vehicle-following scenario
Zhang et al. Powertrain design and energy management of a novel coaxial series-parallel plug-in hybrid electric vehicle
CN115027290A (en) Hybrid electric vehicle following energy management method based on multi-objective optimization
Yu et al. Braking energy management strategy for electric vehicles based on working condition prediction
Zhang et al. Deep reinforcement learning based multi-objective energy management strategy for a plug-in hybrid electric bus considering driving style recognition
CN114954455B (en) Electric vehicle following driving control method based on multi-step reinforcement learning
Zhang et al. Multi-objective optimization for pure electric vehicle during a car-following process
Hong et al. Energy-Saving Driving Assistance System Integrated With Predictive Cruise Control for Electric Vehicles
Zhou et al. Research on Design Optimization and Simulation of Regenerative Braking Control Strategy for Pure Electric Vehicle Based on EMB Systems
Shi et al. Energy Management Strategy based on Driving Style Recognition for Plug-in Hybrid Electric Bus
Omar et al. Design and optimization of powertrain system for prototype fuel cell electric vehicle
Keskin et al. Fuzzy control of dual storage system of an electric drive vehicle considering battery degradation
Li et al. Research on PHEV logic threshold energy management strategy based on Engine Optimal Working Curve
Bravo et al. The influences of energy storage and energy management strategies on fuel consumption of a fuel cell hybrid vehicle
CN116424317A (en) Electric automobile economic self-adaptive cruise control method based on multi-step DDQN
Bozhkov et al. Modelling the hybrid electric vehicle energy efficiency
Wang et al. Powertrain analysis and optimized power follower control strategy in a series hybrid electric vehicle
Ma et al. Power demand for fuel cell system in hybrid vehicles
CN112659922B (en) Hybrid power rail vehicle and direct current bus voltage control method and system thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant