CN114954455A - Electric vehicle following running control method based on multi-step reinforcement learning - Google Patents
Electric vehicle following running control method based on multi-step reinforcement learning Download PDFInfo
- Publication number
- CN114954455A CN114954455A CN202210770539.XA CN202210770539A CN114954455A CN 114954455 A CN114954455 A CN 114954455A CN 202210770539 A CN202210770539 A CN 202210770539A CN 114954455 A CN114954455 A CN 114954455A
- Authority
- CN
- China
- Prior art keywords
- vehicle
- time
- reinforcement learning
- control
- battery
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000002787 reinforcement Effects 0.000 title claims abstract description 29
- 238000000034 method Methods 0.000 title claims abstract description 21
- 230000006870 function Effects 0.000 claims abstract description 38
- 230000001133 acceleration Effects 0.000 claims abstract description 22
- 230000008859 change Effects 0.000 claims abstract description 4
- 238000013461 design Methods 0.000 claims abstract description 4
- 238000004146 energy storage Methods 0.000 claims abstract description 4
- 230000000694 effects Effects 0.000 claims abstract description 3
- 238000004422 calculation algorithm Methods 0.000 claims description 14
- 238000005457 optimization Methods 0.000 claims description 7
- 238000005096 rolling process Methods 0.000 claims description 7
- 230000003044 adaptive effect Effects 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000009795 derivation Methods 0.000 claims description 3
- 238000011160 research Methods 0.000 claims description 3
- 230000009471 action Effects 0.000 claims description 2
- 230000005484 gravity Effects 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims description 2
- 230000002035 prolonged effect Effects 0.000 claims 1
- 238000000926 separation method Methods 0.000 claims 1
- 230000008569 process Effects 0.000 abstract description 4
- 238000004891 communication Methods 0.000 abstract description 2
- 238000011161 development Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 5
- 241000282414 Homo sapiens Species 0.000 description 4
- 239000000446 fuel Substances 0.000 description 4
- 238000004088 simulation Methods 0.000 description 4
- 238000010792 warming Methods 0.000 description 3
- CURLTUGMZLYLDI-UHFFFAOYSA-N Carbon dioxide Chemical compound O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 description 2
- 238000003915 air pollution Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 229910002092 carbon dioxide Inorganic materials 0.000 description 1
- 239000001569 carbon dioxide Substances 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000007599 discharging Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W30/00—Purposes of road vehicle drive control systems not related to the control of a particular sub-unit, e.g. of systems using conjoint control of vehicle sub-units
- B60W30/14—Adaptive cruise control
- B60W30/16—Control of distance between vehicles, e.g. keeping a distance to preceding vehicle
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60L—PROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
- B60L15/00—Methods, circuits, or devices for controlling the traction-motor speed of electrically-propelled vehicles
- B60L15/20—Methods, circuits, or devices for controlling the traction-motor speed of electrically-propelled vehicles for control of the vehicle or its driving motor to achieve a desired performance, e.g. speed, torque, programmed variation of speed
- B60L15/2045—Methods, circuits, or devices for controlling the traction-motor speed of electrically-propelled vehicles for control of the vehicle or its driving motor to achieve a desired performance, e.g. speed, torque, programmed variation of speed for optimising the use of energy
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60W—CONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
- B60W2556/00—Input parameters relating to data
- B60W2556/45—External transmission of data to or from the vehicle
- B60W2556/65—Data transmitted between vehicles
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/60—Other road transportation technologies with climate change mitigation effect
- Y02T10/72—Electric energy management in electromobility
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Mechanical Engineering (AREA)
- Transportation (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Power Engineering (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Automation & Control Theory (AREA)
- Electric Propulsion And Braking For Vehicles (AREA)
Abstract
Description
技术领域technical field
本发明属于智能驾驶技术领域,尤其针对前后车跟车,具体涉及一种基于多步强化学习的电动车跟车行驶控制方法。The invention belongs to the technical field of intelligent driving, and in particular is aimed at following the vehicle at the front and rear, and in particular relates to a control method for following the vehicle of an electric vehicle based on multi-step reinforcement learning.
背景技术Background technique
为了减缓全球变暖上升趋势,降低二氧化碳的排放量,同时随着电池的容量和经济性不断提高,纯电动汽车成为新能源汽车发展的方向之一,电动汽车的推广应用排放,为遏制全球暖化的趋势,电动汽车的推广应用越来越受到人们的重视。In order to slow down the rising trend of global warming and reduce carbon dioxide emissions, and with the continuous improvement of battery capacity and economy, pure electric vehicles have become one of the development directions of new energy vehicles. The promotion and application of electric vehicles emits emissions to curb global warming. With the trend of globalization, the promotion and application of electric vehicles has attracted more and more attention.
随着时代的发展,全球不可再生资源日益稀缺,人类在发展的同时必须更加注重对现有非可再生资源的合理利用。然而,随着现代经济的快速蓬勃发展,智能汽车作为现如今的人均标配以及人类不可或缺的重要出行手段之一,如何平衡汽车燃油消耗以及人类社会石油资源的合理分配成为重要难题之一。其次,石油排放问题造成的大气污染、全球变暖等问题日趋突出,加之汽车排放尾气与燃油消耗标准的法规越来越严格,发展新能源电动智能汽车成为人类发展的必要。因此,新能源汽车引起了汽车制造业和政府的关注。现如今随着科学技术手段的不断进步,新能源汽车取得了巨大的进步与发展,电动汽车成为现如今人们出行的重要手段之一,在市场交易份额中占了重大比例。因此,为了更大程度地提升新能源电动汽车的耗电率,延长电池寿命,本文提出了一种基于多步强化学习的电动车跟车行驶控制方法。With the development of the times, the global non-renewable resources are increasingly scarce, and human beings must pay more attention to the rational use of existing non-renewable resources while developing. However, with the rapid and vigorous development of the modern economy, smart cars are now standard per capita and one of the indispensable means of travel for human beings. How to balance the fuel consumption of automobiles and the rational distribution of oil resources in human society has become one of the important problems. . Secondly, air pollution and global warming caused by oil emissions are becoming more and more prominent. In addition, the regulations on vehicle exhaust and fuel consumption standards are becoming more and more strict. The development of new energy electric smart vehicles has become a necessity for human development. Therefore, new energy vehicles have attracted the attention of the automobile manufacturing industry and the government. Nowadays, with the continuous progress of scientific and technological means, new energy vehicles have made great progress and development, and electric vehicles have become one of the important means of people's travel today, accounting for a significant proportion of market transactions. Therefore, in order to increase the power consumption rate of new energy electric vehicles to a greater extent and prolong the battery life, this paper proposes a multi-step reinforcement learning-based control method for electric vehicle following driving.
发明内容SUMMARY OF THE INVENTION
本发明主要考虑随着我国电动车的广泛使用以及新能源汽车市场的逐步开放,我国拥有电动汽车的数量将会持续上升。如何更好地利用电动汽车的发展,从而降低燃油车汽车尾气排放和提高生态环境并且降低电池耗电量是值得探讨的问题。The present invention mainly considers that with the widespread use of electric vehicles in my country and the gradual opening of the new energy vehicle market, the number of electric vehicles in my country will continue to increase. How to make better use of the development of electric vehicles, thereby reducing the exhaust emissions of fuel vehicles, improving the ecological environment and reducing battery power consumption are issues worth exploring.
本发明的目的是提供一种基于多步强化学习的智能汽车跟车行驶控制的方法,利用了前车与主车之间的通信设施,使得车辆能够获取前车加速度等信息以及实现跟车效果,在跟车过程中实现降低耗电量的目标。The purpose of the present invention is to provide a method for intelligent vehicle following driving control based on multi-step reinforcement learning, which utilizes the communication facilities between the preceding vehicle and the host vehicle, so that the vehicle can obtain information such as the acceleration of the preceding vehicle and realize the following effect. , in the process of following the car to achieve the goal of reducing power consumption.
以上目标通过以下的技术方案实现:The above goals are achieved through the following technical solutions:
步骤1、通过车辆自带的信息获取模块和控制器设计模块确定状态变量X(t), 通过控制目标来确定控制变量U(t),初始化车辆相关参数详见图3。
步骤2、将状态变量X(t)中的速度V以等比例从-3.7~4.399划分成50000 个格子大小,将等效车间距δd以等比例从-2~2.2999划分成50000个格子大小,将控制变量加速度从-1~1之间划分成21个格子。因此,构成了一个50000*21 个表格大小的Q表。
步骤3、将状态变量X(t)和控制变量U(t)输入到Q表中获取期望值函数;Step 3. Input the state variable X(t) and the control variable U(t) into the Q table to obtain the expected value function;
步骤4、通过纵向动力学模块和电动汽车储能模块求解出下一时刻的状态变量X(t+1)。Step 4: Solve the state variable X(t+1) at the next moment through the longitudinal dynamics module and the electric vehicle energy storage module.
步骤5、根据前车下一时刻的加速度变化情况,选取到未来n步以内最小期望代价值的Q表来获取控制变量U(t+1)。Step 5. According to the acceleration change of the preceding vehicle at the next moment, select the Q table with the minimum expected cost value within n steps in the future to obtain the control variable U(t+1).
步骤6、判断该Q表是否满足最大迭代次数或容差是否满足自适应迭代值。若满足,则求解出的控制变量U(t+1)作为最优或者次优的控制变量,否则返回步骤4。Step 6: Determine whether the Q table satisfies the maximum number of iterations or whether the tolerance satisfies the adaptive iteration value. If satisfied, the obtained control variable U(t+1) is taken as the optimal or sub-optimal control variable, otherwise, return to step 4.
步骤7、迭代终止后,从Q表中获得控制输入,经过计算获得最优需求功率 Pe并应用于主车车辆上。Step 7: After the iteration is terminated, the control input is obtained from the Q table, and the optimal demand power Pe is obtained through calculation and applied to the host vehicle.
本发明方法具有的优点及有益结果为:The advantages and beneficial results that the method of the present invention has are:
1.随着我国智能汽车产业的快速发展以及电动汽车的广泛使用,对于电动汽车如何节电节能方面可作为一种开发利用的隐性资源,能根据市场要求指令,主动进行电动汽车提升耗电效率以及减小电池损耗,以维护整个电动汽车系统的稳定性和经济性。相比于传统的燃油汽车,电动汽车的广泛发展可作为现在新能源市场的重要目标之一。1. With the rapid development of my country's smart car industry and the widespread use of electric vehicles, it can be used as a hidden resource for development and utilization of electric vehicles in terms of how to save electricity efficiency and reduce battery losses to maintain the stability and economy of the entire electric vehicle system. Compared with traditional fuel vehicles, the extensive development of electric vehicles can be one of the important goals of the new energy market.
2、将电动汽车的控制变量限定于一定范围内,可使得电动汽车在加速或减速过程中不会出现大幅度的变化导致行驶过程中的安全性降低,同时也能使得乘客驾驶的的舒适性得到一定的保障。2. Limiting the control variables of electric vehicles to a certain range can prevent the electric vehicles from changing significantly during the acceleration or deceleration process, resulting in reduced safety during driving, and at the same time, it can also make the driving comfort of passengers. get some protection.
3、将状态变量控制在一定范围内可使得行车过程中始终保持与前车有个合适的车间距,这使得驾驶的最终目标——驾驶安全性得到保障。3. Controlling the state variable within a certain range can keep a proper distance between the vehicle and the vehicle ahead during the driving process, which ensures the ultimate goal of driving - driving safety.
4、通过使用多步强化学习算法可以获得多步范围内的最小值函数,使得总体代价函数最小,优化了用电效率,改变了以往单步更新的效率。4. By using the multi-step reinforcement learning algorithm, the minimum value function within the multi-step range can be obtained, so that the overall cost function is minimized, the power consumption efficiency is optimized, and the efficiency of the previous single-step update is changed.
附图说明Description of drawings
图1是本发明提供的车辆跟车场景图。FIG. 1 is a scene diagram of a vehicle following a vehicle provided by the present invention.
图2是本发明提供的基于ACC策略下的多步回溯树算法图。FIG. 2 is a diagram of a multi-step backtracking tree algorithm based on the ACC strategy provided by the present invention.
图3是本发明提供的基于多步强化学习的生态ACC策略的仿真平台和参数设置图。FIG. 3 is a simulation platform and parameter setting diagram of the ecological ACC strategy based on multi-step reinforcement learning provided by the present invention.
图4是本发明提供的基于多步强化学习的智能汽车跟车行驶控制在WLTC驾驶循环下的仿真实验图。FIG. 4 is a simulation experiment diagram of the intelligent vehicle following driving control based on multi-step reinforcement learning provided by the present invention under the WLTC driving cycle.
具体实施方式Detailed ways
下面结合具体实施方式对本发明进行详细的说明。The present invention will be described in detail below with reference to specific embodiments.
本发明提出的基于多步强化学习算法的智能汽车跟车行驶控制方法,按照以下步骤实施。The intelligent vehicle following driving control method based on the multi-step reinforcement learning algorithm proposed by the present invention is implemented according to the following steps.
步骤1、通过车辆自带的信息获取模块和控制器设计模块确定状态变量X(t), 通过控制目标来确定控制变量U(t),初始化车辆相关参数详见图3。
步骤2、将状态变量X(t)中的速度V以等比例从-3.7到4.399划分成50000 个格子大小,将等效车间距δd以等比例从-2到2.2999划分成50000个格子大小,将控制变量加速度从-1到1之间划分成21个格子。从而构成一个50000*21个表格大小的Q表。
步骤3、将状态变量X(t)和控制变量U(t)输入到Q表中获取期望值函数;Step 3. Input the state variable X(t) and the control variable U(t) into the Q table to obtain the expected value function;
步骤4、通过纵向动力学模块和电动汽车储能模块求解出下一时刻的状态变量X(t+1)。Step 4: Solve the state variable X(t+1) at the next moment through the longitudinal dynamics module and the electric vehicle energy storage module.
步骤5、根据前车下一时刻的加速度变化情况,选取到未来n步以内最小期望代价值的Q表来获取控制变量U(t+1)。Step 5. According to the acceleration change of the preceding vehicle at the next moment, select the Q table with the minimum expected cost value within n steps in the future to obtain the control variable U(t+1).
步骤6、判断该Q表是否满足最大迭代次数或容差是否满足自适应迭代值。若满足,则求解出的控制变量U(t+1)作为最优或者次优的控制变量,否则返回步骤4。Step 6: Determine whether the Q table satisfies the maximum number of iterations or whether the tolerance satisfies the adaptive iteration value. If satisfied, the obtained control variable U(t+1) is taken as the optimal or sub-optimal control variable, otherwise, return to step 4.
步骤7、迭代终止后,从Q表中获得控制输入,经过计算获得最优需求功率Pe 并应用于主车车辆上。Step 7: After the iteration is terminated, the control input is obtained from the Q table, and the optimal demand power Pe is obtained through calculation and applied to the host vehicle.
进一步的,步骤1所述的初始化参数包括汽车质量m、空气密度ρ、重力加速度g、滚动阻力系数μ、主车的标称气动阻力系数Ch,d、电动车的车身长度Lcar、电机效率ηm、固定齿轮比Gr、最小加速度amin、最大加速度amax等。Further, the initialization parameters described in
进一步的,步骤1具体实现如下:Further,
对车辆进行纵向动力学建模,对车辆的基本信息以及车辆的物理量进行建模。车辆跟车场景图如图1所示。Model the longitudinal dynamics of the vehicle, model the basic information of the vehicle and the physical quantities of the vehicle. The vehicle following scene graph is shown in Figure 1.
1-1.建立二阶车辆纵向动力学模型,如下所示:1-1. Establish a second-order vehicle longitudinal dynamics model as follows:
其中,S表示车辆所在位置,表示对车辆所在位置进行求导。V表示车辆纵向速度,表示对车辆纵向速度求导。U为控制输入。Among them, S represents the location of the vehicle, Indicates the derivation of the position of the vehicle. V is the longitudinal speed of the vehicle, Represents the derivative of the vehicle longitudinal velocity. U is the control input.
1-2.建立车辆纵向力平衡行驶方程为:1-2. Establish the vehicle longitudinal force balance driving equation as:
其中,Fhf(t)和Fhr(t)分别为t时刻前车车轮轮胎纵向力和后车车轮轮胎纵向力,Fa(t)为t时刻的空气阻力,Fr(t)为t时刻的滚动阻力,m为汽车质量,V为车辆纵向速度。Among them, F hf (t) and F hr (t) are the longitudinal force of the front wheel tire and the rear wheel tire longitudinal force at time t, respectively, F a (t) is the air resistance at time t, and F r (t) is t Rolling resistance at time, m is the mass of the vehicle, and V is the longitudinal speed of the vehicle.
t时刻作用在车辆上的空气阻力可表述为:The air resistance acting on the vehicle at time t can be expressed as:
其中,ρ为空气密度,CD(dh)为两车跟车间距相关的气动阻力系数,Av为主车正面迎风面积,表示主车(前车)速度的平方。Among them, ρ is the air density, C D (d h ) is the aerodynamic drag coefficient related to the distance between the two vehicles and the vehicle, A v is the front windward area of the main vehicle, Indicates the square of the speed of the host vehicle (the preceding vehicle).
t时刻滚动阻力表达式如下所示:The rolling resistance expression at time t is as follows:
Fr(t)=mgμ (4)F r (t) = mgμ (4)
其中,g为重力加速度,μ为滚动阻力系数,m为汽车质量。Among them, g is the acceleration of gravity, μ is the coefficient of rolling resistance, and m is the mass of the vehicle.
气动阻力系数可表述为:The aerodynamic drag coefficient can be expressed as:
其中,Ch,d表示主车的标称气动阻力系数。参数c1和c2是通过回归实验数据得到的,dh为两车的跟车间距。Among them, C h,d represents the nominal aerodynamic drag coefficient of the main vehicle. The parameters c 1 and c 2 are obtained by regressing the experimental data, and dh is the following distance between the two vehicles.
两车的跟车间距dh通过以下方式计算:The following distance d h of the two vehicles is calculated as follows:
dh=Sp-Sh-Lcar (6)d h =S p -S h -L car (6)
其中,Lcar为电动车的车身长度。Sp,Sh分别为前车位置及主车位置。Among them, L car is the body length of the electric vehicle. Sp and Sh are the position of the preceding vehicle and the position of the host vehicle, respectively.
1-3.建立电动车电机驱动模型。1-3. Establish electric vehicle motor drive model.
本发明中控制主体是电动车,其驱动力由电机提供。为准确描述车辆的动力学特性,假设不考虑电机效率的约束,电机实际输出力矩Tm与电机转速ωm可表述为:In the present invention, the control body is an electric vehicle, and its driving force is provided by the motor. In order to accurately describe the dynamic characteristics of the vehicle, assuming that the constraints of the motor efficiency are not considered, the actual output torque T m of the motor and the motor speed ω m can be expressed as:
其中,R和Gr分别为轮胎半径和固定齿轮比,Tω(t)是t时刻的牵引力。where R and G r are the tire radius and fixed gear ratio, respectively, and T ω (t) is the traction force at time t.
1-4.建立电动车电池功率模型。忽略辅件对电池功率的影响,期望输出电池功率可等同于期望电机输入功率,其表达式如下:1-4. Establish an electric vehicle battery power model. Ignoring the influence of accessories on battery power, the expected output battery power can be equal to the expected motor input power, and its expression is as follows:
其中,Pbat为电池功率;ηm为电机效率。其中电机效率可描述为:Among them, P bat is battery power; η m is motor efficiency. The motor efficiency can be described as:
ηm(t)=fm(ωm(t),Tm(t)) (9)η m (t) = f m (ω m (t), T m (t)) (9)
其中,fm表示电机转换的功率转换效率函数。where f m represents the power conversion efficiency function of the motor conversion.
1-5.建立电动车的充放电电阻模型,如下所示:1-5. Establish the charging and discharging resistance model of the electric vehicle, as shown below:
其中,SoCbat(t)为电池组在t时刻的荷电状态,Rbat(t)为电池在t时刻的电阻。均为电池组充电模型系数,为电池组放电模型系数, Ibat(t)为t时刻的电池组电流。Among them, SoC bat (t) is the state of charge of the battery pack at time t, and R bat (t) is the resistance of the battery at time t. are the battery pack charging model coefficients, is the battery pack discharge model coefficient, and I bat (t) is the battery pack current at time t.
步骤2、基于多步强化学习的电动车进行生态自适应巡航控制,确定优化目标。
2-1基于车辆安全驾驶的优化目标;为了确保车辆行驶的安全性,车间距必须受限于:2-1 Based on the optimization goal of safe driving of vehicles; in order to ensure the safety of vehicle driving, the distance between vehicles must be limited by:
dmin(t)≤dh(t)≤dmax(t) (11)d min (t)≤d h (t)≤d max (t) (11)
其中,dh(t)为车辆在t时刻时,主车与前车的车间距,dmin(t)和dmax(t)分别为所被允许的最小和最大车间距,dmin(t)和dmax(t)均来源于Q表。Among them, d h (t) is the distance between the host vehicle and the preceding vehicle at time t, d min (t) and d max (t) are the allowed minimum and maximum inter-vehicle distances, respectively, d min (t ) and d max (t) are both derived from the Q table.
dmin(t)和dmax(t)分别可表述为:d min (t) and d max (t) can be expressed as:
2-2.基于车辆驾驶舒适性的优化目标;为了确保驾驶舒适性,电动车的控制输入必须受限于:2-2. Optimization objective based on vehicle driving comfort; in order to ensure driving comfort, the control input of electric vehicles must be limited by:
amin≤U(t)≤amax (13)a min ≤U(t)≤a max (13)
其中,amin和amax分别为所被允许的最小和最大加速度。在本发明中, amin=-1m/s2,amax=1m/s2。where a min and a max are the allowed minimum and maximum accelerations, respectively. In the present invention, a min =-1 m/s 2 , a max =1 m/s 2 .
2-3.基于延长车辆电池寿命的优化目标。减小可以延长电池的寿命。因此为了尽可能延长电池寿命,需要使下式尽可能地减小:2-3. Based on the optimization goal of extending vehicle battery life. reduce Extends battery life. Therefore, in order to prolong the battery life as much as possible, the following formula needs to be reduced as much as possible:
其中,为电池组在t时刻的电流的平方,P0为驾驶循环的开始时刻,Tcyc为驾驶循环的终止时刻。in, is the square of the current of the battery pack at time t, P 0 is the start time of the drive cycle, and T cyc is the end time of the drive cycle.
2-4.基于车辆能源经济性的优化目标。2-4. Optimization objective based on vehicle energy economy.
为了提升电动车能源经济性,需要使下式尽可能地减小:In order to improve the energy economy of electric vehicles, the following equations need to be reduced as much as possible:
步骤3、确定基于多步强化学习的算法为n步树回溯算法。Step 3: Determine that the algorithm based on multi-step reinforcement learning is an n-step tree backtracking algorithm.
3-1.确定基于多步强化学习的Eco-ACC策略研究下的状态变量和控制变量。3-1. Determine the state variables and control variables under the Eco-ACC strategy research based on multi-step reinforcement learning.
①状态变量X(t):为了使得电动汽车在合理的车间距范围内跟随前车,就必须满足等式(11)。因此跟车性能可以用车辆间距偏差Δd和速度偏差ΔV来评估,分别可以定义为:① State variable X(t): In order for the electric vehicle to follow the preceding vehicle within a reasonable distance between vehicles, equation (11) must be satisfied. Therefore, the following performance can be evaluated by vehicle distance deviation Δd and speed deviation ΔV, which can be defined as:
X(t)=[ΔV(t),Δd(t)]T (16)X(t)=[ΔV(t),Δd(t)]T (16)
其中,in,
ΔV(t)=Vp(t)-V(t) (18)ΔV(t)= Vp (t)-V(t) (18)
其中,BSF()为带阻函数,α,β,cf为与带阻函数相关的系数,可见表1。 Vp(t)为t时刻前车的速度。Among them, BSF() is the band-stop function, α, β, cf is the coefficient related to the band-stop function, see Table 1. V p (t) is the speed of the preceding vehicle at time t.
为了更直观地表述车辆地状态信息,对带阻函数进行改进,将车间距描述为:In order to express the state information of vehicles more intuitively, the band-stop function is improved, and the distance between vehicles is described as:
其中,δd(t)为t时刻的等效车间距偏差,α,β,cfz为与带阻函数相关的系数,可见表1。Among them, δd(t) is the equivalent vehicle distance deviation at time t, α, β, cf z is a coefficient related to the bandstop function, see Table 1.
②控制变量:本发明的控制变量为加速度。②Control variable: The control variable of the present invention is acceleration.
U(t)=a(t) (20)U(t)=a(t) (20)
其中,a(t)为t时刻的主车加速度。Among them, a(t) is the acceleration of the host vehicle at time t.
3-2.确定基于多步强化学习的Eco-ACC策略研究下的奖励函数和值函数。3-2. Determine the reward function and value function under the Eco-ACC strategy study based on multi-step reinforcement learning.
③奖励函数:为了实现控制目标,给出的奖励函数如下所示:③Reward function: In order to achieve the control objective, the given reward function is as follows:
r(X(t),U(t))=α1L1(t)+α2L2(t)+α3L3(t) (21)r(X(t),U(t))=α 1 L 1 (t)+α 2 L 2 (t)+α 3 L 3 (t) (21)
其中,α1、α2和α3为奖励函数的权重系数,可见表1。L1、L2和L3可用下式表述为:Among them, α 1 , α 2 and α 3 are the weight coefficients of the reward function, see Table 1. L 1 , L 2 and L 3 can be expressed as:
④值函数:基于多步强化学习的生态ACC策略的值函数可用下式表述为:④Value function: The value function of the ecological ACC strategy based on multi-step reinforcement learning can be expressed as:
其中,γ为折扣因子,α,β和是带阻函数的参数,详见表1。where γ is the discount factor, α, β and are the parameters of the band-stop function, see Table 1 for details.
在强化学习中,代理的最终目标是最大化累积奖励。奖励函数能在短时内判断该动作是好是坏。因此值函数可被描述为:In reinforcement learning, the ultimate goal of the agent is to maximize the cumulative reward. The reward function can determine whether the action is good or bad in a short period of time. So the value function can be described as:
3-3.多步学习算法采用不适用重要性采样的离线算法,即回溯树法,n步回溯树图如图2所示。3-3. The multi-step learning algorithm adopts an offline algorithm that does not apply importance sampling, that is, the backtracking tree method. The n-step backtracking tree diagram is shown in Figure 2.
3-4.基于多步强化学习的生态ACC策略的仿真平台和参数设置如图3所示。3-4. The simulation platform and parameter settings of the ecological ACC strategy based on multi-step reinforcement learning are shown in Figure 3.
带阻函数相关参数与奖励函数权重系数设置如表1所示。The relevant parameters of the bandstop function and the weight coefficient settings of the reward function are shown in Table 1.
表1Table 1
步骤4、对于不同驾驶循环下的基于多步强化学习算法的智能汽车跟车形式控制方法进行验证如下表所示。Step 4. Verify the intelligent vehicle following form control method based on the multi-step reinforcement learning algorithm under different driving cycles as shown in the following table.
表2不同驾驶循环下的基于多步强化学习算法的智能汽车跟车形式控制方法Table 2 The intelligent vehicle following form control method based on multi-step reinforcement learning algorithm under different driving cycles
仿真结果如图4所示,其结果表明:发明的基于多步强化学习算法的Eco-ACC 系统所控制的车辆和前方车辆的速度基本一致,车辆的加速度也比传统ACC系统控制的更加平滑,使乘客感到更加舒适;Eco-ACC系统所控制的车辆与前方车辆的实际车距始终保持在安全范围内,保证了车辆在行驶过程中的安全性;Eco-ACC 系统所控制的车辆比传统ACC系统控制的车辆更加节能。The simulation results are shown in Figure 4. The results show that the speed of the vehicle controlled by the invented Eco-ACC system based on the multi-step reinforcement learning algorithm is basically the same as that of the vehicle ahead, and the acceleration of the vehicle is smoother than that controlled by the traditional ACC system. Make passengers feel more comfortable; the actual distance between the vehicle controlled by the Eco-ACC system and the vehicle ahead is always kept within a safe range, ensuring the safety of the vehicle during driving; the vehicle controlled by the Eco-ACC system is better than traditional ACC. System-controlled vehicles are more energy efficient.
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210770539.XA CN114954455B (en) | 2022-06-30 | 2022-06-30 | A method for controlling electric vehicle following vehicle based on multi-step reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210770539.XA CN114954455B (en) | 2022-06-30 | 2022-06-30 | A method for controlling electric vehicle following vehicle based on multi-step reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114954455A true CN114954455A (en) | 2022-08-30 |
CN114954455B CN114954455B (en) | 2024-07-02 |
Family
ID=82966644
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210770539.XA Active CN114954455B (en) | 2022-06-30 | 2022-06-30 | A method for controlling electric vehicle following vehicle based on multi-step reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114954455B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109484407A (en) * | 2018-11-14 | 2019-03-19 | 北京科技大学 | A kind of adaptive follow the bus method that electric car auxiliary drives |
CN112046484A (en) * | 2020-09-21 | 2020-12-08 | 吉林大学 | Q learning-based vehicle lane-changing overtaking path planning method |
CN112989553A (en) * | 2020-12-28 | 2021-06-18 | 郑州大学 | Construction and application of CEBs (common electronic devices and controllers) speed planning model based on battery capacity loss control |
WO2021197246A1 (en) * | 2020-03-31 | 2021-10-07 | 长安大学 | V2x-based motorcade cooperative braking method and system |
-
2022
- 2022-06-30 CN CN202210770539.XA patent/CN114954455B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109484407A (en) * | 2018-11-14 | 2019-03-19 | 北京科技大学 | A kind of adaptive follow the bus method that electric car auxiliary drives |
WO2021197246A1 (en) * | 2020-03-31 | 2021-10-07 | 长安大学 | V2x-based motorcade cooperative braking method and system |
CN112046484A (en) * | 2020-09-21 | 2020-12-08 | 吉林大学 | Q learning-based vehicle lane-changing overtaking path planning method |
CN112989553A (en) * | 2020-12-28 | 2021-06-18 | 郑州大学 | Construction and application of CEBs (common electronic devices and controllers) speed planning model based on battery capacity loss control |
Non-Patent Citations (1)
Title |
---|
李文昌;郭景华;王进;: "分层架构下智能电动汽车纵向运动自适应模糊滑模控制", 厦门大学学报(自然科学版), no. 03, 28 May 2019 (2019-05-28) * |
Also Published As
Publication number | Publication date |
---|---|
CN114954455B (en) | 2024-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112896161B (en) | An ecological adaptive cruise control system for electric vehicles based on reinforcement learning | |
CN109703375B (en) | Coordinated recovery control method of regenerative braking energy for electric vehicles | |
CN105416077B (en) | The EMS and management method of a kind of electric automobile | |
Biao et al. | Regenerative braking control strategy of electric vehicles based on braking stability requirements | |
Zhuang et al. | Integrated energy-oriented cruising control of electric vehicle on highway with varying slopes considering battery aging | |
CN113561793B (en) | A dynamically constrained energy management strategy for smart fuel cell vehicles | |
CN109977449B (en) | Hybrid dynamic modeling and optimizing control method for intelligent automobile longitudinal dynamics system | |
CN115027290A (en) | Hybrid electric vehicle following energy management method based on multi-objective optimization | |
CN105667501B (en) | The energy distributing method of motor vehicle driven by mixed power with track optimizing function | |
CN106055830A (en) | PHEV (Plug-in Hybrid Electric Vehicle) control threshold parameter optimization method based on dynamic programming | |
Zhang et al. | Powertrain design and energy management of a novel coaxial series-parallel plug-in hybrid electric vehicle | |
Tong et al. | Speed planning for connected electric buses based on battery capacity loss | |
Ye et al. | A fast Q-learning energy management strategy for battery/supercapacitor electric vehicles considering energy saving and battery aging | |
Zhang et al. | Deep reinforcement learning based multi-objective energy management strategy for a plug-in hybrid electric bus considering driving style recognition | |
CN105620310A (en) | Three-motor hybrid truck and power system parameter matching method | |
CN110356396B (en) | Method for instantaneously optimizing speed of electric vehicle by considering road gradient | |
CN114954455B (en) | A method for controlling electric vehicle following vehicle based on multi-step reinforcement learning | |
CN114103711B (en) | Control method, system, device and storage medium for orderly charging of charging load | |
Zhang et al. | Multi-objective optimization for pure electric vehicle during a car-following process | |
Li et al. | Study on regenerative braking control strategy for extended range electric vehicles | |
Hong et al. | Development of a mathematical model of a train in the energy point of view for the international conference on control, automation and systems 2007 (ICCAS 2007) | |
CN116424317A (en) | Electric automobile economic self-adaptive cruise control method based on multi-step DDQN | |
CN111176140B (en) | An integrated control method for electric vehicle motion-transmission-energy system | |
CN112659922B (en) | Hybrid power rail vehicle and direct current bus voltage control method and system thereof | |
Bravo et al. | The influences of energy storage and energy management strategies on fuel consumption of a fuel cell hybrid vehicle |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |