CN113264031A - Hybrid power system control method based on road surface identification and deep reinforcement learning - Google Patents

Hybrid power system control method based on road surface identification and deep reinforcement learning Download PDF

Info

Publication number
CN113264031A
CN113264031A CN202110766400.3A CN202110766400A CN113264031A CN 113264031 A CN113264031 A CN 113264031A CN 202110766400 A CN202110766400 A CN 202110766400A CN 113264031 A CN113264031 A CN 113264031A
Authority
CN
China
Prior art keywords
neural network
road surface
power system
hybrid power
control method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110766400.3A
Other languages
Chinese (zh)
Other versions
CN113264031B (en
Inventor
唐小林
陈佳信
汪锋
胡晓松
邓忠伟
李佳承
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202110766400.3A priority Critical patent/CN113264031B/en
Publication of CN113264031A publication Critical patent/CN113264031A/en
Application granted granted Critical
Publication of CN113264031B publication Critical patent/CN113264031B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W20/00Control systems specially adapted for hybrid vehicles
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Mechanical Engineering (AREA)
  • Transportation (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Electric Propulsion And Braking For Vehicles (AREA)
  • Hybrid Electric Vehicles (AREA)

Abstract

本发明涉及一种基于路面识别与深度强化学习的混合动力系统控制方法,属于新能源汽车的智能控制领域。该方法包括:S1:建立P3结构的并联式混合动力系统以及驾驶环境模型;S2:建立VGG卷积神经网络,采集不同类型路面的图片,并对卷积神经网络进行关于路面类型特征提取的训练;S3:根据滑动率‑附着系数特征曲线确定制动阶段的最优滑动率,并且作为电机转速微调策略的参考值;S4:基于DQN算法建立适用于多目标控制的立体神经网络;S5:定义立体神经网络的状态变量空间、动作变量空间以及奖励函数,并完成迭代训练;S6:提取并保存同步拟合三种参数化控制策略的神经网络,实现混合动力汽车燃油经济性与制动安全性的协同保证。

Figure 202110766400

The invention relates to a hybrid power system control method based on road identification and deep reinforcement learning, and belongs to the field of intelligent control of new energy vehicles. The method includes: S1: establishing a parallel hybrid power system with a P3 structure and a driving environment model; S2: establishing a VGG convolutional neural network, collecting pictures of different types of road surfaces, and training the convolutional neural network on the feature extraction of road surface types ; S3: Determine the optimal slip rate in the braking stage according to the characteristic curve of slip rate-adhesion coefficient, and use it as the reference value of the motor speed fine-tuning strategy; S4: Establish a stereo neural network suitable for multi-objective control based on the DQN algorithm; S5: Definition State variable space, action variable space and reward function of the stereo neural network, and complete iterative training; S6: extract and save the neural network that synchronously fits three parameterized control strategies to achieve fuel economy and braking safety of hybrid electric vehicles synergy guarantee.

Figure 202110766400

Description

Hybrid power system control method based on road surface identification and deep reinforcement learning
Technical Field
The invention belongs to the technical field of intelligent control of new energy automobiles, and relates to a hybrid power system control method based on road surface identification and deep reinforcement learning.
Background
At present, the main development types of new energy automobiles comprise pure electric automobiles, plug-in hybrid electric automobiles and fuel cell automobiles, and in comparison, the hybrid electric automobiles have the characteristics of good fuel economy, good emission optimization effect, long endurance, mature control technology, low requirement on battery performance and the like, and are automobile models suitable for future development. On the other hand, the key technical route of the intelligent automobile sequentially comprises an environment perception technology, an intelligent decision technology and a control execution technology, but the technical route is not specific to a certain type of automobile, and the automobile is regarded as an entity to be controlled macroscopically. On the basis, it is conceivable that if the environment perception technology of the intelligent automobile and the energy management strategy of the hybrid electric vehicle are organically integrated, the automobile can acquire the environment information in real time after carrying the visual identification function, and therefore more intelligent and reasonable system control is conducted.
Most of the existing system control methods of hybrid electric vehicles only consider road condition information or the steep gradient of a road, and almost do not consider the influence factors of the road type.
Disclosure of Invention
In view of the above, the present invention provides a hybrid power system control method based on road surface identification and deep reinforcement learning, which utilizes a convolution network model based on computer vision to complete online identification of road surface types, and then performs economic-oriented energy management and safety-oriented brake control on a plurality of components in the hybrid power system through a deep reinforcement learning algorithm, so as to achieve cooperative assurance of fuel economy and brake safety, and is suitable for an unmanned hybrid vehicle.
In order to achieve the purpose, the invention provides the following technical scheme:
a hybrid power system control method based on road surface identification and deep reinforcement learning specifically comprises the following steps:
s1: establishing a parallel hybrid power system with a P3 structure (namely, a motor is positioned between a transmission and a main reducer) and a driving environment model fusing various time-varying state information, and completing the construction of a training environment;
s2: building a VGG16 convolutional neural network for road surface identification, acquiring images of five typical road surfaces, and training the convolutional neural network on road surface type characteristic extraction;
s3: after the road surface type is identified on line, determining the optimal slip rate in the braking stage according to a slip rate-adhesion coefficient characteristic curve, and using the optimal slip rate as a reference value of a motor rotating speed fine-tuning strategy in a subsequent technical system;
s4: establishing a three-dimensional neural Network suitable for multi-target control based on a Deep Q-Network (DQN) algorithm;
s5: defining a state variable space, an action variable space and a reward function of the three-dimensional neural network, and then performing iterative training on the three-dimensional neural network;
s6: extracting and storing the neural network synchronously fitting the three parameterized control strategies, and realizing the cooperative guarantee of the fuel economy and the braking safety of the hybrid electric vehicle; the three parameterized control strategies comprise a motor rotating speed fine adjustment strategy in a braking stage, an engine power control strategy and a mechanical continuously variable transmission gear shifting strategy.
Further, in step S1, the plurality of time-varying state information includes: the longitudinal running speed, the gradient, the number of passengers, the running picture collected by the vehicle-mounted camera and the like.
Further, in step S2, five typical road surface pictures are collected, including collecting multiple pictures of dry asphalt road surface, dry cobble road surface, wet asphalt road surface, wet cobble road surface, snow road surface, etc., and the surrounding environment is cut off by batch cutting so as to only keep the road surface portion, so as to ensure that the pixel information input into the convolutional neural network is valid information.
Further, in step S3, determining an optimal slip rate in the braking phase specifically includes: after the current road surface type is identified on line, the optimal slip rate capable of fully utilizing the road surface adhesion condition is determined according to a slip rate-adhesion coefficient characteristic curve, and the aim is that when the vehicle is in a braking state, the working efficiency of an anti-lock braking system (ABS) can be achieved by utilizing the motor to brake and keeping a certain slip rate under the regenerative braking mode without starting the braking force of a brake by adjusting the motor speed directly related to the wheel speed; wherein the slip ratio is determined according to the following formula:
Figure BDA0003151794970000021
where s is the slip ratio, vvehIs the longitudinal speed of the vehicle, r is the wheel radius, ωwheelIs the wheel speed; at the moment, the motor rotating speed corresponding to the optimal slip rate is used as a reference value of a motor rotating speed fine-tuning strategy in a subsequent technical system.
Further, in step S4, establishing a stereo neural network specifically includes: establishing a depth value network algorithm framework, and defining a hyper-parameter capable of maximally improving the calculation efficiency and the learning effect; the depth value network algorithm frame comprises an environment module and an agent module; the environment module comprises a parallel hybrid power system established in the step S1 and a driving environment model fusing various time-varying state information, and is used as a training environment for extracting an optimal control strategy; the intelligent agent module comprises a depth value network algorithm based on depth reinforcement learning, and specifically comprises a target network module, an experience playback mechanism module and the like; the hyper-parameters include: learning rate, attenuation rate of greedy coefficient, and experience pool capacity.
Further, in step S5, the expression of the state variable space S is defined as:
S={soc,vel,acc,ωmg,iCVT,Peng,θ,Roadsurface,Npeople}
where soc is the battery state of charge, vel is the speed, acc is the acceleration, ω ismgIs the motor speed, iCVTIs the transmission ratio of the continuously variable transmission, PengIs engine power, θ is slope, RoadsurfaceIs of the road surface type, NpeopleIs the number of passengers. Meanwhile, in a nine-dimensional state variable space, the battery charge state, speed and acceleration are subordinate to the vehicle system state, the motor rotation speed, the continuously variable transmission ratio and the engine power are subordinate to the control part state, and the gradient, the road surface type and the number of passengers are subordinate to the driving environment state.
Further, in step S5, the expression of the defined action variable space a is:
Figure BDA0003151794970000031
wherein, Δ ωmgIs the amount of change, Δ i, in the rotational speed of the motorCVTIs the change in transmission ratio, Δ P, of the transmissionengIs the amount of change in engine power.
Further, in step S5, the reward function R is defined by the following expression:
Figure BDA0003151794970000032
where α, β, γ, χ, and ξ are the weighting coefficients, t is the time, soc is the battery state of chargetargetIs the target charge state, abs denotes the absolute value, ωrefIs a reference motor speed, iCVTIs the transmission ratio of the continuously variable transmission, irefReference is made to the transmission ratio of the transmission,
Figure BDA0003151794970000033
is instantaneous oil consumption, TengIs hairMotive torque, nengIs the engine speed, ηengIs the engine efficiency.
Further, in step S6, extracting and storing the neural network that synchronously fits the three parameterized control strategies specifically includes: and after the total accumulated reward value is stably converged (marking the end of training), extracting the constructed three-dimensional deep neural network parameters and storing the parameters as a persistence model, wherein three parameterized control strategies are simultaneously stored in the model, namely a motor rotating speed fine adjustment strategy, an engine power control strategy and a mechanical continuously variable transmission gear shifting strategy in a braking stage, so that synchronous learning of three different types of control strategies is realized.
The invention has the beneficial effects that: the intelligent energy management strategy capable of synchronously ensuring the fuel economy and the braking safety is provided for the hybrid electric vehicle after the computer vision technology is combined, the online identification of the road surface type is specifically completed by utilizing a convolution network model based on the computer vision, and the economic-oriented energy management and the safety-oriented braking control are carried out on a plurality of components in the hybrid electric vehicle through a deep reinforcement learning algorithm, so that the cooperative guarantee of the fuel economy and the braking safety is realized, and the intelligent energy management strategy is suitable for the unmanned hybrid electric vehicle.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow chart of a control strategy of the method of the present invention;
FIG. 2 is a parallel hybrid powertrain of the P3 configuration;
FIG. 3 is a driving environment modeling diagram;
FIG. 4 is a network architecture diagram of a VGG 16;
FIG. 5 is a graph of slip versus road adhesion coefficient for different road types;
FIG. 6 is a diagram of the DQN algorithm structure;
fig. 7 is a three-dimensional deep neural network structure.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Referring to fig. 1 to 7, the present invention preferably discloses a hybrid power system control method based on road surface identification and deep reinforcement learning, which specifically includes the following steps:
s1: and establishing a parallel hybrid power system with a P3 structure and a driving environment model fusing various time-varying state information, wherein the driving environment model comprises longitudinal driving speed, gradient, passenger number, driving pictures acquired by a vehicle-mounted camera and the like, and completing the establishment of a training environment.
A parallel hybrid power system (comprising an engine, a clutch, a hydraulic torque converter, a motor/generator, a lithium ion power battery, a mechanical continuously variable transmission, a rear axle and the like) with a P3 structure shown in FIG. 2 is built, and a driving environment model (comprising longitudinal driving speed, gradient, passenger number, driving pictures collected by an on-board camera and the like) with a plurality of time-varying states is fused as shown in FIG. 3. In the former case, since the motor is installed between the mechanical continuously variable transmission and the final drive in the P3 configuration, the rotation speed of the motor is directly related to the rotation speed of the wheels, and the engine operating speed can be controlled by adjusting the real-time transmission ratio of the transmission. For the latter, the total running time of the driving environment model is 3602 seconds (namely the combination of a forward WLTC working condition and a reverse WLTC working condition), and the divided 10 modules are distributed with respective driving pictures and the number of passengers while the WLTC speed track and the real road gradient are included, so that an environment model containing various time-varying state information is built.
S2: a VGG convolutional network model for pavement identification is established, five typical pavement picture materials are collected, and training on pavement type feature extraction is conducted on a convolutional neural network.
A VGG16 convolutional network model shown in FIG. 4 is established through a Python language and Pythroch deep learning tool for online road surface type identification, and driving pictures are recorded through a vehicle-mounted camera, so that a large number of picture materials are collected as deep learning data sets for five typical road surfaces, for example, more than 2000 pieces of dry asphalt road surfaces, dry cobblestone road surfaces, wet asphalt road surfaces, wet cobblestone road surfaces and snowfield road surfaces are collected, the surrounding environment is cut off by batch cutting, only the road surface part is reserved, and therefore all pixel information input into a neural network can be guaranteed to be effective information. The VGG16 convolutional network was then trained for pavement type identification by defining 90% of the material as the training set and the remaining 10% as the test set.
According to a conventional deep learning training scheme, an original picture is compressed to the size of a neural network input layer for characteristic extraction in a specific training process, and finally, a classification layer is defined as a membership type of the picture through the subsequent processing of a series of convolution layers, pooling layers and full connection layers.
S3: after the road surface type is identified on line, the optimal slip rate in the braking stage is determined according to the slip rate-adhesion coefficient characteristic curve and is used as a reference value of a motor rotating speed fine-tuning strategy in a subsequent technical system.
After a driving picture is input into a trained VGG16 convolutional network through a vehicle-mounted camera to realize online identification of a road type, an optimal sliding rate capable of fully exerting a road adhesion condition is determined according to a characteristic curve of the sliding rate-adhesion coefficient shown in FIG. 5, and the aim is that when the vehicle is in a braking state, the working efficiency of an anti-lock braking system (ABS) can be achieved by utilizing motor braking and keeping a certain sliding rate under a regenerative braking mode without starting brake braking force through controlling the motor rotating speed directly related to the wheel rotating speed. Wherein the slip ratio is determined according to the following formula:
Figure BDA0003151794970000051
where s is the slip ratio, vvehIs the longitudinal speed of the vehicle, r is the wheel radius, ωwheelIs the wheel speed. The optimal slip ratio and the corresponding optimal motor speed at this time are also used as reference values of a motor speed fine-tuning strategy in a subsequent technical system.
S4: and establishing a three-dimensional neural network model suitable for multi-target control based on a depth value network (DQN) algorithm.
A depth value network algorithm framework shown in FIG. 6 is established by a Python language and a Pythrch depth learning tool. In FIG. 6, the solid line represents the reinforcement learning training loop for the subject, while the inner greedy algorithm actually selects random actions with a probability of ε% and selects the best actions known to date with a probability of 1- ε%. The greedy coefficients also decay gradually as the iterative training progresses. The chain line indicates the flow of the computation of the loss function, and the chain line indicates the gradient computation and the inverse update process. The environment module comprises a hybrid power system model and a driving environment model which are established before (as a training environment for extracting an optimal control strategy), and the agent module comprises a depth value network algorithm for depth reinforcement learning (comprising a target network, an experience playback mechanism and the like). Meanwhile, hyper-parameters (including a learning rate, an attenuation rate of a greedy coefficient, an experience pool capacity, and the like) capable of maximizing the improvement of the calculation efficiency and the learning effect are defined. Because three controlled components, namely an engine, a motor and a mechanical continuously variable transmission, exist in the parallel hybrid power system with the P3 structure, the aim of complying with the purpose of learning a control strategy by an online network is to combine online networks of three depth value network frames in a three-dimensional mode, and finally establish a three-dimensional neural network as shown in FIG. 7, so that the three-dimensional neural network serves as a basic frame for a multi-target control strategy of a subsequent hybrid power system.
In general, the three-dimensional neural network structure in the hidden layer includes three sides, namely, the three sides correspond to three control strategies to be learned, and each side of the neural network has the same structure, namely, each side includes three layers of neurons, each layer has 100 neurons in total, and each neuron has the same activation function.
S5: after a state variable space, an action variable space and a reward function are defined, the three-dimensional neural network is trained.
S51: the state variable space S is defined as follows:
S={soc,vel,acc,ωmg,iCVT,Peng,θ,Roadsurface,Npeople}
where soc is the battery state of charge, vel is the speed, acc is the acceleration, ω ismgIs the motor speed, iCVTIs the transmission ratio of the continuously variable transmission, PengIs engine power, θ is slope, RoadsurfaceIs of the road surface type, NpeopleIs the number of passengers. Meanwhile, in a nine-dimensional state variable space, the battery charge state, speed and acceleration are subordinate to the vehicle system state, the motor rotation speed, the continuously variable transmission ratio and the engine power are subordinate to the control part state, and the gradient, the road surface type and the number of passengers are subordinate to the driving environment state.
S52: the motion variable space A is defined as follows
Figure BDA0003151794970000061
Wherein, Δ ωmgIs the amount of change, Δ i, in the rotational speed of the motorCVTIs the amount of change in the transmission ratio of the transmission,ΔPengis the amount of change in engine power.
S53: reward function definition R is as follows
Figure BDA0003151794970000062
Wherein α, β, γ, χ and ξ are weight coefficients, t is time, soctargetIs the target charge state, abs denotes the absolute value, ωrefIs a reference motor speed, irefReference is made to the transmission ratio of the transmission,
Figure BDA0003151794970000063
is instantaneous oil consumption, TengIs the engine torque, nengIs the engine speed, ηengIs the engine efficiency.
After the definition of the state space, the action space and the reward function is completed, an iterative training process is started.
S6: and extracting and storing a neural network (comprising a motor rotating speed fine adjustment strategy, an engine power control strategy and a mechanical continuously variable transmission gear shifting strategy) synchronously fitting the three parameterized control strategies, and realizing the cooperative guarantee of fuel economy and braking safety.
Under the guidance of a reward function, the training model gradually increases towards an accumulated reward value and is developed in a state of keeping stable convergence (which marks the end of the training process and the successful learning of an optimal control strategy), then three-dimensional deep neural network parameters are extracted and stored as a persistence model for practical test and application, wherein three parameterized control strategies are simultaneously stored in the hidden layer model and respectively comprise a motor rotating speed fine adjustment strategy, an engine power control strategy and a mechanical continuously variable transmission gear shifting strategy in a braking stage. In a normal driving stage, particularly in a pure generator driving mode, a driving charging mode and a hybrid driving mode, the optimal or near-optimal fuel economy is obtained through an engine power control strategy and a mechanical continuously variable transmission gear shifting strategy, and in a braking stage, when the braking requirement is in the motor braking capacity range, the performance requirements of braking energy recovery and safe braking are realized through a motor rotating speed fine-adjustment strategy, so that the finally realized synchronous learning of three different types of control strategies is realized, and the cooperative guarantee of the fuel economy and the braking safety is also realized.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims (9)

1.一种基于路面识别与深度强化学习的混合动力系统控制方法,其特征在于,该方法具体包括以下步骤:1. a hybrid system control method based on road surface identification and deep reinforcement learning, is characterized in that, this method specifically comprises the following steps: S1:建立P3结构的并联式混合动力系统以及融合多种时变状态信息的驾驶环境模型,完成训练环境的搭建;S1: Establish a parallel hybrid power system with a P3 structure and a driving environment model that integrates various time-varying state information to complete the construction of the training environment; S2:建立用于路面识别的VGG卷积神经网络,采集不同类型路面的图片,并对卷积神经网络进行关于路面类型特征提取的训练;S2: Establish a VGG convolutional neural network for road surface recognition, collect pictures of different types of road surfaces, and train the convolutional neural network on the feature extraction of road surface types; S3:通过在线识别路面类型后,根据滑动率-附着系数特征曲线确定制动阶段的最优滑动率,并且作为电机转速微调策略的参考值;S3: After identifying the road surface type online, determine the optimal slip rate in the braking stage according to the slip rate-adhesion coefficient characteristic curve, and use it as a reference value for the fine-tuning strategy of the motor speed; S4:基于深度值网络(Deep Q-Network,DQN)算法建立适用于多目标控制的立体神经网络;S4: Establish a stereo neural network suitable for multi-objective control based on the Deep Q-Network (DQN) algorithm; S5:定义立体神经网络的状态变量空间、动作变量空间以及奖励函数,然后对立体神经网络进行迭代训练;S5: Define the state variable space, action variable space and reward function of the stereo neural network, and then iteratively train the stereo neural network; S6:提取并保存同步拟合了三种参数化控制策略的神经网络,实现混合动力汽车燃油经济性与制动安全性的协同保证;其中,三种参数化控制策略包括制动阶段的电机转速微调策略、发动机功率控制策略与机械式无极变速器换挡策略。S6: Extract and save the neural network that synchronously fits three parametric control strategies, to achieve synergistic guarantee of fuel economy and braking safety of HEVs; among them, the three parametric control strategies include the motor speed in the braking phase Fine-tuning strategy, engine power control strategy and mechanical continuously variable transmission shifting strategy. 2.根据权利要求1所述的混合动力系统控制方法,其特征在于,步骤S1中,所述的多种时变状态信息,包括:纵向行驶速度、坡度、乘客人数以及车载摄像头采集的行驶画面。2 . The hybrid power system control method according to claim 1 , wherein in step S1, the various time-varying state information includes: longitudinal driving speed, gradient, number of passengers, and driving pictures collected by on-board cameras. 3 . . 3.根据权利要求1所述的混合动力系统控制方法,其特征在于,步骤S2中,采集不同类型路面的图片,包括采集干沥青路面、干鹅卵石路面、湿沥青路面、湿鹅卵石路面以及雪地路面的多张图片,利用批量裁剪将周围环境舍去从而只保留路面部分,以确保输入卷积神经网络的像素信息均为有效信息。3 . The hybrid power system control method according to claim 1 , wherein in step S2 , pictures of different types of road surfaces are collected, including collection of dry asphalt pavement, dry cobblestone pavement, wet asphalt pavement, wet cobblestone pavement and snow. 3 . For multiple pictures of the road surface, batch cropping is used to discard the surrounding environment so that only the road surface part is retained to ensure that the pixel information input to the convolutional neural network is valid information. 4.根据权利要求1所述的混合动力系统控制方法,其特征在于,步骤S3中,确定制动阶段的最优滑动率,具体包括:在线识别当前路面类型后,根据滑动率-附着系数特征曲线确定能够充分利用路面附着条件的最优滑动率;其中,滑动率按照下式确定:4 . The hybrid power system control method according to claim 1 , wherein in step S3, determining the optimal slip rate in the braking stage, specifically comprising: after recognizing the current road surface type online, according to the slip rate-adhesion coefficient feature The curve determines the optimal slip rate that can make full use of the road adhesion conditions; where the slip rate is determined according to the following formula:
Figure FDA0003151794960000011
Figure FDA0003151794960000011
其中,s是滑动率,vveh是汽车纵向速度,r是车轮半径,ωwheel是车轮转速;此时,最优滑动率对应的电机转速作为电机转速微调策略的参考值。Among them, s is the slip rate, v veh is the longitudinal speed of the vehicle, r is the wheel radius, and ω wheel is the wheel speed; at this time, the motor speed corresponding to the optimal slip rate is used as the reference value of the motor speed fine-tuning strategy.
5.根据权利要求1所述的混合动力系统控制方法,其特征在于,步骤S4中,建立立体神经网络,具体包括:建立深度值网络算法框架,并定义能够最大化提高计算效率与学习效果的超参数;所述深度值网络算法框架包含有环境模块和智能体模块;所述环境模块包括步骤S1建立的并联式混合动力系统以及融合多种时变状态信息的驾驶环境模型,作为提取最优控制策略的训练环境;所述智能体模块包括基于深度强化学习的深度值网络算法,具体包括目标网络和经验回放机制;所述超参数包括:学习率、贪婪系数的衰减率和经验池容量。5. The hybrid power system control method according to claim 1, wherein in step S4, establishing a three-dimensional neural network, specifically comprising: establishing a depth value network algorithm framework, and defining an algorithm that can maximize computing efficiency and learning effect. Hyperparameters; the depth value network algorithm framework includes an environment module and an agent module; the environment module includes a parallel hybrid power system established in step S1 and a driving environment model that integrates various time-varying state information, as the extraction optimal The training environment of the control strategy; the agent module includes a deep value network algorithm based on deep reinforcement learning, and specifically includes a target network and an experience playback mechanism; the hyperparameters include: learning rate, greedy coefficient decay rate and experience pool capacity. 6.根据权利要求1所述的混合动力系统控制方法,其特征在于,步骤S5中,定义的状态变量空间S的表达式为:6. The hybrid power system control method according to claim 1, wherein in step S5, the expression of the defined state variable space S is: S={soc,vel,acc,ωmg,iCVT,Peng,θ,Roadsurface,Npeople}S={soc,vel,acc,ω mg ,i CVT ,P eng ,θ,Road surface ,N people } 其中,soc是电池电荷状态,vel是速度,acc是加速度,ωmg是电机转速,iCVT是无级变速器传动比,Peng是发动机功率,θ是坡度,Roadsurface是路面类型,Npeople是乘客数目。where soc is battery state of charge, vel is speed, acc is acceleration, ω mg is motor speed, i CVT is CVT gear ratio, P eng is engine power, θ is slope, Road surface is road surface type, and N people is number of passengers. 7.根据权利要求1所述的混合动力系统控制方法,其特征在于,步骤S5中,定义的动作变量空间A的表达式为:7. The hybrid power system control method according to claim 1, wherein in step S5, the expression of the defined action variable space A is:
Figure FDA0003151794960000021
Figure FDA0003151794960000021
其中,Δωmg是电动机转速的变化量,ΔiCVT是变速器传动比的变化量,ΔPeng是发动机功率的变化量。where Δω mg is the change in motor speed, Δi CVT is the change in transmission ratio, and ΔP eng is the change in engine power.
8.根据权利要求1所述的混合动力系统控制方法,其特征在于,步骤S5中,定义的奖励函数R的表达式为:8. The hybrid power system control method according to claim 1, wherein in step S5, the expression of the defined reward function R is:
Figure FDA0003151794960000022
Figure FDA0003151794960000022
其中,α,β,γ,χ和ξ是权重系数,t是时间,soc是电池电荷状态,soctarget是目标电荷状态,abs表示取绝对值,ωref是参考电机转速,iCVT是无级变速器传动比,iref是参考变速器传动比,
Figure FDA0003151794960000023
是瞬时油耗,Teng是发动机转矩,neng是发动机转速,ηeng是发动机效率。
where α, β, γ, χ and ξ are the weight coefficients, t is the time, soc is the battery state of charge, soc target is the target state of charge, abs is the absolute value, ωref is the reference motor speed, i CVT is the stepless Transmission ratio, i ref is the reference transmission ratio,
Figure FDA0003151794960000023
is the instantaneous fuel consumption, T eng is the engine torque, n eng is the engine speed, and η eng is the engine efficiency.
9.根据权利要求1所述的混合动力系统控制方法,其特征在于,步骤S6中,提取并保存同步拟合了三种参数化控制策略的神经网络,具体包括:当总累计奖励值稳定收敛后,提取所构建的立体深度神经网络参数并且保存为持久化模型,其中在该模型里同时存储着三种参数化控制策略,分别为制动阶段的电机转速微调策略、发动机功率控制策略与机械式无极变速器换挡策略,实现三种不同类型控制策略的同步学习。9 . The hybrid power system control method according to claim 1 , wherein in step S6 , extracting and saving a neural network that synchronously fits three parameterized control strategies, specifically comprising: when the total cumulative reward value stably converges. 10 . Then, the constructed stereo deep neural network parameters are extracted and saved as a persistent model, in which three parametric control strategies are stored in the model at the same time, namely, the motor speed fine-tuning strategy in the braking phase, the engine power control strategy and the mechanical The shift strategy of the continuously variable transmission can realize the synchronous learning of three different types of control strategies.
CN202110766400.3A 2021-07-07 2021-07-07 Hybrid power system control method based on road surface identification and deep reinforcement learning Active CN113264031B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110766400.3A CN113264031B (en) 2021-07-07 2021-07-07 Hybrid power system control method based on road surface identification and deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110766400.3A CN113264031B (en) 2021-07-07 2021-07-07 Hybrid power system control method based on road surface identification and deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN113264031A true CN113264031A (en) 2021-08-17
CN113264031B CN113264031B (en) 2022-04-29

Family

ID=77236517

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110766400.3A Active CN113264031B (en) 2021-07-07 2021-07-07 Hybrid power system control method based on road surface identification and deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN113264031B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113807503A (en) * 2021-09-28 2021-12-17 中国科学技术大学先进技术研究院 Autonomous decision making method, system, device and terminal suitable for intelligent automobile
CN115140059A (en) * 2022-07-19 2022-10-04 山东大学 Hybrid electric vehicle energy management method and system based on multi-objective optimization
CN115179779A (en) * 2022-07-22 2022-10-14 福州大学 A control method of intelligent driving fuel cell vehicle based on spatialization of road multi-dimensional information
CN115179779B (en) * 2022-07-22 2025-02-25 福州大学 Intelligent driving fuel cell vehicle control method integrating multi-dimensional road information spatialization

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107491736A (en) * 2017-07-20 2017-12-19 重庆邮电大学 A kind of pavement adhesion factor identifying method based on convolutional neural networks
CN108177648A (en) * 2018-01-02 2018-06-19 北京理工大学 A kind of energy management method of the plug-in hybrid vehicle based on intelligent predicting
CN110222953A (en) * 2018-12-29 2019-09-10 北京理工大学 A kind of power quality hybrid perturbation analysis method based on deep learning
JP2020091757A (en) * 2018-12-06 2020-06-11 富士通株式会社 Reinforcement learning program, reinforcement learning method, and reinforcement learning device
US20200194031A1 (en) * 2018-09-30 2020-06-18 Strong Force Intellectual Capital, Llc Intelligent transportation systems
CN111824095A (en) * 2020-06-14 2020-10-27 长春理工大学 Four-wheel hub electric vehicle electro-hydraulic composite brake anti-lock coordinated optimal control method
CN111845701A (en) * 2020-08-05 2020-10-30 重庆大学 A method of HEV energy management based on deep reinforcement learning in a car-following environment
CN112158189A (en) * 2020-09-30 2021-01-01 东南大学 Hybrid electric vehicle energy management method based on machine vision and deep learning
CN112550272A (en) * 2020-12-14 2021-03-26 重庆大学 Intelligent hybrid electric vehicle hierarchical control method based on visual perception and deep reinforcement learning

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107491736A (en) * 2017-07-20 2017-12-19 重庆邮电大学 A kind of pavement adhesion factor identifying method based on convolutional neural networks
CN108177648A (en) * 2018-01-02 2018-06-19 北京理工大学 A kind of energy management method of the plug-in hybrid vehicle based on intelligent predicting
US20200194031A1 (en) * 2018-09-30 2020-06-18 Strong Force Intellectual Capital, Llc Intelligent transportation systems
JP2020091757A (en) * 2018-12-06 2020-06-11 富士通株式会社 Reinforcement learning program, reinforcement learning method, and reinforcement learning device
CN110222953A (en) * 2018-12-29 2019-09-10 北京理工大学 A kind of power quality hybrid perturbation analysis method based on deep learning
CN111824095A (en) * 2020-06-14 2020-10-27 长春理工大学 Four-wheel hub electric vehicle electro-hydraulic composite brake anti-lock coordinated optimal control method
CN111845701A (en) * 2020-08-05 2020-10-30 重庆大学 A method of HEV energy management based on deep reinforcement learning in a car-following environment
CN112158189A (en) * 2020-09-30 2021-01-01 东南大学 Hybrid electric vehicle energy management method based on machine vision and deep learning
CN112550272A (en) * 2020-12-14 2021-03-26 重庆大学 Intelligent hybrid electric vehicle hierarchical control method based on visual perception and deep reinforcement learning

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113807503A (en) * 2021-09-28 2021-12-17 中国科学技术大学先进技术研究院 Autonomous decision making method, system, device and terminal suitable for intelligent automobile
CN113807503B (en) * 2021-09-28 2024-02-09 中国科学技术大学先进技术研究院 Autonomous decision-making methods, systems, devices and terminals suitable for smart cars
CN115140059A (en) * 2022-07-19 2022-10-04 山东大学 Hybrid electric vehicle energy management method and system based on multi-objective optimization
CN115179779A (en) * 2022-07-22 2022-10-14 福州大学 A control method of intelligent driving fuel cell vehicle based on spatialization of road multi-dimensional information
CN115179779B (en) * 2022-07-22 2025-02-25 福州大学 Intelligent driving fuel cell vehicle control method integrating multi-dimensional road information spatialization

Also Published As

Publication number Publication date
CN113264031B (en) 2022-04-29

Similar Documents

Publication Publication Date Title
Chen et al. Deep reinforcement learning-based multi-objective control of hybrid power system combined with road recognition under time-varying environment
CN111845701B (en) HEV energy management method based on deep reinforcement learning in car following environment
CN110936824B (en) A dual-motor control method for electric vehicles based on adaptive dynamic programming
CN108177648B (en) A kind of energy management method of the plug-in hybrid vehicle based on intelligent predicting
Singh et al. Fuzzy logic and Elman neural network tuned energy management strategies for a power-split HEVs
CN104071161B (en) A kind of method of plug-in hybrid-power automobile operating mode's switch and energy management and control
CN109291925B (en) An energy-saving intelligent network-connected hybrid vehicle following control method
CN114103971B (en) Energy-saving driving optimization method and device for fuel cell automobile
CN113085666A (en) Energy-saving driving method for layered fuel cell automobile
CN111923897B (en) Intelligent energy management method for plug-in hybrid electric vehicle
CN113264031A (en) Hybrid power system control method based on road surface identification and deep reinforcement learning
CN113276829B (en) A variable weight method for vehicle driving energy saving optimization based on working condition prediction
CN114312845A (en) Deep reinforcement learning based hybrid electric vehicle control method based on map data
Shi et al. A cloud-based energy management strategy for hybrid electric city bus considering real-time passenger load prediction
CN113635879A (en) Vehicle braking force distribution method
CN115257691B (en) A hybrid electric vehicle mode switching control method based on reinforcement learning
CN115158094A (en) Plug-in hybrid electric vehicle energy management method based on long-short-term SOC (System on chip) planning
CN112550272A (en) Intelligent hybrid electric vehicle hierarchical control method based on visual perception and deep reinforcement learning
CN105667501A (en) Energy distribution method of hybrid electric vehicle with track optimization function
Chen et al. Regenerative braking control strategy for distributed drive electric vehicles based on slope and mass co-estimation
Halima et al. Energy management of parallel hybrid electric vehicle based on fuzzy logic control strategies
Dorri et al. Design and optimization of a new control strategy in a parallel hybrid electric vehicle in order to improve fuel economy
DE102017129018A1 (en) Method for operating a motor vehicle
CN111196270B (en) A kind of electric vehicle electro-hydraulic composite braking system cornering control method
CN115571113A (en) Hybrid power vehicle energy optimization control method and system based on speed planning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant