CN113264031B - Hybrid power system control method based on road surface identification and deep reinforcement learning - Google Patents

Hybrid power system control method based on road surface identification and deep reinforcement learning Download PDF

Info

Publication number
CN113264031B
CN113264031B CN202110766400.3A CN202110766400A CN113264031B CN 113264031 B CN113264031 B CN 113264031B CN 202110766400 A CN202110766400 A CN 202110766400A CN 113264031 B CN113264031 B CN 113264031B
Authority
CN
China
Prior art keywords
road surface
neural network
strategy
speed
control method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110766400.3A
Other languages
Chinese (zh)
Other versions
CN113264031A (en
Inventor
唐小林
陈佳信
汪锋
胡晓松
邓忠伟
李佳承
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202110766400.3A priority Critical patent/CN113264031B/en
Publication of CN113264031A publication Critical patent/CN113264031A/en
Application granted granted Critical
Publication of CN113264031B publication Critical patent/CN113264031B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W20/00Control systems specially adapted for hybrid vehicles
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W60/00Drive control systems specially adapted for autonomous road vehicles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road

Abstract

The invention relates to a hybrid power system control method based on road surface identification and deep reinforcement learning, and belongs to the field of intelligent control of new energy automobiles. The method comprises the following steps: s1: establishing a parallel hybrid power system with a P3 structure and a driving environment model; s2: building a VGG convolutional neural network, acquiring pictures of different types of pavements, and training the convolutional neural network on pavement type feature extraction; s3: determining the optimal slip rate in the braking stage according to the slip rate-adhesion coefficient characteristic curve, and using the optimal slip rate as a reference value of a motor rotating speed fine-tuning strategy; s4: establishing a three-dimensional neural network suitable for multi-target control based on a DQN algorithm; s5: defining a state variable space, an action variable space and a reward function of the three-dimensional neural network, and completing iterative training; s6: and extracting and storing the neural network synchronously fitting the three parameterized control strategies, thereby realizing the cooperative guarantee of the fuel economy and the braking safety of the hybrid electric vehicle.

Description

Hybrid power system control method based on road surface identification and deep reinforcement learning
Technical Field
The invention belongs to the technical field of intelligent control of new energy automobiles, and relates to a hybrid power system control method based on road surface identification and deep reinforcement learning.
Background
At present, the main development types of new energy automobiles comprise pure electric automobiles, plug-in hybrid electric automobiles and fuel cell automobiles, and in comparison, the hybrid electric automobiles have the characteristics of good fuel economy, good emission optimization effect, long endurance, mature control technology, low requirement on battery performance and the like, and are automobile models suitable for future development. On the other hand, the key technical route of the intelligent automobile sequentially comprises an environment perception technology, an intelligent decision technology and a control execution technology, but the technical route is not specific to a certain type of automobile, and the automobile is regarded as an entity to be controlled macroscopically. On the basis, it is conceivable that if the environment perception technology of the intelligent automobile and the energy management strategy of the hybrid electric vehicle are organically integrated, the automobile can acquire the environment information in real time after carrying the visual identification function, and therefore more intelligent and reasonable system control is conducted.
Most of the existing system control methods of hybrid electric vehicles only consider road condition information or the steep gradient of a road, and almost do not consider the influence factors of the road type.
Disclosure of Invention
In view of the above, the present invention provides a hybrid power system control method based on road surface identification and deep reinforcement learning, which utilizes a convolution network model based on computer vision to complete online identification of road surface types, and then performs economic-oriented energy management and safety-oriented brake control on a plurality of components in the hybrid power system through a deep reinforcement learning algorithm, so as to achieve cooperative assurance of fuel economy and brake safety, and is suitable for an unmanned hybrid vehicle.
In order to achieve the purpose, the invention provides the following technical scheme:
a hybrid power system control method based on road surface identification and deep reinforcement learning specifically comprises the following steps:
s1: establishing a parallel hybrid power system with a P3 structure (namely, a motor is positioned between a transmission and a main reducer) and a driving environment model fusing various time-varying state information, and completing the construction of a training environment;
s2: building a VGG16 convolutional neural network for road surface identification, acquiring images of five typical road surfaces, and training the convolutional neural network on road surface type characteristic extraction;
s3: after the road surface type is identified on line, determining the optimal slip rate in the braking stage according to a slip rate-adhesion coefficient characteristic curve, and using the optimal slip rate as a reference value of a motor rotating speed fine-tuning strategy in a subsequent technical system;
s4: establishing a three-dimensional neural Network suitable for multi-target control based on a Deep Q-Network (DQN) algorithm;
s5: defining a state variable space, an action variable space and a reward function of the three-dimensional neural network, and then performing iterative training on the three-dimensional neural network;
s6: extracting and storing the neural network synchronously fitting the three parameterized control strategies, and realizing the cooperative guarantee of the fuel economy and the braking safety of the hybrid electric vehicle; the three parameterized control strategies comprise a motor rotating speed fine adjustment strategy in a braking stage, an engine power control strategy and a mechanical continuously variable transmission gear shifting strategy.
Further, in step S1, the plurality of time-varying state information includes: the longitudinal running speed, the gradient, the number of passengers, the running picture collected by the vehicle-mounted camera and the like.
Further, in step S2, five typical road surface pictures are collected, including collecting multiple pictures of dry asphalt road surface, dry cobble road surface, wet asphalt road surface, wet cobble road surface, snow road surface, etc., and the surrounding environment is cut off by batch cutting so as to only keep the road surface portion, so as to ensure that the pixel information input into the convolutional neural network is valid information.
Further, in step S3, determining an optimal slip rate in the braking phase specifically includes: after the current road surface type is identified on line, the optimal slip rate capable of fully utilizing the road surface adhesion condition is determined according to a slip rate-adhesion coefficient characteristic curve, and the aim is that when the vehicle is in a braking state, the working efficiency of an anti-lock braking system (ABS) can be achieved by utilizing the motor to brake and keeping a certain slip rate under the regenerative braking mode without starting the braking force of a brake by adjusting the motor speed directly related to the wheel speed; wherein the slip ratio is determined according to the following formula:
Figure BDA0003151794970000021
where s is the slip ratio, vvehIs the longitudinal speed of the vehicle, r is the wheel radius, ωwheelIs the wheel speed; at the moment, the motor rotating speed corresponding to the optimal slip rate is used as a reference value of a motor rotating speed fine-tuning strategy in a subsequent technical system.
Further, in step S4, establishing a stereo neural network specifically includes: establishing a depth value network algorithm framework, and defining a hyper-parameter capable of maximally improving the calculation efficiency and the learning effect; the depth value network algorithm frame comprises an environment module and an agent module; the environment module comprises a parallel hybrid power system established in the step S1 and a driving environment model fusing various time-varying state information, and is used as a training environment for extracting an optimal control strategy; the intelligent agent module comprises a depth value network algorithm based on depth reinforcement learning, and specifically comprises a target network module, an experience playback mechanism module and the like; the hyper-parameters include: learning rate, attenuation rate of greedy coefficient, and experience pool capacity.
Further, in step S5, the expression of the state variable space S is defined as:
S={soc,vel,acc,ωmg,iCVT,Peng,θ,Roadsurface,Npeople}
where soc is the battery state of charge, vel is the speed, acc is the acceleration, ω ismgIs the motor speed, iCVTIs the transmission ratio of the continuously variable transmission, PengIs engine power, θ is slope, RoadsurfaceIs of the road surface type, NpeopleIs the number of passengers. Meanwhile, in a nine-dimensional state variable space, the battery charge state, speed and acceleration are subordinate to the vehicle system state, the motor rotation speed, the continuously variable transmission ratio and the engine power are subordinate to the control part state, and the gradient, the road surface type and the number of passengers are subordinate to the driving environment state.
Further, in step S5, the expression of the defined action variable space a is:
Figure BDA0003151794970000031
wherein, Δ ωmgIs the amount of change, Δ i, in the rotational speed of the motorCVTIs the change in transmission ratio, Δ P, of the transmissionengIs the amount of change in engine power.
Further, in step S5, the reward function R is defined by the following expression:
Figure BDA0003151794970000032
where α, β, γ, χ, and ξ are the weighting coefficients, t is the time, soc is the battery state of chargetargetIs the target charge state, abs denotes the absolute value, ωrefIs a reference motor speed, iCVTIs the transmission ratio of the continuously variable transmission, irefReference is made to the transmission ratio of the transmission,
Figure BDA0003151794970000033
is instantaneous oil consumption, TengIs the engine torque, nengIs the engine speed, ηengIs the engine efficiency.
Further, in step S6, extracting and storing the neural network that synchronously fits the three parameterized control strategies specifically includes: and after the total accumulated reward value is stably converged (marking the end of training), extracting the constructed three-dimensional deep neural network parameters and storing the parameters as a persistence model, wherein three parameterized control strategies are simultaneously stored in the model, namely a motor rotating speed fine adjustment strategy, an engine power control strategy and a mechanical continuously variable transmission gear shifting strategy in a braking stage, so that synchronous learning of three different types of control strategies is realized.
The invention has the beneficial effects that: the intelligent energy management strategy capable of synchronously ensuring the fuel economy and the braking safety is provided for the hybrid electric vehicle after the computer vision technology is combined, the online identification of the road surface type is specifically completed by utilizing a convolution network model based on the computer vision, and the economic-oriented energy management and the safety-oriented braking control are carried out on a plurality of components in the hybrid electric vehicle through a deep reinforcement learning algorithm, so that the cooperative guarantee of the fuel economy and the braking safety is realized, and the intelligent energy management strategy is suitable for the unmanned hybrid electric vehicle.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow chart of a control strategy of the method of the present invention;
FIG. 2 is a parallel hybrid powertrain of the P3 configuration;
FIG. 3 is a driving environment modeling diagram;
FIG. 4 is a network architecture diagram of a VGG 16;
FIG. 5 is a graph of slip versus road adhesion coefficient for different road types;
FIG. 6 is a diagram of the DQN algorithm structure;
fig. 7 is a three-dimensional deep neural network structure.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Referring to fig. 1 to 7, the present invention preferably discloses a hybrid power system control method based on road surface identification and deep reinforcement learning, which specifically includes the following steps:
s1: and establishing a parallel hybrid power system with a P3 structure and a driving environment model fusing various time-varying state information, wherein the driving environment model comprises longitudinal driving speed, gradient, passenger number, driving pictures acquired by a vehicle-mounted camera and the like, and completing the establishment of a training environment.
A parallel hybrid power system (comprising an engine, a clutch, a hydraulic torque converter, a motor/generator, a lithium ion power battery, a mechanical continuously variable transmission, a rear axle and the like) with a P3 structure shown in FIG. 2 is built, and a driving environment model (comprising longitudinal driving speed, gradient, passenger number, driving pictures collected by an on-board camera and the like) with a plurality of time-varying states is fused as shown in FIG. 3. In the former case, since the motor is installed between the mechanical continuously variable transmission and the final drive in the P3 configuration, the rotation speed of the motor is directly related to the rotation speed of the wheels, and the engine operating speed can be controlled by adjusting the real-time transmission ratio of the transmission. For the latter, the total running time of the driving environment model is 3602 seconds (namely the combination of a forward WLTC working condition and a reverse WLTC working condition), and the divided 10 modules are distributed with respective driving pictures and the number of passengers while the WLTC speed track and the real road gradient are included, so that an environment model containing various time-varying state information is built.
S2: a VGG convolutional network model for pavement identification is established, five typical pavement picture materials are collected, and training on pavement type feature extraction is conducted on a convolutional neural network.
A VGG16 convolutional network model shown in FIG. 4 is established through a Python language and Pythroch deep learning tool for online road surface type identification, and driving pictures are recorded through a vehicle-mounted camera, so that a large number of picture materials are collected as deep learning data sets for five typical road surfaces, for example, more than 2000 pieces of dry asphalt road surfaces, dry cobblestone road surfaces, wet asphalt road surfaces, wet cobblestone road surfaces and snowfield road surfaces are collected, the surrounding environment is cut off by batch cutting, only the road surface part is reserved, and therefore all pixel information input into a neural network can be guaranteed to be effective information. The VGG16 convolutional network was then trained for pavement type identification by defining 90% of the material as the training set and the remaining 10% as the test set.
According to a conventional deep learning training scheme, an original picture is compressed to the size of a neural network input layer for characteristic extraction in a specific training process, and finally, a classification layer is defined as a membership type of the picture through the subsequent processing of a series of convolution layers, pooling layers and full connection layers.
S3: after the road surface type is identified on line, the optimal slip rate in the braking stage is determined according to the slip rate-adhesion coefficient characteristic curve and is used as a reference value of a motor rotating speed fine-tuning strategy in a subsequent technical system.
After a driving picture is input into a trained VGG16 convolutional network through a vehicle-mounted camera to realize online identification of a road type, an optimal sliding rate capable of fully exerting a road adhesion condition is determined according to a characteristic curve of the sliding rate-adhesion coefficient shown in FIG. 5, and the aim is that when the vehicle is in a braking state, the working efficiency of an anti-lock braking system (ABS) can be achieved by utilizing motor braking and keeping a certain sliding rate under a regenerative braking mode without starting brake braking force through controlling the motor rotating speed directly related to the wheel rotating speed. Wherein the slip ratio is determined according to the following formula:
Figure BDA0003151794970000051
where s is the slip ratio, vvehIs the longitudinal speed of the vehicle, r is the wheel radius, ωwheelIs the wheel speed. The optimal slip ratio and the corresponding optimal motor speed at this time are also used as reference values of a motor speed fine-tuning strategy in a subsequent technical system.
S4: and establishing a three-dimensional neural network model suitable for multi-target control based on a depth value network (DQN) algorithm.
A depth value network algorithm framework shown in FIG. 6 is established by a Python language and a Pythrch depth learning tool. In FIG. 6, the solid line represents the reinforcement learning training loop for the subject, while the inner greedy algorithm actually selects random actions with a probability of ε% and selects the best actions known to date with a probability of 1- ε%. The greedy coefficients also decay gradually as the iterative training progresses. The chain line indicates the flow of the computation of the loss function, and the chain line indicates the gradient computation and the inverse update process. The environment module comprises a hybrid power system model and a driving environment model which are established before (as a training environment for extracting an optimal control strategy), and the agent module comprises a depth value network algorithm for depth reinforcement learning (comprising a target network, an experience playback mechanism and the like). Meanwhile, hyper-parameters (including a learning rate, an attenuation rate of a greedy coefficient, an experience pool capacity, and the like) capable of maximizing the improvement of the calculation efficiency and the learning effect are defined. Because three controlled components, namely an engine, a motor and a mechanical continuously variable transmission, exist in the parallel hybrid power system with the P3 structure, the aim of complying with the purpose of learning a control strategy by an online network is to combine online networks of three depth value network frames in a three-dimensional mode, and finally establish a three-dimensional neural network as shown in FIG. 7, so that the three-dimensional neural network serves as a basic frame for a multi-target control strategy of a subsequent hybrid power system.
In general, the three-dimensional neural network structure in the hidden layer includes three sides, namely, the three sides correspond to three control strategies to be learned, and each side of the neural network has the same structure, namely, each side includes three layers of neurons, each layer has 100 neurons in total, and each neuron has the same activation function.
S5: after a state variable space, an action variable space and a reward function are defined, the three-dimensional neural network is trained.
S51: the state variable space S is defined as follows:
S={soc,vel,acc,ωmg,iCVT,Peng,θ,Roadsurface,Npeople}
where soc is the battery state of charge, vel is the speed, acc is the acceleration, ω ismgIs the motor speed, iCVTIs the transmission ratio of the continuously variable transmission, PengIs engine power, θ is slope, RoadsurfaceIs of the road surface type, NpeopleIs the number of passengers. Meanwhile, in a nine-dimensional state variable space, the battery charge state, speed and acceleration are subordinate to the vehicle system state, the motor rotation speed, the continuously variable transmission ratio and the engine power are subordinate to the control part state, and the gradient, the road surface type and the number of passengers are subordinate to the driving environment state.
S52: the motion variable space A is defined as follows
Figure BDA0003151794970000061
Wherein, Δ ωmgIs the amount of change, Δ i, in the rotational speed of the motorCVTIs the change in transmission ratio, Δ P, of the transmissionengIs the amount of change in engine power.
S53: reward function definition R is as follows
Figure BDA0003151794970000062
Wherein α, β, γ, χ and ξ are weight coefficients, t is time, soctargetIs the target charge state, abs denotes the absolute value, ωrefIs a reference motor speed, irefReference is made to the transmission ratio of the transmission,
Figure BDA0003151794970000063
is instantaneous oil consumption, TengIs the engine torque, nengIs the engine speed, ηengIs the engine efficiency.
After the definition of the state space, the action space and the reward function is completed, an iterative training process is started.
S6: and extracting and storing a neural network (comprising a motor rotating speed fine adjustment strategy, an engine power control strategy and a mechanical continuously variable transmission gear shifting strategy) synchronously fitting the three parameterized control strategies, and realizing the cooperative guarantee of fuel economy and braking safety.
Under the guidance of a reward function, the training model gradually increases towards an accumulated reward value and is developed in a state of keeping stable convergence (which marks the end of the training process and the successful learning of an optimal control strategy), then three-dimensional deep neural network parameters are extracted and stored as a persistence model for practical test and application, wherein three parameterized control strategies are simultaneously stored in the hidden layer model and respectively comprise a motor rotating speed fine adjustment strategy, an engine power control strategy and a mechanical continuously variable transmission gear shifting strategy in a braking stage. In a normal driving stage, particularly in a pure generator driving mode, a driving charging mode and a hybrid driving mode, the optimal or near-optimal fuel economy is obtained through an engine power control strategy and a mechanical continuously variable transmission gear shifting strategy, and in a braking stage, when the braking requirement is in the motor braking capacity range, the performance requirements of braking energy recovery and safe braking are realized through a motor rotating speed fine-adjustment strategy, so that the finally realized synchronous learning of three different types of control strategies is realized, and the cooperative guarantee of the fuel economy and the braking safety is also realized.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims (7)

1. A hybrid power system control method based on road surface identification and deep reinforcement learning is characterized by comprising the following steps:
s1: establishing a parallel hybrid power system with a P3 structure and a driving environment model fusing various time-varying state information to complete the construction of a training environment;
s2: building a VGG convolutional neural network for pavement identification, acquiring pictures of different types of pavements, and training the convolutional neural network on pavement type feature extraction;
s3: after the road surface type is identified on line, determining the optimal slip rate in the braking stage according to the slip rate-adhesion coefficient characteristic curve, and using the optimal slip rate as a reference value of a motor rotating speed fine-adjustment strategy;
s4: establishing a three-dimensional neural Network suitable for multi-target control based on a Deep Q-Network (DQN) algorithm;
s5: defining a state variable space, an action variable space and a reward function of the three-dimensional neural network, and then performing iterative training on the three-dimensional neural network; wherein, the expression of the state variable space S is:
S={soc,vel,acc,ωmg,iCVT,Peng,θ,Roadsurface,Npeople}
where soc is the battery state of charge, vel is the speed, acc is the acceleration, ω ismgIs the motor speed, iCVTIs the transmission ratio of the continuously variable transmission, PengIs engine power, θ is slope, RoadsurfaceIs of the road surface type, NpeopleIs the number of passengers;
the expression of the reward function R is:
Figure FDA0003473277690000011
where α, β, γ, χ, and ξ are the weighting coefficients, t is the time, soc is the battery state of chargetargetIs the target charge state, abs denotes the absolute value, ωrefIs a reference motor speed, iCVTIs the transmission ratio of the continuously variable transmission, irefReference is made to the transmission ratio of the transmission,
Figure FDA0003473277690000012
is instantaneous oil consumption,TengIs the engine torque, nengIs the engine speed, ηengIs the engine efficiency;
s6: extracting and storing the neural network synchronously fitting the three parameterized control strategies, and realizing the cooperative guarantee of the fuel economy and the braking safety of the hybrid electric vehicle; the three parameterized control strategies comprise a motor rotating speed fine adjustment strategy in a braking stage, an engine power control strategy and a mechanical continuously variable transmission gear shifting strategy.
2. The hybrid system control method according to claim 1, wherein in step S1, the plurality of time-varying state information includes: the longitudinal running speed, the gradient, the number of passengers and the running picture collected by the vehicle-mounted camera.
3. The hybrid system control method according to claim 1, wherein in step S2, the image of the road surface of different types is collected, and the image includes collecting a plurality of images of a dry asphalt road surface, a dry cobblestone road surface, a wet asphalt road surface, a wet cobblestone road surface, and a snow road surface, and the surrounding environment is cut off by batch cropping so that only the road surface portion is retained, thereby ensuring that the pixel information input to the convolutional neural network is valid information.
4. The hybrid system control method according to claim 1, wherein in step S3, determining the optimal slip rate in the braking phase specifically includes: after the current pavement type is identified on line, determining the optimal sliding rate capable of fully utilizing the pavement adhesion condition according to the sliding rate-adhesion coefficient characteristic curve; wherein the slip ratio is determined according to the following formula:
Figure FDA0003473277690000021
where s is the slip ratio, vvehIs the longitudinal speed of the vehicle, r is the wheel radius, ωwheelIs the wheel speed; at this time, the optimum slip rate corresponds toAs a reference value for the motor speed fine-tuning strategy.
5. The hybrid system control method according to claim 1, wherein in step S4, establishing a stereo neural network specifically includes: establishing a depth value network algorithm framework, and defining a hyper-parameter capable of maximally improving the calculation efficiency and the learning effect; the depth value network algorithm frame comprises an environment module and an agent module; the environment module comprises a parallel hybrid power system established in the step S1 and a driving environment model fusing various time-varying state information, and is used as a training environment for extracting an optimal control strategy; the intelligent agent module comprises a depth value network algorithm based on depth reinforcement learning, and specifically comprises a target network and an experience playback mechanism; the hyper-parameters include: learning rate, decay rate of greedy coefficients, and empirical pool capacity.
6. The hybrid system control method according to claim 1, characterized in that in step S5, the expression of the action variable space a is defined as:
Figure FDA0003473277690000022
wherein, Δ ωmgIs the amount of change, Δ i, in the rotational speed of the motorCVTIs the change in transmission ratio, Δ P, of the transmissionengIs the amount of change in engine power.
7. The hybrid power system control method according to claim 1, wherein in step S6, extracting and storing the neural network synchronously fitting three parameterized control strategies specifically comprises: and after the total accumulated reward value is stably converged, extracting the constructed three-dimensional deep neural network parameters and storing the parameters as a persistence model, wherein three parameterized control strategies, namely a motor rotating speed fine adjustment strategy, an engine power control strategy and a mechanical continuously variable transmission gear shifting strategy in a braking stage are simultaneously stored in the model, and synchronous learning of three different types of control strategies is realized.
CN202110766400.3A 2021-07-07 2021-07-07 Hybrid power system control method based on road surface identification and deep reinforcement learning Active CN113264031B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110766400.3A CN113264031B (en) 2021-07-07 2021-07-07 Hybrid power system control method based on road surface identification and deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110766400.3A CN113264031B (en) 2021-07-07 2021-07-07 Hybrid power system control method based on road surface identification and deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN113264031A CN113264031A (en) 2021-08-17
CN113264031B true CN113264031B (en) 2022-04-29

Family

ID=77236517

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110766400.3A Active CN113264031B (en) 2021-07-07 2021-07-07 Hybrid power system control method based on road surface identification and deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN113264031B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113807503B (en) * 2021-09-28 2024-02-09 中国科学技术大学先进技术研究院 Autonomous decision making method, system, device and terminal suitable for intelligent automobile
CN115179779A (en) * 2022-07-22 2022-10-14 福州大学 Intelligent driving fuel cell vehicle control method integrating road multidimensional information spatialization

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108177648A (en) * 2018-01-02 2018-06-19 北京理工大学 A kind of energy management method of the plug-in hybrid vehicle based on intelligent predicting
CN110222953A (en) * 2018-12-29 2019-09-10 北京理工大学 A kind of power quality hybrid perturbation analysis method based on deep learning
JP2020091757A (en) * 2018-12-06 2020-06-11 富士通株式会社 Reinforcement learning program, reinforcement learning method, and reinforcement learning device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107491736A (en) * 2017-07-20 2017-12-19 重庆邮电大学 A kind of pavement adhesion factor identifying method based on convolutional neural networks
JP7465484B2 (en) * 2018-09-30 2024-04-11 ストロング フォース ティーピー ポートフォリオ 2022,エルエルシー Highly functional transportation system
CN111824095B (en) * 2020-06-14 2022-07-05 长春理工大学 Four-wheel hub electric automobile electro-hydraulic composite brake anti-lock coordination optimization control method
CN111845701B (en) * 2020-08-05 2021-03-30 重庆大学 HEV energy management method based on deep reinforcement learning in car following environment
CN112158189A (en) * 2020-09-30 2021-01-01 东南大学 Hybrid electric vehicle energy management method based on machine vision and deep learning
CN112550272B (en) * 2020-12-14 2021-07-30 重庆大学 Intelligent hybrid electric vehicle hierarchical control method based on visual perception and deep reinforcement learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108177648A (en) * 2018-01-02 2018-06-19 北京理工大学 A kind of energy management method of the plug-in hybrid vehicle based on intelligent predicting
JP2020091757A (en) * 2018-12-06 2020-06-11 富士通株式会社 Reinforcement learning program, reinforcement learning method, and reinforcement learning device
CN110222953A (en) * 2018-12-29 2019-09-10 北京理工大学 A kind of power quality hybrid perturbation analysis method based on deep learning

Also Published As

Publication number Publication date
CN113264031A (en) 2021-08-17

Similar Documents

Publication Publication Date Title
CN111845701B (en) HEV energy management method based on deep reinforcement learning in car following environment
CN113264031B (en) Hybrid power system control method based on road surface identification and deep reinforcement learning
CN110696815B (en) Prediction energy management method of network-connected hybrid electric vehicle
CN112116156B (en) Hybrid train energy management method and system based on deep reinforcement learning
Maia et al. Electrical vehicle modeling: A fuzzy logic model for regenerative braking
CN104210383B (en) A kind of four-wheel drive electric automobile torque distribution control method and system
CN113085666B (en) Energy-saving driving method for layered fuel cell automobile
CN109291925B (en) Energy-saving intelligent network-connection hybrid electric vehicle following control method
CN111923897B (en) Intelligent energy management method for plug-in hybrid electric vehicle
CN111959492B (en) HEV energy management hierarchical control method considering lane change behavior in internet environment
CN113635879B (en) Vehicle braking force distribution method
Chen et al. Deep reinforcement learning-based multi-objective control of hybrid power system combined with road recognition under time-varying environment
CN114103971B (en) Energy-saving driving optimization method and device for fuel cell automobile
CN110936824A (en) Electric automobile double-motor control method based on self-adaptive dynamic planning
CN105151040A (en) Energy management method of hybrid electric vehicle based on power spectrum self-learning prediction
CN112550272B (en) Intelligent hybrid electric vehicle hierarchical control method based on visual perception and deep reinforcement learning
CN115158094A (en) Plug-in hybrid electric vehicle energy management method based on long-short-term SOC (System on chip) planning
CN115534929A (en) Plug-in hybrid electric vehicle energy management method based on multi-information fusion
CN113911101A (en) Online energy distribution method based on coaxial parallel structure
CN115675102A (en) Particle swarm algorithm optimized hybrid electric vehicle regenerative braking control method
CN113276829B (en) Vehicle running energy-saving optimization weight-changing method based on working condition prediction
CN113682152B (en) Traction control method for distributed drive automobile
CN114228507A (en) Intelligent electrically-driven vehicle regenerative braking control method utilizing front vehicle information
Chen et al. An energy management strategy for through-the-road type plug-in hybrid electric vehicles
CN113459829A (en) Intelligent energy management method for double-motor electric vehicle based on road condition prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant