CN116424332B - Energy management strategy enhancement updating method for deep reinforcement learning type hybrid electric vehicle - Google Patents

Energy management strategy enhancement updating method for deep reinforcement learning type hybrid electric vehicle Download PDF

Info

Publication number
CN116424332B
CN116424332B CN202310378883.9A CN202310378883A CN116424332B CN 116424332 B CN116424332 B CN 116424332B CN 202310378883 A CN202310378883 A CN 202310378883A CN 116424332 B CN116424332 B CN 116424332B
Authority
CN
China
Prior art keywords
vehicle
speed
hybrid electric
energy management
driving
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310378883.9A
Other languages
Chinese (zh)
Other versions
CN116424332A (en
Inventor
唐小林
陈佳信
杨为
胡晓松
杨亚联
谢翌
李佳承
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202310378883.9A priority Critical patent/CN116424332B/en
Publication of CN116424332A publication Critical patent/CN116424332A/en
Application granted granted Critical
Publication of CN116424332B publication Critical patent/CN116424332B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W20/00Control systems specially adapted for hybrid vehicles
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W2050/0001Details of the control system
    • B60W2050/0002Automatic control, details of type of controller or control system architecture
    • B60W2050/0004In digital systems, e.g. discrete-time systems involving sampling
    • B60W2050/0005Processor details or data handling, e.g. memory registers or chip architecture
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W2050/0001Details of the control system
    • B60W2050/0019Control system elements or transfer functions
    • B60W2050/0028Mathematical models, e.g. for simulation
    • B60W2050/0031Mathematical model of the vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Mechanical Engineering (AREA)
  • Transportation (AREA)
  • Artificial Intelligence (AREA)
  • Automation & Control Theory (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Human Computer Interaction (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Electric Propulsion And Braking For Vehicles (AREA)

Abstract

The invention relates to a method for enhancing and updating an energy management strategy of a deep reinforcement learning type hybrid electric vehicle, and belongs to the technical field of hybrid electric vehicles. The method comprises the following steps: s1: acquiring historical speed data of different types of vehicles; s2: dividing the acquired data into initial, reinforced and final stages, and then merging to generate a speed state transfer characteristic matrix of the corresponding stage; s3: generating a characteristic driving condition based on a state sequence according to the speed state transfer characteristic matrix, and training an energy management strategy of the deep reinforcement learning type hybrid electric vehicle; s4: defining a variable space and a reward function required by strategy training, and realizing joint simulation training by using an m file of Matlab as a data interface; s5: and finishing an online enhanced update type iterative training process of the deep reinforcement learning type hybrid electric vehicle energy management strategy, downloading the latest strategy after training, and loading the latest strategy into a hybrid electric system model for subsequent testing.

Description

Energy management strategy enhancement updating method for deep reinforcement learning type hybrid electric vehicle
Technical Field
The invention belongs to the technical field of hybrid electric vehicles, and relates to a state sequence working condition based deep reinforcement learning type hybrid electric vehicle energy management strategy reinforcement updating method.
Background
The global automobile industry is coming into the new development opportunities, and the technologies of new energy sources, intellectualization and the like bring great change to the power system and control of automobiles. New energy automobiles have been regarded as an important measure for realizing energy transformation and relieving energy crisis. Currently, mainstream and emerging automobile manufacturers are in a state of providing corresponding pure electric automobiles, hybrid electric automobiles and fuel cell automobiles. The pure electric automobile can attract the attention of consumers by low charging price and environment-friendly driving mode and meeting the demands of people for traveling in urban areas. However, the public is still more concerned about the endurance mileage, charging facilities, and the degree of security assurance. Although the pure electric vehicle can replace the traditional fuel vehicle to become a main vehicle in the future, the key technology needs to be further improved. Fuel cell automobiles use hydrogen instead of gasoline to generate electricity and drive motors, and are regarded as the main power system of future commercial vehicles in the middle, united states, europe, and the like. At present, the hybrid electric vehicle has the most mature technical level, can meet the requirements of driving mileage, convenient energy supplementing, energy saving, emission reduction and the like, is an ideal transition product, and occupies the sales share of the new energy vehicle market for a long time.
For the technical route of the hybrid electric vehicle, the model selection and parameter matching of the power assembly are finished in the initial stage, and a solution is required to be determined according to the service environment and the client requirements of the hybrid electric vehicle. The energy management strategy is one of core technologies for realizing energy conservation and emission reduction and improving fuel economy of the hybrid power system. The main principle is that the power flow is reasonably distributed among a plurality of power sources while meeting the requirements and constraints of a power system, so that the expected optimization target is achieved. In addition, some researches are beginning to consider other important factors affecting the operation of the power system, such as battery aging, motor heating, etc., so that the energy management strategy becomes a control strategy gradually considering the whole vehicle operation environment. Generally, a reliable set of energy management strategies can be designed by using the experience of researchers or experts to form rule-based energy management strategies, and optimization algorithms such as dynamic planning, pontriful minimum principle, equivalent fuel consumption minimization strategies, model predictive control and the like can also be adopted to obtain the optimization-based energy management strategies. However, both of the above energy management strategies have drawbacks in terms of adaptability, computational efficiency, optimization effects, etc.
Disclosure of Invention
In view of the above, the invention aims to provide a fully new training concept which is more suitable for the principle of reinforcement learning algorithm for the energy management strategy of the hybrid electric vehicle based on deep reinforcement learning, and provides a deep reinforcement learning type control strategy reinforcement updating method based on state sequence working conditions (but not time sequence speed working conditions) by adopting a joint simulation form of an agent model in a Python environment and a hybrid electric system model in a Simulink environment, so that the control strategy obtained by final training has a more perfect application effect.
In order to achieve the above purpose, the present invention provides the following technical solutions:
the method for enhancing and updating the energy management strategy of the deep reinforcement learning type hybrid electric vehicle specifically comprises the following steps:
s1: different types of vehicle historical speed data are obtained through diversified driving information sources, and automatic driving simulation software (CARLA) simulation data, a real driving data set DBNet, a racing type electronic game Gran Turismo and standard working condition (HWEFT, US06, WLTC and the like) data which are specially used for testing the vehicle performance are mainly covered;
s2: dividing each acquired vehicle history speed data into three stages (an initial stage, a strengthening stage and a final stage) respectively, and then merging to generate a speed state transition feature matrix under the corresponding stage;
s3: generating a characteristic driving working condition based on a state sequence according to a speed state transition characteristic matrix generated by vehicle historical speed data, and training an energy management strategy of the deep reinforcement learning type hybrid electric vehicle;
s4: the method is oriented to a deep reinforcement learning type hybrid electric vehicle energy management strategy, a state space S, an action space A and a reward function R required by a training process are defined, and a m file of Matlab is used as a data interface to realize the joint simulation training of a deep reinforcement learning type intelligent agent in a Python environment and a parallel hybrid electric system in a Simulink environment;
s5: and (3) completing an online enhancement and updating iterative training process of the deep reinforcement learning type hybrid electric vehicle energy management strategy based on a cloud server (such as a Tengmao cloud virtual machine), downloading the latest hybrid electric vehicle energy management strategy after training is finished, and loading the latest hybrid electric vehicle energy management strategy into a hybrid electric system model for subsequent testing.
Further, in step S1, the acquired different types of vehicle history speed data include:
(1) Based on virtual simulated autopilot data, source calla (autopilot study simulator): based on the official vehicles and the map as the environment, controlling the vehicles to run in the area through the automatic driving function, wherein the environment where the target vehicle is located comprises surrounding vehicles, pedestrians and traffic management equipment, and further obtaining simulation speed data representing the automatic driving control characteristics;
(2) Vehicle speed data for a real human driver, source DBNet: downloading a data set which is issued by a real driver in the urban area range on the internet (Shanghai university of traffic) to acquire real speed data capable of representing the human driving characteristics;
(3) Based on the speed data of the racing class electronic game, the source Gran Turismo: the simulation speed data which fully characterizes the vehicle in the racing environment is obtained according to the difference of the driving styles of the track, the vehicle and the player by running Gran Turismo Sport a real driving simulator on a PlayStation platform;
(4) Standard operating conditions specially used for testing vehicle performance: several standard speed conditions commonly used in the field of vehicle testing are selected for merging, including HWEFT, US06, WLTC and the like, and real speed data issued by authorities are acquired.
Further, in step S2, a speed state transition feature matrix at different stages is generated, which specifically includes the following steps:
s21: based on the four types of vehicle historical speed data, dividing the vehicle historical speed data into three stages, namely an initial stage, a strengthening stage and a final stage by taking time as a standard; over time, when a driver enters a strange driving environment or repeatedly runs in a known driving environment for a plurality of times, the driving habit and the driving style can be changed, and the change is sequentially used as a main basis of a dividing stage;
s22: combining the four types of vehicle historical speed data according to different stages to form a complete speed working condition;
s23: the speed transfer characteristic matrixes corresponding to the three stages (initial stage, strengthening stage and final stage) are respectively constructed, and four types of vehicle speed data contained in the speed transfer characteristic matrixes can reflect more comprehensive driving characteristics.
Further, in step S3, training the deep reinforcement learning type hybrid electric vehicle energy management strategy by using the generated characteristic working condition based on the state sequence, specifically including the following steps:
s31: the vehicle history speed data cover part of the range of the state transition matrix by the speed state transition characteristic matrix in the initial stage, and then an envelope curve is obtained from the known range, and the related area of the envelope curve can embody the history driving characteristics of the vehicle, namely the driving behavior which is experienced at present, and can judge whether a driver has habits such as high-speed driving, rapid acceleration, rapid deceleration and the like;
s32: acquiring boundary state transfer characteristic points of a known driving characteristic range based on the envelope region, and randomly generating a plurality of discrete state transfer characteristic points in the envelope region;
s33: connecting boundary points of an envelope area with internal random points by taking acceleration change and vehicle speed change conditions as indexes to jointly construct a speed track generated based on speed transfer characteristic points, namely a state sequence driving condition;
s34: when the vehicle enters a new driving environment, new driving habits, i.e. new speed transfer features, may be generated, whereby the previous envelope will be extended. Thus, the enhanced morphology driving regime is generated periodically or aperiodically through the extended speed transfer characteristic envelope region generated by the enhanced phase, whereas the morphology driving regime generated by the theoretical final phase would include all the speed transfer characteristics of the vehicle.
Further, in step S4, a variable space and a reward function required for the training process are defined, specifically including: in order to ensure that a hybrid electric vehicle energy management strategy based on an equivalent fuel consumption minimum strategy can be issued with MathWork authorities to have a fair setting condition, by taking optimizing the fuel economy of the hybrid electric vehicle as a main target and utilizing a depth value network suitable for discrete control tasks as a main control algorithm, a state space S, an action space A and a reward function R involved in the training process are defined as follows:
S=(T wheel ,SOC,Voc batt ,Gear transmot ,Vel car ,Temp env )
A=Throttle=[0,0.1,0.2,......,0.9,1]
wherein T is wheel Is the torque demand at the vehicle, SOC is the battery state of charge, voc batt Is the open circuit voltage of the battery, gear trans Is the gear of the speed changer, omega mot Is the motor speed, vel car Is the longitudinal speed of the vehicle, temp env Is ambient temperature, defined as a constant 313K; throttle is Throttle, discretized into 11 points of action [0,0.1,0.2, …,0.9,1]The method comprises the steps of carrying out a first treatment on the surface of the Alpha, beta and gamma are weight coefficients, T eng Is engine torque, n eng Is the rotational speed of the engine,is the instantaneous oil consumption of the engine, BSFC(s) is the effective fuel consumption rate, SOC target Is the target battery state of charge.
Further, in step S4, the joint simulation setup specifically includes: respectively writing four m files with the purposes of opening a model, transmitting data, continuing to run the model and closing the model in a Matlab environment as function files for joint simulation data interaction; therefore, a control command DRL_action of the deep reinforcement learning type agent in the Python environment is transmitted to the hybrid power system, and the state parameters of the hybrid power system after the control command is executed in the Simulink environment are transmitted to the deep reinforcement learning type agent; the state parameters include brake fuel consumption rate BSFC, instantaneous fuel consumption FuelFlw, battery state of charge BattSoc, battery voltage BattV, transmission gear, longitudinal running speed xdot, motor rotation speed MotSpd, motor torque MotTrq, engine rotation speed EngSpd, engine torque EngTrq, ambient temperature Temp, wheel demand torque whtrq, and simulation time SimuTime.
Further, in step S5, after the total jackpot function is in the stable maximum convergence state, training is ended.
The invention has the beneficial effects that: aiming at the hybrid electric vehicle and the corresponding deep reinforcement learning type energy management strategy, the invention adopts an entirely new training concept which is more suitable for the reinforcement learning algorithm principle, and the control strategy obtained by the final training has a more perfect application effect by taking the state sequential speed working condition rather than the time sequential speed working condition as the data base.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.
Drawings
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in the following preferred detail with reference to the accompanying drawings, in which:
FIG. 1 is an overall flow chart of a hybrid vehicle energy management strategy enhancement update method of the present invention;
FIG. 2 is an overall frame diagram of a hybrid vehicle energy management strategy enhancement update method of the present invention;
FIG. 3 is a graph of diversified vehicle historic speed data, wherein (a) is CARLA-based driving data, (b) is DBNet-based driving data, (c) is Gran Turismo-based driving data, and (d) is standard operating mode-based driving data (HWEFT, US06, WLTC);
FIG. 4 is a velocity transfer feature matrix co-constructed from four classes of vehicle historic velocity data;
FIG. 5 is a block diagram of a depth value network algorithm;
FIG. 6 is a schematic diagram of a joint simulation data interface.
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the illustrations provided in the following embodiments merely illustrate the basic idea of the present invention by way of illustration, and the following embodiments and features in the embodiments may be combined with each other without conflict.
Wherein the drawings are for illustrative purposes only and are shown in schematic, non-physical, and not intended to limit the invention; for the purpose of better illustrating embodiments of the invention, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the size of the actual product; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numbers in the drawings of embodiments of the invention correspond to the same or similar components; in the description of the present invention, it should be understood that, if there are terms such as "upper", "lower", "left", "right", "front", "rear", etc., that indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, it is only for convenience of describing the present invention and simplifying the description, but not for indicating or suggesting that the referred device or element must have a specific azimuth, be constructed and operated in a specific azimuth, so that the terms describing the positional relationship in the drawings are merely for exemplary illustration and should not be construed as limiting the present invention, and that the specific meaning of the above terms may be understood by those of ordinary skill in the art according to the specific circumstances.
Referring to fig. 1 to 6, the present invention provides a method for enhancing and updating an energy management policy of a deep reinforcement learning type hybrid electric vehicle based on a state sequence working condition, wherein a flow is shown in fig. 1, and a frame is shown in fig. 2. The method specifically comprises the following steps:
s1: different types of vehicle historical speed data are obtained through diversified driving information sources, and automatic driving simulation software (CARLA) simulation data, a real driving data set DBNet, a racing type electronic game Gran Turismo and standard working conditions (HWEFT, US06, WLTC and the like) special for testing vehicle performance are mainly covered;
(1) Virtual simulation-based automatic driving data source CARLA: based on the official vehicles and the map, controlling the vehicles to run in the area through the automatic driving function, wherein the environment where the target vehicle is located comprises surrounding vehicles, pedestrians and traffic management equipment, and further obtaining simulation speed data representing the automatic driving control characteristics, as shown in fig. 3 (a);
(2) Real human driver-oriented vehicle speed data source DBNet: downloading a data set issued by Shanghai university to drive a real driver in the urban area range, and acquiring real speed data capable of representing the human driving characteristics, as shown in fig. 3 (b);
(3) Vehicle speed data source Gran Turismo based on racing class electronic games: by running Gran Turismo Sport a real driving simulator on the PlayStation platform, according to the difference of the driving styles of the track, the vehicle and the player, obtaining simulation speed data which fully characterizes the vehicle in a racing environment, as shown in fig. 3 (c);
(4) Standard operating conditions specially used for testing vehicle performance: several standard speed conditions commonly used in the test field are selected for merging, including HWEFT, US06, WLTC and the like, and the true speed data issued by authorities is acquired, as shown in fig. 3 (d).
S2: dividing each acquired vehicle history speed data into three stages (an initial stage, a strengthening stage and a final stage) respectively, and then merging to jointly construct a corresponding speed state transition feature matrix; the method specifically comprises the following steps:
s21: based on the four types of vehicle history speed information, the vehicle history speed information is divided into three stages, namely an initial stage, a strengthening stage and a final stage by taking time as a standard. Over time, when a driver enters a strange driving environment or repeatedly runs in a known driving environment for a plurality of times, the driving habit and the driving style can be changed, and the change is sequentially used as a main basis of a dividing stage;
s22: combining the four types of vehicle historical speed information according to different stages to form a complete speed working condition;
s23: the speed transfer feature matrices corresponding to the three phases (initial phase, reinforcement phase, final phase) are respectively constructed, and four kinds of vehicle speed information contained therein can reflect more comprehensive driving features, as shown in fig. 4.
S3: generating a characteristic driving working condition based on a state sequence according to a speed state transition characteristic matrix generated by vehicle historical speed data, and training an energy management strategy of the deep reinforcement learning type hybrid electric vehicle; the method specifically comprises the following steps:
s31: the method comprises the steps that a speed transfer characteristic matrix in an initial stage is used, historical vehicle speed data cover a part of the range of a state transfer matrix, an envelope curve is obtained from the known range, the related area of the curve can embody the historical driving characteristics of a vehicle, namely the driving behavior which is experienced at present, and whether a driver has habits such as high-speed driving, rapid acceleration and rapid deceleration or not can be judged;
s32: acquiring boundary state transfer characteristic points of a known driving characteristic range based on the envelope region, and randomly generating a plurality of discrete state transfer characteristic points in the envelope region;
s33: connecting boundary points of an envelope area with internal random points by taking acceleration change and vehicle speed change conditions as indexes to jointly construct a speed track generated based on speed transfer characteristic points, namely a state sequence driving condition;
s34: when the vehicle enters a new driving environment, new driving habits, i.e. new speed transfer features, may be generated, whereby the previous envelope will be extended. Thus, the enhanced morphology driving regime is generated periodically or aperiodically through the extended speed transfer characteristic envelope region generated by the enhanced phase, whereas the morphology driving regime generated by the theoretical final phase would include all the speed transfer characteristics of the vehicle.
S4: the energy management strategy of the hybrid electric vehicle facing deep reinforcement learning defines a state space S, an action space A and a reward function R required by a training process, and an interface environment and an interaction scheme facing joint simulation are set;
in order to ensure that the energy management strategy of the hybrid electric vehicle based on the equivalent fuel consumption minimum strategy (ECMS) can be issued with the MathWork authorities to have fair setting conditions, by taking the optimization of the fuel economy of the hybrid electric vehicle as a main goal and utilizing a depth value network suitable for discrete control tasks as a main control algorithm, as shown in fig. 5, a state space S, an action space A and a reward function R involved in the training process are defined as follows:
S=(T wheel ,SOC,Voc batt ,Gear transmot ,Vel car ,Temp env )
A=Throttle=[0,0.1,0.2,......,0.9,1]
wherein T is wheel Is the torque demand at the vehicle, SOC is the battery state of charge, voc batt Is the open circuit voltage of the battery, gear trans Is the gear of the speed changer, omega mot Is the motor speed, vel car Is the longitudinal speed of the vehicle, temp env Is ambient temperature, defined as a constant 313K; throttle is the Throttle, discretized into 11 operating points [0,0.1,0.2, …,0.9,1]the method comprises the steps of carrying out a first treatment on the surface of the Alpha, beta and gamma are weight coefficients, T eng Is engine torque, n eng Is the rotational speed of the engine,is the instantaneous oil consumption of the engine, BSFC(s) is the effective fuel consumption rate, SOC target Is the target battery state of charge.
And then, respectively writing four m files with the purposes of opening the model, transmitting data, continuing to run the model and closing the model in a Matlab environment as function files for joint simulation data interaction. Thus, as shown in fig. 6, the control command drl_action of the deep reinforcement learning agent in the Python environment is transmitted to the hybrid system, and the hybrid system state parameter after the control command is executed in the Simulink environment is transmitted to the deep reinforcement learning agent. The state parameters include a brake fuel consumption rate BSFC, an instantaneous fuel consumption amount FuelFlw, a battery state of charge BattSoc, a battery voltage BattV, a transmission gear, a longitudinal running speed xdot, a motor rotation speed MotSpd, a motor torque MotTrq, an engine rotation speed EngSpd, an engine torque EngTrq, an environmental temperature Temp, a wheel demand torque whlrq, and a simulation time SimuTime.
S5: the online enhanced update type iterative training process of the deep reinforcement learning type control strategy is completed based on the Tencent cloud virtual machine, and after training is finished, the latest type control strategy is downloaded and loaded into the hybrid power system model for subsequent testing. The method specifically comprises the following steps:
s51: purchasing the use right of the communication cloud virtual server shown in the table 1, and uploading the energy management strategy training program of the hybrid electric vehicle based on deep reinforcement learning and the state sequence speed characteristic working conditions of the corresponding stage;
TABLE 1 Tencent cloud Server configuration
S52: configuring a training environment, and installing and setting a Python/TensorFlow required by a deep reinforcement learning algorithm and a Matlab/Simulink environment required by a hybrid power system;
s53: and (3) performing iterative trial-and-error updating on the control strategy in a cloud server environment, downloading a neural network parameter file corresponding to the energy management strategy to a local environment after the total accumulated rewarding function is in a stable maximum convergence state, and loading the latest energy management strategy into a hybrid power system model in a brand new test environment for subsequent testing and verification.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present invention, which is intended to be covered by the claims of the present invention.

Claims (5)

1. The method for enhancing and updating the energy management strategy of the deep reinforcement learning type hybrid electric vehicle is characterized by comprising the following steps of:
s1: acquiring different types of vehicle historical speed data through diversified driving information sources;
s2: dividing each acquired historical speed data of the vehicle into three stages respectively, and then merging to generate a speed state transition feature matrix under the corresponding stage; the method specifically comprises the following steps:
s21: based on the four types of vehicle historical speed data, dividing the vehicle historical speed data into three stages, namely an initial stage, a strengthening stage and a final stage by taking time as a standard;
s22: combining the four types of vehicle historical speed data according to different stages to form a complete speed working condition;
s23: respectively constructing speed transfer feature matrixes corresponding to three stages, wherein four types of vehicle speed data can reflect more comprehensive driving features;
s3: generating a characteristic driving working condition based on a state sequence according to a speed state transition characteristic matrix generated by vehicle historical speed data, and training an energy management strategy of the deep reinforcement learning type hybrid electric vehicle;
s4: the method is oriented to a deep reinforcement learning type hybrid electric vehicle energy management strategy, a state space S, an action space A and a reward function R required by a training process are defined, and a m file of Matlab is used as a data interface to realize the joint simulation training of a deep reinforcement learning type intelligent agent in a Python environment and a parallel hybrid electric system in a Simulink environment;
defining the variable space and the rewarding function required by the training process, specifically comprising: the state space S, action space a and reward function R involved in the training process are defined as follows:
S=(T wheel ,SOC,Voc batt ,Gear transmot ,Vel car ,Temp env )
A=Throttle=[0,0.1,0.2,......,0.9,1]
wherein T is wheel Is the torque demand at the vehicle, SOC is the battery state of charge, voc batt Is the open circuit voltage of the battery, gear trans Is the gear of the speed changer, omega mot Is the motor speed, vel car Is the longitudinal speed of the vehicle, temp env Is ambient temperature; throttle is Throttle, discretized into 11 points of action [0,0.1,0.2, …,0.9,1]The method comprises the steps of carrying out a first treatment on the surface of the Alpha, beta and gamma are weight coefficients, T eng Is engine torque, n eng Is the rotational speed of the engine,is the instantaneous oil consumption of the engine, BSFC(s) is the effective fuel consumption rate, SOC target Is the target battery state of charge;
s5: and (3) completing an online enhanced and updated iterative training process of the deep reinforcement learning type hybrid electric vehicle energy management strategy based on the cloud server, downloading the latest hybrid electric vehicle energy management strategy after training is finished, and loading the latest hybrid electric vehicle energy management strategy into a hybrid electric system model for subsequent testing.
2. The hybrid vehicle energy management strategy enhancement updating method according to claim 1, wherein in step S1, the acquired different types of vehicle history speed data include:
(1) Automatic driving data based on virtual simulation, source CARLA: based on the official vehicles and the map as the environment, controlling the vehicles to run in the area through the automatic driving function, wherein the environment where the target vehicle is located comprises surrounding vehicles, pedestrians and traffic management equipment, and further obtaining simulation speed data representing the automatic driving control characteristics;
(2) Vehicle speed data for a real human driver, source DBNet: downloading a data set which is issued by a real driver and runs in the urban area range on the internet, and acquiring real speed data capable of representing the human driving characteristics;
(3) Based on the speed data of the racing class electronic game, the source Gran Turismo: the simulation speed data which fully characterizes the vehicle in the racing environment is obtained according to the difference of the driving styles of the track, the vehicle and the player by running Gran Turismo Sport a real driving simulator on a PlayStation platform;
(4) Standard operating conditions specially used for testing vehicle performance: several standard speed working conditions commonly used in the field of vehicle testing are selected for combination, and real speed data issued by authorities are obtained.
3. The method for enhancing and updating the energy management strategy of the hybrid electric vehicle according to claim 1, wherein in step S3, the deep reinforcement learning type hybrid electric vehicle energy management strategy is trained by using the generated characteristic working condition based on the state sequence, and specifically comprises the following steps:
s31: the method comprises the steps that a characteristic matrix is transferred in a speed state in an initial stage, the historical speed data of a vehicle covers a part of the range of the state transfer matrix, further, an envelope curve is obtained for a known range, and the related area of the envelope curve can embody the historical driving characteristics of the vehicle;
s32: acquiring boundary state transfer characteristic points of a known driving characteristic range based on the envelope region, and randomly generating a plurality of discrete state transfer characteristic points in the envelope region;
s33: connecting boundary points of an envelope area with internal random points by taking acceleration change and vehicle speed change conditions as indexes to jointly construct a speed track generated based on speed transfer characteristic points, namely a state sequence driving condition;
s34: when the vehicle enters a new driving environment, the enhanced driving condition is generated periodically or aperiodically through the expanded speed transfer characteristic envelope region generated in the enhanced stage.
4. The method for enhancing and updating the energy management strategy of the hybrid electric vehicle according to claim 1, wherein in step S4, the joint simulation setting specifically includes: respectively writing four m files with the purposes of opening a model, transmitting data, continuing to run the model and closing the model in a Matlab environment as function files for joint simulation data interaction; therefore, a control command DRL_action of the deep reinforcement learning type agent in the Python environment is transmitted to the hybrid power system, and the state parameters of the hybrid power system after the control command is executed in the Simulink environment are transmitted to the deep reinforcement learning type agent; the state parameters include brake fuel consumption rate BSFC, instantaneous fuel consumption FuelFlw, battery state of charge BattSoc, battery voltage BattV, transmission gear, longitudinal running speed xdot, motor rotation speed MotSpd, motor torque MotTrq, engine rotation speed EngSpd, engine torque EngTrq, ambient temperature Temp, wheel demand torque whtrq, and simulation time SimuTime.
5. The method of claim 1, wherein in step S5, training is completed when the total jackpot function is in a stable maximum convergence state.
CN202310378883.9A 2023-04-10 2023-04-10 Energy management strategy enhancement updating method for deep reinforcement learning type hybrid electric vehicle Active CN116424332B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310378883.9A CN116424332B (en) 2023-04-10 2023-04-10 Energy management strategy enhancement updating method for deep reinforcement learning type hybrid electric vehicle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310378883.9A CN116424332B (en) 2023-04-10 2023-04-10 Energy management strategy enhancement updating method for deep reinforcement learning type hybrid electric vehicle

Publications (2)

Publication Number Publication Date
CN116424332A CN116424332A (en) 2023-07-14
CN116424332B true CN116424332B (en) 2023-11-21

Family

ID=87084933

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310378883.9A Active CN116424332B (en) 2023-04-10 2023-04-10 Energy management strategy enhancement updating method for deep reinforcement learning type hybrid electric vehicle

Country Status (1)

Country Link
CN (1) CN116424332B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117184095B (en) * 2023-10-20 2024-05-14 燕山大学 Hybrid electric vehicle system control method based on deep reinforcement learning
CN117911670A (en) * 2023-12-01 2024-04-19 重庆大学 Algorithm for detecting small target by improving YOLOv5
CN118082891B (en) * 2024-04-26 2024-06-18 广汽埃安新能源汽车股份有限公司 Gear optimization method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102717797A (en) * 2012-06-14 2012-10-10 北京理工大学 Energy management method and system of hybrid vehicle
CN111845701A (en) * 2020-08-05 2020-10-30 重庆大学 HEV energy management method based on deep reinforcement learning in car following environment
WO2021083785A1 (en) * 2019-10-31 2021-05-06 Psa Automobiles Sa Method for training at least one algorithm for a control device of a motor vehicle, computer program product, and motor vehicle
WO2021092639A1 (en) * 2019-11-12 2021-05-20 Avl List Gmbh Method and system for analysing and/or optimizing a configuration of a vehicle type
CN114103971A (en) * 2021-11-23 2022-03-01 北京理工大学 Energy-saving driving optimization method and device for fuel cell vehicle
CN114802180A (en) * 2022-05-19 2022-07-29 广西大学 Mode prediction system and method for hybrid electric vehicle power system coordination control
CN115470700A (en) * 2022-09-01 2022-12-13 吉泰车辆技术(苏州)有限公司 Hybrid vehicle energy management method based on reinforcement learning training network model
CN115495997A (en) * 2022-10-28 2022-12-20 东南大学 New energy automobile ecological driving method based on heterogeneous multi-agent deep reinforcement learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7314831B2 (en) * 2020-02-17 2023-07-26 トヨタ自動車株式会社 VEHICLE CONTROL DATA GENERATION METHOD, VEHICLE CONTROL DEVICE, VEHICLE CONTROL SYSTEM, AND VEHICLE LEARNING DEVICE
EP4244770A1 (en) * 2020-11-12 2023-09-20 Umnai Limited Architecture for explainable reinforcement learning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102717797A (en) * 2012-06-14 2012-10-10 北京理工大学 Energy management method and system of hybrid vehicle
WO2021083785A1 (en) * 2019-10-31 2021-05-06 Psa Automobiles Sa Method for training at least one algorithm for a control device of a motor vehicle, computer program product, and motor vehicle
WO2021092639A1 (en) * 2019-11-12 2021-05-20 Avl List Gmbh Method and system for analysing and/or optimizing a configuration of a vehicle type
CN111845701A (en) * 2020-08-05 2020-10-30 重庆大学 HEV energy management method based on deep reinforcement learning in car following environment
CN114103971A (en) * 2021-11-23 2022-03-01 北京理工大学 Energy-saving driving optimization method and device for fuel cell vehicle
CN114802180A (en) * 2022-05-19 2022-07-29 广西大学 Mode prediction system and method for hybrid electric vehicle power system coordination control
CN115470700A (en) * 2022-09-01 2022-12-13 吉泰车辆技术(苏州)有限公司 Hybrid vehicle energy management method based on reinforcement learning training network model
CN115495997A (en) * 2022-10-28 2022-12-20 东南大学 New energy automobile ecological driving method based on heterogeneous multi-agent deep reinforcement learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
韩少剑 ; 张风奇 ; 任延飞 ; 席军强 ; .基于深度学习的混合动力汽车预测能量管理.中国公路学报.2020,(第08期),第3页第1栏第15行-第7页第2栏第17行. *

Also Published As

Publication number Publication date
CN116424332A (en) 2023-07-14

Similar Documents

Publication Publication Date Title
CN116424332B (en) Energy management strategy enhancement updating method for deep reinforcement learning type hybrid electric vehicle
Liessner et al. Deep reinforcement learning for advanced energy management of hybrid electric vehicles.
Dextreit et al. Game theory controller for hybrid electric vehicles
Phan et al. Intelligent energy management system for conventional autonomous vehicles
Liu et al. Formula-E race strategy development using artificial neural networks and Monte Carlo tree search
Wu et al. Fast velocity trajectory planning and control algorithm of intelligent 4WD electric vehicle for energy saving using time‐based MPC
CN105539423A (en) Hybrid vehicle torque distribution control method and system for protecting battery based on environment temperature
Nguyen et al. Optimal drivetrain design methodology for enhancing dynamic and energy performances of dual-motor electric vehicles
Zhu et al. A deep reinforcement learning framework for eco-driving in connected and automated hybrid electric vehicles
Zhu et al. Energy management of hybrid electric vehicles via deep Q-networks
CN113554337B (en) Plug-in hybrid electric vehicle energy management strategy construction method integrating traffic information
CN115495997B (en) New energy automobile ecological driving method based on heterogeneous multi-agent deep reinforcement learning
CN115793445B (en) Hybrid electric vehicle control method based on multi-agent deep reinforcement learning
CN115534929A (en) Plug-in hybrid electric vehicle energy management method based on multi-information fusion
CN112498334B (en) Robust energy management method and system for intelligent network-connected hybrid electric vehicle
Johri et al. Self-learning neural controller for hybrid power management using neuro-dynamic programming
Li et al. Distributed cooperative energy management system of connected hybrid electric vehicles with personalized non-stationary inference
CN112473151A (en) Information providing device, information providing method, and storage medium
Xu et al. Real-time energy optimization of HEVs under-connected environment: a benchmark problem and receding horizon-based solution
You et al. Real-time energy management strategy based on predictive cruise control for hybrid electric vehicles
CN117698685B (en) Dynamic scene-oriented hybrid electric vehicle self-adaptive energy management method
Sim et al. A control algorithm of an idle stop and go system with traffic conditions for hybrid electric vehicles
Yadav et al. Intelligent energy management strategies for hybrid electric transportation
CN117807714B (en) Adaptive online lifting method for deep reinforcement learning type control strategy
Zlocki et al. Methodology for quantification of fuel reduction potential for adaptive cruise control relevant driving strategies

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant