CN115503559A - Learning type collaborative energy management method for fuel cell automobile considering air conditioning system - Google Patents
Learning type collaborative energy management method for fuel cell automobile considering air conditioning system Download PDFInfo
- Publication number
- CN115503559A CN115503559A CN202211385462.0A CN202211385462A CN115503559A CN 115503559 A CN115503559 A CN 115503559A CN 202211385462 A CN202211385462 A CN 202211385462A CN 115503559 A CN115503559 A CN 115503559A
- Authority
- CN
- China
- Prior art keywords
- expressed
- fuel cell
- vehicle
- air conditioning
- conditioning system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000000446 fuel Substances 0.000 title claims abstract description 80
- 238000004378 air conditioning Methods 0.000 title claims abstract description 65
- 238000007726 management method Methods 0.000 title claims abstract description 50
- 238000005457 optimization Methods 0.000 claims abstract description 38
- UFHFLCQGNIYNRP-UHFFFAOYSA-N Hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 claims abstract description 28
- 229910052739 hydrogen Inorganic materials 0.000 claims abstract description 28
- 239000001257 hydrogen Substances 0.000 claims abstract description 28
- 238000005265 energy consumption Methods 0.000 claims abstract description 20
- 238000010438 heat treatment Methods 0.000 claims abstract description 17
- 238000000034 method Methods 0.000 claims abstract description 16
- 230000008859 change Effects 0.000 claims abstract description 13
- 238000002485 combustion reaction Methods 0.000 claims abstract description 7
- 238000011217 control strategy Methods 0.000 claims abstract description 5
- 238000005057 refrigeration Methods 0.000 claims abstract description 3
- 230000006870 function Effects 0.000 claims description 71
- 230000009471 action Effects 0.000 claims description 44
- 238000011156 evaluation Methods 0.000 claims description 15
- 238000001816 cooling Methods 0.000 claims description 12
- 239000003795 chemical substances by application Substances 0.000 claims description 9
- 230000001133 acceleration Effects 0.000 claims description 6
- 230000005540 biological transmission Effects 0.000 claims description 6
- 238000012546 transfer Methods 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000014509 gene expression Effects 0.000 claims description 3
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 238000005192 partition Methods 0.000 claims description 3
- 230000035699 permeability Effects 0.000 claims description 3
- 238000005096 rolling process Methods 0.000 claims description 3
- 238000009423 ventilation Methods 0.000 claims description 3
- 230000003993 interaction Effects 0.000 claims description 2
- 230000007704 transition Effects 0.000 claims description 2
- 230000005855 radiation Effects 0.000 claims 1
- 238000012549 training Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 230000002787 reinforcement Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 239000002803 fossil fuel Substances 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
Images
Classifications
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60L—PROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
- B60L58/00—Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles
- B60L58/30—Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles for monitoring or controlling fuel cells
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60H—ARRANGEMENTS OF HEATING, COOLING, VENTILATING OR OTHER AIR-TREATING DEVICES SPECIALLY ADAPTED FOR PASSENGER OR GOODS SPACES OF VEHICLES
- B60H1/00—Heating, cooling or ventilating [HVAC] devices
- B60H1/00357—Air-conditioning arrangements specially adapted for particular vehicles
- B60H1/00385—Air-conditioning arrangements specially adapted for particular vehicles for vehicles having an electrical drive, e.g. hybrid or fuel cell
- B60H1/00392—Air-conditioning arrangements specially adapted for particular vehicles for vehicles having an electrical drive, e.g. hybrid or fuel cell for electric vehicles having only electric drive means
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60H—ARRANGEMENTS OF HEATING, COOLING, VENTILATING OR OTHER AIR-TREATING DEVICES SPECIALLY ADAPTED FOR PASSENGER OR GOODS SPACES OF VEHICLES
- B60H1/00—Heating, cooling or ventilating [HVAC] devices
- B60H1/32—Cooling devices
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60L—PROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
- B60L58/00—Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles
- B60L58/10—Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles for monitoring or controlling batteries
- B60L58/12—Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles for monitoring or controlling batteries responding to state of charge [SoC]
Landscapes
- Engineering & Computer Science (AREA)
- Mechanical Engineering (AREA)
- Life Sciences & Earth Sciences (AREA)
- Sustainable Development (AREA)
- Sustainable Energy (AREA)
- Physics & Mathematics (AREA)
- Thermal Sciences (AREA)
- Power Engineering (AREA)
- Transportation (AREA)
- Fuel Cell (AREA)
Abstract
The invention relates to a learning type collaborative energy management method for a fuel cell automobile considering an air conditioning system, and belongs to the field of new energy automobiles. The method comprises the following steps: s1: acquiring vehicle state parameter information, fuel cell parameter information, power cell parameter information and air conditioning system parameter information of a fuel cell vehicle; s2: establishing a fuel cell automobile collaborative energy management model; s3: establishing a fuel cell vehicle cooperative energy management optimization control strategy considering an air conditioning system, solving a multi-objective optimization problem including hydrogen combustion economy and cabin temperature comfort by combining a SAC algorithm, and controlling the change of the refrigeration/heating capacity of an air conditioner to maintain the cabin temperature in a comfort interval while performing energy flow optimization control. The invention can effectively solve the compromise problem between hydrogen energy consumption and cabin temperature comfort, and optimize the hydrogen-burning economy and cabin temperature comfort of the fuel cell automobile.
Description
Technical Field
The invention belongs to the field of new energy automobiles, and relates to a learning type collaborative energy management method for a fuel cell automobile considering an air conditioning system.
Background
In the face of increasingly severe problems of ecological environment pollution, fossil fuel shortage and the like, various automobile manufacturers strive to develop new energy automobiles. With the development of fuel cell technology, fuel cell vehicles fully exert the advantages of zero emission, low energy consumption and strong endurance, and are considered to be one of important research directions for realizing the sustainable development of vehicles in the future. The energy management strategy is a core control technology of a fuel cell automobile multi-power source system, and the quality of the performance directly determines the economic performance of the whole automobile. In current research, energy management methods are mainly divided into three types: rule-based, optimization-based, and learning-based energy management strategies. However, rule-based and optimization-based energy management methods face a dilemma that they cannot meet both real-time and optimality; for the traditional deep reinforcement learning algorithm, although the real-time performance and the optimality of energy flow optimization can be realized at the same time, certain defects exist in the aspects of training data and hyper-parameter setting. Therefore, the proposal of the soft constraint actor critic algorithm provides a method for solving the problems.
On the other hand, the air conditioning system is an indispensable auxiliary device for a fuel cell vehicle, and contributes to providing a comfortable riding environment for passengers in the vehicle. However, the use of the air conditioning system inevitably increases the energy consumption of the fuel cell vehicle, thereby affecting the economic performance of the entire vehicle. In the current research on the energy management method of the fuel cell automobile, the energy consumption of the air conditioning system is generally regarded as a fixed value or ignored. However, due to the change of driving environment, the heat exchange quantity inside and outside the cab changes, and the power used by the air conditioning system changes.
Therefore, a new energy management method for a fuel cell vehicle is needed to coordinate and control the air conditioning system and the power source components, and to optimize the energy flow in the vehicle while considering the energy consumption variation of the air conditioning system.
Disclosure of Invention
In view of the above, the present invention provides a learning-type collaborative energy management method for a fuel cell vehicle considering an air conditioning system, which coordinately controls the air conditioning system and power source components of the fuel cell vehicle by applying a Soft constraint actor critic (SAC) algorithm, so as to optimize the energy flow of the entire vehicle while ensuring cabin comfort, so as to reduce the energy consumption of the entire vehicle of the fuel cell vehicle.
In order to achieve the purpose, the invention provides the following technical scheme:
a learning type collaborative energy management method for a fuel cell automobile considering an air conditioning system specifically comprises the following steps:
s1: acquiring vehicle state parameter information, fuel cell parameter information, power cell parameter information and air conditioning system parameter information of a fuel cell vehicle;
s2: establishing a fuel cell automobile collaborative energy management model, comprising the following steps: the method comprises the following steps that a whole vehicle longitudinal dynamics model, a fuel cell model, a power cell model, a motor model, an air conditioning system model and a vehicle cabin thermal load model are adopted;
s3: establishing a fuel cell automobile cooperative energy management optimization control strategy considering an air conditioning system, solving a multi-objective optimization problem containing hydrogen combustion economy and cabin temperature comfort by combining a SAC algorithm, and controlling the change of the refrigeration/heating capacity of an air conditioner to maintain the cabin temperature in a comfortable interval while performing energy flow optimization control; the SAC algorithm is a soft-constraint actor critic algorithm.
Further, in step S1, the vehicle state parameter information includes: the method comprises the following steps of (1) vehicle speed, vehicle cabin thermal load parameters, motor operation efficiency and transmission system characteristic parameters; the fuel cell parameter information includes: power, efficiency, and hydrogen energy consumption of the fuel cell; the power battery parameter information comprises: the state of charge, internal resistance and open circuit voltage of the power battery; the air conditioning system parameter information includes: air conditioning system cooling capacity/heating capacity and corresponding power.
Further, in step S2, the established longitudinal dynamics model of the entire vehicle is:
P drive =(F air +F f +F i +m 0 a)·v
P dem =P b +P fc ·η DC/DC
wherein ,m0 Representing the mass of the whole vehicle; v represents the vehicle speed of the whole vehicle; a represents a vehicle acceleration; f air Expressed as air resistance; f f Expressed as rolling resistance; f i Expressed as acceleration resistance; eta m 、η DC/AC 、η DC/DC and ηmotor Respectively representing transmission efficiency, DC/AC converter efficiency, DC/DC converter efficiency and motor efficiency; p drive 、P dem 、P b and Pfc Respectively representing the driving power, the required power, and the battery output power, the fuel cell output power at the vehicle wheels.
Further, in step S2, the fuel cell model is established as follows:
η fc =f η (P fc )
wherein ,fη(·) and the efficiency and hydrogen consumption can be calculated by interpolation, respectively expressed as fitting functions of the efficiency and hydrogen consumption.
Further, in step S2, the power battery model established is:
wherein ,IL Expressed as power cell current; v oc Expressed as the power cell open circuit voltage; r in Expressed as the equivalent internal resistance of the power battery; SOC (system on chip) 0 Expressed as initial SOC; q t Expressed as the maximum capacity of the power battery; t is t 0 Expressed as an initial time; t is t f Denoted as the final time instant.
Further, in step S2, the established motor model is:
η m =f m (ω m ,T m )
wherein ,ωm and Tm Respectively representing the rotating speed and the torque of the motor; p m Expressed as motor output power, f m The (DEG) is expressed as a fitting function of the working efficiency of the motor, and the working efficiency of the motor can be obtained by an interpolation method.
Further, in step S2, the air conditioning system model is established as follows:
wherein ,Qac Expressed as a cooling capacity or a heating capacity of the air conditioning system; p ac Expressed as the corresponding power consumption of the air conditioning system; eta cop Expressed as the air conditioning system coefficient of performance.
Further, in step S2, the built cabin thermal load model is:
Q c =∑KF(T out -T in )
Q h =145+116n
Q n =m e ξCp air (T out -T in )
wherein ,Qc 、Q r 、Q h and Qn Respectively representing thermal conduction load, radiant heat load, heat generated by the vehicle occupant (empirically, about 145 watts of heat generated by the driver and about 116 watts of heat generated by each occupant), and ventilation system heat load; k is expressed as the heat transfer coefficient; f denotes the heat transfer area of the respective housing; t is out Expressed as ambient temperature; t is in Expressed as cabin air temperature; η is expressed as permeability; i represents the intensity of sunlight; a. The i Expressed as windshield, left and right side windows, and rear window area; theta i Expressed as the sunlight incident angle; β is expressed as a shading factor; n represents the number of passengers in the vehicle; m is a unit of e Expressed as the mass of air passing through the evaporator; ξ is expressed as the air recirculation coefficient; cp air Expressed as indoor air heat capacity; rho air and Vair Respectively, as air density and cabin volume in the cabin.
Further, in step S3, establishing a fuel cell vehicle cooperative energy management optimization control strategy considering an air conditioning system, specifically including the following steps:
s301: determining a state space: in order to reflect key environmental information, the SOC of the power battery and the output power P of the fuel battery are measured fc Vehicle speed v, cooling/heating capacity Q of air conditioning system ac Set as a state variable, a state space S is constructed, which can be expressed as:
S={SOC,P fc ,v,Q ac }
s302: determining an action space: considering the cooperative energy management of the air conditioning system, the power of the power source is not only distributed, but also changed according to the refrigerating/heating capacity of the air conditioning systemMaintaining thermal comfort of the cabin temperature, for which purpose the fuel cell output power is variedAnd air conditioning system cooling/heating capacity variationSetting as an action variable, constructing an action space A, which can be expressed as:
s303: establishing a reward function: in order to ensure the comfort of the cabin temperature, the temperature in the cabin of the vehicle is maintained at about 24 ℃, for this reason, the reward function also comprises an optimization term of the cabin temperature change, and then the reward function R is set as the weighted sum of three indexes of hydrogen energy consumption, SOC change and cabin temperature change, which is expressed as:
R=-(ζ·fuel(t)+ψ·(SOC(t)-0.7) 2 +γ·(T in -24) 2 )
zeta, psi and gamma are weight factors of each optimization item, and the problem of compromise between hydrogen energy consumption and cabin temperature comfort is solved by adjusting the weight factors, so that the multi-objective optimization problem is solved; fuel (t) represents the amount of hydrogen energy consumption at the present time; the SOC (t) represents the state of charge of the power battery at the present time.
Further, in the step S3, a multi-objective optimization problem including hydrogen combustion economy and cabin temperature comfort is solved by combining a SAC algorithm, and the method specifically comprises the following steps:
s311: the multi-objective optimization problem in energy management is solved by combining a SAC algorithm, action entropy is introduced into the SAC algorithm to enable action output to be more dispersed, and then exploration capacity, new task learning capacity and stability of the algorithm are improved, wherein the entropy is expressed as:
H(π(·|s t ))=-logπ(·|s t )
wherein H is strategy pi (· | s) t ) Entropy of (2).
S312: in the solution process, the actor network in the agent is in state s t As input, the mean and variance of the Gaussian distribution of the motion are output, and the motion a is generated by using a re-parameterization technology t :
wherein ,τt Represents a noise signal sampled from a standard normal distribution;representing the mean and variance of the function output;andrespectively, mean and variance of the gaussian distribution.
S313: performing action a t Thereafter, the vehicle environment feeds back a reward r to the agent t And shifts to the next state s t+1 I.e. the interactive data(s) of the environment and the intelligent agent can be generated t ,a t ,r t ,s t+1 And stored in an experience poolIn (1).
S314: randomly extracting a small batch of experience samples from an experience pool, and introducing a parameter theta to avoid overestimation when the function value of the action state is maximized and further overestimation when the target is calculated by utilizing the network of the user 1 ,θ 2 Is the evaluation critic network and the parameter is θ' 1 ,θ′ 2 The target critic network selects the target critic network to output a smaller action state function value as a target value; for a particular state s t And action a t Soft constrained action value function Q in SAC algorithm soft (s t ,a t ) The update formula is as follows:
wherein r represents a reward earned by the vehicle; gamma represents a discount factor; α represents a temperature coefficient.
S315: by minimizing the loss function L (theta) when updating the policy network i ) Updating the evaluation critic network, the loss function being defined asAndmean square error between, expressed as:
wherein ,expressed as an evaluation critic network parameter of theta i An evaluation function of time, andthe list is a target comment family network parameter of theta' i The evaluation function of time.
S316: the actor network parameter updating is realized by minimizing KL divergence, and the smaller the KL value is, the smaller the difference between rewards corresponding to output actions is, and the better the convergence effect of the strategy is; objective function of actor networkIs defined as:
wherein ,DKL Expressing KL divergence calculation expressions; z(s) t ) Is a partition function for normalizing the distribution;indicating the vehicle state s at the current moment t And performing action a t The mathematical expectation function of the time of day,indicates that the current state is s t The function of the policy in time,expressed as parameters of the policy function.
S317: updating actor network parameters according to a gradient descent method, represented as:
wherein ,expressed in terms of policy function parametersThe gradient of the fall of (a) is,is shown as relating to the execution of action a at the current time t t A falling gradient of (c).
S318: in the SAC algorithm system, the adjustment of the temperature coefficient alpha is important for the training effect of the SAC algorithm, and the values of the optimal temperature coefficient are different in different reinforcement learning tasks and training periods. In order to realize the automatic adjustment of the temperature coefficient, the minimum value of an objective function in the optimization problem is solved, so that the optimal temperature coefficient of each step can be obtained by updating, wherein the objective function is expressed as:
wherein ,H0 A threshold value representing a predefined minimum policy entropy,expressed as a function of the policy pi t Performing action a t Mathematical expectation function of time, pi t (a t |s t ) Expressed as a policy function, s t Is expressed as the state of the fuel cell vehicle at the current time t, a t It is expressed as an action executed according to the policy function at the current time t.
The invention has the beneficial effects that:
1) The invention designs an energy management strategy based on a soft constraint actor critic algorithm, effectively gets rid of the dependence of the traditional deep reinforcement learning algorithm on training data and hyper-parameter setting in the fuel cell automobile energy management application, and is beneficial to improving the stability of control tasks under a continuous action space.
2) Considering that the energy consumption change of the air conditioning system is generally ignored during the design of the energy management problem of the fuel cell automobile, the invention sets up a cooperative energy management optimization control framework considering the air conditioning system by taking hydrogen energy consumption, SOC maintenance and cabin temperature comfort as optimization targets, and realizes the cooperative control of the energy management and the air conditioning system.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow chart of a fuel cell vehicle collaborative energy management method of the present invention;
FIG. 2 is a schematic structural diagram of a multi-power-source system of a fuel cell vehicle;
FIG. 3 is a schematic diagram of a cabin thermal load model and an air conditioning system configuration;
fig. 4 is a diagram of a collaborative energy management framework in consideration of an air conditioning system built by applying a SAC algorithm in the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Referring to fig. 1 to 4, the fuel cell vehicle collaborative energy management optimization method considering the air conditioning system is designed based on the soft constraint actor critic algorithm. Considering that the energy consumption change of an air conditioning system is generally ignored in the energy management of a fuel cell automobile, the main influence factors of the temperature comfort in the cabin of the automobile are analyzed, an air conditioning system model and a cabin heat load model are established, the hydrogen consumption, the SOC maintenance and the cabin temperature are taken as optimization targets, a collaborative energy management optimization control framework considering the air conditioning system is established by applying a soft constraint actor critic algorithm suitable for control tasks under a continuous action space, the collaborative control of the energy management and the air conditioning system is realized, and the hydrogen combustion economy and the cabin temperature comfort of the fuel cell automobile are optimized. As shown in fig. 1, the energy management collaborative optimization method specifically includes the following steps:
s1: acquiring key parameter information of a fuel cell vehicle, comprising the following steps:
the vehicle state parameter information includes: the method comprises the following steps of (1) vehicle speed, vehicle cabin thermal load parameters, motor operation efficiency and transmission system characteristic parameters;
the fuel cell parameter information includes: power, efficiency, and hydrogen energy consumption of the fuel cell;
the power battery parameter information comprises: the state of charge, internal resistance and open circuit voltage of the power battery;
the air conditioning system parameter information includes: air conditioning system cooling capacity/heating capacity and corresponding power.
S2: establishing a fuel cell vehicle collaborative energy management model, as shown in fig. 2 and 3, the specific steps are as follows:
s21: establishing a longitudinal dynamic model of the whole vehicle:
P drive =(F air +F f +F i +m 0 a)·v
P dem =P b +P fc ·η DC/DC
wherein ,m0 Representing the mass of the whole vehicle; v represents the vehicle speed of the whole vehicle; a represents a vehicle acceleration; f air Expressed as air resistance; f f Expressed as rolling resistance; f i Expressed as acceleration resistance; eta m 、η DC/AC 、η DC/DC and ηmotor Respectively representing transmission efficiency, DC/AC converter efficiency, DC/DC converter efficiency and motor efficiency; p drive 、P dem 、P b and Pfc Respectively representing the driving power, the required power, and the battery output power, the fuel cell output power at the vehicle wheels.
S22: establishing a fuel cell model:
η fc =f η (P fc )
wherein ,fη(·) and the efficiency and hydrogen consumption can be calculated by interpolation, respectively expressed as fitting functions of the efficiency and hydrogen consumption.
S23: establishing a power battery model:
wherein ,IL Expressed as power cell current; v oc Expressed as the power cell open circuit voltage; r is in Expressed as the equivalent internal resistance of the power battery; SOC 0 Expressed as initial SOC; q t Expressed as the maximum capacity of the power battery; t is t 0 Expressed as an initial time; t is t f Denoted as the final time instant.
S24: establishing a motor model:
η m =f m (ω m ,T m )
wherein ,ωm and Tm Respectively representing the rotating speed and the torque of the motor; p m Expressed as motor output power, f m The (DEG) is expressed as a fitting function of the working efficiency of the motor, and the working efficiency of the motor can be obtained by an interpolation method.
S25: establishing an air conditioning system model:
wherein ,Qac Expressed as a cooling capacity or a heating capacity of the air conditioning system; p ac Expressed as the corresponding power consumption of the air conditioning system; eta cop Expressed as the air conditioning system coefficient of performance.
S26: establishing a vehicle cabin heat load model:
Q c =∑KF(T out -T in )
Q h =145+116n
Q n =m e ξCp air (T out -T in )
wherein ,Qc 、Q r 、Q h and Qn Respectively representing thermal conduction load, radiant heat load, heat generated by the vehicle occupant (empirically, about 145 watts of heat generated by the driver and about 116 watts of heat generated by each occupant), and ventilation system heat load; k is expressed as the heat transfer coefficient; f denotes the heat transfer area of the respective housing; t is out Expressed as ambient temperature; t is in Expressed as cabin air temperature; eta is expressed as permeability; i represents the intensity of sunlight; a. The i Expressed as windshield, left and right side windows, and rear window area; theta i Expressed as the sunlight incident angle; β is expressed as a shading factor; n represents the number of passengers in the vehicle; m is e Expressed as the mass of air passing through the evaporator; ξ is expressed as the air recirculation coefficient; cp air Expressed as indoor air heat capacity; rho air and Vair Respectively, as air density and cabin volume in the cabin.
S3: a fuel cell automobile collaborative energy management optimization control framework considering an air conditioning system is established based on a SAC algorithm, and a multi-objective optimization problem including hydrogen combustion economy and cabin temperature comfort is solved. As shown in fig. 3, the cooperative control of the energy management and air conditioning system is realized by applying the soft constraint actor critic algorithm, and the hydrogen-burning economy and cabin temperature comfort of the fuel cell vehicle are optimized, specifically:
s301: in order to reflect key environmental information, the SOC of the power battery and the output power P of the fuel battery are measured fc Vehicle speed v, air-conditioning cooling/heating capacity Q ac Setting as a state variable, a state space is constructed, which can be expressed as:
S={SOC,P fc ,v,Q ac }
s302: considering that the cooperative energy management of the air conditioning system not only distributes power source power, but also maintains the thermal comfort of the cabin temperature according to the change of the refrigerating/heating capacity of the air conditioning system, for this reason, the output power of the fuel cell is changed by an amountAnd air conditioning system cooling/heating capacity variationSetting as an action variable, constructing an action space, which can be expressed as:
s303: in order to ensure the comfort of the cabin temperature, the temperature in the cabin of the vehicle is maintained at about 24 ℃, for this reason, the reward function also comprises an optimization term of the cabin temperature change, and then the reward function is set as a weighted sum of three indexes of hydrogen energy consumption, SOC change and cabin temperature change, which is expressed as:
R=-(ζ·fuel(t)+ψ·(SOC(t)-0.7) 2 +γ·(T in -24) 2 )
zeta, psi and gamma are weight factors of each optimization item, and the problem of compromise between hydrogen energy consumption and cabin temperature comfort is solved by adjusting the weight factors, so that the multi-objective optimization problem is solved; fuel (t) is expressed as the amount of hydrogen energy consumption at the present time; the SOC (t) is expressed as the state of charge of the power battery at the current moment.
S304: the multi-objective optimization problem in energy management is solved by combining a SAC algorithm, action entropy is introduced into the SAC algorithm to enable action output to be more dispersed, and then exploration capacity, new task learning capacity and stability of the algorithm are improved, wherein the entropy is expressed as:
H(π(·|s t ))=-logπ(·|s t )
wherein H is the strategy pi (· | s) t ) The entropy of (c).
S305: in the solution process, the actor network in the agent is in state s t As input, the mean and variance of the Gaussian distribution of the motion are output, and the motion a is generated by using a re-parameterization technology t :
wherein ,τt Represented as a noise signal sampled from a standard normal distribution;outputting a mean value and a variance of the function;andrespectively, mean and variance of the gaussian distribution.
S306: performing action a t Thereafter, the vehicle environment feeds back a reward r to the agent t And transition to the next state s t+1 I.e. generating interaction data s of the environment and the agent t ,a t ,r t ,s t+1 And stored in an experience poolIn (1).
S307: randomly extracting small batch of experience samples from the experience pool to avoid overestimation when maximizing the action state function value and to utilize the experience samplesFurther overestimation when the network calculates the target, and the introduced parameter is theta 1 ,θ 2 Is the evaluation critic network and the parameter is θ' 1 ,θ′ 2 The target critic network of (4) selects the target critic network to output a small action state function value as a target value. For a particular state s t And action a t Soft constrained action value function Q in SAC algorithm soft (s t, a t ) The update formula is as follows:
wherein r represents the reward earned for the vehicle; gamma is expressed as a discount factor; α is expressed as a temperature coefficient.
S308: when updating a policy network, by minimizing a loss function L (theta) i ) Updating the evaluation critic network, the loss function being defined asAndmean square error between, expressed as:
wherein ,expressed as an evaluation critic network parameter θ i An evaluation function of time, andthe list is a target comment family network parameter of theta' i Evaluation letter of timeAnd (4) counting.
S309: the actor network parameter updating is realized by minimizing KL divergence, and the smaller the KL value is, the smaller the difference between rewards corresponding to output actions is, and the better the convergence effect of the strategy is. Objective function of actor networkIs defined as:
wherein ,DKL Expressed as KL divergence calculation expression; z(s) t ) Is a partition function for normalizing the distribution;indicating the vehicle state s at the current moment t And performing action a t The mathematical expectation function of the time of day,indicates that the current state is s t The function of the policy in time,expressed as parameters of the policy function.
S310: updating actor network parameters according to a gradient descent method, represented as:
wherein ,expressed in terms of policy function parametersThe gradient of the fall of (a) is,is shown as relating to the execution of action a at the current time t t A falling gradient of;
s311: in the SAC algorithm system, the adjustment of the temperature coefficient alpha is important for the training effect of the SAC algorithm, and the values of the optimal temperature coefficient are different in different reinforcement learning tasks and training periods. In order to realize the automatic adjustment of the temperature coefficient, the minimum value of an objective function in the optimization problem is solved, so that the optimal temperature coefficient of each step can be obtained by updating, wherein the objective function is expressed as:
wherein ,H0 Expressed as a predefined threshold of minimum policy entropy,expressed as a function of the policy pi t Performing action a t Mathematical expectation function of time, pi t (a t |s t ) Expressed as a policy function, s t Is expressed as the state of the fuel cell vehicle at the current time t, a t It is expressed as an action performed according to the policy function at the current time t.
Finally, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that various changes and modifications may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. A learning type collaborative energy management method for a fuel cell automobile considering an air conditioning system is characterized by comprising the following steps:
s1: acquiring vehicle state parameter information, fuel cell parameter information, power cell parameter information and air conditioning system parameter information of a fuel cell vehicle;
s2: establishing a fuel cell vehicle collaborative energy management model, which comprises the following steps: the method comprises the following steps that a whole vehicle longitudinal dynamics model, a fuel cell model, a power cell model, a motor model, an air conditioning system model and a vehicle cabin thermal load model are adopted;
s3: establishing a fuel cell automobile cooperative energy management optimization control strategy considering an air conditioning system, solving a multi-objective optimization problem comprising hydrogen combustion economy and cabin temperature comfort by combining a SAC algorithm, and controlling the change of air conditioning refrigeration/heating capacity to maintain the cabin temperature in a comfort interval while performing energy flow optimization control; the SAC algorithm is a soft-constraint actor critic algorithm.
2. The fuel cell automobile learning-type collaborative energy management method according to claim 1, wherein in step S1, the vehicle state parameter information includes: the method comprises the following steps of (1) vehicle speed, vehicle cabin thermal load parameters, motor operation efficiency and transmission system characteristic parameters; the fuel cell parameter information includes: power, efficiency, and hydrogen energy consumption of the fuel cell; the power battery parameter information comprises: the state of charge, internal resistance and open circuit voltage of the power battery; the air conditioning system parameter information includes: air conditioning system cooling capacity/heating capacity and corresponding power.
3. The fuel cell automobile learning type collaborative energy management method according to claim 1, wherein in the step S2, the established overall automobile longitudinal dynamics model is:
P drive =(F air +F f +F i +m 0 a)·v
P dem =P b +P fc ·η DC/DC
wherein ,m0 Representing the mass of the whole vehicle; v represents the vehicle speed of the whole vehicle; a represents a vehicle acceleration;F air expressed as air resistance; f f Expressed as rolling resistance; f i Expressed as acceleration resistance; eta m 、η DC/AC 、η DC/DC and ηmotor Respectively representing transmission efficiency, DC/AC converter efficiency, DC/DC converter efficiency and motor efficiency; p drive 、P dem 、P b and Pfc Respectively representing the driving power, the required power, and the battery output power, the fuel cell output power at the vehicle wheels.
4. The fuel cell vehicle learning-type collaborative energy management method according to claim 3, wherein in step S2, the fuel cell model is established as:
η fc =f η (P fc )
5. The fuel cell vehicle learning-type collaborative energy management method according to claim 3, wherein in step S2, the power cell model is established as:
wherein ,IL Expressed as power cell current; v oc Is shown asOpen circuit voltage of the power battery; r in Expressed as the equivalent internal resistance of the power battery; SOC 0 Expressed as initial SOC; q t Expressed as the maximum capacity of the power battery; t is t 0 Expressed as an initial time; t is t f Denoted as the final time instant.
6. The fuel cell vehicle learning-type collaborative energy management method according to claim 3, wherein in step S2, the motor model is established as:
η m =f m (ω m ,T m )
wherein ,ωm and Tm Respectively representing the rotating speed and the torque of the motor; p m Expressed as motor output power, f m And (v) representing a fitting function of the working efficiency of the motor, and obtaining the working efficiency of the motor by an interpolation method.
7. The fuel cell vehicle learning-type collaborative energy management method according to claim 1, wherein in step S2, the air conditioning system model is established as follows:
wherein ,Qac Expressed as a cooling capacity or a heating capacity of the air conditioning system; p is ac Expressed as the corresponding power consumption of the air conditioning system; eta cop Expressed as the air conditioning system coefficient of performance.
8. The fuel cell vehicle learning-type collaborative energy management method according to claim 1, wherein in step S2, the vehicle cabin thermal load model is established as follows:
Q c =∑KF(T out -T in )
Q h =145+116n
Q n =m e ξCp air (T out -T in )
wherein ,Qc 、Q r 、Q h and Qn Respectively representing heat conduction load, radiation heat load, heat generated by people in the vehicle and heat load of a ventilation system; k is expressed as the heat transfer coefficient; f denotes the heat transfer area of the respective housing; t is a unit of out Expressed as ambient temperature; t is in Expressed as cabin air temperature; η is expressed as permeability; i represents the intensity of sunlight; a. The i Expressed as windshield, left and right side windows, and rear window area; theta i Expressed as the sunlight incident angle; β is expressed as a shading factor; n represents the number of passengers in the vehicle; m is a unit of e Expressed as the mass of air passing through the evaporator; ξ is expressed as the air recirculation coefficient; cp air Expressed as indoor air heat capacity; rho air and Vair Respectively, as air density and cabin volume in the cabin.
9. The fuel cell vehicle learning type collaborative energy management method according to claim 1, wherein in step S3, a fuel cell vehicle collaborative energy management optimization control strategy considering an air conditioning system is established, and specifically comprises the following steps:
s301: determining a state space: the SOC of the power battery and the output power P of the fuel battery fc Vehicle speed v, cooling/heating capacity Q of air conditioning system ac Set as state variables, construct a state space S, represented as:
S={SOC,P fc ,v,Q ac }
s302: determining an action space: will be provided withVariation of output power of fuel cellAnd air conditioning system cooling/heating capacity variationSetting as an action variable, constructing an action space A, expressed as:
s303: establishing a reward function: the reward function R is set as a weighted sum of three indicators, hydrogen consumption, SOC variation and cabin temperature variation, expressed as:
R=-(ζ·fuel(t)+ψ·(SOC(t)-0.7) 2 +γ·(T in -24) 2 )
zeta, psi and gamma are weight factors of various optimization items, and the problem of compromise between hydrogen energy consumption and cabin temperature comfort is solved by adjusting the weight factors, so that the multi-objective optimization problem is solved; fuel (t) represents the amount of hydrogen energy consumption at the present time; the SOC (t) represents the state of charge of the power battery at the present time.
10. The fuel cell automobile learning type collaborative energy management method according to claim 9, wherein in the step S3, a multi-objective optimization problem including hydrogen combustion economy and cabin temperature comfort is solved by combining with a SAC algorithm, and the method specifically includes the following steps:
s311: the multi-objective optimization problem in energy management is solved by combining a SAC algorithm, action entropy is introduced into the SAC algorithm to enable action output to be more dispersed, and then exploration capacity, new task learning capacity and stability of the algorithm are improved, wherein the entropy is expressed as:
H(π(·|s t ))=-logπ(·|s t )
wherein H is strategy pi (· | s) t ) Entropy of (d);
s312: solving processIn the agent, the actor network is in state s t As input, the mean and variance of the Gaussian distribution of the motion are output, and the motion a is generated by using a re-parameterization technology t :
wherein ,τt Represents a noise signal sampled from a standard normal distribution;representing the mean and variance of the function output;andrespectively representing the mean and variance of the Gaussian distribution;
s313: performing action a t Thereafter, the vehicle environment feeds back a reward r to the agent t And transition to the next state s t+1 I.e. generating interaction data s of the environment and the agent t ,a t ,r t ,s t+1 And stored in an experience poolPerforming the following steps;
s314: randomly extracting a small batch of experience samples from an experience pool, and introducing a parameter theta 1 ,θ 2 Is the evaluation critic network and the parameter is θ' 1 ,θ′ 2 The target critic network selects the target critic network to output a smaller action state function value as a target value; for a particular state s t And action a t Soft constrained action value function Q in SAC algorithm soft (s t ,a t ) The update formula is as follows:
wherein r represents a reward earned by the vehicle; gamma represents a discount factor; α represents a temperature coefficient;
s315: by minimizing the loss function L (theta) when updating the policy network i ) Updating the evaluation critic network, the loss function being defined asAndmean square error between, expressed as:
wherein ,the network parameter of the evaluation critic is represented as theta i The evaluation function of the time of day,representing that the network parameter of the target comment family is theta' i An evaluation function of time;
s316: updating actor network parameters is realized by minimizing KL divergence; objective function of actor networkIs defined as:
wherein ,DKL Expressing KL divergence calculation expressions; z(s) t ) Is a partition function for normalizing the distribution;indicating the state s of the vehicle at the present moment t And performing action a t A mathematical expectation function of time;indicates that the current state is s t The function of the policy in time,parameters expressed as policy functions;
s317: updating actor network parameters according to a gradient descent method, represented as:
wherein ,expressed in terms of policy function parametersThe gradient of the fall of (a) is,is shown as relating to the execution of action a at the current time t t A falling gradient of;
s318: the minimum value of the objective function in the optimization problem is solved, so that the optimal temperature coefficient of each step can be obtained through updating, and the objective function is expressed as:
wherein ,H0 A threshold value representing a predefined minimum policy entropy,expressed as a function of the policy pi t Performing action a t Mathematical expectation function of time, pi t (a t |s t ) Expressed as a policy function, s t Is expressed as the state of the fuel cell vehicle at the current time t, a t It is expressed as an action performed according to the policy function at the current time t.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211385462.0A CN115503559B (en) | 2022-11-07 | 2022-11-07 | Fuel cell automobile learning type cooperative energy management method considering air conditioning system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211385462.0A CN115503559B (en) | 2022-11-07 | 2022-11-07 | Fuel cell automobile learning type cooperative energy management method considering air conditioning system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115503559A true CN115503559A (en) | 2022-12-23 |
CN115503559B CN115503559B (en) | 2023-05-02 |
Family
ID=84512880
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211385462.0A Active CN115503559B (en) | 2022-11-07 | 2022-11-07 | Fuel cell automobile learning type cooperative energy management method considering air conditioning system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115503559B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116639135A (en) * | 2023-05-26 | 2023-08-25 | 中国第一汽车股份有限公司 | Cooperative control method and device for vehicle and vehicle |
CN117968208A (en) * | 2024-03-29 | 2024-05-03 | 中建安装集团有限公司 | Environment system control method and control system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111731303A (en) * | 2020-07-09 | 2020-10-02 | 重庆大学 | HEV energy management method based on deep reinforcement learning A3C algorithm |
CN111785045A (en) * | 2020-06-17 | 2020-10-16 | 南京理工大学 | Distributed traffic signal lamp combined control method based on actor-critic algorithm |
CN112287463A (en) * | 2020-11-03 | 2021-01-29 | 重庆大学 | Fuel cell automobile energy management method based on deep reinforcement learning algorithm |
CN113071506A (en) * | 2021-05-20 | 2021-07-06 | 吉林大学 | Fuel cell automobile energy consumption optimization system considering cabin temperature |
CN113085665A (en) * | 2021-05-10 | 2021-07-09 | 重庆大学 | Fuel cell automobile energy management method based on TD3 algorithm |
CN113246805A (en) * | 2021-07-02 | 2021-08-13 | 吉林大学 | Fuel cell power management control method considering temperature of automobile cab |
US20210270622A1 (en) * | 2020-02-27 | 2021-09-02 | Cummins Enterprise Llc | Technologies for energy source schedule optimization for hybrid architecture vehicles |
-
2022
- 2022-11-07 CN CN202211385462.0A patent/CN115503559B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210270622A1 (en) * | 2020-02-27 | 2021-09-02 | Cummins Enterprise Llc | Technologies for energy source schedule optimization for hybrid architecture vehicles |
CN111785045A (en) * | 2020-06-17 | 2020-10-16 | 南京理工大学 | Distributed traffic signal lamp combined control method based on actor-critic algorithm |
CN111731303A (en) * | 2020-07-09 | 2020-10-02 | 重庆大学 | HEV energy management method based on deep reinforcement learning A3C algorithm |
CN112287463A (en) * | 2020-11-03 | 2021-01-29 | 重庆大学 | Fuel cell automobile energy management method based on deep reinforcement learning algorithm |
CN113085665A (en) * | 2021-05-10 | 2021-07-09 | 重庆大学 | Fuel cell automobile energy management method based on TD3 algorithm |
CN113071506A (en) * | 2021-05-20 | 2021-07-06 | 吉林大学 | Fuel cell automobile energy consumption optimization system considering cabin temperature |
CN113246805A (en) * | 2021-07-02 | 2021-08-13 | 吉林大学 | Fuel cell power management control method considering temperature of automobile cab |
Non-Patent Citations (2)
Title |
---|
王哲;谢怡;臧鹏飞;王耀;: "基于极小值原理的燃料电池客车能量管理策略", 吉林大学学报(工学版) * |
祁文凯;桑国明;: "基于延迟策略的最大熵优势演员评论家算法", 小型微型计算机系统 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116639135A (en) * | 2023-05-26 | 2023-08-25 | 中国第一汽车股份有限公司 | Cooperative control method and device for vehicle and vehicle |
CN117968208A (en) * | 2024-03-29 | 2024-05-03 | 中建安装集团有限公司 | Environment system control method and control system |
Also Published As
Publication number | Publication date |
---|---|
CN115503559B (en) | 2023-05-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115503559A (en) | Learning type collaborative energy management method for fuel cell automobile considering air conditioning system | |
Xie et al. | A Self-learning intelligent passenger vehicle comfort cooling system control strategy | |
CN111267831B (en) | Intelligent time-domain-variable model prediction energy management method for hybrid electric vehicle | |
CN111731303B (en) | HEV energy management method based on deep reinforcement learning A3C algorithm | |
CN112287463B (en) | Fuel cell automobile energy management method based on deep reinforcement learning algorithm | |
CN111845701B (en) | HEV energy management method based on deep reinforcement learning in car following environment | |
CN110936824B (en) | Electric automobile double-motor control method based on self-adaptive dynamic planning | |
CN113071506B (en) | Fuel cell automobile energy consumption optimization system considering cabin temperature | |
WO2021159660A1 (en) | Energy management method and system for hybrid vehicle | |
CN110406526A (en) | Parallel hybrid electric energy management method based on adaptive Dynamic Programming | |
CN109591659A (en) | A kind of pure electric automobile energy management control method of intelligence learning | |
CN111767896A (en) | Chassis loading cooperative control method and perception recognition implementation device for sweeper | |
CN113110052B (en) | Hybrid energy management method based on neural network and reinforcement learning | |
Deng et al. | Battery thermal-and cabin comfort-aware collaborative energy management for plug-in fuel cell electric vehicles based on the soft actor-critic algorithm | |
CN115793445A (en) | Hybrid electric vehicle control method based on multi-agent deep reinforcement learning | |
JPH09109648A (en) | Advance air conditioner for electric vehicle | |
CN113147321A (en) | Vehicle-mounted air conditioner and regenerative braking coordination control method | |
CN114969982A (en) | Fuel cell automobile deep reinforcement learning energy management method based on strategy migration | |
Hu et al. | A transfer-based reinforcement learning collaborative energy management strategy for extended-range electric buses with cabin temperature comfort consideration | |
Wu et al. | Multi-objective reinforcement learning-based energy management for fuel cell vehicles considering lifecycle costs | |
Zhang et al. | A novel online prediction method for vehicle cabin temperature and passenger thermal sensation | |
Yang et al. | Variable optimization domain-based cooperative energy management strategy for connected plug-in hybrid electric vehicles | |
Haskara et al. | Reinforcement learning based EV energy management for integrated traction and cabin thermal management considering battery aging | |
Wang et al. | Deep reinforcement learning with deep-Q-network based energy management for fuel cell hybrid electric truck | |
Chen et al. | Reinforcement learning-based energy management control strategy of hybrid electric vehicles |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |