CN115503559A - Learning type collaborative energy management method for fuel cell automobile considering air conditioning system - Google Patents

Learning type collaborative energy management method for fuel cell automobile considering air conditioning system Download PDF

Info

Publication number
CN115503559A
CN115503559A CN202211385462.0A CN202211385462A CN115503559A CN 115503559 A CN115503559 A CN 115503559A CN 202211385462 A CN202211385462 A CN 202211385462A CN 115503559 A CN115503559 A CN 115503559A
Authority
CN
China
Prior art keywords
expressed
fuel cell
vehicle
air conditioning
conditioning system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211385462.0A
Other languages
Chinese (zh)
Other versions
CN115503559B (en
Inventor
唐小林
邓磊
甘炯鹏
朱和龙
胡晓松
李佳承
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202211385462.0A priority Critical patent/CN115503559B/en
Publication of CN115503559A publication Critical patent/CN115503559A/en
Application granted granted Critical
Publication of CN115503559B publication Critical patent/CN115503559B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60LPROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
    • B60L58/00Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles
    • B60L58/30Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles for monitoring or controlling fuel cells
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60HARRANGEMENTS OF HEATING, COOLING, VENTILATING OR OTHER AIR-TREATING DEVICES SPECIALLY ADAPTED FOR PASSENGER OR GOODS SPACES OF VEHICLES
    • B60H1/00Heating, cooling or ventilating [HVAC] devices
    • B60H1/00357Air-conditioning arrangements specially adapted for particular vehicles
    • B60H1/00385Air-conditioning arrangements specially adapted for particular vehicles for vehicles having an electrical drive, e.g. hybrid or fuel cell
    • B60H1/00392Air-conditioning arrangements specially adapted for particular vehicles for vehicles having an electrical drive, e.g. hybrid or fuel cell for electric vehicles having only electric drive means
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60HARRANGEMENTS OF HEATING, COOLING, VENTILATING OR OTHER AIR-TREATING DEVICES SPECIALLY ADAPTED FOR PASSENGER OR GOODS SPACES OF VEHICLES
    • B60H1/00Heating, cooling or ventilating [HVAC] devices
    • B60H1/32Cooling devices
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60LPROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
    • B60L58/00Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles
    • B60L58/10Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles for monitoring or controlling batteries
    • B60L58/12Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles for monitoring or controlling batteries responding to state of charge [SoC]

Landscapes

  • Engineering & Computer Science (AREA)
  • Mechanical Engineering (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Sustainable Development (AREA)
  • Sustainable Energy (AREA)
  • Physics & Mathematics (AREA)
  • Thermal Sciences (AREA)
  • Power Engineering (AREA)
  • Transportation (AREA)
  • Fuel Cell (AREA)

Abstract

The invention relates to a learning type collaborative energy management method for a fuel cell automobile considering an air conditioning system, and belongs to the field of new energy automobiles. The method comprises the following steps: s1: acquiring vehicle state parameter information, fuel cell parameter information, power cell parameter information and air conditioning system parameter information of a fuel cell vehicle; s2: establishing a fuel cell automobile collaborative energy management model; s3: establishing a fuel cell vehicle cooperative energy management optimization control strategy considering an air conditioning system, solving a multi-objective optimization problem including hydrogen combustion economy and cabin temperature comfort by combining a SAC algorithm, and controlling the change of the refrigeration/heating capacity of an air conditioner to maintain the cabin temperature in a comfort interval while performing energy flow optimization control. The invention can effectively solve the compromise problem between hydrogen energy consumption and cabin temperature comfort, and optimize the hydrogen-burning economy and cabin temperature comfort of the fuel cell automobile.

Description

Learning type collaborative energy management method for fuel cell automobile considering air conditioning system
Technical Field
The invention belongs to the field of new energy automobiles, and relates to a learning type collaborative energy management method for a fuel cell automobile considering an air conditioning system.
Background
In the face of increasingly severe problems of ecological environment pollution, fossil fuel shortage and the like, various automobile manufacturers strive to develop new energy automobiles. With the development of fuel cell technology, fuel cell vehicles fully exert the advantages of zero emission, low energy consumption and strong endurance, and are considered to be one of important research directions for realizing the sustainable development of vehicles in the future. The energy management strategy is a core control technology of a fuel cell automobile multi-power source system, and the quality of the performance directly determines the economic performance of the whole automobile. In current research, energy management methods are mainly divided into three types: rule-based, optimization-based, and learning-based energy management strategies. However, rule-based and optimization-based energy management methods face a dilemma that they cannot meet both real-time and optimality; for the traditional deep reinforcement learning algorithm, although the real-time performance and the optimality of energy flow optimization can be realized at the same time, certain defects exist in the aspects of training data and hyper-parameter setting. Therefore, the proposal of the soft constraint actor critic algorithm provides a method for solving the problems.
On the other hand, the air conditioning system is an indispensable auxiliary device for a fuel cell vehicle, and contributes to providing a comfortable riding environment for passengers in the vehicle. However, the use of the air conditioning system inevitably increases the energy consumption of the fuel cell vehicle, thereby affecting the economic performance of the entire vehicle. In the current research on the energy management method of the fuel cell automobile, the energy consumption of the air conditioning system is generally regarded as a fixed value or ignored. However, due to the change of driving environment, the heat exchange quantity inside and outside the cab changes, and the power used by the air conditioning system changes.
Therefore, a new energy management method for a fuel cell vehicle is needed to coordinate and control the air conditioning system and the power source components, and to optimize the energy flow in the vehicle while considering the energy consumption variation of the air conditioning system.
Disclosure of Invention
In view of the above, the present invention provides a learning-type collaborative energy management method for a fuel cell vehicle considering an air conditioning system, which coordinately controls the air conditioning system and power source components of the fuel cell vehicle by applying a Soft constraint actor critic (SAC) algorithm, so as to optimize the energy flow of the entire vehicle while ensuring cabin comfort, so as to reduce the energy consumption of the entire vehicle of the fuel cell vehicle.
In order to achieve the purpose, the invention provides the following technical scheme:
a learning type collaborative energy management method for a fuel cell automobile considering an air conditioning system specifically comprises the following steps:
s1: acquiring vehicle state parameter information, fuel cell parameter information, power cell parameter information and air conditioning system parameter information of a fuel cell vehicle;
s2: establishing a fuel cell automobile collaborative energy management model, comprising the following steps: the method comprises the following steps that a whole vehicle longitudinal dynamics model, a fuel cell model, a power cell model, a motor model, an air conditioning system model and a vehicle cabin thermal load model are adopted;
s3: establishing a fuel cell automobile cooperative energy management optimization control strategy considering an air conditioning system, solving a multi-objective optimization problem containing hydrogen combustion economy and cabin temperature comfort by combining a SAC algorithm, and controlling the change of the refrigeration/heating capacity of an air conditioner to maintain the cabin temperature in a comfortable interval while performing energy flow optimization control; the SAC algorithm is a soft-constraint actor critic algorithm.
Further, in step S1, the vehicle state parameter information includes: the method comprises the following steps of (1) vehicle speed, vehicle cabin thermal load parameters, motor operation efficiency and transmission system characteristic parameters; the fuel cell parameter information includes: power, efficiency, and hydrogen energy consumption of the fuel cell; the power battery parameter information comprises: the state of charge, internal resistance and open circuit voltage of the power battery; the air conditioning system parameter information includes: air conditioning system cooling capacity/heating capacity and corresponding power.
Further, in step S2, the established longitudinal dynamics model of the entire vehicle is:
P drive =(F air +F f +F i +m 0 a)·v
Figure BDA0003929529130000021
P dem =P b +P fc ·η DC/DC
wherein ,m0 Representing the mass of the whole vehicle; v represents the vehicle speed of the whole vehicle; a represents a vehicle acceleration; f air Expressed as air resistance; f f Expressed as rolling resistance; f i Expressed as acceleration resistance; eta m 、η DC/AC 、η DC/DC and ηmotor Respectively representing transmission efficiency, DC/AC converter efficiency, DC/DC converter efficiency and motor efficiency; p drive 、P dem 、P b and Pfc Respectively representing the driving power, the required power, and the battery output power, the fuel cell output power at the vehicle wheels.
Further, in step S2, the fuel cell model is established as follows:
η fc =f η (P fc )
Figure BDA0003929529130000022
wherein ,fη(·) and
Figure BDA0003929529130000023
the efficiency and hydrogen consumption can be calculated by interpolation, respectively expressed as fitting functions of the efficiency and hydrogen consumption.
Further, in step S2, the power battery model established is:
Figure BDA0003929529130000024
Figure BDA0003929529130000031
wherein ,IL Expressed as power cell current; v oc Expressed as the power cell open circuit voltage; r in Expressed as the equivalent internal resistance of the power battery; SOC (system on chip) 0 Expressed as initial SOC; q t Expressed as the maximum capacity of the power battery; t is t 0 Expressed as an initial time; t is t f Denoted as the final time instant.
Further, in step S2, the established motor model is:
η m =f mm ,T m )
Figure BDA0003929529130000032
wherein ,ωm and Tm Respectively representing the rotating speed and the torque of the motor; p m Expressed as motor output power, f m The (DEG) is expressed as a fitting function of the working efficiency of the motor, and the working efficiency of the motor can be obtained by an interpolation method.
Further, in step S2, the air conditioning system model is established as follows:
Figure BDA0003929529130000033
wherein ,Qac Expressed as a cooling capacity or a heating capacity of the air conditioning system; p ac Expressed as the corresponding power consumption of the air conditioning system; eta cop Expressed as the air conditioning system coefficient of performance.
Further, in step S2, the built cabin thermal load model is:
Q c =∑KF(T out -T in )
Figure BDA0003929529130000034
Q h =145+116n
Q n =m e ξCp air (T out -T in )
Figure BDA0003929529130000035
wherein ,Qc 、Q r 、Q h and Qn Respectively representing thermal conduction load, radiant heat load, heat generated by the vehicle occupant (empirically, about 145 watts of heat generated by the driver and about 116 watts of heat generated by each occupant), and ventilation system heat load; k is expressed as the heat transfer coefficient; f denotes the heat transfer area of the respective housing; t is out Expressed as ambient temperature; t is in Expressed as cabin air temperature; η is expressed as permeability; i represents the intensity of sunlight; a. The i Expressed as windshield, left and right side windows, and rear window area; theta i Expressed as the sunlight incident angle; β is expressed as a shading factor; n represents the number of passengers in the vehicle; m is a unit of e Expressed as the mass of air passing through the evaporator; ξ is expressed as the air recirculation coefficient; cp air Expressed as indoor air heat capacity; rho air and Vair Respectively, as air density and cabin volume in the cabin.
Further, in step S3, establishing a fuel cell vehicle cooperative energy management optimization control strategy considering an air conditioning system, specifically including the following steps:
s301: determining a state space: in order to reflect key environmental information, the SOC of the power battery and the output power P of the fuel battery are measured fc Vehicle speed v, cooling/heating capacity Q of air conditioning system ac Set as a state variable, a state space S is constructed, which can be expressed as:
S={SOC,P fc ,v,Q ac }
s302: determining an action space: considering the cooperative energy management of the air conditioning system, the power of the power source is not only distributed, but also changed according to the refrigerating/heating capacity of the air conditioning systemMaintaining thermal comfort of the cabin temperature, for which purpose the fuel cell output power is varied
Figure BDA0003929529130000047
And air conditioning system cooling/heating capacity variation
Figure BDA0003929529130000046
Setting as an action variable, constructing an action space A, which can be expressed as:
Figure BDA0003929529130000041
s303: establishing a reward function: in order to ensure the comfort of the cabin temperature, the temperature in the cabin of the vehicle is maintained at about 24 ℃, for this reason, the reward function also comprises an optimization term of the cabin temperature change, and then the reward function R is set as the weighted sum of three indexes of hydrogen energy consumption, SOC change and cabin temperature change, which is expressed as:
R=-(ζ·fuel(t)+ψ·(SOC(t)-0.7) 2 +γ·(T in -24) 2 )
zeta, psi and gamma are weight factors of each optimization item, and the problem of compromise between hydrogen energy consumption and cabin temperature comfort is solved by adjusting the weight factors, so that the multi-objective optimization problem is solved; fuel (t) represents the amount of hydrogen energy consumption at the present time; the SOC (t) represents the state of charge of the power battery at the present time.
Further, in the step S3, a multi-objective optimization problem including hydrogen combustion economy and cabin temperature comfort is solved by combining a SAC algorithm, and the method specifically comprises the following steps:
s311: the multi-objective optimization problem in energy management is solved by combining a SAC algorithm, action entropy is introduced into the SAC algorithm to enable action output to be more dispersed, and then exploration capacity, new task learning capacity and stability of the algorithm are improved, wherein the entropy is expressed as:
H(π(·|s t ))=-logπ(·|s t )
wherein H is strategy pi (· | s) t ) Entropy of (2).
S312: in the solution process, the actor network in the agent is in state s t As input, the mean and variance of the Gaussian distribution of the motion are output, and the motion a is generated by using a re-parameterization technology t
Figure BDA0003929529130000042
wherein ,τt Represents a noise signal sampled from a standard normal distribution;
Figure BDA0003929529130000043
representing the mean and variance of the function output;
Figure BDA0003929529130000044
and
Figure BDA0003929529130000045
respectively, mean and variance of the gaussian distribution.
S313: performing action a t Thereafter, the vehicle environment feeds back a reward r to the agent t And shifts to the next state s t+1 I.e. the interactive data(s) of the environment and the intelligent agent can be generated t ,a t ,r t ,s t+1 And stored in an experience pool
Figure BDA0003929529130000051
In (1).
S314: randomly extracting a small batch of experience samples from an experience pool, and introducing a parameter theta to avoid overestimation when the function value of the action state is maximized and further overestimation when the target is calculated by utilizing the network of the user 12 Is the evaluation critic network and the parameter is θ' 1 ,θ′ 2 The target critic network selects the target critic network to output a smaller action state function value as a target value; for a particular state s t And action a t Soft constrained action value function Q in SAC algorithm soft (s t ,a t ) The update formula is as follows:
Figure BDA0003929529130000052
wherein r represents a reward earned by the vehicle; gamma represents a discount factor; α represents a temperature coefficient.
S315: by minimizing the loss function L (theta) when updating the policy network i ) Updating the evaluation critic network, the loss function being defined as
Figure BDA0003929529130000053
And
Figure BDA0003929529130000054
mean square error between, expressed as:
Figure BDA0003929529130000055
Figure BDA0003929529130000056
wherein ,
Figure BDA0003929529130000057
expressed as an evaluation critic network parameter of theta i An evaluation function of time, and
Figure BDA0003929529130000058
the list is a target comment family network parameter of theta' i The evaluation function of time.
S316: the actor network parameter updating is realized by minimizing KL divergence, and the smaller the KL value is, the smaller the difference between rewards corresponding to output actions is, and the better the convergence effect of the strategy is; objective function of actor network
Figure BDA0003929529130000059
Is defined as:
Figure BDA00039295291300000510
wherein ,DKL Expressing KL divergence calculation expressions; z(s) t ) Is a partition function for normalizing the distribution;
Figure BDA00039295291300000511
indicating the vehicle state s at the current moment t And performing action a t The mathematical expectation function of the time of day,
Figure BDA00039295291300000512
indicates that the current state is s t The function of the policy in time,
Figure BDA00039295291300000513
expressed as parameters of the policy function.
S317: updating actor network parameters according to a gradient descent method, represented as:
Figure BDA00039295291300000514
wherein ,
Figure BDA00039295291300000515
expressed in terms of policy function parameters
Figure BDA00039295291300000516
The gradient of the fall of (a) is,
Figure BDA00039295291300000517
is shown as relating to the execution of action a at the current time t t A falling gradient of (c).
S318: in the SAC algorithm system, the adjustment of the temperature coefficient alpha is important for the training effect of the SAC algorithm, and the values of the optimal temperature coefficient are different in different reinforcement learning tasks and training periods. In order to realize the automatic adjustment of the temperature coefficient, the minimum value of an objective function in the optimization problem is solved, so that the optimal temperature coefficient of each step can be obtained by updating, wherein the objective function is expressed as:
Figure BDA0003929529130000061
wherein ,H0 A threshold value representing a predefined minimum policy entropy,
Figure BDA0003929529130000062
expressed as a function of the policy pi t Performing action a t Mathematical expectation function of time, pi t (a t |s t ) Expressed as a policy function, s t Is expressed as the state of the fuel cell vehicle at the current time t, a t It is expressed as an action executed according to the policy function at the current time t.
The invention has the beneficial effects that:
1) The invention designs an energy management strategy based on a soft constraint actor critic algorithm, effectively gets rid of the dependence of the traditional deep reinforcement learning algorithm on training data and hyper-parameter setting in the fuel cell automobile energy management application, and is beneficial to improving the stability of control tasks under a continuous action space.
2) Considering that the energy consumption change of the air conditioning system is generally ignored during the design of the energy management problem of the fuel cell automobile, the invention sets up a cooperative energy management optimization control framework considering the air conditioning system by taking hydrogen energy consumption, SOC maintenance and cabin temperature comfort as optimization targets, and realizes the cooperative control of the energy management and the air conditioning system.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow chart of a fuel cell vehicle collaborative energy management method of the present invention;
FIG. 2 is a schematic structural diagram of a multi-power-source system of a fuel cell vehicle;
FIG. 3 is a schematic diagram of a cabin thermal load model and an air conditioning system configuration;
fig. 4 is a diagram of a collaborative energy management framework in consideration of an air conditioning system built by applying a SAC algorithm in the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Referring to fig. 1 to 4, the fuel cell vehicle collaborative energy management optimization method considering the air conditioning system is designed based on the soft constraint actor critic algorithm. Considering that the energy consumption change of an air conditioning system is generally ignored in the energy management of a fuel cell automobile, the main influence factors of the temperature comfort in the cabin of the automobile are analyzed, an air conditioning system model and a cabin heat load model are established, the hydrogen consumption, the SOC maintenance and the cabin temperature are taken as optimization targets, a collaborative energy management optimization control framework considering the air conditioning system is established by applying a soft constraint actor critic algorithm suitable for control tasks under a continuous action space, the collaborative control of the energy management and the air conditioning system is realized, and the hydrogen combustion economy and the cabin temperature comfort of the fuel cell automobile are optimized. As shown in fig. 1, the energy management collaborative optimization method specifically includes the following steps:
s1: acquiring key parameter information of a fuel cell vehicle, comprising the following steps:
the vehicle state parameter information includes: the method comprises the following steps of (1) vehicle speed, vehicle cabin thermal load parameters, motor operation efficiency and transmission system characteristic parameters;
the fuel cell parameter information includes: power, efficiency, and hydrogen energy consumption of the fuel cell;
the power battery parameter information comprises: the state of charge, internal resistance and open circuit voltage of the power battery;
the air conditioning system parameter information includes: air conditioning system cooling capacity/heating capacity and corresponding power.
S2: establishing a fuel cell vehicle collaborative energy management model, as shown in fig. 2 and 3, the specific steps are as follows:
s21: establishing a longitudinal dynamic model of the whole vehicle:
P drive =(F air +F f +F i +m 0 a)·v
Figure BDA0003929529130000071
P dem =P b +P fc ·η DC/DC
wherein ,m0 Representing the mass of the whole vehicle; v represents the vehicle speed of the whole vehicle; a represents a vehicle acceleration; f air Expressed as air resistance; f f Expressed as rolling resistance; f i Expressed as acceleration resistance; eta m 、η DC/AC 、η DC/DC and ηmotor Respectively representing transmission efficiency, DC/AC converter efficiency, DC/DC converter efficiency and motor efficiency; p drive 、P dem 、P b and Pfc Respectively representing the driving power, the required power, and the battery output power, the fuel cell output power at the vehicle wheels.
S22: establishing a fuel cell model:
η fc =f η (P fc )
Figure BDA0003929529130000072
wherein ,fη(·) and
Figure BDA0003929529130000073
the efficiency and hydrogen consumption can be calculated by interpolation, respectively expressed as fitting functions of the efficiency and hydrogen consumption.
S23: establishing a power battery model:
Figure BDA0003929529130000074
Figure BDA0003929529130000081
wherein ,IL Expressed as power cell current; v oc Expressed as the power cell open circuit voltage; r is in Expressed as the equivalent internal resistance of the power battery; SOC 0 Expressed as initial SOC; q t Expressed as the maximum capacity of the power battery; t is t 0 Expressed as an initial time; t is t f Denoted as the final time instant.
S24: establishing a motor model:
η m =f mm ,T m )
Figure BDA0003929529130000082
wherein ,ωm and Tm Respectively representing the rotating speed and the torque of the motor; p m Expressed as motor output power, f m The (DEG) is expressed as a fitting function of the working efficiency of the motor, and the working efficiency of the motor can be obtained by an interpolation method.
S25: establishing an air conditioning system model:
Figure BDA0003929529130000083
wherein ,Qac Expressed as a cooling capacity or a heating capacity of the air conditioning system; p ac Expressed as the corresponding power consumption of the air conditioning system; eta cop Expressed as the air conditioning system coefficient of performance.
S26: establishing a vehicle cabin heat load model:
Q c =∑KF(T out -T in )
Figure BDA0003929529130000084
Q h =145+116n
Q n =m e ξCp air (T out -T in )
Figure BDA0003929529130000085
wherein ,Qc 、Q r 、Q h and Qn Respectively representing thermal conduction load, radiant heat load, heat generated by the vehicle occupant (empirically, about 145 watts of heat generated by the driver and about 116 watts of heat generated by each occupant), and ventilation system heat load; k is expressed as the heat transfer coefficient; f denotes the heat transfer area of the respective housing; t is out Expressed as ambient temperature; t is in Expressed as cabin air temperature; eta is expressed as permeability; i represents the intensity of sunlight; a. The i Expressed as windshield, left and right side windows, and rear window area; theta i Expressed as the sunlight incident angle; β is expressed as a shading factor; n represents the number of passengers in the vehicle; m is e Expressed as the mass of air passing through the evaporator; ξ is expressed as the air recirculation coefficient; cp air Expressed as indoor air heat capacity; rho air and Vair Respectively, as air density and cabin volume in the cabin.
S3: a fuel cell automobile collaborative energy management optimization control framework considering an air conditioning system is established based on a SAC algorithm, and a multi-objective optimization problem including hydrogen combustion economy and cabin temperature comfort is solved. As shown in fig. 3, the cooperative control of the energy management and air conditioning system is realized by applying the soft constraint actor critic algorithm, and the hydrogen-burning economy and cabin temperature comfort of the fuel cell vehicle are optimized, specifically:
s301: in order to reflect key environmental information, the SOC of the power battery and the output power P of the fuel battery are measured fc Vehicle speed v, air-conditioning cooling/heating capacity Q ac Setting as a state variable, a state space is constructed, which can be expressed as:
S={SOC,P fc ,v,Q ac }
s302: considering that the cooperative energy management of the air conditioning system not only distributes power source power, but also maintains the thermal comfort of the cabin temperature according to the change of the refrigerating/heating capacity of the air conditioning system, for this reason, the output power of the fuel cell is changed by an amount
Figure BDA0003929529130000091
And air conditioning system cooling/heating capacity variation
Figure BDA0003929529130000092
Setting as an action variable, constructing an action space, which can be expressed as:
Figure BDA0003929529130000093
s303: in order to ensure the comfort of the cabin temperature, the temperature in the cabin of the vehicle is maintained at about 24 ℃, for this reason, the reward function also comprises an optimization term of the cabin temperature change, and then the reward function is set as a weighted sum of three indexes of hydrogen energy consumption, SOC change and cabin temperature change, which is expressed as:
R=-(ζ·fuel(t)+ψ·(SOC(t)-0.7) 2 +γ·(T in -24) 2 )
zeta, psi and gamma are weight factors of each optimization item, and the problem of compromise between hydrogen energy consumption and cabin temperature comfort is solved by adjusting the weight factors, so that the multi-objective optimization problem is solved; fuel (t) is expressed as the amount of hydrogen energy consumption at the present time; the SOC (t) is expressed as the state of charge of the power battery at the current moment.
S304: the multi-objective optimization problem in energy management is solved by combining a SAC algorithm, action entropy is introduced into the SAC algorithm to enable action output to be more dispersed, and then exploration capacity, new task learning capacity and stability of the algorithm are improved, wherein the entropy is expressed as:
H(π(·|s t ))=-logπ(·|s t )
wherein H is the strategy pi (· | s) t ) The entropy of (c).
S305: in the solution process, the actor network in the agent is in state s t As input, the mean and variance of the Gaussian distribution of the motion are output, and the motion a is generated by using a re-parameterization technology t
Figure BDA0003929529130000094
wherein ,τt Represented as a noise signal sampled from a standard normal distribution;
Figure BDA0003929529130000095
outputting a mean value and a variance of the function;
Figure BDA0003929529130000096
and
Figure BDA0003929529130000097
respectively, mean and variance of the gaussian distribution.
S306: performing action a t Thereafter, the vehicle environment feeds back a reward r to the agent t And transition to the next state s t+1 I.e. generating interaction data s of the environment and the agent t ,a t ,r t ,s t+1 And stored in an experience pool
Figure BDA0003929529130000101
In (1).
S307: randomly extracting small batch of experience samples from the experience pool to avoid overestimation when maximizing the action state function value and to utilize the experience samplesFurther overestimation when the network calculates the target, and the introduced parameter is theta 12 Is the evaluation critic network and the parameter is θ' 1 ,θ′ 2 The target critic network of (4) selects the target critic network to output a small action state function value as a target value. For a particular state s t And action a t Soft constrained action value function Q in SAC algorithm soft (s t, a t ) The update formula is as follows:
Figure BDA0003929529130000102
wherein r represents the reward earned for the vehicle; gamma is expressed as a discount factor; α is expressed as a temperature coefficient.
S308: when updating a policy network, by minimizing a loss function L (theta) i ) Updating the evaluation critic network, the loss function being defined as
Figure BDA0003929529130000103
And
Figure BDA0003929529130000104
mean square error between, expressed as:
Figure BDA0003929529130000105
Figure BDA0003929529130000106
wherein ,
Figure BDA0003929529130000107
expressed as an evaluation critic network parameter θ i An evaluation function of time, and
Figure BDA0003929529130000108
the list is a target comment family network parameter of theta' i Evaluation letter of timeAnd (4) counting.
S309: the actor network parameter updating is realized by minimizing KL divergence, and the smaller the KL value is, the smaller the difference between rewards corresponding to output actions is, and the better the convergence effect of the strategy is. Objective function of actor network
Figure BDA0003929529130000109
Is defined as:
Figure BDA00039295291300001010
wherein ,DKL Expressed as KL divergence calculation expression; z(s) t ) Is a partition function for normalizing the distribution;
Figure BDA00039295291300001011
indicating the vehicle state s at the current moment t And performing action a t The mathematical expectation function of the time of day,
Figure BDA00039295291300001012
indicates that the current state is s t The function of the policy in time,
Figure BDA00039295291300001013
expressed as parameters of the policy function.
S310: updating actor network parameters according to a gradient descent method, represented as:
Figure BDA00039295291300001014
wherein ,
Figure BDA00039295291300001015
expressed in terms of policy function parameters
Figure BDA00039295291300001016
The gradient of the fall of (a) is,
Figure BDA00039295291300001017
is shown as relating to the execution of action a at the current time t t A falling gradient of;
s311: in the SAC algorithm system, the adjustment of the temperature coefficient alpha is important for the training effect of the SAC algorithm, and the values of the optimal temperature coefficient are different in different reinforcement learning tasks and training periods. In order to realize the automatic adjustment of the temperature coefficient, the minimum value of an objective function in the optimization problem is solved, so that the optimal temperature coefficient of each step can be obtained by updating, wherein the objective function is expressed as:
Figure BDA0003929529130000111
wherein ,H0 Expressed as a predefined threshold of minimum policy entropy,
Figure BDA0003929529130000112
expressed as a function of the policy pi t Performing action a t Mathematical expectation function of time, pi t (a t |s t ) Expressed as a policy function, s t Is expressed as the state of the fuel cell vehicle at the current time t, a t It is expressed as an action performed according to the policy function at the current time t.
Finally, although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that various changes and modifications may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A learning type collaborative energy management method for a fuel cell automobile considering an air conditioning system is characterized by comprising the following steps:
s1: acquiring vehicle state parameter information, fuel cell parameter information, power cell parameter information and air conditioning system parameter information of a fuel cell vehicle;
s2: establishing a fuel cell vehicle collaborative energy management model, which comprises the following steps: the method comprises the following steps that a whole vehicle longitudinal dynamics model, a fuel cell model, a power cell model, a motor model, an air conditioning system model and a vehicle cabin thermal load model are adopted;
s3: establishing a fuel cell automobile cooperative energy management optimization control strategy considering an air conditioning system, solving a multi-objective optimization problem comprising hydrogen combustion economy and cabin temperature comfort by combining a SAC algorithm, and controlling the change of air conditioning refrigeration/heating capacity to maintain the cabin temperature in a comfort interval while performing energy flow optimization control; the SAC algorithm is a soft-constraint actor critic algorithm.
2. The fuel cell automobile learning-type collaborative energy management method according to claim 1, wherein in step S1, the vehicle state parameter information includes: the method comprises the following steps of (1) vehicle speed, vehicle cabin thermal load parameters, motor operation efficiency and transmission system characteristic parameters; the fuel cell parameter information includes: power, efficiency, and hydrogen energy consumption of the fuel cell; the power battery parameter information comprises: the state of charge, internal resistance and open circuit voltage of the power battery; the air conditioning system parameter information includes: air conditioning system cooling capacity/heating capacity and corresponding power.
3. The fuel cell automobile learning type collaborative energy management method according to claim 1, wherein in the step S2, the established overall automobile longitudinal dynamics model is:
P drive =(F air +F f +F i +m 0 a)·v
Figure FDA0003929529120000011
P dem =P b +P fc ·η DC/DC
wherein ,m0 Representing the mass of the whole vehicle; v represents the vehicle speed of the whole vehicle; a represents a vehicle acceleration;F air expressed as air resistance; f f Expressed as rolling resistance; f i Expressed as acceleration resistance; eta m 、η DC/AC 、η DC/DC and ηmotor Respectively representing transmission efficiency, DC/AC converter efficiency, DC/DC converter efficiency and motor efficiency; p drive 、P dem 、P b and Pfc Respectively representing the driving power, the required power, and the battery output power, the fuel cell output power at the vehicle wheels.
4. The fuel cell vehicle learning-type collaborative energy management method according to claim 3, wherein in step S2, the fuel cell model is established as:
η fc =f η (P fc )
Figure FDA0003929529120000012
wherein ,fη(·) and
Figure FDA0003929529120000013
the efficiency and hydrogen consumption are calculated by interpolation as fitting functions of efficiency and hydrogen consumption, respectively.
5. The fuel cell vehicle learning-type collaborative energy management method according to claim 3, wherein in step S2, the power cell model is established as:
Figure FDA0003929529120000021
Figure FDA0003929529120000022
wherein ,IL Expressed as power cell current; v oc Is shown asOpen circuit voltage of the power battery; r in Expressed as the equivalent internal resistance of the power battery; SOC 0 Expressed as initial SOC; q t Expressed as the maximum capacity of the power battery; t is t 0 Expressed as an initial time; t is t f Denoted as the final time instant.
6. The fuel cell vehicle learning-type collaborative energy management method according to claim 3, wherein in step S2, the motor model is established as:
η m =f mm ,T m )
Figure FDA0003929529120000023
wherein ,ωm and Tm Respectively representing the rotating speed and the torque of the motor; p m Expressed as motor output power, f m And (v) representing a fitting function of the working efficiency of the motor, and obtaining the working efficiency of the motor by an interpolation method.
7. The fuel cell vehicle learning-type collaborative energy management method according to claim 1, wherein in step S2, the air conditioning system model is established as follows:
Figure FDA0003929529120000024
wherein ,Qac Expressed as a cooling capacity or a heating capacity of the air conditioning system; p is ac Expressed as the corresponding power consumption of the air conditioning system; eta cop Expressed as the air conditioning system coefficient of performance.
8. The fuel cell vehicle learning-type collaborative energy management method according to claim 1, wherein in step S2, the vehicle cabin thermal load model is established as follows:
Q c =∑KF(T out -T in )
Figure FDA0003929529120000025
Q h =145+116n
Q n =m e ξCp air (T out -T in )
Figure FDA0003929529120000031
wherein ,Qc 、Q r 、Q h and Qn Respectively representing heat conduction load, radiation heat load, heat generated by people in the vehicle and heat load of a ventilation system; k is expressed as the heat transfer coefficient; f denotes the heat transfer area of the respective housing; t is a unit of out Expressed as ambient temperature; t is in Expressed as cabin air temperature; η is expressed as permeability; i represents the intensity of sunlight; a. The i Expressed as windshield, left and right side windows, and rear window area; theta i Expressed as the sunlight incident angle; β is expressed as a shading factor; n represents the number of passengers in the vehicle; m is a unit of e Expressed as the mass of air passing through the evaporator; ξ is expressed as the air recirculation coefficient; cp air Expressed as indoor air heat capacity; rho air and Vair Respectively, as air density and cabin volume in the cabin.
9. The fuel cell vehicle learning type collaborative energy management method according to claim 1, wherein in step S3, a fuel cell vehicle collaborative energy management optimization control strategy considering an air conditioning system is established, and specifically comprises the following steps:
s301: determining a state space: the SOC of the power battery and the output power P of the fuel battery fc Vehicle speed v, cooling/heating capacity Q of air conditioning system ac Set as state variables, construct a state space S, represented as:
S={SOC,P fc ,v,Q ac }
s302: determining an action space: will be provided withVariation of output power of fuel cell
Figure FDA0003929529120000032
And air conditioning system cooling/heating capacity variation
Figure FDA0003929529120000033
Setting as an action variable, constructing an action space A, expressed as:
Figure FDA0003929529120000034
s303: establishing a reward function: the reward function R is set as a weighted sum of three indicators, hydrogen consumption, SOC variation and cabin temperature variation, expressed as:
R=-(ζ·fuel(t)+ψ·(SOC(t)-0.7) 2 +γ·(T in -24) 2 )
zeta, psi and gamma are weight factors of various optimization items, and the problem of compromise between hydrogen energy consumption and cabin temperature comfort is solved by adjusting the weight factors, so that the multi-objective optimization problem is solved; fuel (t) represents the amount of hydrogen energy consumption at the present time; the SOC (t) represents the state of charge of the power battery at the present time.
10. The fuel cell automobile learning type collaborative energy management method according to claim 9, wherein in the step S3, a multi-objective optimization problem including hydrogen combustion economy and cabin temperature comfort is solved by combining with a SAC algorithm, and the method specifically includes the following steps:
s311: the multi-objective optimization problem in energy management is solved by combining a SAC algorithm, action entropy is introduced into the SAC algorithm to enable action output to be more dispersed, and then exploration capacity, new task learning capacity and stability of the algorithm are improved, wherein the entropy is expressed as:
H(π(·|s t ))=-logπ(·|s t )
wherein H is strategy pi (· | s) t ) Entropy of (d);
s312: solving processIn the agent, the actor network is in state s t As input, the mean and variance of the Gaussian distribution of the motion are output, and the motion a is generated by using a re-parameterization technology t
Figure FDA0003929529120000041
wherein ,τt Represents a noise signal sampled from a standard normal distribution;
Figure FDA0003929529120000042
representing the mean and variance of the function output;
Figure FDA0003929529120000043
and
Figure FDA0003929529120000044
respectively representing the mean and variance of the Gaussian distribution;
s313: performing action a t Thereafter, the vehicle environment feeds back a reward r to the agent t And transition to the next state s t+1 I.e. generating interaction data s of the environment and the agent t ,a t ,r t ,s t+1 And stored in an experience pool
Figure FDA00039295291200000417
Performing the following steps;
s314: randomly extracting a small batch of experience samples from an experience pool, and introducing a parameter theta 12 Is the evaluation critic network and the parameter is θ' 1 ,θ′ 2 The target critic network selects the target critic network to output a smaller action state function value as a target value; for a particular state s t And action a t Soft constrained action value function Q in SAC algorithm soft (s t ,a t ) The update formula is as follows:
Figure FDA0003929529120000045
wherein r represents a reward earned by the vehicle; gamma represents a discount factor; α represents a temperature coefficient;
s315: by minimizing the loss function L (theta) when updating the policy network i ) Updating the evaluation critic network, the loss function being defined as
Figure FDA0003929529120000046
And
Figure FDA0003929529120000047
mean square error between, expressed as:
Figure FDA0003929529120000048
Figure FDA0003929529120000049
wherein ,
Figure FDA00039295291200000410
the network parameter of the evaluation critic is represented as theta i The evaluation function of the time of day,
Figure FDA00039295291200000411
representing that the network parameter of the target comment family is theta' i An evaluation function of time;
s316: updating actor network parameters is realized by minimizing KL divergence; objective function of actor network
Figure FDA00039295291200000412
Is defined as:
Figure FDA00039295291200000413
wherein ,DKL Expressing KL divergence calculation expressions; z(s) t ) Is a partition function for normalizing the distribution;
Figure FDA00039295291200000414
indicating the state s of the vehicle at the present moment t And performing action a t A mathematical expectation function of time;
Figure FDA00039295291200000415
indicates that the current state is s t The function of the policy in time,
Figure FDA00039295291200000416
parameters expressed as policy functions;
s317: updating actor network parameters according to a gradient descent method, represented as:
Figure FDA0003929529120000051
wherein ,
Figure FDA0003929529120000052
expressed in terms of policy function parameters
Figure FDA0003929529120000053
The gradient of the fall of (a) is,
Figure FDA0003929529120000054
is shown as relating to the execution of action a at the current time t t A falling gradient of;
s318: the minimum value of the objective function in the optimization problem is solved, so that the optimal temperature coefficient of each step can be obtained through updating, and the objective function is expressed as:
Figure FDA0003929529120000055
wherein ,H0 A threshold value representing a predefined minimum policy entropy,
Figure FDA0003929529120000056
expressed as a function of the policy pi t Performing action a t Mathematical expectation function of time, pi t (a t |s t ) Expressed as a policy function, s t Is expressed as the state of the fuel cell vehicle at the current time t, a t It is expressed as an action performed according to the policy function at the current time t.
CN202211385462.0A 2022-11-07 2022-11-07 Fuel cell automobile learning type cooperative energy management method considering air conditioning system Active CN115503559B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211385462.0A CN115503559B (en) 2022-11-07 2022-11-07 Fuel cell automobile learning type cooperative energy management method considering air conditioning system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211385462.0A CN115503559B (en) 2022-11-07 2022-11-07 Fuel cell automobile learning type cooperative energy management method considering air conditioning system

Publications (2)

Publication Number Publication Date
CN115503559A true CN115503559A (en) 2022-12-23
CN115503559B CN115503559B (en) 2023-05-02

Family

ID=84512880

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211385462.0A Active CN115503559B (en) 2022-11-07 2022-11-07 Fuel cell automobile learning type cooperative energy management method considering air conditioning system

Country Status (1)

Country Link
CN (1) CN115503559B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116639135A (en) * 2023-05-26 2023-08-25 中国第一汽车股份有限公司 Cooperative control method and device for vehicle and vehicle
CN117968208A (en) * 2024-03-29 2024-05-03 中建安装集团有限公司 Environment system control method and control system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111731303A (en) * 2020-07-09 2020-10-02 重庆大学 HEV energy management method based on deep reinforcement learning A3C algorithm
CN111785045A (en) * 2020-06-17 2020-10-16 南京理工大学 Distributed traffic signal lamp combined control method based on actor-critic algorithm
CN112287463A (en) * 2020-11-03 2021-01-29 重庆大学 Fuel cell automobile energy management method based on deep reinforcement learning algorithm
CN113071506A (en) * 2021-05-20 2021-07-06 吉林大学 Fuel cell automobile energy consumption optimization system considering cabin temperature
CN113085665A (en) * 2021-05-10 2021-07-09 重庆大学 Fuel cell automobile energy management method based on TD3 algorithm
CN113246805A (en) * 2021-07-02 2021-08-13 吉林大学 Fuel cell power management control method considering temperature of automobile cab
US20210270622A1 (en) * 2020-02-27 2021-09-02 Cummins Enterprise Llc Technologies for energy source schedule optimization for hybrid architecture vehicles

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210270622A1 (en) * 2020-02-27 2021-09-02 Cummins Enterprise Llc Technologies for energy source schedule optimization for hybrid architecture vehicles
CN111785045A (en) * 2020-06-17 2020-10-16 南京理工大学 Distributed traffic signal lamp combined control method based on actor-critic algorithm
CN111731303A (en) * 2020-07-09 2020-10-02 重庆大学 HEV energy management method based on deep reinforcement learning A3C algorithm
CN112287463A (en) * 2020-11-03 2021-01-29 重庆大学 Fuel cell automobile energy management method based on deep reinforcement learning algorithm
CN113085665A (en) * 2021-05-10 2021-07-09 重庆大学 Fuel cell automobile energy management method based on TD3 algorithm
CN113071506A (en) * 2021-05-20 2021-07-06 吉林大学 Fuel cell automobile energy consumption optimization system considering cabin temperature
CN113246805A (en) * 2021-07-02 2021-08-13 吉林大学 Fuel cell power management control method considering temperature of automobile cab

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王哲;谢怡;臧鹏飞;王耀;: "基于极小值原理的燃料电池客车能量管理策略", 吉林大学学报(工学版) *
祁文凯;桑国明;: "基于延迟策略的最大熵优势演员评论家算法", 小型微型计算机系统 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116639135A (en) * 2023-05-26 2023-08-25 中国第一汽车股份有限公司 Cooperative control method and device for vehicle and vehicle
CN117968208A (en) * 2024-03-29 2024-05-03 中建安装集团有限公司 Environment system control method and control system

Also Published As

Publication number Publication date
CN115503559B (en) 2023-05-02

Similar Documents

Publication Publication Date Title
CN115503559A (en) Learning type collaborative energy management method for fuel cell automobile considering air conditioning system
Xie et al. A Self-learning intelligent passenger vehicle comfort cooling system control strategy
CN111267831B (en) Intelligent time-domain-variable model prediction energy management method for hybrid electric vehicle
CN111731303B (en) HEV energy management method based on deep reinforcement learning A3C algorithm
CN112287463B (en) Fuel cell automobile energy management method based on deep reinforcement learning algorithm
CN111845701B (en) HEV energy management method based on deep reinforcement learning in car following environment
CN110936824B (en) Electric automobile double-motor control method based on self-adaptive dynamic planning
CN113071506B (en) Fuel cell automobile energy consumption optimization system considering cabin temperature
WO2021159660A1 (en) Energy management method and system for hybrid vehicle
CN110406526A (en) Parallel hybrid electric energy management method based on adaptive Dynamic Programming
CN109591659A (en) A kind of pure electric automobile energy management control method of intelligence learning
CN111767896A (en) Chassis loading cooperative control method and perception recognition implementation device for sweeper
CN113110052B (en) Hybrid energy management method based on neural network and reinforcement learning
Deng et al. Battery thermal-and cabin comfort-aware collaborative energy management for plug-in fuel cell electric vehicles based on the soft actor-critic algorithm
CN115793445A (en) Hybrid electric vehicle control method based on multi-agent deep reinforcement learning
JPH09109648A (en) Advance air conditioner for electric vehicle
CN113147321A (en) Vehicle-mounted air conditioner and regenerative braking coordination control method
CN114969982A (en) Fuel cell automobile deep reinforcement learning energy management method based on strategy migration
Hu et al. A transfer-based reinforcement learning collaborative energy management strategy for extended-range electric buses with cabin temperature comfort consideration
Wu et al. Multi-objective reinforcement learning-based energy management for fuel cell vehicles considering lifecycle costs
Zhang et al. A novel online prediction method for vehicle cabin temperature and passenger thermal sensation
Yang et al. Variable optimization domain-based cooperative energy management strategy for connected plug-in hybrid electric vehicles
Haskara et al. Reinforcement learning based EV energy management for integrated traction and cabin thermal management considering battery aging
Wang et al. Deep reinforcement learning with deep-Q-network based energy management for fuel cell hybrid electric truck
Chen et al. Reinforcement learning-based energy management control strategy of hybrid electric vehicles

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant