CN115503559B - Fuel cell automobile learning type cooperative energy management method considering air conditioning system - Google Patents

Fuel cell automobile learning type cooperative energy management method considering air conditioning system Download PDF

Info

Publication number
CN115503559B
CN115503559B CN202211385462.0A CN202211385462A CN115503559B CN 115503559 B CN115503559 B CN 115503559B CN 202211385462 A CN202211385462 A CN 202211385462A CN 115503559 B CN115503559 B CN 115503559B
Authority
CN
China
Prior art keywords
expressed
fuel cell
conditioning system
function
vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211385462.0A
Other languages
Chinese (zh)
Other versions
CN115503559A (en
Inventor
唐小林
邓磊
甘炯鹏
朱和龙
胡晓松
李佳承
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202211385462.0A priority Critical patent/CN115503559B/en
Publication of CN115503559A publication Critical patent/CN115503559A/en
Application granted granted Critical
Publication of CN115503559B publication Critical patent/CN115503559B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60LPROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
    • B60L58/00Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles
    • B60L58/30Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles for monitoring or controlling fuel cells
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60HARRANGEMENTS OF HEATING, COOLING, VENTILATING OR OTHER AIR-TREATING DEVICES SPECIALLY ADAPTED FOR PASSENGER OR GOODS SPACES OF VEHICLES
    • B60H1/00Heating, cooling or ventilating [HVAC] devices
    • B60H1/00357Air-conditioning arrangements specially adapted for particular vehicles
    • B60H1/00385Air-conditioning arrangements specially adapted for particular vehicles for vehicles having an electrical drive, e.g. hybrid or fuel cell
    • B60H1/00392Air-conditioning arrangements specially adapted for particular vehicles for vehicles having an electrical drive, e.g. hybrid or fuel cell for electric vehicles having only electric drive means
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60HARRANGEMENTS OF HEATING, COOLING, VENTILATING OR OTHER AIR-TREATING DEVICES SPECIALLY ADAPTED FOR PASSENGER OR GOODS SPACES OF VEHICLES
    • B60H1/00Heating, cooling or ventilating [HVAC] devices
    • B60H1/32Cooling devices
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60LPROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
    • B60L58/00Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles
    • B60L58/10Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles for monitoring or controlling batteries
    • B60L58/12Methods or circuit arrangements for monitoring or controlling batteries or fuel cells, specially adapted for electric vehicles for monitoring or controlling batteries responding to state of charge [SoC]

Landscapes

  • Engineering & Computer Science (AREA)
  • Mechanical Engineering (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Sustainable Development (AREA)
  • Sustainable Energy (AREA)
  • Power Engineering (AREA)
  • Transportation (AREA)
  • Physics & Mathematics (AREA)
  • Thermal Sciences (AREA)
  • Fuel Cell (AREA)

Abstract

The invention relates to a fuel cell automobile learning type cooperative energy management method considering an air conditioning system, and belongs to the field of new energy automobiles. The method comprises the following steps: s1: acquiring vehicle state parameter information, fuel cell parameter information, power cell parameter information and air conditioning system parameter information of a fuel cell automobile; s2: establishing a fuel cell automobile cooperative energy management model; s3: the method comprises the steps of establishing a fuel cell automobile collaborative energy management optimization control strategy considering an air conditioning system, solving a multi-objective optimization problem comprising hydrogen economy and cabin temperature comfort by combining an SAC algorithm, and controlling the change of refrigerating/heating capacity of an air conditioner to maintain the cabin temperature in a comfort zone while performing energy flow optimization control. The invention can effectively solve the problem of compromise between hydrogen energy consumption and cabin temperature comfort, and optimize the hydrogen economy and cabin temperature comfort of the fuel cell automobile.

Description

Fuel cell automobile learning type cooperative energy management method considering air conditioning system
Technical Field
The invention belongs to the field of new energy automobiles, and relates to a fuel cell automobile learning type collaborative energy management method considering an air conditioning system.
Background
Aiming at the problems of increasingly serious ecological environment pollution, lack of fossil fuel and the like, manufacturers of various large automobiles strive to develop new energy automobiles. With the development of fuel cell technology, fuel cell automobiles fully exert the advantages of zero emission, low energy consumption and strong endurance, and are considered as one of important research directions for realizing the sustainable development of automobiles in the future. The energy management strategy is a core control technology of a fuel cell automobile multi-power-source system, and the economical performance of the whole automobile is directly determined by the performance of the fuel cell automobile multi-power-source system. In the current research, energy management methods are mainly classified into three types: rule-based, optimization-based, and learning-based energy management strategies. However, rule-based and optimization-based energy management methods face the dilemma of not meeting both real-time and optimality; for the traditional deep reinforcement learning algorithm, although the real-time performance and the optimality of the energy flow optimization can be realized at the same time, certain defects exist in the aspects of training data and super-parameter setting. To this end, the proposal of soft constraint actor critics algorithm provides a method for solving the above problems.
On the other hand, the air conditioning system is an indispensable auxiliary device for a fuel cell vehicle, and helps to provide a comfortable riding environment for occupants in the vehicle. However, the use of an air conditioning system necessarily increases the energy consumption of the fuel cell vehicle, thereby affecting the economical performance of the entire vehicle. In modern fuel cell vehicle energy management method research, the energy consumption of an air conditioning system is generally considered to be constant or negligible. However, due to the change of the driving environment, the heat exchange amount inside and outside the cockpit can be changed, and the use power of the air conditioning system can be changed.
Accordingly, there is a need for a new energy management method for fuel cell vehicles to coordinate control of air conditioning systems and power source components, while taking into account changes in air conditioning system energy consumption, which is responsible for optimizing energy flow in the vehicle.
Disclosure of Invention
In view of the above, the present invention aims to provide a learning type collaborative energy management method for a fuel cell vehicle considering an air conditioning system, which uses a soft constraint actor remarks (Soft actor critic, SAC) algorithm to coordinate and control the air conditioning system and power source components of the fuel cell vehicle, so as to optimize the energy flow of the whole vehicle while ensuring the comfort of the cabin, thereby reducing the energy consumption of the whole vehicle of the fuel cell vehicle.
In order to achieve the above purpose, the present invention provides the following technical solutions:
the fuel cell automobile learning type cooperative energy management method considering an air conditioning system specifically comprises the following steps:
s1: acquiring vehicle state parameter information, fuel cell parameter information, power cell parameter information and air conditioning system parameter information of a fuel cell automobile;
s2: establishing a fuel cell vehicle collaborative energy management model, comprising: a whole vehicle longitudinal dynamics model, a fuel cell model, a power cell model, a motor model, an air conditioning system model and a cabin thermal load model;
s3: establishing a fuel cell automobile collaborative energy management optimization control strategy considering an air conditioning system, solving a multi-objective optimization problem comprising hydrogen economy and cabin temperature comfort by combining an SAC algorithm, and controlling the change of the refrigerating/heating capacity of an air conditioner to maintain the cabin temperature in a comfort zone while performing energy flow optimization control; the SAC algorithm is a soft constraint actor commentator algorithm.
Further, in step S1, the vehicle state parameter information includes: vehicle speed, cabin thermal load parameters, motor operating efficiency and transmission system characteristic parameters; the fuel cell parameter information includes: power, efficiency, and hydrogen energy consumption of the fuel cell; the power battery parameter information includes: the state of charge, internal resistance and open circuit voltage of the power battery; the air conditioning system parameter information includes: air conditioning system cooling capacity/heating capacity and corresponding power.
Further, in step S2, the established longitudinal dynamics model of the whole vehicle is:
P drive =(F air +F f +F i +m 0 a)·v
Figure BDA0003929529130000021
P dem =P b +P fc ·η DC/DC
wherein ,m0 Representing the quality of the whole vehicle; v is the speed of the whole vehicle; a represents vehicle acceleration; f (F) air Expressed as air resistance; f (F) f Expressed as rolling resistance; f (F) i Expressed as acceleration resistance; η (eta) m 、η DC/AC 、η DC/DC η motor Respectively representing transmission efficiency, DC/AC converter efficiency, DC/DC converter efficiency and motor efficiency; p (P) drive 、P dem 、P b P fc Respectively representing the driving power at the wheels of the vehicle, the required power, the battery output power, and the fuel cell output power.
Further, in step S2, the established fuel cell model is:
η fc =f η (P fc )
Figure BDA0003929529130000022
wherein ,fη(·) and
Figure BDA0003929529130000023
Expressed as a fitted function of efficiency and hydrogen energy consumption, respectively, the efficiency and hydrogen consumption can be calculated by interpolation.
Further, in step S2, the power battery model is built as follows:
Figure BDA0003929529130000024
Figure BDA0003929529130000031
wherein ,IL Expressed as power cell current; v (V) oc Expressed as power cell open circuit voltage; r is R in Expressed as the equivalent internal resistance of the power battery; SOC (State of Charge) 0 Denoted as initial SOC; q (Q) t Expressed as power cell maximum capacity; t is t 0 Denoted as initial time; t is t f Represented as the final time.
Further, in step S2, the established motor model is:
η m =f mm ,T m )
Figure BDA0003929529130000032
wherein ,ωm and Tm Respectively representing the motor rotation speed and the motor torque; p (P) m Expressed as motor output power, f m (. Cndot.) is expressed as a fitting function of the motor operating efficiency, which can be obtained by interpolation.
Further, in step S2, the air conditioning system model is established as follows:
Figure BDA0003929529130000033
wherein ,Qac Expressed as cooling capacity or heating capacity of the air conditioning system; p (P) ac Expressed as the corresponding power consumption of the air conditioning system; η (eta) cop Expressed as an air conditioning system coefficient of performance.
Further, in step S2, the built cabin thermal load model is:
Q c =∑KF(T out -T in )
Figure BDA0003929529130000034
Q h =145+116n
Q n =m e ξCp air (T out -T in )
Figure BDA0003929529130000035
wherein ,Qc 、Q r 、Q h and Qn respectively, heat conduction load, radiant heat load, in-vehicle occupant generated heat (empirically, the driver generated heat is about 145W, each occupant generates about 116W), and ventilation system heat load; k is expressed as a heat transfer coefficient; f is denoted as corresponding housingA heat transfer area; t (T) out Expressed as ambient temperature; t (T) in Expressed as cabin air temperature; η is expressed as permeability; i is expressed as the intensity of sunlight; a is that i Represented as windshield, left and right side windows, and rear window area; θ i Expressed as the incident angle of sunlight; beta is denoted as a shading factor; n represents the number of passengers in the vehicle; m is m e Represented as the mass of air passing through the evaporator; ζ is the air recirculation coefficient; cp air Expressed as indoor air heat capacity; ρ air and Vair Respectively as the air density in the cabin and the cabin volume.
Further, in step S3, a fuel cell vehicle collaborative energy management optimization control strategy considering an air conditioning system is established, which specifically includes the following steps:
s301: determining a state space: to reflect the key environment information, the power battery SOC and the fuel battery output power P fc Vehicle speed v, refrigerating/heating capacity Q of air conditioning system ac Set as state variables, construct state space S, can be expressed as:
S={SOC,P fc ,v,Q ac }
s302: determining an action space: considering that the cooperative energy management of the air conditioning system not only distributes power of the power source, but also maintains the thermal comfort of the cabin temperature according to the change of the refrigerating/heating capacity of the air conditioning system, and therefore, the output power change of the fuel cell is realized
Figure BDA0003929529130000047
And the amount of change in the refrigerating/heating capacity of the air conditioning system +.>
Figure BDA0003929529130000046
Set as action variables, construct action space a, can be expressed as:
Figure BDA0003929529130000041
s303: establishing a reward function: to ensure cabin temperature comfort, the cabin temperature is maintained at about 24 ℃, and the optimized term of cabin temperature change is also included in the reward function, so that the reward function R is set as a weighted sum of three indexes of hydrogen energy consumption, SOC change and cabin temperature change, and is expressed as:
R=-(ζ·fuel(t)+ψ·(SOC(t)-0.7) 2 +γ·(T in -24) 2 )
zeta, ψ and gamma are weight factors of each optimization term, and the balance problem between hydrogen energy consumption and cabin temperature comfort is solved by adjusting the weight factors, so that the multi-objective optimization problem is solved; fuel (t) represents the hydrogen energy consumption at the current time; SOC (t) represents the state of charge of the power battery at the current time.
Further, in step S3, a multi-objective optimization problem including hydrogen economy and cabin temperature comfort is solved in combination with the SAC algorithm, specifically including the steps of:
s311: solving a multi-objective optimization problem in energy management by combining a SAC algorithm, introducing motion entropy values into the SAC algorithm to enable motion output to be more dispersed, and further improving exploration capacity, new task learning capacity and stability of the algorithm, wherein the entropy values are expressed as:
H(π(·|s t ))=-logπ(·|s t )
wherein H is the strategy pi (|s) t ) Is a function of the entropy of (a).
S312: during the solving process, the actor network in the agent is in the state s t As input, the mean and variance of the Gaussian distribution of the motion is output, and the motion a is generated by utilizing a re-parameterization technology t
Figure BDA0003929529130000042
wherein ,τt Representing noise signals sampled from a standard normal distribution;
Figure BDA0003929529130000043
representing the mean and variance of the function output; />
Figure BDA0003929529130000044
and />
Figure BDA0003929529130000045
Mean and variance of the gaussian distribution are shown, respectively.
S313: executing action a t Thereafter, the vehicle environment feeds back the reward r to the agent t And transitions to the next state s t+1 Can generate the interactive data { s } of the environment and the intelligent agent t ,a t ,r t ,s t+1 And store in experience pool
Figure BDA0003929529130000051
Is a kind of medium.
S314: randomly extracting small-batch experience samples from an experience pool, and introducing parameters theta to avoid overestimation when maximizing action state function values and further overestimation when calculating targets by utilizing own network 12 Is evaluated by a critics network and has a parameter of theta' 1 ,θ′ 2 Selecting a target critic network which outputs a smaller action state function value as a target value; for a specific state s t And action a t Soft constraint action value function Q in SAC algorithm soft (s t ,a t ) The update formula is as follows:
Figure BDA0003929529130000052
wherein r represents a reward earned by the vehicle; gamma represents a discount factor; alpha represents a temperature coefficient.
S315: when updating the policy network, the policy network is updated by minimizing the loss function L (θ i ) Updating an evaluation critic network, the loss function being defined as
Figure BDA0003929529130000053
And->
Figure BDA0003929529130000054
The mean square error between them is expressed as:
Figure BDA0003929529130000055
Figure BDA0003929529130000056
wherein ,
Figure BDA0003929529130000057
represented as evaluation critics network parameter θ i Evaluation function at time, and->
Figure BDA0003929529130000058
The network parameters of the critics with the table as the target are theta' i Evaluation function at the time.
S316: the updating of actor network parameters is realized by minimizing KL divergence, and the smaller the KL value is, the smaller the difference between rewards corresponding to the output actions is, the better the convergence effect of the strategy is; objective function of actor network
Figure BDA0003929529130000059
The definition is as follows:
Figure BDA00039295291300000510
wherein ,DKL Representing a KL divergence calculation expression; z(s) t ) Is a distribution function for normalizing the distribution;
Figure BDA00039295291300000511
representing the state s of the vehicle at the current time t Executing action a t Mathematical expectation function of time +.>
Figure BDA00039295291300000512
Representing the current state as s t Policy function at time->
Figure BDA00039295291300000513
Parameters expressed as policy functions.
S317: updating actor network parameters according to a gradient descent method, wherein the actor network parameters are expressed as follows:
Figure BDA00039295291300000514
wherein ,
Figure BDA00039295291300000515
expressed as about policy function parameters->
Figure BDA00039295291300000516
Gradient of decline of->
Figure BDA00039295291300000517
Represented as action a being performed in relation to the current time t t Is a gradient of the decline of (c).
S318: in the SAC algorithm system, the adjustment of the temperature coefficient alpha is critical to the training effect of the SAC algorithm, and the optimal temperature coefficient is different in value in different reinforcement learning tasks and training periods. In order to realize automatic adjustment of the temperature coefficient, the optimal temperature coefficient of each step can be updated by solving the minimum value of the objective function in the optimization problem, and the objective function is expressed as:
Figure BDA0003929529130000061
wherein ,H0 A threshold representing a predefined minimum policy entropy,
Figure BDA0003929529130000062
represented as a function of policy pi t Executing action a t Mathematical expectation function of time, pi t (a t |s t ) Expressed as a policy function, s t Is expressed as the state of the fuel cell automobile at the current time t, a t Then expressed as the current time t-hour basis policyThe action performed by the thumbnail function.
The invention has the beneficial effects that:
1) The invention designs an energy management strategy based on a soft constraint actor criticizing algorithm, effectively gets rid of the dependence of the traditional deep reinforcement learning algorithm on training data and super-parameter setting in the application of fuel cell automobile energy management, and is beneficial to improving the stability of control tasks under continuous action space.
2) Considering that the energy consumption change of the air conditioning system is usually ignored when the energy management problem of the fuel cell automobile is designed, the invention aims at optimizing the hydrogen energy consumption, the SOC maintenance and the cabin temperature comfort, builds a cooperative energy management optimizing control framework for taking the air conditioning system into account, and realizes the cooperative control of the energy management and the air conditioning system.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.
Drawings
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in the following preferred detail with reference to the accompanying drawings, in which:
FIG. 1 is a flow chart of a fuel cell vehicle collaborative energy management method of the present invention;
FIG. 2 is a schematic diagram of a fuel cell vehicle multi-power system;
FIG. 3 is a schematic diagram of a cabin thermal load model and an air conditioning system;
fig. 4 is a collaborative energy management framework diagram of an air conditioning system according to the present invention constructed by applying a SAC algorithm.
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the illustrations provided in the following embodiments merely illustrate the basic idea of the present invention by way of illustration, and the following embodiments and features in the embodiments may be combined with each other without conflict.
Referring to fig. 1 to 4, the invention designs a fuel cell vehicle collaborative energy management optimization method considering an air conditioning system based on a soft constraint actor criticizing algorithm. Considering that the energy consumption change of an air conditioning system is usually ignored in the energy management of the fuel cell automobile, the main influencing factors of the temperature comfort in the cabin of the automobile are analyzed, an air conditioning system model and a cabin thermal load model are established, hydrogen consumption, SOC maintenance and cabin temperature are used as optimization targets, a collaborative energy management optimization control framework for taking the air conditioning system into account is established by applying a soft constraint actor commentator algorithm suitable for control tasks under continuous action space, collaborative control of the energy management and the air conditioning system is realized, and the hydrogen economy and the cabin temperature comfort of the fuel cell automobile are optimized. As shown in fig. 1, the energy management collaborative optimization method specifically includes the following steps:
s1: the method for acquiring the key parameter information of the fuel cell automobile comprises the following steps:
the vehicle state parameter information includes: vehicle speed, cabin thermal load parameters, motor operating efficiency and transmission system characteristic parameters;
the fuel cell parameter information includes: power, efficiency, and hydrogen energy consumption of the fuel cell;
the power battery parameter information includes: the state of charge, internal resistance and open circuit voltage of the power battery;
the air conditioning system parameter information includes: air conditioning system cooling capacity/heating capacity and corresponding power.
S2: the method for establishing the collaborative energy management model of the fuel cell automobile comprises the following specific steps of:
s21: establishing a longitudinal dynamics model of the whole vehicle:
P drive =(F air +F f +F i +m 0 a)·v
Figure BDA0003929529130000071
P dem =P b +P fc ·η DC/DC
wherein ,m0 Representing the quality of the whole vehicle; v is the speed of the whole vehicle; a represents vehicle acceleration; f (F) air Expressed as air resistance; f (F) f Expressed as rolling resistance; f (F) i Expressed as acceleration resistance; η (eta) m 、η DC/AC 、η DC/DC η motor Respectively representing transmission efficiency, DC/AC converter efficiency, DC/DC converter efficiency and motor efficiency; p (P) drive 、P dem 、P b P fc Respectively representing the driving power at the wheels of the vehicle, the required power, the battery output power, and the fuel cell output power.
S22: and (3) establishing a fuel cell model:
η fc =f η (P fc )
Figure BDA0003929529130000072
wherein ,fη(·) and
Figure BDA0003929529130000073
Expressed as a fitted function of efficiency and hydrogen energy consumption, respectively, the efficiency and hydrogen consumption can be calculated by interpolation.
S23: and (3) establishing a power battery model:
Figure BDA0003929529130000074
Figure BDA0003929529130000081
wherein ,IL Expressed as power cell current; v (V) oc Expressed as power cell open circuit voltage; r is R in Expressed as the equivalent internal resistance of the power battery; SOC (State of Charge) 0 Denoted as initial SOC; q (Q) t Expressed as power cell maximum capacity; t is t 0 Denoted as initial time; t is t f Represented as the final time.
S24: establishing a motor model:
η m =f mm ,T m )
Figure BDA0003929529130000082
wherein ,ωm and Tm Respectively representing the motor rotation speed and the motor torque; p (P) m Expressed as motor output power, f m (. Cndot.) is expressed as a fitting function of the motor operating efficiency, which can be obtained by interpolation.
S25: establishing an air conditioning system model:
Figure BDA0003929529130000083
wherein ,Qac Expressed as cooling capacity or heating capacity of the air conditioning system; p (P) ac Expressed as the corresponding power consumption of the air conditioning system; η (eta) cop Expressed as an air conditioning system coefficient of performance.
S26: and (3) building a thermal load model of the vehicle cabin:
Q c =∑KF(T out -T in )
Figure BDA0003929529130000084
Q h =145+116n
Q n =m e ξCp air (T out -T in )
Figure BDA0003929529130000085
wherein ,Qc 、Q r 、Q h and Qn respectively, heat conduction load, radiant heat load, in-vehicle occupant generated heat (empirically, the driver generated heat is about 145W, each occupant generates about 116W), and ventilation system heat load; k is expressed as a heat transfer coefficient; f represents the heat transfer area of the corresponding housing; t (T) out Expressed as ambient temperature; t (T) in Expressed as cabin air temperature; η is expressed as permeability; i is expressed as the intensity of sunlight; a is that i Represented as windshield, left and right side windows, and rear window area; θ i Expressed as the incident angle of sunlight; beta is denoted as a shading factor; n represents the number of passengers in the vehicle; m is m e Represented as the mass of air passing through the evaporator; ζ is the air recirculation coefficient; cp air Expressed as indoor air heat capacity; ρ air and Vair Respectively as the air density in the cabin and the cabin volume.
S3: a fuel cell automobile collaborative energy management optimization control framework considering an air conditioning system is established based on a SAC algorithm, and a multi-objective optimization problem comprising hydrogen economy and cabin temperature comfort is solved. As shown in fig. 3, the collaborative control of energy management and an air conditioning system is realized by applying a soft constraint actor commentator algorithm, so that the hydrogen economy and cabin temperature comfort of the fuel cell automobile are optimized, specifically:
s301: to reflect the key environment information, the power battery SOC and the fuel battery output power P fc Vehicle speed v, air conditioning cooling/heating capacity Q ac Set as state variables, build state space, can be expressed as:
S={SOC,P fc ,v,Q ac }
s302: taking into account the coordinated energy management of the air conditioning system not only allocates power to the power source, but also maintains the thermal comfort of the cabin temperature according to the change of the refrigerating/heating capacity of the air conditioning system, and thus, the fuel is electrically chargedPool output power variation
Figure BDA0003929529130000091
And the amount of change in the refrigerating/heating capacity of the air conditioning system +.>
Figure BDA0003929529130000092
Set as action variables, build action space, can be expressed as:
Figure BDA0003929529130000093
/>
s303: to ensure cabin temperature comfort, the cabin temperature is maintained at about 24 ℃, and the optimized term of cabin temperature change is also included in the reward function, so that the reward function is set as a weighted sum of three indexes of hydrogen energy consumption, SOC change and cabin temperature change, and the weighted sum is expressed as:
R=-(ζ·fuel(t)+ψ·(SOC(t)-0.7) 2 +γ·(T in -24) 2 )
zeta, ψ and gamma are weight factors of each optimization term, and the balance problem between hydrogen energy consumption and cabin temperature comfort is solved by adjusting the weight factors, so that the multi-objective optimization problem is solved; fuel (t) is expressed as hydrogen energy consumption at the current time; SOC (t) is expressed as the state of charge of the power battery at the current time.
S304: solving a multi-objective optimization problem in energy management by combining a SAC algorithm, introducing motion entropy values into the SAC algorithm to enable motion output to be more dispersed, and further improving exploration capacity, new task learning capacity and stability of the algorithm, wherein the entropy values are expressed as:
H(π(·|s t ))=-logπ(·|s t )
wherein H is the strategy pi (|s) t ) Is a function of the entropy of (a).
S305: during the solving process, the actor network in the agent is in the state s t As input, the mean and variance of the Gaussian distribution of the motion is output, and the motion a is generated by utilizing a re-parameterization technology t
Figure BDA0003929529130000094
wherein ,τt Represented as a noise signal sampled from a standard normal distribution;
Figure BDA0003929529130000095
the mean and variance of the function output; />
Figure BDA0003929529130000096
And
Figure BDA0003929529130000097
mean and variance of the gaussian distribution are shown, respectively.
S306: executing action a t Thereafter, the vehicle environment feeds back the reward r to the agent t And transitions to the next state s t+1 Can generate the interactive data { s } of the environment and the intelligent agent t ,a t ,r t ,s t+1 And store in experience pool
Figure BDA0003929529130000101
Is a kind of medium.
S307: randomly extracting small-batch experience samples from an experience pool, and introducing parameters theta to avoid overestimation when maximizing action state function values and further overestimation when calculating targets by utilizing own network 12 Is evaluated by a critics network and has a parameter of theta' 1 ,θ′ 2 And selecting the target critic network to output smaller action state function value as the target value. For a specific state s t And action a t Soft constraint action value function Q in SAC algorithm soft (s t, a t ) The update formula is as follows:
Figure BDA0003929529130000102
wherein r represents a reward earned for the vehicle; gamma is denoted as the discount factor; alpha is expressed as a temperature coefficient.
S308: when updating the policy network, the policy network is updated by minimizing the loss function L (θ i ) Updating an evaluation critic network, the loss function being defined as
Figure BDA0003929529130000103
And->
Figure BDA0003929529130000104
The mean square error between them is expressed as:
Figure BDA0003929529130000105
Figure BDA0003929529130000106
wherein ,
Figure BDA0003929529130000107
represented as evaluation critics network parameter θ i Evaluation function at time, and->
Figure BDA0003929529130000108
The network parameters of the critics with the table as the target are theta' i Evaluation function at the time.
S309: the updating of the actor network parameters is realized by minimizing the KL divergence, and the smaller the KL value is, the smaller the difference between rewards corresponding to the output actions is, and the better the convergence effect of the strategy is. Objective function of actor network
Figure BDA0003929529130000109
The definition is as follows: />
Figure BDA00039295291300001010
wherein ,DKL Expressed as a KL divergence calculation expression; z(s) t ) Is a distribution function for normalizing the distributionCloth;
Figure BDA00039295291300001011
representing the state s of the vehicle at the current time t Executing action a t Mathematical expectation function of time +.>
Figure BDA00039295291300001012
Representing the current state as s t Policy function at time->
Figure BDA00039295291300001013
Parameters expressed as policy functions.
S310: updating actor network parameters according to a gradient descent method, wherein the actor network parameters are expressed as follows:
Figure BDA00039295291300001014
wherein ,
Figure BDA00039295291300001015
expressed as about policy function parameters->
Figure BDA00039295291300001016
Gradient of decline of->
Figure BDA00039295291300001017
Represented as action a being performed in relation to the current time t t Is a decreasing gradient of (2);
s311: in the SAC algorithm system, the adjustment of the temperature coefficient alpha is critical to the training effect of the SAC algorithm, and the optimal temperature coefficient is different in value in different reinforcement learning tasks and training periods. In order to realize automatic adjustment of the temperature coefficient, the optimal temperature coefficient of each step can be updated by solving the minimum value of the objective function in the optimization problem, and the objective function is expressed as:
Figure BDA0003929529130000111
wherein ,H0 A threshold value expressed as a predefined minimum policy entropy,
Figure BDA0003929529130000112
represented as a function of policy pi t Executing action a t Mathematical expectation function of time, pi t (a t |s t ) Expressed as a policy function, s t Is expressed as the state of the fuel cell automobile at the current time t, a t Then this is denoted as the action performed according to the policy function at the current time t.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present invention, which is intended to be covered by the claims of the present invention.

Claims (8)

1. A fuel cell vehicle learning type cooperative energy management method considering an air conditioning system, which is characterized by comprising the following steps:
s1: acquiring vehicle state parameter information, fuel cell parameter information, power cell parameter information and air conditioning system parameter information of a fuel cell automobile;
s2: establishing a fuel cell vehicle collaborative energy management model, comprising: a whole vehicle longitudinal dynamics model, a fuel cell model, a power cell model, a motor model, an air conditioning system model and a cabin thermal load model;
s3: establishing a fuel cell automobile collaborative energy management optimization control strategy considering an air conditioning system, solving a multi-objective optimization problem comprising hydrogen economy and cabin temperature comfort by combining an SAC algorithm, and controlling the change of refrigerating/heating capacity of an air conditioner to maintain the cabin temperature in a comfort zone while performing energy flow optimization control; the SAC algorithm is a soft constraint actor commentator algorithm; the method for establishing the fuel cell automobile collaborative energy management optimization control strategy considering the air conditioning system specifically comprises the following steps:
s301: determining a state space: SOC of power battery and output power P of fuel battery fc Vehicle speed v, refrigerating/heating capacity Q of air conditioning system ac Set as state variables, construct a state space S, denoted as:
S={SOC,P fc ,v,Q ac }
s302: determining an action space: variation of fuel cell output power ∈P fc And the variation of the refrigerating/heating capacity of the air conditioning system (Q) ac Set as action variables, construct action space a, denoted as:
A={▽P fc ,▽Q ac }
s303: establishing a reward function: the bonus function R is set as a weighted sum of three indicators of hydrogen energy consumption, SOC variation and cabin temperature variation, expressed as:
R=-(ζ·fuel(t)+ψ·(SOC(t)-0.7) 2 +γ·(T in -24) 2 )
zeta, ψ and gamma are weight factors of each optimization term, and the balance problem between hydrogen energy consumption and cabin temperature comfort is solved by adjusting the weight factors, so that the multi-objective optimization problem is solved; fuel (t) represents the hydrogen energy consumption at the current time; SOC (t) represents the state of charge of the power battery at the current time; t (T) in Expressed as cabin air temperature;
solving a multi-objective optimization problem comprising hydrogen economy and cabin temperature comfort by combining with a SAC algorithm, specifically comprising the following steps:
s311: solving a multi-objective optimization problem in energy management by combining a SAC algorithm, introducing motion entropy values into the SAC algorithm to enable motion output to be more dispersed, and further improving exploration capacity, new task learning capacity and stability of the algorithm, wherein the entropy values are expressed as:
H(π(·|s t ))=-logπ(·|s t )
wherein H is the strategy pi (|s) t ) Entropy of (2);
s312: during the solving process, the actor network in the agent is in the state s t As input, inputYielding the mean and variance of the gaussian distribution of the motion, generating motion a using a re-parameterization technique t
Figure FDA0004128743520000021
wherein ,τt Representing noise signals sampled from a standard normal distribution;
Figure FDA0004128743520000022
representing the mean and variance of the function output; />
Figure FDA0004128743520000023
And
Figure FDA0004128743520000024
mean and variance of gaussian distribution are shown, respectively;
s313: executing action a t Thereafter, the vehicle environment feeds back the reward r to the agent t And transitions to the next state s t+1 I.e. generating interaction data { s } of the environment and the agent t ,a t ,r t ,s t+1 And store in experience pool
Figure FDA0004128743520000025
In (a) and (b);
s314: randomly extracting small-batch experience samples from an experience pool, and introducing parameters theta 12 Is a critics network and parameter θ 1 ′,θ 2 The target critics network selects a smaller action state function value output by the target critics network as a target value; for a specific state s t And action a t Soft constraint action value function Q in SAC algorithm soft (s t ,a t ) The update formula is as follows:
Figure FDA0004128743520000026
wherein r represents a reward earned by the vehicle; gamma represents a discount factor; alpha represents a temperature coefficient;
s315: when updating the policy network, the policy network is updated by minimizing the loss function L (θ i ) Updating an evaluation critic network, the loss function being defined as
Figure FDA0004128743520000027
And->
Figure FDA0004128743520000028
The mean square error between them is expressed as:
Figure FDA0004128743520000029
Figure FDA00041287435200000210
wherein ,
Figure FDA00041287435200000211
representing evaluating critics network parameters as θ i Evaluation function at time->
Figure FDA00041287435200000212
Representing the network parameter of the target critics as theta i ' evaluation function at time;
s316: actor network parameter updating is achieved by minimizing KL divergence; objective function of actor network
Figure FDA00041287435200000213
The definition is as follows:
Figure FDA00041287435200000214
wherein ,DKL Representing a KL divergence calculation expression; z(s) t ) Is a distribution function for normalizing the distribution;
Figure FDA00041287435200000215
representing the state s of the vehicle at the current time t Executing action a t Mathematical expectation function of time; />
Figure FDA00041287435200000216
Representing the current state as s t Policy function at time->
Figure FDA00041287435200000217
Parameters expressed as policy functions;
s317: updating actor network parameters according to a gradient descent method, wherein the actor network parameters are expressed as follows:
Figure FDA0004128743520000031
wherein ,
Figure FDA0004128743520000032
expressed as about policy function parameters->
Figure FDA0004128743520000033
Gradient of decline of->
Figure FDA0004128743520000034
Represented as action a being performed in relation to the current time t t Is a decreasing gradient of (2);
s318: the optimal temperature coefficient of each step can be obtained by updating the minimum value of the objective function in the optimization problem, and the objective function is expressed as:
Figure FDA0004128743520000035
wherein ,H0 A threshold representing a predefined minimum policy entropy,
Figure FDA0004128743520000036
represented as a function of policy pi t Executing action a t Mathematical expectation function of time, pi t (a t |s t ) Expressed as a policy function, s t Is expressed as the state of the fuel cell automobile at the current time t, a t Then this is denoted as the action performed according to the policy function at the current time t.
2. The fuel cell vehicle learning collaborative energy management method according to claim 1, wherein in step S1, the vehicle state parameter information includes: vehicle speed, cabin thermal load parameters, motor operating efficiency and transmission system characteristic parameters; the fuel cell parameter information includes: power, efficiency, and hydrogen energy consumption of the fuel cell; the power battery parameter information includes: the state of charge, internal resistance and open circuit voltage of the power battery; the air conditioning system parameter information includes: air conditioning system cooling capacity/heating capacity and corresponding power.
3. The fuel cell vehicle learning collaborative energy management method according to claim 1, wherein in step S2, a vehicle longitudinal dynamics model is established as follows:
P drive =(F air +F f +F i +m 0 a)·v
Figure FDA0004128743520000037
P dem =P b +P fc ·η DC/DC
wherein ,m0 Representing the quality of the whole vehicle; v is the speed of the whole vehicle; a represents vehicle acceleration; f (F) air Expressed as air resistance; f (F) f Expressed as rolling resistance; f (F) i Expressed as acceleration resistance; η (eta) m 、η DC/AC 、η DC/DC η motor Respectively representing transmission efficiency, DC/AC converter efficiency, DC/DC converter efficiency and motor efficiency; p (P) drive 、P dem 、P b P fc Respectively representing the driving power at the wheels of the vehicle, the required power, the battery output power, and the fuel cell output power.
4. The fuel cell vehicle learning collaborative energy management method according to claim 3 wherein in step S2, a fuel cell model is established as:
η fc =f η (P fc )
Figure FDA0004128743520000038
wherein ,fη(·) and
Figure FDA0004128743520000039
Expressed as a fitted function of efficiency and hydrogen energy consumption, respectively, the efficiency and hydrogen consumption were calculated by interpolation.
5. The fuel cell vehicle learning collaborative energy management method according to claim 3, wherein in step S2, a power cell model is established as follows:
Figure FDA0004128743520000041
Figure FDA0004128743520000042
wherein ,IL Expressed as power cell current; v (V) oc Expressed as power cell open circuit voltage; r is R in Expressed as the equivalent internal resistance of the power battery;SOC 0 denoted as initial SOC; q (Q) t Expressed as power cell maximum capacity; t is t 0 Denoted as initial time; t is t f Represented as the final time.
6. The fuel cell vehicle learning collaborative energy management method according to claim 3, wherein in step S2, a motor model is built as follows:
η m =f mm ,T m )
Figure FDA0004128743520000043
wherein ,ωm and Tm Respectively representing the motor rotation speed and the motor torque; p (P) m Expressed as motor output power, f m (. Cndot.) represents a fitting function of the motor working efficiency, which is obtained by interpolation.
7. The fuel cell vehicle learning collaborative energy management method according to claim 1, wherein in step S2, an air conditioning system model is established as follows:
Figure FDA0004128743520000044
/>
wherein ,Qac Expressed as cooling capacity or heating capacity of the air conditioning system; p (P) ac Expressed as the corresponding power consumption of the air conditioning system; η (eta) cop Expressed as an air conditioning system coefficient of performance.
8. The fuel cell vehicle learning collaborative energy management method according to claim 1, wherein in step S2, a vehicle cabin thermal load model is established as follows:
Q c =∑KF(T out -T in )
Figure FDA0004128743520000045
Q h =145+116n
Q n =m e ξCp air (T out -T in )
Figure FDA0004128743520000051
wherein ,Qc 、Q r 、Q h and Qn respectively representing heat conduction load, radiation heat load, heat generated by personnel in the vehicle and heat load of a ventilation system; k is expressed as a heat transfer coefficient; f represents the heat transfer area of the corresponding housing; t (T) out Expressed as ambient temperature;
T in expressed as cabin air temperature; η is expressed as permeability; i is expressed as the intensity of sunlight; a is that i Represented as windshield, left and right side windows, and rear window area; θ i Expressed as the incident angle of sunlight; beta is denoted as a shading factor; n represents the number of passengers in the vehicle;
m e represented as the mass of air passing through the evaporator; ζ is the air recirculation coefficient; cp air Expressed as indoor air heat capacity; ρ air and Vair Respectively as the air density in the cabin and the cabin volume.
CN202211385462.0A 2022-11-07 2022-11-07 Fuel cell automobile learning type cooperative energy management method considering air conditioning system Active CN115503559B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211385462.0A CN115503559B (en) 2022-11-07 2022-11-07 Fuel cell automobile learning type cooperative energy management method considering air conditioning system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211385462.0A CN115503559B (en) 2022-11-07 2022-11-07 Fuel cell automobile learning type cooperative energy management method considering air conditioning system

Publications (2)

Publication Number Publication Date
CN115503559A CN115503559A (en) 2022-12-23
CN115503559B true CN115503559B (en) 2023-05-02

Family

ID=84512880

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211385462.0A Active CN115503559B (en) 2022-11-07 2022-11-07 Fuel cell automobile learning type cooperative energy management method considering air conditioning system

Country Status (1)

Country Link
CN (1) CN115503559B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116639135A (en) * 2023-05-26 2023-08-25 中国第一汽车股份有限公司 Cooperative control method and device for vehicle and vehicle
CN117968208A (en) * 2024-03-29 2024-05-03 中建安装集团有限公司 Environment system control method and control system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111731303A (en) * 2020-07-09 2020-10-02 重庆大学 HEV energy management method based on deep reinforcement learning A3C algorithm
CN111785045A (en) * 2020-06-17 2020-10-16 南京理工大学 Distributed traffic signal lamp combined control method based on actor-critic algorithm
CN112287463A (en) * 2020-11-03 2021-01-29 重庆大学 Fuel cell automobile energy management method based on deep reinforcement learning algorithm
CN113071506A (en) * 2021-05-20 2021-07-06 吉林大学 Fuel cell automobile energy consumption optimization system considering cabin temperature
CN113085665A (en) * 2021-05-10 2021-07-09 重庆大学 Fuel cell automobile energy management method based on TD3 algorithm
CN113246805A (en) * 2021-07-02 2021-08-13 吉林大学 Fuel cell power management control method considering temperature of automobile cab

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210270622A1 (en) * 2020-02-27 2021-09-02 Cummins Enterprise Llc Technologies for energy source schedule optimization for hybrid architecture vehicles

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111785045A (en) * 2020-06-17 2020-10-16 南京理工大学 Distributed traffic signal lamp combined control method based on actor-critic algorithm
CN111731303A (en) * 2020-07-09 2020-10-02 重庆大学 HEV energy management method based on deep reinforcement learning A3C algorithm
CN112287463A (en) * 2020-11-03 2021-01-29 重庆大学 Fuel cell automobile energy management method based on deep reinforcement learning algorithm
CN113085665A (en) * 2021-05-10 2021-07-09 重庆大学 Fuel cell automobile energy management method based on TD3 algorithm
CN113071506A (en) * 2021-05-20 2021-07-06 吉林大学 Fuel cell automobile energy consumption optimization system considering cabin temperature
CN113246805A (en) * 2021-07-02 2021-08-13 吉林大学 Fuel cell power management control method considering temperature of automobile cab

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于延迟策略的最大熵优势演员评论家算法;祁文凯;桑国明;;小型微型计算机系统(第08期);90-98 *

Also Published As

Publication number Publication date
CN115503559A (en) 2022-12-23

Similar Documents

Publication Publication Date Title
CN115503559B (en) Fuel cell automobile learning type cooperative energy management method considering air conditioning system
Xie et al. A Self-learning intelligent passenger vehicle comfort cooling system control strategy
CN111267831A (en) Hybrid vehicle intelligent time-domain-variable model prediction energy management method
CN110936824B (en) Electric automobile double-motor control method based on self-adaptive dynamic planning
CN111845701A (en) HEV energy management method based on deep reinforcement learning in car following environment
CN113071506B (en) Fuel cell automobile energy consumption optimization system considering cabin temperature
DE102011086569A1 (en) Method for controlling the temperature of a vehicle with at least partial electric drive, vehicle and charging station
CN114103971B (en) Energy-saving driving optimization method and device for fuel cell automobile
CN113110052B (en) Hybrid energy management method based on neural network and reinforcement learning
JPH09109648A (en) Advance air conditioner for electric vehicle
CN104527637B (en) Method for controlling hybrid power vehicle and system
CN111301397A (en) Variable time domain model prediction energy management method for plug-in hybrid electric vehicle
CN110962684B (en) Electric automobile energy management and distribution method
WO2021228019A1 (en) Method for extending service life of electric vehicle battery
CN113147321B (en) Vehicle-mounted air conditioner and regenerative braking coordination control method
CN112124298B (en) Hybrid vehicle following cruising energy management method based on rapid solving algorithm
Hu et al. A transfer-based reinforcement learning collaborative energy management strategy for extended-range electric buses with cabin temperature comfort consideration
CN112937251A (en) Vehicle-mounted air conditioner compressor control method and system
Rong et al. Model predictive climate control of electric vehicles for improved battery lifetime
US20220111702A1 (en) Method for preconditioning vehicles
CN112406864B (en) Electric motor coach double-source intelligent steering system and steering cooperative control method
Haskara et al. Reinforcement learning based EV energy management for integrated traction and cabin thermal management considering battery aging
Yang et al. Optimization of the energy management system in hybrid electric vehicles considering cabin temperature
Yang et al. Variable optimization domain-based cooperative energy management strategy for connected plug-in hybrid electric vehicles
CN110893744B (en) Air conditioner control method and device based on public vehicle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant