CN117863948B

CN117863948B - Distributed electric vehicle charging control method and device for auxiliary frequency modulation

Info

Publication number: CN117863948B
Application number: CN202410067438.5A
Authority: CN
Inventors: 赵卓立; 谭翰袁; 徐家文; 张泽翰; 卢健钊
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Filing date: 2024-01-17
Publication date: 2024-06-11
Anticipated expiration: 2044-01-17

Abstract

The invention discloses a method and a device for controlling charge of a scattered electric automobile with auxiliary frequency modulation, comprising the following steps: acquiring current state information; the state information comprises frequency deviation of a micro-grid and the state of charge of an electric vehicle; inputting the current state information into a decision network model; the decision network model is obtained based on training of a preset target rewarding function, and the preset target rewarding function is constructed through the state information; and controlling the charging power of the electric automobile based on the output of the decision network model, and simultaneously storing the charging working experience. The method can reduce the communication cost and improve the comprehensiveness of the electric automobile participating in the micro-grid frequency modulation control strategy.

Description

Distributed electric vehicle charging control method and device for auxiliary frequency modulation

Technical Field

The invention relates to the field of electric automobiles, in particular to a method and a device for controlling charge of a scattered electric automobile with auxiliary frequency modulation.

Background

At present, the following defects exist in the strategy of the electric automobile participating in micro-grid frequency modulation control:

1. In the prior art, when a research object participating in micro-grid frequency modulation service is selected, most electric vehicles parked in a public charging station are selected, and electric vehicles accessed into a power grid by using a private charger are ignored;

2. The existing electric automobile participation frequency modulation control strategy usually adopts a method of aggregation of electric automobiles, so that the complexity of the electric automobile aggregation is reduced, but the individual requirements of the electric automobiles are easily ignored;

3. the existing control strategies are mostly centralized or distributed control, and a good communication environment is required to be used as a support, so that extra communication cost is brought, and the effect is poor when communication is interrupted.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, and provides a method and a device for controlling the charge of a scattered electric vehicle with auxiliary frequency modulation, which can reduce the communication cost and improve the comprehensiveness of the electric vehicle participating in a micro-grid frequency modulation control strategy.

The embodiment of the invention provides a charge control method for a distributed electric vehicle with auxiliary frequency modulation, which comprises the following steps:

Acquiring current state information; the state information comprises frequency deviation of a micro-grid and the state of charge of an electric vehicle;

Inputting the current state information into a latest decision network model; the latest decision network model is obtained based on training of a preset target rewarding function, and the preset target rewarding function is constructed through the state information;

And controlling the charging power of the electric automobile based on the output of the latest decision network model, and simultaneously storing the charging working experience.

Further, the preset target rewarding function is constructed through the state information, and specifically includes:

constructing a first rewarding function according to the frequency deviation of the micro-grid;

Constructing a second rewarding function according to the charge state of the electric automobile;

And carrying out weighted addition on the first rewarding function and the second rewarding function according to a preset weight coefficient to obtain the rewarding value.

Preferably, the constructing a first reward function according to the frequency deviation of the micro-grid specifically includes:

assuming that the frequency deviation of the micro-grid is Δf, the calculation formula of the first reward function r ₁ is:

Wherein f ₁、f₂、f₃ represents the frequency deviation boundary of the micro-grid during normal operation, auxiliary control and emergency control, and alpha ₁、α₂、α₃ is the preset weight coefficient corresponding to f ₁、f₂、f₃.

Preferably, the constructing a second prize function according to the state of charge of the electric automobile specifically includes:

Assuming that the state of charge of the electric vehicle is SOC, a calculation formula of the second prize function r ₂ is:

wherein r _max is a preset maximum prize value, SOC _min is a preset minimum state of charge, SOC ^* is a preset target state of charge, and SOC _max is a preset maximum state of charge.

Further, the latest decision network model is obtained based on training of a preset target rewarding function, and specifically comprises the following steps:

initializing a predictive decision network, a predictive value network, a target decision network and a target value network;

Randomly selecting charging working experience data from a preset experience pool, and training the predictive value network according to a preset loss function; the charging working experience data are obtained by calculation based on the collected actual operation data and the preset target rewarding function;

updating the parameters trained by the predictive value network to the target value network in a soft updating mode;

constructing an objective function according to the objective value network after parameter updating, and training the prediction decision network through the objective function;

Updating the parameters trained by the predictive decision network to the target decision network in a soft updating mode;

and re-selecting the charging working experience data, performing new training until the training times reach a preset training threshold value, ending the training, and outputting the target decision network obtained by the last training as the latest decision network.

Further, the charging working experience data is calculated based on the collected actual operation data and the preset target rewarding function, and specifically includes:

Setting the current state information as S ₁ and the reference power as A, and obtaining charged state information S ₂ after the electric automobile is charged according to the reference power;

calculating a reward value R through the preset target reward function according to the charged state information S ₂;

and taking [ S ₁,A,R,S₂ ] as the charging operation experience data.

Preferably, when the number of the charging operation experience data in the preset experience pool is smaller than a preset number threshold value, the preset experience pool is filled by simulating the charging operation experience data; the method for acquiring the simulated charging operation experience data specifically comprises the following steps:

according to preset configuration information, a load frequency model is established; wherein the preset configuration information comprises the state information of each moment;

According to the state information S _t at the time t, calculating to obtain reference power A _t at the time t through the prediction decision network;

According to the reference power A _t at the time t, the state information S _t+1 at the time t+1 is obtained through the load frequency model simulation, and according to the state information S _t+1 at the time t+1, a reward value R _t is calculated;

And outputting the [ S _t,A_t,R_t,S_t+1 ] as the simulated charging working experience data to the preset experience pool.

Further, the method further comprises:

and uploading the charging working experience data stored in the preset time period to a preset experience pool every other preset time period.

Another embodiment of the present invention provides a charge control device for a decentralized electric vehicle with auxiliary frequency modulation, including: the device comprises an acquisition module, an input module and a charging module;

the acquisition module is used for acquiring current state information; the state information comprises frequency deviation of a micro-grid and the state of charge of an electric vehicle;

The input module is used for inputting the current state information into a latest decision network model; the latest decision network model is obtained based on training of a preset target rewarding function, and the preset target rewarding function is constructed through the state information;

The charging module is used for controlling the charging power of the electric automobile based on the output of the latest decision network model and storing the charging working experience.

Further, the charging module is further configured to upload charging operation experience data stored in a preset duration to a preset experience pool every a preset duration.

Compared with the prior art, the invention has the beneficial effects that:

According to the invention, the electric automobile which is accessed into the micro-grid by using the private charger is also incorporated into the frequency modulation control strategy of the micro-grid by carrying out centralized training and using the private bidirectional charger for decentralized control, so that the comprehensiveness of the electric automobile participating in the frequency modulation control strategy of the micro-grid is improved.

In addition, the distributed control provided by the invention can realize the charge control of the distributed electric automobile only by carrying out information interaction within a preset period of time, and compared with the existing centralized control, the distributed control reduces the communication cost.

Drawings

Fig. 1 is a flow chart of a method for controlling charge of a distributed electric vehicle with auxiliary frequency modulation according to an embodiment of the present invention.

Fig. 2 is a schematic structural diagram of a charge control device for a distributed electric vehicle with auxiliary frequency modulation according to another embodiment of the present invention.

Fig. 3 is a schematic structural diagram of a charge control architecture of a decentralized electric vehicle with auxiliary frequency modulation according to another embodiment of the present invention.

Detailed Description

The drawings are for illustrative purposes only and are not to be construed as limiting the present patent;

It will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.

Referring to fig. 1, a flow chart of a method for controlling charge of a distributed electric vehicle with auxiliary frequency modulation according to an embodiment of the present invention includes the following steps:

s1: acquiring current state information; the state information comprises frequency deviation of a micro-grid and the state of charge of an electric vehicle;

s2: inputting the current state information into a latest decision network model; the latest decision network model is obtained based on training of a preset target rewarding function, and the preset target rewarding function is constructed through the state information;

S3: and controlling the charging power of the electric automobile based on the output of the latest decision network model, and simultaneously storing the charging working experience.

For step S2, specifically, the preset target reward function is constructed through the state information, and specifically includes:

In a preferred embodiment, the target rewarding function considers the frequency deviation of the micro-grid and the charge state of the electric automobile, and by adjusting the weight coefficients of the target rewarding function and the electric automobile, benefits of both a micro-grid manager and an electric automobile user can be simultaneously considered.

For step S2, specifically, the latest decision network model is obtained through training of existing charging working experience, and specifically includes:

Randomly selecting a plurality of pieces of charging working experience data from a preset experience pool, and training the predictive value network according to a preset loss function;

And re-selecting a plurality of pieces of charging working experience data, performing a new training round until the training times reach a preset training threshold value, ending the training, and outputting the target decision network obtained by the last training as the latest decision network.

In a preferred embodiment, the preset loss function includes parameters to be optimized of the predicted value network, and the parameters to be optimized of the predicted value network are optimized with the aim of minimizing the preset loss function during training, so as to obtain optimized parameters of the predicted value network.

After the optimization is finished, covering the optimization parameters of the predicted value network on the corresponding parameters of the target value network in a soft update mode, setting the optimization parameters of the predicted value network as w, setting the initial values of the corresponding parameters of the target value network as v, and setting the updated corresponding parameters as v' and adopting a soft update calculation formula as follows:

v’＝aw+(1-a)v

Wherein a is a preset learning coefficient.

And the objective function comprises parameters to be optimized of the predictive decision network, and the parameters to be optimized of the predictive decision network are optimized by taking the maximized objective function as an optimization target during training, so that the optimization parameters of the predictive decision network are obtained.

And after the optimization is finished, similarly, covering the optimization parameters of the predictive decision network on the corresponding parameters of the target decision network in a soft update mode, and finishing a round of training.

And after the training is finished, selecting the charging working experience data from the preset experience pool again, and performing a new training until the training is finished, and outputting the finally obtained target decision network as the latest decision network.

Further, the charging working experience data is calculated based on the preset target rewarding function, and specifically includes:

Setting the current state information as S ₁ and the reference power as A, and collecting the charged state information S ₂ after the private charger charges the electric automobile according to the reference power;

and taking [ S ₁,A,R,S₂ ] as the charging operation experience data.

In a preferred embodiment, the reward value is used to evaluate the reference power, i.e. to qualitatively evaluate the contribution to the frequency stabilization of the microgrid and to the charging efficiency of the electric vehicle after charging the electric vehicle according to the reference power. The inclusion of the prize value into the working experience is beneficial to improving the training effect of the decision network.

according to the reference power A _t at the time t, the state information S _t+1 at the time t+1 is obtained through the load frequency model simulation, and according to the state information S _t+1 at the time t+1, a reward value R _t is calculated through a preset target reward function;

In a preferred embodiment, the micro-grid frequency load model is a mathematical model capable of reflecting the frequency load relationship based on the characteristics of the actual micro-grid and the electric vehicle. In the invention, the micro-grid frequency load model is used as an interaction environment for multi-agent deep reinforcement learning, and can be used for simulating micro-grid frequency deviation after a charger adjusts charge and discharge power, and further obtaining the simulated charge working experience data.

Further, the method further comprises:

In a preferred embodiment, the preset time period may be set to days, weeks or months according to actual needs. And after uploading the charging working experience to the experience battery, the central controller upgrades the decision network according to the updated experience battery, and finally, the updated latest decision network is sent to the private charger. Such an architecture arrangement is less costly to communicate than existing centralized or distributed control architectures.

For step S3, specifically, the controlling the charging work of the electric vehicle based on the output of the decision network model specifically includes:

Let the state information be S, the calculation formula of the reference power a specifically is:

S＝μ(a,θ)

Wherein μ is an output function of the latest decision network, and θ is a network parameter of the latest decision network.

Compared with the prior art, the invention has the beneficial effects that:

The electric automobile which is accessed into the micro-grid by the private charger is also incorporated into the frequency modulation control strategy of the micro-grid by carrying out centralized training and using the private bidirectional charger for decentralized control, so that the comprehensiveness of the electric automobile participating in the frequency modulation control strategy of the micro-grid is improved.

Referring to fig. 2, a schematic structural diagram of a charge control device for a distributed electric vehicle with auxiliary frequency modulation according to another embodiment of the present invention includes: an acquisition module 201, an input module 202, and a charging module 203;

The acquiring module 201 is configured to acquire current state information; the state information comprises frequency deviation of a micro-grid and the state of charge of an electric vehicle;

The input module 202 is configured to input the current state information into a latest decision network model; the latest decision network model is obtained based on training of a preset target rewarding function, and the preset target rewarding function is constructed through the state information;

The charging module 203 is configured to control charging power of the electric vehicle based on the output of the latest decision network model, and store the charging experience of the present time.

Further, the charging module 203 is further configured to upload the charging operation experience data stored in the preset duration to a preset experience pool every a preset duration.

Referring to fig. 3, a schematic structural diagram of a charge control architecture of a decentralized electric vehicle with auxiliary frequency modulation according to another embodiment of the present invention includes: and dispersing the electric automobile and a central server.

The central server is used as a training center of the network, and is responsible for training the network by using a multi-agent deep reinforcement learning algorithm after receiving experience from the battery charger, and transmitting the trained predictive decision network parameters to the corresponding personal battery charger.

The distributed electric vehicles are respectively connected with a private charger, and after a prediction decision network from a central server is loaded by the private charger, the charging and discharging power of the electric vehicles is automatically controlled according to state information, and experience is saved. At intervals, the private charger packages the experiences in the saved experience pool and sends the packages to the central server.

It is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims

1. The method for controlling the charge of the dispersed electric automobile with the auxiliary frequency modulation is characterized by comprising the following steps of:

Controlling the charging power of the electric automobile based on the output of the latest decision network model, and simultaneously storing the charging working experience data;

The latest decision network model is obtained based on training of a preset target rewarding function, and specifically comprises the following steps:

Re-selecting the charging working experience data, performing a new training until the training times reach a preset training threshold value, ending the training, and outputting the target decision network obtained by the last training as the latest decision network;

The charging working experience data is calculated based on the collected actual operation data and the preset target rewarding function, and specifically comprises the following steps:

Setting the current state information as S ₁, inputting S ₁ into the latest decision network to obtain reference power A, and obtaining state information S ₂ after the electric automobile is charged according to the reference power;

and taking [ S ₁,A,R,S₂ ] as the charging operation experience data.

2. The method for controlling the charge of the decentralized electric vehicle with auxiliary frequency modulation according to claim 1, wherein the preset target reward function is constructed by the state information, specifically comprising:

and carrying out weighted addition on the first rewarding function and the second rewarding function according to a preset weight coefficient to obtain the preset target rewarding function.

3. The method for controlling the charge of the decentralized electric vehicle with auxiliary frequency modulation according to claim 2, wherein the constructing a first reward function according to the frequency deviation of the micro-grid specifically comprises:

4. The method for controlling the charge of the decentralized electric vehicle with auxiliary frequency modulation according to claim 2, wherein the constructing a second prize function according to the state of charge of the electric vehicle specifically comprises:

5. The method for controlling charge of a distributed electric vehicle with auxiliary frequency modulation according to claim 1, wherein the preset experience pool is filled by simulating charge operation experience data when the number of charge operation experience data in the preset experience pool is smaller than a preset number threshold; the method for acquiring the simulated charging operation experience data specifically comprises the following steps:

Setting the state information at the time t as S _t, and calculating to obtain reference power A _t at the time t through the prediction decision network;

6. The method for controlling charge of a decentralized electric vehicle with auxiliary frequency modulation according to claim 1, further comprising:

7. The utility model provides a supplementary scattered electric automobile charge control device of frequency modulation which characterized in that includes: the device comprises an acquisition module, an input module and a charging module;

The charging module is used for controlling the charging power of the electric automobile based on the output of the latest decision network model and storing the charging working experience data;

and taking [ S ₁,A,R,S₂ ] as the charging operation experience data.

8. The device of claim 7, wherein the charging module is further configured to upload the charging experience data stored in the preset time period to a preset experience pool every a preset time period.