CN117863948B - Distributed electric vehicle charging control method and device for auxiliary frequency modulation - Google Patents

Distributed electric vehicle charging control method and device for auxiliary frequency modulation Download PDF

Info

Publication number
CN117863948B
CN117863948B CN202410067438.5A CN202410067438A CN117863948B CN 117863948 B CN117863948 B CN 117863948B CN 202410067438 A CN202410067438 A CN 202410067438A CN 117863948 B CN117863948 B CN 117863948B
Authority
CN
China
Prior art keywords
preset
state information
decision network
training
charge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410067438.5A
Other languages
Chinese (zh)
Other versions
CN117863948A (en
Inventor
赵卓立
谭翰袁
徐家文
张泽翰
卢健钊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202410067438.5A priority Critical patent/CN117863948B/en
Publication of CN117863948A publication Critical patent/CN117863948A/en
Application granted granted Critical
Publication of CN117863948B publication Critical patent/CN117863948B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a method and a device for controlling charge of a scattered electric automobile with auxiliary frequency modulation, comprising the following steps: acquiring current state information; the state information comprises frequency deviation of a micro-grid and the state of charge of an electric vehicle; inputting the current state information into a decision network model; the decision network model is obtained based on training of a preset target rewarding function, and the preset target rewarding function is constructed through the state information; and controlling the charging power of the electric automobile based on the output of the decision network model, and simultaneously storing the charging working experience. The method can reduce the communication cost and improve the comprehensiveness of the electric automobile participating in the micro-grid frequency modulation control strategy.

Description

Distributed electric vehicle charging control method and device for auxiliary frequency modulation
Technical Field
The invention relates to the field of electric automobiles, in particular to a method and a device for controlling charge of a scattered electric automobile with auxiliary frequency modulation.
Background
At present, the following defects exist in the strategy of the electric automobile participating in micro-grid frequency modulation control:
1. In the prior art, when a research object participating in micro-grid frequency modulation service is selected, most electric vehicles parked in a public charging station are selected, and electric vehicles accessed into a power grid by using a private charger are ignored;
2. The existing electric automobile participation frequency modulation control strategy usually adopts a method of aggregation of electric automobiles, so that the complexity of the electric automobile aggregation is reduced, but the individual requirements of the electric automobiles are easily ignored;
3. the existing control strategies are mostly centralized or distributed control, and a good communication environment is required to be used as a support, so that extra communication cost is brought, and the effect is poor when communication is interrupted.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a method and a device for controlling the charge of a scattered electric vehicle with auxiliary frequency modulation, which can reduce the communication cost and improve the comprehensiveness of the electric vehicle participating in a micro-grid frequency modulation control strategy.
The embodiment of the invention provides a charge control method for a distributed electric vehicle with auxiliary frequency modulation, which comprises the following steps:
Acquiring current state information; the state information comprises frequency deviation of a micro-grid and the state of charge of an electric vehicle;
Inputting the current state information into a latest decision network model; the latest decision network model is obtained based on training of a preset target rewarding function, and the preset target rewarding function is constructed through the state information;
And controlling the charging power of the electric automobile based on the output of the latest decision network model, and simultaneously storing the charging working experience.
Further, the preset target rewarding function is constructed through the state information, and specifically includes:
constructing a first rewarding function according to the frequency deviation of the micro-grid;
Constructing a second rewarding function according to the charge state of the electric automobile;
And carrying out weighted addition on the first rewarding function and the second rewarding function according to a preset weight coefficient to obtain the rewarding value.
Preferably, the constructing a first reward function according to the frequency deviation of the micro-grid specifically includes:
assuming that the frequency deviation of the micro-grid is Δf, the calculation formula of the first reward function r 1 is:
Wherein f 1、f2、f3 represents the frequency deviation boundary of the micro-grid during normal operation, auxiliary control and emergency control, and alpha 1、α2、α3 is the preset weight coefficient corresponding to f 1、f2、f3.
Preferably, the constructing a second prize function according to the state of charge of the electric automobile specifically includes:
Assuming that the state of charge of the electric vehicle is SOC, a calculation formula of the second prize function r 2 is:
wherein r max is a preset maximum prize value, SOC min is a preset minimum state of charge, SOC * is a preset target state of charge, and SOC max is a preset maximum state of charge.
Further, the latest decision network model is obtained based on training of a preset target rewarding function, and specifically comprises the following steps:
initializing a predictive decision network, a predictive value network, a target decision network and a target value network;
Randomly selecting charging working experience data from a preset experience pool, and training the predictive value network according to a preset loss function; the charging working experience data are obtained by calculation based on the collected actual operation data and the preset target rewarding function;
updating the parameters trained by the predictive value network to the target value network in a soft updating mode;
constructing an objective function according to the objective value network after parameter updating, and training the prediction decision network through the objective function;
Updating the parameters trained by the predictive decision network to the target decision network in a soft updating mode;
and re-selecting the charging working experience data, performing new training until the training times reach a preset training threshold value, ending the training, and outputting the target decision network obtained by the last training as the latest decision network.
Further, the charging working experience data is calculated based on the collected actual operation data and the preset target rewarding function, and specifically includes:
Setting the current state information as S 1 and the reference power as A, and obtaining charged state information S 2 after the electric automobile is charged according to the reference power;
calculating a reward value R through the preset target reward function according to the charged state information S 2;
and taking [ S 1,A,R,S2 ] as the charging operation experience data.
Preferably, when the number of the charging operation experience data in the preset experience pool is smaller than a preset number threshold value, the preset experience pool is filled by simulating the charging operation experience data; the method for acquiring the simulated charging operation experience data specifically comprises the following steps:
according to preset configuration information, a load frequency model is established; wherein the preset configuration information comprises the state information of each moment;
According to the state information S t at the time t, calculating to obtain reference power A t at the time t through the prediction decision network;
According to the reference power A t at the time t, the state information S t+1 at the time t+1 is obtained through the load frequency model simulation, and according to the state information S t+1 at the time t+1, a reward value R t is calculated;
And outputting the [ S t,At,Rt,St+1 ] as the simulated charging working experience data to the preset experience pool.
Further, the method further comprises:
and uploading the charging working experience data stored in the preset time period to a preset experience pool every other preset time period.
Another embodiment of the present invention provides a charge control device for a decentralized electric vehicle with auxiliary frequency modulation, including: the device comprises an acquisition module, an input module and a charging module;
the acquisition module is used for acquiring current state information; the state information comprises frequency deviation of a micro-grid and the state of charge of an electric vehicle;
The input module is used for inputting the current state information into a latest decision network model; the latest decision network model is obtained based on training of a preset target rewarding function, and the preset target rewarding function is constructed through the state information;
The charging module is used for controlling the charging power of the electric automobile based on the output of the latest decision network model and storing the charging working experience.
Further, the charging module is further configured to upload charging operation experience data stored in a preset duration to a preset experience pool every a preset duration.
Compared with the prior art, the invention has the beneficial effects that:
According to the invention, the electric automobile which is accessed into the micro-grid by using the private charger is also incorporated into the frequency modulation control strategy of the micro-grid by carrying out centralized training and using the private bidirectional charger for decentralized control, so that the comprehensiveness of the electric automobile participating in the frequency modulation control strategy of the micro-grid is improved.
In addition, the distributed control provided by the invention can realize the charge control of the distributed electric automobile only by carrying out information interaction within a preset period of time, and compared with the existing centralized control, the distributed control reduces the communication cost.
Drawings
Fig. 1 is a flow chart of a method for controlling charge of a distributed electric vehicle with auxiliary frequency modulation according to an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of a charge control device for a distributed electric vehicle with auxiliary frequency modulation according to another embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a charge control architecture of a decentralized electric vehicle with auxiliary frequency modulation according to another embodiment of the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present patent;
It will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.
Referring to fig. 1, a flow chart of a method for controlling charge of a distributed electric vehicle with auxiliary frequency modulation according to an embodiment of the present invention includes the following steps:
s1: acquiring current state information; the state information comprises frequency deviation of a micro-grid and the state of charge of an electric vehicle;
s2: inputting the current state information into a latest decision network model; the latest decision network model is obtained based on training of a preset target rewarding function, and the preset target rewarding function is constructed through the state information;
S3: and controlling the charging power of the electric automobile based on the output of the latest decision network model, and simultaneously storing the charging working experience.
For step S2, specifically, the preset target reward function is constructed through the state information, and specifically includes:
constructing a first rewarding function according to the frequency deviation of the micro-grid;
Constructing a second rewarding function according to the charge state of the electric automobile;
And carrying out weighted addition on the first rewarding function and the second rewarding function according to a preset weight coefficient to obtain the rewarding value.
Preferably, the constructing a first reward function according to the frequency deviation of the micro-grid specifically includes:
assuming that the frequency deviation of the micro-grid is Δf, the calculation formula of the first reward function r 1 is:
Wherein f 1、f2、f3 represents the frequency deviation boundary of the micro-grid during normal operation, auxiliary control and emergency control, and alpha 1、α2、α3 is the preset weight coefficient corresponding to f 1、f2、f3.
Preferably, the constructing a second prize function according to the state of charge of the electric automobile specifically includes:
Assuming that the state of charge of the electric vehicle is SOC, a calculation formula of the second prize function r 2 is:
wherein r max is a preset maximum prize value, SOC min is a preset minimum state of charge, SOC * is a preset target state of charge, and SOC max is a preset maximum state of charge.
In a preferred embodiment, the target rewarding function considers the frequency deviation of the micro-grid and the charge state of the electric automobile, and by adjusting the weight coefficients of the target rewarding function and the electric automobile, benefits of both a micro-grid manager and an electric automobile user can be simultaneously considered.
For step S2, specifically, the latest decision network model is obtained through training of existing charging working experience, and specifically includes:
initializing a predictive decision network, a predictive value network, a target decision network and a target value network;
Randomly selecting a plurality of pieces of charging working experience data from a preset experience pool, and training the predictive value network according to a preset loss function;
updating the parameters trained by the predictive value network to the target value network in a soft updating mode;
constructing an objective function according to the objective value network after parameter updating, and training the prediction decision network through the objective function;
Updating the parameters trained by the predictive decision network to the target decision network in a soft updating mode;
And re-selecting a plurality of pieces of charging working experience data, performing a new training round until the training times reach a preset training threshold value, ending the training, and outputting the target decision network obtained by the last training as the latest decision network.
In a preferred embodiment, the preset loss function includes parameters to be optimized of the predicted value network, and the parameters to be optimized of the predicted value network are optimized with the aim of minimizing the preset loss function during training, so as to obtain optimized parameters of the predicted value network.
After the optimization is finished, covering the optimization parameters of the predicted value network on the corresponding parameters of the target value network in a soft update mode, setting the optimization parameters of the predicted value network as w, setting the initial values of the corresponding parameters of the target value network as v, and setting the updated corresponding parameters as v' and adopting a soft update calculation formula as follows:
v’=aw+(1-a)v
Wherein a is a preset learning coefficient.
And the objective function comprises parameters to be optimized of the predictive decision network, and the parameters to be optimized of the predictive decision network are optimized by taking the maximized objective function as an optimization target during training, so that the optimization parameters of the predictive decision network are obtained.
And after the optimization is finished, similarly, covering the optimization parameters of the predictive decision network on the corresponding parameters of the target decision network in a soft update mode, and finishing a round of training.
And after the training is finished, selecting the charging working experience data from the preset experience pool again, and performing a new training until the training is finished, and outputting the finally obtained target decision network as the latest decision network.
Further, the charging working experience data is calculated based on the preset target rewarding function, and specifically includes:
Setting the current state information as S 1 and the reference power as A, and collecting the charged state information S 2 after the private charger charges the electric automobile according to the reference power;
calculating a reward value R through the preset target reward function according to the charged state information S 2;
and taking [ S 1,A,R,S2 ] as the charging operation experience data.
In a preferred embodiment, the reward value is used to evaluate the reference power, i.e. to qualitatively evaluate the contribution to the frequency stabilization of the microgrid and to the charging efficiency of the electric vehicle after charging the electric vehicle according to the reference power. The inclusion of the prize value into the working experience is beneficial to improving the training effect of the decision network.
Preferably, when the number of the charging operation experience data in the preset experience pool is smaller than a preset number threshold value, the preset experience pool is filled by simulating the charging operation experience data; the method for acquiring the simulated charging operation experience data specifically comprises the following steps:
according to preset configuration information, a load frequency model is established; wherein the preset configuration information comprises the state information of each moment;
According to the state information S t at the time t, calculating to obtain reference power A t at the time t through the prediction decision network;
according to the reference power A t at the time t, the state information S t+1 at the time t+1 is obtained through the load frequency model simulation, and according to the state information S t+1 at the time t+1, a reward value R t is calculated through a preset target reward function;
And outputting the [ S t,At,Rt,St+1 ] as the simulated charging working experience data to the preset experience pool.
In a preferred embodiment, the micro-grid frequency load model is a mathematical model capable of reflecting the frequency load relationship based on the characteristics of the actual micro-grid and the electric vehicle. In the invention, the micro-grid frequency load model is used as an interaction environment for multi-agent deep reinforcement learning, and can be used for simulating micro-grid frequency deviation after a charger adjusts charge and discharge power, and further obtaining the simulated charge working experience data.
Further, the method further comprises:
and uploading the charging working experience data stored in the preset time period to a preset experience pool every other preset time period.
In a preferred embodiment, the preset time period may be set to days, weeks or months according to actual needs. And after uploading the charging working experience to the experience battery, the central controller upgrades the decision network according to the updated experience battery, and finally, the updated latest decision network is sent to the private charger. Such an architecture arrangement is less costly to communicate than existing centralized or distributed control architectures.
For step S3, specifically, the controlling the charging work of the electric vehicle based on the output of the decision network model specifically includes:
Let the state information be S, the calculation formula of the reference power a specifically is:
S=μ(a,θ)
Wherein μ is an output function of the latest decision network, and θ is a network parameter of the latest decision network.
Compared with the prior art, the invention has the beneficial effects that:
The electric automobile which is accessed into the micro-grid by the private charger is also incorporated into the frequency modulation control strategy of the micro-grid by carrying out centralized training and using the private bidirectional charger for decentralized control, so that the comprehensiveness of the electric automobile participating in the frequency modulation control strategy of the micro-grid is improved.
In addition, the distributed control provided by the invention can realize the charge control of the distributed electric automobile only by carrying out information interaction within a preset period of time, and compared with the existing centralized control, the distributed control reduces the communication cost.
Referring to fig. 2, a schematic structural diagram of a charge control device for a distributed electric vehicle with auxiliary frequency modulation according to another embodiment of the present invention includes: an acquisition module 201, an input module 202, and a charging module 203;
The acquiring module 201 is configured to acquire current state information; the state information comprises frequency deviation of a micro-grid and the state of charge of an electric vehicle;
The input module 202 is configured to input the current state information into a latest decision network model; the latest decision network model is obtained based on training of a preset target rewarding function, and the preset target rewarding function is constructed through the state information;
The charging module 203 is configured to control charging power of the electric vehicle based on the output of the latest decision network model, and store the charging experience of the present time.
Further, the charging module 203 is further configured to upload the charging operation experience data stored in the preset duration to a preset experience pool every a preset duration.
Referring to fig. 3, a schematic structural diagram of a charge control architecture of a decentralized electric vehicle with auxiliary frequency modulation according to another embodiment of the present invention includes: and dispersing the electric automobile and a central server.
The central server is used as a training center of the network, and is responsible for training the network by using a multi-agent deep reinforcement learning algorithm after receiving experience from the battery charger, and transmitting the trained predictive decision network parameters to the corresponding personal battery charger.
The distributed electric vehicles are respectively connected with a private charger, and after a prediction decision network from a central server is loaded by the private charger, the charging and discharging power of the electric vehicles is automatically controlled according to state information, and experience is saved. At intervals, the private charger packages the experiences in the saved experience pool and sends the packages to the central server.
It is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims (8)

1. The method for controlling the charge of the dispersed electric automobile with the auxiliary frequency modulation is characterized by comprising the following steps of:
Acquiring current state information; the state information comprises frequency deviation of a micro-grid and the state of charge of an electric vehicle;
Inputting the current state information into a latest decision network model; the latest decision network model is obtained based on training of a preset target rewarding function, and the preset target rewarding function is constructed through the state information;
Controlling the charging power of the electric automobile based on the output of the latest decision network model, and simultaneously storing the charging working experience data;
The latest decision network model is obtained based on training of a preset target rewarding function, and specifically comprises the following steps:
initializing a predictive decision network, a predictive value network, a target decision network and a target value network;
Randomly selecting charging working experience data from a preset experience pool, and training the predictive value network according to a preset loss function; the charging working experience data are obtained by calculation based on the collected actual operation data and the preset target rewarding function;
updating the parameters trained by the predictive value network to the target value network in a soft updating mode;
constructing an objective function according to the objective value network after parameter updating, and training the prediction decision network through the objective function;
Updating the parameters trained by the predictive decision network to the target decision network in a soft updating mode;
Re-selecting the charging working experience data, performing a new training until the training times reach a preset training threshold value, ending the training, and outputting the target decision network obtained by the last training as the latest decision network;
The charging working experience data is calculated based on the collected actual operation data and the preset target rewarding function, and specifically comprises the following steps:
Setting the current state information as S 1, inputting S 1 into the latest decision network to obtain reference power A, and obtaining state information S 2 after the electric automobile is charged according to the reference power;
calculating a reward value R through the preset target reward function according to the charged state information S 2;
and taking [ S 1,A,R,S2 ] as the charging operation experience data.
2. The method for controlling the charge of the decentralized electric vehicle with auxiliary frequency modulation according to claim 1, wherein the preset target reward function is constructed by the state information, specifically comprising:
constructing a first rewarding function according to the frequency deviation of the micro-grid;
Constructing a second rewarding function according to the charge state of the electric automobile;
and carrying out weighted addition on the first rewarding function and the second rewarding function according to a preset weight coefficient to obtain the preset target rewarding function.
3. The method for controlling the charge of the decentralized electric vehicle with auxiliary frequency modulation according to claim 2, wherein the constructing a first reward function according to the frequency deviation of the micro-grid specifically comprises:
assuming that the frequency deviation of the micro-grid is Δf, the calculation formula of the first reward function r 1 is:
Wherein f 1、f2、f3 represents the frequency deviation boundary of the micro-grid during normal operation, auxiliary control and emergency control, and alpha 1、α2、α3 is the preset weight coefficient corresponding to f 1、f2、f3.
4. The method for controlling the charge of the decentralized electric vehicle with auxiliary frequency modulation according to claim 2, wherein the constructing a second prize function according to the state of charge of the electric vehicle specifically comprises:
Assuming that the state of charge of the electric vehicle is SOC, a calculation formula of the second prize function r 2 is:
wherein r max is a preset maximum prize value, SOC min is a preset minimum state of charge, SOC * is a preset target state of charge, and SOC max is a preset maximum state of charge.
5. The method for controlling charge of a distributed electric vehicle with auxiliary frequency modulation according to claim 1, wherein the preset experience pool is filled by simulating charge operation experience data when the number of charge operation experience data in the preset experience pool is smaller than a preset number threshold; the method for acquiring the simulated charging operation experience data specifically comprises the following steps:
according to preset configuration information, a load frequency model is established; wherein the preset configuration information comprises the state information of each moment;
Setting the state information at the time t as S t, and calculating to obtain reference power A t at the time t through the prediction decision network;
According to the reference power A t at the time t, the state information S t+1 at the time t+1 is obtained through the load frequency model simulation, and according to the state information S t+1 at the time t+1, a reward value R t is calculated;
And outputting the [ S t,At,Rt,St+1 ] as the simulated charging working experience data to the preset experience pool.
6. The method for controlling charge of a decentralized electric vehicle with auxiliary frequency modulation according to claim 1, further comprising:
and uploading the charging working experience data stored in the preset time period to a preset experience pool every other preset time period.
7. The utility model provides a supplementary scattered electric automobile charge control device of frequency modulation which characterized in that includes: the device comprises an acquisition module, an input module and a charging module;
the acquisition module is used for acquiring current state information; the state information comprises frequency deviation of a micro-grid and the state of charge of an electric vehicle;
The input module is used for inputting the current state information into a latest decision network model; the latest decision network model is obtained based on training of a preset target rewarding function, and the preset target rewarding function is constructed through the state information;
The charging module is used for controlling the charging power of the electric automobile based on the output of the latest decision network model and storing the charging working experience data;
The latest decision network model is obtained based on training of a preset target rewarding function, and specifically comprises the following steps:
initializing a predictive decision network, a predictive value network, a target decision network and a target value network;
Randomly selecting charging working experience data from a preset experience pool, and training the predictive value network according to a preset loss function; the charging working experience data are obtained by calculation based on the collected actual operation data and the preset target rewarding function;
updating the parameters trained by the predictive value network to the target value network in a soft updating mode;
constructing an objective function according to the objective value network after parameter updating, and training the prediction decision network through the objective function;
Updating the parameters trained by the predictive decision network to the target decision network in a soft updating mode;
Re-selecting the charging working experience data, performing a new training until the training times reach a preset training threshold value, ending the training, and outputting the target decision network obtained by the last training as the latest decision network;
The charging working experience data is calculated based on the collected actual operation data and the preset target rewarding function, and specifically comprises the following steps:
Setting the current state information as S 1, inputting S 1 into the latest decision network to obtain reference power A, and obtaining state information S 2 after the electric automobile is charged according to the reference power;
calculating a reward value R through the preset target reward function according to the charged state information S 2;
and taking [ S 1,A,R,S2 ] as the charging operation experience data.
8. The device of claim 7, wherein the charging module is further configured to upload the charging experience data stored in the preset time period to a preset experience pool every a preset time period.
CN202410067438.5A 2024-01-17 Distributed electric vehicle charging control method and device for auxiliary frequency modulation Active CN117863948B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410067438.5A CN117863948B (en) 2024-01-17 Distributed electric vehicle charging control method and device for auxiliary frequency modulation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410067438.5A CN117863948B (en) 2024-01-17 Distributed electric vehicle charging control method and device for auxiliary frequency modulation

Publications (2)

Publication Number Publication Date
CN117863948A CN117863948A (en) 2024-04-12
CN117863948B true CN117863948B (en) 2024-06-11

Family

ID=

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110809306A (en) * 2019-11-04 2020-02-18 电子科技大学 Terminal access selection method based on deep reinforcement learning
CN111934335A (en) * 2020-08-18 2020-11-13 华北电力大学 Cluster electric vehicle charging behavior optimization method based on deep reinforcement learning
CN112187074A (en) * 2020-09-15 2021-01-05 电子科技大学 Inverter controller based on deep reinforcement learning
CN112989017A (en) * 2021-05-17 2021-06-18 南湖实验室 Method for generating high-quality simulation experience for dialogue strategy learning
CN113141017A (en) * 2021-04-29 2021-07-20 福州大学 Control method for energy storage system to participate in primary frequency modulation of power grid based on DDPG algorithm and SOC recovery
CN113270937A (en) * 2021-03-30 2021-08-17 鹏城实验室 Standby battery scheduling method, computer readable storage medium and system
CN113627993A (en) * 2021-08-26 2021-11-09 东北大学秦皇岛分校 Intelligent electric vehicle charging and discharging decision method based on deep reinforcement learning
CN113872198A (en) * 2021-09-29 2021-12-31 电子科技大学 Active power distribution network fault recovery method based on reinforcement learning method
CN114091879A (en) * 2021-11-15 2022-02-25 浙江华云电力工程设计咨询有限公司 Multi-park energy scheduling method and system based on deep reinforcement learning
CN114423061A (en) * 2022-01-20 2022-04-29 重庆邮电大学 Wireless route optimization method based on attention mechanism and deep reinforcement learning
CN114742453A (en) * 2022-05-06 2022-07-12 江苏大学 Micro-grid energy management method based on Rainbow deep Q network
CN115051403A (en) * 2022-03-16 2022-09-13 国网浙江省电力有限公司丽水供电公司 Island microgrid load frequency control method and system based on deep Q learning
CN115097729A (en) * 2022-06-21 2022-09-23 广东工业大学 Boiler soot blower optimization control method and system based on reinforcement learning
CN115238891A (en) * 2022-07-29 2022-10-25 腾讯科技(深圳)有限公司 Decision model training method, and target object strategy control method and device
CN115257745A (en) * 2022-07-21 2022-11-01 同济大学 Automatic driving lane change decision control method based on rule fusion reinforcement learning
CN115366099A (en) * 2022-08-18 2022-11-22 江苏科技大学 Mechanical arm depth certainty strategy gradient training method based on forward kinematics
WO2023064474A1 (en) * 2021-10-14 2023-04-20 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Systems and methods for controlling magnetic microdevices with machine learning
CN116185584A (en) * 2023-01-09 2023-05-30 西北工业大学 Multi-tenant database resource planning and scheduling method based on deep reinforcement learning
CN116454902A (en) * 2023-05-09 2023-07-18 广东电网有限责任公司 Power distribution network voltage regulating method, device, equipment and storage medium based on reinforcement learning
CN116456493A (en) * 2023-04-20 2023-07-18 无锡学院 D2D user resource allocation method and storage medium based on deep reinforcement learning algorithm
CN116824848A (en) * 2023-06-08 2023-09-29 甘肃紫光智能交通与控制技术有限公司 Traffic signal optimization control method based on Bayesian deep Q network

Patent Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110809306A (en) * 2019-11-04 2020-02-18 电子科技大学 Terminal access selection method based on deep reinforcement learning
CN111934335A (en) * 2020-08-18 2020-11-13 华北电力大学 Cluster electric vehicle charging behavior optimization method based on deep reinforcement learning
CN112187074A (en) * 2020-09-15 2021-01-05 电子科技大学 Inverter controller based on deep reinforcement learning
CN113270937A (en) * 2021-03-30 2021-08-17 鹏城实验室 Standby battery scheduling method, computer readable storage medium and system
CN113141017A (en) * 2021-04-29 2021-07-20 福州大学 Control method for energy storage system to participate in primary frequency modulation of power grid based on DDPG algorithm and SOC recovery
CN112989017A (en) * 2021-05-17 2021-06-18 南湖实验室 Method for generating high-quality simulation experience for dialogue strategy learning
CN113627993A (en) * 2021-08-26 2021-11-09 东北大学秦皇岛分校 Intelligent electric vehicle charging and discharging decision method based on deep reinforcement learning
CN113872198A (en) * 2021-09-29 2021-12-31 电子科技大学 Active power distribution network fault recovery method based on reinforcement learning method
WO2023064474A1 (en) * 2021-10-14 2023-04-20 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Systems and methods for controlling magnetic microdevices with machine learning
CN114091879A (en) * 2021-11-15 2022-02-25 浙江华云电力工程设计咨询有限公司 Multi-park energy scheduling method and system based on deep reinforcement learning
CN114423061A (en) * 2022-01-20 2022-04-29 重庆邮电大学 Wireless route optimization method based on attention mechanism and deep reinforcement learning
CN115051403A (en) * 2022-03-16 2022-09-13 国网浙江省电力有限公司丽水供电公司 Island microgrid load frequency control method and system based on deep Q learning
CN114742453A (en) * 2022-05-06 2022-07-12 江苏大学 Micro-grid energy management method based on Rainbow deep Q network
CN115097729A (en) * 2022-06-21 2022-09-23 广东工业大学 Boiler soot blower optimization control method and system based on reinforcement learning
CN115257745A (en) * 2022-07-21 2022-11-01 同济大学 Automatic driving lane change decision control method based on rule fusion reinforcement learning
CN115238891A (en) * 2022-07-29 2022-10-25 腾讯科技(深圳)有限公司 Decision model training method, and target object strategy control method and device
CN115366099A (en) * 2022-08-18 2022-11-22 江苏科技大学 Mechanical arm depth certainty strategy gradient training method based on forward kinematics
CN116185584A (en) * 2023-01-09 2023-05-30 西北工业大学 Multi-tenant database resource planning and scheduling method based on deep reinforcement learning
CN116456493A (en) * 2023-04-20 2023-07-18 无锡学院 D2D user resource allocation method and storage medium based on deep reinforcement learning algorithm
CN116454902A (en) * 2023-05-09 2023-07-18 广东电网有限责任公司 Power distribution network voltage regulating method, device, equipment and storage medium based on reinforcement learning
CN116824848A (en) * 2023-06-08 2023-09-29 甘肃紫光智能交通与控制技术有限公司 Traffic signal optimization control method based on Bayesian deep Q network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种最大置信上界经验采样的深度Q网络方法;朱斐;吴文;刘全;伏玉琛;;计算机研究与发展;20180815(第08期);全文 *
基于深度强化学习的异构云无线接入网自适应无线资源分配算法;陈前斌;管令进;李子煜;王兆堃;杨恒;唐伦;;电子与信息学报;20200615(第06期);全文 *

Similar Documents

Publication Publication Date Title
JP2000209707A (en) Charging plan equipment for electric vehicle
CN113515884A (en) Distributed electric vehicle real-time optimization scheduling method, system, terminal and medium
CN103078389B (en) Integrated power system control method and the relevant device with energy storage elements
CN104321947B (en) Charge rate optimizes
CN112117760A (en) Micro-grid energy scheduling method based on double-Q-value network deep reinforcement learning
CN113103905B (en) Intelligent charging distribution adjusting method, device, equipment and medium for electric automobile
CN108596667B (en) Electric automobile real-time charging electricity price calculation method based on Internet of vehicles
CN113511082A (en) Hybrid electric vehicle energy management method based on rule and double-depth Q network
CN108471139B (en) Regional power grid dynamic demand response method containing new energy and temperature control load
CN114069612A (en) Charging pile access control method and device, computer equipment and storage medium
CN113997805A (en) Charging control method and system of new energy automobile, vehicle-mounted terminal and medium
CN106165186A (en) Accumulator control device and accumulator control method
CN117863948B (en) Distributed electric vehicle charging control method and device for auxiliary frequency modulation
CN115587645A (en) Electric vehicle charging management method and system considering charging behavior randomness
CN117863948A (en) Distributed electric vehicle charging control method and device for auxiliary frequency modulation
CN110535196B (en) Charging method, charging device, and remote server performed in a power conversion facility
CN116993031A (en) Charging decision optimization method, device, equipment and medium for electric vehicle
CN114619907A (en) Coordinated charging method and coordinated charging system based on distributed deep reinforcement learning
CN112018847A (en) Charging processing method and device for rechargeable battery and electric vehicle
CN107925244A (en) Based on the horizontal definite control load of future energy and energy source
CN113650515B (en) Electric automobile charging control method and device, terminal equipment and storage medium
CN113561834B (en) Ordered charging management method and system for charging piles
WO2014120250A1 (en) Battery maintenance system
CN117863969B (en) Electric automobile charge and discharge control method and system considering battery loss
WO2021056662A1 (en) Charging regulation and control method and apparatus, charging system, computer device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant