CN116703062A - Ordered charging method for electric automobile based on depth deterministic strategy gradient algorithm - Google Patents

Ordered charging method for electric automobile based on depth deterministic strategy gradient algorithm Download PDF

Info

Publication number
CN116703062A
CN116703062A CN202310522988.7A CN202310522988A CN116703062A CN 116703062 A CN116703062 A CN 116703062A CN 202310522988 A CN202310522988 A CN 202310522988A CN 116703062 A CN116703062 A CN 116703062A
Authority
CN
China
Prior art keywords
charging
electric automobile
gradient algorithm
network
ordered
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310522988.7A
Other languages
Chinese (zh)
Inventor
韩帅
肖静
吴宁
陈卫东
郭敏
吴晓锐
龚文兰
卢健斌
姚知洋
莫宇鸿
郭小璇
孙乐平
赵立夏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of Guangxi Power Grid Co Ltd
Original Assignee
Electric Power Research Institute of Guangxi Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of Guangxi Power Grid Co Ltd filed Critical Electric Power Research Institute of Guangxi Power Grid Co Ltd
Priority to CN202310522988.7A priority Critical patent/CN116703062A/en
Publication of CN116703062A publication Critical patent/CN116703062A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06315Needs-based resource requirements planning or analysis
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60LPROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
    • B60L53/00Methods of charging batteries, specially adapted for electric vehicles; Charging stations or on-board charging equipment therefor; Exchange of energy storage elements in electric vehicles
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60LPROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
    • B60L53/00Methods of charging batteries, specially adapted for electric vehicles; Charging stations or on-board charging equipment therefor; Exchange of energy storage elements in electric vehicles
    • B60L53/60Monitoring or controlling charging stations
    • B60L53/62Monitoring or controlling charging stations in response to charging parameters, e.g. current, voltage or electrical charge
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/60Other road transportation technologies with climate change mitigation effect
    • Y02T10/70Energy storage systems for electromobility, e.g. batteries

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Power Engineering (AREA)
  • Marketing (AREA)
  • General Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Tourism & Hospitality (AREA)
  • Transportation (AREA)
  • General Business, Economics & Management (AREA)
  • Mechanical Engineering (AREA)
  • Public Health (AREA)
  • Evolutionary Computation (AREA)
  • Water Supply & Treatment (AREA)
  • Operations Research (AREA)
  • Primary Health Care (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Charge And Discharge Circuits For Batteries Or The Like (AREA)

Abstract

The application discloses an ordered charging method of an electric vehicle based on a depth deterministic strategy gradient algorithm, which considers data and time-of-use electricity price signals fed back in real time by a charging monitoring system, considers uncertainty of an electric vehicle travel mode and charging requirements, and optimizes charging behaviors of the electric vehicle from a load aggregator level. By modeling the charging process of a single electric automobile, solving the optimal scheduling model based on a depth deterministic strategy gradient algorithm (DDPG) to accurately and rapidly acquire an optimal charging plan so as to achieve the aims of optimizing operation in a charging station and effectively reducing daily operation cost in the station. The battery of the electric automobile is protected, and meanwhile, the charging requirement of a user of the electric automobile is met.

Description

Ordered charging method for electric automobile based on depth deterministic strategy gradient algorithm
Technical Field
The application relates to the technical field of electric automobiles, in particular to an ordered charging method for an electric automobile based on a depth deterministic strategy gradient algorithm.
Background
Electric vehicles have been rapidly developed in recent years. As a novel load, the electric automobile has the characteristics of randomness and flexibility. The charging time of the electric automobile is extremely overlapped with the working and living rest time of people, the charging load extremely probability of the electric automobile can be overlapped with the basic operation load of the existing power grid to form an overload condition of peak-to-peak addition, the power grid load is further increased, the current rising rate of each node in the power grid is increased suddenly, the operation difficulty of a heavy-load line and a heavy-load transformer substation is increased, the power grid loss is increased, the aging of power grid operation, distribution and power transmission equipment is accelerated, and the safety operation of a power system and the power consumption experience of power consumers are greatly negatively influenced. Meanwhile, if the maximum load at the moment of electricity consumption peak is met, higher requirements are put on capacity increase of the power grid, and economy and development of power grid construction are not facilitated.
For the problem of peak-valley difference increase of a load curve caused by the electric automobile connected to a power grid, coordination of charging by a user in an orderly guiding manner is an important point of current research, and economic guiding by using peak-valley electricity prices is an important adoptable direction. The intelligent power grid is used as an important development direction of the power grid construction of the current power system, the ordered guiding strategy of the peak-valley electricity price to the charging users is applied by analyzing, researching and judging the three-party data information of the power grid, the charging equipment and the charging automobile users, so that the aim of peak clipping and valley filling of the power grid is fulfilled, the running loss of the power grid is further reduced, the running stability and safety of the power system are improved, the three-party benefit is maximized, the electric automobile is promoted to be popularized and applied in a larger scale finally, and the clean conversion of the power source demand side is promoted. However, current charging station ordered charging plans that do not take into account uncertainty in time-of-use electricity prices and electric vehicle user behavior
Disclosure of Invention
The embodiment of the application provides an electric vehicle ordered charging method based on a depth deterministic strategy gradient algorithm, which at least solves the problem of how to formulate an ordered charging plan of a charging station while considering uncertainty of time-of-use electricity prices and electric vehicle user behaviors.
According to an aspect of the embodiment of the application, there is provided an ordered charging method for an electric vehicle based on a depth deterministic strategy gradient algorithm, including:
from the aspect of load aggregation, comprehensively considering the charging demands of community electric automobile users, adjusting the charging behaviors of electric automobiles in a charging station, integrating SOC information fed back by a charging monitoring system and predicted vehicle taking time information of the users, and establishing an ordered charging optimization model of a community electric automobile cluster;
and solving the ordered charging optimization model of the community electric automobile cluster by adopting a depth deterministic strategy gradient algorithm to obtain an optimal charging plan, wherein the optimal charging plan achieves the aims of optimizing operation in a charging station and effectively reducing daily operation cost in the station.
Optionally, the optimization objective of the ordered charging optimization model of the community electric automobile cluster is as follows:
wherein P is n,t The charging power of the nth electric automobile in the t period; ρ t Time-sharing electricity price for t period; n (N) t The total number of electric vehicles connected with a power grid in the charging station at the t period; t is t n,lea And t n,arr The time when the nth electric automobile arrives at the charging station and leaves the charging station is respectively; f is the total cost of the electric charge of the electric automobile cluster in all time periods.
Optionally, the constraint of the ordered charging optimization model of the community electric automobile cluster includes: the method comprises the following steps of constraint of the state of charge of the electric automobile, constraint of charging expectancy of a user, constraint of operation of a charging pile of the electric automobile and constraint of charging time of the electric automobile.
Optionally, solving the ordered charge optimization model of the community electric automobile cluster by adopting a depth deterministic strategy gradient algorithm comprises:
approximating the strategy function and the action value function, respectively, by using a deep neural network, i.e. determining the parameter θ of the value network Q And parameters θ of the policy network μ
Adding a Target network with the same structure as the strategy network and the value network to improve the performance of the depth deterministic strategy gradient algorithm, and obtaining optimized Target network parameters of theta respectively Q' And theta μ'
And training the target network and the value network through the optimized parameters, so that the ordered charging optimization model of the community electric automobile cluster outputs an optimal strategy.
Optionally, the parameter θ of the value network Q Updating by minimizing the loss function L Q To realize:
L Q =E((y t -Q(s t ,a tQ )) 2 )
wherein Q(s) t ,a tQ ) For output of the value network, i.e. t-period in state S t And performs action a t Expected return on time; y is t Is the target Q value;
y t =r t +γQ'(s t+1 ,u'(s t+1μ' )|θ Q' )
wherein r is t A prize value for the period t; q 'and u' are the target value network and the target policy network, respectively.
Alternatively, r t Expressed as a negative value of the jackpot obtained from the reinforcement training:
γ t =-J=-ω 1 J 12 J 23 J 3 ······
in J, J 1 、J 2 、J 3 Rewards, omega obtained for each training respectively 1 、ω 2 、ω 3 The weight values of rewards obtained for each training are respectively obtained.
Optionally, the parameter θ of the policy network μ By minimizing the loss function L μ To realize:
L μ =-E(Q(s t ,u(s t )))
wherein Q(s) t ,u(s t ) ) is the output of the policy network, i.e. t period is in state s t The value of the corresponding action-state value function, i.e., the Q value;
target network parameter θ Q' And theta μ' The updating mode of (a) is as follows:
θ μ' ←τθ μ +(1-τ)θ μ'
θ Q' ←τθ Q +(1-τ)θ Q'
where τ is a soft update rate factor, and when τ is greater, the value network parameter θ Q And parameters θ of the policy network μ To corresponding target network parameter theta Q' And theta μ' The faster the transfer speed of (2)。
According to another aspect of the embodiment of the present application, there is also provided an ordered charging device for an electric vehicle based on a depth deterministic strategy gradient algorithm, including:
the system comprises a charge monitoring system, a charging optimization model establishing module, a charging optimization model determining module and a charging optimization model determining module, wherein the charge optimization model establishing module is used for comprehensively considering the charge demands of community electric automobile users from the perspective of a load aggregator, adjusting the charging behaviors of the electric automobiles in a charging station, integrating SOC information fed back by the charge monitoring system and predicted vehicle taking time information of the users, and establishing an ordered charge optimization model of the community electric automobile clusters;
and the model solving module is used for solving the ordered charging optimization model of the community electric automobile cluster by adopting a depth deterministic strategy gradient algorithm to obtain an optimal charging plan, and the optimal charging plan achieves the aims of optimizing operation in a charging station and effectively reducing daily operation cost in the station.
According to another aspect of the embodiment of the present application, there is further provided a computer readable storage medium, where the computer readable storage medium includes a stored program, and when the program runs, the device where the computer readable storage medium is controlled to execute the method for orderly charging an electric automobile based on the depth deterministic strategy gradient algorithm according to any one of the above embodiments.
According to another aspect of the embodiment of the present application, there is further provided a processor, configured to execute a program, where the program executes any one of the above-described methods for orderly charging of electric vehicles based on a depth deterministic strategy gradient algorithm.
Compared with the prior art, the application has the following beneficial effects:
in the embodiment of the application, the real-time feedback data and the time-of-use electricity price signal of the charging monitoring system are considered, the uncertainty of the travel mode and the charging requirement of the electric automobile is considered, and the charging behavior of the electric automobile is optimized from the load aggregator level. By modeling the charging process of a single electric automobile, solving the optimal scheduling model based on a depth deterministic strategy gradient algorithm (DDPG) to accurately and rapidly acquire an optimal charging plan so as to achieve the aims of optimizing operation in a charging station and effectively reducing daily operation cost in the station. The battery of the electric automobile is protected, and meanwhile, the charging requirement of a user of the electric automobile is met.
Drawings
In order to more clearly illustrate the technical solutions of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawing in the description below is only one embodiment of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of an ordered charging method for an electric vehicle based on a depth deterministic strategy gradient algorithm according to an embodiment of the present application;
fig. 2 is a schematic diagram of a relationship between a time-of-use electricity price, an electric vehicle user and an electric vehicle charging pile according to an embodiment of the present application;
fig. 3 is a schematic diagram of an optimization flow of a DDPG to ordered charge optimization model of a community electric vehicle cluster according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a time sampling scenario according to an embodiment of the present application;
fig. 5 is a schematic diagram of a DDPG algorithm training scenario according to an embodiment of the present application.
Detailed Description
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate in order to describe the embodiments of the application herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or device.
Example 1
According to an embodiment of the present application, there is provided an embodiment of an electric vehicle ordered charging method based on a depth deterministic strategy gradient algorithm, it being noted that the steps illustrated in the flowchart of the figures may be performed in a computer system, such as a set of computer executable instructions, and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order other than that illustrated herein.
Fig. 1 is a flowchart of an ordered charging method for an electric vehicle based on a depth deterministic strategy gradient algorithm according to an embodiment of the present application, as shown in fig. 1, the method includes the following steps:
step S1, aiming at the ordered charging problem of large-scale electric vehicles in a community charging station, comprehensively considering the charging demands of community electric vehicle users from the perspective of a load aggregator, adjusting the charging behavior of the electric vehicles in the charging station, integrating SOC information fed back by a charging monitoring system and predicted vehicle taking time information of the users, and establishing an ordered charging optimization model of a community electric vehicle cluster;
and S2, solving an ordered charging optimization model of the community electric vehicle cluster by adopting a depth deterministic strategy gradient algorithm (DDPG) to obtain an optimal charging plan so as to achieve the aims of optimizing operation in a charging station and effectively reducing daily operation cost in the station.
The application considers the real-time feedback data and the time-sharing electricity price signal of the charging monitoring system, considers the uncertainty of the travel mode and the charging requirement of the electric automobile, and optimizes the charging behavior of the electric automobile from the load aggregator level. By modeling the charging process of a single electric automobile, solving the optimal scheduling model based on a depth deterministic strategy gradient algorithm (DDPG) to accurately and rapidly acquire an optimal charging plan so as to achieve the aims of optimizing operation in a charging station and effectively reducing daily operation cost in the station. The battery of the electric automobile is protected, and meanwhile, the charging requirement of a user of the electric automobile is met.
As an alternative embodiment, as shown in fig. 2, as an intermediate link between the power grid and the user, the benefits of the load aggregator are mainly derived from the difference between the charge management service charge charged to the electric vehicle user and the electricity consumption overhead purchased from the power grid. When the charge management service charge is calculated, the charge behavior of the electric automobile cluster is optimized by responding to the time-of-use electricity price, the expense of the electric quantity purchased by the power grid is reduced, and the load aggregator can obtain larger profit space. Therefore, the optimization objective of the ordered charging optimization model of the community electric automobile cluster is as follows:
wherein P is n,t The charging power of the nth electric automobile in the t period; ρ t Time-sharing electricity price for t period; n (N) t The total number of electric vehicles connected with a power grid in the charging station at the t period; t is t n,lea And t n,arr The time when the nth electric automobile arrives at the charging station and leaves the charging station is respectively; f is the total cost of the electric charge of the electric automobile cluster in all time periods.
As an optional embodiment, the constraint of the ordered charging optimization model of the community electric automobile cluster includes: state of charge (SOC) constraints of the electric vehicle, charging expectancy constraints of a user, electric vehicle charging pile operation constraints and electric vehicle charging time constraints. Specific:
1) In period t, the state of charge constraint of the electric vehicle can be expressed as:
in the method, in the process of the application,the SOC of the nth electric automobile in the t period is set; />Is->Upper and lower limit values of (2); q (Q) n The battery capacity of the nth electric automobile; />Charging power P for nth electric automobile in t period n,t Corresponding charging efficiency; Δt is the time gap length.
2) Because for continuous adjustable electric automobile of power fills electric pile, electric automobile fills electric pile's average power that chargesAnd charging power P n,t Has stronger correlation, and average charging power is +.>And charging power P n,t The approximate expression of the relationship is:
in order to meet the travel demands of users, the situations of overcharging and undercharging of the electric automobile are reasonably avoided, and when the users get away, the SOC of the battery of the electric automobile is in a section expected by the users, so that the charging expected constraint of the users is as follows:
in the method, in the process of the application,the SOC is the SOC expected by a user when the electric automobile leaves; epsilon is the allowable difference between the SOC when the electric automobile leaves and the expected SOC, and t is the current moment.
3) Considering the safe and stable operation of the electric automobile charging pile, the charging power of the electric automobile has constraint (namely, the operation constraint of the electric automobile charging pile) requirements:
0≤P n,t ≤P max
wherein P is max And the upper limit of the charging power of the electric automobile charging pile is set.
4) Because the time period that the electric automobile is connected into the power grid through the charging pile is the time range that the electric power system can be scheduled at will, the electric automobile charging time t constraint is:
t n,arr ≤t≤t n,lea
wherein t is n,arr And t n,lea The time when the nth electric vehicle arrives at the charging station and leaves the charging station is respectively.
As an alternative embodiment, the reinforcement learning process is described by a markov decision process (Markov Decision Process, MDP), generally represented by a five-tuple (S, a, P, R, γ), where S characterizes the state set, a characterizes the action set, P characterizes the transition probability, R characterizes the reward function and γ characterizes the discount factor;
the selection of the state space S should contain all information of the environment, whileFailure to redundancy, if too many factors are added to the state space, can result in a model that is too complex to train. For this reason, the arrival time t of the electric vehicle will be taken into account in conjunction with the problems studied here n,arr Leaving time t of electric automobile n,lea State of charge of electric vehicleAnd the current period t joins the state space. Thus, t period state S t Can be expressed as +.>
The action space A is the decision quantity of the model, and is the next action obtained by the intelligent agent according to the state at the current moment. In the study herein, since the operation is the charge/discharge power of the electric vehicle, the operation a is performed at time t t Can be expressed as (P) n,t )。
As an alternative embodiment, as shown in fig. 3, step S2 of solving the ordered charge optimization model of the community electric automobile cluster by using a depth deterministic strategy gradient algorithm includes:
step S21, approximating the strategy function and the action value function by using the deep neural network respectively, namely determining the parameter theta of the value network Q And parameters θ of the policy network μ The method comprises the steps of carrying out a first treatment on the surface of the The neural network comprises a value network and a strategy network, wherein the value network comprises a target value network and an updated value network, and the strategy networks are the same;
step S22, adding a Target network with the same structure as the strategy network (Actor network) and the value network (Critic network) to improve the performance of the deep deterministic strategy gradient algorithm (i.e. optimize θ) Q And theta μ ) The optimized target network parameters are respectively theta Q' And theta μ'
And S23, training a target network and a value network through the optimized parameters, so that the ordered charging optimization model of the community electric automobile cluster outputs an optimal strategy. The output of the ordered charging optimization model of the community electric automobile cluster is determined, and the optimal action in the current state is represented.
As an alternative embodiment, in step S22, the parameter θ of the value network (Critic network) Q Updating by minimizing the loss function L Q To realize:
L Q =E((y t -Q(s t ,a tQ )) 2 )
wherein Q(s) t ,a tQ ) For output of the value network, i.e. t-period in state S t And performs action a t Expected return on time; y is t Is the target Q value;
y t =r t +γQ'(s t+1 ,u'(s t+1μ' )|θ Q' )
wherein r is t A prize value for the period t; q 'and u' are the target value network and the target policy network, respectively.
The DDPG algorithm belongs to model-free reinforcement learning, and the learning process can be completed without a specific expression of a state transfer function. I.e. r t Expressed as a negative value of the jackpot obtained from the reinforcement training:
γ t =-J=-ω 1 J 12 J 23 J 3 ······
in J, J 1 、J 2 、J 3 Rewards, omega obtained for each training respectively 1 、ω 2 、ω 3 The weight values of rewards obtained for each training are respectively obtained.
By the above equation, the minimized objective function is converted into a form that obtains the maximum prize by optimizing the decision function.
As an alternative embodiment, in step S22, the parameter θ of the policy network (Actor network) μ By minimizing the loss function L μ To realize:
L μ =-E(Q(s t ,u(s t )))
wherein Q(s) t ,u(s t ) ) is the output of the policy network, i.e. t period is in state s t The value of the corresponding action-state value function, i.e., the Q value;
target network parameter θ Q' And theta μ' The updating mode of (a) is as follows:
θ μ' ←τθ μ +(1-τ)θ μ'
θ Q' ←τθ Q +(1-τ)θ Q'
where τ is a soft update rate factor, and when τ is greater, the value network parameter θ Q And parameters θ of the policy network μ To corresponding target network parameter theta Q' And theta μ' The faster the transfer speed of (c).
As an alternative embodiment, according to the method of embodiment 1 of the present application, as shown in fig. 4, based on the situation of the arrival and departure of the electric vehicle, that is, the number of arrival and departure of the electric vehicle at different times in the sampled day, the scheduling period of the power distribution network is set to 24 hours in combination with the time-of-use electricity price information, the interval between two adjacent time periods is 1 hour, and the DDPG algorithm is used to successfully converge after training the agent to obtain a corresponding operation strategy, and the training result is shown in fig. 5. From fig. 5 it can be seen that the curve converges, i.e. the problem is solved, and the method used is reasonable.
Example 2
According to another aspect of the embodiment of the present application, there is also provided an electric vehicle ordered charging device based on a depth deterministic strategy gradient algorithm, the electric vehicle ordered charging device applying the electric vehicle ordered charging method based on the depth deterministic strategy gradient algorithm, the device including:
the system comprises a community electric automobile cluster ordered charging optimization model building module, a charging monitoring system and a charging system, wherein the community electric automobile cluster ordered charging optimization model building module is used for comprehensively considering the charging demands of community electric automobile users from the aspect of load aggregators aiming at the ordered charging problem of large-scale electric automobiles in a community charging station, regulating the charging behaviors of the electric automobiles in a charging station, integrating SOC information fed back by the charging monitoring system and predicted vehicle taking time information of the users, and building an ordered charging optimization model of the community electric automobile cluster;
and the model solving module is used for solving the ordered charging optimization model of the community electric automobile cluster by adopting a depth deterministic strategy gradient algorithm (DDPG) to obtain an optimal charging plan, and the optimal charging plan achieves the aims of optimizing operation in a charging station and effectively reducing daily operation cost in the station.
The present application is not limited to the above embodiments, but is to be accorded the widest scope consistent with the principles and other features of the present application.
Example 3
According to another aspect of the embodiment of the present application, there is further provided a computer readable storage medium, where the computer readable storage medium includes a stored program, and when the program runs, the device where the computer readable storage medium is located is controlled to execute the method for orderly charging an electric automobile based on the depth deterministic strategy gradient algorithm according to any one of the above.
Alternatively, in this embodiment, the above-mentioned computer readable storage medium may be located in any one of the computer terminals in the computer terminal group in the computer network or in any one of the mobile terminals in the mobile terminal group, and the above-mentioned computer readable storage medium includes a stored program.
Optionally, the computer readable storage medium is controlled to perform the following functions when the program is run: from the aspect of load aggregation, comprehensively considering the charging demands of community electric automobile users, adjusting the charging behaviors of electric automobiles in a charging station, integrating SOC information fed back by a charging monitoring system and predicted vehicle taking time information of the users, and establishing an ordered charging optimization model of a community electric automobile cluster; and solving the ordered charging optimization model of the community electric automobile cluster by adopting a depth deterministic strategy gradient algorithm to obtain an optimal charging plan, wherein the optimal charging plan achieves the aims of optimizing operation in a charging station and effectively reducing daily operation cost in the station.
Example 4
According to another aspect of the embodiment of the present application, there is further provided a processor for running a program, wherein the program executes the method for orderly charging an electric vehicle based on the depth deterministic strategy gradient algorithm according to any one of the above.
The embodiment of the application provides equipment, which comprises a processor, a memory and a program stored on the memory and capable of running on the processor, wherein the processor realizes the step of the electric vehicle ordered charging method based on a depth deterministic strategy gradient algorithm when executing the program.
The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed technology may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, for example, may be a logic function division, and may be implemented in another manner, for example, a plurality of units or components may be combined or may be integrated into another apparatus, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces and the indirect coupling or communication connection of units or modules may be in electrical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-0nlyMemory (ROM), a random access memory (RAM, randomAccessMemory), a removable hard disk, a magnetic disk, or an optical disk, or the like, which can store program codes.
The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application, which are intended to be comprehended within the scope of the present application.

Claims (10)

1. An ordered charging method of an electric vehicle based on a depth deterministic strategy gradient algorithm is characterized by comprising the following steps:
from the aspect of load aggregation, comprehensively considering the charging demands of community electric automobile users, adjusting the charging behaviors of electric automobiles in a charging station, integrating SOC information fed back by a charging monitoring system and predicted vehicle taking time information of the users, and establishing an ordered charging optimization model of a community electric automobile cluster;
and solving the ordered charging optimization model of the community electric automobile cluster by adopting a depth deterministic strategy gradient algorithm to obtain an optimal charging plan, wherein the optimal charging plan achieves the aims of optimizing operation in a charging station and effectively reducing daily operation cost in the station.
2. The ordered charging method of the electric vehicles based on the depth deterministic strategy gradient algorithm according to claim 1, wherein the optimization objective of the ordered charging optimization model of the community electric vehicle cluster is:
wherein P is n,t The charging power of the nth electric automobile in the t period; ρ t Time-sharing electricity price for t period; n (N) t The total number of electric vehicles connected with a power grid in the charging station at the t period; t is t n,lea And t n,arr The time when the nth electric automobile arrives at the charging station and leaves the charging station is respectively; f is the total cost of the electric charge of the electric automobile cluster in all time periods.
3. The ordered charging method of electric vehicles based on depth deterministic strategy gradient algorithm according to claim 1, wherein the constraints of the ordered charging optimization model of the community electric vehicle cluster comprise: the method comprises the following steps of constraint of the state of charge of the electric automobile, constraint of charging expectancy of a user, constraint of operation of a charging pile of the electric automobile and constraint of charging time of the electric automobile.
4. The method for orderly charging electric vehicles based on the depth deterministic strategy gradient algorithm according to claim 1, wherein solving the orderly charging optimization model of the community electric vehicle cluster by using the depth deterministic strategy gradient algorithm comprises:
approximating the strategy function and the action value function, respectively, by using a deep neural network, i.e. determining the parameter θ of the value network Q And parameters θ of the policy network μ
Adding a Target network with the same structure as the strategy network and the value network to improve the performance of the depth deterministic strategy gradient algorithm, and obtaining optimized Target network parameters of theta respectively Q' And theta μ'
And training the target network and the value network through the optimized parameters, so that the ordered charging optimization model of the community electric automobile cluster outputs an optimal strategy.
5. The ordered charging method for electric vehicles based on depth deterministic strategy gradient algorithm according to claim 4, wherein the value network parameter θ Q Updating by minimizing the loss function L Q To realize:
L Q =E((y t -Q(s t ,a tQ )) 2 )
wherein Q(s) t ,a tQ ) For output of the value network, i.e. t-period in state S t And performs action a t Expected return on time; y is t Is the target Q value;
y t =r t +γQ'(s t+1 ,u'(s t+1μ' )|θ Q' )
wherein r is t A prize value for the period t; q 'and u' are the target value network and the target policy network, respectively.
6. The depth deterministic strategy gradient algorithm-based ordered charging method for electric vehicles according to claim 5, wherein r t Expressed as a negative value of the jackpot obtained from the reinforcement training:
γ t =-J=-ω 1 J 12 J 23 J 3 ······
in J, J 1 、J 2 、J 3 Rewards, omega obtained for each training respectively 1 、ω 2 、ω 3 The weight values of rewards obtained for each training are respectively obtained.
7. The ordered charging method for electric vehicles based on depth deterministic strategy gradient algorithm according to claim 4, wherein the parameters θ of the strategy network μ By minimizing the loss functionNumber L μ To realize:
L μ =-E(Q(s t ,u(s t )))
wherein Q(s) t ,u(s t ) ) is the output of the policy network, i.e. t period is in state s t The value of the corresponding action-state value function, i.e., the Q value;
target network parameter θ Q' And theta μ' The updating mode of (a) is as follows:
θ μ' ←τθ μ +(1-τ)θ μ'
θ Q' ←τθ Q +(1-τ)θ Q'
where τ is a soft update rate factor, and when τ is greater, the value network parameter θ Q And parameters θ of the policy network μ To corresponding target network parameter theta Q' And theta μ' The faster the transfer speed of (c).
8. An electric automobile ordered charging device based on depth deterministic strategy gradient algorithm is characterized by comprising:
the system comprises a charge monitoring system, a charging optimization model establishing module, a charging optimization model determining module and a charging optimization model determining module, wherein the charge optimization model establishing module is used for comprehensively considering the charge demands of community electric automobile users from the perspective of a load aggregator, adjusting the charging behaviors of the electric automobiles in a charging station, integrating SOC information fed back by the charge monitoring system and predicted vehicle taking time information of the users, and establishing an ordered charge optimization model of the community electric automobile clusters;
and the model solving module is used for solving the ordered charging optimization model of the community electric automobile cluster by adopting a depth deterministic strategy gradient algorithm to obtain an optimal charging plan, and the optimal charging plan achieves the aims of optimizing operation in a charging station and effectively reducing daily operation cost in the station.
9. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored program, wherein the program when run controls a device in which the computer readable storage medium is located to execute the method for orderly charging an electric vehicle based on the depth deterministic strategy gradient algorithm according to any one of claims 1 to 7.
10. A processor, wherein the processor is configured to run a program, wherein the program when run performs the depth deterministic strategy gradient algorithm-based electric vehicle ordered charging method according to any one of claims 1 to 7.
CN202310522988.7A 2023-05-10 2023-05-10 Ordered charging method for electric automobile based on depth deterministic strategy gradient algorithm Pending CN116703062A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310522988.7A CN116703062A (en) 2023-05-10 2023-05-10 Ordered charging method for electric automobile based on depth deterministic strategy gradient algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310522988.7A CN116703062A (en) 2023-05-10 2023-05-10 Ordered charging method for electric automobile based on depth deterministic strategy gradient algorithm

Publications (1)

Publication Number Publication Date
CN116703062A true CN116703062A (en) 2023-09-05

Family

ID=87824764

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310522988.7A Pending CN116703062A (en) 2023-05-10 2023-05-10 Ordered charging method for electric automobile based on depth deterministic strategy gradient algorithm

Country Status (1)

Country Link
CN (1) CN116703062A (en)

Similar Documents

Publication Publication Date Title
Sachan et al. Stochastic charging of electric vehicles in smart power distribution grids
CN108960510B (en) Virtual power plant optimization trading strategy device based on two-stage random planning
Xiong et al. Vehicle grid integration for demand response with mixture user model and decentralized optimization
Yang et al. Computational scheduling methods for integrating plug-in electric vehicles with power systems: A review
Shaaban et al. Real-time PEV charging/discharging coordination in smart distribution systems
CN110774929A (en) Real-time control strategy and optimization method for orderly charging of electric automobile
Liu et al. Optimal operation strategy for distributed battery aggregator providing energy and ancillary services
Rezaeimozafar et al. A self-optimizing scheduling model for large-scale EV fleets in microgrids
CN111260237B (en) Multi-interest-subject coordinated game scheduling method considering EV (electric vehicle) owner intention
Hajforoosh et al. Online optimal variable charge-rate coordination of plug-in electric vehicles to maximize customer satisfaction and improve grid performance
Jin et al. Decentralised online charging scheduling for large populations of electric vehicles: a cyber-physical system approach
Yao et al. A fuzzy logic based charging scheme for electric vechicle parking station
Alfaverh et al. Optimal vehicle-to-grid control for supplementary frequency regulation using deep reinforcement learning
CN114997631B (en) Electric vehicle charging scheduling method, device, equipment and medium
CN116001624A (en) Ordered charging method for one-pile multi-connected electric automobile based on deep reinforcement learning
US20160140449A1 (en) Fuzzy linear programming method for optimizing charging schedules in unidirectional vehicle-to-grid systems
WO2024092954A1 (en) Power system regulation method based on deep reinforcement learning
Chu et al. A multiagent federated reinforcement learning approach for plug-in electric vehicle fleet charging coordination in a residential community
Zhang et al. Real-time adjustment of load frequency control based on controllable energy of electric vehicles
Ghofrani et al. Electric drive vehicle to grid synergies with large scale wind resources
Chai et al. A two-stage optimization method for Vehicle to Grid coordination considering building and Electric Vehicle user expectations
Zhang et al. A safe reinforcement learning-based charging strategy for electric vehicles in residential microgrid
Manivannan Research on IoT-based hybrid electrical vehicles energy management systems using machine learning-based algorithm
Gharibi et al. Deep learning framework for day-ahead optimal charging scheduling of electric vehicles in parking lot
CN114619907A (en) Coordinated charging method and coordinated charging system based on distributed deep reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination