NL2026738B1 - Cooperative-optimization control method of charging station based on double-center q-learning method - Google Patents
Cooperative-optimization control method of charging station based on double-center q-learning method Download PDFInfo
- Publication number
- NL2026738B1 NL2026738B1 NL2026738A NL2026738A NL2026738B1 NL 2026738 B1 NL2026738 B1 NL 2026738B1 NL 2026738 A NL2026738 A NL 2026738A NL 2026738 A NL2026738 A NL 2026738A NL 2026738 B1 NL2026738 B1 NL 2026738B1
- Authority
- NL
- Netherlands
- Prior art keywords
- charging
- peak regulation
- control
- state
- electric vehicle
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
- G06Q10/06315—Needs-based resource requirements planning or analysis
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60L—PROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
- B60L53/00—Methods of charging batteries, specially adapted for electric vehicles; Charging stations or on-board charging equipment therefor; Exchange of energy storage elements in electric vehicles
- B60L53/30—Constructional details of charging stations
- B60L53/31—Charging columns specially adapted for electric vehicles
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60L—PROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
- B60L53/00—Methods of charging batteries, specially adapted for electric vehicles; Charging stations or on-board charging equipment therefor; Exchange of energy storage elements in electric vehicles
- B60L53/60—Monitoring or controlling charging stations
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60L—PROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
- B60L53/00—Methods of charging batteries, specially adapted for electric vehicles; Charging stations or on-board charging equipment therefor; Exchange of energy storage elements in electric vehicles
- B60L53/60—Monitoring or controlling charging stations
- B60L53/63—Monitoring or controlling charging stations in response to network capacity
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60L—PROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
- B60L53/00—Methods of charging batteries, specially adapted for electric vehicles; Charging stations or on-board charging equipment therefor; Exchange of energy storage elements in electric vehicles
- B60L53/60—Monitoring or controlling charging stations
- B60L53/64—Optimising energy costs, e.g. responding to electricity rates
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B60—VEHICLES IN GENERAL
- B60L—PROPULSION OF ELECTRICALLY-PROPELLED VEHICLES; SUPPLYING ELECTRIC POWER FOR AUXILIARY EQUIPMENT OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRODYNAMIC BRAKE SYSTEMS FOR VEHICLES IN GENERAL; MAGNETIC SUSPENSION OR LEVITATION FOR VEHICLES; MONITORING OPERATING VARIABLES OF ELECTRICALLY-PROPELLED VEHICLES; ELECTRIC SAFETY DEVICES FOR ELECTRICALLY-PROPELLED VEHICLES
- B60L53/00—Methods of charging batteries, specially adapted for electric vehicles; Charging stations or on-board charging equipment therefor; Exchange of energy storage elements in electric vehicles
- B60L53/60—Monitoring or controlling charging stations
- B60L53/65—Monitoring or controlling charging stations involving identification of vehicles or their battery types
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
- G06Q10/06312—Adjustment or analysis of established resource schedule, e.g. resource or task levelling, or dynamic rescheduling
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02E—REDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
- Y02E60/00—Enabling technologies; Technologies with a potential or indirect contribution to GHG emissions mitigation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/60—Other road transportation technologies with climate change mitigation effect
- Y02T10/70—Energy storage systems for electromobility, e.g. batteries
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/60—Other road transportation technologies with climate change mitigation effect
- Y02T10/7072—Electromobility specific charging systems or methods for batteries, ultracapacitors, supercapacitors or double-layer capacitors
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T90/00—Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation
- Y02T90/10—Technologies relating to charging of electric vehicles
- Y02T90/12—Electric charging stations
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T90/00—Enabling technologies or technologies with a potential or indirect contribution to GHG emissions mitigation
- Y02T90/10—Technologies relating to charging of electric vehicles
- Y02T90/16—Information or communication technologies improving the operation of electric vehicles
- Y02T90/167—Systems integrating technologies related to power network operation and communication or information technologies for supporting the interoperability of electric or hybrid vehicles, i.e. smartgrids as interface for battery charging of electric vehicles [EV] or hybrid vehicles [HEV]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/12—Monitoring or controlling equipment for energy generation units, e.g. distributed energy generation [DER] or load-side generation
- Y04S10/126—Monitoring or controlling equipment for energy generation units, e.g. distributed energy generation [DER] or load-side generation the energy generation units being or involving electric vehicles [EV] or hybrid vehicles [HEV], i.e. power aggregation of EV or HEV, vehicle to grid arrangements [V2G]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S30/00—Systems supporting specific end-user applications in the sector of transportation
- Y04S30/10—Systems supporting the interoperability of electric or hybrid vehicles
- Y04S30/14—Details associated with the interoperability, e.g. vehicle recognition, authentication, identification or billing
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Entrepreneurship & Innovation (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Power Engineering (AREA)
- Transportation (AREA)
- Mechanical Engineering (AREA)
- Game Theory and Decision Science (AREA)
- Educational Administration (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Electric Propulsion And Braking For Vehicles (AREA)
- Charge And Discharge Circuits For Batteries Or The Like (AREA)
Abstract
The present invention discloses a cooperative-optimization control method of a charging station based on a double-center Q-learning method, including: 1, describing a control process of charging service requests of two charging forms of electric vehicles which arrive randomly as an event-driven decision-making process; 2, describing a process of controlling the electric vehicle which is charged in the charging station to respond to a power-grid peak regulation electricity price plan as a sequential decision-making process; 3, taking a peak regulation electricity price and the online service state of a charging point as a system state, 3, taking the fact that the electric vehicle arrives and makes the service request as an event, and selecting whether the electric vehicle is admitted and provided with charging service or not as an admission control action, 4, at the epoch when the peak regulation electricity price is issued, selecting charging and discharging actions of all the AC charging electric vehicles which are served as a peak regulation control action, and 5, performing online cooperative optimization on an electric-vehicle admission control center and a control center for peak regulation response of a system with a Q-learning algorithm. With the present invention, effective electric- vehicle intelligent admission control and peak regulation response control may be performed on the charging station, thereby adapting to peak regulation demands of a power grid. 1
Description
-1- COOPERATIVE-OPTIMIZATION CONTROL METHOD OF CHARGING STATION BASED ON DOUBLE-CENTER Q-LEARNING METHOD
[0001] The present invention pertains to the technical field of intelligent control and optimization, and particularly relates to a cooperative-optimization control method of a charging station based on a double-center Q-learning method.
[0002] At present, China is the largest vehicle consumption market all over the world, vehicle manufacturers shift research, development and production emphases from vehicles powered by traditional energy to new energy vehicles, and electric vehicles are the mainstream of development of the new energy vehicles in a long period of time, and have huge consumption potential and an increasing market share. Charging points are important infrastructures for providing charging service for the electric vehicles, and also are an important link in the industrialization and commercialization process of the electric vehicles. With a rapid development of the electric vehicle industry and a great increase of market holdings of the electric vehicles, a charging station where centralized management and operation are performed on a plurality of charging points will be an important business mode and service form in the future. In addition, with an increase of the permeability of new energy, such as wind electricity, photovoltaic energy, or the like, as well as a mature development of an interaction technology (V2G technology) between the electric vehicles and a power grid, the intelligence and the adaptability of electricity production and service will be improved in the future, and effective management and guidance of electricity consumption of a power consumer, such as the charging station, or the like, will be a trend. For example, a dispatching center at each level may make an electricity peak regulation plan according to source-load prediction data and issue a real-time electricity price, thereby guiding the power consumer, for example, the charging station for the electric vehicle, to consume electricity reasonably and perform V2G electricity feedback, and promoting an autonomous peak shaving or peak shifting action at the consumer side.
[0003] A time-of-use electricity price mechanism which is quite simple and fixed is adopted for the existing power-grid electricity price, a power-grid peak regulation
-2- electricity price plan is not dynamically made or regulated according to actual source- load prediction conditions of the power grid, and a service system of the charging station also does not dynamically and adaptively perform adaptive admission control on a charging request of the electric vehicle and adaptive peak regulation response control on charging and discharging actions of the electric vehicle, according to the actual power- grid peak regulation electricity price plan. Therefore, under a real-time power-grid peak regulation electricity price mechanism, how the intelligent service system of the charging station adaptively responds to the charging service request of the electric vehicle which arrives randomly and an instruction of issuing the power-grid peak regulation electricity price in real time, that is, controls admission of the electric vehicle into the service and energy interaction between the electric vehicle and the power grid, according to the real-time power-grid peak regulation electricity price and online service states of all the charging points in the station, thereby improving the running economy of the charging station and adapting to a power-grid peak regulation demand, will be a problem to be researched and solved.
[0004] In order to solve the defects in the prior art, the present invention provides a cooperative-optimization control method of a charging station based on a double- center Q-learning method, so as to online cooperatively optimize admission control for a service request of an electric vehicle of an admission control center and peak regulation control of a control center for peak regulation response in the charging station, thereby improving the running economy of the charging station and adapting the power-grid peak regulation demand.
[0005] In order to solve the technical problems, the following technical solution is adopted in the present invention:
[0006] the cooperative-optimization control method of the charging station based on the double-center Q-learning method according to the present invention is characterized by being applied to a service system of the charging station, which is provided with Jo DC charging points, Ja AC charging points and Jap AC and DC- hybrid charging points and provides paid charging service for Mb DC fast-charging electric vehicles which arrive randomly and Ma AC slow-charging electric vehicles
-3- which arrive randomly;
[0007] each DC charging point is enabled to meet charging power demands of the Mp DC fast-charging electric vehicles, each AC charging point is enabled to meet charging power demands of the Ma AC slow-charging electric vehicles, each AC and DC-hybrid charging point is enabled to meet the charging power demands of the Mp DC fast-charging electric vehicles and the MA AC slow-charging electric vehicles, and one charging point is enabled to provide the charging service for only one electric vehicle at a time; CSP‚CSP CSP, CSP
[0008] the JD DC charging points are denoted as °° Jo e Co CSP, CSy,-+, CS, CS respectively, the Ja AC charging points are denoted as ! : ! Ta respectively, and the Jap AC and DC-hybrid charging points are denoted as CSP, CSP, SPP, CSD b : : 2 PI Vo respectively; CS represents the jth DC charging 1 = 2... : : point, JC Dr Di, = 1.2. Jp} represents a set of codes of the DC charging points, A : 12. CS; represents the jth AC charging point, J Pi Di, = il, 2 Ja} represents a set … CSfP of codes of the AC charging points, “+ represents the jth AC and DC-hybrid charging ; _ ee y point, Je Disp, and Dip = 12, Jan} represents a set of codes of the AC and DC- hybrid charging points;
[0009] the charging power demands of the Mp pc fast-charging electric PP p>... PP... PP / vehicles are denoted as 2 0 Mo ‚ and the charging power demands of the | | | PA PA PA... PA D Ma AC slow-charging electric vehicles are denoted as ~~ => 0 Ma. Pn represents the charging power demand of the mth DC charging electric vehicle, 12 m € Du, ‚ Dat = t,2, Mp} represents a set of codes of types of the DC charging
A electric vehicles, Pm represents the charging power demand of the mth AC charging = es 2 electric vehicle, ™ © DM, ‚ and DM, = 412,00 Maj represents a set of codes of types of the AC charging electric vehicles;
[0010] it is assumed that the power-grid peak regulation electricity price is periodically issued according to a dispatching instruction, K is the number of issued
4- periods of the dispatching instruction in one day, a corresponding total time length is T, the power-grid peak regulation electricity price at any epoch t under the total time length T is denoted as PR PR © Per and PPR is a limited electricity-price state space; if Tk is the epoch when the kth peak regulation electricity price PR, is issued, a peak regulation electricity price sequence is denoted as ie PR. Jk =0L2 kl = 0} PR, er tx =T and PRyPRPR.
[0011] it is assumed that the Mt th electric vehicle randomly arrives at the charging station at the epoch t to apply for charging service, and my € Oy, UDM, „if the current state of charge (SOC) of a battery of the M! th electric vehicle is SOC, (1) the arrival event of the ™t th electric vehicle is denoted as E(m, SOC, (©) ;
[0012] the combined state of the three types of charging points at the epoch t is denoted as C= ICS. cs? csi , CSP = (CSP (t), CSP (t) CS (t) CSP, (1) nd CS) (9= (m? (6).50C,p () represents the service state of the jth DC charging point; my (®) represents the type of the electric vehicle which is served by the jth DC charging point CS) at the epoch t, my’ (1) =0 indicates that no vehicle is admitted at the jth DC charging point CS} at the epoch t, and mj (t) Das, indicates that the jth DC charging point CS) is charging one electric vehicle in Pwo : SOC np (1 represents the current SOC of the battery of the my (1) th electric vehicle which is served by the jth DC charging point CS} at the epoch t; 0013] CSf = (CSP (1),C83 (1), CSP (t), CS}, (1) | nd cs; (t)= (m5 (9-50C () represents the service state of the jth AC charging point; mj (0) represents the type of the electric vehicle which is served by the jth AC charging
-5- A Arey point CS at the epoch t, m; (£)=0 indicates that no vehicle is admitted at the jth AC
A A charging point C85 at the epoch t, and ™i (DDM, indicates that the jth AC charging A SOC a(t point CS) jg charging one electric vehicle in Du, ; mij ( represents the current
A SOC of the battery of the m; (t) th AC charging electric vehicle which is served by the oooi CS jth AC charging point ~~} at the epoch t; CSP =(CSP (1),CS2P (1), +, CSP (1), CSP (t
[0014] t ( 1 ( ) 2 ( ) J ( ) me ( )) , and CSP (1) =(mf” (£),50C,m (1) ° ’ 3 represents the service state of the jth AC and DC- : Co mi Co LL hybrid charging point; represents the type of the electric vehicle which is served AD AD(t}— by the jth AC and DC-hybrid charging point CS at the epoch t, m; (6) =0 indicates
AD that no vehicle is admitted at the jth AC and DC-hybrid charging point CS; at the
AD epoch t, and (Dx, UDM, indicates that the jth AC and DC-hybrid charging AD SOC so (t point Sj ig charging one electric vehicle in Pry op DM, i > (1) represents m5? (t) . . . the current SOC of the battery of the th DC or AC charging electric vehicle
CSP which is served by the jth AC and DC-hybrid charging point ~~! at the epoch t;
[0015] the state of the service system of the charging station at any epoch t is — 1 denoted as *! tt CPR} :
[0016] the epoch Tk of issuing the kth peak regulation electricity price PR, is taken as a decision-making epoch for peak regulation of the control center for peak regulation response, and the energy exchange directions of all the AC charging electric vehicles admitted by the service system of the charging station are denoted as actions dk at the decision-making epoch for peak regulation, di ={(a (0) at (2), df (3). df (34). (A (01), dE (2). (5)? (J an))} df ()eDe=(-101 ied, dP()eDi={-L01 ig je Pio. represents discharging peak regulation, O represents no charging and discharging action,
-6- and 1 represents a charging action; Dr represents a set of peak regulation control actions for each electric vehicle;
[0017] it is assumed that the DC charging electric vehicle admitted into the service in the service system of the charging station does not participate in peak regulation, and the AC charging electric vehicle admitted into the service may participate in peak regulation;
[0018] at the kth decision-making epoch 7% for peak regulation, if AD dP (iy =1 mj (0) € Pay in the jth AC and DC-hybrid charging point, "© (J) , and if AD (1) — AD (yg : Ary — mj (1) =0 , di (9) ‚ and JC Pro. if Mi ()=0 in the jth AC charging point, A; =0 TL (3) ‚and JC Pr.
[0019] at the kth decision-making epoch 7% for peak regulation, a set of feasible Ak peak regulation control actions dk of the charging station is defined as Dr | and ko A Dr Dr wherein Ds is a Cartesian product of Ja +JaD gets Dr of peak regulation control actions; E(m, SOC, (t
[0020] the occurrence epoch t of the arrival event ( to an { ) of the Mt th electric vehicle is taken as a decision-making epoch for admission control of the admission control center for the electric vehicle, and event information of the decision- making epoch for admission control and state information of the service system of the charging station are combined and defined as an event-extended state sy = {t, Ct, PR,, Mt, SOC, ()} .
[0021] at the decision-making epoch for admission control, whether the service system of the charging station admits the electric vehicle and provides the charging service is denoted as an admission control action a, and the action at the nth decision- . : 10.1 : making epoch Ta for admission control is denoted as 3», and 32 © Da =10.1} , wherein 0 represents service refusal, 1 represents service admission, and Da represents a set of actions of the admission control center;
[0022] at the nth decision-making epoch Th for admission control, if the type Mt € Du, of the arriving electric vehicle, all the DC charging points and all the AC and
-7- Co Co mPme died, DC-hybrid charging points are in service, ie, ‚ and im € Par, U Pu) © @1 ‚8270. if the type MSC PM, of the arriving electric vehicle, all the AC charging points and all the AC and DC-hybrid charging points | oo IG e Dy, ij € Oy, } EG € Dy, U Oyj ED; } are in service, i.e, ‚ and , ag = 0.
[0023] the cooperative-optimization control method of the charging station based on the double-center Q-learning method is divided into electric-vehicle admission control and peak regulation response control;
[0024] in the electric-vehicle admission control, a control process of charging service requests of the DC charging electric vehicle and the AC charging electric vehicle which arrive randomly is described as an event-driven decision-making process, the electric vehicles arriving and making the service requests is taken as an event, the peak regulation electricity price and the online service state of the charging point are taken as the state of the service system of the charging station, when the event occurs, the event information and the state information of the service system are combined into the event- extended state, whether the electric vehicle is admitted and provided with the charging service or not is selected as the admission control action, sample data feedback is thus obtained, a Q-value table for admission control is updated with the Q-learning method, and finally, a strategy table for admission control is obtained,
[0025] in the peak regulation response control, a process of controlling the electric vehicle which is charged in the charging station responds to a power-grid peak regulation electricity price plan is described as a sequential decision-making process, and at the epoch when the peak regulation electricity price is issued, the charging and discharging actions of all the AC charging electric vehicles which are served are selected as the peak regulation control actions according to the state of the service system of the charging station, sample data feedback is thus obtained, a Q-value table for peak regulation control is updated with a Q-learning method, and finally, a strategy table for peak regulation control is obtained.
[0026] The cooperative-optimization control method of the charging station based on the double-center Q-learning method is also characterized in that the admission control of the electric vehicle includes the following steps:
-8-
[0027] step 1: defining and initializing an exploration rate of the admission control action at the nth decision-making epoch In for admission control as €a and letting 0 <&n <1.
[0028] defining elements in the Q-value table for admission control as discretization event-extended state-action pair learning values, and initializing the elements in the Q-value table for admission control;
[0029] defining a current greedy strategy table V for admission control as a set formed by actions corresponding to the maximum discretization event-extended state- action pair learning value of each row in the Q-value table for admission control;
[0030] step 2: initializing t= 9 and n =1; assigning the current exploration rate a for the admission control action to an initial exploration rate ê; assigning the current p going greedy strategy table V for admission control to an original strategy table V0;
[0031] step 3: at the Nth decision-making epoch Th for admission control of the . . . . E{m, SOC, (t service system of the charging station when the arrival event ( b m ) occurs, observing the current state St of the service system of the charging station to form the sc event-extended state °!;
[0032] denoting the discretization state corresponding to the event-extended € state St of the Nth decision-making epoch Tu for admission control in the Q-value table as Sh <
[0033] denoting the action which is actually taken in the event-extended state 5! c at the Dn th decision-making epoch Ta for admission control as VS!) , wherein v(s{) € Da. e
[0034] in the event-extended state St at the nth decision-making epoch Ta for ce admission control, extracting a greedy action in the discretization state 52 corresponding sy oo v(sh) to >t from the Q-value table and denoting it as >»; Cc
[0035] in the event-extended state Si at the nth decision-making epoch Tn for . . . my © Dy FE . . admission control, if the type © of the arriving electric vehicle, all the DC
-9- charging points and all the AC and DC-hybrid charging points are in service, or the type mi € DM, of the arriving electric vehicle, all the AC charging points and all the AC and . . . . . . v(st) =0 . . . v(sy) DC-hybrid charging points are in service, letting \\>t- ‚ otherwise, assigning \>2 v(s{) Ik 1-8 ; : v(sy) to "7 with a probability n selecting an action other than ’\*27 from the action set Paat the exploration rate £» as an exploration action “*» and assigning the action
[0036] after the admission control center of the charging station takes the action v(st) : 7 Ln (st, vsf), 7) tJ observing and obtaining a system transition sample track transited from the nth decision-making epoch Ta for admission control to the n+ th decision- making epoch I= for admission control or the epoch T , wherein t=Tn t = laa <T © ro r_ ={T PR,0,0 ‚or t =T:; when t'=T letting st ={T,C1,PR,,0, }. € © <
[0037] step 4: observing and calculating the combined quantity rise, visi).sy) of charging rewards and peak regulation rewards obtained in the state transition process Aat of the service system of the charging station from the current action visi) taking state e _ v STIL Cn PRM, SOC (OF 44 the n th decision-making epoch In for admission c _ 1 ‘ control to the state SU Tt Cr PR, mi, SOC. (1)} gt the n+1 th decision-making epoch Tost for admission control or the epoch T;
[0038] step 5: updating the discretization event-extended state-action pair < 7 € . . e . . . . € learning value RG: VS:) for taking the action YO) in the discretization state Sn © corresponding to $t in the Q-value table for admission control by using a difference formula and a Q-value updating formula shown in Equ. (1) and Equ. (2), and assigning Cc c the value to QS: V(s0)) : d(st, v(st), sr) =r(st, v(st), st) + max Q(Sn+1,2) —Q(sh, v{st))
[0039] aeD (1)
[00409] Q(sn, v(st)): = Q(sn, v(st)) + y(sn, v(si)d(st, v(st), st) (2)
-10- oo Q(st 1,2) ee
[0041] wherein in Equ. (1), 11°) represents the discretization event- extended state-action pair learning value for taking the action 2 in the discretization € € . . . state Sn+ corresponding to the state 3¢ of transition to the n+1 th decision-making epoch Tot for admission control or the epoch T;
[0042] in Equ. (2), the operator ": =" indicates that the value of the right formula . . . (ss v(st)) . . is calculated first and then given to the left variable; >» *\>t7 is a learning step length c c for taking the action V(S1) in the discretization state Sn at the nth decision-making epoch Tu for admission control;
[0043] step 6: selecting the action corresponding to the maximum discretization event-extended state-action pair learning value of each row in the updated Q-value table for admission control to form the current action set for admission control, taking the current action set as the updated greedy strategy table for admission control, and assigning it to the current greedy strategy v for admission control, degrading the exploration rate ©» | thereby obtaining the updated exploration rate and assigning it to Ent:
[0044] step 7: if U <T assigning n+l to n, and returning to the step 3; otherwise, indicating t’= T, and performing step 8; and
[0045] step 8: judging whether the strategy table V for admission control is equal to Y° or not, if so, stopping updating and performing admission control on the random charging service requests of the M electric vehicles with the current strategy table v for admission control, otherwise, returning to the step 2 for execution;
[0046] the peak regulation response control includes the following steps:
[0047] step -1: defining and initializing an exploration rate of the peak regulation control action at the kth decision-making epoch 7x for peak regulation control as Ek and letting 0 <2 <1.
[0048] defining elements in the Q-value table for peak regulation control as state-action pair learning values of the service system of the charging station, and initializing the elements in the Q-value table for peak regulation control;
[0049] defining a current greedy strategy table V for peak regulation control as
-11- a set formed by actions corresponding to the maximum discretization event-extended state-action pair learning value of each row in the Q-value table for peak regulation control;
[0050] step -2: initializing t= 0 and k= 0: assigning the current exploration rate tk for the peak regulation control action to an original exploration rate 0: assigning the current greedy strategy table V for peak regulation control to an original strategy table Vo.
[0051] step -3: at the kth decision-making epoch Tk for peak regulation control of the service system of the charging station, observing the current state 3t of the service system of the charging station;
[0052] denoting the discretization state corresponding to the system state St of the kth decision-making epoch Tk for peak regulation control in the Q-value table for peak regulation control as Sk;
[0053] denoting the peak regulation control action which is actually taken in the system state St at the kth decision-making epoch 7% for peak regulation control as Ws) ‚ wherein Ms) € Dr.
[0054] in the system state St at the kth decision-making epoch *k for peak regulation control, extracting a greedy action in the discretization state Sk corresponding to the current state St from the Q-value table for peak regulation control and denoting it as VK),
[0055] in the system state St at the kth decision-making epoch 7% for peak regulation control, randomly selecting an action Va from the feasible action set Dr according to the current exploration rate Èk for peak regulation control and assigning the action to (st), and assigning (st) to VS) with the probability I= Ek.
[0056] after the control center for peak regulation of the charging station takes the action st) , observing and obtaining a system transition sample track (se Visi), St) transited from the kth decision-making epoch 7% for peak regulation control to the
-12- (k+1)th decision-making epoch T+! for peak regulation control, wherein 17 7, and t= Tij :
[0057] step -4: observing and calculating the combined quantity Hs, V(s), St) of charging rewards and peak regulation rewards obtained in the state transition process of the service system of the charging station from the current action vist) taking state St at the kth decision-making epoch 7% for peak regulation control to the state 5t at the (k+1)th decision-making epoch Tk+1 for peak regulation control;
[0058] step -5: updating the discretization state-action pair learning value Qs VS) For taking the action St) in the discretization state Sk corresponding to St in the Q-value table for peak regulation control by using a difference formula and a Q-value updating formula shown in Equ. (3) and Equ. (4), and assigning the value to Qs, (sn). d(st V(t), st") = 181, V(8¢), st) + max Q(Sk+1,d) —Q(s, V(st))
[0059] d<D; (3)
[0060] Olst. V(s0)): = Olst. vst) + (sk, v{s))d(s, (so), st) (4)
[0061] wherein in Equ. (3), Q(sk:1,d) represents the discretization state-action pair learning value for taking the feasible action d in the discretization state Sk: € corresponding to the state St of transition to the (k+1)th decision-making epoch Tk+1 for peak regulation control,
[0062] in Equ. (4), the operator ": =" indicates that the value of the right formula is calculated first and then given to the left variable; V8) is 4 learning step length for taking the action WS) in the discretization state Sk at the kth decision-making epoch Tk for peak regulation control;
[0063] step -6: selecting the action corresponding to the maximum discretization state-action pair learning value of each row in the updated Q-value table for peak regulation control to form the current action set for peak regulation control, taking the current action set as the updated greedy strategy table for peak regulation control, and assigning it to the current greedy strategy V for peak regulation control; degrading the
-13- exploration rate ek thereby obtaining the updated exploration rate and assigning it to Eiht;
[0064] step -7: if k<K assigning k+1 to k, and returning to the step -3; otherwise, performing step -8; and
[0065] step -8: judging whether the strategy table V for peak regulation control is equal to Vo or not, if so, stopping updating and performing peak regulation control on the AC charging electric vehicles served by the charging station with the current greedy strategy table V for peak regulation control, otherwise, returning to the step -2 for execution;
[0066] Compared with the prior art, the present invention has the following beneficial effects.
[0067] 1. In the present invention, the epoch when the power-grid peak regulation electricity price is issued is taken as the decision-making epoch for peak regulation of the control center for peak regulation response, the energy exchange directions of all the AC charging electric vehicles admitted by the service system of the charging station are taken as the decision-making actions, decisions are made according to the system state including the starting epoch of a peak regulation period, the real-time state of the charging points in the system and the current power-grid peak regulation electricity price, and the starting epoch of the peak regulation period and the current power-grid peak regulation electricity price are taken as part of the system state, thus facilitating reflection of the time sequence characteristic of peak regulation of the power grid, enabling the control strategy to adapt to the peak regulation demands of the power grid and better conform to actual situations, and improving the feasibility of the method.
[0068] 2. In the present invention, the power-grid peak regulation electricity price and the online service state of the charging point are taken as the state of the service system of the charging station; the charging service request of the electric vehicle which arrives randomly is taken as the event; the random event and the state of the service system of the charging station are combined into the event-extended state; whether the arriving electric vehicle is admitted into the charging station to be provided with the charging service is taken as the system action; the epoch when the charging service request of the electric vehicle arrives randomly is taken as the decision-making epoch for admission control; the intelligent admission control process of the electric vehicle at
-14- the charging station where the electric vehicle arrives randomly is described as a discrete event-driven decision-making process, and a corresponding action is taken according to the real-time event-extended state of the system; therefore, admission control of the electric vehicle of the charging station where the service request of the electric vehicle arrives randomly is processed effectively, and by optimization, the system may reasonably select the admission action, thus improving the running economy of the service system of the charging station, and adapting to the peak load regulation demands of the power grid.
[0069] 3. In the present invention, admission of the electric vehicle of the charging station is intelligently controlled and optimized with a Q-learning method of the electric-vehicle admission control center, and the energy interaction between the service AC electric vehicle of the charging station and the power grid is intelligently controlled and optimized with a Q-learning method of the control center for peak regulation response, compared with a theoretical solution method, in the present invention, a complete mathematical modeling process is not required to be performed on a control system, and particularly, the random characteristics in the system are not required to be modeled precisely. With the present invention, a better control strategy may be obtained by observing running samples of the system to perform a real-time online learning process. In addition, when random parameters of the system change, operators are not required to modify an algorithm, the online learning process may still be performed according to the actual running process of the system, and a better intelligent admission control strategy of the electric vehicle may be obtained adaptively; particularly, the double-center Q-learning method in the present invention solves the asynchronous decision problem in the cooperative-optimization control of the charging station and overcomes the defects of a centralized synchronous decision method.
[0070] 4. The cooperative-optimization control method of the charging station based on the double-center Q-learning method according to the present invention is also suitable for the situation where charging prices are different in different periods of time and the situation where the power-grid peak regulation electricity price is issued non- periodically (or randomly).
[0071] Fig. 1 is a flow chart of an electric-vehicle admission control center in a
-15- method according to the present invention;
[0072] Fig. 2 is a flow chart of a control center for peak regulation response in the method according to the present invention; and
[0073] Fig. 3 is a schematic diagram of a service system of a charging station according to the present invention.
[0074] In this embodiment, as shown in Fig. 3, a cooperative-optimization control method of a charging station based on a double-center Q-learning method is applied to a service system of the charging station, which includes Io DC charging points 1, Ja AC charging points 2, Jap AC and DC-hybrid charging points 3, Mp DC fast-charging electric vehicles 4 which arrive randomly, Ma AC slow-charging electric vehicles 5 which arrive randomly, a power-grid peak regulation electricity price plan 6, an admission control center 7 and a control center 8 for peak regulation response;
[0075] each DC charging point is enabled to adaptively meet charging power demands of the Mp DC fast-charging electric vehicles, each AC charging point is enabled to adaptively meet charging power demands of the MA AC slow-charging electric vehicles, each AC and DC-hybrid charging point is enabled to meet the charging power demands of the Mp DC fast-charging electric vehicles and the MA AC slow- charging electric vehicles, and one charging point is enabled to provide charging service for only one electric vehicle at a time; D 412... Jot
[0076] the jth DC charging point is denoted as CS; Jen il 2 JD} gnd Pip represents a set of codes of the DC charging points, thereby denoting the Ip DC : : CSP, C83, CSP, CS? : : , charging points as toe J> > JD respectively; the jth AC charging point A =f1 2...
is denoted as CS; , Jed, ={L2, Ja} , and Pi, represents a set of codes of the AC charging points, thereby denoting the Ja AC charging points as CSDCSD CSP, cs? : : : Co ! 7) Ia respectively; the jth AC and DC-hybrid charging point is AD = ee denoted as CS; , JE Dp ={L2, Jap} , and Pro represents a set of codes of the AC and DC-hybrid charging points, thereby denoting the Jap AC and DC-hybrid
-16- LO CSP, CSP CSP SPP charging points as : “D respectively;
[0077] the charging power demand of the mth DC charging electric vehicle is
D denoted as Pm KW, the total capacity of a battery of the electric vehicle is denoted as
D Em KWH, and the charging power demand and the total capacity are determined by the configuration of the electric vehicle; thus, the charging power demands of the Mp DC , , Pp, PY, PR, Py fast-charging electric vehicles are denoted as @ ° " Mp _ we 2 m € Dy 11,2," Mp} , and Pup represents a set of codes of all types of the DC charging electric vehicles;
[0078] the charging power demand of the mth AC charging electric vehicle is
A denoted as Pm KW, the total capacity of a battery of the electric vehicle is denoted as
A Ein KWH, and the charging power demand and the total capacity are determined by the configuration of the electric vehicle; thus, the charging power demands of the Ma AC : : PP Pn, Pi slow-charging electric vehicles are denoted as Vote om MA = i. 1 me Py, = {1.2 Maj , and DM, represents a set of codes of all types of the AC charging electric vehicles;
[0079] K is set as the maximum period number in one day, a corresponding total time length is T, a power-grid peak regulation electricity price at any epoch t under the total time length T is denoted as PR, yuan/KWH, PR. Der and PPR is a limited electricity-price state space; it is assumed that the power-grid peak regulation electricity price is periodically issued according to a dispatching instruction, and 7% is the epoch when the kth peak regulation electricity price PR, is issued, the price is maintained to the epoch ™+1 when the next peak regulation electricity price is issued; that is, PR‘ = PR tu<t<m k=012K-1 gnd T0=0 ‚ a peak regulation electricity Tk, PR Ik =0,1,2,- K—115=0 price sequence is denoted as {( k J) 9 } , Wherein PR, ep. to =T and PR: PR - PR.
[0080] the charging station provides paid charging service, and the price of the charging service of the charging station is PRoy yuan/KWH; PRoy is at least less than
-17- the maximum peak regulation electricity price;
[0081] the event that the th electric vehicle with the battery having the state SOC, (t) : of charge (SOC) «+7 at the epoch t randomly arrives at the charging station to apply for the charging service, is denoted as an arrival event E(mt, SOC, (D) and mg Pu, U Pu, :
[0082] the service state of the jth DC charging point at the epoch t is denoted as CSP (t) = (m7 (t) SOC» (t il | U my’ ( )). thereby denoting the combined state of the JD DC CSP =(CSP (t), CSP (t) ‚CSP (t) CSP (t charging points at the epoch t as ' | ' | ) : | ) ! | ) vo ). D | m; (f) represents the type of the electric vehicle which is served by the jth DC charging D Dey = point CS at the epoch t, mj (1) 0 indicates that no vehicle is admitted at the jth DC
D D charging point CS; at the epoch t, and mj (!) € Dat indicates that the jth DC charging D SOC» (t point CSj is charging one electric vehicle in Dt, ; my (1) represents the SOC of
D the battery of the mj (1) th DC charging electric vehicle which is served by the jth DC l CSP charging point “+ at the epoch t;
[0083] the service state of the jth AC charging point at the epoch t is denoted as Cf (t) = (mi (t),S0C (t 5) | 0 mj | ) thereby denoting the combined state of the J4 AC CSP =(CSt*(1),CS3 (1), CSP (t) CSP (t charging points at the epoch t as ! | ) : | ) ! | ) nl ).
A m; (1) represents the type of the electric vehicle which is served by the jth AC charging A Arey — point CS at the epoch t, m; (t) =0 indicates that no vehicle is admitted at the jth AC
A A charging point CS; at the epoch t, and my (1) € Da, indicates that the jth AC charging A SOC, a(t point CS) is charging one electric vehicle in PM. mi ( represents the SOC of
A the battery of the mj’(t) th AC charging electric vehicle which is served by the jth AC
CSP charging point ~~! at the epoch t;
-18-
[0084] the service state of the jth AC and DC-hybrid charging point at the epoch CS) (t) = (mPP (t),S0C 0 (t tis denoted as | ) | ! ( ) mj )) thereby denoting the combined state of the JAD AC and DC-hybrid charging points at the epoch t as CSP = (CSD (t) CSP (t})-- CSP (t)-- CSP (t AD ' | ! | ) : | ) ! | ) pa ). mj (B) represents the type of the electric vehicle which is served by the jth AC and DC-hybrid charging point
AD AD CS; at the epoch t, m; (1) =0 indicates that no vehicle is admitted at the jth AC and
AD AD DC-hybrid charging point CS; at the epoch t, and ™ (Dx, UDM, 5 dicates : : : CSP : ee that the jth AC and DC-hybrid charging point ~~! is charging one electric vehicle in SOC_ wo (t AD Puy op PM, : Co (© represents the current SOC of the battery of the Di (Dh DC or AC charging electric vehicle which is served by the jth AC and DC-hybrid
CSP charging point “+ at the epoch t;
[0085] the combined state of the three types of charging points at the epoch t 1s C, ={CSP, CSP, CSP denoted as | | ' | J
[0086] the state of the service system of the charging station at any epoch t is denoted as ie CPR}.
[0087] it is assumed that the DC charging electric vehicle admitted into the service in the service system of the charging station does not participate in peak regulation, and the AC charging electric vehicle admitted into the service may participate in peak regulation; it is assumed that discharge power of one AC charging electric vehicle is equal to charging power, and the discharge reward per unit time per unit discharge power at any epoch is equal to a real-time power-grid electricity price;
[0088] the epoch 7% of issuing the kth peak regulation electricity price PR, is taken as a decision-making epoch for peak regulation of the control center for peak regulation response, and the energy exchange directions of all the AC charging electric vehicles admitted by the service system of the charging station are denoted as actions di at the decision-making epoch for peak regulation, de = {(df (1). 02 (2) (dt (a), (dl (1), dL (2). ++. a (3). ++. dE (Tan) )}
-19- INE _ : AD (: _ represents discharging peak regulation, O represents no charging and discharging action, 1 represents a charging action, and Ds represents a set of peak regulation control actions for each electric vehicle;
[0089] at the kth decision-making epoch 7% for peak regulation, in the jth AC AD AD. AD and DC-hybrid charging point, if ™ (1) € Py, , de (J) , and if ™ ()=0 , AD {: . A Ar: di” (3) 0 and? © Pro, in the jth AC charging point, if my (f) 0 di (9) , and Jje@y,
[0090] at the kth decision-making epoch 7% for peak regulation, a set of feasible Ak peak regulation control actions dk of the charging station is denoted as Dr ‚ and ko _ Dr © Dy , wherein D: is a Cartesian product of Ja+JaD gets Dr , le, Dy =D xD x--xDr . Dr jg 4 Cartesian product of Ja +JAD gets Pr of peak regulation control actions; the total number of actions in the set Dr is denoted as C ;
[0091] all the actions in Dr are encoded, de) is set to represent the cth action, and d(e) Dr, 0 =1,2,C. E(m;, SOC, (t
[0092] the occurrence epoch t of the arrival event ( te wm, (1) ) of the Mt th electric vehicle is taken as a decision-making epoch for admission control of the admission control center for the electric vehicle, and event information of the current epoch and current state information of the service system of the charging station are : =Ít CPR, mt, SOC mn, (t)} combined and denoted as an event-extended state St rb Ao MG m (1) s: st Ta ie t=T
[0093] the epoch when the nth event >! occurs is denoted as 'n ie, tn, and a corresponding peak regulation period for the power-grid electricity price is denoted as [Eke Trott): Kn E10, KB Ty em, Tk).
[0094] a change interval [0°11 of the SOC of the battery of the electric vehicle is discretized by using a smaller constant 8 to obtain a discretization event-extended © _ C e state 5n = {kn, Cn, PR, my, SOC m, (D)} corresponding to St, wherein n represents a
-20- numerical value or discretization value corresponding to the nth decision-making epoch — —=D ——A ——=AD . RT JET, | n for admission control; ™n represents tT, is the discretization combined state of the charging points corresponding to Ct \ =D [==D =D =D —D CS; (© (n) CS? (n).--.CS) (n), CS, (0) is the discretization combined . . . csP state of the DC charging points corresponding to t , —A [=A —A A ——A CS (cs (n).CS3 (n),-,CSs (n), CS, (0) is the discretization combined - . . . CsA state of the AC charging points corresponding to t , and ——AD [{——=AD —AD ——AD ——AD CSn (csi (n),CS2 (n),---.CSj (Dn), CS, (0) is the discretization . . . . . CSAP combined state of the AC and DC-hybrid charging points corresponding to “>t ; —D D “oA A CS; (n)= (m] (n),SOC,p (n)) CS; (n)= (mj (n), SOC ‚a (n)) ° 3 , | and ——AD CS; (n) =m" (n),SOC ‚440 (n)) | ! are the discretization states corresponding to
D A AD CSP(1) CS) nq €57 (D respectively, and 50m S9Cn? () SOC, (n) SOC» (n) {0,8,25,-+-,1-8,1}. PR, € Opr .
[0095] the state space formed by all possible discretization event-extended states € isdenoted as Pie, Sn € D and the total number of the discretization event-extended states of the system is denoted as S; €
[0096] all the possible discretization event-extended states are encoded, Sn(S) ¢ —_— … represents the Sth discretization event-extended state, and sa(s)e @,s=1.2,--S ; a set of all possible discretization event-extended states where a DC-charging-electric-vehicle arriving event occurs and all the DC charging points and all the AC and DC-hybrid
D charging points are busy is denoted as Do. 4 set of all possible discretization event- extended states where an AC-charging-electric-vehicle arriving event occurs and all the AC charging points and all the AC and DC-hybrid charging points are busy is denoted as Dh .
-21-
[0097] under the same discretization rule, the discretization state corresponding to the state St of the the service system of the charging station at any epoch t is denoted as Sk, 5k = tk, Cx, PR} , and Ck and Cn have a consistent value space; the state space formed by all the possible discretization states of the service system of the charging station is denoted as © i.e, Sk € © ‚ and the total number of the discretization states of the system is denoted as S.
[0098] all the possible discretization states of the service system of the charging station are encoded, 5 (5) represents the S th discretization state, and se(8)e®,5=1,2.---.S.
[0099] the decision-making epoch for admission control of the system is defined as the arrival epoch of any electric vehicle, i.e., the event occurrence epoch;
[00100] whether the service system of the charging station admits the charging request of the electric vehicle which arrives randomly and provides the charging service is taken as an admission control action a, and the action at the nth decision-making epoch ~ . . =o. 1} .
Ta for admission control is denoted as an and an €D, 0,1} wherein 0 represents service refusal, 1 represents service admission, and Da represents a set of actions of the admission control center;
[00101] at any decision-making epoch Tu for admission control, if the type my € Puy of the arriving electric vehicle, all the DC charging points and all the AC and D . | Co Im Oe [je ©, | DC-hybrid charging points are in service, i.e, ! > ") and {mf (0 € Dx, UD, [ied | =0 . m € P co. ’ aa VV: jf the type ! Ma of the arriving electric vehicle, all the AC charging points and all the AC and DC-hybrid charging points A AD . — fm] (t)e Dun, j € ®,,} fm] (t) e Dy, U Oyj ed, } are in service, i.e, , and , a, =0.
[00102] at the D th decision-making epoch Ta for admission control, if my € Du and an =1 , the arriving DC charging electric vehicle is preferentially admitted into any idle DC charging point and charged immediately; if ™ © PMs and
-22- ay =1 , the arriving AC charging electric vehicle is preferentially admitted into any idle g ging p y y AC charging point and is charged immediately; it is assumed that the electric vehicle leaves the charging station once full;
[00103] the cooperative-optimization control method of the charging station 1s divided into Q-learning control of the electric-vehicle admission control center and Q- learning control of the control center for peak regulation response;
[00104] as shown in Fig. 1, the Q-learning control method of the electric-vehicle admission control center of the charging station includes the following steps:
[00105] step 1: defining and initializing an exploration rate of the admission control action at the nth decision-making epoch In for admission control as En, and letting 9 <8 <1 for example, letting En = 0.8:
[00106] defining elements in a Q-value table for admission control as discretization event-extended state-action pair learning values, and initializing the elements in the Q-value table for admission control, for example, randomly initializing the value of each element to be O0 or making it be 0, wherein the Q-value table for admission control takes the discretization event-extended state of the system at the time of the event as a row and the admission action of the system as a column, i.e, Q(sa(1),0) Q(sa(1),]) Q(sn(2),9) Q(sn(2), 1) Q(sn(s),0) Q(sa(s), 1) D o Q(s:(S),0) Q(s;(S), 1) , and if su(s) € Dy Uap , s=12,--,8 i Q(s5(s),1) is a negative infinite value;
[00107] defining a current greedy control strategy table v as an action set formed by actions corresponding to the maximum discretization event-extended state-action pair learning value of each row in the Q-value table for admission control;
[00108] step 2: initializing variables t=9 and n=1 assigning the current exploration rate n for the admission control action to 1; letting an original strategy table YO TV;
[00109] step 3: at the Nth decision-making epoch Ta of the service system of the
-23- charging station when the arrival event E(my, SOC m, (1) occurs, observing the current Cc state st of the service system, and denoting the event-extended state as St;
[00110] denoting the discretization state corresponding to the current event- © ce extended state St of the Nth decision-making epoch Ta in the Q-value table as Sn;
[00111] denoting the admission control action which is actually taken in the € © current event-extended state St at the Nth decision-making epoch Tn as vise) wherein v(st) € Da. €
[00112] in the event-extended state St at the nth decision-making epoch Tn for e ec D A admission control, if the corresponding discretization state 5» meets Sn © D; UD, , letting “\* ‚ otherwise, in the current event-extended state ?!, extracting a greedy € c action in the discretization state Sn corresponding to St from the Q-value table, © © ie denoting the greedy action as Sn), assigning Y(n) to (51) with a probability I —2n Cc , selecting an action other than V(51) from the action set Pa at the exploration rate ên 7 . . . © as an exploration action Ven and assigning the action to v(st) ; ©
[00113] after the service system of the charging station takes the action vis)
CRC N observing and obtaining a transition sample track transited from the Nth decision-making epoch Ta for admission control to the n +1th decision-making epoch In+ for admission control or the epoch T, wherein t=In,t =In4< T or t'=T: when © "_ . 1 =3T PR,,0,0 V=T assuming St { Cr. PR1,0, i.
[00114] step 4: observing the service system of the charging station, and with Equ. … Tst, v(st),s5) (1), calculating the combined quantity 6 t° of accumulated charging rewards and peak regulation rewards obtained in the state transition process of the system from © ¢ = 1 . the current action Y(t) taking state *! LE PR, m4, SOC, (OF gt the n th decision- e _ ' , ¢ making epoch In for admission control to the state 5! © {t, Co, PR, my, SOC, (t)} atthe ntlth decision-making epoch Ti+t for admission control or the epoch T;
-24- | ’ > sgn(my( OPP + Yo Lr (mg OP a, | rst vss =|, ED seam ANE (PE aA” (Dj O)P [001 15] x(PRe, > PRr)dt (1)
D a ‚ rf — 5 Xo, 5 =
[00116] wherein in Equ. (1), ¥ = M2 Tue. 15. it js defined that when ™ (t)=0 Dy D . . D N A — sgn(m; (t))=0 and when 7 (t)>0 sgn(m; (t)) =1 . when Mi (t)=0 An _ Ace Ar _ AD sgn(m;'(t})) =0 and when Mi (t)>0 ’ sgn(mj (t)) =1 . when Mi ED, ’ AD Ny AD _ AD Ady, (MG (t)) =1 otherwise, Zo, (mj (1))=0 . when mj (t) € Du, ’ AD u AD u pP PA = . mj =0 p A Hoang, (M(H) L otherwise, Kou, (Mj (0) ‚ MID and PD represent the
D A charging power demands of the mj (£) th DC charging electric vehicle and the m; (t) th ee df (j),dP (j)eD AC charging electric vehicle respectively; (9), t (7) ' represent the power
A directions of the electric vehicles which are served by the jth AC charging point CS; : CSP : and the jth AC and DC-hybrid charging point ~~ of the service system of the charging station at the current epoch t under the peak regulation control action of the control center for peak regulation response;
[00117] step 5: updating the discretization event-extended state-action pair e /.e e © learning value 952: V(S) for taking the action YOU) in the discretization state Sn € corresponding to 3 in the Q-value table for admission control by using a difference formula and a Q-value updating formula shown in Equ. (2) and Equ. (3), obtaining the e 1e updated learning value and assigning it to Q(sn, v{s:)). d(st,v(st),st) =r(st,v(st),st) + max Q(s51,8) = Qlsn, vst)
[00118] asDa (2)
[00119] Q(sn, v(st)): = Q(sn, v(st)) + (sh, v(sr))d(st, v(si), 57) (3) oo Q(sc1.a) me
[00120] wherein in Equ. (2), “2: represents the discretization event- extended state-action pair learning value for taking the action 2 in the discretization e € state Sn+ corresponding to the state St of transition to the n+1 th decision-making epoch Tost for admission control or the epoch T;
-25-
[00121] in Equ. (3), the operator ": =" indicates that the value of the right formula ; 7 ‚able: 8 VSO) is a learn is calculated first and then given to the left variable; > "tJ is a learning step length qe € for taking the action V(St) in the discretization event-extended state Sn at the nth decision-making epoch In for admission control;
[00122] step 6: selecting the action corresponding to the maximum discretization event-extended state-action pair learning value of each row in the updated Q-value table for admission control to form the current action set for admission control, taking the current action set as the updated greedy strategy table for admission control, and assigning it to the current greedy strategy v for admission control; degrading the exploration rate ©n, thereby obtaining the updated exploration rate and assigning it to En+1 ;
[00123] step 7: if '<T assigning n+1 to n and returning to the step 3; otherwise, indicating t= T and performing step 8; and
[00124] step 8: judging whether the strategy table V for admission control is equal to Y9 or not, if so, stopping updating and performing admission control on the random charging service requests of the M electric vehicles with the current strategy table v for admission control, otherwise, returning to the step 2 for execution.
[00125] As shown in Fig. 2, the Q-learning control method of the control center for peak regulation of the charging station includes the following steps:
[00126] step -1: defining and initializing an exploration rate of the peak regulation control action at the kth decision-making epoch 7x for peak regulation control as êx, and letting 0 <€x <1 for example, letting ek = 0.9.
[00127] defining elements in a Q-value table for peak regulation control as state- action pair learning values of the service system of the charging station, and initializing the elements in the Q-value table for peak regulation control, for example, randomly initializing the value of each element to be O or making it be 0, wherein the Q-value table for peak regulation control takes the discretization state of the service system of the charging station as a row and the peak regulation control action of the system as a
-26- dM, dn) Qlsi(1).d(2)) ++ Q(s(1),d(c)) + QUsi(1),d(C)) Q(si(2),d()) QUsk(2).d(2)) == sk), do) ++ QUsi(2), d(C) Qs (3),d(1) Q(sk(3),d(2)) ++ Qs(3),d(c)) ++ Q(sk(3),d(C)) column. ie. LQEK®.dD) Ask(®.d@) + AUsi($d@) + QAsi(§),d(©) , for any element, Qlsx(5).d(<)) s=L2S and if 99) is not the peak regulation control action which is feasible in the system state S 5) ie, dc) 2 Dr Qs (3). d(c)) 1s a negative infinite value;
[00128] defining a current greedy strategy table V for peak regulation control as an action set formed by actions corresponding to the maximum discretization state- action pair learning value of each row in the Q-value table for peak regulation control,
[00129] step -2: initializing t= 9 and k=0. assigning the current exploration rate €k for the peak regulation control action to go. setting an original greedy strategy table for peak regulation control as vozV ;
[00130] step -3: at the kth decision-making epoch 7% for peak regulation control of the service system of the charging station, observing the current state >t of the service system;
[00131] denoting the discretization state corresponding to the system state St of the kth decision-making epoch 7% for peak regulation control in the Q-value table for peak regulation control as Sk;
[00132] denoting the peak regulation control action which is actually taken in the system state St at the kth decision-making epoch Tk for peak regulation control as Ws) ‚ wherein Us) € Dr.
[00133] in the system state St at the kth decision-making epoch °k for peak regulation control, extracting a greedy action in the discretization state Sk corresponding to St from the Q-value table for peak regulation control and denoting it as (sk),
[00134] in the system state St at the kth decision-making epoch 7% for peak regulation control, randomly selecting an action Va, from the current feasible action set
-27- AK ~ ~ Dr according to the exploration rate © and assigning the action to Vis) ‚and assigning VK) to V(80) with the probability Lek.
[00135] after the control center for peak regulation of the charging station takes LY . . i» St, V(s¢), St the action vs) , observing and obtaining a system transition sample track | 0 Vs st ) transited from the kth decision-making epoch 7x for peak regulation control to the (k+1)th decision-making epoch Tk+! for peak regulation control, wherein '= Tk and t' = Tk+1 ;
[00136] step -4: observing the service system of the charging station, and with i i oo TS, (80). 81) ; Equ. (4), calculating the combined quantity of charging rewards and peak regulation rewards obtained in the state transition process of the system from the current Low . — 1 . . action V0) taking state St = CPR} ot the kth decision-making epoch Tk for peak . S= ft Ca 1 we regulation control to the state St WC PR} gt the (k+1)th decision-making epoch Tk+1 for peak regulation control; > sen(my (D)P pat tb (m2 OP lo Hs. (sr), Sr) = I. OI sgn(mf (0)di (Py 2 ing, (mj (0)ALD (Prog,
[00137] {PR — PR )dt (4)
[00138] step -5: updating the discretization state-action pair learning value Qs, ¥(50)) for taking the action vs) in the discretization state Sk corresponding to St in the Q-value table for peak regulation control by using a difference formula and a Q-value updating formula shown in Equ. (5) and Equ. (6), obtaining the updated learning value and assigning it to Qs, Vist). ds Us.) = 150, 950), 50) + max Alsi, d) - Qs, Hs)
[00139] deD} (5)
[00140] Qs. V(s1)): = Olst, V(st)) + v(sk, V(s))d(s. Vs). st) (6)
[00141] wherein in Equ. (5), Q(sk-1,d) represents the discretization state-action pair learning value for taking the feasible action d in the discretization state Sk © corresponding to the state Sy of transition of the system to the (k+1)th decision-making
-28- epoch K+ for peak regulation control;
[00142] in Equ. (6), the operator "* =" indicates that the value of the right formula is calculated first and then given to the left variable; YY) is a learning step length for taking the action VS) in the discretization state Sk at the kth decision-making epoch Tk for peak regulation control;
[00143] step -6: selecting the action corresponding to the maximum discretization state-action pair learning value of each row in the updated Q-value table for peak regulation control to form the current action set for peak regulation control, taking the current action set as the updated greedy strategy table for peak regulation control, and assigning it to the current greedy strategy V for peak regulation control; degrading the exploration rate Ek, thereby obtaining the updated exploration rate and assigning it to Ek ;
[00144] step -7: if k <K assigning k+1 to k, and then returning to the step -3; otherwise, performing step -8; and
[00145] step -8: judging whether the strategy table V for peak regulation control is equal to YO or not, if so, stopping updating and performing peak regulation control on the AC charging electric vehicles served by the charging station with the current greedy strategy table V for peak regulation control, otherwise, returning to the step -2 for execution.
Claims (1)
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911316131.XA CN110991931B (en) | 2019-12-19 | 2019-12-19 | Charging station cooperative optimization control method based on double-center Q learning |
Publications (2)
Publication Number | Publication Date |
---|---|
NL2026738A NL2026738A (en) | 2021-08-18 |
NL2026738B1 true NL2026738B1 (en) | 2022-03-18 |
Family
ID=70096022
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
NL2026738A NL2026738B1 (en) | 2019-12-19 | 2020-10-23 | Cooperative-optimization control method of charging station based on double-center q-learning method |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110991931B (en) |
NL (1) | NL2026738B1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114548518A (en) * | 2022-01-21 | 2022-05-27 | 广州蔚景科技有限公司 | Ordered charging control method for electric automobile |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10889196B2 (en) * | 2017-06-02 | 2021-01-12 | CarFlex Corporation | Autonomous vehicle servicing and energy management |
CN110068337B (en) * | 2019-04-25 | 2023-09-19 | 安徽师范大学 | Unmanned aerial vehicle scheduling method and system for sensor node charging |
CN110443415B (en) * | 2019-07-24 | 2022-07-15 | 三峡大学 | Electric vehicle charging station multi-objective optimization scheduling method considering dynamic electricity price strategy |
CN110428165B (en) * | 2019-07-31 | 2022-03-25 | 电子科技大学 | Electric vehicle charging scheduling method considering reservation and queuing in charging station |
-
2019
- 2019-12-19 CN CN201911316131.XA patent/CN110991931B/en active Active
-
2020
- 2020-10-23 NL NL2026738A patent/NL2026738B1/en active
Also Published As
Publication number | Publication date |
---|---|
CN110991931B (en) | 2022-03-15 |
NL2026738A (en) | 2021-08-18 |
CN110991931A (en) | 2020-04-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111376954B (en) | Train autonomous scheduling method and system | |
WO2021248607A1 (en) | Deep reinforcement learning-based taxi dispatching method and system | |
CN105740556B (en) | The automatic preparation method of route map of train based on passenger flow demand | |
CN106828161B (en) | One kind being applied to multichannel charging jack charging equipment of electric automobile and its control method | |
CN102708425B (en) | Based on electric automobile service network coordinated control system and the method for Multi-Agent system | |
CN103400209B (en) | Power distribution network maintenance embodiment optimization method | |
CN104680258A (en) | Method and device for dispatching electric taxi | |
CN104022552B (en) | A kind of intelligent detecting method controlled for charging electric vehicle | |
JP2013520955A (en) | System, apparatus and method for exchanging energy with an electric vehicle | |
NL2026738B1 (en) | Cooperative-optimization control method of charging station based on double-center q-learning method | |
CN103997044A (en) | Power load control method and system | |
CN102437601A (en) | Autonomous charging system of cloud robot and method thereof | |
Chen et al. | Real-time bus holding control on a transit corridor based on multi-agent reinforcement learning | |
CN112183771A (en) | Intelligent operation and maintenance ecosystem for rail transit and operation method thereof | |
Liu et al. | Data-driven intelligent EV charging operating with limited chargers considering the charging demand forecasting | |
CN106447178A (en) | Distribution network transformation and construction analyzing and planning system | |
Bertolini et al. | Power output optimization of electric vehicles smart charging hubs using deep reinforcement learning | |
CN101702537A (en) | Method for processing failures on adaptive basis in terminal of distribution network | |
Xia et al. | A fuzzy control model based on BP neural network arithmetic for optimal control of smart city facilities | |
CN114204578B (en) | Charging pile load intelligent regulation and control method and system for demand response | |
CN105809987B (en) | It is a kind of based on the wind light mutual complementing formula intelligent traffic light control system more acted on behalf of | |
Sun et al. | Research on digital flow control model of urban rail transit under the situation of epidemic prevention and control | |
Bagherinezhad et al. | Real-time coordinated operation of power and autonomous electric ride-hailing systems | |
CN114204675A (en) | Power distribution station electric energy data acquisition terminal based on cloud edge cooperation | |
CN107139777A (en) | A kind of vehicle energy management method and its system |