CN114444802A - Electric vehicle charging guide optimization method based on graph neural network reinforcement learning - Google Patents

Electric vehicle charging guide optimization method based on graph neural network reinforcement learning Download PDF

Info

Publication number
CN114444802A
CN114444802A CN202210109887.2A CN202210109887A CN114444802A CN 114444802 A CN114444802 A CN 114444802A CN 202210109887 A CN202210109887 A CN 202210109887A CN 114444802 A CN114444802 A CN 114444802A
Authority
CN
China
Prior art keywords
electric vehicle
node
neural network
charging
reinforcement learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210109887.2A
Other languages
Chinese (zh)
Other versions
CN114444802B (en
Inventor
江昌旭
卢玥君
林铮
邵振国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN202210109887.2A priority Critical patent/CN114444802B/en
Publication of CN114444802A publication Critical patent/CN114444802A/en
Application granted granted Critical
Publication of CN114444802B publication Critical patent/CN114444802B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0283Price estimation or determination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Accounting & Taxation (AREA)
  • Game Theory and Decision Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Tourism & Hospitality (AREA)
  • Finance (AREA)
  • Evolutionary Computation (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Operations Research (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Charge And Discharge Circuits For Batteries Or The Like (AREA)
  • Electric Propulsion And Braking For Vehicles (AREA)

Abstract

本发明提供了一种基于图神经网络强化学习的电动汽车充电引导优化方法,包括如下步骤:步骤S1:电力‑交通融合网协同优化模型初始化;步骤S2:更新电动汽车充电负荷;步骤S3:根据epsilon‑Greedy算法和图神经网络强化学习算法生成a i,t ;步骤S4:执行充电引导行为策略a i,t ;步骤S5:计算图神经网络强化学习算法的奖励函数;步骤S6:部分观测马尔科夫决策过程的状态x i,t 更新;步骤S7:将当前步的信息(x i,t , a i,t ,r i,t ,x i,t )存储于记忆单元D中;步骤S8:判断是否达到预定的时间T end;若否,则执行(2)~(7);若是,则输出图神经网络强化学习算法参数和相应输出结果。应用本技术方案可实现有效地降低电动汽车充电总成本,实现电动汽车的有序充电以及电力系统协同优化调度。

Figure 202210109887

The present invention provides an electric vehicle charging guidance optimization method based on graph neural network reinforcement learning, comprising the following steps: step S1: initialization of a collaborative optimization model of the power-transport fusion network; step S2: updating the electric vehicle charging load; step S3: according to The epsilon-Greedy algorithm and the graph neural network reinforcement learning algorithm generate a i,t ; Step S4: execute the charging guide behavior strategy ai ,t ; Step S5: calculate the reward function of the graph neural network reinforcement learning algorithm; The state xi ,t of the Kov decision-making process is updated; Step S7: Store the information of the current step ( xi ,t , a i,t , ri ,t , xi ,t ) in the memory unit D ; Step S8 : determine whether the predetermined time T end is reached; if not, execute (2) to (7); if so, output the graph neural network reinforcement learning algorithm parameters and corresponding output results. By applying the technical solution, the total cost of electric vehicle charging can be effectively reduced, and the orderly charging of the electric vehicle and the coordinated optimal scheduling of the power system can be realized.

Figure 202210109887

Description

基于图神经网络强化学习的电动汽车充电引导优化方法Electric vehicle charging guidance optimization method based on graph neural network reinforcement learning

技术领域technical field

本发明涉及电力-交通融合网协同优化技术领域,特别是一种基于图神经网络强化学习的电动汽车充电引导优化方法。The invention relates to the technical field of collaborative optimization of an electric power-transportation fusion network, in particular to an electric vehicle charging guidance optimization method based on graph neural network reinforcement learning.

背景技术Background technique

随着电动汽车规模化运行,电力系统和交通系统将会存在许多的交互融合,形成电力-交通融合网。该融合网涉及电动汽车、电力系统和交通系统等多个主体,包含了多种随机不确定因素。多个主体相互作用、多种随机因素的影响以及多种随机因素的耦合关系使得弄清电力和交通系统的交互影响机理以及解决电力-交通融合网协同优化变得更加困难。例如电动汽车用户的出行和心理行为以及驾驶行为均具有一定的随机性,这将会影响到交通系统的流量分布,使得交通流量也具有一定的不确定性,进一步影响到电动汽车达到充电站的时间,使得电动汽车的充电时间、排队时间和充电时长也具有很强的不确定性。不同于传统的电力负荷,电动汽车作为一种可移动的负荷,其随机性相比于传统的电力负荷更强,更加难以预测。With the large-scale operation of electric vehicles, there will be many interactions and integrations between the power system and the transportation system, forming a power-transportation fusion network. The fusion network involves multiple subjects such as electric vehicles, power systems, and transportation systems, and contains a variety of random uncertain factors. The interaction of multiple agents, the influence of multiple random factors, and the coupling relationship of multiple random factors make it more difficult to understand the interaction mechanism of the power and transportation systems and to solve the collaborative optimization of the power-transport fusion network. For example, the travel and psychological behavior and driving behavior of electric vehicle users have a certain degree of randomness, which will affect the flow distribution of the traffic system, making the traffic flow also have certain uncertainty, and further affecting the electric vehicle reaching the charging station. Time, making the charging time, queuing time and charging time of electric vehicles also have strong uncertainty. Different from traditional electric loads, electric vehicles, as a movable load, have stronger randomness and are more difficult to predict than traditional electric loads.

目前对电力-交通融合网研究可以分为三个研究方向:1)从电力系统角度出发,通过计算节点边际成本电价或优化充电站服务定价来引导电动汽车以最低的成本进行充电;2)从交通系统角度出发考虑充电路径优化实现充电成本最小化;3)综合考虑电动汽车、电力和交通系统的利益,通过优化电动汽车的充电策略和电力系统的调度决策实现综合效益最大化。但是现有的研究大部分属于静态优化问题,尚未考虑到电动汽车、充电站和电力系统等主体在连续时间尺度上的耦合关系;同时现有大部分研究没有考虑到多种不确定因素及其相关耦合性对电力-交通融合网协同优化的影响。更重要的是,现有的研究中没有考虑到电动汽车间交互影响对电力-交通融合网协同优化影响。At present, the research on the power-transport fusion network can be divided into three research directions: 1) From the perspective of the power system, by calculating the node marginal cost price or optimizing the charging station service pricing to guide the electric vehicle to charge at the lowest cost; 2) From the perspective of the power system From the perspective of the transportation system, the optimization of the charging path is considered to minimize the charging cost; 3) the interests of the electric vehicle, the electric power and the transportation system are comprehensively considered, and the comprehensive benefit is maximized by optimizing the charging strategy of the electric vehicle and the scheduling decision of the electric power system. However, most of the existing researches belong to static optimization problems, and have not considered the coupling relationship of electric vehicles, charging stations, and power systems on a continuous time scale. The influence of correlation coupling on the collaborative optimization of power-transportation fusion network. More importantly, the existing research has not considered the impact of the interaction between electric vehicles on the synergistic optimization of the power-transportation fusion network.

发明内容SUMMARY OF THE INVENTION

有鉴于此,本发明的目的在于提供一种基于图神经网络强化学习的电动汽车充电引导优化方法,能够有效的在考虑电力-交通融合网多种不确定性因素的情况下,能够有效地降低电动汽车充电总成本,实现电动汽车的有序充电以及电力系统协同优化调度。In view of this, the purpose of the present invention is to provide an electric vehicle charging guidance optimization method based on graph neural network reinforcement learning, which can effectively reduce the number of uncertainties in the power-transportation fusion network. The total cost of electric vehicle charging, realizing the orderly charging of electric vehicles and the coordinated optimal scheduling of the power system.

为实现上述目的,本发明采用如下技术方案:基于图神经网络强化学习的电动汽车充电引导优化方法,包括如下步骤:In order to achieve the above purpose, the present invention adopts the following technical scheme: an electric vehicle charging guidance optimization method based on graph neural network reinforcement learning, comprising the following steps:

步骤S1:电力-交通融合网协同优化模型初始化;Step S1: initialization of the collaborative optimization model of the power-transport fusion network;

步骤S2:更新电动汽车充电负荷,并基于二阶锥松弛优化及对偶理论对电动汽车充电站所在的节点的边际成本电价进行优化计算;Step S2: updating the electric vehicle charging load, and optimizing the marginal cost electricity price of the node where the electric vehicle charging station is located based on the second-order cone relaxation optimization and dual theory;

步骤S3:根据epsilon-Greedy算法和图神经网络强化学习算法生成电动汽车充电引导行为策略ai,tStep S3: according to the epsilon-Greedy algorithm and the graph neural network reinforcement learning algorithm, the electric vehicle charging guidance behavior strategy a i,t is generated;

步骤S4:执行充电引导行为策略ai,t,并对电动汽车的状态进行判断和更新;Step S4: executing the charging guidance behavior strategy a i,t , and judging and updating the state of the electric vehicle;

步骤S5:根据电力-交通融合环境计算图神经网络强化学习算法的奖励函数;Step S5: Calculate the reward function of the graph neural network reinforcement learning algorithm according to the power-transport fusion environment;

步骤S6:部分观测马尔科夫决策过程的状态xi,t更新;Step S6: update the state x i, t of the partially observed Markov decision process;

步骤S7:将当前步的信息(xi,t,ai,t,ri,t,xi,t’)存储于记忆单元D中,并基于随机梯度下降的方法对图神经网络强化学习算法权重进行更新;其中,xi,t,表示图神经网络强化学习当前状态;ai,t表示电动汽车行为策略;ri,t表示图神经网络强化学习的奖励函数值;xi,t’表示图神经网络强化学习下一步状态;Step S7: Store the information of the current step (x i,t ,a i,t ,r i,t , xi,t ') in the memory unit D, and perform reinforcement learning on the graph neural network based on the stochastic gradient descent method The algorithm weights are updated; among them, x i,t , represents the current state of the reinforcement learning of the graph neural network; a i,t represents the electric vehicle behavior strategy; ri ,t represents the reward function value of the reinforcement learning of the graph neural network; xi,t 'represents the next state of the reinforcement learning of the graph neural network;

步骤S8:判断是否达到预定的时间Tend;若否,则执行(2)~(7);若是,则输出图神经网络强化学习算法参数和相应输出结果。Step S8: determine whether the predetermined time T end is reached; if not, execute (2) to (7); if so, output the parameters of the graph neural network reinforcement learning algorithm and the corresponding output result.

在一较佳的实施例中,对电力-交通融合网协同优化模型初始化,包括以下步骤:In a preferred embodiment, initializing the collaborative optimization model of the power-transport fusion network includes the following steps:

步骤21:电力网络和交通网络拓扑结构和参数确定,包括电力系统节点、线路、初始电压、优化的上下限值,交通网络包括交通节点、道路参数、容量及行驶速度最大值;Step 21: Determine the topology and parameters of the power network and the transportation network, including power system nodes, lines, initial voltages, and optimized upper and lower limits, and the transportation network includes traffic nodes, road parameters, capacity, and maximum travel speed;

步骤22:神经网络参数初始化,包括神经网络权重初始化和超参数设置,如学习速率α、折扣因子γ、批大小B和记忆单元D容量大小;Step 22: Neural network parameter initialization, including neural network weight initialization and hyperparameter settings, such as learning rate α, discount factor γ, batch size B and memory unit D capacity;

步骤23:将研究区域中的每辆电动汽车看做一个代理,并将其视为一个节点n∈N,将电动汽车间的连接视为边e∈E,以此构成图网络结构G=(N,E),并对每辆电动汽车i在当前状态xi,t和邻接矩阵A进行初始化。Step 23: Consider each electric vehicle in the study area as an agent, and regard it as a node n∈N, and regard the connection between electric vehicles as an edge e∈E, so as to form a graph network structure G=( N, E), and initialize the current state x i, t and the adjacency matrix A of each electric vehicle i.

在一较佳的实施例中,更新电动汽车充电负荷和基于二阶锥松弛优化及对偶理论对电动汽车充电站所在的节点的边际成本电价进行优化计算步骤包括:In a preferred embodiment, the steps of updating the electric vehicle charging load and optimizing the marginal cost electricity price of the node where the electric vehicle charging station is located based on second-order cone relaxation optimization and dual theory include:

步骤31:更新电动汽车充电负荷:根据充电站中的电动汽车数量和充电功率计算各个充电站充电负荷,得到各个站的充电负荷后加上该节点的基础负荷即可以获得该节点的最终用电负荷;Step 31: Update the charging load of electric vehicles: Calculate the charging load of each charging station according to the number of electric vehicles in the charging station and the charging power, obtain the charging load of each station and add the basic load of the node to obtain the final power consumption of the node load;

步骤32:建立基于支路潮流模型的配电网最优潮流模型:Step 32: Establish the optimal power flow model of the distribution network based on the branch power flow model:

min f(p,q,P,Q,V,I) (1)min f(p,q,P,Q,V,I) (1)

Figure BDA0003494780970000031
Figure BDA0003494780970000031

Figure BDA0003494780970000032
Figure BDA0003494780970000032

Figure BDA0003494780970000033
Figure BDA0003494780970000033

Figure BDA0003494780970000034
Figure BDA0003494780970000034

Figure BDA0003494780970000035
Figure BDA0003494780970000035

Figure BDA0003494780970000036
Figure BDA0003494780970000036

Figure BDA0003494780970000037
Figure BDA0003494780970000037

Figure BDA0003494780970000041
Figure BDA0003494780970000041

式中,EN和EL分别表示配电网节点和线路集合;Pij和Qij表示从节点i流向节点j的支路有功功率和无功功率;Pjk表示从节点j流向节点k的支路有功功率;

Figure BDA0003494780970000042
Figure BDA0003494780970000043
表示发电机有功和无功出力,即注入到节点j的有功功率和无功功率;
Figure BDA0003494780970000044
Figure BDA0003494780970000045
表示风机注入到节点j的有功功率和无功功率;Qjs表示从节点j流向节点s的支路无功功率;rij和xij表示从节点i到节点j的支路电阻和电抗;Iij表示从节点i到节点j的支路电流;π(j)表示与节点j相连的支路集合;
Figure BDA0003494780970000046
Figure BDA0003494780970000047
表示连接在节点j上的有功负荷和无功负荷;Vi表示节点i的电压幅值;Vj表示节点j的电压幅值;zij表示连接节点i和节点j的支路阻抗,满足zij=rij+jxij
Figure BDA0003494780970000048
表示连接节点i和节点j的支路电流最大值;V j
Figure BDA0003494780970000049
表示节点j的最小和最大电压;
Figure BDA00034947809700000410
表示连接到节点j的风机最大有功出力;
Figure BDA00034947809700000411
表示连接到节点j的风机的功率因素;In the formula, EN and EL represent the distribution network nodes and line sets, respectively; P ij and Q ij represent the branch active power and reactive power flowing from node i to node j; P jk represent the flow from node j to node k. branch active power;
Figure BDA0003494780970000042
and
Figure BDA0003494780970000043
Represents the active and reactive output of the generator, that is, the active power and reactive power injected into node j;
Figure BDA0003494780970000044
and
Figure BDA0003494780970000045
represents the active power and reactive power injected by the fan into node j; Q js represents the branch reactive power flowing from node j to node s; r ij and x ij represent the branch resistance and reactance from node i to node j; I ij represents the branch current from node i to node j; π(j) represents the set of branches connected to node j;
Figure BDA0003494780970000046
and
Figure BDA0003494780970000047
represents the active load and reactive load connected to node j; V i represents the voltage amplitude of node i; V j represents the voltage amplitude of node j; z ij represents the branch impedance connecting node i and node j, satisfying z ij =r ij +jx ij ;
Figure BDA0003494780970000048
represents the maximum value of the branch current connecting node i and node j; V j and
Figure BDA0003494780970000049
represents the minimum and maximum voltage of node j;
Figure BDA00034947809700000410
Represents the maximum active power output of the fan connected to node j;
Figure BDA00034947809700000411
represents the power factor of the fan connected to node j;

配电网节点j的负荷

Figure BDA00034947809700000412
包括基础负荷
Figure BDA00034947809700000413
和电动汽车充电负荷
Figure BDA00034947809700000414
即The load of distribution network node j
Figure BDA00034947809700000412
including base load
Figure BDA00034947809700000413
and electric vehicle charging load
Figure BDA00034947809700000414
which is

Figure BDA00034947809700000415
Figure BDA00034947809700000415

根据配电网实际需求,其目标函数min f(p,q,P,Q,V,I)可以最终定义为:According to the actual demand of the distribution network, the objective function min f(p,q,P,Q,V,I) can be finally defined as:

Figure BDA00034947809700000416
Figure BDA00034947809700000416

式中,

Figure BDA00034947809700000417
表示注入节点i发电机的有功出力;ai和bi分别表示发电机的二次煤耗和一次煤耗系数;
Figure BDA00034947809700000418
Figure BDA00034947809700000419
分别从主网中购买电量的电价和有功功率;In the formula,
Figure BDA00034947809700000417
represents the active power output of the generator injected into node i; a i and b i represent the secondary coal consumption and primary coal consumption coefficient of the generator, respectively;
Figure BDA00034947809700000418
and
Figure BDA00034947809700000419
The electricity price and active power of electricity purchased from the main network, respectively;

步骤33:将以上非线性配电网最优潮流模型转换为二阶锥松弛规划模型:Step 33: Convert the above nonlinear distribution network optimal power flow model into a second-order cone relaxation programming model:

由于BFM-OPF是非线性规划模型,令支路电流幅值

Figure BDA00034947809700000420
以及支路电压幅值
Figure BDA00034947809700000421
并对式进行二阶锥松弛(SOCR)转换,可以得到以下模型:Since BFM-OPF is a nonlinear programming model, let the branch current amplitude
Figure BDA00034947809700000420
and the branch voltage amplitude
Figure BDA00034947809700000421
And the second-order cone relaxation (SOCR) transformation of the formula, the following model can be obtained:

Figure BDA0003494780970000051
Figure BDA0003494780970000051

Figure BDA0003494780970000052
Figure BDA0003494780970000052

Figure BDA0003494780970000053
Figure BDA0003494780970000053

Figure BDA0003494780970000054
Figure BDA0003494780970000054

Figure BDA0003494780970000055
Figure BDA0003494780970000055

Figure BDA0003494780970000056
Figure BDA0003494780970000056

Figure BDA0003494780970000057
Figure BDA0003494780970000057

式中||·||2表示二阶锥操作;上式-构成了松弛后的配电网最优潮流基本形式;where ||·|| 2 represents the second-order cone operation; the above formula - constitutes the basic form of the optimal power flow of the distribution network after relaxation;

步骤34:采用Gurobi求解器求解上述模型的原问题和对偶变量,获取充电站所在节点的边际成本电价λkStep 34: Use the Gurobi solver to solve the original problem and dual variables of the above model, and obtain the marginal cost electricity price λ k of the node where the charging station is located.

在一较佳的实施例中,所述epsilon-Greedy算法包括以下步骤:In a preferred embodiment, the epsilon-Greedy algorithm includes the following steps:

步骤41:生成一个随机数u,判断其与epsilon-Greedy算法的衰退因子ξ的大小;Step 41: Generate a random number u, and judge the size of it and the decay factor ξ of the epsilon-Greedy algorithm;

步骤42:若u<ξ,则采用随机的方式在当前状态对每辆电动汽车生成一个行为ai,t,该行为在专利中表示电动汽车充电路径策略;Step 42: If u<ξ, generate a behavior a i,t for each electric vehicle in the current state in a random manner, and this behavior represents the electric vehicle charging path strategy in the patent;

ai,t=randint(Naction) (19)a i,t = randint(N action ) (19)

式中,Naction表示电动汽车行为决策的数量;where N action represents the number of EV behavior decisions;

步骤43:若u≥ξ,则根据图神经网络强化学习算法的经验对每辆电动汽车i在当前状态xi,t和邻接矩阵A下生成一个行为ai,t,即Step 43: If u≥ξ, generate a behavior a i,t for each electric vehicle i under the current state x i,t and the adjacency matrix A according to the experience of the graph neural network reinforcement learning algorithm, that is,

Figure BDA0003494780970000058
Figure BDA0003494780970000058

式中,θt表示图神经网络强化学习算法的参数;argmax()表示取最大值对应的参数操作;xi,t表示第i辆电动汽车在时间t时的状态,其主要由时间t时第i辆电动汽车的状态xi,t由电动汽车状态EVi,t、近邻交通道路信息Roi,t、近邻电动汽车状态Nei,t和各充电站信息CSt组成,即In the formula, θ t represents the parameters of the graph neural network reinforcement learning algorithm; argmax() represents the parameter operation corresponding to the maximum value; x i, t represents the state of the i-th electric vehicle at time t, which is mainly determined by time t. The state x i,t of the i-th electric vehicle is composed of the electric vehicle state EV i,t , the neighboring traffic road information Ro i,t , the neighboring electric vehicle state Ne i,t and the information CS t of each charging station, namely

xi,t=[EVi,t,Roi,t,Nei,t,CSt] (21)x i,t =[EV i,t ,Ro i,t ,Ne i,t ,CS t ] (21)

Figure BDA0003494780970000061
Figure BDA0003494780970000061

Figure BDA0003494780970000062
Figure BDA0003494780970000062

Figure BDA0003494780970000063
Figure BDA0003494780970000063

Figure BDA0003494780970000064
Figure BDA0003494780970000064

式中,第i辆电动汽车状态EVi,t包括电动汽车前往充电站时的下一节点

Figure BDA0003494780970000065
道路编号
Figure BDA0003494780970000066
电动汽车行驶速度vi,t和剩余电量SOCi,t;近邻交通道路信息状态Roi,t包括与电动汽车i所在下一节点
Figure BDA0003494780970000067
相连的下一条道路的起始节点
Figure BDA0003494780970000068
末节点
Figure BDA0003494780970000069
道路长度
Figure BDA00034947809700000610
以及道路上的电动车数量
Figure BDA00034947809700000611
近邻电动汽车状态Nei,t包括各近邻电动汽车k的状态,如与第i辆电动汽车临近的第k辆电动汽车下一节点
Figure BDA00034947809700000612
其所在的道路编号
Figure BDA00034947809700000613
电动汽车行驶速度vi,k,t和剩余电量SOCi,k,t;充电站信息CSt包括各充电站的充电电价pc,t和电动汽车数量
Figure BDA00034947809700000614
In the formula, the state EV i,t of the ith electric vehicle includes the next node when the electric vehicle goes to the charging station
Figure BDA0003494780970000065
road number
Figure BDA0003494780970000066
Electric vehicle driving speed v i,t and remaining power SOC i,t ; neighbor traffic road information state Ro i,t includes the next node where electric vehicle i is located
Figure BDA0003494780970000067
The starting node of the next connected road
Figure BDA0003494780970000068
end node
Figure BDA0003494780970000069
road length
Figure BDA00034947809700000610
and the number of electric vehicles on the road
Figure BDA00034947809700000611
Neighboring electric vehicle state Ne i,t includes the state of each neighboring electric vehicle k, such as the next node of the k-th electric vehicle adjacent to the i-th electric vehicle
Figure BDA00034947809700000612
the road number it is on
Figure BDA00034947809700000613
Electric vehicle running speed v i,k,t and remaining power SOC i,k,t ; charging station information CS t includes charging electricity price p c,t of each charging station and the number of electric vehicles
Figure BDA00034947809700000614

所述图神经网络强化学习算法其神经网络结构包括一层的输入层,一层的全连接层对输入的状态xi,t进行特征提取xi,t’,然后将提出的特征xi,t’和邻接矩阵A一起输入到两层的图神经网络中再进行特征提取,最后连接一层全连接层对电动汽车充电路径策略ai,t进行输出;其中,所述的图神经网络采用的是图注意力网络。The neural network structure of the graph neural network reinforcement learning algorithm includes an input layer of one layer, and a fully connected layer of one layer performs feature extraction x i, t ' on the input state x i, t , and then the proposed feature x i , t ' is extracted. t ' and the adjacency matrix A are input into the two-layer graph neural network for feature extraction, and finally a fully connected layer is connected to output the electric vehicle charging path strategy a i, t ; wherein, the graph neural network adopts is the graph attention network.

在一较佳的实施例中,所述图神经网络强化学习算法的奖励函数ri,t如式所示:In a preferred embodiment, the reward function ri ,t of the graph neural network reinforcement learning algorithm is shown in the formula:

Figure BDA0003494780970000071
Figure BDA0003494780970000071

式中,nodecur和nodetar表示电动汽车所在当前节点和电动汽车将要前往的任一充电站节点,step表示电动汽车已经行驶的步数;penalty表示一个很大的惩罚因子;wi表示第i辆电动汽车的单位时间成本;

Figure BDA0003494780970000072
Figure BDA0003494780970000073
分别表示在时间t时第i辆电动汽车前往第k个充电站时的行驶时间、充电等待时间和充电所需时间;λk,t表示在时间t时充电站k所在节点的边际成本电价;SOCi,k,t表示在时间t时第i辆电动汽车达到充电站k时的剩余电量SOCi,k,t
Figure BDA0003494780970000074
表示第i辆电动汽车电池额定容量;In the formula, node cur and node tar represent the current node where the electric vehicle is located and any charging station node that the electric vehicle will go to, step represents the number of steps the electric vehicle has traveled; penalty represents a large penalty factor; w i represents the i -th The unit time cost of an electric vehicle;
Figure BDA0003494780970000072
and
Figure BDA0003494780970000073
λ k, t represent the marginal cost electricity price of the node where the charging station k is located at time t; SOC i,k,t represents the remaining power SOC i,k,t when the ith electric vehicle reaches the charging station k at time t ;
Figure BDA0003494780970000074
Indicates the rated capacity of the i-th electric vehicle battery;

从式可以看出该奖励函数ri,t是一个分段函数;若第i辆电动汽车没有到达充电站nodecur≠nodetar并且当前电动汽车前往充电站的步数在给定的最大充电步数内step<Nstep,此时其奖励函数ri,t=0;若第i辆电动汽车前往充电站的步数大于或等于给定的最大充电步数step≥Nstep,表明该次充电行为探索失败,此时给予其一个较大的负奖励ri,t=-penalty;若第i辆电动汽车到达充电站nodecur=nodetar并且当前电动汽车前往充电站的步数在给定的最大充电步数内step<Nstep,此时其奖励函数根据电动汽车行驶时间

Figure BDA0003494780970000075
和充电时间
Figure BDA0003494780970000076
以及充电时电费来计算;It can be seen from the formula that the reward function ri ,t is a piecewise function; if the i-th electric vehicle does not arrive at the charging station node cur ≠node tar and the current number of steps of the electric vehicle to the charging station is within the given maximum charging step step<N step in the number, then its reward function ri ,t = 0; if the number of steps taken by the i-th electric vehicle to the charging station is greater than or equal to the given maximum number of charging steps step≥N step , it indicates that this charging time If the behavior exploration fails, a large negative reward ri ,t =-penalty is given to it; if the ith electric vehicle reaches the charging station node cur = node tar and the current number of steps of the electric vehicle to the charging station is within the given value step<N step within the maximum number of charging steps, at this time, the reward function is based on the driving time of the electric vehicle
Figure BDA0003494780970000075
and charging time
Figure BDA0003494780970000076
And the electricity charge when charging is calculated;

第i辆电动汽车在路段a的通行时间ta,t根据美国联邦公路局函数(bureau ofpublic roads,BPR)来计算,即The travel time t a,t of the i-th electric vehicle on road segment a is calculated according to the U.S. Federal Highway Administration (BPR) function, that is,

Figure BDA0003494780970000077
Figure BDA0003494780970000077

式中,na,t表示t时刻路段a上的电动汽车数量;ca

Figure BDA0003494780970000078
分别表示路段a的容量上限和t时刻电动汽车自由通行时间;由此可以得到第i辆电动汽车前往充电站k所需时间
Figure BDA0003494780970000081
即In the formula, n a, t represent the number of electric vehicles on road segment a at time t; c a and
Figure BDA0003494780970000078
respectively represent the upper limit of the capacity of road section a and the free passage time of electric vehicles at time t; from this, the time required for the i-th electric vehicle to go to charging station k can be obtained.
Figure BDA0003494780970000081
which is

Figure BDA0003494780970000082
Figure BDA0003494780970000082

此外,第i辆电动汽车的充电等待时间

Figure BDA0003494780970000083
可以通过式得到;In addition, the charging waiting time of the i-th electric vehicle
Figure BDA0003494780970000083
can be obtained by the formula;

Figure BDA0003494780970000084
Figure BDA0003494780970000084

式中,SOCt表示电动汽车剩余电量;

Figure BDA0003494780970000085
表示电动汽车电池的额定容量;η表示充电功率因素,Pcharging表示电动汽车充电的额定功率。In the formula, SOC t represents the remaining power of the electric vehicle;
Figure BDA0003494780970000085
Represents the rated capacity of the electric vehicle battery; η represents the charging power factor, and P charging represents the rated power of the electric vehicle charging.

在一较佳的实施例中,所述基于随机梯度下降的方法对图神经网络强化学习算法权重进行更新包括:In a preferred embodiment, the stochastic gradient descent-based method for updating the weights of the graph neural network reinforcement learning algorithm includes:

步骤61:从记忆单元D中随机抽取一定数量的样本Sample;Step 61: Randomly extract a certain number of samples from the memory unit D;

步骤62:构建损失函数如式所示,并在抽取的样本Sample下根据随机梯度下降方法对图神经网络强化学习算法权重进行更新如式所示;Step 62: Construct the loss function as shown in the formula, and update the weights of the graph neural network reinforcement learning algorithm according to the stochastic gradient descent method under the sampled sample as shown in the formula;

Figure BDA0003494780970000086
Figure BDA0003494780970000086

式中,x,a,x'和a'分别为当前状态、动作以及下一时刻的状态和动作;r表示图神经网络强化学习的立即奖励;θt表示当前时刻t的图神经网络强化学习算法参数;0≤γ≤1表示折扣因子,其反映未来Q值对当前动作的影响;

Figure BDA0003494780970000087
表示在目标图神经网络强化学习算法参数θ′t下的状态-动作值;In the formula, x, a, x' and a' are the current state, action and state and action at the next moment, respectively; r represents the immediate reward of the reinforcement learning of the graph neural network; θ t represents the reinforcement learning of the graph neural network at the current moment t Algorithm parameters; 0≤γ≤1 represents the discount factor, which reflects the influence of the future Q value on the current action;
Figure BDA0003494780970000087
represents the state-action value under the target graph neural network reinforcement learning algorithm parameter θ′ t ;

Figure BDA0003494780970000088
Figure BDA0003494780970000088

式中,θt表示当前时刻t的图神经网络强化学习算法参数;

Figure BDA0003494780970000089
表示对θt进行求导操作;α表示学习速率;In the formula, θ t represents the parameters of the reinforcement learning algorithm of the graph neural network at the current time t;
Figure BDA0003494780970000089
Represents the derivation operation on θ t ; α represents the learning rate;

步骤63:每经过一定的步数根据当前图神经网络强化学习参数θt对目标图神经网络强化学习参数θ′t进行更新。Step 63: Update the target graph neural network reinforcement learning parameter θ′ t according to the current graph neural network reinforcement learning parameter θ t every time a certain number of steps pass.

与现有技术相比,本发明具有以下有益效果:Compared with the prior art, the present invention has the following beneficial effects:

本发明提供了一种基于图神经网络强化学习的电动汽车充电引导优化方法,基于图理论将电动汽车间的相互影响关系转换为一种动态网络图结构,提出一种基于注意力机制的图神经网络强化学习来处理不规则非欧式结构数据,以此研究多智能体间的沟通、协作,探讨电动汽车间的相互影响。在考虑可再生能源出力的主动配电网基础上,通过二阶锥优化及对偶优化理论对配电网最优潮流进行求解并得到配电网节点边际成本电价,以此研究电力-交通融合网协同优化。所提出的基于图神经网络强化学习的电动汽车充电引导优化方法能够有效的在考虑电力-交通融合网多种不确定性因素的情况下,能够有效地降低电动汽车充电总成本,实现电动汽车的有序充电以及电力系统协同优化调度。The invention provides an electric vehicle charging guidance optimization method based on graph neural network reinforcement learning. Based on graph theory, the mutual influence relationship between electric vehicles is converted into a dynamic network graph structure, and a graph neural network based on attention mechanism is proposed. Network reinforcement learning is used to deal with irregular non-European structure data, in order to study the communication and cooperation between multi-agents, and to explore the mutual influence between electric vehicles. Based on the active distribution network considering the output of renewable energy, the optimal power flow of the distribution network is solved through the second-order cone optimization and dual optimization theory, and the marginal cost price of the distribution network node is obtained, so as to study the power-transportation integrated network. Collaborative optimization. The proposed optimization method of electric vehicle charging guidance based on the reinforcement learning of graph neural network can effectively reduce the total cost of electric vehicle charging and realize the electric vehicle charging under the condition of considering various uncertain factors of the power-transport fusion network. Orderly charging and coordinated optimal scheduling of the power system.

附图说明Description of drawings

图1为本发明优选实施例的基于图神经网络强化学习的电动汽车充电引导优化方法流程图。FIG. 1 is a flow chart of an electric vehicle charging guidance optimization method based on graph neural network reinforcement learning according to a preferred embodiment of the present invention.

具体实施方式Detailed ways

下面结合附图及实施例对本发明做进一步说明。The present invention will be further described below with reference to the accompanying drawings and embodiments.

应该指出,以下详细说明都是例示性的,旨在对本申请提供进一步的说明。除非另有指明,本文使用的所有技术和科学术语具有与本申请所属技术领域的普通技术人员通常理解的相同含义。It should be noted that the following detailed description is exemplary and intended to provide further explanation of the application. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

需要注意的是,这里所使用的术语仅是为了描述具体实施方式,而非意图限制根据本申请的示例性实施方式;如在这里所使用的,除非上下文另外明确指出,否则单数形式也意图包括复数形式,此外,还应当理解的是,当在本说明书中使用术语“包含”和/或“包括”时,其指明存在特征、步骤、操作、器件、组件和/或它们的组合。It should be noted that the terms used herein are for the purpose of describing particular embodiments only and are not intended to limit exemplary embodiments in accordance with the present application; as used herein, unless the context clearly dictates otherwise, the singular forms are also intended to include Plural forms, furthermore, should also be understood that when the terms "comprising" and/or "comprising" are used in this specification, they indicate the presence of features, steps, operations, devices, components, and/or combinations thereof.

如图1所示,是本发明一种基于图神经网络强化学习的电动汽车充电引导优化方法,包括如下步骤:As shown in FIG. 1, it is an electric vehicle charging guidance optimization method based on graph neural network reinforcement learning of the present invention, which includes the following steps:

S11:电力-交通融合网协同优化模型初始化;S11: initialization of the collaborative optimization model of the power-transport fusion network;

S12:更新电动汽车充电负荷,并基于二阶锥松弛优化及对偶理论对电动汽车充电站所在的节点的边际成本电价进行优化计算;S12: Update the electric vehicle charging load, and optimize the marginal cost electricity price of the node where the electric vehicle charging station is located based on the second-order cone relaxation optimization and dual theory;

S13:根据epsilon-Greedy算法和图神经网络强化学习算法生成电动汽车充电引导行为策略ai,tS13: Generate an electric vehicle charging guidance behavior strategy a i,t according to the epsilon-Greedy algorithm and the graph neural network reinforcement learning algorithm;

S14:执行充电引导行为策略ai,t,并对电动汽车的状态进行判断和更新;S14: Execute the charging guidance behavior strategy a i,t , and judge and update the state of the electric vehicle;

S15:根据电力-交通融合环境计算图神经网络强化学习算法的奖励函数;S15: Calculate the reward function of the graph neural network reinforcement learning algorithm according to the power-transport fusion environment;

S16:部分观测马尔科夫决策过程的状态xi,t更新;S16: The state x i, t of the partially observed Markov decision process is updated;

S17:将当前步的信息(xi,t,ai,t,ri,t,xi,t’)存储于记忆单元D中,并基于随机梯度下降的方法对图神经网络强化学习算法权重进行更新;S17: Store the information of the current step (x i,t ,a i,t ,r i,t , xi,t ') in the memory unit D, and strengthen the learning algorithm of the graph neural network based on the method of stochastic gradient descent weights are updated;

S18:判断是否达到预定的时间Tend。若否,则执行(2)~(7);若是,则输出图神经网络强化学习算法参数和相应输出结果。S18: Determine whether the predetermined time T end is reached. If not, execute (2) to (7); if so, output the parameters of the graph neural network reinforcement learning algorithm and the corresponding output results.

具体的:specific:

一、电力-交通融合网协同优化模型初始化。主要的步骤包括电力网络和交通网络拓扑结构和参数确定,包括电力系统节点、线路、初始电压、优化的上下限值,交通网络包括交通节点、道路参数、容量、行驶速度最大值等。1. Initialization of the collaborative optimization model of the power-transport fusion network. The main steps include the determination of power network and traffic network topology and parameters, including power system nodes, lines, initial voltages, and optimized upper and lower limits. The traffic network includes traffic nodes, road parameters, capacity, and maximum driving speed.

神经网络参数初始化,包括神经网络权重初始化和超参数设置,如学习速率α、折扣因子γ、批大小B和记忆单元容量大小D;Neural network parameter initialization, including neural network weight initialization and hyperparameter settings, such as learning rate α, discount factor γ, batch size B and memory cell capacity size D;

将研究区域中的每辆电动汽车看做一个代理,并将其视为一个节点n∈N,将电动汽车间的连接视为边e∈E,以此构成图网络结构G=(N,E),并对每辆电动汽车i在当前状态xi,t和邻接矩阵A进行初始化。Each electric vehicle in the study area is regarded as an agent, and it is regarded as a node n∈N, and the connection between electric vehicles is regarded as an edge e∈E, so as to form a graph network structure G=(N,E ), and initialize the current state xi,t and adjacency matrix A for each electric vehicle i.

二、更新电动汽车充电负荷,并基于二阶锥松弛优化及对偶理论对电动汽车充电站所在的节点的边际成本电价进行优化计算。主要包括以下步骤:2. Update the electric vehicle charging load, and optimize the marginal cost electricity price of the node where the electric vehicle charging station is located based on the second-order cone relaxation optimization and dual theory. It mainly includes the following steps:

步骤21:更新电动汽车充电负荷:根据充电站中的电动汽车数量和充电功率计算各个充电站充电负荷,得到各个站的充电负荷后加上该节点的基础负荷即可以获得该节点的最终用电负荷;Step 21: Update the electric vehicle charging load: Calculate the charging load of each charging station according to the number of electric vehicles in the charging station and the charging power, and then add the base load of the node to obtain the final power consumption of the node after obtaining the charging load of each station. load;

步骤22:建立基于支路潮流模型的配电网最优潮流模型:Step 22: Establish the optimal power flow model of the distribution network based on the branch power flow model:

min f(p,q,P,Q,V,I) (1)min f(p,q,P,Q,V,I) (1)

Figure BDA0003494780970000111
Figure BDA0003494780970000111

Figure BDA0003494780970000112
Figure BDA0003494780970000112

Figure BDA0003494780970000113
Figure BDA0003494780970000113

Figure BDA0003494780970000114
Figure BDA0003494780970000114

Figure BDA0003494780970000115
Figure BDA0003494780970000115

Figure BDA0003494780970000116
Figure BDA0003494780970000116

Figure BDA0003494780970000117
Figure BDA0003494780970000117

Figure BDA0003494780970000118
Figure BDA0003494780970000118

式中,EN和EL分别表示配电网节点和线路集合;Pij和Qij表示从节点i流向节点j的支路有功功率和无功功率;

Figure BDA0003494780970000119
Figure BDA00034947809700001110
表示发电机有功和无功出力,即注入到节点j的有功功率和无功功率;
Figure BDA00034947809700001111
Figure BDA00034947809700001112
表示风机注入到节点j的有功功率和无功功率;rij和xij表示从节点i到节点j的支路电阻和电抗;Iij表示从节点i到节点j的支路电流;π(j)表示与节点j相连的支路集合;
Figure BDA0003494780970000121
Figure BDA0003494780970000122
表示连接在节点j上的有功负荷和无功负荷;Vi表示节点i的电压幅值;zij表示连接节点i和节点j的支路阻抗,满足zij=rij+jxij
Figure BDA0003494780970000123
表示连接节点i和节点j的支路电流最大值;V j
Figure BDA0003494780970000124
表示节点j的最小和最大电压;
Figure BDA0003494780970000125
表示连接到节点j的风机最大有功出力;
Figure BDA0003494780970000126
表示连接到节点j的风机的功率因素。In the formula, EN and EL represent the distribution network nodes and line sets, respectively; P ij and Q ij represent the branch active power and reactive power flowing from node i to node j;
Figure BDA0003494780970000119
and
Figure BDA00034947809700001110
Represents the active and reactive output of the generator, that is, the active power and reactive power injected into node j;
Figure BDA00034947809700001111
and
Figure BDA00034947809700001112
Represents the active power and reactive power injected by the fan into node j; r ij and x ij represent the branch resistance and reactance from node i to node j; I ij represents the branch current from node i to node j; π(j ) represents the set of branches connected to node j;
Figure BDA0003494780970000121
and
Figure BDA0003494780970000122
represents the active load and reactive load connected to node j; V i represents the voltage amplitude of node i; z ij represents the branch impedance connecting node i and node j, satisfying zi ij =r ij +jx ij ;
Figure BDA0003494780970000123
represents the maximum value of the branch current connecting node i and node j; V j and
Figure BDA0003494780970000124
represents the minimum and maximum voltage of node j;
Figure BDA0003494780970000125
Represents the maximum active power output of the fan connected to node j;
Figure BDA0003494780970000126
represents the power factor of the fan connected to node j.

配电网节点j的负荷

Figure BDA0003494780970000127
包括基础负荷
Figure BDA0003494780970000128
和电动汽车充电负荷
Figure BDA0003494780970000129
即The load of distribution network node j
Figure BDA0003494780970000127
including base load
Figure BDA0003494780970000128
and electric vehicle charging load
Figure BDA0003494780970000129
which is

Figure BDA00034947809700001210
Figure BDA00034947809700001210

根据配电网实际需求,其目标函数min f(p,q,P,Q,V,I)可以最终定义为:According to the actual demand of the distribution network, the objective function min f(p,q,P,Q,V,I) can be finally defined as:

Figure BDA00034947809700001211
Figure BDA00034947809700001211

式中,ai和bi分别表示发电机的二次煤耗和一次煤耗系数;

Figure BDA00034947809700001212
Figure BDA00034947809700001213
分别从主网中购买电量的电价和有功功率。In the formula, a i and b i represent the secondary coal consumption and primary coal consumption coefficient of the generator, respectively;
Figure BDA00034947809700001212
and
Figure BDA00034947809700001213
The electricity price and active power of electricity purchased from the main network, respectively.

步骤23、将以上非线性配电网最优潮流模型转换为二阶锥松弛规划模型:Step 23: Convert the above nonlinear distribution network optimal power flow model into a second-order cone relaxation programming model:

由于BFM-OPF是非线性规划模型,令

Figure BDA00034947809700001214
以及
Figure BDA00034947809700001215
并对式进行二阶锥松弛(SOCR)转换,可以得到以下模型:Since BFM-OPF is a nonlinear programming model, let
Figure BDA00034947809700001214
as well as
Figure BDA00034947809700001215
And the second-order cone relaxation (SOCR) transformation of the formula, the following model can be obtained:

Figure BDA00034947809700001216
Figure BDA00034947809700001216

Figure BDA00034947809700001217
Figure BDA00034947809700001217

Figure BDA00034947809700001218
Figure BDA00034947809700001218

Figure BDA00034947809700001219
Figure BDA00034947809700001219

Figure BDA0003494780970000131
Figure BDA0003494780970000131

Figure BDA0003494780970000132
Figure BDA0003494780970000132

Figure BDA0003494780970000133
Figure BDA0003494780970000133

式中||·||2表示二阶锥操作;上式-构成了松弛后的配电网最优潮流基本形式。In the formula ||·|| 2 represents the second-order cone operation; the above formula - constitutes the basic form of the optimal power flow of the distribution network after relaxation.

步骤24、采用Gurobi求解器求解上述模型的原问题和对偶变量,获取充电站所在节点的边际成本电价λkStep 24: Use the Gurobi solver to solve the original problem and dual variables of the above model, and obtain the marginal cost electricity price λ k of the node where the charging station is located.

三、根据epsilon-Greedy算法和图神经网络强化学习算法生成电动汽车充电引导行为策略ai,t。主要包括以下步骤:3. Generate electric vehicle charging guidance behavior strategy a i,t according to epsilon-Greedy algorithm and graph neural network reinforcement learning algorithm. It mainly includes the following steps:

步骤31:生成一个随机数u,判断其与epsilon-Greedy算法的衰退因子ξ的大小。Step 31: Generate a random number u, and determine its size with the decay factor ξ of the epsilon-Greedy algorithm.

步骤32:若u<ξ,则采用随机的方式在当前状态对每辆电动汽车生成一个行为ai,t,该行为在专利中表示电动汽车充电路径策略;Step 32: If u<ξ, generate a behavior a i,t for each electric vehicle in the current state in a random manner, which represents the electric vehicle charging path strategy in the patent;

ai,t=randint(Naction) (19)a i,t = randint(N action ) (19)

式中,Naction表示电动汽车行为决策的数量。where N action represents the number of EV behavior decisions.

步骤33:若u≥ξ,则根据图神经网络强化学习算法的经验对每辆电动汽车i在当前状态xi,t和邻接矩阵A下生成一个行为ai,t,即Step 33: If u≥ξ, according to the experience of the graph neural network reinforcement learning algorithm, a behavior a i,t is generated for each electric vehicle i under the current state x i,t and the adjacency matrix A, that is,

Figure BDA0003494780970000134
Figure BDA0003494780970000134

式中,θt表示图神经网络强化学习算法的参数;argmax()表示取最大值对应的参数操作;xi,t表示第i辆电动汽车在时间t时的状态,其主要由时间t时第i辆电动汽车的状态xi,t由电动汽车状态EVi,t、近邻交通道路信息Roi,t、近邻电动汽车状态Nei,t和各充电站信息CSt组成,即In the formula, θ t represents the parameters of the graph neural network reinforcement learning algorithm; argmax() represents the parameter operation corresponding to the maximum value; x i, t represents the state of the i-th electric vehicle at time t, which is mainly determined by time t. The state x i,t of the i-th electric vehicle is composed of the electric vehicle state EV i,t , the neighboring traffic road information Ro i,t , the neighboring electric vehicle state Ne i,t and the information CS t of each charging station, namely

xi,t=[EVi,t,Roi,t,Nei,t,CSt] (21)x i,t =[EV i,t ,Ro i,t ,Ne i,t ,CS t ] (21)

Figure BDA0003494780970000135
Figure BDA0003494780970000135

Figure BDA0003494780970000141
Figure BDA0003494780970000141

Figure BDA0003494780970000142
Figure BDA0003494780970000142

Figure BDA0003494780970000143
Figure BDA0003494780970000143

式中,第i辆电动汽车状态EVi,t包括电动汽车前往充电站时的下一节点

Figure BDA0003494780970000144
道路编号
Figure BDA0003494780970000145
电动汽车行驶速度vi,t和剩余电量SOCi,t;近邻交通道路信息状态Roi,t包括与电动汽车i所在下一节点
Figure BDA0003494780970000146
相连的下一条道路的起始节点
Figure BDA0003494780970000147
末节点
Figure BDA0003494780970000148
道路长度
Figure BDA0003494780970000149
以及道路上的电动车数量
Figure BDA00034947809700001410
近邻电动汽车状态Nei,t包括各近邻电动汽车k的状态,如与第i辆电动汽车临近的第k辆电动汽车下一节点
Figure BDA00034947809700001411
其所在的道路编号
Figure BDA00034947809700001412
电动汽车行驶速度vi,k,t和剩余电量SOCi,k,t;充电站信息CSt包括各充电站的充电电价pc,t和电动汽车数量
Figure BDA00034947809700001413
In the formula, the state EV i,t of the ith electric vehicle includes the next node when the electric vehicle goes to the charging station
Figure BDA0003494780970000144
road number
Figure BDA0003494780970000145
Electric vehicle driving speed v i,t and remaining power SOC i,t ; neighbor traffic road information state Ro i,t includes the next node where electric vehicle i is located
Figure BDA0003494780970000146
The starting node of the next connected road
Figure BDA0003494780970000147
end node
Figure BDA0003494780970000148
road length
Figure BDA0003494780970000149
and the number of electric vehicles on the road
Figure BDA00034947809700001410
Neighboring electric vehicle state Ne i,t includes the state of each neighboring electric vehicle k, such as the next node of the k-th electric vehicle adjacent to the i-th electric vehicle
Figure BDA00034947809700001411
the road number it is on
Figure BDA00034947809700001412
Electric vehicle running speed v i,k,t and remaining power SOC i,k,t ; charging station information CS t includes charging electricity price p c,t of each charging station and the number of electric vehicles
Figure BDA00034947809700001413

所述图神经网络强化学习算法其神经网络结构包括一层的输入层,一层的全连接层对输入的状态xi,t进行特征提取xi,t’,然后将提出的特征xi,t’和邻接矩阵A一起输入到两层的图神经网络中再进行特征提取,最后连接一层全连接层对电动汽车充电路径策略ai,t进行输出。其中,本专利所述的图神经网络采用的是图注意力网络。The neural network structure of the graph neural network reinforcement learning algorithm includes an input layer of one layer, and a fully connected layer of one layer performs feature extraction x i, t ' on the input state x i, t , and then the proposed feature x i , t ' is extracted. t ' and the adjacency matrix A are input into a two-layer graph neural network for feature extraction, and finally a fully connected layer is connected to output the electric vehicle charging path strategy a i,t . Among them, the graph neural network described in this patent adopts the graph attention network.

四、执行充电引导行为策略ai,t,并对电动汽车的状态进行判断和更新。电动汽车的状态分为三种:决策状态、运行状态和充电状态。如果电动汽车抵达交叉路口nodecur=nodenext并且该路口不是充电站节点nodecur≠nodetar,此时电动处于决策状态,电动汽车执行充电引导行为策略ai,t,并更新道路状态如电动汽车数量、行驶理想速度,更新电动汽车状态如所在道路位置、行驶速度和距离等信息;若电动汽车没有抵达交叉路口nodecur≠nodenext,此时电动汽车处于运行状态,即电动汽车按照上一步的充电引导策略ai,t-1继续沿着当前的道路向前行驶,并更新此时的电动汽车位置信息、速度信息和SOC状态;若电动汽车所在节点位置充电站节点上nodecur=nodetar,此时电动汽车处于充电状态,若当前电动汽车数量大于充电站中充电桩的数量时,电动汽车需要排队等待进行充电,若充电站中有可用充电桩使用时,则电动汽车立即进行充电,并更新电动汽车充电等待时间、充电时间和电动汽车SOC状态。Fourth, implement the charging guidance behavior strategy a i,t , and judge and update the state of the electric vehicle. There are three states of electric vehicles: decision state, running state and charging state. If the electric vehicle arrives at the intersection node cur = node next and the intersection is not a charging station node node cur ≠node tar , at this time the electric vehicle is in a decision-making state, the electric vehicle executes the charging guidance behavior strategy a i,t , and updates the road state such as the electric vehicle Quantity, ideal driving speed, update the information of the electric vehicle status such as the road location, driving speed and distance; if the electric vehicle does not reach the intersection node cur ≠ node next , the electric vehicle is in the running state at this time, that is, the electric vehicle is in the running state according to the previous step. The charging guidance strategy a i, t-1 continues to drive forward along the current road, and updates the electric vehicle position information, speed information and SOC status at this time; if the electric vehicle is located at the node position of the charging station node cur = node tar , At this time, the electric vehicle is in the charging state. If the current number of electric vehicles is greater than the number of charging piles in the charging station, the electric vehicle needs to wait in line for charging. If there are available charging piles in the charging station, the electric vehicle will be charged immediately. And update EV charging waiting time, charging time and EV SOC status.

五、根据电力-交通融合环境计算图神经网络强化学习算法的奖励函数。具体地,奖励函数ri,t是一个分段函数:若第i辆电动汽车没有到达充电站nodecur≠nodetar并且当前电动汽车前往充电站的步数在给定的最大充电步数内step<Nstep,此时其奖励函数ri,t=0;若第i辆电动汽车前往充电站的步数大于或等于给定的最大充电步数step≥Nstep,表明该次充电行为探索失败,此时给予其一个较大的负奖励ri,t=-penalty;若第i辆电动汽车到达充电站nodecur=nodetar并且当前电动汽车前往充电站的步数在给定的最大充电步数内step<Nstep,此时其奖励函数根据电动汽车行驶时间

Figure BDA0003494780970000151
充电等待时间
Figure BDA0003494780970000152
充电时间
Figure BDA0003494780970000153
以及充电时电费来计算,具体计算表达式如所示。5. Calculate the reward function of the graph neural network reinforcement learning algorithm according to the power-transport fusion environment. Specifically, the reward function r i,t is a piecewise function: if the ith electric vehicle does not arrive at the charging station node cur ≠ node tar and the current number of steps of the electric vehicle to the charging station is within the given maximum number of charging steps <N step , the reward function ri ,t = 0 at this time; if the number of steps taken by the i-th electric vehicle to the charging station is greater than or equal to the given maximum number of charging steps step≥N step , it indicates that the exploration of the charging behavior has failed. , at this time give it a large negative reward ri ,t =-penalty; if the i-th electric vehicle arrives at the charging station node cur = node tar and the current number of steps of the electric vehicle to the charging station is within the given maximum charging step Step<N step in the number, at this time, its reward function is based on the driving time of the electric vehicle
Figure BDA0003494780970000151
Charging waiting time
Figure BDA0003494780970000152
charging time
Figure BDA0003494780970000153
And the electricity cost during charging is calculated, and the specific calculation expression is as shown.

Figure BDA0003494780970000154
Figure BDA0003494780970000154

行驶时间

Figure BDA0003494780970000155
充电等待时间
Figure BDA0003494780970000156
充电时间
Figure BDA0003494780970000157
计算表达式如-所示。travel time
Figure BDA0003494780970000155
Charging waiting time
Figure BDA0003494780970000156
charging time
Figure BDA0003494780970000157
The calculation expression is as shown in -.

第i辆电动汽车在路段a的通行时间根据美国联邦公路局函数(bureau of publicroads,BPR)来计算,即The transit time of the i-th electric vehicle on road segment a is calculated according to the U.S. Federal Highway Administration (BPR) function, that is,

Figure BDA0003494780970000158
Figure BDA0003494780970000158

式中,na,t表示t时刻路段a上的电动汽车数量;ca

Figure BDA0003494780970000159
分别表示路段a的容量上限和t时刻电动汽车自由通行时间。由此可以得到第i辆电动汽车前往充电站k所需时间
Figure BDA00034947809700001510
即In the formula, n a, t represent the number of electric vehicles on road segment a at time t; c a and
Figure BDA0003494780970000159
respectively represent the upper limit of the capacity of road segment a and the free passage time of electric vehicles at time t. From this, the time required for the i-th electric vehicle to go to the charging station k can be obtained.
Figure BDA00034947809700001510
which is

Figure BDA00034947809700001511
Figure BDA00034947809700001511

此外,第i辆电动汽车的充电等待时间

Figure BDA0003494780970000161
可以通过式得到。In addition, the charging waiting time of the i-th electric vehicle
Figure BDA0003494780970000161
can be obtained by formula.

Figure BDA0003494780970000162
Figure BDA0003494780970000162

式中,SOCt表示电动汽车的剩余电量;

Figure BDA0003494780970000163
表示电动汽车电池额定容量;η表示充电功率因素,Pcharging表示电动汽车充电的额定功率。In the formula, SOC t represents the remaining power of the electric vehicle;
Figure BDA0003494780970000163
Represents the rated capacity of the electric vehicle battery; η represents the charging power factor, and P charging represents the rated power of the electric vehicle charging.

六、部分观测马尔科夫决策过程的状态xi,t更新,包括更新电动汽车状态EVi,t、近邻交通道路信息Roi,t、近邻电动汽车状态Nei,t和各充电站信息CSt6. Partially observe the update of the state x i,t of the Markov decision process, including the update of the electric vehicle state EV i,t , the neighbor traffic road information Ro i,t , the neighbor electric vehicle state Ne i,t and the charging station information CS t .

七、将当前步的信息(xi,t,ai,t,ri,t,xi,t’)存储于记忆单元D中,并基于随机梯度下降的方法对图神经网络强化学习算法权重进行更新。其主要包括以下步骤:7. Store the information of the current step ( xi,t ,ai ,t ,ri ,t , xi,t ') in the memory unit D, and strengthen the learning algorithm of the graph neural network based on the method of stochastic gradient descent Weights are updated. It mainly includes the following steps:

步骤71:从记忆单元D中随机抽取一定数量的样本Sample;Step 71: Randomly extract a certain number of samples from the memory unit D;

步骤72:构建损失函数如式所示,并在抽取的样本Sample下根据随机梯度下降方法对图神经网络强化学习算法权重进行更新如式所示;Step 72: Construct the loss function as shown in the formula, and update the weights of the graph neural network reinforcement learning algorithm according to the stochastic gradient descent method under the sampled sample as shown in the formula;

Figure BDA0003494780970000164
Figure BDA0003494780970000164

式中,x,a,x'和a'分别为当前状态、动作以及下一时刻的状态和动作;θt表示当前时刻t的图神经网络强化学习算法参数;0≤γ≤1表示折扣因子,其反映未来Q值对当前动作的影响;

Figure BDA0003494780970000165
表示在目标图神经网络强化学习算法参数θ′t下的状态-动作值。In the formula, x, a, x' and a' are the current state, action and state and action at the next moment, respectively; θ t represents the graph neural network reinforcement learning algorithm parameter at the current moment t; 0≤γ≤1 represents the discount factor , which reflects the impact of the future Q value on the current action;
Figure BDA0003494780970000165
Represents the state-action value under the target graph neural network reinforcement learning algorithm parameter θ′ t .

Figure BDA0003494780970000166
Figure BDA0003494780970000166

式中,θt表示当前时刻t的图神经网络强化学习算法参数;

Figure BDA0003494780970000167
表示对θt进行求导操作;α表示学习速率。In the formula, θ t represents the parameters of the reinforcement learning algorithm of the graph neural network at the current time t;
Figure BDA0003494780970000167
represents the derivation operation on θ t ; α represents the learning rate.

步骤73:每经过一定的步数根据当前图神经网络强化学习参数θt对目标图神经网络强化学习参数θ′t进行更新。Step 73: Update the target graph neural network reinforcement learning parameter θ′ t according to the current graph neural network reinforcement learning parameter θ t every time a certain number of steps pass.

八、判断是否达到预定的时间Tend。若否,则执行(2)~(7);若是,则输出图神经网络强化学习算法参数和相应输出结果。8. Determine whether the predetermined time T end is reached. If not, execute (2) to (7); if so, output the parameters of the graph neural network reinforcement learning algorithm and the corresponding output results.

本发明一种基于图神经网络强化学习的电动汽车充电引导优化方法,基于图理论将电动汽车间的相互影响关系转换为一种动态网络图结构,提出一种基于注意力机制的图神经网络强化学习来处理不规则非欧式结构数据,以此研究多智能体间的沟通、协作,探讨电动汽车间的相互影响。在考虑可再生能源出力的主动配电网基础上,通过二阶锥优化及对偶优化理论对配电网最优潮流进行求解并得到配电网节点边际成本电价,以此研究电力-交通融合网协同优化。所提出的基于图神经网络强化学习的电动汽车充电引导优化方法能够有效的在考虑电力-交通融合网多种不确定性因素的情况下,能够有效地降低电动汽车充电总成本,实现电动汽车的有序充电以及电力系统协同优化调度。The invention is an electric vehicle charging guidance optimization method based on graph neural network reinforcement learning. Based on graph theory, the mutual influence relationship between electric vehicles is converted into a dynamic network graph structure, and a graph neural network enhancement based on attention mechanism is proposed. Learn to deal with irregular non-European structure data, in order to study the communication and cooperation between multi-agents, and explore the mutual influence between electric vehicles. Based on the active distribution network considering the output of renewable energy, the optimal power flow of the distribution network is solved through the second-order cone optimization and dual optimization theory, and the marginal cost price of the distribution network node is obtained, so as to study the power-transportation integrated network. Collaborative optimization. The proposed optimization method of electric vehicle charging guidance based on the reinforcement learning of graph neural network can effectively reduce the total cost of electric vehicle charging and realize the electric vehicle charging under the condition of considering various uncertain factors of the power-transport fusion network. Orderly charging and coordinated optimal scheduling of the power system.

以上所述实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only represent several embodiments of the present invention, and the descriptions thereof are specific and detailed, but should not be construed as a limitation on the scope of the patent of the present invention. It should be pointed out that for those skilled in the art, without departing from the concept of the present invention, several modifications and improvements can be made, which all belong to the protection scope of the present invention. Therefore, the protection scope of the patent of the present invention should be subject to the appended claims.

Claims (6)

1.基于图神经网络强化学习的电动汽车充电引导优化方法,其特征在于,包括如下步骤:1. The electric vehicle charging guidance optimization method based on graph neural network reinforcement learning, is characterized in that, comprises the following steps: 步骤S1:电力-交通融合网协同优化模型初始化;Step S1: initialization of the collaborative optimization model of the power-transport fusion network; 步骤S2:更新电动汽车充电负荷,并基于二阶锥松弛优化及对偶理论对电动汽车充电站所在的节点的边际成本电价进行优化计算;Step S2: updating the electric vehicle charging load, and optimizing the marginal cost electricity price of the node where the electric vehicle charging station is located based on the second-order cone relaxation optimization and dual theory; 步骤S3:根据epsilon-Greedy算法和图神经网络强化学习算法生成电动汽车充电引导行为策略ai,tStep S3: according to the epsilon-Greedy algorithm and the graph neural network reinforcement learning algorithm, the electric vehicle charging guidance behavior strategy a i,t is generated; 步骤S4:执行充电引导行为策略ai,t,并对电动汽车的状态进行判断和更新;Step S4: executing the charging guidance behavior strategy a i,t , and judging and updating the state of the electric vehicle; 步骤S5:根据电力-交通融合环境计算图神经网络强化学习算法的奖励函数ri,tStep S5: Calculate the reward function ri ,t of the graph neural network reinforcement learning algorithm according to the power-traffic fusion environment; 步骤S6:部分观测马尔科夫决策过程的状态xi,t更新;Step S6: update the state x i, t of the partially observed Markov decision process; 步骤S7:将当前步的信息(xi,t,ai,t,ri,t,xi,t’)存储于记忆单元D中,并基于随机梯度下降的方法对图神经网络强化学习算法权重进行更新;其中,xi,t,表示图神经网络强化学习当前状态;ai,t表示电动汽车行为策略;ri,t表示图神经网络强化学习的奖励函数值;xi,t’表示图神经网络强化学习下一步状态;Step S7: Store the information of the current step (x i,t ,a i,t ,r i,t , xi,t ') in the memory unit D, and perform reinforcement learning on the graph neural network based on the stochastic gradient descent method The algorithm weights are updated; among them, x i,t , represents the current state of the reinforcement learning of the graph neural network; a i,t represents the electric vehicle behavior strategy; ri ,t represents the reward function value of the reinforcement learning of the graph neural network; xi,t 'represents the next state of the reinforcement learning of the graph neural network; 步骤S8:判断是否达到预定的时间Tend;若否,则执行(2)~(7);若是,则输出图神经网络强化学习算法参数和相应输出结果。Step S8: determine whether the predetermined time T end is reached; if not, execute (2) to (7); if so, output the parameters of the graph neural network reinforcement learning algorithm and the corresponding output result. 2.根据权利要求1所述的基于图神经网络强化学习的电动汽车充电引导优化方法,其特征在于,对电力-交通融合网协同优化模型初始化,包括以下步骤:2. The electric vehicle charging guidance optimization method based on graph neural network reinforcement learning according to claim 1, characterized in that, initializing the power-transportation fusion network collaborative optimization model, comprising the following steps: 步骤21:电力网络和交通网络拓扑结构和参数确定,包括电力系统节点、线路、初始电压、优化的上下限值,交通网络包括交通节点、道路参数、容量及行驶速度最大值;Step 21: Determine the topology and parameters of the power network and the transportation network, including power system nodes, lines, initial voltages, and optimized upper and lower limits, and the transportation network includes traffic nodes, road parameters, capacity, and maximum travel speed; 步骤22:神经网络参数初始化,包括神经网络权重初始化和超参数设置,如学习速率α、折扣因子γ、批大小B和记忆单元D容量大小;Step 22: Neural network parameter initialization, including neural network weight initialization and hyperparameter settings, such as learning rate α, discount factor γ, batch size B and memory unit D capacity; 步骤23:将研究区域中的每辆电动汽车看做一个代理,并将其视为一个节点n∈N,将电动汽车间的连接视为边e∈E,以此构成图网络结构G=(N,E),并对每辆电动汽车i在当前状态xi,t和邻接矩阵A进行初始化。Step 23: Consider each electric vehicle in the study area as an agent, and regard it as a node n∈N, and regard the connection between electric vehicles as an edge e∈E, so as to form a graph network structure G=( N, E), and initialize the current state x i, t and the adjacency matrix A of each electric vehicle i. 3.根据权利要求1所述的基于图神经网络强化学习的电动汽车充电引导优化方法,其特征在于,更新电动汽车充电负荷和基于二阶锥松弛优化及对偶理论对电动汽车充电站所在的节点的边际成本电价进行优化计算步骤包括:3. The electric vehicle charging guidance optimization method based on graph neural network reinforcement learning according to claim 1, wherein the electric vehicle charging load is updated and the node where the electric vehicle charging station is located based on second-order cone relaxation optimization and dual theory The steps of optimizing the marginal cost of electricity price include: 步骤31:更新电动汽车充电负荷:根据充电站中的电动汽车数量和充电功率计算各个充电站充电负荷,得到各个站的充电负荷后加上该节点的基础负荷即可以获得该节点的最终用电负荷;Step 31: Update the charging load of electric vehicles: Calculate the charging load of each charging station according to the number of electric vehicles in the charging station and the charging power, obtain the charging load of each station and add the basic load of the node to obtain the final power consumption of the node load; 步骤32:建立基于支路潮流模型的配电网最优潮流模型:Step 32: Establish the optimal power flow model of the distribution network based on the branch power flow model: minf(p,q,P,Q,V,I) (1)minf(p,q,P,Q,V,I) (1) s.t.s.t.
Figure FDA0003494780960000021
Figure FDA0003494780960000021
Figure FDA0003494780960000022
Figure FDA0003494780960000022
Figure FDA0003494780960000023
Figure FDA0003494780960000023
Figure FDA0003494780960000024
Figure FDA0003494780960000024
Figure FDA0003494780960000025
Figure FDA0003494780960000025
Figure FDA0003494780960000026
Figure FDA0003494780960000026
Figure FDA0003494780960000027
Figure FDA0003494780960000027
Figure FDA0003494780960000028
Figure FDA0003494780960000028
式中,EN和EL分别表示配电网节点和线路集合;Pij和Qij表示从节点i流向节点j的支路有功功率和无功功率;Pjk表示从节点j流向节点k的支路有功功率;
Figure FDA0003494780960000031
Figure FDA0003494780960000032
表示发电机有功和无功出力,即注入到节点j的有功功率和无功功率;
Figure FDA0003494780960000033
Figure FDA0003494780960000034
表示风机注入到节点j的有功功率和无功功率;Qjs表示从节点j流向节点s的支路无功功率;rij和xij表示从节点i到节点j的支路电阻和电抗;Iij表示从节点i到节点j的支路电流;π(j)表示与节点j相连的支路集合;
Figure FDA0003494780960000035
Figure FDA0003494780960000036
表示连接在节点j上的有功负荷和无功负荷;Vi表示节点i的电压幅值;Vj表示节点j的电压幅值;zij表示连接节点i和节点j的支路阻抗,满足zij=rij+jxij
Figure FDA0003494780960000037
表示连接节点i和节点j的支路电流最大值;V j
Figure FDA0003494780960000038
表示节点j的最小和最大电压;
Figure FDA0003494780960000039
表示连接到节点j的风机最大有功出力;
Figure FDA00034947809600000310
表示连接到节点j的风机的功率因素;
In the formula, EN and EL represent the distribution network nodes and line sets, respectively; P ij and Q ij represent the branch active power and reactive power flowing from node i to node j; P jk represent the flow from node j to node k. branch active power;
Figure FDA0003494780960000031
and
Figure FDA0003494780960000032
Represents the active and reactive output of the generator, that is, the active power and reactive power injected into node j;
Figure FDA0003494780960000033
and
Figure FDA0003494780960000034
Represents the active power and reactive power injected by the fan into node j; Q js represents the branch reactive power flowing from node j to node s; r ij and x ij represent the branch resistance and reactance from node i to node j; I ij represents the branch current from node i to node j; π(j) represents the set of branches connected to node j;
Figure FDA0003494780960000035
and
Figure FDA0003494780960000036
represents the active load and reactive load connected to node j; V i represents the voltage amplitude of node i; V j represents the voltage amplitude of node j; z ij represents the branch impedance connecting node i and node j, satisfying z ij =r ij +jx ij ;
Figure FDA0003494780960000037
represents the maximum value of the branch current connecting node i and node j; V j and
Figure FDA0003494780960000038
represents the minimum and maximum voltage of node j;
Figure FDA0003494780960000039
Represents the maximum active power output of the fan connected to node j;
Figure FDA00034947809600000310
represents the power factor of the fan connected to node j;
配电网节点j的负荷
Figure FDA00034947809600000311
包括基础负荷
Figure FDA00034947809600000312
和电动汽车充电负荷
Figure FDA00034947809600000313
The load of distribution network node j
Figure FDA00034947809600000311
including base load
Figure FDA00034947809600000312
and electric vehicle charging load
Figure FDA00034947809600000313
which is
Figure FDA00034947809600000314
Figure FDA00034947809600000314
根据配电网实际需求,其目标函数minf(p,q,P,Q,V,I)可以最终定义为:According to the actual demand of the distribution network, the objective function minf(p,q,P,Q,V,I) can be finally defined as:
Figure FDA00034947809600000315
Figure FDA00034947809600000315
式中,ai和bi分别表示发电机的二次煤耗和一次煤耗系数;
Figure FDA00034947809600000316
表示注入节点i发电机的有功出力;
Figure FDA00034947809600000317
Figure FDA00034947809600000318
分别从主网中购买电量的电价和有功功率;
In the formula, a i and b i represent the secondary coal consumption and primary coal consumption coefficient of the generator, respectively;
Figure FDA00034947809600000316
Indicates the active power output injected into the generator of node i;
Figure FDA00034947809600000317
and
Figure FDA00034947809600000318
The electricity price and active power of electricity purchased from the main network, respectively;
步骤33:将以上非线性配电网最优潮流模型转换为二阶锥松弛规划模型:Step 33: Convert the above nonlinear distribution network optimal power flow model into a second-order cone relaxation programming model: 由于BFM-OPF是非线性规划模型,令支路电流幅值
Figure FDA00034947809600000319
以及支路电压幅值
Figure FDA00034947809600000320
并对式进行二阶锥松弛(SOCR)转换,可以得到以下模型:
Since BFM-OPF is a nonlinear programming model, let the branch current amplitude
Figure FDA00034947809600000319
and the branch voltage amplitude
Figure FDA00034947809600000320
And the second-order cone relaxation (SOCR) transformation of the formula, the following model can be obtained:
Figure FDA00034947809600000321
Figure FDA00034947809600000321
s.t.s.t.
Figure FDA00034947809600000322
Figure FDA00034947809600000322
Figure FDA0003494780960000041
Figure FDA0003494780960000041
Figure FDA0003494780960000042
Figure FDA0003494780960000042
Figure FDA0003494780960000043
Figure FDA0003494780960000043
Figure FDA0003494780960000044
Figure FDA0003494780960000044
Figure FDA0003494780960000045
Figure FDA0003494780960000045
式中||·||2表示二阶锥操作;上式-构成了松弛后的配电网最优潮流基本形式;where ||·|| 2 represents the second-order cone operation; the above formula - constitutes the basic form of the optimal power flow of the distribution network after relaxation; 步骤34:采用Gurobi求解器求解上述模型的原问题和对偶变量,获取充电站所在节点的边际成本电价λkStep 34: Use the Gurobi solver to solve the original problem and dual variables of the above model, and obtain the marginal cost electricity price λ k of the node where the charging station is located.
4.根据权利要求3所述的基于图神经网络强化学习的电动汽车充电引导优化方法,其特征在于,所述epsilon-Greedy算法包括以下步骤:4. The electric vehicle charging guidance optimization method based on graph neural network reinforcement learning according to claim 3, is characterized in that, described epsilon-Greedy algorithm comprises the following steps: 步骤41:生成一个随机数u,判断其与epsilon-Greedy算法的衰退因子ξ的大小;Step 41: Generate a random number u, and judge the size of it and the decay factor ξ of the epsilon-Greedy algorithm; 步骤42:若u<ξ,则采用随机的方式在当前状态对每辆电动汽车生成一个行为ai,t,该行为在专利中表示电动汽车充电路径策略;Step 42: If u<ξ, generate a behavior a i,t for each electric vehicle in the current state in a random manner, and this behavior represents the electric vehicle charging path strategy in the patent; ai,t=randint(Naction) (19)a i,t = randint(N action ) (19) 式中,Naction表示电动汽车行为决策的数量;where N action represents the number of EV behavior decisions; 步骤43:若u≥ξ,则根据图神经网络强化学习算法的经验对每辆电动汽车i在当前状态xi,t和邻接矩阵A下生成一个行为ai,t,即Step 43: If u≥ξ, generate a behavior a i,t for each electric vehicle i under the current state x i,t and the adjacency matrix A according to the experience of the graph neural network reinforcement learning algorithm, that is,
Figure FDA0003494780960000046
Figure FDA0003494780960000046
式中,θt表示图神经网络强化学习算法的参数;argmax()表示取最大值对应的参数操作;xi,t表示第i辆电动汽车在时间t时的状态,其主要由时间t时第i辆电动汽车的状态xi,t由电动汽车状态EVi,t、近邻交通道路信息Roi,t、近邻电动汽车状态Nei,t和各充电站信息CSt组成,即In the formula, θ t represents the parameters of the graph neural network reinforcement learning algorithm; argmax() represents the parameter operation corresponding to the maximum value; x i, t represents the state of the i-th electric vehicle at time t, which is mainly determined by time t. The state x i,t of the i-th electric vehicle is composed of the electric vehicle state EV i,t , the neighboring traffic road information Ro i,t , the neighboring electric vehicle state Ne i,t and the information CS t of each charging station, namely xi,t=[EVi,t,Roi,t,Nei,t,CSt] (21)x i,t =[EV i,t ,Ro i,t ,Ne i,t ,CS t ] (21)
Figure FDA0003494780960000051
Figure FDA0003494780960000051
Figure FDA0003494780960000052
Figure FDA0003494780960000052
Figure FDA0003494780960000053
Figure FDA0003494780960000053
Figure FDA0003494780960000054
Figure FDA0003494780960000054
式中,第i辆电动汽车状态EVi,t包括电动汽车前往充电站时的下一节点
Figure FDA0003494780960000055
道路编号
Figure FDA0003494780960000056
电动汽车行驶速度vi,t和剩余电量SOCi,t;近邻交通道路信息状态Roi,t包括与电动汽车i所在下一节点
Figure FDA0003494780960000057
相连的下一条道路的起始节点
Figure FDA0003494780960000058
末节点
Figure FDA0003494780960000059
道路长度
Figure FDA00034947809600000510
以及道路上的电动车数量
Figure FDA00034947809600000511
近邻电动汽车状态Nei,t包括各近邻电动汽车k的状态,如与第i辆电动汽车临近的第k辆电动汽车下一节点
Figure FDA00034947809600000512
其所在的道路编号
Figure FDA00034947809600000513
电动汽车行驶速度vi,k,t和剩余电量SOCi,k,t;充电站信息CSt包括各充电站的充电电价pc,t和电动汽车数量
Figure FDA00034947809600000514
In the formula, the state EV i,t of the ith electric vehicle includes the next node when the electric vehicle goes to the charging station
Figure FDA0003494780960000055
road number
Figure FDA0003494780960000056
Electric vehicle running speed v i,t and remaining power SOC i,t ; neighbor traffic road information state Ro i,t includes the next node where electric vehicle i is located
Figure FDA0003494780960000057
The starting node of the next connected road
Figure FDA0003494780960000058
end node
Figure FDA0003494780960000059
road length
Figure FDA00034947809600000510
and the number of electric vehicles on the road
Figure FDA00034947809600000511
Neighboring electric vehicle state Ne i,t includes the state of each neighboring electric vehicle k, such as the next node of the k-th electric vehicle adjacent to the i-th electric vehicle
Figure FDA00034947809600000512
the road number it is on
Figure FDA00034947809600000513
Electric vehicle travel speed v i,k,t and remaining power SOC i,k,t ; charging station information CS t includes charging electricity price p c, t of each charging station and the number of electric vehicles
Figure FDA00034947809600000514
所述图神经网络强化学习算法其神经网络结构包括一层的输入层,一层的全连接层对输入的状态xi,t进行特征提取xi,t’,然后将提出的特征xi,t’和邻接矩阵A一起输入到两层的图神经网络中再进行特征提取,最后连接一层全连接层对电动汽车充电路径策略ai,t进行输出;其中,所述的图神经网络采用的是图注意力网络。The neural network structure of the graph neural network reinforcement learning algorithm includes an input layer of one layer, and a fully connected layer of one layer performs feature extraction x i, t ' on the input state x i, t , and then the proposed feature x i , t ' is extracted. t ' and the adjacency matrix A are input into the two-layer graph neural network for feature extraction, and finally a fully connected layer is connected to output the electric vehicle charging path strategy a i, t ; wherein, the graph neural network adopts is the graph attention network.
5.根据权利要求2所述的基于图神经网络强化学习的电动汽车充电引导优化方法,其特征在于,所述图神经网络强化学习算法的奖励函数ri,t如式所示:5. The electric vehicle charging guidance optimization method based on graph neural network reinforcement learning according to claim 2, is characterized in that, the reward function r i,t of described graph neural network reinforcement learning algorithm is as shown in the formula:
Figure FDA00034947809600000515
Figure FDA00034947809600000515
Figure FDA0003494780960000061
Figure FDA0003494780960000061
式中,nodecur和nodetar表示电动汽车所在当前节点和电动汽车将要前往的任一充电站节点,step表示电动汽车已经行驶的步数;penalty表示一个很大的惩罚因子;wi表示第i辆电动汽车的单位时间成本;
Figure FDA0003494780960000062
Figure FDA0003494780960000063
分别表示在时间t时第i辆电动汽车前往第k个充电站时的行驶时间、充电等待时间和充电所需时间;λk,t表示在时间t时充电站k所在节点的边际成本电价;SOCi,k,t表示在时间t时第i辆电动汽车达到充电站k时的剩余电量SOCi,k,t
Figure FDA0003494780960000064
表示第i辆电动汽车电池额定容量;
In the formula, node cur and node tar represent the current node where the electric vehicle is located and any charging station node that the electric vehicle will go to, step represents the number of steps the electric vehicle has traveled; penalty represents a large penalty factor; w i represents the i -th The unit time cost of an electric vehicle;
Figure FDA0003494780960000062
and
Figure FDA0003494780960000063
λ k, t represent the marginal cost electricity price of the node where charging station k is located at time t; SOC i,k,t represents the remaining power SOC i,k,t when the i-th electric vehicle reaches the charging station k at time t ;
Figure FDA0003494780960000064
Indicates the rated capacity of the i-th electric vehicle battery;
从式可以看出该奖励函数ri,t是一个分段函数;若第i辆电动汽车没有到达充电站nodecur≠nodetar并且当前电动汽车前往充电站的步数在给定的最大充电步数内step<Nstep,此时其奖励函数ri,t=0;若第i辆电动汽车前往充电站的步数大于或等于给定的最大充电步数step≥Nstep,表明该次充电行为探索失败,此时给予其一个较大的负奖励ri,t=-penalty;若第i辆电动汽车到达充电站nodecur=nodetar并且当前电动汽车前往充电站的步数在给定的最大充电步数内step<Nstep,此时其奖励函数根据电动汽车行驶时间
Figure FDA0003494780960000065
和充电时间
Figure FDA0003494780960000066
以及充电时电费来计算;
It can be seen from the formula that the reward function ri ,t is a piecewise function; if the ith electric vehicle does not reach the charging station node cur ≠node tar and the current number of steps of the electric vehicle to the charging station is within the given maximum charging step step<N step in the number, then its reward function ri ,t = 0; if the number of steps taken by the i-th electric vehicle to the charging station is greater than or equal to the given maximum number of charging steps step≥N step , it indicates that this charging time If the behavior exploration fails, a large negative reward ri ,t =-penalty is given to it; if the ith electric vehicle reaches the charging station node cur = node tar and the current number of steps of the electric vehicle to the charging station is within the given value step<N step within the maximum number of charging steps, at this time, the reward function is based on the driving time of the electric vehicle
Figure FDA0003494780960000065
and charging time
Figure FDA0003494780960000066
And the electricity charge when charging is calculated;
第i辆电动汽车在路段a的通行时间ta,t根据美国联邦公路局函数(bureau of publicroads,BPR)来计算,即The transit time t a,t of the i-th electric vehicle on the road section a is calculated according to the function of the Federal Highway Administration (BPR), that is,
Figure FDA0003494780960000067
Figure FDA0003494780960000067
式中,na,t表示t时刻路段a上的电动汽车数量;ca
Figure FDA0003494780960000068
分别表示路段a的容量上限和t时刻电动汽车自由通行时间;由此可以得到第i辆电动汽车前往充电站k所需时间
Figure FDA0003494780960000069
In the formula, n a, t represent the number of electric vehicles on road segment a at time t; c a and
Figure FDA0003494780960000068
respectively represent the upper limit of the capacity of the road section a and the free passage time of the electric vehicle at time t; from this, the time required for the i-th electric vehicle to go to the charging station k can be obtained.
Figure FDA0003494780960000069
which is
Figure FDA00034947809600000610
Figure FDA00034947809600000610
此外,第i辆电动汽车的充电等待时间
Figure FDA00034947809600000611
可以通过式得到;
In addition, the charging waiting time of the i-th electric vehicle
Figure FDA00034947809600000611
can be obtained by the formula;
Figure FDA0003494780960000071
Figure FDA0003494780960000071
式中,SOCt表示电动汽车剩余电量;
Figure FDA0003494780960000072
表示电动汽车电池的额定容量;η表示充电功率因素,Pcharging表示电动汽车充电的额定功率。
In the formula, SOC t represents the remaining power of the electric vehicle;
Figure FDA0003494780960000072
Represents the rated capacity of the electric vehicle battery; η represents the charging power factor, and Pcharging represents the rated power of the electric vehicle charging.
6.根据权利要求1所述的基于图神经网络强化学习的电动汽车充电引导优化方法,其特征在于,所述基于随机梯度下降的方法对图神经网络强化学习算法权重进行更新包括:6. The electric vehicle charging guidance optimization method based on graph neural network reinforcement learning according to claim 1, wherein the method based on stochastic gradient descent updates the weight of the graph neural network reinforcement learning algorithm comprising: 步骤61:从记忆单元D中随机抽取一定数量的样本Sample;Step 61: Randomly extract a certain number of samples from the memory unit D; 步骤62:构建损失函数如式所示,并在抽取的样本Sample下根据随机梯度下降方法对图神经网络强化学习算法权重进行更新如式所示;Step 62: Construct the loss function as shown in the formula, and update the weights of the graph neural network reinforcement learning algorithm according to the stochastic gradient descent method under the sampled sample as shown in the formula;
Figure FDA0003494780960000073
Figure FDA0003494780960000073
式中,x,a,x'和a'分别为当前状态、动作以及下一时刻的状态和动作;r表示图神经网络强化学习的立即奖励;θt表示当前时刻t的图神经网络强化学习算法参数;0≤γ≤1表示折扣因子,其反映未来Q值对当前动作的影响;
Figure FDA0003494780960000074
表示在目标图神经网络强化学习算法参数θ′t下的状态-动作值;
In the formula, x, a, x' and a' are the current state, action and state and action at the next moment, respectively; r represents the immediate reward of the reinforcement learning of the graph neural network; θ t represents the reinforcement learning of the graph neural network at the current moment t Algorithm parameters; 0≤γ≤1 represents the discount factor, which reflects the influence of the future Q value on the current action;
Figure FDA0003494780960000074
represents the state-action value under the target graph neural network reinforcement learning algorithm parameter θ′ t ;
Figure FDA0003494780960000075
Figure FDA0003494780960000075
式中,θt表示当前时刻t的图神经网络强化学习算法参数;
Figure FDA0003494780960000076
表示对θt进行求导操作;α表示学习速率;
In the formula, θ t represents the parameters of the reinforcement learning algorithm of the graph neural network at the current time t;
Figure FDA0003494780960000076
Represents the derivation operation on θ t ; α represents the learning rate;
步骤63:每经过一定的步数根据当前图神经网络强化学习参数θt对目标图神经网络强化学习参数θ′t进行更新。Step 63: After a certain number of steps, update the target graph neural network reinforcement learning parameter θ′ t according to the current graph neural network reinforcement learning parameter θ t .
CN202210109887.2A 2022-01-29 2022-01-29 Electric vehicle charging guide optimization method based on graph neural network reinforcement learning Active CN114444802B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210109887.2A CN114444802B (en) 2022-01-29 2022-01-29 Electric vehicle charging guide optimization method based on graph neural network reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210109887.2A CN114444802B (en) 2022-01-29 2022-01-29 Electric vehicle charging guide optimization method based on graph neural network reinforcement learning

Publications (2)

Publication Number Publication Date
CN114444802A true CN114444802A (en) 2022-05-06
CN114444802B CN114444802B (en) 2024-06-04

Family

ID=81372174

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210109887.2A Active CN114444802B (en) 2022-01-29 2022-01-29 Electric vehicle charging guide optimization method based on graph neural network reinforcement learning

Country Status (1)

Country Link
CN (1) CN114444802B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115016938A (en) * 2022-06-09 2022-09-06 北京邮电大学 An automatic partitioning method of computational graph based on reinforcement learning
CN116436019A (en) * 2023-04-12 2023-07-14 国网江苏省电力有限公司电力科学研究院 A multi-resource coordination optimization method, device and storage medium
CN118098000A (en) * 2024-04-24 2024-05-28 哈尔滨华鲤跃腾科技有限公司 Urban comprehensive management method based on artificial intelligence
CN118438918A (en) * 2024-05-09 2024-08-06 烟台开发区德联软件有限责任公司 A mobile energy storage device charging control method and device
WO2024165229A1 (en) * 2023-02-08 2024-08-15 E.On Se Edge computing with ai at mains supply network interconnection points
CN119231508A (en) * 2024-09-26 2024-12-31 北京智芯微电子科技有限公司 A method, device and system for orderly charging control of distribution network

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110570050A (en) * 2019-09-25 2019-12-13 国网浙江省电力有限公司经济技术研究院 A charging guidance method for electric vehicles considering road-network-vehicle
TWI687785B (en) * 2019-02-25 2020-03-11 華碩電腦股份有限公司 Method of returning to charging station
CN111934335A (en) * 2020-08-18 2020-11-13 华北电力大学 Cluster electric vehicle charging behavior optimization method based on deep reinforcement learning
WO2021143075A1 (en) * 2020-01-17 2021-07-22 南京东博智慧能源研究院有限公司 Demand response method taking space-time distribution of electric vehicle charging loads into consideration
CN113159578A (en) * 2021-04-22 2021-07-23 杭州电子科技大学 Charging optimization scheduling method of large-scale electric vehicle charging station based on reinforcement learning
CN113515884A (en) * 2021-04-19 2021-10-19 国网上海市电力公司 Distributed electric vehicle real-time optimization scheduling method, system, terminal and medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI687785B (en) * 2019-02-25 2020-03-11 華碩電腦股份有限公司 Method of returning to charging station
CN110570050A (en) * 2019-09-25 2019-12-13 国网浙江省电力有限公司经济技术研究院 A charging guidance method for electric vehicles considering road-network-vehicle
WO2021143075A1 (en) * 2020-01-17 2021-07-22 南京东博智慧能源研究院有限公司 Demand response method taking space-time distribution of electric vehicle charging loads into consideration
CN111934335A (en) * 2020-08-18 2020-11-13 华北电力大学 Cluster electric vehicle charging behavior optimization method based on deep reinforcement learning
CN113515884A (en) * 2021-04-19 2021-10-19 国网上海市电力公司 Distributed electric vehicle real-time optimization scheduling method, system, terminal and medium
CN113159578A (en) * 2021-04-22 2021-07-23 杭州电子科技大学 Charging optimization scheduling method of large-scale electric vehicle charging station based on reinforcement learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
夏冬: "多信息融合下电动汽车充电路径规划", 电测与仪器, vol. 57, no. 22, 25 December 2019 (2019-12-25), pages 24 - 32 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115016938A (en) * 2022-06-09 2022-09-06 北京邮电大学 An automatic partitioning method of computational graph based on reinforcement learning
WO2024165229A1 (en) * 2023-02-08 2024-08-15 E.On Se Edge computing with ai at mains supply network interconnection points
CN116436019A (en) * 2023-04-12 2023-07-14 国网江苏省电力有限公司电力科学研究院 A multi-resource coordination optimization method, device and storage medium
CN116436019B (en) * 2023-04-12 2024-01-23 国网江苏省电力有限公司电力科学研究院 A multi-resource coordination and optimization method, device and storage medium
CN118098000A (en) * 2024-04-24 2024-05-28 哈尔滨华鲤跃腾科技有限公司 Urban comprehensive management method based on artificial intelligence
CN118438918A (en) * 2024-05-09 2024-08-06 烟台开发区德联软件有限责任公司 A mobile energy storage device charging control method and device
CN118438918B (en) * 2024-05-09 2024-12-31 烟台开发区德联软件有限责任公司 Mobile energy storage device charging regulation and control method and device
CN119231508A (en) * 2024-09-26 2024-12-31 北京智芯微电子科技有限公司 A method, device and system for orderly charging control of distribution network

Also Published As

Publication number Publication date
CN114444802B (en) 2024-06-04

Similar Documents

Publication Publication Date Title
CN114444802A (en) Electric vehicle charging guide optimization method based on graph neural network reinforcement learning
Li et al. Probabilistic charging power forecast of EVCS: Reinforcement learning assisted deep learning approach
CN109523051B (en) A real-time optimal scheduling method for electric vehicle charging
CN109347149B (en) Microgrid energy storage scheduling method and device based on deep Q-value network reinforcement learning
CN104463701B (en) A kind of distribution system and the coordinated planning method of charging electric vehicle network
Luo et al. Joint deployment of charging stations and photovoltaic power plants for electric vehicles
CN113078641B (en) A method and device for reactive power optimization of distribution network based on estimator and reinforcement learning
CN110570050A (en) A charging guidance method for electric vehicles considering road-network-vehicle
Chu et al. A multiagent federated reinforcement learning approach for plug-in electric vehicle fleet charging coordination in a residential community
CN106651059A (en) Optimal configuration method for electric automobile charging pile
CN106130007A (en) A kind of active distribution network energy storage planing method theoretical based on vulnerability
CN105591433A (en) Electric automobile charging load optimization method based on electric automobile charging power dynamic distribution
CN106096757A (en) Based on the microgrid energy storage addressing constant volume optimization method improving quantum genetic algorithm
CN107067190A (en) The micro-capacitance sensor power trade method learnt based on deeply
CN109840635A (en) Electric automobile charging station planing method based on voltage stability and charging service quality
CN112116125A (en) A method for electric vehicle charging and navigation based on deep reinforcement learning
CN114707292B (en) Analysis method for voltage stability of distribution network containing electric automobile
CN114123256B (en) Distributed energy storage configuration method and system adapting to random optimization decision
CN115344653A (en) A site selection method for electric vehicle charging stations based on user behavior
CN106408452A (en) Optimal configuration method for electric vehicle charging station containing multiple distributed power distribution networks
Yang et al. Dynamic incentive pricing on charging stations for real-time congestion management in distribution network: an adaptive model-based safe deep reinforcement learning method
CN117522444A (en) V2G-based dynamic electricity price setting method and system for electric vehicle charging station
Ma et al. IMOCS based EV charging station planning optimization considering stakeholders’ interests balance
CN112097783A (en) Planning method for electric taxi charging navigation path based on deep reinforcement learning
CN107776433A (en) A kind of discharge and recharge optimal control method of electric automobile group

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant