CN117541030B - Virtual power plant optimized operation method, device, equipment and medium - Google Patents
Virtual power plant optimized operation method, device, equipment and medium Download PDFInfo
- Publication number
- CN117541030B CN117541030B CN202410028978.2A CN202410028978A CN117541030B CN 117541030 B CN117541030 B CN 117541030B CN 202410028978 A CN202410028978 A CN 202410028978A CN 117541030 B CN117541030 B CN 117541030B
- Authority
- CN
- China
- Prior art keywords
- power plant
- virtual power
- representing
- model
- energy storage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 230000006870 function Effects 0.000 claims description 75
- 230000009471 action Effects 0.000 claims description 70
- 238000004146 energy storage Methods 0.000 claims description 55
- 239000007789 gas Substances 0.000 claims description 47
- 230000007246 mechanism Effects 0.000 claims description 41
- 238000003860 storage Methods 0.000 claims description 32
- 238000012549 training Methods 0.000 claims description 28
- 230000005611 electricity Effects 0.000 claims description 25
- 238000005457 optimization Methods 0.000 claims description 21
- 238000013486 operation strategy Methods 0.000 claims description 20
- 238000010248 power generation Methods 0.000 claims description 20
- 238000004422 calculation algorithm Methods 0.000 claims description 19
- VNWKTOKETHGBQD-UHFFFAOYSA-N methane Chemical compound C VNWKTOKETHGBQD-UHFFFAOYSA-N 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 8
- 238000010276 construction Methods 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 5
- 239000003795 chemical substances by application Substances 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims description 4
- 239000003345 natural gas Substances 0.000 claims description 4
- 230000017525 heat dissipation Effects 0.000 claims description 3
- 230000020169 heat generation Effects 0.000 claims description 3
- 238000010438 heat treatment Methods 0.000 claims description 3
- 238000010977 unit operation Methods 0.000 claims description 3
- 238000011017 operating method Methods 0.000 claims 1
- 238000013473 artificial intelligence Methods 0.000 abstract description 7
- 238000004590 computer program Methods 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 238000007726 management method Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 4
- 238000007599 discharging Methods 0.000 description 4
- 230000000739 chaotic effect Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000002427 irreversible effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0499—Feedforward networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/067—Enterprise or organisation modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Human Resources & Organizations (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Health & Medical Sciences (AREA)
- Tourism & Hospitality (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Operations Research (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Quality & Reliability (AREA)
- Educational Administration (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Computational Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Primary Health Care (AREA)
- Algebra (AREA)
Abstract
The invention relates to the technical field of artificial intelligence, and provides a virtual power plant optimizing operation method, device, equipment and medium.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a virtual power plant optimized operation method, device, equipment and medium.
Background
With the rapid growth of renewable and distributed energy sources, electrical power systems face increasingly complex management challenges. The virtual power plant is used as an intelligent system integrating various energy resources, and the efficient operation of the power system can be realized through optimal scheduling. The research on the optimal scheduling of the virtual power plant is not only beneficial to improving the economy of a power system and reducing the energy production cost, but also can effectively cope with the fluctuation and uncertainty of renewable energy sources and improve the reliability and stability of a power grid.
However, in the prior art, when the virtual power plant is optimally scheduled, the following drawbacks exist:
(1) The renewable energy source and the load prediction precision are required to be high;
(2) The action space can only carry out discrete operation, so that the action quantity of each unit needs to be discretized to adapt to an algorithm, and the selectable action range is greatly reduced;
(3) Overestimation problems are easily generated for the state and action value functions, so that the strategy learned by the model is invalid.
In view of the foregoing, there is a need for a more stable and efficient virtual power plant optimization operating scheme.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a method, apparatus, device and medium for optimizing operation of a virtual power plant, which are capable of reasonably and stably optimizing the operation of the virtual power plant.
A virtual power plant optimization operation method, the virtual power plant optimization operation method comprising:
constructing an objective function and constraint conditions, and constructing an initial virtual power plant model according to the objective function and the constraint conditions;
creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on a Markov decision model and a TD3 algorithm to obtain an intermediate virtual power plant model;
Updating the intermediate virtual power plant model based on a noise mechanism and a attention mechanism to obtain a model to be trained;
optimizing and training the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model;
acquiring current power data, and inputting the power data into the target virtual power plant model;
and generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model, and executing the virtual power plant optimizing operation strategy.
According to a preferred embodiment of the present invention, the constructing objective functions and constraints includes:
Taking the minimum total cost in the virtual power plant operation period as the objective function on the premise of meeting the constraint condition;
The total cost is the sum of wind power generation cost, gas unit cost, energy storage cost and electricity selling cost of an electric power market;
the constraint conditions comprise virtual power plant power balance constraint, energy storage equipment charge and discharge constraint, wind turbine generator set output constraint and gas turbine generator set operation constraint.
According to a preferred embodiment of the invention:
The objective function is expressed as follows: the objective function is expressed as follows:
;
Wherein: ;
Wherein min C represents the minimum total cost, C wt (t) represents the wind power generation cost, C gas (t) represents the gas turbine unit cost, C es (t) represents the energy storage cost, and C maket (t) represents the electricity market electricity selling cost; n wt represents the number of fans, N gas represents the number of gas turbines, and N es represents the number of energy storage devices; c wt represents the running cost of the power generation of a single fan, The output of the kth fan at the time t is shown; c CH4 represents a natural gas unit price; l HVNG represents the natural gas low heating value, n gas represents the power generation efficiency, and P gas,i,t represents the output of the ith gas turbine at the t moment; c es represents a cost coefficient of charging and discharging of the single energy storage device, and P es,n,t represents charging and discharging power of an nth energy storage unit at time t; c maket,t represents the market electricity price at the time T, P maket,t represents the market electricity trading volume at the time T, deltat represents the time variation, and T represents the operation period of the virtual power plant;
the virtual plant power balance constraint is expressed as follows:
;
Wherein N load represents the number of loads; p load,m,t represents the mth user load power at time t; the heat generation power of the ith gas turbine at the t moment is shown; /(I) Representing the mth user thermal load at time t;
the energy storage device charge-discharge constraints are expressed as follows: ;
In the method, in the process of the invention, Representing the charging power of the energy storage device at the moment t/(Represents the discharge power of the energy storage device at the moment t,Representing the maximum charging power of the energy storage device,/>Representing the maximum discharge power of the energy storage device,/>SOC value representing J period energy storage,/>Upper limit of SOC value representing J period energy storage,/>Representing a lower limit of the SOC value for the J period of energy storage;
The output constraint of the wind turbine generator is expressed as follows: ;
wherein, Representing the maximum value of the actual output of the wind turbine generator;
the gas unit operation constraints are expressed as follows: ;
wherein, Representing the power generation of the gas turbine unit,/>Representing the upper limit of the output of the gas unit,/>Representing the lower output limit of the gas unit,/>The heat dissipation loss rate is shown.
According to a preferred embodiment of the present invention, the creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the markov decision model and the TD3 algorithm, and the obtaining an intermediate virtual power plant model includes:
acquiring an energy storage SOC value, the maximum value of the actual output of the wind turbine generator, the mth user thermal load at the moment t, the mth user load power at the moment t and the market electricity price at the moment t as the state space;
Acquiring the output of the ith gas turbine at the moment t, the charge and discharge power of the nth energy storage unit at the moment t, the market electric energy trading volume at the moment t and the output of the kth fan at the moment t as the action space;
Acquiring a certain state of an intelligent agent corresponding to the initial virtual power plant model, and constructing the rewarding function by the total cost of any action under the corresponding state and the penalty coefficient exceeding the constraint;
Updating the initial virtual power plant model based on the state space, the action space and the rewarding function to obtain the intermediate virtual power plant model;
wherein, the state space S t is represented as follows:
;
SOC represents the SOC value of the stored energy;
Wherein, the action space a t is represented as follows: ;
Wherein the bonus function r t is represented as follows: ;
S t represents a certain state of the intelligent agent corresponding to the initial virtual power plant model; Representing the total cost of the system when the action in S t state is a t,/> Representing penalty coefficients that exceed the constraint.
According to a preferred embodiment of the present invention, the updating the intermediate virtual power plant model based on the noise mechanism and the attention mechanism, to obtain the model to be trained includes:
configuring target parameters;
Generating a random number based on the Logistic mapping;
Comparing the target parameter with the random number to obtain a comparison result;
When the comparison result is that the target parameter is greater than or equal to the random number, adding noise into an action network of the intermediate virtual power plant model; or when the comparison result is that the target parameter is smaller than the random number, no noise is added to the action network;
Acquiring an attention mechanism network, and updating the action network based on the attention mechanism network to obtain the model to be trained; wherein the attention mechanism network consists of a fully connected layer and a Softmax activation function.
According to a preferred embodiment of the invention:
The target parameter λ is expressed as follows: ;
wherein, Representing a first coefficient adjusting said target parameter lambda,/>A second coefficient representing an adjustment of the target parameter λ; r t represents a time t prize value; /(I)Representing an average prize value over the virtual power plant operating period;
the random number L n+1 is represented as follows:
;
wherein L n represents an iteration sequence value of the Logistic map; y represents a branching parameter.
According to a preferred embodiment of the present invention, the optimizing training the model to be trained based on the priority experience storage policy, and obtaining the optimized target virtual power plant model includes:
in each round of training process, acquiring the current virtual power plant environment state of the current round and the current action obtained through the action network;
detecting whether noise is added to the action network based on the noise mechanism to obtain random actions;
Executing the random action based on the current virtual power plant environment state to obtain a current rewarding value and a virtual power plant environment state at the next moment;
Determining the current experience value as excellent experience if the searched current experience value is larger than the average value of all the searched experience values except the current experience value in a specified period, and adding the current experience value into an experience playback pool; or if the current experience value is smaller than or equal to the average value, determining whether to add the current experience value to the experience playback pool according to a preset probability;
randomly extracting experience samples from the experience playback pool, and inputting the experience samples into the action network to obtain random actions at the next moment;
Acquiring the current value of a Q function of a current wheel model corresponding to the model to be trained, and updating model parameters of the current wheel model according to the current value of the Q function;
And stopping training when detecting that the value of the Q function reaches the maximum in any round of training, and determining a model obtained by the round of training as the target virtual power plant model.
A virtual power plant optimal operation device, the virtual power plant optimal operation device comprising:
the construction unit is used for constructing an objective function and constraint conditions and constructing an initial virtual power plant model according to the objective function and the constraint conditions;
The establishing unit is used for establishing a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the Markov decision model and the TD3 algorithm to obtain an intermediate virtual power plant model;
The updating unit is used for updating the intermediate virtual power plant model based on a noise mechanism and an attention mechanism to obtain a model to be trained;
the training unit is used for carrying out optimization training on the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model;
the input unit is used for acquiring current power data and inputting the power data into the target virtual power plant model;
and the optimizing unit is used for generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model and executing the virtual power plant optimizing operation strategy.
A computer device, the computer device comprising:
A memory storing at least one instruction; and
And the processor executes the instructions stored in the memory to realize the virtual power plant optimized operation method.
A computer-readable storage medium having stored therein at least one instruction for execution by a processor in a computer device to implement the virtual power plant optimized operation method.
According to the technical scheme, the initial virtual power plant model can be built according to the self-built objective function and constraint conditions, the state space, the action space and the rewarding function are built based on the Markov decision model and the TD3 algorithm to obtain the intermediate virtual power plant model, the intermediate virtual power plant model is updated based on the noise mechanism and the attention mechanism, the model to be trained is optimized and trained based on the priority experience storage strategy, the optimized target virtual power plant model is obtained, the current electric power data is further input into the target virtual power plant model, the virtual power plant optimizing operation strategy is generated and executed, the optimized virtual power plant operation strategy is more stable, and the electric power cost can be effectively solved.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of the virtual power plant optimization method of the present invention.
Fig. 2 is a power supply and demand balance diagram of the present invention.
FIG. 3 is a comparison of the virtual power plant optimization method of the present invention with other optimization methods.
FIG. 4 is a functional block diagram of a preferred embodiment of the virtual power plant optimized operation apparatus of the present invention.
FIG. 5 is a schematic diagram of a computer device implementing a preferred embodiment of the method of optimizing operation of a virtual power plant according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
FIG. 1 is a flow chart of a preferred embodiment of the virtual power plant optimization method of the present invention. The order of the steps in the flowchart may be changed and some steps may be omitted according to various needs.
The virtual power plant optimizing operation method is applied to one or more computer devices, wherein the computer device is a device capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware of the computer device comprises, but is not limited to, a microprocessor, an Application SPECIFIC INTEGRATED Circuit (ASIC), a Programmable gate array (Field-Programmable GATE ARRAY, FPGA), a digital Processor (DIGITAL SIGNAL Processor, DSP), an embedded device and the like.
The computer device may be any electronic product that can interact with a user in a human-computer manner, such as a Personal computer, a tablet computer, a smart phone, a Personal digital assistant (Personal DIGITAL ASSISTANT, PDA), a game console, an interactive internet protocol television (Internet Protocol Television, IPTV), a smart wearable device, etc.
The computer device may also include a network device and/or a user device. Wherein the network device includes, but is not limited to, a single network server, a server group composed of a plurality of network servers, or a Cloud based Cloud Computing (Cloud Computing) composed of a large number of hosts or network servers.
The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
Wherein artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
The network in which the computer device is located includes, but is not limited to, the internet, a wide area network, a metropolitan area network, a local area network, a virtual private network (Virtual Private Network, VPN), and the like.
S10, constructing an objective function and constraint conditions, and constructing an initial virtual power plant model according to the objective function and the constraint conditions.
In this embodiment, the constructing the objective function and the constraint condition includes:
Taking the minimum total cost in the virtual power plant operation period as the objective function on the premise of meeting the constraint condition;
The total cost is the sum of wind power generation cost, gas unit cost, energy storage cost and electricity selling cost of an electric power market;
the constraint conditions comprise virtual power plant power balance constraint, energy storage equipment charge and discharge constraint, wind turbine generator set output constraint and gas turbine generator set operation constraint.
Specifically, the objective function is expressed as follows:;
Wherein: ;
Wherein min C represents the minimum total cost, C wt (t) represents the wind power generation cost, C gas (t) represents the gas turbine unit cost, C es (t) represents the energy storage cost, and C maket (t) represents the electricity market electricity selling cost; n wt represents the number of fans, N gas represents the number of gas turbines, and N es represents the number of energy storage devices; c wt represents the running cost of the power generation of a single fan, The output of the kth fan at the time t is shown; c CH4 represents a natural gas unit price; l HVNG represents the natural gas low heating value, n gas represents the power generation efficiency, and P gas,i,t represents the output of the ith gas turbine at the t moment; c es represents a cost coefficient of charging and discharging of the single energy storage device, and P es,n,t represents charging and discharging power of an nth energy storage unit at time t; c maket,t represents the market electricity price at the time T, P maket,t represents the market electricity trading volume at the time T, deltat represents the time variation, and T represents the operation period of the virtual power plant;
the virtual plant power balance constraint is expressed as follows:
;
Wherein N load represents the number of loads; p load,m,t represents the mth user load power at time t; the heat generation power of the ith gas turbine at the t moment is shown; /(I) Representing the mth user thermal load at time t;
the energy storage device charge-discharge constraints are expressed as follows: ;
In the method, in the process of the invention, Representing the charging power of the energy storage device at the moment t/(Represents the discharge power of the energy storage device at the moment t,Representing the maximum charging power of the energy storage device,/>Representing the maximum discharge power of the energy storage device,/>SOC value representing J period energy storage,/>Upper limit of SOC value representing J period energy storage,/>Representing a lower limit of the SOC value for the J period of energy storage;
Through the constraint of the charge and discharge of the energy storage equipment, the normal and stable operation of the energy storage system can be ensured, the irreversible damage and other dangers to the battery caused by overcharge and discharge are prevented, and the charge and discharge power and the energy storage charge state of the energy storage equipment are constrained.
The output constraint of the wind turbine generator is expressed as follows:;
wherein, Representing the maximum value of the actual output of the wind turbine generator;
the gas unit operation constraints are expressed as follows: ;
wherein, Representing the power generation of the gas turbine unit,/>Representing the upper limit of the output of the gas unit,/>Representing the lower output limit of the gas unit,/>The heat dissipation loss rate is shown.
In the above embodiment, the initial virtual power plant model is built with multiple constraints of minimum cost and dimensions, so as to ensure that the built initial virtual power plant model can operate with minimum cost, and multiple constraint conditions can be satisfied, so that operation stability is synchronously ensured.
And S11, creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on a Markov decision model and a TD3 (TWIN DELAYED DEEP DETERMINISTIC Policy Gradient) algorithm to obtain an intermediate virtual power plant model.
In this embodiment, the creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the markov decision model and the TD3 algorithm, and obtaining the intermediate virtual power plant model includes:
acquiring an energy storage SOC value, the maximum value of the actual output of the wind turbine generator, the mth user thermal load at the moment t, the mth user load power at the moment t and the market electricity price at the moment t as the state space;
Acquiring the output of the ith gas turbine at the moment t, the charge and discharge power of the nth energy storage unit at the moment t, the market electric energy trading volume at the moment t and the output of the kth fan at the moment t as the action space;
Acquiring a certain state of an intelligent agent corresponding to the initial virtual power plant model, and constructing the rewarding function by the total cost of any action under the corresponding state and the penalty coefficient exceeding the constraint;
Updating the initial virtual power plant model based on the state space, the action space and the rewarding function to obtain the intermediate virtual power plant model;
wherein, the state space S t is represented as follows:
;
SOC represents the SOC value of the stored energy;
Wherein, the action space a t is represented as follows: ;
Wherein the bonus function r t is represented as follows: ;
S t represents a certain state of the intelligent agent corresponding to the initial virtual power plant model; Representing the total cost of the system when the action in S t state is a t,/> Representing penalty coefficients that exceed the constraint.
In the embodiment, the influence of action rewards and environment rewards are comprehensively considered through the rewards function, so that learning targets can be more definite. Specifically, after selecting any one of the actions based on the distribution of the action space in the environmental state set, the environment gives a prize. Thus, the problem of minimizing the total cost of operation of the virtual power plant can be converted into a form of maximizing the rewards on the premise that the constraints are satisfied.
And S12, updating the intermediate virtual power plant model based on a noise mechanism and an attention mechanism to obtain a model to be trained.
In this embodiment, updating the intermediate virtual power plant model based on the noise mechanism and the attention mechanism, to obtain the model to be trained includes:
configuring target parameters;
Generating a random number based on the Logistic mapping;
Comparing the target parameter with the random number to obtain a comparison result;
When the comparison result is that the target parameter is greater than or equal to the random number, adding noise into an action network of the intermediate virtual power plant model; or when the comparison result is that the target parameter is smaller than the random number, no noise is added to the action network;
Acquiring an attention mechanism network, and updating the action network based on the attention mechanism network to obtain the model to be trained; wherein the attention mechanism network consists of a fully connected layer and a Softmax activation function.
Specifically, the target parameter λ is expressed as follows:;
wherein, Representing a first coefficient adjusting said target parameter lambda,/>A second coefficient representing an adjustment of the target parameter λ; r t represents a time t prize value; /(I)Representing an average prize value over the virtual power plant operating period;
the random number L n+1 is represented as follows:
;
wherein L n represents an iteration sequence value of the Logistic map; y represents a branching parameter.
When Y is less than or equal to 3.5699456 and less than or equal to 4, the Logistic mapping enters a chaotic state, and the generated chaotic sequence has good random distribution characteristics.
Wherein the action network is improved based on an attention mechanism. The observed state is used as an input vector, after passing through the first full connection layer of the attention network, the full connection layer is activated through a Softmax activation function to obtain weights corresponding to all components of the input vector, and the weights are multiplied with all corresponding components to obtain a new vector and serve as the output of the attention network. The output new vector passes through a full connection layer and a Sigmoid activation function to obtain actions under corresponding states.
In the process of policy training of the TD3 algorithm model, superimposed noise is output through the action network, and the noise is introduced to expand the searching capability of the algorithm model. However, unnecessary noise superposition adds additional computational cost to the model, and creates excessive exploration problems. Therefore, the embodiment improves the action exploration mechanism of the traditional TD3 algorithm, so that the model can better balance exploration and utilization under a dynamic environment.
And S13, optimizing and training the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model.
In this embodiment, the optimizing training the model to be trained based on the priority experience storage policy, and obtaining the optimized target virtual power plant model includes:
in each round of training process, acquiring the current virtual power plant environment state of the current round and the current action obtained through the action network;
detecting whether noise is added to the action network based on the noise mechanism to obtain random actions;
Executing the random action based on the current virtual power plant environment state to obtain a current rewarding value and a virtual power plant environment state at the next moment;
Determining the current experience value as excellent experience if the searched current experience value is larger than the average value of all the searched experience values except the current experience value in a specified period, and adding the current experience value into an experience playback pool; or if the current experience value is smaller than or equal to the average value, determining whether to add the current experience value to the experience playback pool according to a preset probability;
randomly extracting experience samples from the experience playback pool, and inputting the experience samples into the action network to obtain random actions at the next moment;
Acquiring the current value of a Q function of a current wheel model corresponding to the model to be trained, and updating model parameters of the current wheel model according to the current value of the Q function;
And stopping training when detecting that the value of the Q function reaches the maximum in any round of training, and determining a model obtained by the round of training as the target virtual power plant model.
The preset probability can be configured according to an actual running environment.
Through the embodiment, the optimized target virtual power plant model can be trained by further combining with the priority experience storage strategy.
S14, acquiring current power data, and inputting the power data into the target virtual power plant model.
In this embodiment, the power data may include, but is not limited to, one or more of the following combinations of data:
the current time period, the current regional characteristics, the number of fans, the number of gas turbines, the number of energy storage devices, and the like.
And S15, generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model, and executing the virtual power plant optimizing operation strategy.
In this embodiment, according to the output data of the target virtual power plant model, it may be determined whether the current period needs to be supplemented with electricity requirements, whether surplus electric energy needs to be stored, and other operation strategies, so as to ensure reasonable use of electric power resources, and further maintain stability of the electric power system.
For example: please refer to fig. 2, which is a power supply and demand balance diagram of the present invention. As can be taken from fig. 2, the virtual power plant optimization operation strategy may include, according to different supply and demand: in the time intervals of 1:00-7:00 and 16:00-18:00, the electricity price is lower, and the internal power supply of the virtual power plant mainly comes from wind power and gas turbine power generation, and the electricity demand is supplemented through electricity purchasing. The electric energy demand is low in the period of 2:00-4:00, so that low-price surplus electric energy is stored. In the period of 8:00-10:00, the virtual power plant does not purchase electricity any more, the energy storage is discharged in a small amount in a part of the period, and the load demand is still mainly met by wind power and gas turbine power generation. In the time intervals of 11:00-15:00 and 19:00-21:00, the overall electric load demand is more, the electricity price is in a peak time interval, and in order to reduce the total running cost, electricity purchasing is selected to be reduced, the system electric energy is supplied through wind power and a gas turbine, and the electricity demand is supplemented by energy storage discharge. In the period of 22:00-23:00, wind power and gas turbine power generation cannot meet the power demand, but the power price is low at the moment, so that the system balances the electric load through electricity purchasing. At 24:00, the electric load demand is lower, and can be met through wind power generation and gas turbine power generation.
Please refer to fig. 3, which is a comparison result of the virtual power plant optimizing operation method and other optimizing methods. Specifically, scheme 1 is a virtual power plant optimization operation method based on DDPG (DEEP DETERMINISTIC Policy Gradient algorithm); scheme 2 is a virtual power plant optimization operation method based on only traditional TD 3; scheme 3 is a virtual power plant optimization operation method proposed based on the present embodiment. As can be seen, the average daily running cost of scheme 3 is 16582 yuan, which is reduced by 7.55% and 3.4% compared with scheme 1 and scheme 2, respectively; the minimum daily running cost of the scheme 3 is 14554 yuan, which is reduced by 6.58 percent and 4.45 percent compared with the scheme 1 and the scheme 2 respectively; the maximum daily running cost of the scheme 3 is 18762 yuan, which is reduced by 10.53 percent and 3.97 percent respectively compared with the scheme 1 and the scheme 2. Therefore, compared with the scheme 1 and the scheme 2, the virtual power plant optimizing operation method provided by the embodiment has better performance, and the system operation cost can be effectively reduced.
According to the technical scheme, the initial virtual power plant model can be built according to the self-built objective function and constraint conditions, the state space, the action space and the rewarding function are built based on the Markov decision model and the TD3 algorithm to obtain the intermediate virtual power plant model, the intermediate virtual power plant model is updated based on the noise mechanism and the attention mechanism, the model to be trained is optimized and trained based on the priority experience storage strategy, the optimized target virtual power plant model is obtained, the current electric power data is further input into the target virtual power plant model, the virtual power plant optimizing operation strategy is generated and executed, the optimized virtual power plant operation strategy is more stable, and the electric power cost can be effectively solved.
FIG. 4 is a functional block diagram of a preferred embodiment of the virtual power plant optimized operation apparatus of the present invention. The virtual power plant optimizing operation device 11 comprises a construction unit 110, a creation unit 111, an updating unit 112, a training unit 113, an input unit 114 and an optimizing unit 115. The module/unit referred to in the present invention refers to a series of computer program segments, which are stored in a memory, capable of being executed by a processor and of performing a fixed function. In the present embodiment, the functions of the respective modules/units will be described in detail in the following embodiments.
The construction unit 110 is configured to construct an objective function and a constraint condition, and construct an initial virtual power plant model according to the objective function and the constraint condition;
the creating unit 111 is configured to create a state space, an action space and a reward function corresponding to the initial virtual power plant model based on a markov decision model and a TD3 algorithm, so as to obtain an intermediate virtual power plant model;
the updating unit 112 is configured to update the intermediate virtual power plant model based on a noise mechanism and an attention mechanism to obtain a model to be trained;
The training unit 113 is configured to perform optimization training on the model to be trained based on a priority experience storage policy, so as to obtain an optimized target virtual power plant model;
The input unit 114 is configured to obtain current power data, and input the power data to the target virtual power plant model;
the optimizing unit 115 is configured to generate a virtual power plant optimizing operation policy according to the output data of the target virtual power plant model, and execute the virtual power plant optimizing operation policy.
According to the technical scheme, the initial virtual power plant model can be built according to the self-built objective function and constraint conditions, the state space, the action space and the rewarding function are built based on the Markov decision model and the TD3 algorithm to obtain the intermediate virtual power plant model, the intermediate virtual power plant model is updated based on the noise mechanism and the attention mechanism, the model to be trained is optimized and trained based on the priority experience storage strategy, the optimized target virtual power plant model is obtained, the current electric power data is further input into the target virtual power plant model, the virtual power plant optimizing operation strategy is generated and executed, the optimized virtual power plant operation strategy is more stable, and the electric power cost can be effectively solved.
FIG. 5 is a schematic diagram of a computer device for implementing a preferred embodiment of the method for optimizing operation of a virtual power plant according to the present invention.
The computer device 1 may comprise a memory 12, a processor 13 and a bus, and may further comprise a computer program stored in the memory 12 and executable on the processor 13, such as a virtual power plant optimization run program.
It will be appreciated by those skilled in the art that the schematic diagram is merely an example of the computer device 1 and does not constitute a limitation of the computer device 1, the computer device 1 may be a bus type structure, a star type structure, the computer device 1 may further comprise more or less other hardware or software than illustrated, or a different arrangement of components, for example, the computer device 1 may further comprise an input-output device, a network access device, etc.
It should be noted that the computer device 1 is only used as an example, and other electronic products that may be present in the present invention or may be present in the future are also included in the scope of the present invention by way of reference.
The memory 12 includes at least one type of readable storage medium including flash memory, a removable hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 12 may in some embodiments be an internal storage unit of the computer device 1, such as a removable hard disk of the computer device 1. The memory 12 may also be an external storage device of the computer device 1 in other embodiments, such as a plug-in mobile hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the computer device 1. Further, the memory 12 may also include both an internal storage unit and an external storage device of the computer device 1. The memory 12 may be used not only for storing application software installed on the computer device 1 and various types of data, such as codes of virtual power plant optimization running programs, but also for temporarily storing data that has been output or is to be output.
The processor 13 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, various control chips, and the like. The processor 13 is a Control Unit (Control Unit) of the computer device 1, connects the respective components of the entire computer device 1 using various interfaces and lines, executes or executes programs or modules stored in the memory 12 (for example, executes a virtual power plant optimization running program, etc.), and invokes data stored in the memory 12 to perform various functions of the computer device 1 and process data.
The processor 13 executes the operating system of the computer device 1 and various types of applications installed. The processor 13 executes the application program to implement the steps of the various virtual power plant optimization operation method embodiments described above, such as the steps shown in fig. 1.
Illustratively, the computer program may be partitioned into one or more modules/units that are stored in the memory 12 and executed by the processor 13 to complete the present invention. The one or more modules/units may be a series of computer readable instruction segments capable of performing the specified functions, which instruction segments describe the execution of the computer program in the computer device 1. For example, the computer program may be divided into a construction unit 110, a creation unit 111, an update unit 112, a training unit 113, an input unit 114, an optimization unit 115.
The integrated units implemented in the form of software functional modules described above may be stored in a computer readable storage medium. The software functional module is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a computer device, or a network device, etc.) or a processor (processor) to execute a portion of the virtual power plant optimization operation method according to the embodiments of the present invention.
The modules/units integrated in the computer device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on this understanding, the present invention may also be implemented by a computer program for instructing a relevant hardware device to implement all or part of the procedures of the above-mentioned embodiment method, where the computer program may be stored in a computer readable storage medium and the computer program may be executed by a processor to implement the steps of each of the above-mentioned method embodiments.
Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory, or the like.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The blockchain (Blockchain), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
The bus may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. For ease of illustration, only one straight line is shown in fig. 5, but not only one bus or one type of bus. The bus is arranged to enable a connection communication between the memory 12 and at least one processor 13 or the like.
Although not shown, the computer device 1 may further comprise a power source (such as a battery) for powering the various components, preferably the power source may be logically connected to the at least one processor 13 via a power management means, whereby the functions of charge management, discharge management, and power consumption management are achieved by the power management means. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The computer device 1 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described in detail herein.
Further, the computer device 1 may also comprise a network interface, optionally comprising a wired interface and/or a wireless interface (e.g. WI-FI interface, bluetooth interface, etc.), typically used for establishing a communication connection between the computer device 1 and other computer devices.
The computer device 1 may optionally further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the computer device 1 and for displaying a visual user interface.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
Fig. 5 shows only a computer device 1 with components 12-13, it will be understood by those skilled in the art that the structure shown in fig. 5 is not limiting of the computer device 1 and may include fewer or more components than shown, or may combine certain components, or a different arrangement of components.
In connection with fig. 1, the memory 12 in the computer device 1 stores a plurality of instructions for implementing a virtual power plant optimized operation method, the processor 13 being executable to implement:
constructing an objective function and constraint conditions, and constructing an initial virtual power plant model according to the objective function and the constraint conditions;
creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on a Markov decision model and a TD3 algorithm to obtain an intermediate virtual power plant model;
Updating the intermediate virtual power plant model based on a noise mechanism and a attention mechanism to obtain a model to be trained;
optimizing and training the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model;
acquiring current power data, and inputting the power data into the target virtual power plant model;
and generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model, and executing the virtual power plant optimizing operation strategy.
Specifically, the specific implementation method of the above instructions by the processor 13 may refer to the description of the relevant steps in the corresponding embodiment of fig. 1, which is not repeated herein.
The data in this case were obtained legally.
In the several embodiments provided in the present invention, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The invention is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. The units or means stated in the invention may also be implemented by one unit or means, either by software or hardware. The terms first, second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.
Claims (4)
1. The virtual power plant optimizing operation method is characterized by comprising the following steps of:
constructing an objective function and constraint conditions, and constructing an initial virtual power plant model according to the objective function and the constraint conditions;
creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on a Markov decision model and a TD3 algorithm to obtain an intermediate virtual power plant model;
Updating the intermediate virtual power plant model based on a noise mechanism and a attention mechanism to obtain a model to be trained;
optimizing and training the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model;
acquiring current power data, and inputting the power data into the target virtual power plant model;
Generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model, and executing the virtual power plant optimizing operation strategy;
Wherein, the construction objective function and the constraint condition comprise:
Taking the minimum total cost in the virtual power plant operation period as the objective function on the premise of meeting the constraint condition;
The total cost is the sum of wind power generation cost, gas unit cost, energy storage cost and electricity selling cost of an electric power market;
The constraint conditions comprise virtual power plant power balance constraint, energy storage equipment charge-discharge constraint, wind turbine generator set output constraint and gas turbine generator set operation constraint;
Wherein the objective function is expressed as follows:
;
Wherein:
;
;
;
;
Wherein min C represents the smallest of said total costs, Representing the wind power generation cost,/>Representing the gas unit cost,/>Representing the energy storage cost,/>Representing the electricity market electricity selling cost; /(I)Representing the number of fans,/>Representing the number of gas turbines,/>Representing the number of energy storage devices; /(I)Representing the running cost of power generation of a single fan,/>The output of the kth fan at the time t is shown; /(I)Represents the unit price of natural gas; Representing natural gas low heating value,/> Representing the power generation efficiency,/>The output of the ith gas turbine at the t moment is shown; representing the cost coefficient of charge and discharge of a single energy storage device,/> The charge and discharge power of the nth energy storage unit at the time t is represented; /(I)Represents the market price at time t,/>The market electric energy trading quantity at the moment T is represented, delta T represents the time variation quantity, and T represents the operation period of the virtual power plant;
the virtual plant power balance constraint is expressed as follows:
;
;
wherein, Representing the number of loads; /(I)Representing the mth user load power at the t moment; /(I)The heat generation power of the ith gas turbine at the t moment is shown; /(I)Representing the mth user thermal load at time t;
the energy storage device charge-discharge constraints are expressed as follows:
;
;
;
In the method, in the process of the invention, Representing the charging power of the energy storage device at the moment t/(Represents the discharge power of the energy storage device at the moment t,Representing the maximum charging power of the energy storage device,/>Representing the maximum discharge power of the energy storage device,/>SOC value representing J period energy storage,/>Upper limit of SOC value representing J period energy storage,/>Representing a lower limit of the SOC value for the J period of energy storage;
The output constraint of the wind turbine generator is expressed as follows:
;
wherein, Representing the maximum value of the actual output of the wind turbine generator;
the gas unit operation constraints are expressed as follows:
;
;
wherein, Representing the power generation of the gas turbine unit,/>Representing the upper limit of the output of the gas unit,/>Representing the lower output limit of the gas unit,/>The heat dissipation loss rate is represented;
The creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the Markov decision model and the TD3 algorithm, and the obtaining the intermediate virtual power plant model comprises:
acquiring an energy storage SOC value, the maximum value of the actual output of the wind turbine generator, the mth user thermal load at the moment t, the mth user load power at the moment t and the market electricity price at the moment t as the state space;
Acquiring the output of the ith gas turbine at the moment t, the charge and discharge power of the nth energy storage unit at the moment t, the market electric energy trading volume at the moment t and the output of the kth fan at the moment t as the action space;
Acquiring a certain state of an intelligent agent corresponding to the initial virtual power plant model, and constructing the rewarding function by the total cost of any action under the corresponding state and the penalty coefficient exceeding the constraint;
Updating the initial virtual power plant model based on the state space, the action space and the rewarding function to obtain the intermediate virtual power plant model;
wherein the state space is represented as follows:
;
S t represents the state space; SOC represents the SOC value of the stored energy;
wherein the action space is represented as follows:
;
a t represents the action space;
wherein the reward function is expressed as follows:
;
r t denotes the bonus function; Representation/> Action in State is/>Total cost of system at time,/>Representing penalty coefficients that exceed the constraint;
the method for updating the intermediate virtual power plant model based on the noise mechanism and the attention mechanism comprises the following steps of:
configuring target parameters;
Generating a random number based on the Logistic mapping;
Comparing the target parameter with the random number to obtain a comparison result;
When the comparison result is that the target parameter is greater than or equal to the random number, adding noise into an action network of the intermediate virtual power plant model; or when the comparison result is that the target parameter is smaller than the random number, no noise is added to the action network;
acquiring an attention mechanism network, and updating the action network based on the attention mechanism network to obtain the model to be trained; wherein the attention mechanism network consists of a full connection layer and a Softmax activation function;
wherein the target parameters are expressed as follows:
;
wherein λ represents the target parameter; representing a first coefficient adjusting said target parameter lambda,/> A second coefficient representing an adjustment of the target parameter λ; /(I)Representing an average prize value over the virtual power plant operating period;
the random number The expression is as follows:
;
wherein, Representing an iteration sequence value of the Logistic map; y represents a branch parameter;
The optimizing training is carried out on the model to be trained based on the priority experience storage strategy, and the obtaining of the optimized target virtual power plant model comprises the following steps:
in each round of training process, acquiring the current virtual power plant environment state of the current round and the current action obtained through the action network;
detecting whether noise is added to the action network based on the noise mechanism to obtain random actions;
Executing the random action based on the current virtual power plant environment state to obtain a current rewarding value and a virtual power plant environment state at the next moment;
Determining the current experience value as excellent experience if the searched current experience value is larger than the average value of all the searched experience values except the current experience value in a specified period, and adding the current experience value into an experience playback pool; or if the current experience value is smaller than or equal to the average value, determining whether to add the current experience value to the experience playback pool according to a preset probability;
randomly extracting experience samples from the experience playback pool, and inputting the experience samples into the action network to obtain random actions at the next moment;
Acquiring the current value of a Q function of a current wheel model corresponding to the model to be trained, and updating model parameters of the current wheel model according to the current value of the Q function;
And stopping training when detecting that the value of the Q function reaches the maximum in any round of training, and determining a model obtained by the round of training as the target virtual power plant model.
2. A virtual power plant optimizing operation apparatus that performs the virtual power plant optimizing operation method according to claim 1, characterized in that the virtual power plant optimizing operation apparatus comprises:
the construction unit is used for constructing an objective function and constraint conditions and constructing an initial virtual power plant model according to the objective function and the constraint conditions;
The establishing unit is used for establishing a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the Markov decision model and the TD3 algorithm to obtain an intermediate virtual power plant model;
The updating unit is used for updating the intermediate virtual power plant model based on a noise mechanism and an attention mechanism to obtain a model to be trained;
the training unit is used for carrying out optimization training on the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model;
the input unit is used for acquiring current power data and inputting the power data into the target virtual power plant model;
and the optimizing unit is used for generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model and executing the virtual power plant optimizing operation strategy.
3. A computer device, the computer device comprising:
A memory storing at least one instruction; and
A processor executing instructions stored in the memory to implement the virtual power plant optimization operating method of claim 1.
4. A computer-readable storage medium, characterized by: the computer readable storage medium having stored therein at least one instruction for execution by a processor in a computer device to implement the virtual power plant optimized operation method of claim 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410028978.2A CN117541030B (en) | 2024-01-09 | 2024-01-09 | Virtual power plant optimized operation method, device, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410028978.2A CN117541030B (en) | 2024-01-09 | 2024-01-09 | Virtual power plant optimized operation method, device, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117541030A CN117541030A (en) | 2024-02-09 |
CN117541030B true CN117541030B (en) | 2024-04-26 |
Family
ID=89794211
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410028978.2A Active CN117541030B (en) | 2024-01-09 | 2024-01-09 | Virtual power plant optimized operation method, device, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117541030B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113326994A (en) * | 2021-07-06 | 2021-08-31 | 华北电力大学 | Virtual power plant energy collaborative optimization method considering source load storage interaction |
CN114036825A (en) * | 2021-10-27 | 2022-02-11 | 南方电网科学研究院有限责任公司 | Collaborative optimization scheduling method, device, equipment and storage medium for multiple virtual power plants |
CN115423207A (en) * | 2022-09-26 | 2022-12-02 | 中国长江三峡集团有限公司 | Wind storage virtual power plant online scheduling method and device |
CN115663804A (en) * | 2022-11-02 | 2023-01-31 | 深圳先进技术研究院 | Electric power system regulation and control method based on deep reinforcement learning |
CN115879983A (en) * | 2023-02-07 | 2023-03-31 | 长园飞轮物联网技术(杭州)有限公司 | Virtual power plant scheduling method and system |
CN116914732A (en) * | 2023-07-13 | 2023-10-20 | 广东工业大学 | Deep reinforcement learning-based low-carbon scheduling method and system for cogeneration system |
CN117291095A (en) * | 2023-09-15 | 2023-12-26 | 国网上海能源互联网研究院有限公司 | Collaborative interaction method, device, equipment and medium for virtual power plant and power distribution network |
-
2024
- 2024-01-09 CN CN202410028978.2A patent/CN117541030B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113326994A (en) * | 2021-07-06 | 2021-08-31 | 华北电力大学 | Virtual power plant energy collaborative optimization method considering source load storage interaction |
CN114036825A (en) * | 2021-10-27 | 2022-02-11 | 南方电网科学研究院有限责任公司 | Collaborative optimization scheduling method, device, equipment and storage medium for multiple virtual power plants |
CN115423207A (en) * | 2022-09-26 | 2022-12-02 | 中国长江三峡集团有限公司 | Wind storage virtual power plant online scheduling method and device |
CN115663804A (en) * | 2022-11-02 | 2023-01-31 | 深圳先进技术研究院 | Electric power system regulation and control method based on deep reinforcement learning |
CN115879983A (en) * | 2023-02-07 | 2023-03-31 | 长园飞轮物联网技术(杭州)有限公司 | Virtual power plant scheduling method and system |
CN116914732A (en) * | 2023-07-13 | 2023-10-20 | 广东工业大学 | Deep reinforcement learning-based low-carbon scheduling method and system for cogeneration system |
CN117291095A (en) * | 2023-09-15 | 2023-12-26 | 国网上海能源互联网研究院有限公司 | Collaborative interaction method, device, equipment and medium for virtual power plant and power distribution network |
Also Published As
Publication number | Publication date |
---|---|
CN117541030A (en) | 2024-02-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110705863A (en) | Energy optimization scheduling device, equipment and medium | |
Zhang et al. | Semi-asynchronous personalized federated learning for short-term photovoltaic power forecasting | |
Bertini et al. | Soft computing based optimization of combined cycled power plant start-up operation with fitness approximation methods | |
Liang et al. | An energy-aware resource deployment algorithm for cloud data centers based on dynamic hybrid machine learning | |
Liao et al. | Energy consumption optimization scheme of cloud data center based on SDN | |
CN117522087B (en) | Virtual power plant resource allocation method, device, equipment and medium | |
Wang et al. | Research on short‐term and mid‐long term optimal dispatch of multi‐energy complementary power generation system | |
CN117541030B (en) | Virtual power plant optimized operation method, device, equipment and medium | |
CN115528750B (en) | Power grid safety and stability oriented data model hybrid drive unit combination method | |
CN116401602A (en) | Event detection method, device, equipment and computer readable medium | |
Zhang et al. | Short‐Term Power Load Forecasting Model Design Based on EMD‐PSO‐GRU | |
Lv et al. | Exponential hybrid mutation differential evolution for economic dispatch of large-scale power systems considering valve-point effects | |
Sun et al. | HEFT-dynamic scheduling algorithm in workflow scheduling | |
CN113869944A (en) | Revenue prediction method and device based on machine learning and readable storage medium | |
CN112381333A (en) | Micro-grid optimization method based on distributed improved bat algorithm | |
Yin et al. | Deep neural network accelerated-group African vulture optimization algorithm for unit commitment considering uncertain wind power | |
CN114997659B (en) | Resource scheduling model construction method and system based on dynamic multi-objective optimization | |
Mackie et al. | Reinforcement learning based load balancing for geographically distributed data centres | |
CN116629596B (en) | Supply chain risk prediction method, device, equipment and medium | |
CN118572795B (en) | Micro-grid group optimal scheduling method and system based on MADDPG and pareto front edge combination | |
CN118300195B (en) | Intelligent monitoring and scheduling method for micro-grid | |
CN115964499B (en) | Knowledge graph-based social management event mining method and device | |
CN118281959B (en) | Virtual power-based micro-grid distributed regulation and control method | |
CN118300207A (en) | Control method and device for hydrogen-electricity hybrid energy supply system and electronic equipment | |
CN118761575A (en) | Virtual power plant cluster resource replacement method and device, electronic equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |