CN117541030A - Virtual power plant optimized operation method, device, equipment and medium - Google Patents

Virtual power plant optimized operation method, device, equipment and medium Download PDF

Info

Publication number
CN117541030A
CN117541030A CN202410028978.2A CN202410028978A CN117541030A CN 117541030 A CN117541030 A CN 117541030A CN 202410028978 A CN202410028978 A CN 202410028978A CN 117541030 A CN117541030 A CN 117541030A
Authority
CN
China
Prior art keywords
power plant
virtual power
model
representing
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410028978.2A
Other languages
Chinese (zh)
Other versions
CN117541030B (en
Inventor
屈蓉
李任戈
张欣
唐琛捷
沈旺旺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Science and Industry Corp Ltd
Original Assignee
China Construction Science and Industry Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Science and Industry Corp Ltd filed Critical China Construction Science and Industry Corp Ltd
Priority to CN202410028978.2A priority Critical patent/CN117541030B/en
Publication of CN117541030A publication Critical patent/CN117541030A/en
Application granted granted Critical
Publication of CN117541030B publication Critical patent/CN117541030B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Operations Research (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Quality & Reliability (AREA)
  • Educational Administration (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Computational Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Primary Health Care (AREA)
  • Algebra (AREA)

Abstract

The invention relates to the technical field of artificial intelligence, and provides a virtual power plant optimizing operation method, device, equipment and medium.

Description

Virtual power plant optimized operation method, device, equipment and medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a virtual power plant optimized operation method, device, equipment and medium.
Background
With the rapid growth of renewable and distributed energy sources, electrical power systems face increasingly complex management challenges. The virtual power plant is used as an intelligent system integrating various energy resources, and the efficient operation of the power system can be realized through optimal scheduling. The research on the optimal scheduling of the virtual power plant is not only beneficial to improving the economy of a power system and reducing the energy production cost, but also can effectively cope with the fluctuation and uncertainty of renewable energy sources and improve the reliability and stability of a power grid.
However, in the prior art, when the virtual power plant is optimally scheduled, the following drawbacks also exist:
(1) The renewable energy source and the load prediction precision are required to be high;
(2) The action space can only carry out discrete operation, so that the action quantity of each unit needs to be discretized to adapt to an algorithm, and the selectable action range is greatly reduced;
(3) Overestimation problems are easily generated for the state and action value functions, so that the strategy learned by the model is invalid.
In view of the foregoing, there is a need for a more stable and efficient virtual power plant optimization operating scheme.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a method, apparatus, device and medium for optimizing operation of a virtual power plant, which are capable of reasonably and stably optimizing the operation of the virtual power plant.
A virtual power plant optimization operation method, the virtual power plant optimization operation method comprising:
constructing an objective function and constraint conditions, and constructing an initial virtual power plant model according to the objective function and the constraint conditions;
creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on a Markov decision model and a TD3 algorithm to obtain an intermediate virtual power plant model;
Updating the intermediate virtual power plant model based on a noise mechanism and a attention mechanism to obtain a model to be trained;
optimizing and training the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model;
acquiring current power data, and inputting the power data into the target virtual power plant model;
and generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model, and executing the virtual power plant optimizing operation strategy.
According to a preferred embodiment of the present invention, the constructing objective functions and constraints includes:
taking the minimum total cost in the virtual power plant operation period as the objective function on the premise of meeting the constraint condition;
the total cost is the sum of wind power generation cost, gas unit cost, energy storage cost and electricity selling cost of an electric power market;
the constraint conditions comprise virtual power plant power balance constraint, energy storage equipment charge and discharge constraint, wind turbine generator set output constraint and gas turbine generator set operation constraint.
According to a preferred embodiment of the invention:
the objective function is expressed as follows: the objective function is expressed as follows:
Wherein:
wherein min C represents the minimum of the total cost, C wt (t) represents the wind power generation cost, C gas (t) represents the cost of the gas turbine unit, C es (t) represents the energy storage cost, C maket (t) represents the electricity market electricity cost; n (N) wt Represents the number of fans, N gas Indicating the number of gas turbines, N es Representing the number of energy storage devices; c (C) wt Represents the running cost of the power generation of a single fan,the output of the kth fan at the time t is shown; c (C) CH4 Represents the unit price of natural gas; l (L) HVNG Represents the low calorific value of natural gas, n gas Representing the power generation efficiency, P gas,i,t The output of the ith gas turbine at the t moment is shown; c es Representing the cost coefficient of charge and discharge of a single energy storage device, P es,n,t The charge and discharge power of the nth energy storage unit at the time t is represented; c maket,t Represents the market electricity price at the moment t, P maket,t The market electric energy trading quantity at the moment T is represented, delta T represents the time variation quantity, and T represents the operation period of the virtual power plant;
the virtual plant power balance constraint is expressed as follows:
wherein N is load Representing the number of loads; p (P) load,m,t Representation oftMoment mth user load power;representation oftThe heat generation power of the ith gas turbine at the moment; />Representation oftMoment mth user thermal load;
the energy storage device charge-discharge constraints are expressed as follows:
In the method, in the process of the invention,representing energy storage devicestTime charging power->Representing energy storage devicestThe power of the discharge is at the moment,representing the maximum charging power of the energy storage device, +.>Representing the maximum discharge power of the energy storage device, +.>SOC value representing J period energy storage, +.>Upper limit of SOC value representing J period energy storage, < >>Representing a lower limit of the SOC value for the J period of energy storage;
the output constraint of the wind turbine generator is expressed as follows:
wherein,representing the maximum value of the actual output of the wind turbine generator;
the gas unit operation constraints are expressed as follows:
wherein,representing the power generation of the gas unit>Indicating the upper limit of the output of the gas unit,/->Indicating the lower limit of the output of the gas unit>The heat dissipation loss rate is shown.
According to a preferred embodiment of the present invention, the creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the markov decision model and the TD3 algorithm, and the obtaining an intermediate virtual power plant model includes:
acquiring an energy storage SOC value, the maximum value of the actual output of the wind turbine generator set,tTime mth user thermal load,tThe mth user load power at the moment and the market electricity price at the moment are used as the state space;
acquiring the output of the ith gas turbine at the moment t, the charge and discharge power of the nth energy storage unit at the moment t, the market electric energy trading volume at the moment t and the output of the kth fan at the moment t as the action space;
Acquiring a certain state of an intelligent agent corresponding to the initial virtual power plant model, and constructing the rewarding function by the total cost of any action under the corresponding state and the penalty coefficient exceeding the constraint;
updating the initial virtual power plant model based on the state space, the action space and the rewarding function to obtain the intermediate virtual power plant model;
wherein the state space S t The expression is as follows:
SOC represents the SOC value of the stored energy;
wherein the action space a t The expression is as follows:
wherein the reward function r t The expression is as follows:
S t representing a certain state of the intelligent agent corresponding to the initial virtual power plant model;represent S t The action is a in the state t Total cost of system when->Representing penalty coefficients that exceed the constraint.
According to a preferred embodiment of the present invention, the updating the intermediate virtual power plant model based on the noise mechanism and the attention mechanism, to obtain the model to be trained includes:
configuring target parameters;
generating a random number based on the Logistic mapping;
comparing the target parameter with the random number to obtain a comparison result;
when the comparison result is that the target parameter is greater than or equal to the random number, adding noise into an action network of the intermediate virtual power plant model; or when the comparison result is that the target parameter is smaller than the random number, no noise is added to the action network;
Acquiring an attention mechanism network, and updating the action network based on the attention mechanism network to obtain the model to be trained; wherein the attention mechanism network consists of a fully connected layer and a Softmax activation function.
According to a preferred embodiment of the invention:
the target parameter λ is expressed as follows:
wherein,a first coefficient representing an adjustment of said target parameter lambda +_>A second coefficient representing an adjustment of the target parameter λ; r is (r) t Representing a reward value at time t; />Representing an average prize value over the virtual power plant operating period;
the random number L n+1 The expression is as follows:
wherein L is n Representing an iteration sequence value of the Logistic map; y represents a branching parameter.
According to a preferred embodiment of the present invention, the optimizing training the model to be trained based on the priority experience storage policy, and obtaining the optimized target virtual power plant model includes:
in each round of training process, acquiring the current virtual power plant environment state of the current round and the current action obtained through the action network;
detecting whether noise is added to the action network based on the noise mechanism to obtain random actions;
executing the random action based on the current virtual power plant environment state to obtain a current rewarding value and a virtual power plant environment state at the next moment;
Determining the current experience value as excellent experience if the searched current experience value is larger than the average value of all the searched experience values except the current experience value in a specified period, and adding the current experience value into an experience playback pool; or if the current experience value is smaller than or equal to the average value, determining whether to add the current experience value to the experience playback pool according to a preset probability;
randomly extracting experience samples from the experience playback pool, and inputting the experience samples into the action network to obtain random actions at the next moment;
acquiring the current value of a Q function of a current wheel model corresponding to the model to be trained, and updating model parameters of the current wheel model according to the current value of the Q function;
and stopping training when detecting that the value of the Q function reaches the maximum in any round of training, and determining a model obtained by the round of training as the target virtual power plant model.
A virtual power plant optimal operation device, the virtual power plant optimal operation device comprising:
the construction unit is used for constructing an objective function and constraint conditions and constructing an initial virtual power plant model according to the objective function and the constraint conditions;
The establishing unit is used for establishing a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the Markov decision model and the TD3 algorithm to obtain an intermediate virtual power plant model;
the updating unit is used for updating the intermediate virtual power plant model based on a noise mechanism and an attention mechanism to obtain a model to be trained;
the training unit is used for carrying out optimization training on the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model;
the input unit is used for acquiring current power data and inputting the power data into the target virtual power plant model;
and the optimizing unit is used for generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model and executing the virtual power plant optimizing operation strategy.
A computer device, the computer device comprising:
a memory storing at least one instruction; and
And the processor executes the instructions stored in the memory to realize the virtual power plant optimized operation method.
A computer-readable storage medium having stored therein at least one instruction for execution by a processor in a computer device to implement the virtual power plant optimized operation method.
According to the technical scheme, the initial virtual power plant model can be built according to the self-built objective function and constraint conditions, the state space, the action space and the rewarding function are built based on the Markov decision model and the TD3 algorithm to obtain the intermediate virtual power plant model, the intermediate virtual power plant model is updated based on the noise mechanism and the attention mechanism, the model to be trained is optimized and trained based on the priority experience storage strategy, the optimized target virtual power plant model is obtained, the current electric power data is further input into the target virtual power plant model, the virtual power plant optimizing operation strategy is generated and executed, the optimized virtual power plant operation strategy is more stable, and the electric power cost can be effectively solved.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of the virtual power plant optimization method of the present invention.
Fig. 2 is a power supply and demand balance diagram of the present invention.
FIG. 3 is a comparison of the virtual power plant optimization method of the present invention with other optimization methods.
FIG. 4 is a functional block diagram of a preferred embodiment of the virtual power plant optimized operation apparatus of the present invention.
FIG. 5 is a schematic diagram of a computer device implementing a preferred embodiment of the method of optimizing operation of a virtual power plant according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
FIG. 1 is a flow chart of a preferred embodiment of the virtual power plant optimization method of the present invention. The order of the steps in the flowchart may be changed and some steps may be omitted according to various needs.
The virtual power plant optimizing operation method is applied to one or more computer devices, wherein the computer device is a device capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware comprises, but is not limited to, a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, an ASIC), a programmable gate array (Field-Programmable Gate Array, an FPGA), a digital processor (Digital Signal Processor, a DSP), an embedded device and the like.
The computer device may be any electronic product that can interact with a user in a human-computer manner, such as a personal computer, tablet computer, smart phone, personal digital assistant (Personal Digital Assistant, PDA), game console, interactive internet protocol television (Internet Protocol Television, IPTV), smart wearable device, etc.
The computer device may also include a network device and/or a user device. Wherein the network device includes, but is not limited to, a single network server, a server group composed of a plurality of network servers, or a Cloud based Cloud Computing (Cloud Computing) composed of a large number of hosts or network servers.
The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
The network in which the computer device is located includes, but is not limited to, the internet, a wide area network, a metropolitan area network, a local area network, a virtual private network (Virtual Private Network, VPN), and the like.
S10, constructing an objective function and constraint conditions, and constructing an initial virtual power plant model according to the objective function and the constraint conditions.
In this embodiment, the constructing the objective function and the constraint condition includes:
taking the minimum total cost in the virtual power plant operation period as the objective function on the premise of meeting the constraint condition;
the total cost is the sum of wind power generation cost, gas unit cost, energy storage cost and electricity selling cost of an electric power market;
the constraint conditions comprise virtual power plant power balance constraint, energy storage equipment charge and discharge constraint, wind turbine generator set output constraint and gas turbine generator set operation constraint.
Specifically, the objective function is expressed as follows:
wherein:
wherein min C represents the minimum of the total cost, C wt (t) represents the wind power generation cost, C gas (t) represents the cost of the gas turbine unit, C es (t) represents the energy storage cost, C maket (t) represents the electricity market electricity cost; n (N) wt Represents the number of fans, N gas Indicating the number of gas turbines, N es Representing the number of energy storage devices; c (C) wt Represents the running cost of the power generation of a single fan,the output of the kth fan at the time t is shown; c (C) CH4 Represents the unit price of natural gas; l (L) HVNG Represents the low calorific value of natural gas, n gas Representing the power generation efficiency, P gas,i,t The output of the ith gas turbine at the t moment is shown; c es Representing the cost coefficient of charge and discharge of a single energy storage device, P es,n,t The charge and discharge power of the nth energy storage unit at the time t is represented; c maket,t The market electricity price at the moment t is represented,P maket,t the market electric energy trading quantity at the moment T is represented, delta T represents the time variation quantity, and T represents the operation period of the virtual power plant;
the virtual plant power balance constraint is expressed as follows:
wherein N is load Representing the number of loads; p (P) load,m,t Representation oftMoment mth user load power;representation oftThe heat generation power of the ith gas turbine at the moment; />Representation oftMoment mth user thermal load;
the energy storage device charge-discharge constraints are expressed as follows:
in the method, in the process of the invention,representing energy storage devicestTime charging power->Representing energy storage devicestThe power of the discharge is at the moment,representing the maximum charging power of the energy storage device, +.>Representing the maximum discharge power of the energy storage device, +.>SOC value representing J period energy storage, +.>Upper limit of SOC value representing J period energy storage, < > >Representing a lower limit of the SOC value for the J period of energy storage;
through the constraint of the charge and discharge of the energy storage equipment, the normal and stable operation of the energy storage system can be ensured, the irreversible damage and other dangers to the battery caused by overcharge and discharge are prevented, and the charge and discharge power and the energy storage charge state of the energy storage equipment are constrained.
The output constraint of the wind turbine generator is expressed as follows:
wherein,representing the maximum value of the actual output of the wind turbine generator;
the gas unit operation constraints are expressed as follows:
wherein,representing the power generation of the gas unit>Indicating the upper limit of the output of the gas unit,/->Indicating the lower limit of the output of the gas unit>The heat dissipation loss rate is shown.
In the above embodiment, the initial virtual power plant model is built with multiple constraints of minimum cost and dimensions, so as to ensure that the built initial virtual power plant model can operate with minimum cost, and multiple constraint conditions can be satisfied, so that operation stability is synchronously ensured.
And S11, creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on a Markov decision model and a TD3 (Twin Delayed Deep Deterministic Policy Gradient, double-delay depth deterministic strategy gradient) algorithm to obtain an intermediate virtual power plant model.
In this embodiment, the creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the markov decision model and the TD3 algorithm, and obtaining the intermediate virtual power plant model includes:
acquiring an energy storage SOC value, the maximum value of the actual output of the wind turbine generator set,tTime mth user thermal load,tThe mth user load power at the moment and the market electricity price at the moment are used as the state space;
acquiring the output of the ith gas turbine at the moment t, the charge and discharge power of the nth energy storage unit at the moment t, the market electric energy trading volume at the moment t and the output of the kth fan at the moment t as the action space;
acquiring a certain state of an intelligent agent corresponding to the initial virtual power plant model, and constructing the rewarding function by the total cost of any action under the corresponding state and the penalty coefficient exceeding the constraint;
updating the initial virtual power plant model based on the state space, the action space and the rewarding function to obtain the intermediate virtual power plant model;
wherein the state space S t The expression is as follows:
SOC represents the SOC value of the stored energy;
wherein the action space a t The expression is as follows:
wherein the reward function r t The expression is as follows:
S t representing a certain state of the intelligent agent corresponding to the initial virtual power plant model;represent S t The action is a in the state t Total cost of system when->Representing penalty coefficients that exceed the constraint.
In the embodiment, the influence of action rewards and environment rewards are comprehensively considered through the rewards function, so that learning targets can be more definite. Specifically, after selecting any one of the actions based on the distribution of the action space in the environmental state set, the environment gives a prize. Thus, the problem of minimizing the total cost of operation of the virtual power plant can be converted into a form of maximizing the rewards on the premise that the constraints are satisfied.
And S12, updating the intermediate virtual power plant model based on a noise mechanism and an attention mechanism to obtain a model to be trained.
In this embodiment, updating the intermediate virtual power plant model based on the noise mechanism and the attention mechanism, to obtain the model to be trained includes:
configuring target parameters;
generating a random number based on the Logistic mapping;
comparing the target parameter with the random number to obtain a comparison result;
when the comparison result is that the target parameter is greater than or equal to the random number, adding noise into an action network of the intermediate virtual power plant model; or when the comparison result is that the target parameter is smaller than the random number, no noise is added to the action network;
Acquiring an attention mechanism network, and updating the action network based on the attention mechanism network to obtain the model to be trained; wherein the attention mechanism network consists of a fully connected layer and a Softmax activation function.
Specifically, the target parameter λ is expressed as follows:
wherein,a first coefficient representing an adjustment of said target parameter lambda +_>A second coefficient representing an adjustment of the target parameter λ; r is (r) t Representing a reward value at time t; />Representing an average prize value over the virtual power plant operating period;
the random number L n+1 The expression is as follows:
wherein L is n Representing an iteration sequence value of the Logistic map; y represents a branching parameter.
When Y is less than or equal to 3.5699456 and less than or equal to 4, the Logistic mapping enters a chaotic state, and the generated chaotic sequence has good random distribution characteristics.
Wherein the action network is improved based on an attention mechanism. The observed state is used as an input vector, after passing through the first full connection layer of the attention network, the full connection layer is activated through a Softmax activation function to obtain weights corresponding to all components of the input vector, and the weights are multiplied with all corresponding components to obtain a new vector and serve as the output of the attention network. The output new vector passes through a full connection layer and a Sigmoid activation function to obtain actions under corresponding states.
In the process of policy training of the TD3 algorithm model, superimposed noise is output through the action network, and the noise is introduced to expand the searching capability of the algorithm model. However, unnecessary noise superposition adds additional computational cost to the model, and creates excessive exploration problems. Therefore, the embodiment improves the action exploration mechanism of the traditional TD3 algorithm, so that the model can better balance exploration and utilization under a dynamic environment.
And S13, optimizing and training the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model.
In this embodiment, the optimizing training the model to be trained based on the priority experience storage policy, and obtaining the optimized target virtual power plant model includes:
in each round of training process, acquiring the current virtual power plant environment state of the current round and the current action obtained through the action network;
detecting whether noise is added to the action network based on the noise mechanism to obtain random actions;
executing the random action based on the current virtual power plant environment state to obtain a current rewarding value and a virtual power plant environment state at the next moment;
Determining the current experience value as excellent experience if the searched current experience value is larger than the average value of all the searched experience values except the current experience value in a specified period, and adding the current experience value into an experience playback pool; or if the current experience value is smaller than or equal to the average value, determining whether to add the current experience value to the experience playback pool according to a preset probability;
randomly extracting experience samples from the experience playback pool, and inputting the experience samples into the action network to obtain random actions at the next moment;
acquiring the current value of a Q function of a current wheel model corresponding to the model to be trained, and updating model parameters of the current wheel model according to the current value of the Q function;
and stopping training when detecting that the value of the Q function reaches the maximum in any round of training, and determining a model obtained by the round of training as the target virtual power plant model.
The preset probability can be configured according to an actual running environment.
Through the embodiment, the optimized target virtual power plant model can be trained by further combining with the priority experience storage strategy.
S14, acquiring current power data, and inputting the power data into the target virtual power plant model.
In this embodiment, the power data may include, but is not limited to, one or more of the following combinations of data:
the current time period, the current regional characteristics, the number of fans, the number of gas turbines, the number of energy storage devices, and the like.
And S15, generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model, and executing the virtual power plant optimizing operation strategy.
In this embodiment, according to the output data of the target virtual power plant model, it may be determined whether the current period needs to be supplemented with electricity requirements, whether surplus electric energy needs to be stored, and other operation strategies, so as to ensure reasonable use of electric power resources, and further maintain stability of the electric power system.
For example: please refer to fig. 2, which is a power supply and demand balance diagram of the present invention. As can be taken from fig. 2, the virtual power plant optimization operation strategy may include, according to different supply and demand: in the time intervals of 1:00-7:00 and 16:00-18:00, the electricity price is lower, and the internal power supply of the virtual power plant mainly comes from wind power and gas turbine power generation, and the electricity demand is supplemented through electricity purchasing. The electric energy demand is low in the period of 2:00-4:00, so that low-price surplus electric energy is stored. In the period of 8:00-10:00, the virtual power plant does not purchase electricity any more, the energy storage is discharged in a small amount in a part of the period, and the load demand is still mainly met by wind power and gas turbine power generation. In the time intervals of 11:00-15:00 and 19:00-21:00, the overall electric load demand is more, the electricity price is in a peak time interval, and in order to reduce the total running cost, electricity purchasing is selected to be reduced, the system electric energy is supplied through wind power and a gas turbine, and the electricity demand is supplemented by energy storage discharge. In the period of 22:00-23:00, wind power and gas turbine power generation cannot meet the power demand, but the power price is low at the moment, so that the system balances the electric load through electricity purchasing. At 24:00, the electric load demand is lower, and can be met through wind power generation and gas turbine power generation.
Please refer to fig. 3, which is a comparison result of the virtual power plant optimizing operation method and other optimizing methods. Specifically, scheme 1 is a virtual power plant optimization operation method based on DDPG (Deep Deterministic Policy Gradient, depth deterministic strategy gradient algorithm); scheme 2 is a virtual power plant optimization operation method based on only traditional TD 3; scheme 3 is a virtual power plant optimization operation method proposed based on the present embodiment. As can be seen, the average daily running cost of scheme 3 is 16582 yuan, which is reduced by 7.55% and 3.4% compared with scheme 1 and scheme 2, respectively; the minimum daily running cost of the scheme 3 is 14554 yuan, which is reduced by 6.58 percent and 4.45 percent compared with the scheme 1 and the scheme 2 respectively; the maximum daily running cost of the scheme 3 is 18762 yuan, which is reduced by 10.53 percent and 3.97 percent respectively compared with the scheme 1 and the scheme 2. Therefore, compared with the scheme 1 and the scheme 2, the virtual power plant optimizing operation method provided by the embodiment has better performance, and the system operation cost can be effectively reduced.
According to the technical scheme, the initial virtual power plant model can be built according to the self-built objective function and constraint conditions, the state space, the action space and the rewarding function are built based on the Markov decision model and the TD3 algorithm to obtain the intermediate virtual power plant model, the intermediate virtual power plant model is updated based on the noise mechanism and the attention mechanism, the model to be trained is optimized and trained based on the priority experience storage strategy, the optimized target virtual power plant model is obtained, the current electric power data is further input into the target virtual power plant model, the virtual power plant optimizing operation strategy is generated and executed, the optimized virtual power plant operation strategy is more stable, and the electric power cost can be effectively solved.
FIG. 4 is a functional block diagram of a preferred embodiment of the virtual power plant optimized operation apparatus of the present invention. The virtual power plant optimizing operation device 11 comprises a construction unit 110, a creation unit 111, an updating unit 112, a training unit 113, an input unit 114 and an optimizing unit 115. The module/unit referred to in the present invention refers to a series of computer program segments, which are stored in a memory, capable of being executed by a processor and of performing a fixed function. In the present embodiment, the functions of the respective modules/units will be described in detail in the following embodiments.
The construction unit 110 is configured to construct an objective function and a constraint condition, and construct an initial virtual power plant model according to the objective function and the constraint condition;
the creating unit 111 is configured to create a state space, an action space and a reward function corresponding to the initial virtual power plant model based on a markov decision model and a TD3 algorithm, so as to obtain an intermediate virtual power plant model;
the updating unit 112 is configured to update the intermediate virtual power plant model based on a noise mechanism and an attention mechanism to obtain a model to be trained;
the training unit 113 is configured to perform optimization training on the model to be trained based on a priority experience storage policy, so as to obtain an optimized target virtual power plant model;
The input unit 114 is configured to obtain current power data, and input the power data to the target virtual power plant model;
the optimizing unit 115 is configured to generate a virtual power plant optimizing operation policy according to the output data of the target virtual power plant model, and execute the virtual power plant optimizing operation policy.
According to the technical scheme, the initial virtual power plant model can be built according to the self-built objective function and constraint conditions, the state space, the action space and the rewarding function are built based on the Markov decision model and the TD3 algorithm to obtain the intermediate virtual power plant model, the intermediate virtual power plant model is updated based on the noise mechanism and the attention mechanism, the model to be trained is optimized and trained based on the priority experience storage strategy, the optimized target virtual power plant model is obtained, the current electric power data is further input into the target virtual power plant model, the virtual power plant optimizing operation strategy is generated and executed, the optimized virtual power plant operation strategy is more stable, and the electric power cost can be effectively solved.
FIG. 5 is a schematic diagram of a computer device for implementing a preferred embodiment of the method for optimizing operation of a virtual power plant according to the present invention.
The computer device 1 may comprise a memory 12, a processor 13 and a bus, and may further comprise a computer program stored in the memory 12 and executable on the processor 13, such as a virtual power plant optimization run program.
It will be appreciated by those skilled in the art that the schematic diagram is merely an example of the computer device 1 and does not constitute a limitation of the computer device 1, the computer device 1 may be a bus type structure, a star type structure, the computer device 1 may further comprise more or less other hardware or software than illustrated, or a different arrangement of components, for example, the computer device 1 may further comprise an input-output device, a network access device, etc.
It should be noted that the computer device 1 is only used as an example, and other electronic products that may be present in the present invention or may be present in the future are also included in the scope of the present invention by way of reference.
The memory 12 includes at least one type of readable storage medium including flash memory, a removable hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 12 may in some embodiments be an internal storage unit of the computer device 1, such as a removable hard disk of the computer device 1. The memory 12 may in other embodiments also be an external storage device of the computer device 1, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the computer device 1. Further, the memory 12 may also include both an internal storage unit and an external storage device of the computer device 1. The memory 12 may be used not only for storing application software installed on the computer device 1 and various types of data, such as codes of virtual power plant optimization running programs, but also for temporarily storing data that has been output or is to be output.
The processor 13 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, a combination of various control chips, and the like. The processor 13 is a Control Unit (Control Unit) of the computer device 1, connects the respective components of the entire computer device 1 using various interfaces and lines, executes or executes programs or modules stored in the memory 12 (for example, executes a virtual power plant optimization running program, etc.), and invokes data stored in the memory 12 to perform various functions of the computer device 1 and process data.
The processor 13 executes the operating system of the computer device 1 and various types of applications installed. The processor 13 executes the application program to implement the steps of the various virtual power plant optimization operation method embodiments described above, such as the steps shown in fig. 1.
Illustratively, the computer program may be partitioned into one or more modules/units that are stored in the memory 12 and executed by the processor 13 to complete the present invention. The one or more modules/units may be a series of computer readable instruction segments capable of performing the specified functions, which instruction segments describe the execution of the computer program in the computer device 1. For example, the computer program may be divided into a construction unit 110, a creation unit 111, an update unit 112, a training unit 113, an input unit 114, an optimization unit 115.
The integrated units implemented in the form of software functional modules described above may be stored in a computer readable storage medium. The software functional module is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a computer device, or a network device, etc.) or a processor (processor) to execute a portion of the virtual power plant optimization operation method according to the embodiments of the present invention.
The modules/units integrated in the computer device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on this understanding, the present invention may also be implemented by a computer program for instructing a relevant hardware device to implement all or part of the procedures of the above-mentioned embodiment method, where the computer program may be stored in a computer readable storage medium and the computer program may be executed by a processor to implement the steps of each of the above-mentioned method embodiments.
Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory, or the like.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
The bus may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. For ease of illustration, only one straight line is shown in fig. 5, but not only one bus or one type of bus. The bus is arranged to enable a connection communication between the memory 12 and at least one processor 13 or the like.
Although not shown, the computer device 1 may further comprise a power source (such as a battery) for powering the various components, preferably the power source may be logically connected to the at least one processor 13 via a power management means, whereby the functions of charge management, discharge management, and power consumption management are achieved by the power management means. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The computer device 1 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described in detail herein.
Further, the computer device 1 may also comprise a network interface, optionally comprising a wired interface and/or a wireless interface (e.g. WI-FI interface, bluetooth interface, etc.), typically used for establishing a communication connection between the computer device 1 and other computer devices.
The computer device 1 may optionally further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the computer device 1 and for displaying a visual user interface.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
Fig. 5 shows only a computer device 1 with components 12-13, it will be understood by those skilled in the art that the structure shown in fig. 5 is not limiting of the computer device 1 and may include fewer or more components than shown, or may combine certain components, or a different arrangement of components.
In connection with fig. 1, the memory 12 in the computer device 1 stores a plurality of instructions for implementing a virtual power plant optimized operation method, the processor 13 being executable to implement:
constructing an objective function and constraint conditions, and constructing an initial virtual power plant model according to the objective function and the constraint conditions;
creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on a Markov decision model and a TD3 algorithm to obtain an intermediate virtual power plant model;
updating the intermediate virtual power plant model based on a noise mechanism and a attention mechanism to obtain a model to be trained;
optimizing and training the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model;
Acquiring current power data, and inputting the power data into the target virtual power plant model;
and generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model, and executing the virtual power plant optimizing operation strategy.
Specifically, the specific implementation method of the above instructions by the processor 13 may refer to the description of the relevant steps in the corresponding embodiment of fig. 1, which is not repeated herein.
The data in this case were obtained legally.
In the several embodiments provided in the present invention, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The invention is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. The units or means stated in the invention may also be implemented by one unit or means, either by software or hardware. The terms first, second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (10)

1. The virtual power plant optimizing operation method is characterized by comprising the following steps of:
constructing an objective function and constraint conditions, and constructing an initial virtual power plant model according to the objective function and the constraint conditions;
creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on a Markov decision model and a TD3 algorithm to obtain an intermediate virtual power plant model;
updating the intermediate virtual power plant model based on a noise mechanism and a attention mechanism to obtain a model to be trained;
Optimizing and training the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model;
acquiring current power data, and inputting the power data into the target virtual power plant model;
and generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model, and executing the virtual power plant optimizing operation strategy.
2. The method for optimizing operation of a virtual power plant of claim 1, wherein the constructing objective functions and constraints comprises:
taking the minimum total cost in the virtual power plant operation period as the objective function on the premise of meeting the constraint condition;
the total cost is the sum of wind power generation cost, gas unit cost, energy storage cost and electricity selling cost of an electric power market;
the constraint conditions comprise virtual power plant power balance constraint, energy storage equipment charge and discharge constraint, wind turbine generator set output constraint and gas turbine generator set operation constraint.
3. The virtual power plant optimization operating method of claim 2, wherein:
the objective function is expressed as follows:
wherein:
wherein min C represents the minimum of the total cost, C wt (t) represents the wind power generation cost, C gas (t) represents the cost of the gas turbine unit, C es (t) represents the energy storage cost, C maket (t) represents the electricity market electricity cost; n (N) wt Represents the number of fans, N gas Indicating the number of gas turbines, N es Representing the number of energy storage devices; c (C) wt Represents the running cost of the power generation of a single fan,the output of the kth fan at the time t is shown; c (C) CH4 Represents the unit price of natural gas; l (L) HVNG Represents the low calorific value of natural gas, n gas Representing the power generation efficiency, P gas,i,t The output of the ith gas turbine at the t moment is shown; c es Representing the cost coefficient of charge and discharge of a single energy storage device, P es,n,t Indicating the charging of the nth energy storage unit at time tDischarge power; c maket,t Represents the market electricity price at the moment t, P maket,t The market electric energy trading quantity at the moment T is represented, delta T represents the time variation quantity, and T represents the operation period of the virtual power plant;
the virtual plant power balance constraint is expressed as follows:
wherein N is load Representing the number of loads; p (P) load,m,t Representation oftMoment mth user load power;representation oftThe heat generation power of the ith gas turbine at the moment; />Representation oftMoment mth user thermal load;
the energy storage device charge-discharge constraints are expressed as follows:
in the method, in the process of the invention,representing energy storage devicestTime charging power- >Representing energy storage devicestTime discharge power->Representing the maximum charging power of the energy storage device, +.>Representing the maximum discharge power of the energy storage device, +.>Represents the SOC value of the stored energy during the J period,/>upper limit of SOC value representing J period energy storage, < >>Representing a lower limit of the SOC value for the J period of energy storage;
the output constraint of the wind turbine generator is expressed as follows:
wherein,representing the maximum value of the actual output of the wind turbine generator;
the gas unit operation constraints are expressed as follows:
wherein,representing the power generation of the gas unit>Indicating the upper limit of the output of the gas unit,/->Indicating the lower limit of the output of the gas unit>The heat dissipation loss rate is shown.
4. The method for optimizing operation of a virtual power plant according to claim 3, wherein creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the markov decision model and the TD3 algorithm, and obtaining the intermediate virtual power plant model comprises:
acquiring an energy storage SOC value, the maximum value of the actual output of the wind turbine generator set,tTime mth user thermal load,tThe mth user load power at the moment and the market electricity price at the moment are used as the state space;
acquiring the output of the ith gas turbine at the moment t, the charge and discharge power of the nth energy storage unit at the moment t, the market electric energy trading volume at the moment t and the output of the kth fan at the moment t as the action space;
Acquiring a certain state of an intelligent agent corresponding to the initial virtual power plant model, and constructing the rewarding function by the total cost of any action under the corresponding state and the penalty coefficient exceeding the constraint;
updating the initial virtual power plant model based on the state space, the action space and the rewarding function to obtain the intermediate virtual power plant model;
wherein the state space S t The expression is as follows:
SOC represents the SOC value of the stored energy;
wherein the action space a t The expression is as follows:
wherein the reward function r t The expression is as follows:
S t representing a certain state of the intelligent agent corresponding to the initial virtual power plant model;represent S t The action is a in the state t Total cost of system when->Representing penalty coefficients that exceed the constraint.
5. The method for optimizing operation of a virtual power plant according to claim 1, wherein updating the intermediate virtual power plant model based on a noise mechanism and a attention mechanism to obtain a model to be trained comprises:
configuring target parameters;
generating a random number based on the Logistic mapping;
comparing the target parameter with the random number to obtain a comparison result;
when the comparison result is that the target parameter is greater than or equal to the random number, adding noise into an action network of the intermediate virtual power plant model; or when the comparison result is that the target parameter is smaller than the random number, no noise is added to the action network;
Acquiring an attention mechanism network, and updating the action network based on the attention mechanism network to obtain the model to be trained; wherein the attention mechanism network consists of a fully connected layer and a Softmax activation function.
6. The method for optimized operation of a virtual power plant as claimed in claim 5, wherein:
the target parameter λ is expressed as follows:
wherein,a first coefficient representing an adjustment of said target parameter lambda +_>A second coefficient representing an adjustment of the target parameter λ; r is (r) t Representing a reward value at time t; />Representing an average prize value over the virtual power plant operating period;
The random number L n+1 The expression is as follows:
wherein L is n Representing an iteration sequence value of the Logistic map; y represents a branching parameter.
7. The method for optimizing operation of a virtual power plant according to claim 5, wherein the optimizing training the model to be trained based on the priority experience storage strategy, to obtain the optimized target virtual power plant model, comprises:
in each round of training process, acquiring the current virtual power plant environment state of the current round and the current action obtained through the action network;
detecting whether noise is added to the action network based on the noise mechanism to obtain random actions;
Executing the random action based on the current virtual power plant environment state to obtain a current rewarding value and a virtual power plant environment state at the next moment;
determining the current experience value as excellent experience if the searched current experience value is larger than the average value of all the searched experience values except the current experience value in a specified period, and adding the current experience value into an experience playback pool; or if the current experience value is smaller than or equal to the average value, determining whether to add the current experience value to the experience playback pool according to a preset probability;
randomly extracting experience samples from the experience playback pool, and inputting the experience samples into the action network to obtain random actions at the next moment;
acquiring the current value of a Q function of a current wheel model corresponding to the model to be trained, and updating model parameters of the current wheel model according to the current value of the Q function;
and stopping training when detecting that the value of the Q function reaches the maximum in any round of training, and determining a model obtained by the round of training as the target virtual power plant model.
8. A virtual power plant optimal operation device, characterized in that the virtual power plant optimal operation device comprises:
The construction unit is used for constructing an objective function and constraint conditions and constructing an initial virtual power plant model according to the objective function and the constraint conditions;
the establishing unit is used for establishing a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the Markov decision model and the TD3 algorithm to obtain an intermediate virtual power plant model;
the updating unit is used for updating the intermediate virtual power plant model based on a noise mechanism and an attention mechanism to obtain a model to be trained;
the training unit is used for carrying out optimization training on the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model;
the input unit is used for acquiring current power data and inputting the power data into the target virtual power plant model;
and the optimizing unit is used for generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model and executing the virtual power plant optimizing operation strategy.
9. A computer device, the computer device comprising:
a memory storing at least one instruction; and
A processor executing instructions stored in the memory to implement the virtual power plant optimization operation method of any one of claims 1 to 7.
10. A computer-readable storage medium, characterized by: the computer-readable storage medium having stored therein at least one instruction for execution by a processor in a computer device to implement the virtual power plant optimized operation method of any one of claims 1 to 7.
CN202410028978.2A 2024-01-09 2024-01-09 Virtual power plant optimized operation method, device, equipment and medium Active CN117541030B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410028978.2A CN117541030B (en) 2024-01-09 2024-01-09 Virtual power plant optimized operation method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410028978.2A CN117541030B (en) 2024-01-09 2024-01-09 Virtual power plant optimized operation method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN117541030A true CN117541030A (en) 2024-02-09
CN117541030B CN117541030B (en) 2024-04-26

Family

ID=89794211

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410028978.2A Active CN117541030B (en) 2024-01-09 2024-01-09 Virtual power plant optimized operation method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN117541030B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326994A (en) * 2021-07-06 2021-08-31 华北电力大学 Virtual power plant energy collaborative optimization method considering source load storage interaction
CN114036825A (en) * 2021-10-27 2022-02-11 南方电网科学研究院有限责任公司 Collaborative optimization scheduling method, device, equipment and storage medium for multiple virtual power plants
CN115423207A (en) * 2022-09-26 2022-12-02 中国长江三峡集团有限公司 Wind storage virtual power plant online scheduling method and device
CN115663804A (en) * 2022-11-02 2023-01-31 深圳先进技术研究院 Electric power system regulation and control method based on deep reinforcement learning
CN115879983A (en) * 2023-02-07 2023-03-31 长园飞轮物联网技术(杭州)有限公司 Virtual power plant scheduling method and system
CN116914732A (en) * 2023-07-13 2023-10-20 广东工业大学 Deep reinforcement learning-based low-carbon scheduling method and system for cogeneration system
CN117291095A (en) * 2023-09-15 2023-12-26 国网上海能源互联网研究院有限公司 Collaborative interaction method, device, equipment and medium for virtual power plant and power distribution network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326994A (en) * 2021-07-06 2021-08-31 华北电力大学 Virtual power plant energy collaborative optimization method considering source load storage interaction
CN114036825A (en) * 2021-10-27 2022-02-11 南方电网科学研究院有限责任公司 Collaborative optimization scheduling method, device, equipment and storage medium for multiple virtual power plants
CN115423207A (en) * 2022-09-26 2022-12-02 中国长江三峡集团有限公司 Wind storage virtual power plant online scheduling method and device
CN115663804A (en) * 2022-11-02 2023-01-31 深圳先进技术研究院 Electric power system regulation and control method based on deep reinforcement learning
CN115879983A (en) * 2023-02-07 2023-03-31 长园飞轮物联网技术(杭州)有限公司 Virtual power plant scheduling method and system
CN116914732A (en) * 2023-07-13 2023-10-20 广东工业大学 Deep reinforcement learning-based low-carbon scheduling method and system for cogeneration system
CN117291095A (en) * 2023-09-15 2023-12-26 国网上海能源互联网研究院有限公司 Collaborative interaction method, device, equipment and medium for virtual power plant and power distribution network

Also Published As

Publication number Publication date
CN117541030B (en) 2024-04-26

Similar Documents

Publication Publication Date Title
Ebrahimi et al. Unit commitment problem solution using shuffled frog leaping algorithm
Bertini et al. Soft computing based optimization of combined cycled power plant start-up operation with fitness approximation methods
Bi et al. Green energy forecast-based bi-objective scheduling of tasks across distributed clouds
Liang et al. An energy-aware resource deployment algorithm for cloud data centers based on dynamic hybrid machine learning
Liao et al. Energy consumption optimization scheme of cloud data center based on SDN
Wang et al. Research on short‐term and mid‐long term optimal dispatch of multi‐energy complementary power generation system
CN117541030B (en) Virtual power plant optimized operation method, device, equipment and medium
CN115528750B (en) Power grid safety and stability oriented data model hybrid drive unit combination method
CN112214602A (en) Text classification method and device based on humor, electronic equipment and storage medium
CN116401602A (en) Event detection method, device, equipment and computer readable medium
CN116862172A (en) Full-period variable time period ordered power utilization scheduling method and storage medium
Zhang et al. Short‐Term Power Load Forecasting Model Design Based on EMD‐PSO‐GRU
CN115776138A (en) Micro-grid capacity planning method and device considering multi-dimensional uncertainty and energy management strategy
Wang et al. Dominant technology identification model based on patent information toward sustainable energy development
CN112381333A (en) Micro-grid optimization method based on distributed improved bat algorithm
Sun et al. HEFT-dynamic scheduling algorithm in workflow scheduling
CN114997659B (en) Resource scheduling model construction method and system based on dynamic multi-objective optimization
Mackie et al. Reinforcement learning based load balancing for geographically distributed data centres
CN116629596B (en) Supply chain risk prediction method, device, equipment and medium
CN117522087B (en) Virtual power plant resource allocation method, device, equipment and medium
CN115964499B (en) Knowledge graph-based social management event mining method and device
CN116540990B (en) Code integration method and device for realizing electronic product based on embedded mode
CN117216339B (en) Digital twinning-based electronic sand table system construction method and device
CN118300207A (en) Control method and device for hydrogen-electricity hybrid energy supply system and electronic equipment
Alla et al. Strategies for achieving and maintaining Green ICT campus for Malaysian higher education institutes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant