CN117541030A

CN117541030A - Virtual power plant optimized operation method, device, equipment and medium

Info

Publication number: CN117541030A
Application number: CN202410028978.2A
Authority: CN
Inventors: 屈蓉; 李任戈; 张欣; 唐琛捷; 沈旺旺
Original assignee: China Construction Science and Industry Corp Ltd
Current assignee: China Construction Science and Industry Corp Ltd
Priority date: 2024-01-09
Filing date: 2024-01-09
Publication date: 2024-02-09
Anticipated expiration: 2044-01-09
Also published as: CN117541030B

Abstract

The invention relates to the technical field of artificial intelligence, and provides a virtual power plant optimizing operation method, device, equipment and medium.

Description

Virtual power plant optimized operation method, device, equipment and medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a virtual power plant optimized operation method, device, equipment and medium.

Background

With the rapid growth of renewable and distributed energy sources, electrical power systems face increasingly complex management challenges. The virtual power plant is used as an intelligent system integrating various energy resources, and the efficient operation of the power system can be realized through optimal scheduling. The research on the optimal scheduling of the virtual power plant is not only beneficial to improving the economy of a power system and reducing the energy production cost, but also can effectively cope with the fluctuation and uncertainty of renewable energy sources and improve the reliability and stability of a power grid.

However, in the prior art, when the virtual power plant is optimally scheduled, the following drawbacks also exist:

(1) The renewable energy source and the load prediction precision are required to be high;

(2) The action space can only carry out discrete operation, so that the action quantity of each unit needs to be discretized to adapt to an algorithm, and the selectable action range is greatly reduced;

(3) Overestimation problems are easily generated for the state and action value functions, so that the strategy learned by the model is invalid.

In view of the foregoing, there is a need for a more stable and efficient virtual power plant optimization operating scheme.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a method, apparatus, device and medium for optimizing operation of a virtual power plant, which are capable of reasonably and stably optimizing the operation of the virtual power plant.

A virtual power plant optimization operation method, the virtual power plant optimization operation method comprising:

constructing an objective function and constraint conditions, and constructing an initial virtual power plant model according to the objective function and the constraint conditions;

creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on a Markov decision model and a TD3 algorithm to obtain an intermediate virtual power plant model;

Updating the intermediate virtual power plant model based on a noise mechanism and a attention mechanism to obtain a model to be trained;

optimizing and training the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model;

acquiring current power data, and inputting the power data into the target virtual power plant model;

and generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model, and executing the virtual power plant optimizing operation strategy.

According to a preferred embodiment of the present invention, the constructing objective functions and constraints includes:

taking the minimum total cost in the virtual power plant operation period as the objective function on the premise of meeting the constraint condition;

the total cost is the sum of wind power generation cost, gas unit cost, energy storage cost and electricity selling cost of an electric power market;

the constraint conditions comprise virtual power plant power balance constraint, energy storage equipment charge and discharge constraint, wind turbine generator set output constraint and gas turbine generator set operation constraint.

According to a preferred embodiment of the invention:

the objective function is expressed as follows: the objective function is expressed as follows: ；

Wherein:；

wherein min C represents the minimum of the total cost, C _wt (t) represents the wind power generation cost, C _gas (t) represents the cost of the gas turbine unit, C _es (t) represents the energy storage cost, C _maket (t) represents the electricity market electricity cost; n (N) _wt Represents the number of fans, N _gas Indicating the number of gas turbines, N _es Representing the number of energy storage devices; c (C) _wt Represents the running cost of the power generation of a single fan,the output of the kth fan at the time t is shown; c (C) _CH4 Represents the unit price of natural gas; l (L) _HVNG Represents the low calorific value of natural gas, n _gas Representing the power generation efficiency, P _gas,i,t The output of the ith gas turbine at the t moment is shown; c _es Representing the cost coefficient of charge and discharge of a single energy storage device, P _es,n,t The charge and discharge power of the nth energy storage unit at the time t is represented; c _maket,t Represents the market electricity price at the moment t, P _maket,t The market electric energy trading quantity at the moment T is represented, delta T represents the time variation quantity, and T represents the operation period of the virtual power plant;

the virtual plant power balance constraint is expressed as follows:；

wherein N is _load Representing the number of loads; p (P) _load,m,t Representation oftMoment mth user load power;representation oftThe heat generation power of the ith gas turbine at the moment; />Representation oftMoment mth user thermal load;

the energy storage device charge-discharge constraints are expressed as follows: ；

In the method, in the process of the invention,representing energy storage devicestTime charging power->Representing energy storage devicestThe power of the discharge is at the moment,representing the maximum charging power of the energy storage device, +.>Representing the maximum discharge power of the energy storage device, +.>SOC value representing J period energy storage, +.>Upper limit of SOC value representing J period energy storage, < >>Representing a lower limit of the SOC value for the J period of energy storage;

the output constraint of the wind turbine generator is expressed as follows:；

wherein,representing the maximum value of the actual output of the wind turbine generator;

the gas unit operation constraints are expressed as follows:；

wherein,representing the power generation of the gas unit>Indicating the upper limit of the output of the gas unit,/->Indicating the lower limit of the output of the gas unit>The heat dissipation loss rate is shown.

According to a preferred embodiment of the present invention, the creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the markov decision model and the TD3 algorithm, and the obtaining an intermediate virtual power plant model includes:

acquiring an energy storage SOC value, the maximum value of the actual output of the wind turbine generator set,tTime mth user thermal load,tThe mth user load power at the moment and the market electricity price at the moment are used as the state space;

acquiring the output of the ith gas turbine at the moment t, the charge and discharge power of the nth energy storage unit at the moment t, the market electric energy trading volume at the moment t and the output of the kth fan at the moment t as the action space;

Acquiring a certain state of an intelligent agent corresponding to the initial virtual power plant model, and constructing the rewarding function by the total cost of any action under the corresponding state and the penalty coefficient exceeding the constraint;

updating the initial virtual power plant model based on the state space, the action space and the rewarding function to obtain the intermediate virtual power plant model;

wherein the state space S _t The expression is as follows:；

SOC represents the SOC value of the stored energy;

wherein the action space a _t The expression is as follows:；

wherein the reward function r _t The expression is as follows:；

S _t representing a certain state of the intelligent agent corresponding to the initial virtual power plant model;represent S _t The action is a in the state _t Total cost of system when->Representing penalty coefficients that exceed the constraint.

According to a preferred embodiment of the present invention, the updating the intermediate virtual power plant model based on the noise mechanism and the attention mechanism, to obtain the model to be trained includes:

configuring target parameters;

generating a random number based on the Logistic mapping;

comparing the target parameter with the random number to obtain a comparison result;

when the comparison result is that the target parameter is greater than or equal to the random number, adding noise into an action network of the intermediate virtual power plant model; or when the comparison result is that the target parameter is smaller than the random number, no noise is added to the action network;

Acquiring an attention mechanism network, and updating the action network based on the attention mechanism network to obtain the model to be trained; wherein the attention mechanism network consists of a fully connected layer and a Softmax activation function.

According to a preferred embodiment of the invention:

the target parameter λ is expressed as follows:；

wherein,a first coefficient representing an adjustment of said target parameter lambda +_>A second coefficient representing an adjustment of the target parameter λ; r is (r) _t Representing a reward value at time t; />Representing an average prize value over the virtual power plant operating period;

the random number L _n+1 The expression is as follows:；

wherein L is _n Representing an iteration sequence value of the Logistic map; y represents a branching parameter.

According to a preferred embodiment of the present invention, the optimizing training the model to be trained based on the priority experience storage policy, and obtaining the optimized target virtual power plant model includes:

in each round of training process, acquiring the current virtual power plant environment state of the current round and the current action obtained through the action network;

detecting whether noise is added to the action network based on the noise mechanism to obtain random actions;

executing the random action based on the current virtual power plant environment state to obtain a current rewarding value and a virtual power plant environment state at the next moment;

Determining the current experience value as excellent experience if the searched current experience value is larger than the average value of all the searched experience values except the current experience value in a specified period, and adding the current experience value into an experience playback pool; or if the current experience value is smaller than or equal to the average value, determining whether to add the current experience value to the experience playback pool according to a preset probability;

randomly extracting experience samples from the experience playback pool, and inputting the experience samples into the action network to obtain random actions at the next moment;

acquiring the current value of a Q function of a current wheel model corresponding to the model to be trained, and updating model parameters of the current wheel model according to the current value of the Q function;

and stopping training when detecting that the value of the Q function reaches the maximum in any round of training, and determining a model obtained by the round of training as the target virtual power plant model.

A virtual power plant optimal operation device, the virtual power plant optimal operation device comprising:

the construction unit is used for constructing an objective function and constraint conditions and constructing an initial virtual power plant model according to the objective function and the constraint conditions;

The establishing unit is used for establishing a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the Markov decision model and the TD3 algorithm to obtain an intermediate virtual power plant model;

the updating unit is used for updating the intermediate virtual power plant model based on a noise mechanism and an attention mechanism to obtain a model to be trained;

the training unit is used for carrying out optimization training on the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model;

the input unit is used for acquiring current power data and inputting the power data into the target virtual power plant model;

and the optimizing unit is used for generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model and executing the virtual power plant optimizing operation strategy.

A computer device, the computer device comprising:

a memory storing at least one instruction; and

And the processor executes the instructions stored in the memory to realize the virtual power plant optimized operation method.

A computer-readable storage medium having stored therein at least one instruction for execution by a processor in a computer device to implement the virtual power plant optimized operation method.

According to the technical scheme, the initial virtual power plant model can be built according to the self-built objective function and constraint conditions, the state space, the action space and the rewarding function are built based on the Markov decision model and the TD3 algorithm to obtain the intermediate virtual power plant model, the intermediate virtual power plant model is updated based on the noise mechanism and the attention mechanism, the model to be trained is optimized and trained based on the priority experience storage strategy, the optimized target virtual power plant model is obtained, the current electric power data is further input into the target virtual power plant model, the virtual power plant optimizing operation strategy is generated and executed, the optimized virtual power plant operation strategy is more stable, and the electric power cost can be effectively solved.

Drawings

FIG. 1 is a flow chart of a preferred embodiment of the virtual power plant optimization method of the present invention.

Fig. 2 is a power supply and demand balance diagram of the present invention.

FIG. 3 is a comparison of the virtual power plant optimization method of the present invention with other optimization methods.

FIG. 4 is a functional block diagram of a preferred embodiment of the virtual power plant optimized operation apparatus of the present invention.

FIG. 5 is a schematic diagram of a computer device implementing a preferred embodiment of the method of optimizing operation of a virtual power plant according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.

FIG. 1 is a flow chart of a preferred embodiment of the virtual power plant optimization method of the present invention. The order of the steps in the flowchart may be changed and some steps may be omitted according to various needs.

The virtual power plant optimizing operation method is applied to one or more computer devices, wherein the computer device is a device capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware comprises, but is not limited to, a microprocessor, an application specific integrated circuit (Application Specific Integrated Circuit, an ASIC), a programmable gate array (Field-Programmable Gate Array, an FPGA), a digital processor (Digital Signal Processor, a DSP), an embedded device and the like.

The computer device may be any electronic product that can interact with a user in a human-computer manner, such as a personal computer, tablet computer, smart phone, personal digital assistant (Personal Digital Assistant, PDA), game console, interactive internet protocol television (Internet Protocol Television, IPTV), smart wearable device, etc.

The computer device may also include a network device and/or a user device. Wherein the network device includes, but is not limited to, a single network server, a server group composed of a plurality of network servers, or a Cloud based Cloud Computing (Cloud Computing) composed of a large number of hosts or network servers.

The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.

Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

The network in which the computer device is located includes, but is not limited to, the internet, a wide area network, a metropolitan area network, a local area network, a virtual private network (Virtual Private Network, VPN), and the like.

S10, constructing an objective function and constraint conditions, and constructing an initial virtual power plant model according to the objective function and the constraint conditions.

In this embodiment, the constructing the objective function and the constraint condition includes:

Specifically, the objective function is expressed as follows:；

wherein:；

wherein min C represents the minimum of the total cost, C _wt (t) represents the wind power generation cost, C _gas (t) represents the cost of the gas turbine unit, C _es (t) represents the energy storage cost, C _maket (t) represents the electricity market electricity cost; n (N) _wt Represents the number of fans, N _gas Indicating the number of gas turbines, N _es Representing the number of energy storage devices; c (C) _wt Represents the running cost of the power generation of a single fan,the output of the kth fan at the time t is shown; c (C) _CH4 Represents the unit price of natural gas; l (L) _HVNG Represents the low calorific value of natural gas, n _gas Representing the power generation efficiency, P _gas,i,t The output of the ith gas turbine at the t moment is shown; c _es Representing the cost coefficient of charge and discharge of a single energy storage device, P _es,n,t The charge and discharge power of the nth energy storage unit at the time t is represented; c _maket,t The market electricity price at the moment t is represented,P _maket,t the market electric energy trading quantity at the moment T is represented, delta T represents the time variation quantity, and T represents the operation period of the virtual power plant;

the virtual plant power balance constraint is expressed as follows:；

the energy storage device charge-discharge constraints are expressed as follows:；

in the method, in the process of the invention,representing energy storage devicestTime charging power->Representing energy storage devicestThe power of the discharge is at the moment,representing the maximum charging power of the energy storage device, +.>Representing the maximum discharge power of the energy storage device, +.>SOC value representing J period energy storage, +.>Upper limit of SOC value representing J period energy storage, < > >Representing a lower limit of the SOC value for the J period of energy storage;

through the constraint of the charge and discharge of the energy storage equipment, the normal and stable operation of the energy storage system can be ensured, the irreversible damage and other dangers to the battery caused by overcharge and discharge are prevented, and the charge and discharge power and the energy storage charge state of the energy storage equipment are constrained.

The output constraint of the wind turbine generator is expressed as follows:；

the gas unit operation constraints are expressed as follows:；

In the above embodiment, the initial virtual power plant model is built with multiple constraints of minimum cost and dimensions, so as to ensure that the built initial virtual power plant model can operate with minimum cost, and multiple constraint conditions can be satisfied, so that operation stability is synchronously ensured.

And S11, creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on a Markov decision model and a TD3 (Twin Delayed Deep Deterministic Policy Gradient, double-delay depth deterministic strategy gradient) algorithm to obtain an intermediate virtual power plant model.

In this embodiment, the creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the markov decision model and the TD3 algorithm, and obtaining the intermediate virtual power plant model includes:

wherein the state space S _t The expression is as follows:；

SOC represents the SOC value of the stored energy;

wherein the action space a _t The expression is as follows:；

wherein the reward function r _t The expression is as follows:；

In the embodiment, the influence of action rewards and environment rewards are comprehensively considered through the rewards function, so that learning targets can be more definite. Specifically, after selecting any one of the actions based on the distribution of the action space in the environmental state set, the environment gives a prize. Thus, the problem of minimizing the total cost of operation of the virtual power plant can be converted into a form of maximizing the rewards on the premise that the constraints are satisfied.

And S12, updating the intermediate virtual power plant model based on a noise mechanism and an attention mechanism to obtain a model to be trained.

In this embodiment, updating the intermediate virtual power plant model based on the noise mechanism and the attention mechanism, to obtain the model to be trained includes:

configuring target parameters;

generating a random number based on the Logistic mapping;

Specifically, the target parameter λ is expressed as follows:；

the random number L _n+1 The expression is as follows:；

When Y is less than or equal to 3.5699456 and less than or equal to 4, the Logistic mapping enters a chaotic state, and the generated chaotic sequence has good random distribution characteristics.

Wherein the action network is improved based on an attention mechanism. The observed state is used as an input vector, after passing through the first full connection layer of the attention network, the full connection layer is activated through a Softmax activation function to obtain weights corresponding to all components of the input vector, and the weights are multiplied with all corresponding components to obtain a new vector and serve as the output of the attention network. The output new vector passes through a full connection layer and a Sigmoid activation function to obtain actions under corresponding states.

In the process of policy training of the TD3 algorithm model, superimposed noise is output through the action network, and the noise is introduced to expand the searching capability of the algorithm model. However, unnecessary noise superposition adds additional computational cost to the model, and creates excessive exploration problems. Therefore, the embodiment improves the action exploration mechanism of the traditional TD3 algorithm, so that the model can better balance exploration and utilization under a dynamic environment.

And S13, optimizing and training the model to be trained based on a priority experience storage strategy to obtain an optimized target virtual power plant model.

In this embodiment, the optimizing training the model to be trained based on the priority experience storage policy, and obtaining the optimized target virtual power plant model includes:

The preset probability can be configured according to an actual running environment.

Through the embodiment, the optimized target virtual power plant model can be trained by further combining with the priority experience storage strategy.

S14, acquiring current power data, and inputting the power data into the target virtual power plant model.

In this embodiment, the power data may include, but is not limited to, one or more of the following combinations of data:

the current time period, the current regional characteristics, the number of fans, the number of gas turbines, the number of energy storage devices, and the like.

And S15, generating a virtual power plant optimizing operation strategy according to the output data of the target virtual power plant model, and executing the virtual power plant optimizing operation strategy.

In this embodiment, according to the output data of the target virtual power plant model, it may be determined whether the current period needs to be supplemented with electricity requirements, whether surplus electric energy needs to be stored, and other operation strategies, so as to ensure reasonable use of electric power resources, and further maintain stability of the electric power system.

For example: please refer to fig. 2, which is a power supply and demand balance diagram of the present invention. As can be taken from fig. 2, the virtual power plant optimization operation strategy may include, according to different supply and demand: in the time intervals of 1:00-7:00 and 16:00-18:00, the electricity price is lower, and the internal power supply of the virtual power plant mainly comes from wind power and gas turbine power generation, and the electricity demand is supplemented through electricity purchasing. The electric energy demand is low in the period of 2:00-4:00, so that low-price surplus electric energy is stored. In the period of 8:00-10:00, the virtual power plant does not purchase electricity any more, the energy storage is discharged in a small amount in a part of the period, and the load demand is still mainly met by wind power and gas turbine power generation. In the time intervals of 11:00-15:00 and 19:00-21:00, the overall electric load demand is more, the electricity price is in a peak time interval, and in order to reduce the total running cost, electricity purchasing is selected to be reduced, the system electric energy is supplied through wind power and a gas turbine, and the electricity demand is supplemented by energy storage discharge. In the period of 22:00-23:00, wind power and gas turbine power generation cannot meet the power demand, but the power price is low at the moment, so that the system balances the electric load through electricity purchasing. At 24:00, the electric load demand is lower, and can be met through wind power generation and gas turbine power generation.

Please refer to fig. 3, which is a comparison result of the virtual power plant optimizing operation method and other optimizing methods. Specifically, scheme 1 is a virtual power plant optimization operation method based on DDPG (Deep Deterministic Policy Gradient, depth deterministic strategy gradient algorithm); scheme 2 is a virtual power plant optimization operation method based on only traditional TD 3; scheme 3 is a virtual power plant optimization operation method proposed based on the present embodiment. As can be seen, the average daily running cost of scheme 3 is 16582 yuan, which is reduced by 7.55% and 3.4% compared with scheme 1 and scheme 2, respectively; the minimum daily running cost of the scheme 3 is 14554 yuan, which is reduced by 6.58 percent and 4.45 percent compared with the scheme 1 and the scheme 2 respectively; the maximum daily running cost of the scheme 3 is 18762 yuan, which is reduced by 10.53 percent and 3.97 percent respectively compared with the scheme 1 and the scheme 2. Therefore, compared with the scheme 1 and the scheme 2, the virtual power plant optimizing operation method provided by the embodiment has better performance, and the system operation cost can be effectively reduced.

FIG. 4 is a functional block diagram of a preferred embodiment of the virtual power plant optimized operation apparatus of the present invention. The virtual power plant optimizing operation device 11 comprises a construction unit 110, a creation unit 111, an updating unit 112, a training unit 113, an input unit 114 and an optimizing unit 115. The module/unit referred to in the present invention refers to a series of computer program segments, which are stored in a memory, capable of being executed by a processor and of performing a fixed function. In the present embodiment, the functions of the respective modules/units will be described in detail in the following embodiments.

The construction unit 110 is configured to construct an objective function and a constraint condition, and construct an initial virtual power plant model according to the objective function and the constraint condition;

the creating unit 111 is configured to create a state space, an action space and a reward function corresponding to the initial virtual power plant model based on a markov decision model and a TD3 algorithm, so as to obtain an intermediate virtual power plant model;

the updating unit 112 is configured to update the intermediate virtual power plant model based on a noise mechanism and an attention mechanism to obtain a model to be trained;

the training unit 113 is configured to perform optimization training on the model to be trained based on a priority experience storage policy, so as to obtain an optimized target virtual power plant model;

The input unit 114 is configured to obtain current power data, and input the power data to the target virtual power plant model;

the optimizing unit 115 is configured to generate a virtual power plant optimizing operation policy according to the output data of the target virtual power plant model, and execute the virtual power plant optimizing operation policy.

FIG. 5 is a schematic diagram of a computer device for implementing a preferred embodiment of the method for optimizing operation of a virtual power plant according to the present invention.

The computer device 1 may comprise a memory 12, a processor 13 and a bus, and may further comprise a computer program stored in the memory 12 and executable on the processor 13, such as a virtual power plant optimization run program.

It will be appreciated by those skilled in the art that the schematic diagram is merely an example of the computer device 1 and does not constitute a limitation of the computer device 1, the computer device 1 may be a bus type structure, a star type structure, the computer device 1 may further comprise more or less other hardware or software than illustrated, or a different arrangement of components, for example, the computer device 1 may further comprise an input-output device, a network access device, etc.

It should be noted that the computer device 1 is only used as an example, and other electronic products that may be present in the present invention or may be present in the future are also included in the scope of the present invention by way of reference.

The memory 12 includes at least one type of readable storage medium including flash memory, a removable hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 12 may in some embodiments be an internal storage unit of the computer device 1, such as a removable hard disk of the computer device 1. The memory 12 may in other embodiments also be an external storage device of the computer device 1, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the computer device 1. Further, the memory 12 may also include both an internal storage unit and an external storage device of the computer device 1. The memory 12 may be used not only for storing application software installed on the computer device 1 and various types of data, such as codes of virtual power plant optimization running programs, but also for temporarily storing data that has been output or is to be output.

The processor 13 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, a combination of various control chips, and the like. The processor 13 is a Control Unit (Control Unit) of the computer device 1, connects the respective components of the entire computer device 1 using various interfaces and lines, executes or executes programs or modules stored in the memory 12 (for example, executes a virtual power plant optimization running program, etc.), and invokes data stored in the memory 12 to perform various functions of the computer device 1 and process data.

The processor 13 executes the operating system of the computer device 1 and various types of applications installed. The processor 13 executes the application program to implement the steps of the various virtual power plant optimization operation method embodiments described above, such as the steps shown in fig. 1.

Illustratively, the computer program may be partitioned into one or more modules/units that are stored in the memory 12 and executed by the processor 13 to complete the present invention. The one or more modules/units may be a series of computer readable instruction segments capable of performing the specified functions, which instruction segments describe the execution of the computer program in the computer device 1. For example, the computer program may be divided into a construction unit 110, a creation unit 111, an update unit 112, a training unit 113, an input unit 114, an optimization unit 115.

The integrated units implemented in the form of software functional modules described above may be stored in a computer readable storage medium. The software functional module is stored in a storage medium, and includes several instructions for causing a computer device (which may be a personal computer, a computer device, or a network device, etc.) or a processor (processor) to execute a portion of the virtual power plant optimization operation method according to the embodiments of the present invention.

The modules/units integrated in the computer device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on this understanding, the present invention may also be implemented by a computer program for instructing a relevant hardware device to implement all or part of the procedures of the above-mentioned embodiment method, where the computer program may be stored in a computer readable storage medium and the computer program may be executed by a processor to implement the steps of each of the above-mentioned method embodiments.

Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory, or the like.

Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.

The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.

The bus may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. For ease of illustration, only one straight line is shown in fig. 5, but not only one bus or one type of bus. The bus is arranged to enable a connection communication between the memory 12 and at least one processor 13 or the like.

Although not shown, the computer device 1 may further comprise a power source (such as a battery) for powering the various components, preferably the power source may be logically connected to the at least one processor 13 via a power management means, whereby the functions of charge management, discharge management, and power consumption management are achieved by the power management means. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The computer device 1 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described in detail herein.

Further, the computer device 1 may also comprise a network interface, optionally comprising a wired interface and/or a wireless interface (e.g. WI-FI interface, bluetooth interface, etc.), typically used for establishing a communication connection between the computer device 1 and other computer devices.

The computer device 1 may optionally further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the computer device 1 and for displaying a visual user interface.

It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.

Fig. 5 shows only a computer device 1 with components 12-13, it will be understood by those skilled in the art that the structure shown in fig. 5 is not limiting of the computer device 1 and may include fewer or more components than shown, or may combine certain components, or a different arrangement of components.

In connection with fig. 1, the memory 12 in the computer device 1 stores a plurality of instructions for implementing a virtual power plant optimized operation method, the processor 13 being executable to implement:

Specifically, the specific implementation method of the above instructions by the processor 13 may refer to the description of the relevant steps in the corresponding embodiment of fig. 1, which is not repeated herein.

The data in this case were obtained legally.

In the several embodiments provided in the present invention, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.

The invention is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.

The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. The units or means stated in the invention may also be implemented by one unit or means, either by software or hardware. The terms first, second, etc. are used to denote a name, but not any particular order.

Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims

1. The virtual power plant optimizing operation method is characterized by comprising the following steps of:

2. The method for optimizing operation of a virtual power plant of claim 1, wherein the constructing objective functions and constraints comprises:

3. The virtual power plant optimization operating method of claim 2, wherein:

the objective function is expressed as follows:；

wherein:；

wherein min C represents the minimum of the total cost, C _wt (t) represents the wind power generation cost, C _gas (t) represents the cost of the gas turbine unit, C _es (t) represents the energy storage cost, C _maket (t) represents the electricity market electricity cost; n (N) _wt Represents the number of fans, N _gas Indicating the number of gas turbines, N _es Representing the number of energy storage devices; c (C) _wt Represents the running cost of the power generation of a single fan,the output of the kth fan at the time t is shown; c (C) _CH4 Represents the unit price of natural gas; l (L) _HVNG Represents the low calorific value of natural gas, n _gas Representing the power generation efficiency, P _gas,i,t The output of the ith gas turbine at the t moment is shown; c _es Representing the cost coefficient of charge and discharge of a single energy storage device, P _es,n,t Indicating the charging of the nth energy storage unit at time tDischarge power; c _maket,t Represents the market electricity price at the moment t, P _maket,t The market electric energy trading quantity at the moment T is represented, delta T represents the time variation quantity, and T represents the operation period of the virtual power plant;

the virtual plant power balance constraint is expressed as follows:；

in the method, in the process of the invention,representing energy storage devicestTime charging power- >Representing energy storage devicestTime discharge power->Representing the maximum charging power of the energy storage device, +.>Representing the maximum discharge power of the energy storage device, +.>Represents the SOC value of the stored energy during the J period,/>upper limit of SOC value representing J period energy storage, < >>Representing a lower limit of the SOC value for the J period of energy storage;

the output constraint of the wind turbine generator is expressed as follows:；

the gas unit operation constraints are expressed as follows:；

4. The method for optimizing operation of a virtual power plant according to claim 3, wherein creating a state space, an action space and a reward function corresponding to the initial virtual power plant model based on the markov decision model and the TD3 algorithm, and obtaining the intermediate virtual power plant model comprises:

wherein the state space S _t The expression is as follows:；

SOC represents the SOC value of the stored energy;

wherein the action space a _t The expression is as follows:；

wherein the reward function r _t The expression is as follows:；

5. The method for optimizing operation of a virtual power plant according to claim 1, wherein updating the intermediate virtual power plant model based on a noise mechanism and a attention mechanism to obtain a model to be trained comprises:

configuring target parameters;

generating a random number based on the Logistic mapping;

6. The method for optimized operation of a virtual power plant as claimed in claim 5, wherein:

the target parameter λ is expressed as follows:；

wherein,a first coefficient representing an adjustment of said target parameter lambda +_>A second coefficient representing an adjustment of the target parameter λ; r is (r) _t Representing a reward value at time t; />Representing an average prize value over the virtual power plant operating period；

The random number L _n+1 The expression is as follows:；

7. The method for optimizing operation of a virtual power plant according to claim 5, wherein the optimizing training the model to be trained based on the priority experience storage strategy, to obtain the optimized target virtual power plant model, comprises:

8. A virtual power plant optimal operation device, characterized in that the virtual power plant optimal operation device comprises:

9. A computer device, the computer device comprising:

a memory storing at least one instruction; and

A processor executing instructions stored in the memory to implement the virtual power plant optimization operation method of any one of claims 1 to 7.

10. A computer-readable storage medium, characterized by: the computer-readable storage medium having stored therein at least one instruction for execution by a processor in a computer device to implement the virtual power plant optimized operation method of any one of claims 1 to 7.