CN116306864A - Method for training deep learning model of power system - Google Patents
Method for training deep learning model of power system Download PDFInfo
- Publication number
- CN116306864A CN116306864A CN202310033698.6A CN202310033698A CN116306864A CN 116306864 A CN116306864 A CN 116306864A CN 202310033698 A CN202310033698 A CN 202310033698A CN 116306864 A CN116306864 A CN 116306864A
- Authority
- CN
- China
- Prior art keywords
- time
- gru
- architecture
- encoder
- decoder
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 134
- 238000012549 training Methods 0.000 title claims abstract description 124
- 238000013136 deep learning model Methods 0.000 title claims abstract description 39
- 210000002569 neuron Anatomy 0.000 claims abstract description 98
- 238000005516 engineering process Methods 0.000 claims abstract description 28
- 230000006870 function Effects 0.000 claims abstract description 22
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 21
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims abstract description 14
- 238000013528 artificial neural network Methods 0.000 claims description 32
- 239000002131 composite material Substances 0.000 claims description 17
- 238000012937 correction Methods 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 5
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 abstract description 14
- 238000013507 mapping Methods 0.000 description 24
- 238000012360 testing method Methods 0.000 description 13
- 238000013459 approach Methods 0.000 description 11
- 238000000354 decomposition reaction Methods 0.000 description 11
- 230000008569 process Effects 0.000 description 10
- 238000011160 research Methods 0.000 description 10
- 125000004122 cyclic group Chemical group 0.000 description 5
- 230000015654 memory Effects 0.000 description 4
- 238000012795 verification Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 3
- 238000004088 simulation Methods 0.000 description 3
- 238000003462 Bender reaction Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000013178 mathematical model Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 235000003801 Castanea crenata Nutrition 0.000 description 1
- 244000209117 Castanea crenata Species 0.000 description 1
- 241000288105 Grus Species 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012067 mathematical method Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000010248 power generation Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0637—Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Entrepreneurship & Innovation (AREA)
- Educational Administration (AREA)
- General Health & Medical Sciences (AREA)
- Marketing (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Primary Health Care (AREA)
- Computational Linguistics (AREA)
- Development Economics (AREA)
- Public Health (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Water Supply & Treatment (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Feedback Control In General (AREA)
- Machine Translation (AREA)
Abstract
A method of training a deep learning model of a power system, comprising the steps of: step 1: constructing a loss function based on an average absolute error MAE through an Encoder-Decoder architecture; step 2: using an Adam algorithm as an updating algorithm of the neuron weight to realize training of each parameter of GRU neurons in an Encoder-Decoder architecture; step 3: the learning rate of each parameter is adaptively found by Adam algorithm. The invention aims to solve the technical problem that when a deep learning model based on a Seq2Seq technology is adopted to train differential sample data, the training efficiency is low because a unit start-stop state matrix and an output state matrix of an actual power system are high-dimensional sample matrices.
Description
Technical Field
The invention belongs to the field of electric power systems and automation research, in particular to the research of a unit combination decision method of a deep learning intelligent algorithm, and relates to a divisional application of a data driving unit combination intelligent decision method (application number 2019108724540) based on GRU and Seq2Seq technologies.
Background
The open mature power market often requires a separate power market operator (Independent System Operators, ISO) to maintain its safe and reliable operation, while the problem of combining units (Security Constrained Unit Commitment, SCUC) taking into account safety constraints is an important theoretical basis for ISO decisions. In recent years, the worldwide development of the electric power market is rapid, and the ISO is required to have a powerful calculation tool to maintain market operation and make an intelligent and fine day-ahead power generation plan; on the other hand, with the great application of new energy technologies such as electric vehicles, intermittent energy sources, demand side management, etc., challenges faced by ISO decisions are also endless. Therefore, the research of the SCUC decision theory with high adaptability and high precision has important theoretical and engineering significance for the development and marketization of the electric power industry.
Depending on the factors considered, current crew combinations can be broadly classified into multi-objective crew combinations, uncertainty crew combinations, crew combinations that consider diversified constraints and decision variables, and the like. In general, the research of the current unit combination problem is that a mathematical model is firstly put forward from the actual engineering problem on the premise of mechanism research, and then a corresponding mathematical method is researched to solve the problem, although the emphasis is different. The thinking of the research is based on strict logic deduction and mechanism research and driven by a mathematical model and an algorithm, so that the method can be called a unit combination decision method based on physical model driving. Because the model itself needs to be modified and reconstructed when the method faces new problems continuously appearing, the research thought may have insufficient applicability under the background that the energy source is changed day by day and theoretical challenges are endless.
In terms of engineering practice, once the unit combination decision method is used for the practice, a large amount of structured historical data is often accumulated, and in the long term, the unit combination decision also has certain repeatability, if a unit combination decision method based on data driving can be provided, the built-in mechanism is not researched, but the mapping relation between known input quantity and decision results is directly constructed based on the deep learning method, the mapping relation between the known input quantity and the decision results is trained by utilizing massive historical decision data, and the sustainable correction of a model is realized through accumulation of the historical data, so that the unit combination decision is endowed with self-evolution and self-learning capabilities. The decision method based on data driving not only can greatly simplify the modeling and solving processes and complexity of the unit combination problem, but also can cope with various theoretical problems and challenges which are continuously emerging through self-learning, however, the research of people in the field is relatively rare at present. Aiming at the time sequence characteristics of the unit combination sample data, for example, paper-study of a unit combination intelligent decision method with self-learning capability based on data driving, a cyclic neural network, namely a Long Short-Term Memory (LSTM), is adopted as a core training tool for the first time, a unit combination decision model based on data driving is successfully constructed, and the self-evolution characteristics of the method and the adaptability of the method to different unit combination problems are verified. The method in this document still has the following problems:
1) Since the LSTM model is too complex, it not only requires a large amount of computing resources when processing training samples with high dimensions, but also is prone to an overfitting phenomenon. In contrast, the latest improved generation in the cyclic neural network, the threshold cyclic network (Gated Recurrent Unit, GRU) combines the input gate and the forgetting gate on the basis of LSTM, simplifies the memory unit, and can effectively reduce the complexity of the model;
2) If the power system is trained offline by directly adopting a single cyclic neural network architecture, a unique compromise mapping model is inevitably generated when the power system is faced with the history sample data with huge differences (such as great differences of load characteristics in different seasons), so that the accuracy of online decision is difficult to ensure. In this regard, the document proposes a concept of clustering training, that is, clustering processing is performed on historical scheduling data first, then training is performed on each type of sample respectively, so as to obtain a plurality of mapping models, at the time of decision, the type of input data is first determined, and then online decision is performed by using the mapping model of the corresponding type. Although the fitting precision problem of the single-cycle neural network architecture facing the differential samples is solved to a certain extent, the training and decision-making efficiency of the thinking is greatly reduced because the thinking needs to train a plurality of deep learning models.
As an effective means for solving the sequence-type problem, the sequence-to-sequence (Sequence to Sequence, seq2 Seq) technique has been widely used in recent years for machine translation, intelligent question-answering, and the like. Unlike the conventional single-loop neural network architecture, which reads all input amounts with a single neuron and outputs the result, the Seq2Seq technique uses two loop neural networks to construct an Encoder-Decoder architecture, respectively. The Encoder reads the input sequence step by step according to the time steps, and then outputs the intermediate state C of the whole sequence. Since the recurrent neural network can record the process information for each training step, the intermediate state C can theoretically account for the information of the entire input sequence. In the Decoder, the other cyclic neural network performs the inverse operation of the Encoder, and the resulting intermediate state C is decoded in steps to form the final output sequence. The intermediate state C can completely store the category information and the pointing probability of the input sequence and the output sequence, so that the Seq2Seq technology is expected to become a feasible thinking for solving the problem that a single-cycle neural network model cannot accurately train differential sample data in theory. However, since the dimension of the set start-stop matrix is proportional to the number of system sets, the set start-stop state matrix and the output state matrix of the actual power system are high-dimensional sample matrices, and if the deep learning model based on the Seq2Seq technology is directly used for training the data, the training efficiency is low. Therefore, it is also necessary to research the dimension reduction strategy for the set start-stop sample data while introducing the Seq2Seq technology so as to further improve the training efficiency.
Disclosure of Invention
The invention aims to solve the technical problem that when a deep learning model based on a Seq2Seq technology is adopted to train differential sample data, the training efficiency is low because a unit start-stop state matrix and an output state matrix of an actual power system are high-dimensional sample matrices.
The technical scheme adopted by the invention is as follows:
the intelligent decision method for the data driving unit combination based on GRU and Seq2Seq technology comprises the following steps:
1. compressing dimensions of the crew combination history decision data using a sample encoding technique for a high-dimensional crew combination training sample matrix;
2. introducing a Seq2Seq technology on the basis of a threshold circulation network, and establishing a composite neural network architecture oriented to unit combination decision;
3. on the basis, a unit combination deep learning model is built, and a mapping model between the daily load of the system and a unit start-stop scheme is built through historical data training;
4. and carrying out unit combination decision by using the generated mapping model, obtaining a unit start-stop state and unit output under an optimal power flow model, taking the obtained unit combination decision result as new historical sample data, and training the deep learning model, thereby realizing continuous correction of the model.
In step 1, when the high-dimensional unit combination training sample matrix is processed, specifically, the unit combination start-stop state vector of each period is encoded, so that the vector codes with identical start-stop states are identical.
And converting the set combined start-stop state vector of each period into a decimal code corresponding to the set combined start-stop state vector of each period, so that the dimension of the sample matrix is compressed.
In step 2, an Encoder-Decoder composite neural network architecture is constructed based on GRU and Seq2Seq technology, specifically adopting the following steps:
1) A history map sample (P L ,U G ) Substituted into the Encoder-Decode architecture, where P L U as daily load data G For the corresponding unit start-stop scheme, the Encoder architecture will P L The GRU neuron hidden layer state at the time t is jointly determined by the GRU neuron hidden layer state at the time t-1 and the daily load at the time t by steps, and the specific formula is as follows:
h t =f(h t-1 ,P Lt ) (1)
wherein: h is a t The hidden layer state of the GRU neuron at the moment t is represented; h is a t-1 The hidden layer state of GRU neurons at the time t-1 is represented; p (P) Lt The daily load input at the time t is represented;
2) GRU neuron hidden layer state h at time t in the Encoder architecture t In the Decoder architecture, the GRU neuron hidden layer state h at the k moment is made to be the same as the intermediate state of the Encoder architecture k The specific formula is as follows, which is the same as the intermediate state of the Decoder architecture:
wherein: c (C) t Representing the intermediate state of the Encoder architecture at the time t; c (C) k Representing the intermediate state of the Encoder architecture at time k;
3) The intermediate state of the output of the Encoder framework at the moment T is the intermediate state C of the input sequence, and the value is C T Representing the complete information of the input sequence, specifically the following formula:
C=C T (3);
4) Inputting the sequence intermediate state C into a Decoder architecture, wherein the initial value C of the Decoder intermediate state 0 Like the intermediate state C of the sequence, will C 0 After input, the hidden layer state h of GRU neuron at k moment can be obtained k The hidden layer state of the GRU neuron at the moment k-1 and the input of the GRU neuron at the moment k are determined together, and the specific formula is as follows:
h k =f(h k-1 ,x k ) (4)
wherein: h is a k-1 The hidden layer state of GRU neurons at the moment k-1 is represented; x is x k A GRU neuron input at time k is represented;
5) The k-1 time Decoder architecture output will be used as the k time GRU neuron input, specifically as follows:
x k =U Gk-1 (5)
wherein: u (U) Gk-1 The Decoder architecture output at time k-1 is represented;
6) Substituting the formula (5) into the formula (4), and simultaneously, executing the operation opposite to the operation of the Encoder by the Decoder framework, and performing step-by-step decoding on the input sequence intermediate state C according to the time step to form a final output sequence, wherein the time of k-1 is the intermediate state C of the Decoder framework k-1 And h k-1 Equal, the Decoder architecture output at time k is defined by h k-1 、U Gk-1 H k The common decision is specifically described as follows:
wherein: u (U) Gk The Decoder architecture output at the time k is represented; p represents probability; g represents a softmax function; f represents a conversion function;
7) GRU neuron input at time kx k And k-1 time Decoder architecture intermediate state C k-1 Construction of update gate z in GRU neurons for variables k Reset gate r k Pending output valueThe concrete model of the three is as follows:
wherein: w (W) r Represents x k And r k Weight coefficient between; w (W) z Represents x k And z k Weight coefficient between; w (W) h Represents x k Andweight coefficient between; alpha represents an activation function sigmoid in the neural network;
8) Will z k 、r k and The three are combined to obtain the output h of the hidden layer of the GRU neuron k The specific formula is as follows:
wherein: h is a k-1 The GRU neuron hidden layer output at the moment k-1 is represented;
through the steps, an Encoder-Decode composite neural network architecture is constructed.
In step 3, daily load data P of a typical day is obtained L Set start-stop scheme U corresponding to set G As a history mapping sample, in the history mapping sample, the unit start-stop scheme U G And daily load P L U for relation of (C) G =F(p(P L ) Description; wherein p represents the probability between the daily load and the corresponding unit start-stop scheme, and F represents the conversion function.
Accumulating historical data for depth based on Seq2Seq and GRUOffline training is performed by the learning model, so that a describable U is obtained G And P L A mapping model of the probability relation between them.
Training a deep learning model by adopting an Adam algorithm, and specifically adopting the following steps:
1) Constructing a loss function based on an average absolute error (Mean Absolute Deviation, MAE) by an Encoder-Decoder architecture, setting the output of the Encoder-Decoder architecture at time k as U Gok The target value is U Gdk The total error E of the sample during training is shown as follows:
2) The Adam algorithm is used as an updating algorithm of the neuron weight to realize training of each parameter of GRU neurons in an Encoder-Decoder architecture, and the basic formula is shown as follows;
wherein: θ k The parameter variable to be updated at the moment k, delta is the learning rate,and->The specific calculation formula of the gradient weighted average value after error correction and the gradient weighted biased variance is shown as follows:
3) Substituting the formula (11) into the formula (10), and adaptively searching the learning rate of each parameter by utilizing an Adam algorithm to realize W in GRU neurons in an Encoder-Decoder architecture r 、W z W is provided h The correction of the three weight coefficients is specifically as follows:
training of the Encoder-Decoder architecture is achieved by equation (12) based on the constant correction of each weight coefficient.
The composite neural network architecture for unit combination decision is constructed based on GRU and Seq2Seq technology, and concretely comprises the following steps:
1) A history map sample (P L ,U G ) Substituting into the Encoder-Decode architecture, the Encoder architecture loads the sequence of daily loads P L The GRU neuron hidden layer state at the time t is jointly determined by the GRU neuron hidden layer state at the time t-1 and the daily load at the time t by steps, and the specific formula is as follows:
h t =f(h t-1 ,P Lt ) (1)
wherein: h is a t The hidden layer state of the GRU neuron at the moment t is represented; h is a t-1 The hidden layer state of GRU neurons at the time t-1 is represented; p (P) Lt The daily load input at the time t is represented;
2) GRU neuron hidden layer state h at time t in the Encoder architecture t In the Decoder architecture, the GRU neuron hidden layer state h at the k moment is made to be the same as the intermediate state of the Encoder architecture k The specific formula is as follows, which is the same as the intermediate state of the Decoder architecture:
wherein: c (C) t Representing the intermediate state of the Encoder architecture at the time t; c (C) k Representing the intermediate state of the Encoder architecture at time k;
3) The intermediate state of the output of the Encoder framework at the moment T is the intermediate state C of the input sequence, and the value is C T Representing the complete information of the input sequence, specifically the following formula:
C=C T (3);
4) Inputting the sequence intermediate state C into a Decoder architecture, wherein the initial value C of the Decoder intermediate state 0 Like the intermediate state C of the sequence, will C 0 After input, the hidden layer state h of GRU neuron at k moment can be obtained k The hidden layer state of the GRU neuron at the moment k-1 and the input of the GRU neuron at the moment k are determined together, and the specific formula is as follows:
h k =f(h k-1 ,x k ) (4)
wherein: h is a k-1 The hidden layer state of GRU neurons at the moment k-1 is represented; x is x k A GRU neuron input at time k is represented;
5) The k-1 time Decoder architecture output will be used as the k time GRU neuron input, specifically as follows:
x k =U Gk-1 (5)
wherein: u (U) Gk-1 The Decoder architecture output at time k-1 is represented;
6) Substituting the formula (5) into the formula (4), and simultaneously, executing the operation opposite to the operation of the Encoder by the Decoder framework, and performing step-by-step decoding on the input sequence intermediate state C according to the time step to form a final output sequence, wherein the time of k-1 is the intermediate state C of the Decoder framework k-1 And h k-1 Equal, the Decoder architecture output at time k is defined by h k-1 、U Gk-1 H k The common decision is specifically described as follows:
wherein: u (U) Gk The Decoder architecture output at the time k is represented; p represents probability; g represents a softmax function; f represents a conversion function;
7) Input x with GRU neuron at time k k And k-1 time Decoder architecture intermediate state C k-1 Construction of update gate z in GRU neurons for variables k Reset gate r k Pending output valueThe concrete model of the three is as follows:
wherein: w (W) r Represents x k And r k Weight coefficient between; w (W) z Represents x k And z k Weight coefficient between; w (W) h Represents x k Andweight coefficient between; alpha represents an activation function sigmoid in the neural network;
8) Will z k 、r k and The three are combined to obtain the output h of the hidden layer of the GRU neuron k The specific formula is as follows:
wherein: h is a k-1 The GRU neuron hidden layer output at the moment k-1 is represented;
through the steps, an Encoder-Decode composite neural network architecture is constructed.
A method for training a deep learning model of an electric power system adopts an Adam algorithm to train the deep learning model, and specifically adopts the following steps:
1) Constructing a loss function based on an average absolute error (Mean Absolute Deviation, MAE) by an Encoder-Decoder architecture, setting the output of the Encoder-Decoder architecture at time k as U Gok The target value is U Gdk The total error E of the sample during training is shown as follows:
2) The Adam algorithm is used as an updating algorithm of the neuron weight to realize training of each parameter of GRU neurons in an Encoder-Decoder architecture, and the basic formula is shown as follows;
wherein: θ k The parameter variable to be updated at the moment k, delta is the learning rate,and->The specific calculation formula of the gradient weighted average value after error correction and the gradient weighted biased variance is shown as follows:
3) Substituting the formula (11) into the formula (10), and adaptively searching the learning rate of each parameter by utilizing an Adam algorithm to realize W in GRU neurons in an Encoder-Decoder architecture r 、W z W is provided h The correction of the three weight coefficients is specifically as follows:
training of the Encoder-Decoder architecture is achieved by equation (12) based on the constant correction of each weight coefficient.
Compared with the prior art, the data-driven unit combination decision method provided by the invention has the following advantages and
the beneficial effects are that:
1) The invention constructs a GRU-based unit combination decision deep learning model, and compared with the LSTM model used in the existing literature, the training efficiency is higher;
2) According to the invention, a Seq2Seq technology is introduced on the basis of GRU, and an Encoder-Decoder composite neural network architecture for unit combination decision is provided, compared with the method for clustering samples in the prior art, the method provided by the invention has the advantages that the sample data is not required to be clustered and preprocessed, and training of all differential sample data can be completed by directly utilizing a deep learning model, so that the training and decision efficiency is higher;
3) The invention provides a sample coding technology for a high-dimensional unit combination sample matrix, which effectively compresses the dimension of unit combination sample data and further improves the training efficiency of a unit combination deep learning model.
Drawings
FIG. 1 is a data driven set combination decision method framework based on a composite neural network architecture.
Fig. 2 is a schematic diagram of a sample encoding technique.
FIG. 3 is a daily load and crew start-stop scheme mapping model.
Fig. 4 is an Encoder-Decoder complex neural network architecture.
Fig. 5 is a diagram showing the internal structure of a GRU neuron.
FIG. 6 is a graph of training errors based on a GRU model versus a Seq2Seq technique and a GRU model.
Detailed Description
As shown in fig. 1, the intelligent decision method for the data driving unit combination based on the GRU and Seq2Seq technology is characterized by comprising the following steps:
1. compressing dimensions of the crew combination history decision data using a sample encoding technique for a high-dimensional crew combination training sample matrix;
2. introducing a Seq2Seq technology on the basis of a threshold circulation network, and establishing a composite neural network architecture oriented to unit combination decision;
3. on the basis, a unit combination deep learning model is built, and a mapping model between the daily load of the system and a unit start-stop scheme is built through historical data training;
4. and carrying out unit combination decision by using the generated mapping model, obtaining a unit start-stop state and unit output under an optimal power flow model, taking the obtained unit combination decision result as new historical sample data, and training the deep learning model, thereby realizing continuous correction of the model.
As shown in FIG. 2, daily load data P of a typical day L Set start-stop scheme U corresponding to set G As a calendarShi Yingshe samples. In a mapping sample, unit start-stop scheme U G And daily load P L Available U of relation of (C) G =F(p(P L ) Description). The mapping relationship is shown in fig. 1.
In fig. 2 p represents the probability between daily load and corresponding unit start-stop scheme and F represents the transfer function. For daily load P L In other words, the mapping model is built by accumulating a large amount of historical data, and performing offline training on the deep learning model based on the Seq2Seq and the GRU, thereby obtaining the descriptive U G And P L A mapping model of the probability relation between them.
As shown in fig. 3, the set combination start-stop state vector of each period is encoded, and the vector codes with identical start-stop states are identical. The main purpose of sample coding is to convert the set combined start-stop state vector of each period into a decimal code corresponding to the set combined start-stop state vector, so that the dimension of a sample matrix is compressed, and finally, the purpose of improving the training efficiency of a deep learning model is achieved, and the principle of the method is shown in figure 2.
An Encoder-Decode composite neural network architecture is constructed based on GRU and Seq2Seq techniques, with the specific architecture shown in FIG. 4.
The structure of the GRU neurons is shown in FIG. 5.
At the time of specific construction, a history map sample (P L ,U G ) Substituting into the Encoder-Decode architecture, the Encoder architecture loads the sequence of daily loads P L The GRU neuron hidden layer state at the time t is jointly determined by the GRU neuron hidden layer state at the time t-1 and the daily load at the time t by steps, and the specific formula is as follows:
h t =f(h t-1 ,P Lt ) (1)
wherein: h is a t The hidden layer state of the GRU neuron at the moment t is represented; h is a t-1 The hidden layer state of GRU neurons at the time t-1 is represented; p (P) Lt The daily load input at time t is shown.
According to the characteristics of the GRU model, in the Encoder architecture, the hidden layer state h of GRU neurons at the moment t t The same as the intermediate state of the Encoder architecture. In the Decoder architecture, time kGRU neuron hidden layer state h k The specific formula is as follows, which is the same as the intermediate state of the Decoder architecture:
wherein: c (C) t Representing the intermediate state of the Encoder architecture at the time t; c (C) k Representing the intermediate state of the Encoder architecture at time k.
According to the characteristics of the Encoder architecture, the intermediate state output by the Encoder architecture at the moment T is the intermediate state C of the input sequence, and the value is C T Representing the complete information of the input sequence, specifically the following formula:
C=C T (3)
inputting the sequence intermediate state C into a Decoder architecture, wherein the initial value C of the Decoder intermediate state 0 The same as the intermediate state C of the sequence. C is C 0 After input, the hidden layer state h of GRU neuron at k moment can be obtained k The hidden layer state of the GRU neuron at the moment k-1 and the input of the GRU neuron at the moment k are determined together, and the specific formula is as follows:
h k =f(h k-1 ,x k ) (4)
wherein: h is a k-1 The hidden layer state of GRU neurons at the moment k-1 is represented; x is x k The GRU neuron input at time k is represented.
According to the Decoder architecture characteristics, the k-1 time Decoder architecture output is used as the input of the k time GRU neuron, and the following specific formula is as follows:
x k =U Gk-1 (5)
wherein: u (U) Gk-1 Representing the Decoder architecture output at time k-1.
Substituting the formula (5) into the formula (4), and simultaneously, executing the operation opposite to the operation of the Encoder by the Decoder framework, and performing step-by-step decoding on the input sequence intermediate state C according to the time step to form a final output sequence, wherein the time of k-1 is the intermediate state C of the Decoder framework k-1 And h k-1 Equal. Thus, the k-time Decoder architecture output is defined by h k-1 、U Gk-1 H k The common decision is specifically described as follows:
wherein: u (U) Gk The Decoder architecture output at the time k is represented; p represents probability; g represents a softmax function; f represents a conversion function.
As can be seen from FIG. 4, GRU neuron input x is taken as k k And k-1 time Decoder architecture intermediate state C k-1 Construction of update gate z in GRU neurons for variables k Reset gate r k Pending output valueThe concrete model of the three is as follows:
wherein: w (W) r Represents x k And r k Weight coefficient between; w (W) z Represents x k And z k Weight coefficient between; w (W) h Represents x k Andweight coefficient between; alpha represents the activation function sigmoid in the neural network.
Will z k 、r k and The three are combined to obtain the output h of the hidden layer of the GRU neuron k The specific formula is as follows:
wherein: h is a k-1 The GRU neuron hidden layer output at time k-1 is represented.
In step 4, training the deep learning model by adopting an Adam algorithm.
The Encoder-Decoder architecture works with the average absolute error (Mean Absolute Deviation,MAE) as a basis to construct a loss function, and setting the output of the Encoder-Decoder architecture at the k moment as U Gok The target value is U Gdk The total error E of the sample during training is shown in the following equation.
The Adam algorithm is used as an updating algorithm of the neuron weights to realize the training of each parameter of the GRU neurons in the Encoder-Decoder architecture, and the basic formula is shown as follows.
Wherein: θ k The parameter variable to be updated at the moment k, delta is the learning rate,and->The specific calculation formula is shown in the following formula, wherein the gradient weighted average value after error correction and the gradient weighted variance are obtained.
Substituting the formula (11) into the formula (10), and adaptively searching the learning rate of each parameter by utilizing an Adam algorithm to realize W in GRU neurons in an Encoder-Decoder architecture r 、W z W is provided h The correction of the three weight coefficients is specifically as follows.
Training of the Encoder-Decoder architecture is achieved by equation (12) based on the constant correction of each weight coefficient.
Examples:
in order to verify the validity and the correctness of the invention, the simulation test is carried out based on IEEE118 node standard calculation example and Hunan power grid actual data. In the IEEE118 node example, a load sample of 93 typical days applicable to an IEEE118 node system is constructed based on a Hunan electric network daily load characteristic curve. Among the daily load samples, the daily load samples 1 to 90 are used as training samples, and the daily load samples 91 to 93 are used as test samples. For convenience of subsequent calculation and analysis, the sample numbers 1 to 90 can be clustered into three clustered sample sets by a method of literature on the basis of a data-driven unit combination intelligent decision method study with self-learning capability, namely, 1 to 30 are clustered sample sets 1, and a sample number 91 also belongs to the type; 31-60 is cluster sample set 2, and test sample number 92 is also of this type; 61-90 is a cluster sample set 3 and test sample No. 93 is also of this type.
All the unit combination deep learning models of the invention are trained and tested on a Tensorflow1.6.0 platform. The relevant simulation calculations were all implemented on an Intel Kuri 5-4460 processor/3.20 GHz,8G memory computer.
In order to verify the correctness of the method of the invention, four methods are set: the method 1 is a unit combination decision method based on an LSTM model; the method 2 is a unit combination decision method based on a GRU model; the method 3 is a unit combination decision method based on a Seq2Seq technology and a GRU model; the method 4 is based on a set combination decision method based on the Seq2Seq technology and the GRU model, and integrates a sample coding technology.
1) Procedural simulation and correctness verification of the method
Firstly, training No. 1-90 samples by using the method of the invention, respectively carrying out unit combination decision on 91-93 test samples by using a mapping model obtained by training, and carrying out unit start-stop scheme and use document' Network-Constrained AC Unit Commitment Under Uncertainty: a Benders' Decomposition Approach compares the unit start-stop schemes of the unit combination decision method based on the physical model driving, and the solving result of the No. 91 test sample is shown in Table 1.
Table 1 the present invention and document Network-Constrained AC Unit Commitment Under Uncertainty: unit start-stop scheme obtained by solving No. 91 test sample by A Benders' Decomposition Approach method
As can be seen from Table 1, the method of the present invention and the document "Network-Constrained AC Unit Commitment Under Uncertainty: the unit start-stop scheme obtained by solving the method A Benders' Decomposition Approach is the same, which shows that the invention can fully learn the mapping relation between daily load and the unit start-stop scheme, and the mapping model obtained by training can carry out correct unit start-stop scheme decision on any input daily load data.
For the test sample No. 91-93, the invention solves the optimal power flow model on the basis of solving the start-stop scheme to obtain the unit combination decision result. The method and document "Network-Constrained AC Unit Commitment Under Uncertainty" of the invention are: the unit combination decision results of the A Benders' Decomposition Approach method are compared and the total cost is shown in Table 2.
Table 2 the method of the invention and literature Network-Constrained AC Unit Commitment Under Uncertainty: comparison of total cost of ABenders' Decomposition Approach method
As can be seen from Table 2, the unit output scheme and the total cost obtained by solving the method of the invention are as follows: the solution of A Benders' Decomposition Approach is the same. The result shows that after the unit start-stop scheme is obtained, the optimal power flow model can be obtained by solving the optimal power flow model, and the literature of Network-Constrained AC Unit Commitment Under Uncertainty is obtained: a Benders 'Decomposition Approach' method is the same set output scheme.
This is due to the unit start-stop scheme U G Methods and literature of the inventionNetwork-Constrained AC Unit Commitment Under Uncertainty: the methods of A Benders' Decomposition Approach all belong to decision variables. Using the method of the invention and literature Network-Constrained AC Unit Commitment Under Uncertainty: when solving the unit output scheme by the method A Benders' Decomposition Approach, the same optimal power flow model is adopted. Thus, for the same unit start-stop scheme U G The method and document "Network-Constrained AC Unit Commitment Under Uncertainty" of the invention: the method of A Benders' Decomposition Approach can be solved to obtain the same unit output scheme P G . The ability of self-learning and self-evolution of the data-driven unit combination intelligent decision method and the applicability thereof in facing different types of unit combination problems have been verified in the literature on study of unit combination intelligent decision method with self-learning ability based on data driving, and the invention is not described here in detail. The decision accuracy in the subsequent calculation examples all represent the unit start-stop scheme obtained by solving the method and the document Network-Constrained AC Unit Commitment Under Uncertainty: correlation between unit start-stop protocols as determined in A Benders' Decomposition Approach.
2) The method introduces the validity verification of the GRU model
In order to verify the effectiveness of the GRU model introduced in the invention, training samples after clustering pretreatment are used for training the method 1 and the method 2, the training times are set to be 500 times, and then the 91-93 test samples are solved by the two methods, wherein the specific results are shown in the table 3.
Table 3 decision accuracy versus training time for methods 1 and 2
As can be seen from table 3, in terms of decision accuracy, the decision accuracy of method 2 was 100% when deciding on 3 test samples, while the decision accuracy of method 1 was less than 100% when deciding on sample No. 93, and the total cost of method 1 was higher than that of method 2. This shows that by 500 training, method 2 has been able to generate an accurate mapping model for all clustered training sample sets, whereas method 1 has not been able to generate an accurate mapping model for clustered sample set 3, requiring more training times. As far as training time is concerned, the training time of method 2 for the 3 clustered sample sets is reduced by 77s, 91s and 82s, respectively, compared to method 1. This indicates that the training time required for method 2 is shorter for the same number of training.
The main reason for the phenomenon is that the GRU combines the input gate and the forgetting gate, the forgetting gate is recombined into the update gate and the reset gate, and meanwhile, the memory unit in the LSTM is simplified, so that the memory unit can directly calculate and output the result, the overall structure of the model is simpler, and the training and decision-making precision is higher under the same training parameters. It can be seen that it is correct and efficient to construct a deep learning model of unit combination decisions with GRUs instead of LSTM.
3) The method introduces the validity verification of the Seq2Seq technology
In order to verify the effectiveness of the method of the invention by introducing the Seq2Seq technology, samples subjected to clustering pretreatment and samples not subjected to clustering pretreatment are respectively used as training samples of the method 2 and the method 3, and the two methods are used for respectively solving the test samples of the numbers 91-93, wherein the specific results are shown in the table 4.
Table 4 decision accuracy and training time comparison for methods 2 and 3
As can be seen from table 4, in terms of decision accuracy, if the training method 2 is directly used without performing clustering pretreatment on the set combination training samples, the decision accuracy of the method 2 after training is generally lower than 90%. Training method 2 by adopting training samples subjected to clustering pretreatment can enable the decision accuracy to reach 100%. The method 3 is different, and even if training samples which are not subjected to clustering pretreatment are adopted for training, the method 3 can still obtain 100% of decision accuracy. This suggests that after the introduction of the Seq2Seq technique, a single deep learning model can complete the training of all the differential sample data. The reason for this is that if a single deep learning model is directly constructed by using the GRU, it is difficult to avoid generating a unique compromise mapping model in the face of training sample data having large differences, and thus it is difficult to ensure the accuracy of online decision. Therefore, in order to ensure the on-line decision accuracy, only the training samples can be clustered and preprocessed, and then each class of training samples is trained by adopting a corresponding deep learning model. If the Seq2Seq technology is introduced to construct an Encoder-Decoder composite neural network architecture based on GRU, as the intermediate state C can completely save the category information and the pointing probability of the input sequence and the output sequence, the accurate training of all the differential samples can be realized by using a single deep learning model.
As far as training time is concerned, method 2 is trained with non-clustered samples, which requires a minimum total time. While training method 3 with the non-clustered samples required 110.02s more total time than the former, it took 179.23s less time than method 2 with the clustered samples. The reason for this is that if method 2 is trained using non-clustered samples, the training process is terminated prematurely because it cannot converge to the most accurate mapping model, so the training time required for its entirety is short. However, if the method 2 is trained by using the clustering samples, the clustering pretreatment process also takes a certain time because the training needs to be completed on a plurality of deep learning models, so that the total training time consumed in the case is the longest.
To further analyze the reasons for the time difference between training methods 2 and 3, the actual convergence curves of the two methods when training methods 2 and 3 using the non-clustered samples are shown in fig. 6.
As can be seen from fig. 6, the total training error of the method 2 is basically converged to about 0.09 after the training times exceeds 100 times, and the error cannot be further reduced, so that the training process is finished in advance. And the error of the method 3 finally converges to about 0.0002 after training for more than 100 times. It follows that method 3 requires longer time to train the non-clustered samples due to the greater number of training times, while method 2, although training time is shorter, does not guarantee training accuracy of the model.
In summary, if the training sample without clustering pretreatment is directly used to train the method 2, it is difficult to ensure the decision accuracy of the model. By introducing the clustering preprocessing strategy, the problem of training accuracy of the method 2 in the face of the differential training samples can be solved, but the complexity of offline training of the method 2 is greatly increased, and thus the required offline training time and total time are greatly increased. The method 3 introduces the Seq2Seq technology, so that accurate training of the differential sample can be realized by using only one single deep learning model, the training process is simpler, and the training and decision-making efficiency of the deep learning model can be improved while the training precision is ensured.
4) The method introduces the validity verification of the sample coding technology
To verify the effect of the sample coding technique proposed by the present invention on the training efficiency of the deep learning model, the training of method 3 and method 4 was performed using non-clustered training samples, respectively, and the training and testing results are shown in table 5.
Table 5 decision accuracy and training time comparison for methods 3 and 4
As can be seen from table 5, after training, the decision accuracy was the same for method 3 and method 4, and the accuracy was 100% for all 3 test samples. However, the training time required for method 4 is reduced by 351s compared to method 3. The sample coding technology provided by the invention can directly compress the data dimension of the training samples, and reduces the set start-stop state matrix of 1 training sample from 24×54 dimension to 24×1 dimension, thus directly reducing the number of variables required to be calculated in the training process of the deep learning model, and further reducing the training time of the model. Therefore, after the sample coding technology is introduced, the sample coding process consumes a certain time, but the method can effectively reduce the overall training time of the deep learning model.
In summary, the sample coding technology provided by the invention can effectively compress the data dimension of the combined training sample of the unit, and directly reduce the number of variables required to be calculated in the training process of the deep learning model, so that the training accuracy of the deep learning model is ensured, and meanwhile, the training time of the deep learning model is effectively reduced; compared with an LSTM neural network adopted in a literature (study of a set combination intelligent decision method with self-learning ability based on data driving), the GRU model introduced by the invention can obtain higher training and decision accuracy under the same training parameters; according to the invention, the sequence 2 sequence technology is introduced to construct the Encoder-Decoder composite neural network architecture taking GRU as the neuron, so that the method can realize accurate training of the differential sample by only using a single deep learning model, the training process is simpler, and the training and decision-making efficiency of the method can be improved while the training precision of the method is ensured.
Claims (4)
1. The method for training the deep learning model of the electric power system is characterized by training the deep learning model by adopting an Adam algorithm, and specifically comprises the following steps of:
step 1: constructing a loss function based on an average absolute error MAE by using an Encoder-Decoder architecture, and setting the output of the Encoder-Decoder architecture at the k moment as U Gok The target value is U Gdk The total error E of the sample during training is shown as follows:
step 2: the Adam algorithm is used as an updating algorithm of the neuron weight to realize training of each parameter of GRU neurons in an Encoder-Decoder architecture, and the basic formula is shown as follows;
wherein: θ k The parameter variable to be updated at the moment k, delta is the learning rate,and->The specific calculation formula of the gradient weighted average value after error correction and the gradient weighted biased variance is shown as follows:
step 3: substituting the formula (11) into the formula (10), and adaptively searching the learning rate of each parameter by utilizing an Adam algorithm to realize W in GRU neurons in an Encoder-Decoder architecture r 、W z W is provided h The correction of the three weight coefficients is specifically as follows:
training of the Encoder-Decoder architecture is achieved by equation (12) based on the constant correction of each weight coefficient.
2. The method of claim 1, wherein the deep learning model is obtained on the basis of a set-decision-oriented composite neural network architecture, which is an Encoder-Decoder composite neural network architecture constructed based on the GRU and Seq2Seq techniques.
3. The method according to claim 1, wherein the set-combination decision-oriented composite neural network architecture, when constructed, comprises the steps of:
1) A history map sample (P L ,U G ) Substituting into the Encoder-Decode architecture, the Encoder architecture loads the sequence of daily loads P L The GRU neuron hidden layer state at the time t is jointly determined by the GRU neuron hidden layer state at the time t-1 and the daily load at the time t by steps, and the specific formula is as follows:
h t =f(h t-1 ,P Lt ) (1)
wherein: h is a t The hidden layer state of the GRU neuron at the moment t is represented; h is a t-1 The hidden layer state of GRU neurons at the time t-1 is represented; p (P) Lt The daily load input at the time t is represented;
2) GRU neuron hidden layer state h at time t in the Encoder architecture t In the Decoder architecture, the GRU neuron hidden layer state h at the k moment is made to be the same as the intermediate state of the Encoder architecture k The specific formula is as follows, which is the same as the intermediate state of the Decoder architecture:
wherein: c (C) t Representing the intermediate state of the Encoder architecture at the time t; c (C) k Representing the intermediate state of the Encoder architecture at time k;
3) The intermediate state of the output of the Encoder framework at the moment T is the intermediate state C of the input sequence, and the value is C T Representing the complete information of the input sequence, specifically the following formula:
C=C T (3)
4) Inputting the sequence intermediate state C into a Decoder architecture, wherein the initial value C of the Decoder intermediate state 0 Like the intermediate state C of the sequence, will C 0 After input, the hidden layer state h of GRU neuron at k moment can be obtained k The hidden layer state of the GRU neuron at the moment k-1 and the input of the GRU neuron at the moment k are determined together, and the specific formula is as follows:
h k =f(h k-1 ,x k ) (4)
wherein: h is a k-1 The hidden layer state of GRU neurons at the moment k-1 is represented; x is x k Indicating the time kA GRU neuron input;
5) The k-1 time Decoder architecture output will be used as the k time GRU neuron input, specifically as follows:
x k =U Gk-1 (5)
wherein: u (U) Gk-1 The Decoder architecture output at time k-1 is represented;
6) Substituting the formula (5) into the formula (4), and simultaneously, executing the operation opposite to the operation of the Encoder by the Decoder framework, and performing step-by-step decoding on the input sequence intermediate state C according to the time step to form a final output sequence, wherein the time of k-1 is the intermediate state C of the Decoder framework k-1 And h k-1 Equal, the Decoder architecture output at time k is defined by h k-1 、U Gk-1 H k The common decision is specifically described as follows:
wherein: u (U) Gk The Decoder architecture output at the time k is represented; p represents probability; g represents a softmax function; f represents a conversion function;
7) Input x with GRU neuron at time k k And k-1 time Decoder architecture intermediate state C k-1 Construction of update gate z in GRU neurons for variables k Reset gate r k Pending output valueThe concrete model of the three is as follows:
wherein: w (W) r Represents x k And r k Weight coefficient between; w (W) z Represents x k And z k Weight coefficient between; w (W) h Represents x k Andweight coefficient between; alphaRepresenting an activation function sigmoid in the neural network;
8) Will z k 、r k and The three are combined to obtain the output h of the hidden layer of the GRU neuron k The specific formula is as follows:
wherein: h is a k-1 The GRU neuron hidden layer output at time k-1 is represented.
4. The composite neural network architecture for unit combination decision is characterized by being an Encoder-Decoder composite neural network architecture constructed based on GRU and Seq2Seq technology, and specifically comprises the following steps:
step S1: a history map sample (P L ,U G ) Substituting into the Encoder-Decode architecture, the Encoder architecture loads the sequence of daily loads P L The GRU neuron hidden layer state at the time t is jointly determined by the GRU neuron hidden layer state at the time t-1 and the daily load at the time t by steps, and the specific formula is as follows:
h t =f(h t-1 ,P Lt ) (1)
wherein: h is a t The hidden layer state of the GRU neuron at the moment t is represented; h is a t-1 The hidden layer state of GRU neurons at the time t-1 is represented; p (P) Lt The daily load input at the time t is represented;
step S2: GRU neuron hidden layer state h at time t in the Encoder architecture t In the Decoder architecture, the GRU neuron hidden layer state h at the k moment is made to be the same as the intermediate state of the Encoder architecture k The specific formula is as follows, which is the same as the intermediate state of the Decoder architecture:
wherein: c (C) t Representing the intermediate state of the Encoder architecture at the time t; c (C) k Representing the intermediate state of the Encoder architecture at time k;
step S3: the intermediate state of the output of the Encoder framework at the moment T is the intermediate state C of the input sequence, and the value is C T Representing the complete information of the input sequence, specifically the following formula:
C=C T (3)
step S4: inputting the sequence intermediate state C into a Decoder architecture, wherein the initial value C of the Decoder intermediate state 0 Like the intermediate state C of the sequence, will C 0 After input, the hidden layer state h of GRU neuron at k moment can be obtained k The hidden layer state of the GRU neuron at the moment k-1 and the input of the GRU neuron at the moment k are determined together, and the specific formula is as follows:
h k =f(h k-1 ,x k ) (4)
wherein: h is a k-1 The hidden layer state of GRU neurons at the moment k-1 is represented; x is x k A GRU neuron input at time k is represented;
step S5: the k-1 time Decoder architecture output will be used as the k time GRU neuron input, specifically as follows:
x k =U Gk-1 (5)
wherein: u (U) Gk-1 The Decoder architecture output at time k-1 is represented;
step S6: substituting the formula (5) into the formula (4), and simultaneously, executing the operation opposite to the operation of the Encoder by the Decoder framework, and performing step-by-step decoding on the input sequence intermediate state C according to the time step to form a final output sequence, wherein the time of k-1 is the intermediate state C of the Decoder framework k-1 And h k-1 Equal, the Decoder architecture output at time k is defined by h k-1 、U Gk-1 H k The common decision is specifically described as follows:
wherein: u (U) Gk The Decoder architecture output at the time k is represented; p represents probability; g represents a softmax function; f represents a conversion function;
step S7: input x with GRU neuron at time k k And k-1 time Decoder architecture intermediate state C k-1 Construction of update gate z in GRU neurons for variables k Reset gate r k Pending output valueThe concrete model of the three is as follows:
wherein: w (W) r Represents x k And r k Weight coefficient between; w (W) z Represents x k And z k Weight coefficient between; w (W) h Represents x k Andweight coefficient between; alpha represents an activation function sigmoid in the neural network;
step S8: will z k 、r k and The three are combined to obtain the output h of the hidden layer of the GRU neuron k The specific formula is as follows:
wherein: h is a k-1 The GRU neuron hidden layer output at the moment k-1 is represented;
through the steps, an Encoder-Decode composite neural network architecture is constructed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310033698.6A CN116306864A (en) | 2019-09-16 | 2019-09-16 | Method for training deep learning model of power system |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910872454.0A CN110674459B (en) | 2019-09-16 | 2019-09-16 | Data driving type unit combination intelligent decision-making method based on GRU and Seq2Seq technology |
CN202310033698.6A CN116306864A (en) | 2019-09-16 | 2019-09-16 | Method for training deep learning model of power system |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910872454.0A Division CN110674459B (en) | 2019-09-16 | 2019-09-16 | Data driving type unit combination intelligent decision-making method based on GRU and Seq2Seq technology |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116306864A true CN116306864A (en) | 2023-06-23 |
Family
ID=69077953
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310033698.6A Pending CN116306864A (en) | 2019-09-16 | 2019-09-16 | Method for training deep learning model of power system |
CN201910872454.0A Active CN110674459B (en) | 2019-09-16 | 2019-09-16 | Data driving type unit combination intelligent decision-making method based on GRU and Seq2Seq technology |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910872454.0A Active CN110674459B (en) | 2019-09-16 | 2019-09-16 | Data driving type unit combination intelligent decision-making method based on GRU and Seq2Seq technology |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN116306864A (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111426933A (en) * | 2020-05-19 | 2020-07-17 | 浙江巨磁智能技术有限公司 | Safety type power electronic module and safety detection method thereof |
CN113393119B (en) * | 2021-06-11 | 2022-08-30 | 河海大学 | Stepped hydropower short-term scheduling decision method based on scene reduction-deep learning |
CN113420508B (en) * | 2021-07-07 | 2024-02-27 | 华北电力大学 | Unit combination calculation method based on LSTM |
CN113408648B (en) * | 2021-07-07 | 2024-08-23 | 华北电力大学 | Unit combination calculation method combined with deep learning |
CN117291109B (en) * | 2023-11-24 | 2024-04-09 | 中汽研汽车检验中心(广州)有限公司 | Modelica fluid model intelligent prediction method |
CN117439146B (en) * | 2023-12-06 | 2024-03-19 | 广东车卫士信息科技有限公司 | Data analysis control method and system for charging pile |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11009536B2 (en) * | 2016-10-05 | 2021-05-18 | Telecom Italia S.P.A. | Method and system for estimating energy generation based on solar irradiance forecasting |
CN110070224A (en) * | 2019-04-20 | 2019-07-30 | 北京工业大学 | A kind of Air Quality Forecast method based on multi-step recursive prediction |
-
2019
- 2019-09-16 CN CN202310033698.6A patent/CN116306864A/en active Pending
- 2019-09-16 CN CN201910872454.0A patent/CN110674459B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN110674459A (en) | 2020-01-10 |
CN110674459B (en) | 2023-03-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116306864A (en) | Method for training deep learning model of power system | |
US11409270B1 (en) | Optimization decision-making method of industrial process fusing domain knowledge and multi-source data | |
CN111832825B (en) | Wind power prediction method and system integrating long-term memory network and extreme learning machine | |
CN111159638A (en) | Power distribution network load missing data recovery method based on approximate low-rank matrix completion | |
CN111191856A (en) | Regional comprehensive energy system multi-energy load prediction method considering time sequence dynamic characteristics and coupling characteristics | |
CN113988449A (en) | Wind power prediction method based on Transformer model | |
CN115481788B (en) | Phase change energy storage system load prediction method and system | |
CN116957698A (en) | Electricity price prediction method based on improved time sequence mode attention mechanism | |
CN111198550A (en) | Cloud intelligent production optimization scheduling on-line decision method and system based on case reasoning | |
CN106453294A (en) | Security situation prediction method based on niche technology with fuzzy elimination mechanism | |
CN115409258A (en) | Hybrid deep learning short-term irradiance prediction method | |
CN116384572A (en) | Sequence-to-sequence power load prediction method based on multidimensional gating circulating unit | |
CN114817773A (en) | Time sequence prediction system and method based on multi-stage decomposition and fusion | |
CN108694480A (en) | Finance data prediction technique based on improved length memory network in short-term | |
CN117709540A (en) | Short-term bus load prediction method and system for identifying abnormal weather | |
CN116843057A (en) | Wind power ultra-short-term prediction method based on LSTM-ViT | |
Guo et al. | Short-term EV charging load forecasting based on GA-GRU model | |
CN113762591B (en) | Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning | |
CN113112089A (en) | Power consumption prediction method and prediction system for cement raw material grinding system | |
Cai et al. | Short-term forecasting of user power load in China based on XGBoost | |
CN110674460B (en) | E-Seq2Seq technology-based data driving type unit combination intelligent decision method | |
CN117556949A (en) | Traffic prediction method based on continuous evolution graph nerve controlled differential equation | |
CN116613740A (en) | Intelligent load prediction method based on transform and TCN combined model | |
Li et al. | A Novel Short-term Load Forecasting Model by TCN-LSTM Structure with Attention Mechanism | |
CN110909254A (en) | Method and system for predicting question popularity of question-answering community based on deep learning model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |