CN110674459A

CN110674459A - GRU and Seq2Seq technology-based data driving type unit combination intelligent decision-making method

Info

Publication number: CN110674459A
Application number: CN201910872454.0A
Authority: CN
Inventors: 杨楠; 邓逸天; 叶迪; 贾俊杰; 黄悦华; 邾玢鑫; 李振华; 张涛; 刘颂凯; 张磊; 王灿
Original assignee: China Three Gorges University CTGU
Current assignee: China Three Gorges University CTGU
Priority date: 2019-09-16
Filing date: 2019-09-16
Publication date: 2020-01-10
Anticipated expiration: 2039-09-16
Also published as: CN110674459B; CN116306864A

Abstract

The unit combination intelligent decision method based on GRU and Seq2Seq technology comprises the following steps: 1. compressing the dimensionality of the unit combination historical decision data by using a sample coding technology aiming at a high-dimensional unit combination training sample matrix; 2. introducing a Seq2Seq technology on the basis of a threshold cycle network, and establishing a composite neural network architecture facing unit combination decision; 3. on the basis, a unit combination deep learning model is built, and a mapping model between the daily load of the system and the unit start-stop scheme is built through historical data training; 4. and performing unit combination decision by using the generated mapping model, solving unit output under the start-stop state and the optimal power flow model of the unit, taking the obtained unit combination decision result as new historical sample data, and training the deep learning model, thereby realizing continuous correction of the model. The invention aims to solve the technical problem that when a deep learning model based on the Seq2Seq technology is adopted to train difference sample data, the low training efficiency is caused because a unit start-stop state matrix and an output state matrix of an actual power system are high-dimensional sample matrices.

Description

GRU and Seq2Seq technology-based data driving type unit combination intelligent decision-making method

Technical Field

The invention belongs to the field of electric power system and automatic research, and particularly relates to research on a unit combination decision method of a deep learning intelligent algorithm.

Background

An independent power market operator (ISO) is often required to maintain safe and reliable operation of the open and mature power market, and a Unit combination (SCUC) problem considering safety constraints is an important theoretical basis for ISO decision making. In recent years, the power market is rapidly developed worldwide, on the one hand, ISO is required to have a strong calculation tool to maintain market operation, and an intelligent and fine day-ahead power generation plan is made; on the other hand, with the large number of applications of new energy technologies such as electric vehicles, intermittent energy sources, demand-side management, and the like, the challenges facing ISO decision making are also endless. Therefore, the research on the SCUC decision theory with high adaptability and high precision has important theoretical and engineering significance for the development and marketization of the power industry.

According to different considered factors, the current unit combination can be roughly divided into a multi-target unit combination, an uncertain unit combination, a unit combination considering diversified constraint conditions and decision variables and the like. In general, although the research of the unit combination problem at present has different emphasis points, the method firstly provides a mathematical model on the premise of mechanism research from the actual engineering problem, and then researches a corresponding mathematical method to solve. The idea of the research is based on strict logic derivation and mechanism research and driven by a mathematical model and an algorithm, so that the method can be called a unit combination decision method driven by a physical model. Because the model itself needs to be modified and reconstructed when the method faces new emerging problems, the research idea may have insufficient applicability under the background that the energy revolution is new and different day by day and the theoretical challenge is endless.

In terms of engineering practice, once the unit combination decision method is used in practice, a large amount of structured historical data is accumulated, and in a long term, the unit combination decision has certain repeatability, if a data-driven unit combination decision method can be provided, the inherent mechanism of the unit combination decision method is not researched, the unit combination decision method is based on a deep learning method, massive historical decision data is used for training, the mapping relation between the known input quantity and the decision result is directly constructed, and the model is continuously corrected through accumulation of the historical data, so that the unit combination decision is endowed with self-evolution and self-learning capabilities. The decision method based on data driving not only can greatly simplify the process and complexity of modeling and solving the unit combination problem, but also can deal with various continuously emerging theoretical problems and challenges through self-learning, however, at present, the research of people in the field is rare. For the time sequence characteristics of unit combination sample data, for example, a thesis, "a unit combination intelligent decision method research based on data drive and having self learning ability", first adopts a recurrent neural network, namely a long short-Term Memory network (LSTM), as a core training tool, successfully constructs a unit combination decision model based on data drive, and verifies the self-evolution characteristics of the method and the adaptability of the method to different unit combination problems. However, the method of this document still has the following problems:

1) due to the fact that the LSTM model is too complex, when the LSTM model is used for processing training samples with high dimensionality, not only are a large amount of computing resources needed, but also an overfitting phenomenon easily occurs. In comparison, the latest improved threshold Recurrent Unit (GRU) in the Recurrent neural network combines the input gate and the forgetting gate on the basis of the LSTM, simplifies the memory Unit, and can effectively reduce the complexity of the model;

2) the historical data of the power system has huge difference (for example, the load characteristics in different seasons have huge difference), if a single cyclic neural network architecture is directly adopted for off-line training, a unique compromise mapping model is inevitably generated when the historical sample data with huge difference is faced, and therefore the accuracy of on-line decision making is difficult to ensure. To this end, the document proposes a concept of cluster training, that is, clustering is performed on historical scheduling data, then each type of sample is trained respectively, so as to obtain a plurality of mapping models, and when making a decision, the type of input data is first distinguished, and then an online decision is made by using the mapping models of the corresponding types. Although the idea solves the problem of fitting accuracy of a single-cycle neural network architecture in the face of different samples to a certain extent, the training and decision efficiency of the idea is greatly reduced because the idea needs to train a plurality of deep learning models.

As an effective means for solving the problem of the Sequence type, Sequence-to-Sequence (Seq 2Seq) technology has been widely used in recent years in machine translation, smart question answering, and the like. Different from the traditional single-cycle neural network architecture which uses a single neuron to read all input quantities and output results, the Seq2Seq technology respectively uses two cycle neural networks to form an Encoder-Decoder architecture. The Encoder reads the input sequence step by step according to time steps, and then outputs the intermediate state C of the whole sequence. Since the recurrent neural network can record process information for each training step, theoretically the intermediate state C can take into account the information of the entire input sequence. In Decoder, another recurrent neural network performs the inverse operation of the Encoder, and the resulting intermediate state C is decoded in steps to form the final output sequence. The intermediate state C can completely store the category information and pointing probability of the input sequence and the output sequence, so that the Seq2Seq technology is expected to become a feasible idea for solving the problem that a single-cycle neural network model cannot accurately train the diversity sample data in theory. However, because the dimension of the unit start-stop matrix is in direct proportion to the number of the system units, the unit start-stop state matrix and the output state matrix of the actual power system are high-dimensional sample matrices, and if the deep learning model based on the Seq2Seq technology is directly used for training the data, the training efficiency is low. Therefore, it is necessary to study a dimensionality reduction strategy for the unit start-stop sample data while introducing a Seq2Seq technology, so as to further improve the training efficiency.

Disclosure of Invention

The invention aims to solve the technical problem that when a deep learning model based on the Seq2Seq technology is adopted to train difference sample data, the low training efficiency is caused because a unit start-stop state matrix and an output state matrix of an actual power system are high-dimensional sample matrices.

The technical scheme adopted by the invention is as follows:

a data driving type unit combination intelligent decision method based on GRU and Seq2Seq technology comprises the following steps:

1. compressing the dimensionality of the unit combination historical decision data by using a sample coding technology aiming at a high-dimensional unit combination training sample matrix;

2. introducing a Seq2Seq technology on the basis of a threshold cycle network, and establishing a composite neural network architecture facing unit combination decision;

3. on the basis, a unit combination deep learning model is built, and a mapping model between the daily load of the system and the unit start-stop scheme is built through historical data training;

4. and performing unit combination decision by using the generated mapping model, solving unit output under the start-stop state and the optimal power flow model of the unit, taking the obtained unit combination decision result as new historical sample data, and training the deep learning model, thereby realizing continuous correction of the model.

In step 1, when a high-dimensional unit combination training sample matrix is processed, specifically, a unit combination start-stop state vector of each time period is encoded, so that vector codes with completely the same start-stop state are also the same.

And converting the unit combination start-stop state vector of each time period into a decimal code corresponding to the unit combination start-stop state vector, thereby compressing the dimensionality of the sample matrix.

In step 2, an Encoder-Decoder composite neural network architecture is constructed based on GRU and Seq2Seq technology, and the following steps are specifically adopted:

1) mapping a history to a sample (P)_L，U_G) Substituted into Encode-Decoder architecture, where P_LFor daily load data, U_GFor the corresponding unit start-stop scheme, the Encoder framework converts P into P_LReading in step by step, wherein the hidden layer state of the GRU neuron at the time t is jointly determined by the hidden layer state of the GRU neuron at the time t-1 and the daily load at the time t, and the specific formula is as follows:

h_t＝f(h_t-1,P_Lt) (1)

in the formula: h is_tRepresenting the state of a hidden layer of a GRU neuron at the time t; h is_t-1Representing the state of a hidden layer of a GRU neuron at the t-1 moment; p_LtRepresenting the daily load input at time t;

2) in the Encoder architecture, the hidden layer state h of GRU neuron at time t_tIn the Decoder architecture, the state h of the hidden layer of the GRU neuron at the time k is made the same as the intermediate state of the Encoder architecture_kThe intermediate state of the Decoder architecture is the same as that of the Decoder architecture, and the specific formula is as follows:

in the formula: c_tRepresenting the intermediate state of the Encoder architecture at the time t; c_kRepresenting the intermediate state of the Encoder architecture at the k moment;

3) the intermediate state output by the Encoder framework at the time T is an intermediate state C of the input sequence, and the value is C_TThe complete information of the input sequence is represented by the following formula:

C＝C_T(3)；

4) inputting the sequence intermediate state C into a Decoder architecture, wherein the Decoder intermediate state initial value C₀Same as in the sequence intermediate state C, add C₀After input, the state h of the hidden layer of the GRU neuron at the moment k can be obtained_kThe hidden layer state of the GRU neuron at the moment k-1 and the input of the GRU neuron at the moment k are jointly determined, and the specific formula is as follows:

h_k＝f(h_k-1,x_k) (4)

in the formula: h is_k-1Representing the state of a GRU neuron hidden layer at the k-1 moment; x is the number of_kRepresenting the GRU neuron input at the k moment;

5) the k-1 time Decoder architecture output will be used as the k time GRU neuron input, as follows:

x_k＝U_Gk-1(5)

in the formula: u shape_Gk-1Representing the Decoder architecture output at the k-1 moment;

6) substituting formula (5) into formula (4), simultaneously, the Decoder architecture executes the operation opposite to the Encode, and performs step-by-step decoding on the input sequence intermediate state C according to the time step to form a final output sequence, wherein the k-1 moment of the Decoder architecture intermediate state C_k-1And h_k-1Equal, k time Decoder architecture output by h_k-1、U_Gk-1And h_kThe common decisions are specifically described as follows:

in the formula: u shape_GkDecoder architecture input for representing k timeDischarging; p represents a probability; g represents a softmax function; f represents a conversion function;

7) GRU neuron inputs x at time k_kAnd k-1 time Decoder architecture intermediate state C_k-1Constructing update gate z in GRU neuron for variable_kReset gate r_kAnd a pending output value

The three concrete models are as follows:

in the formula: w_rDenotes x_kAnd r_kA weight coefficient therebetween; w_zDenotes x_kAnd z_kA weight coefficient therebetween; w_hDenotes x_kAnd

a weight coefficient therebetween; alpha represents an activation function sigmoid in the neural network;

8) will z_k、r_kAnd

the output h of the hidden layer of the GRU neuron can be obtained after the three are combined_kThe concrete formula is as follows:

in the formula: h is_k-1Representing GRU neuron hidden layer output at the k-1 moment;

and constructing an Encoder-Decoder composite neural network architecture through the steps.

In step 3, daily load data P of a typical day is added_LAnd corresponding unit start-stop scheme U_GAs a historical mapping sample, in the historical mapping sample, a unit start-stop scheme U_GAnd daily load P_LU for relationship (2)_G＝F(p(P_L) Description); wherein p representsAnd F represents a conversion function.

Historical data are accumulated, and a deep learning model based on Seq2Seq and GRU is trained off line, so that descriptive U is obtained_GAnd P_LA mapping model of the probability relationship between them.

The deep learning model is trained by adopting an Adam algorithm, and the method specifically comprises the following steps:

1) constructing a loss function by using an Encoder-Decoder architecture based on Mean Absolute Error (MAE), and setting the output of the Encoder-Decoder architecture at the k moment as U_GokTarget value is U_GdkThen the total error E of the sample during training is shown as follows:

2) the Adam algorithm is used as an updating algorithm of the neuron weight to realize the training of each parameter of GRU neurons in the Encoder-Decoder architecture, and the basic formula is shown as follows;

in the formula: theta_kThe parameter variable to be updated at the moment k, delta is the learning rate,

andthe average value of the gradient weights and the gradient weights after error correction have a bias variance, and the specific calculation formula is shown as the following formula:

3) substituting the formula (11) into the formula (10), and utilizing Adam algorithm to adaptively find the learning rate of each parameter, thereby realizing GRU spirit in the Encoder-Decoder frameworkMeridian in the channel W_r、W_zAnd W_hThe correction of the three weight coefficients is specifically as follows:

and (4) realizing the training of the Encoder-Decoder framework on the basis of continuously correcting each weight coefficient through a formula (12).

A unit combination decision-oriented composite neural network architecture is constructed based on GRU and Seq2Seq technology, and specifically comprises the following steps: 1) mapping a history to a sample (P)_L，U_G) Substituting into Encoder-Decoder architecture, which loads daily sequence P_LReading in step by step, wherein the hidden layer state of the GRU neuron at the time t is jointly determined by the hidden layer state of the GRU neuron at the time t-1 and the daily load at the time t, and the specific formula is as follows:

h_t＝f(h_t-1,P_Lt) (1)

C＝C_T(3)；

h_k＝f(h_k-1,x_k) (4)

x_k＝U_Gk-1(5)

in the formula: u shape_GkRepresenting the Decoder architecture output at the k moment; p represents a probability; g represents a softmax function; f represents a conversion function;

The three concrete models are：

8) will z_k、r_kAndthe output h of the hidden layer of the GRU neuron can be obtained after the three are combined_kThe concrete formula is as follows:

A method for training a deep learning model of a power system adopts an Adam algorithm to train the deep learning model, and specifically comprises the following steps:

3) substituting the formula (11) into the formula (10), and utilizing Adam algorithm to adaptively find the learning rate of each parameter, the W in GRU neuron in the Encoder-Decoder architecture can be realized_r、W_zAnd W_hThe correction of the three weight coefficients is specifically as follows:

Compared with the prior art, the unit combination decision method based on data driving provided by the invention has the following advantages and beneficial effects:

1) the invention constructs a GRU-based unit combined decision deep learning model, and compared with the LSTM model used in the existing literature, the training efficiency is higher;

2) the invention introduces the Seq2Seq technology on the basis of GRU, and provides an Encoder-Decoder composite neural network architecture facing unit combination decision, compared with the method of utilizing sample clustering in the existing literature, the method provided by the invention does not need to cluster preprocessing sample data, and can finish training all differential sample data by directly utilizing a deep learning model, thereby having higher training and decision efficiency;

3) the invention provides a sample coding technology for a high-dimensional unit combination sample matrix, which effectively compresses the dimensionality of unit combination sample data and further improves the training efficiency of a unit combination deep learning model.

Drawings

Fig. 1 is a data-driven unit combination decision method framework based on a composite neural network architecture.

Fig. 2 is a schematic diagram of a sample encoding technique.

FIG. 3 is a daily load and unit start-stop scheme mapping model.

Fig. 4 is an Encoder-Decoder composite neural network architecture.

Fig. 5 is a diagram of the internal structure of a GRU neuron.

FIG. 6 is a graph of training errors based on a GRU model versus a Seq2Seq technique and a GRU model.

Detailed Description

As shown in fig. 1, the data-driven type unit combination intelligent decision method based on GRU and Seq2Seq technology is characterized by comprising the following steps:

As shown in FIG. 2, daily load data P of a typical day_LAnd corresponding unit start-stop scheme U_GAs a history map sample. In a mapping sample, a unit start-stop scheme U_GAnd daily load P_LRelation of (1) available U_G＝F(p(P_L) Description of (c). The mapping relationship is shown in fig. 1.

In fig. 2, p represents the probability between the daily load and the corresponding unit start-stop scheme, and F represents the conversion function. Load to day P_LIn the process of establishing the mapping model, a deep learning model based on Seq2Seq and GRU is trained off line by accumulating a large amount of historical data, so that descriptive U is obtained_GAnd P_LA mapping model of the probability relationship between them.

As shown in fig. 3, the unit combination start-stop state vector of each time interval is encoded, and the vector codes with the completely same start-stop state are also the same. The main purpose of the sample coding is to convert the unit combination start-stop state vector of each time period into a decimal code corresponding to the unit combination start-stop state vector, so that the dimensionality of a sample matrix is compressed, and finally the purpose of improving the training efficiency of the deep learning model is achieved, and the principle of the sample coding is shown in fig. 2.

An Encoder-Decoder composite neural network architecture is constructed based on GRU and Seq2Seq technology, and the specific architecture is shown in FIG. 4.

The structure of a GRU neuron is shown in fig. 5.

In the specific construction, a history is mapped to a sample (P)_L，U_G) Substituting into Encoder-Decoder architecture, which loads daily sequence P_LReading in step by step, wherein the hidden layer state of the GRU neuron at the time t is jointly determined by the hidden layer state of the GRU neuron at the time t-1 and the daily load at the time t, and the specific formula is as follows:

h_t＝f(h_t-1,P_Lt) (1)

in the formula: h is_tRepresenting the state of a hidden layer of a GRU neuron at the time t; h is_t-1Representing the state of a hidden layer of a GRU neuron at the t-1 moment; p_LtIndicating the daily load input at time t.

According to the characteristics of the GRU model, in the Encoder framework, the state h of the hidden layer of the GRU neuron at the time t_tThe same as the Encoder architecture intermediate state. In the Decoder architecture, the k-time GRU neuron hidden layer state h_kThe intermediate state of the Decoder architecture is the same as that of the Decoder architecture, and the specific formula is as follows:

in the formula: c_tRepresenting the intermediate state of the Encoder architecture at the time t; c_kRepresenting the intermediate state of the Encoder architecture at time k.

According to the Encode architecture characteristic, the intermediate state output by the Encode architecture at the time T is the intermediate state C of the input sequence, and the value is C_TThe complete information of the input sequence is represented by the following formula:

C＝C_T(3)

inputting the sequence intermediate state C into a Decoder architecture, wherein the Decoder intermediate state initial value C₀Same as sequence intermediate state C. C is to be₀After input, the state h of the hidden layer of the GRU neuron at the moment k can be obtained_kThe hidden layer state of the GRU neuron at the moment k-1 and the input of the GRU neuron at the moment k are jointly determined, and the specific formula is as follows:

h_k＝f(h_k-1,x_k) (4)

in the formula: h is_k-1Representing the state of a GRU neuron hidden layer at the k-1 moment; x is the number of_kRepresenting the GRU neuron input at time k.

According to the Decoder architecture characteristic, the k-1 time Decoder architecture output is used as a k time GRU neuron input, and the following formula is specifically adopted:

x_k＝U_Gk-1(5)

in the formula: u shape_Gk-1Representing the k-1 time Decoder architecture output.

Substituting formula (5) into formula (4), simultaneously, the Decoder architecture executes the operation opposite to that of the Encoder, and performs step-by-step decoding on the input sequence intermediate state C according to the time step to form a final output sequence, wherein the Decoder at the k-1 momentr architecture intermediate state C_k-1And h_k-1Are equal. Thus, the k-time Decoder architecture output is given by h_k-1、U_Gk-1And h_kThe common decisions are specifically described as follows:

in the formula: u shape_GkRepresenting the Decoder architecture output at the k moment; p represents a probability; g represents a softmax function; f denotes a transfer function.

As can be seen in FIG. 4, GRU neurons input x at time k_kAnd k-1 time Decoder architecture intermediate state C_k-1Constructing update gate z in GRU neuron for variable_kReset gate r_kAnd a pending output valueThe three concrete models are as follows:

a weight coefficient therebetween; α denotes an activation function sigmoid in the neural network.

Will z_k、r_kAnd

in the formula: h is_k-1Representing GRU neuron implicit at time k-1And (5) outputting the hidden layer.

In step 4, the deep learning model is trained by using the Adam algorithm.

The Encoder-Decoder framework constructs a loss function based on Mean Absolute Error (MAE), and the output of the Encoder-Decoder framework at the k moment is set as U_GokTarget value is U_GdkThen the total error E of the sample during training is shown as the following equation.

The Adam algorithm is used as an updating algorithm of the neuron weight to realize the training of each parameter of the GRU neuron in the Encoder-Decoder architecture, and the basic formula is shown as follows.

In the formula: theta_kThe parameter variable to be updated at the moment k, delta is the learning rate,and

the mean value of the gradient weights and the gradient weights after error correction have a bias, and the specific calculation formula is shown as the following formula.

Substituting the formula (11) into the formula (10), and utilizing Adam algorithm to adaptively find the learning rate of each parameter, the W in GRU neuron in the Encoder-Decoder architecture can be realized_r、W_zAnd W_hThe correction of the three weight coefficients is specifically as follows.

Example (b):

in order to verify the validity and the correctness of the method, simulation tests are carried out on the basis of IEEE118 node standard calculation examples and the actual data of the Hunan power grid. In the IEEE118 node calculation example, 93 typical day load samples suitable for the IEEE118 node system are constructed on the basis of the daily load characteristic curve of the power grid in the south of the lake. In the above daily load samples, the No. 1-90 daily load samples are used as training samples, and the No. 91-93 daily load samples are used as test samples. In order to facilitate subsequent calculation and analysis, samples No. 1-90 can be clustered into three cluster sample sets by a method of a literature 'research on a unit combination intelligent decision method with self-learning capability based on data drive', namely, samples No. 1-30 are a cluster sample set 1, and a test sample No. 91 also belongs to the type; 31-60 are clustered sample set 2, and test sample number 92 also belongs to this type; 61-90 are clustered sample set 3 and test sample number 93 is also of this type.

All the unit combination deep learning models complete training and testing on a Tensorflow1.6.0 platform. The related simulation calculations are all implemented on Intel core i5-4460 processor/3.20 GHz, 8G memory computer.

In order to verify the correctness of the method, four methods are set: the method 1 is a unit combination decision method based on an LSTM model; the method 2 is a unit combination decision method based on a GRU model; the method 3 is a unit combination decision method based on the Seq2Seq technology and the GRU model; the method 4 is a sample coding technology fused on the basis of a set combination decision method based on the Seq2Seq technology and the GRU model.

1) Procedural simulation and correctness verification of the method of the invention

Firstly, training samples No. 1-90 by using the method of the invention, respectively carrying out Unit combination decision on test samples No. 91-93 by using a mapping model obtained by training, and comparing an obtained Unit start-stop scheme with a method using a document 'Network-Constrained AC Unit Commitment Underventity': a Benders 'decomplexionApproach' compares unit start-stop schemes based on a physical model-driven unit combination decision method, and the solving result of the No. 91 test sample is shown in Table 1.

TABLE 1 inventive and document "Network-Constrained AC Unit Commitment Underunderwertility: a unit start-stop scheme obtained by solving No. 91 test sample by A Benders' decomplexing Approach

As can be seen from Table 1, the method of the present invention and the document "Network-Constrained AC Unit CommitmentUnderUncertainty: the unit start-stop schemes obtained by the A Benders 'composition Approach' method are the same, which shows that the method can fully learn the mapping relation between the daily load and the unit start-stop scheme, and the mapping model obtained by training can make correct unit start-stop scheme decision on any type of daily load data.

For test samples No. 91-93, the optimal power flow model is solved on the basis of solving the starting and stopping scheme, and a unit combination decision result is obtained. The method of the present invention and the document "Network-Constrained AC Unit Commitment Underentertainty: the unit combination decision results of the A Benders 'decompensation Approach' method are compared, and the total cost ratio is shown in a table 2.

TABLE 2 methods of the invention and the document "Network-Constrained AC Unit Commitment UnderUncertainty: a markers' decompensation Approach for total cost comparison

As can be seen from Table 2, the output scheme and total cost of the Unit obtained by the solution of the method of the present invention are compared with the document "Network-Constrained AC Unit Commitment Underventaintry: the solution results for A Benders 'DecompositionApproach' are the same. The above results show that, after the method of the present invention obtains the Unit start-stop scheme, the optimal power flow model is solved to obtain the power flow model, which is described in the document "Network-Constrained AC Unit committal undermining: a drivers' Decomposition Approach is the same as the unit output scheme.

This is because the unit start-stop scheme U_GIn the methods of the invention and in the document "Network-Constrained AC Unit acceptance Underendless availability: the A markers' decompensation Approach all belong to decision variables. Using the method of the invention and the document "Network-Constrained AC Unit Commitment UnderUncertainty: when the A Benders 'decomplexing Approach' method is used for solving the unit output scheme, the same optimal power flow model is adopted. Therefore, for the same unit start-stop scheme U_GThe method of the present invention and the document "Network-Constrained AC Unit Commitment Under availability: the method of A Benders 'DecompositionApproach' can solve to obtain the output scheme P of the same unit_G. The self-learning and self-evolution abilities of the data-driven unit combination intelligent decision method and the applicability thereof in the face of different types of unit combination problems are verified in the literature, "research on the unit combination intelligent decision method based on data-driven self-learning ability", and the invention is not repeated herein. Decision accuracy in subsequent calculation examples represents a Unit start-stop scheme obtained by the method and a document, namely, Network-Constrained AC Unit Commitment Underventainty: and A, correlation between unit start-stop schemes obtained by Benders' Decomposion Approach.

2) Validity verification of GRU model introduced by the method

In order to verify the effectiveness of the GRU model introduced by the invention, firstly, the method 1 and the method 2 are trained by using the training samples after the clustering pretreatment, the training times are set to be 500 times, then, the No. 91-93 test samples are solved by using the two methods, and the specific results are shown in Table 3.

TABLE 3 decision accuracy versus training time for method 1 and method 2

As can be seen from table 3, in terms of decision accuracy, the decision accuracy of method 2 is 100% when the decision is made on 3 test samples, whereas the decision accuracy of method 1 is lower than 100% when the decision is made on sample No. 93, and the total cost obtained by method 1 is higher than that obtained by method 2. This shows that, with 500 training passes, method 2 can generate an accurate mapping model for all cluster training sample sets, while method 1 cannot generate an accurate mapping model for cluster sample set 3, and requires more training passes. In terms of training time, method 2 reduces training time for 3 clustered sample sets by 77s, 91s, and 82s, respectively, compared to method 1. This indicates that the training time required for method 2 is shorter for the same number of training sessions.

The main reason for the above phenomenon is that the GRU combines the input gate and the forgetting gate, recombines the forgetting gate into the update gate and the reset gate, and simplifies the memory unit in the LSTM, so that the memory unit can directly calculate and output the result, and thus the overall structure of the model is simpler, and the training and decision accuracy is higher under the same training parameters. Therefore, the deep learning model for constructing the unit combination decision by using GRU instead of LSTM is correct and effective.

3) Validity verification of introduction of Seq2Seq technology into method of the invention

In order to verify the effectiveness of introducing the Seq2Seq technology into the method, samples subjected to clustering pretreatment and samples not subjected to clustering pretreatment are respectively adopted as training samples of the method 2 and the method 3, the two methods are utilized to respectively solve No. 91-93 test samples, and specific results are shown in Table 4.

TABLE 4 decision accuracy versus training time for methods 2 and 3

As can be seen from table 4, in terms of decision accuracy, if the unit combination training sample is not subjected to clustering preprocessing, but is directly used for training the method 2, the decision accuracy of the method 2 after training is generally lower than 90%. The method 2 is trained by adopting the training samples subjected to clustering pretreatment, so that the decision precision can reach 100%. The method 3 is different, and even if training samples without clustering preprocessing are adopted for training, the method 3 can still obtain 100% decision accuracy. This shows that, with the introduction of the Seq2Seq technique, a single deep learning model can complete the training of all the differential sample data. The reason is that if a single deep learning model is constructed by directly using GRU, a unique compromise mapping model is inevitably generated in the face of training sample data with huge differences, so that the accuracy of online decision making is difficult to ensure. Therefore, in order to ensure the on-line decision precision, the clustering preprocessing can be performed on the training samples only, and then a corresponding deep learning model is adopted for each type of training samples to perform training. If a Seq2Seq technology is introduced to construct an Encoder-Decoder composite neural network architecture based on GRU, because the intermediate state C can completely store the class information and pointing probability of the input sequence and the output sequence, a single deep learning model can be used to realize the accurate training of all the different samples.

In terms of training time, method 2 is trained using non-clustered samples, which requires the least total time. While the method 3 using the non-clustered samples requires 110.02s more total time for training than the former, it requires 179.23s less time than the method 2 using the clustered samples. The reason for this is that if the method 2 is trained using non-clustered samples, the training process is terminated early because it cannot converge to the most accurate mapping model, so the training time required for the whole method is short. However, if the method 2 is trained by using the cluster samples, the total training time consumed in this case is longest because the training needs to be completed on a plurality of deep learning models, and the cluster preprocessing process also takes a certain amount of time.

To further analyze the reason for the training time difference between the method 2 and the method 3, the practical convergence curves of the two methods when the method 2 and the method 3 are trained by using the non-clustered samples are shown in fig. 6.

As can be seen from fig. 6, the total error of method 2 training basically converges to about 0.09 after the number of training times exceeds 100, and the error cannot be further reduced, so the training process ends early. And after the training of the method 3 exceeds 100 times, the error of the method finally converges to about 0.0002. It can be seen that, because the number of times of training is more, the time required by the method 3 when training the non-clustered samples is longer, while the training time of the method 2 is shorter, but the training precision of the model cannot be guaranteed.

In summary, if the method 2 is trained by directly using training samples without clustering preprocessing, it is difficult to ensure the decision accuracy of the model. By introducing the clustering preprocessing strategy, although the problem of training precision of the method 2 in the face of different training samples can be solved, the complexity of the off-line training of the method 2 is greatly increased, and the off-line training time and the total time required by the off-line training time are greatly increased. In the method 3, the Seq2Seq technology is introduced, so that the method can realize accurate training of the difference samples only by using a single deep learning model, the training process is simpler, and the training and decision efficiency of the deep learning model can be improved while the training accuracy is ensured.

4) The method introduces the validity verification of a sample coding technology

In order to verify the influence of the sample coding technology provided by the invention on the deep learning model training efficiency, the method 3 and the method 4 are respectively trained by using non-clustered training samples, and the training and testing results are shown in table 5.

TABLE 5 comparison of decision accuracy and training time for methods 3 and 4

As can be seen from table 5, after training, the decision accuracy of method 3 and method 4 is the same, and when the decision is made on 3 test samples, the accuracy is 100%. However, the training time required for method 4 is shortened by 351s compared to method 3. The reason is that the sample coding technology provided by the invention can directly compress the data dimension of the training sample, and reduce the unit start-stop state matrix of 1 training sample from 24 × 54 dimension to 24 × 1 dimension, so that the number of variables required to be calculated in the training process of the deep learning model is directly reduced, and the training time of the model is further reduced. Therefore, after the sample coding technology is introduced, although the sample coding process consumes a certain time, the method can effectively reduce the overall training time of the deep learning model.

In conclusion, the sample coding technology provided by the invention can effectively compress the data dimension of the combined training sample of the unit, and directly reduce the number of variables required to be calculated in the training process of the deep learning model, thereby effectively reducing the training time of the deep learning model while ensuring the training precision of the deep learning model; compared with an LSTM neural network adopted in a data-driven unit combination intelligent decision method research with self-learning capability, the GRU model introduced by the invention can obtain higher training and decision precision under the same training parameters; according to the invention, by introducing a Seq2Seq technology and constructing an Encoder-Decoder composite neural network architecture with GRUs as neurons, accurate training of differential samples can be realized by using only a single deep learning model, the training process is simpler, and the training and decision efficiency of the method can be improved while the training accuracy is ensured.

Claims

1. A data driving type unit combination intelligent decision method based on GRU and Seq2Seq technology is characterized by comprising the following steps:

2. The data-driven type unit combination intelligent decision method based on GRU and Seq2Seq technology as claimed in claim 1, characterized in that: in step 1, when a high-dimensional unit combination training sample matrix is processed, specifically, a unit combination start-stop state vector of each time period is encoded, so that vector codes with completely the same start-stop state are also the same.

3. The data-driven type unit combination intelligent decision method based on GRU and Seq2Seq technology as claimed in claim 2, characterized in that: and converting the unit combination start-stop state vector of each time period into a decimal code corresponding to the unit combination start-stop state vector, thereby compressing the dimensionality of the sample matrix.

4. The data-driven type unit combination intelligent decision method based on GRU and Seq2Seq technology as claimed in claim 1, characterized in that: in step 2, an Encoder-Decoder composite neural network architecture is constructed based on GRU and Seq2Seq technology, and the following steps are specifically adopted:

h_t＝f(h_t-1,P_Lt) (1)

2) in the Encoder architecture, the hidden layer state h of GRU neuron at time t_tSame as Encoder architecture intermediate stateIn the Decoder architecture, the k-time GRU neuron hidden layer state h_kThe intermediate state of the Decoder architecture is the same as that of the Decoder architecture, and the specific formula is as follows:

C＝C_T(3)

h_k＝f(h_k-1,x_k) (4)

x_k＝U_Gk-1(5)

The three concrete models are as follows:

8) will z_k、r_kAnd

5. The data-driven type unit combination intelligent decision method based on GRU and Seq2Seq technology as claimed in claim 1, characterized in that: in step 3, daily load data P of a typical day is added_LAnd corresponding unit start-stop scheme U_GAs a historical mapping sample, in the historical mapping sample, a unit start-stop scheme U_GAnd daily load P_LU for relationship (2)_G＝F(p(P_L) Description); wherein p represents the probability between the daily load and the corresponding unit start-stop scheme, and F represents the conversion function.

6. The GRU and Seq2Seq technology-based data-driven type unit combination intelligent decision method according to claim 5, characterized in that: historical data are accumulated, and a deep learning model based on Seq2Seq and GRU is trained off line, so that descriptive U is obtained_GAnd P_LA mapping model of the probability relationship between them.

7. The GRU and Seq2Seq technology-based data-driven type unit combination intelligent decision method according to one of claims 1 to 6, characterized in that an Adam algorithm is adopted to train a deep learning model, and the following steps are specifically adopted:

and

the average value of the gradient weights and the gradient weights after error correction have a bias variance, and the specific calculation formula is shown as the following formula:

8. A unit combination decision-oriented composite neural network architecture is characterized in that an Encoder-Decoder composite neural network architecture is constructed based on GRU and Seq2Seq technology, and the method specifically comprises the following steps: 1) mapping a history to a sample (P)_L，U_G) Substituting into Encoder-Decoder architecture, which loads daily sequence P_LReading in step by step, wherein the hidden layer state of the GRU neuron at the time t is jointly determined by the hidden layer state of the GRU neuron at the time t-1 and the daily load at the time t, and the specific formula is as follows:

h_t＝f(h_t-1,P_Lt) (1)

C＝C_T(3)

h_k＝f(h_k-1,x_k) (4)

x_k＝U_Gk-1(5)

6) the formula (5) is substituted into the formula (4), meanwhile, the Decoder architecture executes the operation opposite to the Encode, and the input sequence intermediate state C is decoded step by step according to the time step to form the final output sequence which isMiddle k-1 time Decoder architecture middle state C_k-1And h_k-1Equal, k time Decoder architecture output by h_k-1、U_Gk-1And h_kThe common decisions are specifically described as follows:

The three concrete models are as follows:

8) will z_k、r_kAnd

9. A method for training a deep learning model of an electric power system is characterized in that an Adam algorithm is adopted to train the deep learning model, and the method specifically comprises the following steps:

and

3) substituting the formula (11) into the formula (10) and utilizing Adam algorithm to adaptThe learning rate of each parameter is searched correspondingly, and the W in the GRU neuron in the Encoder-Decoder framework can be realized_r、W_zAnd W_hThe correction of the three weight coefficients is specifically as follows: