CN110829434B

CN110829434B - Method for improving expansibility of deep neural network tidal current model

Info

Publication number: CN110829434B
Application number: CN201910938908.XA
Authority: CN
Inventors: 余娟; 向明旭; 杨知方; 代伟; 杨燕; 余红欣; 何燕
Original assignee: Chongqing University; Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd; State Grid Corp of China SGCC
Current assignee: Chongqing University; Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd; State Grid Corp of China SGCC
Priority date: 2019-09-30
Filing date: 2019-09-30
Publication date: 2021-04-06
Anticipated expiration: 2039-09-30
Also published as: CN110829434A

Abstract

The invention discloses a method for improving the expansibility of a deep neural network tidal current model, which mainly comprises the following steps: 1) acquiring basic data of a power system; 2) determining a feature vector; 3) establishing an original DNN power flow model; 4) training the original DNN power flow model to obtain a trained original DNN power flow model; 5) the method comprises the steps of expanding an original DNN power flow model to obtain an expanded DNN power flow model; 6) and resolving the probability load flow of the extension system to obtain a probability load flow result. The method can be widely applied to solving the probability load flow of the power system, and is particularly suitable for the condition that the DNN of the original system cannot be applied due to system extension.

Description

Method for improving expansibility of deep neural network tidal current model

Technical Field

The invention relates to the field of power systems and automation thereof, in particular to a method for improving the expansibility of a deep neural network tidal current model.

Background

In recent years, with the vigorous development of renewable energy sources, power systems face more and more uncertainties. These uncertainties have a significant impact on the operation, planning and control of the power system. The probability power flow can comprehensively consider the influence of various uncertainties on the system power flow, so that valuable information is provided for power system analysis.

However, the relationship between each uncertainty factor and the corresponding trend result is very complex, which makes the solution of the probability trend very difficult. The traditional probabilistic power flow calculation method can be divided into the following two methods: analytical methods and simulation methods. The solution idea of the above two methods is shown in fig. 2 and 3.

The core idea of the analytical method is to construct a representative sample from a probability density function of random variables. By calculating the power flow of the typical sample, a calculation result of the probability power flow can be obtained. The method does not need to solve the load flow results of a large number of samples in an iterative mode, and the calculation burden of probability load flow is effectively reduced. However, the flow equation is simplified in the calculation process by the analytical method, and the calculation accuracy of the flow equation cannot be guaranteed in a complex system.

The simulation first generates a probability density function with a large number of samples representing random variables. Then, the trend results of the samples are solved iteratively. And finally, carrying out statistical analysis on the load flow results of the samples so as to obtain a probability load flow calculation result. Compared with an analytical method, the simulation method is more accurate. However, the solution process of the simulation method is time-consuming because of the need to iteratively solve the trend results of a large number of samples. In order to increase the computation speed of the simulation method, researchers have proposed a method based on a linear power flow model and a method based on parallel computation. The linear power flow model converts the nonlinear power flow equation into a linear equation, so that the calculation speed of the probability power flow can be effectively increased. However, due to errors of the linear equation, the calculation accuracy of the probability load flow is reduced by using the linear equation. The method adopts parallel computation to realize the rapid and accurate computation of the probability load flow, but the requirement of the parallel computation on computing resources is higher. It is impractical to use this parallel computing method at every utility company. In conclusion, the computational burden of the probability trend is the bottleneck of the practical application of the probability trend in engineering.

With the continuous development of artificial intelligence technology, some researchers propose to solve probability load flow by using DNN, so that high-precision and fast calculation of the probability load flow is realized. The method comprises the steps of firstly training DNN to obtain a DNN power flow model which can approach a power flow equation with high precision. And then, directly mapping the load flow calculation results of all samples by using the trained DNN load flow model. Therefore, the probability load flow is calculated quickly and accurately. However, this method is difficult to solve the situation of system scale change (when the system scale changes, the DNN of the original system that has been trained will not be directly applied to the new system). With the continuous development of load requirements and new energy, the scale of the power grid is continuously increased (newly added nodes or branches). The structure of the DNN model should correspond to the scale of the system being solved, and thus the DNN of the original system cannot be used to solve the probabilistic power flow of the new system. For this reason, a new DNN needs to be trained to solve the probabilistic power flow of the new system. However, training the DNN from scratch is a very time consuming process. Therefore, it is necessary to improve the training efficiency of the extension system DNN, so as to improve the extensibility of the DNN in the extension system.

Disclosure of Invention

The present invention is directed to solving the problems of the prior art.

The technical scheme adopted for achieving the purpose of the invention is that the method for improving the expansibility of the deep neural network tidal current model mainly comprises the following steps:

1) and acquiring basic data of the power system.

2) Determining a feature vector, mainly comprising the following steps:

2.1) setting uncertain factors including new energy uncertainty and load uncertainty. And setting a power flow result, wherein the power flow result comprises the voltage amplitude and the voltage phase angle of each node, and the active power and the reactive power of each branch circuit.

2.2) calculating the active injection power P of the node i_inj,iReactive injection power Q of node i_inj,iVoltage amplitude v of node i_iAnd voltage phase angle theta_iActive power P of branches i to j_ijAnd reactive power P_ijNamely:

P_inj,i＝P_g,i+P_w,i+P_v,i-P_d,i。 (1)

Q_inj,i＝Q_g,i-Q_d,i。 (2)

where i and j are the node numbers. n is_bIndicating the number of nodes. P_g,iAnd Q_g,iAnd representing the active output and the reactive output of the generator of the node i. P_w,iRepresenting the active output of the wind farm at node i. P_v,iAnd representing the active power output of the photovoltaic power station at the node i. P_d,iAnd Q_d,iRepresenting the active and reactive loads of node i. v. of_iIs the voltage magnitude at node i. v. of_jIs the voltage magnitude at node j. Theta_ijIs the voltage phase angle difference between node i and node j. G_ijAre elements of the conductance matrix. B is_ijAre elements of the susceptance matrix. g_ijAnd b_ijAre the conductance and susceptance parameters of the transmission line.

2.3) determining an input feature vector of the power flow sample, namely the node injection power x. The input feature vector is as follows:

x＝[P_inj,Q_inj]。 (7)

2.4) determining the output characteristic vector of the power flow sample, namely the power flow calculation result y₀. The output feature vector is as follows:

y_o＝[v,θ,P_ij,Q_ij]。 (8)

where v is the voltage amplitude. And theta is a voltage phase angle.

3) And establishing an original DNN power flow model.

Further, the main steps of establishing the original DNN power flow model are as follows:

3.1) taking active injection power and reactive injection power of each node in the power system as original input x of the DNN power flow model, and taking a calculation result of each power flow sample as original output y of the DNN power flow model_o. And the calculation result of each load flow sample comprises the voltage amplitude of each node, the voltage phase angle of each node, the active power and the reactive power of each branch.

3.2) the original DNN power flow model adopts SDAE as a basic model.

The original SDAE power flow model consists of a plurality of noise reduction automatic encoders DAE. The main steps for establishing the original SDAE power flow model are as follows:

3.2.1) establishing a noise reduction automatic encoder DAE, which mainly comprises the following steps:

3.2.1.1) corrupting the original input x in a randomly mapped manner, resulting in a locally corrupted input

The corrosion formula is shown below:

in the formula, q_DThe method is an etching process by means of random mapping, namely, a plurality of original input x are randomly selected and set to zero. x is the input of the original DNN power flow model.

3.2.1.2) according to the coding function f_θCompute hidden layer y, i.e.:

in the formula, W is the weight of the encoder. W is a number d_y×d_xMatrix of dimensions. b is the bias of the encoder. b is a number d_yA vector of dimensions. d_xIs the dimension of the input layer vector. d_yThe dimension of the intermediate layer vector. f. of_θIs a coding function. s (x) is the activation function in the encoding process.

3.2.1.3) decoding function g of the intermediate layer output y by the decoder_θ′And obtaining the output of the output layer z, thereby establishing the DAE model.

The output of output layer z is as follows:

z＝g_θ′(y)＝s(W′y+b′)。 (11)

wherein, W' is the decoder weight. W' is a number d_x×d_yA matrix of dimensions. b' is the decoder bias. b' is a number d_xA vector of dimensions. d_xIs the dimension of the input layer vector. d_yThe dimension of the intermediate layer vector.

3.2.2) stacking the n DAE models layer by layer. And the middle layer of the lower DAE model is used as an input layer of the upper DAE model, so that an original SDAE power flow model, namely an original DNN power flow model, is obtained.

Output of original SDAE power flow model

As follows:

in the formula (I), the compound is shown in the specification,

is the coding function of the l-th layer DAE. 1,2, … n. n is the number of DAE in the original SDAE. l is the number of DAE.

Is the coding function at the top layer of the original SDAE power flow model.

4) The main steps for training the original DNN power flow model, namely the original SDAE power flow model, are as follows:

4.1) carrying out unsupervised pre-training on the original SDAE power flow model, and mainly comprising the following steps:

4.1.1) establish the mean square error loss function, i.e.:

in the formula, x_lIs the input of the l-th layer DAE, i.e., the intermediate layer output y of the l-1-th layer DAE_l-1。z_lIs the output of the l-th layer DAE.

4.1.2) training the parameters of the DAEs of each layer by using a root mean square propagation algorithm, wherein the updating formulas of the parameters of the DAEs of each layer in the training process are shown as a formula (14) to a formula (23).

After the T +1 th parameter is updated, the weight values of the jth neuron in the l-1 th DAE intermediate layer to the ith neuron in the l-1 th DAE intermediate layer are obtained

As follows:

in the formula (I), the compound is shown in the specification,

is the weight value from the jth neuron in the intermediate layer of the layer l-1 DAE to the ith neuron in the intermediate layer of the layer l DAE after the T +1 times of parameter updating.

Is the weight value from the jth neuron in the intermediate layer of the layer l-1 DAE to the ith neuron in the intermediate layer of the layer l DAE after the Tth parameter updating.

Bias of ith neuron in ith DAE middle layer after T +1 th parameter update

As follows:

in the formula (I), the compound is shown in the specification,

is the bias of the ith neuron in the middle layer of the ith DAE after the Tth parameter update.

Wherein, after the Tth parameter is updated for the first time, the weight values of the jth neuron of the l-1 th DAE middle layer to the ith neuron of the l-1 th DAE middle layer

Weight change in update process

As follows:

wherein, after the Tth parameter updating, the bias of the ith neuron of the ith DAE middle layer

Bias change in update procedure

As follows:

in the formula (I), the compound is shown in the specification,

representing the Hadamard product. η is the learning rate. ρ is a gradient accumulation index. k is an arbitrary sample.

Is the weight value from the jth neuron in the intermediate layer l-1 DAE to the ith neuron in the intermediate layer l-1 DAE after the T-1 parameter updating.

And iterating the accumulated gradients for the previous T times of weights.

The accumulated gradient is iterated for the first T-1 times of weights. Δ is the incremental sign. d is the differential sign. σ is a constant.

Are the partial derivative symbols.

According to the output of the top layer

And the training sample output y constructs a mean square error loss function. k is an arbitrary sample.

Is the bias of the ith neuron in the ith DAE middle layer after the T-1 parameter update.

The gradient accumulated for the previous T bias iterations.

The first T-1 bias iterations accumulate gradients. m is the number of samples.

4.1.3) introducing momentum learning rate as an additional item, updating the formula (16) to the formula (19) to obtain the weight value change quantity shown in the formula (22) and the formula (23)

And amount of change in bias

Namely:

in the formula, p is a momentum learning rate.

4.1.4) calculating the optimal coding parameter theta of each layer of DAE according to an unsupervised pre-training parameter updating formula, and taking the optimal coding parameter theta as the initial coding parameter with supervised fine tuning.

4.2) carrying out supervised fine tuning on the original SDAE power flow model, and mainly comprising the following steps:

4.2.1) output from the Top layer

Constructing a mean square error loss function L with the training sample output y to obtain an optimized objective function arg_θminJ(W,b)。

arg_θminJ(W,b)＝arg_θmin L。 (24)

4.2) according to the optimization objective function arg_θminJ (W, b), and performing fine adjustment on the optimal coding parameter theta of the original SDAE power flow model by adopting expressions (14) to (23) to obtain the trained SDAE power flow model.

5) And expanding the original DNN power flow model to obtain an expanded DNN power flow model.

Further, the main steps of expanding the original DNN power flow model by using the transfer learning method are as follows:

and 5.1) determining parameters needing initialization in the extension of the DNN power flow model, wherein the parameters comprise hidden layer parameters, input layer parameters and output layer parameters.

5.2) transferring hidden layer parameters in the original DNN power flow model to the extension DNN power flow model.

5.3) initializing input layer parameters and output layer parameters in the extension DNN power flow model, and mainly comprising the following steps:

5.3.1) determining variables connected with the input layer parameters and the output layer parameters of the original DNN power flow model, wherein the variables comprise active injection power, reactive injection power, voltage amplitude, voltage phase angle, branch active power and branch reactive power.

And respectively migrating parameters connected with the variables in the original DNN power flow model to the extension DNN power flow model so as to initialize and extend partial parameters of the DNN power flow model.

5.3.2) initializing and expanding the residual parameters of the input layer and the output layer in the DNN power flow model by using a method for fitting the parameter distribution of the original DNN power flow model.

And 5.4) fine-tuning the extension DNN power flow model by using the power flow sample of the extension system.

6) And resolving the probability load flow of the extension system by using the extension DNN load flow model to obtain a probability load flow result.

Further, the main steps of resolving the extension system probability load flow are as follows:

6.1) collecting training samples and training an extension DNN power flow model.

6.2) sampling the state of the extension system by adopting a Monte Carlo method.

And 6.3) calculating the probability load flow result of the extension system by using the extension DNN load flow model, wherein the probability load flow result comprises the average value and the standard deviation of the variables and the probability density function of all output variables.

It is worth to be noted that, the present invention first directly migrates the parameters of the hidden layer of the DNN of the original system to the DNN of the new system; then, according to random variables connected with the parameters of the input layer and the output layer, migrating the parameters of the input layer and the output layer of the original system DNN to a new system DNN; and finally, fitting parameter distribution connected to each variable in the input layer and the output layer of the original system DNN by adopting non-parameter estimation, and initializing the un-initialized parameters in the new system DNN by sampling the parameter distribution. Simulation results on an IEEE39 node system and an IEEE118 node system show that the knowledge migration method provided by the invention can greatly improve the training efficiency of DNN of a new system and verify the effectiveness of the method.

The technical effect of the present invention is undoubted. The invention provides a knowledge migration method for improving the training efficiency of a DNN of an extension system. The method mainly comprises the following four characteristics:

1) the parameters of the original system DNN are migrated according to the variables to which the parameters are connected. The inputs and outputs of the DNN include various random variables in the power system. The input layer and output layer parameters of the original system DNN have learned the trend characteristics of the variables connected to them, which can provide valuable trend characteristic knowledge for the parameters connected to the corresponding variables in the new system DNN. Based on this, the present invention proposes to migrate the input layer and output layer parameters of the original system DNN to the new DNN according to the random variables to which the parameters are connected.

2) Parameters of the remaining portions of the input layer and the output layer are initialized based on the non-parametric estimates. Since the new system is larger in scale than the original system, the dimensions of the input layer and the output layer of the DNN of the new system are larger than those of the DNN of the original system. Therefore, the parameters of the original system DNN cannot completely cover the parameters of the new DNN, and some of the parameters still need to be initialized. For this part of the parameters, the invention proposes to first obtain the parameter distribution of the original system DNN by non-parametric estimation, and then initialize the remaining part of the parameters by sampling from the parameter distribution.

3) The knowledge migration method provided by the invention can migrate the DNN parameters trained by the original system to the extension system DNN, help the extension system DNN to learn, enable the extension system DNN to converge faster in the training process, and improve the training efficiency, thereby improving the expansibility of the DNN in the extension system.

4) The knowledge migration method provided by the invention can provide a good initial value for the DNN of the extension system by utilizing the DNN parameters trained by the original system. The good initial value reduces the dependence of the extension system DNN on the number of training samples in the training process, and further improves the expansibility of the DNN in the extension system.

The method can be widely applied to solving the probability load flow of the power system, and is particularly suitable for the condition that the DNN of the original system cannot be applied due to system extension. Through the method provided by the invention, the learned trend feature knowledge of the original system DNN can be migrated to the new system DNN, so that the training efficiency of the new system DNN is greatly improved, and the requirement of the training process on the number of samples is effectively reduced.

Drawings

FIG. 1 is a process framework for analytical and simulation;

FIG. 2 is the structure of a DAE;

FIG. 3 is the structure of SDAE;

FIG. 4 is a block diagram of extension system DNN training based on transfer learning;

fig. 5 is a basic block diagram for solving a probabilistic power flow by applying DNN.

Detailed Description

The present invention is further illustrated by the following examples, but it should not be construed that the scope of the above-described subject matter is limited to the following examples. Various substitutions and alterations can be made without departing from the technical idea of the invention and the scope of the invention is covered by the present invention according to the common technical knowledge and the conventional means in the field.

Example 1:

referring to fig. 1 to 5, a method for improving the expansibility of a deep neural network tidal current model mainly includes the following steps:

1) and acquiring basic data of the power system.

2) A feature vector is determined. DNN is a model that extracts features between sample inputs and outputs. In order to enable the DNN to effectively extract the features of the probabilistic power flow, the input feature vector should include features related to new energy and load. The output feature vector should contain the load flow calculation result of interest.

Further, the main steps of determining the feature vector are as follows:

P_inj,i＝P_g,i+P_w,i+P_v,i-P_d,i。 (1)

Q_inj,i＝Q_g,i-Q_d,i。 (2)

where i and j are the node numbers. 1,2, …, n_b。j＝1，2，…，n_b。n_bIndicating the number of nodes. P_g,iAnd Q_g,iAnd representing the active output and the reactive output of the generator of the node i. P_w,iRepresenting the active output of the wind farm at node i. P_v,iAnd representing the active power output of the photovoltaic power station at the node i. P_d,iAnd Q_d,iRepresenting the active and reactive loads of node i. v. of_iIs the voltage magnitude at node i. v. of_jIs the voltage magnitude at node j. Theta_ijIs the voltage phase angle difference between node i and node j. G_ijAre elements of the conductance matrix. B is_ijAre elements of the susceptance matrix. g_ijAnd b_ijAre the conductance and susceptance parameters of the transmission line.

x＝[P_inj,Q_inj]。 (7)

y_o＝[v,θ,P_ij,Q_ij]。 (8)

where v is the voltage amplitude. And theta is a voltage phase angle.

3) And establishing an original DNN power flow model.

The basic model selected by the invention is a Stacked Denoising Auto-Encoder (SDAE), which is a special type of DNN. The SDAE is composed of a plurality of noise reduction Auto-encoders (DAE). DAE is similar to the common three-layer neural network. The main difference is that the DAE contains a corrosion process and the training goal of the DAE is to reconstruct its inputs. The erosion process may force the DAE to extract more robust features. Reconstructing the input may extract potential features of the input, facilitating training of the DNN.

3.1) taking active injection power and reactive injection power of each node in the power system as original input x of the DNN power flow model. The calculation result of each power flow sample (namely the voltage amplitude and the voltage phase angle of each node, the active power and the reactive power of each branch) is used as the original output y of the DNN power flow model_o。

The structure of DAE is shown in FIG. 2, and the calculation process is as follows:

eroding the original input x in a random mapping manner to obtain locally eroded input

The corrosion formula is shown below:

in the formula, q_DThe method is an etching process by means of random mapping, namely, a plurality of original input x are randomly selected and set to zero. x is the original input of the original DNN power flow model.

3.2) according to the coding function f_θCompute hidden layer y, i.e.:

in the formula, W is the weight of the encoder. W is a number d_y×d_xA matrix of dimensions. b is the bias of the encoder. b is a number d_yA vector of dimensions. d_xIs the dimension of the input layer vector. d_yThe dimension of the intermediate layer vector. f. of_θIs a coding function. s (x) is the activation function in the encoding process.

3.3) decoding function g of the intermediate layer output y by the decoder_θ′And obtaining the output of the output layer z, thereby establishing the DAE model.

The output of output layer z is as follows:

z＝g_θ′(y)＝s(W′y+b′)。 (11)

3.4) stacking the n DAE models layer by layer. The middle layer of the lower DAE model serves as an input layer of the upper DAE model, so as to obtain an SDAE flow model (i.e. the original DNN flow model), as shown in fig. 3.

Output of original SDAE power flow model

As follows:

in the formula (I), the compound is shown in the specification,

Is the coding function at the top layer of the original SDAE power flow model.

4) And training the original SDAE power flow model to obtain the trained original SDAE power flow model (namely the original DNN power flow model).

Further, the main steps for training the original SDAE flow model are as follows:

4.1.1) establishing a mean square error loss function L_H(x_l,z_l) Namely:

Is an objective function.

The minimum value is expressed.

Shown below the mean square error loss function.

Wherein m is the number of samples,

is the output value of SDAE, and y is the actual output of the training sample. L is L_H(x_l,z_l) Is a simplified expression of.

4.1.2) training each layer of DAE by using a root mean square propagation algorithm, wherein the parameter updating formulas of each layer of DAE in the training process are shown as a formula (6) to a formula (16).

As follows:

in the formula (I), the compound is shown in the specification,

Bias of ith neuron in ith DAE middle layer after T +1 th parameter update

As follows:

in the formula (I), the compound is shown in the specification,

Weight change in update process

As follows:

Bias change in update procedure

As follows:

in the formula (I), the compound is shown in the specification,

representing the Hadamard product. η is the learning rate, and is 0.001. ρ is the gradient cumulative index, equal to 0.99. ε is constant, take 10^-8. k is an arbitrary sample.

And iterating the accumulated gradients for the previous T times of weights.

The accumulated gradient is iterated for the first T-1 times of weights. Δ is the incremental sign. d is the sign of the differential of the signal,

is a weight value

The differential of (a) is determined,

is offset by

Differentiation of (2). σ is a constant.

Are the partial derivative symbols.

Is a weight value

The partial derivatives of (a) are,

is offset by

The partial derivatives of (1).

According to the output of the top layer

The gradient accumulated for the previous T bias iterations.

The first T-1 bias iterations accumulate gradients. m is the number of samples.

4.1.3) introducing momentum learning rate as an additional item, updating the formula (17) to the formula (20) to obtain the weight value change quantity shown in the formula (23) and the formula (24)

And amount of change in bias

Namely:

in the formula, p is a momentum learning rate and has a value of 0.9.

4.2.1) output from the Top layer

arg_θminJ(W,b)＝arg_θmin L。 (25)

4.2.2) according to an optimized objective function arg_θmin J (W, b), and thus, fine-tuning the optimal coding parameter θ ═ { W, b } of the original SDAE power flow model to obtain a final SDAE power flow model (i.e., the original DNN power flow model).

In a real power system, the system will expand as new energy and load demands develop. From equations (22) and (23), it can be seen that the input and output dimensions of DNN are related to the scale of the system. The input and output dimensions of the original system DNN do not match the extension system, and the original system DNN cannot be applied to the extension system. For this reason, a new DNN needs to be trained for the extension system.

In order to improve the training efficiency of the new DNN, the present invention utilizes a transfer learning technique. Transfer learning is a method of applying useful knowledge of the relevant domain to assist in the training of the target domain model. The new system is obtained by the extension of the original system, so the trend characteristics of the new system are very similar to those of the original system. Therefore, the original system DNN parameters that can reflect the original system flow characteristics are useful knowledge for extending the system DNN. Based on this, the invention migrates the parameters learned by the original system DNN to the new DNN, so as to improve the training efficiency of the new DNN.

The traditional migration learning method directly migrates the trained DNN parameters to the new DNN to initialize the parameters of the new DNN, and then fine-tunes the parameters of the new DNN by using the new samples. However, for the build-out system, since the structure of the new DNN is larger than that of the original system DNN, directly migrating the parameters of the original system DNN cannot cover all the parameters of the new DNN.

Further, the method for expanding the original DNN power flow model by using the transfer learning method mainly comprises the following steps:

As shown in fig. 4, the parameters of the input layer and the output layer are connected with different variables, including active injection power, reactive injection power, voltage amplitude, voltage phase angle, branch active power and branch reactive power. The parameters connected to each variable in the original system DNN have learned the trend knowledge associated with the variable. The invention therefore proposes to migrate the parameters of the input and output layers according to the variables to which they are connected. Taking the active injection power as an example, in the parameter migration process, the present invention migrates the parameter connected to the active injection power in the original system DNN to the new DNN to initialize a part of the parameter connected to the active injection power in the new DNN, as shown by the light blue line in fig. 4.

Taking the active injection power as an example, because the number of nodes of the extension system is greater than that of the original system, in the new system DNN, some parameters connected to the active injection power are not initialized yet, as shown by the light blue solid circles in fig. 4. Since knowledge of the power flow associated with the active injection power is contained in the distribution of the parameters connected to it, the invention first fits the distribution of the parameters connected to the active injection power in the original system DNN and then initializes the remaining parameters connected to the active injection power of the new system DNN by sampling from the fitted distribution. Common distribution fitting methods include parametric estimation and non-parametric estimation. Since there is no prior knowledge of the parameter distribution, the present invention chooses non-parametric estimation to fit the distribution of the parameters.

For the parameters connected to the active injection power, the probability density function can be estimated by equation (26).

In the formula, K (x) is a kernel function, and the invention adopts a Gaussian kernel function:

w_Pinj,irepresenting the parameters connected to the active injection power in the original system DNN. h represents the bandwidth factor, and h is 1.8 in the invention. N is a radical of_wIs the number of parameters connected to the active injected power.

By estimated probability density functions

The remaining parameters of the new DNN connected to the active injection power may be initialized.

By the method, the trend knowledge learned in the original system DNN is transferred to the new DNN, and good initial parameters can be provided for the new DNN. And then, fine-tuning the new DNN by using the trend sample of the extension system. Since the output of the DNN is the effect of all parameter interactions in the DNN, the present invention will adjust all parameters in the new DNN. Compared with zero learning, the method can complete the training process of the extension system DNN more quickly. Because the method provided by the invention does not need to execute an unsupervised pre-training stage, and the original system DNN provides a good initial value for the extension system DNN. Furthermore, the proposed method may also reduce the dependency of the training process on the number of training samples.

And 6.3) calculating a probability power flow result of the extension system by using the extension DNN power flow model, wherein the probability power flow result comprises the average value and the standard deviation of the variables and the probability density function of all output variables.

Example 2:

an experiment for verifying and improving the expansibility of a deep neural network tidal current model mainly comprises the following steps:

1) tidal current sample acquisition

In the present embodiment, an IEEE39 node system and an IEEE118 node system are used for simulation. Original system: in an IEEE39 node system, a wind power plant is introduced to buses 23, 24 and 25, the maximum output of the wind power plant is 260MW, photovoltaic power stations are introduced to buses 17, 18 and 19, and the maximum output of the photovoltaic power stations is 200 MW. In an IEEE118 node system, the invention introduces a wind power plant on buses 59, 80 and 90, the maximum output of the wind power plant is 260MW, and introduces photovoltaic power stations on buses 13, 14, 16 and 23, the maximum output of the photovoltaic power stations is 200 MW. An extension system: in the IEEE39 node system, a bus 40 and branches 26-40 are added on the basis of the original system. In the IEEE118 node system, a bus 119, a branch 118 and branches 75-119 are additionally arranged on the basis of the original system, and a wind power plant is introduced on the bus 119, wherein the maximum output of the wind power plant is 200 MW. Wherein, the wind speed is assumed to follow two parameters Weibull distribution, the scale parameter is 2.016, and the shape parameter is 5.089. The illumination intensity follows Beta distribution, and the shape parameters of the photovoltaic power station and the cut-in wind speed, the rated wind speed and the cut-out wind speed of the wind power plant are shown in table 1. Further, it is assumed that the random characteristic of the load of each node follows a normal distribution with a standard deviation of 10% of the expected value of the load of each node.

TABLE 1 photovoltaic power plant and wind farm related parameters

Then, the random variable is sampled for 5 ten thousand times by a Monte Carlo method, and the power flow of each sampling state is solved by a Newton method. The active injection power and the reactive injection power of all sampling states are taken as training sample input x. And (3) taking the load flow calculation results (namely the voltage amplitude and phase angle of each node of the power system, the active power and the reactive power of each branch) of all sampling states obtained by the Newton method as the output y of the training sample.

2) DNN power flow model initialization

Including data preprocessing and determination of DNN model structure. The input and output data of the training samples are normalized by equation (26). Then, the layer structure of the DNN power flow model is set according to the scale and complexity of the power system to be solved, as shown in table 2.

In the formula, x is a normalized sample value, x is a sample value to be normalized, μ is a sample mean value, and σ is a sample standard deviation.

TABLE 2 model structures of different systems

System scale	IEEE39 node system	IEEE118 node system
			Implicit layer number	4	4
Number of hidden layer neurons	200	300

3) Original system DNN power flow model training

For the original system, the DNN needs to be trained from scratch, i.e. a two-stage training method with unsupervised pre-training and supervised fine-tuning is used. In the training process, the formula shown in formula (13) -formula (23) is adopted, and the DNN parameters are iteratively adjusted according to the training samples until the testing precision meets the requirement or the iteration times reach 500 times.

4) Extension system DNN power flow model training

For the extension system, the knowledge migration method provided by the invention is adopted to migrate the parameters of the original system DNN to the extension system DNN, and then the extension system DNN is trained. In the training process, the formula shown in the formula (5) -formula (15) is adopted, and the DNN parameters are iteratively adjusted according to the extension system flow sample until the testing precision meets the requirement or the iteration times reach 500 times.

The specific simulation results are as follows:

1) concrete examples and training method comparison

The example information used for the simulation is as follows:

example 1: the improved IEEE39 node system has the load standard deviation of 10% and the new energy permeability of 20%.

Example 2: the improved IEEE118 node system has the load standard deviation of 10% and the new energy permeability of 20%.

Example 3: the improved IEEE39 node system has the load standard deviation of 10% and the new energy permeability of 25%.

Example 4: the improved IEEE118 node system has the load standard deviation of 10% and the new energy permeability of 25%.

Example 5: the extension system based on the example 1 is additionally provided with the bus 40 and the branches 26-40.

Example 6: in the extension system based on the embodiment 2, a bus 119, a branch 118 and a branch 75-119 are additionally arranged, and a wind power plant with the maximum output of 200MW is introduced on the bus 119.

Examples 5 and 6 are extension systems based on examples 1 and 2, respectively. The effectiveness of the knowledge migration method provided by the invention is verified by adopting the following steps of example 5 and example 6.

The DNN power flow model training method compared by the invention is M1-M3:

m1: the DNN is trained from zero.

M2: the knowledge migration method provided by the invention is adopted to train DNN.

M3: as M2, but employs a random initialization method to initialize the remaining parameters in the input and output layers.

The accuracy of the DNN power flow model is evaluated by adopting the following indexes:

P_V,β: the probability that the absolute error of the voltage amplitude exceeds β p.u.

P_θ,β: the absolute error of the voltage phase angle exceeds the probability of beta rad.

P_P,β: the probability that the absolute error of the active power of the branch exceeds beta MW.

P_Q,β: the probability that the absolute error of branch reactive power exceeds beta MVar.

2) Precision analysis of DNN power flow model of original system

The section tests the accuracy of the DNN power flow model of the original system. The DNN power flow model of the original system needs to be trained to solveSolving the probability trend of the original system, and providing useful trend knowledge for the extension system DNN. Table 3 lists the following accuracy indicators: p_V,0.0002,P_θ,0.002,P_P,2,P_Q,2。

TABLE 3 accuracy of DNN Power flow model trained by M1

Examples of the design	Method	P_V,0.0002	P_θ,0.002	P_P,2	P_Q,2
						EXAMPLE 1	M1	0.05％	0.88％	1.06％	0.01％
EXAMPLE 2	M1	1.42％	1.02％	0.57％	0.02％

As can be seen from table 3, the accuracy index of both example 1 and example 2 is less than 1.5%, indicating that the original system DNN can approximate the power flow equation with high accuracy.

To verify the generalization ability of the DNN of the original system, the DNN trained in examples 1 and 2 was used directly herein to solve the examples with higher new energy penetration (examples 3 and 4). The test results are shown in table 4.

TABLE 4 solving the accuracies of example 3 and 4 directly using the trained DNN

As can be seen from table 4, all indexes are still less than 2% for examples 3 and 4, thus verifying the generalization capability of the original system DNN.

3) Performance analysis of the proposed knowledge migration method

The section verifies the effectiveness of the knowledge migration method provided by the invention. The dimensions of the input and output of the examples 5 and 6 are different from those of the examples 1 and 2 due to the difference in the scale of the system. Therefore, the DNNs trained in examples 1 and 2 cannot be directly applied to solving the probability flows of examples 5 and 6. To this end, a new DNN suitable for the extension system needs to be trained. For examples 5 and 6, table 5 lists the training results for three methods, M1 (training from scratch), M2 (training using the proposed knowledge migration method), M3 (same as M2, but using the random initialization method to initialize the remaining parameters). The training stop criteria are that all the indicators shown in table 3 are less than 5% or that the number of iterations reaches 500.

TABLE 5 comparison of training results for the extension System from M1-M3

As can be seen from table 5, M1 cannot achieve satisfactory accuracy when the number of samples is small. In contrast, in the case of sufficient samples, M1 can meet the accuracy requirement, but the training burden is increased accordingly. In contrast, M2 is able to meet the accuracy requirements faster and with less sample size requirements thanks to the absence of the need to perform unsupervised pre-training and better initial values of the parameters provided by the original system DNN. Among them, the advantage of M2 is well shown in example 6. For examples 5 and 6, the training speed of M2 was improved by 9.92 times and 61.62 times, respectively, compared to M1. Similar to M2, training efficiency is greatly improved by using M3. In contrast, however, M2 was more efficient in training than M3, demonstrating the effectiveness of initializing the remaining parameters by fitting a distribution of parameters. But the advantage of M2 is not obvious due to the small number of remaining parameters. Furthermore, another advantage of M2 over M3 is that M2 is not affected by the random initialization method.

From the experimental results, it can be seen that: the knowledge migration method (M2) provided by the invention can greatly reduce the training time and the requirement on the number of training samples.

In summary, the present invention provides a knowledge migration method. Firstly, migrating parameters of an implicit layer in an original system DNN to a new DNN; then, transferring parameters of input and output layers in the original system DNN according to variables connected with the parameters; finally, the remaining parameters of the input and output layers in the new DNN are initialized according to the fitted parameter distribution of the original system DNN. Therefore, the fact that the trend knowledge learned by the original system DNN is transferred to the extension system DNN is achieved, and therefore training efficiency of the extension system DNN is improved. The example simulation analysis shows that the knowledge migration method provided by the invention can greatly improve the training efficiency of the extension system DNN and reduce the requirement of the training process on the number of samples. In addition, simulation analysis also demonstrates the effectiveness of initializing the remaining parameters from the fitted parameter distribution. Therefore, the invention can greatly improve the expansibility of DNN in the extension system.

Claims

1. A method for improving the expansibility of a deep neural network tidal current model is characterized by mainly comprising the following steps:

1) acquiring basic data of a power system;

2) determining a feature vector;

3) establishing an original DNN power flow model;

4) training the original DNN power flow model to obtain a trained original DNN power flow model;

5) expanding the trained original DNN power flow model to obtain an expanded DNN power flow model;

the main steps of utilizing the transfer learning method to expand the trained original DNN power flow model are as follows:

5.1) determining parameters needing initialization in the extension DNN power flow model, wherein the parameters comprise hidden layer parameters, input layer parameters and output layer parameters;

5.2) transferring hidden layer parameters in the original DNN power flow model to an extension DNN power flow model;

5.3.1) determining variables connected with input layer parameters and output layer parameters of the original DNN power flow model, wherein the variables comprise active injection power, reactive injection power, voltage amplitude, voltage phase angle, branch active power and branch reactive power;

respectively migrating parameters connected with the variables in the original DNN power flow model to an extension DNN power flow model so as to initialize and extend partial parameters of the DNN power flow model;

5.3.2) initializing and expanding the residual parameters of an input layer and an output layer in the DNN power flow model by using a method for fitting the parameter distribution of the original DNN power flow model;

5.4) utilizing the power flow sample of the extension system to finely adjust the extension DNN power flow model:

2. The method for improving the expansibility of the deep neural network tidal flow model according to claim 1, wherein the main steps for determining the eigenvectors are as follows:

1) setting uncertain factors including new energy uncertainty and load uncertainty; setting a power flow result, wherein the power flow result comprises a voltage amplitude value and a voltage phase angle of each node, and active power and reactive power of each branch circuit;

2) calculating active injection power P of node i_inj,iReactive injection power Q of node i_inj,iVoltage amplitude v of node i_iPhase angle theta of voltage at node i_iActive power P of branches i to j_ijAnd reactive power Q_ijNamely:

P_inj,i＝P_g,i+P_w,i+P_v,i-P_d,i； (1)

Q_inj,i＝Q_g,i-Q_d,i； (2)

where i and j are node numbers; n is_bRepresenting the number of nodes; p_g,iAnd Q_g,iRepresenting the active output and the reactive output of the generator of the node i; p_w,iRepresenting the active output of the wind power plant at the node i; p_v,iRepresenting the active power output of the photovoltaic power station at the node i; p_d,iAnd Q_d,iRepresenting the active load and the reactive load of the node i; v. of_iIs the voltage amplitude of node i; v. of_jIs the voltage amplitude of node j; theta_ijIs the voltage phase angle difference between node i and node j; g_ijIs an element of the conductance matrix; b is_ijIs an element of a susceptance matrix; g_ijAnd b_ijIs the conductance and susceptance parameters of the transmission line;

3) determining an input feature vector of the power flow sample, namely node injection power x; the input feature vector is as follows:

x＝[P_inj,Q_inj]； (7)

4) determining output characteristic vector of power flow sample, namely power flow calculation result y₀(ii) a The output feature vector is as follows:

y_o＝[v,θ,P_ij,Q_ij]； (8)

wherein v is the voltage amplitude; and theta is a voltage phase angle.

3. The method for improving the expansibility of the deep neural network power flow model according to claim 2, wherein the step of establishing the original DNN power flow model comprises the following steps:

1) taking active injection power and reactive injection power of each node in the power system as original input x of the DNN power flow model, and taking a calculation result of each power flow sample as original output y of the DNN power flow model_o(ii) a The calculation result of each load flow sample comprises the voltage amplitude of each node, the voltage phase angle of each node, the active power and the reactive power of each branch;

2) the original DNN power flow model selects SDAE as a basic model;

the SDAE consists of a plurality of noise reduction automatic encoders DAE; the main steps for establishing the original SDAE power flow model are as follows:

2.1) establishing a noise reduction automatic encoder DAE, which mainly comprises the following steps:

2.1.1) corrupting the original input x in a randomly mapped manner, resulting in a locally corrupted input

The corrosion formula is shown below：

In the formula, q_DSelecting a plurality of original input x randomly and setting the x to zero in the corrosion process of a random mapping mode; x is the input of the original DNN power flow model;

2.1.2) according to the coding function f_θCompute hidden layer y, i.e.:

in the formula, W is the weight of the encoder; w is a number d_y×d_xA matrix of dimensions; b is the bias of the encoder; b is a number d_yA vector of dimensions; d_xIs the dimension of the input layer vector; d_yDimension of intermediate layer vector; f. of_θIs a coding function; s (x) is an activation function in the encoding process;

2.1.3) decoding function g of the intermediate layer output y by the decoder_θ′Obtaining the output of the output layer z, and establishing a DAE model;

the output of output layer z is as follows:

z＝g_θ′(y)＝s(W′y+b′)； (11)

wherein, W' is the weight of the decoder; w' is a number d_x×d_yA matrix of dimensions; b' is decoder bias; b' is a number d_xA vector of dimensions; d_xIs the dimension of the input layer vector; d_yDimension of intermediate layer vector;

2.2) stacking n layers of DAE models layer by layer; the middle layer of the lower DAE model is used as an input layer of the upper DAE model, so that an original SDAE power flow model, namely an original DNN power flow model, is obtained;

output of original SDAE power flow model

As follows:

in the formula (I), the compound is shown in the specification,

an encoding function for layer l DAE; 1,2, … n; n is the number of DAEs in the original SDAE; l is the number of DAE;

is the coding function at the top layer of the original SDAE power flow model.

4. The method for improving the expansibility of the deep neural network power flow model according to claim 3, wherein the training of the original DNN power flow model, that is, the original SDAE power flow model, comprises the following steps:

1) the method comprises the following steps of carrying out unsupervised pre-training on an original SDAE power flow model, and mainly comprising the following steps:

1.1) establishing a mean square error loss function, namely:

in the formula, x_lIs the input of the l-th layer DAE, i.e., the intermediate layer output y of the l-1-th layer DAE_l-1；z_lIs the output of the l-th layer DAE;

to optimize the objective function;

1.2) training parameters of DAEs of each layer by using a root mean square propagation algorithm, wherein the parameter updating formulas of DAEs of each layer in the training process are shown as a formula (6) to a formula (15);

after the T +1 th parameter is updated, the jth neuron of the first-1 DAE middle layer to the ith neuron of the first DAE middle layerWeight of each neuron

As follows:

in the formula (I), the compound is shown in the specification,

the weight value from the jth neuron in the middle layer of the (l-1) th DAE to the ith neuron in the middle layer of the l-1 th DAE after the T +1 th parameter updating;

the weight value from the jth neuron in the middle layer of the (l-1) th DAE to the ith neuron in the middle layer of the l-1 th DAE after the Tth parameter updating;

bias of ith neuron in ith DAE middle layer after T +1 th parameter update

As follows:

in the formula (I), the compound is shown in the specification,

is the bias of the ith neuron in the middle layer of the first-layer DAE after the Tth parameter is updated;

Weight change in update process

As follows:

Bias change in update procedure

As follows:

in the formula (I), the compound is shown in the specification,

represents the Hadamard product; η is the learning rate; ρ is a gradient accumulation index; k is an arbitrary sample;

the weight value from the jth neuron in the middle layer of the first-1 DAE layer to the ith neuron in the middle layer of the first DAE layer after the T-1 time of parameter updating;

the gradient accumulated for the previous T times of weight iteration;

the gradient accumulated by the first T-1 times of weight iteration; Δ is the incremental sign; d is a differential sign; σ is a constant;

is a partial derivative symbol;

according to the output of the top layer

Constructing a mean square error loss function with the training sample output y; k is an arbitrary sample;

is the bias of the ith neuron in the middle layer of the first-layer DAE after the T-1 parameter updating;

gradients accumulated for the previous T bias iterations;

the gradient accumulated by the first T-1 times of bias iteration; m is the number of samples;

1.3) introducing momentum learning rate as an additional item, and updating the formula (16) to the formula (19) to obtain the weight value change quantity shown in the formula (22) and the formula (23)

And amount of change in bias

Namely:

wherein p is a momentum learning rate;

1.4) calculating the optimal coding parameter theta of each layer of DAE according to an unsupervised pre-training parameter updating formula, and taking the optimal coding parameter theta as an initial coding parameter with supervision and fine adjustment;

2) carrying out supervised fine tuning on an original SDAE power flow model, and mainly comprising the following steps:

2.1) output according to the top layer

And training sample output y constructMaking a mean square error loss function L to obtain an optimized objective function arg_θminJ(W,b)；

arg_θminJ(W,b)＝arg_θminL； (24)

2.2) according to the optimization objective function arg_θminJ (W, b), and performing fine adjustment on the optimal coding parameter theta of the original SDAE power flow model by adopting expressions (14) to (23) to obtain the trained SDAE power flow model.

5. The method for improving the expansibility of the deep neural network power flow model according to claim 1, wherein the main steps for solving the probabilistic power flow of the augmentation system are as follows:

1) collecting training samples and training and expanding a DNN power flow model;

2) sampling the state of the extension system by adopting a Monte Carlo method;

3) and calculating the probability load flow result of the extension system by using the extension DNN load flow model, wherein the probability load flow result comprises the average value and the standard deviation of the variables and the probability density function of all output variables.