CN112149873A

CN112149873A - Low-voltage transformer area line loss reasonable interval prediction method based on deep learning

Info

Publication number: CN112149873A
Application number: CN202010867853.0A
Authority: CN
Inventors: 井友鼎; 付勇; 滕铁军; 郝增才; 陈小燕; 张伟
Original assignee: Beijing Hezhong Weiqi Technology Co ltd
Current assignee: Beijing Hezhong Weiqi Technology Co ltd
Priority date: 2020-08-25
Filing date: 2020-08-25
Publication date: 2020-12-29
Anticipated expiration: 2040-08-25
Also published as: CN112149873B

Abstract

The invention belongs to the technical field of line loss prediction of transformer areas, and particularly relates to a low-voltage transformer area line loss reasonable interval prediction method based on deep learning, which comprises the following steps: collecting data; constructing indexes; constructing characteristics; selecting a characteristic factor; constructing a line loss prediction model; and displaying the model prediction effect. According to the method, through a precise line loss prediction model, the precise loss reduction capability is improved, and line loss lean management is realized; through accurate prediction of line loss, the lean management of line loss is promoted.

Description

Low-voltage transformer area line loss reasonable interval prediction method based on deep learning

Technical Field

The invention belongs to the technical field of line loss prediction of transformer areas, and particularly relates to a low-voltage transformer area line loss reasonable interval prediction method based on deep learning.

Background

The reasons for generating the line loss of the low-voltage distribution station area are mainly divided into fixed loss, management reasons and technical reasons. The fixed loss comprises resistance loss and excitation loss generated by a winding and an iron core in the transformer; resistive losses generated by the cabling of the power grid transmission; electric energy loss generated by capacitor and reactance equipment deployed in the power transmission website; electrical energy losses generated by protection devices in the electrical power network; loss generated by the medium and loss generated by the power grid metering device. The management reasons mainly refer to meter reading problems, insufficient electricity stealing management work and the like. The technical reason mainly refers to the problems of inconsistent marketing data, inconsistent household variable relationships and the like.

With the deep development of the line loss lean management work of the state network company, the line loss qualification rate assessment mode of the traditional one-cutting mode no longer meets the line loss lean management requirement, and a power supply enterprise needs to find an effective line loss calculation method for dynamically predicting reasonable line loss values and reasonable interval upper bound of each distribution area and giving early warning to the distribution areas exceeding the reasonable interval upper bound.

A large amount of research and verification are already carried out by scholars before the line loss calculation of a transformer area, and currently, a large number of methods for calculating the line loss rate of a power grid are provided, and the methods are mainly divided into a traditional method and a method based on machine learning. The traditional methods mainly comprise a transformer area loss rate method, a voltage loss rate method, an equivalent resistance method, a power flow method and the like. The machine learning method mainly comprises a linear regression method based on traditional interpretability, an integrated regression method based on a decision tree, a non-linear regression method based on a neural network and not easy to interpret, a calculation method based on a support vector machine, a calculation method based on the neural network and an improved adaptive quadratic variation differential evolution algorithm, a calculation method based on an improved K-means cluster and a BP neural network, and a short-term low-voltage distribution network theoretical line loss prediction algorithm based on a kmeans-LightGBM.

The invention patent document of publication number CN109272176B discloses a method for predicting and calculating the line loss rate of a distribution room by using a K-means clustering algorithm, which comprises the following steps: step 1, selecting active power supply quantity X1, reactive power supply quantity X2, total power supply line length X3, power supply radius X4 and total line resistance X5 as electrical characteristic parameters; step 2, standardizing the original data of the electrical characteristic parameters; step 3, establishing a performance index function PI (i) of the distribution room through the electrical characteristic parameters, and selecting an initial clustering center point and a clustering number K; step 4, predicting the line loss rate of the transformer area by using an improved K-means clustering algorithm; the method utilizes the index function established by the electrical characteristic parameters of the distribution area as the principle of cluster analysis and judgment of the initial cluster center, improves the accuracy of the cluster result, but different initial values of the cluster result can lead to different classification results, and the defect of larger parameter reconstruction error is easily caused when the classification is not accurate.

Disclosure of Invention

The invention aims to provide a low-voltage transformer area line loss reasonable interval prediction method based on deep learning aiming at the problems in the prior art, and the method improves the accurate loss reduction capability and realizes the lean management of line loss through an accurate line loss prediction model; through accurate prediction of line loss, the lean management of line loss is promoted.

The technical scheme of the invention is as follows:

a low-voltage transformer area line loss reasonable interval prediction method based on deep learning comprises the following steps:

s1, collecting data, including four systems of acquisition, marketing, PMS and GIS, wherein the related data includes distribution transformer side data and user side data;

s2, constructing indexes, namely selecting line loss influence factors from the data collected in the S1;

s3, feature construction, namely inputting the square of load characteristics and the absolute value of terminal pressure drop as feature factors of a model through correlation analysis of the selected influence factors and the line loss rate;

s4, selecting characteristic factors, namely constructing a Lasso regression model by adopting the statistical index factors and the two constructed characteristic factors, and finally selecting eight factors, namely the power-on capacity ratio, the power supply radius, the tail end capacity ratio, the power factor of the transformer area, the absolute value of the pressure drop of the head end and the tail end, the load rate, the square of the load characteristic and the three-phase imbalance coefficient, as final characteristic factors according to a Lasso regression variable screening result;

s5, constructing a line loss prediction model, and constructing the line loss prediction model by adopting LSTM;

and S6, displaying the model prediction effect.

Specifically, in step S1, the distribution-side data includes distribution capacity, voltage, current, power, and electric quantity, and the user-side data includes user capacity, daily electric quantity, and user coordinates.

Specifically, twenty-four transformer area line loss influence factors, which are five major types of transformer area power supply indexes, grid structure indexes, electric quantity indexes, operation indexes and capacity indexes, are selected in step S2.

Specifically, the Lasso regression model in step S4 is a regularization method that performs parameter estimation and variable selection simultaneously, and the parameter estimation is defined as follows:

in the formula, lambda is a non-negative regular parameter,

is a penalty term.

Specifically, the step S4 of screening the regression variables by using the Lasso regression model includes removing abnormal values from the residual estimates, and includes the following steps:

1) computing prediction residual

2) Calculating mean and standard deviation of residual

mean_resid＝mean(resid)

std_resid＝std(resid)

3) Calculating z-score

4) Outlier detection

|z_score|＞3

And eliminating the line loss rate sample value with the z score larger than 3 in the sample data.

Specifically, the step S4 further includes performing normalization processing on the selected feature factors, and normalizing each feature factor to [0.1,0.9] by using a minimum-maximum normalization method, assuming that there are N samples { x (N) } Nn-1, and for each dimension of the feature x, the normalized feature is

Where min (x) and max (x) are the minimum and maximum values of the characteristic factor x over all samples, respectively.

Specifically, the building of the line loss prediction model in step S5 selects eight feature factors as inputs of the LSTM model by using the LASSO algorithm, and optimizes the training network structure by loop iteration, including the following steps:

1) performing normalization processing on the eight characteristic factors to serve as input of an LSTM model, outputting an actual line loss rate serving as the model, and dividing data into a training set and a test set;

2) setting basic parameters of an LSTM deep learning model, including an activation function and the number of layers of a deep neural network, wherein the model gradient optimization algorithm is ADAM;

3) training model optimization model parameters according to training data, finely adjusting basic parameters of a network model, including increasing the number of layers and changing an activation function until an evaluation function reaches an ideal range, finally determining 128 layers of model network parameters, selecting a sigmod function by the activation function, and setting the dropping rate of a Dropout layer to be 0.5;

4) and finally, predicting the corresponding line loss rate and the reasonable interval upper bound for the newly input characteristic factors by using the trained model.

Specifically, the evaluation function of the LSTM model adopts RMSLE, and its formula is as follows:

wherein y is_iIn order to be the actual line loss value,

and the line loss is predicted value.

In order to strengthen daily management work of power supply enterprises, China comprehensively implements sub-station area management on a low-voltage power distribution network, the sub-station area management is used as an important component of power grid four-section management, and line loss of a station area directly reflects the power grid management level of a certain area. However, due to the fact that the number of users in the low-voltage transformer area is large, loads are various, the management level of a power grid basic layer is uneven, grid account management is imperfect, and line distribution is complex and various, complexity of line loss management of the transformer area is increased. The realization of accurate and rapid calculation of the line loss rate of the transformer area based on the current situation becomes an urgent problem to be solved.

The problems existing in the traditional line loss calculation method are as follows: firstly, the method simplifies algorithms such as a transformer area loss rate method and a voltage loss rate method, has low calculation precision, and cannot meet the requirement of lean management of the transformer area; secondly, the equivalent resistance method, the tidal current method and other accurate algorithms have high requirements on a distribution area topological network, equipment parameters and operation data, and due to the large scale of the distribution network, if the distribution area actual measurement is carried out in a large scale, the workload is very large, so that the overall line loss condition of the distribution area is difficult to master; thirdly, with the development of new energy, the power access of 380V photovoltaic and the like in the transformer area becomes a common phenomenon, and the existing line loss calculation method cannot meet the development requirements of the transformer area.

The calculation or prediction accuracy of the line loss is greatly improved by adopting a machine learning algorithm, but certain limitations exist, the model training efficiency is low when the data volume is large by adopting a calculation method based on a support vector machine, and in addition, a proper kernel function is difficult to find when the characteristic dimension space is large; by adopting a k-means clustering and bp neural network calculation method and a kmeans-LightGBM-based calculation method, as the initialization center of the clustering algorithm is randomly selected, different initial values can cause different classification results, and the defect of larger parameter reconstruction error is easily caused when the classification is inaccurate.

The invention has the beneficial effects that: 1) characteristic selection is carried out by adopting lasso regression, model input is optimized, and the model operation efficiency and the model stability are improved; 2) a deep learning model is constructed aiming at line loss prediction, the deep learning model has excellent nonlinear function approximation capability, and deep characteristic rules between characteristic factors and line loss rate can be mined, so that the prediction result is more reasonable; 3) the RMSLE is adopted to replace the traditional RMSE to evaluate the model, the RMSLE evaluation strategy is to punish that under-prediction is larger than over-prediction, namely, the loss of the under-prediction is larger than that of the over-prediction, the problem of line loss rate distribution tailing is effectively solved, and the model prediction result is more reasonable and accurate; 4) the upper bound of the reasonable predicted line loss interval is estimated according to the residual error between the predicted line loss and the actual line loss.

By using the method provided by the invention, marketing line loss management covers users the widest, the number of related equipment is the largest, the data scale is the largest, the accurate loss reduction capability is improved through an accurate line loss prediction model, and line loss lean management is realized; through accurate prediction of line loss, the lean management of line loss is promoted, and the benefit is increased for a company by about hundred million yuan each year.

Drawings

Fig. 1 is a schematic diagram of line loss rate and load characteristic dispersion.

FIG. 2 is a graph showing the line loss rate and the head and tail end pressure drop scattering point;

FIG. 3 is a schematic diagram of an LSTM structure;

fig. 4 is a comparison between the predicted line loss rate and the actual line loss rate of the transformer area.

Detailed Description

The technical solution of the present invention is described in detail below with reference to the accompanying drawings and the detailed description.

Example 1

The method for predicting the low-voltage transformer area line loss reasonable interval based on deep learning comprises the following steps:

s1, collecting data, including four systems of acquisition, marketing, PMS and GIS, wherein the related data comprises distribution transformer side data and user side data, the distribution transformer side data comprises distribution transformer capacity, voltage, current, power and electric quantity, and the user side data comprises user capacity, daily electric quantity and user coordinates;

s2, constructing indexes, namely selecting line loss influence factors from the data collected in the S1, and selecting twenty-four line loss influence factors of five categories of a power supply index, a grid frame index, an electric quantity index, an operation index and a capacity index of a transformer area;

s3, constructing characteristics, namely inputting the square of load characteristics and the absolute value of terminal voltage drop as characteristic factors of a model through correlation analysis of selected influence factors and line loss rate, wherein the correlation analysis of 24 influence factors and the line loss rate shows that certain linear correlation exists between the influence factors and the line loss rate, for example, the net electricity quantity ratio, the power factor and the line loss rate are in negative correlation integrally, the power supply radius, the terminal electricity quantity ratio, the load rate and the three-phase imbalance are in positive correlation integrally with the line loss rate, and partial influence factors also have nonlinear correlation, for example, FIG. 1 is a schematic diagram of the dispersion points of the line loss rate and the load characteristics, FIG. 2 is a schematic diagram of the dispersion points of the line loss rate and the head-terminal voltage drop, the line loss rate and the square of the load characteristics have linear relationship, and the absolute value of the line loss rate and the terminal voltage drop presents positive linear relationship, therefore, the square of the load characteristic and the absolute value of the terminal pressure drop are used as characteristic input of the model;

and S4, selecting characteristic factors, namely constructing a Lasso regression model by adopting 22 statistical index factors and two constructed characteristic factors, and finally selecting eight factors, namely the ratio of the on-grid electricity quantity, the power supply radius, the ratio of the tail end electricity quantity, the power factor of the transformer area, the absolute value of the head end voltage drop and the tail end voltage drop, the load rate, the load characteristic square and the three-phase imbalance coefficient, as final characteristic factors according to the screening result of Lasso regression variables, wherein if the model training influence factors are too many, the efficiency of model training and the stability of the model are influenced. The invention adopts Lasso to screen characteristic factors, the Lasso is widely applied to one of methods of parameter estimation and variable selection, the Lasso variable selection is proved to be consistent under a determined condition, the Lasso is a regularization method for simultaneously performing parameter estimation and variable selection, and the parameter estimation is defined as follows:

in the formula, lambda is a non-negative regular parameter,

is a penalty item;

and S6, displaying the model prediction effect.

The twenty-four station area line loss characteristic index factors in the step S2 are power supply indexes: the ratio of the on-line electricity quantity; capacity index: distribution transformer capacity, single-phase user total capacity, three-phase user total capacity, user capacity ratio, single-phase user total capacity percentage and three-phase user total capacity percentage; the network frame index is as follows: power supply radius, grid structure and average house power supply length; electric quantity index: the total daily power supply amount, the total daily power consumption of single-phase users, the total daily power consumption of three-phase users, the total power consumption percentage of single-phase users, the total power consumption percentage of three-phase users and the power consumption percentage of end users; operation type indexes are as follows: power factor, average bus voltage, head and tail end voltage drop, average load rate, maximum load rate, load characteristics, three-phase unbalance and maximum three-phase unbalance degree.

The existence of the abnormal value may affect the stability of the model, so the process of screening the regression variable by using the Lasso regression model in step S4 includes removing the abnormal value from the residual estimate, and the process of removing the abnormal value includes the following steps:

1) computing prediction residual

2) Calculating mean and standard deviation of residual

mean_resid＝mean(resid)

std_resid＝std(resid)

3) Calculating z-score

4) Outlier detection

|z_score|＞3

Example 2

In general, each dimension of the original features of the sample is different in source and measurement unit, and the distribution range of the feature extraction value is often very different. When the Euclidean distance between different samples is calculated, the characteristic with a large value range plays a leading role. Such as large differences in supply radius and load rate dimensions. This affects the convergence speed of the deep learning model that we build later and the stability and accuracy of the model, so it is necessary to normalize the input feature factors.

The difference between this embodiment and embodiment 1 is that the normalization processing needs to be performed on the selected feature factors in step S4, and this embodiment adopts the minimum maximum normalization method to normalize each feature factor to [0.1,0.9], assuming that there are N samples { x (N) } Nn ═ 1, and for each dimension of feature x, the feature after normalization is that

Example 3

The method for constructing the line loss prediction model in the step S5 includes the steps of adopting eight characteristic factors selected by the LASSO algorithm as input of the LSTM model, and optimizing and training a network structure through loop iteration, including the following steps:

The LSTM is a recurrent neural network, the LSTM layer is a variant of the SimpleRNN layer, and the algorithm was developed by Hochreiter and Schmidhuber in 1997, which effectively solves the problem of disappearance of simple recurrent neural network gradients of SimpleRNN. The method is realized by adding a method for carrying information to span a plurality of time steps to solve the problem of gradient disappearance.

The basic principle is that, assuming a conveyor belt and a time sequence, the direction of the conveyor belt and the direction of the time sequence are parallel, the information in the time sequence can jump to the conveyor belt at any time node, and the conveyor belt transmits the information to a later time node and returns the information to the original time node as it is when needed. This is the basic principle of LSTM: it holds the information in such a form for later use, thereby preventing earlier time information from fading out during processing. The structure is shown in FIG. 3, in which h (t) is a short-term state; c (t) is the long-term state; g (t) is the main layer output layer, whose basic role is to analyze the current input x (t) and the previous short-term state h (t-1); f (2) controls which long-term states should be discarded; i (t) controls which parts of g (t) are added to the long-term state; o (t) controls which long-term states should be read and output at this time iteration.

The formula for LSTM is as follows:

i_(t)＝σ(W_xi ^T*X_(t)+W_hi ^T*h_(t-1)+b_i)

f_(t)＝σ(W_xf ^T*X_(t)+W_hf ^T*h_(t-1)+b_f)

o_(t)＝σ(W_xo ^T*X_(t)+W_ho ^T*h_(t-1)+b_o)

g_(t)＝tanh(W_xg ^T*X_(t)+W_hg ^T*h_(t-1)+b_g)

wherein, W_xi,W_fi,W_xo,W_xgIs connected per layer to the input vector x_(t)；W_hi,W_hf,W_ho,W_hgIs that each layer is connected to the previous short-term state h_(t-1)A weight matrix of (a); b_i,b_f,b_o,b_gIs the coefficient of variation for each layer.

Example 4

The line loss prediction belongs to regression, and for a regression algorithm, performances of a model are usually evaluated by adopting MSE, RMSE, MAPE, R2 and the like, but because an actual line loss rate is adopted as a target variable for model training, and through analysis on distribution of the actual line loss rate, the actual line loss rate is not symmetrically distributed and has a certain trailing effect, the RMSE is adopted as a non-optimal choice, in the embodiment, the RMSLE is adopted as an evaluation function of the LSTM model, and the formula is as follows:

wherein y is_iIn order to be the actual line loss value,

and the line loss is predicted value.

Because the actual line loss rate has a certain tailing effect, if the RMSE is used for evaluation, the value of the RMSE is dominated by an abnormally large value, so that even if a plurality of small values are accurately predicted, the RMSE is large when the abnormal large value of each individual is not predicted accurately. And RLMSE firstly takes logarithm and then calculates RMSE, so that the problem can be effectively solved.

Example 5

And randomly extracting 20 local transformer areas in a certain city, analyzing the relation between the actual line loss rate and the predicted line loss rate of the transformer areas in 20 days, wherein blue is the actual line loss rate, yellow is the predicted line loss rate, and as can be seen from fig. 4, most of predicted values are concentrated in the middle of the actual line loss rate, and are similar to the actual line loss rate distribution, which indicates that the model prediction effect is better.

Estimating the upper bound of the prediction interval according to the residual error between the predicted line loss and the actual line loss, namely the RMSE (root mean square error) of the training model, and carrying out interval expansion according to the 3 sigma principle, wherein the interval expansion is carried out by one time of 68.3 percent, 1.5 of 93.32 percent, 95.4 percent and 3 of 99.7 percent, the method adopts 1.5 for expansion, and the RMSE calculation formula is as follows

Therefore, the prediction interval of the present invention is bounded by

Wherein

In order to be the upper bound of the reasonable interval,

and (4) obtaining a root mean square error by training the RMSE through a model as a line loss predicted value.

The method provided by the invention is used for carrying out line loss prediction model of a certain district, modeling analysis is carried out according to more than 1 ten thousand district data and more than 30 ten thousand sample data in 1 month in a certain city, the model accuracy rate reaches more than 90% from the point of view of on-site verification of the model, and the feasibility and effectiveness of the method are well verified.

The method provided by the invention is distinguished from other machine learning methods, the line loss of the transformer area is predicted by adopting a deep learning algorithm, and the prediction accuracy is obviously improved; the design of the processes of feature index construction and feature selection is novel and reasonable; 3) the process of screening the regression variables by adopting the Lasso regression model comprises the steps of carrying out abnormal value elimination on residual estimation, so that the stability of the model is ensured; 4) the model evaluation adopts an RMSLE strategy to distinguish from a traditional evaluation strategy, so that the trailing effect is effectively solved; 5) the upper interval bound is extended with 1.5 times RMSE.

Finally, it should be noted that the above examples are only used to illustrate the technical solutions of the present invention and not to limit the same; although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art will understand that: modifications to the specific embodiments of the invention or equivalent substitutions for parts of the technical features may be made; without departing from the spirit of the present invention, it is intended to cover all aspects of the invention as defined by the appended claims.

Claims

1. A low-voltage transformer area line loss reasonable interval prediction method based on deep learning is characterized by comprising the following steps:

and S6, displaying the model prediction effect.

2. The method for predicting the reasonable line loss interval of the low voltage transformer area based on the deep learning of claim 1, wherein in the step S1, the distribution side data includes distribution capacity, voltage, current, power and electric quantity, and the user side data includes user capacity, daily electric quantity and user coordinates.

3. The low-voltage transformer area line loss reasonable interval prediction method based on deep learning of claim 1, wherein twenty-four transformer area line loss influence factors are selected from five categories of a transformer area power supply index, a grid frame index, an electric quantity index, an operation index and a capacity index in step S2.

4. The method for predicting the reasonable line loss interval of the low-voltage transformer area based on the deep learning of claim 1, wherein the Lasso regression model in the step S4 is a regularization method that performs parameter estimation and variable selection simultaneously, and the parameter estimation is defined as follows:

in the formula, lambda is a non-negative regular parameter,

is a penalty term.

5. The method for predicting the reasonable line loss interval of the low-voltage transformer area based on the deep learning of claim 1, wherein the step S4 of screening the regression variables by using the Lasso regression model includes removing abnormal values from residual estimates, and the method includes the following steps:

1) computing prediction residual

2) Calculating mean and standard deviation of residual

mean_resid＝mean(resid)

std_resid＝std(resid)

3) Calculating z-score

4) Outlier detection

|z_score|＞3

6. The method according to claim 1, wherein the step S4 further includes normalizing the selected feature factors by a min-max normalization method, wherein each feature factor is normalized to [0.1,0.9], assuming that there are N samples { x (N) } Nn-1, and for each dimension of the feature x, the normalized feature is that

7. The low-voltage transformer area line loss reasonable interval prediction method based on deep learning of claim 1, wherein the line loss prediction model constructed in the step S5 adopts a LASSO algorithm to select eight characteristic factors as input of an LSTM model, and a training network structure is optimized through loop iteration, comprising the following steps:

8. The method for predicting the reasonable line loss interval of the low-voltage transformer area based on the deep learning of claim 7, wherein the evaluation function of the LSTM model adopts RMSLE, and the formula is as follows:

wherein y is_iIn order to be the actual line loss value,

and the line loss is predicted value.