CN112488395B

CN112488395B - Method and system for predicting line loss of power distribution network

Info

Publication number: CN112488395B
Application number: CN202011386440.7A
Authority: CN
Inventors: 李勇; 郭钇秀; 乔学博; 周王峰; 段义隆
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2020-12-01
Filing date: 2020-12-01
Publication date: 2024-04-05
Anticipated expiration: 2040-12-01
Also published as: CN112488395A

Abstract

The embodiment of the invention provides a method and a system for predicting line loss of a power distribution network, which are used for acquiring and cleaning time sequence data of each line and each area in the power distribution network, detecting and eliminating abnormal data of the time sequence by adopting an outlier detection method, establishing a random forest model with improved interpolation, and filling missing data of the time sequence. Calculating the maximum mutual information coefficient of each feature and line loss data according to the change rule of each time sequence feature, and selecting the feature with the maximum correlation with the line loss as the input feature of the line loss prediction model; clustering line loss data with similar characteristics by adopting a k-means clustering method according to the time sequence data of the line loss of each area, dividing each type of line loss data set, establishing a long-period memory neural network prediction model, inputting a training sample to train the long-period memory neural network, and obtaining a line loss prediction model; the accuracy of the short-term line loss prediction of the power distribution network can be improved, and the purposes of guiding the loss management and the synergistic operation of the power distribution line are achieved.

Description

Method and system for predicting line loss of power distribution network

Technical Field

The embodiment of the application relates to the technical field of analysis and management of power distribution network line loss, in particular to a power distribution network line loss prediction method and system.

Background

In recent years, with the increase of power equipment, the scale of a power distribution network in China is continuously increased, however, the problem of power distribution loss is gradually serious with the continuous increase of load capacity, and the increase of the line loss of the power distribution network can lead to the increase of the capacity of power generation and transmission equipment, so that the power cost is increased and the power resource is wasted. The development structure of the distribution network in China is unreasonable, the distribution loss management lacks guidance, the loss of the medium-low voltage distribution line accounts for about 50% of the loss of the whole power line at present, the distribution network economic operation faces serious challenges due to the distribution network line loss, and the distribution network has great potential and space in the aspect of energy conservation and loss reduction.

With the informatization development of the intelligent power grid and the popularization and application of the intelligent power meter of the power distribution network, sufficient data support is provided for calculation and analysis of the line loss of the power distribution network, however, due to the complexity and fluctuation of the line loss data of the power distribution network, the traditional line loss analysis and prediction methods are very dependent on the grid structure of the power distribution network, and the intrinsic rules of mass data and line loss cannot be mined. At present, the line loss management analysis of the power distribution network only calculates and simply counts the line loss, so that important characteristics affecting the line loss of the power distribution network and the change trend of the line loss in the future cannot be obtained, and abnormal line loss conditions cannot be found in time and treated. Because of the rapid development of big data and artificial intelligence technology, some statistical and artificial intelligence algorithms are also gradually applied to analysis and evaluation of the line loss of the power distribution network, and the nonlinear fitting capacity of the algorithms is utilized to accurately mine and analyze the data change rule of the line loss of the power distribution network, accurately predict the change condition of the line loss of the power distribution network in a future period of time, and solve the problems that the traditional analysis and prediction method is high in calculation complexity, depends on the grid structure of the power distribution network and the like. By analyzing and predicting the line loss of the power distribution network, the power enterprise can grasp key factors influencing the change of the line loss in the operation of the power distribution network, monitor the abnormal line loss phenomenon of the power distribution network in an auxiliary manner, and guide and evaluate the energy-saving loss-reducing work of the power distribution network. Therefore, the research on the line loss rule of the power distribution network has important significance in the aspects of power network planning operation, equipment maintenance and benefit improvement.

With the development of artificial intelligence technology, research on the aspects of on-line loss data analysis mining and prediction of a power distribution network is gradually paid attention to in China, and at present, a mature power distribution network load prediction method exists in China, and research on the aspect of line loss prediction is still in a starting stage. Aiming at the line loss analysis aspect, the existing intelligent algorithm is used for mining the line loss correlation through a Person correlation and gray correlation analysis method, the Person correlation coefficient is convenient to calculate, but only variables linearly related to the line loss can be measured, the gray correlation analysis is used for judging the correlation through the change situation between the variables and the line loss, but the index optimal value is difficult to determine, and the objectivity is poor. Aiming at the aspect of line loss prediction, the traditional method adopts simple mathematical statistics and least square regression to analyze and predict the line loss of the power distribution network, has lower prediction precision and can not deeply mine the internal rule of the line loss. In addition, a method for predicting the line loss of the power distribution network by using a machine learning algorithm such as a support vector machine and a BP neural network is gradually proposed by a learner, the change condition of the future line loss can be fitted to a certain extent, but the model is not predicted by combining the front-back relation characteristics of a time sequence, and the prediction precision is still to be improved.

Disclosure of Invention

The embodiment of the application mainly aims to provide a power distribution network line loss prediction method and system, which improve the accuracy of short-term line loss prediction of a power distribution network so as to achieve the purpose of guiding power distribution line loss management and synergistic operation.

In order to solve the above problems, in a first aspect, an embodiment of the present invention provides a method for predicting a line loss of a power distribution network, including:

step S1, extracting a line loss data set comprising a plurality of characteristics in historical data of a power distribution network, wherein the characteristics comprise a station area line loss, station area load active power, station area load reactive power, the number of station areas in operation, holiday data, temperature, humidity and wind speed;

s2, carrying out correlation analysis on the line loss data set, and extracting a plurality of features with maximum correlation as prediction input variables;

step S3, clustering the predicted input variables according to characteristics to obtain a data sample set of line losses of a plurality of categories; dividing a data sample set of each category line loss into a training sample and a test sample;

s4, training the long-term and short-term memory neural network based on the training sample to obtain a line loss prediction model for predicting the line loss of each type of power distribution network;

and S5, verifying the line loss prediction model based on the test sample.

Preferably, the step S1 specifically includes:

step S11, acquiring historical measurement data of line loss related characteristics required by power distribution network prediction;

and step S12, cleaning the time series data of each feature, detecting and removing abnormal data according to an outlier detection method, filling the missing data by adopting a random forest method with improved interpolation, so as to obtain complete time series data of each feature, and constructing a time series sample set of each feature.

Preferably, the step S2 specifically includes:

step S21, selecting line loss data containing N moments from the line loss data set obtained in the step S1 as a reference object sequence of correlation analysis;

step S22, according to the time sequence relation with the reference object sequence, selecting historical line loss data of the first 1-24 hours, historical line loss data of the first 1-7 days, historical line loss data of the same moment of the first 1 year, historical line loss data of the same moment of the first 2 years, historical load active power data of the first 1-3 days, historical load active power data of the first 1-3 hours, weeks, months, quarters, holidays and data in a fortune area corresponding to the same moment, temperature, humidity and wind speed data of the same moment, wherein the selected time data length is consistent with the corresponding line loss data length, each time comprises N moments, and the selected data is used as a comparison object sequence, and each comparison object sequence corresponds to each influence characteristic of the line loss;

And S23, calculating the maximum mutual information coefficient between the reference object sequence and each comparison object sequence by adopting a maximum mutual information coefficient method, analyzing the correlation between the characteristics of the object sequences and the line loss, and selecting the optimal M characteristics as the input characteristics of the power distribution network line loss prediction model.

Preferably, the step S3 specifically includes:

step S31, setting the cluster number i to change from 2 to 10, inputting the P area line loss data samples obtained in the step S10, calculating the K-means clustering error square sum under each cluster number i, drawing an image of the cluster number and the error square sum, and determining the optimal cluster number K of the area line loss according to the elbow rule of the inflection point of the image;

step S32, obtaining the number K of clustering categories according to the step S31, setting initial clustering center points of a K-means algorithm, calculating Euclidean distances between P area line loss data samples and K initial clustering center points, distributing the P samples to the nearest center points according to the calculated distance so as to form K clusters, calculating the average value of all objects in each cluster, taking the calculated K average values as new clustering centers of the K clusters, then continuously iterating and calculating a cluster error square sum, and when the error square sum reaches a limit value, indicating that the algorithm tends to be stable, the clustering center is basically not changed, and finishing K-means clustering, wherein the P area line loss data samples are clustered into K categories;

Step S33, adding the data samples of the line loss of each type of the transformer area obtained in the step S32, constructing a corresponding data sample set of the line loss of each type of the transformer area according to the step S23, and carrying out normalization processing on the data sample set of the line loss of each type of the transformer area;

and step S34, dividing each normalized data sample set according to the number of the data samples, wherein 70% of samples are used as training sample sets and 30% of samples are used as test sample sets.

Preferably, the step S4 specifically includes:

s41, constructing K long-short-term memory neural network line loss prediction models according to the line loss clustering number, wherein super parameters of the models comprise hidden layers of the neural network, the number of neurons of each layer of network and the learning rate;

and S42, taking each type of line loss training sample set as input data of each neural network prediction model, taking corresponding each type of line loss prediction samples as output data, performing off-line training on the long-short-period memory neural network prediction model, determining network optimal super parameters, and performing iterative training on the long-short-period memory neural network parameters to obtain the line loss prediction model of each type of line loss.

Preferably, the selecting the optimal M features as input features of the power distribution network line loss prediction model specifically includes:

And sequencing the obtained maximum mutual information coefficients of the features, sequentially selecting the features with larger maximum mutual information coefficients, obtaining multiple groups of selected features, inputting different groups of features into a line loss prediction model to perform line loss prediction, and selecting a group of M features with the minimum average line loss prediction error as the optimal input features of the line loss prediction model of the power distribution network.

Preferably, the calculating the maximum mutual information coefficient between the reference object sequence and each comparison object sequence specifically includes:

the correlation between the variable and the line loss is calculated by adopting a maximum mutual information coefficient method, and the calculation formula of the maximum mutual information coefficient MIC is as follows:

wherein x represents the data of the comparison object, namely the analyzed characteristic quantity data, y represents the data of the reference object, namely the line loss data, n is the number contained in the data sample, p represents the probability calculation, a represents the number of grids divided in the x direction in the two-dimensional space, b represents the number of grids divided in the y direction in the two-dimensional space, MIC (x, y) represents the maximum mutual information coefficient between the data x of the comparison object and the data y of the reference data, the value range is 0 to 1, the larger the maximum mutual information coefficient is, the stronger the correlation between x and y is, and the smaller the maximum mutual information coefficient is, the weaker the correlation between x and y is.

In a second aspect, an embodiment of the present invention provides a power distribution network line loss prediction system, including:

the system comprises an acquisition module, a storage module and a control module, wherein the acquisition module is used for extracting a line loss data set comprising a plurality of characteristics in the historical data of the power distribution network, wherein the characteristics comprise the line loss of a platform area, the active power of the load of the platform area, the reactive power of the load of the platform area, the number of the platform areas in operation, holiday data, temperature, humidity and wind speed;

the variable determining module is used for carrying out correlation analysis on the linear loss data set and extracting a plurality of features with maximum correlation as predicted input variables;

the sample set module clusters the predicted input variables according to characteristics to obtain data sample sets of line losses of a plurality of categories; dividing a data sample set of each category line loss into a training sample and a test sample;

the training module is used for training the long-period memory neural network based on the training sample to obtain a line loss prediction model for predicting the line loss of each type of power distribution network;

and the verification module is used for verifying the line loss prediction model based on the test sample.

In a third aspect, an embodiment of the present invention provides an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor executes the program to implement the steps of the method for predicting line loss of a power distribution network according to the embodiment of the first aspect of the present invention.

In a fourth aspect, embodiments of the present invention provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a power distribution network line loss prediction method according to the embodiments of the first aspect of the present invention.

The embodiment of the invention has the beneficial effects that:

1. cleaning power distribution network data, detecting and removing abnormal values of line loss, filling missing data, improving data quality of the power distribution network, and providing basic data for line loss prediction;

2. the selected time span contains historical actual data more than 3 years, so that the training sample can contain line loss data in a longer period and can be suitable for line loss prediction in different seasons; because the line loss is a group of specific time series data, future line loss change rules can be comprehensively mined through historical data of the same period of the last year and the previous year and data of different time scales at the latest moment;

3. adopting a maximum mutual information coefficient analysis method, excavating the association relation between the line loss of the power distribution network and the historical electrical characteristic quantity and the non-electrical characteristic quantity, screening out the electrical characteristic quantity and the non-electrical characteristic quantity with the maximum correlation as the input characteristic quantity of the line loss prediction model, reducing the input data dimension of the line loss prediction model, and further improving the line loss prediction efficiency and accuracy;

4. Clustering line loss data of each region of the power distribution network according to the similarity degree of curve characteristics by adopting a K-means clustering method, gathering the line loss data with similar characteristics into one type, deeply mining the change rule of each type of line loss, respectively carrying out modeling prediction on each type of line loss, and improving the accuracy of line loss prediction;

5. a power distribution network line loss prediction model is established based on a long-short-period memory neural network, the training model is suitable for time series data, effective information in a time series can be transmitted through the structure of a neuron gate of the training model, the data rule of the power distribution network line loss can be fully mined through the self-learning capability of the model, the prediction precision is high, and the fitting is not easy to be performed.

Drawings

One or more embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements, and in which the figures of the drawings are not to be taken in a limiting sense, unless otherwise indicated.

Fig. 1 is a flowchart of a method for predicting line loss of a power distribution network according to an embodiment of the present invention;

FIG. 2 is a random forest power distribution network data filling matrix diagram according to an embodiment of the present invention;

FIG. 3 (a) is a graph showing the calculation result of the maximum mutual information coefficient of the historical network loss feature according to the embodiment of the invention;

FIG. 3 (b) is a graph of the maximum mutual information coefficient calculation of the historical load power characteristics according to an embodiment of the invention;

FIG. 3 (c) is a graph of the maximum mutual information coefficient calculation result of the time feature according to the embodiment of the present invention;

FIG. 3 (d) is a graph of the maximum mutual information coefficient calculation result of meteorological features according to an embodiment of the present invention;

FIG. 4 is a graph of clustered elbow according to an embodiment of the invention;

FIG. 5 is a schematic diagram of a long-term and short-term memory neural network neuron according to an embodiment of the present invention;

FIG. 6 is a graph of prediction results for each type of network loss according to an embodiment of the present invention;

FIG. 7 is a graph showing the comparison of the line loss prediction results of different models according to an embodiment of the present invention;

fig. 8 is a schematic diagram of a server according to still another embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the embodiments of the present application will be described in detail below with reference to the accompanying drawings. However, as will be appreciated by those of ordinary skill in the art, in the various embodiments of the present application, numerous technical details have been set forth in order to provide a better understanding of the present application. However, the technical solutions claimed in the present application can be implemented without these technical details and with various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not be construed as limiting the specific implementation of the present application, and the embodiments may be mutually combined and referred to without contradiction.

In the embodiment of the present application, the term "and/or" is merely an association relationship describing the association object, which indicates that three relationships may exist, for example, a and/or B may indicate: a exists alone, A and B exist together, and B exists alone.

The terms "first", "second" in the embodiments of the present application are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present application, the terms "comprise" and "have," along with any variations thereof, are intended to cover non-exclusive inclusions. For example, a system, article, or apparatus that comprises a list of elements is not limited to only those elements or units listed but may alternatively include other elements not listed or inherent to such article, or apparatus. In the description of the present application, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

Aiming at the aspect of line loss prediction, the traditional method adopts simple mathematical statistics and least square regression to analyze and predict the line loss of the power distribution network, has low prediction precision and can not deeply mine the internal rule of the line loss. The method for predicting the line loss of the power distribution network by using a machine learning algorithm such as a support vector machine, a BP neural network and the like is proposed by a learner, the change condition of the future line loss can be fitted to a certain extent, but the model is not predicted by combining the front-back relation of time sequences, and the prediction precision is still to be improved.

Therefore, the embodiment of the invention provides a power distribution network line loss prediction method and a power distribution network line loss prediction system, wherein a maximum mutual information coefficient analysis method is adopted to mine the association relation between the power distribution network line loss and the historical electrical characteristic quantity and the non-electrical characteristic quantity, the electrical characteristic quantity and the non-electrical characteristic quantity with the maximum correlation are screened out and used as the input characteristic quantity of a line loss prediction model, the input data dimension of the line loss prediction model is reduced, the line loss prediction efficiency and the line loss prediction precision are further improved, a K-means clustering method is adopted to cluster the line loss data of each area of the power distribution network according to the similarity degree of curve characteristics, the line loss data with similar characteristics are clustered into one category, the change rule of each category of line loss is deeply mined, modeling prediction is respectively carried out on each category of line loss, and the line loss prediction precision is improved. The following description and description will be made with reference to various embodiments.

Fig. 1 is a schematic diagram of a power distribution network line loss prediction method according to an embodiment of the present invention, including:

specifically, considering the influence of short-term and contemporaneous data in different seasons on line loss, the embodiment selects the actual power distribution network historical operation data with a time span of 3 years and a half-longer period to conduct line loss prediction, and selects non-electrical quantity data, so that applicability of a line loss prediction model in different environments and different periods can be enhanced.

Specifically, considering that the original power distribution network data has some abnormal values and missing values, the outlier detection method is adopted to detect the abnormal values of the power distribution network data, the local abnormal factor calculation of the data points is used to detect the abnormal data of the power distribution network, and the local abnormal factor calculation formula is as follows:

Wherein LOF is a local abnormality factor, ρ _k Representing the local reachable density of the data points, representing the average reachable distance from all the points in the field to O, k being the kth field of O points, N _k (O) represents a set of O neighborhood points. The LOF value may measure the greater the distance between points, the lower the density, the closer the distance, the higher the density, and the lower the LOF value, the more likely the anomaly data.

The method adopts an interpolation improved random forest algorithm to fill the missing data of the power distribution network, and comprises the following specific processes:

firstly, counting the missing condition of a power distribution network data matrix X, traversing daily data, filling from the data with the least missing day, and processing other missing data by linear interpolation before filling to obtain a matrix Xnew.

Then training a random forest filling model by using a training set Xtrain, wherein a training label is composed of data Ytrain without a missing part, and verifying a verification set Xtest after the filling model is trained, so that missing data Ytest is filled. And (3) updating the data matrix of the power distribution network by using the filled data after completing the filling of the random forest every time, and then continuing the circulation algorithm to fill the next daily data containing the missing value. The data of the day containing the missing value is correspondingly reduced after filling, so that the data which needs to be processed by interpolation is smaller. When proceeding to the last day data, the other day data has not been processed with linear interpolation, and random forest has filled in a large amount of valid information, which can be used to fill in the most missing day data.

And finally, after traversing all the data, the power distribution network data matrix does not contain missing values any more, and the data filling is completed. The random forest power distribution network data filling matrix is shown in fig. 2.

Through data cleaning, the problems of abnormal collection values and missing phenomena caused by faults, noise interference, data transmission errors or power consumption anomalies of distribution network data due to the fact that measuring equipment occurs are solved, and high-quality basic data are provided for a distribution network line loss prediction model.

The line loss, load active power, load reactive power, temperature, humidity and wind speed data are data acquired by the intelligent acquisition equipment, the data form is 15 minutes, one data point is 15 minutes, 96 data points are contained in one day, and the week, month, quarter and holiday data respectively correspond to time information of acquiring the data by the intelligent acquisition equipment.

The correlation between the variable and the line loss is calculated by adopting a maximum mutual information coefficient method through joint probability, and the calculation formula of the maximum mutual information coefficient MIC is as follows:

The selection method of the optimal input features of the M power distribution network line loss prediction models comprises the following steps: and sequencing the obtained maximum mutual information coefficients of the features, sequentially selecting the features with larger maximum mutual information coefficients, obtaining multiple groups of selected features, inputting different groups of features into a line loss prediction model to perform line loss prediction, and selecting a group of M features with the minimum average line loss prediction error as the optimal input features of the line loss prediction model of the power distribution network.

the principle of the elbow rule is as follows: in the process of increasing the number of clusters K, the sample data is divided more finely, and the degree of aggregation of each cluster is gradually increased, so that the sum of squares of errors is gradually reduced. When the number of clusters K is smaller than the optimal number of clusters, an increase in K increases the degree of aggregation of each cluster, and the square of error and the magnitude of decrease in SSE are large. When the cluster number K reaches the optimal cluster number, the decreasing amplitude of the error square sum SSE tends to be gentle when the cluster number K is increased, so that the curve presents an elbow shape. Generally, the K value corresponding to the elbow position is taken as the optimal cluster number, and the error is most rapidly reduced under the cluster number. The square sum of K-means clustering error is calculated as follows:

Wherein k is a cluster number, m _i Is c _i Clustering center of class samples, x _a Is of the class c _i Samples in the class.

the normalization formula of the data sample set is specifically as follows:

wherein x is the feature sample to be processed, x _max And x _min The maximum value and the minimum value of the characteristic sample are respectively, and x' is a normalized numerical value.

the off-line training model comprises the following specific steps:

the last time state, the last hidden layer state unit and the current state are input into the input gate. After the input of the input gate is transformed by a nonlinear function, the state information screening is carried out through the forgetting gate, so that the LSTM neural network clears the state information which is not used currently in the last step, and according to the three variables input by the input gate, the state information of which part needs to be forgotten by the neural network is jointly determined, and meanwhile, the useful information is determined to enter a new current state. Finally, the output gate uses the new current state to perform operation to determine how much information is output to the current hidden layer state unit, and the current hidden layer state unit enters the next LSTM neuron to perform calculation, so that the connection between the front time sequence and the rear time sequence is established. The specific calculation formula among the variables is as follows:

i ^(t) ＝σ(W _i h ^(t-1) +U _i x ^(t) +b _i )

a ^(t) ＝tanh(W _a h ^(t-1) +U _a x ^(t) +b _a )

f ^(t) ＝σ(W _f h ^(t-1) +U _f x ^(t) +b _f )

c ^(t) ＝i ^(t) ea ^(t) +f _t ec ^(t-1)

o ^(t) ＝σ(U _o x ^(t) +W _o h ^(t-1) +b _o )

h ^(t) ＝o ^(t) gtanh(c ^(t) )

Wherein the input gate output comprises two parts i ^(t) And a ^(t) ，W _i ，W _a U is the connection weight of the hidden layer state at the last moment _i ，U _a B, inputting the connection weight of the door _i ，b _a Respectively for the respective input gates. The output of the forgetting gate is f ^(t) ，W _f ，U _f ，b _f The cycle weight, input weight and forget gate bias of the forget gate, respectively. New current state c ^(t) The input gate, the output of the forget gate and the state c at the last moment are respectively used for ^(t-1) Commonly determined, the output gate outputs a result o ^(t) ，U ₀ ，W ₀ ，b ₀ Output gate weight, cyclic weight and output gate bias, respectively. Hidden layer state h ^(t) Output by output gate and current state togetherIt is decided that σ in the formula is an activation function, and may be a tanh or sigmoid function.

In order to compare the prediction effect of the power distribution network loss prediction model, the prediction result evaluation index refers to the common index of the prediction model, and the deviation between the prediction value and the actual value is measured by adopting two indexes of average absolute percentage error (Mean Absolute Percentage Error, MAPE) and root mean square error (Root Mean Squared Error, RMSE). The specific error formula is shown as follows:

where n is the number of samples, y _i ' represents the predicted value of the model, y _i Representing the actual value. In order to reflect the advantages of the prediction model in the aspect of network loss prediction precision, a plurality of machine learning prediction methods which are widely applied in the prediction field are added for error comparison with the method.

In addition, the standard deviation sigma MAPE for calculating the average absolute error is adopted to illustrate the fluctuation degree of the prediction error, and the robustness of the prediction model is checked. The calculation is shown in the following formula.

Where n is the number of samples, m _i For the average absolute error of the i-th predicted sample,is the average of the average absolute error of the tie of n samples.

and (3) setting the value ranges of hidden layers of the long-term memory neural network, the number of neurons of each layer of network and the learning rate, performing line loss prediction by circularly setting different super-parameter values, and selecting the super-parameter value with the minimum line loss average prediction error as the optimal super-parameter of the power distribution network line loss prediction model.

And S42, taking each type of line loss training sample set as input data of each neural network prediction model, taking corresponding each type of line loss prediction samples as output data, performing off-line training on the long-short-period memory neural network prediction model, determining network optimal super parameters, and performing iterative training on the long-short-period memory neural network parameters to finally obtain the prediction model of each type of line loss. Parameters of the long-term and short-term memory neural network prediction model iterative training include: the weight and bias of the input gate, the weight and bias of the forget gate, the weight and bias of the output gate.

The method for determining the optimal super parameter of the long-term and short-term memory neural network comprises the following steps: and (3) setting the value ranges of hidden layers of the long-term memory neural network, the number of neurons of each layer of network and the learning rate, performing line loss prediction by circularly setting different super-parameter values, and selecting the super-parameter value with the minimum line loss average prediction error as the optimal super-parameter of the power distribution network line loss prediction model.

And S5, verifying the line loss prediction model based on the test sample.

In this embodiment, the input test sample is used to predict the antenna loss of 1 day in the future.

Step S51, acquiring a line loss test sample set obtained in the step S33, and inputting the line loss prediction model of each type of the neural network obtained in the step S42 to obtain a normalized prediction value of each type of line loss;

step S52, carrying out inverse normalization processing on the normalized predicted value of each type of line loss to obtain a predicted line loss value of each type of line loss;

the specific formula of the inverse normalization is as follows:

wherein x is _max And x _min The maximum value and the minimum value of the normalized variable of each line loss are respectively determined, and x' is the predicted value of the line loss.

Step S53, obtaining predicted values of each type of line loss from step S52, and summing to obtain predicted values of the line bus loss.

The following provides specific examples to further illustrate the technical scheme of the present invention:

according to the actual power distribution network model data of 10kV in a certain area, the power distribution network model data comprises 1 main transformer substation with 110kV to 10kV, 46 transformer areas are all arranged, and 68 overhead lines are arranged. The line loss of the power distribution network is predicted one day (96 data points) in advance, and a specific prediction flow is shown in fig. 1.

S201, analyzing and selecting characteristics affecting the line loss of the power distribution network, and performing maximum mutual information coefficient calculation and analysis on the line loss of a certain line in the power distribution network, wherein the characteristic numbers are shown in a table 1, and the maximum mutual information coefficient calculation result is shown in fig. 3.

TABLE 1

The maximum mutual information coefficient method shows part of historical line loss, and the historical load active power data has strong correlation to the line loss. In addition, the quantity and the temperature characteristics of the transportation area in the quarter and holidays have strong relevance with line loss. Therefore, by analyzing the line loss influence characteristics of the power distribution network, the historical network loss (the network loss in the first 1 hour, the historical network loss in the same time of the first 1-7 days, the historical network loss in the same time of the first year and the first two years) is selected through the selection method of the optimal input characteristics, the historical load active power characteristics (the historical load active power in the first 1-3 days), the hours, the quarters and the holidays, the quantity and the temperature characteristics in the operation area are used as the input characteristics of the power distribution network loss prediction, and the average error value of the corresponding line loss prediction is minimum.

S301, clustering line loss data of 46 areas and 68 overhead lines in a certain line of a power distribution network, wherein the clustering number i is changed from 2 to 10, so as to obtain a cluster error square sum graph of the clustering number change in the embodiment of the invention shown in fig. 4, the optimal clustering number is selected to be 4 according to an elbow rule, and the line loss is clustered into 4 types;

s302, the embodiment of the invention adopts the data of the long-term data 1308 days of a certain line of the power distribution network, and divides the data into two parts, wherein 916 days are training sample sets, and 392 days are verification sample sets.

S401, writing Python codes in the embodiment of the invention, constructing an LSTM neural network based on a Google learning framework TensorFlow, setting an activation function of the LSTM layer as tanh (), setting a loss function as a mean square error MSE, taking 672 by BATCH_SIZE, and displaying a neuron structure diagram of the long-short-term memory neural network as shown in FIG. 5.

S402, constructing a long-short-period memory neural network model of 4 types of line loss prediction according to the embodiment of the invention, and obtaining an optimal super-parameter result through model training as shown in a table 2.

TABLE 2

S50, predicting each type of line loss by online application of a trained long-short-period memory neural network, wherein three-day prediction results are shown in fig. 6, and adding four types of line loss prediction results as bus loss prediction results, in order to compare the improvement of the method in prediction accuracy and robustness, a support vector machine (Support Vector Machine, SVM) Random Forest (RF) with good prediction performance is selected based on the prediction capability of each model, a Back Propagation (BP) neural network algorithm predicts the line loss, and meanwhile, the long-short-period memory neural network which is not clustered is set to predict the line loss, and prediction errors are shown in a table 3. The line loss prediction method provided by the invention has the advantages of minimum error, stable error fluctuation and optimal accuracy and robustness of the prediction model.

TABLE 3 Table 3

The line loss prediction method based on the K-means clustering and the long-short-period memory neural network can improve the accuracy of short-term line loss prediction of the power distribution network, so as to achieve the purposes of guiding the loss management and the synergy operation of the distribution line and provide decision basis for loss reduction and energy saving of the power grid.

The embodiment of the invention also provides a power distribution network line loss prediction system, which is based on the power distribution network line loss prediction method in each embodiment, and comprises the following steps:

Based on the same concept, the embodiment of the present invention further provides a schematic diagram of a server, as shown in fig. 8, where the server may include: processor 810, communication interface (Communications Interface) 820, memory 830, and communication bus 840, wherein processor 810, communication interface 820, memory 830 accomplish communication with each other through communication bus 840. The processor 810 may invoke logic instructions in the memory 830 to perform the steps of the power distribution network line loss prediction method described in the embodiments above. Examples include:

and S5, verifying the line loss prediction model based on the test sample.

Further, the logic instructions in the memory 830 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Based on the same conception, the embodiments of the present invention also provide a non-transitory computer readable storage medium storing a computer program, where the computer program includes at least one piece of code executable by a master control device to control the master control device to implement the steps of the power distribution network line loss prediction method according to the embodiments above. Examples include:

And S5, verifying the line loss prediction model based on the test sample.

Based on the same technical concept, the embodiments of the present application also provide a computer program, which is used to implement the above-mentioned method embodiments when the computer program is executed by the master control device.

The program may be stored in whole or in part on a storage medium that is packaged with the processor, or in part or in whole on a memory that is not packaged with the processor.

Based on the same technical concept, the embodiment of the application also provides a processor, which is used for realizing the embodiment of the method. The processor may be a chip.

The embodiments of the present invention may be arbitrarily combined to achieve different technical effects.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions described in the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk), etc.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. The power distribution network line loss prediction method is characterized by comprising the following steps of:

s2, carrying out correlation analysis on the line loss data set, and extracting a plurality of features with maximum correlation as prediction input variables; s21, selecting line loss data containing N moments from the line loss data set obtained in the step S1 as a reference object sequence of correlation analysis;

s23, calculating the maximum mutual information coefficient between the reference object sequence and each comparison object sequence by adopting a maximum mutual information coefficient method, analyzing the correlation between the characteristics of the object sequences and the line loss, and selecting the optimal M characteristics as input characteristics of a power distribution network line loss prediction model;

；

wherein x represents the data of the comparison object, namely the analyzed characteristic quantity data, y represents the data of the reference object, namely the line loss data, n is the quantity contained in the data sample, p represents the probability calculation, a represents the grid quantity divided in the x direction in the two-dimensional space, b represents the grid quantity divided in the y direction in the two-dimensional space, MIC (x, y) represents the maximum mutual information coefficient between the data x of the comparison object and the reference data y, the value range is between 0 and 1, the larger the maximum mutual information coefficient is, the stronger the correlation between x and y is, and the smaller the maximum mutual information coefficient is, the weaker the correlation between x and y is;

the off-line training model comprises the following specific steps: the input gate inputs the last moment state, the last hidden layer state unit and the current state; after the input of the input gate is transformed by a nonlinear function, the state information screening is carried out through the forget gate, so that the LSTM neural network clears the state information which is not used at the last step, and according to three variables input by the input gate, the state information of which part needs to be forgotten by the neural network is jointly determined, and meanwhile, the useful information is determined to enter a new current state; finally, the output gate utilizes the new current state to carry out operation to determine how much information is output to the current hidden layer state unit, the current hidden layer state unit can enter the next LSTM neuron to carry out calculation, and the connection between the front time sequence and the rear time sequence is established; the specific calculation formula among the variables is as follows:

；

Wherein the input gate output comprises two parts i ^(t) And a ^(t) ，W _i ，W _a U is the connection weight of the hidden layer state at the last moment _i ，U _a B, inputting the connection weight of the door _i ，b _a Respectively bias the respective input gates, the output of the forget gate is f ^(t) ，W _f ，U _f ，b _f The new current state c is respectively the cycle weight, the input weight and the forgetting gate bias of the forgetting gate ^(t) The input gate, the output of the forget gate and the state c at the last moment are respectively used for ^(t-1) Commonly determined, the output gate outputs a result o ^(t) ，U ₀ ，W ₀ ，b ₀ Output gate weight, cyclic weight and output gate bias, respectively, conceal layer state h ^(t) The output of the output gate and the current state jointly determine that sigma in the formula is an activation function and can be a tanh or sigmoid function;

the prediction result evaluation index refers to a common index of a prediction model, and the deviation between a prediction value and an actual value is measured by adopting two indexes of average absolute percentage error and root mean square error; the specific error formula is shown as follows:

；

where n is the number of samples,y _i ' represents the predicted value of the model, y _i Representing the actual value;

the machine learning prediction method which is widely applied in the prediction field is added to carry out error comparison with the power distribution network loss prediction method;

the fluctuation degree of the prediction error is reflected by adopting the standard deviation sigma MAPE for calculating the average absolute error, and the robustness of the prediction model is checked; the calculation is shown as follows;

；

Where n is the number of samples, m _i For the average absolute error of the i-th predicted sample,is the average of the average absolute errors of n samples;

s5, verifying the line loss prediction model based on the test sample;

obtaining a line loss test sample set, and inputting the obtained line loss prediction model of each type of neural network to obtain a normalized prediction value of each type of line loss;

performing inverse normalization processing on the normalized predicted values of each type of line loss to obtain predicted line loss values of each type of line loss; the specific formula of the inverse normalization is as follows:

；

wherein x is _max And x _min Respectively obtaining the maximum value and the minimum value of each type of line loss normalized variable, wherein x' is a line loss predicted value;

and obtaining predicted values of each type of line loss, and summing to obtain predicted values of the line bus loss.

2. The method for predicting line loss of power distribution network according to claim 1, wherein the step S1 specifically includes:

3. The method for predicting line loss of power distribution network according to claim 1, wherein the step S3 specifically includes:

step S32, obtaining the number K of clustering categories according to the step S31, setting initial clustering center points of a K-means algorithm, calculating Euclidean distances between P area line loss data samples and K initial clustering center points, distributing the P samples to the nearest center points according to the calculated distance so as to form K clusters, calculating the average value of all objects in each cluster, taking the calculated K average values as new clustering centers of the K clusters, then continuously iterating and calculating a cluster error square sum, and when the error square sum reaches a limit value, indicating that the algorithm tends to be stable, the clustering center is not changed any more, and finishing K-means clustering, wherein the P area line loss data samples are clustered into K categories;

4. The method for predicting line loss of power distribution network according to claim 3, wherein the step S4 specifically includes:

5. The method for predicting line loss of a power distribution network according to claim 1, wherein the selecting the optimal M features as input features of a power distribution network line loss prediction model specifically comprises:

6. A power distribution network line loss prediction system, characterized in that the system is adapted to the power distribution network line loss prediction method according to any one of claims 1-5; the system comprises:

7. A server, comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the power distribution network line loss prediction method of any one of claims 1 to 5.

8. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the power distribution network line loss prediction method according to any one of claims 1 to 5.