CN115794805B

CN115794805B - Method for supplementing measurement data of medium-low voltage distribution network

Info

Publication number: CN115794805B
Application number: CN202310084891.2A
Authority: CN
Inventors: 黄旭; 丁琪; 祖国强; 徐智; 张春晖; 郝子源; 魏然; 李治; 张驰; 赵长伟; 陆杨; 范朕宁; 张高磊; 魏炜; 黄盼
Original assignee: State Grid Corp of China SGCC; State Grid Tianjin Electric Power Co Ltd; Electric Power Research Institute of State Grid Tianjin Electric Power Co Ltd; Chengdong Power Supply Co of State Grid Tianjin Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; State Grid Tianjin Electric Power Co Ltd; Electric Power Research Institute of State Grid Tianjin Electric Power Co Ltd; Chengdong Power Supply Co of State Grid Tianjin Electric Power Co Ltd
Priority date: 2023-02-03
Filing date: 2023-02-03
Publication date: 2023-05-23
Anticipated expiration: 2043-02-03
Also published as: CN115794805A

Abstract

The invention provides a method for supplementing measurement data of a medium-low voltage distribution network, which classifies original quantity measurement by using K-media clusters, and selects main/standby key quantity measurement from various types as an input sample; on the basis, an input vector matrix and a response vector are constructed, the input vector matrix and the response vector are used as the input of an LSTM model, and the LSTM model is trained, so that measurement alignment models oriented to different measurement types and moments are obtained; and according to the deficiency conditions of the primary/standby key quantity measurement at the preamble time and the current time, respectively supplementing and aligning the primary/standby key quantity measurement, and supplementing and aligning other deficiency quantity measurement in various types based on average errors. According to the invention, the scale of the LSTM neural network model is greatly reduced by selecting the main/standby key quantity measurement, the compensation of different types of quantity measurement can be realized, the observability of the power distribution network is improved, and the method is suitable for the power distribution network with higher measurement data loss rate.

Description

Method for supplementing measurement data of medium-low voltage distribution network

Technical Field

The invention belongs to the technical field of distribution network measurement data processing, and particularly relates to a medium-low voltage distribution network measurement data filling method.

Background

Along with the continuous increase of the scale of the power system, the measurement data of the power system shows a rapid growth trend, however, the problems of data deletion may occur in the processes of acquisition, measurement, transmission, storage and the like of massive data, especially the quality and the deletion condition of the measurement data of the power distribution network are worse than those of the power transmission network, the problem that observability of the distribution network is seriously affected and the safety of the power transmission network is threatened is not solved, meanwhile, the number of measurement devices is huge, the calculation resources occupied by complete training are huge, the measurement types are various (current, voltage and power), the change rule of the measurement of different types along with time is different, and how to effectively complement the measurement data of the distribution network is a problem to be solved. The traditional measurement data alignment method is mainly used for predicting single measurement or small-scale measurement, but the calculation accuracy and calculation speed are difficult to meet the requirement of large-scale measurement data alignment.

Therefore, it is necessary to provide a new method for supplementing measurement data of a medium-low voltage distribution network to solve the above-mentioned technical problems.

Disclosure of Invention

The present invention is directed to a method for supplementing measurement data of a medium-low voltage distribution network to solve the above-mentioned problems.

The invention realizes the above purpose through the following technical scheme:

A method for supplementing measurement data of a medium-low voltage distribution network comprises the following steps:

obtaining historical sample measurement data, and processing the sample measurement data to obtain a sample input vector matrix and a sample response vector;

constructing a measurement alignment model, inputting the sample input vector matrix and the sample response vector into the measurement alignment model for training to obtain a trained measurement alignment model, and optimizing the trained measurement alignment model to obtain a final measurement alignment model;

acquiring measurement data at the current moment, and processing the measurement data to obtain a current input vector matrix and a current response vector;

inputting the current input vector matrix and the current response vector into the final measurement alignment model to obtain a measurement predicted value;

performing filling processing on the measurement data at the current moment based on the measurement predicted value to fill in the measurement data of the medium-low voltage distribution network;

obtaining historical sample measurement data, and processing the sample measurement data to obtain a sample input vector matrix and a sample response vector, wherein the specific process comprises the following steps of:

acquiring historical sample measurement data;

Classifying the sample measurement data based on a K-media clustering method, distinguishing main/standby key measurement data and non-key measurement data in various data, and calculating average errors between the non-key measurement data and the main/standby key measurement data in the same type of data;

based on the n+1 consecutive values in the preset sample amount measurement data

Construction of +_for each of the primary and backup key quantity measurement data>

Sample input vector matrix of dimensions

, wherein ,/>

Vector representing the constitution of the sample size measurement data, +.>

Measurement data value representing a first time instant +.>

Representing the measured data value at the second instant,

measurement data value representing a third time instant +.>

Measurement data value representing time n-1, < >>

A measurement data value representing an nth time; />

Representing and vector->

N-dimensional vector formed by time series characteristics of one-to-one correspondence of n measured data values, vector ∈>

Representation and vector->

The n measured data values are corresponding to each other in one-to-one mode and are related to the judgment of workdays, and the vector is +.>

Representing the type of the measurement object;

constructing a sample response vector based on a sample input vector matrix

, wherein

； wherein ,b ₁ representing the first data value in the sample response vector, b ₂ Representing the second data value in the sample response vector,b ₃ representing the third data value in the sample response vector,b _n-1 representing the n-1 data value in the sample response vector,b _n representing an nth data value in the sample response vector;

classifying the sample measurement data based on a K-media clustering method, distinguishing main/standby key measurement data and non-key measurement data in various types of data, and calculating average errors between the non-key measurement data and the main/standby key measurement data in the same type of data, wherein the specific process is as follows:

setting sample quantity measurement data as X (N.times.M), wherein N is the number of data samples, M is the feature dimension of each data, and the given cluster number is K to obtain K cluster centers;

among the sample amount measurement data, K sample data are randomly selected

As an initial cluster center, wherein ∈>

Representing the first sample data, < >>

Representing the second sample data,/for example>

A third sample of data is represented and is displayed,

represents the kth sample data;

calculating the remaining N-K sample data

Euclidean distance to K cluster centers, wherein, < ->

Representing the first sample data of the remaining sample data, and (2)>

Representing the second sample data of the remaining sample data,/- >

Representing the third sample data of the remaining sample data,/->

Representing the N-K sample data in the rest sample data, dividing the rest sample data under the corresponding class clusters according to the Euclidean distance minimum value to obtain a clustering result, and realizing cluster updating; the Euclidean distance calculation formula is as follows:

（1）；/>

wherein ,

representing sample data->

The first element of (2)>

Representing sample data->

The first element of (a);

traversing all sample points in various clusters, updating the cluster center point by taking the minimum sum of Euclidean distances from all other points in the clusters to the center point as an objective function, wherein the objective function formula is as follows:

（2）；

wherein ,

indicating Euclidean distance from the jth sample point to the 1 st cluster center, +.>

Indicating Euclidean distance from the jth sample point to the 2 nd cluster center, +.>

Representing Euclidean distance from the jth sample point to the kth cluster center;

repeating the processes of cluster updating and cluster center point updating, iterating until all cluster center points and cluster results do not change any more or reach the preset maximum iteration times, and ending the clustering;

setting K clustering center points

Namely, the main critical measurement data are respectively calculated as the average error between the non-critical measurement data and the main critical measurement data in the K classes >

Let the number of non-critical quantity measurement data in each class be +.>

The non-critical measurement data in class i is +.>

, wherein />

Representing the first non-critical measurement data in class i,/I>

Representing second non-critical measurement data in class i,/I>

Representing third non-critical measurement data in class i,/I>

Represents the%>

Measuring data by non-key quantity; />

Mean error vector representing the first class, +.>

Mean error vector representing the second class, +.>

Mean error vector representing class III, -)>

The average error vector of the K-th class is represented, and the average error calculation formula is as follows:

（3）；

wherein ,

representation->

The j-th element in the vector,>

representing major key quantity measurement data->

The kth element of (a)>

Representing non-critical quantity data->

The kth element of (a);

removing the K main key quantity measurement data from various clusters, and respectively searching the clustering center points of the rest data samples in the various clusters again to obtain new K clustering center points

Namely, standby key quantity measurement data, wherein ∈>

Representing the first cluster center of the new K cluster centers,/>

Representing the second cluster center point of the new K cluster centers,/for>

Representing the third cluster center of the new K cluster centers,/ >

Representing the Kth cluster center point in the new K cluster center points, and respectively calculating non-key quantity measurement data in K classesNew mean error between critical quantity data for standby +.>

Let the number of non-critical quantity measurement data in each class be +.>

The non-critical measurement data in class i is

, wherein />

Representing the first non-critical measurement data in class i,/I>

Representing second non-critical measurement data in class i,/I>

Representing third non-critical measurement data in class i,

represents the%>

Measuring data by non-key quantity; />

Representing a new average error vector of the first type,

new average error vector representing the second class, < >>

Representing a third class of new average error vectors, and (2)>

A new average error vector representing class K, the new average error calculation formula is as follows:

（4）；

wherein ,

representation->

The j-th element in the vector,>

representing spare critical quantity measurement data->

The kth element of (a)>

Representing non-critical quantity data->

Is the kth element in (c).

As a further optimization scheme of the invention, the time series characteristics comprise year, quarter, month, day, time and minute; the judgment about the working day specifically comprises a working day 0 representation and a non-working day 1 representation; the measured object types comprise an electric current amount, a voltage amount and a power amount, wherein the electric current amount is represented by 0, the voltage amount is represented by 1, and the power amount is represented by 2.

As a further optimization scheme of the invention, the measurement complement model is an LSTM model, the LSTM model comprises an input gate, a forgetting gate and an output gate, and a double-layer structure is adopted, and the formula is as follows:

（5）；

（6）；

（7）；

（8）；

（9）；

（10）；/>

in the formulas (5) to (10),

indicating the state of the input door at the current moment +.>

Indicating the state of forgetting the door at the current moment, +.>

Indicating the current output door state +.>

Representing the state of the LSTM model at the current moment, +.>

Representing the state of the LSTM model at the previous moment, < + >>

The candidate state of the current moment of the LSTM model is expressed as a pair of the current moment of the LSTM model +.>

and />

For calculating the current cell state +.>

，/>

Input representing the current moment of the LSTM model, +.>

For input to the input gate->

Weight of->

Hidden layer to input gate for previous time>

Weight of->

For input to forget gate->

Weight of->

Hidden layer to forget door for the previous moment>

Weight of->

For input to the output gate->

Weight of->

Hidden layer to output gate for the previous time>

Weight of->

For input of

Weight in feature extraction process, +.>

Implicit layer for the previous moment->

Weight in feature extraction process, +.>

and />

The symbols +. >

Representing the Hadamard product.

As a further optimization scheme of the invention, the sample input vector matrix and the sample response vector are input into the measurement alignment model for training, and the measurement alignment model after training is obtained, which comprises the following specific processes:

vector in sample input vector matrix by Symlet wavelet function

Denoising the data in the sample input vector matrix after denoising and all the data in the sample response vector after denoising are normalized to data with zero mean and unit variance, so as to obtain the normalized sample input vector matrix and the normalized sample response vector;

dividing the normalized sample input vector matrix and the normalized sample response vector into a training set, a verification set and a test set according to the proportion;

and inputting the standardized sample input vector matrix and the standardized sample response vector in the training set into the measurement alignment model for training, inputting the standardized sample input vector matrix and the standardized sample response vector in the verification set into the measurement alignment model, and correcting parameters of the measurement alignment model to obtain the measurement alignment model after training.

As a further optimization scheme of the invention, the parameters of the measurement alignment model comprise weight and bias values.

As a further optimization scheme of the invention, the measurement alignment model is trained by adopting an Adam algorithm, and a weight updating formula is as follows:

（11）；

（12）；/>

（13）；

in the formulae (11) - (13),

and />

For the network weight parameter to be updated in adjacent time steps,/->

For smooth parameters +.>

For learning rate->

and />

Exponential decay rate estimated for first and second moments, respectively,/->

、/>

Deviation correction values for the first and second moment estimates, respectively; />

Representing a first moment estimation of the gradient when the time step is t-1;

representing the gradient at time step t; />

Representing a second moment estimate of the gradient at time step t-1; />

Representing the square of the gradient at time step t;

setting a root mean square error function (RMSE) as a loss function trained by the measurement and alignment model, wherein the formula is as follows:

（14）；

in the formula ,

for the total number of measurement data to be predicted, +.>

For measuring the true value of the data to be predicted,

predicted values of measurement data output by the measurement patch model.

As a further optimization scheme of the invention, the post-training measurement and alignment model is optimized to obtain a final measurement and alignment model, and the specific process is as follows:

And optimizing the super parameters of the trained measurement alignment model by adopting a Bayesian optimization method to obtain optimized network parameters, reconstructing the trained measurement alignment model based on the optimized network parameters, and obtaining the final measurement alignment model.

As a further optimization scheme of the invention, the super parameters of the post-training measurement and alignment model comprise iteration times, hidden layer numbers, neuron numbers of each layer and learning rate.

As a further optimization scheme of the invention, the super-parameters of the measurement and complement model after training are optimized by adopting a Bayesian optimization method, so as to obtain optimized network parameters, and the specific process is as follows:

setting objective functions of a Bayesian framework

Independent variable->

Representing the super-parameters;

selection of

Calculating objective function of each observation point>

The value at the observation point, namely the observation value of a preset observation model;

setting up

Based on the observation +.>

Estimating to obtain the objective function->

Functional distribution of->

A minimum value of the target value;

setting current observation data

A base (B)At the current observation dataDCalculating a preset acquisition function and determining the next observation point +.>

Calculate the +.>

Acquisition function value->

Setting- >

Updating a preset probability agent model;

repeating the above steps until the target value reaches the preset maximum observation times P to obtain the optimized network parameters

。

As a further optimization scheme of the invention, the probability agent model is a Gaussian process regression model, and the Gaussian process regression model obeys k-dimensional normal distribution and has the following formula:

（15）；

wherein ,

representing an n-dimensional vector, ">

As a mean function>

As a covariance function.

As a further optimization of the invention, the expected improvement function is adopted as the acquisition function and the next observation point is determined

The formula is as follows:

（16）；

（17）；

in the formulae (16) to (17),

is the position observed in step i, +.>

Is the posterior mean of the agent at time s+1; />

Representing the observation position when the current objective function is maximized; />

Maximum value of the current objective function; the argmax (f (u)) function is an argument u that maximizes the value of f (u); the max (f (u)) function is the maximum value of f (u); d represents the current observation data set; e (f (u)) functions are expected for f (u).

As a further optimization scheme of the invention, measuring data at the current moment is obtained, the measuring data is processed to obtain a current input vector matrix and a current response vector, and the specific process is as follows:

Acquiring measurement data at the current moment;

classifying the measured data at the current moment based on a K-media clustering method, distinguishing main/standby key measured data at the current moment and non-key measured data at the current moment in various data, and calculating average errors between the non-key measured data at the current moment and the main/standby key measured data at the current moment in the same type of data;

judging whether the current main key measurement data is missing, if not, supplementing the missing current non-key measurement data based on the average error between the current non-key measurement data and the current main key measurement data; if yes, comparing the main key measurement data at the current moment with the missing degree of the data of the front period of the standby key measurement data at the current moment, and taking the main/standby key measurement data at the current moment with lower missing degree as the measurement data to be predicted;

selecting data of the first m times of the waiting predicting time of the waiting predicting quantity data to construct an m-dimensional vector

And constructs +.>

Current input vector matrix of dimension->

, wherein ,c ₁ the first vector representing the first m moments, c ₂ The second vector representing the first m moments,c ₃ the third vector representing the first m moments,c _m-1 the m-1 st vector representing the first m moments,c _m an mth vector representing the first m times;y ₂ 、y ₃ 、y ₄ 、y ₅ 、y ₆ 、y ₇ respectively represent and vectory ₁ M-dimensional vector formed by time sequence features corresponding to m measured data values one by oney ₈ Representation and vectory ₁ The m measurement data values are in one-to-one correspondence with the judgment of working days and the vectory ₉ Representing the type of the measurement object;

construction of the current response vector

, wherein ,

，/>

for the measurement value at time t, set +.>

。/>

As a further optimization scheme of the present invention, the current input vector matrix and the current response vector are input into the final measurement alignment model to obtain a measurement predicted value, and the specific process is as follows:

the current input vector matrix and the current response vector are used as input parameters to be input into a final measurement and alignment model, and the output quantity measurement predicted value is

, wherein ,/>

Measurement prediction value representing the first moment, < +.>

Measurement prediction value representing the second moment, < + >>

Indicating the measurement prediction value at the third time,

measurement prediction value indicating m-1 th time,/->

The measurement predicted value at the time t is obtained.

As a further optimization scheme of the invention, the method for carrying out the filling processing on the measurement data at the current moment based on the measurement predicted value comprises the following specific processes:

And based on the measurement predicted value, supplementing the missing measurement data according to the average error between the non-key measurement data and the main/standby key measurement data in the same class until all measurement data in all classes are supplemented.

The invention has the beneficial effects that:

the invention effectively utilizes the relevance of each measured data, thereby reducing the scale of the problem to be solved, improving the efficiency of large-scale measured data filling, adding time sequence characteristics, data types and other influencing factors in the LSTM model, and improving the accuracy of data prediction.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a flow chart of a model training phase of the present invention;

FIG. 3 is a flow chart of the measurement patch phase of the present invention;

FIG. 4 is an overall block diagram of an LSTM model employed in the present invention;

FIG. 5 is an internal structure diagram of the LSTM model employed in the present invention;

FIG. 6 is a flowchart of a K-media clustering algorithm employed by the present invention.

Detailed Description

The following detailed description of the present application is provided in conjunction with the accompanying drawings, and it is to be understood that the following detailed description is merely illustrative of the application and is not to be construed as limiting the scope of the application, since numerous insubstantial modifications and adaptations of the application will be to those skilled in the art in light of the foregoing disclosure.

As shown in fig. 1, a method for supplementing measurement data of a medium-low voltage distribution network includes the following steps:

s1: obtaining historical sample measurement data, and processing the sample measurement data to obtain a sample input vector matrix and a sample response vector;

s2: constructing a measurement alignment model, inputting the sample input vector matrix and the sample response vector into the measurement alignment model for training to obtain a trained measurement alignment model, and optimizing the trained measurement alignment model to obtain a final measurement alignment model;

s3: acquiring measurement data at the current moment, and processing the measurement data to obtain a current input vector matrix and a current response vector;

s4: inputting the current input vector matrix and the current response vector into the final measurement alignment model to obtain a measurement predicted value;

s5: and carrying out filling processing on the measurement data at the current moment based on the measurement predicted value so as to fill in the measurement data of the medium-low voltage distribution network.

In this embodiment, the method specifically includes the following steps:

as shown in fig. 2, model training phase:

step 1: acquiring historical sample measurement data, namely acquiring existing measurement data, classifying the existing measurement data by adopting a K-media clustering method, selecting main/standby key measurement from various types, and recording average errors between other measurement and main/standby key measurement in the same type, wherein the other measurement data is non-key measurement data;

Step 2: for each of the primary and backup key measurements, based on n+1 values in succession in the measurement

) Structure->

Dimensional input vector matrix

, wherein

For a vector of measurement data, each element represents a measurement data value at a moment,/-for the measurement data>

Respectively is->

N-dimensional vector formed by time series characteristics (year, quarter, month, day, time and minute) of one-to-one correspondence of n measured data values in vector, ++>

Vector representation and +.>

The n measured data values in the vector are in one-to-one correspondence with the judgment about the working day (working day 0 representation and non-working day 1 representation),>

vector represents the type of object measured (current amount is represented by 0, voltage amount is represented by 1, power amount is represented by 2);

step 3: constructing response vectors

, wherein />

，

I.e. predicting the value of the sequence at a future time step, assigning a response vector as a training sequence with the value shifted by one time step, at each time step of the input sequence, the LSTM network learns the value of the predicted next time step;

step 4: establishing a measurement complement model, and adopting an LSTM model structure, wherein the structure mainly comprises an input door (controlling how much information of a candidate state at the current moment needs to be saved), a forgetting door (controlling how much information of an internal state at the previous moment needs to be forgotten) and an output door (controlling how much information of the internal state at the current moment needs to be output to an external state); setting an LSTM model training mode, adopting an Adam optimization algorithm, wherein a loss function is a root mean square error function (RMSE), the learning rate is 0.005, input parameters are a multidimensional input vector matrix constructed in the step 2 after pretreatment and a response vector constructed in the step 3, training model parameters, and each main key quantity measurement and each standby key quantity measurement are involved in training to obtain measurement alignment models for different measurement types and moments;

Step 5: and (3) optimizing the super parameters (iteration times, hidden layer numbers, neuron numbers of each layer, learning rate and the like) of the established LSTM model by using a Bayesian optimization (Bayesian optimization) method to obtain optimized network parameters, and reconstructing the LSTM model for prediction by using the optimized parameters.

As shown in fig. 3, the measurement replenishment phase:

step 1: selecting quantity measurement to be predicted according to the missing condition of main key quantity measurement data at the current moment (t moment), if the quantity measurement is not missing, based on the main key quantity measurement, supplementing the quantity measurement with other missing quantity measurement according to the average error between the other quantity measurement and the main key quantity measurement in the class, jumping to the step 6, and if the quantity measurement is missing, comparing the quality (missing degree) of the data in the preamble period of the main key quantity measurement and the spare key quantity measurement, wherein the quality is good as the quantity measurement to be predicted;

step 2: based on the selected measurement to be predicted, the data of the first m times of the measurement to be predicted time (t time) are selected to construct an m-dimensional vector

And based on the m data constructs

Dimension input vector matrix +.>

The construction method is the same as the model training stage step 2;

step 3: constructing response vectors

, wherein />

，

，/>

Measurement of t time (absence) is given by ∈K>

；

Step 4: preprocessing an input vector matrix and a response vector, taking the preprocessed input vector matrix and the preprocessed response vector as input parameters, predicting based on a trained measurement alignment model for the measurement type, and outputting the output result as

Inverse normalized +.>

The measurement predicted value at the time t is obtained;

step 5: based on the main/standby key quantity measurement predicted value, compensating the measurement of other missing quantity according to the average error between the other quantity measurement in the same class and the main/standby key quantity measurement;

step 6: the above steps are repeated for each class of measurement until all of the measurement data in all classes are filled.

In this embodiment, as shown in fig. 6, the specific steps of clustering by using the K-media clustering method include:

step 1: setting an input data sample as X (N.times.M), wherein N is the number of data samples, M is the feature dimension of each data, and the given cluster number is K, namely gathering N data into K types;

step 2: among the original data samples, K samples are randomly selected

As an initial cluster center;

step 3: cluster updating: calculating the remaining N-K samples

And dividing the rest data into corresponding class clusters according to the Euclidean distance between the rest data and the K clustering centers to obtain a clustering result. The Euclidean distance calculation formula is as follows:

（1）；

wherein ,

representing sample data->

The first element of (2)>

Representing sample data->

The first element of (a);

and />

All are M-dimensional sample data vectors, and each element in the vector represents measurement data at a certain moment. />

Step 4: cluster center point update: traversing all sample points in various clusters, updating the cluster center point by taking the minimum sum of Euclidean distances from all other points (T-1) in the clusters to the center point as an objective function, wherein the formula of the objective function is as follows:

（2）；

providing T sample points in a certain class of clusters, respectively taking each sample point as a cluster center point, calculating the sum of Euclidean distances from all other points in the clusters to the center point, wherein,

representing the first sample point as the cluster center point, the sum of the Euclidean distances of all other points in the cluster to the center point, and so on, +.>

Representing the euclidean distance from the jth sample point to the 1 st sample point (center point),/>

Representing Euclidean distance from the jth sample point to the T clustering center; and so on, wherein

Are all 0;

step 5: repeating the processes of cluster updating and cluster center point updating, iterating until all cluster center points and cluster results do not change any more or reach the maximum iteration times designated in advance, and ending the clustering;

Step 6: the obtained K clustering center points

Namely, the main key quantity measurement is respectively calculated as the measurement of other quantities in K classes +.>

Average error from primary key measurement

And recording, wherein the average error calculation formula is as follows:

（3）；

wherein ,

representation->

The j-th element in the vector,>

representing major key quantity measurement data->

The kth element of (a)>

Representing non-critical quantity data->

The kth element of (a); />

and />

The data are M-dimensional sample data vectors, and each element in the vectors represents measurement data at a certain moment;

step 7: removing the selected K main key quantity measurements from various clusters, and searching cluster center points for the rest data samples in the various clusters;

step 8: the obtained new K clustering center points

Namely, the standby key quantity measurement is respectively calculated as the other quantity measurement in the K class>

Average error between the measurement of the spare key quantity +.>

And recording, wherein the average error calculation formula is as follows:

（4）；

wherein ,

representation->

The j-th element in the vector,>

representing spare critical quantity measurement data->

The kth element of (a)>

Representing non-critical quantity data->

The kth element of (a); />

and />

All are M-dimensional sample data vectors, and each element in the vector represents measurement data at a certain moment.

In this embodiment, a measurement alignment model is constructed, the sample input vector matrix and the sample response vector are input into the measurement alignment model to train, a trained measurement alignment model is obtained, the trained measurement alignment model is optimized, and a final measurement alignment model is obtained, and the method specifically includes the following steps:

as shown in fig. 5, the measurement and alignment model adopts an LSTM model, the LSTM model adopts a double-layer structure, the number of neurons in an hidden layer is 96×3, and a single LSTM structure is composed of an input gate, a forgetting gate and an output gate, and the formula is as follows:

（5）；

（6）；

（7）；

（8）；

（9）；

（10）；

in the formula ,

indicating the state of the input door at the current moment +.>

Indicating the state of forgetting the door at the current moment, +.>

Indicating the current output door state +.>

Representing the state of the LSTM at the current moment, +.>

Indicating the state of the LSTM immediately preceding, +.>

For the candidate state of the LSTM current time, the LSTM current time pair is expressed>

and />

For calculating the current cell state +.>

，/>

Input representing the current time of LSTM, +.>

For input to the input gate->

Weight of->

Hidden layer to input gate for previous time>

Weight of->

For input to forget gate->

Weight of->

Hidden layer to forget door for the previous moment>

Weight of- >

For input to the output gate->

Weight of->

Hidden layer to output gate for the previous time>

Weight of->

For input +.>

Weight in feature extraction process, +.>

Implicit layer for the previous moment->

Weight in feature extraction process, +.>

and />

The symbols +.>

Representing the Hadamard product.

Preprocessing an input vector matrix constructed in a model training stage step 2 and a response vector constructed in a model training stage step 3, and firstly selecting a Symlet wavelet function to construct a vector for measuring data in the input vector matrix

Denoising the data in the input vector matrix and the response vector, and normalizing all the data in the input vector matrix and the response vector into data with zero mean and unit variance;

the input vector matrix and the response vector are divided into a training set, a verification set and a test set according to the proportion of 8:1:1, wherein the training set and the verification set are used for training a model and determining parameters (a weight value W, U and a bias value b), and the test set is used for checking the generalization capability of the model.

And taking the input vector matrix and the response vector in the training set as the LSTM input, finally obtaining output through the full connection layer, and taking the input vector matrix and the response vector in the verification set as the LSTM input to correct parameters in the LSTM model to obtain a trained LSTM model, namely a measurement and alignment model after training.

Training the LSTM layer model by adopting an Adam algorithm, wherein the weight updating formula is as follows:

（11）；

（12）；

（13）；

in the formula ,

and />

For the network weight parameter to be updated in adjacent time steps,/->

In order to smooth the parameters of the image,

for learning rate->

and />

Exponential decay rate estimated for first and second moments, respectively,/->

Representing a first moment estimation of the gradient when the time step is t-1; />

Representing the gradient at time step t; />

Representing a second moment estimate of the gradient at time step t-1; />

Representing the square of the gradient at time step t;

a root mean square error function (root mean square erro, RMSE) is defined as a model trained loss function, whose formula is:

（14）；

in the formula ,

for the total number of measurement data to be predicted, +.>

For measuring the true value of the data to be predicted,

predicted values of measurement data output for the LSTM model.

In this embodiment, the bayesian optimization (Bayesian optimization) method involved in the model training stage step 5 is as follows:

(1) Bayesian optimization framework

Step 1: independent variable

For hyper-parametric space, model-trained loss function (RMSE) is used as the objective function of bayesian framework +.>

；

Step 2: selection of

Calculating the ∈10 in each observation point>

The values at these points are then used to determine the observations of the observation model;

step 3: order the

；

Step 4: estimating the function from finite observations, this assumption being called a priori assumption in Bayesian optimization, by which the estimation is made

(function distribution) minimum value of target value on (function distribution);

step 5: based on current observation data

Calculating an acquisition function and determining the next observation point +.>

The function value at the next observation point is calculated: />

And let->

Updating the probability agent model;

step 6: repeating the steps 4 and 5 until the target value on the assumed distribution reaches a preset standard or reaches a preset maximum observation frequency P;

step 7: output of

Corresponding y, < >>

And the optimized super parameter is obtained.

(2) The tool (probability agent model) for estimating the function distribution used in the bayesian optimization framework step 4 is as follows:

the probability proxy model comprises a priori probability model

And observation model->

Updating the probability agent model, i.e. obtaining the posterior probability distribution comprising more data information according to the formula +.>

The calculation formula is as follows:

（15）；

in the formula ,

for the objective function, D represents the observed set, +.>

For likelihood distribution of observations +. >

Representation->

Is>

Representation->

Posterior probability distribution of (c).

Gaussian process regression is adopted as a probability agent model, and the distribution of an objective function is estimated according to a few observation points

(include->

The value of each point and the confidence level corresponding to the point), wherein the Gaussian process regression obeys the k-dimensional normal distribution:

（16）；

wherein ,

representing an n-dimensional vector, ">

As a mean function>

As a covariance function.

(3) The acquisition function in the bayesian optimization framework step 5 is as follows:

acquisition function measurement observation point pair fitting

The generated influence is selected and the point with the largest influence is selected to execute the next observation, the expected lifting EI (Expected Improvement) is adopted as the acquisition function and the next observation point +.>

I.e. if a certain point can be the current maximum +.>

If the maximum expected lifting is brought, the point is selected as the next observation point, and the calculation formula is as follows:

（17）；

（18）；

in the formulae (16) to (17),

is the position observed in step i, +.>

Is the posterior mean of the agent at time s+1; />

The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention.

Claims

1. The method for supplementing measurement data of the medium-low voltage distribution network is characterized by comprising the following steps of:

acquiring historical sample measurement data;

Sample input vector matrix of dimensions

, wherein ,/>

Vector representing the constitution of the sample size measurement data, +.>

Measurement data value representing a first time instant +.>

Representing the measured data value at the second instant,

measurement data value representing a third time instant +.>

Measurement data value representing time n-1, < >>

A measurement data value representing an nth time; / >

Representing and vector->

Representation and vector->

Representing the type of the measurement object;

constructing a sample response vector based on a sample input vector matrix

, wherein

； wherein ,b ₁ representing the first data value in the sample response vector,b ₂ representing the second data value in the sample response vector,b ₃ representing the third data value in the sample response vector,b _n-1 representing the n-1 data value in the sample response vector,b _n representing an nth data value in the sample response vector;

among the sample amount measurement data, K sample data are randomly selected

As an initial cluster center, wherein ∈>

Representing the first sample data, < >>

Representing the second sample data,/for example>

Representing the third sample data, ++>

Represents the kth sample data;

calculating the remaining N-K sample data

Euclidean distance to K cluster centers, wherein, < ->

Representing the first sample data of the remaining sample data, and (2)>

Representing the second sample data of the remaining sample data,/->

Representing the third sample data of the remaining sample data,/->

（1）；

wherein ,

representing sample data->

The first element of (2)>

Representing sample data->

The first element of (a);

（2）；

wherein ,

Indicating Euclidean distance from the jth sample point to the 2 nd cluster center, +. >

Represents the jth sample pointEuclidean distance to the kth cluster center;

setting K clustering center points

Namely, the main critical measurement data are respectively calculated as the average error between the non-critical measurement data and the main critical measurement data in the K classes>

Let the number of non-critical quantity measurement data in each class be +.>

The non-critical measurement data in class i is +.>

, wherein />

Representing the first non-critical measurement data in class i,/I>

Representing second non-critical measurement data in class i,/I>

Representing third non-critical measurement data in class i,/I>

Represents the%>

Measuring data by non-key quantity; />

Mean error vector representing the first class, +.>

Mean error vector representing the second class, +.>

Mean error vector representing class III, -)>

（3）；

wherein ,

representation->

The j-th element in the vector,>

representing major key quantity measurement data->

The (c) is a group of elements,

representing non-critical quantity data- >

The kth element of (a);

Namely, standby key quantity measurement data, wherein ∈>

Representing the first cluster center of the new K cluster centers,/>

Representing the second cluster center point of the new K cluster centers,/for>

Representing the third cluster center of the new K cluster centers,/>

Representing the Kth cluster center point in the K new cluster centers, and respectively calculating new average error between the non-key measured data and the standby key measured data in the K classes +.>

The number of non-critical quantity measurement data in each class is respectively set as

The non-critical measurement data in class i is +.>

, wherein />

Representing the first non-critical measurement data in class i,/I>

Representing second non-critical measurement data in class i,/I>

Representing third non-critical measurement data in class i,/I>

Represents the%>

Measuring data by non-key quantity; />

New mean error vector representing the first class, < ->

Representing a new average error vector of the second class,

representing a third class of new average error vectors, and (2) >

（4）；/>

wherein ,

representation->

The j-th element in the vector,>

representing spare critical quantity measurement data->

The kth element of (a)>

Representing non-critical quantity data->

Is the kth element in (c).

2. The method for supplementing measurement data to a medium-low voltage distribution network according to claim 1, wherein the method comprises the steps of: the time sequence features comprise year, quarter, month, day, time and minute; the judgment about the working day specifically comprises a working day 0 representation and a non-working day 1 representation; the measured object types comprise an electric current amount, a voltage amount and a power amount, wherein the electric current amount is represented by 0, the voltage amount is represented by 1, and the power amount is represented by 2.

3. The method for supplementing measurement data to a medium-low voltage distribution network according to claim 1, wherein the method comprises the steps of: the measuring and complementing model is an LSTM model, the LSTM model comprises an input door, a forgetting door and an output door, a double-layer structure is adopted, and the formula is as follows:

（5）；

（6）；

（7）；

（8）；

（9）；

（10）；

in the formulas (5) to (10),

indicating the state of the input door at the current moment +.>

Indicating that the door state is forgotten at the current moment,

indicating the current output door state +.>

Representing the state of the LSTM model at the current moment, +. >

Representing the state of the LSTM model at the previous moment, < + >>

and />

For calculating the current cell state +.>

，/>

Input representing the current moment of the LSTM model, +.>

For input to the input gate->

Weight of->

Hidden layer to input gate for previous time>

Weight of->

For input to forget gate->

Weight of->

Hidden layer to forget door for the previous moment>

Weight of->

For input to the output gate->

Weight of->

Hidden layer to output gate for the previous time>

Weight of->

For input of

Weight in feature extraction process, +.>

Implicit layer for the previous moment->

Weight in feature extraction process, +.>

and />

The symbols +.>

Representing the Hadamard product.

4. The method for supplementing measurement data to a medium-low voltage distribution network according to claim 1, wherein the method comprises the steps of: inputting the sample input vector matrix and the sample response vector into the measurement alignment model for training to obtain a measurement alignment model after training, wherein the specific process is as follows:

vector in sample input vector matrix by Symlet wavelet function

5. The method for supplementing measurement data to a medium-low voltage distribution network according to claim 4, wherein the method comprises the steps of: the parameters of the measurement complement model comprise weights and bias values.

6. The method for measuring and data supplementing of a medium-low voltage distribution network according to claim 4, wherein the measuring and supplementing model is trained by Adam algorithm, and the weight updating formula is as follows:

（11）；

（12）；

（13）；

In the formulae (11) - (13),

and />

For the network weight parameter to be updated in adjacent time steps,/->

For smooth parameters +.>

For learning rate->

and />

Exponential decay rate estimated for first and second moments, respectively,/->

、/>

representing the gradient at time step t; />

Representing a second moment estimate of the gradient at time step t-1; />

Representing the square of the gradient at time step t;

（14）；

in the formula ,

for the total number of measurement data to be predicted, +.>

For the measurement data true value to be predicted, < +.>

Predicted values of measurement data output by the measurement patch model. />

7. The method for compensating measurement data of a medium-low voltage distribution network according to claim 6, wherein the method is characterized by optimizing the trained measurement compensation model to obtain a final measurement compensation model, and comprises the following specific steps:

8. The method for measuring and data supplementing in a medium-low voltage distribution network according to claim 7, wherein the super parameters of the trained measuring and supplementing model comprise iteration times, hidden layer numbers, neuron numbers of each layer and learning rate.

9. The method for supplementing measurement data of a medium-low voltage distribution network according to claim 7, wherein the super parameters of the trained measurement and supplementation model are optimized by adopting a Bayesian optimization method, and the optimized network parameters are obtained by the following specific processes:

setting objective functions of a Bayesian framework

Independent variable->

Representing the super-parameters;

selection of

Calculating objective function of each observation point>

setting up

Based on the observation +.>

Estimating to obtain the objective function->

Is a function distribution of (2)

A minimum value of the target value;

setting current observation data

Based on current observation dataDCalculating a preset acquisition function and determining the next observation point +.>

Calculate the +.>

Acquisition function value->

Setting up

Updating a preset probability agent model;

。

10. The method for supplementing measurement data of a medium-low voltage distribution network according to claim 9, wherein the probability agent model is a gaussian process regression model, and the gaussian process regression model obeys k-dimensional normal distribution and has the following formula:

（15）；

wherein ,

representing an n-dimensional vector, ">

As a mean function>

As a covariance function.

11. The method of claim 9, wherein the expected improvement function is used as an acquisition function and a next observation point is determined

The formula is as follows:

（16）；

（17）；

in the formulae (16) to (17),

is the position observed in step i, +.>

Is the posterior mean of the agent at time s+1; />

12. The method for supplementing measurement data of a medium-low voltage distribution network according to claim 1, wherein the method is characterized by obtaining measurement data at the current moment, and processing the measurement data to obtain a current input vector matrix and a current response vector, and comprises the following specific steps:

Acquiring measurement data at the current moment;

And constructs +.>

Current input vector matrix of dimension->

construction of the current response vector

, wherein ,

，/>

for the measurement value at time t, set +.>

。

13. The method for supplementing measurement data of a medium-low voltage distribution network according to claim 12, wherein the method is characterized in that the current input vector matrix and the current response vector are input into the final measurement supplementing model to obtain a measurement predicted value, and comprises the following steps:

, wherein ,/>

Measurement prediction value representing the first moment, < +.>

Measurement prediction value representing the second moment, < + >>

Indicating the measurement prediction value at the third time,

measurement prediction value indicating m-1 th time,/->

The measurement predicted value at the time t is obtained.

14. The method for supplementing measurement data of a medium-low voltage distribution network according to claim 1, wherein the method for supplementing the measurement data at the current moment based on the measurement predicted value comprises the following steps: