Disclosure of Invention
The purpose of the invention is as follows: aiming at the problems in the prior art, the invention provides a method and a device for identifying dynamic parameters of a power distribution network based on a probability graph model, which are used for analyzing the influence of the running state of the power distribution network and external environmental factors on the dynamic characteristics of the parameters of the power distribution network, deducing the probability distribution of the dynamic parameters of the power distribution network by using the probability graph model and realizing the accurate identification of the dynamic parameters of the power distribution network.
The technical scheme is as follows: in order to achieve the above object, in one aspect, the present invention provides a method for identifying dynamic parameters of a power distribution network, including the following steps:
carrying out data preprocessing on the collected power distribution network operation data and the collected external environment data to generate a power distribution network dynamic parameter identification sample;
discretizing a dynamic parameter identification sample of the power distribution network to obtain a discretization sample;
acquiring parameters of a probability map model according to a discretization sample based on a pre-established probability map model, wherein the probability map model is a two-time slice probability map model;
and acquiring dynamic parameters of the power distribution network based on a confidence coefficient propagation algorithm according to the observation variables and the probability map model after the parameters are acquired.
Further, the establishing step of the probability map model comprises:
selecting temperature, humidity, feeder line section voltage drop and feeder line section transmission power at a moment as observation variables in a probability map model, and selecting impedance of a line at the moment as an implicit variable of the probability map model;
adding each variable into a probability graph model one by one according to the causal relationship between each observation variable and each hidden variable under a single time slice to construct a static Bayesian network;
setting an initial time slice, and appointing prior probability distribution of each variable under the time slice;
and specifying the causal relationship of each state between adjacent time slices to construct a transfer model.
Further, the generating a distribution network dynamic parameter identification sample comprises:
performing secondary spline interpolation on the external environment data to enable the frequencies of the external environment data of different data sources to be identical;
merging data of different data sources, and eliminating redundant fields in the data;
and removing repeated data in the operating data of the power distribution network and performing data null removal.
Further, discretizing the power distribution network dynamic parameter identification sample, wherein a calculation formula is as follows:
wherein Z is an implicit variable, m is the number of divided discrete intervals, Ncount(Z ═ s) is the number of samples of hidden variables in the data at state s; n is a radical ofamount(Z) is the total number of samples.
Further, the obtaining of the parameters of the probability map model includes:
based on the probability mass function, obtaining an initial probability distribution table of each variable according to the discretization sample;
calculating a conditional probability distribution table among all variables according to the discretization sample based on a maximum expectation algorithm;
counting a transition probability distribution table of each variable from t moment to t +1 moment from continuous data samples on a time axis;
and determining the correctness of the conditional probability distribution table by checking whether the sum of the probability distributions of each variable is 1 or not and whether the conditional probability distribution is consistent with the causal relationship in the Bayesian network or not.
Further, the expression of the probability mass function is:
wherein
Probability of the hidden variable being initially in state s; n is a radical of
count(Z ═ s) is the number of samples of hidden variables in the data at state s; n is a radical of
amount(Z) is the total number of samples.
Further, the calculating a conditional probability distribution table between variables according to the discretization sample based on the maximum expectation algorithm includes:
calculating the posterior probability of the hidden variable as the current expected value of the hidden variable according to the initial value of the conditional probability or the conditional probability obtained by the previous iteration, wherein the expression is as follows:
Pposterior(Z)=P(Z|X;θcpt)
wherein Z is an implicit variable, Pposterior(Z) posterior probability of hidden variable, θcptA conditional probability distribution table in a probability graph model, wherein X is an observation variable;
and (3) updating the conditional probability distribution table by taking the likelihood function maximization as a target, wherein the expression is as follows:
where m is the number of hidden variable states, P (X, Z; theta)cpt) (ii) a desire for a hidden variable obtained from the sample;
and when the probability of the training data sample is maximum according to the condition probability distribution table in the probability graph model, the iteration of the maximum expectation algorithm is finished.
Further, the expression of the transition probability distribution table is:
wherein
Representing the probability of the hidden variable Z transitioning from
state 1 to state 2 from t-1 to t; n is a radical of
count(s
1,s
2) Representing the times of transferring the hidden variable Z from the
state 1 to the state 2 from the t-1 to the t moment in the acquired historical data; n is a radical of
amount(s
1) Representing the number of samples in the acquired historical data with the hidden variable Z in
state 1.
Further, the obtaining of the dynamic parameters of the power distribution network based on the confidence propagation algorithm includes:
initializing the probability distribution of each variable according to the sample;
randomly selecting a certain state variable Y in the network, and replacing the confidence coefficient of the node with b (Y)t):
Wherein phi (Y)t,Xt) Representing the joint compatibility of the node Y at the time t for a likelihood function between the corresponding state variable Y and the observation variable X at the time t, G being a first-order neighborhood of the node Y, mxY(Yt) A message passed to node Y for node x;
updating information between variables:
wherein psi (Y)t,Yt-1) Is a section ofThe potential energy between the nodes from the t-1 moment to the t moment at the point Y;
until the convergence condition is satisfied:
b(n)(Yt)-b(n-1)(Yt)<10-5
the confidence b (Y) of the final hidden variablet) As the result of the estimation of the probability distribution of the hidden variables in each state interval.
In another aspect, the present invention provides a device for identifying dynamic parameters of a power distribution network, including:
the data preprocessing module is used for preprocessing the acquired running data and external environment data of the power distribution network to generate a dynamic parameter identification sample of the power distribution network;
the discretization processing module is used for discretizing the dynamic parameter identification sample of the power distribution network to obtain a discretization sample;
the model parameter determining module is used for acquiring parameters of the probability map model according to the discretization sample based on the pre-established probability map model; the probability map model is a two-time slice probability map model;
and the power distribution network dynamic parameter generation module is used for acquiring power distribution network dynamic parameters based on a confidence coefficient propagation algorithm according to the observation variables and the probability map model after the parameters are acquired.
Has the advantages that:
1. according to the method, the influence and the factor of the dynamic parameters of the power distribution network are analyzed according to data statistics and priori knowledge, the problem that the dynamic parameters cannot be obtained in a part of power distribution areas is solved, the parameter identification precision under the condition that the operation modes of the part of power distribution areas suddenly change or the external environment suddenly changes is improved, the operation state of the power distribution network is mastered and analyzed by power distribution network scheduling personnel, and the final identification result can provide a good basis for the upper-layer application of a power distribution automation system;
2. the invention carries out scientific analysis on the collected operation data and meteorological data of the power distribution network, provides a basis for power distribution network situation perception, is beneficial to realizing a panoramic visible and controllable power distribution network, and is beneficial to providing more reliable, safe and economic electric energy for power grid companies.
Detailed Description
The invention is further described with reference to specific examples. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
Fig. 1 is a flowchart of a method for identifying dynamic parameters of a power distribution network according to an embodiment of the present invention, as shown in fig. 1, the method includes the following steps:
step 1, carrying out data preprocessing on collected power distribution network operation data and external environment data to generate a power distribution network dynamic parameter identification sample.
According to one embodiment, the raw data may be preprocessed in DataFrame form using Pandas.
The data required for identifying the data-driven line physical parameters comprises power distribution network operation data and external environment data: the operation data of the power distribution network is acquired by the intelligent electric meter, and comprises node voltage, current, active power and reactive power which are acquired every 15 min; com, including regional temperature and humidity collected every 3 hours, external environmental data was from 58238 weather station data (provided by meteomanz. The method mainly has three problems that the original data come from different data sources, the data in the different data sources need to be merged and integrated in a data frame; secondly, the dimensionality of the data needs stipulation, and the original data has too many attributes, so that the data modeling is not facilitated; and thirdly, the data has missing values and outliers, and data cleaning is needed.
To address the above issues, in one embodiment, the following operations may be taken with respect to the raw data:
performing secondary spline interpolation on the external environment data to enable the frequencies of the external environment data of different data sources to be identical;
merging data of different data sources, and eliminating redundant fields in the data;
and removing repeated data in the operating data of the power distribution network and performing data null removal.
In a specific example, as shown in fig. 3, it can further adopt:
considering the mismatching of the acquisition frequency of two types of data from different data sources, inserting data point complete data into every two pieces of external environment data by using a spline interpolation method, specifically, interpolating a time sequence of temperature and humidity by using an interp1d function in a python script module package, and completing meteorological historical data to match the frequency of the two data sources;
merging data from different data sources through a merge function in a pandas module, and removing redundant fields in original data by using a drop function;
processing data vacancy and data repeatedly; removing repeated data in the running data by using a drop _ duplicates function; detecting the loss proportion of variables by using pandas, isnull and sum (), and using a dropna function to perform data emptying under the conditions of low loss rate (less than 95%) and low importance, thereby finally obtaining a cleaned dynamic parameter identification data sample of the power distribution network. The structure of the sample is shown in table 1 below.
TABLE 1 distribution line parameter identification data sample
And 2, discretizing the dynamic parameter identification sample of the power distribution network to obtain a discretization sample.
In one embodiment, the data sample may be discretized using a maximum entropy algorithm, which is calculated as follows:
wherein Z is an implicit variable, m is the number of divided discrete intervals, Ncount(Z ═ s) is the number of samples of variables in the data at state s; n is a radical ofamount(Z) is the total number of samples. Each variable in the network is distributed to a respective state space according to the condition that the mutual information entropy is maximum, and under the condition that no priori knowledge exists, the mutual information entropy between the states is maximum when the number of samples contained in each discrete interval is the same. The discretization granularity was chosen to be 10% of the number of samples, and the results of discretization for each variable are shown in table 2 below.
TABLE 2 discretization results for the variable at 10% discrete particle size
And 3, acquiring parameters of the probability map model according to the discretization sample based on the pre-established probability map model.
Wherein, the probability graph model is a two-time slice probability graph model.
The probabilistic graphical model is pre-established, and according to one embodiment, the step of establishing the probabilistic graphical model may include:
selecting the temperature T at time TtHumidity HtVoltage drop Δ V of feeder sectiontAnd feeder section transmission power StAs an observation variable in the probabilistic graphical model, the impedance Z of the line at that moment is selectedtAs hidden variables of the probabilistic graphical model;
adding each variable into a probability graph model one by one according to the causal relationship between each observation variable and each hidden variable under a single time slice to construct a static Bayesian network;
setting an initial time slice, and appointing prior probability distribution of each variable under the time slice;
and specifying the causal relationship of each state between adjacent time slices to construct a transfer model.
The model building process is further described below.
Specifically, considering that the dynamic parameters of the power distribution network are related to the external environment and the running state of the power distribution network, the temperature T at the moment T is selected according to the relation between the variables and the line impedance parameterstHumidity HtVoltage drop Δ V of feeder sectiontAnd feeder section transmission power StThe expression of an observed variable as a probabilistic graphical model, namely an observed variable X of the probabilistic graphical model at the time t is as follows:
Xt={Tt,Ht,ΔVt,St}
considering that the power distribution network line is relatively short, the ground capacitance of the line is ignored when the probability graph model is constructed. Therefore, the hidden variable Y of the probability map model at the time ttFor the impedance of the line at that moment, using ZtAnd (4) showing.
After the random variables of the model are determined, the order of the variables is selected according to the causal relationship. T representing an external environmenttAnd HtAre all factors affecting the line impedance, and thus the line impedance ZtAnd (4) a parent node. The expression for the line node voltage drop is as follows:
wherein V1,V2Representing the difference in voltage amplitudes at node 1 and node 2, P, Q being the active and reactive power respectively flowing between node 1 and node 2, R, X being the resistance and reactance of the line connecting nodes 1 and 2, respectively, IR,IXRespectively representing the corresponding active and reactive currents. In addition, StCharacterizing the apparent power; ztIs the line impedance. From the above equation, apparent power and line impedance are the factors that affect the voltage drop, and thus are the line voltage drop Δ VtThe parent node of (2). And finally, starting from a null graph, adding the variables into the probability graph model one by one according to the dependency relationship among the variables to form a directed acyclic graph.
And selecting an initial time slice on the basis, designating prior probability distribution of each variable under the time slice, designating causal relationship of each state between adjacent time slices, constructing a transfer model, and completing construction of a power distribution network dynamic parameter identification model based on a two-time slice probability graph model. And obtaining a two-time slice probability map model for identifying the dynamic parameters of the power distribution network in the figure 2.
According to one embodiment, the parameters of the probabilistic graphical model may be obtained by:
based on the probability mass function, obtaining an initial probability distribution table of each variable according to the discretization sample;
calculating a conditional probability distribution table among all variables according to the discretization sample based on a maximum expectation algorithm;
counting a transition probability distribution table of each variable from t moment to t +1 moment from continuous data samples on a time axis;
and determining the correctness of the conditional probability distribution table by checking whether the sum of the probability distributions of each variable is 1 or not and whether the conditional probability distribution is consistent with the causal relationship in the Bayesian network or not.
In one embodiment, the parameters of the probability map model, i.e., the initial probability distribution table, the conditional probability distribution table, and the transition probability distribution table, may be obtained as follows:
obtaining an initial probability distribution table of each variable from the discretized sample by calculating a probability mass function, wherein the structure of the initial probability distribution table is shown in the following table 3;
TABLE 3 initial probability distribution Table
The sum of all elements in the initial probability vector is 1, each element PiObtained by computing the Probability Mass Function (PMF):
wherein
Probability of the hidden variable being initially in state s; n is a radical of
count(Z ═ s) is the number of samples of hidden variables in the data at state s; n is a radical of
amount(Z) is the total number of samples.
The conditional probability distribution table for calculating the values of the variables from the discretized sample by the max-expectation algorithm is a matrix of m × (n × k × v × h) in the case where the number of states of the hidden variables is m and the number of states of the observed variables is n, k, v, and h, respectively, and has a structure as shown in table 4 below.
TABLE 4 conditional probability distribution Table
The conditional probability distribution may be obtained by an EM algorithm. The EM algorithm initializes the probability distribution first and then iterates in two steps until convergence. The two-step iteration process is as follows:
1) step E calculation (Expectation Step): calculating the posterior probability of the hidden variable according to the initial value of the conditional probability or the conditional probability obtained by the previous iteration, and taking the posterior probability as the current expected value of the hidden variable:
Pposterior(Z)=P(Z|X;θcpt)
wherein P isposterior(Z) posterior probability of hidden variable, θcptAs DBN (depth confidence)Network, Deep Belief Network) is a parameter of the conditional probability distribution table.
2) M Step calculation (Maximization Step): and (3) updating a conditional probability distribution table by taking the likelihood function maximization as a target:
where m is the number of hidden variable states, P (X, Z; theta)cpt) Is the expectation of the hidden variable obtained from the sample. And when the probability of the training data sample is maximum according to the condition probability distribution table in the probability graph model, the iteration of the maximum expectation algorithm is finished. The maximum expectation algorithm can carry out maximum likelihood estimation on parameters from the incomplete data set and is suitable for probability map model conditional probability distribution calculation under the condition of power distribution network data acquisition loss.
And counting a transition probability distribution table of each variable from t time to t +1 time from continuous data samples on a time axis. The transition probability distribution is a parameter for expressing variable timing transitions in the DBN and can be calculated by the following formula:
wherein
Representing the probability of the hidden variable Z transitioning from
state 1 to state 2 from t-1 to t; n is a radical of
count(s
1,s
2) Representing the times of transferring the hidden variable Z from the
state 1 to the state 2 from the t-1 to the t moment in the acquired historical data; n is a radical of
amount(s
1) Representing the number of samples in the acquired historical data with the hidden variable Z in
state 1.
Finally, the correctness of the obtained parameters can be checked by checking whether the sum of the probability distributions of each variable is 1 or not, and whether the conditional probability distribution is consistent with the causal relationship in the bayesian network or not.
And 4, acquiring dynamic parameters of the power distribution network based on a confidence coefficient propagation algorithm according to the observation variables and the probability map model after the parameters are acquired.
According to one embodiment, the belief propagation algorithm is used to infer power distribution network dynamic parameters when the observed variables are known, as shown in fig. 4, which includes:
1) initializing the probability distribution of each variable according to the sample;
2) randomly selecting a certain state variable Y in the network, the confidence of the node can be represented as b (Y)t) And the confidence level, the adjacent node and all the information m transmitted to the node through the adjacent edgexY(Yt) In direct proportion, the confidence of a node can be replaced by a probability:
wherein phi (Y)t,Xt) And G is a first-order neighborhood of the node, namely a set of all nodes adjacent to the node. m isxY(Yt) The message passed to node Y for node x indicates the effect of node x on node Y at time t.
3) Updating information between variables:
wherein psi (Y)t,Yt-1) And reflecting the compatibility between hidden variables for the potential energy between the nodes from the t-1 moment to the t moment at the node Y.
4) Continuously repeating the steps 2) and 3) to continuously iterate message propagation and confidence coefficient updating until a convergence condition is met:
b(n)(Yt)-b(n-1)(Yt)<10-5
5) and taking the confidence coefficient of the final hidden variable as an inference result of probability distribution of the hidden variable in each state interval. DBN Final reasoningThe result is a probability distribution that, compared to single point parameter identification, the DBN model can provide all the cases that may occur at that moment and their probabilities. For comparison with the conventional single-point line parameter identification model, the final line impedance parameter single-point identification result is obtained from the samples { Z ] in the state interval in the historical data1,Z2,…,ZNAnd (4) calculating, wherein the root mean square can be used as a point identification result of the line parameters.
Wherein N is the number of samples in the history data in the same state as the identification result, Z
iIs the value of the impedance of the sample,
the root mean square of these sample impedance values. The two 10kV feeder lines connected through the interconnection switch in fig. 5 are used as parameter identification objects, and the parameter identification results of 14 lines are shown in table 5 below.
TABLE 5 dynamic parameter identification results for distribution networks
Because the lengths of all lines are different, the average error rate of parameter identification of each line is taken as an evaluation standard, the average error rate of impedance parameter identification of the proposed probabilistic graphical model is 3.80%, and the average error rate of reactance parameter identification is 9.05%.
In another embodiment, the present invention provides a device for identifying dynamic parameters of a power distribution network, including:
the data preprocessing module is used for preprocessing the acquired running data and external environment data of the power distribution network to generate a dynamic parameter identification sample of the power distribution network;
the discretization processing module is used for discretizing the dynamic parameter identification sample of the power distribution network to obtain a discretization sample;
the model parameter determining module is used for acquiring parameters of the probability map model according to the discretization sample based on the pre-established probability map model; the probability map model is a two-time slice probability map model;
and the power distribution network dynamic parameter generation module is used for acquiring power distribution network dynamic parameters based on a confidence coefficient propagation algorithm according to the observation variables and the probability map model after the parameters are acquired.
In conclusion, the invention constructs a novel power distribution network dynamic parameter identification probability graph model to solve the problem of power distribution network parameter identification errors caused by operation condition changes and data errors. The problem of uncertainty in the process of identifying the parameters of the power distribution network is solved by using knowledge in the field of probability theory, and accurate line physical parameters are provided for power distribution network situation perception and line loss calculation. The model provided by the invention can improve the accuracy and robustness of distribution line impedance parameter identification, improves the intelligent degree of distribution network analysis and management, and provides a parameter basis for distribution network scheduling personnel to master, analyze and control the operation mode of the distribution network.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The present invention has been disclosed in terms of the preferred embodiment, but is not intended to be limited to the embodiment, and all technical solutions obtained by substituting or converting equivalents thereof fall within the scope of the present invention.