CN109102126B - Theoretical line loss rate prediction model based on deep migration learning - Google Patents

Theoretical line loss rate prediction model based on deep migration learning Download PDF

Info

Publication number
CN109102126B
CN109102126B CN201810999797.9A CN201810999797A CN109102126B CN 109102126 B CN109102126 B CN 109102126B CN 201810999797 A CN201810999797 A CN 201810999797A CN 109102126 B CN109102126 B CN 109102126B
Authority
CN
China
Prior art keywords
data
dnn
deep
model
loss rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810999797.9A
Other languages
Chinese (zh)
Other versions
CN109102126A (en
Inventor
卢志刚
杨英杰
丁艺楠
顾媛媛
杨宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yanshan University
Original Assignee
Yanshan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yanshan University filed Critical Yanshan University
Priority to CN201810999797.9A priority Critical patent/CN109102126B/en
Publication of CN109102126A publication Critical patent/CN109102126A/en
Application granted granted Critical
Publication of CN109102126B publication Critical patent/CN109102126B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E40/00Technologies for an efficient electrical power generation, transmission or distribution
    • Y02E40/70Smart grids as climate change mitigation technology in the energy generation sector
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Operations Research (AREA)
  • Primary Health Care (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Quality & Reliability (AREA)
  • Biomedical Technology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a theoretical line loss rate prediction model based on deep migration learning, and relates to the technical field of application of artificial intelligence algorithms in power systems. Then, in order to avoid trapping in local optimization during deep learning model training, a concept of transfer learning is introduced, the predicted data is combined, the distribution difference of source and target data is measured through an MMD method, source training data is screened, the screened training data is used for fine tuning the trained DNN deep neural network, and finally the TDBN-DNN-based deep transfer learning model is obtained. Finally, using the power grid operation data as model input to predict the line loss rate; the method solves the problems of strong power grid, high operation efficiency, energy conservation and environmental protection and the construction of the smart power grid.

Description

Theoretical line loss rate prediction model based on deep migration learning
Technical Field
The invention relates to the technical field of application of artificial intelligence algorithms in power systems, in particular to a method for predicting theoretical line loss rate based on deep migration learning.
Background
In a two-zero-eight scientific conference with the subject of 'direction and key of application and research of artificial intelligence in the electric power field' held by the national electric academy of sciences in 2017, 12 and 6, experts mostly make special reports on the technical progress of artificial intelligence and application practices and prospects in an electric power system, preliminarily reach consensus, and focus on theoretical and technical applications such as deep learning, reinforcement learning, transfer learning, small sample learning and the like. At present, due to the strong power grid, high-efficiency operation, energy-saving and environment-friendly requirements and the construction of a smart power grid, a large amount of data is generated in the operation of the power grid, so that a new rapid calculation method is found, which is a problem to be solved urgently in the line loss research field. The difficulty is how to improve the speed and the precision of multidimensional data processing so as to ensure the timeliness and the accuracy of the theoretical line loss rate calculation. The deep learning theory is used as the latest research result in the field of pattern recognition and machine learning, and achieves great results in the aspect of big data processing in the fields of image and voice processing and the like by strong modeling and representation capabilities.
However, conventional machine learning methods typically have two premise assumptions: (1) the samples in the training dataset and the testing dataset must follow the same probability distribution (2) sufficient training samples (in tens of thousands) are required. These two preconditions are often not true in practical applications. The migration learning appears to successfully solve the above limitations of the conventional machine learning.
In summary, it is necessary to provide a line loss rate algorithm for deep transfer learning.
Disclosure of Invention
The invention aims to provide a method for predicting theoretical line loss rate based on deep transfer learning. Transfer learning is utilized which solves the two problems described above with conventional machine learning; the intelligent power grid system aims to meet the requirements of strong power grid, high operation efficiency, energy conservation and environmental protection and the construction of the intelligent power grid.
In order to solve the technical problems under the background that a large amount of data is generated in the operation of a power grid, the technical scheme adopted by the invention is as follows: a method for predicting theoretical line loss rate based on deep transfer learning is characterized by comprising the following steps:
step 1, establishing a DBN deep confidence network formed by stacking a plurality of RBM models; wherein, a parameter theta in a log-likelihood function obtained by each RBM model needs to be derived by using a contrast divergence algorithm;
step 2, connecting an output layer of the DBN deep belief network to an input layer of the DNN model to form a DBN-DNN deep learning model; the DNN model consists of a plurality of layers of common neural networks, and the last layer is an output layer;
step 3, freezing a lower DBN in the DBN-DNN deep learning model deep network, measuring the distribution distance between source data and task prediction data by using an MMD method, and migrating rho in a source samplei>0, obtaining a model for fine-tuning a DNN structure in the DBN-DNN based on the migrated data, namely a TDBN-DNN migration deep learning model:
and 4, simulating a nonlinear mapping relation among load data, power supply data, bus voltage data and line loss rate when the power grid operates by using the TDBN-DNN migration deep learning model, and predicting the line loss rate.
The further technical scheme is that the specific steps of the step 1 are as follows:
firstly, building a DBN deep confidence network
The RBM model is an energy model whose energy function is defined as E for a set of known states (v, h)θ(v, h), where θ is the network parameter θ ═ { a ═ ai,bj,wijIts joint probability distribution of hidden and visible layers is defined as p (v, h), i.e.:
Figure BDA0001782695150000031
Figure BDA0001782695150000032
in the formula (1), m is the number of input units, n is the number of output units, aiFor each input cell real-valued offset, bjFor real-valued bias of each hidden unit, viFor representing input data, hiFor hidden unit output data, wijIs a real-valued weight matrix;
z in the formula (2)θIn order to be a function of the allocation,
Figure BDA0001782695150000033
as can be seen from the conditional independence among the layers of the limited Boltzmann machine, when input data is given, the node values of the output layer meet the following conditional probability:
Figure BDA0001782695150000034
in the formula (3)
Figure BDA0001782695150000035
Activating a function for sigmoid; after the data of the output layer is determined, the value conditional probability of the input node is as follows:
Figure BDA0001782695150000036
given a set of training samples G ═ G1,g2,…,gsTraining the RBM model, adjusting the parameter theta to fit a given training sample, and obtaining the parameter theta by maximizing the likelihood function L (theta) of the network, namely
Figure BDA0001782695150000037
To simplify the calculation, it is written in logarithmic form as:
Figure BDA0001782695150000038
Figure BDA0001782695150000039
two, CD training algorithm
The parameter θ in the log-likelihood function is derived with a contrast divergence algorithm as follows:
Figure BDA0001782695150000041
Figure BDA0001782695150000042
Figure BDA0001782695150000043
the updating criterion of each parameter is as follows:
△wij=εCD(<vihj>data-<vihj>recon) (10)
△ai=εCD(<vi>data-<vi>recon) (11)
△bj=εCD(<hj>data-<hj>recon) (12)
in the formula (11) - (12), εCDIn order to learn the step size,<hj>reconrepresenting the distribution of the model definition after one-step reconstruction.
The further technical scheme is that the specific steps of the step 2 are as follows:
connecting an input layer of the DNN to an output layer of the DBN, and establishing a DBN-DNN deep learning model, wherein the training process of the DNN comprises forward propagation and backward propagation, and the contents are as follows:
one, forward propagation process
Figure BDA0001782695150000044
f(p)=p (14)
hw,b(p)=δ(wp+b) (15)
Figure BDA0001782695150000045
Figure BDA0001782695150000046
In formula (16):
Figure BDA0001782695150000047
represents a connection weight parameter between a jth node of the ith layer and an ith node of the (l-1) th layer;
Figure BDA0001782695150000048
an intercept term representing a jth node of the ith layer;
in the formula (17)
Figure BDA0001782695150000051
Represents the k-thlThe output value of the jth node of the layer;
two, counter-propagating process
For the k-th layer as the output layerlThe residual error calculation formula of the output unit i of the layer is as follows:
Figure BDA0001782695150000052
for l ═ kl-1,kl-2,kl-3The residual error calculation for each layer …,2, i-th node of the l-th layer is as follows:
Figure BDA0001782695150000053
for a fixed training sample set { (u)1,y1),…,(uc,yc) And (4) solving the deep neural network by using a small batch gradient descent method, wherein the loss function of the deep neural network comprises c samples:
Figure BDA0001782695150000054
in the equation (20), the weight attenuation parameter λ is used for the importance of the two terms in the control formula (20); thus, the update for parameters w and b for each iteration in the small batch gradient descent method is as follows:
Figure BDA0001782695150000055
Figure BDA0001782695150000056
the further technical scheme is that the specific process of the step 3 is as follows:
firstly, measuring the distribution difference of source data and task data:
suppose there is one source number satisfying d distribution
Figure BDA0001782695150000057
And a target data satisfying q distribution
Figure BDA0001782695150000058
X(s),X(t)The maximum mean difference MMD of (a) can be expressed as:
Figure BDA0001782695150000059
in the formula (23), H represents a regeneration nuclear Hilbert space RKHS, phi (·): X → H represents a nonlinear feature mapping kernel function of the original feature space mapping to RKHS; since RKHS is typically a high-dimensional, even infinite space, the corresponding kernel selects a gaussian kernel representing an infinite dimension:
Figure BDA0001782695150000061
equation (24) where the width parameter of the sigma function; here, the square form of MMD, i.e. MMD2(ii) a It is unfolded as follows:
Figure BDA0001782695150000062
the nuclear trick is introduced here:
<η(x),η(x')>H=<K(x,·),K(·,x')>H=K(x,x') (26)
equation (25) reduces to:
Figure BDA0001782695150000063
third, migration source data of maximum mean difference contribution coefficient
In order to find out data with irrelevant or not relevant relevance between the source data and the target data, the source data is screened by defining a maximum mean difference contribution coefficient CCMMD of each source data;
by rhoiRepresents the maximum mean difference contribution coefficient of the ith sample; suppose there is a lack of MMD for the ith source data sampleγ≠iComprises the following steps:
Figure BDA0001782695150000064
MMD in formula (28)γ≠iRepresents the maximum mean distance of the missing ith source data sample, where γ is 1,2, …, NsThen
Figure BDA0001782695150000071
In the formula (28) (. rho)iRepresents the maximum mean difference contribution coefficient of the ith sample, ifi>0, indicates that the ith sample is contributing to MMD, conversely if ρi≦ 0, indicating that the ith sample is "negative" contributing to MMD; by calculating p for each source dataiMigration out of rhoi>0, obtaining source data which is distributed more closely to the target data;
and fourthly, freezing a DBN layer of the DBN-DNN, and obtaining a model for finely adjusting the DNN structure in the DBN-DNN based on the migrated data, namely a TDBN-DNN migration deep learning model.
The further technical scheme is that the specific process of the step 4 is as follows:
(1) data processing
Input data load data, power output data and bus voltage data are subjected to standardization treatment, and the formula is (32):
Figure BDA0001782695150000072
in the formula (32), αi,jThe ith sample data representing the jth data characteristic, i ═ 1,2, …, MC,
j=1,2,…,2r,α′i,jdata normalized by the ith sample data representing the jth data feature, αmin,jRepresents the minimum value of the j-th data characteristic in the sample data, alphamax,jRepresenting the maximum value of the jth data characteristic in the sample data;
(2) TDBN-DNN migration deep learning model simulation model for simulating nonlinear mapping relation among load data, power supply data, bus voltage data and line loss rate during power grid operation
Determining the number of hidden layers and the number of nodes of a TDBN-DNN migration deep learning model according to the scale and real-time requirements of samples, and initializing network parameters; secondly, taking the standardized load data, power output data and bus voltage data sample data as input data, taking the corresponding section line loss rate as label data, and training the whole DBN deep belief network model layer by utilizing a greedy unsupervised learning algorithm; finally, giving the characteristic vector output by the DBN to a good initial value for the DNN, and adopting a BP algorithm to train the DNN to fit the label data by monitoring the corresponding fault line loss rate;
(3) line loss rate prediction of TDBN-DNN migration deep learning model
After the input data of the task target are subjected to standardization processing, the data are input into a TDBN-DNN line loss rate prediction model to obtain a predicted line loss rate value.
Compared with the prior art, the invention has the following beneficial effects:
1. the consideration factors are comprehensive, the sample input data is consistent with the data used by load flow calculation, and the model can more effectively fit the nonlinear mapping relation between input and output;
2. the line loss rate prediction model does not need to be subjected to a load flow calculation process in line loss calculation, but directly uses the nonlinear fitting capacity of the model to directly calculate a result, and improves the operation speed.
3. A deep learning line loss rate prediction model is provided, and a complex nonlinear mapping relation between power grid operation data and the line loss rate can be effectively simulated compared with a shallow network.
4. Before a task to be predicted is treated, the distribution distance between source training data and target data is tested through an MMD method in transfer learning, and training data which are distributed more closely to the target data in the source data are screened by a maximum mean difference Contribution Coefficient (CCMMD) to fine tune DNN. Therefore, the problems that a large amount of sample data is needed in deep learning, source data and task data are not distributed uniformly and are easy to over-fit are solved, and the prediction accuracy of the line loss rate can be improved.
Drawings
FIG. 1 is a diagram of a DBN basic building block RBM of the method of the present invention;
FIG. 2 is a DBN-DNN model structure of the method of the present invention;
fig. 3 is a flow chart of the method of the present invention.
Detailed description of the preferred embodiments
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.
As shown in fig. 3, the method of the present invention comprises the following steps:
step 1, establishing a DBN deep confidence network model, training with a CD algorithm thereof, and specifically modeling as follows:
one, deep belief network
The DBN is a generative structure diagram model with multiple hidden layers, is formed by stacking a plurality of layers of Restricted Boltzmann Machines (RBMs), and has strong feature extraction capability. The structure of the RBM is shown in figure 1.
The RBM model is an energy model whose energy function can be defined as E for a set of known states (v, h)θ(v, h), where θ is the network parameter θ ═ { a ═ ai,bj,wijIts joint probability distribution of hidden and visible layers can be defined as p (v, h), i.e.:
Figure BDA0001782695150000101
Figure BDA0001782695150000102
in the formula (1), m is the number of input units, n is the number of output units, aiFor each input cell real-valued offset, bjFor real-valued bias of each hidden unit, viFor representing input data, hiFor hidden unit output data, wijIs a real-valued weight matrix;
z in the formula (2)θIn order to be a function of the allocation,
Figure BDA0001782695150000109
as can be seen from the conditional independence among the layers of the limited Boltzmann machine, when input data is given, the node values of the output layer meet the following conditional probability:
Figure BDA0001782695150000103
in the formula (3)
Figure BDA0001782695150000104
Activating a function for sigmoid; phase (C)Correspondingly, after the data of the output layer is determined, the value conditional probability of the input node is as follows:
Figure BDA0001782695150000105
given a set of training samples G ═ G1,g2,…,gsTraining an RBM, which means adjusting a parameter θ to fit a given training sample, is to obtain the parameter θ by maximizing the likelihood function L (θ) of the network, i.e.
Figure BDA0001782695150000106
To simplify the calculation, it is written in logarithmic form as:
Figure BDA0001782695150000107
Figure BDA0001782695150000108
two, CD training algorithm
Contrast Divergence (CD) the parameter θ in the log-likelihood function is derived with a contrast divergence algorithm as follows:
Figure BDA0001782695150000111
Figure BDA0001782695150000112
Figure BDA0001782695150000113
the updating criterion of each parameter is as follows:
△wij=εCD(<vihj>data-<vihj>recon) (10)
△ai=εCD(<vi>data-<vi>recon) (11)
△bj=εCD(<hj>data-<hj>recon) (12)
in the formula (11) - (12), εCDIn order to learn the step size,<hj>reconrepresenting the distribution of the model definition after one-step reconstruction.
Step 2, building a DBN-DNN deep learning model on the basis of the DBN, as shown in fig. 2, wherein the specific process of DNN model construction is as follows:
connecting an input layer of the DNN to an output layer of the DBN, and establishing a DBN-DNN deep learning model, wherein the training process of the DNN comprises forward propagation and backward propagation, and the contents are as follows:
one, forward propagation process
Figure BDA0001782695150000114
f(p)=p (14)
hw,b(p)=δ(wp+b) (15)
Figure BDA0001782695150000115
Figure BDA0001782695150000116
In formula (16):
Figure BDA0001782695150000117
represents a connection weight parameter between a jth node of the ith layer and an ith node of the (l-1) th layer;
Figure BDA0001782695150000121
an intercept term representing a jth node of the ith layer;
in the formula (17)
Figure BDA0001782695150000122
Represents the k-thlThe output value of the jth node of the layer;
two, counter-propagating process
For the k-thlThe residual error calculation formula of the output unit i of the layer (output layer) is as follows:
Figure BDA0001782695150000123
for l ═ kl-1,kl-2,kl-3The residual error calculation for each layer …,2, i-th node of the l-th layer is as follows:
Figure BDA0001782695150000124
for a fixed training sample set { (u)1,y1),…,(uc,yc) And (4) solving the deep neural network by using a small batch gradient descent method, wherein the loss function of the deep neural network comprises c samples:
Figure BDA0001782695150000125
in which (20) the weight decay parameter λ is used to control the importance of both terms in (20); thus, the update for parameters w and b for each iteration in the small batch gradient descent method is as follows:
Figure BDA0001782695150000126
Figure BDA0001782695150000127
further, in step 3, the migration deep learning model specifically includes the following processes:
and (4) training the DBN-DNN deep learning model by using the source data, wherein the training process is as shown in the first step and the second step.
Firstly, measuring the distribution difference of source data and task data:
suppose there is one source number satisfying d distribution
Figure BDA0001782695150000128
And a target data satisfying q distribution
Figure BDA0001782695150000129
X(s),X(t)The Maximum Mean Difference (MMD) of (a) can be expressed as:
Figure BDA0001782695150000131
in the formula (23), H represents a Regenerative Kernel Hilbert Space (RKHS), phi (·): X → H represents a nonlinear feature mapping kernel function of the original feature space mapped to the RKHS; since RKHS is typically a high-dimensional, even infinite space, the corresponding kernel is typically chosen to represent a gaussian kernel of infinite dimensions:
Figure BDA0001782695150000132
equation (24) where the width parameter of the sigma function; for computational convenience, we generally use the square form of MMD, i.e., MMD2(ii) a It is unfolded as follows:
Figure BDA0001782695150000133
the nuclear trick (kernel rock) was introduced here:
<η(x),η(x')>H=<K(x,·),K(·,x')>H=K(x,x') (26)
equation (25) reduces to:
Figure BDA0001782695150000134
second, migration source data of maximum mean difference contribution coefficient
In order to find data that is irrelevant or not relevant to the source data and the target data, herein, the source data is filtered by defining a maximum mean difference Contribution Coefficient (CCMMD) of each source data;
by rhoiRepresents the maximum mean difference contribution coefficient of the ith sample; suppose there is a lack of MMD for the ith source data sampleγ≠iComprises the following steps:
Figure BDA0001782695150000141
MMD of formula (28)γ≠iRepresents the maximum mean distance of the missing ith source data sample, where γ is 1,2, …, NsThen
Figure BDA0001782695150000142
In the formula (29) (. rho)iRepresents the maximum mean difference contribution coefficient of the ith sample, ifi>0, indicates that the ith sample is contributing to MMD, conversely if ρi≦ 0, indicating that the ith sample is "negative" contributing to MMD; by calculating p for each source dataiMigration out of rhoi>And 0, obtaining source data which is distributed more closely to the target data.
And fourthly, freezing a DBN layer of the DBN-DNN, and using the migrated source data for fine-tuning a DNN neural network at a high layer to obtain a TDBN-DNN deep migration learning model.
Further, in step 4, the theoretical line loss prediction model based on TDBN-DNN is established by the following specific process:
first, theoretical line loss problem description
And simulating a nonlinear mapping relation between load data, power supply data and bus voltage data and a line loss rate when the power grid operates by using a deep learning model. Considering the strong feature extraction capability and the strong fitting performance capability of a deep learning model, taking active power data, reactive power data and bus voltage data of each node in an equivalent model of the electric network as an input matrix X of the deep model, and taking the theoretical line loss rate of a corresponding network as an output matrix T. assuming that the electric network has r nodes, wherein the e PQ nodes, the r-e-1 PV nodes and one balance node are shown in the formulas (30) and (31);
Figure BDA0001782695150000151
T=[△ξ1,△ξ2,…,△ξMC] (31)
in the formula (30), MC is the number of sections, r is the number of nodes in the power grid, and e is the number of load PQ nodes;
in the formula (31), Δ ξiLine loss rate tag data for the ith sample;
second, TDBN-DNN model construction and solution
The method comprises the following specific steps:
(1) and (6) data processing. Because the value ranges and units of the load data, the power output data and the bus voltage data are different, in order to avoid the influence of dimension and prevent the situation of 'big eating and small' of data caused by different absolute values among characteristics, the input data is subjected to standardization treatment, as shown in formula (32):
Figure BDA0001782695150000152
in the formula (32), αi,jThe ith sample data representing the jth data feature, i ═ 1,2, …, M,
j=1,2,…,2n,α′i,jrepresenting characteristics of j-th dataData normalized to the ith sample data, αmin,jRepresents the minimum value of the j-th data characteristic in the sample data, alphamax,jRepresenting the maximum value of the jth data feature in the sample data.
(2) And pre-training a deep learning model. Firstly, determining the number of hidden layers and the number of nodes of a deep learning network according to the scale and the real-time requirement of a sample, and initializing network parameters. And then, taking the standardized sample data of the power supply, load, voltage and the like of the power grid as input data, taking the corresponding line loss rate of the section as label data, and training the whole DBN deep belief network model layer by utilizing a greedy unsupervised learning algorithm. And finally, giving the characteristic vector output by the DBN to a good initial value for the DNN, and adopting a BP algorithm to train the DNN to fit the label data by using a mode of supervising the corresponding fault line loss rate.
(3) And (5) transferring the deep learning model. First, the underlying generic network in the deep learning model, which is generally considered, is frozen. Then, the distribution difference of the source training data and the target data is measured by adopting a maximum mean difference method (MMD), the maximum mean difference Contribution Coefficient (CCMMD) of each sample in the source data is calculated, and rho is selectedi>Sample data of 0. And finally, fine-tuning the pre-trained DNN by using the selected source sample data to obtain a deep migration learning model TDBN-DNN.
(4) And predicting the line loss rate. After the input data of the task target are subjected to standardization processing, the data are input into a TDBN-DNN line loss rate prediction model to obtain a predicted line loss rate value.
The above-mentioned embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements of the technical solution of the present invention by those skilled in the art should fall within the protection scope defined by the claims of the present invention without departing from the spirit of the present invention.

Claims (4)

1. A method for predicting theoretical line loss rate based on deep transfer learning is characterized by comprising the following steps:
step 1, establishing a DBN deep confidence network formed by stacking a plurality of RBM models; wherein, a parameter theta in a log-likelihood function obtained by each RBM model needs to be derived by using a contrast divergence algorithm;
step 2, connecting an output layer of the DBN deep belief network to an input layer of the DNN model to form a DBN-DNN deep learning model; the DNN model consists of a plurality of layers of common neural networks, and the last layer is an output layer;
step 3, freezing a lower DBN in the DBN-DNN deep learning model deep network, measuring the distribution distance between source data and task prediction data by using an MMD method, and migrating rho in a source samplei>0, obtaining a model for fine-tuning a DNN structure in the DBN-DNN based on the migrated data, namely a TDBN-DNN migration deep learning model:
step 4, simulating a nonlinear mapping relation among load data, power supply data, bus voltage data and line loss rate when a power grid operates by using a TDBN-DNN migration deep learning model, and predicting the line loss rate;
the specific steps of step 4 are as follows:
(1) data processing
Input data load data, power output data and bus voltage data are subjected to standardization treatment, and the formula is (32):
Figure FDA0003292996110000011
in the formula (32), αi,jThe ith sample data representing the jth data feature, i ═ 1,2, L, MC, j ═ 1,2, L,2r, α'i,jData normalized by the ith sample data representing the jth data feature, αmin,jRepresents the minimum value of the j-th data characteristic in the sample data, alphamax,jRepresenting the maximum value of the jth data characteristic in the sample data;
(2) TDBN-DNN migration deep learning model simulation model for simulating nonlinear mapping relation among load data, power supply data, bus voltage data and line loss rate during power grid operation
Determining the number of hidden layers and the number of nodes of a TDBN-DNN migration deep learning model according to the scale and real-time requirements of samples, and initializing network parameters; secondly, taking the standardized load data, power output data and bus voltage data sample data as input data, taking the corresponding section line loss rate as label data, and training the whole DBN deep belief network model layer by utilizing a greedy unsupervised learning algorithm; finally, giving the characteristic vector output by the DBN to a good initial value for the DNN, and adopting a BP algorithm to train the DNN to fit the label data by monitoring the corresponding fault line loss rate;
(3) line loss rate prediction of TDBN-DNN migration deep learning model
After the input data of the task target are subjected to standardization processing, the data are input into a TDBN-DNN line loss rate prediction model to obtain a predicted line loss rate value.
2. The method for predicting the theoretical line loss rate based on deep migration learning according to claim 1, wherein the specific steps of the step 1 are as follows:
firstly, building a DBN deep confidence network
The RBM model is an energy model whose energy function is defined as E for a set of known states (v, h)θ(v, h), where θ is the network parameter θ ═ { a ═ ai,bj,wijIts joint probability distribution of hidden and visible layers is defined as p (v, h), i.e.:
Figure FDA0003292996110000021
Figure FDA0003292996110000022
in the formula (1), m is the number of input units, n is the number of output units, aiFor each input cell real-valued offset, bjFor real-valued bias of each hidden unit, viFor representing input data, hiFor hidden unit output data, wijIs a real-valued weightA re-matrix;
z in the formula (2)θIn order to be a function of the allocation,
Figure FDA0003292996110000031
as can be seen from the conditional independence among the layers of the limited Boltzmann machine, when input data is given, the node values of the output layer meet the following conditional probability:
Figure FDA0003292996110000032
in the formula (3)
Figure FDA0003292996110000033
For the sigmoid activation function, after the data of the output layer is determined, the value condition probability of the input node is as follows:
Figure FDA0003292996110000034
given a set of training samples G ═ G1,g2,L,gsTraining the RBM model, adjusting the parameter theta to fit a given training sample, and obtaining the parameter theta by maximizing the likelihood function L (theta) of the network, namely
Figure FDA0003292996110000035
To simplify the calculation, it is written in logarithmic form as:
Figure FDA0003292996110000036
Figure FDA0003292996110000037
two, CD training algorithm
The parameter θ in the log-likelihood function is derived with a contrast divergence algorithm as follows:
Figure FDA0003292996110000038
Figure FDA0003292996110000039
Figure FDA00032929961100000310
the updating criterion of each parameter is as follows:
Δwij=εCD(<vihjdata-<vihjrecon) (10)
Δai=εCD(<vidata-<virecon) (11)
Δbj=εCD(<hjdata-<hjrecon) (12)
in the formula (11) - (12), εCDFor learning step length, < hjreconRepresenting the distribution of the model definition after one-step reconstruction.
3. The method for predicting the theoretical line loss rate based on deep migration learning according to claim 1, wherein the specific steps of the step 2 are as follows:
one, forward propagation process
Figure FDA0003292996110000041
f(p)=p (14)
hw,b(p)=δ(wp+b) (15)
Figure FDA0003292996110000042
Figure FDA0003292996110000043
In formula (16):
Figure FDA0003292996110000044
represents a connection weight parameter between a jth node of the ith layer and an ith node of the (l-1) th layer;
Figure FDA0003292996110000045
an intercept term representing a jth node of the ith layer;
in the formula (17)
Figure FDA0003292996110000046
Represents the k-thlThe output value of the jth node of the layer;
two, counter-propagating process
For the k-th layer as the output layerlThe residual error calculation formula of the output unit i of the layer is as follows:
Figure FDA0003292996110000047
for l ═ kl-1,kl-2,kl-3In each layer L,2, the residual calculation formula of the ith node in the L-th layer is as follows:
Figure FDA0003292996110000051
for a fixed training sample set { (u)1,y1),L,(uc,yc) Is composed of cThe method comprises the following steps of solving a deep neural network by using a small batch gradient descent method, wherein a loss function of the deep neural network is as follows:
Figure FDA0003292996110000052
in the equation (20), the weight attenuation parameter λ is used for the importance of the two terms in the control formula (20); thus, the update for parameters w and b for each iteration in the small batch gradient descent method is as follows:
Figure FDA0003292996110000053
Figure FDA0003292996110000054
4. the method for predicting the theoretical line loss rate based on deep migration learning according to claim 1, wherein the specific process in the step 3 is as follows:
firstly, measuring the distribution difference of source data and task data:
suppose there is one source number satisfying d distribution
Figure FDA0003292996110000055
And a target data satisfying q distribution
Figure FDA0003292996110000056
X(s),X(t)The maximum mean difference MMD of (a) can be expressed as:
Figure FDA0003292996110000057
in the formula (23), H represents a regeneration nuclear Hilbert space RKHS, phi (·): X → H represents a nonlinear feature mapping kernel function of the original feature space mapping to RKHS; since RKHS is typically a high-dimensional, even infinite space, the corresponding kernel selects a gaussian kernel representing an infinite dimension:
Figure FDA0003292996110000058
equation (24) where the width parameter of the sigma function; here, the square form of MMD, i.e. MMD2(ii) a It is unfolded as follows:
Figure FDA0003292996110000061
the nuclear trick is introduced here:
<η(x),η(x')>H=<K(x,g),K(g,x')>H=K(x,x') (26)
equation (25) reduces to:
Figure FDA0003292996110000062
third, migration source data of maximum mean difference contribution coefficient
In order to find out data with irrelevant or not relevant relevance between the source data and the target data, the source data is screened by defining a maximum mean difference contribution coefficient CCMMD of each source data;
by rhoiRepresents the maximum mean difference contribution coefficient of the ith sample; suppose there is a lack of MMD for the ith source data sampleγ≠iComprises the following steps:
Figure FDA0003292996110000063
MMD in formula (28)γ≠iRepresents the maximum mean distance of the missing ith source data sample, where γ is 1,2, L, NsThen
Figure FDA0003292996110000064
In the formula (28) (. rho)iRepresents the maximum mean difference contribution coefficient of the ith sample, ifi>0, indicates that the ith sample is contributing to MMD, conversely if ρi≦ 0, indicating that the ith sample is "negative" contributing to MMD; by calculating p for each source dataiMigration out of rhoi>0, obtaining source data which is distributed more closely to the target data;
and fourthly, freezing a DBN layer of the DBN-DNN, and obtaining a model for finely adjusting the DNN structure in the DBN-DNN based on the migrated data, namely a TDBN-DNN migration deep learning model.
CN201810999797.9A 2018-08-30 2018-08-30 Theoretical line loss rate prediction model based on deep migration learning Active CN109102126B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810999797.9A CN109102126B (en) 2018-08-30 2018-08-30 Theoretical line loss rate prediction model based on deep migration learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810999797.9A CN109102126B (en) 2018-08-30 2018-08-30 Theoretical line loss rate prediction model based on deep migration learning

Publications (2)

Publication Number Publication Date
CN109102126A CN109102126A (en) 2018-12-28
CN109102126B true CN109102126B (en) 2021-12-10

Family

ID=64864268

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810999797.9A Active CN109102126B (en) 2018-08-30 2018-08-30 Theoretical line loss rate prediction model based on deep migration learning

Country Status (1)

Country Link
CN (1) CN109102126B (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800526B (en) * 2019-01-30 2022-11-04 华侨大学 Intelligent design method and system for customizing children's garment paper pattern
CN109871622A (en) * 2019-02-25 2019-06-11 燕山大学 A kind of low-voltage platform area line loss calculation method and system based on deep learning
CN110059802A (en) * 2019-03-29 2019-07-26 阿里巴巴集团控股有限公司 For training the method, apparatus of learning model and calculating equipment
CN110473634B (en) * 2019-04-23 2021-10-08 浙江大学 Genetic metabolic disease auxiliary screening method based on multi-domain fusion learning
CN110348713A (en) * 2019-06-28 2019-10-18 广东电网有限责任公司 A kind of platform area line loss calculation method based on association analysis and data mining
CN110378035A (en) * 2019-07-19 2019-10-25 南京工业大学 Hydrocracking soft measurement modeling method based on deep learning
CN110536257B (en) * 2019-08-21 2022-02-08 成都电科慧安科技有限公司 Indoor positioning method based on depth adaptive network
CN110535146B (en) * 2019-08-27 2022-09-23 哈尔滨工业大学 Electric power system reactive power optimization method based on depth determination strategy gradient reinforcement learning
CN110399796A (en) * 2019-09-02 2019-11-01 国网上海市电力公司 A kind of electrical energy power quality disturbance recognition methods based on improvement deep learning algorithm
CN110705029B (en) * 2019-09-05 2021-09-07 西安交通大学 Flow field prediction method of oscillating flapping wing energy acquisition system based on transfer learning
CN110879917A (en) * 2019-11-08 2020-03-13 北京交通大学 Electric power system transient stability self-adaptive evaluation method based on transfer learning
CN110969293B (en) * 2019-11-22 2023-07-21 上海交通大学 Short-term generalized power load prediction method based on transfer learning
CN110990135B (en) * 2019-11-28 2023-05-12 中国人民解放军国防科技大学 Spark job time prediction method and device based on deep migration learning
CN110910969A (en) * 2019-12-04 2020-03-24 云南锡业集团(控股)有限责任公司研发中心 Tin-bismuth alloy performance prediction method based on transfer learning
CN111612029B (en) * 2020-03-30 2023-08-04 西南电子技术研究所(中国电子科技集团公司第十研究所) Airborne electronic product fault prediction method
CN111652264B (en) * 2020-04-13 2023-08-18 湖北华中电力科技开发有限责任公司 Negative migration sample screening method based on maximum mean value difference
CN111522290A (en) * 2020-04-24 2020-08-11 大唐环境产业集团股份有限公司 Denitration control method and system based on deep learning method
CN114004328A (en) * 2020-07-27 2022-02-01 华为技术有限公司 AI model updating method, device, computing equipment and storage medium
CN112560079B (en) * 2020-11-03 2024-04-19 浙江工业大学 Hidden false data injection attack method based on deep belief network and migration learning
CN112330488B (en) * 2020-11-05 2022-07-05 贵州电网有限责任公司 Power grid frequency situation prediction method based on transfer learning
CN113393051B (en) * 2021-06-30 2023-07-18 四川大学 Power distribution network investment decision-making method based on deep migration learning
CN113537244B (en) * 2021-07-23 2024-03-15 深圳职业技术学院 Livestock image target detection method and device based on lightweight YOLOv4
CN113505847B (en) * 2021-07-26 2023-04-14 云南电网有限责任公司电力科学研究院 Energy-saving online measuring system and method based on transfer learning
CN114239114A (en) * 2021-12-21 2022-03-25 浙江工业大学台州研究院 Truss stress prediction and lightweight method based on transfer learning fusion model
CN115879569B (en) * 2023-03-08 2023-05-23 齐鲁工业大学(山东省科学院) Online learning method and system for IoT observation data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794534A (en) * 2015-04-16 2015-07-22 国网山东省电力公司临沂供电公司 Power grid security situation predicting method based on improved deep learning model
CN107769972A (en) * 2017-10-25 2018-03-06 武汉大学 A kind of power telecom network equipment fault Forecasting Methodology based on improved LSTM
CN108108858A (en) * 2018-01-22 2018-06-01 佛山科学技术学院 A kind of Short-Term Load Forecasting Method
CN108304927A (en) * 2018-01-25 2018-07-20 清华大学 Bearing fault modality diagnostic method and system based on deep learning

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10796220B2 (en) * 2016-05-24 2020-10-06 Marvell Asia Pte, Ltd. Systems and methods for vectorized FFT for multi-dimensional convolution operations
CN106199174A (en) * 2016-07-01 2016-12-07 广东技术师范学院 Extruder energy consumption predicting abnormality method based on transfer learning
US10452899B2 (en) * 2016-08-31 2019-10-22 Siemens Healthcare Gmbh Unsupervised deep representation learning for fine-grained body part recognition
CN107679859B (en) * 2017-07-18 2020-08-25 中国银联股份有限公司 Risk identification method and system based on migration deep learning
CN107506590A (en) * 2017-08-26 2017-12-22 郑州大学 A kind of angiocardiopathy forecast model based on improvement depth belief network
CN108256556A (en) * 2017-12-22 2018-07-06 上海电机学院 Wind-driven generator group wheel box method for diagnosing faults based on depth belief network
CN108154223B (en) * 2017-12-22 2022-04-15 北京映翰通网络技术股份有限公司 Power distribution network working condition wave recording classification method based on network topology and long time sequence information
CN108446711B (en) * 2018-02-01 2022-04-22 南京邮电大学 Software defect prediction method based on transfer learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794534A (en) * 2015-04-16 2015-07-22 国网山东省电力公司临沂供电公司 Power grid security situation predicting method based on improved deep learning model
CN107769972A (en) * 2017-10-25 2018-03-06 武汉大学 A kind of power telecom network equipment fault Forecasting Methodology based on improved LSTM
CN108108858A (en) * 2018-01-22 2018-06-01 佛山科学技术学院 A kind of Short-Term Load Forecasting Method
CN108304927A (en) * 2018-01-25 2018-07-20 清华大学 Bearing fault modality diagnostic method and system based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
电力系统的迁移强化学习优化算法研究;张孝顺;《中国博士学位论文全文数据库 工程科技Ⅱ辑》;20180715;C042-35 *

Also Published As

Publication number Publication date
CN109102126A (en) 2018-12-28

Similar Documents

Publication Publication Date Title
CN109102126B (en) Theoretical line loss rate prediction model based on deep migration learning
Zhang et al. Modeling and simulating of reservoir operation using the artificial neural network, support vector regression, deep learning algorithm
CN109492822B (en) Air pollutant concentration time-space domain correlation prediction method
CN113053115B (en) Traffic prediction method based on multi-scale graph convolution network model
Li et al. An intelligent transient stability assessment framework with continual learning ability
CN111860982A (en) Wind power plant short-term wind power prediction method based on VMD-FCM-GRU
CN112990556A (en) User power consumption prediction method based on Prophet-LSTM model
CN108985515B (en) New energy output prediction method and system based on independent cyclic neural network
CN110298434B (en) Integrated deep belief network based on fuzzy partition and fuzzy weighting
CN113158572A (en) Short-term load prediction method and device
CN112488452B (en) Energy system management multi-time scale optimal decision method based on deep reinforcement learning
CN111525587A (en) Reactive load situation-based power grid reactive voltage control method and system
CN108182490A (en) A kind of short-term load forecasting method under big data environment
CN116610416A (en) Load prediction type elastic expansion system and method based on Kubernetes
Na et al. A novel heuristic artificial neural network model for urban computing
Samsudin et al. A hybrid least squares support vector machines and GMDH approach for river flow forecasting
CN112183721B (en) Construction method of combined hydrological prediction model based on self-adaptive differential evolution
CN117610689A (en) Training method of dynamic neural network model based on information entropy integrated learning
CN110705756B (en) Electric power energy consumption optimization control method based on input convex neural network
Guo et al. Short-Term Water Demand Forecast Based on Deep Neural Network:(029)
CN116663745A (en) LSTM drainage basin water flow prediction method based on PCA_DWT
CN116681159A (en) Short-term power load prediction method based on whale optimization algorithm and DRESN
CN114254828B (en) Power load prediction method based on mixed convolution feature extractor and GRU
CN110059871A (en) Photovoltaic power generation power prediction method
CN115907000A (en) Small sample learning method for optimal power flow prediction of power system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant