CN107169628A - A kind of distribution network reliability evaluation method based on big data mutual information attribute reduction - Google Patents

A kind of distribution network reliability evaluation method based on big data mutual information attribute reduction Download PDF

Info

Publication number
CN107169628A
CN107169628A CN201710244420.8A CN201710244420A CN107169628A CN 107169628 A CN107169628 A CN 107169628A CN 201710244420 A CN201710244420 A CN 201710244420A CN 107169628 A CN107169628 A CN 107169628A
Authority
CN
China
Prior art keywords
attribute
mrow
conditional
decision
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710244420.8A
Other languages
Chinese (zh)
Other versions
CN107169628B (en
Inventor
李妍
盛梦雨
刘婉兵
杜明秋
杨秉臻
杨晨光
王少荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201710244420.8A priority Critical patent/CN107169628B/en
Publication of CN107169628A publication Critical patent/CN107169628A/en
Application granted granted Critical
Publication of CN107169628B publication Critical patent/CN107169628B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Game Theory and Decision Science (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention relates to distribution network planning field, a kind of distribution network reliability evaluation method based on big data mutual information attribute reduction is provided, this method is from big data, the correlation between basic index is weighed using the mutual information concept in rough set, with reliability index strong correlation and separate index in screening magnanimity multiclass index, using these indexs as input, carry out evaluating reliability of distribution network with the BP neural network model based on genetic algorithm and work.The present invention breaches the limitation of traditional Monte Carlo simulation and analytic method, for electric power big data, realizes the evaluating reliability of distribution network based on big data mutual information attribute reduction.

Description

A kind of distribution network reliability evaluation method based on big data mutual information attribute reduction
Technical field
The present invention relates to distribution network planning field, and in particular to a kind of power distribution network based on big data mutual information attribute reduction Reliability estimation method.
Background technology
With internet, the development of database technology and the automation of production environment, the field such as finance, electric power, meteorology Generate that magnanimity species is various and data of rapid development, referred to as big data, nowadays big data has penetrated into every field, into For the important factor of production, and the new engine of industry transformation is promoted because its huge value turns into.To big data plus With mining analysis, extract its main information and reasonable utilization, could realize the value of big data, distribution network reliability be one with The technical indicator of many factors strong correlation, wherein related to distribution network reliability has temperature, wind speed, electricity sales amount, line loss per unit etc. Many-sided data.Convectional reliability index is general to be estimated by modeling or sampled analog with multiple indexs, as load point refers to Mark, power off time index, the economic class index of power failure etc., but analytic method has significant limitations, illiteracy when handling complicated electric power system Time-consuming very long caused by special Carlow sampling non-stateful redundant, big data technology provides to carry out evaluating reliability of distribution network New approaches.
The content of the invention
The invention aims to overcome above-mentioned the deficiencies in the prior art part, propose a kind of based on big data mutual information The distribution network reliability evaluation method of attribute reduction, from big data, weighs basic using the mutual information concept in rough set With reliability index strong correlation and separate index in correlation between index, screening magnanimity multiclass index, with these Index is carried out evaluating reliability of distribution network as input, with the BP neural network model based on genetic algorithm and worked.The present invention is prominent The limitation of traditional Monte Carlo simulation and analytic method has been broken, for electric power big data, has realized and is based on big data mutual information attribute The evaluating reliability of distribution network of yojan.
The purpose of the present invention is achieved through the following technical measures.
A kind of distribution network reliability evaluation method based on big data mutual information attribute reduction, this method pair can with power distribution network Pre-processed by the relevant index of property, include the discretization of continuous type index, between the concept parameter based on comentropy Association relationship, carry out going dimension to obtain entropy correlated coefficient between index after operating, each index and reliability index judged accordingly Correlation and each index correlation each other, carry out Indexes Reduction, then for obtained after yojan and reliability index Strong correlation and separate index are fitted their non-linear relation using BP neural network, and combine the optimizing of genetic algorithm The deficiency of feature supplements neural net method.Specifically include following steps:
Step 1:Data largely related to distribution network reliability are collected from academic, meteorological or statistics website;
Step 2:The index value relevant with distribution network reliability is sorted out in many data of comforming, that is, sorts out portion The decision table of reliability index and index of correlation corresponding relation is characterized, final distribution network reliability height is represented including 1 Decision attribute (i.e. reliability index) and multiple conditional attributes for representing the factor related to reliability;
Step 3:Data in decision table are pre-processed:According to all values of each attribute, taking for attribute is judged Value is continuous or discrete, for continuous type attribute, and the knowledge in mathematical statistics need to be utilized, which to calculate it, to be divided Number, and with equidistant discrete method by continuous type attribute discretization;
Step 4:Probability when specific discrete value is got per attribute is calculated, the respective information of every attribute is then obtained Entropy, conditional attribute for decision attribute conditional entropy, and then try to achieve between various conditional attributes and decision attribute, two-by-two condition belong to Mutual information between property;
Step 5:Mutual information in step 4 between counted conditional attribute and decision attribute is normalized, with reference to letter Breath entropy asks for the entropy correlated coefficient between conditional attribute and decision attribute, thus the phase between Rule of judgment attribute and decision attribute Guan Xing, entropy correlated coefficient is smaller, then correlation is weaker, sets a suitable critical value to weigh the correlation between attribute, Reject the conditional attribute weak with decision attribute correlation;
Step 6:It is similar with the method in step 5, calculate by step 5 reject after remaining conditional attribute two-by-two between entropy phase Relation number, filters out the redundancy condition attribute weaker with remaining conditional attribute strong correlation, with decision attribute correlation, and deletes, Obtain with reliability index strong correlation and separate conditional attribute collection, reach the purpose of yojan attribute;
Step 7:Construction three layers of BP neural network the property set after yojan is trained, with obtained by step 6 with can By property index strong correlation conditional attribute as input, using decision attribute data as output, try to achieve make error of fitting minimum Connection weight and hidden layer, the threshold value of output layer in network between each node layer, obtain optimal BP neural network model;For Training precision is improved, optimal initial weight and threshold value can be tried to achieve with genetic algorithm.
In the above-mentioned technical solutions, the step 2 is comprised the steps of:
Step 2.1:Matching somebody with somebody for portion m × n is set up according to the collected mass data related to certain city's distribution network reliability System adequacy evaluation decision table, wherein n represent the total number of decision attribute and conditional attribute, corresponding decision attribute and condition Attribute constitutes one group of attribute data, and m represents the total group of number (i.e. sample number) of attribute data;
Step 2.2:The index of distribution network reliability will be directly represented or determined in decision table as decision attribute, such as:For Electric reliability, remaining index related to reliability is as conditional attribute, such as:Month, temperature, integrated voltage qualification rate etc..
In the above-mentioned technical solutions, the step 3 is comprised the steps of:
Step 3.1:According to the value of all properties in decision table, judge attribute data be it is continuous or discrete, than Such as:The attributes such as time, month only get fixed several integers, are discrete data, Analyzing Total Electricity Consumption, rate of load condensate, synthesis The attributes such as rate of qualified voltage can get all numerical value in an interval, be continuous data;
Step 3.2:According to the data distribution characteristic and related objective factor of each factor, calculate continuous according to formula (1) The number of partitions that type attribute should be divided;
K=1.87 × (m-1)2/5 (1)
In formula, m is the sample number of attribute data, and k is the number of partitions of connection attribute codomain;
Step 3.3:The siding-to-siding block length of continuous type attribute is calculated according to the counted number of partitions in step 3.2, with equidistant discrete The codomain of continuous type attribute is divided into k interval by method, to one discrete integer value of each interval tax, and calculates continuous type category Property discretization results, complete continuous data discretization.
In the above-mentioned technical solutions, the step 4 is comprised the steps of:
Step 4.1:The sample number that each attribute gets each discrete integer value is counted, is taken according to formula (2) computation attribute To probability during specific discrete value;
In formula, k represents the attribute x discretization number of partitions, XiRepresent attribute x i-th of value, c (Xi) represent that attribute x takes It is worth for XiSample number, U represents total sample i.e. domain, and c (U) represents total number of samples, p (Xi) represent attribute x values for XiIt is general Rate;
Step 4.2:The respective comentropy of every attribute, conditional attribute are obtained for decision attribute according to formula (3), (4) With a certain conditional attribute for the conditional entropy of another conditional attribute, it is necessary to which explanation, comentropy herein is used to measure category Property provide information content, also illustrate that the ordering degree of sequence of attributes, conditional entropy is represented before completely known a certain attribute Put, the information content of another attribute also how many;
In formula, H (x) represents attribute x comentropy;
In formula, p (Yj|Xi) represent in XiOn the premise of generation, YjThe probability of generation, and H (y | x) represent attribute y for x's The conditional entropy of conditional entropy or y based on x;
Step 4.3:Using the result of calculation of step 4.2, according to formula (5) try to achieve every kind of conditional attribute and decision attribute it Between, the mutual information of conditional attribute between any two, to represent the size that information content is had between these attributes,
I (x, y)=H (y)-H (y | x) (5)
In formula, H (y) represents attribute y comentropy, and I (x, y) represents attribute x and y mutual information, it is believed that be attribute y The information content being had with x.
In the above-mentioned technical solutions, the step 5 is comprised the steps of:
Step 5.1:To eliminate dimension impact, using formula (6) to by the counted conditional attribute of step 4.3 and decision attribute Mutual information be normalized, ask for entropy correlated coefficient value, accordingly the correlation between Rule of judgment attribute and decision attribute, entropy Coefficient correlation is smaller, represents that correlation is weaker, and conditional attribute is also just smaller for the effect of evaluating reliability of distribution network;
In formula, ρxyFor attribute x and y entropy correlated coefficient, x and y degree of correlation is represented;
Step 5.2:One critical value e1 is set according to the entropy correlated coefficient result of calculation in step 5.1, when certain condition When the entropy correlated coefficient of attribute and decision attribute is less than the critical value, it is believed that influence of the conditional attribute for distribution network reliability Less, it is rejected from decision table.
In the above-mentioned technical solutions, the step 6 is comprised the steps of:
Step 6.1:It is similar with the method in step 5, calculate by step 5.2 reject after entropy between remaining conditional attribute Coefficient correlation;
Step 6.2:One critical value e2 is set according to the entropy correlated coefficient result of calculation in step 6.1, when two conditions When the entropy correlated coefficient of attribute exceedes this critical value, it is believed that the correlation of the two attributes is very strong, can represent mutually, i.e., two Individual attribute is roughly the same for the influence of distribution network reliability, now to compare the two conditional attributes and decision attribute it Between entropy correlated coefficient, delete the conditional attribute weaker with decision attribute correlation, reduce the redundancy of property set, obtain with Reliability index strong correlation and separate conditional attribute collection.
In the above-mentioned technical solutions, the step 7 is comprised the steps of:
Step 7.1:Three layers of BP neural network of construction are trained to the attribute data after yojan, to be obtained by step 6.2 With the conditional attribute of reliability index strong correlation as input, final output is used as using decision attribute;Assuming that after yojan There are p kind conditional attributes in decision table, then the node number of input layer and output layer is respectively p and 1;In m group attribute datas with Machine selects b test sample, and remaining sample is as the training sample of neutral net, and sample includes conditional attribute and decision attribute Data in sample are normalized by value;
Step 7.2:With computer random generate h group BP neural networks in each node layer initial connection weight and imply Layer, the threshold value of output layer, are rewritten as binary coded form, constitute initial solution space, and solution is gone out with reference to neural computing The fitness of data is solved in space;The larger preceding c solution data of fitness are selected as parent solution data, parent data are carried out Intersect, mutation operation obtains filial generation solution space, convergence is judged whether according to the fitness of filial generation solution data, if it is, optimizing Stop and export optimal initial weight and threshold value, otherwise, proceed to select, intersect, mutation operation;
Step 7.3:Counted initial weight in step 7.2 and threshold value are decoded, trained with BP neural network at normalization Sample after reason, obtains the estimate of decision attribute and the error of actual value, judges whether the error meets the condition of convergence, if not Meet, then adjust weight and threshold value, continue training network;If meeting, stop circulation, output makes the minimum weight of error and threshold Value.
Compared with prior art, beneficial effects of the present invention are:
The present invention proposes a kind of distribution network reliability evaluation method based on mutual information and improved BP, for The a large amount of a variety of data related to distribution network reliability occurred under big data background, it is general based on the mutual information on the basis of comentropy Read and go dimension operation to obtain entropy correlated coefficient value, the index with distribution network reliability strong correlation is filtered out, with reference to BP nerve nets Network is modeled to these indexs, and can not be true with the optimizing feature supplements neutral net initial weight and threshold value of genetic algorithm Fixed deficiency, realizes comprehensive accurate rapid evaluation of distribution network reliability.
Brief description of the drawings
Fig. 1 is the evaluating reliability of distribution network flow chart based on big data mutual information attribute reduction;
Fig. 2 is the flow chart based on mutual information yojan distribution network reliability index of correlation.
Embodiment
In order that technological means, creation characteristic and the purpose of the present invention are apparent to, further is made to the present invention below Illustrate.
Referring to Fig. 1,2, the embodiment of the present invention provides a kind of distribution network reliability based on big data mutual information attribute reduction Appraisal procedure, is followed the steps below successively:
Step 1:Certain city is obtained inside from electric power enterprise largely matches somebody with somebody electricity consumption data, is obtained from the websites such as meteorological, statistics The each side data related to city's distribution network reliability;
Step 2:The evaluating reliability of distribution network that portion 108 × 15 is sorted out from the mass data collected by step 1 is determined Plan table, including 1 decision attribute --- power supply reliability (Y, %), and 14 conditional attributes --- the time (X1), month (X2), Analyzing Total Electricity Consumption (X3, ten thousand kWh), electricity sales amount (X4, ten thousand kWh), 220kV and following line loss per unit (X5, %), rate of load condensate (X6, %), peak load (X7, ten thousand kW), integrated voltage qualification rate (X8, %), monthly total precipitation (X9, mm), monthly mean temperature (X10, DEG C), moon sunshine time (X11, h), monthly average wind speed (X12, m/s), moon Windy Days (X13, day), rain day moon number (X14, day), have 108 groups of attribute datas;
Step 3:According to the value of all properties in decision table, judge attribute data be it is continuous or discrete, such as: The attributes such as time, month only get fixed several integers, are discrete data, Analyzing Total Electricity Consumption, rate of load condensate, integrated voltage The value of the attributes such as qualification rate is derived from some continuum, is continuous data;For ease of data dependence analysis below, it is necessary to Sliding-model control is carried out to continuous data, specific processing mode is as follows:
According to the data distribution characteristic and related objective factor of each factor, calculating continuous type attribute according to formula (1) should The number of partitions being divided;
K=1.87 × (m-1)2/5 (1)
In formula, m is total number of samples, and k is the number of partitions of connection attribute.
According to formula (1) counted number of partitions m=1.87 × (108-1)2/5=12.12, that is, select that all properties are whole It is divided into 12 classes, the results are shown in Table 1;
Continuous type attribute x value is divided into k interval with equidistant discrete method, utilizes formula (2) to calculate continuous type category Siding-to-siding block length l of the property in discretizationx, and one discrete integer value of each interval tax, that is, continuous data are being entered 1,2 are only got after row discretization ..., these discrete integer values of k;The each original value pair of the attribute is calculated further according to formula (3) The discretization results answered, complete discretization, and discretization results are as shown in table 1.
In formula, max ([x]) and min ([x]) are respectively the maximum and minimum value of all values in attribute x, and k is setting The interval number of discretization.
In formula, xiAttribute x i-th of value, X before expression discretizationiRepresent discretization after and xiCorresponding attribute x I-th of value, [x] represents to round downwards, i.e. the maximum integer smaller than x.
The discretization results of table 1
Step 4:Using the discretization results in step 3, the sample number that each attribute gets each discrete integer value is counted, Probability during specific discrete value is got according to formula (4) computation attribute;
In formula, k represents the attribute x discretization number of partitions, XiRepresent attribute x i-th of value, c (Xi) represent that attribute x takes It is worth for XiSample number, U represents total sample i.e. domain, and c (U) represents total number of samples, p (Xi) represent attribute x values for XiIt is general Rate.
The probability distribution tried to achieve more than, the respective comentropy of every attribute, bar are obtained according to formula (5), (6) respectively Part attribute for decision attribute and a certain conditional attribute for another conditional attribute conditional entropy, it is necessary to explanation, herein Comentropy be used for metric attribute provide information content, also illustrate that the ordering degree of sequence of attributes, conditional entropy is represented complete On the premise of known a certain attribute, the information content of another attribute;
In formula, H (x) represents attribute x comentropy.
In formula, p (Yj|Xi) represent in XiOn the premise of generation, YjThe probability of generation, and H (y | x) represent attribute y for x's The conditional entropy of conditional entropy or y based on x.
Using above result of calculation, tried to achieve according to formula (7) between various conditional attributes and decision attribute, conditional attribute two Mutual information between two, to measure the size that information content is had between these attributes.
I (x, y)=H (y)-H (y | x) (7)
In formula, H (y) represents attribute y comentropy, and I (x, y) represents attribute x and y mutual information, it is believed that be attribute y The information content being had with x.
Step 5:To eliminate dimension impact, using formula (8) to by the mutual of the counted conditional attribute of step 4 and decision attribute Information is normalized, and asks for entropy correlated coefficient value, accordingly the correlation between Rule of judgment attribute and decision attribute, and entropy is related Coefficient is smaller, represents that correlation is weaker, and conditional attribute is also just smaller for the effect of evaluating reliability of distribution network;Each conditional attribute xiThe entropy correlated coefficient of (i=1,2 ..., 14) between decision attribute y is shown in Table 2;
In formula, ρxyFor attribute x and y entropy correlated coefficient, x and y degree of correlation is represented.
Entropy correlated coefficient between the conditional attribute of table 2 and decision attribute
Conditional attribute X1 X2 X3 X4 X5 X6 X7
Entropy correlated coefficient 0.2770 0.1488 0.1859 0.2027 0.1513 0.1578 0.1636
Conditional attribute X8 X9 X10 X11 X12 X13 X14
Entropy correlated coefficient 0.2874 0.1353 0.1112 0.1645 0.1569 0.0947 0.1652
One critical value e1 is set according to entropy correlated coefficient result of calculation, when the entropy phase of certain conditional attribute and decision attribute When relation number is less than the critical value, it is believed that the conditional attribute is little for the influence of distribution network reliability, by it from decision table Reject;Found out by table 2, maximum entropy correlated coefficient is no more than 0.3 between these conditional attributes and decision attribute, and e1 is chosen herein For 0.15, the conditional attribute by entropy correlated coefficient no more than e1 removes, that is, leaves out month X2, monthly total precipitation X9, monthly mean temperature X10, moon Windy Days X13.
Step 6:It is similar with the method in step 5, calculate by step 5 reject after the mutual entropy phase of remaining conditional attribute Relation number, sets up correlation matrix, and result of calculation is as shown in table 3;
The mutual entropy correlated coefficient of the essential condition attribute of table 3
One critical value e2 is set according to the value condition of entropy correlated coefficient in correlation matrix, when the entropy of two conditional attributes When coefficient correlation exceedes this critical value, it is believed that the correlation of the two attributes is very strong, can represent mutually, i.e., two attributes pair It is roughly the same in the influence of distribution network reliability, now compares the entropy phase between the two conditional attributes and decision attribute Relation number, deletes the conditional attribute weaker with decision attribute correlation, obtains and reliability index strong correlation and separate Conditional attribute collection, reach the purpose of attribute reduction;
As can be seen from Table 3, the entropy correlated coefficient between X1 and X8, X3 and X4, X3 and X7 is selected herein more than 0.5 It is 0.5 to take critical value e2, and the entropy correlated coefficient size of this five conditional attributes and decision attribute is X8>X1>X4>X3>X7, because This leaves out conditional attribute the time X1 and Analyzing Total Electricity Consumption X3 of relative redundancy.
Step 7:Construction three layers of BP neural network the attribute data after yojan is trained, with obtained by step 6 with The conditional attribute of reliability index strong correlation is used as final output, it is assumed that after yojan as input using decision attribute data There are p kind conditional attributes in decision table, then the node number of input layer and output layer is respectively p and 1;108 are had in this example Group sample data, 8 groups are therefrom selected at random as test sample, remaining 100 groups as training sample, sample includes condition Data in sample are normalized by attribute and decision attribute values;
The initial connection weight and hidden layer, output layer of each node layer in h group BP neural networks are generated with computer random Threshold value, be rewritten as binary coded form, constitute initial solution space, calculated with reference to neutral net and data are solved in solution space Fitness;The larger preceding c solution data of fitness are selected as parent solution data, parent data are intersected, make a variation behaviour Filial generation solution space is obtained, convergence is judged whether according to the fitness of filial generation solution data, if it is, optimizing stops and exported most Excellent initial weight and threshold value, otherwise, proceed to select, intersect, mutation operation;
The counted initial weight of previous step and threshold value are decoded and neutral net is input to, normalizing is trained with BP neural network 100 training samples after change processing, obtain decision attribute estimate and the error of actual value, judge whether the error meets receipts Condition is held back, if it is not satisfied, then adjustment weight and threshold value, continues training network;If meeting, stop circulation, output makes error most Small weight and threshold value, obtains optimal BP network models;
The reliability of 8 groups of test samples is estimated with the BP neural network model trained, assessment result with it is true As shown in table 4, as can be seen from Table 4, assessed value is fairly close with actual value for the contrast of value, and maximum absolute error is 0.004, It can be seen that, the Evaluated effect of the appraisal procedure is preferable.
Table 4 predicts the outcome
Sequence number Actual value Predicted value Absolute error
1 99.989 99.990 0.001
2 99.973 99.973 0.000
3 99.974 99.975 0.001
4 99.989 99.985 0.004
5 99.994 99.992 0.002
6 99.980 99.981 0.001
7 99.988 99.987 0.001
8 99.987 99.987 0.000
The content not being described in detail in this specification, belongs to prior art known to those skilled in the art.

Claims (6)

1. a kind of distribution network reliability evaluation method based on big data mutual information attribute reduction, it is characterized in that this method include with Lower step:
(1) data largely related to distribution network reliability are collected from academic, meteorological or statistics website;
(2) a decision table for characterizing reliability index and index of correlation corresponding relation is sorted out in many data of comforming, wherein wrapping Include 1 decision attribute i.e. reliability index for representing final distribution network reliability height it is related to reliability with multiple expressions because The conditional attribute of element;
(3) data in decision table are pre-processed:According to all values of each attribute, the value for judging attribute is continuous Or it is discrete, for continuous type attribute, calculate its number that should be divided, and belonged to continuous type with equidistant discrete method Property discretization;
(4) probability when specific discrete value is got per attribute is calculated, the respective comentropy of every attribute, condition category is then obtained Property and then is tried to achieve between various conditional attributes and decision attribute, two-by-two between conditional attribute for the conditional entropy of decision attribute Mutual information;
(5) mutual information in step (4) between counted conditional attribute and decision attribute is normalized, combining information entropy is asked The entropy correlated coefficient between conditional attribute and decision attribute is taken, thus the correlation between Rule of judgment attribute and decision attribute, Entropy correlated coefficient is smaller, then correlation is weaker, sets suitable critical value to weigh the correlation between attribute, rejects and decision-making The weak conditional attribute of Attribute Correlation;
(6) calculate by step (5) rejecting after remaining conditional attribute two-by-two between entropy correlated coefficient, filter out and remaining condition belong to Property strong correlation, the redundancy condition attribute weaker with decision attribute correlation, and delete, obtain and reliability index strong correlation and phase Mutually independent conditional attribute collection, reaches the purpose of yojan attribute;
(7) three layers of BP neural network are constructed to be trained the property set after yojan, is referred to what is obtained by step (6) with reliability The conditional attribute of strong correlation is marked as input, using decision attribute data as output, is tried to achieve in the network for making error of fitting minimum Connection weight and hidden layer, the threshold value of output layer between each node layer, obtain optimal BP neural network model.
2. the distribution network reliability evaluation method according to claim 1 based on big data mutual information attribute reduction, it is special Levy is that the concrete mode of a decision table for characterizing reliability index and index of correlation corresponding relation is sorted out in step (2) such as Under:
Step one, the power distribution network for setting up portion m × n according to the collected mass data related to certain city's distribution network reliability can By property evaluation decision table, wherein n represents the total number of decision attribute and conditional attribute, corresponding decision attribute and conditional attribute structure Into one group of attribute data, m represents the total group of number i.e. sample number of attribute data;
Step 2, will directly represent or determines the index of distribution network reliability as decision attribute in decision table, remaining with it is reliable Property related index be used as conditional attribute.
3. the distribution network reliability evaluation method according to claim 1 based on big data mutual information attribute reduction, it is special Levy the concrete mode for being to be pre-processed to the data in decision table in step (3) as follows:
Step one, according to the value of all properties in decision table, it is continuous or discrete to judge attribute data;
Step 2, according to the data distribution characteristic and related objective factor of each factor, continuous type category is calculated according to below equation The number of partitions that property should be divided;
K=1.87 × (m-1)2/5
In formula, m is the sample number of attribute data, and k is the number of partitions of connection attribute codomain;
Step 3, the siding-to-siding block length of continuous type attribute is calculated according to the counted number of partitions, with equidistant discrete method by continuous type attribute Codomain be divided into k it is interval, to it is each it is interval assign a discrete integer value, and calculate the discretization knot of continuous type attribute Really, the discretization of continuous data is completed.
4. the distribution network reliability evaluation method according to claim 1 based on big data mutual information attribute reduction, it is special Levy be try to achieve between various conditional attributes and decision attribute in step (4), the specific side of mutual information two-by-two between conditional attribute Formula is as follows:
Step one, the sample number that each attribute gets each discrete integer value is counted, spy is got according to below equation computation attribute Determine probability during centrifugal pump,
<mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>X</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>c</mi> <mrow> <mo>(</mo> <msub> <mi>X</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <mi>c</mi> <mrow> <mo>(</mo> <mi>U</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>,</mo> <mi>i</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mo>...</mo> <mo>,</mo> <mi>k</mi> </mrow>
In formula, k represents the attribute x discretization number of partitions, XiRepresent attribute x i-th of value, c (Xi) represent that attribute x values are XiSample number, U represents total sample i.e. domain, and c (U) represents total number of samples, p (Xi) represent attribute x values for XiProbability;
Step 2, the respective comentropy of every attribute, conditional attribute are obtained for decision attribute and a certain kind according to below equation Conditional attribute for another conditional attribute conditional entropy, it is necessary to which explanation, comentropy herein is used for what metric attribute was provided Information content, also illustrates that the ordering degree of sequence of attributes, conditional entropy is represented on the premise of completely known a certain attribute, another The information content of attribute also how many;
<mrow> <mi>H</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mo>=</mo> <mo>-</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>X</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mi>log</mi> <mi> </mi> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>X</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </mrow>
In formula, H (x) represents attribute x comentropy;
<mrow> <mi>H</mi> <mrow> <mo>(</mo> <mi>y</mi> <mo>|</mo> <mi>x</mi> <mo>)</mo> </mrow> <mo>=</mo> <mo>-</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>k</mi> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>X</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>k</mi> </munderover> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>Y</mi> <mi>j</mi> </msub> <mo>|</mo> <msub> <mi>X</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mi>log</mi> <mi> </mi> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>Y</mi> <mi>j</mi> </msub> <mo>|</mo> <msub> <mi>X</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </mrow>
In formula, p (Yj|Xi) represent in XiOn the premise of generation, YjThe probability of generation, and H (y | x) represent conditional entropies of the attribute y for x Or the conditional entropy of the y based on x;
Step 3, using the result of calculation of upper step, is tried to achieve between every kind of conditional attribute and decision attribute, condition according to below equation The mutual information of attribute between any two, to represent the size that information content is had between these attributes,
I (x, y)=H (y)-H (y | x)
In formula, H (y) represents attribute y comentropy, and I (x, y) represents attribute x and y mutual information, is the letter that attribute y and x has Breath amount.
5. the distribution network reliability evaluation method according to claim 1 based on big data mutual information attribute reduction, it is special It is related being normalized counted conditional attribute and the mutual information between decision attribute in step (5) and weighing attribute to levy The concrete mode of property is as follows:
Step one, to eliminate dimension impact, the mutual information of counted conditional attribute and decision attribute is carried out using below equation Normalization, asks for entropy correlated coefficient value, accordingly the correlation between Rule of judgment attribute and decision attribute, entropy correlated coefficient is got over It is small, represent that correlation is weaker, conditional attribute is also just smaller for the effect of evaluating reliability of distribution network;
<mrow> <msub> <mi>&amp;rho;</mi> <mrow> <mi>x</mi> <mi>y</mi> </mrow> </msub> <mo>=</mo> <mfrac> <mrow> <mi>I</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> <msqrt> <mrow> <mi>H</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mi>H</mi> <mrow> <mo>(</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> </msqrt> </mfrac> </mrow>
In formula, ρxyFor attribute x and y entropy correlated coefficient, x and y degree of correlation is represented;
Step 2, a critical value is set according to the result of calculation of entropy correlated coefficient, when certain conditional attribute and decision attribute When entropy correlated coefficient is less than the critical value, it is believed that the conditional attribute for distribution network reliability influence less, by it from decision-making Rejected in table.
6. the distribution network reliability evaluation method according to claim 1 based on big data mutual information attribute reduction, it is special Levy be obtain in step (7) optimal BP neural network model concrete mode it is as follows:
Step one, construct three layers of BP neural network to be trained the attribute data after yojan, with obtain and reliability index The conditional attribute of strong correlation is used as final output as input using decision attribute;Assuming that having p kind bars in decision table after yojan Part attribute, then the node number of input layer and output layer is respectively p and 1;B test specimens are randomly choosed in m group attribute datas This, remaining sample is as the training sample of neutral net, and sample includes conditional attribute and decision attribute values, to the number in sample According to being normalized;
Step 2, the initial connection weight of each node layer and hidden layer, output in h group BP neural networks are generated with computer random The threshold value of layer, is rewritten as binary coded form, constitutes initial solution space, is gone out with reference to neural computing in solution space and is solved The fitness of data;The larger preceding c solution data of fitness are selected as parent solution data, parent data are intersected, become ETTHER-OR operation obtains filial generation solution space, and convergence is judged whether according to the fitness of filial generation solution data, if it is, optimizing stops and defeated Go out optimal initial weight and threshold value, otherwise, proceed to select, intersect, mutation operation;
Step 3, decodes to the counted initial weight of upper step and threshold value, the sample after normalized is trained with BP neural network, The estimate of decision attribute and the error of actual value are obtained, judges whether the error meets the condition of convergence, if it is not satisfied, then adjusting Weight and threshold value, continue training network;If meeting, stop circulation, output makes the minimum weight and threshold value of error.
CN201710244420.8A 2017-04-14 2017-04-14 Power distribution network reliability assessment method based on big data mutual information attribute reduction Active CN107169628B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710244420.8A CN107169628B (en) 2017-04-14 2017-04-14 Power distribution network reliability assessment method based on big data mutual information attribute reduction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710244420.8A CN107169628B (en) 2017-04-14 2017-04-14 Power distribution network reliability assessment method based on big data mutual information attribute reduction

Publications (2)

Publication Number Publication Date
CN107169628A true CN107169628A (en) 2017-09-15
CN107169628B CN107169628B (en) 2021-05-07

Family

ID=59849026

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710244420.8A Active CN107169628B (en) 2017-04-14 2017-04-14 Power distribution network reliability assessment method based on big data mutual information attribute reduction

Country Status (1)

Country Link
CN (1) CN107169628B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108197822A (en) * 2018-01-24 2018-06-22 贵州电网有限责任公司 A kind of distribution network fault line selection adaptability teaching decision-making technique
CN108665181A (en) * 2018-05-18 2018-10-16 中国电力科学研究院有限公司 A kind of appraisal procedure and device of distribution network reliability
CN108664752A (en) * 2018-05-23 2018-10-16 同济大学 A kind of process parameter optimizing method based on process rule and big data analysis technology
CN108846422A (en) * 2018-05-28 2018-11-20 中国人民公安大学 Account relating method and system across social networks
CN109165819A (en) * 2018-08-03 2019-01-08 国网山东省电力公司聊城供电公司 A kind of active power distribution network reliability fast evaluation method based on improvement AdaBoost.M1-SVM
CN109242150A (en) * 2018-08-15 2019-01-18 中国南方电网有限责任公司超高压输电公司南宁监控中心 A kind of electric network reliability prediction technique
CN109343367A (en) * 2018-10-26 2019-02-15 齐鲁工业大学 A method of based on network response surface flue gas desulfurization
CN109615246A (en) * 2018-12-14 2019-04-12 内蒙古电力(集团)有限责任公司内蒙古电力科学研究院分公司 A kind of active distribution network economical operation state determines method
CN109636660A (en) * 2018-10-22 2019-04-16 广东精点数据科技股份有限公司 A kind of agricultural weather data redundancy removing method and system based on comentropy
CN110142803A (en) * 2019-05-28 2019-08-20 上海电力学院 A kind of mobile welding robot working state of system detection method and device
CN110443320A (en) * 2019-08-13 2019-11-12 北京明略软件系统有限公司 The determination method and device of event similarity
CN113220751A (en) * 2021-06-03 2021-08-06 国网江苏省电力有限公司营销服务中心 Metering system and evaluation method for multi-source data state quantity
CN113221442A (en) * 2020-12-24 2021-08-06 山东鲁能软件技术有限公司 Construction method and device of health assessment model of power plant equipment
CN113326655A (en) * 2021-05-25 2021-08-31 广西电网有限责任公司电力科学研究院 Comprehensive evaluation method and device for reliability and economy of radiation type power distribution network
CN113537734A (en) * 2021-06-28 2021-10-22 国网福建省电力有限公司经济技术研究院 Energy data application catalog extraction method based on maximum correlation minimum redundancy

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102879677A (en) * 2012-09-24 2013-01-16 西北工业大学 Intelligent fault diagnosis method based on rough Bayesian network classifier
CN103488802A (en) * 2013-10-16 2014-01-01 国家电网公司 EHV (Extra-High Voltage) power grid fault rule mining method based on rough set association rule
CN106503802A (en) * 2016-10-20 2017-03-15 上海电机学院 A kind of method of utilization genetic algorithm optimization BP neural network system
KR102059472B1 (en) * 2018-11-29 2019-12-30 대한민국 A System and Method for Prediction of Geomagnetic Disturbance Strength based on Solar Coronal Hole Information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102879677A (en) * 2012-09-24 2013-01-16 西北工业大学 Intelligent fault diagnosis method based on rough Bayesian network classifier
CN103488802A (en) * 2013-10-16 2014-01-01 国家电网公司 EHV (Extra-High Voltage) power grid fault rule mining method based on rough set association rule
CN106503802A (en) * 2016-10-20 2017-03-15 上海电机学院 A kind of method of utilization genetic algorithm optimization BP neural network system
KR102059472B1 (en) * 2018-11-29 2019-12-30 대한민국 A System and Method for Prediction of Geomagnetic Disturbance Strength based on Solar Coronal Hole Information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄海: "基于粗糙集理论的配电网可靠性评估", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108197822B (en) * 2018-01-24 2022-06-21 贵州电网有限责任公司 Power distribution network fault line selection adaptability evaluation decision method
CN108197822A (en) * 2018-01-24 2018-06-22 贵州电网有限责任公司 A kind of distribution network fault line selection adaptability teaching decision-making technique
CN108665181A (en) * 2018-05-18 2018-10-16 中国电力科学研究院有限公司 A kind of appraisal procedure and device of distribution network reliability
CN108664752A (en) * 2018-05-23 2018-10-16 同济大学 A kind of process parameter optimizing method based on process rule and big data analysis technology
CN108846422B (en) * 2018-05-28 2021-08-31 中国人民公安大学 Account number association method and system across social networks
CN108846422A (en) * 2018-05-28 2018-11-20 中国人民公安大学 Account relating method and system across social networks
CN109165819A (en) * 2018-08-03 2019-01-08 国网山东省电力公司聊城供电公司 A kind of active power distribution network reliability fast evaluation method based on improvement AdaBoost.M1-SVM
CN109165819B (en) * 2018-08-03 2021-09-14 国网山东省电力公司聊城供电公司 Active power distribution network reliability rapid evaluation method based on improved AdaBoost. M1-SVM
CN109242150A (en) * 2018-08-15 2019-01-18 中国南方电网有限责任公司超高压输电公司南宁监控中心 A kind of electric network reliability prediction technique
CN109636660A (en) * 2018-10-22 2019-04-16 广东精点数据科技股份有限公司 A kind of agricultural weather data redundancy removing method and system based on comentropy
CN109343367A (en) * 2018-10-26 2019-02-15 齐鲁工业大学 A method of based on network response surface flue gas desulfurization
CN109615246A (en) * 2018-12-14 2019-04-12 内蒙古电力(集团)有限责任公司内蒙古电力科学研究院分公司 A kind of active distribution network economical operation state determines method
CN110142803A (en) * 2019-05-28 2019-08-20 上海电力学院 A kind of mobile welding robot working state of system detection method and device
CN110443320A (en) * 2019-08-13 2019-11-12 北京明略软件系统有限公司 The determination method and device of event similarity
CN113221442A (en) * 2020-12-24 2021-08-06 山东鲁能软件技术有限公司 Construction method and device of health assessment model of power plant equipment
CN113221442B (en) * 2020-12-24 2022-08-30 山东鲁能软件技术有限公司 Method and device for constructing health assessment model of power plant equipment
CN113326655A (en) * 2021-05-25 2021-08-31 广西电网有限责任公司电力科学研究院 Comprehensive evaluation method and device for reliability and economy of radiation type power distribution network
CN113220751A (en) * 2021-06-03 2021-08-06 国网江苏省电力有限公司营销服务中心 Metering system and evaluation method for multi-source data state quantity
CN113537734A (en) * 2021-06-28 2021-10-22 国网福建省电力有限公司经济技术研究院 Energy data application catalog extraction method based on maximum correlation minimum redundancy

Also Published As

Publication number Publication date
CN107169628B (en) 2021-05-07

Similar Documents

Publication Publication Date Title
CN107169628A (en) A kind of distribution network reliability evaluation method based on big data mutual information attribute reduction
CN108733631A (en) A kind of data assessment method, apparatus, terminal device and storage medium
CN108764273A (en) A kind of method, apparatus of data processing, terminal device and storage medium
CN103106535B (en) Method for solving collaborative filtering recommendation data sparsity based on neural network
CN108446794A (en) One kind being based on multiple convolutional neural networks combination framework deep learning prediction techniques
CN112149873B (en) Low-voltage station line loss reasonable interval prediction method based on deep learning
CN112116153A (en) Park multivariate load joint prediction method for coupling Copula and stacked LSTM network
CN113205207A (en) XGboost algorithm-based short-term power consumption load fluctuation prediction method and system
CN106127242A (en) Year of based on integrated study Extreme Precipitation prognoses system and Forecasting Methodology thereof
CN114969953B (en) Optimized shield underpass tunnel design method and equipment based on Catboost-NSGA-III
CN112330050A (en) Power system load prediction method considering multiple features based on double-layer XGboost
CN106815652A (en) A kind of distribution network reliability Forecasting Methodology based on big data correlation analysis
CN113052214B (en) Heat exchange station ultra-short-term heat load prediction method based on long-short-term time sequence network
CN110751355A (en) Scientific and technological achievement assessment method and device
CN106845767A (en) A kind of industry development in science and technology power quantitative estimation method and assessment system
CN107909221A (en) Power-system short-term load forecasting method based on combination neural net
CN107295537A (en) A kind of method and system for wireless sensor network reliability of testing and assessing
CN113469570A (en) Information quality evaluation model construction method, device, equipment and storage medium
CN104834975A (en) Power network load factor prediction method based on intelligent algorithm optimization combination
CN106599610A (en) Method and system for predicting association between long non-coding RNA and protein
CN115948964A (en) Road flatness prediction method based on GA-BP neural network
CN116579447A (en) Time sequence prediction method based on decomposition mechanism and attention mechanism
Tu et al. Study on the interactive relationship between urban residents’ expenditure and energy consumption of production sectors
Liu et al. Study on the drivers of inclusive green growth in China based on the digital economy represented by the Internet of Things (IoT)
CN112990776B (en) Distribution network equipment health degree evaluation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant