CN111429034A - Method for predicting power distribution network fault - Google Patents

Method for predicting power distribution network fault Download PDF

Info

Publication number
CN111429034A
CN111429034A CN202010317146.4A CN202010317146A CN111429034A CN 111429034 A CN111429034 A CN 111429034A CN 202010317146 A CN202010317146 A CN 202010317146A CN 111429034 A CN111429034 A CN 111429034A
Authority
CN
China
Prior art keywords
data
fault
power distribution
distribution network
network fault
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010317146.4A
Other languages
Chinese (zh)
Inventor
黄文思
陆鑫
陈婧
薛迎卫
林超
叶强镔
胡从众
张建永
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
State Grid Jiangsu Electric Power Co Ltd
Great Power Science and Technology Co of State Grid Information and Telecommunication Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
State Grid Jiangsu Electric Power Co Ltd
Great Power Science and Technology Co of State Grid Information and Telecommunication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Information and Telecommunication Co Ltd, State Grid Jiangsu Electric Power Co Ltd, Great Power Science and Technology Co of State Grid Information and Telecommunication Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202010317146.4A priority Critical patent/CN111429034A/en
Publication of CN111429034A publication Critical patent/CN111429034A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Theoretical Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention discloses a method for predicting the fault of a power distribution network, which comprises a first step of data acquisition and pretreatment; a second step of a feature variable selection section; thirdly, building a power distribution network fault prediction model based on fault grade division; according to the method, the accurate power distribution network fault prediction model is built through the optimized data action, so that the accurate prediction of the power distribution network fault is realized, the problems that data are difficult to refine and extract and prediction model parameters are difficult to select in the power distribution network fault prediction are solved, a basis is provided for power grid workers to timely react to the power distribution network fault, and the stability and the reliability of the whole power system are improved.

Description

Method for predicting power distribution network fault
Technical Field
The invention relates to a method for predicting a power distribution network fault in the field of power utilization of a power grid.
Background
The power distribution network is an important component in a power system, and with the rapid development of the smart power grid, a large number of distributed power supplies are not determined to be connected, so that the fault information of the power distribution network is more complex, and the accurate and rapid analysis of the fault becomes more difficult. In order to ensure highly intelligent operation of the power distribution network, real-time monitoring, timely early warning of abnormal conditions and rapid positioning and processing of faults need to be carried out on feeder line operation data.
In the prior art, both a power distribution network fault positioning method and a power distribution network fault type analysis method are based on positioning and type analysis of a fault which occurs according to a fault waveform collected by a fault indicator after the fault occurs, and the methods have guiding significance for timely processing the fault which occurs. However, the method cannot predict the to-be-generated fault and cannot achieve the function of preventing the possible fault in time. In addition, the power distribution network fault prediction depth is interfered by various influence factors of power distribution network faults, and the original data has the problems of data abnormity, vacancy and the like, so that the data quality is not high, and the fault prediction result is influenced finally.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a power distribution network fault prediction method, solves the problems that data are difficult to refine and extract and prediction model parameters are difficult to select in power distribution network fault prediction, and improves the stability and reliability of the whole power system.
In order to achieve the purpose, the invention provides the following technical scheme: a method for power distribution network fault prediction is characterized by comprising the following steps: comprises the following steps
The first step is as follows: the data preprocessing part comprises power distribution network fault influence factor analysis and sample screening based on a particle swarm algorithm;
the second step is that: the characteristic variable selection part comprises the steps of carrying out primary characteristic selection on each relevant variable, and then adopting a specific selection algorithm to carry out secondary screening to form an optimal characteristic variable set;
the third step: building a power distribution network fault prediction model based on fault grade division; the method comprises the steps of grading faults, optimizing parameters of a support vector machine by using an improved particle swarm optimization algorithm, further constructing a power distribution network fault prediction model, analyzing various data of a power distribution network based on the constructed model, and outputting abnormity and fault prediction.
Preferably, the analysis of the power distribution network fault influence factors in the first step includes fault influence factor analysis, power distribution network data investigation and extraction
(1) Analysis of the fault influencing factors includes
1) Factor of influence of operation
The operation influence factors comprise feeder load and feeder temperature rise;
2) self-influencing factors of equipment
The self-influencing factors of the equipment comprise the number of the equipment on the feeder line and the commissioning time of the equipment;
3) external influencing factor
The external influence factors comprise the influence of weather, and the equipment is exposed outdoors for a long time and is influenced by various factors such as temperature, wind and precipitation;
(2) the data investigation and extraction of the power distribution network comprise
The data contained in the power distribution network are comprehensively investigated, and historical fault influence factor data and fault data are comprehensively collected after the data are understood, wherein the historical fault influence factor data and the fault data comprise a marketing business management system, an ERP system, a power distribution automation system, a power utilization information acquisition system, a power distribution line online monitoring system, a production management system, a power distribution geographic information system and an intelligent public distribution transformer monitoring system.
Preferably, the sample screening based on the particle swarm optimization in the first step comprises two parts of raw data processing and outlier data sample elimination
(1) Processing original data, wherein the processing of the original data comprises three steps of data cleaning, data transformation and data integration:
1) data cleansing
The data cleaning comprises data vacancy value analysis, data abnormal value analysis and data repeated value analysis
2) Data transformation
The data transformation converts original data into a form which is easy to analyze and apply, including feature construction, data grading and data quantization
3) Data integration
Data integration is to perform data statistics and merge data into a unified data store
(2) Outlier data sample rejection
After the original data is preprocessed, abnormal samples possibly still appear in the obtained data samples, and outlier sample data are diagnosed and removed by using an outlier sample diagnosis technology based on clustering.
Through the steps, compared with the existing historical data preprocessing mode of the power distribution network, the method adopts fault influence factor analysis and particle swarm algorithm-based sample screening, and carries out comprehensive investigation and extraction by analyzing fault data and combining various data contained in the power distribution network, and the output result is more accurate after a data set in the next step is screened and rejected for multiple times.
Preferably, the selection of the characteristic variables in the second step comprises exploration and analysis of feeder fault influence factor data and selection of the characteristic variables of the feeder fault
(1) Data exploration and analysis of feeder fault influence factors
Analyzing the relationship between the feeder line fault and the influence factors thereof by a data exploration method, preliminarily screening out the fault characteristic variable, referred to as the fault characteristic variable for short, under the characteristic variable of the influence factors of the feeder line fault, and laying a foundation for further screening out the optimal fault characteristic variable by adopting a characteristic selection algorithm;
(2) selection of feeder fault signature variables
Selecting an optimal feature subset from a large number of related features by using a feature selection algorithm: and (3) adopting a characteristic subset evaluation algorithm based on the correlation: removing redundant characteristic variables and non-strong correlation variables; the correlations between discrete and discrete variables, and between discrete and continuous variables are quantified.
Through the arrangement of the scheme, the characteristic selection keeps the fault characteristic variable of the fault characteristic class of the feeder line, on one hand, the fact that the operation influence factors of the feeder line are strongly related and no redundancy exists with other characteristic variables is proved, and on the other hand, the effectiveness of the characteristic selection method is proved.
Preferably, the third step includes optimizing and constructing a model for the support vector machine parameters by using a particle swarm optimization algorithm, specifically by:
(1) fault ranking and assessment index
1) Fault grades are divided by adopting power failure frequency quantity and fault range based on statistical probability
2) Evaluation index for power distribution network fault prediction by adopting Kappa statistical index
(2) Support vector machine
Training samples, converting data into a high-dimensional space, and then performing linear regression, so that the original nonlinear fitting problem is converted into a linear regression problem, wherein the formula is as follows:
Figure BDA0002459981790000031
here, ω is a weight vector; x is an input vector; b is the fitting deviation;
(3) power distribution network fault prediction model based on support vector machine
A support vector machine algorithm is adopted to solve the problem of complex power grid fault prediction, the learned 'hidden rule' is stored in an SVM model, and parameters are optimized to construct a power distribution network fault prediction model.
Through the scheme, compared with the traditional model, the method has stronger generalization capability and classification effect, and has the advantages of avoiding overfitting, having upper limit on generalization errors, having few adjusting parameters and the like.
In summary, the invention has the following beneficial effects: according to the method, the influence factors of the power distribution network faults are deeply analyzed according to the feasibility and the comprehensive rationality principle of the power distribution network fault prediction; the method comprises the following steps of (1) carrying out investigation on the data of the power distribution network, and extracting data required by feeder line fault prediction; the power distribution network fault prediction method based on the power distribution network fault prediction model has the advantages that the power distribution network fault prediction model is established based on optimized data and parameters, accurate prediction of power distribution network faults is achieved based on the power distribution network fault prediction model compared with an existing model, the problems that data are difficult to refine and extract and prediction model parameters are difficult to select in power distribution network fault prediction are solved, a basis is provided for power grid workers to timely react to distribution network faults, and stability and reliability of a whole power system are improved.
Drawings
FIG. 1 is a framework diagram of power distribution network fault prediction in the scheme;
FIG. 2 is a flow chart of data investigation of the power distribution network according to the scheme;
FIG. 3 is a schematic diagram of weighted summation of 3 moving directions of particles in the clustering algorithm according to the present scheme;
FIG. 4 is a flow chart of outlier sample diagnosis according to the present protocol;
FIG. 5 is a flow chart of feature selection in the present scenario;
FIG. 6 is a flowchart of the SVM-based power distribution network fault prediction in the present scheme;
fig. 7 is a diagram of a search strategy commonly used for feature subsets in the present scheme.
Detailed Description
The present invention is described in further detail below with reference to fig. 1-7.
The embodiment discloses a method for predicting faults of a power distribution network, which is characterized by comprising the following steps: comprises the following steps
The first step is as follows: the data preprocessing part comprises power distribution network fault influence factor analysis and sample screening based on a particle swarm algorithm;
the second step is that: the characteristic variable selection part comprises the steps of carrying out primary characteristic selection on each relevant variable, and then adopting a specific selection algorithm to carry out secondary screening to form an optimal characteristic variable set;
the third step: building a power distribution network fault prediction model based on fault grade division; the method comprises the steps of grading faults, optimizing parameters of a support vector machine by using an improved particle swarm optimization algorithm, further constructing a power distribution network fault prediction model, analyzing various data of a power distribution network based on the constructed model, and outputting abnormity and fault prediction.
Preferably, the analysis of the power distribution network fault influence factors in the first step includes fault influence factor analysis, power distribution network data investigation and extraction
1) Analysis of the fault influencing factors includes
(1) Factor of influence of operation
The operational influencing factor is mainly the feeder load. The device is subjected to voltages, currents and loads that affect the performance of the device. If overload causes the temperature of the equipment to rise, the tension of the lead is influenced, the aging of the equipment is accelerated, and the like, so that the fault rate of the feeder line is increased;
(2) self-influencing factors of equipment
The operational influencing factor is mainly the feeder load. The device is subjected to voltages, currents and loads that affect the performance of the device. If overload causes the temperature of the equipment to rise, the tension of the lead is influenced, the aging of the equipment is accelerated, and the like, so that the fault rate of the feeder line is increased;
(3) external influencing factor
The external influencing factors are mainly the influence of weather. The equipment is exposed outdoors for a long time and is influenced by various factors such as temperature, wind, precipitation and the like. Such as; the ambient temperature of the equipment is higher, so that the temperature rise of the electrical equipment exceeds the allowable degree, the voltage-resistant level and the insulation property are reduced, and the fault probability of the feeder line is increased; the high precipitation directly causes high humidity, the high humidity easily reduces the insulation performance of equipment, leakage current, pollution flashover and the like occur, and the fault probability of a feeder line is increased; the mechanical strength of the equipment can be directly influenced in windy weather, and hanging wires such as wind and the like are indirectly caused, so that the fault probability of the feeder line is increased; lightning weather can directly cause equipment to bear lightning overvoltage, irreversible serious influence is generated on the equipment or faults are caused, and therefore the fault rate of a feeder line is increased in the lightning weather;
2) the data investigation and extraction of the power distribution network comprise
As shown in fig. 2, data contained in the power distribution network are comprehensively investigated, historical fault influence factor data and fault data are comprehensively collected after the data are understood, and the system mainly comprises a marketing service management system, an ERP system, a power distribution automation system, an electricity consumption information acquisition system, a power distribution line online monitoring system, a production management system, a power distribution geographic information system and an intelligent public distribution transformer monitoring system, wherein the marketing service management system, the ERP system and the power distribution automation system respectively belong to a marketing, supply chain management and scheduling system, and are removed independently of data required by power distribution network fault prediction; most data of the power utilization information acquisition system are acquired by the intelligent ammeter, and are removed regardless of data required by power distribution network fault prediction; the on-line monitoring system for the distribution line can acquire current data and the like of the distribution line in real time and can be used for power distribution network fault prediction.
Preferably, the sample screening based on the particle swarm optimization in the first step comprises two parts of raw data processing and outlier data sample elimination
1) Processing original data, wherein the processing of the original data mainly comprises three steps of data cleaning, data transformation and data integration:
(1) data cleansing
The data cleaning mainly comprises data vacancy value analysis, data abnormal value analysis and data repeated value analysis; the vacancy value analysis mainly analyzes record deletion in original data and certain field deletion in the record, and eliminates or supplements the characteristics of the original data; the abnormal data value analysis formulates corresponding rules to find unreasonable and wrong data according to the characteristics of the original data, and carries out removal or replacement operation; and comparing different fields of different data samples according to the characteristics of the data, and removing the repeated data.
(2) Data transformation
The data transformation converts original data into a form easy to analyze and apply, and the main content comprises feature construction, data grading and data quantification; data transformation is performed such as position information quantification, construction of characteristic attributes of commissioning time, hierarchical analysis of weather data and the like;
(3) data integration
Data integration is to perform data statistics, and to merge data into a certain unified data storage (such as an excell database), most of data required for feeder fault prediction come from different distribution automation systems, and therefore, the W needs to perform statistical analysis and merging on original data.
2) Outlier data sample rejection
After the original data is preprocessed, abnormal samples may still appear in the obtained data samples, and these samples are very different from most of data in the same data set, and may be generated by another completely different mechanism, and this data is called outlier sample data. Outlier sample data can directly influence the fitting precision of the model, and even can directly obtain a wrong prediction result. Therefore, it is very important to find outlier sample data and reject the outlier sample data by using an outlier sample diagnosis technology;
outlier sample diagnosis is an important part of data preprocessing, which is mainly the finding of outlier samples that are significantly different from other samples. The method for acquiring the data determines that outlier samples cannot be modified and filled, and the outlier sample elimination has little influence on the construction of the fault prediction model and is helpful for improving the model prediction accuracy, so the outlier samples are regarded as noise to be directly eliminated.
For the characteristic of more dimensionality of the data samples, the clustering-based outlier sample diagnosis is suitable, namely the data samples are divided into a plurality of categories, the distance between the data samples and a clustering center is calculated, and the outlier samples are diagnosed based on the distance and then removed.
The quality in the clustering generated by the clustering algorithm has great influence on the diagnosis of the outlier sample, and the particle swarm clustering algorithm has the following advantages:
(1) the particle swarm clustering algorithm has larger randomness when generating a next generation solution (clustering center), is not easy to fall into a local minimum value, and can ensure the clustering effect;
(2) the method has no degradation phenomenon of random optimization, so that the later convergence is stable, the fluctuation phenomenon rarely occurs, and a better clustering result can be stably obtained.
Therefore, the optimal clustering result is obtained by adopting the particle swarm optimization algorithm in the section, so that the correctness of the diagnosis of the outlier sample is ensured. A particle swarm clustering algorithm is introduced in the following expansion:
let X ═ Xi, i ═ 1,2, …, N } be the sample set, the sample dimension is N, and the cluster center set is
Figure BDA0002459981790000071
The intra-class sample distance center and the calculation formula are as follows:
Figure BDA0002459981790000072
in the formula (I), the compound is shown in the specification,
Figure BDA0002459981790000073
the center of the jth cluster is an n-dimensional vector.
Figure BDA0002459981790000074
The distance from the sample to the corresponding clustering center is shown, the distance center and J of the sample in the class are the sum of the distances from each sample to the corresponding clustering center, and the distance from the sample to the clustering center is measured by the algorithm by adopting the Euclidean distance.
After the cluster center is determined, the cluster division of the samples is determined by the nearest cluster rule. That is, for the sample Xi, if the distance from the d-th cluster center satisfies the following formula, the sample does not belong to the class d.
Figure BDA0002459981790000075
In the clustering problem with K cluster centers, the sample vector dimension is n. And (3) constructing m particles by a particle swarm clustering algorithm. Each particle l (l ═ 1,2, …, m) consists of a particle position Zl, a velocity vl and a fitness fl. Particle position, i.e. individual cluster center position; the particle speed is the updating direction and the size of K clustering center positions; fitness is a measure to determine whether the particle updates the global and local optimal solutions.
In each iteration, each particle can remember the historically self-searched optimal solution, namely the local optimal solution Pld, and the optimal solution searched by the whole particle swarm, namely the global optimal solution Pg. particle updates the self speed searching new solution by searching the local optimal solution and the global optimal solution, the speed of the particle is determined by three parts, namely the original speed vl (t) of the particle, the distance Pld-zl (t) which the particle self optimally experiences and the distance Pg-zl (t) which the particle swarm optimally experiences are respectively determined by weight coefficients α 1, β 2, wherein α is called inertia coefficient and β 1, β 2 is called acceleration constant;
as shown in fig. 3, the velocity update and position update formulas of the particles are:
vl(t+1)=α.vl(t)+β1.rand().(Pld-zl(t))+β2.rand().(Pg-zl(t))
zl(t+1)=zl(t)+vl(t+1)
in the above formula, vl (t +1) is the velocity of the particle l in the t +1 th iteration, and rand () represents a random number of 0-1.
The fitness calculation formula of the particles is as follows:
fl=1/J
the smaller the intra-class sample from the center sum among the clusters represented by the particles, the greater the particle fitness. And continuously searching in a solution space by the particle swarm clustering algorithm through the updating of the particles, the local optimal solution and the global optimal solution in each iteration, and obtaining the optimal solution after the maximum iteration times are reached.
After clustering analysis is carried out on the samples, the abnormal value identification standard of the boxed graph is used for defining the diagnosis broad value. And if the distance between the sample and the clustering center is larger than the diagnosis threshold value, diagnosing the sample as an outlier sample. The diagnostic threshold is calculated as follows:
df=Qu+1.5IQR
in the above formula, df is a diagnostic threshold, Qu is an upper quartile, which means that the distance between all samples and the clustering center is one fourth greater than that of the samples;
wherein:
IQR=QU-QL
in the above equation, Q L is the lower quartile, meaning that the distance between all samples and the cluster is a quarter of the smaller.
The outlier sample diagnosis algorithm based on the particle swarm clustering algorithm comprises the following specific steps:
step 1: initializing the particle swarm. Giving a clustering number K and a particle number m; setting the maximum iteration number itermax; randomly appointing the category of the particles; the initial velocity of each particle was set to 0.
Step 2: and setting the local optimal solution Pld and the global optimal solution Pg according to the initialization result.
And Step3, calculating and updating the speed and the position of all the particles according to the formula, conventionally setting an acceleration coefficient β 1-2, β 2-2, and an inertia coefficient α according to the formula:
Figure BDA0002459981790000091
step 4: a new cluster partition is performed. And determining the new cluster division of the sample according to the rule of the nearest cluster.
Step 5: fitness values for all particles were calculated. And for each particle, calculating a cluster center according to the new cluster division, and calculating the fitness value of the particle in the iteration.
Step 6: update Pld, global optimal solution Pg. And judging whether the fitness values of all the particles are larger than the fitness values of the local optimal solution Pld and the global optimal solution Pg, and if so, updating the local optimal solution Pld and the global optimal solution Pg.
Step 7: and judging whether the maximum iteration number is reached. If the iteration is continued without going to Step3, and if the iteration is reached, the iteration goes to Step 8.
Step 8: a diagnostic threshold is calculated. And calculating the distance between each sample and the corresponding clustering center, and calculating a diagnosis threshold value.
Step 9: and diagnosing an outlier sample and outputting a result. If the distance between the sample and the corresponding bin is greater than the diagnostic threshold, the sample is diagnosed as an outlier sample. And (5) all samples are diagnosed, and an outlier sample judgment result is output.
The specific steps of diagnosing the outlier sample by the particle swarm algorithm can be drawn, and the outlier sample diagnosis process is shown in fig. 4.
Preferably, the selection of the characteristic variables in the second step comprises exploration and analysis of feeder fault influence factor data and selection of the characteristic variables of the feeder fault
(1) Data exploration and analysis of feeder fault influence factors
Analyzing the relationship between the feeder line fault and the influence factors thereof by a data exploration method, preliminarily screening out the fault characteristic variable, referred to as the fault characteristic variable for short, under the characteristic variable of the influence factors of the feeder line fault, and laying a foundation for further screening out the optimal fault characteristic variable by adopting a characteristic selection algorithm;
data exploration and analysis is a method for exploring data regularity by methods such as mapping, tabulation, statistics, correlation analysis and the like. Three characteristics are explored: one is to let the data "speak" and emphasize the inherent regularity of exploring the data from the data, rather than the traditional post-hypothesis verification conclusions. And secondly, the analysis method is selected according to data without being limited by the traditional statistical method. Thirdly, the data analysis result tends to and the data visualization method is adopted to intuitively find the intrinsic value of the data.
And (5) carrying out correlation quantitative analysis by adopting a Pearson correlation coefficient. Assuming that the variable X: and the other corresponding variable is Y, the calculation mode of the Pearson correlation coefficient is as follows:
Figure BDA0002459981790000101
wherein, the correlation degree of the variables X and Y is divided into: l r l <0.4 is low degree linear correlation, 0.4< | r | <0.7 is significant linear correlation, and 0.7< | r | 1 is high linear correlation.
(2) Selection of feeder fault signature variables
For the preliminarily screened fault feature variables, a feature selection algorithm is adopted to select an optimal feature subset from a large number of related features, the feature selection process is a process of searching the feature subset in the feature set containing d features and evaluating the feature subset obtained by searching, so that the optimal feature subset is selected, the specific process is shown in fig. 5, and the search strategy is shown in fig. 7.
The feature subset processing mainly comprises four parts: generating a feature subset, evaluating the feature subset, terminating the search and outputting a result. Wherein generating the feature subset is a process of continuously generating candidate feature subsets. The process of evaluating the feature subset is to calculate the candidate feature subset evaluation value and compare the candidate feature subset evaluation value with the optimal feature subset obtained in the previous iteration process, and if the new candidate feature subset is more optimal, the optimal feature subset is updated. Outputting a feature selection result when the search termination condition is met;
feature subset search, i.e., the process of generating candidate subsets, is a very important step in feature selection. In each cycle, the generated candidate subset is likely to be the feature subset of the optimal feature subset. Therefore, the feature subset search method directly determines whether the feature selection can obtain the optimal feature subset;
for solving the problems: (1) removing redundant characteristic variables and non-strong correlation variables; (2) the correlations between discrete and discrete variables, and between discrete and continuous variables are quantified.
By adopting a feature subset evaluation algorithm based on the correlation as a feature subset evaluation function used by the feature subset evaluation algorithm, the optimal feature subset with low redundancy among features and high correlation between the features and predictive variables (namely, strong correlation features) can be selected, and the introduced weighted correlation coefficient calculation method can measure the correlation among various types of variables;
and calculating the evaluation value of the candidate feature subset F by a feature subset evaluation function based on the feature subset evaluation algorithm of the correlation degree, and selecting the optimal feature subset by combining a feature subset search algorithm. The feature subset merit function is as follows:
Figure BDA0002459981790000102
where e (F) is the evaluation value of the candidate subset, and d is the number of features included in the candidate subset F.
Figure BDA0002459981790000103
The average relevance of all the feature variables and predictor variables in the candidate subset F,
Figure BDA0002459981790000111
is the average degree of association between the feature variables of the candidate subset F. If the redundancy of features within a subset of features is higher, i.e. the redundancy of features is higher
Figure BDA0002459981790000112
The larger the value of e (F); the more strongly correlated feature variables within the candidate feature subset F, i.e. the
Figure BDA0002459981790000113
The larger the value of (c), the larger the value of e (f). Therefore, by comparing the evaluation values of the candidate feature subsets, redundant and non-strongly correlated variables can be eliminated, and an optimal feature subset strongly correlated with the predictor variable is selected.
Because the feature set contains discrete variables and continuous variables, there are two ways of calculating the correlation between every two variables, and if two feature variables are X and Y, there are:
1) when the variables X and Y are continuous variables, a Pearson correlation coefficient calculation mode is adopted.
2) When Y is a continuous variable and X is a discrete variable, if X has i values, the correlation calculation between the variables X and Y adopts a half-weighted correlation coefficient calculation formula:
Figure BDA0002459981790000114
wherein, P (X ═ xi) is the ratio of the i-th class value in X, and Xbi is a continuous variable formed by making xi in the discrete variable X1 and the rest variables 0.
3) When X, Y variables are all discrete variables, if X, Y has i and j values, the correlation degree between the variables X, Y adopts a full-weighted correlation coefficient calculation formula;
Figure BDA0002459981790000115
preferably, the third step includes optimizing and constructing a model for the support vector machine parameters by using a particle swarm optimization algorithm, specifically by:
(1) fault ranking and assessment index
1) The failure range is divided by the total feeder core capacity by summing the load of the failures of the different feeders in the month, since the total bound load of the different feeders is different, the degree of the failure range is reflected by dividing the total feeder core capacity by the sum of the load of the failures of the different feeders in the month, the failure load proportion value L i is expressed as follows:
Figure BDA0002459981790000116
here, n represents the total number of failures in the month; SiN represents the core capacity of the ith feeder line; sij represents the fault load quantity of the ith feeder at the jth time in the month. In order to simultaneously consider the power failure times and the fault range, a power distribution network fault grade division index f is established in a weighting mode:
fi=λ·Ni+Li
in the formula, N is the failure frequency of the ith feeder line; λ is a weight coefficient, taken herein as 0.25; and fi is an index of the ith feeder line. The different fault classes are as follows:
Figure BDA0002459981790000121
if the fault level is 2, the risk of the fault state of the feeder line is high, generally, the fault number is 3 to 8 and L i is less than 100%, if the fault level is 3, the risk of the fault state of the feeder line is high, generally, the fault number is more than 8 and L i is more than 100%.
2) Adopting a Kappa statistical index as an evaluation index of power distribution network fault prediction; the Kappa statistical indicator is obtained by calculating the values in the error matrix. The matrix is a ns-row-ns-column matrix, where rows represent classification points and columns represent prediction points. The calculation method is as follows:
Figure BDA0002459981790000122
here, Ns sample total number; r is the number of matrix rows; xii is the value of the matrix diagonal; xir and xic represent the sum of the ith row and the ith column, respectively. The fault prediction accuracy can be obtained by the formula, the K value is between 0 and 1, and the higher the K value is, the higher the fault prediction accuracy of the power distribution network is.
(2) Support vector machine
The training samples are { x1, y1}, …, { xi, yi }. The original nonlinear fitting problem is converted into a linear regression problem by converting the data into a high-dimensional space and then performing linear regression. The fitting function is represented as follows:
Figure BDA0002459981790000123
here, ω is a weight vector; x is an input vector; b is the fitting deviation. Multiple training runs were performed through the samples to achieve the following equation minimization.
Figure BDA0002459981790000131
Figure BDA0002459981790000132
Here, the first item
Figure BDA0002459981790000133
Empirical error for optimization problems; the latter item
Figure BDA0002459981790000134
The above equation can be transformed by introducing relaxation variables ξ and ξ, which are normal parameters (. () as cost parameters to compensate for the normalization parameters and empirical errors, c as penalty factors that have a significant impact on the training effect, and parameters that are loss functions of the optimization problem:
Figure BDA0002459981790000135
Figure BDA0002459981790000136
the above problem is converted into a dual problem:
Figure BDA0002459981790000137
here, α i and α i are both introduced lagrange multipliers the SVM model can be finally obtained from the above equation as follows:
Figure BDA0002459981790000138
where k (xi, xj) is a kernel function, a radial basis function is used herein as a kernel function.
(3) Power distribution network fault prediction model based on support vector machine
A support vector machine algorithm is adopted to solve the problem of complex power grid fault prediction, the learned 'hidden rule' is stored in an SVM model, and parameters are optimized to construct a power distribution network fault prediction model.
As shown in fig. 6: and (3) an SVM power distribution network fault prediction model of PSO optimization parameters.
Step 1: initializing a particle swarm scale m, setting a weight factor of an algorithm, a termination condition and an initial particle code;
step 2: setting the individual extreme value of each particle as the current position, calculating the fitness value of each particle by using a fitness function, taking the particles with good fitness, and taking the corresponding individual extreme value as an initial global extreme value;
step 3: performing iterative calculation according to a position and speed updating formula of the particles, and updating the position and speed of the particles;
step 4: calculating the fitness value of each particle after each iteration according to the fitness function of the particles;
step 5: comparing the fitness value of each particle with the fitness value of the individual extreme value of each particle, if the fitness value is better, updating the individual extreme value, otherwise, keeping the original value;
step 6: comparing the updated individual extreme value of each particle with the global extreme value, if the updated individual extreme value is more optimal, updating the global extreme value, otherwise, keeping the original value;
step 7: judging whether a termination condition is met, if the maximum iteration times are reached or the obtained solution is converged or the obtained solution achieves an expected effect, terminating the iteration, otherwise, returning to Step 3;
step 8: and obtaining a parameter combination which enables the model to be optimal, and using the parameter combination to construct an optimal model.
By the pair of the following steps:
(1) the management level of the power distribution network is improved, and various resources of the power distribution network are reasonably utilized. By utilizing the research result of feeder line fault prediction, the fault is sensed in advance, a decision basis is provided for the aspects of operation, maintenance, management and the like of the power distribution network, and the reliability of the power distribution network can be effectively improved.
(2) And the loss caused by the fault is reduced. By using the result of feeder fault prediction, the feeder with higher fault level and the like are prevented in a targeted manner, such as strengthening and eliminating defects, patrolling or replacing old equipment and the like, and the loss of power for companies and power users caused by the fault of a power distribution network is reduced.
(3) And the first-aid repair efficiency and the user satisfaction degree are improved. The feeder fault prediction result can provide guidance for power distribution network emergency repair work, such as power distribution network emergency repair stagnation point position optimization. The emergency repair time is reduced, so that the electricity customers can obtain better electricity utilization experience, and meanwhile, the investment of emergency repair resources is expected to be reduced.
According to the method, the influence factors of the power distribution network faults are deeply analyzed according to the feasibility and the comprehensive rationality principle of the power distribution network fault prediction; the method comprises the following steps of (1) carrying out investigation on the data of the power distribution network, and extracting data required by feeder line fault prediction; the power distribution network fault prediction model is established based on optimized data and parameters, and based on the power distribution network fault prediction model, abnormal data such as load, temperature and the like of a power distribution network feeder line are more accurately subjected to prediction analysis compared with the existing model, so that accurate judgment and prediction of the power distribution network fault are realized, the problems that data are difficult to refine and extract and prediction model parameters are difficult to select in the power distribution network fault prediction are solved, a basis is provided for timely reacting to the power distribution network fault by power grid workers, and the stability and reliability of the whole power system are improved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the design concept of the present invention should be included in the scope of the present invention.

Claims (5)

1. A method for power distribution network fault prediction is characterized by comprising the following steps: comprises the following steps
The first step is as follows: the data preprocessing part comprises power distribution network fault influence factor analysis and sample screening based on a particle swarm algorithm;
the second step is that: the characteristic variable selection part comprises the steps of carrying out primary characteristic selection on each relevant variable, and then adopting a specific selection algorithm to carry out secondary screening to form an optimal characteristic variable set;
the third step: building a power distribution network fault prediction model based on fault grade division; the method comprises the steps of grading faults, optimizing parameters of a support vector machine by using an improved particle swarm optimization algorithm, further constructing a power distribution network fault prediction model, analyzing various data of a power distribution network based on the constructed model, and outputting abnormity and fault prediction.
2. The method for power distribution network fault prediction according to claim 1, wherein: analyzing the power distribution network fault influence factors in the first step, wherein the analyzing comprises analyzing the fault influence factors, researching and extracting the power distribution network data
(1) Analysis of the fault influencing factors includes
1) Factor of influence of operation
The operation influence factors comprise feeder load and feeder temperature rise;
2) self-influencing factors of equipment
The self-influencing factors of the equipment comprise the number of the equipment on the feeder line and the commissioning time of the equipment;
3) external influencing factor
The external influence factors comprise the influence of weather, and the equipment is exposed outdoors for a long time and is influenced by various factors such as temperature, wind and precipitation;
(2) the data investigation and extraction of the power distribution network comprise
The data contained in the power distribution network are comprehensively investigated, and historical fault influence factor data and fault data are comprehensively collected after the data are understood, wherein the historical fault influence factor data and the fault data comprise a marketing business management system, an ERP system, a power distribution automation system, a power utilization information acquisition system, a power distribution line online monitoring system, a production management system, a power distribution geographic information system and an intelligent public distribution transformer monitoring system.
3. The method for power distribution network fault prediction according to claim 1, wherein: in the first step, sample screening based on a particle swarm algorithm comprises two parts of original data processing and outlier data sample elimination
(1) Processing original data, wherein the processing of the original data comprises three steps of data cleaning, data transformation and data integration:
1) data cleansing
The data cleaning comprises data vacancy value analysis, data abnormal value analysis and data repeated value analysis
2) Data transformation
The data transformation converts original data into a form which is easy to analyze and apply, including feature construction, data grading and data quantization
3) Data integration
Data integration is to perform data statistics and merge data into a unified data store
(2) Outlier data sample rejection
After the original data is preprocessed, abnormal samples possibly still appear in the obtained data samples, and outlier sample data are diagnosed and removed by using an outlier sample diagnosis technology based on clustering.
4. The method for power distribution network fault prediction according to claim 1, wherein: the characteristic variable selection in the second step comprises the exploration and analysis of feeder fault influence factor data and the selection of the characteristic variable of the feeder fault
(1) Data exploration and analysis of feeder fault influence factors
Analyzing the relationship between the feeder line fault and the influence factors thereof by a data exploration method, preliminarily screening out the fault characteristic variable, referred to as the fault characteristic variable for short, under the characteristic variable of the influence factors of the feeder line fault, and laying a foundation for further screening out the optimal fault characteristic variable by adopting a characteristic selection algorithm;
(2) selection of feeder fault signature variables
Selecting an optimal feature subset from a large number of related features by using a feature selection algorithm: and (3) adopting a characteristic subset evaluation algorithm based on the correlation: removing redundant characteristic variables and non-strong correlation variables; the correlations between discrete and discrete variables, and between discrete and continuous variables are quantified.
5. The method for power distribution network fault prediction according to claim 1, wherein: the third step comprises the step of optimizing and constructing a model for the parameters of the support vector machine by adopting a particle swarm optimization algorithm, which specifically comprises the following steps:
(1) fault ranking and assessment index
1) Dividing fault grades by adopting the power failure frequency quantity and the fault range based on the statistical probability;
2) adopting a Kappa statistical index as an evaluation index of power distribution network fault prediction;
(2) support vector machine
Training samples, converting data into a high-dimensional space, and then performing linear regression, so that the original nonlinear fitting problem is converted into a linear regression problem, wherein the formula is as follows:
Figure FDA0002459981780000031
here, ω is a weight vector; x is an input vector; b is the fitting deviation;
(3) power distribution network fault prediction model based on support vector machine
A support vector machine algorithm is adopted to solve the problem of complex power grid fault prediction, the learned 'hidden rule' is stored in an SVM model, and parameters are optimized to construct a power distribution network fault prediction model.
CN202010317146.4A 2020-04-21 2020-04-21 Method for predicting power distribution network fault Pending CN111429034A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010317146.4A CN111429034A (en) 2020-04-21 2020-04-21 Method for predicting power distribution network fault

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010317146.4A CN111429034A (en) 2020-04-21 2020-04-21 Method for predicting power distribution network fault

Publications (1)

Publication Number Publication Date
CN111429034A true CN111429034A (en) 2020-07-17

Family

ID=71554225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010317146.4A Pending CN111429034A (en) 2020-04-21 2020-04-21 Method for predicting power distribution network fault

Country Status (1)

Country Link
CN (1) CN111429034A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115180A (en) * 2020-09-11 2020-12-22 国网山东省电力公司枣庄供电公司 Power grid accident prediction method based on big data
CN112257937A (en) * 2020-10-28 2021-01-22 国网信通亿力科技有限责任公司 Power distribution network fault prediction system and method based on big data technology
CN112395178A (en) * 2020-11-18 2021-02-23 河南辉煌城轨科技有限公司 Equipment fault prediction method
CN112561189A (en) * 2020-12-23 2021-03-26 宁夏中科嘉业新能源研究院(有限公司) Method suitable for predicting power generation capacity of photovoltaic power station
CN112990776A (en) * 2021-04-26 2021-06-18 广东电网有限责任公司东莞供电局 Distribution network equipment health degree evaluation method
CN113435101A (en) * 2021-04-01 2021-09-24 国网内蒙古东部电力有限公司 Power failure prediction method for support vector machine based on particle swarm optimization
CN113609765A (en) * 2021-07-29 2021-11-05 国网河北省电力有限公司邯郸供电分公司 Overvoltage prediction method
CN113985101A (en) * 2021-11-02 2022-01-28 国网江苏省电力有限公司电力科学研究院 Non-contact broadband voltage monitoring system
CN114915035A (en) * 2022-07-19 2022-08-16 北京智芯微电子科技有限公司 Power distribution network monitoring method, device and system
CN115509187A (en) * 2022-09-20 2022-12-23 北京中佳瑞通科技有限公司 Industrial big data processing method and system
CN117491850A (en) * 2024-01-03 2024-02-02 江苏上达半导体有限公司 Circuit fault monitoring method and system based on artificial intelligence

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160061879A1 (en) * 2014-08-28 2016-03-03 General Electric Company Systems and methods for identifying fault location using distributed communication
CN108596449A (en) * 2018-04-09 2018-09-28 南京邮电大学 It is a kind of to consider distribution network reliability prediction technique of the weather to distribution network failure impact probability
CN110750524A (en) * 2019-09-12 2020-02-04 中国电力科学研究院有限公司 Method and system for determining fault characteristics of active power distribution network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160061879A1 (en) * 2014-08-28 2016-03-03 General Electric Company Systems and methods for identifying fault location using distributed communication
CN108596449A (en) * 2018-04-09 2018-09-28 南京邮电大学 It is a kind of to consider distribution network reliability prediction technique of the weather to distribution network failure impact probability
CN110750524A (en) * 2019-09-12 2020-02-04 中国电力科学研究院有限公司 Method and system for determining fault characteristics of active power distribution network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
卢育梓: "基于数据挖掘技术的配电网故障预测研究", 《中国优秀博硕士学位论文全文数据库(硕士)》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115180B (en) * 2020-09-11 2021-09-14 国网山东省电力公司枣庄供电公司 Power grid accident prediction method based on big data
CN112115180A (en) * 2020-09-11 2020-12-22 国网山东省电力公司枣庄供电公司 Power grid accident prediction method based on big data
CN112257937B (en) * 2020-10-28 2023-06-16 国网信通亿力科技有限责任公司 Power distribution network fault prediction system and method based on big data technology
CN112257937A (en) * 2020-10-28 2021-01-22 国网信通亿力科技有限责任公司 Power distribution network fault prediction system and method based on big data technology
CN112395178A (en) * 2020-11-18 2021-02-23 河南辉煌城轨科技有限公司 Equipment fault prediction method
CN112395178B (en) * 2020-11-18 2022-09-30 河南辉煌城轨科技有限公司 Equipment fault prediction method
CN112561189A (en) * 2020-12-23 2021-03-26 宁夏中科嘉业新能源研究院(有限公司) Method suitable for predicting power generation capacity of photovoltaic power station
CN113435101A (en) * 2021-04-01 2021-09-24 国网内蒙古东部电力有限公司 Power failure prediction method for support vector machine based on particle swarm optimization
CN113435101B (en) * 2021-04-01 2023-06-30 国网内蒙古东部电力有限公司 Particle swarm optimization-based power failure prediction method for support vector machine
CN112990776A (en) * 2021-04-26 2021-06-18 广东电网有限责任公司东莞供电局 Distribution network equipment health degree evaluation method
CN113609765A (en) * 2021-07-29 2021-11-05 国网河北省电力有限公司邯郸供电分公司 Overvoltage prediction method
CN113609765B (en) * 2021-07-29 2023-12-26 国网河北省电力有限公司邯郸供电分公司 Overvoltage prediction method
CN113985101A (en) * 2021-11-02 2022-01-28 国网江苏省电力有限公司电力科学研究院 Non-contact broadband voltage monitoring system
CN114915035B (en) * 2022-07-19 2022-09-13 北京智芯微电子科技有限公司 Power distribution network monitoring method, device and system
CN114915035A (en) * 2022-07-19 2022-08-16 北京智芯微电子科技有限公司 Power distribution network monitoring method, device and system
CN115509187B (en) * 2022-09-20 2023-04-18 北京中佳瑞通科技有限公司 Industrial big data processing method and system
CN115509187A (en) * 2022-09-20 2022-12-23 北京中佳瑞通科技有限公司 Industrial big data processing method and system
CN117491850A (en) * 2024-01-03 2024-02-02 江苏上达半导体有限公司 Circuit fault monitoring method and system based on artificial intelligence
CN117491850B (en) * 2024-01-03 2024-03-26 江苏上达半导体有限公司 Circuit fault monitoring method and system based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN111429034A (en) Method for predicting power distribution network fault
CN110223196B (en) Anti-electricity-stealing analysis method based on typical industry feature library and anti-electricity-stealing sample library
CN111709490B (en) Fan health state assessment method based on GRU neural network
CN105117602B (en) A kind of metering device running status method for early warning
CN106980905A (en) Distribution network reliability Forecasting Methodology and system
CN111259947A (en) Power system fault early warning method and system based on multi-mode learning
CN110750524A (en) Method and system for determining fault characteristics of active power distribution network
CN111949939B (en) Method for evaluating running state of intelligent electric meter based on improved TOPSIS and cluster analysis
CN113792754A (en) Method for processing DGA (differential global alignment) online monitoring data of converter transformer by removing different elements and then repairing
CN116737510B (en) Data analysis-based intelligent keyboard monitoring method and system
CN112446509A (en) Complex electronic equipment prediction maintenance method
CN113240527A (en) Bond market default risk early warning method based on interpretable machine learning
CN115526258A (en) Power system transient stability evaluation method based on Spearman correlation coefficient feature extraction
CN113033898A (en) Electrical load prediction method and system based on K-means clustering and BI-LSTM neural network
CN117076691A (en) Commodity resource knowledge graph algorithm model oriented to intelligent communities
CN117520809A (en) Transformer fault diagnosis method based on EEMD-KPCA-CNN-BiLSTM
CN116187640B (en) Power distribution network planning method and device based on grid multi-attribute image system
CN115905319B (en) Automatic identification method and system for abnormal electricity fees of massive users
CN116070458A (en) New wind power plant scene generation method based on RAC-GAN
CN110533213A (en) Transmission line of electricity defect Risk Modeling and its prediction technique based on support vector machines
CN116561569A (en) Industrial power load identification method based on EO feature selection and AdaBoost algorithm
CN115034285A (en) SVM-WS based intelligent electric meter fault prediction method
CN114372835A (en) Comprehensive energy service potential customer identification method, system and computer equipment
CN115598459A (en) Power failure prediction method for 10kV feeder line fault of power distribution network
Sicheng et al. Abnormal line loss data detection and correction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200717

RJ01 Rejection of invention patent application after publication