CN117520407A - Method and device for mining power transmission line fault causes under extreme weather conditions - Google Patents
Method and device for mining power transmission line fault causes under extreme weather conditions Download PDFInfo
- Publication number
- CN117520407A CN117520407A CN202311329362.0A CN202311329362A CN117520407A CN 117520407 A CN117520407 A CN 117520407A CN 202311329362 A CN202311329362 A CN 202311329362A CN 117520407 A CN117520407 A CN 117520407A
- Authority
- CN
- China
- Prior art keywords
- fault
- transmission line
- power transmission
- fuzzy
- causes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000005540 biological transmission Effects 0.000 title claims abstract description 166
- 238000005065 mining Methods 0.000 title claims abstract description 44
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 46
- 230000006870 function Effects 0.000 claims abstract description 20
- 230000010354 integration Effects 0.000 claims abstract description 8
- 230000006698 induction Effects 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 10
- 238000010276 construction Methods 0.000 claims description 10
- 238000009826 distribution Methods 0.000 claims description 9
- 239000013598 vector Substances 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 6
- 238000009413 insulation Methods 0.000 claims description 5
- 238000011002 quantification Methods 0.000 claims description 5
- 238000012216 screening Methods 0.000 claims description 5
- 238000013139 quantization Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 3
- 230000001939 inductive effect Effects 0.000 claims description 2
- 238000012163 sequencing technique Methods 0.000 claims description 2
- 238000004458 analytical method Methods 0.000 description 18
- 230000004927 fusion Effects 0.000 description 5
- 238000009412 basement excavation Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 238000007619 statistical method Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000012098 association analyses Methods 0.000 description 2
- 239000004020 conductor Substances 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 239000012212 insulator Substances 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000003449 preventive effect Effects 0.000 description 1
- 230000001681 protective effect Effects 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 238000011158 quantitative evaluation Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000284 resting effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/243—Natural language query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2474—Sequence data queries, e.g. querying versioned data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Business, Economics & Management (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Fuzzy Systems (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Locating Faults (AREA)
Abstract
The invention discloses a method for excavating a power transmission line fault cause under extreme weather conditions, which comprises the following steps: collecting fault information and extreme weather information of the power transmission line, and constructing a fault cause candidate feature library of the power transmission line; aiming at the significance and the weight of the characteristics, calculating the influence weight of each fault cause on the power transmission line fault by adopting a Relief-F algorithm to obtain the power transmission line fault cause; carrying out data integration on the power transmission line fault information and the power transmission line fault causes, and converting fault data into a fault fuzzy set through membership functions; aiming at the obtained fault fuzzy set, evaluating the association relation of the faults of the power transmission line by using a fuzzy frequent item mining algorithm, calculating the support degree and the confidence degree of the association relation, and quantifying the influence of the fault causes on the faults of the power transmission line. The method and the device deeply excavate the fault cause, quantify the influence of the fault cause on the fault of the power transmission line, and provide powerful support for the stable operation of the power system.
Description
Technical Field
The invention belongs to the field of power transmission line fault cause identification, and particularly relates to a power transmission line fault cause mining method and device under extreme weather conditions.
Background
With the continuous expansion of the scale of the power system and the increase of the power load, the influence of extreme weather conditions such as storm, high temperature and the like on the power system, particularly the power transmission line, is more and more obvious, and the extreme weather events often cause the fault of the power transmission line, thereby affecting the stable supply of the power. Therefore, it is of great importance to excavate and quantify the cause of failure of the transmission line in extreme weather.
At present, researches on causes of faults of power transmission lines are focused on the following aspects:
1. fault cause analysis based on statistical method: the method is mainly used for carrying out statistical analysis based on historical fault data and determining a fault mode and a fault cause of the power transmission line. However, due to extreme weather uncertainty and variability, pure statistical methods often fail to fully reveal failure causes.
2. Fault cause analysis based on physical model: such methods rely on the physical characteristics of the transmission line for in-depth fault mechanism research. However, due to the complexity of the power system, there are limitations in the application of such methods when extreme weather factors are considered.
3. Machine learning-based fault cause analysis: in recent years, this method has received a great deal of attention. Features are extracted from the historical data by utilizing big data and machine learning techniques, and a predictive model is built to analyze the cause of the fault. However, these methods often lack deep feature fusion and mining, resulting in less accurate analysis results.
Thus, the main disadvantages of the prior art are:
the feature fusion is insufficient: many methods are still in the primary stage for fusion of multi-source data, which results in an inability to fully utilize all available data, affecting the accuracy of the analysis.
The deep mining technology is not applied enough: although there are many studies in this area, the application of deep mining techniques in transmission line fault cause analysis is still relatively small.
Lack of careful quantization methods: most existing methods focus on the mining of fault causes, but quantitative analysis of these causes is relatively lacking.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a method and a device for mining the cause of the fault of a power transmission line under extreme weather conditions.
In order to achieve the above purpose, the invention adopts the following technical scheme:
a method for mining power transmission line fault causes under extreme weather conditions comprises the following steps:
collecting fault information and extreme weather information of the power transmission line, and constructing a fault cause candidate feature library of the power transmission line;
aiming at the significance and the weight of the characteristics, calculating the influence weight of each fault cause on the power transmission line fault by adopting a Relief-F algorithm to obtain the power transmission line fault cause;
carrying out data integration on the power transmission line fault information and the power transmission line fault causes, and converting fault data into a fault fuzzy set through membership functions;
aiming at the obtained fault fuzzy set, evaluating the association relation of the faults of the power transmission line by using a fuzzy frequent item mining algorithm, calculating the support degree and the confidence degree of the association relation, and quantifying the influence of the fault causes on the faults of the power transmission line.
Preferably, the collecting the fault information and the extreme weather information of the power transmission line, and constructing a fault cause candidate feature library of the power transmission line includes:
acquiring fault information of a power transmission line from a power information system;
acquiring extreme weather information from a comprehensive weather observation system;
integrating the power transmission line fault information and extreme weather information data, and recording the fault power transmission line, fault time and fault reason information in strips in a table;
classifying each fault sample, screening and recording line insulation rate, wind speed and rainfall characteristic data;
and constructing a candidate feature library B of the power transmission line fault causes based on the screened data.
Preferably, for the significance and the weight of the feature, calculating the influence weight of each fault cause on the power transmission line fault by adopting a Relief-F algorithm to obtain the power transmission line fault cause, including:
the candidate feature library B for inducing the faults of the power transmission line is provided with m fault samples, and each sample is provided with n meteorological features A of the power transmission line i (i=1, 2, …, n) the feature weight vector is w= { W 1 ,W 2 ,W 3 ,…,W n The iteration number is N, and the characteristic weight calculation steps are as follows:
the initial weights of all features are set to zero, i.e. W i =0, and perform normalization processing on all fault sample data;
randomly selecting a fault sample GZ from the data set, and selecting r nearest neighbor samples H from the sample set of the same type of fault as GZ j (j=1, 2, …, r); similarly, r nearest neighbor samples M are selected from non-homogeneous fault samples of GZ j (j=1, 2, …, r), each feature weight is calculated by the following formula:
where l is the number of iterations, class (GZ) represents the fault Class of sample SP, C++class (GZ) represents a non-homogeneous fault of Class C GZ, p (C) represents the proportion of the number of samples of this Class to the total number, M j (C) A j-th fault sample representing fault class C, diff (a i ,GZ,H j ) Represented in feature A i On sample H j Distance from sample GZ;
repeating the characteristic weight calculation steps until the iteration times l=N, and outputting a characteristic weight vector W;
setting a threshold value, and taking the characteristic as a power transmission line fault cause when the element in the W is larger than the threshold value.
Preferably, the sample H j The distance from the sample GZ is defined as follows:
wherein A is i,GZ Representing the fault sample GZ as being characteristic a i The value of the key is taken; similarly, A i,Hj Representing a fault sample H j In feature A i The value of the above value.
Preferably, the data integration is performed on the power transmission line fault information and the power transmission line fault cause, and a membership function is selected to convert the fault data into a fault fuzzy set, including:
let i= { I 1 ,i 2 ,…,i m And qd= { T for the quantized data set 1 ,T 2 ,…,T n Total set of terms in QD, any transaction T q E QD is a subset of I, the association between transactions is:
T a →T b (3)
wherein T is a And T is b Are all transactions, i.e. item i k (k=1, 2, …, m) and has
Setting a transaction T in a quantized data set QD q The ith item number value of v iq At semantic level R i ={R i1 ,R i2 ,…,R ih Membership at μ j (v iq ) (j=1, 2, …, h), transaction T q Fuzzy term f corresponding to the ith term of (2) iq The definition is as follows:
in the formula, v iq Corresponding membership mu j (v iq ) Is determined by membership functions, anRepresenting transaction T q Is related to semantic level R il (l=1, 2, …, h), a higher degree of membership representing the semantic ranking being more representative, transaction T q Fuzzy set F of (2) q The definition is as follows:
F q ={f 1q ,f 2q ,...,f hq } (5)
triangular distribution quantitative description indexes are selected, membership functions are divided into Low, middle, high semantic grades, fuzzy sets corresponding to transactions are obtained according to formulas (3) - (5), and quantity information is converted into semantic language and membership degree modes.
Preferably, for the obtained fault fuzzy set, the evaluation of the association relationship of the faults of the power transmission line by using a fuzzy frequent item mining algorithm, and the calculation of the support and the confidence thereof comprise the following steps:
constructing a Fuzzy table Fuzzy-list according to Fuzzy set data;
digging a Fuzzy-list to obtain a Fuzzy frequent item set;
quantifying the influence of fault inducement on the faults of the power transmission line;
preferably, the constructing the Fuzzy table Fuzzy-list according to the Fuzzy set data includes:
1) Given semantic level R il And (2) andmarking a transaction identification code (tid) as a subscript q of a transaction set, namely tid=q;
2) Highest support semantic level R it At transaction T q Membership of the corresponding fuzzy term in (a) is expressed as an endogenous fuzzy value, denoted if (R) it ,T q ) The definition is:
in the formula, MAX (fsup (R) il ) Representing the maximum value among all semantic level support degrees corresponding to the item;
3) At transaction T q In addition to the semantic level R it Corresponding fuzzy value if (R it ,T q ) The maximum endogenous blur values other than these are noted as residual blur values, denoted rf (R it ,T q ) The definition is:
rf(R it ,T q )=max{if(z,T q )|z∈(T q /R it )} (7)
in (T) q /R il ) For transaction T q In addition to R il A collection of semantic levels outside.
Preferably, the mining Fuzzy-list to obtain the Fuzzy frequent item set includes:
1) Constructing a raw Fuzzy total value and a residual Fuzzy total value, and mining Fuzzy relations by comparing Fuzzy-list with a minimum support base, wherein the construction of an endogenous Fuzzy total value ifsum (X) and a residual Fuzzy total value rfsum (X) of a term set X is as follows:
in the formula, the term set X= { R i1 ,R i2 ,…,R ik For item set X, if its endogenous fuzzy total value ifsum (X) is greater than a minimum support base δ×|qd|, then it is referred to as a fuzzy frequent item set, where δ is the minimum support, qd| is the number of transactions for the data set, and if the remaining fuzzy total value rfsum (X) for item set X is less than the minimum support base, then all supersets for X will not be fuzzy frequent item sets;
2) And selecting the two factors and the association relation to analyze, calculating the support degree and the confidence degree of the two factors, and recording the result in a table according to the fault transmission line, the fault factors, the support degree and the confidence degree.
Preferably, the quantifying the influence of the fault cause on the power transmission line fault includes:
allocating an influence magnitude value for each fault cause or fault cause combination by using the fuzzy frequent item set;
according to different degrees of influence of different fault causes on the power transmission line, a weight coefficient is distributed for each fault cause or fault cause combination;
multiplying each fault influence value by a corresponding weight coefficient to obtain a weighted fault influence value;
sequencing the fault causes and the corresponding weighted fault influence magnitudes;
and taking the ordered fault causes and the corresponding weighted fault influence magnitudes as quantization results of fault causes on the power transmission line.
An electric transmission line fault causing excavation device under extreme weather conditions, comprising:
the fault induction candidate feature library construction module is used for collecting fault information and extreme weather information of the power transmission line and constructing a fault induction candidate feature library of the power transmission line;
the power transmission line fault induction acquisition module is used for calculating the influence weight of each fault induction on the power transmission line fault by adopting a Relief-F algorithm aiming at the significance and the weight of the characteristics to obtain the power transmission line fault induction;
the fault fuzzy set construction module is used for integrating the data of the power transmission line fault information and the power transmission line fault causes and converting the fault data into a fault fuzzy set through membership functions;
and the fault cause quantification module is used for evaluating the power transmission line fault association relation by using a fuzzy frequent item mining algorithm according to the obtained fault fuzzy set, calculating the support degree and the confidence degree of the power transmission line fault association relation, and quantifying the influence of the fault cause on the power transmission line fault.
The invention has the positive beneficial effects that:
1. according to the invention, the influence weight of each fault cause on the power transmission line fault is accurately calculated by utilizing a Relief-F algorithm, the algorithm is based on a verification method, the accuracy of the cause weight calculation is improved, the power transmission line fault information and the power transmission line fault causes are subjected to data integration, fault data are converted into a fault fuzzy set through membership functions, the fault data are processed by adopting fuzzy logic, the fusion and pretreatment of the data are realized, the fault causes can be effectively analyzed under the condition of fuzzy or incomplete data, a more accurate and stable data basis is provided for subsequent analysis and excavation, the hidden association relationship between the power transmission line faults is revealed through the fuzzy frequent item mining algorithm, the support degree and the confidence degree are calculated, the internal relationship between various fault causes is conveniently known, and the detailed analysis and quantitative evaluation of the fault causes are realized. According to the invention, fault information and extreme weather information of the power transmission line are integrated, fuzzy logic is adopted for processing in a data preprocessing stage, full fusion and comprehensive utilization of multi-source data are realized, a Relief-F algorithm and a fuzzy frequent item mining algorithm are utilized, deep mining is realized on weight calculation and association relation mining of fault causes, the depth and breadth of fault cause analysis are remarkably improved, and meanwhile, support and confidence calculation are introduced in the mining algorithm, so that the fault causes are quantitatively analyzed. Therefore, the invention realizes remarkable improvement and innovation in the aspects of the excavation, analysis and quantification of the fault cause of the power transmission line, deeply excavates the fault cause, quantifies the influence of the fault cause on the fault of the power transmission line, effectively solves the key problems in the prior art, and provides powerful support for the stable operation of the power system.
2. According to the invention, through accurately analyzing and quantifying the power transmission line fault cause under extreme weather conditions, an electric power system operator can take preventive measures or design coping strategies so as to prevent potential faults, and the stability and reliability of the whole electric power system are enhanced; the invention quantifies the influence of fault causes on the faults of the power transmission line, provides a more scientific and clear decision support basis, can assist operators to optimize operation and maintenance strategies, effectively distributes resources and reduces the risk of large-area power failure events caused by extreme weather; the deep analysis and evaluation method for the fault causes of the extreme weather can help the power system to quickly locate possible fault risk areas when the extreme weather event occurs, and improve the efficiency of fault emergency response and treatment; through long-term fault cause mining and analysis, the invention can provide valuable data support for the construction and planning of a power system, such as where to add protective measures or how to perform optimal design of a line, so as to reduce future risks; in addition, the method of the invention will also promote further research and development of related technologies in the field of power systems, such as application of big data processing technology, artificial intelligence algorithm and the like in analysis and optimization of power systems.
Drawings
FIG. 1 is a schematic diagram of the index membership distribution rule used in the present invention;
FIG. 2 is a schematic diagram of the construction of an initial fuzzy table embodying the present invention;
FIG. 3 is a graph of overall correlation analysis of fault factors obtained by the present invention.
Detailed Description
The technical solutions of the present invention will be clearly and completely described below with reference to the drawings and specific embodiments of the present invention.
A method for mining power transmission line fault causes under extreme weather conditions comprises the following steps:
collecting fault information and extreme weather information of the power transmission line, and constructing a fault cause candidate feature library of the power transmission line;
aiming at the significance and the weight of the characteristics, calculating the influence weight of each fault cause on the power transmission line fault by adopting a Relief-F algorithm to obtain the power transmission line fault cause;
carrying out data integration on the power transmission line fault information and the power transmission line fault causes, and converting fault data into a fault fuzzy set through membership functions;
aiming at the obtained fault fuzzy set, evaluating the association relation of the faults of the power transmission line by using a fuzzy frequent item mining algorithm, calculating the support degree and the confidence degree of the association relation, and quantifying the influence of the fault causes on the faults of the power transmission line.
Further, the collecting the power transmission line fault information and the extreme weather information, and constructing a fault cause candidate feature library of the power transmission line includes:
acquiring fault information of a power transmission line from a power information system, wherein the fault information comprises operation years, insulation rate and fault position data of overhead conductors, towers and insulators;
acquiring extreme weather information from a comprehensive weather observation system, wherein the extreme weather information comprises wind speed, rainfall, humidity, temperature and air pressure weather parameters;
integrating the obtained fault information and extreme weather information data of the power transmission line, and arranging the fault power transmission line, the fault time and the fault reason information recorded in a split manner in a table;
classifying each fault sample, screening and recording line insulation rate, wind speed and rainfall characteristic data;
and constructing a candidate feature library B of the power transmission line fault causes based on the screened data.
Further, with respect to the significance and the weight of the feature, calculating the influence weight of each fault cause on the power transmission line fault by adopting a Relief-F algorithm to obtain the power transmission line fault cause, including:
the candidate feature library B for the power transmission line fault induction comprises m fault samples, and each sample comprises n meteorological features A of the power transmission line (including overhead conductors, towers and insulators) i (i=1, 2, …, n) the feature weight vector is w= { W 1 ,W 2 ,W 3, …,W n The iteration number is N, and the specific steps of feature weight calculation are as follows:
the initial weights of all features are set to zero, i.e. W i =0, and perform normalization processing on all fault sample data;
randomly selecting a fault sample GZ from the data set, and selecting r nearest neighbor samples H from the sample set of the same type of fault as GZ j (j=1, 2, …, r); similarly, r nearest neighbor samples M are selected from non-homogeneous fault samples of GZ j (j=1,2,…,r);
The feature weights are calculated from the following equation:
wherein l isIteration number, class (GZ) represents the fault Class of the sample SP, c++class (GZ) represents a non-homogeneous fault with Class C GZ, p (C) represents the proportion of the number of samples of this Class to the total number, M j (C) A j-th fault sample representing fault class C, diff (a i ,GZ,H j ) Represented in feature A i On sample H j Distance from sample GZ;
sample H j The distance from the sample GZ is defined as follows:
wherein A is i,GZ Representing the fault sample GZ as being characteristic a i The value of the key is taken; similarly, A i,Hj Representing a fault sample H j In feature A i The value of the key is taken;
repeating the characteristic weight calculation steps until the iteration times l=N, outputting a characteristic weight vector W, namely listing the N characteristics in a split way, marking the obtained characteristic weight, and storing the obtained characteristic weight in a table;
setting a threshold value, and taking the characteristic as a power transmission line fault cause when the element in the W is larger than the threshold value.
In this embodiment, features in the candidate feature library are screened based on a Relief-F algorithm, the iteration number n=30, the sample set number r=10, and considering that the weight average value is reduced to a greater extent between rank 7 and rank 8 and the threshold value is 0.08, 7 features with the highest weight average value are used as conditional features of power transmission line fault analysis, see table 1, the fault power transmission line category is added, and the total of 8 features form an accident association feature library.
TABLE 1 Accident correlation characteristics library
Further, the data integration is performed on the power transmission line fault information and the power transmission line fault causes, and the fault data is converted into a fault fuzzy set through membership functions, including:
processing in a database, screening out corresponding data under the same geocoding and time distribution conditions by utilizing data flows of the SCADA and IMOS systems, and generating corresponding fault information tables and entry correspondence of the air-phase information tables;
performing preliminary auditing on all data, deleting data containing error fields and excessive missing values, and replacing specific fields, for example, replacing characters in a 'fault position' field with numbers so as to ensure data quality and facilitate program operation;
integrating the processed meteorological data and fault data into a group of data according to time distribution, wherein each group of data comprises operation years, typical fault reasons and fault occurrence positions and is recorded in a form of row vectors;
let i= { I 1 ,i 2 ,…,i m Qd= { T for the quantized data set (Quantitative Database) 1 ,T 2 ,…,T n Total set of items (Item) in QD, any Transaction (Transaction) T q The e QD is a subset of I, and the association between transactions is represented as follows:
T a →T b (3)
wherein T is a And T is b Are all transactions, i.e. item i k (k=1, 2, …, m) and has
Setting a transaction T in a quantized data set QD q The ith item number value of v iq At semantic level R i ={R i1 ,R i2 ,…,R ih Membership at μ j (v iq ) (j=1, 2, …, h). Then transaction T q Fuzzy term f corresponding to the ith term of (2) iq The definition is as follows:
in the formula, v iq Corresponding membership mu j (v iq ) Is determined by membership functions, anRepresenting transaction T q Is related to semantic level R il (l=1, 2, …, h), a higher degree of membership representing the semantic ranking being more representative, transaction T q Fuzzy set F of (2) q The definition is as follows:
F q ={f 1q ,f 2q ,...,f hq } (5)
triangular distribution quantitative description indexes are selected, membership functions are divided into Low, middle, high semantic grades, referring to fig. 1, fuzzy sets corresponding to transactions are obtained according to formulas (3) - (5), and quantity information is converted into semantic language and membership degree modes.
In this embodiment, in order to perform association relation mining, the processed weather data and fault data need to be integrated into a set of data according to time distribution, and each set of data includes an operation period, a typical fault cause and a fault occurrence position, and is recorded in a table in the form of a row vector, so as to construct a typical fault cause library, as shown in table 2.
TABLE 2 exemplary failure cause library
The obtained typical fault cause library is constructed in the form of a quantization database, the fault causes in each group of data are quantized into numerical values, and the numerical values are recorded in a table in the form of row vectors, as shown in table 3.
Table 3 line fault quantification database
The 20 groups are selected for display, and it is noted that the operation years, fault positions and line insulation rates in table 3 are originally character segments, and mapping replacement is performed for facilitating fuzzy frequent item mining, the values of the mapping replacement are related to the selection of membership functions, and the mapping table is shown in table 4.
Table 4 character map substitution table
The distribution parameters corresponding to the indexes are determined by referring to the grading basis of the indexes such as wind speed, rainfall and the like in the field of meteorology, and the specific numerical values are shown in Table 5.
TABLE 5 membership function distribution parameter settings
Further, for the obtained fault fuzzy set, evaluating the association relation of the faults of the power transmission line by using a fuzzy frequent item mining algorithm, and calculating the support degree and the confidence degree of the association relation, wherein the method comprises the following steps:
constructing a Fuzzy table Fuzzy-list according to Fuzzy set data;
digging a Fuzzy-list to obtain a Fuzzy frequent item set;
and quantifying the influence of fault inducement on the faults of the power transmission line.
Still further, the Fuzzy table Fuzzy-list is constructed according to Fuzzy set data, see fig. 2, specifically:
1) Given semantic level R il And (2) andmarking a transaction identification code (tid) as a subscript q of a transaction set, namely tid=q;
2) Highest support semantic level R it At transaction T q The membership of the corresponding fuzzy term in (a) is expressed as an endogenous fuzzy value (internal fuzzy value), denoted as if (R) it ,T q ) The definition is:
in the formula, MAX (fsup (R) il ) Represents the maximum of all semantic level support degrees corresponding to the item.
3) At transaction T q In addition to the semantic level R it Corresponding fuzzy value if (R it ,T q ) The maximum endogenous blur values other than these are noted as residual blur values (resting fuzzy value), denoted rf (R) it ,T q ) The definition is:
rf(R it ,T q )=max{if(z,T q )|z∈(T q /R it )} (7)
in (T) q /R il ) For transaction T q In addition to R il A collection of semantic levels outside. It should be noted that for the determination of the Fuzzy-list parameter of the k-term set, R is only required il Replaced by { R i1 ,R i2 ,…,R ik And then solving the parameters.
Still further, the mining Fuzzy-list obtains Fuzzy frequent item sets, specifically:
1) The raw total blur value and the remaining total blur value are constructed. The Fuzzy relation is mined by comparing the Fuzzy-list with the minimum support base, and an endogenous Fuzzy total value ifsum (X) and a residual Fuzzy total value rfsum (X) of the item set X are constructed as follows:
in the formula, the term set X= { R i1 ,R i2 ,…,R ik K=1, 2 …, h. For term set X, if its endogenous fuzzy total value ifsum (X) is greater than the minimum support base delta X QD, then it is called moduloPasting frequent item sets, wherein delta is the minimum support degree, |QD| is the transaction number of the data set, and if the residual fuzzy total value rfsum (X) of the item set X is smaller than the minimum support base number, all supersets of X are not fuzzy frequent item sets;
2) Because the ambiguity of the association relationship between the single fault factor and the power transmission line fault is too strong, the association relationship of the double factors and the above is selected for analysis, the support degree and the confidence degree are calculated, and the result is recorded in a table according to the fault power transmission line, the fault factor, the support degree and the confidence degree.
Furthermore, the influence of the quantized fault cause on the power transmission line fault is specifically:
1) Evaluating each item in the fuzzy frequent item set: utilizing the obtained fuzzy frequent item set to allocate an influence magnitude value for each fault cause or fault cause combination;
2) Considering that different fault causes may have different degrees of influence on the power transmission line, assigning a weight coefficient to each fault cause or fault cause combination;
3) Calculating a weighted fault impact magnitude: multiplying each fault influence value obtained in the step 1) with a corresponding weight coefficient to obtain a weighted fault influence value;
4) Sorting all weighted fault impact metrics;
5) And outputting the sequenced fault causes and the corresponding weighted influence measurement values thereof as quantized results of fault influence of the fault causes on the power transmission line.
In this embodiment, in consideration of the fact that the ambiguity of the association relationship between the single fault factor and the power transmission line fault is too strong and has no definite directivity, the present invention selects the association relationship of two factors and above for analysis, calculates the support degree and the confidence degree thereof, and records the result in a table according to the fault power transmission line, the fault factor, the support degree and the confidence degree, as shown in table 6.
TABLE 6 Power Transmission line Fault strong association rule List
Comparing the evaluation result with the previous fault characteristics to obtain the causes of faults at different positions of the power transmission line, wherein the overall association analysis result of the fault factors is shown in fig. 3, and the wider the line in the diagram is, the deeper the association degree is represented, and the greater the association degree of the faults of the power transmission line is.
In order to further verify the accuracy of the algorithm of the invention in the power transmission line fault event relative to other algorithms, the invention selects an additional certain number of groups of power transmission line fault data and corresponding meteorological data from the database as a test set for verification.
Firstly, comparing the result of the strong association rule mining, wherein the comparison algorithm adopts a classical Apriori algorithm, and the quantized data is directly classified by adopting an interval classification method because the quantized data cannot be processed by the traditional association rule mining algorithm. The interval selection uses the same boundaries as the membership function of the algorithm of the present invention, e.g., the fast membership class division v in Table 5 low =0、v mid =4.8、v hig =9.6, then in classical Apriori algorithm, low wind speed v low The interval is (0,4.8)]Stroke speed v mid The interval is (4.8, 9.6)]High wind speed v hig The interval (9.6 of the time period, ++ infinity]. The division standard of the comparison algorithm is identical to that of the algorithm, the same database is used for carrying out fault association analysis, the same screening rule as that of the method is adopted, and the association rule with the support degree of the first five rows and the confidence degree of the first five rows is taken as the strong association rule. In the strong association rule obtained by the Apriori association algorithm, the highest support and confidence are compared with the result obtained by the method of the invention, as shown in table 7.
Table 7 algorithm comparative analysis
As can be seen from Table 7, in the strong association rule obtained by the method provided by the invention, the highest support degree is higher than that of the strong association rule obtained by the Apriori method, because the classical Apriori method has the problem of excessively hard boundary division, the hard division of the interval easily causes erroneous judgment of data at the boundary, and the support degree of the association rule is reduced. The fuzzy set theory is introduced to soften the boundary, and the membership description solves the problem of misjudgment of boundary data which is not the same, so that the mined association rule has higher value and higher support.
In the aspect of algorithm accuracy comparison, given test set data, membership degrees aiming at each semantic level are obtained according to a fuzzy set theory, representative semantic levels are selected according to a maximum membership degree principle and a section dividing principle respectively to obtain a relevance index, and then the relevance index is compared with a strong relevance rule mined by the method and a classical Apriori algorithm, if the relevance rule condition is met, judgment is successful, and the accuracy rates of the method and the Apriori method under the maximum membership degree principle and the section classifying principle are calculated, as shown in table 8.
Table 8 algorithm accuracy analysis
As can be seen from Table 8, whether the test data are processed according to the membership degree principle of the algorithm of the invention or the interval division principle of the Apriori algorithm, the finally obtained analysis result is that the accuracy of the association rule mined by the subject algorithm is higher, which indicates that the association rule obtained by the algorithm of the invention has certain universality, is superior to the classical Apriori method under different data division principles, has higher accuracy, and has guiding value for weak link identification and actual production and life.
Example 2
An electric transmission line fault causing excavation device under extreme weather conditions, comprising:
the fault induction candidate feature library construction module is used for collecting fault information and extreme weather information of the power transmission line and constructing a fault induction candidate feature library of the power transmission line;
the power transmission line fault induction acquisition module is used for calculating the influence weight of each fault induction on the power transmission line fault by adopting a Relief-F algorithm aiming at the significance and the weight of the characteristics to obtain the power transmission line fault induction;
the fault fuzzy set construction module is used for integrating the data of the power transmission line fault information and the power transmission line fault causes and converting the fault data into a fault fuzzy set through membership functions;
and the fault cause quantification module is used for evaluating the power transmission line fault association relation by using a fuzzy frequent item mining algorithm according to the obtained fault fuzzy set, calculating the support degree and the confidence degree of the power transmission line fault association relation, and quantifying the influence of the fault cause on the power transmission line fault.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (10)
1. The method for mining the fault causes of the power transmission line under the extreme weather condition is characterized by comprising the following steps:
collecting fault information and extreme weather information of the power transmission line, and constructing a fault cause candidate feature library of the power transmission line;
aiming at the significance and the weight of the characteristics, calculating the influence weight of each fault cause on the power transmission line fault by adopting a Relief-F algorithm to obtain the power transmission line fault cause;
carrying out data integration on the power transmission line fault information and the power transmission line fault causes, and converting fault data into a fault fuzzy set through membership functions;
aiming at the obtained fault fuzzy set, evaluating the association relation of the faults of the power transmission line by using a fuzzy frequent item mining algorithm, calculating the support degree and the confidence degree of the association relation, and quantifying the influence of the fault causes on the faults of the power transmission line.
2. The method for mining the causes of faults of the power transmission line under the extreme weather condition according to claim 1, wherein the steps of collecting the fault information of the power transmission line and the extreme weather information, and constructing a candidate feature library of the causes of faults of the power transmission line include:
acquiring fault information of a power transmission line from a power information system;
acquiring extreme weather information from a comprehensive weather observation system;
integrating the power transmission line fault information and extreme weather information data, and recording the fault power transmission line, fault time and fault reason information in strips in a table;
classifying each fault sample, screening and recording line insulation rate, wind speed and rainfall characteristic data;
and constructing a candidate feature library B of the power transmission line fault causes based on the screened data.
3. The method for mining power transmission line fault causes under extreme weather conditions according to claim 2, wherein the calculating the influence weight of each fault cause on the power transmission line fault by adopting a Relief-F algorithm according to the significance and the weight of the features to obtain the power transmission line fault cause comprises:
the candidate feature library B for inducing the faults of the power transmission line is provided with m fault samples, and each sample is provided with n meteorological features A of the power transmission line i (i=1, 2, …, n) the feature weight vector is w= { W 1 ,W 2 ,W 3 ,…,W n The iteration number is N, and the characteristic weight calculation steps are as follows:
the initial weights of all features are set to zero, i.e. W i =0, and perform normalization processing on all fault sample data;
random selection from a datasetSelecting one fault sample GZ, and selecting r nearest neighbor samples H from a sample set of similar faults as GZ j (j=1, 2, …, r); similarly, r nearest neighbor samples M are selected from non-homogeneous fault samples of GZ j (j=1, 2, …, r), each feature weight is calculated by the following formula:
where l is the number of iterations, class (GZ) represents the fault Class of sample SP, C++class (GZ) represents a non-homogeneous fault of Class C GZ, p (C) represents the proportion of the number of samples of this Class to the total number, M j (C) A j-th fault sample representing fault class C, diff (a i ,GZ,H j ) Represented in feature A i On sample H j Distance from sample GZ;
repeating the characteristic weight calculation steps until the iteration times l=N, and outputting a characteristic weight vector W;
setting a threshold value, and taking the characteristic as a power transmission line fault cause when the element in the W is larger than the threshold value.
4. A method of mining for induction of transmission line faults in extreme weather conditions as claimed in claim 3, wherein the sample H j The distance from the sample GZ is defined as follows:
wherein A is i,GZ Representing the fault sample GZ as being characteristic a i The value of the key is taken; similarly, A i,Hj Representing a fault sample H j In feature A i The value of the above value.
5. The method for mining power transmission line fault causes under extreme weather conditions according to claim 3 or 4, wherein the data integration is performed on power transmission line fault information and power transmission line fault causes, membership functions are selected, and fault data are converted into fault fuzzy sets, including:
let i= { I 1 ,i 2 ,…,i m And qd= { T for the quantized data set 1 ,T 2 ,…,T n Total set of terms in QD, any transaction T q E QD is a subset of I, the association between transactions is:
T a →T b (3)
wherein T is a And T is b Are all transactions, i.e. item i k (k=1, 2, …, m) and has
Setting a transaction T in a quantized data set QD q The ith item number value of v iq At semantic level R i ={R i1 ,R i2 ,…,R ih Membership at μ j (v iq ) (j=1, 2, …, h), transaction T q Fuzzy term f corresponding to the ith term of (2) iq The definition is as follows:
in the formula, v iq Corresponding membership mu j (v iq ) Is determined by membership functions, anRepresenting transaction T q Is related to semantic level R il (l=1, 2, …, h), a higher degree of membership representing the semantic ranking being more representative, transaction T q Fuzzy set F of (2) q The definition is as follows:
F q ={f 1q ,f 2q ,...,f hq }(5)
triangular distribution quantitative description indexes are selected, membership functions are divided into Low, middle, high semantic grades, fuzzy sets corresponding to transactions are obtained according to formulas (3) - (5), and quantity information is converted into semantic language and membership degree modes.
6. The method for mining the cause of power transmission line fault under extreme weather conditions according to claim 5, wherein the evaluating the association relationship of power transmission line faults by using a fuzzy frequent item mining algorithm for the obtained fault fuzzy set, and calculating the support and the confidence thereof, comprises:
constructing a Fuzzy table Fuzzy-list according to Fuzzy set data;
digging a Fuzzy-list to obtain a Fuzzy frequent item set;
and quantifying the influence of fault inducement on the faults of the power transmission line.
7. The method for mining causes of power transmission line faults under extreme weather conditions according to claim 6, wherein the constructing a Fuzzy table Fuzzy-list according to Fuzzy set data comprises:
1) Given semantic level R il And (2) andmarking a transaction identification code (tid) as a subscript q of a transaction set, namely tid=q;
2) Highest support semantic level R it At transaction T q Membership of the corresponding fuzzy term in (a) is expressed as an endogenous fuzzy value, denoted if (R) it ,T q ) The definition is:
in the formula, MAX (fsup (R) il ) Representing the maximum value among all semantic level support degrees corresponding to the item;
3) At transaction T q In addition to the semantic level R it Corresponding fuzzy value if (R it ,T q ) The maximum endogenous blur values other than these are noted as residual blur values, denoted rf (R it ,T q ) The definition is:
rf(R it ,T q )=max{if(z,T q )|z∈(T q /R it )} (7)
in (T) q /R il ) For transaction T q In addition to R il A collection of semantic levels outside.
8. The method for mining causes of transmission line faults in extreme weather conditions according to claim 7, wherein mining Fuzzy-list results in Fuzzy frequent item sets comprising:
1) Constructing a raw Fuzzy total value and a residual Fuzzy total value, and mining Fuzzy relations by comparing Fuzzy-list with a minimum support base, wherein the construction of an endogenous Fuzzy total value ifsum (X) and a residual Fuzzy total value rfsum (X) of a term set X is as follows:
in the formula, the term set X= { R i1 ,R i2 ,…,R ik For item set X, if its endogenous fuzzy total value ifsum (X) is greater than a minimum support base δ×|qd|, then it is referred to as a fuzzy frequent item set, where δ is the minimum support, qd| is the number of transactions for the data set, and if the remaining fuzzy total value rfsum (X) for item set X is less than the minimum support base, then all supersets for X will not be fuzzy frequent item sets;
2) And selecting the two factors and the association relation to analyze, calculating the support degree and the confidence degree of the two factors, and recording the result in a table according to the fault transmission line, the fault factors, the support degree and the confidence degree.
9. The method for mining causes of faults in power transmission lines under extreme weather conditions of claim 8 in which quantifying the impact of causes of faults on power transmission lines includes:
allocating an influence magnitude value for each fault cause or fault cause combination by using the fuzzy frequent item set;
according to different degrees of influence of different fault causes on the power transmission line, a weight coefficient is distributed for each fault cause or fault cause combination;
multiplying each fault influence value by a corresponding weight coefficient to obtain a weighted fault influence value;
sequencing the fault causes and the corresponding weighted fault influence magnitudes;
and taking the ordered fault causes and the corresponding weighted fault influence magnitudes as quantization results of fault causes on the power transmission line.
10. An excavating device for power transmission line fault causes under extreme weather conditions, comprising:
the fault induction candidate feature library construction module is used for collecting fault information and extreme weather information of the power transmission line and constructing a fault induction candidate feature library of the power transmission line;
the power transmission line fault induction acquisition module is used for calculating the influence weight of each fault induction on the power transmission line fault by adopting a Relief-F algorithm aiming at the significance and the weight of the characteristics to obtain the power transmission line fault induction;
the fault fuzzy set construction module is used for integrating the data of the power transmission line fault information and the power transmission line fault causes and converting the fault data into a fault fuzzy set through membership functions;
and the fault cause quantification module is used for evaluating the power transmission line fault association relation by using a fuzzy frequent item mining algorithm according to the obtained fault fuzzy set, calculating the support degree and the confidence degree of the power transmission line fault association relation, and quantifying the influence of the fault cause on the power transmission line fault.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311329362.0A CN117520407A (en) | 2023-10-13 | 2023-10-13 | Method and device for mining power transmission line fault causes under extreme weather conditions |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311329362.0A CN117520407A (en) | 2023-10-13 | 2023-10-13 | Method and device for mining power transmission line fault causes under extreme weather conditions |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117520407A true CN117520407A (en) | 2024-02-06 |
Family
ID=89755729
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311329362.0A Pending CN117520407A (en) | 2023-10-13 | 2023-10-13 | Method and device for mining power transmission line fault causes under extreme weather conditions |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117520407A (en) |
-
2023
- 2023-10-13 CN CN202311329362.0A patent/CN117520407A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108417033B (en) | Expressway traffic accident analysis and prediction method based on multi-dimensional factors | |
CN111259947A (en) | Power system fault early warning method and system based on multi-mode learning | |
CN107944622A (en) | Wind power forecasting method based on continuous time cluster | |
CN113792754B (en) | Converter transformer DGA online monitoring data processing method for firstly removing abnormal state and then repairing | |
CN113344471B (en) | Method for representing weather environment adaptability of aircraft system | |
CN111539845B (en) | Enterprise environment-friendly management and control response studying and judging method based on power consumption mode membership grade | |
Li et al. | A two-tier wind power time series model considering day-to-day weather transition and intraday wind power fluctuations | |
CN114048436A (en) | Construction method and construction device for forecasting enterprise financial data model | |
CN113030633B (en) | GA-BP neural network-based power distribution network fault big data analysis method and system | |
CN111667135B (en) | Load structure analysis method based on typical feature extraction | |
CN111210170A (en) | Environment-friendly management and control monitoring and evaluation method based on 90% electricity distribution characteristic index | |
CN108830405B (en) | Real-time power load prediction system and method based on multi-index dynamic matching | |
CN115907822A (en) | Load characteristic index relevance mining method considering region and economic influence | |
CN115033591B (en) | Intelligent detection method, system, storage medium and computer equipment for electric charge data abnormality | |
CN116432123A (en) | Electric energy meter fault early warning method based on CART decision tree algorithm | |
CN114757557A (en) | On-site operation risk assessment prediction method and device based on electric work ticket | |
CN116187835A (en) | Data-driven-based method and system for estimating theoretical line loss interval of transformer area | |
CN115423146A (en) | Self-adaptive runoff forecasting method based on multi-factor nearest neighbor sampling regression and support vector machine | |
CN117077005B (en) | Optimization method and system for urban micro-update potential | |
CN114372093A (en) | Processing method of DGA (differential global alignment) online monitoring data of transformer | |
CN113743453A (en) | Population quantity prediction method based on random forest | |
CN112508278A (en) | Multi-connected system load prediction method based on evidence regression multi-model | |
Sueyoshi et al. | Efficiency measurement and strategic classification of Japanese banking institutions | |
CN116561569A (en) | Industrial power load identification method based on EO feature selection and AdaBoost algorithm | |
CN117520407A (en) | Method and device for mining power transmission line fault causes under extreme weather conditions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |