CN111652448A - Method, device and system for predicting defect rate of power transmission line - Google Patents

Method, device and system for predicting defect rate of power transmission line Download PDF

Info

Publication number
CN111652448A
CN111652448A CN202010651743.0A CN202010651743A CN111652448A CN 111652448 A CN111652448 A CN 111652448A CN 202010651743 A CN202010651743 A CN 202010651743A CN 111652448 A CN111652448 A CN 111652448A
Authority
CN
China
Prior art keywords
transmission line
power transmission
training
defect rate
predicting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010651743.0A
Other languages
Chinese (zh)
Inventor
陈松波
叶志健
李少鹏
胡金磊
叶万余
欧阳业
罗敏辉
王星华
曾勇斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingyuan Power Supply Bureau of Guangdong Power Grid Co Ltd
Original Assignee
Qingyuan Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingyuan Power Supply Bureau of Guangdong Power Grid Co Ltd filed Critical Qingyuan Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority to CN202010651743.0A priority Critical patent/CN111652448A/en
Publication of CN111652448A publication Critical patent/CN111652448A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Abstract

The embodiment of the invention discloses a method, a device and a system for predicting defect rate of a power transmission line, wherein the prediction method comprises the following steps: acquiring characteristic data of the defect influence factors of the power transmission line, and classifying the characteristic data to form a data sample to be processed; forming a training sample and a test sample according to a data sample to be processed; wherein the training samples comprise a training set and a validation set; constructing a regression prediction model and training the regression prediction model through training samples to obtain importance degree sequences of the influence characteristics; obtaining an optimal feature set according to the importance ranking of the influence features; and predicting the defect rate of the power transmission line by combining the optimal feature set with the test sample. The technical scheme provided by the embodiment of the invention realizes the prediction of the defect rate of different line sections of the future power transmission line, provides important reference for the maintenance operation of a power transmission management department, and is convenient for maintaining the power transmission line.

Description

Method, device and system for predicting defect rate of power transmission line
Technical Field
The embodiment of the invention relates to the technical field of prediction of defect rates of power transmission lines, in particular to a method, a device and a system for predicting defect rates of power transmission lines.
Background
The transmission line is a line connected between two public substations, has higher voltage, and generally refers to a line with the voltage grade of 35KV and above.
At present, in the operation and maintenance work of a power grid in China, the research on the defects of the power transmission line is mainly focused on a state evaluation method, and the processing and analysis of the relevant factors of the occurrence of the line defects and the establishment of a prediction model are related less. And often only focus on the holistic state of circuit, neglected that there is individuality, regionality and influence factor's difference between the different sections of same circuit, and then lead to the difference of defect emergence condition, be unfavorable for the maintenance to transmission line.
Disclosure of Invention
The embodiment of the invention provides a method, a device and a system for predicting the defect rate of a power transmission line, which are used for predicting the defect rate of the power transmission line in different sections and facilitating the maintenance of the power transmission line.
In a first aspect, an embodiment of the present invention provides a method for predicting a defect rate of a power transmission line, where the method is applied to a section of the power transmission line, and the method includes:
acquiring characteristic data of the defect influence factors of the power transmission line, and classifying the characteristic data to form a data sample to be processed;
forming a training sample and a testing sample according to the data sample to be processed; wherein the training samples comprise a training set and a validation set;
constructing a regression prediction model and training the regression prediction model through the training samples to obtain the importance degree sequence of the influence factors;
obtaining an optimal feature set according to the importance ranking of the influence factors;
and predicting the defect rate of the power transmission line by combining the optimal feature set with the test sample.
Optionally, the obtaining the characteristic data of the influence factors of the defect of the power transmission line, and classifying the characteristic data to form a to-be-processed data sample includes:
acquiring historical characteristic data of the influence factors of the defects of the power transmission line according to historical defect records;
classifying historical characteristic data of vector characteristics in the influence factors of the defects of the power transmission line by a K-means clustering method and forming corresponding clustering numbers;
and forming a data sample to be processed according to the cluster number of the vector characteristics in the electric transmission line defect influence factors and the characteristic data of the non-vector characteristics in the electric transmission line defect influence factors.
Optionally, the forming a training sample and a testing sample according to the data sample to be processed includes:
dividing a certain amount of data samples to be processed as training samples, and taking the rest data samples to be processed as test samples;
and dividing the training samples into mutually exclusive K groups of subsets, wherein one group is used as a verification set, and the rest part is used as a training set.
Optionally, the influence factors of the defect rate of the power transmission line are classified as follows: meteorological features, topographic features, line parameter features, and line operational features; the training samples are based on the following representation:
{ Q, D, P, O, R }; wherein Q is meteorological characteristics, D is topographic characteristics, P is line self characteristics, O is line operation characteristics, and R is the defect rate of the transmission line in the same historical month.
Optionally, the defect rate is determined based on:
Rt=Nt/Tt;Rtindicating the defect incidence of a section of the transmission line; n is a radical oftRepresenting the total number of devices with defects in the section of the transmission line in the current period; t istRepresenting the total number of devices in the section of transmission line.
Optionally, the constructing a regression prediction model and training the regression prediction model through the training samples to obtain the importance ranking of the impact features includes:
constructing an SVM regression prediction model based on an RBF kernel function, and searching for an optimal kernel function parameter by adopting a grid search method and combining a K-fold cross verification method;
and training the SVM regression prediction model by using the training set according to an SVM-REF algorithm, calculating the score of each influence factor, and removing the influence factors one by one according to the score so as to obtain the importance sequence of each influence factor.
Optionally, the building of the regression prediction model of the SVM based on the RBF kernel function is determined based on the following:
Figure BDA0002575227560000031
where minJ denotes a minimization objective function J,
Figure BDA0002575227560000032
and ξtThe two different relaxation variables are the upper and lower hyperplane, c is a normalization constant, | ω | is a Euclidean norm.
Optionally, the obtaining an optimal feature set according to the ranking of the importance of the impact features includes:
replacing the training set and the verification set of the prediction model by using the K-fold cross validation method; repeating the steps k times, recording the division mode of the training set and the verification set corresponding to the minimum mean square error and the corresponding optimal kernel function parameters, and obtaining an optimal feature set.
In a second aspect, an embodiment of the present invention provides a device for predicting a defect rate of a power transmission line, including:
the acquisition module is used for acquiring characteristic data of the defect influence factors of the power transmission line and classifying the characteristic data to form a data sample to be processed;
the training sample and test sample forming module is used for forming a training sample and a test sample according to the data sample to be processed; wherein the training samples comprise a training set and a validation set;
the model construction module is used for constructing a regression prediction model and training the regression prediction model through the training samples to obtain the importance degree sequence of the influence factors;
the optimal feature set obtaining module is used for obtaining an optimal feature set according to the importance degree sequence of the influence factors;
and the prediction module is used for predicting the defect rate of the power transmission line according to the optimal feature set and the test sample.
In a third aspect, an embodiment of the present invention provides a system for predicting a defect rate of a power transmission line, including the apparatus for predicting a defect rate of a power transmission line according to the second aspect, where the method for predicting a defect rate of a power transmission line according to any one of the first aspect is performed by the apparatus for predicting a defect rate of a power transmission line.
The embodiment of the invention provides a method, a device and a system for predicting defect rate of a power transmission line, wherein the prediction method comprises the following steps: acquiring characteristic data of the defect influence factors of the power transmission line, and classifying the characteristic data to form a data sample to be processed; forming a training sample and a test sample according to a data sample to be processed; wherein the training samples comprise a training set and a validation set; constructing a regression prediction model and training the regression prediction model through training samples to obtain importance degree sequences of the influence characteristics; obtaining an optimal feature set according to the importance ranking of the influence factors; and predicting the defect rate of the power transmission line by combining the optimal feature set with the test sample. According to the technical scheme provided by the embodiment of the invention, the training samples and the test samples are formed by classifying and processing the characteristic data of the defect influence factors of the power transmission line, the regression prediction model is trained through the training samples to obtain the importance ranking of the influence characteristics, and the optimal characteristic subset is selected through cross validation average errors, so that the defect rates of different line sections of the power transmission line in the future are predicted, important reference is provided for the maintenance operation of a power transmission management department, and the maintenance of the power transmission line is facilitated.
Drawings
Fig. 1 is a flowchart of a method for predicting a defect rate of a power transmission line according to an embodiment of the present invention;
fig. 2 is a flowchart of a method for predicting a defect rate of a power transmission line according to a second embodiment of the present invention;
fig. 3 is a flowchart of a method for predicting a defect rate of a power transmission line according to a third embodiment of the present invention;
fig. 4 is a block diagram of a device for predicting a defect rate of a power transmission line according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
The embodiment of the present invention provides a method for predicting a defect rate of a power transmission line, and fig. 1 is a flowchart of the method for predicting the defect rate of the power transmission line provided in the embodiment of the present invention, and with reference to fig. 1, the method includes:
and S110, acquiring characteristic data of the defect influence factors of the power transmission line, and classifying the characteristic data to form a to-be-processed data sample.
Specifically, the power transmission line is divided into a plurality of sections, and all equipment between two adjacent base tower towers form one power transmission line section. The defect rate of each section of the power transmission line is counted according to the defect occurrence condition of each device between two base mast towers of the power transmission line, and the defect rate of each section can be obtained by the ratio of the total number of devices which are in current defect occurrence in the power transmission line of the section to the total number of devices in the power transmission line of the section. The influence factors influencing the defect occurrence of the equipment on the transmission line can be summarized into a plurality of aspects, for example, the influence factors can be divided into four aspects of meteorological characteristics, topographic characteristics, self characteristics and line operation characteristics. The meteorological factors are key factors influencing the occurrence and deterioration of the defects of the power transmission equipment, and the influencing factors in the meteorological characteristics can include air temperature, lightning strike, wind speed, relative humidity and the like. The topographic features comprise altitude, slope direction and slope angle which are passed by the power transmission line, different topographic features can form different microclimates, and different influences are generated on each section of the line. The self characteristics mainly comprise line structure parameters (such as the length of a lead, the voltage grade, the number of splits and the like) and equipment defect characteristics (such as the defect rate of the previous a months, the defect elimination period, the average operation life of equipment and the like). The line operation characteristics mainly consider the operation state characteristics of line operation load, heating condition and the like, and the operation characteristics also have great influence on the occurrence condition of the defects of the power transmission line. After the characteristic data of the defect influence factors of the power transmission line in the preset time are obtained, the characteristic data need to be classified to form a data sample to be processed.
S120, forming a training sample and a test sample according to the data sample to be processed; wherein the training samples comprise a training set and a validation set.
Specifically, the data in the data sample to be processed formed after the classification processing are all discrete values, that is, the characteristic data corresponding to the influence factors with vector attributes in the influence factors of the defects of the power transmission line are subjected to scaling processing. The data sample to be processed includes the feature data corresponding to the plurality of influence factors, which includes the feature data after the scaling processing of the influence factors with vector attributes, and also includes the feature data corresponding to the influence factors which do not change with time. For example, the temperature in the meteorological features changes with time, which is a kind of influence factor with vector attributes. The slope direction in the terrain features does not change along with time, and the slope direction is an influence factor of non-vector attributes. Dividing a data sample to be processed into a training sample and a testing sample; wherein the training samples comprise a training set and a validation set. The training samples are used for training the constructed regression prediction model. The test samples are used for verifying the prediction performance of the model and predicting the defect rate of a section of a certain power transmission line in the future month.
S130, constructing a regression prediction model and training the regression prediction model through training samples to obtain importance degree sequences of the influencing factors.
In particular, regression models are predictive modeling techniques that study the relationship between dependent variables (targets) and independent variables (predictors). This technique is commonly used in predictive analysis to discover causal relationships between variables and independent variables. For the embodiment of the invention, the relationship between the influence factors of the defects of the power transmission line and the predicted defect rate is represented by constructing the regression prediction model. The importance ranking of the influencing factors can be obtained by training the regression prediction model through the training samples. The importance ranking refers to the ranking of closeness between the influencing factors and the defects of the power transmission line in the consideration range. The influence factors before the ranking are the influence factors which are easy to cause the defects of the power transmission line.
And S140, sorting according to the importance of the influence factors to obtain an optimal feature set.
Specifically, after the above-mentioned ordering of the influencing factors, an optimal feature set needs to be further selected. And the optimal characteristic set is the optimal prediction model. And (4) removing one influence factor in each circulation, calculating to obtain a corresponding predicted average error, and taking the feature subset with the minimum regression average error as an optimal feature set. And the optimal feature subset is selected as the optimal feature set by cross validation of the average error, so that the generalization capability of the regression prediction model can be improved to a certain extent.
And S150, predicting the defect rate of the power transmission line according to the optimal feature set and the test sample.
Specifically, based on the optimal feature set, the defect rate of the power transmission line is predicted by using the test sample, and the defect rate of each section of the power transmission line in the future month can be obtained. For example, if the defect rate of a section of the power transmission line in the current month is to be predicted, the feature data of the defect influence factors of the section of the power transmission line is obtained, and the feature data is classified to form a data sample to be processed as the feature data of the previous month, where the test sample may be the historical feature data of each influence factor in the last days in the obtained data sample to be processed. Based on the optimal feature set, the defect rate of the power transmission line is predicted by using the test sample to obtain the defect rate of the section of the power transmission line in the future month, and an important reference is provided for maintenance operation of a power transmission management department. Based on the same method, the defect rate of each section of different transmission lines can be predicted, and maintenance of all the transmission lines is facilitated.
The method for predicting the defect rate of the power transmission line provided by the embodiment of the invention comprises the following steps: acquiring characteristic data of the defect influence factors of the power transmission line, and classifying the characteristic data to form a data sample to be processed; forming a training sample and a test sample according to a data sample to be processed; wherein the training samples comprise a training set and a validation set; constructing a regression prediction model and training the regression prediction model through training samples to obtain importance degree sequences of the influence characteristics; obtaining an optimal feature set according to the importance ranking of the influence features; and predicting the defect rate of the power transmission line by combining the optimal feature set with the test sample. According to the technical scheme provided by the embodiment of the invention, the training samples and the test samples are formed by classifying and processing the characteristic data of the defect influence factors of the power transmission line, the regression prediction model is trained through the training samples to obtain the importance ranking of the influence characteristics, and the optimal characteristic subset is selected through cross validation average errors, so that the defect rates of different line sections of the power transmission line in the future are predicted, important reference is provided for the maintenance operation of a power transmission management department, and the maintenance of the power transmission line is facilitated.
Example two
The embodiment of the invention provides a method for predicting the defect rate of a power transmission line, and based on the first embodiment, the method for predicting the defect rate of the power transmission line is supplemented and refined. The characteristic data are classified to form a to-be-processed data sample, and historical characteristic data of vector characteristics in the influence factors of the defects of the power transmission line are classified to form a corresponding cluster number through a K-means clustering method.
Fig. 2 is a flowchart of a method for predicting a defect rate of a power transmission line according to a second embodiment of the present invention, and referring to fig. 2, the method for predicting includes:
and S210, acquiring historical characteristic data of the influence factors of the defects of the power transmission line according to the historical defect records.
Specifically, the historical defect record is a defect repair record of the power transmission line equipment, and the historical defect record records the time when the defect equipment is discovered, but not the time when the defect occurs; meanwhile, considering that the defect occurrence of the power transmission equipment is a result caused by various factors such as influence factors of meteorological characteristics and influence factors of line operation characteristics under long-term accumulation, time series data of various influence factors at least within 1 month before the recording time of the defect is considered as historical characteristic data when analysis is performed.
And S220, classifying the historical characteristic data of the vector characteristics in the influence factors of the defects of the power transmission line by a K-means clustering method and forming a corresponding clustering number.
Specifically, the K-means clustering method is a clustering analysis algorithm for iterative solution, and comprises the steps of randomly selecting K objects as initial clustering centers, then calculating the distance between each object and each seed clustering center, and allocating each object to the nearest clustering center. The cluster centers and the objects assigned to them represent a cluster. The cluster center of a cluster is recalculated for each sample assigned based on the objects existing in the cluster. This process will be repeated until some termination condition is met. The termination condition may be that no (or minimum number) objects are reassigned to different clusters, no (or minimum number) cluster centers are changed again, and the sum of squared errors is locally minimal. The similarity of objects in the same cluster is higher; while the object similarity in different clusters is smaller.
The K-means clustering method is applied to the classification of the historical characteristic data of the vector characteristics in the influence factors of the defects of the power transmission line in the embodiment of the invention. Illustratively, the time series data of the temperature influencing factors in the meteorological features are clustered. The temperature influence factors have vector attributes, data of 1 month before the defect record is considered, average temperature of each hour is taken, namely 720 time sequence data are obtained, historical data, namely time sequence data, corresponding to the influence factors can be obtained according to the historical defect record, the influence factors are classified by an unsupervised clustering method to obtain cluster numbers, the cluster numbers are the same, the cluster numbers have similar characteristics, for example, the cluster numbers of the third day and the sixth day are the same, and the temperature of the third day and the temperature of the sixth day are similar. After clustering, the time sequence data of each influencing factor with vector attributes become clustering numbers, so that the purpose of reducing the dimension can be achieved.
And S230, forming a data sample to be processed according to the cluster number of the vector features in the electric transmission line defect influence factors and the feature data of the non-vector features in the electric transmission line defect influence factors.
Specifically, the data in the data sample to be processed formed after the classification processing are all discrete values, that is, the characteristic data corresponding to the influence factors with vector attributes in the influence factors of the defects of the power transmission line are subjected to scaling processing. The data sample to be processed includes the feature data corresponding to the plurality of influence factors, including the feature data subjected to scaling processing by the influence factors with vector attributes, and also including the feature data corresponding to the influence factors which do not change with time. The characteristic data obtained by scaling the influence factors with the vector attributes is the cluster number of the vector characteristics in the defect influence factors of the power transmission line.
S240, forming a training sample and a test sample according to the data sample to be processed; wherein the training samples comprise a training set and a validation set.
Specifically, n vector arrays containing e influencing factors are input to form a data matrix X, d vector arrays are taken as test samples Z, and the rest n-d vector arrays are taken as test samples Y required by modeling; considering that the test sample data must be data at a future time, d vector arrays located at the tail end according to a time ordering rule may be selected as the test sample. Wherein the training samples comprise a training set and a validation set. Divide training sample Y into k equal-sized mutexesSubset AjJ is 1,2, …, k, and satisfies:
|A1|=|A2|=…=|Aki, in the formula, | AjI means the set AjThe number of elements of (a), i.e. the potential of the set; setting i as a set [1, k ]]To take the subset AiAs a validation set, define Qi as a training set, then Qi=Y-Ai(ii) a Optionally, the training samples are equally divided into mutually exclusive K groups of subsets, wherein one group is used as a verification set, and the rest is used as a training set.
Illustratively, 30 (one month) vector arrays containing 8 influencing factors are input to form a data matrix X, and the vector array of the last day is taken as a test sample Z, and the rest 29 vector arrays are taken as test samples Y required by modeling. Wherein the training samples comprise a training set and a validation set. Partition of training samples Y into 29 sets of equally sized mutually exclusive subsets AjOne group is taken as a verification set, and the rest is taken as a training set. Here, the number of influencing factors is not limited. Preferably, the influencing factors may include all the influencing factors mentioned above (step S110), so as to realize multiple considerations for power transmission line defect rate prediction and improve the accuracy of defect rate prediction.
Optionally, the influence factors of the defects of the power transmission line are classified as follows: meteorological features, topographic features, line parameter features, and line operational features; the training samples are based on the following representation:
{ Q, D, P, O, R }; wherein Q is meteorological characteristics, D is topographic characteristics, P is line self characteristics, O is line operation characteristics, and R is the defect rate of the transmission line in the same historical month. Wherein the defect rate is determined based on:
Rt=Nt/Tt;Rtindicating the defect incidence of a section of the transmission line; n is a radical oftRepresenting the total number of devices with defects in the section of the transmission line in the current period; t istRepresenting the total number of devices in the section of transmission line.
And S250, constructing a regression prediction model and training the regression prediction model through the training samples to obtain the importance degree sequence of the influence factors.
And S260, obtaining an optimal feature set according to the importance degree sequence of the influence features.
And S270, predicting the defect rate of the power transmission line by combining the optimal feature set with the test sample.
The method for predicting the defect rate of the power transmission line, provided by the embodiment of the invention, comprises the step of classifying the characteristic data to form a data sample to be processed, wherein the step of classifying historical characteristic data of vector characteristics in the defect influence factors of the power transmission line by using a K-means clustering method to form a corresponding clustering number. The method comprises the steps of forming a training sample and a test sample by classifying and processing feature data of power transmission line defect influence factors, training a regression prediction model through the training sample to obtain importance degree sequences of influence features, and selecting an optimal feature subset through cross validation average errors, so that the defect rates of different line sections of the power transmission line in the future are predicted, important references are provided for the maintenance of the power transmission line, and the maintenance of the power transmission line is facilitated.
EXAMPLE III
The embodiment of the invention provides a method for predicting the defect rate of a power transmission line, and based on the embodiment, the method for predicting the defect rate of the power transmission line is supplemented and refined. The method comprises the following steps of constructing an SVM regression prediction model based on an RBF kernel function, and training the SVM regression model through a training set; and training the SVM regression prediction model by using the training set according to an SVM-REF algorithm, calculating the scores of all the influence factors, and removing the influence factors one by one according to the scores to obtain the importance sequence of all the influence factors. Removing one characteristic in each cycle until the average error of the K groups of samples is obtained after the K groups of samples are tested; and taking the feature subset with the minimum regression average error as an optimal feature set.
Fig. 3 is a flowchart of a method for predicting a defect rate of a power transmission line according to a third embodiment of the present invention, and referring to fig. 3, the method for predicting includes:
and S310, acquiring historical characteristic data of the influence factors of the defects of the power transmission line according to the historical defect records.
And S320, classifying the historical characteristic data of the vector characteristics in the influence factors of the defects of the power transmission line by a K-means clustering method and forming a corresponding clustering number.
S330, forming a data sample to be processed according to the cluster number of the vector features in the electric transmission line defect influence factors and the feature data of the non-vector features in the electric transmission line defect influence factors.
S340, forming a training sample and a test sample according to the data sample to be processed; wherein the training samples comprise a training set and a validation set.
S350, constructing an SVM regression prediction model based on the RBF kernel function, and searching for the optimal kernel function parameter by adopting a grid search method and combining a K-fold cross verification method.
Optionally, constructing a regression prediction model of the SVM based on the RBF kernel function is determined based on the following:
Figure BDA0002575227560000121
where minJ denotes a minimization objective function J,
Figure BDA0002575227560000122
and ξtThe two different relaxation variables are the upper and lower hyperplane, c is a normalization constant, | ω | is a Euclidean norm.
Specifically, the SVM is a widely applied machine learning method, and the nonlinear mapping capability thereof can map a nonlinear problem of a low-dimensional space to a high-dimensional space, thereby enhancing separability of an identified object. The grid search method is a method for optimizing model performance by traversing given parameter combinations, and the purpose of K-fold cross validation is to make model evaluation more accurate and credible. And searching for the optimal kernel function parameter by adopting a grid search method and combining a K-fold cross verification method. Wherein, the introduced RBF kernel function is as follows:
Figure BDA0002575227560000131
K(x,xi) σ is a free parameter for the RBF kernel. Using a grid searching method to divide the parameters c and sigma into grids in a certain range, and determining a group of parameter combinations of c and sigma(ii) a Equally dividing the training sample into K groups by using a K-fold cross validation method for the determined c and sigma parameter values, respectively selecting each subset data as a primary validation set, and taking the rest K-1 groups of subset data as training sets to obtain K models; the average of classification accuracy of the K model final verification sets is used as the prediction accuracy of the K-fold cross verification method classifier, and finally, the parameter c which can enable the prediction accuracy of the training set to be highest is selectedopt、σoptAs an optimal parameter combination; using the training set Qi and the obtained optimal parameter combination copt、σoptAnd constructing a regression prediction model based on the support vector machine.
And S360, training the SVM regression prediction model by using the training set according to an SVM-REF algorithm, calculating the score of each influence factor, and removing the influence factors one by one according to the score to obtain the importance sequence of each influence factor.
Specifically, the idea of the SVM-REF algorithm is to construct a feature ranking coefficient according to a weight vector generated by the SVM during training, remove one least influence factor of the ranking coefficient in each iteration, and finally obtain the descending ranking of all the influence factors.
And S370, sorting according to the importance of the influence factors to obtain an optimal feature set.
Specifically, the SVM-REF algorithm only ranks the above features, and further selects an optimal feature subset. Therefore, a K-fold cross validation method is reused on the basis of the SVM-REF algorithm to replace the training set and the validation set of the prediction model; repeating the steps k times, recording the division mode of the training set and the verification set corresponding to the minimum mean square error and the corresponding optimal kernel function parameters, and obtaining an optimal feature set. The optimal characteristic set is the optimal prediction model, and the regression average error of the optimal characteristic set is the minimum. And the optimal feature subset is selected as the optimal feature set by cross validation of the average error, so that the generalization capability of the regression prediction model can be improved to a certain extent.
And S380, predicting the defect rate of the power transmission line by combining the optimal feature set and the test sample.
The prediction method of the defect rate of the power transmission line provided by the embodiment of the invention is characterized in that an SVM regression prediction model is constructed based on an RBF kernel function, and the SVM regression model is trained through a training set; and training the SVM regression prediction model by using the training set according to an SVM-REF algorithm, calculating each feature score, and removing influence features one by one according to the score so as to obtain the importance sequence of each influence feature. Removing one characteristic in each cycle until the average error of the K groups of samples is obtained after the K groups of samples are tested; and taking the feature subset with the minimum regression average error as an optimal feature set. Therefore, the defect rates of different line sections of the power transmission line in the future can be predicted, important reference is further provided for maintenance inspection work of a power transmission management department, and the power transmission line is convenient to maintain.
Example four
An embodiment of the present invention provides a device for predicting a defect rate of a power transmission line, and fig. 4 is a block diagram of a structure of the device for predicting a defect rate of a power transmission line provided in a fourth embodiment of the present invention, and referring to fig. 4, the device for predicting a defect rate of a power transmission line includes:
the acquisition module 10 is configured to acquire characteristic data of the influence factors of the defects of the power transmission line, and classify the characteristic data to form a to-be-processed data sample;
a training sample and test sample forming module 20, configured to form a training sample and a test sample according to a data sample to be processed; wherein the training samples comprise a training set and a validation set;
the model construction module 30 is configured to construct a regression prediction model and train the regression prediction model through training samples to obtain importance ranks of the influencing factors;
an optimal feature set obtaining module 40, configured to obtain an optimal feature set according to the importance ranking of the influencing factors;
and the predicting module 50 is used for predicting the defect rate of the power transmission line according to the optimal feature set and the test sample.
Specifically, the device for predicting the defect rate of the power transmission line comprises an obtaining module 10, a training sample and test sample forming module 20, a model building module 30, an optimal feature set obtaining module 40 and a predicting module 50. The obtaining module 10 is configured to obtain characteristic data of the influence factors of the defect of the power transmission line, and classify the characteristic data to form a to-be-processed data sample. The power transmission line is divided into a plurality of sections, and all equipment between two adjacent base pole towers form a power transmission line section. The defect rate of each section can be obtained by the ratio of the total number of devices which are in the section of transmission line and have defects in the current period to the total number of devices in the section of transmission line. The influence factors influencing the defect occurrence of the equipment on the transmission line can be summarized into a plurality of aspects, for example, the influence factors can be divided into four aspects of meteorological characteristics, topographic characteristics, self characteristics and line operation characteristics. After the characteristic data of the defect influence factors of the power transmission line in the preset time are obtained, the characteristic data need to be classified to form a data sample to be processed.
The training sample and test sample forming module 20 is used for forming a training sample and a test sample according to the data sample to be processed; wherein the training samples comprise a training set and a validation set. And (3) the data in the data sample to be processed formed after classification processing are all discrete values, namely, the characteristic data corresponding to the influence factors with vector attributes in the influence factors of the defects of the power transmission line are subjected to scaling processing. The data sample to be processed includes the feature data corresponding to the plurality of influence factors, including the feature data subjected to scaling processing by the influence factors with vector attributes, and also including the feature data corresponding to the influence factors which do not change with time. For example, the temperature in the meteorological features changes with time, which is a kind of influence factor with vector attributes. The orientation of the topographic features does not change with time, and is a non-vector attribute influence factor. Dividing a data sample to be processed into a training sample and a testing sample; wherein the training samples comprise a training set and a validation set. The training samples are used for training the constructed regression prediction model. The test samples are used for verifying the prediction performance of the model and predicting the defect rate of a section of a certain power transmission line in the future month.
The model building module 30 is configured to build a regression prediction model and train the regression prediction model through training samples to obtain the importance ranking of the influencing factors. Regression models are predictive modeling techniques that study the relationship between dependent variables (targets) and independent variables (predictors). This technique is commonly used in predictive analysis to discover causal relationships between variables and independent variables. For the embodiment of the invention, the relationship between the influence factors of the defects of the power transmission line and the predicted defect rate is represented by constructing the regression prediction model. The importance ranking of the influencing factors can be obtained by training the regression prediction model through the training samples. The importance ranking refers to the ranking of closeness between the influencing factors and the defects of the power transmission line in the consideration range. The influence factors before the ranking are the influence factors which are easy to cause the defects of the power transmission line.
The optimal feature set obtaining module 40 is configured to obtain an optimal feature set according to the importance degree ranking of the influencing factors. After the above-mentioned ordering of the influencing factors, an optimal feature set needs to be further selected. And the optimal characteristic set is the optimal prediction model. And (4) removing one influence factor in each circulation, calculating to obtain a corresponding predicted average error, and taking the feature subset with the minimum regression average error as an optimal feature set. And the optimal feature subset is selected as the optimal feature set by cross validation of the average error, so that the generalization capability of the regression prediction model can be improved to a certain extent.
The prediction module 50 is configured to predict the defect rate of the power transmission line according to the optimal feature set in combination with the test samples. Based on the optimal feature set, the defect rate of the power transmission line is predicted by using the test sample, and the defect rate of each section of the power transmission line in the future month can be obtained. For example, if the defect rate of a section of the power transmission line in the current month is to be predicted, the feature data of the defect influence factors of the section of the power transmission line is obtained, and the feature data is classified to form a data sample to be processed as the feature data of the previous month, where the test sample may be the historical feature data of each influence factor in the last days in the obtained data sample to be processed. Based on the optimal feature set, the defect rate of the power transmission line is predicted by using the test sample to obtain the defect rate of the section of the power transmission line in the future month, and an important reference is provided for maintenance operation of a power transmission management department. Based on the same method, the defect rate of each section of different transmission lines can be predicted, and maintenance of all the transmission lines is facilitated.
The device for predicting the defect rate of the power transmission line provided by the embodiment of the invention comprises: the device for predicting the defect rate of the power transmission line comprises an acquisition module, a training sample and test sample forming module, a model building module, an optimal feature set acquisition module and a prediction module. After the training samples and the test samples are formed, the model construction module trains a regression prediction model through the training samples, and the optimal feature subset is selected by the optimal feature set obtaining module through cross validation average errors, so that the prediction module predicts the defect rate of different line sections of the power transmission line in the future, provides important reference for the maintenance operation of a power transmission management department, and is convenient for maintenance of the power transmission line.
EXAMPLE five
The embodiment of the invention provides a system for predicting the defect rate of a power transmission line, which comprises the device for predicting the defect rate of the power transmission line, and the method for predicting the defect rate of the power transmission line in any embodiment is executed through the device for predicting the defect rate of the power transmission line. Have the same technical effect and are not described in detail herein.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method for predicting the defect rate of a power transmission line is applied to a section of the power transmission line, and is characterized by comprising the following steps:
acquiring characteristic data of the defect influence factors of the power transmission line, and classifying the characteristic data to form a data sample to be processed;
forming a training sample and a testing sample according to the data sample to be processed; wherein the training samples comprise a training set and a validation set;
constructing a regression prediction model and training the regression prediction model through the training samples to obtain the importance degree sequence of the influence factors;
obtaining an optimal feature set according to the importance ranking of the influence factors;
and predicting the defect rate of the power transmission line by combining the optimal feature set with the test sample.
2. The method for predicting the defect rate of the power transmission line according to claim 1, wherein the step of obtaining the feature data of the influence factors of the defect of the power transmission line and classifying the feature data to form a to-be-processed data sample comprises the following steps:
acquiring historical characteristic data of the influence factors of the defects of the power transmission line according to historical defect records;
classifying historical characteristic data of vector characteristics in the influence factors of the defects of the power transmission line by a K-means clustering method and forming corresponding clustering numbers;
and forming a data sample to be processed according to the cluster number of the vector characteristics in the electric transmission line defect influence factors and the characteristic data of the non-vector characteristics in the electric transmission line defect influence factors.
3. The method for predicting the defect rate of the power transmission line according to claim 2, wherein the forming of the training samples and the testing samples according to the data samples to be processed comprises:
dividing a certain amount of data samples to be processed as training samples, and taking the rest data samples to be processed as test samples;
and dividing the training samples into mutually exclusive K groups of subsets, wherein one group is used as a verification set, and the rest part is used as a training set.
4. The method for predicting the defect rate of the power transmission line according to claim 1, wherein the influence factors of the defect of the power transmission line are classified as follows: meteorological features, topographic features, line parameter features, and line operational features; the training samples are based on the following representation:
{ Q, D, P, O, R }; wherein Q is meteorological characteristics, D is topographic characteristics, P is line self characteristics, O is line operation characteristics, and R is the defect rate of the transmission line in the same historical month.
5. The method for predicting the defect rate of the power transmission line according to claim 1, wherein the defect rate is determined based on:
Rt=Nt/Tt;Rtindicating the defect incidence of a section of the transmission line; n is a radical oftRepresenting the total number of devices with defects in the section of the transmission line in the current period; t istRepresenting the total number of devices in the section of transmission line.
6. The method for predicting the defect rate of the power transmission line according to claim 4, wherein the constructing a regression prediction model and training the regression prediction model through the training samples to obtain the importance ranking of the influence features comprises:
constructing an SVM regression prediction model based on an RBF kernel function, and searching for an optimal kernel function parameter by adopting a grid search method and combining a K-fold cross verification method;
and training the SVM regression prediction model by using the training set according to an SVM-REF algorithm, calculating the score of each influence factor, and removing the influence factors one by one according to the score so as to obtain the importance sequence of each influence factor.
7. The method for predicting the defect rate of the power transmission line according to claim 6, wherein the building of the regression prediction model of the SVM based on the RBF kernel function is determined based on:
Figure FDA0002575227550000031
where minJ denotes a minimization objective function J,
Figure FDA0002575227550000032
and ξtThe two different relaxation variables are the upper and lower hyperplane, c is a normalization constant, | ω | is a Euclidean norm.
8. The method for predicting the defect rate of the power transmission line according to claim 6, wherein the obtaining of the optimal feature set according to the importance ranking of the influence features comprises:
replacing the training set and the verification set of the prediction model by using the K-fold cross validation method; repeating the steps k times, recording the division mode of the training set and the verification set corresponding to the minimum mean square error and the corresponding optimal kernel function parameters, and obtaining an optimal feature set.
9. A prediction device of defect rate of a transmission line is characterized by comprising:
the acquisition module is used for acquiring characteristic data of the defect influence factors of the power transmission line and classifying the characteristic data to form a data sample to be processed;
the training sample and test sample forming module is used for forming a training sample and a test sample according to the data sample to be processed; wherein the training samples comprise a training set and a validation set;
the model construction module is used for constructing a regression prediction model and training the regression prediction model through the training samples to obtain the importance degree sequence of the influence factors;
the optimal feature set obtaining module is used for obtaining an optimal feature set according to the importance degree sequence of the influence factors;
and the prediction module is used for predicting the defect rate of the power transmission line according to the optimal feature set and the test sample.
10. A system for predicting a defect rate of a power transmission line, comprising the apparatus for predicting a defect rate of a power transmission line according to claim 9, wherein the method for predicting a defect rate of a power transmission line according to any one of claims 1 to 8 is performed by the apparatus for predicting a defect rate of a power transmission line.
CN202010651743.0A 2020-07-08 2020-07-08 Method, device and system for predicting defect rate of power transmission line Pending CN111652448A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010651743.0A CN111652448A (en) 2020-07-08 2020-07-08 Method, device and system for predicting defect rate of power transmission line

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010651743.0A CN111652448A (en) 2020-07-08 2020-07-08 Method, device and system for predicting defect rate of power transmission line

Publications (1)

Publication Number Publication Date
CN111652448A true CN111652448A (en) 2020-09-11

Family

ID=72350255

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010651743.0A Pending CN111652448A (en) 2020-07-08 2020-07-08 Method, device and system for predicting defect rate of power transmission line

Country Status (1)

Country Link
CN (1) CN111652448A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149898A (en) * 2020-09-21 2020-12-29 广东电网有限责任公司清远供电局 Fault rate prediction model training method, fault rate prediction method and related device
CN112417763A (en) * 2020-11-25 2021-02-26 杭州凯达电力建设有限公司 Defect diagnosis method, device and equipment for power transmission line and storage medium
CN115423156A (en) * 2022-08-15 2022-12-02 博源规划设计集团有限公司 Site selection optimization method for new railway four-electric engineering

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654207A (en) * 2016-01-07 2016-06-08 国网辽宁省电力有限公司锦州供电公司 Wind power prediction method based on wind speed information and wind direction information
CN107784393A (en) * 2017-10-27 2018-03-09 国网新疆电力公司电力科学研究院 A kind of the defects of transmission line of electricity Forecasting Methodology and device
CN108805193A (en) * 2018-06-01 2018-11-13 广东电网有限责任公司 A kind of power loss data filling method based on mixed strategy
CN109409723A (en) * 2018-10-18 2019-03-01 广西电网有限责任公司电力科学研究院 A kind of overhead transmission line method for evaluating state

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105654207A (en) * 2016-01-07 2016-06-08 国网辽宁省电力有限公司锦州供电公司 Wind power prediction method based on wind speed information and wind direction information
CN107784393A (en) * 2017-10-27 2018-03-09 国网新疆电力公司电力科学研究院 A kind of the defects of transmission line of electricity Forecasting Methodology and device
CN108805193A (en) * 2018-06-01 2018-11-13 广东电网有限责任公司 A kind of power loss data filling method based on mixed strategy
CN109409723A (en) * 2018-10-18 2019-03-01 广西电网有限责任公司电力科学研究院 A kind of overhead transmission line method for evaluating state

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
闭应洲等: "《数据挖掘与机器学习》", vol. 48, 浙江科学技术出版社, article 曾勇斌等, pages: 80 - 81 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149898A (en) * 2020-09-21 2020-12-29 广东电网有限责任公司清远供电局 Fault rate prediction model training method, fault rate prediction method and related device
CN112149898B (en) * 2020-09-21 2023-10-31 广东电网有限责任公司清远供电局 Training of failure rate prediction model, failure rate prediction method and related device
CN112417763A (en) * 2020-11-25 2021-02-26 杭州凯达电力建设有限公司 Defect diagnosis method, device and equipment for power transmission line and storage medium
CN115423156A (en) * 2022-08-15 2022-12-02 博源规划设计集团有限公司 Site selection optimization method for new railway four-electric engineering
CN115423156B (en) * 2022-08-15 2023-09-15 博源规划设计集团有限公司 Site selection optimization method for newly built railway four-electric engineering

Similar Documents

Publication Publication Date Title
CN111652448A (en) Method, device and system for predicting defect rate of power transmission line
CN109871976B (en) Clustering and neural network-based power quality prediction method for power distribution network with distributed power supply
US20150185270A1 (en) Method for recognizing transformer partial discharge pattern based on singular value decomposition algorithm
CN111476435B (en) Charging pile load prediction method based on density peak value
CN115270965B (en) Power distribution network line fault prediction method and device
CN110619360A (en) Ultra-short-term wind power prediction method considering historical sample similarity
CN109993225B (en) Airspace complexity classification method and device based on unsupervised learning
CN108491864B (en) Hyperspectral image classification based on automatic determination of convolution kernel size convolutional neural network
CN111160401A (en) Abnormal electricity utilization judging method based on mean shift and XGboost
CN113010504B (en) Electric power data anomaly detection method and system based on LSTM and improved K-means algorithm
CN112287980B (en) Power battery screening method based on typical feature vector
CN114676923A (en) Method and device for predicting generated power, computer equipment and storage medium
CN113935557A (en) Same-mode energy consumption big data prediction method based on deep learning
CN115881238A (en) Model training method, transformer fault diagnosis method and related device
Akut et al. NeuroEvolution: Using genetic algorithm for optimal design of deep learning models
Pham et al. Predicting song popularity
CN116861232A (en) Air quality data anomaly detection model based on DBN-OCSVM
CN115983095A (en) Photovoltaic power generation prediction method based on clustering algorithm, neural network and genetic algorithm
CN113919600B (en) Resident load ultra-short term prediction method
CN113962440A (en) DPC and GRU fused photovoltaic prediction method and system
CN112685933B (en) Method for predicting residual service life of roller screw pair
CN112257953B (en) Data processing method based on polar region new energy power generation power prediction
CN117290673A (en) Ship energy consumption high-precision prediction system based on multi-model fusion
Yozgyur CLUSTERING OF GEARBOXES BASED ON LOAD-TEST VIBRATION MEASUREMENTS
CN113850028B (en) Converter valve cooling mode classification method and device based on stacked heterogeneous residual error network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination