CN110070111A

CN110070111A - A kind of distribution network line classification method and system

Info

Publication number: CN110070111A
Application number: CN201910247484.2A
Authority: CN
Inventors: 于海平; 何安宏; 陈益果; 徐玮; 肖徐兵; 刘乐全; 王昕平; 姜晓慧
Original assignee: Nari Technology Co Ltd; NARI Nanjing Control System Co Ltd
Current assignee: Nari Technology Co Ltd; NARI Nanjing Control System Co Ltd
Priority date: 2019-03-29
Filing date: 2019-03-29
Publication date: 2019-07-30

Abstract

The invention discloses a method for classifying distribution network lines. The method includes collecting data of several distribution network lines and dividing them into a test sample set, a training sample set and a verification sample set; based on the training sample set, a K-center clustering algorithm is sequentially adopted and decision classification algorithm to generate K member classifiers; obtain several nearest neighbors of each test sample in the verification sample set; traverse the K member classifiers, if the member classifier can correctly classify the nearest neighbors of all test samples, the member The classifier is added to the optimal classifier set; the optimal classifier set is used to classify the distribution network lines to be classified. At the same time, the corresponding system is disclosed. The invention solves the problems of low classification accuracy, low efficiency and poor robustness of the existing distribution network lines.

Description

A kind of distribution network line classification method and system

技术领域technical field

本发明涉及一种配网线路分类方法及系统，属于电网运营自动化技术领域。The invention relates to a distribution network line classification method and system, belonging to the technical field of power grid operation automation.

背景技术Background technique

随着用户对配电网的要求越来越高，国家电网公司近些年提出加快城市配电网建设改造，努力提高城市电网安全运行水平、提高电能质量、降低线损，同时提出优化网架结构和提高城市配电网自动化及管理水平。As users have higher and higher requirements for the distribution network, the State Grid Corporation of China has proposed to speed up the construction and transformation of the urban distribution network in recent years, strive to improve the safe operation level of the urban power grid, improve the power quality, and reduce the line loss. At the same time, it proposes to optimize the grid Structure and improve the automation and management level of urban distribution network.

配电网建设的一个重要组成部分就是配网线路的运维，运维的原则是“重要设备，重点运维”，因此需要对配网线路进行等级评估，而等级评估的前提是对配网线路进行分类，现有的电网运维系统中采用的方法主要有以下几种：一、单分类器分类模型；二、静态分类器模型；但是这些常用方法的准确率较低、效率低、稳健性不好。An important part of distribution network construction is the operation and maintenance of distribution network lines. The principle of operation and maintenance is "important equipment, key operation and maintenance". Therefore, it is necessary to conduct grade evaluation on distribution network lines, and the premise of grade evaluation is to evaluate the distribution network. The existing power grid operation and maintenance systems mainly use the following methods: 1. Single classifier classification model; 2. Static classifier model; however, these commonly used methods have low accuracy, low efficiency, and robustness. Sex is not good.

发明内容SUMMARY OF THE INVENTION

本发明提供了一种配网线路分类方法及系统，解决了现有配网线路分类准确率较低、效率低、稳健性不好的问题。The invention provides a distribution network line classification method and system, and solves the problems of low classification accuracy, low efficiency and poor robustness of the existing distribution network lines.

为了解决上述技术问题，本发明所采用的技术方案是：In order to solve the above-mentioned technical problems, the technical scheme adopted in the present invention is:

一种配网线路分类方法，包括，A distribution network line classification method, comprising,

采集若干条配电网线路数据，并将其分成测试样本集、训练样本集和验证样本集；Collect several distribution network line data and divide it into test sample set, training sample set and verification sample set;

基于训练样本集，依次采用K中心聚类算法和决策分类算法，生成K个成员分类器；Based on the training sample set, the K-center clustering algorithm and the decision-making classification algorithm are used in turn to generate K member classifiers;

获取各测试样本在验证样本集中的若干最近邻居；Obtain several nearest neighbors of each test sample in the validation sample set;

遍历K个成员分类器，若成员分类器能正确分类所有测试样本的最近邻居，则把该成员分类器加入最优分类器集合中；Traverse K member classifiers, if the member classifier can correctly classify the nearest neighbors of all test samples, add the member classifier to the optimal classifier set;

用最优分类器集对待分类配网线路分类。Use the optimal classifier set to classify the distribution network lines to be classified.

测试样本集、训练样本集和验证样本集的各样本均为一条配电网线路数据，配电网线路数据包括投运年限、月平均负载率、月平均线损率、年度故障次数、关联配变的年度故障次数、关联配变的平均负载率以及是否包括重要用户。Each sample in the test sample set, training sample set and verification sample set is the data of a distribution network line, and the distribution network line data includes the operation period, the monthly average load rate, the monthly average line loss rate, the annual number of faults, and the associated distribution. The number of annual failures of the transformer, the average load rate of the associated distribution transformer, and whether important users are included.

生成K个成员分类器的过程为，The process of generating K member classifiers is,

采用PAM对训练样本集进行划分，生成K个训练样本子集；Use PAM to divide the training sample set to generate K training sample subsets;

采用C4.5分类算法，对K个训练子集进行训练，生成K个成员分类器。Using the C4.5 classification algorithm, K training subsets are trained to generate K member classifiers.

计算测试样本到各验证样本的欧式距离，选择欧式距离小于阈值的若干验证样本作为最近邻居。Calculate the Euclidean distance from the test sample to each verification sample, and select several verification samples whose Euclidean distance is less than the threshold as the nearest neighbor.

一种配网线路分类系统，包括，A distribution network line classification system, comprising,

采集模块：采集若干条配电网线路数据，并将其分成测试样本集、训练样本集和验证样本集；Collection module: collects data of several distribution network lines and divides them into test sample set, training sample set and verification sample set;

成员分类器生成模块：基于训练样本集，依次采用K中心聚类算法和决策分类算法，生成K个成员分类器；Member classifier generation module: Based on the training sample set, K-center clustering algorithm and decision classification algorithm are used in turn to generate K member classifiers;

最近邻居获取模块：获取各测试样本在验证样本集中的若干最近邻居；The nearest neighbor acquisition module: to obtain several nearest neighbors of each test sample in the verification sample set;

最优分类器集合生成模块：遍历K个成员分类器，若成员分类器能正确分类所有测试样本的最近邻居，则把该成员分类器加入最优分类器集合中；Optimal classifier set generation module: traverse K member classifiers, if the member classifier can correctly classify the nearest neighbors of all test samples, add the member classifier to the optimal classifier set;

分类模块：用最优分类器集对待分类配网线路分类。Classification module: Use the optimal classifier set to classify the distribution network lines to be classified.

成员分类器生成模块包括PAM模块和C4.5模块；Member classifier generation module includes PAM module and C4.5 module;

PAM模块：采用PAM对训练样本集进行划分，生成K个训练样本子集；PAM module: Use PAM to divide the training sample set to generate K training sample subsets;

C4.5模块：采用C4.5分类算法，对K个训练子集进行训练，生成K个成员分类器。C4.5 module: The C4.5 classification algorithm is used to train K training subsets to generate K member classifiers.

最近邻居获取模块获取最近邻居的过程为：The process of obtaining the nearest neighbor by the nearest neighbor acquisition module is as follows:

一种存储一个或多个程序的计算机可读存储介质，所述一个或多个程序包括指令，所述指令当由计算设备执行时，使得所述计算设备执行配网线路分类方法。A computer-readable storage medium storing one or more programs comprising instructions that, when executed by a computing device, cause the computing device to perform a distribution line classification method.

一种计算设备，包括一个或多个处理器、存储器以及一个或多个程序，其中一个或多个程序存储在所述存储器中并被配置为由所述一个或多个处理器执行，所述一个或多个程序包括用于执行配网线路分类方法的指令。A computing device comprising one or more processors, a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the The one or more programs include instructions for performing the distribution line classification method.

本发明所达到的有益效果：本发明解决了现有配网线路分类准确率较低、效率低、稳健性不好的问题。The beneficial effects achieved by the invention are as follows: the invention solves the problems of low classification accuracy, low efficiency and poor robustness of the existing distribution network lines.

附图说明Description of drawings

图1为本发明的流程图；Fig. 1 is the flow chart of the present invention;

图2为三种分类方法的比较图。Figure 2 is a comparison diagram of the three classification methods.

具体实施方式Detailed ways

下面结合附图对本发明作进一步描述。以下实施例仅用于更加清楚地说明本发明的技术方案，而不能以此来限制本发明的保护范围。The present invention will be further described below in conjunction with the accompanying drawings. The following examples are only used to illustrate the technical solutions of the present invention more clearly, and cannot be used to limit the protection scope of the present invention.

如图1所示，一种配网线路分类方法，包括以下步骤：As shown in Figure 1, a method for classifying distribution network lines includes the following steps:

步骤1，采集若干条配电网线路数据，并将其分成测试样本集、训练样本集和验证样本集。Step 1: Collect several distribution network line data and divide them into test sample set, training sample set and verification sample set.

训练样本集、测试样本集和验证样本集所占比例为8：1：1，测试样本集、训练样本集和验证样本集的各样本均为一条配电网线路数据，配电网线路数据包括投运年限、月平均负载率、月平均线损率、年度故障次数、关联配变的年度故障次数、关联配变的平均负载率以及是否包括重要用户。The ratio of training sample set, test sample set and verification sample set is 8:1:1. Each sample of test sample set, training sample set and verification sample set is the data of a distribution network line. The distribution network line data includes Years of operation, monthly average load rate, monthly average line loss rate, annual failure times, annual failure times of associated distribution transformers, average load rate of associated distribution transformers, and whether important users are included.

投运年限：从生产管理系统获取配网线路投运日期，投运年限＝当前运行时间-投运日期；Operation period: obtain the distribution network line operation date from the production management system, operation period = current operation time - operation date;

月平均负载率：从调度自动化系统中获取实时电流、从生产管理系统中获取线路额定电流，负载率＝实时电流/额定电流；Monthly average load rate: obtain the real-time current from the dispatching automation system, and obtain the line rated current from the production management system, load rate = real-time current/rated current;

月平均线损率：从用电信息采集系统中获取配网线路当月损失电量、当月电量，月平均线损率＝当月损失电量/当月电量；Monthly average line loss rate: Obtain the current month's power loss and current month's power of distribution network lines from the electricity consumption information collection system, and the monthly average line loss rate = current month's power loss/current month's power;

年度故障次数：从调度自动化系统中获取年度故障发生次数；Annual number of failures: Obtain the annual number of failures from the scheduling automation system;

关联配变的年度故障次数：从生产管理系统中获取配网线路关联所有配变信息，从调度自动化系统中获取年度故障发生次数；The annual number of faults associated with distribution transformers: obtain the information of all distribution transformers associated with the distribution network lines from the production management system, and obtain the annual number of faults from the dispatch automation system;

关联配变的平均负载率：从生产管理系统中获取配网线路关联所有配变信息，从用电信息采集系统中获取负荷和容量，负载率＝负荷/容量；The average load rate of the associated distribution transformers: obtain all distribution transformer information associated with the distribution network lines from the production management system, and obtain the load and capacity from the electricity consumption information collection system, load rate = load/capacity;

是否包括重要用户：从生产管理系统中获取配网线路关联所有配变信息，从营销系统查看是否包含重要用户。Whether to include important users: Obtain all distribution and transformer information associated with distribution network lines from the production management system, and check whether important users are included from the marketing system.

步骤2，基于训练样本集，依次采用K中心聚类算法和决策分类算法，生成K个成员分类器。Step 2: Based on the training sample set, the K-center clustering algorithm and the decision-making classification algorithm are sequentially used to generate K member classifiers.

具体过程为：采用PAM对训练样本集进行划分，生成K个训练样本子集；采用C4.5分类算法，对K个训练子集进行训练，生成K个成员分类器。The specific process is: using PAM to divide the training sample set to generate K training sample subsets; using the C4.5 classification algorithm to train the K training subsets to generate K member classifiers.

PAM是对训练样本进行K中心点聚类划分，根据Calinski-Harabasz准则，由类间方差、类内方差和复杂度计算得出K值，生成多个子训练样本集。PAM divides the training samples into K center point clustering. According to the Calinski-Harabasz criterion, the K value is calculated from the inter-class variance, intra-class variance and complexity, and multiple sub-training sample sets are generated.

PAM划分具体如下：The PAM division is as follows:

(1)从训练样本集中任意选出K个训练样本对象作为初始的代表对象；(1) arbitrarily select K training sample objects from the training sample set as the initial representative objects;

(2)将每一个剩余对象指派到最近的代表对象所代表的集合内；(2) assigning each remaining object to the set represented by the nearest representative object;

(3)随机地选择一个非代表对象，计算用非代表对象交换代表对象的总代价其中p为空间中的点，为代表集合C_J中的非代表对象，O_j为代表集合C_J中的代表对象，k为代表集合C_J中代表对象数量；(3) Randomly select a non-representative object and calculate the total cost of exchanging the representative object with the non-representative object Among them, p is a point in the space, which is a non-representative object in the representative set C _J , O _j is the representative object in the representative set C _J , and k is the number of representative objects in the representative set C _J ;

(4)如果总代价小于0，则用非代表对象替换代表对象，并生成K个代表对象集合；(4) If the total cost is less than 0, replace the representative object with a non-representative object, and generate K representative object sets;

(5)循环(2)、(3)、(4)步骤，直至每个集合不再变化。(5) Repeat steps (2), (3), (4) until each set does not change.

C4.5分类算法是一系列用在机器学习和数据挖掘的分类问题中的算法，其目标是监督学习：给定一个数据集，其中的每一个元组都能用一组属性值来描述，每一个元组属于一个互斥的类别中的某一类；目标是通过学习，找到一个从属性值到类别的映射关系，并且这个映射能用于对新的类别未知的实体进行分类。The C4.5 classification algorithm is a series of algorithms used in machine learning and data mining classification problems with the goal of supervised learning: given a data set, each tuple in it can be described by a set of attribute values, Each tuple belongs to one of a mutually exclusive category; the goal is to learn, through learning, to find a mapping from attribute values to categories that can be used to classify new entities with unknown categories.

步骤3，获取各测试样本在验证样本集中的若干最近邻居。Step 3: Obtain several nearest neighbors of each test sample in the verification sample set.

具体过程为：计算测试样本到各验证样本的欧式距离，选择欧式距离小于阈值的若干验证样本作为最近邻居。The specific process is: calculating the Euclidean distance from the test sample to each verification sample, and selecting several verification samples whose Euclidean distance is less than the threshold as the nearest neighbors.

步骤4，遍历K个成员分类器，若成员分类器能正确分类所有测试样本的最近邻居，则把该成员分类器加入最优分类器集合中。Step 4, traverse the K member classifiers, if the member classifier can correctly classify the nearest neighbors of all test samples, the member classifier is added to the optimal classifier set.

步骤5，用最优分类器集对待分类配网线路分类。Step 5: Use the optimal classifier set to classify the distribution network lines to be classified.

根据待分类配网线路自身的特征，从最优分类器集中自适应地选择分类器组合或指派分类器权重进行最终的动态组合进行分类。According to the characteristics of the distribution network line to be classified, the classifier combination is adaptively selected from the optimal classifier set or the classifier weight is assigned to perform the final dynamic combination for classification.

为了验证上述方法，做以下实验：In order to verify the above method, do the following experiments:

分别选取了某三省各1600条配网线路2018年运行数据，作为模型的训练、验证、测试数据，每条配网线路挑选了投运年限、月平均负载率、月平均线损率、年度故障次数、关联配变的年度故障次数、关联配变的平均负载率、是否包括重要用户等共七个特征变量。The 2018 operation data of 1,600 distribution network lines in each of the three provinces were selected as the training, verification and test data of the model. For each distribution network line, the operating years, monthly average load rate, monthly average line loss rate, and annual failure rate were selected. There are seven characteristic variables, including the number of times, the annual failure times of the associated distribution transformers, the average load rate of the associated distribution transformers, and whether important users are included.

样本数据分成训练、验证、测试三组，训练数据集占总数据样本的百分之八十，共1280条，测试数据集和验证数据集各占总数的百分之十，各160条，并对数据集进行了离散化处理。实验比较方法包括上述方法以及目前常规的模型方法，即动态组合分类DDC-CD(即本发明的方法)、单分类器方法(C4.5)、静态组合方法AdaBoost(以C4.5为成员分类器学习算法)。分别对三种方法在数据分类精度、分类效率、稳健性三方面进行了比较，具体如表1、2以及图2所示。The sample data is divided into three groups: training, validation, and testing. The training data set accounts for 80% of the total data samples, with a total of 1280 records. The test data set and the validation data set each account for 10% of the total, with 160 records each. The dataset was discretized. The experimental comparison methods include the above methods and the current conventional model methods, namely the dynamic combination classification DDC-CD (ie the method of the present invention), the single classifier method (C4.5), the static combination method AdaBoost (with C4.5 as the member classification) machine learning algorithms). The three methods are compared in terms of data classification accuracy, classification efficiency, and robustness, as shown in Tables 1, 2 and Figure 2.

表1三种方分类法的分类精度比较(％)Table 1 Comparison of classification accuracy of three classification methods (%)

表2三种方分类法的分类稳健性比较(次)Table 2 Classification robustness comparison of three classification methods (times)

表1结果表明，AdaBoost的分类精度优于C4.5，DCC-CD的分类精度优于AdaBoost和C4.5。这说明DCC-CD能够充分利用基分类器所提供的预测信息，有效的提高分类精度，组合优越性远大于AdaBoost。The results in Table 1 show that the classification accuracy of AdaBoost is better than C4.5, and the classification accuracy of DCC-CD is better than that of AdaBoost and C4.5. This shows that DCC-CD can make full use of the prediction information provided by the base classifier, effectively improve the classification accuracy, and the combined superiority is much greater than that of AdaBoost.

图2结果表明，AdaBoost在三组数据源上的分类效率在优于C4.5，动态分DCC-CD的分类效率优于AdaBoost和基分类器C4.5。The results in Figure 2 show that the classification efficiency of AdaBoost on the three sets of data sources is better than C4.5, and the classification efficiency of dynamic classification DCC-CD is better than that of AdaBoost and the base classifier C4.5.

此外对交叉验证方法的10次分类准确率所属区间进行了统计，准确率区间选择为(0，R-3)和(R+3，100)，其中R为平均准确率。从表2中可以看到C4.5在十次验证中有5次，而AdaBoost有3次，DCC-CD只有1次。结果表明DCC-CD在分类稳健性方面优于C4.5和AdaBoost。In addition, the 10 classification accuracy intervals of the cross-validation method are counted, and the accuracy intervals are selected as (0, R-3) and (R+3, 100), where R is the average accuracy. From Table 2, we can see that C4.5 has 5 out of ten validations, while AdaBoost has 3 and DCC-CD has only 1. The results show that DCC-CD outperforms C4.5 and AdaBoost in terms of classification robustness.

综上所述，上述方法采用组合分类，解决了现有配网线路分类准确率较低、效率低、稳健性不好的问题。To sum up, the above method adopts combined classification to solve the problems of low classification accuracy, low efficiency and poor robustness of existing distribution network lines.

一种配网线路分类系统，包括：A distribution network line classification system, comprising:

采集模块：采集若干条配电网线路数据，并将其分成测试样本集、训练样本集和验证样本集。测试样本集、训练样本集和验证样本集的各样本均为一条配电网线路数据，配电网线路数据包括投运年限、月平均负载率、月平均线损率、年度故障次数、关联配变的年度故障次数、关联配变的平均负载率以及是否包括重要用户。Collection module: collects data of several distribution network lines and divides them into test sample set, training sample set and verification sample set. Each sample in the test sample set, training sample set and verification sample set is the data of a distribution network line, and the distribution network line data includes the operation period, the monthly average load rate, the monthly average line loss rate, the annual number of faults, and the associated distribution. The number of annual failures of the transformer, the average load rate of the associated distribution transformer, and whether important users are included.

成员分类器生成模块：基于训练样本集，依次采用K中心聚类算法和决策分类算法，生成K个成员分类器。Member classifier generation module: Based on the training sample set, K-center clustering algorithm and decision classification algorithm are used in turn to generate K member classifiers.

成员分类器生成模块包括PAM模块和C4.5模块；PAM模块：采用PAM对训练样本集进行划分，生成K个训练样本子集；C4.5模块：采用C4.5分类算法，对K个训练子集进行训练，生成K个成员分类器。Member classifier generation module includes PAM module and C4.5 module; PAM module: uses PAM to divide the training sample set to generate K training sample subsets; C4.5 module: adopts C4.5 classification algorithm to train K training samples The subset is trained to generate K member classifiers.

最近邻居获取模块：获取各测试样本在验证样本集中的若干最近邻居。The nearest neighbor acquisition module: acquires several nearest neighbors of each test sample in the verification sample set.

最近邻居获取模块获取最近邻居的过程为：计算测试样本到各验证样本的欧式距离，选择欧式距离小于阈值的若干验证样本作为最近邻居。The process of obtaining the nearest neighbors by the nearest neighbor acquiring module is: calculating the Euclidean distance from the test sample to each verification sample, and selecting several verification samples whose Euclidean distance is less than the threshold as the nearest neighbors.

最优分类器集合生成模块：遍历K个成员分类器，若成员分类器能正确分类所有测试样本的最近邻居，则把该成员分类器加入最优分类器集合中。Optimal classifier set generation module: traverse K member classifiers, if the member classifier can correctly classify the nearest neighbors of all test samples, the member classifier is added to the optimal classifier set.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.

以上仅为本发明的实施例而已，并不用于限制本发明，凡在本发明的精神和原则之内，所做的任何修改、等同替换、改进等，均包含在申请待批的本发明的权利要求范围之内。The above are only examples of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention are included in the application for pending approval of the present invention. within the scope of the claims.

Claims

1. a kind of distribution line classification method, it is characterised in that: including,

Several distribution network line data are acquired, and are divided into test sample collection, training sample set and verifying sample set；

Based on training sample set, K central cluster algorithm and Decision Classfication algorithm are successively used, generates K member classifiers；

Obtain several nearest-neighbors of each test sample in verifying sample set；

Traverse K member classifiers, the nearest-neighbors of all test samples if member classifiers can correctly classify, the member Classifier is added in optimum classifier set；

Classified with optimum classifier collection to distribution line to be sorted.

2. a kind of distribution line classification method according to claim 1, it is characterised in that: test sample collection, training sample Each sample of collection and verifying sample set is a distribution network line data, and distribution network line data include putting into operation to put down the time limit, the moon Equal load factor, monthly average line loss per unit, the annual number of stoppages, the annual number of stoppages for being associated with distribution transforming, the average load for being associated with distribution transforming Rate and whether include responsible consumer.

3. a kind of distribution line classification method according to claim 1, it is characterised in that: generate K member classifiers' Process is,

Training sample set is divided using PAM, generates K training sample subset；

Using C4.5 sorting algorithm, K training subset is trained, generates K member classifiers.

4. a kind of distribution line classification method according to claim 1, it is characterised in that: calculate test sample to each verifying The Euclidean distance of sample selects Euclidean distance to be less than several verifying samples of threshold value as nearest-neighbors.

5. a kind of distribution line categorizing system, it is characterised in that: including,

Acquisition module: several distribution network line data of acquisition, and it is divided into test sample collection, training sample set and verifying sample This collection；

Member classifiers' generation module: being based on training sample set, successively uses K central cluster algorithm and Decision Classfication algorithm, raw At K member classifiers；

Nearest-neighbors obtain module: obtaining several nearest-neighbors of each test sample in verifying sample set；

Optimum classifier set generation module: K member classifiers of traversal, all test specimens if member classifiers can correctly classify This nearest-neighbors then the member classifiers are added in optimum classifier set；

Categorization module: classified with optimum classifier collection to distribution line to be sorted.

6. a kind of distribution line categorizing system according to claim 5, it is characterised in that: test sample collection, training sample Each sample of collection and verifying sample set is a distribution network line data, and distribution network line data include putting into operation to put down the time limit, the moon Equal load factor, monthly average line loss per unit, the annual number of stoppages, the annual number of stoppages for being associated with distribution transforming, the average load for being associated with distribution transforming Rate and whether include responsible consumer.

7. a kind of distribution line categorizing system according to claim 5, it is characterised in that: member classifiers' generation module packet Include PAM module and C4.5 module；

PAM module: dividing training sample set using PAM, generates K training sample subset；

C4.5 module: using C4.5 sorting algorithm, be trained to K training subset, generates K member classifiers.

8. a kind of distribution line categorizing system according to claim 5, it is characterised in that: nearest-neighbors obtain module and obtain The process of nearest-neighbors is,

Calculate test sample to it is each verifying sample Euclidean distance, select Euclidean distance less than threshold value several verifying samples as Nearest-neighbors.

9. a kind of computer readable storage medium for storing one or more programs, it is characterised in that: one or more of journeys Sequence include instruction, described instruction when executed by a computing apparatus so that the calculatings equipment execution according to claim 1 to 4 institutes Method either in the method stated.

10. a kind of calculating equipment, it is characterised in that: including,

One or more processors, memory and one or more programs, wherein one or more programs are stored in described deposit It in reservoir and is configured as being executed by one or more of processors, one or more of programs include for executing basis The instruction of method either in method described in Claims 1-4.