CN111191687A - Power communication data clustering method based on improved K-means algorithm - Google Patents
Power communication data clustering method based on improved K-means algorithm Download PDFInfo
- Publication number
- CN111191687A CN111191687A CN201911286973.5A CN201911286973A CN111191687A CN 111191687 A CN111191687 A CN 111191687A CN 201911286973 A CN201911286973 A CN 201911286973A CN 111191687 A CN111191687 A CN 111191687A
- Authority
- CN
- China
- Prior art keywords
- initial
- classification
- distance
- clustering
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004891 communication Methods 0.000 title claims abstract description 62
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000012545 processing Methods 0.000 claims abstract description 26
- 239000011159 matrix material Substances 0.000 claims abstract description 16
- 238000010606 normalization Methods 0.000 claims description 8
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 230000000694 effects Effects 0.000 abstract description 8
- 238000003064 k means clustering Methods 0.000 abstract description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013523 data management Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Public Health (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Evolutionary Biology (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a power communication data clustering method based on an improved K-means algorithm, which comprises the following steps: s101, carrying out standardized processing on the power communication data; s102, manually selecting an initial classification number K from the normalized data, determining an element distance matrix according to the K value, and determining K initial clustering centers; s103, selecting an element, and determining a classification group corresponding to the element by calculating the distance between the element and each initial clustering center; s104, updating the clustering centers of the classification groups, and determining the actual clustering centers of the classification groups; s105, obtaining the classification of the power communication data until the classification group is not changed any more; on the basis of the traditional K-means clustering algorithm, the initial classification number K value can be dynamically adjusted and improved according to the clustering effect so as to improve the clustering effect; the initial elements can be selected more reasonably according to the element distance matrix so as to improve classification rationality and have strong practicability.
Description
Technical Field
The invention belongs to the technical field of power communication, and particularly relates to a power communication data clustering method based on an improved K-means algorithm.
Background
The electric power communication network has huge redundant data, the development of redundant data processing is important content of electric power communication data management, and data clustering is a preposed link of redundant data processing, so that the huge electric power communication data are classified, the type of the redundant data is analyzed according to the actual condition of the data in each class, and a redundant data processing method is adopted according to local conditions.
The K-means algorithm is a main method for data clustering of the current power communication network, the implementation flow of the traditional K-means algorithm is shown in figure 1, and the main flow comprises the following steps:
(1) giving a K value, and randomly selecting an initial element; the K value is the number of element classifications obtained by clustering. The classification number K value of the traditional K-means algorithm is given manually, and initial elements of each initial classification are selected from the elements to be clustered manually;
(2) judging element classification; judging the subordination relation between each element and each classification one by one according to the distance between each element and each classification center position;
(3) updating the classification center position; and after the element judgment is finished each time, updating the newly added elements to update the positions of all the classification centers.
The K value and the initial element are key factors for realizing element clustering in the K-means algorithm, and the K value and the initial element in the traditional K-means algorithm are both given manually, lack of scientific support and difficult to ensure clustering effect.
Disclosure of Invention
The invention overcomes the defects of the prior art, and solves the technical problems that: the power communication data clustering method based on the improved K-means algorithm is capable of adjusting the initial classification number K and the initial clustering center.
In order to solve the technical problems, the invention adopts the technical scheme that: a power communication data clustering method based on an improved K-means algorithm comprises the following steps: s101, carrying out standardized processing on the power communication data; s102, manually selecting an initial classification number K from the normalized data, determining an element distance matrix according to the K value, and determining K initial clustering centers; s103, selecting an element, and determining a classification group corresponding to the element by calculating the distance between the element and each initial clustering center; s104, updating the clustering centers of the classification groups, and determining the actual clustering centers of the classification groups; and S105, repeating the step S103 until the classification group is not changed any more, and obtaining the classification of the power communication data.
Further, still include:
and S106, judging whether the initial classification number K meets the optimal classification value.
Preferably, the power communication data is subjected to normalization processing, specifically, the power communication data is converted into character-type numerical values, continuous numerical values and discrete numerical values which are easy to process;
the character-type numerical conversion process comprises the following steps: the character type numerical values in the power communication data are subjected to value sharing, and a conversion formula can be expressed as follows:
in the formula (1), xi、Respectively taking the values of the character type attribute i of the power communication data before and after processing, Cha1、Cha2… … is N character values of the attribute, which can be converted into values between 0 and 1 according to the character attribute value types;
the continuous type values include: the continuous numerical value in the power communication data is processed by adopting a normalization method, and the processing formula can be expressed as follows:
in the formula (2), xi、Respectively taking values of the continuous type attribute i of the power communication data before and after processing,and taking values of the continuous attribute.
Preferably, the normalized data is subjected to manual selection of an initial classification number K, an element distance matrix is determined according to the K value, and K initial clustering centers are determined, which specifically includes:
s1021, manually selecting an initial classification number K;
s1022, calculating the distance between each element according to an Euclidean distance formula;
assuming that the power communication data to be analyzed after data normalization processing has N items, and the data has M items of attributes, x in the formula (3)iDenotes the ith item, xi,jThe j attribute value of the ith item of data is represented,mrepresents dimension, d (x)i,xj) Representing data xiAnd data xjThe distance between them;
s1023, obtaining an element distance matrix according to the distance between the elements, and determining the average value of each row of elements, namely the average distance between the corresponding data of the row and all other data;
s1024, selecting the maximum average distance as the first initial clustering center, and selecting the remaining initial clustering centers to meet the target that the average distance between the remaining initial clustering centers and the selected initial elements is maximum, namely:
in the formula (4), J is the number of the selected initial elements, the number of the initial elements is increased one by one until the total number of the initial elements is equal to the number K of the initial classification, and t is set as the number K of the initial classificationHeart, then the set of initial cluster centers is (x)t,1,xt,2,Lxt,M)。
Preferably, the selecting an element, and determining the classification group corresponding to the element by calculating the distance between the element and each initial cluster center specifically includes: calculating the distance between each selected element and each initial clustering center by the formula (5), namely:
in the formula (5), d (x)i,xt) And clustering the element i into the classification with the minimum distance according to the distance value of the element and each initial clustering center.
Preferably, the updating the clustering centers of the classification groups and determining the actual clustering centers of the classification groups specifically include: when an element is added, the j-th attribute value updating formula of the central position of the classification group can be expressed as:
in the formula (6), xt,jTo add the j attribute value, x, of the actual cluster center t' in the cluster groupt,j' the j ' th attribute of the actual clustering center t ' in the classification group before adding elements is valued, xi,jTo increase the number of elements in the group after an element, NtAnd taking the value of the j attribute of the added element.
Preferably, the determining whether the initial classification number K satisfies the optimal classification value specifically includes:
s1061, calculating the distance between the actual clustering centers t' according to an Euclidean distance formula, namely:
in the formula (7), d (x)t1,xt2) As the actual cluster center t1Inter-class distance from the actual cluster center t 2;
s1061, calculating the minimum value of the inter-class distances among all the actual clustering centers t', namely the minimum inter-class distance TDmin;
S1062, calculating the average value of the inter-class distances among all the actual clustering centers t', namely the average inter-class distance TDave;
S1063, calculating the maximum value of the distances of all elements in the same classification, namely the maximum intra-class distance ITDmax;
S1064, judging the minimum inter-class distance TDminWhether much less than the mean inter-class distance TDaveReturning to step S102, otherwise, executing step S1065;
s1065, judging the ITD of the maximum intra-class distance by the rootmaxWhether much larger than the average inter-class distance TDaveIf so, returning to the step S102, otherwise, executing the step S1066;
and S1066, if the initial classification number K meets the optimal classification value, the classification of the power communication data can be output.
Preferably, the converted value ranges of the character-type value, the continuous-type value and the discrete-type value are all data between 0 and 1.
Compared with the prior art, the invention has the following beneficial effects:
the invention relates to a power communication data clustering method based on an improved K-means algorithm, which improves an initial classification number K and an initial clustering center which are manually given on the basis of the traditional K-means clustering algorithm, and can dynamically adjust and improve the value of the initial classification number K according to the clustering effect so as to improve the clustering effect; the initial elements can be selected more reasonably according to the element distance matrix so as to improve classification rationality.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings;
FIG. 1 is a flow chart of a conventional K-means algorithm;
fig. 2 is a schematic flow chart of a power communication data clustering method based on an improved K-means algorithm according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of a power communication data clustering method based on an improved K-means algorithm according to a second embodiment of the present invention;
fig. 4 is a schematic flow chart of a power communication data clustering method based on an improved K-means algorithm according to a third embodiment of the present invention;
fig. 5 is a schematic flow chart of a power communication data clustering method based on an improved K-means algorithm according to a fourth embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments, but not all embodiments, of the present invention; all other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 2 is a schematic flow chart of a power communication data clustering method based on an improved K-means algorithm according to an embodiment of the present invention, and as shown in fig. 2, the power communication data clustering method based on the improved K-means algorithm includes:
s101, carrying out standardized processing on the power communication data;
s102, manually selecting an initial classification number K from the normalized data, determining an element distance matrix according to the K value, and determining K initial clustering centers;
s103, selecting an element, and determining a classification group corresponding to the element by calculating the distance between the element and each initial clustering center;
s104, updating the clustering centers of the classification groups, and determining the actual clustering centers of the classification groups;
and S105, repeating the step S103 until the classification group is not changed any more, and obtaining the classification of the power communication data.
Specifically, in this embodiment, on the basis of the conventional K-means clustering algorithm, an initial classification number K value and an initial clustering center which are manually given are improved, in this embodiment, an element distance matrix is determined according to the given initial classification number K value, and a group of elements with the largest average distance is selected as the initial clustering center to enhance the discreteness of the initial elements, and the selection of the rest of the initial clustering centers can be more reasonably selected according to the element distance matrix, so that the clustering effect is improved, and the classification rationality is improved.
Fig. 3 is a schematic flow chart of a power communication data clustering method based on an improved K-means algorithm according to a second embodiment of the present invention, as shown in fig. 3, on the basis of the first embodiment, the method further includes:
and S106, judging whether the initial classification number K meets the optimal classification value.
In the embodiment, the initial classification number K selected manually can be dynamically adjusted and improved according to the clustering effect, so that the clustering effect is improved, the clustering rationality is improved, and the processing efficiency of redundant data of the power communication network is improved.
Further, in step S101, the power communication data is subjected to normalization processing, specifically, the power communication data is converted into character-type values, continuous-type values, and discrete-type values which are easy to process; and the converted value ranges of the character type numerical value, the continuous type numerical value and the discrete type numerical value are all data between 0 and 1.
The character-type numerical conversion process comprises the following steps: the character type numerical value can be counted to obtain the character value range, the common value of the character type numerical values in the electric power communication data is obtained without loss of generality, and the conversion formula can be expressed as follows:
in the formula (1), xi、Respectively taking the values of the character type attribute i of the power communication data before and after processing, Cha1、Cha2… … are N character type values of the attribute, according to the characterThe type attribute value category can be correspondingly converted into a numerical value between 0 and 1;
the continuous type values include: the continuous numerical value in the power communication data is processed by adopting a normalization method, and the processing formula can be expressed as follows:
in the formula (2), xi、Respectively taking values of the continuous type attribute i of the power communication data before and after processing,upper and lower limit values for the value of the continuous attribute;
the discrete numerical processing mode is similar to the character numerical processing mode, and the discrete numerical processing mode and the character numerical processing mode are also converted according to the value possibility.
Fig. 4 is a schematic flow chart of a power communication data clustering method based on an improved K-means algorithm according to a third embodiment of the present invention, as shown in fig. 4, on the basis of the second embodiment, the normalized data is subjected to manual selection of an initial classification number K, an element distance matrix is determined according to a K value, and K initial clustering centers are determined, which specifically includes:
s1021, manually selecting an initial classification number K;
s1022, calculating the distance between each element according to an Euclidean distance formula;
assuming that the power communication data to be analyzed after data normalization processing has N items, and the data has M items of attributes, x in the formula (3)iDenotes the ith item, xi,jThe j attribute value of the ith item of data is represented,mrepresenting dimension, defining the distance between data as the Euclidean space distance corresponding to each attribute valueI is then d (x)i,xj) Representing data xiAnd data xjThe distance between them;
s1023, obtaining an element distance matrix according to the distance between the elements, wherein the matrix is an NxN-order matrix, and the element in the ith row and the jth column is data xiAnd data xjDistance d (x) therebetweeni,xj) Determining the average value of each row element, namely the average distance between the corresponding data of the row and all other data;
s1024, selecting the element with the largest average distance as the first initial clustering center, wherein the selection of the remaining initial clustering centers should meet the target that the average distance between the remaining initial clustering centers and the selected initial element is the largest, namely:
in the formula (4), J is the number of the selected initial elements, the number of the initial elements is increased one by one until the total number of the initial elements is equal to the initial classification number K, and the initial clustering center value obtained according to the method has the maximum average distance and is most beneficial to clustering; determining the position of a classification center according to the selected initial clustering center, wherein the initial classification center position is the attribute value of the corresponding initial element, and if t is the initial clustering center, the set of the initial clustering centers is (x)t,1,xt,2,Lxt,M)。
Further, in step S103, the selecting an element, and determining the classification group corresponding to the element by calculating the distance between the element and each initial cluster center specifically includes: defining: the distance between the element and the classification is the Euclidean distance between the element and the classification initial clustering center, and the distance between the selected element and each initial clustering center is calculated, namely:
in the formula (5), d (x)i,xt) Is the distance between the element i and the initial cluster center t, in terms of the element to eachAnd clustering the distance value of the initial clustering centers into the classification with the minimum distance.
Further, in step S104, the updating the cluster centers of the classification groups, and determining the actual cluster centers of the classification groups, where the actual cluster centers refer to the average values of the attributes corresponding to all the elements belonging to the classification, and specifically includes: when an element is added, the j-th attribute value updating formula of the central position of the classification group can be expressed as:
in the formula (6), xt,jTo add the j attribute value, x, of the actual cluster center t' in the cluster groupt,j' the j ' th attribute of the actual clustering center t ' in the classification group before adding elements is valued, xi,jTo increase the number of elements in the group after an element, NtAnd taking the value of the j attribute of the added element.
Fig. 5 is a schematic flow chart of a power communication data clustering method based on an improved K-means algorithm according to a fourth embodiment of the present invention, as shown in fig. 5, on the basis of the third embodiment, the determining whether the initial classification number K satisfies the optimal classification value specifically includes:
s1061, calculating the distance between the actual clustering centers t' according to an Euclidean distance formula, namely:
in the formula (7), d (x)t1,xt2) The inter-class distance between the actual clustering center t1 and the actual clustering center t 2;
s1062, calculating the minimum value of the inter-class distances among all the actual clustering centers t', namely the minimum inter-class distance TDmin;
S1063, calculating the average value of the inter-class distances among all the actual clustering centers t', namely the average inter-class distance TDave;
S1064, calculating all element distances in the same classificationMaximum value of distance, i.e. maximum intra-class distance ITDmax;
S1065, judging the minimum inter-class distance TDminWhether much less than the mean inter-class distance TDaveIf so, returning to the step S102, otherwise, executing the step S1066;
s1066, judging the ITD of the maximum intra-class distancemaxWhether much larger than the average inter-class distance TDaveIf so, returning to the step S102, otherwise, executing a step S1067;
and S1067, if the initial classification number K meets the optimal classification value, the classification of the power communication data can be output.
Specifically, if the manually selected initial classification number K is too large, which may cause the classification to exceed the actual requirement, there is a minimum inter-class distance TDminMuch smaller than the mean inter-class distance TDaveThe case (1); otherwise, if the initial classification number K is too small, the classification will be insufficient and the actual requirement will be met, and there is a maximum intra-class distance ITD of a certain groupmaxFar greater than the average inter-class distance TDaveThe case (1).
If the relationship exists:
TDmin>mmTDave(8)
in the formula (8), mm is a given small number, and can be generally 0.2, the value of K is considered to be overlarge, K-1 can be used for replacing the original value of K, and the step S102 is returned to for distance again;
if the relationship exists:
ITDmax>MMTDave(9)
in the formula (9), when MM is a given larger number and can be generally 8, the value of K is considered to be too small, K +1 can be used for replacing the original value of K, and the step (II) is returned to perform clustering again.
And when the initial classification number K does not satisfy the problems of the formulas (8) and (9), the initial classification number K is reasonable in value, and the output result is finished.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.
Claims (8)
1. A power communication data clustering method based on an improved K-means algorithm is characterized by comprising the following steps: the method comprises the following steps:
s101, carrying out standardized processing on the power communication data;
s102, manually selecting an initial classification number K from the normalized data, determining an element distance matrix according to the K value, and determining K initial clustering centers;
s103, selecting an element, and determining a classification group corresponding to the element by calculating the distance between the element and each initial clustering center;
s104, updating the clustering centers of the classification groups, and determining the actual clustering centers of the classification groups;
and S105, repeating the step S103 until the classification group is not changed any more, and obtaining the classification of the power communication data.
2. The power communication data clustering method based on the improved K-means algorithm as claimed in claim 1, wherein: further comprising:
and S106, judging whether the initial classification number K meets the optimal classification value.
3. The power communication data clustering method based on the improved K-means algorithm as claimed in claim 1, wherein: the electric power communication data are subjected to normalized processing, specifically, the electric power communication data are converted into character type numerical values, continuous type numerical values and discrete type numerical values which are easy to process;
the character-type numerical conversion process comprises the following steps: the character type numerical values in the power communication data are subjected to value sharing, and a conversion formula can be expressed as follows:
in the formula (1), xi、Respectively taking the values of the character type attribute i of the power communication data before and after processing, Cha1、Cha2… … is N character values of the attribute, which can be converted into values between 0 and 1 according to the character attribute value types;
the continuous type values include: the continuous numerical value in the power communication data is processed by adopting a normalization method, and the processing formula can be expressed as follows:
4. The power communication data clustering method based on the improved K-means algorithm as claimed in claim 1, wherein: the normalized data is subjected to manual selection of an initial classification number K, an element distance matrix is determined according to a K value, and K initial clustering centers are determined, and the method specifically comprises the following steps:
s1021, manually selecting an initial classification number K;
s1022, calculating the distance between each element according to an Euclidean distance formula;
suppose thatThe electric power communication data to be analyzed after data normalization processing have N items, the data contain M items with attribute, and x in formula (3)iDenotes the ith item, xi,jJ attribute value representing ith item of data, m represents dimension, d (x)i,xj) Representing data xiAnd data xjThe distance between them;
s1023, obtaining an element distance matrix according to the distance between the elements, and determining the average value of each row of elements, namely the average distance between the corresponding data of the row and all other data;
s1024, selecting the maximum average distance as the first initial clustering center, and selecting the remaining initial clustering centers to meet the target that the average distance between the remaining initial clustering centers and the selected initial elements is maximum, namely:
in the formula (4), J is the number of the selected initial elements, the number of the initial elements is increased one by one until the total number of the initial elements is equal to the initial classification number K, and t is the initial clustering center, so that the set of the initial clustering centers is (x)t,1,xt,2,L xt,M)。
5. The power communication data clustering method based on the improved K-means algorithm as claimed in claim 1, wherein: selecting an element, and determining a classification group corresponding to the element by calculating the distance between the element and each initial cluster center, specifically comprising: calculating the distance between each selected element and each initial clustering center by the formula (5), namely:
in the formula (5), d (x)i,xt) And clustering the element i into the classification with the minimum distance according to the distance value of the element and each initial clustering center.
6. The power communication data clustering method based on the improved K-means algorithm as claimed in claim 1, wherein: the updating of the clustering centers of the classification groups and the determination of the actual clustering centers of the classification groups specifically include: when an element is added, the j-th attribute value updating formula of the central position of the classification group can be expressed as:
in the formula (6), xt,jTo add the j attribute value, x, of the actual cluster center t' in the cluster groupt,j' the j ' th attribute of the actual clustering center t ' in the classification group before adding elements is valued, xi,jTo increase the number of elements in the group after an element, NtAnd taking the value of the j attribute of the added element.
7. The power communication data clustering method based on the improved K-means algorithm as claimed in claim 2, wherein: the determining whether the initial classification number K satisfies the optimal classification value specifically includes:
s1061, calculating the distance between the actual clustering centers t' according to an Euclidean distance formula, namely:
in the formula (7), d (x)t1,xt2) The inter-class distance between the actual clustering center t1 and the actual clustering center t 2;
s1061, calculating the minimum value of the inter-class distances among all the actual clustering centers t', namely the minimum inter-class distance TDmin;
S1062, calculating the average value of the inter-class distances among all the actual clustering centers t', namely the average inter-class distance TDave;
S1063, calculating the maximum value of the distances of all elements in the same classification, namely the maximum intra-class distance ITDmax;
S1064, judging the minimum inter-class distance TDminWhether much less than the mean inter-class distance TDaveReturning to step S102, otherwise, executing step S1065;
s1065, judging the ITD of the maximum intra-class distance by the rootmaxWhether much larger than the average inter-class distance TDaveIf so, returning to the step S102, otherwise, executing the step S1066;
and S1066, if the initial classification number K meets the optimal classification value, the classification of the power communication data can be output.
8. The power communication data clustering method based on the improved K-means algorithm as claimed in claim 3, wherein: and the converted value ranges of the character type numerical value, the continuous type numerical value and the discrete type numerical value are all data between 0 and 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911286973.5A CN111191687B (en) | 2019-12-14 | 2019-12-14 | Power communication data clustering method based on improved K-means algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911286973.5A CN111191687B (en) | 2019-12-14 | 2019-12-14 | Power communication data clustering method based on improved K-means algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111191687A true CN111191687A (en) | 2020-05-22 |
CN111191687B CN111191687B (en) | 2023-02-10 |
Family
ID=70709187
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911286973.5A Active CN111191687B (en) | 2019-12-14 | 2019-12-14 | Power communication data clustering method based on improved K-means algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111191687B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111680764A (en) * | 2020-08-13 | 2020-09-18 | 国网浙江省电力有限公司 | Industry reworking and production-resuming degree monitoring method |
CN111680937A (en) * | 2020-08-13 | 2020-09-18 | 国网浙江省电力有限公司营销服务中心 | Small and micro enterprise rework rate evaluation method based on power data grading and empowerment |
CN112507607A (en) * | 2020-11-12 | 2021-03-16 | 中国电建集团中南勘测设计研究院有限公司 | Method for correcting pressure intensity calculation result of water-proof curtain wall |
CN116360352A (en) * | 2022-12-02 | 2023-06-30 | 山东和信智能科技有限公司 | Intelligent control method and system for power plant |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101329683A (en) * | 2008-07-25 | 2008-12-24 | 华为技术有限公司 | Recommendation system and method |
CN103440566A (en) * | 2013-08-27 | 2013-12-11 | 北京京东尚科信息技术有限公司 | Method and device for generating order picking collection lists and method for optimizing order picking route |
CN105095516A (en) * | 2015-09-16 | 2015-11-25 | 中国传媒大学 | Broadcast television subscriber grouping system and method based on spectral clustering integration |
CN106202335A (en) * | 2016-06-28 | 2016-12-07 | 银江股份有限公司 | A kind of big Data Cleaning Method of traffic based on cloud computing framework |
CN106682079A (en) * | 2016-11-21 | 2017-05-17 | 云南电网有限责任公司电力科学研究院 | Detection method of user's electricity consumption behavior of user based on clustering analysis |
WO2018157286A1 (en) * | 2017-02-28 | 2018-09-07 | 深圳市大疆创新科技有限公司 | Recognition method and device, and movable platform |
CN108629375A (en) * | 2018-05-08 | 2018-10-09 | 广东工业大学 | Power customer sorting technique, system, terminal and computer readable storage medium |
CN108898154A (en) * | 2018-09-29 | 2018-11-27 | 华北电力大学 | A kind of electric load SOM-FCM Hierarchical clustering methods |
CN109034231A (en) * | 2018-07-17 | 2018-12-18 | 辽宁大学 | The deficiency of data fuzzy clustering method of information feedback RBF network valuation |
CN109271427A (en) * | 2018-10-17 | 2019-01-25 | 辽宁大学 | A kind of clustering method based on neighbour's density and manifold distance |
US20190073416A1 (en) * | 2016-11-14 | 2019-03-07 | Ping An Technology (Shenzhen) Co., Ltd. | Method and device for processing question clustering in automatic question and answering system |
CN109685128A (en) * | 2018-12-18 | 2019-04-26 | 电子科技大学 | A kind of MB-kmeans++ clustering method and the user conversation clustering method based on it |
CN109934301A (en) * | 2019-03-22 | 2019-06-25 | 广东电网有限责任公司 | A kind of power load aggregation analysis method, device and equipment |
CN110263837A (en) * | 2019-06-13 | 2019-09-20 | 河海大学 | A kind of circuit breaker failure diagnostic method based on multilayer DBN model |
-
2019
- 2019-12-14 CN CN201911286973.5A patent/CN111191687B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101329683A (en) * | 2008-07-25 | 2008-12-24 | 华为技术有限公司 | Recommendation system and method |
CN103440566A (en) * | 2013-08-27 | 2013-12-11 | 北京京东尚科信息技术有限公司 | Method and device for generating order picking collection lists and method for optimizing order picking route |
CN105095516A (en) * | 2015-09-16 | 2015-11-25 | 中国传媒大学 | Broadcast television subscriber grouping system and method based on spectral clustering integration |
CN106202335A (en) * | 2016-06-28 | 2016-12-07 | 银江股份有限公司 | A kind of big Data Cleaning Method of traffic based on cloud computing framework |
US20190073416A1 (en) * | 2016-11-14 | 2019-03-07 | Ping An Technology (Shenzhen) Co., Ltd. | Method and device for processing question clustering in automatic question and answering system |
CN106682079A (en) * | 2016-11-21 | 2017-05-17 | 云南电网有限责任公司电力科学研究院 | Detection method of user's electricity consumption behavior of user based on clustering analysis |
WO2018157286A1 (en) * | 2017-02-28 | 2018-09-07 | 深圳市大疆创新科技有限公司 | Recognition method and device, and movable platform |
CN108629375A (en) * | 2018-05-08 | 2018-10-09 | 广东工业大学 | Power customer sorting technique, system, terminal and computer readable storage medium |
CN109034231A (en) * | 2018-07-17 | 2018-12-18 | 辽宁大学 | The deficiency of data fuzzy clustering method of information feedback RBF network valuation |
CN108898154A (en) * | 2018-09-29 | 2018-11-27 | 华北电力大学 | A kind of electric load SOM-FCM Hierarchical clustering methods |
CN109271427A (en) * | 2018-10-17 | 2019-01-25 | 辽宁大学 | A kind of clustering method based on neighbour's density and manifold distance |
CN109685128A (en) * | 2018-12-18 | 2019-04-26 | 电子科技大学 | A kind of MB-kmeans++ clustering method and the user conversation clustering method based on it |
CN109934301A (en) * | 2019-03-22 | 2019-06-25 | 广东电网有限责任公司 | A kind of power load aggregation analysis method, device and equipment |
CN110263837A (en) * | 2019-06-13 | 2019-09-20 | 河海大学 | A kind of circuit breaker failure diagnostic method based on multilayer DBN model |
Non-Patent Citations (2)
Title |
---|
李秀馨等: "基于改进FCM算法的卫星云图聚类方法研究", 《红外技术》 * |
邹臣嵩等: "基于最大距离积与最小距离和协同K 聚类算法", 《计算机应用与软件》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111680764A (en) * | 2020-08-13 | 2020-09-18 | 国网浙江省电力有限公司 | Industry reworking and production-resuming degree monitoring method |
CN111680937A (en) * | 2020-08-13 | 2020-09-18 | 国网浙江省电力有限公司营销服务中心 | Small and micro enterprise rework rate evaluation method based on power data grading and empowerment |
CN111680937B (en) * | 2020-08-13 | 2020-11-13 | 国网浙江省电力有限公司营销服务中心 | Small and micro enterprise rework rate evaluation method based on power data grading and empowerment |
CN112507607A (en) * | 2020-11-12 | 2021-03-16 | 中国电建集团中南勘测设计研究院有限公司 | Method for correcting pressure intensity calculation result of water-proof curtain wall |
CN112507607B (en) * | 2020-11-12 | 2023-02-10 | 中国电建集团中南勘测设计研究院有限公司 | Method for correcting pressure intensity calculation result of water-proof curtain wall |
CN116360352A (en) * | 2022-12-02 | 2023-06-30 | 山东和信智能科技有限公司 | Intelligent control method and system for power plant |
CN116360352B (en) * | 2022-12-02 | 2024-04-02 | 山东和信智能科技有限公司 | Intelligent control method and system for power plant |
Also Published As
Publication number | Publication date |
---|---|
CN111191687B (en) | 2023-02-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111191687B (en) | Power communication data clustering method based on improved K-means algorithm | |
CN107578288B (en) | Non-invasive load decomposition method considering user power consumption mode difference | |
CN106446967A (en) | Novel power system load curve clustering method | |
CN108932557A (en) | A kind of Short-term Load Forecasting Model based on temperature cumulative effect and grey relational grade | |
CN106408008A (en) | Load curve distance and shape-based load classification method | |
CN111489188B (en) | Resident adjustable load potential mining method and system | |
CN112367675B (en) | Wireless sensor network data fusion method and network system based on self-encoder | |
CN114040272B (en) | Path determination method, device and storage medium | |
CN110705685A (en) | Neural network quantitative classification method and system | |
CN115696690B (en) | Distributed intelligent building illumination self-adaptive energy-saving control method | |
CN111541628A (en) | Power communication network service resource allocation method and related device | |
CN109272058A (en) | Integrated power load curve clustering method | |
CN114781717A (en) | Network point equipment recommendation method, device, equipment and storage medium | |
CN114358378A (en) | User side energy storage optimal configuration system and method for considering demand management | |
CN113676357A (en) | Decision method for edge data processing in power internet of things and application thereof | |
Gong et al. | Adaptive interactive genetic algorithms with individual interval fitness | |
CN113112177A (en) | Transformer area line loss processing method and system based on mixed indexes | |
Lin et al. | Deployment method of power terminal edge control center based on cloud-edge cooperative mode | |
CN117034046A (en) | Flexible load adjustable potential evaluation method based on ISODATA clustering | |
CN111080164A (en) | Power load clustering result evaluation method based on daily load curve | |
CN110689452A (en) | Clustering algorithm-based power market business center service center planning method | |
CN108205721B (en) | Spline interpolation typical daily load curve selecting device based on clustering | |
CN114781703A (en) | Hierarchical multi-objective optimization method, terminal equipment and storage medium | |
CN115186882A (en) | Clustering-based controllable load spatial density prediction method | |
CN106777298A (en) | A kind of distributed clustering method based on fractal technology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |