CN117332287A - Evaluation index weight data processing method based on cluster analysis - Google Patents

Evaluation index weight data processing method based on cluster analysis Download PDF

Info

Publication number
CN117332287A
CN117332287A CN202311273672.5A CN202311273672A CN117332287A CN 117332287 A CN117332287 A CN 117332287A CN 202311273672 A CN202311273672 A CN 202311273672A CN 117332287 A CN117332287 A CN 117332287A
Authority
CN
China
Prior art keywords
determining
classification
index
evaluation index
evaluation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311273672.5A
Other languages
Chinese (zh)
Inventor
景春阳
张俊斌
黄雪鹰
王希阔
刘洋
林洋
陆盼盼
范一凡
刘振元
吴宪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
63856 Force Of Chinese Pla
Original Assignee
63856 Force Of Chinese Pla
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 63856 Force Of Chinese Pla filed Critical 63856 Force Of Chinese Pla
Priority to CN202311273672.5A priority Critical patent/CN117332287A/en
Publication of CN117332287A publication Critical patent/CN117332287A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Economics (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to the technical field of evaluation index weight data processing, in particular to an evaluation index weight data processing method based on cluster analysis. Comprising the following steps: determining an evaluation target and an evaluation index set; carrying out relative weight scoring under an evaluation target aiming at the evaluation index to obtain an original judgment matrix; extracting the grading value to obtain an original weight data point set; classifying the original weight data point set through cluster analysis to obtain a classification cluster; calculating the distance between cluster center points of two adjacent classification clusters to obtain a distance value; determining reasonable classification number and classification data point set according to the distance value; determining a distance dataset; removing and updating the distance data set to obtain cluster center points of an updated data point set; and determining the weight value of the evaluation index. The weight result obtained by the method eliminates few kinds of original weight data, reserves the original weight data occupying most kinds, and achieves the purpose of weight processing.

Description

Evaluation index weight data processing method based on cluster analysis
Technical Field
The application relates to the technical field of evaluation index weight data processing, in particular to an evaluation index weight data processing method based on cluster analysis.
Background
The weight of the evaluation index refers to the contribution degree of each index to the total target realization, and reflects the coefficient of the value position of each index in the evaluation object. The weight is a measure of the importance of each indicator, i.e., the size of the contribution of each indicator to the overall goal. The weights directly influence the result of the evaluation.
The subjective weighting method is a common weighting determination method, and mainly depends on the experience, knowledge and personal value of an expert to determine the weight, namely, original weight data is mainly judged by the expert. The weight determination of each index in an evaluation index system often requires multiple experts to independently judge each index, and then comprehensively determine the final index weight through data processing. The data processing method of the original weight directly influences the determination of the final index weight.
At present, an arithmetic average method or a geometric average method is mostly adopted in the processing method of the original weight data. However, in the actual evaluation work, when an expert judges the weight of the index, because of the difference in aspects of expertise, working experience, knowledge structure and the like, a large difference is easy to appear in weight judgment, and particularly when the number of the indexes is large and a certain dispute exists in part of the indexes, the difference is more obvious. At this time, the average method is adopted to ignore the difference of judgment of the index weights, and the average value of the weight data cannot represent the judgment of all the experts, and cannot keep the common opinion of most of the experts.
Disclosure of Invention
The application provides an evaluation index weight data processing method based on cluster analysis, which can solve the problems that the existing weight data processing method ignores index weight judgment differences and cannot keep most common opinions.
The technical scheme of the application is an evaluation index weight data processing method based on cluster analysis, which comprises the following steps:
s1: determining an evaluation target and determining a plurality of evaluation index sets corresponding to the evaluation target, wherein the plurality of evaluation index sets comprise the same number of evaluation indexes;
based on the evaluation indexes, performing evaluation target relative weight scoring on the evaluation indexes in each evaluation index set, and correspondingly obtaining a plurality of original judgment matrixes;
s2: extracting scoring values for a plurality of original judgment matrixes, and correspondingly obtaining a scoring value set and an original weight data point set corresponding to the scoring value set;
s3: determining a number of preset classification numbers for the original weight data point set;
classifying the original weight data point set by cluster analysis based on different preset classification numbers to obtain a plurality of classification clusters corresponding to the different preset classification numbers;
s4: aiming at a plurality of classification clusters corresponding to different preset classification numbers, calculating the distance between cluster center points of two adjacent classification clusters to obtain a plurality of distance values corresponding to different classification numbers;
determining a reasonable classification number corresponding to the evaluation index set and a classification data point set corresponding to the reasonable classification number according to the plurality of distance values;
s5: determining a cluster center point of the classified data point set and determining a distance data set according to the cluster center point;
removing and updating the distance data set to correspondingly obtain an updated data point set and cluster center points corresponding to the updated data point set;
and constructing an update judgment matrix according to the cluster center point corresponding to the update data point set and determining the weight value of the evaluation index in the update judgment matrix.
Optionally, the S1 includes:
s11: determining an evaluation target and determining n sets of evaluation indicators corresponding to the evaluation target;
each evaluation index set comprises the same number of evaluation indexes X k (k=1、2、…、m);
Selecting reference index X among m evaluation indexes in each evaluation index set p
S12: in each evaluation index set, a reference index X is respectively determined p And a base importance score a between m assessment indices pk
S13: in each evaluation index set, a is scored according to the basic importance degree pk Determining any two evaluation indexes X i And X j A relative importance score a between (i, j=1, 2, …, m) ij
The relative importance score a ij The calculation formula is as follows:
a ij =a kj /a ki
wherein a is kj A representation; a, a ki A representation;
s14: scoring a according to relative importance corresponding to n sets of evaluation metrics ij Constructing an original judgment matrix A z (a ij ) m×m (z=1、2、…、n)。
Optionally, the S2 includes:
s21: scoring value c at several identical locations in each original judgment matrix ij Respectively obtaining n score value sets corresponding to each original judgment matrix;
each set of scoring values includes (m-1) scoring value points;
s22: constructing an original weight data point set c (c ij ) n×(m-1)
Optionally, the S3 includes:
s31: determining a plurality of preset classification numbers K related to the original weight data point set;
k represents the preset classification number corresponding to the original weight data point set, and the preset classification number of index importance scores is assumed to be K classes, and the number of each class is respectively n 1 、n 2 、…、n k The weight data point set of each class is C respectively 1 、C 2 、…、C K The number and weight data point sets for each class satisfy the following quantitative relationship:
n 1 +n 2 +…+n K =n;C 1 ∪C 2 ∪…∪C k =; for any i, j ε [1, K],
S32: for the original weight data point set C (C ij ) n×(m-1) Random point taking is carried out, the taken random points are used as cluster centers, and a cluster center set H (H ij ) K×(m-1)
h ij Representing random points;
s33: based on different preset classification numbers K, aiming at an original weight data point set, calculating Euclidean distances between the grading value points and K cluster centers to obtain Euclidean distances corresponding to each grading value point;
the calculation formula of the euclidean distance d is as follows:
s34: based on different preset classification numbers K, according to Euclidean distance corresponding to each grading value point, correspondingly determining a cluster center closest to the Euclidean distance between each grading value point, classifying according to the cluster center, and correspondingly obtaining a plurality of classification clusters.
Optionally, the step S3 further includes:
s35: determining a weight coefficient based on different preset classification numbers K, and calculating the weighted average value of all grading value points in each classification cluster according to the weight coefficient aiming at a plurality of classification clusters;
updating the center of each classification cluster through the weighted mean value corresponding to each classification cluster, and correspondingly obtaining a plurality of updated classification clusters;
s36: determining termination conditions on the iteration times, the least square error and the cluster center point change rate based on different preset classification numbers K;
iteratively executing the S34-S35 until the classification cluster meets the termination condition;
and outputting a plurality of classification clusters corresponding to different preset classification numbers K.
Optionally, the S4 includes:
s41: based on different preset classification numbers K, calculating the distance between cluster center points of two adjacent classification clusters in a plurality of classification clusters, and correspondingly obtaining a plurality of distance values;
s42: determining an acceptance threshold u;
based on different preset classification numbers K, respectively judging the sizes of a plurality of distance values and an acceptance threshold u, and correspondingly obtaining a judgment result;
s43: and determining the reasonable classification number corresponding to the evaluation index set according to the judgment result.
Optionally, the S43 includes:
s431: based on different preset classification numbers K, if a judgment result with a distance value larger than an acceptance threshold u exists and the ratio between the number of the judgment results with the distance value larger than the acceptance threshold u and the total number of the judgment results is larger than 1, the corresponding preset classification number K is considered as a reasonable classification number;
s432: comparing a plurality of reasonable classification numbers, and determining a classified data point set C based on the reasonable classification number with the largest absolute number r ;C r The number of data points is r.
Optionally, the S5 includes:
s51: computing a categorized data Point set C r Cluster center point (h) r1 ,h r2 ,...,h r,m-1 ) Euclidean distance d between each data point 1 ,d 2 ,…,d r Correspondingly obtaining a distance data set D;
s52: determining the median D of the distance dataset D m
According to the median d m Calculating the median d m Median absolute deviation MAD of (2);
the formula for calculating MAD is shown below:
s53: calculating a Z-Score according to the median absolute deviation MAD;
the formula for calculating Z-Score is shown below:
s54: performing elimination processing on data points corresponding to the distance value with the Z-Score larger than 3.5 to correspondingly obtain an elimination data point set C t ;C t The number of data points is t;
s55: determining a rejected data Point set C t Is a cluster center point of (1);
according to the point set C of the reject data t Constructing an updating judgment matrix A;
s56: determining a maximum eigenvalue lambda of the updated judgment matrix max
According to the maximum eigenvalue lambda max And determining a feature vector W, carrying out normalization processing on the feature vector W, and correspondingly determining the weight value of the evaluation index in the update judgment matrix.
Optionally, the method further comprises:
s6: carrying out consistency operation according to the maximum characteristic value to correspondingly obtain a consistency index;
and judging whether the updating judgment matrix needs to be adjusted according to the consistency operation result, and if so, repeating the steps S3-S5 until the judgment result corresponding to the updating judgment matrix is that the adjustment is not needed.
Optionally, the S6 includes:
s61: carrying out consistency operation according to the maximum characteristic value to correspondingly obtain a consistency index CI;
the calculation formula of the consistency index CI is as follows:
s62: constructing a detection index Rl; calculating a consistency ratio CR according to a consistency index CI and a check index RI;
the calculation of the consistency ratio CR is shown below:
s63: determining a consistency threshold;
and judging the size between the consistency ratio CR and the consistency threshold, if the consistency ratio CR is larger than or equal to the consistency threshold as a judgment result, judging that the updating judgment matrix needs to be adjusted, and repeatedly executing the steps S3-S5 until the consistency ratio CR is smaller than the consistency threshold as a judgment result corresponding to the updating judgment matrix.
The beneficial effects are that:
according to the technical scheme, relative weight scoring can be carried out on the evaluation indexes aiming at the evaluation index set under a certain evaluation target, an original judgment matrix is obtained, the judgment score value is extracted to obtain an original weight data point set, the classification is carried out based on a cluster analysis algorithm, the optimal classification number and the absolute majority class data point set are obtained according to the distance value, and finally the evaluation index weight value is obtained. According to the technical scheme, few types of original weight data are removed, the original weight data occupying most types are reserved, the averaging treatment is not simply carried out, the result is more in line with the subjective weighting characteristics and the dispute disposal mode, and the weight treatment purpose is achieved;
in summary, the method and the device can solve the problems that the existing weight data processing method ignores the index weight judgment difference and cannot keep most common opinions.
Drawings
In order to more clearly illustrate the technical solutions of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flow chart of a method for processing evaluation index weight data based on cluster analysis in an embodiment of the present application;
FIG. 2 is a schematic diagram illustrating the setting of evaluation indexes in the embodiment of the present application;
FIG. 3 is a schematic diagram of clustering results of the original weight data point set into 3 classes in the embodiment of the present application;
fig. 4 is a schematic diagram of clustering results of the original weight data point set into 2 classes in the embodiment of the present application.
Detailed Description
Reference will now be made in detail to the embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The embodiments described in the examples below do not represent all embodiments consistent with the present application. Merely as examples of systems and methods consistent with some aspects of the present application as detailed in the claims.
Example 1
The application provides an evaluation index weight data processing method based on cluster analysis, as shown in fig. 1, fig. 1 is a flow chart of the evaluation index weight data processing method based on cluster analysis in the embodiment of the application, which comprises the following steps:
s1: determining an evaluation target and determining a plurality of evaluation index sets corresponding to the evaluation target, wherein the plurality of evaluation index sets comprise the same number of evaluation indexes;
based on the evaluation indexes, performing evaluation target relative weight scoring on the evaluation indexes in each evaluation index set, and correspondingly obtaining a plurality of original judgment matrixes.
Specifically, a judgment matrix is established: and (3) forming a weight scoring group by n experts in the related technical field of the evaluation target, wherein each expert independently scores the relative weights of m evaluation indexes under the same evaluation target or the same evaluation criterion.
In this embodiment of the present application, as shown in fig. 2, fig. 2 is a schematic diagram of setting evaluation indexes in the embodiment of the present application, where the set evaluation criteria includes four indexes in total: index a, index B, index C, and index D.
Wherein S1 comprises:
s11: determining an evaluation target and determining n sets of evaluation indicators corresponding to the evaluation target;
each evaluation index set comprises the same number of evaluation indexes X k (k=1、2、...、m);
Selecting reference index X among m evaluation indexes in each evaluation index set p
Specifically, from m peer indices (X 1 、X 2 、...、X m ) 1 index X considered to be most important is selected p As a reference index.
In the embodiment of the present application, 100 sets of relative weight scoring data with three tendencies are generated randomly by simulation, and an original judgment matrix is established, so that 100 original judgment matrices are formed, one of which is shown in table 1.
TABLE 1 one of the raw judgment matrices for simulation data
Relative weights Index A Index B Index C Index D
Index A 1.0 3.8 3.9 3.1
Index B 0.3 1.0 1.0 0.8
Index C 0.3 1.0 1.0 0.8
Index D 0.3 1.2 1.2 1.0
S12: in each evaluation index set, a reference index X is respectively determined p And a base importance score a between m assessment indices pk
Specifically, the reference index X is compared p And the other index X k (k=1, 2., m), deriving a relative importance relationship score a pk Co-alignment (m-1) times, scoring range of 1.0-9.0, relative importance ranking as shown in Table 2.
Table 2 evaluation index relative importance level grading table
S13: in each evaluation index set, a is scored according to the basic importance degree pk Determining any two evaluation indexes X i And X j (i) the process of (i), relative importance scores a) between j=1, 2 ij
Score of relative importance a ij The calculation formula is as follows:
a ij =a kj /a ki
wherein a is kj Representing an evaluation index X k Relative evaluation index X j Importance score value of (2); a, a ki Representing an evaluation index X k Relative evaluation index X i Importance score value of (c).
Specifically, for any two indices X i And X j (i, j=1, 2, …, m), the relative importance relationship score of the evaluation index satisfies the following formula:
a ij =a kj /a kj ;k=1、2、…、m。
s14: scoring a according to relative importance corresponding to n sets of evaluation metrics ij Constructing an original judgment matrix A z (a ij ) m×m (z=1、2、…、n)。
Specifically, an original judgment matrix A is established z (a ij ) m×m (z=1, 2, …, n), n experts together form n original judgment matrices.
S2: and extracting the scoring values for a plurality of original judgment matrixes, and correspondingly obtaining a scoring value set and an original weight data point set corresponding to the scoring value set.
In the embodiment of the application, an original weight data point set is established: and extracting a scoring value set { index A relative to index B, index A relative to index C, index A relative to index D } from each of the formed judging matrixes, wherein each scoring value in the set is mutually independent, and thus, an original weight data point set is established, and part of the scoring value set is shown in table 3.
Table 3 simulation data part original weight data Point set
Point number Score 1 Score 2 Score 3 Point number Score 1 Score 2 Score 3
1 3.8 3.9 3.1 6 4.7 4.6 4.5
2 3.9 3.6 3.1 7 4.5 4.3 4.7
3 3.3 3.5 4.0 8 4.2 4.7 4.2
4 4.0 3.2 4.0 9 4.4 4.6 4.8
5 4.0 3.5 3.8 10 4.1 4.9 4.8
Wherein S2 includes:
S21: scoring value c at several identical locations in each original judgment matrix ij Respectively obtaining n score value sets corresponding to each original judgment matrix;
each set of scoring values includes (m-1) scoring value points.
Specifically, a set of scoring values is extracted from the same plurality of locations in each of the judgment matrices based on judgment matrices derived by n experts.
The scoring values in the set are independent of each other and do not satisfy formula a ij =a kj /a ki The set is composed of (m-1) score values, and n sets are extracted in total.
S22: constructing an original weight data point set C (C) ij ) n×(m-1)
S3: determining a number of preset classification numbers for the original weight data point set;
based on different preset classification numbers, classifying the original weight data point sets through cluster analysis to obtain a plurality of classification clusters corresponding to the different preset classification numbers.
Specifically, the original weight data point set is classified into K classes by adopting a cluster analysis algorithm.
In the embodiment of the application, the original weight data point set is divided into 3 classes by adopting a cluster analysis algorithm, the iteration termination condition is set to be that the cluster center point is not changed, the clustering result is shown in fig. 3, fig. 3 is a schematic diagram of the clustering result of the original weight data point set divided into 3 classes in the embodiment of the application, and the detailed clustering result is shown in table 4.
Table 4 detailed clustering results of simulation data into 3 classes
Class number Central score 1 Central score 2 Central score 3 Sorting points
1 5.5 5.5 5.6 20
2 3.5 3.5 3.5 50
3 4.4 4.5 4.5 30
The original weight data point set was continued to be divided into class 2, class 4 and class 5 respectively, and the results are shown in table 5.
Table 5 detailed clustering results of simulation data divided into class 2, class 4 and class 5
Wherein S3 includes:
s31: determining a plurality of preset classification numbers K related to the original weight data point set;
k represents the preset classification number corresponding to the original weight data point set, and is assumed to meanThe preset classification number of the mark importance degree scores is K classes, and the number of each class is n respectively 1 、n 2 、…、n k The weight data point set of each class is C respectively 1 、C 2 、…、C K The number and weight data point sets for each class satisfy the following quantitative relationship:
n 1 +n 2 +…+n k =n;C 1 ∪C 2 ∪…∪C K =c; for any i, j ε [1, K],
Specifically, the index importance scores of n experts are classified into K classes 09i, the number of each class is n 1 、n 2 、...、n K The weight data point set of each class is C respectively 1 、C 2 、…、C K And satisfies the following relation:
n 1 +n 2 +…+n K =n;C 1 ∪C 2 ∪…∪C k =c; for any i, j ε [1, K],
S32: for the original weight data point set C (C ij ) n×(m-1) Random point taking is carried out, the taken random points are used as cluster centers, and a cluster center set H (H ij ) Kx(m-1)
h ij Representing random points.
Specifically, K points are randomly selected as the cluster center set H (H ij ) K×(m-1)
S33: based on different preset classification numbers K, aiming at an original weight data point set, calculating Euclidean distances between the grading value points and K cluster centers to obtain Euclidean distances corresponding to each grading value point;
the calculation formula of the euclidean distance d is as follows:
specifically, the Euclidean distance d from each data point to the center of K clusters is calculated respectively, and the formula is calculated:
find the cluster center nearest to the point and assign it to the corresponding cluster.
S34: based on different preset classification numbers K, according to Euclidean distance corresponding to each grading value point, correspondingly determining a cluster center closest to the Euclidean distance between each grading value point, classifying according to the cluster center, and correspondingly obtaining a plurality of classification clusters.
S35: determining a weight coefficient based on different preset classification numbers K, and calculating the weighted average value of all grading value points in each classification cluster according to the weight coefficient aiming at a plurality of classification clusters;
updating the center of each classification cluster through the weighted mean corresponding to each classification cluster, and correspondingly obtaining a plurality of updated classification clusters.
Specifically, after all points belong to each cluster, the cluster center set H is updated by taking the average value of all points in each cluster as the new center of each cluster (H ij ) K×(m-1) The method comprises the steps of carrying out a first treatment on the surface of the Further, expert weight coefficients may be set according to the comprehensive ability of the expert, at which time the weighted (expert weight coefficient) average value of all points in the cluster is calculated as the center of each cluster
S36: determining termination conditions on the iteration times, the least square error and the cluster center point change rate based on different preset classification numbers K;
iteratively executing S34-S35 until the classification cluster meets the termination condition;
and outputting a plurality of classification clusters corresponding to different preset classification numbers K.
S4: aiming at a plurality of classification clusters corresponding to different preset classification numbers, calculating the distance between cluster center points of two adjacent classification clusters to obtain a plurality of distance values corresponding to different classification numbers;
a reasonable classification number corresponding to the evaluation index set and a classified data point set corresponding to the reasonable classification number are determined according to the plurality of distance values.
In the embodiment of the present application, the euclidean distance between each adjacent center point in each clustering result is calculated, and the result is shown in table 6.
TABLE 6 spacing between adjacent center points for each cluster result of simulation data
The acceptable classification center point spacing threshold is set to be 1.0, and is not acceptable in 4 classes and 5 classes, and is classified into 2 classes, wherein the number of points in most classes is only 52 percent, which is close to 50 percent of the classification mean value, and is not acceptable. The number of points in most classes is 50% and is far greater than 33.3% of the classification mean value.
Therefore, it is reasonable to divide into 3 classes, the number of most class points is 50, and the center point is (3.5,3.5,3.5).
Wherein S4 includes:
s41: based on different preset classification numbers K, calculating the distance between cluster center points of two adjacent classification clusters in the plurality of classification clusters, and correspondingly obtaining a plurality of distance values.
Specifically, the distance between adjacent center points in the clustering result under each K value is calculated.
S42: determining an acceptance threshold u;
based on different preset classification numbers K, the magnitudes between a plurality of distance values and an acceptance threshold u are respectively judged, and accordingly judging results are obtained.
S43: and determining the reasonable classification number corresponding to the evaluation index set according to the judgment result.
Wherein S43 includes:
s431: based on different preset classification numbers K, if a judgment result with a distance value larger than the acceptance threshold u exists and the ratio between the number of the judgment results with the distance value larger than the acceptance threshold u and the total number of the judgment results is larger than 1, the corresponding preset classification number K is considered as a reasonable classification number.
Specifically, an acceptance threshold u (generally, a value of 1.0) is set, and for clustering results with intervals larger than u, the classification results can be considered to be reasonable; further, under the condition that the distance between adjacent center points is reasonable, calculating the ratio B occupied by most classes, comparing 1/K, and when BK is far more than 1, classifying the ratio B into K classes
S432: comparing a plurality of reasonable classification numbers, and determining a classified data point set C based on the reasonable classification number with the largest absolute number r ;C r The number of data points is r.
Specifically, when the K classes are determined to be reasonable, the classification data point set with the largest number is C r The number of contained data points is r.
S5: determining a cluster center point of the classified data point set and determining a distance data set according to the cluster center point;
removing and updating the distance data set to correspondingly obtain an updated data point set and cluster center points corresponding to the updated data point set;
and constructing an update judgment matrix according to the cluster center point corresponding to the update data point set and determining the weight value of the evaluation index in the update judgment matrix.
In the embodiment of the application, the multi-class point set divided into 3 classes is reprocessed, the Euclidean distance between each point and the center and the Z-Score are calculated, and partial calculation results are shown in table 7.
Table 7 results of the partial reprocessing of the most classified point sets of the simulation data divided into 3 classes
Point number Distance from center Z-Score Point number Distance from center Z-Score
1 0.65 0.96 6 0.55 0.27
2 0.60 0.62 7 0.57 0.44
3 0.50 -0.02 8 0.73 1.48
4 0.72 1.41 9 0.40 -0.72
5 0.53 0.17 10 0.40 -0.67
All points had a Z-Score of less than 3.5 without any data culling.
(6) The updated judgment matrix after the weight processing is constructed according to the center points (3.5,3.5,3.5) of the most classes as shown in table 8.
Table 8 update judgment matrix after simulation data processing
Relative weights Index A Index B Index C Index D
Index A 1.0 3.5 3.5 3.5
Index B 0.3 1.0 1.0 0.8
Index C 0.3 1.0 1.0 0.8
Index D 0.3 1.2 1.2 1.0
The weight values calculated according to the method in the embodiment of the present application are shown in table 9.
Table 9 simulation data weight value contrast
Index A Index B Index C Index D
Clustering weights 0.54 0.14 0.14 0.18
Geometric mean weight 0.66 0.11 0.12 0.11
Wherein S5 includes:
s51: computing a categorized data Point set C r Cluster center point (h) r1 ,h r2 ,…,h r,m-1 ) Euclidean distance d between each data point 1 ,d 2 ,…,d r A distance dataset D is obtained accordingly.
Specifically, a cluster center point (h r1 ,h r2 ,…,h r,m-1 ) Euclidean distance d to each data point 1 ,d 2 ,…,d r The distance dataset is D.
S52: determining the median D of the distance dataset D m
According to the median d m Calculating the median d m Median absolute deviation MAD of (2);
the formula for calculating MAD is shown below:
specifically, the median D in D is selected m The median absolute deviation MAD of D is calculated as follows:
s53: calculating a Z-Score according to the median absolute deviation MAD;
the formula for calculating Z-Score is shown below:
specifically, the Z-Score for all distance datasets in D is calculated as follows:
s54: performing elimination processing on data points corresponding to the distance value with the Z-Score larger than 3.5 to correspondingly obtain an elimination data point set C t ;C t The number of data points is t.
Specifically, the data points corresponding to the distance value with Z-Score larger than 3.5 are removed, and the removed data point set is marked as C t The number of contained data points is t.
S55: determining a rejected data Point set C t Is a cluster center point of (1);
according to the point set C of the reject data t And (3) constructing an updating judgment matrix A.
Specifically, the processed data point set C is calculated t Cluster center point (h) t1 ,h t2 ,…,h t,m-1 ) Based on data point set C t Is according to formula a ij =a kj /a kj And constructing an updating judgment matrix A as a final result after weight data processing.
S56: determining a maximum eigenvalue lambda of the updated judgment matrix max
According to the maximum eigenvalue lambda max And determining a feature vector W, carrying out normalization processing on the feature vector W, and correspondingly determining the weight value of the evaluation index in the update judgment matrix.
Specifically, the feature vector W of the judgment matrix a corresponding to the maximum feature value is calculated, and normalized to obtain the weight values of m peer indexes under the evaluation target.
S6: carrying out consistency operation according to the maximum characteristic value to correspondingly obtain a consistency index;
and judging whether the updating judgment matrix needs to be adjusted according to the consistency operation result, and if so, repeating the steps S3-S5 until the judgment result corresponding to the updating judgment matrix is that the adjustment is not needed.
In the embodiment of the present application, the data of the above steps is continued, and the consistency ratio of the calculated update judgment matrix is 0.006, which is considered to be good in consistency and acceptable.
Wherein S6 includes:
s61: carrying out consistency operation according to the maximum characteristic value to correspondingly obtain a consistency index CI;
the calculation formula of the consistency index CI is as follows:
specifically, the consistency of the judgment matrix a is checked: from the maximum eigenvalue lambda of matrix a max The consistency index Cl is calculated as follows:
s62: constructing an inspection index RI; calculating a consistency ratio CR according to a consistency index CI and a check index RI;
the calculation of the consistency ratio CR is shown below:
specifically, the corresponding test index RI may be looked up through the table 10.
Table 10 evaluation index relative importance relationship table
n 1 2 3 4 5 6 7 8 9
RI 0 0 0.58 0.90 1.12 1.24 1.32 1.41 1.45
The coherence ratio CR can be calculated from the following equation
S63: determining a consistency threshold;
judging the size between the consistency ratio CR and the consistency threshold, if the consistency ratio CR is larger than or equal to the consistency threshold as a judgment result, judging that the updating judgment matrix needs to be adjusted, and repeatedly executing S3-S5 until the consistency ratio CR is smaller than the consistency threshold as a judgment result corresponding to the updating judgment matrix.
Specifically, the consistency threshold may be set to 0.10, and if the consistency ratio is lower than 0.10, the consistency of the updated judgment matrix may be acceptable, otherwise, necessary adjustment is required.
(second) embodiment II
(1) Establishing a judgment matrix: and (3) carrying out relative weight scoring on the four indexes under the evaluation criterion by 7 experts (the weight coefficients of the experts are 1/7), wherein the step of expert scoring is as follows:
(1.1) selecting 1 index considered to be most important from 4 peer indexes as a reference index;
(1.2) comparing the importance levels of the reference index and the rest 3 indexes to obtain relative importance level relation scores, wherein the relative importance level grades are shown in table 1;
(1.3) creating original judgment matrices, and forming 7 original judgment matrices, one of which is shown in Table 11.
Table 11 shows one of the original judgment matrices of the measured data
Relative weights Index A Index B Index C Index D
Index A 1.0 3.0 0.8 1.5
Index B 0.3 1.0 0.3 0.5
Index C 1.3 4.0 1.0 2.0
Index D 0.7 2.0 0.5 1.0
(2) Establishing an original weight data point set: and extracting a scoring value set from each judging matrix as { index A relative to index B, index A relative to index C, index A relative to index D }, wherein the scoring values in the set are mutually independent, and thus, an original weight data point set is established as shown in table 12.
Table 12 actual measured data raw weight data point set
Point number Score 1 Score 2 Score 3
1 3.0 0.8 1.5
2 2.0 1.0 3.0
3 2.0 0.3 3.0
4 4.0 5.0 2.0
5 4.0 2.0 1.0
6 1.0 0.3 1.0
7 3.0 1.0 1.0
(3) The original weight data point set is divided into 2 classes by adopting a cluster analysis algorithm, the iteration termination condition is set as the cluster center point is not changed, the result is shown in fig. 4, fig. 4 is a schematic diagram of the clustering result of the original weight data point set divided into 2 classes in the embodiment of the application, and the detailed clustering result is shown in table 13:
table 13 detailed clustering results of the measured data into class 2
Class number Central score 1 Central score 2 Central score 3 Sorting points
1 4.0 5.0 2.0 1
2 2.5 0.9 1.8 6
(4) The original weight data point set is continuously divided into 3 classes, and the result is shown in table 14
Table 14 detailed clustering results of the measured data into 3 classes
Class number Central score 1 Central score 2 Central score 3 Sorting points
1 3.3 1.3 1.2 3
2 1.7 0.6 2.3 3
3 4.0 5.0 2.0 1
The Euclidean distance between each adjacent center point in each clustering result is calculated, and the results are shown in Table 15
Table 15 actual measurement data Each clustering result adjacent center point distance
Setting the acceptable classification center point spacing threshold to be 1.0, and classifying the classification center points into 2 types and 3 types; the number of points of the majority class is 43 percent and is larger than the average value of the classification by 33 percent, but the number of the majority class with two same number is not acceptable; the number of points in most classes is 86% and is far greater than 50% of the classification mean.
Therefore, it is reasonable to divide the data into 2 types, the number of the most types is 6, and the center point is (2.5,0.9,1.8).
(5) The point sets of the most classes under class 2 were reprocessed, the center Euclidean distance between each point and the Z-Score was calculated, and the partial calculation results are shown in Table 16.
Table 16 shows the results of the partial reprocessing of the most classified point sets of the measured data divided into 2 classes
Point number Distance from center Z-Score
1 0.6 -1.4
2 1.3 -0.1
3 1.5 0.1
4 2.0 1.0
5 1.8 0.6
6 0.9 -0.8
All points had a Z-Score of less than 3.5 without any data culling.
(6) The judgment matrix after weight processing is established according to the center points of most classes is shown in table 17.
Table 17 judgment matrix after actual measurement data processing
Relative weights Index A Index B Index C Index D
Index A 1.0 2.5 0.9 1.8
Index B 0.3 1.0 0.3 0.5
Index C 1.3 4.0 1.0 2.0
Index D 0.7 2.0 0.5 1.0
(7) The consistency ratio of the calculation judgment matrix is 0.02, and the consistency is good and acceptable.
(8) The weight values calculated by the method in the embodiment of the present application are shown in table 18.
Table 18 comparison of measured data weight values
Index A Index B Index C Index D
Clustering weights 0.31 0.10 0.39 0.20
Geometric mean weight 0.30 0.07 0.42 0.21
The foregoing detailed description of the embodiments of the present application has been provided for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application. All equivalent changes and modifications within the scope of the present application should be made within the scope of the present application.

Claims (10)

1. The method for processing the evaluation index weight data based on the cluster analysis is characterized by comprising the following steps of:
s1: determining an evaluation target and determining a plurality of evaluation index sets corresponding to the evaluation target, wherein the plurality of evaluation index sets comprise the same number of evaluation indexes;
based on the evaluation indexes, performing evaluation target relative weight scoring on the evaluation indexes in each evaluation index set, and correspondingly obtaining a plurality of original judgment matrixes;
s2: extracting scoring values for a plurality of original judgment matrixes, and correspondingly obtaining a scoring value set and an original weight data point set corresponding to the scoring value set;
s3: determining a number of preset classification numbers for the original weight data point set;
classifying the original weight data point set by cluster analysis based on different preset classification numbers to obtain a plurality of classification clusters corresponding to the different preset classification numbers;
s4: aiming at a plurality of classification clusters corresponding to different preset classification numbers, calculating the distance between cluster center points of two adjacent classification clusters to obtain a plurality of distance values corresponding to different classification numbers;
determining a reasonable classification number corresponding to the evaluation index set and a classification data point set corresponding to the reasonable classification number according to the plurality of distance values;
s5: determining a cluster center point of the classified data point set and determining a distance data set according to the cluster center point;
removing and updating the distance data set to correspondingly obtain an updated data point set and cluster center points corresponding to the updated data point set;
and constructing an update judgment matrix according to the cluster center point corresponding to the update data point set and determining the weight value of the evaluation index in the update judgment matrix.
2. The method for processing evaluation index weight data based on cluster analysis according to claim 1, wherein S1 comprises:
s11: determining an evaluation target and determining n sets of evaluation indicators corresponding to the evaluation target;
each evaluation index set comprises the same number of evaluation indexes X k (k=1、2、…、m);
Selecting reference index X among m evaluation indexes in each evaluation index set p
S12: in each evaluation index set, a reference index X is respectively determined p And a base importance score a between m assessment indices pk
S13: in each evaluation index set, a is scored according to the basic importance degree pk Determining any two evaluation indexes X i And X j (i,j=1、2.…, m) relative importance score a ij
The relative importance score a ij The calculation formula is as follows:
a ij =a kj /a ki
wherein a is kj Representing an evaluation index X k Relative evaluation index X j Importance score value of (2); a, a ki Representing an evaluation index X k Relative evaluation index X i Importance score value of (2);
s14: scoring a according to relative importance corresponding to n sets of evaluation metrics ij Constructing an original judgment matrix A z (a ij ) m×m (z=1、2、…、n)。
3. The method for processing evaluation index weight data based on cluster analysis according to claim 1, wherein S2 comprises:
s21: scoring value c at several identical locations in each original judgment matrix ij Respectively obtaining n score value sets corresponding to each original judgment matrix;
each set of scoring values includes (m-1) scoring value points;
s22: constructing an original weight data point set C (C) ij ) n×(m-1)
4. The method for processing evaluation index weight data based on cluster analysis according to claim 1, wherein S3 comprises:
s31: determining a plurality of preset classification numbers K related to the original weight data point set;
k represents the preset classification number corresponding to the original weight data point set, and the preset classification number of index importance scores is assumed to be K classes, and the number of each class is respectively n 1 、n 2 、…、n K The weight data point set of each class is C respectively 1 、C 2 、…、C K The number and weight data point set for each class satisfies the followingThe quantitative relation:
n 1 +n 2 +…+n K =n;C 1 ∪C 2 ∪…∪C K =c; for any i, j ε [1, K],
S32: for the original weight data point set C (C ij ) n×(m-1) Random point taking is carried out, the taken random points are used as cluster centers, and a cluster center set H (H ij ) K×(m-1)
h ij Representing random points;
s33: based on different preset classification numbers K, aiming at an original weight data point set, calculating Euclidean distances between the grading value points and K cluster centers to obtain Euclidean distances corresponding to each grading value point;
the calculation formula of the euclidean distance d is as follows:
s34: based on different preset classification numbers K, according to Euclidean distance corresponding to each grading value point, correspondingly determining a cluster center closest to the Euclidean distance between each grading value point, classifying according to the cluster center, and correspondingly obtaining a plurality of classification clusters.
5. The method for processing evaluation index weight data based on cluster analysis according to claim 4, wherein S3 further comprises:
s35: determining a weight coefficient based on different preset classification numbers K, and calculating the weighted average value of all grading value points in each classification cluster according to the weight coefficient aiming at a plurality of classification clusters;
updating the center of each classification cluster through the weighted mean value corresponding to each classification cluster, and correspondingly obtaining a plurality of updated classification clusters;
s36: determining termination conditions on the iteration times, the least square error and the cluster center point change rate based on different preset classification numbers K;
iteratively executing the S34-S35 until the classification cluster meets the termination condition;
and outputting a plurality of classification clusters corresponding to different preset classification numbers K.
6. The method for processing evaluation index weight data based on cluster analysis according to claim 1, wherein S4 comprises:
s41: based on different preset classification numbers K, calculating the distance between cluster center points of two adjacent classification clusters in a plurality of classification clusters, and correspondingly obtaining a plurality of distance values;
s42: determining an acceptance threshold u;
based on different preset classification numbers K, respectively judging the sizes of a plurality of distance values and an acceptance threshold u, and correspondingly obtaining a judgment result;
s43: and determining the reasonable classification number corresponding to the evaluation index set according to the judgment result.
7. The method for processing evaluation index weight data based on cluster analysis according to claim 6, wherein S43 comprises:
s431: based on different preset classification numbers K, if a judgment result with a distance value larger than an acceptance threshold u exists and the ratio between the number of the judgment results with the distance value larger than the acceptance threshold u and the total number of the judgment results is larger than 1, the corresponding preset classification number K is considered as a reasonable classification number;
s432: comparing a plurality of reasonable classification numbers, and determining a classified data point set C based on the reasonable classification number with the largest absolute number r ;C r The number of data points is r.
8. The method for processing evaluation index weight data based on cluster analysis according to claim 1, wherein S5 comprises:
s51: computing a categorized data Point set C r Cluster center point (h) r1 ,h r2 ,...,h r,m-1 ) Euclidean distance d between each data point 1 ,d 2 ,...,d r Correspondingly obtaining a distance data set D;
s52: determining the median D of the distance dataset D m
According to the median d m Calculating the median d m Median absolute deviation MAD of (2);
the formula for calculating MAD is shown below:
s53: calculating a Z-Score according to the median absolute deviation MAD;
the formula for calculating Z-Score is shown below:
s54: performing elimination processing on data points corresponding to the distance value with the Z-Score larger than 3.5 to correspondingly obtain an elimination data point set C t ;C t The number of data points is t;
s55: determining a rejected data Point set C t Is a cluster center point of (1);
according to the point set C of the reject data t Constructing an updating judgment matrix A;
s56: determining a maximum eigenvalue lambda of the updated judgment matrix max
According to the maximum eigenvalue lambda max And determining a feature vector W, carrying out normalization processing on the feature vector W, and correspondingly determining the weight value of the evaluation index in the update judgment matrix.
9. The method for processing the evaluation index weight data based on the cluster analysis according to claim 1, wherein the method further comprises:
s6: carrying out consistency operation according to the maximum characteristic value to correspondingly obtain a consistency index;
and judging whether the updating judgment matrix needs to be adjusted according to the consistency operation result, and if so, repeating the steps S3-S5 until the judgment result corresponding to the updating judgment matrix is that the adjustment is not needed.
10. The method for processing evaluation index weight data based on cluster analysis according to claim 9, wherein S6 comprises:
s61: carrying out consistency operation according to the maximum characteristic value to correspondingly obtain a consistency index CI;
the calculation formula of the consistency index CI is as follows:
s62: constructing an inspection index RI; calculating a consistency ratio CR according to a consistency index CI and a check index RI;
the calculation of the consistency ratio CR is shown below:
s63: determining a consistency threshold;
and judging the size between the consistency ratio CR and the consistency threshold, if the consistency ratio CR is larger than or equal to the consistency threshold as a judgment result, judging that the updating judgment matrix needs to be adjusted, and repeatedly executing the steps S3-S5 until the consistency ratio CR is smaller than the consistency threshold as a judgment result corresponding to the updating judgment matrix.
CN202311273672.5A 2023-09-28 2023-09-28 Evaluation index weight data processing method based on cluster analysis Pending CN117332287A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311273672.5A CN117332287A (en) 2023-09-28 2023-09-28 Evaluation index weight data processing method based on cluster analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311273672.5A CN117332287A (en) 2023-09-28 2023-09-28 Evaluation index weight data processing method based on cluster analysis

Publications (1)

Publication Number Publication Date
CN117332287A true CN117332287A (en) 2024-01-02

Family

ID=89278496

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311273672.5A Pending CN117332287A (en) 2023-09-28 2023-09-28 Evaluation index weight data processing method based on cluster analysis

Country Status (1)

Country Link
CN (1) CN117332287A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117610970A (en) * 2024-01-04 2024-02-27 成都开元精创信息技术有限公司 Intelligent evaluation method and system for data migration work

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111275132A (en) * 2020-02-24 2020-06-12 电子科技大学 Target clustering method based on SA-PFCM + + algorithm
CN113378927A (en) * 2021-06-11 2021-09-10 哈尔滨理工大学 Clustering-based self-adaptive weighted oversampling method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111275132A (en) * 2020-02-24 2020-06-12 电子科技大学 Target clustering method based on SA-PFCM + + algorithm
CN113378927A (en) * 2021-06-11 2021-09-10 哈尔滨理工大学 Clustering-based self-adaptive weighted oversampling method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117610970A (en) * 2024-01-04 2024-02-27 成都开元精创信息技术有限公司 Intelligent evaluation method and system for data migration work
CN117610970B (en) * 2024-01-04 2024-04-02 成都开元精创信息技术有限公司 Intelligent evaluation method and system for data migration work

Similar Documents

Publication Publication Date Title
CN109816031B (en) Transformer state evaluation clustering analysis method based on data imbalance measurement
CN117332287A (en) Evaluation index weight data processing method based on cluster analysis
JP2008546046A (en) Mahalanobis distance genetic algorithm method and system
CN103886168A (en) Multi-channel analysis method and device based on analytic hierarchy process
CN112039903B (en) Network security situation assessment method based on deep self-coding neural network model
CN108491864B (en) Hyperspectral image classification based on automatic determination of convolution kernel size convolutional neural network
CN111339478B (en) Meteorological data quality assessment method based on improved fuzzy analytic hierarchy process
CN111401785A (en) Power system equipment fault early warning method based on fuzzy association rule
CN111291822A (en) Equipment running state judgment method based on fuzzy clustering optimal k value selection algorithm
CN112633337A (en) Unbalanced data processing method based on clustering and boundary points
CN115130578A (en) Incremental rough clustering-based online evaluation method for state of power distribution equipment
US7136809B2 (en) Method for performing an empirical test for the presence of bi-modal data
CN111353529A (en) Mixed attribute data set clustering method for automatically determining clustering center
CN111914943A (en) Information vector machine method and device for comprehensively judging stability of dumping type karst dangerous rock
CN109222220B (en) Method and system for evaluating cigarette rolling quality index
CN107423319B (en) Junk web page detection method
CN113919932A (en) Client scoring deviation detection method based on loan application scoring model
KR102336679B1 (en) Index normalization based probability distribution selection method for model selection
CN111767273A (en) Data intelligent detection method and device based on improved SOM algorithm
JP7146218B1 (en) Information processing device, information processing method and program
CN116013527A (en) CV-MABAC hypertension age bracket prediction method based on entropy
CN116415836A (en) Security evaluation method for intelligent power grid information system
CN115271442A (en) Modeling method and system for evaluating enterprise growth based on natural language
CN111985826B (en) Visual quality grading method and system for multi-index industrial products
CN115293609A (en) Method and system for constructing personnel safety responsibility and job making evaluation index weight system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination