CN116434880B

CN116434880B - High-entropy alloy hardness prediction method based on fuzzy self-consistent clustering integration

Info

Publication number: CN116434880B
Application number: CN202310204380.XA
Authority: CN
Inventors: 李述; 单云霄; 李帅; 崔禹欣; 李福祥
Original assignee: Harbin University of Science and Technology
Current assignee: Harbin University of Science and Technology
Priority date: 2023-03-06
Filing date: 2023-03-06
Publication date: 2023-09-08
Anticipated expiration: 2043-03-06
Also published as: CN116434880A

Abstract

The invention provides a high-entropy alloy hardness prediction method based on fuzzy self-consistent clustering integration, and belongs to the technical field of alloy hardness prediction. The method aims to solve the problems that the prior method cannot effectively avoid the influence of the size of the class cluster label value on the relation between objects, the consensus result is difficult to accurately map out the actual difference between the base cluster results, and the capability of processing the uncertainty relation is weak. The method takes a base clustering result as a sample point characteristic, and the base clustering result is expressed in a scaled dumb variable form; constructing a relation matrix of all sample points by adopting a fuzzy operator; and calculating the local density and the relative distance of each sample point based on the relation matrix so as to identify a clustering center and distribute non-center points, and constructing a reassignment strategy to correct the uncertainty sample points in the consensus clustering result. The method eliminates the influence of the division difference between the base clustering results, and examines the fuzzy relation between the objects from the angle of the fuzzy operator, thereby effectively improving the processing capacity of the fuzzy relation.

Description

High-entropy alloy hardness prediction method based on fuzzy self-consistent clustering integration

Technical Field

The invention relates to the technical field of alloy hardness prediction, in particular to a high-entropy alloy hardness prediction method based on fuzzy self-consistent clustering integration.

Background

High-entropy alloys (HEAs) have some good physical and mechanical properties such as high hardness, good wear resistance, excellent low-temperature fracture toughness, excellent magnetic properties, and the like. When the hardness of the high-entropy alloy is predicted, for a given unknown high-entropy alloy data set, the high-entropy alloy data set comprises alloy materials with large differences in intrinsic properties and rules, and the alloys with large differences have complex internal relations among each other, so that the hardness predicted value of a model for a certain type of alloy has large deviation.

Clustering is an advanced analysis technology for solving the unsupervised problem, and aims to divide sample points with higher similarity in a given data set into the same class of clusters, and sample points with relatively larger difference are separated from each other, so that the clustering is a hot spot research subject in the fields of machine learning and data mining. Has been successfully applied to a plurality of actual scenes, such as image pattern recognition, medical research, recommendation systems, text mining and engineering systems, and the like. Aiming at the difficulty in hardness prediction of the high-entropy alloy, the prior research of the high-entropy alloy hardness prediction method CN114613456A based on the improved density peak clustering algorithm solves the problem that the model cannot learn the internal structural characteristics of the data set well due to large data structure difference of the high-entropy alloy data set by the improved density peak clustering algorithm, and improves the prediction capability of the model.

Compared with a traditional single clustering model, the clustering integration (CE) model can obtain an integrated result with better clustering effect and robust performance by integrating a plurality of basic clustering results. The key step in the CE model, which plays a bridge-like role, is how to process the basic clustering result matrix, and the quality of the processing of the step directly influences the final clustering performance of the model. Existing CE methods typically use two ways to process the base cluster result matrix. The first method is a method which is developed relatively mature and widely used by most people, and is advocated to take a row vector as input of a consensus process, and the existing research of the high-entropy alloy hardness prediction method CN115691700A of a double-granularity clustering integration algorithm based on three consensus strategies solves the problem that the existing integration clustering algorithm cannot effectively integrate partition conflicts generated among different consensus results, and finally achieves a satisfactory clustering effect. The second is a less-relevant processing method which claims that the label value of the base cluster result is directly used as the characteristic representation of the original data. The method can effectively overcome the defect of the first method, namely that clusters among different base clustering results are not corresponding. Meanwhile, since the second treatment mode is in the starting stage, it has some disadvantages: 1) The influence of the size of the class cluster label value on the relation calculation between objects cannot be effectively avoided; 2) By default, all base clustering results are equal contribution and it is difficult to accurately map out the actual differences between the base clustering results for the final consensus result. Different processing modes will directly lead to the quality of the effective information extraction, thereby affecting the final integration effect. Furthermore, although the definition method concerning the similarity relationship between objects is endless, the difference is not obvious from the calculation essence in practice. Furthermore, the degree of coupling of most defined methods is fixed, resulting in a weak ability to handle uncertainty relationships, which is not applicable to all types of data structures. Some degree of blurring is unavoidable when assigning cluster-like labels. Therefore, it is necessary to re-examine the uncertainty between objects from a new perspective to increase the flexibility of the model and the ability to deal with ambiguous relationships.

Disclosure of Invention

The invention aims to solve the technical problems that:

the prior method cannot effectively avoid the influence of the size of the cluster-like label value on the relation calculation among the objects, defaults that all the basic clustering results are of equal contribution force, and the final consensus result is difficult to accurately map out the actual difference among the basic clustering results; meanwhile, the uncertainty relation processing capability is weak, and the problem of uncertainty in the consensus result cannot be effectively solved.

The invention adopts the technical scheme for solving the technical problems:

the invention provides a high-entropy alloy hardness prediction method based on fuzzy self-consistent clustering integration, which comprises the following steps:

s1, aiming at a high-entropy alloy data set, dividing the data set for multiple times by adopting a base clustering algorithm to generate a base clustering result, taking the base clustering result as a sample point characteristic, representing the base clustering result in a form of a dummy variable, and performing scaling weighting treatment on the result to obtain a dummy variable characteristic representation of scaling of original data;

s2, defining a fuzzy relation between any two sample points under the scaled dummy variable feature representation by adopting a fuzzy operator to obtain a relation matrix formed by all the sample points;

s3, taking a relation matrix as input of a consensus strategy, calculating local density and relative distance of each sample point based on the relation matrix, and identifying a clustering center and distributing non-center points based on the local density and the relative distance to obtain a first-order consensus clustering result pi ^θ ；

S4, constructing a reassignment strategy to correct uncertainty sample points in the primary consensus clustering result to obtain a final consensus clustering result pi ^* ；

S5, respectively establishing regression models for different clusters in the final-order consensus clustering result, and performing high-entropy alloy hardness prediction calculation.

Further, S1 specifically includes the following steps:

s11, aiming at high-entropy alloy data set DX= { DX ₁ ,dx ₂ ,…,dx _N Where N is the number of samples, through a base clustering algorithm { A } ₁ ,A ₂ ,…,A _L Dividing the data set DX L times to obtain a basic clustering result matrix wherein π_ij Representing the label value allocated to the jth sample point by the ith division, namely the base clustering result and the class cluster number under the ith division are respectively +.> and |C(r_i ) I, taking the base clustering result as a sample point characteristic;

s12, representing the basic clustering result in a form of a dummy variable, namely, for any sample point dx _j Expressed as:

wherein ,is of length |C (r) _i ) The value of each position from left to right represents cluster 1 to C (r _i ) I cluster,/->Only at pi _ij The values of the positions are 1, and the values of the rest positions are 0;

s13, carrying out scaling weighting processing on the base clustering result on the basis of S12, wherein the scaling coefficient omega _i Based on the DBI index, the specific forms are:

wherein ,DBI_i DBI index scores of the base clustering results are marked for the ith division;

the sample point dx _j Expressed as:

wherein ,is of length |C (r) _i ) The value of each position from left to right represents cluster 1 to C (r _i ) I cluster,/->Only at pi _ij The value of each position is omega _i The values of the rest positions are all 0;

finally, the dummy variable characteristic representation of the original data scaling is obtained.

Further, the definition of the blurring operator in S2 is:

set mapping S0, 1] ² →[0,1]For the followingWith gamma-fuzzy operator->

Further, a gamma-fuzzy operator is adopted in S2Defining any two sample points d ^* x _j and d^* x _k Fuzzy relation f between _jk The method specifically comprises the following steps:

wherein, gamma is [ -1,0].

Further, S3 is for any sample point d ^* x _j Its local density ρ _j The calculation method of (1) is as follows:

wherein ,d_c Is the cutoff distance.

Further, S3 is for any sample point d ^* x _j Its relative distance delta _j The calculation method of (1) is as follows:

if d ^* x _j Is the non-maximum local density point, its relative distance delta _j The method comprises the following steps:

delta at this time _j From distance d ^* x _j Nearest neighbor d ^* x _k Confirm, and d ^* x _k Having a relatively large local density value;

if d ^* x _j Is the maximum local density point, its relative distance delta _j Recorded as delta _max Expressed as:

δ _max ＝max _k (f _jk )。

further, the identifying process of the cluster center in S3 is as follows: firstly, drawing a two-dimensional decision graph, wherein the abscissa and the ordinate of the two-dimensional decision graph respectively correspond to rho and delta; secondly, mapping all sample points into any point d in the decision diagram ^* x _j Is defined by ρ _j and δ_j Finally, by identifying the upper right angle ρ in the decision graph _j and δ_j Meanwhile, a larger sample point becomes a clustering center;

the allocation principle of the non-center point is that the point and the nearest point are clustered together, and the adjacent point has larger local density.

Further, the definition of the uncertainty sample point in S4 is: for any sample point d ^* x _j If it is relative to the sample point d ^* x _k With nearest neighbor sample point d ^* x _q First-order consensus clustering result pi ^θ The clusters in the two groups are inconsistent, the sample point d ^* x _j Is an uncertainty sample point, wherein the relative sample point d ^* x _k To obtain point d ^* x _j Is a relative distance delta of (2) _j The point to which it is located.

Further, the implementation process of the reassignment policy is as follows:

step1: constructing the corresponding to all sample points { d } ^* x ₁ ,d ^* x ₂ ,…d ^* x _N Two vectors, one vectorStored is a cluster tag corresponding to the relative sample point, another vectorStoring cluster labels corresponding to nearest neighbor sample points;

step2: picking out uncertainty sample points with inconsistent label values of clusters at positions corresponding to the two vectors, and putting the uncertainty sample points into a set Q;

step3: calculating pi ^θ Mean vector for each cluster in a cluster wherein C^* Is pi ^θ Cluster-like number of (2); by means of the gamma-blurring operator +.>Evaluate each uncertainty point in Q with V ^mv Fuzzy relations among all mean value vectors;

step4: the uncertainty points are assigned to the class clusters with the greatest fuzzy relationship with them using the maximum membership principle.

Further, the gamma-blurring operatorThe method for proving the basic conditions to be met is as follows:

inference 1: structured gamma-fuzzy operatorFor the S triangular mode, four basic conditions in the S triangular mode are required to be satisfied:

(1) Exchange law: s (u, v) =s (v, u);

(2) Binding law: s (u, v), p) =s (u, S (v, p));

(3) Monotonicity: if u ₁ ≤u ₂ ,v ₁ ≤v ₂ S (u) ₁ ,v ₁ )≤S(u ₂ ,v ₂ )；

(4) Boundary conditions: s (u, 0) =u;

and (3) proving: (1) switching law: for the followingThen there are:

i.e., S (u, v) =s (v, u);

(2) Binding law: for the followingThen there are:

that is, S (u, v), p) =s (u, S (v, p));

(3) Monotonicity: for the followingAnd u is ₁ ≤u ₂ ,v ₁ ≤v ₂ The following steps are:

and gamma E [ -1,0 [ -1 ]]Upper typeNamely S (u) ₁ ,v ₁ )≤S(u ₂ ,v ₂ )；

(4) Boundary conditions: for the followingThe method comprises the following steps:

i.e. S (u, 0) =u.

Compared with the prior art, the invention has the beneficial effects that:

the invention relates to a fuzzy self-consistent clustering integration-based high-entropy alloy hardness prediction method, which takes a base clustering result as a sample point characteristic, takes a dumb variable scaled by the base clustering result as a new characteristic of an original data point, fundamentally eliminates the influence of a non-practical cluster-like label value on the practical relation between sample points, and eliminates the influence of the division difference between the base clustering results on the accuracy of an integration result. The invention designs the fuzzy operator capable of adjusting the coupling strength according to the uncertainty of the actual problem, and re-views the fuzzy relation among the objects in the CE problem from the angle of the fuzzy operator for the first time, thereby effectively improving the processing capacity of the CE model on the fuzzy relation. Providing a reassignment strategy for fuzzy objects in the integration result to obtain a higher-quality final-order integration result; the method effectively strengthens the capability and robustness of the traditional CE method for processing the fuzzy relation.

Drawings

FIG. 1 is an overall flow chart of a fuzzy self-consistent clustering integration completion cluster in an embodiment of the invention;

FIG. 2 is a flow chart of a dummy variable representation for scaling weighting of base cluster results in an embodiment of the invention;

FIG. 3 is a graph showing comparison of a plurality of model hardness prediction results in an embodiment of the present invention, wherein a linear SVR model, a SKTDPC+linear SVR model, a BCESF-DSC+linear SVR model, and a fitted condition graph of an average prediction result and an experimental result of 40 times running under a 90% training set and a 10% testing set are sequentially shown from top to bottom;

FIG. 4 is a graph showing comparison of a plurality of model hardness prediction results in an embodiment of the present invention, wherein a linear SVR model, a SKTDPC+linear SVR model, a BCESF-DSC+linear SVR model, and a fitting condition graph of an average prediction result and an experimental result of 40 times running under an 80% training set and a 20% testing set are sequentially shown from top to bottom;

FIG. 5 is a graph showing comparison of the hardness prediction results of a plurality of models in the embodiment of the present invention, wherein the comparison is a graph showing the fit between the average prediction result and the experimental result of the model running 40 times under the training set 70% and the test set 30% from top to bottom, and the model comprises a linear SVR model, a SKTDPC+linear SVR model, a BCESF-DSC+linear SVR model, and the method of the present invention.

Detailed Description

In the description of the present invention, it should be noted that the terms "first," "second," and "third" mentioned in the embodiments of the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", or a third "may explicitly or implicitly include one or more such feature.

In order that the above objects, features and advantages of the invention will be readily understood, a more particular description of the invention will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings.

Referring to fig. 1 to 2, the invention provides a high-entropy alloy hardness prediction method based on fuzzy self-consistent clustering integration, which comprises the following steps:

s1, aiming at a high-entropy alloy data set, dividing the data set for multiple times by adopting a base clustering algorithm to generate a base clustering result, taking the base clustering result as a sample point characteristic, representing the base clustering result in a form of a dummy variable, and carrying out scaling weighting treatment on the result to obtain a dummy variable characteristic representation of scaling of original data.

S1 specifically comprises the following steps:

s11, aiming at high-entropy alloy data set DX= { DX ₁ ,dx ₂ ,…,dx _N Where N is the number of samples, through a base clustering algorithm { A } ₁ ,A ₂ ,…,A _L Dividing the data set DX L times to obtain a basic clustering result matrix wherein π_ij Representing assignment of the ith partition to the jth sample pointThe label value, i.e. the base cluster result and the cluster number under the ith division are +.> and |C(r_i ) I, taking the base clustering result as a sample point characteristic;

s13, carrying out scaling weighting processing on the base clustering result on the basis of S12, wherein the scaling coefficient omega _i Based on the DBI (Davies-Bouldin) index, the specific form is:

wherein ,DBI_i The DBI index score of the base clustering result is marked as the ith division, and the smaller the DBI index score is, the better the clustering effect is shown;

the sample point dx _j Expressed as:

The processing mode of S12 can fundamentally eliminate the influence of the non-practical cluster-like label value on the relationship between sample points. The existing method defaults that the contribution force of all base clustering results is identical, but more or less division differences are necessarily existed among a plurality of base clustering results generated at actual random; in order to more accurately describe the original data, S13 assigns all the base clustering results to respective scaling coefficients omega, and adopts the scaling coefficients omega _i And measuring the contribution force of the ith base clustering result.

And S2, defining a fuzzy relation between any two sample points under the scaled dummy variable characteristic representation by adopting a fuzzy operator to obtain a relation matrix formed by all the sample points.

The definition of the blurring operator is as follows:

set mapping S0, 1] ² →[0,1]For the followingWith gamma-fuzzy operator->

The gamma-blurring operatorThe method for proving the basic conditions to be met is as follows:

(1) Exchange law: s (u, v) =s (v, u);

(2) Binding law: s (u, v), p) =s (u, S (v, p));

(4) Boundary conditions: s (u, 0) =u;

and (3) proving: (1) switching law: for the followingThen there are:

i.e., S (u, v) =s (v, u);

(2) Binding law: for the followingThen there are:

that is, S (u, v), p) =s (u, S (v, p));

i.e. S (u, 0) =u.

S2 adopts gamma-fuzzy operatorDefining any two sample points d ^* x _j and d^* x _k Fuzzy relation f between _jk The method specifically comprises the following steps:

wherein, gamma is [ -1,0].

f _jk The smaller the value of (c) represents the sample point d ^* x _j and d^* x _k The closer the distance, i.e. the stronger the attractive force between the two. In addition, gamma is two at different valuesThe coupling strength between the two will also change. In general, γ tends to take a smaller value when there are more uncertain data points in the data set. Otherwise, the value of gamma is slightly larger.

The gamma-blurring operatorBlur operators belonging to the parameterized type. The operator has stronger scalability in practical application, and the coupling strength of the fuzzy operator can be controlled by adjusting the parameter gamma according to the actual problem. Therefore, the gamma-blurring operator is introduced in the design of the CE model>The processing capacity of the fuzzy relation can be effectively improved, and the clustering effect and generalization capacity of the CE model are further improved.

S3, taking a relation matrix as input of a consensus strategy, calculating local density and relative distance of each sample point based on the relation matrix, and identifying a clustering center and distributing non-center points based on the local density and the relative distance to obtain a first-order consensus clustering result pi ^θ 。

For any sample point d ^* x _j Its local density ρ _j The calculation method of (1) is as follows:

wherein ,d_c The cut-off distance is the only super parameter of the consensus strategy.

For any sample point d ^* x _j Its relative distance delta _j The calculation method of (1) is as follows:

δ _max ＝max _k (f _jk )。

the identification process of the cluster center in the S3 is as follows: firstly, drawing a two-dimensional decision graph, wherein the abscissa and the ordinate of the two-dimensional decision graph respectively correspond to rho and delta; secondly, mapping all sample points into any point d in the decision diagram ^* x _j Is defined by ρ _j and δ_j Finally, by identifying the upper right angle ρ in the decision graph _j and δ_j Meanwhile, a larger sample point becomes a clustering center;

S4, constructing a reassignment strategy to correct uncertainty sample points in the primary consensus clustering result to obtain a final consensus clustering result pi ^* 。

In view of the type of processing strategy used in the consensus process, different degrees of ambiguity are unavoidable in assigning cluster-like labels. That is, at the first order, the result of consensus clustering pi ^θ There are still some sample points with uncertainty. These uncertainty points require an efficient way of processing to reassign them to obtain stable and accurate consensus results. Therefore, the invention designs a reassignment strategy capable of reducing uncertainty and improving clustering effect.

First, the definition of the uncertainty sample points is: for any sample point d ^* x _j If it is relative to the sample point d ^* x _k With nearest neighbor sample point d ^* x _q First-order consensus clustering result pi ^θ The clusters in the two groups are inconsistent, the sample point d ^* x _j Is an uncertainty sample point, wherein the relative sample point d ^* x _k To obtain point d ^* x _j Is a relative distance delta of (2) _j The point to which it is located.

The implementation process of the reassignment policy is as follows:

Thus, the correction of the uncertainty points is completed, and the final-order consensus clustering result pi is obtained ^* 。

The regression model used in S5 is a linear SVR model.

To verify the accuracy of the method of the invention, for a high entropy alloy dataset comprising 601 sample points, the sample point feature parameter types are: phase parameters, mechanical parameters, processing and preparation parameters and molar ratio parameters of element components. The phase parameters comprise valence electron concentration, electronegativity difference, atomic radius difference, mixing enthalpy, mixing entropy, electron concentration and cohesive energy; the mechanical parameters include work function, modulus mismatch, shear modulus difference, shear modulus and melting point; the processing preparation parameters include as-cast, additive manufacturing, powder metallurgy, work hardening and homogenization; the molar ratio parameters of the element components comprise the molar ratio of lithium, magnesium, aluminum, silicon, scandium, titanium, vanadium, chromium, manganese, iron, nickel, cobalt, copper, zinc, zirconium, niobium, molybdenum, tin, hafnium, tantalum and tungsten.

And respectively adopting a linear SVR model, a SKTDPC+linear SVR model, a BCESF-DSC+linear SVR model and the method for predicting the hardness of the sample points in the data set. The SVR model is to directly predict the high-entropy alloy hardness by adopting an SVR algorithm; the SKTDPC+linear SVR model is based on the background art of 'a high-entropy alloy hardness prediction method based on an improved density peak clustering algorithm', and the BCESF-DSC+linear SVR model is based on the background art of 'a high-entropy alloy hardness prediction method based on a dual-granularity clustering integration algorithm of three consensus strategies', and the prediction results of the method and the models are shown in figures 3, 4 and 5.

As can be seen from the comparison result graph, under different distribution ratios of training set and test set, the prediction capability of the method of the invention is greatly improved compared with that of the original SVR model, R ² A lift of up to around 50% has been achieved. In addition, compared with the existing SKTDPC+linear SVR model and BCESF-DSC+linear SVR model R with stronger prediction capability, the method provided by the invention has the advantages that ² Also exhibiting a lift of around 13% and 8%. It is noted that when a certain type of method is gradually developed and matured to reach a certain height, the improvement amplitude of the model performance tends to be smaller and smaller, and the corresponding improvement difficulty also rises. In this case, the method provided by the invention can break through more than ten percent of index valuesDifficult. In addition, it can be found that the method of the invention has very stable prediction effect under different experimental settings. Moreover, the thought of the method has universality, and when the situation similar to the research content of the invention is met, the method can be considered to be combined with other regression models so as to fundamentally improve the prediction capability of the models.

Although the present disclosure is disclosed above, the scope of the present disclosure is not limited thereto. Various changes and modifications may be made by one skilled in the art without departing from the spirit and scope of the disclosure, and such changes and modifications would be within the scope of the disclosure.

Claims

1. A high-entropy alloy hardness prediction method based on fuzzy self-consistent clustering integration is characterized by comprising the following steps:

s1 specifically comprises the following steps:

s11, aiming at high-entropy alloy data set DX= { DX ₁ ，dx ₂ ，...，dx _N Where N is the number of samples, through a base clustering algorithm { A } ₁ ，A ₂ ，...，A _L Dividing the data set DX L times to obtain a basic clustering result matrix wherein π_ij Representing the label value allocated to the jth sample point by the ith division, namely the base clustering result and the class cluster number under the ith division are respectively and |C(r_i ) I, taking the base clustering result as a sample point characteristic;

the sample point dx _j Expressed as:

wherein ,is of length |C (r) _i ) The value of each position from left to right represents cluster 1 to C (r _i ) I cluster the process comprises,only at pi _ij The value of each position is omega _i The values of the rest positions are all 0;

finally obtaining the dummy variable characteristic representation of original data scaling;

the definition of the blurring operator is as follows:

setting a mapping S: [0,1] ² →[0，1]For the followingγ∈[-1，0]With gamma-fuzzy operator->

Using gamma-fuzzy operatorsDefining any two sample points d ^* x _j and d^* x _k Fuzzy relation f between _jk The method specifically comprises the following steps:

wherein, gamma is [ -1,0];

2. Method according to claim 1, characterized in that S3 is for any sample point d ^* x _j Its local density ρ _j The calculation method of (1) is as follows:

wherein ,d_c Is the cutoff distance.

3. Method according to claim 2, characterized in that S3 is for any sample point d ^* x _j Its relative distance delta _j The calculation method of (1) is as follows:

δ _max ＝max _k (f _jk )。

4. a method according to claim 3, wherein the identification process of the cluster center in S3 is: firstly, drawing a two-dimensional decision graph, wherein the abscissa and the ordinate of the two-dimensional decision graph respectively correspond to rho and delta; secondly, mapping all sample points into a decision diagram, wherein any point d is x _j Is defined by ρ _j and δ_j Finally, by identifying the upper right angle ρ in the decision graph _j and δ_j Meanwhile, a larger sample point becomes a clustering center;

5. The method according to claim 4, wherein the definition of the uncertainty sample points in S4 is: for any sample point d ^* x _j If it is relative to the sample point d ^* x _k With nearest neighbor sample point d ^* x _q First-order consensus clustering result pi ^θ The clusters in the two groups are inconsistent, the sample point d ^* x _j Is an uncertainty sample point, wherein the relative sample point d ^* x _k To obtain point d ^* x _j Is a relative distance delta of (2) _j The point to which it is located.

6. The method of claim 5, wherein said reassignment policy is implemented by:

step1: constructing the corresponding to all sample points { d } ^* x ₁ ，d ^* x ₂ ，...d ^* x _N Two vectors, one vectorStored is a cluster tag corresponding to the sample point, another vector +.>Storing cluster labels corresponding to nearest neighbor sample points;

7. The method according to claim 1, characterized in that the gamma-prime operatorThe method for proving the basic conditions to be met is as follows:

(1) Exchange law: s (u, v) =s (v, u);

(2) Binding law: s (u, v), p) =s (u, S (v, p));

(3) Monotonicity of: if u ₁ ≤u ₂ ，v ₁ ≤v ₂ S (u) ₁ ，v ₁ )≤S(u ₂ ，v ₂ )；

(4) Boundary conditions: s (u, 0) =u;

and (3) proving: (1) switching law: for the followingThen there are:

i.e., S (u, v) =s (v, u);

(2) Binding law: for the followingThen there are:

that is, S (u, v), p) =s (u, S (v, p));

(3) Monotonicity: for the followingAnd u is ₁ ≤u ₂ ，v ₁ ≤v ₂ The following steps are:

and gamma E [ -1,0 [ -1 ]]Upper typeNamely S (u) ₁ ，v ₁ )≤S(u ₂ ，v ₂ )；

i.e. S (u, 0) =u.