CN108763590B - Data clustering method based on double-variant weighted kernel FCM algorithm - Google Patents
Data clustering method based on double-variant weighted kernel FCM algorithm Download PDFInfo
- Publication number
- CN108763590B CN108763590B CN201810636707.XA CN201810636707A CN108763590B CN 108763590 B CN108763590 B CN 108763590B CN 201810636707 A CN201810636707 A CN 201810636707A CN 108763590 B CN108763590 B CN 108763590B
- Authority
- CN
- China
- Prior art keywords
- formula
- clustering
- algorithm
- data point
- kernel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a data clustering method based on a bivariate weighted kernel (FCM) algorithm, which comprises the steps of firstly, optimally dividing a data set to minimize a target function; obtaining an initial membership matrix, a typical value matrix and an initial clustering center; calculating the distance between the data points and the clustering center in the multi-core high-dimensional space; iteratively obtaining a membership value and a likelihood typical value; so that the objective function obtains the clustering result corresponding to the minimum value as the final clustering result. The invention adopts the kernel function guided by the combined kernel to replace the common Euclidean distance function, and can better divide the linear data and the nonlinear data; the noise immunity of the algorithm is enhanced by adopting the typical value matrix, the accuracy of algorithm clustering is improved, the proportion of various kernels in the combined kernel can be automatically adjusted to meet the requirements of different data sets on different kernel functions, and the problem of uncertainty of the ordinary kernel algorithm on kernel function selection is solved.
Description
Technical Field
The invention relates to the technical field of data clustering, in particular to a data clustering method based on a bivariate weighted kernel (FCM) algorithm.
Background
Clustering is an important research content in the fields of data mining and artificial intelligence, and plays a great role in various fields such as big data, pattern recognition, image segmentation, machine learning and the like. Clustering is a process of dividing data according to similarity rules of the data, the division result is determined by the rules, and the divided groups or sets are also often called clusters. The fuzzy c-means clustering algorithm (FCM) is a basic method of fuzzy clustering, lays a foundation for a clustering algorithm based on a target function, but is sensitive to initialization of a clustering center and is easily influenced by noise points. To improve the noise immunity of the algorithm, Krishnapuram and Keller propose a PCM algorithm. The PCM algorithm adopts a probability partition matrix, and the probability membership reflects the typical degree of a data point to a certain clustering center. In addition, it relaxes the constraint of dividing the matrix sum to 1 in the FCM algorithm, and compared to the FCM algorithm, PCM can better handle noise points. However, the PCM algorithm easily obtains fewer cluster centers or overlapping clusters than the predetermined number of clusters.
Based on the reason, Zhang provides an improved clustering mode of a probabilistic clustering algorithm in the literature, and a new parameter eta is addediTo reduce the error of the algorithm, although the clustering algorithm with possibility can overcome the problem of consistent clustering, the original m is subjected topSelection of parameters is exceptionally sensitive, different mpThe resulting cluster centers will be two distinct values, even if the values differ very little. An improved probability fuzzy c-means clustering algorithm PFCM provided by Nikhil has good noise robustness and can not generate coincident clusters, however, the selection of parameters a and b by the PFCM usually needs artificial designation and lacks theoretical basis, and has strong dependency.
The algorithm has a good clustering effect on linear data, but the clustering effect on nonlinear data is not ideal, and by introducing a kernel function, original data passes through a mercer kernel condition to make sample data X ═ X1,…,xnMapping into a high-dimensional feature space F, and mapping data to phi (x)1),...,φ(xn) And F, wherein phi is a mapping function, and samples are clustered in a space F to form a kernel-based fuzzy clustering algorithm. Yang provides a kernel-based fuzzy clustering algorithm KFCM, Genton also shows a kernel machine learning mode from the statistical angle, the algorithms enable data points to be mapped to a high-dimensional feature space, the kernel function and the optimized clustering error are used to enable the kernel-based fuzzy clustering algorithm to have good robustness for noise and outliers, the problem that the PFCM algorithm is sensitive to parameter setting is solved, however, the kernel-based fuzzy clustering algorithm has a good effect on spherical data, and an ideal effect is not obtained on non-spherical data.
Zhao et al, previously proposed in the literature that the segmented clustering algorithm of the largest kernel of multiple kernels is mostly concerned with supervised and semi-supervised cluster learning, which is based on maximally marginal clustering, but it is obvious that the clustering algorithm is mostly used for hard clustering. The multi-core method proposed by Hsin-Chien provides great flexibility for selection and combination of basic cores, increases information sources from different angles, and improves the encoding capability of domain knowledge.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a data clustering method based on a dual-variable weighted kernel FCM algorithm, so that the problems that the FCM is sensitive to noise points and the PCM is easy to generate consistent clustering can be accurately avoided, the accuracy of the algorithm is improved, and the data structure information existing in a data set is more accurately mined. Meanwhile, the most suitable weight value and the current membership value can be automatically found, so that the reliability and the convergence of the algorithm are improved.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention discloses a data clustering method based on a bivariate weighted kernel FCM algorithm, which is used for clustering customer information and recommending products to customers according to clustering results and comprises the following steps:
in the formula (1), i represents the ith class, c represents the number of the classified classes, i is more than or equal to 1 and less than or equal to c, and c is more than or equal to 2 and less than or equal to n; u. ofijRepresents the jth data point xjMembership toMembership value in class i;is uijM is a weighted exponent representing the degree of cluster ambiguity, tijRepresents the jth data point xjA likelihood representative value of membership to the i-th class,is tijEta is a weighted exponent and is used for controlling the proportion of membership and a typical value to realize bivariation; dijJ-th data point x representing multi-kernel high-dimensional feature spacejAnd the clustering center v of the ith class of the multi-kernel high-dimensional feature spaceiAnd has:
in the formula (2), wlIs the weight parameter of the kernel, M is the total number of kernels,αijkexpressed by formula (3):
in the formula (3), kl(xj,xq) Is the ith gaussian kernel function expressed by equation (4):
in formula (4)Is the width parameter of the function, xjl,xqlRespectively representing the l-dimension characteristic values of the j-th data point and the q-th data point;
in the formula (5), the reaction mixture is,represents a constant; x is the number ofzZ is more than or equal to 1 and less than or equal to n and represents the z-th data point; | xj-xzI represents the jth data point xjAnd z-th data point xzThe Euclidean distance between;
in the formula (1), riRepresents a penalty factor and has:
in the formula (6) | | xj-viThe | | represents the Euclidean distance between the jth data point and the ith clustering center;
Step 5, obtaining parameter beta from formula (7)l:
Step 7, obtaining the jth data point x of the multi-core high-dimensionality feature space by the formula (2)jAnd the clustering center v of the ith class of the multi-kernel high-dimensional feature spaceiA distance D betweenijSquare of (d);
step 8, obtaining the jth data point x of the iter iteration from the formula (9)jTypical values belonging to class i
Step 9, obtaining the jth data point x of the iter iteration from the formula (10)jMembership value to class i
Step 10, setting a threshold value epsilon, and judging an algorithm stop conditionOr if iter > iterMax is true, if true, thenFor optimal membership value, for the jth data point xjGet itThe value of i corresponding to the maximum value is used to obtain the data point xjBelongs to the i-th class according to the obtained membership matrix U(iter)Obtaining all data point sets Y belonging to the ith class, and calculatingyjIs the jth data in the ith class of data point set Y,formula is the average value of the set Y, n' is the total number of data points belonging to the ith class, and then the clustering center of the ith classThe same method obtains the clustering centers of other classes; recommending products for the new customers according to the clustering center matrix; if the stop condition is not satisfied, the value of iter is increased by 1, and steps 4 to 10 are repeated until the condition is satisfied.
Compared with the prior art, the invention has the beneficial effects that:
1. the Double-variable Weighted Kernel FCM clustering algorithm DWKFCM (Weighted Kernel Fuzzy C-Means with Double variables) adopts a multi-Kernel method, integrates the advantages of a Fuzzy clustering method FCM and the advantages of a probability clustering algorithm PCM, reduces the influence of Kernel function selection on an experimental result, is sensitive to Kernel function selection, and adds a probability concept on the basis of multi-Kernel, so that the algorithm has stronger noise immunity and the obtained clustering result is more accurate.
2. The DWKFCM of the invention adopts the kernel-based algorithm which can carry out nonlinear data operation, and maps the nonlinear operation of common data to a high-dimensional data space, thereby increasing the robustness of the algorithm.
3. The DWKFCM extends the DWKFCM to the aspect of soft clustering, fuzzifies the attribution of the data points, considers the spatial distribution characteristics of the data points at the same time, and judges the influence of the data points on the clustering center by calculating the distance between the data points, wherein the influence is smaller when the data points are farther away from the clustering center, so the method has stronger flexibility.
4. Data information is increased explosively nowadays, and clustering of data is a key for further data analysis and knowledge mining, so that the method has a good application value.
Drawings
FIG. 1 is a flow chart of a data clustering algorithm based on a bivariate weighted kernel FCM algorithm;
fig. 2 is a Sammon map of a iris dataset.
Detailed Description
Referring to fig. 1, the data clustering method based on the dual-variant weighted kernel FCM algorithm in this embodiment is used for clustering customer information, and recommending products to customers according to clustering results, and is performed according to the following steps:
in the formula (1), i represents the ith class, c represents the number of classified classes, i.e. the class of the product, such as: c is 10, i is more than or equal to 1 and less than or equal to c, and c is more than or equal to 2 and less than or equal to n; u. ofijRepresents the jth data point xjMembership values belonging to class i;is uijM is a weighted exponent representing the degree of cluster ambiguity, m can take the value 2, tijRepresents the jth data point xjA likelihood representative value of membership to the i-th class,is tijEta is a weighted exponent and is used for controlling the proportion of membership and a typical value to realize bivariation, and eta can be 2; dijJ-th data point x representing multi-kernel high-dimensional feature spacejAnd the clustering center v of the ith class of the multi-kernel high-dimensional feature spaceiAnd has:
in the formula (2), wlIs the weight parameter of the kernel, M is the total number of the kernels, and the value of M is the number of attributes of the client information, such as: m is equal to 6, and M is equal to 6,αijkexpressed by formula (3):
in the formula (3), kl(xj,xq) Is the ith gaussian kernel function expressed by equation (4):
in formula (4)Is the width parameter of the function, using the formula:obtaining the calculated value, xjl,xqlRespectively representing the l-dimension characteristic values of the j-th data point and the q-th data point;
in the formula (5), the reaction mixture is,represents a constant, which can take the value of 2; x is the number ofzZ is more than or equal to 1 and less than or equal to n and represents the z-th data point; | xj-xzI represents the jth data point xjAnd z-th data point xzThe Euclidean distance between;
in the formula (1), riRepresents a penalty factor and has:
in the formula (6) | | xj-viAnd | | l represents the Euclidean distance between the jth data point and the ith cluster center.
Step 5, obtaining parameter beta from formula (7)l:
Step 7, obtaining the jth data point x of the multi-core high-dimensionality feature space by the formula (2)jAnd the clustering center v of the ith class of the multi-kernel high-dimensional feature spaceiA distance D betweenijSquare of (d).
Step 8, obtaining the jth data point x of the iter iteration from the formula (9)jTypical values belonging to class i
Step 9, obtaining the jth data point x of the iter iteration from the formula (10)jMembership value to class i
Step 10, setting the threshold value epsilon to 0.00001, and judging the algorithm stop conditionOr if iter > iterMax is true, if true, thenFor optimal membership value, for the jth data point xjGet itThe value of i corresponding to the maximum value is used to obtain the data point xjBelongs to the i-th class according to the obtained membership matrix U(iter)Obtaining all data point sets Y belonging to the ith class, and calculatingyjIs the jth data in the ith class of data point set Y,the formula is the average value of the set Y, and n' is the total number of data points belonging to the ith class, which can be obtained from the number of elements in the set Y. Then cluster center of the ith classThe same method obtains the clustering of other classesA core; recommending products for the new customers according to the clustering center matrix; if the stop condition is not satisfied, the value of iter is increased by 1, and steps 4 to 10 are repeated until the condition is satisfied.
In this embodiment, a marketing "catalog marking" is used as a research object, data that needs to be clustered is used as customer information, that is, a set formed by all customer information is used as a data set to be clustered, and each piece of customer information includes numerical attribute information such as age, income, internet surfing time, gender, constellation, consumer varieties, and the like. The data clustering method in the embodiment is adopted to cluster all customer information, and then marketing strategies such as specific product recommendation and regular release of articles purchased by similar personnel are carried out on different customers according to clustering results.
The clustering method in the embodiment is completed based on the following experimental platform: the operating system is a PC of Windows 7 and the integrated development environment is MATLAB R2015 b. The hardware conditions are as follows: the CPU is Intel Core 3.20GHz and the memory is 8 GB. The parameters m and η in the algorithm are both set to 2.0, and the maximum number of iterations iterMax is 100.
In order to verify the performance of the data clustering method DWKFCM based on the dual-variant weighted kernel FCM algorithm, four real data sets were used: iris data set, wine data set, glass data set, and diabetes data set of pidan. The specific information of these four data sets is shown in table 1.
Table 1 summary of the actual data set information used in the experiment
The clustering method DWKFCM, the fuzzy C-mean clustering algorithm FCM, the probability clustering algorithm PCM and the probability fuzzy clustering algorithm PFCM of the embodiment are respectively utilized, and the kernel-based fuzzy clustering algorithm KFCM and the multi-kernel fuzzy clustering algorithm MKFC are utilized to cluster the data sets. Taking the clustering accuracy CR as an evaluation standard of the clustering effect, and defining the clustering accuracy CR as follows:
wherein a isiRepresenting the number of samples that are ultimately correctly classified, c representing the number of clusters, and n representing the number of samples in the data set. The higher the clustering accuracy, the better the clustering effect of the clustering method. When the value of CR is 1, the clustering result of the algorithm on the data set is completely correct.
The iris dataset contains 150 data objects, each described by 4 attributes: calyx length, calyx width, petal length and petal width, and the algorithm is used for predicting which of three categories (Setosa, Versicolour, Virginica) iris belongs to according to the 4 attributes. As can be seen from the Sammon map of the iris data set in fig. 2, the data set has two types of data overlapping each other and not easy to be divided, and the two types of data are marked by "Δ" and "o" in the map, which brings great challenges to the clustering method. The accuracy of the clustering results obtained by using the DWKFCM algorithm, the FCM algorithm, the PCM algorithm, the PFCM algorithm, the KFCM algorithm, and the MKFCM algorithm on the iris data sets is shown in table 2.
TABLE 2 clustering accuracy of algorithms on Iris data set
Algorithm | Cluster accuracy (CR) |
FCM | 0.877 |
PCM | 0.668 |
PFCM | 0.808 |
KFCM | 0.914 |
MKFCM | 0.932 |
DWKFCM | 0.959 |
It can be seen from table 2 that the accuracy of the DWKFCM algorithm is improved by 8.2%, 29.1%, 15.1%, 4.5% and 2.7% over FCM, PCM, PFCM, KFCM and MKFCM, respectively. The performance of the DWKFCM algorithm is therefore best.
The glass data set is a data set characterized by glass material, and comprises 214 data objects, each object is represented by 9 attributes, and can be divided into 6 classes with different numbers. The accuracy of the clustering results obtained by using the DWKFCM algorithm, the FCM algorithm, the PCM algorithm, the PFCM algorithm, the KFCM algorithm, and the MKFCM algorithm on the glass data set is shown in table 3.
TABLE 3 clustering accuracy of algorithms on glass datasets
Algorithm | Cluster accuracy (CR) |
FCM | 0.813 |
PCM | 0.739 |
PFCM | 0.846 |
KFCM | 0.888 |
MKFCM | 0.901 |
DWKFCM | 0.953 |
The clustering results in table 3 show that the clustering accuracy of the DWKFCM algorithm is improved by 14%, 21.4%, 10.7%, 6.5% and 5.2% respectively over FCM, PCM, PFCM, KFCM and MKFCM. The performance of the DWKFCM algorithm is better.
The wine data set is a data set characterized by analysis of the composition of wine and contains 178 data, each data containing 13 attributes, classified into 3 types, and the number of attributes is the largest compared to iris and glass data sets. The accuracy of the clustering results obtained by using the DWKFCM algorithm, the FCM algorithm, the PCM algorithm, the PFCM algorithm, the KFCM algorithm, and the MKFCM algorithm on the wine data set is shown in table 4.
TABLE 4 clustering accuracy of algorithms on wine datasets
Algorithm | Cluster accuracy (CR) |
FCM | 0.708 |
PCM | 0.409 |
PFCM | 0.688 |
KFCM | 0.777 |
MKFCM | 0.841 |
DWKFCM | 0.925 |
The clustering results in table 4 show that the clustering accuracy of the DWKFCM algorithm is improved by 21.7%, 51.6%, 23.7%, 14.8% and 8.4% respectively over FCM, PCM, PFCM, KFCM and MKFCM. The performance of the DWKFCM algorithm is better.
The diabetes data set is a diabetes diagnosis data set, and belongs to the field of medicine. The data set contains 768 instances, each containing 8 attributes: pregnancy (pregnant times), Plasma glucose concentration within 2h of the meal (Plasma glucose concentration), diastolic blood pressure mmHg (diastatic blood pressure), thickness of skin fissure (triceps skin fold thickness), serum insulin mu U/ml (serum insulin), Body mass index (Body mass index), family history (pedigree function), Age (Age). The algorithm determines whether the patient is diabetic based on these 8 attributes. The data set consisted of 500 data sets without disease and 268 patient data statistics.
The accuracy of the clustering results obtained by using the DWKFCM algorithm, the FCM algorithm, the PCM algorithm, the PFCM algorithm, the KFCM algorithm, and the MKFCM algorithm on the diabetes data set is shown in table 5.
TABLE 5 clustering accuracy of algorithms on diabetes data sets
Algorithm | Cluster accuracy (CR) |
FCM | 0.582 |
PCM | 0.355 |
PFCM | 0.754 |
KFCM | 0.773 |
MKFCM | 0.831 |
DWKFCM | 0.934 |
As can be seen from Table 5, the DWKFCM of the present invention can achieve 93.4% accuracy in diabetes data set diagnosis of diabetes, while the accuracy of other algorithms is lower than 90%, and the PCM algorithm has even only 35.5% clustering accuracy. The clustering results in table 5 show that the clustering accuracy of the DWKFCM algorithm is improved by 35.2%, 57.9%, 18%, 16.1% and 10.3% respectively over FCM, PCM, PFCM, KFCM and MKFCM. The performance of the DWKFCM algorithm is better.
The double-variable weighted kernel fuzzy c-means clustering algorithm DWKFCM can be well applied to the fields of marketing, flower, glucose wine, glass classification, diabetes diagnosis and the like, reliably mines information in data and has high practicability.
Claims (1)
1. A data clustering method based on a bivariate weighted kernel (FCM) algorithm is used for clustering customer information and recommending products to customers according to clustering results, and is characterized by comprising the following steps:
step 1, a data set X is customer information, wherein X is { X ═ X1,x2,…,xn},xjIs the jth data point; j is 1,2, …, n, n is the total number of data; optimally dividing the data set X to make the J value of the objective function in the formula (1) be minimum:
in the formula (1), i represents the ith class, c represents the number of the classified classes, i is more than or equal to 1 and less than or equal to c, and c is more than or equal to 2 and less than or equal to n; u. ofijRepresents the jth data point xjMembership values belonging to class i;is uijM is a weighted exponent representing the degree of cluster ambiguity, tijRepresents the jth data point xjA likelihood representative value of membership to the i-th class,is tijEta is a weighted exponent and is used for controlling the proportion of membership and a typical value to realize bivariation; dijJ-th data point x representing multi-kernel high-dimensional feature spacejAnd the clustering center v of the ith class of the multi-kernel high-dimensional feature spaceiAnd has:
in the formula (2), wlIs the weight parameter of the kernel, M is the total number of kernels,αijkexpressed by formula (3):
in the formula (3), kl(xj,xq) Is the ith gaussian kernel function expressed by equation (4):
in formula (4)Is the width parameter of the function, xjl,xqlRespectively representing the l-dimension characteristic values of the j-th data point and the q-th data point;
in the formula (5), θ represents a constant; x is the number ofzZ is more than or equal to 1 and less than or equal to n and represents the z-th data point; | xj-xzI represents the jth data point xjAnd z-th data point xzThe Euclidean distance between;
in the formula (1), riRepresents a penalty factor and has:
in the formula (6) | | xj-viThe | | represents the Euclidean distance between the jth data point and the ith clustering center;
step 2, processing the data set X by using a fuzzy C-means clustering algorithm FCM to obtain a membership matrix U and a clustering center V which are respectively as follows:obtaining a parameter penalty factor r by calculating according to the formula (6)iAnd taking a membership matrix U obtained by using a fuzzy C-means clustering algorithm FCM as an initial membership matrix U of a bivariate weighted kernel FCM algorithm(0);
Step 3, randomly initializing the jth data point xjTypical values belonging to class iDefining the iteration times as iter, and defining the maximum iteration times as iterMax; and initializing iter as 1; the membership matrix of the iter iteration is U(iter)A typical value matrix for the ith iteration is T(iter);
Step 5, obtaining parameter beta from formula (7)l:
Step 7, obtaining the jth data point x of the multi-core high-dimensionality feature space by the formula (2)jAnd the clustering center v of the ith class of the multi-kernel high-dimensional feature spaceiA distance D betweenijSquare of (d);
step 8, obtaining the jth data point x of the iter iteration from the formula (9)jTypical values belonging to class i
Step 9, obtaining the jth data point x of the iter iteration from the formula (10)jMembership value to class i
Step 10, setting a threshold value epsilon, and judging an algorithm stop conditionOr if iter > iterMax is true, if true, thenFor optimal membership value, for the jth data point xjGet itThe value of i corresponding to the maximum value is used to obtain the data point xjBelongs to the i-th class according to the obtained membership matrix U(iter)To obtainAll data point sets Y belonging to class i are calculatedyjIs the jth data in the ith class of data point set Y,formula is the average value of the set Y, n' is the total number of data points belonging to the ith class, and then the clustering center of the ith classThe same method obtains the clustering centers of other classes; recommending products for the new customers according to the clustering center matrix; if the stop condition is not satisfied, the value of iter is increased by 1, and steps 4 to 10 are repeated until the condition is satisfied.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810636707.XA CN108763590B (en) | 2018-06-20 | 2018-06-20 | Data clustering method based on double-variant weighted kernel FCM algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810636707.XA CN108763590B (en) | 2018-06-20 | 2018-06-20 | Data clustering method based on double-variant weighted kernel FCM algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108763590A CN108763590A (en) | 2018-11-06 |
CN108763590B true CN108763590B (en) | 2021-07-27 |
Family
ID=63979218
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810636707.XA Active CN108763590B (en) | 2018-06-20 | 2018-06-20 | Data clustering method based on double-variant weighted kernel FCM algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108763590B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109670537A (en) * | 2018-12-03 | 2019-04-23 | 济南大学 | The full attribute weight fuzzy clustering method of multicore based on quasi- Monte Carlo feature |
CN111367901B (en) * | 2020-02-27 | 2024-04-02 | 智慧航海(青岛)科技有限公司 | Ship data denoising method |
CN111476236B (en) * | 2020-04-09 | 2023-07-21 | 湖南城市学院 | Self-adaptive FCM license plate positioning method and system |
CN112541407B (en) * | 2020-08-20 | 2022-05-13 | 同济大学 | Visual service recommendation method based on user service operation flow |
CN113283242B (en) * | 2021-05-31 | 2024-04-26 | 西安理工大学 | Named entity recognition method based on combination of clustering and pre-training model |
CN113268333B (en) * | 2021-06-21 | 2024-03-19 | 成都锋卫科技有限公司 | Hierarchical clustering algorithm optimization method based on multi-core computing |
CN116723136B (en) * | 2023-08-09 | 2023-11-03 | 南京华飞数据技术有限公司 | Network data detection method applying FCM clustering algorithm |
CN117112871B (en) * | 2023-10-19 | 2024-01-05 | 南京华飞数据技术有限公司 | Data real-time efficient fusion processing method based on FCM clustering algorithm model |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8195734B1 (en) * | 2006-11-27 | 2012-06-05 | The Research Foundation Of State University Of New York | Combining multiple clusterings by soft correspondence |
CN107203785A (en) * | 2017-06-02 | 2017-09-26 | 常州工学院 | Multipath Gaussian kernel Fuzzy c-Means Clustering Algorithm |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103049636A (en) * | 2012-09-12 | 2013-04-17 | 江苏大学 | Method and system for possibly fuzzy K-harmonic means clustering |
CN105894024A (en) * | 2016-03-29 | 2016-08-24 | 合肥工业大学 | Possibility fuzzy c mean clustering algorithm based on multiple kernels |
CN106846326A (en) * | 2017-01-17 | 2017-06-13 | 合肥工业大学 | Image partition method based on multinuclear local message FCM algorithms |
CN107220977B (en) * | 2017-06-06 | 2019-08-30 | 合肥工业大学 | The image partition method of Validity Index based on fuzzy clustering |
-
2018
- 2018-06-20 CN CN201810636707.XA patent/CN108763590B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8195734B1 (en) * | 2006-11-27 | 2012-06-05 | The Research Foundation Of State University Of New York | Combining multiple clusterings by soft correspondence |
CN107203785A (en) * | 2017-06-02 | 2017-09-26 | 常州工学院 | Multipath Gaussian kernel Fuzzy c-Means Clustering Algorithm |
Non-Patent Citations (2)
Title |
---|
Fuzzy clustering with optimized-parameters multiple Gaussian Kernels;Issam Dagher;《IEEE》;20151130;全文 * |
基于粒子群优化的直觉模糊核聚类算法研究;余晓东;《通信学报》;20150525;第36卷(第05期);第78-84段 * |
Also Published As
Publication number | Publication date |
---|---|
CN108763590A (en) | 2018-11-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108763590B (en) | Data clustering method based on double-variant weighted kernel FCM algorithm | |
Dudoit et al. | Classification in microarray experiments | |
DeSarbo et al. | Synthesized clustering: A method for amalgamating alternative clustering bases with differential weighting of variables | |
Wang et al. | On fuzzy cluster validity indices | |
Steinley et al. | Evaluating mixture modeling for clustering: recommendations and cautions. | |
Patil et al. | Hybrid prediction model for type-2 diabetic patients | |
Fonseca et al. | Mixture-model cluster analysis using information theoretical criteria | |
Hunt et al. | Theory & Methods: Mixture model clustering using the MULTIMIX program | |
Yang et al. | Unsupervised fuzzy model-based Gaussian clustering | |
Albergante et al. | Estimating the effective dimension of large biological datasets using Fisher separability analysis | |
Witten et al. | Supervised multidimensional scaling for visualization, classification, and bipartite ranking | |
Long et al. | Boosting and microarray data | |
CN113889192B (en) | Single-cell RNA-seq data clustering method based on deep noise reduction self-encoder | |
Mukhopadhyay | Large-scale mode identification and data-driven sciences | |
CN111652303A (en) | Outlier detection method based on spectral clustering under non-independent same distribution | |
CN110400610B (en) | Small sample clinical data classification method and system based on multichannel random forest | |
Vengatesan et al. | The performance analysis of microarray data using occurrence clustering | |
Jena et al. | An integrated novel framework for coping missing values imputation and classification | |
Miller et al. | Emergent unsupervised clustering paradigms with potential application to bioinformatics | |
CN111582370B (en) | Brain metastasis tumor prognostic index reduction and classification method based on rough set optimization | |
CN115985503B (en) | Cancer prediction system based on ensemble learning | |
CN117195027A (en) | Cluster weighted clustering integration method based on member selection | |
CN110991517A (en) | Classification method and system for unbalanced data set in stroke | |
Ragunthar et al. | Classification of gene expression data with optimized feature selection | |
CN114358191A (en) | Gene expression data clustering method based on depth automatic encoder |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |