CN110728293B - Hierarchical clustering method for tourist heading data - Google Patents
Hierarchical clustering method for tourist heading data Download PDFInfo
- Publication number
- CN110728293B CN110728293B CN201910812062.5A CN201910812062A CN110728293B CN 110728293 B CN110728293 B CN 110728293B CN 201910812062 A CN201910812062 A CN 201910812062A CN 110728293 B CN110728293 B CN 110728293B
- Authority
- CN
- China
- Prior art keywords
- cluster
- data
- clustering
- weight
- points
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a region growth and competition-based tourist destination data hierarchical clustering method for a variable-scale data density space, which is different from the conventional method in that the hierarchical clustering idea is adopted, and the clustering process is divided into three levels. The first-level clustering is used for dividing the objects into a certain number of subclasses based on Euclidean distance by using a distance threshold R1, so that the algorithm is simplified and the complexity is reduced. And then, the second-level method for growing the spatial data area uses the obtained cluster center as a growth seed, and the obtained cluster center grows under a growth criterion until a stop condition is reached, so that the problem of variable-scale data density clustering is solved. And finally, calculating the weight between the cluster centers based on the competitive idea and density similarity principle, and adopting a proper rule to merge the clusters to solve the problem of non-convex data clustering. Compared with other clustering algorithms, the method disclosed by the invention can maximally improve the clustering accuracy on the basis of reducing the complexity, has obvious advantages in processing mass data, and can better meet the requirements of practical engineering application.
Description
Technical Field
The invention relates to the field of hierarchical clustering, in particular to a clustering method for improving variable-scale density data by using a region growing and competition-based method.
Background
Data mining is a hot problem of research in the fields of artificial intelligence and databases, clustering analysis is an important branch of data mining, and the clustering is widely applied in various fields as a tool for data analysis. Clustering is the process of dividing a physical or abstract collection into classes composed of similar objects. Clustering originates in taxonomy, but differs from classification. Clustering differs from classification in that the class to which clustering requires partitioning is unknown and unsupervised. Clustering algorithms are broadly classified into (1) partition-based methods, such as K-means algorithm, and the like; (2) hierarchy-based methods such as the BIRCH algorithm, the CURE algorithm; (3) density-based methods, such as DBSCAN algorithm, density, and the like; (4) a grid-based approach; (5) neural networks, and other various clustering methods. Among them, the K-means algorithm is one of the most classical clustering algorithms. As the clustering algorithm based on division which is most widely applied at present, the K-means algorithm is simpler to realize, but has the following three defects: (1) the user must specify the clustering number k in advance; (2) the K-means algorithm is not suitable for finding non-convex clusters; (3) the K-means algorithm is very sensitive to noise and outlier data. The DBSCAN determines whether to establish a new cluster taking an object as a core object by checking whether the density of an object epsilon neighborhood is high enough, namely whether the number of data points in a certain distance epsilon exceeds a set threshold value, and then combines the clusters with reachable density to realize that the cluster class with any shape can be found in a spatial database with noise, but the DBSCAN algorithm is sensitive to two parameters which are difficult to determine, namely epsilon and the set threshold value. In addition, DBSCAN is relatively high in computational complexity.
Disclosure of Invention
Traditional clustering algorithms mostly assume the same scale of spatial density, but real data are often non-convex data with density multi-scale changes. Various defects often occur when the traditional clustering algorithm is adopted for data with density multi-scale change. Especially distance-based clustering algorithms such as the Kmeans algorithm, increase the sensitivity of the parameters and decrease the accuracy. Aiming at the defect that multi-scale data is mostly limited in space, the invention provides a novel multi-level clustering algorithm based on distance according to actual needs by means of multi-level analysis, and solves the clustering problem of multi-scale density data by means of multi-level rapid non-convex clustering. The algorithm can correspondingly simplify the algorithm complexity based on the distance, and the calculation of the density is avoided; and (4) performing reasonable fusion by utilizing seed region growth in a multi-stage aggregation manner to complete the clustering of the data. The invention can reduce the complexity on the basis of simplifying the algorithm, is beneficial to the clustering of mass data and is suitable for the analysis of the data of the tourist destination.
In order to solve the technical problems, the invention adopts the following technical scheme:
a variable-scale data density space oriented region growing and competition based visitor destination data hierarchical clustering method comprises the following steps:
a first stage: the cluster center is updated by drawing a circle from the distance threshold R1 as follows:
step 1.1: inputting a set of unlabeled data sets X ═ X1,x2,...xi,...xN}∈RPRandomly fetching the ith data object X from XiStoring the first cluster center point in a set C { }; then randomly taking the j-th data object X in XjCalculating x by equation (1)i xjEuclidean distance between themIf it isLess than R1(R1 is 10% of the spatial size of the data set), point xixjFor the same class, calculate a new cluster center point S to replace point x in C according to equation (2)iIf, ifGreater than R1, indicating xi xjNot of the same class, xjAlso as a cluster core, the cluster core set C ═ { x ═ is storedi};
Wherein, S in the formula (2) is the updated cluster center, and β is the weight coefficient;
step 1.2: from dataset X (excluding X)i、xj) In random fetching of the m-th data object xmCalculating the Euclidean distance setn is the number of points in the C set, and x is determinedmTo the closest point C in the cluster center setiUsing point x in combinationm、CiUpdating the cluster center according to the method of formula (1):
step 1.3: repeating the steps 1.1 and 1.2 to traverse all the points in the data X, and obtaining the updated cluster center set C ═ C1,...,Ci,...CwW is the cluster number, and the corresponding cluster set M ═ C1{...},...,Ci{...},...Cw{...}};
And a second stage: the region growing is carried out as follows:
step 1: determining the seed sequence: firstly, all the points in the cluster center set C are traversed, and then the points are countedCalculating the number n of points corresponding to the ith cluster i1,2.. m. If n isiIf min C is less than min C, the corresponding cluster center point C is deletediDeleting the corresponding cluster center point set C in Mi{., and storing the cluster center points in the set D as the seed sequence B ═ C1,...,Ci,...,Cd},d<=w;
Step 2.2: defining growth criteria, determining growth stop conditions: taking the first cluster center C in the seed sequence B1And a circle is drawn with R1 as a radius. Calculating the number of points n in a circle1If n is1If min is greater than C, continue with C1Drawing a circle Q with the circle center R being R1 plus Delta R as a radiusB1And is judged to enter the circle QB1If the point (i) belongs to D, i +1 continues to grow;
△R=e(sm(x))/10*i^2*0.03 (3)
wherein sm (x) is the average value of the distance between data in the x-th cluster in the M set, and the points entering the circle are stored in the corresponding cluster set to obtain updated M;
step 2.3, for the points obtained after each cluster center area grows, the next secondary time is not taken as a growing object to be processed, and then other cluster center points in the C are traversed by the method in the step 2.2 to obtain the data of each cluster center point and the corresponding cluster thereof;
and a third stage: and calculating the relation weight among all cluster centers of the clusters by a competition-based idea, and adopting a proper rule to merge the clusters.
After the data set X is subjected to the second-level clustering, if all cluster centers carry out the second-level clustering on the data XiIn the competition process, the winner is the heart of the cluster respectivelyAndgetWhen d has a value in a certain range, we consider the clusterHezhou clusterThere is a relational weight, the increasing criterion of which is: by usingExpressing the relationship weight between two small clusters, and the calculation method is as the following formula (4):
Step 3.1: first, for a data set X ═ X1,...,Xi,...,XNFrom the first data X1Starting to traverse in sequence, and finding out two winners of all cluster centers in the process of competing for data for each specific dataAndthen, the clusters corresponding to the two winners are judged according to the relation weight existence criterionAndif there is a weight, then the cluster with the weight is subjected to the relational weight according to the formula (4)Then traverse the next data; if the relation weight does not exist, directly traversing the next data until all the data are traversed once in sequence;
after the calculation of the relationship weight is completed, the relationship weight is formed asWherein the subscript x takes values from 1 up to M, and the subscript y takes values from x up to M;
step 3.2: calculating density similarity between each cluster, firstly calculating the intra-cluster density rho of each cluster for the cluster set M clustered at the second stagei:
ρi=ni/Si (5)
niIs the number of points included in the ith cluster, SiIs the area size of the ith cluster. ρ ═ ρ1,...,ρi,...,ρdAnd calculating a density difference between the x-th cluster and the y-th clusterNamely:
subscript x takes values from 1 up to d, and subscript y takes values from x up to d;
assuming the finally formed clustersIs of MkWherein each value of the subscript k corresponds to an independent cluster, and a finally formed cluster set M is subjected tokThe subscript of (a) is initialized to k 1,relationship weightSubscript x is initialized to x ═ 1;
relationship weightStarting from 1 up to M, the subscript x of (1) weights the relationshipThe superscript y takes values from x up to M whenWhen x is equal to y, letSatisfy the requirement of
Relationship weightNot satisfying the conditionAnd isIn time, the small clusters are not processed; relationship weightSatisfy the requirement ofAnd isWhen it is in condition, ifOrThenAndare simultaneously merged into MkIn (1),otherwise k is k +1, simultaneouslyAndmerge into a new cluster MkIn (1),whereinAndthe same elements present in (a) are combined into the same item;
step 3.4: the cluster center set finally formed is MkK clustering ends.
The region growing of the invention is a process of gradually aggregating a data or sub data set region into a complete independent connected region according to a predefined growing rule. For the interested target region R, z in the spatial data as the seed points found in advance on the region R, gradually merging the data meeting the similarity criterion in a certain neighborhood with the seed points z into a seed group according to the specified growth criterion for the growth of the next stage, and continuously carrying out cyclic growth until the growth stopping condition is met, thereby completing the process of growing the interested region from one seed point into an independent connected region. The similarity criterion can be the distance between data, density and other related attributes. The region growing algorithm is therefore generally implemented in three steps: (1) determining growing seed points (2) stipulates a growing criterion (3) determines a growth stop condition.
The invention adopts the idea of hierarchical clustering and divides the clustering process into three-level clustering. The first-level clustering divides the objects into a certain number of subclasses based on a distance threshold R1; the second stage is grown by region growing. And performing second-level clustering on the non-clustered data, and finally calculating the weights among all clustering centers of the clusters on the basis of a competitive idea and density similarity principle, and combining the clusters by adopting a proper rule.
The beneficial effects of the invention are as follows:
(1) the first-level distance-based clustering can simplify the algorithm and reduce the complexity of the algorithm.
(2) The second stage can solve the problem of variable scale density data by using a seed region growing method.
(3) The third-level merging part provides a relation weight threshold and a density similarity threshold, so that the merging of the small clusters is more reasonable and double-guarantee. The problem of non-convex clustering is effectively solved, and the merging accuracy is improved.
(4) By utilizing the reasonable design and fusion of the three-level algorithm, the overall algorithm avoids multi-layer iteration and greatly reduces the complexity of the algorithm.
Drawings
FIG. 1 is an overall flow diagram of the method of the present invention;
FIG. 2 is a flow chart of the first level clustering of the algorithm of the present invention;
FIG. 3 is a flow chart of the second level clustering of the algorithm of the present invention;
FIG. 4 is a flow chart of the third level clustering of the algorithm of the present invention
FIG. 5 is the final clustering result of the algorithm of the present invention applied to a occlusion data set.
Fig. 6 shows the final clustering result of the algorithm of the present invention run on the non-uniform density data set new.
Detailed Description
For the purpose of illustrating the objects, technical solutions and advantages of the present invention, the present invention will be described in further detail below with reference to specific embodiments and accompanying drawings.
Referring to fig. 1 to 6, a hierarchical clustering method based on region growing and competition for a variable-scale data density space includes the following steps:
a first stage: the cluster center is updated by drawing a circle from the distance threshold R1 as follows:
step 1.1: inputting a set of unlabeled data sets X ═ X1,x2,...xi,...xN}∈RPWhere X represents a sample point in the data set, P represents a sample dimension, N represents the number of samples, and the ith data object X is randomly selected from XiStoring the first cluster center point in a set C { }; then randomly taking the j-th data object X in XjCalculating x by equation (1)i xjEuclidean distance between themIf it isLess than R1(R1 is 10% of the spatial size of the data set), point xi xjFor the same class, calculate a new cluster center point S to replace point x in C according to equation (2)i. If it isGreater than R1, indicating xi xjNot of the same class, xjAlso as a cluster core, the cluster core set C ═ { x ═ is storedi}。
In equation (2), S is the updated cluster center, and β is the weighting factor (β is 1/16).
Step 1.2: from dataset X (excluding X)i、xj) In random fetching of the m-th data object xmCalculating the Euclidean distance setn is the number of points in the C set, and x is determinedmTo the closest point C in the cluster center setiUsing point x in combinationm、CiAnd updating the cluster center according to the method of the formula (1).
Step 1.3: repeating the steps 1.1 and 1.2 to traverse all the points in the data X, and obtaining the updated cluster center set C ═ C1,...,Ci,...CwAnd w is the cluster number. Corresponding cluster set M ═ C1{...},...,Ci{...},...Cw{...}}。
And a second stage: the region growing is carried out as follows:
step 2.1: determining the seed sequence: firstly, traversing all the points in the cluster center set C, and calculating the number n of the points corresponding to the ith cluster i1,2.. m. If n isiIf min C (min C is 5% of all samples), no cluster is formed, and the corresponding cluster center point C is deleted from CiDeleting the corresponding cluster center point set C in Mi{., and storing the cluster center points in the set D as the seed sequence B ═ C1,...,Ci,...,Cd},d<=w。
Step 2.2: defining growth criteria, determining growth stop conditions: taking the first cluster center C in the seed sequence B1And a circle is drawn with R1 as a radius. Calculating the number of points n in a circle1If n is1If min is greater than C, continue with C1Drawing a circle Q with the circle center R being R1 plus Delta R as a radiusB1And is judged to enter the circle QB1Whether or not the point(s) is (are)And if the growth belongs to D, i is i +1 and continues to grow.
△R=e(sm(x))/10*i^2*0.03 (3)
And sm (x) is the average value of the distance between data in the x-th cluster in the M set, and the points entering the circle are stored in the corresponding cluster set to obtain the updated M.
And 2.3, for the points obtained after each cluster center area grows, the next secondary time is not taken as a growing object to be processed, and then other cluster center points in the C are traversed by the method in the step 2.2 to obtain the data of each cluster center point and the corresponding cluster thereof.
And a third stage: and calculating the relation weight among all cluster centers of the clusters by a competition-based idea, and adopting a proper rule to merge the clusters.
After the data set X is subjected to the second-level clustering, if all cluster centers carry out the second-level clustering on the data XiIn the competition process, the winner is the heart of the cluster respectivelyAndgetWhen d has a value in a certain range, we consider the clusterHezhou clusterThere is a relational weight. When d < ═ 2.5, the algorithm has better clustering quality. And taking d < 2.5 as existence criterion of the relation weight. Increase criterion of the relational weight: by usingExpressing the weight of the relationship between the two small clusters, the calculation method is as follows (4)
Step 3.1: first, for a data set X ═ X1,...,Xi,...,XNFrom the first data X1Starting to traverse in sequence, and finding out two winners of all cluster centers in the process of competing for data for each specific dataAndthen, the clusters corresponding to the two winners are judged according to the relation weight existence criterionAndif the cluster with the weight exists, the relationship weight is increased according to a formula (4), and then the next data is traversed; if no relationship weight exists, the next data is directly traversed. Until all data has been traversed once in turn.
After the calculation of the relationship weight is completed, the relationship weight is formed asWhere subscript x takes on values from 1 up to M and superscript y takes on values from x up to M.
Step 3.2: calculating the density similarity between each cluster by first clustering the second-level clustersM, calculating the intra-cluster density ρ of each clusteri:
ρi=ni/Si (5)
niIs the number of points included in the ith cluster, SiIs the area size of the ith cluster. ρ ═ ρ1,...,ρi,...,ρdAnd calculating a density difference between the x-th cluster and the y-th clusterNamely:
subscript x takes on values from 1 up to d, and superscript y takes on values from x up to d.
Step 3.3: when in useAnd isIn the middle of the time, clusterHezhou clusterMay be combined. (experiments have found that a reasonable sum of the number of all data in two small clusters with link thresholds of about 40% to 50% of the weight of the relevant system is good.Sim represents the difference between the two densities, i.e., smaller is better, and a number less than 1.5 is used.)
Assume that the final cluster set formed is MkWherein each value of the subscript k corresponds to an independent cluster, and a finally formed cluster set M is subjected tokThe subscript of (a) is initialized to k 1,relationship weightThe subscript x is initialized to x ═ 1.
Relationship weightStarting from 1 up to M, the subscript x of (1) weights the relationshipThe superscript y takes values from x up to M whenWhen x is equal to y, letSatisfy the requirement of
Relationship weightNot satisfying the conditionAnd isIn time, the small clusters are not processed; relationship weightSatisfy the requirement ofAnd isWhen it is in condition, ifOrThenAndare simultaneously merged into MkIn (1),otherwise k is k +1, simultaneouslyAndmerge into a new cluster MkIn (1),whereinAndthe same elements present in (a) are combined into the same item;
step 3.4: the cluster center set finally formed is Mk,k=1,2...K。
The effects of the present invention can be further illustrated by the following simulation experiments.
1) Simulation conditions
The operating system used for the experiment is Windows10, simulation software Matlab (R2018b) (64 bits), the processor is Inter (R) core (TM) i7, and the installation memory is 8.00GB
Table 1 is partial UCI real data:
TABLE 1
2) Simulation result
The algorithm of the invention, the DBSCAN algorithm and the Kmean algorithm are used for the comparison experiment on a UCI data set with scale transformation and a group of artificial data sets with scale transformation new. In order to further verify the performance of the algorithm on a real data set, 4 data sets in the table 1 are used for carrying out experiments, and common ACC and F-measure indexes are adopted to evaluate clustering results, wherein the value ranges of the ACC and the F-measure indexes are [0,1], and the larger the value is, the better the clustering effect is.
TABLE 2
As can be seen from Table 2, the method of the present invention has better results than the conventional DBSCAN algorithm and the Kmeans algorithm. Compared with the running time of the DBSCAN algorithm, the complexity of the algorithm is lower. Especially when the amount of data is large. Has better practical engineering application value.
Details not described in this specification are within the skill of the art that are well known to those skilled in the art.
Claims (1)
1. A method for hierarchical clustering of guest travel direction data, the method comprising the steps of:
a first stage: the cluster center is updated by drawing a circle from the distance threshold R1 as follows:
step 1.1: inputting a set of unlabeled data sets X ═ X1,x2,...xi,...xN}∈RPRandomly fetching the ith data object X from XiStoring the first cluster center point in a set C { }; then randomly taking the j-th data object X in XjCalculating x by equation (1)iAnd xjEuclidean distance between themIf it isLess than R1, R1 is 10% of the spatial size of the data set, point xi xjFor the same class, calculate a new cluster center point S to replace point x in C according to equation (2)iIf, ifGreater than R1, indicating xi xjNot of the same class, xjAlso as a cluster core, the cluster core set C ═ { x ═ is storedi};
Wherein, S in the formula (2) is the updated cluster center, and β is the weight coefficient;
step 1.2: never include xi、xjRandomly fetch the mth data object X in the data set XmCalculating the Euclidean distance setn is the number of points in the C set, and x is determinedmTo the closest point C in the cluster center setiUsing point x in combinationm、CiUpdating the cluster center according to the method of the formula (1);
step 1.3: repeating the steps 1.1 and 1.2 to traverse all the points in the data X, and obtaining the updated cluster center set C ═ C1,...,Ci,...CwW is the cluster number, and the corresponding cluster set M ═ C1{...},...,Ci{...},...Cw{...}};
And a second stage: the region growing is carried out as follows:
step 2.1: determining the seed sequence: firstly, traversing all the points in the cluster center set C, and calculating the number n of the points corresponding to the ith clusteri1,2, m, if niIf min C is less than min C, the corresponding cluster center point C is deletediDeleting the corresponding cluster center point set C in Mi{., and storing the cluster center points in the set D as the seed sequence B ═ C1,...,Ci,...,Cd},d<=w;
Step 2.2: defining growth criteria, determining growth stop conditions: taking the first cluster center C in the seed sequence B1Drawing a circle with R1 as the radius, and calculating the number n of points in the circle1If n is1If min is greater than C, continue with C1Drawing a circle Q with the circle center R being R1 plus Delta R as a radiusB1And is judged to enter the circle QB1If the point (i) belongs to D, i +1 continues to grow;
△R=e(sm(x))/10*i^2*0.03 (3)
wherein sm (x) is the average value of the distance between the data in the x-th cluster in the M set, and the points entering the circle are stored in the corresponding cluster set to obtain updated M;
step 2.3, for the points obtained after each cluster center area grows, the next secondary time is not taken as a growing object to be processed, and then other cluster center points in the C are traversed by the method in the step 2.2 to obtain the data of each cluster center point and the corresponding cluster thereof;
and a third stage: calculating the relation weight and density similarity among all clustering centers by a competition-based idea, and adopting a proper rule to merge the clusters;
after the data set X is subjected to the second-level clustering, if all cluster centers carry out the second-level clustering on the data XiIn the competition process, the winner is the heart of the cluster respectivelyAndgetWhen d has a value in a certain range, we consider the clusterHezhou clusterThere is a relational weight; increase criterion of the relational weight: by usingExpressing the weight of the relationship between the two small clusters, the calculation method is as follows (4)
step 3.1: first, for a data set X ═ X1,…,Xi,…,XNFrom the first data X1Starting to traverse in sequence, and finding out two winners of all cluster centers in the process of competing for data for each specific dataAndthen, the clusters corresponding to the two winners are judged according to the relation weight existence criterionAndif the cluster with the weight exists, the relationship weight is increased according to a formula (4), and then the next data is traversed; if the relation weight does not exist, directly traversing the next data until all the data are traversed once in sequence;
after the calculation of the relationship weight is completed, the relationship weight is formed asWherein the subscript x takes values from 1 up to M, and the subscript y takes values from x up to M;
step 3.2: calculating density similarity between each cluster, firstly calculating the intra-cluster density rho of each cluster for the cluster set M clustered at the second stagei:
ρi=ni/Si (5)
niIs the number of points included in the ith cluster, SiIs the area size of the ith cluster, ρ ═ ρ { [ ρ ]1,...,ρi,...,ρdAnd calculating a density difference between the x-th cluster and the y-th clusterNamely:
subscript x takes values from 1 up to d, and subscript y takes values from x up to d;
assume that the final cluster set formed is MkWherein each value of the subscript k corresponds to an independent cluster, and a finally formed cluster set M is subjected tokThe subscript of (a) is initialized to k 1,relationship weightSubscript x is initialized to x ═ 1;
relationship weightStarting from 1 up to M, the subscript x of (1) weights the relationshipThe superscript y takes values from x up to M whenWhen x is equal to y, letSatisfy the requirement of
Relationship weightNot satisfying the conditionAnd isIn time, the small clusters are not processed; relationship weightSatisfy the requirement ofAnd isWhen it is in condition, ifOrThenAndare simultaneously merged into MkIn (1),otherwise k is k +1, simultaneouslyAndmerge into a new cluster MkIn (1),whereinAndthe same elements present in (a) are combined into the same item;
step 3.4: the cluster center set finally formed is Mk,k=1,2…K。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910812062.5A CN110728293B (en) | 2019-08-30 | 2019-08-30 | Hierarchical clustering method for tourist heading data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910812062.5A CN110728293B (en) | 2019-08-30 | 2019-08-30 | Hierarchical clustering method for tourist heading data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110728293A CN110728293A (en) | 2020-01-24 |
CN110728293B true CN110728293B (en) | 2021-10-29 |
Family
ID=69218832
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910812062.5A Active CN110728293B (en) | 2019-08-30 | 2019-08-30 | Hierarchical clustering method for tourist heading data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110728293B (en) |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2002259250A1 (en) * | 2001-05-18 | 2002-12-03 | Biowulf Technologies, Llc | Model selection for cluster data analysis |
US8031914B2 (en) * | 2006-10-11 | 2011-10-04 | Hewlett-Packard Development Company, L.P. | Face-based image clustering |
CN105550744A (en) * | 2015-12-06 | 2016-05-04 | 北京工业大学 | Nerve network clustering method based on iteration |
CN106776849B (en) * | 2016-11-28 | 2020-01-10 | 西安交通大学 | Method for quickly searching scenic spots by using pictures and tour guide system |
-
2019
- 2019-08-30 CN CN201910812062.5A patent/CN110728293B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN110728293A (en) | 2020-01-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111211994B (en) | Network traffic classification method based on SOM and K-means fusion algorithm | |
WO2018086433A1 (en) | Medical image segmenting method | |
CN107578061A (en) | Based on the imbalanced data classification issue method for minimizing loss study | |
CN109002858B (en) | Evidence reasoning-based integrated clustering method for user behavior analysis | |
CN113076970A (en) | Gaussian mixture model clustering machine learning method under deficiency condition | |
CN108280236A (en) | A kind of random forest visualization data analysing method based on LargeVis | |
CN109271427A (en) | A kind of clustering method based on neighbour's density and manifold distance | |
CN115641177B (en) | Second-prevention killing pre-judging system based on machine learning | |
CN108416381B (en) | Multi-density clustering method for three-dimensional point set | |
CN113435108A (en) | Battlefield target grouping method based on improved whale optimization algorithm | |
CN116821715A (en) | Artificial bee colony optimization clustering method based on semi-supervision constraint | |
CN115690476A (en) | Automatic data clustering method based on improved harmony search algorithm | |
CN113128617B (en) | Spark and ASPSO based parallelization K-means optimization method | |
Xing et al. | Fuzzy c-means algorithm automatically determining optimal number of clusters | |
CN114638301A (en) | Density peak value clustering algorithm based on density similarity | |
CN110781943A (en) | Clustering method based on adjacent grid search | |
CN110580252A (en) | Space object indexing and query method under multi-objective optimization | |
CN110728293B (en) | Hierarchical clustering method for tourist heading data | |
CN108897820B (en) | Parallelization method of DENCLUE algorithm | |
CN108446740B (en) | A kind of consistent Synergistic method of multilayer for brain image case history feature extraction | |
Mir et al. | Improving data clustering using fuzzy logic and PSO algorithm | |
CN108304546B (en) | Medical image retrieval method based on content similarity and Softmax classifier | |
CN113205124B (en) | Clustering method, system and storage medium based on density peak value under high-dimensional real scene | |
CN113469107B (en) | Bearing fault diagnosis method integrating space density distribution | |
Cui et al. | Weighted particle swarm clustering algorithm for self-organizing maps |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |