CN103207896A - Method and system for stable and efficient self-adaptive clustering - Google Patents
Method and system for stable and efficient self-adaptive clustering Download PDFInfo
- Publication number
- CN103207896A CN103207896A CN2013100826712A CN201310082671A CN103207896A CN 103207896 A CN103207896 A CN 103207896A CN 2013100826712 A CN2013100826712 A CN 2013100826712A CN 201310082671 A CN201310082671 A CN 201310082671A CN 103207896 A CN103207896 A CN 103207896A
- Authority
- CN
- China
- Prior art keywords
- input data
- cluster
- candidate
- selecting
- probability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000004364 calculation method Methods 0.000 claims abstract description 8
- 230000003044 adaptive effect Effects 0.000 claims description 21
- 238000012216 screening Methods 0.000 claims description 6
- 238000012217 deletion Methods 0.000 claims description 4
- 230000037430 deletion Effects 0.000 claims description 4
- 238000001514 detection method Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000010365 information processing Effects 0.000 description 3
- 238000005192 partition Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000011430 maximum method Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Images
Abstract
The invention discloses a method and a system for stable and efficient self-adaptive clustering. The method comprises the following steps of: a, obtaining a set p of input data from p1 to pn, in which n input data is included in the set, and obtaining a threshold value theta of cluster radius; b, adding pi and input data in the set with a distance from the input data pi of smaller than the threshold value theta in a candidate cluster Cpi corresponding to the input data pi, in which the input data pi represents the i-th input data in the set; and c, defining m input data in the candidate cluster Cpi, using a function d (pi, pij) as a distance between the two input data pi, pj, and calculating the input data pi as the probability of a cluster center. The method and the system are applied for establishing a stable and efficient self-adaptive cluster system; the amount of final clusters is not required to be preset, so that the system has calculation efficiency and can realize calculation complexity of o(m2), and the method and the system can be suitable for various mobile intelligent terminals at present.
Description
Technical field
The present invention relates to technical field of computer information processing, relate in particular to a kind of clustering method and system of adaptive stability and high efficiency.
Background technology
Along with the quick growth of computerized information, people are more and more stronger to the processing demands of all kinds of computerized informations.Clustering algorithm for the cluster function that various data managements, artificial intelligence, machine learning provide the foundation, is being brought into play important role as a very important class algorithm in the information processing in various information processings,
Use general today at intelligent mobile terminal, various information service based on intelligent mobile device has appearred, they need provide the service of efficient stable to various intelligent terminals, wherein a large amount of services all need to use clustering algorithm, as in the mobile social networking to social good friend's cluster, during shopping is used to the cluster of commodity etc.At present a large amount of mobile device terminal are passed through GPS, the base station, modes such as WAP have possessed station-keeping ability, have therefore also produced many services based on the geographic position, clustering method then can provide abundant more and useful function, the hot zones cluster of for example classifying for this class service.Simple example has often been added all kinds of geographical labels by the user on the present electronic chart, and as shopping, cuisines, sight spot etc., these geographical labels are dispersed on the whole electronic chart.When a smart mobile phone user travelled outdoors or goes window-shopping, he usually needed to seek own interested popular commercial circle, i.e. the intensive place of a certain class label, as the concentrated commercial circle of doing shopping, and obtain navigation Service.But but can only obtain being dispersed in " shopping " label on the whole map by present cell phone map inquiry " shopping ", allow the user be difficult to choose route destination address.Yet by the effective cluster with these " shopping " labels, be about to label and be divided into a plurality of intensive subregions (cluster), can find popular " shopping " commercial circle fast.And by to a plurality of labels, as " shopping " and " cuisines ", cluster result integrate, then can effectively help the user to find to satisfy the popular commercial circle of its multiple requirement.Clustering method can bring a large amount of abundant application for novel mobile device, but the limited characteristics of the application of portable terminal variation and computational resource then to the proposition of clustering method self-adaptation, stablize, efficiently new demand.
Present existing multiple clustering method, as k-means commonly used and the maximum method of expectation, though that their are realized is fast simple, they need set in advance the number of final subregion, and this obviously makes the application that such method can't be scalable.Because the user can't be known number of partitions in advance in great majority are used, as a city what gourmet centers are arranged actually.In addition, all there is unsettled phenomenon in these two kinds of methods, and namely repeatedly moving the cluster result that obtains may be inconsistent.Though and the another kind of method that is called QT need not set in advance number of partitions, and can get access to stable cluster result, it but needs o[(n]
3) computing cost, in the face of huge quantity of information, for the limited mobile device of computational resource, such expense is difficult to bear often.
Summary of the invention
The objective of the invention is to propose a kind of clustering method and system of adaptive stability and high efficiency, to solve the big problem of computing cost.
For reaching this purpose, the present invention by the following technical solutions:
A kind of clustering method of adaptive stability and high efficiency comprises:
The set that a obtains the input data is p={p
1... p
n, comprise n input data in the set, obtain the threshold value θ of cluster radius;
B is with p
iAnd in the set with input data p
iDistance all add input data p less than the input data of threshold value θ
iCorresponding candidate's cluster C
Pi, input data p
iI input data in the expression set;
C makes candidate's cluster C
PiIn the input data be m, function d (p
i, p
j) be two input data p
i, p
jBetween distance, calculate input data p
iProbability as cluster centre is
1≤j≤m;
D selects the input data that become cluster centre probability maximum from the input data of set, candidate's cluster of this input data correspondence of selecting is added final cluster.
Further, described candidate's cluster with this input data correspondence of selecting adds after the final cluster, further comprises:
E deletes the input data that add final cluster from input data set closes, select the input data that become cluster centre probability maximum again from the present input data set, and candidate's cluster of this input data correspondence of selecting is added final cluster;
Whether the quantity of judging the input data in the set is zero, if, then finish, otherwise, step e continued.
A kind of clustering system of adaptive stability and high efficiency comprises:
Initialization module, being used for obtaining the set of importing data is p={p
1... p
n, comprise n input data in the set, obtain the threshold value θ of cluster radius;
Candidate's cluster is set up module, is used for p
iAnd in the set with input data p
iDistance all add input data p less than the input data of threshold value θ
iCorresponding candidate's cluster C
Pi, input data p
iI input data in the expression set;
The probability calculation module is used for making candidate's cluster C
PiIn the input data be m, function d (p
i, p
j) be two input data p
i, p
jBetween distance, calculate input data p
iProbability as cluster centre is
Cluster screening module is used for selecting the input data that become cluster centre probability maximum from the input data of set, and candidate's cluster of this input data correspondence of selecting is added final cluster.
Further, also comprise:
Removing module, be used for closing the input data that deletion adds final cluster from input data set, again from present input data set, select the input data that become cluster centre probability maximum, candidate's cluster of this input data correspondence of selecting is added final cluster;
First detection module is used for judging whether the quantity of the input data of gathering is zero, if, then finish, otherwise, step e continued.
Beneficial effect of the present invention is: clustering method and the system of the present invention by a kind of adaptive stability and high efficiency is provided proposed a kind of clustering method of stability and high efficiency of New Adaptive.This method need not to set in advance the number of subregion, can self-adaptation input data realize appropriate subregion.And this method can realize stable cluster result at identical input.Compare classic method, this method also has the calculating high efficiency, can realize o (n
2) computation complexity, can be applicable to present various mobile intelligent terminals.
Description of drawings
Fig. 1 is the first embodiment process flow diagram of the clustering method of a kind of adaptive stability and high efficiency of the present invention;
Fig. 2 is the second embodiment process flow diagram of the clustering method of a kind of adaptive stability and high efficiency of the present invention;
Fig. 3 is the first embodiment block diagram of the clustering system of a kind of adaptive stability and high efficiency of the present invention;
Fig. 4 is the second embodiment block diagram of the clustering system of a kind of adaptive stability and high efficiency of the present invention.
Embodiment
Further specify technical scheme of the present invention below in conjunction with accompanying drawing and by embodiment.
The first embodiment flow process of the clustering method of a kind of adaptive stability and high efficiency of the present invention is as shown in Figure 1:
Present embodiment proposes a kind of clustering method of adaptive stability and high efficiency, this method need not the number that the user sets in advance cluster, can adaptive generation cluster result, can realize stable cluster result to identical input data, compare traditional algorithm, this method has high efficiency, and its computation complexity is o (n
2).
The second embodiment flow process of the clustering method of a kind of adaptive stability and high efficiency of the present invention is as shown in Figure 2:
If each input data p
iThe probability that becomes cluster centre is q
i, all q of initialization
iBe 0.Make C
PiBe p
iCorresponding cluster, and initialization
The initialization cluster result
Make function d (p
i, p
j) be two input data p
i, p
jBetween the tolerance of distance.
The first embodiment block diagram of the clustering system of the adaptive stability and high efficiency of the present invention as shown in Figure 3, this system comprises that initialization module 310, candidate's cluster set up module 320, probability calculation module 330, cluster screening module 340.
Wherein, initialization module 310, being used for obtaining the set of importing data is p={p
1... p
n, comprise n input data in the set, obtain the threshold value θ of cluster radius; Candidate's cluster is set up module 320, is used for p
iAnd in the set with input data p
iDistance all add input data p less than the input data of threshold value θ
iCorresponding candidate's cluster C
Pi, input data p
iI input data in the expression set; Probability calculation module 330 is used for making candidate's cluster C
PiIn the input data be m, function d (p
i, p
j) be two input data p
i, p
jBetween distance, calculate input data p
iProbability as cluster centre is
Cluster screening module 340 is used for selecting the input data that become cluster centre probability maximum from the input data of set, and candidate's cluster of this input data correspondence of selecting is added final cluster.
The second embodiment block diagram of the clustering system of the adaptive stability and high efficiency of the present invention comprises that initialization module 310, candidate's cluster set up module 320, probability calculation module 330, cluster screening module 340, removing module 350, first detection module 360 as shown in Figure 4.
Wherein, initialization module 310, being used for obtaining the set of importing data is p={p
1... p
n, comprise n input data in the set, obtain the threshold value θ of cluster radius; Candidate's cluster is set up module 320, is used for p
iAnd in the set with input data p
iDistance all add input data p less than the input data of threshold value θ
iCorresponding candidate's cluster C
Pi, input data p
iI input data in the expression set; Probability calculation module 330 is used for making candidate's cluster C
PiIn the input data be m, function d (p
i, p
j) be two input data p
i, p
jBetween distance, calculate input data p
iProbability as cluster centre is
1≤j≤m; Cluster screening module 340 is used for selecting the input data that become cluster centre probability maximum from the input data of set, and candidate's cluster of this input data correspondence of selecting is added final cluster.Removing module 350, be used for closing the input data that deletion adds final cluster from input data set, again from present input data set, select the input data that become cluster centre probability maximum, candidate's cluster of this input data correspondence of selecting is added final cluster; First detection module 360 is used for judging whether the quantity of the input data of gathering is zero, if, then finish, otherwise, step e continued.
The advantage of the clustering method of the adaptive stability and high efficiency that this invention proposes comprises: the first, and this method need not to set in advance the number of final cluster, can self-adaptation input data realize appropriate cluster, can be widely used in plurality of application scenes; The second, this method can realize stable cluster result at identical input, can repeatedly repeat, and guarantees the consistance of service; The 3rd, compare classic method, this method has the calculating high efficiency, can realize o (n
2) computation complexity, can be applicable to present various mobile intelligent terminals; The 4th, when this method obtains final cluster, also obtained the central point of each cluster.
One of ordinary skill in the art will appreciate that all or part of flow process that realizes in above-described embodiment method, be to instruct relevant hardware to finish by computer program, described program can be stored in the computer read/write memory medium, this program can comprise the flow process as the embodiment of above-mentioned each side method when carrying out.Wherein, described storage medium can be magnetic disc, CD, read-only storage memory body (Read-Only Memory, ROM) or at random store memory body (Random Access Memory, RAM) etc.
Know-why of the present invention has below been described in conjunction with specific embodiments.These are described just in order to explain principle of the present invention, and can not be interpreted as limiting the scope of the invention by any way.Based on explanation herein, those skilled in the art does not need to pay performing creative labour can associate other embodiment of the present invention, and these modes all will fall within protection scope of the present invention.
Claims (4)
1. the clustering method of an adaptive stability and high efficiency is characterized in that, comprising:
The set that a obtains the input data is p={p
1... p
n, comprise n input data in the set, obtain the threshold value θ of cluster radius;
B is with p
iAnd in the set with input data p
iDistance all add input data p less than the input data of threshold value θ
iCorresponding candidate's cluster C
Pi, input data p
iI input data in the expression set;
C makes candidate's cluster C
PiIn the input data be m, function d (p
i, p
j) be two input data p
i, p
jBetween distance, calculate input data p
iProbability as cluster centre is
1≤j≤m;
D selects the input data that become cluster centre probability maximum from the input data of set, candidate's cluster of this input data correspondence of selecting is added final cluster.
2. the clustering method of a kind of adaptive stability and high efficiency according to claim 1 is characterized in that, described candidate's cluster with this input data correspondence of selecting adds after the final cluster, further comprises:
E deletes the input data that add final cluster from input data set closes, select the input data that become cluster centre probability maximum again from the present input data set, and candidate's cluster of this input data correspondence of selecting is added final cluster;
Whether the quantity of judging the input data in the set is zero, if, then finish, otherwise, step e continued.
3. the clustering system of an adaptive stability and high efficiency is characterized in that, comprising:
Initialization module, being used for obtaining the set of importing data is p={p
1... p
n, comprise n input data in the set, obtain the threshold value θ of cluster radius;
Candidate's cluster is set up module, is used for p
iAnd in the set with input data p
iDistance all add input data p less than the input data of threshold value θ
iCorresponding candidate's cluster C
Pi, input data p
iI input data in the expression set;
The probability calculation module is used for making candidate's cluster C
PiIn the input data be m, function d (p
i, p
j) be two input data p
i, p
jBetween distance, calculate input data p
iProbability as cluster centre is
Cluster screening module is used for selecting the input data that become cluster centre probability maximum from the input data of set, and candidate's cluster of this input data correspondence of selecting is added final cluster.
4. system as claimed in claim 4 is characterized in that, also comprises:
Removing module, be used for closing the input data that deletion adds final cluster from input data set, again from present input data set, select the input data that become cluster centre probability maximum, candidate's cluster of this input data correspondence of selecting is added final cluster;
First detection module, whether the quantity that is used for the input data of disconnected set is zero, if, then finish, otherwise, step e continued.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310082671.2A CN103207896B (en) | 2013-03-14 | 2013-03-14 | Method and system for stable and efficient self-adaptive clustering |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310082671.2A CN103207896B (en) | 2013-03-14 | 2013-03-14 | Method and system for stable and efficient self-adaptive clustering |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103207896A true CN103207896A (en) | 2013-07-17 |
CN103207896B CN103207896B (en) | 2017-02-01 |
Family
ID=48755118
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310082671.2A Expired - Fee Related CN103207896B (en) | 2013-03-14 | 2013-03-14 | Method and system for stable and efficient self-adaptive clustering |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103207896B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103559502A (en) * | 2013-10-25 | 2014-02-05 | 华南理工大学 | Pedestrian detection system and method based on adaptive clustering analysis |
CN104702432A (en) * | 2014-01-15 | 2015-06-10 | 杭州海康威视系统技术有限公司 | Alarm method based on position area division and server |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101308496A (en) * | 2008-07-04 | 2008-11-19 | 沈阳格微软件有限责任公司 | Large scale text data external clustering method and system |
EP2184692A1 (en) * | 2008-10-28 | 2010-05-12 | Sony Corporation | Information processing |
CN101989281A (en) * | 2009-08-03 | 2011-03-23 | 中国移动通信集团公司 | Clustering method and device |
CN102289478A (en) * | 2011-08-01 | 2011-12-21 | 江苏广播电视大学 | System and method for recommending video on demand based on fuzzy clustering |
-
2013
- 2013-03-14 CN CN201310082671.2A patent/CN103207896B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101308496A (en) * | 2008-07-04 | 2008-11-19 | 沈阳格微软件有限责任公司 | Large scale text data external clustering method and system |
EP2184692A1 (en) * | 2008-10-28 | 2010-05-12 | Sony Corporation | Information processing |
CN101989281A (en) * | 2009-08-03 | 2011-03-23 | 中国移动通信集团公司 | Clustering method and device |
CN102289478A (en) * | 2011-08-01 | 2011-12-21 | 江苏广播电视大学 | System and method for recommending video on demand based on fuzzy clustering |
Non-Patent Citations (1)
Title |
---|
段明秀: ""结合SOFM的改进CLARA聚类算法"", 《计算机工程》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103559502A (en) * | 2013-10-25 | 2014-02-05 | 华南理工大学 | Pedestrian detection system and method based on adaptive clustering analysis |
CN104702432A (en) * | 2014-01-15 | 2015-06-10 | 杭州海康威视系统技术有限公司 | Alarm method based on position area division and server |
CN104702432B (en) * | 2014-01-15 | 2018-03-30 | 杭州海康威视系统技术有限公司 | The method and server alerted based on band of position division |
Also Published As
Publication number | Publication date |
---|---|
CN103207896B (en) | 2017-02-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105427129A (en) | Information delivery method and system | |
CN104796481A (en) | Intelligent audio and video selection method | |
Mo et al. | A two-stage clustering approach for multi-region segmentation | |
CN106162544B (en) | A kind of generation method and equipment of geography fence | |
CN106651213A (en) | Processing method and device for service orders | |
CN109902250A (en) | Sharing method, sharing means, computer equipment and the storage medium of questionnaire survey | |
CN109408522A (en) | A kind of update method and device of user characteristic data | |
Qu et al. | Fl-sec: Privacy-preserving decentralized federated learning using signsgd for the internet of artificially intelligent things | |
CN112950119A (en) | Method, device, equipment and storage medium for splitting instant logistics order | |
CN109492891A (en) | Customer churn prediction technique and device | |
CN112597399B (en) | Graph data processing method and device, computer equipment and storage medium | |
CN103207896A (en) | Method and system for stable and efficient self-adaptive clustering | |
CN112307247B (en) | Distributed face retrieval system and method | |
CN109451334A (en) | User, which draws a portrait, generates processing method, device and electronic equipment | |
CN108932525A (en) | A kind of behavior prediction method and device | |
CN111932302A (en) | Method, device, equipment and system for determining number of service sites in area | |
CN108830298A (en) | A kind of method and device of determining user characteristics label | |
CN104156475A (en) | Geographic information reading method and device | |
CN105512914A (en) | Information processing method and electronic device | |
CN112200644B (en) | Method and device for identifying fraudulent user, computer equipment and storage medium | |
CN109933679A (en) | Object type recognition methods, device and equipment in image | |
CN115292475A (en) | Cloud computing service information processing method and system based on smart city | |
CN111882421B (en) | Information processing method, wind control method, device, equipment and storage medium | |
Gelda et al. | Forecasting supply in Voronoi regions for app-based taxi hailing services | |
CN114219581A (en) | Personalized interest point recommendation method and system based on heteromorphic graph |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170201 |
|
CF01 | Termination of patent right due to non-payment of annual fee |