CN105282011A - Social group finding method based on cluster fusion algorithm - Google Patents

Social group finding method based on cluster fusion algorithm Download PDF

Info

Publication number
CN105282011A
CN105282011A CN201510646011.1A CN201510646011A CN105282011A CN 105282011 A CN105282011 A CN 105282011A CN 201510646011 A CN201510646011 A CN 201510646011A CN 105282011 A CN105282011 A CN 105282011A
Authority
CN
China
Prior art keywords
cluster
fusion
clustering
algorithm
benchmark
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510646011.1A
Other languages
Chinese (zh)
Inventor
刘波
余刚
肖燕珊
郝志峰
梁荣德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201510646011.1A priority Critical patent/CN105282011A/en
Publication of CN105282011A publication Critical patent/CN105282011A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a social group finding method based on a cluster fusion algorithm, and the method comprises the steps: firstly generating a reference class through social network data, enabling social users in the same reference class to have similar group property; employing an adopting device for basic clusters to generate various types of basic cluster sets; employing the cluster fusion algorithm for each basic cluster set, carrying out the fusion of cluster fusion results through employing the cluster fusion algorithm, and generating a candidate reference; employing a screening device for the candidate reference, and obtaining a reference according to a set screening condition; employing the reference to evaluate the cluster quality; and evaluating the cluster quality through employing an external method after the reference is obtained. The method achieves the fusion of the decisions of basic clusters, and obtains a more accurate and robust decision. The method improves the group and individual finding accuracy in social network data, enables a service provider to obtain enough user information, improves the service quality, and is great in use value.

Description

Based on the societies discover method of Cluster-Fusion algorithm
Technical field
The invention belongs to social networks group digging technology field, relate to a kind of determination methods using Cluster-Fusion algorithm, be specifically related to a kind of societies discover method based on Cluster-Fusion algorithm.
Background technology
" the Internet+" is the further practical result of the Internet thinking, and it represents the productivity of a kind of advanced person, promotes economic form and constantly develops, thus drive the vitality of social economy's entity, for reform, development, innovation provide the wide network platform.
Now, marching toward the brand-new epoch in traditional the Internet---in-social interaction server net the epoch (SocialNetworkingService), march toward the epoch of " person to person " from the epoch of " human and computer ".Individual social circle constantly can expand with overlap and form large social networks final.A distinguishing feature of social network supports huge number of users, and such as Facebook supports the user more than 300,000,000, and its data center runs the server more than ten thousand, for the user spread all over the world provides information communication service.In addition, any two social network users may be mutual, namely must support that the data correlation of any two database users operates.This data base administration for service end proposes great challenge.
Cloud Server (ElasticComputeService, be called for short ECS) is that a kind of disposal ability can the calculation services of elastic telescopic, and its way to manage is simpler than physical server efficiently.The application that Cloud Server helps your rapid build more stable, safe, reduces the difficulty of exploitation O&M and overall IT cost, enables you more be absorbed in the innovation of core business.At present, be the fairly perfect ecosystem done.
The core concept of Cluster-Fusion algorithm is by multiple clustering algorithm is merged, and draws more accurately, more healthy and stronger decision-making.On the one hand, because base cluster comes from different base clustering algorithm respectively, the initialization condition of its clustering algorithm, optimum configurations, even algorithm idea is all different, so these different base clusters all contain a part of feature of data set.By these different base clusters are merged, the real feature of data set effectively can be reflected more fully, more accurately.On the other hand, even if some base cluster exists the error message of reflection data set, but by the correct information correction of a large amount of base cluster, more healthy and stronger Clustering Decision-Making can effectively be drawn.Due to these good characteristics of Cluster-Fusion algorithm, at present in clustering algorithm research field, Cluster-Fusion algorithm develops vigorously.
Summary of the invention
The object of this invention is to provide a kind of societies discover method based on Cluster-Fusion algorithm, for the social network data of complexity, use Cluster-Fusion algorithm as judgment criterion, then the social network data of a series of the unknown is classified, classified accordingly, marketing personal can be served accordingly.
The technical solution adopted in the present invention is, based on the societies discover method of Cluster-Fusion algorithm, specifically implements according to following steps:
Step 1: for the data in social networking, draws corresponding sampling base cluster respectively according to base clustering algorithm;
Step 2: each sampling base cluster set that step 1 obtains is merged, draws candidate reference;
Step 3: the candidate reference that step 2 obtains is screened, the highest candidate reference of marking is as optimum benchmark;
Step 4: the optimum benchmark using step 3 to obtain is evaluated clustering result quality.
Feature of the present invention is also,
Step 1 wherein is specifically implemented according to following steps:
Suppose there is a data set X comprising m object, definition X={x 1, x 2..., x m, after the N number of base clustering algorithm of operation, obtain N number of base cluster π, definition π={ π 1, π 2..., π n, then, the computing of Fusion of Clustering algorithm is carried out to π, obtains Fusion of Clustering π *, definition π *=φ (π), wherein φ is Cluster-Fusion function;
First, social network user information being sampled, utilize social platform account to obtain platform access rights, carrying out orientation acquisition by arranging initiating task set pair target information;
Secondly, adopt k-means alternatively benchmark algorithm, first set cluster number, then set initialization cluster centre at random, generate multiple base cluster; In order to generate the high base cluster set of diversity, by sampler, base cluster set being sampled, by the mode of combination subbase cluster set, obtaining the sampling base cluster base that multiple groups of differentiation are large.
The mode of sampler samples is wherein random roulette wheel dish mode.
Step 2 wherein is specifically implemented according to following steps:
Adopt SLC algorithm to merge Fusion of Clustering collection, obtain candidate reference:
The scoring of candidate reference is defined as follows:
E v a l ( π B * ) = 0 , i f | A c u ( π u * ) - A c u ( π v * ) | > α λ · N M I ( π u * , π v * ) + ( 1 - λ ) | A c u ( π u * ) - A c u ( π v * ) | , o t h e r s ,
Wherein, candidate reference is fusion of Clustering is with α is threshold value.
Wherein between Fusion of Clustering, similarity degree is greater than α time, scoring is 0, at this moment prevents the similitude between Fusion of Clustering too large; When between Fusion of Clustering, similarity degree is less than α, mark by two parts be added form; Part I is the similarity degree of Fusion of Clustering between candidate reference, and Part II is the similarity degree between Fusion of Clustering; λ is the weight between two parts; As λ > 0.5, in scoring, Part I is larger than the weight of Part II; As λ < 0.5, in scoring, Part II is larger than the weight of Part I; When λ=0.5, in scoring, Part II is more equal than the weight of Part I; Generally speaking, select λ=0.5, namely to account for the weight of scoring the same for Part II and Part I; Accordingly, calculate the scoring of each candidate reference, the highest candidate reference of marking is as final benchmark; Pass through the optimum benchmark use of benchmark as next step of screening, evaluate clustering result quality.
Step 4 wherein is specifically implemented according to following steps:
The optimum benchmark utilizing previous step to generate, utilizes external method BCubed to evaluate clustering result quality: given benchmark π bwith K the Fusion of Clustering π={ π by different Cluster-Fusion algorithm gained 1, π 2... π k, to each Fusion of Clustering π i, a quality evaluation Q can be drawn ii, π b); Mark higher, represent the fusion results that this Cluster-Fusion algorithm draws better;
Suppose there is object set X={x 1, x 2..., x n, C is a cluster of X, and B is the benchmark of X; C (x i) (1≤i≤n) represent x iin the classification of C, B (x i) (1≤i≤n) represent x iin the classification of B; For two object x iand x j(1≤i, j≤n, i ≠ j), x iand x jbe defined as follows in the correctness of cluster C:
C o r r e c t n e s s ( x i , x j ) = 1 , i f B ( x i ) = B ( x j ) &DoubleLeftRightArrow; C ( x i ) = C ( x j ) 0 , o t h e r s ,
The definition of accuracy of BCubed is as follows:
P = 1 n &Sigma; i = 1 n &Sigma; x i : i &NotEqual; j , C ( x i ) = C ( x j ) C o r r e c t n e s s ( x i , x j ) | { x i | i &NotEqual; j , C ( x i ) = C ( x j ) } | ,
The recall rate of BCubed is defined as follows:
R = 1 n &Sigma; i = 1 n &Sigma; x i : i &NotEqual; j , B ( x i ) = B ( x j ) C o r r e c t n e s s ( x i , x j ) | { x i | i &NotEqual; j , B ( x i ) = B ( x j ) } | ,
Precision and recall rate can be used for evaluating cluster, and F tolerance simultaneously in conjunction with precision and recall rate, can be defined as follows:
F = ( 1 + &beta; 2 ) P &CenterDot; R &beta; 2 &CenterDot; P + R ,
The span of F tolerance is between 0 to 1, and when F tolerance equals 0, clustering result quality is unsatisfactory; When F tolerance equals 1, clustering result quality is desirable, completely the same with benchmark; So when F tolerance more close to 1 time, clustering result quality is better.
The beneficial effect of the invention is, the present invention proposes group's discovery recognition methods that a kind of evaluation from outside method not relying on expert opinion benchmark is criterion.First, generate benchmark class by social network data, make the social user in same benchmark class have similar group attribute.For base cluster, use and adopt device, generate various base cluster set.For each base cluster set, use Cluster-Fusion algorithm, adopt Cluster-Fusion algorithm to merge to Cluster-Fusion result, generate candidate reference.For candidate reference, use screening washer, according to setting screening conditions, draw benchmark.Then, benchmark is used to evaluate clustering result quality.After obtaining benchmark, evaluation from outside method is adopted to evaluate clustering result quality herein.The present invention, by merging the decision-making of base cluster, draws more accurate, healthy and strong decision-making.Improve group in social network data to find, the individual accuracy rate found, makes service provider obtain user profile more fully, thus improves service quality, have great use value.
Accompanying drawing explanation
Fig. 1 is the frame diagram realized base cluster sampling section;
Fig. 2 is the frame diagram generating the realization of candidate reference part;
Fig. 3 is the frame diagram that screening candidate reference part realizes.
Embodiment
Below in conjunction with the drawings and specific embodiments, the present invention is described in detail.
The present invention is based on the societies discover method of Cluster-Fusion algorithm, specifically implement according to following steps:
Step 1: for the data in social networking, draws corresponding base cluster (base clustering algorithm 1 draws corresponding base cluster 1) respectively according to base clustering algorithm, and the mode of wherein sampling is random roulette wheel dish mode.Be specially: build the base cluster in social network circuit-switched data: for the data in social networking, corresponding base cluster (base clustering algorithm 1 draws corresponding base cluster 1) is drawn respectively according to base clustering algorithm, social network circuit-switched data is divided into different base clusters, then sample to base cluster set, object generates the high sampling base cluster set of diversity.The sampling base cluster set that diversity is high contributes to multifarious candidate's Fusion of Clustering of follow-up generation, is conducive in the last Fusion of Clustering of screening.
Step 2: each sampling base cluster set is merged, draws candidate reference.Be specially: each sampling network Data Base cluster is run to the Cluster-Fusion algorithm participating in evaluating, the Fusion of Clustering collection Cluster-Fusion algorithm generated is generated candidate reference.By that analogy, candidate reference collection is generated.
Step 3: screen candidate reference, namely the highest candidate reference of marking is benchmark.
Step 4: use benchmark to evaluate clustering result quality.
Below in conjunction with specific embodiment, illustrate the present invention further, these embodiments should be understood only be not used in for illustration of the present invention and limit the scope of the invention, after having read the present invention, the amendment of those skilled in the art to the various equivalent form of value of the present invention has all fallen within the application's claims limited range.
Embodiment
Fig. 1 is the frame diagram realized base cluster sampling section of the embodiment of the present invention, and idiographic flow is described below:
Express with equation expression, suppose there is a data set X comprising m object, definition X={x 1, x 2..., x m.After the N number of base clustering algorithm of operation, obtain N number of base cluster π, definition π={ π 1, π 2..., π n.Then, the computing of Fusion of Clustering algorithm is carried out to π, obtains Fusion of Clustering π *, definition π *=φ (π), wherein φ is Cluster-Fusion function.
First, social network user information being sampled, utilize social platform account to obtain platform access rights, carrying out orientation acquisition by arranging initiating task set pair target information.From the extracting data occupational information got, extract hobby information, extraction gender information, extraction location message, extraction previous graduate college information.
Secondly, k-means alternatively benchmark algorithm is adopted.First set cluster number, then set initialization cluster centre at random, multiple base cluster can be generated.In order to generate the high base cluster set of diversity, by sampler, base cluster set being sampled, by the mode of combination subbase cluster set, obtaining the sampling base cluster base that multiple groups of differentiation are large.Sampler samples be that the mode of roulette wheel dish stochastical sampling is sampled base cluster set.A base cluster and other base cluster difference larger, its selected probability sampled is larger.This is to generate the high base cluster set of diversity, thus two steps can generate the high Fusion of Clustering of diversity and candidate reference below.
The realization of the experimental evaluation method of Cluster-Fusion algorithm adopts standard mutual information (NormalizedMutualInformation) to evaluate diversity between cluster and correctness, according to the definition of sampler, the probability that the base cluster that diversity is high is selected into sampling base cluster set is higher.The probability that base cluster is selected into sampling base cluster is defined as follows expression:
Wherein π pfor base cluster, Div (π p) represent and run the difference degree of base cluster and other base clusters repeatedly k-means algorithm at random and, on data set, obtain multiple base cluster.Again base cluster set is put on sampler, the strategy of roulette wheel dish stochastical sampling taked by sampler, and the possibility that the base cluster that namely diversity is higher is selected into sampling base cluster is higher, generates multiple sampling base cluster.
Fig. 2 is frame diagram sampler being generated to candidate reference, specifically describes as follows:
By the sampler of previous step, generate multiple sampling base cluster, then for each sampling base cluster, use the Fusion of Clustering algorithm participating in evaluating, obtain merging base cluster set.The present invention adopts SLC algorithm (Single-LevelCell single layer cell) to merge Fusion of Clustering collection, obtains candidate reference.
The scoring of candidate reference is defined as follows:
Wherein, candidate reference is fusion of Clustering is with α is threshold value.When between Fusion of Clustering, similarity degree is greater than α, scoring is 0, at this moment prevents the similitude between Fusion of Clustering too large.When between Fusion of Clustering, similarity degree is less than α, mark by two parts be added form.Part I is the similarity degree of Fusion of Clustering between candidate reference, and Part II is the similarity degree between Fusion of Clustering.λ is the weight between two parts.As λ > 0.5, in scoring, Part I is larger than the weight of Part II; As λ < 0.5, in scoring, Part II is larger than the weight of Part I; When λ=0.5, in scoring, Part II is more equal than the weight of Part I.Generally speaking, select λ=0.5, namely to account for the weight of scoring the same for Part II and Part I.Accordingly, calculate the scoring of each candidate reference, the highest candidate reference of marking is as final benchmark.Benchmark as next step uses by the benchmark through screening, evaluates clustering result quality.
Fig. 3 is the frame diagram that screening candidate reference part realizes.Specifically describe as follows:
Screen all candidate reference, screening washer can draw optimum benchmark according to the scoring of calculated candidate benchmark.
Utilize the benchmark that previous step generates, external method BCubed can be utilized to evaluate clustering result quality.Given benchmark π bwith K the Fusion of Clustering π={ π by different Cluster-Fusion algorithm gained 1, π 2... π k, to each Fusion of Clustering π i, a quality evaluation Q can be drawn ii, π b).Mark higher, represent the fusion results that this Cluster-Fusion algorithm draws better.
BCubed is a kind of evaluation from outside method, and it is according to benchmark, to each calculation and object precision in cluster on data-oriented collection and recall rate.Suppose there is object set X={x 1, x 2..., x n, C is a cluster of X, and B is the benchmark of X.C (x i) (1≤i≤n) represent x iin the classification of C, B (x i) (1≤i≤n) represent x iin the classification of B.For two object x iand x j(1≤i, j≤n, i ≠ j), x iand x jbe defined as follows in the correctness of cluster C:
The definition of accuracy of BCubed is as follows:
The recall rate of BCubed is defined as follows:
R = 1 n &Sigma; i = 1 n &Sigma; x i : i &NotEqual; j , B ( x i ) = B ( x j ) C o r r e c t n e s s ( x i , x j ) | { x i | i &NotEqual; j , B ( x i ) = B ( x j ) } | - - - ( 5 )
Precision and recall rate can be used for evaluating cluster, and F tolerance (F-measure) simultaneously in conjunction with precision and recall rate, can be defined as follows:
F = ( 1 + &beta; 2 ) P &CenterDot; R &beta; 2 &CenterDot; P + R - - - ( 6 )
The span of F tolerance is between 0 to 1.When F tolerance equals 0, clustering result quality is unsatisfactory; When F tolerance equals 1, clustering result quality is desirable, completely the same with benchmark.So when F tolerance more close to 1 time, clustering result quality is better.

Claims (6)

1. based on the societies discover method of Cluster-Fusion algorithm, it is characterized in that, specifically implement according to following steps:
Step 1: for the data in social networking, draws corresponding sampling base cluster respectively according to base clustering algorithm;
Step 2: each sampling base cluster set that step 1 obtains is merged, draws candidate reference;
Step 3: the candidate reference that step 2 obtains is screened, the highest candidate reference of marking is as optimum benchmark;
Step 4: the optimum benchmark using step 3 to obtain is evaluated clustering result quality.
2. the societies discover method based on Cluster-Fusion algorithm according to claim 1, it is characterized in that, described step 1 is specifically implemented according to following steps:
Suppose there is a data set X comprising m object, definition X={x 1, x 2..., x m, after the N number of base clustering algorithm of operation, obtain N number of base cluster π, definition π={ π 1, π 2..., π n, then, the computing of Fusion of Clustering algorithm is carried out to π, obtains Fusion of Clustering π *, definition π *=φ (π), wherein φ is Cluster-Fusion function;
First, social network user information being sampled, utilize social platform account to obtain platform access rights, carrying out orientation acquisition by arranging initiating task set pair target information;
Secondly, adopt k-means alternatively benchmark algorithm, first set cluster number, then set initialization cluster centre at random, generate multiple base cluster; In order to generate the high base cluster set of diversity, by sampler, base cluster set being sampled, by the mode of combination subbase cluster set, obtaining the sampling base cluster base that multiple groups of differentiation are large.
3. the societies discover method based on Cluster-Fusion algorithm according to claim 2, is characterized in that, the mode of described sampler samples is random roulette wheel dish mode.
4. the societies discover method based on Cluster-Fusion algorithm according to claim 2, it is characterized in that, described step 2 is specifically implemented according to following steps:
Adopt SLC algorithm to merge Fusion of Clustering collection, obtain candidate reference:
The scoring of candidate reference is defined as follows:
Wherein, candidate reference is fusion of Clustering is with α is threshold value.
5. the societies discover method based on Cluster-Fusion algorithm according to claim 4, it is characterized in that, described step 3 is specifically implemented according to following steps: when between Fusion of Clustering, similarity degree is greater than α, and scoring is 0, at this moment prevents the similitude between Fusion of Clustering too large; When between Fusion of Clustering, similarity degree is less than α, mark by two parts be added form; Part I is the similarity degree of Fusion of Clustering between candidate reference, and Part II is the similarity degree between Fusion of Clustering; λ is the weight between two parts; As λ > 0.5, in scoring, Part I is larger than the weight of Part II; As λ < 0.5, in scoring, Part II is larger than the weight of Part I; When λ=0.5, in scoring, Part II is more equal than the weight of Part I; Generally speaking, select λ=0.5, namely to account for the weight of scoring the same for Part II and Part I; Accordingly, calculate the scoring of each candidate reference, the highest candidate reference of marking is as final benchmark; Pass through the optimum benchmark use of benchmark as next step of screening, evaluate clustering result quality.
6. the societies discover method based on Cluster-Fusion algorithm according to claim 5, it is characterized in that, described step 4 is specifically implemented according to following steps:
The optimum benchmark utilizing previous step to generate, utilizes external method BCubed to evaluate clustering result quality: given benchmark π bwith K the Fusion of Clustering π={ π by different Cluster-Fusion algorithm gained 1, π 2... π k, to each Fusion of Clustering π i, a quality evaluation Q can be drawn ii, π b); Mark higher, represent the fusion results that this Cluster-Fusion algorithm draws better;
Suppose there is object set X={x 1, x 2..., x n, C is a cluster of X, and B is the benchmark of X; C (x i) (1≤i≤n) represent x iin the classification of C, B (x i) (1≤i≤n) represent x iin the classification of B; For two object x iand x j(1≤i, j≤n, i ≠ j), x iand x jbe defined as follows in the correctness of cluster C:
The definition of accuracy of BCubed is as follows:
The recall rate of BCubed is defined as follows:
Precision and recall rate can be used for evaluating cluster, and F tolerance simultaneously in conjunction with precision and recall rate, can be defined as follows:
The span of F tolerance is between 0 to 1, and when F tolerance equals 0, clustering result quality is unsatisfactory; When F tolerance equals 1, clustering result quality is desirable, completely the same with benchmark; So when F tolerance more close to 1 time, clustering result quality is better.
CN201510646011.1A 2015-09-30 2015-09-30 Social group finding method based on cluster fusion algorithm Pending CN105282011A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510646011.1A CN105282011A (en) 2015-09-30 2015-09-30 Social group finding method based on cluster fusion algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510646011.1A CN105282011A (en) 2015-09-30 2015-09-30 Social group finding method based on cluster fusion algorithm

Publications (1)

Publication Number Publication Date
CN105282011A true CN105282011A (en) 2016-01-27

Family

ID=55150378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510646011.1A Pending CN105282011A (en) 2015-09-30 2015-09-30 Social group finding method based on cluster fusion algorithm

Country Status (1)

Country Link
CN (1) CN105282011A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109222965A (en) * 2018-09-21 2019-01-18 华南理工大学 A kind of synchronous EEG-fMRI's goes artefact method online
CN110110220A (en) * 2018-06-21 2019-08-09 北京交通大学 Merge the recommended models of social networks and user's evaluation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101383748A (en) * 2008-10-24 2009-03-11 北京航空航天大学 Community division method in complex network
CN102929942A (en) * 2012-09-27 2013-02-13 福建师范大学 Social network overlapping community finding method based on ensemble learning
CN103888541A (en) * 2014-04-01 2014-06-25 中国矿业大学 Method and system for discovering cells fused with topology potential and spectral clustering

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101383748A (en) * 2008-10-24 2009-03-11 北京航空航天大学 Community division method in complex network
CN102929942A (en) * 2012-09-27 2013-02-13 福建师范大学 Social network overlapping community finding method based on ensemble learning
CN103888541A (en) * 2014-04-01 2014-06-25 中国矿业大学 Method and system for discovering cells fused with topology potential and spectral clustering

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
梁荣德: "聚类融合算法的实验评价方法", 《无线互联科技》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110110220A (en) * 2018-06-21 2019-08-09 北京交通大学 Merge the recommended models of social networks and user's evaluation
CN110110220B (en) * 2018-06-21 2021-06-01 北京交通大学 Recommendation model fusing social network and user evaluation
CN109222965A (en) * 2018-09-21 2019-01-18 华南理工大学 A kind of synchronous EEG-fMRI's goes artefact method online

Similar Documents

Publication Publication Date Title
Hu et al. Community level diffusion extraction
Do et al. Multiview deep learning for predicting twitter users' location
CN106909643A (en) The social media big data motif discovery method of knowledge based collection of illustrative plates
Orman et al. Towards realistic artificial benchmark for community detection algorithms evaluation
Shin et al. A new understanding of friendships in space: Complex networks meet Twitter
Singh et al. Sentiment analysis of Twitter data using TF-IDF and machine learning techniques
CN106991614A (en) The parallel overlapping community discovery method propagated under Spark based on label
CN104077723A (en) Social network recommending system and social network recommending method
CN106227866A (en) A kind of hybrid filtering film based on data mining recommends method
CN109948242A (en) Network representation learning method based on feature Hash
CN107392392A (en) Microblogging forwarding Forecasting Methodology based on deep learning
Zeng et al. Adaptive federated learning with non-IID data
CN103095849A (en) A method and a system of spervised web service finding based on attribution forecast and error correction of quality of service (QoS)
Kumar et al. A hybrid data-driven framework for spam detection in online social network
CN105184654A (en) Public opinion hotspot real-time acquisition method and acquisition device based on community division
Cai et al. The mining of urban hotspots based on multi-source location data fusion
Choudhury et al. An empirical study of community and sub-community detection in social networks applying Newman-Girvan algorithm
Li et al. Efficient community detection in heterogeneous social networks
CN105282011A (en) Social group finding method based on cluster fusion algorithm
Gao et al. Accelerating graph mining algorithms via uniform random edge sampling
Cheng et al. A Seed‐Expanding Method Based on TOPSIS for Community Detection in Complex Networks
CN105162648B (en) Corporations&#39; detection method based on backbone network extension
Chaudhary et al. Detecting community structures using Modified Fast Louvain Method in complex networks
Chen et al. [Retracted] The Integration Mechanism of Music Intangible Cultural Heritage and Tourism Industry in the Internet of Things Environment
Maheswari et al. Kernelized Spectral Clustering based Conditional MapReduce function with big data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160127