CN105653518A - Specific group discovery and expansion method based on microblog data - Google Patents

Specific group discovery and expansion method based on microblog data Download PDF

Info

Publication number
CN105653518A
CN105653518A CN201510997788.2A CN201510997788A CN105653518A CN 105653518 A CN105653518 A CN 105653518A CN 201510997788 A CN201510997788 A CN 201510997788A CN 105653518 A CN105653518 A CN 105653518A
Authority
CN
China
Prior art keywords
user
feature
data
colony
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510997788.2A
Other languages
Chinese (zh)
Inventor
吴松泽
张华平
徐程程
王洋
王�琦
李高超
付戈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN201510997788.2A priority Critical patent/CN105653518A/en
Publication of CN105653518A publication Critical patent/CN105653518A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Business, Economics & Management (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Computing Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a specific group discovery and expansion method based on microblog data, and belongs to the field of social network analysis and data mining. The specific group discovery and expansion method comprises the following specific steps: collecting relevant group information; carrying out information integration and mapping; aiming at text data to carry out characteristic extraction; calculating a user similarity degree; carrying out the self-detection of a category group; and extracting the attributes of the specific group, judging a category, and carrying out group expansion. The specific group discovery and expansion method artfully avoids the problem that group identification can not be carried out since data is sparse or incomplete when a network model is used, inputs large-scale data calculation and is high in stability.

Description

A kind of special group based on microblogging data is found and extending method
Technical field
This method relates to discovery and the expansion for some particular text colonies in social networks, especiallyThe special group of microblogging data is found and is expanded, belongs to social networks analysis and Data Mining.
Background technology
The information of the issue that in social networks, user can be autonomous oneself also can be seen other people dividing simultaneouslyEnjoy information, and then build the community network of virtual age. This shared platform have timely sharing,Real-time, the feature such as interactive, also possess the propagation characteristic of traditional social activity society simultaneously, becomesFor the part of people's work and life.
In microblogging platform, a large amount of text datas that user is produced, carrying out data mining can obtainVery high information value. Therefore, need to come with efficient data digging method and machine learning algorithmCarry out the excavation of useful information, fully the valuable information extraction in social networks text message is gone outCome, wherein valuable information is exactly discovery and the expansion of social media colony.
When referring to full dose text data taxonomic clustering, the feature extraction of text data need to carry out dataFeature extraction, selects the feature of larger, the key vocabulary of weight as text, is convenient to similarity and calculatesAnd taxonomic clustering. The technology wherein relating to has participle, word frequency statistics, keyword extraction etc., forThe weight calculation of each word uses word frequency or TF-IDF carries out weight calculation, and feature extraction algorithm masterHave: information gain algorithm, whether the gain of comentropy can after mainly calculating different feature extractionsReach maximum, and the characteristic vector that finally reservation can be got maximum informational entropy; Mutual information value-based algorithm, this isConsider the classification information of text, calculated the Mutual information entropy of classification information and feature, retained mutual informationLarge feature; , by there is not independence between suppositive and classification in CHI (chi method)And adopt chi to extract, chi value is higher, the independence between this word and classificationJust less, illustrate that its feature is more obvious, so just extract as feature; Cross entropy (KL distance),By calculating cross entropy, reflect the probability distribution of text categories and occurring certain specific termDistance under condition between classification probability distribution, by the affecting property of this distance metric noun on classification,KL distance is larger, illustrates that the impact property on kind judging of this noun is larger.
At social media data excavation applications, there is the community discovery algorithm of a lot of maturations. A societyGroup is the colony that a group has high contiguity and similitude, and community discovery algorithm passes through between userRelevant information, judges user's impact property and homogeneity and is finally converted into the detection foundation characteristic of corporations.The algorithm of dividing for corporations at present has the community discovery algorithm based on modularity, the society based on analysis of spectrumGroup's discovery algorithm, the community discovery of propagating based on information-theoretical community discovery algorithm and based on label are calculatedThe main flow algorithms such as method. Among these algorithms, can the algorithm of large-scale application in Practical Project beThe community discovery algorithm calculating based on modularity that the people such as Newman propose, this algorithm has defined one and has sentencedNot Zhi: the modularity Q of corporations, each by changing adding of a new element of consideration (node)In generation, whether the value of calculating Q obtains gain, and finally reaches a steady constant result. ThisUnder algorithm frame, need to carry out network modelling to social media, then according to its network structure with associatedRelation is carried out community discovery and division. For social media such as microbloggings, network modelling mainly utilizes microbloggingUser's bean vermicelli concern relation, forwarding relation etc. carried out the limit of building of graph model, finally according to the net of structureNetwork model carries out modularity calculating.
But in practical problem is processed, due to the sparse property of data, we are difficult to obtain completelyData acquisition system, this has brought very large challenge to our traditional computational methods. Many times, weCan not always obtain full dose data, as bean vermicelli relation, concern relation, forwarding information etc., thisIn situation, because data is incomprehensive, we cannot obtain complete relational network, if continuedUse traditional algorithm that is calculated as core based on modularity just cannot accurately calculate each colony closelyDegree and modularity. Therefore,, for sparse relational network, we need to introduce computation model more cleverlyCarry out the discovery of colony of corporations.
Summary of the invention
Consider in mass data the most easily obtain and comparatively comprehensively information be that social user deliversText data information, we have proposed to find and extending method based on the colony of text data, mainCarry out natural language processing and finally extract this user's characteristic information for user's text data,And carry out modeling according to characteristic information, finally carry out cluster by the similitude comparing between each userAnalyze, finally obtain corporations of colony, and the outstanding feature that extracts this colony carries out colony's expansion, separateIn the sparse situation of customer relationship link data of having determined, cannot accurately carry out the problem of colony's discovery.
The technical scheme that the present invention proposes has following step:
Step 1, collection Reference Group information: based on crawler technology or the discloseder data resources of microblogging,Get the community information that needs analysis, these information spinners will comprise: the text envelope that microblog users is sent outThe text message of the comment that breath, user do, the interactive information that user carries out on microblogging, compriseComment operation, forwarding relation, put and praise operation,, user's base attribute, comprise bean vermicelli number, concernNumber, concern relation;
Step 2, to community information integrate with mapping: in the sample data of obtaining in step 1,First remove label, and by hierarchical relationship resolution data, obtain user-microblogging text mapping, userThe mapping of-comment text, and retain user-concern relation, user-bean vermicelli relation, user-forwarding relation;
Step 3, carry out feature extraction for text data: the microblogging content of delivering for user,Use relative entropy (being KL distance) to carry out feature extraction, obtain each user's feature vocabulary, andSet up corresponding mapping relations;
Step 4, calculating user similarity: according to the text feature extracting in step 3, use cosineSimilarity carry out user similarity calculate, and according to similarity result to user carry out cluster orIt is classification;
Step 5, carry out classification colony from detect: for dividing based on text feature data in step 4The colony obtaining, carries out the conclusion of symbolic characteristic to whole colony, obtain the characteristic of each classificationAccording to, be specially total N feature in a colony of hypothesis, take the decision mode of majority voting, adoptRepresentative feature by maximum K the feature of general character as this colony;
Step 6, the special group attributes extraction of carrying out, judge classification, carries out colony's expansion. Need toThe special group sample data of finding is as training data, and uses in step 3 according to this training dataText feature extraction algorithm obtain feature vocabulary, and in category feature, calculate similar according to step 4Degree, obtains cluster correlation user list.
Beneficial effect
This method is used the information of microblogging Chinese version data, can fully obtain to a certain extent user's letterBreath, and adopt feature extraction algorithm to obtain user's feature, avoid cleverly in use network modelSparse or the problem that comprehensively can not carry out colony's identification, the present invention is carrying out colony's discoveryMeanwhile, obtained the characteristic information of each classification, can understand these data to us provides more sideHelp, there is very strong use value. Colony based on text data finds and extending method has been realized numberAccording to make full use of, conveniently carry out colony's discovery and need not set up complicated network model, fromAnd reduced the complexity of algorithm, and the modularity of algorithm is higher, can drop into large-scale data meterCalculate, there is higher stability.
Brief description of the drawings
Fig. 1 finds based on the colony of microblogging text data and the schematic flow sheet of extending method;
Fig. 2 carries out the structural representation of web crawlers collection for microblogging data;
Fig. 3 analyzes and builds the text vector spatial model that can quantize to the text data gatheringAnd mapping;
Fig. 4 utilizes KL distance carry out text feature extraction and carry out colony and find model training process;
Fig. 5 represents to use the model training to carry out colony's discovery and colony expands.
Detailed description of the invention
Below in conjunction with brief description of the drawings the specific embodiment of the present invention:
Based on the colony of microblogging text data find and the overall procedure of extending method as shown in Figure 1, withSina's microblogging " machine learning " domain-specific personage is found to be example, and we are in advance according to stepOne has set up the representative feature storehouse of multiple classification colonies and colony to step 5, and one of them classification is" machine learning " class, has then found 50 doubtful associated users, and target is to find out real being correlated withUser, thereby " machine learning " class is expanded. Concrete grammar is to 50 doubtful relevant useFamily obtains feature vocabulary vector separately according to step 1 to step 3, then carries out class according to step 6Not Pi Pei, judge whether user belongs to " machine learning " class. It is below the each step according to algorithmSuddenly carry out detailed implementation.
Carry out the collection of relevant information according to step 1:
The Sina's microblogging data that will study for us gather or directly obtain Sina's microblogging and providePublic data. As shown in Figure 2, the collection of data cushions URL queue by foundation, adopts rangeFirst search algorithm (BFS) carries out web page interlinkage search, and each node webpage is scanned to download,And the page is resolved, remove irrelevant noise, reservation can be described the metadata of user's attributeInformation: the microblogging text message that user delivers, the microblogging text message of user comment, user's bean vermicelliNumber, user's concern number, user's forwarding relation; Also can directly call microblogging official providesThe feedback information such as api interface or RSS directly extract relevant information;
Carry out integration and the mapping of information according to step 2:
After applying step one has obtained metadata, these metadata are carried out to the integration storage of data,And set up corresponding mapping relations, as shown in Figure 3, specific works comprises:
1) text participle, the microblogging text message (delivering microblogging, comment microblogging) to user usesICTCLAS Words partition system carries out text participle, removes stop words, obtains corresponding text vector spatial modeType (VSM model);
2) data based on having finished dealing with, set up user-microblogging text vector spatial mappings, withTime can also obtain the mappings such as user-forwarding relation, user-bean vermicelli relation, user-concern relation, thoughSo this algorithm does not relate to calculating and the processing of these mappings, but this excavates and have later data secondaryVery large value.
Carry out feature extraction according to step 3 for text data:
The text vector spatial model that utilizes step 2 to obtain, as shown in Fig. 4 feature extraction part, adoptsKL algorithm calculates cross entropy:
C E ( W ) = - Σ i = 1 m P ( C i | W ) l o g P ( C i | W ) P ( C i )
Wherein m represents the number of classification, and this value is by User Defined; CiRepresent i classification, WBe the vocabulary that needs tolerance, CE (W) is cross entropy. KL distance reflects that a vocabulary W is to eachClassification CiImpact property summation, KL distance is larger, this vocabulary more can affect the division of classification, soThis vocabulary is that feature vocabulary vector can be classified as in key vocabulary.
Calculate user's similarity according to step 4:
According to calculating the feature vocabulary that extracts the user who obtains, build a vocabulary vectorV={vi=1|viFeature vocabulary }, so for the characteristic vector V of two user version datajAnd Vk, adoptBy the computational methods of cosine similarity, calculate these two users' similitude, as Fig. 4 similarity is calculatedShown in part:
Wherein, c represents the length (being the size of text vector spatial model vector) of vocabulary, vjiRepresent j user's i vocabulary, vkiRepresent k user's i vocabulary, vji=1 generationA table the j user's i vocabulary is feature vocabulary, on the contrary vji=0 represents it is not feature vocabulary. WithCosine value between family is larger, represents that these two user characteristicses are more relevant, can assert this two usersFor same classification.
Carry out classification colony from detecting according to step 5:
Similarity by step 4 is calculated, and colony is divided with classification or clustering algorithm, firstStep obtains colony in social networks and finds. If Fig. 4 is from as shown in test section, obtaining the division of colonyAfterwards, colony is carried out, from detecting, extracting the most representative feature in this colony (classification),As the feature of this colony. The object of this step is to be to detect into user-defined special groupAnd expand and prepare. Suppose total N feature in a colony, take the decision mode of majority voting,Adopt the representative feature of maximum K the feature of general character as this colony.
Carry out special group attributes extraction according to step 6, judge classification, carry out colony's expansion:
As shown in Figure 5, when user need to find which colony certain user belongs to, and need to expand phaseWhen pass or similar users colony, the sample providing according to user carries out attributes extraction, and main acquisition is specialThe microblogging text data at requisition family, carries out natural language processing to it, comprises participle, word weight TF-IDFCalculate, then adopt equally KL distance to carry out feature extraction, then carry out similarity meter with each classificationCalculate, finally determine which classification this user belongs to, and can carry out colony's expansion to similar sample.In this process, if the similarity of a user and multiple colonies is all very approaching, error is no more than, at this moment just there is the problem of overlapping corporations in the threshold value θ that user sets, a user mayBelong to multiple colonies, the result that need to divide colony is carried out verification, needs results set simultaneouslyMerge and finally improve colony's expansion.

Claims (1)

1. the special group based on microblogging data is found and an extending method, and its feature comprises following stepRapid:
Step 1, collection Reference Group information: based on crawler technology or the discloseder data resources of microblogging,Get and need the community information analyzed, these information comprise: text message that microblog users is sent out,The text message of the comment that user does, the interactive information that user carries out on microblogging, comprise commentOperation, forwarding relation, point are praised operation, and user's base attribute comprises bean vermicelli number, pays close attention to number, closesNote relation;
Step 2, to community information integrate with mapping: in the sample data of obtaining in step 1,First remove label, and by hierarchical relationship resolution data, obtain user-microblogging text mapping, userThe mapping of-comment text, and retain user-concern relation, user-bean vermicelli relation, user-forwarding relation;
Step 3, carry out feature extraction for text data: the microblogging content of delivering for user,Carry out feature extraction with relative entropy, obtain each user's feature vocabulary, and set up and reflect accordinglyPenetrate relation;
Step 4, calculating user similarity: according to the text feature extracting in step 3, use cosineSimilarity carry out user similarity calculate, and according to similarity result to user carry out cluster orIt is classification;
Step 5, carry out classification colony from detect: for dividing based on text feature data in step 4The colony obtaining, carries out the conclusion of symbolic characteristic to whole colony, obtain the characteristic of each classificationAccording to, be specially total N feature in a colony of hypothesis, take the decision mode of majority voting, adoptRepresentative feature by maximum K the feature of general character as this colony;
Step 6, the special group attributes extraction of carrying out, judge classification, carries out colony's expansion: need to send outExisting special group sample data is as training data, and uses in step 3 according to this training dataText feature extraction algorithm obtains feature vocabulary, and in category feature, calculates similarity according to step 4,Obtain cluster correlation user list.
CN201510997788.2A 2015-12-25 2015-12-25 Specific group discovery and expansion method based on microblog data Pending CN105653518A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510997788.2A CN105653518A (en) 2015-12-25 2015-12-25 Specific group discovery and expansion method based on microblog data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510997788.2A CN105653518A (en) 2015-12-25 2015-12-25 Specific group discovery and expansion method based on microblog data

Publications (1)

Publication Number Publication Date
CN105653518A true CN105653518A (en) 2016-06-08

Family

ID=56477053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510997788.2A Pending CN105653518A (en) 2015-12-25 2015-12-25 Specific group discovery and expansion method based on microblog data

Country Status (1)

Country Link
CN (1) CN105653518A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446146A (en) * 2016-09-21 2017-02-22 中国国防科技信息中心 Establishing and identifying method of identification models for followers continuously concerning event in microblogs
CN106933949A (en) * 2017-01-20 2017-07-07 浙江大学 The planing method of influence power outburst in a kind of control social networks
CN107368534A (en) * 2017-06-21 2017-11-21 南京邮电大学 A kind of method for predicting social network user attribute
CN107577782A (en) * 2017-09-14 2018-01-12 国家计算机网络与信息安全管理中心 A kind of people-similarity depicting method based on heterogeneous data
CN108182639A (en) * 2017-12-29 2018-06-19 中国人民解放军火箭军工程大学 A kind of network forum microcommunity determines method and system
CN108228587A (en) * 2016-12-13 2018-06-29 北大方正集团有限公司 Stock discrimination method and Stock discrimination device
CN108647286A (en) * 2018-05-04 2018-10-12 中译语通科技股份有限公司 A kind of user information acquisition method, information data processing terminal towards microblogging
CN108664483A (en) * 2017-03-28 2018-10-16 北大方正集团有限公司 The management method and management system of specific user group
CN109389157A (en) * 2018-09-14 2019-02-26 阿里巴巴集团控股有限公司 A kind of user group recognition methods and device and groups of objects recognition methods and device
CN109446171A (en) * 2017-08-30 2019-03-08 腾讯科技(深圳)有限公司 A kind of data processing method and device
CN110046910A (en) * 2018-12-13 2019-07-23 阿里巴巴集团控股有限公司 The method and apparatus for obtaining customer group relevant to particular customer
CN110264372A (en) * 2019-05-16 2019-09-20 西安交通大学 A kind of theme Combo discovering method indicated based on node
CN110572813A (en) * 2018-05-19 2019-12-13 北京融信数联科技有限公司 mobile phone user behavior similarity analysis method based on mobile big data
CN111026976A (en) * 2019-12-13 2020-04-17 北京信息科技大学 Identification method for microblog specific event attention group
CN112328866A (en) * 2019-08-05 2021-02-05 四川大学 Specific user group mining method in network space security field
CN114357292A (en) * 2021-12-29 2022-04-15 阿里巴巴(中国)有限公司 Model training method, device and storage medium
CN114817563A (en) * 2022-04-27 2022-07-29 电子科技大学 Mining method of specific Twitter user group discovered based on maximum clique

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103823890A (en) * 2014-03-10 2014-05-28 中国科学院信息工程研究所 Microblog hot topic detection method and device aiming at specific group
CN103914493A (en) * 2013-01-09 2014-07-09 北大方正集团有限公司 Method and system for discovering and analyzing microblog user group structure
CN104850647A (en) * 2015-05-28 2015-08-19 国家计算机网络与信息安全管理中心 Microblog group discovering method and microblog group discovering device
CN104915397A (en) * 2015-05-28 2015-09-16 国家计算机网络与信息安全管理中心 Method and device for predicting microblog propagation tendencies
CN104991956A (en) * 2015-07-21 2015-10-21 中国人民解放军信息工程大学 Microblog transmission group division and account activeness evaluation method based on theme possibility model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103914493A (en) * 2013-01-09 2014-07-09 北大方正集团有限公司 Method and system for discovering and analyzing microblog user group structure
CN103823890A (en) * 2014-03-10 2014-05-28 中国科学院信息工程研究所 Microblog hot topic detection method and device aiming at specific group
CN104850647A (en) * 2015-05-28 2015-08-19 国家计算机网络与信息安全管理中心 Microblog group discovering method and microblog group discovering device
CN104915397A (en) * 2015-05-28 2015-09-16 国家计算机网络与信息安全管理中心 Method and device for predicting microblog propagation tendencies
CN104991956A (en) * 2015-07-21 2015-10-21 中国人民解放军信息工程大学 Microblog transmission group division and account activeness evaluation method based on theme possibility model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李炤: "基于微博情感分析的网络舆情热点发现模型研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
邹鸿程: "微博话题检测与追踪技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446146B (en) * 2016-09-21 2019-05-17 中国国防科技信息中心 The identification model of event duration follower is established and recognition methods in a kind of microblogging
CN106446146A (en) * 2016-09-21 2017-02-22 中国国防科技信息中心 Establishing and identifying method of identification models for followers continuously concerning event in microblogs
CN108228587A (en) * 2016-12-13 2018-06-29 北大方正集团有限公司 Stock discrimination method and Stock discrimination device
CN106933949A (en) * 2017-01-20 2017-07-07 浙江大学 The planing method of influence power outburst in a kind of control social networks
CN106933949B (en) * 2017-01-20 2020-09-11 浙江大学 Planning method for controlling influence outbreak in social network
CN108664483A (en) * 2017-03-28 2018-10-16 北大方正集团有限公司 The management method and management system of specific user group
CN107368534B (en) * 2017-06-21 2020-06-12 南京邮电大学 Method for predicting social network user attributes
CN107368534A (en) * 2017-06-21 2017-11-21 南京邮电大学 A kind of method for predicting social network user attribute
CN109446171A (en) * 2017-08-30 2019-03-08 腾讯科技(深圳)有限公司 A kind of data processing method and device
CN107577782B (en) * 2017-09-14 2021-04-30 国家计算机网络与信息安全管理中心 Figure similarity depicting method based on heterogeneous data
CN107577782A (en) * 2017-09-14 2018-01-12 国家计算机网络与信息安全管理中心 A kind of people-similarity depicting method based on heterogeneous data
CN108182639A (en) * 2017-12-29 2018-06-19 中国人民解放军火箭军工程大学 A kind of network forum microcommunity determines method and system
CN108182639B (en) * 2017-12-29 2021-04-09 中国人民解放军火箭军工程大学 Method and system for determining small group of internet forum
CN108647286A (en) * 2018-05-04 2018-10-12 中译语通科技股份有限公司 A kind of user information acquisition method, information data processing terminal towards microblogging
CN110572813A (en) * 2018-05-19 2019-12-13 北京融信数联科技有限公司 mobile phone user behavior similarity analysis method based on mobile big data
CN109389157A (en) * 2018-09-14 2019-02-26 阿里巴巴集团控股有限公司 A kind of user group recognition methods and device and groups of objects recognition methods and device
CN110046910A (en) * 2018-12-13 2019-07-23 阿里巴巴集团控股有限公司 The method and apparatus for obtaining customer group relevant to particular customer
CN110264372A (en) * 2019-05-16 2019-09-20 西安交通大学 A kind of theme Combo discovering method indicated based on node
CN110264372B (en) * 2019-05-16 2022-03-08 西安交通大学 Topic community discovery method based on node representation
CN112328866A (en) * 2019-08-05 2021-02-05 四川大学 Specific user group mining method in network space security field
CN111026976A (en) * 2019-12-13 2020-04-17 北京信息科技大学 Identification method for microblog specific event attention group
CN111026976B (en) * 2019-12-13 2024-01-09 北京信息科技大学 Microblog specific event concern group identification method
CN114357292A (en) * 2021-12-29 2022-04-15 阿里巴巴(中国)有限公司 Model training method, device and storage medium
CN114357292B (en) * 2021-12-29 2023-10-13 杭州溢六发发电子商务有限公司 Model training method, device and storage medium
CN114817563A (en) * 2022-04-27 2022-07-29 电子科技大学 Mining method of specific Twitter user group discovered based on maximum clique

Similar Documents

Publication Publication Date Title
CN105653518A (en) Specific group discovery and expansion method based on microblog data
CN110781317B (en) Method and device for constructing event map and electronic equipment
CN103745000B (en) Hot topic detection method of Chinese micro-blogs
CN104615608B (en) A kind of data mining processing system and method
CN108681557B (en) Short text topic discovery method and system based on self-expansion representation and similar bidirectional constraint
CN106940732A (en) A kind of doubtful waterborne troops towards microblogging finds method
CN104484343A (en) Topic detection and tracking method for microblog
CN101980199A (en) Method and system for discovering network hot topic based on situation assessment
CN107291886A (en) A kind of microblog topic detecting method and system based on incremental clustering algorithm
CN103544255A (en) Text semantic relativity based network public opinion information analysis method
CN104239436A (en) Network hot event detection method based on text classification and clustering analysis
CN101127042A (en) Sensibility classification method based on language model
CN107066555A (en) Towards the online topic detection method of professional domain
Zhang et al. Enhancing traffic incident detection by using spatial point pattern analysis on social media
CN101819585A (en) Device and method for constructing forum event dissemination pattern
CN104866558A (en) Training method of social networking account mapping model, mapping method and system
CN103473262A (en) Automatic classification system and automatic classification method for Web comment viewpoint on the basis of association rule
CN104408033A (en) Text message extracting method and system
CN106484797A (en) Accident summary abstracting method based on sparse study
CN109918648B (en) Rumor depth detection method based on dynamic sliding window feature score
CN106503256B (en) A kind of hot information method for digging based on social networks document
CN107832467A (en) A kind of microblog topic detecting method based on improved Single pass clustering algorithms
CN104281565A (en) Semantic dictionary constructing method and device
CN104331523A (en) Conceptual object model-based question searching method
Romero et al. A framework for event classification in tweets based on hybrid semantic enrichment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160608