CN103488683A - Microblog data management system and implementation method thereof - Google Patents

Microblog data management system and implementation method thereof Download PDF

Info

Publication number
CN103488683A
CN103488683A CN201310367762.0A CN201310367762A CN103488683A CN 103488683 A CN103488683 A CN 103488683A CN 201310367762 A CN201310367762 A CN 201310367762A CN 103488683 A CN103488683 A CN 103488683A
Authority
CN
China
Prior art keywords
user
module
good friend
grouping
community
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310367762.0A
Other languages
Chinese (zh)
Other versions
CN103488683B (en
Inventor
王静远
高飞
李超
欧阳元新
熊璋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201310367762.0A priority Critical patent/CN103488683B/en
Publication of CN103488683A publication Critical patent/CN103488683A/en
Application granted granted Critical
Publication of CN103488683B publication Critical patent/CN103488683B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a microblog data management system and an implementation method thereof, and provides a service of managing microblog data through automatic friend grouping for a microblog user. The system consists of five modules, i.e. a user authorization module, a data extraction module, a community structure finding module, a grouping analysis and exhibition module, a feedback module and a microblog data management module. The system and the method have the advantages that the problems of waste at time and labor and difficult maintenance of traditional manual microblog data management are solved; the friends of the user are intelligently grouped by a community finding technique, so the accuracy is high, the overlapped communities can be found, and the like; a result is analyzed by the method to provide the visual and easy-understanding user friend grouping basis; in addition, the system provides a feedback mechanism to further improve the reliability of the system through introducing the feedback of the user into the system.

Description

A kind of microblogging data management system and its implementation
Technical field
The present invention relates to a kind of microblogging data management system and its implementation based on the community discovery technology, belong to the data mining technology field.
Background technology
In the social networks such as microblogging, along with increasing of user good friend quantity, the user faces a large amount of information every day.For the more microblog users of user, a kind of method of good data management is that User, in real-life social circle, is set up grouping, according to different grouping under the good friend, manages.After setting up grouping, just can carry out information filtering according to group, privacy arranges etc.At present, the main microblogging service provider such as Tengxun's microblogging, Sina's microblogging all provides this mechanism to carry out management data.Yet existing method is mainly carried out grouping management to the good friend by hand by the user and is carried out.This method is too time-consuming and need a large amount of hand labor of user.When the user has new good friend, also be difficult to upgrade.Simultaneously, manual manages, and exists the possibility of maloperation.
Summary of the invention
Technology of the present invention is dealt with problems: overcome the deficiencies in the prior art, a kind of microblogging data management system and method are provided, can excavate efficiently, accurately potential grouping information, the user can manage its microblogging data easily.
Technical solution of the present invention: a kind of microblogging data management system comprises: as shown in Figure 1
Subscriber authorisation module: adopt the Oauth agreement to be authorized.The security mechanism of utilizing the Oauth technology to provide, native system can not touch user's privacy information.
Data capture module: the API that utilizes microblogging to provide, obtain mutual relationship data between the user good friend and the data of subscriber information message.At first user's good friend captured.Then, to each good friend, capture itself and user's common friend information, thereby obtain the mutual relationship between all good friends, form a user social contact relational network formed by good friend's relation.The input of this module is the user name of user on microblogging, and output is the user social contact relational network.Wherein, each node in network has represented user's a good friend, and the limit between node has represented two good friends' of user relation.The final user social contact relational network obtained outputs in database, for the community structure detection module, calls;
Community structure is excavated module: the data handling module is obtained to the figure that user good friend relation forms, according to community's detection technique, the social networks between the good friend, excavate its potential community structure, as the foundation of grouping.One of them community is some good friends' set, wherein between the good friend in community, has good friend's relation that density is larger, and the good friend between community has less good friend's relation.This module has been used community's detection technique, basic community structure search and community's polymerization two parts, consists of.Set any parameter without the user, also without any parameter.The input of this module is good friend's relational network that data capture module obtains, and the good friend who produces through this resume module grouping exports packet parsing to and represents module;
Packet parsing represents module: find according to community structure the user good friend grouping that module produces, it is resolved.The effect of this module is the semantic information of excavating grouping of intelligence.According to the semantic information of user good friend grouping, by group abstract be famous person star, friend, classmate, the large class of colleague four.Parsing module excavates by community structure each grouping that module produces, and utilizes group member's subscriber data, microblogging content, forwards relationship characteristic, determines the classification of its grouping.As the packet parsing result, represent community structure excavate module by the result presentation of parsing module to the user.
Feedback module: to each user good friend grouping, a feedback is set, collects the user and estimate.Make the user make the marking evaluation to the effect of system, and collect field feedback, using user id, group result, user feedback as a record, be stored to database, in order to improve, improve user's experience for system in the future, provide foundation.
A kind of microblogging data managing method, performing step is:
(1) subscriber authorisation: adopt the Oauth agreement to be authorized, obtain the user name of user on microblogging;
(2) data capture: the user name according to the user on microblogging, and utilize API that microblogging provides to obtain mutual relationship data between the user good friend and the data of subscriber information message, specifically at first user's good friend is captured; Then to each good friend, capture itself and user's common friend information, obtain the mutual relationship between all good friends, form a user social contact relational network formed by good friend's relation; Wherein, each node in network has represented user's a good friend, and the limit between node has represented two good friends' of user relation, and the final network obtained outputs in database;
(3) community structure is excavated: good friend's relational network that step (2) is obtained, according to community's detection technique, at first network is carried out to depth-first search and excavate its basic community structure, subsequently basic community structure is carried out to the level polymerization, social networks between the good friend, excavate its potential community structure, foundation as grouping, one of them community is some good friends' set, there is good friend's relation that density is larger between the good friend in community, good friend between community has less good friend's relation, thereby obtains user good friend grouping;
(4) packet parsing represents: the user good friend grouping produced according to step (3), it to be resolved, and the effect of this module is the semantic information of excavating grouping of intelligence.By group abstract be famous person star, friend, classmate, the large class of colleague four, each the user good friend grouping produced according to step (3), utilize group member's subscriber data, microblogging content, forward relationship characteristic, determine the classification of its grouping, represent to the user as the grouping foundation;
(5) fed back: to each user friendly grouping, a feedback is set, collects field feedback, in order to improve, improve user's experience for system in the future, provide foundation.
The present invention's advantage compared with prior art is:
(1) the present invention can automatically be analyzed good friend's relation of user, excavates out its potential grouping, thereby according to grouping management microblogging data.Whole process, without artificial participation, helps the user to save the work of a large amount of loaded down with trivial details repetitions, has saved the time, has improved efficiency.
(2) the present invention has adopted community's etection theory and technology in the process of automatically excavating grouping, only used the good friend's relation information between the user, there is no the information such as user's data, thereby avoided the imperfection due to subscriber data, the ageing packet error caused.
(3) the present invention resolves to the understandable classification of user by group result, and the user can manage its microblogging data intuitively according to this.
The accompanying drawing explanation
The system assumption diagram that Fig. 1 is system of the present invention;
Fig. 2 is data capture module realization flow figure of the present invention;
Fig. 3 excavates module realization flow figure for community structure in invention;
Fig. 4 is that in the present invention, packet parsing represents module realization flow figure.
Embodiment
As shown in Figure 1, the present invention is based on the microblogging data management system of community discovery technology and method by subscriber authorisation module, data capture module, community structure excavate module, packet parsing represents module and user feedback module composition.
The specific implementation process of each module is as follows:
1. subscriber authorisation module
(1) user inputs its account;
(2) account is sent to the microblogging server authentication, if, by checking, return to accesstoken, mandate completes, and data capture module will be used this accesstoken to obtain data and capture authority.
2. according to handling module, as shown in Figure 2,
(1) initialization Hash table H, for depositing user social contact relational network data.Obtain user's concern list list;
(2) (1) gained user is paid close attention to each uid of list list, get itself and user's common concern list list2.And the hash table that is uid to key, its value of initialization is a null set;
(3) (2) gained is paid close attention to jointly to each uid2 of list list2, joined in the set that in Hash table, key uid is corresponding;
(4) repeating step (3) is until each that jointly pay close attention in list is all processed;
(5) repeating step (2) until the user pay close attention in list each is all processed;
(6) by the data write into Databasce in Hash table, and export community structure excavation module to, data capture and finish.
3. community structure is excavated module, as shown in Figure 3,
(1) null set c of initialization, the Hash table of the handling module of fetching data gained, get one of them list item, carries out (2) to (3);
(2) this process is the process of the depth-first traversal of a recurrence.The key of (1) being planted to the hash table of choosing adds set.To the key chosen corresponding value in Hash table, take out successively each uid wherein, judge that it is whether in set, if do not exist, judge whether it is present in the cryptographic hash that in set, each uid is corresponding, and in set, each uid exists in its corresponding cryptographic hash.If all exist, it is added to set, then from then on uid starts, and continues to carry out (2);
(3) if in now set, element number is greater than 3, find a community structure c, preserved this result in community's S set.Continuation is carried out according to step cycle in (1).
(4) threshold value threshold=0.99 is set.The community structure that first three step is obtained, calculate any two community structure c i, c jbetween similarity, computing formula is:
similarity ( c i , c j ) = Σ m , n ∈ c i ∪ c j X m , n | c i ∪ c j | * ( | c i ∪ c j | - 1 )
Wherein,
X m , n = 1 . < m , n > &Element; E 0 . < m , n > &NotElement; E m , n &Element; V
The set that in formula, E is all users in the user social contact relational network, V pays close attention to the set of relation between all users.X m,nmean community structure c i, c jin user m whether user n is had to the concern relation, if having, X m,n=1; Otherwise X m,n=0;
(4.1) if similarity is greater than the value of threshold, merge two community structures, after any two society's structures are all calculated and carried out end, execution step is (4.2);
(4.2) reduce the value of threshold, make threshold=threshold-0.05;
(4.3) value of judgement threshold, if be greater than 0.27, forward step (4.1) to; Otherwise, hold step (5);
(5) export the community structure of gained to packet parsing and represent module.
4. packet parsing represents module, as shown in Figure 4,
(1) each community structure is excavated to the community structure that module is excavated, its member's user money is kept in vector, wherein each dimension of vector has represented an information;
(2) calculate in each group vector the maximum dimension of vectorial number that value is identical;
(2.1) if maximum dimensions is school and more than half, its classification resolves to the classmate;
(2.2) if maximum dimensions is work and more than half, its classification resolves to the colleague;
(2.3), if whether maximum dimensions adds V and more than half, its classification is the famous person star;
(2.4) otherwise, resolve its classification for friend.
(3) semantic parsing is complete, and result is showed.
5. feedback module
(1) obtain user feedback data, i.e. user's marking information (1-5 divides);
(2) feedback information is deposited in database;
The part that the present invention does not describe in detail belongs to techniques well known.

Claims (2)

1. a microblogging data management system is characterized in that comprising: subscriber authorisation module, data capture module, community structure are excavated module, packet parsing represents module and feedback module, wherein:
Subscriber authorisation module: adopt the Oauth agreement to be authorized, obtain the user name of user on microblogging;
Data capture module: the user name according to the user on microblogging, utilize API that microblogging provides to obtain mutual relationship data between the user good friend and the data of subscriber information message, specifically at first user's good friend is captured; Then to each good friend, capture itself and user's common friend information, thereby obtain the mutual relationship between all good friends, form a user social contact relational network formed by good friend's relation, the final user social contact relational network obtained stores database into, and it calls to export the confession of community structure excavation module to;
Community structure is excavated module: good friend's relational network that the data handling module is obtained, according to community's detection technique, the social networks between the good friend, excavate its potential community structure, as the foundation of user good friend grouping; The community's detection technique adopted is comprised of basic community structure search and community's polymerization two parts, and the user good friend grouping of treated generation exports packet parsing to and represents module;
Packet parsing represents module: according to community structure, find the user good friend grouping that module produces, it is resolved, semantic information according to user good friend grouping, by group abstract be famous person star, friend, classmate, the large class of colleague four, packet parsing excavates by community structure each user good friend grouping that module produces, utilize group member's subscriber data to determine the classification of its grouping, as the packet parsing result, represent module community structure is excavated to the result presentation of module and parsing module to the user;
Feedback module: to each user good friend grouping, a feedback is set, collecting the user estimates, make the user make marking evaluation to the effect of system, and collection field feedback, using user id, group result, user feedback as a record, be stored to database, provide foundation in order to improve, improve user's experience for system in the future.
2. a microblogging data managing method is characterized in that performing step is:
(1) subscriber authorisation: adopt the Oauth agreement to be authorized, obtain the user name of user on microblogging;
(2) data capture: the user name according to the user on microblogging, and utilize API that microblogging provides to obtain mutual relationship data between the user good friend and the data of subscriber information message, specifically at first user's good friend is captured; Then to each good friend, capture itself and user's common friend information, obtain the mutual relationship between all good friends, form a user social contact relational network formed by good friend's relation; Wherein, each node in network has represented user's a good friend, and the limit between node has represented two good friends' of user relation, and the final network obtained outputs in database;
(3) community structure is excavated: good friend's relational network that step (2) is obtained, according to community's detection technique, at first network is carried out to depth-first search and excavate its basic community structure, subsequently basic community structure is carried out to the level polymerization, social networks between the good friend, excavate its potential community structure, foundation as grouping, one of them community is some good friends' set, there is good friend's relation that density is larger between the good friend in community, good friend between community has less good friend's relation, thereby obtains user good friend grouping;
(4) packet parsing represents: the user good friend grouping produced according to step (3), it to be resolved, and the effect of this module is the semantic information of excavating grouping of intelligence.By group abstract be famous person star, friend, classmate, the large class of colleague four, each the user good friend grouping produced according to step (3), utilize group member's subscriber data, microblogging content, forward relationship characteristic, determine the classification of its grouping, represent to the user as the grouping foundation;
(5) fed back: to each user friendly grouping, a feedback is set, collects field feedback, in order to improve, improve user's experience for system in the future, provide foundation.
CN201310367762.0A 2013-08-21 2013-08-21 Microblog data management system and implementation method thereof Active CN103488683B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310367762.0A CN103488683B (en) 2013-08-21 2013-08-21 Microblog data management system and implementation method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310367762.0A CN103488683B (en) 2013-08-21 2013-08-21 Microblog data management system and implementation method thereof

Publications (2)

Publication Number Publication Date
CN103488683A true CN103488683A (en) 2014-01-01
CN103488683B CN103488683B (en) 2017-05-10

Family

ID=49828909

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310367762.0A Active CN103488683B (en) 2013-08-21 2013-08-21 Microblog data management system and implementation method thereof

Country Status (1)

Country Link
CN (1) CN103488683B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104052651A (en) * 2014-06-03 2014-09-17 西安交通大学 Method and device for building social contact group
CN104202319A (en) * 2014-08-28 2014-12-10 北京淘友天下科技发展有限公司 Method and device for social relation recommendation
CN104965878A (en) * 2015-06-12 2015-10-07 微梦创科网络科技(中国)有限公司 Method and device for carrying out user work unit digging based on grouped information
CN105262822A (en) * 2015-10-28 2016-01-20 维沃移动通信有限公司 Method and apparatus for assisting user to identify identity of friend
CN105430020A (en) * 2015-12-31 2016-03-23 南京邮电大学 Access group-based privacy protection-supporting access authorization method
CN106411572A (en) * 2016-09-06 2017-02-15 山东大学 Community discovery method combining node information and network structure
CN109783715A (en) * 2019-01-08 2019-05-21 鑫涌算力信息科技(上海)有限公司 Network crawler system and method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009092222A1 (en) * 2007-12-27 2009-07-30 Tencent Technology (Shenzhen) Company Limited A method,a client and a communication system for sharing a communication object
CN102122291A (en) * 2011-01-18 2011-07-13 浙江大学 Blog friend recommendation method based on tree log pattern analysis
CN102708176A (en) * 2012-05-08 2012-10-03 山东大学 Microblog data mining method based on active users

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009092222A1 (en) * 2007-12-27 2009-07-30 Tencent Technology (Shenzhen) Company Limited A method,a client and a communication system for sharing a communication object
CN102122291A (en) * 2011-01-18 2011-07-13 浙江大学 Blog friend recommendation method based on tree log pattern analysis
CN102708176A (en) * 2012-05-08 2012-10-03 山东大学 Microblog data mining method based on active users

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104052651A (en) * 2014-06-03 2014-09-17 西安交通大学 Method and device for building social contact group
CN104052651B (en) * 2014-06-03 2017-09-12 西安交通大学 A kind of method and apparatus for setting up social groups
CN104202319A (en) * 2014-08-28 2014-12-10 北京淘友天下科技发展有限公司 Method and device for social relation recommendation
CN104202319B (en) * 2014-08-28 2018-05-29 北京淘友天下科技发展有限公司 A kind of social networks recommend method and device
CN104965878A (en) * 2015-06-12 2015-10-07 微梦创科网络科技(中国)有限公司 Method and device for carrying out user work unit digging based on grouped information
CN104965878B (en) * 2015-06-12 2018-11-27 微梦创科网络科技(中国)有限公司 A kind of method and device carrying out the excavation of user job unit based on grouping information
CN105262822A (en) * 2015-10-28 2016-01-20 维沃移动通信有限公司 Method and apparatus for assisting user to identify identity of friend
CN105430020A (en) * 2015-12-31 2016-03-23 南京邮电大学 Access group-based privacy protection-supporting access authorization method
CN106411572A (en) * 2016-09-06 2017-02-15 山东大学 Community discovery method combining node information and network structure
CN106411572B (en) * 2016-09-06 2019-05-07 山东大学 A kind of community discovery method of combination nodal information and network structure
CN109783715A (en) * 2019-01-08 2019-05-21 鑫涌算力信息科技(上海)有限公司 Network crawler system and method

Also Published As

Publication number Publication date
CN103488683B (en) 2017-05-10

Similar Documents

Publication Publication Date Title
CN103488683A (en) Microblog data management system and implementation method thereof
CN103678613B (en) Method and device for calculating influence data
CN104660594B (en) A kind of virtual malicious node and its Network Recognition method towards social networks
CN103795613B (en) Method for predicting friend relationships in online social network
CN102722709B (en) Method and device for identifying garbage pictures
CN106372072A (en) Location-based recognition method for user relations in mobile social network
CN106778876A (en) User classification method and system based on mobile subscriber track similitude
CN106372239A (en) Social network event correlation analysis method based on heterogeneous network
WO2014107988A1 (en) Method and system for discovering and analyzing micro-blog user group structure
CN102646122B (en) Automatic building method of academic social network
CN103179198B (en) Based on the topic influence individual method for digging of many relational networks
CN104915397A (en) Method and device for predicting microblog propagation tendencies
CN104268648B (en) Merge user&#39;s ranking system of a variety of interactive information of user and user&#39;s subject information
CN103631862B (en) Event characteristic evolution excavation method and system based on microblogs
CN103618652A (en) Audit and depth analysis system and audit and depth analysis method of business data
CN104765729A (en) Cross-platform micro-blogging community account matching method
CN110009416A (en) A kind of system based on big data cleaning and AI precision marketing
CN111611309A (en) Interactive visualization method for call ticket data relation network
CN104182422A (en) Unified address book information processing method and system
CN110019694A (en) Method, apparatus and computer readable storage medium for knowledge mapping
CN104317794A (en) Chinese feature word association pattern mining method based on dynamic project weight and system thereof
CN109299340B (en) Microblog user forwarding relation importing and visualizing method based on graph database
CN101840423B (en) Bill accuracy auditing system based on pair trading principle and data mining technology
CN105589916A (en) Extraction method for explicit and implicit interest knowledge
CN110704698B (en) Correlation and query method for unstructured massive network security data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant