CN103488683A - Microblog data management system and implementation method thereof - Google Patents
Microblog data management system and implementation method thereof Download PDFInfo
- Publication number
- CN103488683A CN103488683A CN201310367762.0A CN201310367762A CN103488683A CN 103488683 A CN103488683 A CN 103488683A CN 201310367762 A CN201310367762 A CN 201310367762A CN 103488683 A CN103488683 A CN 103488683A
- Authority
- CN
- China
- Prior art keywords
- user
- module
- good friend
- grouping
- community
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to a microblog data management system and an implementation method thereof, and provides a service of managing microblog data through automatic friend grouping for a microblog user. The system consists of five modules, i.e. a user authorization module, a data extraction module, a community structure finding module, a grouping analysis and exhibition module, a feedback module and a microblog data management module. The system and the method have the advantages that the problems of waste at time and labor and difficult maintenance of traditional manual microblog data management are solved; the friends of the user are intelligently grouped by a community finding technique, so the accuracy is high, the overlapped communities can be found, and the like; a result is analyzed by the method to provide the visual and easy-understanding user friend grouping basis; in addition, the system provides a feedback mechanism to further improve the reliability of the system through introducing the feedback of the user into the system.
Description
Technical field
The present invention relates to a kind of microblogging data management system and its implementation based on the community discovery technology, belong to the data mining technology field.
Background technology
In the social networks such as microblogging, along with increasing of user good friend quantity, the user faces a large amount of information every day.For the more microblog users of user, a kind of method of good data management is that User, in real-life social circle, is set up grouping, according to different grouping under the good friend, manages.After setting up grouping, just can carry out information filtering according to group, privacy arranges etc.At present, the main microblogging service provider such as Tengxun's microblogging, Sina's microblogging all provides this mechanism to carry out management data.Yet existing method is mainly carried out grouping management to the good friend by hand by the user and is carried out.This method is too time-consuming and need a large amount of hand labor of user.When the user has new good friend, also be difficult to upgrade.Simultaneously, manual manages, and exists the possibility of maloperation.
Summary of the invention
Technology of the present invention is dealt with problems: overcome the deficiencies in the prior art, a kind of microblogging data management system and method are provided, can excavate efficiently, accurately potential grouping information, the user can manage its microblogging data easily.
Technical solution of the present invention: a kind of microblogging data management system comprises: as shown in Figure 1
Subscriber authorisation module: adopt the Oauth agreement to be authorized.The security mechanism of utilizing the Oauth technology to provide, native system can not touch user's privacy information.
Data capture module: the API that utilizes microblogging to provide, obtain mutual relationship data between the user good friend and the data of subscriber information message.At first user's good friend captured.Then, to each good friend, capture itself and user's common friend information, thereby obtain the mutual relationship between all good friends, form a user social contact relational network formed by good friend's relation.The input of this module is the user name of user on microblogging, and output is the user social contact relational network.Wherein, each node in network has represented user's a good friend, and the limit between node has represented two good friends' of user relation.The final user social contact relational network obtained outputs in database, for the community structure detection module, calls;
Community structure is excavated module: the data handling module is obtained to the figure that user good friend relation forms, according to community's detection technique, the social networks between the good friend, excavate its potential community structure, as the foundation of grouping.One of them community is some good friends' set, wherein between the good friend in community, has good friend's relation that density is larger, and the good friend between community has less good friend's relation.This module has been used community's detection technique, basic community structure search and community's polymerization two parts, consists of.Set any parameter without the user, also without any parameter.The input of this module is good friend's relational network that data capture module obtains, and the good friend who produces through this resume module grouping exports packet parsing to and represents module;
Packet parsing represents module: find according to community structure the user good friend grouping that module produces, it is resolved.The effect of this module is the semantic information of excavating grouping of intelligence.According to the semantic information of user good friend grouping, by group abstract be famous person star, friend, classmate, the large class of colleague four.Parsing module excavates by community structure each grouping that module produces, and utilizes group member's subscriber data, microblogging content, forwards relationship characteristic, determines the classification of its grouping.As the packet parsing result, represent community structure excavate module by the result presentation of parsing module to the user.
Feedback module: to each user good friend grouping, a feedback is set, collects the user and estimate.Make the user make the marking evaluation to the effect of system, and collect field feedback, using user id, group result, user feedback as a record, be stored to database, in order to improve, improve user's experience for system in the future, provide foundation.
A kind of microblogging data managing method, performing step is:
(1) subscriber authorisation: adopt the Oauth agreement to be authorized, obtain the user name of user on microblogging;
(2) data capture: the user name according to the user on microblogging, and utilize API that microblogging provides to obtain mutual relationship data between the user good friend and the data of subscriber information message, specifically at first user's good friend is captured; Then to each good friend, capture itself and user's common friend information, obtain the mutual relationship between all good friends, form a user social contact relational network formed by good friend's relation; Wherein, each node in network has represented user's a good friend, and the limit between node has represented two good friends' of user relation, and the final network obtained outputs in database;
(3) community structure is excavated: good friend's relational network that step (2) is obtained, according to community's detection technique, at first network is carried out to depth-first search and excavate its basic community structure, subsequently basic community structure is carried out to the level polymerization, social networks between the good friend, excavate its potential community structure, foundation as grouping, one of them community is some good friends' set, there is good friend's relation that density is larger between the good friend in community, good friend between community has less good friend's relation, thereby obtains user good friend grouping;
(4) packet parsing represents: the user good friend grouping produced according to step (3), it to be resolved, and the effect of this module is the semantic information of excavating grouping of intelligence.By group abstract be famous person star, friend, classmate, the large class of colleague four, each the user good friend grouping produced according to step (3), utilize group member's subscriber data, microblogging content, forward relationship characteristic, determine the classification of its grouping, represent to the user as the grouping foundation;
(5) fed back: to each user friendly grouping, a feedback is set, collects field feedback, in order to improve, improve user's experience for system in the future, provide foundation.
The present invention's advantage compared with prior art is:
(1) the present invention can automatically be analyzed good friend's relation of user, excavates out its potential grouping, thereby according to grouping management microblogging data.Whole process, without artificial participation, helps the user to save the work of a large amount of loaded down with trivial details repetitions, has saved the time, has improved efficiency.
(2) the present invention has adopted community's etection theory and technology in the process of automatically excavating grouping, only used the good friend's relation information between the user, there is no the information such as user's data, thereby avoided the imperfection due to subscriber data, the ageing packet error caused.
(3) the present invention resolves to the understandable classification of user by group result, and the user can manage its microblogging data intuitively according to this.
The accompanying drawing explanation
The system assumption diagram that Fig. 1 is system of the present invention;
Fig. 2 is data capture module realization flow figure of the present invention;
Fig. 3 excavates module realization flow figure for community structure in invention;
Fig. 4 is that in the present invention, packet parsing represents module realization flow figure.
Embodiment
As shown in Figure 1, the present invention is based on the microblogging data management system of community discovery technology and method by subscriber authorisation module, data capture module, community structure excavate module, packet parsing represents module and user feedback module composition.
The specific implementation process of each module is as follows:
1. subscriber authorisation module
(1) user inputs its account;
(2) account is sent to the microblogging server authentication, if, by checking, return to accesstoken, mandate completes, and data capture module will be used this accesstoken to obtain data and capture authority.
2. according to handling module, as shown in Figure 2,
(1) initialization Hash table H, for depositing user social contact relational network data.Obtain user's concern list list;
(2) (1) gained user is paid close attention to each uid of list list, get itself and user's common concern list list2.And the hash table that is uid to key, its value of initialization is a null set;
(3) (2) gained is paid close attention to jointly to each uid2 of list list2, joined in the set that in Hash table, key uid is corresponding;
(4) repeating step (3) is until each that jointly pay close attention in list is all processed;
(5) repeating step (2) until the user pay close attention in list each is all processed;
(6) by the data write into Databasce in Hash table, and export community structure excavation module to, data capture and finish.
3. community structure is excavated module, as shown in Figure 3,
(1) null set c of initialization, the Hash table of the handling module of fetching data gained, get one of them list item, carries out (2) to (3);
(2) this process is the process of the depth-first traversal of a recurrence.The key of (1) being planted to the hash table of choosing adds set.To the key chosen corresponding value in Hash table, take out successively each uid wherein, judge that it is whether in set, if do not exist, judge whether it is present in the cryptographic hash that in set, each uid is corresponding, and in set, each uid exists in its corresponding cryptographic hash.If all exist, it is added to set, then from then on uid starts, and continues to carry out (2);
(3) if in now set, element number is greater than 3, find a community structure c, preserved this result in community's S set.Continuation is carried out according to step cycle in (1).
(4) threshold value threshold=0.99 is set.The community structure that first three step is obtained, calculate any two community structure c
i, c
jbetween similarity, computing formula is:
Wherein,
The set that in formula, E is all users in the user social contact relational network, V pays close attention to the set of relation between all users.X
m,nmean community structure c
i, c
jin user m whether user n is had to the concern relation, if having, X
m,n=1; Otherwise X
m,n=0;
(4.1) if similarity is greater than the value of threshold, merge two community structures, after any two society's structures are all calculated and carried out end, execution step is (4.2);
(4.2) reduce the value of threshold, make threshold=threshold-0.05;
(4.3) value of judgement threshold, if be greater than 0.27, forward step (4.1) to; Otherwise, hold step (5);
(5) export the community structure of gained to packet parsing and represent module.
4. packet parsing represents module, as shown in Figure 4,
(1) each community structure is excavated to the community structure that module is excavated, its member's user money is kept in vector, wherein each dimension of vector has represented an information;
(2) calculate in each group vector the maximum dimension of vectorial number that value is identical;
(2.1) if maximum dimensions is school and more than half, its classification resolves to the classmate;
(2.2) if maximum dimensions is work and more than half, its classification resolves to the colleague;
(2.3), if whether maximum dimensions adds V and more than half, its classification is the famous person star;
(2.4) otherwise, resolve its classification for friend.
(3) semantic parsing is complete, and result is showed.
5. feedback module
(1) obtain user feedback data, i.e. user's marking information (1-5 divides);
(2) feedback information is deposited in database;
The part that the present invention does not describe in detail belongs to techniques well known.
Claims (2)
1. a microblogging data management system is characterized in that comprising: subscriber authorisation module, data capture module, community structure are excavated module, packet parsing represents module and feedback module, wherein:
Subscriber authorisation module: adopt the Oauth agreement to be authorized, obtain the user name of user on microblogging;
Data capture module: the user name according to the user on microblogging, utilize API that microblogging provides to obtain mutual relationship data between the user good friend and the data of subscriber information message, specifically at first user's good friend is captured; Then to each good friend, capture itself and user's common friend information, thereby obtain the mutual relationship between all good friends, form a user social contact relational network formed by good friend's relation, the final user social contact relational network obtained stores database into, and it calls to export the confession of community structure excavation module to;
Community structure is excavated module: good friend's relational network that the data handling module is obtained, according to community's detection technique, the social networks between the good friend, excavate its potential community structure, as the foundation of user good friend grouping; The community's detection technique adopted is comprised of basic community structure search and community's polymerization two parts, and the user good friend grouping of treated generation exports packet parsing to and represents module;
Packet parsing represents module: according to community structure, find the user good friend grouping that module produces, it is resolved, semantic information according to user good friend grouping, by group abstract be famous person star, friend, classmate, the large class of colleague four, packet parsing excavates by community structure each user good friend grouping that module produces, utilize group member's subscriber data to determine the classification of its grouping, as the packet parsing result, represent module community structure is excavated to the result presentation of module and parsing module to the user;
Feedback module: to each user good friend grouping, a feedback is set, collecting the user estimates, make the user make marking evaluation to the effect of system, and collection field feedback, using user id, group result, user feedback as a record, be stored to database, provide foundation in order to improve, improve user's experience for system in the future.
2. a microblogging data managing method is characterized in that performing step is:
(1) subscriber authorisation: adopt the Oauth agreement to be authorized, obtain the user name of user on microblogging;
(2) data capture: the user name according to the user on microblogging, and utilize API that microblogging provides to obtain mutual relationship data between the user good friend and the data of subscriber information message, specifically at first user's good friend is captured; Then to each good friend, capture itself and user's common friend information, obtain the mutual relationship between all good friends, form a user social contact relational network formed by good friend's relation; Wherein, each node in network has represented user's a good friend, and the limit between node has represented two good friends' of user relation, and the final network obtained outputs in database;
(3) community structure is excavated: good friend's relational network that step (2) is obtained, according to community's detection technique, at first network is carried out to depth-first search and excavate its basic community structure, subsequently basic community structure is carried out to the level polymerization, social networks between the good friend, excavate its potential community structure, foundation as grouping, one of them community is some good friends' set, there is good friend's relation that density is larger between the good friend in community, good friend between community has less good friend's relation, thereby obtains user good friend grouping;
(4) packet parsing represents: the user good friend grouping produced according to step (3), it to be resolved, and the effect of this module is the semantic information of excavating grouping of intelligence.By group abstract be famous person star, friend, classmate, the large class of colleague four, each the user good friend grouping produced according to step (3), utilize group member's subscriber data, microblogging content, forward relationship characteristic, determine the classification of its grouping, represent to the user as the grouping foundation;
(5) fed back: to each user friendly grouping, a feedback is set, collects field feedback, in order to improve, improve user's experience for system in the future, provide foundation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310367762.0A CN103488683B (en) | 2013-08-21 | 2013-08-21 | Microblog data management system and implementation method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310367762.0A CN103488683B (en) | 2013-08-21 | 2013-08-21 | Microblog data management system and implementation method thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103488683A true CN103488683A (en) | 2014-01-01 |
CN103488683B CN103488683B (en) | 2017-05-10 |
Family
ID=49828909
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310367762.0A Active CN103488683B (en) | 2013-08-21 | 2013-08-21 | Microblog data management system and implementation method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103488683B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104052651A (en) * | 2014-06-03 | 2014-09-17 | 西安交通大学 | Method and device for building social contact group |
CN104202319A (en) * | 2014-08-28 | 2014-12-10 | 北京淘友天下科技发展有限公司 | Method and device for social relation recommendation |
CN104965878A (en) * | 2015-06-12 | 2015-10-07 | 微梦创科网络科技(中国)有限公司 | Method and device for carrying out user work unit digging based on grouped information |
CN105262822A (en) * | 2015-10-28 | 2016-01-20 | 维沃移动通信有限公司 | Method and apparatus for assisting user to identify identity of friend |
CN105430020A (en) * | 2015-12-31 | 2016-03-23 | 南京邮电大学 | Access group-based privacy protection-supporting access authorization method |
CN106411572A (en) * | 2016-09-06 | 2017-02-15 | 山东大学 | Community discovery method combining node information and network structure |
CN109783715A (en) * | 2019-01-08 | 2019-05-21 | 鑫涌算力信息科技(上海)有限公司 | Network crawler system and method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009092222A1 (en) * | 2007-12-27 | 2009-07-30 | Tencent Technology (Shenzhen) Company Limited | A method,a client and a communication system for sharing a communication object |
CN102122291A (en) * | 2011-01-18 | 2011-07-13 | 浙江大学 | Blog friend recommendation method based on tree log pattern analysis |
CN102708176A (en) * | 2012-05-08 | 2012-10-03 | 山东大学 | Microblog data mining method based on active users |
-
2013
- 2013-08-21 CN CN201310367762.0A patent/CN103488683B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009092222A1 (en) * | 2007-12-27 | 2009-07-30 | Tencent Technology (Shenzhen) Company Limited | A method,a client and a communication system for sharing a communication object |
CN102122291A (en) * | 2011-01-18 | 2011-07-13 | 浙江大学 | Blog friend recommendation method based on tree log pattern analysis |
CN102708176A (en) * | 2012-05-08 | 2012-10-03 | 山东大学 | Microblog data mining method based on active users |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104052651A (en) * | 2014-06-03 | 2014-09-17 | 西安交通大学 | Method and device for building social contact group |
CN104052651B (en) * | 2014-06-03 | 2017-09-12 | 西安交通大学 | A kind of method and apparatus for setting up social groups |
CN104202319A (en) * | 2014-08-28 | 2014-12-10 | 北京淘友天下科技发展有限公司 | Method and device for social relation recommendation |
CN104202319B (en) * | 2014-08-28 | 2018-05-29 | 北京淘友天下科技发展有限公司 | A kind of social networks recommend method and device |
CN104965878A (en) * | 2015-06-12 | 2015-10-07 | 微梦创科网络科技(中国)有限公司 | Method and device for carrying out user work unit digging based on grouped information |
CN104965878B (en) * | 2015-06-12 | 2018-11-27 | 微梦创科网络科技(中国)有限公司 | A kind of method and device carrying out the excavation of user job unit based on grouping information |
CN105262822A (en) * | 2015-10-28 | 2016-01-20 | 维沃移动通信有限公司 | Method and apparatus for assisting user to identify identity of friend |
CN105430020A (en) * | 2015-12-31 | 2016-03-23 | 南京邮电大学 | Access group-based privacy protection-supporting access authorization method |
CN106411572A (en) * | 2016-09-06 | 2017-02-15 | 山东大学 | Community discovery method combining node information and network structure |
CN106411572B (en) * | 2016-09-06 | 2019-05-07 | 山东大学 | A kind of community discovery method of combination nodal information and network structure |
CN109783715A (en) * | 2019-01-08 | 2019-05-21 | 鑫涌算力信息科技(上海)有限公司 | Network crawler system and method |
Also Published As
Publication number | Publication date |
---|---|
CN103488683B (en) | 2017-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103488683A (en) | Microblog data management system and implementation method thereof | |
CN103678613B (en) | Method and device for calculating influence data | |
CN104660594B (en) | A kind of virtual malicious node and its Network Recognition method towards social networks | |
CN103795613B (en) | Method for predicting friend relationships in online social network | |
CN102722709B (en) | Method and device for identifying garbage pictures | |
CN106372072A (en) | Location-based recognition method for user relations in mobile social network | |
CN106778876A (en) | User classification method and system based on mobile subscriber track similitude | |
CN106372239A (en) | Social network event correlation analysis method based on heterogeneous network | |
WO2014107988A1 (en) | Method and system for discovering and analyzing micro-blog user group structure | |
CN102646122B (en) | Automatic building method of academic social network | |
CN103179198B (en) | Based on the topic influence individual method for digging of many relational networks | |
CN104915397A (en) | Method and device for predicting microblog propagation tendencies | |
CN104268648B (en) | Merge user's ranking system of a variety of interactive information of user and user's subject information | |
CN103631862B (en) | Event characteristic evolution excavation method and system based on microblogs | |
CN103618652A (en) | Audit and depth analysis system and audit and depth analysis method of business data | |
CN104765729A (en) | Cross-platform micro-blogging community account matching method | |
CN110009416A (en) | A kind of system based on big data cleaning and AI precision marketing | |
CN111611309A (en) | Interactive visualization method for call ticket data relation network | |
CN104182422A (en) | Unified address book information processing method and system | |
CN110019694A (en) | Method, apparatus and computer readable storage medium for knowledge mapping | |
CN104317794A (en) | Chinese feature word association pattern mining method based on dynamic project weight and system thereof | |
CN109299340B (en) | Microblog user forwarding relation importing and visualizing method based on graph database | |
CN101840423B (en) | Bill accuracy auditing system based on pair trading principle and data mining technology | |
CN105589916A (en) | Extraction method for explicit and implicit interest knowledge | |
CN110704698B (en) | Correlation and query method for unstructured massive network security data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |