CN112241492B - Early identification method for multi-source heterogeneous online network topics - Google Patents
Early identification method for multi-source heterogeneous online network topics Download PDFInfo
- Publication number
- CN112241492B CN112241492B CN202011141881.0A CN202011141881A CN112241492B CN 112241492 B CN112241492 B CN 112241492B CN 202011141881 A CN202011141881 A CN 202011141881A CN 112241492 B CN112241492 B CN 112241492B
- Authority
- CN
- China
- Prior art keywords
- network
- community
- short text
- complex network
- complex
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000000750 progressive effect Effects 0.000 claims abstract description 12
- 238000000605 extraction Methods 0.000 claims abstract description 9
- 238000012163 sequencing technique Methods 0.000 claims description 9
- 230000011218 segmentation Effects 0.000 claims description 8
- 230000003068 static effect Effects 0.000 claims description 8
- 230000009193 crawling Effects 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000000638 solvent extraction Methods 0.000 claims description 4
- UFHFLCQGNIYNRP-UHFFFAOYSA-N Hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 claims 1
- 229910052739 hydrogen Inorganic materials 0.000 claims 1
- 239000001257 hydrogen Substances 0.000 claims 1
- 230000003993 interaction Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- Computing Systems (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a multisource heterogeneous online network topic early identification method, which comprises the following steps: 1) Obtaining short text keyword set D 0 (ii) a 2) Constructing complex networks based on keyword coincidence3) For the complex network constructed in the step 2)Community structure division is carried out by utilizing dynamic community division method, and time interval t 0 ,t end ]Dividing the social network by taking the time progressive increment delta t as an interval, and constructing t through newly added short text information of various different source online social networks crawled in the time progressive increment delta t 0 Complex network at time + Δ tThen t is 0 Complex network at time + Δ tCommunity division is carried out by utilizing dynamic community division method to realize complex networkDividing the community; 4) Statistical complex networksFinally found topic keyword sets are constructed according to the community division results, and the method can be used for solving the problems of multiple online social networksAnd carrying out early topic discovery and extraction on the short text information data crawled by the platform.
Description
Technical Field
The invention belongs to the research field of online network topic early identification methods, and relates to a multisource heterogeneous online network topic early identification method.
Background
On one hand, with the high-speed and deep development of the internet, particularly the mobile internet, the internet breaks the space-time limitation of the traditional information interaction circulation, subverts the traditional information propagation mode, and changes the role of an internet user in the information propagation and diffusion process from an information consumer to an information diffuser or even an information producer; the phenomenon that information is spread mutually is gradually started to appear and formed between different online social network system main bodies. The production, the transmission and the interaction of information among the multi-source heterogeneous online networks are more and more complex, so that the early discovery of topics is more difficult. And at present, more topic discovery methods are mainly used for researching the discovery and propagation rules of hot topics, and a great research space is provided for the early topic discovery method.
On the other hand, network information sources and propagation channels are increased rapidly, the scale and the influence of network public opinion are getting bigger and bigger, how to determine early topics in a heterogeneous online network is convenient for governments and supervision departments to perform timely and effective supervision and prevention.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides an early identification method of multi-source heterogeneous online network topics, which can be used for early discovering and extracting topics from short text information data crawled from a plurality of online social network platforms.
In order to achieve the purpose, the method for early identifying the multi-source heterogeneous online network topics comprises the following steps:
1) Analyzing the characteristics of different online social network structures, designing a distributed parallel crawler engine aiming at the characteristics of the different online social network structures, crawling original short text information disclosed by the online social network by using the distributed parallel crawler engine, and then performing Chinese word segmentation and text characteristic valueThe extraction method carries out text preprocessing on the original short text information disclosed by the online social network to obtain a short text keyword set D 0 ;
2) At an initial time t 0 Using short text keyword sets D 0 Complex network based on keyword coincidence is constructed according to behavior relation between network users represented by online social network text information and users
3) For the complex network constructed in the step 2)Community structure division is carried out by utilizing dynamic community division method, and time interval t 0 ,t end ]Dividing the social network by taking the time progressive increment delta t as an interval, and constructing t through newly added short text information of various different source online social networks crawled in the time progressive increment delta t 0 Complex network at time + Δ t>Then t is 0 Complex network at time + Δ t>Community division is carried out by utilizing a dynamic community division method to realize the judgment of a complex network>Dividing the community;
4) Statistical complex networksThe total number of the participating users of the short texts represented by all the nodes of each community in the community division result is then judged according to the complex network->Total number of short text participated users represented by all nodes of community in community division resultSorting to obtain the top N communities;
5) And 4) counting keyword sets corresponding to the short texts in the first N communities obtained in the step 4), sequencing TF-IDF in the counted keyword sets, and constructing a finally found topic keyword set by using the first N keywords in the sequencing result.
In the step 1), original short text information disclosed by the crawled online social network comprises news titles of news websites and microblogs of microblog platforms, and a short text keyword set is constructed according to the crawled original short text information by a method of Chinese word segmentation and text characteristic value extraction TF-IDF.
Short text as a complex networkThe edges between the nodes represent the association relation between the short texts.
Complex networkWhere i, j denotes the time t 0 Previously crawled microblog information and news headlines, C i A set of keywords representing short text i; n is a radical of ij Representing a short text keyword set C i And C j Is determined by the number of coincidences of the keyword(s), is greater than or equal to>V i Network node represented by short text message i, E ij For the association between short text i and short text j, N ij =0 denotes no continuous edge between short texts i and j, N ij 0 indicates that there is an edge between the short texts i and j, and edge E ij Is weighted by N ij 。
Step 3) adopting a static community discovery method to the complex networkAnd carrying out community division.
Adding newly-added short text and connection information in time incremental increment delta t into complexNetworkIn order to form a new complex network &>
Adding the new short text and the connection information in the time increment delta t according to the complex networkRelationships in middle communities fall into two categories, where the first category is based on and/or associated with a complex network>Newly added text node set with medium relationship close to each other>The second type is associated with a complex network->Newly added text node set with loose middle community relation>Determining a newly added text node set based on the modularity gain index delta Q>And complex network>The membership of the middle community, and the newly added text node set is/are judged by using a static community division method>Carrying out community division, determining a newly added community, and realizing the combination of a complex network>Dynamic community partitioning.
The invention has the following beneficial effects:
when the method for early identifying the topics of the multi-source heterogeneous online network is specifically operated, the distributed parallel crawler engine is used for crawling the original short text information disclosed by the online social network, and a short text keyword set D is constructed according to the original short text information 0 Reuse of short text keyword sets D 0 Constructing complex networks based on keyword superpositionThen to the complex network->The method comprises the steps of utilizing a dynamic community division method to divide community structures, and constructing t through newly added short text information of various source online social networks obtained by crawling in time incremental increment delta t 0 Complex network at time + Δ t>At the same time for t 0 Complex network at time + Δ t>The community division based on the time-varying dynamic network is realized, and finally, the complex network is utilized>And extracting topic keyword set from the final community division result, and realizing effective and objective discovery of the multi-source online network topic.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flowchart of a first embodiment.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings:
referring to fig. 1, the method for early identifying a multi-source heterogeneous online network topic, provided by the invention, comprises the following steps:
1) Analyzing the characteristics of different online social network structures, designing a distributed parallel crawler engine aiming at the characteristics of the different online social network structures, crawling original short text information disclosed by the online social network by using the distributed parallel crawler engine, and performing text preprocessing on the original short text information disclosed by the online social network by using a Chinese word segmentation and text characteristic value extraction method to obtain a short text keyword set D 0 ;
The method comprises the steps that original short text information disclosed by a crawled online social network comprises news titles of news websites and microblogs of microblog platforms, and a short text keyword set is constructed according to the crawled original short text information through Chinese word segmentation and text characteristic value extraction TF-IDF.
2) At an initial time t 0 Using short text keyword sets D 0 Complex network based on keyword coincidence is constructed according to behavior relation between network users represented by online social network text information and users
Wherein the short text is used as a complex networkThe edges between the nodes represent the association relation between the short texts. Complex network>Where i, j denotes the time t 0 Previously crawled microblog information and news headlines, C i A set of keywords representing short text i; n is a radical of ij Representing short text keyword set C i And C j Is determined by the number of coincidences of the keyword(s), is greater than or equal to>V i Network node represented by short text message i, E ij For the association between short text i and short text j, N ij =0 tableShowing no continuous edge between short texts i and j, N ij > 0 indicates that there is an edge between short texts i and j, and edge E ij Is weighted by N ij 。
3) For the complex network constructed in the step 2)Community structure division is carried out by utilizing dynamic community division method, and time interval t 0 ,t end ]Dividing the social network by taking the time progressive increment delta t as an interval, and constructing t through newly added short text information of various different source online social networks crawled in the time progressive increment delta t 0 Complex network at time + Δ t>Then t is 0 Complex network at time + Δ t>Community division is carried out by utilizing a dynamic community division method to realize the purpose of combining a complex network>Dividing the community;
wherein, a static community discovery method is adopted for a complex networkAnd carrying out community division.
Adding the newly added short text and connection information in the time incremental delta t to the complex networkTo form a new complex network->Newly-added short text and connection information in the time progressive increment delta t are based on the complex network->Relationships in middle communities fall into two categories, where the first category is based on and/or associated with a complex network>Newly added text node set with close relationThe second category is with complex networks &>Newly added text node set with middle community relation loose->Determining a newly added text node set based on the modularity gain index delta Q>And complex network>The membership of the middle community, and the newly added text node set is/are judged by using a static community division method>Carrying out community division, determining a newly added community and realizing the judgment of a complex network>Dynamic community partitioning.
The specific calculation process of the modularity gain index delta Q is as follows:
newly added text node setEach node i in the network is divided into communities of adjacent nodes j, and the complex network at the moment is calculated>Traversing all nodes i and j, extracting the maximum modularity gain index max delta Q, and outputting the corresponding i max And j max And finally determining a complex network &>The community structure of (1).
4) Statistical complex networksThe total number of the participating users of the short texts represented by all the nodes of each community in the community division result is then judged according to the complex network->Sequencing the total number of short text participating users represented by all the nodes of the communities in the community division result to obtain the top N communities;
5) And 4) counting keyword sets corresponding to the short texts in the first N communities obtained in the step 4), sequencing TF-IDF in the counted keyword sets, and constructing a finally found topic keyword set by using the first N keywords in the sequencing result.
Example one
Referring to fig. 2, the specific operation process of this embodiment is:
1) Analyzing the characteristics of different online social network structures, designing a distributed parallel crawler engine aiming at the characteristics of the different online social network structures, crawling original short text information disclosed by the online social network by using the distributed parallel crawler engine, and performing text preprocessing on the original short text information disclosed by the online social network by using a Chinese word segmentation and text characteristic value extraction method to obtain a short text keyword set D 0 ;
The method comprises the steps that original short text information disclosed by the online social network comprises news titles of news websites and microblogs of microblog platforms, and a short text keyword set is constructed according to the original short text information disclosed by the online social network through Chinese word segmentation and text characteristic value extraction TF-IDF.
2) At the beginningMoment t 0 Using short text keyword sets D 0 Complex network based on keyword coincidence is constructed according to behavior relation between network users represented by online social network text information and users
Wherein the short text is used as a complex networkThe edges between the nodes represent the association relation between the short texts. Complex network>Where i, j denotes the time t 0 Previously crawled microblog information and news headlines, C i A set of keywords representing short text i; n is a radical of ij Representing short text keyword set C i And C j Is determined by the number of coincidences of the keyword(s), is greater than or equal to>V i Network node represented by short text message i, E ij For the association between short text i and short text j, N ij =0 denotes no continuous edge between short texts i and j, N ij 0 indicates that there is an edge between the short texts i and j, and edge E ij Is weighted by N ij 。
3) For the complex network constructed in the step 2)Community structure division is carried out by utilizing dynamic community division method, and time interval t 0 ,t end ]Dividing the social network by taking the time progressive increment delta t as an interval, and constructing t through newly added short text information of various different source online social networks crawled in the time progressive increment delta t 0 Complex network at time + Δ t>Then t is 0 Complex network at time + Δ t>Community division is carried out by utilizing a dynamic community division method to realize the judgment of a complex network>The community division of (2);
wherein, a static community discovery method is adopted for a complex networkAnd carrying out community division.
Adding newly-added short text and connection information in time incremental increment delta t into complex networkTo form a new complex network->Newly-added short text and connection information in the time progressive increment delta t are based on the complex network->The relationship of the middle community is divided into two categories, wherein the first category is based on the complex network->Newly added text node set with medium relationship close to each other>The second category is with complex networks &>Newly added text node set with loose middle community relation>Determining from the modularity gain index Δ QNewly added text node set>And complex network>The membership of the middle community, and the newly added text node set and the method of dividing the static community are utilized to combine and combine the nodes>Carrying out community division, determining a newly added community and realizing the judgment of a complex network>Dynamic community partitioning.
The specific calculation process of the modularity gain index delta Q is as follows:
newly added text node setEach node i in the network is divided into communities of adjacent nodes j, and the complex network at the moment is calculated>Traversing all nodes i and j, extracting the maximum modularity gain index max delta Q, and outputting corresponding i max And j max And finally determining a complex network &>The community structure of (1).
4) Counting the complex network in step 3)The total number of the participating users of the short texts represented by all the nodes of each community in the final community division result is output to a complex network ^ and ^>Sequencing the first 1 communities according to the total number of short text participating users in the communities in the final community division result; c1:391238
5) Counting keyword sets corresponding to short texts in all communities in the first 1 communities, and taking out keywords in the first 5 ranked TF-IDF in the corresponding keyword sets;
the top 5 keyword set in the C1 community is { boy basket, suo mosaic, iran, chinese team, asia };
6) Taking the first n keywords corresponding to each community as a keyword set of finally discovered topics;
the key word set of the top 5 in the C1 community is { boy basket, sunday, iran, chinese team, asia }, and the formed topic is 'Chinese boy basket Sunday'.
Claims (6)
1. A multi-source heterogeneous online network topic early identification method is characterized by comprising the following steps:
1) Analyzing the characteristics of different online social network structures, designing a distributed parallel crawler engine aiming at the characteristics of the different online social network structures, crawling original short text information disclosed by the online social network by using the distributed parallel crawler engine, and performing text preprocessing on the original short text information disclosed by the online social network by using a Chinese word segmentation and text characteristic value extraction method to obtain a short text keyword set D 0 ;
2) At an initial time t 0 Using short text keyword sets D 0 Complex network based on keyword coincidence is constructed according to behavior relation between network users represented by online social network text information and users
3) For the complex network constructed in the step 2)The dynamic community dividing method is utilized to divide the community structure for the time interval t 0 ,t end ]Dividing the social network by taking the time progressive increment delta t as an interval, and constructing t through newly added short text information of various different source online social networks crawled in the time progressive increment delta t 0 Complex network at time + Δ t>Then t is 0 Complex network at time + Δ t>Community division is carried out by utilizing a dynamic community division method to realize the judgment of a complex network>Dividing the community;
4) Statistical complex networksThe total number of the participating users of the short texts represented by all the nodes of each community in the community division result is then judged according to the complex network->Sequencing the total number of short text participating users represented by all the nodes of the communities in the community division result to obtain the top N communities;
5) And 4) counting keyword sets corresponding to the short texts in the first N communities obtained in the step 4), sequencing TF-IDF in the counted keyword sets, and constructing a finally found topic keyword set by using the first N keywords in the sequencing result.
2. The method for early identifying the multi-source heterogeneous online network topics according to claim 1, wherein in the step 1), the original short text information disclosed by the crawled online social network comprises news titles of news websites and microblogs of microblog platforms, and a short text keyword set is constructed according to the crawled original short text information by a method of Chinese word segmentation and text feature value extraction TF-IDF.
3. The method for early recognition of the multi-source heterogeneous online network topic according to claim 1, wherein the short text is taken as a complex networkThe edges between the nodes represent the incidence relation between the short texts;
complex networkWhere i, j denotes the time t 0 Previously crawled microblog information and news headlines, C i A set of keywords representing short text i; n is a radical of hydrogen ij Representing a short text keyword set C i And C j Is determined by the number of coincidences of the keyword(s), is greater than or equal to>V i Network node represented by short text message i, E ij For the association between short text i and short text j, N ij =0 denotes no continuous edge between short texts i and j, N ij 0 indicates that there is an edge between the short texts i and j, and edge E ij Is weighted by N ij 。
6. The method for early identifying the multi-source heterogeneous online network topic as claimed in claim 1, wherein the short text and the connection information added in the time increment delta t are determined according to the complex network topicThe relationship of the middle community is divided into two categories, wherein the first category is based on the complex network->Newly added text node set with medium relationship close to each other>The second category is with complex networks &>Newly added text node set with middle community relation loose->Determining newly added text node set based on modularity gain index delta Q>And complex network>The membership of the middle community, and the newly added text node set is/are judged by using a static community division method>Go to societyDividing the groups, determining a newly added community and realizing the judgment of the complex network>Dynamic community partitioning. />
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011141881.0A CN112241492B (en) | 2020-10-22 | 2020-10-22 | Early identification method for multi-source heterogeneous online network topics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011141881.0A CN112241492B (en) | 2020-10-22 | 2020-10-22 | Early identification method for multi-source heterogeneous online network topics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112241492A CN112241492A (en) | 2021-01-19 |
CN112241492B true CN112241492B (en) | 2023-04-07 |
Family
ID=74169687
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011141881.0A Active CN112241492B (en) | 2020-10-22 | 2020-10-22 | Early identification method for multi-source heterogeneous online network topics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112241492B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104268230A (en) * | 2014-09-28 | 2015-01-07 | 福州大学 | Method for detecting objective points of Chinese micro-blogs based on heterogeneous graph random walk |
CN106055604A (en) * | 2016-05-25 | 2016-10-26 | 南京大学 | Short text topic model mining method based on word network to extend characteristics |
CN106372125A (en) * | 2016-08-24 | 2017-02-01 | 安阳师范学院 | Method for building case study model of educational technology microblog group under SNA perspective |
CN108804432A (en) * | 2017-04-26 | 2018-11-13 | 慧科讯业有限公司 | It is a kind of based on network media data Stream Discovery and to track the mthods, systems and devices of much-talked-about topic |
CN110532390A (en) * | 2019-08-26 | 2019-12-03 | 南京邮电大学 | A kind of news keyword extracting method based on NER and Complex Networks Feature |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20130098772A (en) * | 2012-02-28 | 2013-09-05 | 삼성전자주식회사 | Topic-based community index generation apparatus, topic-based community searching apparatus, topic-based community index generation method and topic-based community searching method |
-
2020
- 2020-10-22 CN CN202011141881.0A patent/CN112241492B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104268230A (en) * | 2014-09-28 | 2015-01-07 | 福州大学 | Method for detecting objective points of Chinese micro-blogs based on heterogeneous graph random walk |
CN106055604A (en) * | 2016-05-25 | 2016-10-26 | 南京大学 | Short text topic model mining method based on word network to extend characteristics |
CN106372125A (en) * | 2016-08-24 | 2017-02-01 | 安阳师范学院 | Method for building case study model of educational technology microblog group under SNA perspective |
CN108804432A (en) * | 2017-04-26 | 2018-11-13 | 慧科讯业有限公司 | It is a kind of based on network media data Stream Discovery and to track the mthods, systems and devices of much-talked-about topic |
CN110532390A (en) * | 2019-08-26 | 2019-12-03 | 南京邮电大学 | A kind of news keyword extracting method based on NER and Complex Networks Feature |
Non-Patent Citations (2)
Title |
---|
"Detecting popular topics in micro-blogging based on a user interest-based model";Shuangyong Song et al.;《 International Joint Conference on Neural Networks》;20120730;全文 * |
"基于词共现关系和粗糙集的微博话题检测方法";兰天 等;《计算机系统应用》;20160615;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112241492A (en) | 2021-01-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106980692B (en) | Influence calculation method based on microblog specific events | |
CN103823844B (en) | Question forwarding system and question forwarding method on the basis of subjective and objective context and in community question-and-answer service | |
Wu et al. | User-as-Graph: User Modeling with Heterogeneous Graph Pooling for News Recommendation. | |
CN104991956B (en) | Microblogging based on theme probabilistic model is propagated group and is divided and account liveness appraisal procedure | |
US20150142820A1 (en) | Association strengths and value significances of ontological subjects of networks and compositions | |
CN103927398A (en) | Microblog hype group discovering method based on maximum frequent item set mining | |
Hristakieva et al. | The spread of propaganda by coordinated communities on social media | |
CN108230169B (en) | Information propagation model based on social influence and situation perception system and method | |
CN106992966B (en) | Network information transmission implementation method for true and false messages | |
CN106570763A (en) | User influence evaluation method and system | |
Arakawa et al. | Adding T witter‐specific features to stylistic features for classifying tweets by user type and number of retweets | |
CN106156117A (en) | Hidden community core communication circle detection towards particular topic finds method and system | |
CN113032557A (en) | Microblog hot topic discovery method based on frequent word set and BERT semantics | |
Sha et al. | Matching user accounts across social networks based on users message | |
CN114218457A (en) | False news detection method based on forward social media user representation | |
CN112241492B (en) | Early identification method for multi-source heterogeneous online network topics | |
CN115329078B (en) | Text data processing method, device, equipment and storage medium | |
Dong et al. | Online Burst Events Detection Oriented Real-Time Microblog Message Stream. | |
Hogan | Using Information Networks to Study Social Behavior: An Appraisal. | |
CN113849598A (en) | Social media false information detection method and system based on deep learning | |
Xiao et al. | Data analysis algorithms for mining online communities from microblogs | |
Zhao et al. | High-value user identification based on topic weight | |
Tu et al. | How to improve the rumor-confutation ability of official rumor-refuting account on social media: A Chinese case study | |
CN107577681A (en) | A kind of terrain analysis based on social media picture, recommend method and system | |
Liu et al. | Data Acquisition, Hot Issues and System of Microblog Mining |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |