CN106295681A - A kind of event classification method and system based on complex network label propagation algorithm - Google Patents

A kind of event classification method and system based on complex network label propagation algorithm Download PDF

Info

Publication number
CN106295681A
CN106295681A CN201610621944.XA CN201610621944A CN106295681A CN 106295681 A CN106295681 A CN 106295681A CN 201610621944 A CN201610621944 A CN 201610621944A CN 106295681 A CN106295681 A CN 106295681A
Authority
CN
China
Prior art keywords
event
classification
event information
network
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610621944.XA
Other languages
Chinese (zh)
Inventor
李平
彭欣宇
陈雁
胡栋
孙先
陈凯琪
朱鹏军
韩修龙
郭培伦
许斌
刘婷
朱婷婷
李永乐
林辉
黄飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Petroleum University
Original Assignee
Southwest Petroleum University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Petroleum University filed Critical Southwest Petroleum University
Priority to CN201610621944.XA priority Critical patent/CN106295681A/en
Publication of CN106295681A publication Critical patent/CN106295681A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of event classification method and system based on complex network label propagation algorithm, its method includes: obtain event information, and the key word of temporal information described to described extraction;Calculate the ratio of same keyword in any two event information;Event network is built according to the ratio of same keyword in described any two event information;According to described event network, described event information is classified.Similar event information can effectively be sorted out by the present invention, and reduce similar event information is carried out in classification process to amount of calculation, improve the accuracy rate that similar event is sorted out.

Description

A kind of event classification method and system based on complex network label propagation algorithm
Technical field
The present invention relates to event classification technical field, particularly relate to a kind of thing based on complex network label propagation algorithm Part sorting technique and system.
Background technology
At present, along with popularizing of the Internet, the circulation way of information has relied primarily on the tradition such as TV, newspaper from originally Media are transformed into dependence the Internet and propagate.Therefore, the network media, as a kind of new Information Communication form, has goed deep into people Daily life, the liveness of user name speech has reached unprecedented temperature, no matter is domestic event or international events, Internet public opinion can be formed, take things philosophically point, propagating thought by this net list, and then produce huge pressure from public opinion, reach to appoint The stage what department, mechanism all cannot ignore, this just embodies the importance of public opinion system.But on network, have so polylogia Opinion, many times reptile is probably of a sort data from the content that different web sites crawls, and how can be judged by algorithm Just become a problem demanding prompt solution.Although there are some systems solving event classification in market now, but existing These systems solving event classifications be the most all to be realized by text analyzing, exist computationally intensive, accuracy rate is low etc. asks Topic.
Summary of the invention
It is an object of the invention to overcome the deficiencies in the prior art, it is provided that a kind of based on complex network label propagation algorithm Event classification method and system, it is possible to reduce the amount of calculation of event classification, improves the accuracy rate of event classification.
It is an object of the invention to be achieved through the following technical solutions: a kind of based on complex network label propagation algorithm Event classification method, including:
Obtain event information, and the key word of temporal information described to described extraction;
Calculate the ratio of same keyword in any two event information;
Event network is built according to the ratio of same keyword in described any two event information;
According to described event network, described event information is classified.
In two event informations, the computing formula of the ratio of same keyword is as follows:
Quantity/two the event letter of same keyword in ratio=two event information of same keyword in two event informations The sum of key word in breath.
The construction method of described event network is:
Using each event information as a node in event network;
Judge that in any two event information, whether the ratio of same keyword is more than even limit threshold value: if more than connecting limit threshold value, Line between the node that the two event information is corresponding.
The method classifying described event information is:
The method classifying described event information is:
S1. one classification is set for each node in described event network;
The classification of all neighbor nodes S2. finding out each node includes the classification that number of nodes is most;
S3., the classification that the classification of each node changes to its neighbor node includes the classification that number of nodes is most;
S4. step S2~S3 are repeated, until the classification of each node no longer changes in described event network.
A kind of event classification system based on complex network label propagation algorithm, including: data obtaining module, it is used for obtaining Event information;Keyword extracting module, for extracting the key word of each event information;Same keyword computing module, by based on Calculate the ratio of same keyword in any two event information;Event network struction module, for according to described any two thing In part information, the ratio of same keyword builds event network;Event information sort module, for according to described event network pair Described event information is classified.
In two event informations, the computing formula of the ratio of same keyword is as follows: same keyword in two event informations Ratio=two event information in same keyword quantity/two event information in the sum of key word.
The construction method of described event network is:
Using each event information as a node in event network;
Judge that in any two event information, whether the ratio of same keyword is more than even limit threshold value: if more than connecting limit threshold value, Line between the node that the two event information is corresponding.
The method classifying described event information is:
S1. one classification is set for each node in described event network;
The classification of all neighbor nodes S2. finding out each node includes the classification that number of nodes is most;
S3., the classification that the classification of each node changes to its neighbor node includes the classification that number of nodes is most;
S4. step S2~S3 are repeated, until the classification of each node no longer changes in described event network.
The invention has the beneficial effects as follows: similar event information can effectively be sorted out by the present invention, and reduce right Similar event information carries out the amount of calculation given in classification process, improves the accuracy rate sorting out similar event.
Accompanying drawing explanation
Fig. 1 is the flow chart of event classification method based on complex network label propagation algorithm in the present invention;
Fig. 2 is the flow chart of the structure event network of the present invention;
Fig. 3 is the flow chart in the present invention classified event information;
Fig. 4 is the schematic block diagram of event classification system based on complex network label propagation algorithm in the present invention.
Detailed description of the invention
Technical scheme is described in further detail below in conjunction with the accompanying drawings, but protection scope of the present invention is not limited to The following stated.
As it is shown in figure 1, a kind of event classification method based on complex network label propagation algorithm, including:
Step one, acquisition event information, and the key word of temporal information described to described extraction.
The ratio of same keyword in step 2, calculating any two event information.Identical key in two event informations The computing formula of the ratio of word is as follows:
Quantity/two the event letter of same keyword in ratio=two event information of same keyword in two event informations The sum of key word in breath.
Step 3, according in described any two event information same keyword ratio build event network.Such as Fig. 2 institute Showing, the construction method of described event network is:
Using each event information as a node in event network;
Judge that in any two event information, whether the ratio of same keyword is more than even limit threshold value: if more than connecting limit threshold value, Line between the node that the two event information is corresponding.
Step 4, according to described event network, described event information is classified.As it is shown on figure 3, described event is believed The method that breath carries out classifying is:
S1. one classification is set for each node in described event network;
The classification of all neighbor nodes S2. finding out each node includes the classification that number of nodes is most;
S3., the classification that the classification of each node changes to its neighbor node includes the classification that number of nodes is most;
S4. step S2~S3 are repeated, until the classification of each node no longer changes in described event network.
Embodiment one
Classifying to never grabbing different news with portal website in the present embodiment, even limit threshold value is set to 0.6, including following Step:
Use reptile to crawl news from each portal website, and extract the key word of every the news crawled;
Every news that traversal crawls, the ratio that itself and remaining all news crawled carry out same keyword calculates, And preserve result of calculation;
All news crawled all are changed into the node in network, will every news as a node in network, Then carrying out internodal even limit, even the method on limit is: judge the ratio meter of the same keyword of the news that two nodes are corresponding Calculating whether result is more than even limit threshold value 0.6, the most then carry out even limit by these 2, otherwise these 2 do not connect limit;
Performing label propagation algorithm, similar news sorted out, classifying method is: a. is that each node in network is arranged One classification;The classification of all neighbor nodes b. finding out each node includes the classification that number of nodes is most;C. by each joint The classification of point changes to the classification of its neighbor node and includes the classification that number of nodes is most;D. step b~c are repeated, until institute State the classification of each node in network no longer to change.
As shown in Figure 4, a kind of event classification system based on complex network label propagation algorithm, including acquisition of information mould Block, keyword extracting module, same keyword computing module, event network struction module and event information sort module.
Described data obtaining module, is used for obtaining event information.
Described keyword extracting module, for extracting the key word of each event information.
Described same keyword computing module, for calculating the ratio of same keyword in any two event information;Two In individual event information, the computing formula of the ratio of same keyword is as follows: ratio=two of same keyword in two event informations The sum of key word in quantity/two event information of same keyword in individual event information.
Described event network struction module, for according to the ratio structure of same keyword in described any two event information Build event network;The construction method of described event network is: using each event information as a node in event network;Sentence In disconnected any two event information, whether the ratio of same keyword is more than even limit threshold value: if more than even limit threshold value, in the two Line between the node that event information is corresponding.
Described event information sort module, for classifying to described event information according to described event network.To institute Stating the method that event information carries out classifying is:
S1. one classification is set for each node in described event network;
The classification of all neighbor nodes S2. finding out each node includes the classification that number of nodes is most;
S3., the classification that the classification of each node changes to its neighbor node includes the classification that number of nodes is most;
S4. step S2~S3 are repeated, until the classification of each node no longer changes in described event network.
The above is only the preferred embodiment of the present invention, it should be understood that the present invention is not limited to described herein Form, is not to be taken as the eliminating to other embodiments, and can be used for other combinations various, amendment and environment, and can be at this In the described contemplated scope of literary composition, it is modified by above-mentioned teaching or the technology of association area or knowledge.And those skilled in the art are entered The change of row and change, the most all should be at the protection domains of claims of the present invention without departing from the spirit and scope of the present invention In.

Claims (8)

1. an event classification method based on complex network label propagation algorithm, it is characterised in that: including:
Obtain event information, and the key word of temporal information described to described extraction;
Calculate the ratio of same keyword in any two event information;
Event network is built according to the ratio of same keyword in described any two event information;
According to described event network, described event information is classified.
A kind of event classification method based on complex network label propagation algorithm the most according to claim 1, its feature exists In: in two event informations, the computing formula of the ratio of same keyword is as follows:
Quantity/two the event letter of same keyword in ratio=two event information of same keyword in two event informations The sum of key word in breath.
A kind of event classification method based on complex network label propagation algorithm the most according to claim 1, its feature exists In: the construction method of described event network is:
Using each event information as a node in event network;
Judge that in any two event information, whether the ratio of same keyword is more than even limit threshold value: if more than connecting limit threshold value, Line between the node that the two event information is corresponding.
A kind of event classification method based on complex network label propagation algorithm the most according to claim 1, its feature exists In: the method classifying described event information is:
S1. one classification is set for each node in described event network;
The classification of all neighbor nodes S2. finding out each node includes the classification that number of nodes is most;
S3., the classification that the classification of each node changes to its neighbor node includes the classification that number of nodes is most;
S4. step S2~S3 are repeated, until the classification of each node no longer changes in described event network.
5. an event classification system based on complex network label propagation algorithm, it is characterised in that: including:
Data obtaining module, is used for obtaining event information;
Keyword extracting module, for extracting the key word of each event information;
Same keyword computing module, for calculating the ratio of same keyword in any two event information;
Event network struction module, for building event net according to the ratio of same keyword in described any two event information Network;
Event information sort module, for classifying to described event information according to described event network.
A kind of event classification system based on complex network label propagation algorithm the most according to claim 5, its feature exists In: in two event informations, the computing formula of the ratio of same keyword is as follows:
Quantity/two the event letter of same keyword in ratio=two event information of same keyword in two event informations The sum of key word in breath.
A kind of event classification system based on complex network label propagation algorithm the most according to claim 5, its feature exists In: the construction method of described event network is:
Using each event information as a node in event network;
Judge that in any two event information, whether the ratio of same keyword is more than even limit threshold value: if more than connecting limit threshold value, Line between the node that the two event information is corresponding.
A kind of event classification system based on complex network label propagation algorithm the most according to claim 5, its feature exists In: the method classifying described event information is:
S1. one classification is set for each node in described event network;
The classification of all neighbor nodes S2. finding out each node includes the classification that number of nodes is most;
S3., the classification that the classification of each node changes to its neighbor node includes the classification that number of nodes is most;
S4. step S2~S3 are repeated, until the classification of each node no longer changes in described event network.
CN201610621944.XA 2016-08-02 2016-08-02 A kind of event classification method and system based on complex network label propagation algorithm Pending CN106295681A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610621944.XA CN106295681A (en) 2016-08-02 2016-08-02 A kind of event classification method and system based on complex network label propagation algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610621944.XA CN106295681A (en) 2016-08-02 2016-08-02 A kind of event classification method and system based on complex network label propagation algorithm

Publications (1)

Publication Number Publication Date
CN106295681A true CN106295681A (en) 2017-01-04

Family

ID=57663968

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610621944.XA Pending CN106295681A (en) 2016-08-02 2016-08-02 A kind of event classification method and system based on complex network label propagation algorithm

Country Status (1)

Country Link
CN (1) CN106295681A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934698A (en) * 2019-01-29 2019-06-25 华融融通(北京)科技有限公司 A kind of fraud related network feature extracting method propagated based on label

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103327092A (en) * 2012-11-02 2013-09-25 中国人民解放军国防科学技术大学 Cell discovery method and system on information networks
CN103745000A (en) * 2014-01-24 2014-04-23 福州大学 Hot topic detection method of Chinese micro-blogs
KR20140137521A (en) * 2013-05-23 2014-12-03 고영선 Tag profile system for node expansion of Social Network
CN104199852A (en) * 2014-08-12 2014-12-10 上海交通大学 Label propagation community structure mining method based on node membership degree
CN105677906A (en) * 2015-05-07 2016-06-15 浚鸿数据开发股份有限公司 Automatic collecting and analyzing system and method for network events
CN105677648A (en) * 2014-11-18 2016-06-15 四三九九网络股份有限公司 Community detection method and system based on label propagation algorithm

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103327092A (en) * 2012-11-02 2013-09-25 中国人民解放军国防科学技术大学 Cell discovery method and system on information networks
KR20140137521A (en) * 2013-05-23 2014-12-03 고영선 Tag profile system for node expansion of Social Network
CN103745000A (en) * 2014-01-24 2014-04-23 福州大学 Hot topic detection method of Chinese micro-blogs
CN104199852A (en) * 2014-08-12 2014-12-10 上海交通大学 Label propagation community structure mining method based on node membership degree
CN105677648A (en) * 2014-11-18 2016-06-15 四三九九网络股份有限公司 Community detection method and system based on label propagation algorithm
CN105677906A (en) * 2015-05-07 2016-06-15 浚鸿数据开发股份有限公司 Automatic collecting and analyzing system and method for network events

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934698A (en) * 2019-01-29 2019-06-25 华融融通(北京)科技有限公司 A kind of fraud related network feature extracting method propagated based on label

Similar Documents

Publication Publication Date Title
CN103745000B (en) Hot topic detection method of Chinese micro-blogs
CN103678670B (en) Micro-blog hot word and hot topic mining system and method
CN107239512B (en) A kind of microblogging comment spam recognition methods of combination comment relational network figure
CN103544255A (en) Text semantic relativity based network public opinion information analysis method
CN111159395A (en) Chart neural network-based rumor standpoint detection method and device and electronic equipment
CN105843795A (en) Topic model based document keyword extraction method and system
CN104268160A (en) Evaluation object extraction method based on domain dictionary and semantic roles
CN104536956A (en) A Microblog platform based event visualization method and system
CN104484343A (en) Topic detection and tracking method for microblog
CN105677661A (en) Method for detecting repetition data of social media
CN106980651B (en) Crawling seed list updating method and device based on knowledge graph
CN103995804A (en) Cross-media topic detection method and device based on multimodal information fusion and graph clustering
CN103617290A (en) Chinese machine-reading system
CN104915443A (en) Extraction method of Chinese Microblog evaluation object
CN104346382B (en) Use the text analysis system and method for language inquiry
CN104573057A (en) Account correlation method used for UGC (User Generated Content)-spanning website platform
CN109992784A (en) A kind of heterogeneous network building and distance metric method for merging multi-modal information
CN106294621B (en) A kind of method and system of the calculating event similitude based on complex network node similitude
CN111324801A (en) Hot event discovery method in judicial field based on hot words
Cataldi et al. Estimating domain-based user influence in social networks
Xu Cultural communication in double-layer coupling social network based on association rules in big data
Zhao et al. Towards events detection from microblog messages
CN105468780A (en) Normalization method and device of product name entity in microblog text
CN108595515A (en) A kind of microblog emotional analysis method of the weak relationship of combination microblogging
CN106295681A (en) A kind of event classification method and system based on complex network label propagation algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170104