CN106295681A - A kind of event classification method and system based on complex network label propagation algorithm - Google Patents
A kind of event classification method and system based on complex network label propagation algorithm Download PDFInfo
- Publication number
- CN106295681A CN106295681A CN201610621944.XA CN201610621944A CN106295681A CN 106295681 A CN106295681 A CN 106295681A CN 201610621944 A CN201610621944 A CN 201610621944A CN 106295681 A CN106295681 A CN 106295681A
- Authority
- CN
- China
- Prior art keywords
- event
- classification
- event information
- network
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of event classification method and system based on complex network label propagation algorithm, its method includes: obtain event information, and the key word of temporal information described to described extraction;Calculate the ratio of same keyword in any two event information;Event network is built according to the ratio of same keyword in described any two event information;According to described event network, described event information is classified.Similar event information can effectively be sorted out by the present invention, and reduce similar event information is carried out in classification process to amount of calculation, improve the accuracy rate that similar event is sorted out.
Description
Technical field
The present invention relates to event classification technical field, particularly relate to a kind of thing based on complex network label propagation algorithm
Part sorting technique and system.
Background technology
At present, along with popularizing of the Internet, the circulation way of information has relied primarily on the tradition such as TV, newspaper from originally
Media are transformed into dependence the Internet and propagate.Therefore, the network media, as a kind of new Information Communication form, has goed deep into people
Daily life, the liveness of user name speech has reached unprecedented temperature, no matter is domestic event or international events,
Internet public opinion can be formed, take things philosophically point, propagating thought by this net list, and then produce huge pressure from public opinion, reach to appoint
The stage what department, mechanism all cannot ignore, this just embodies the importance of public opinion system.But on network, have so polylogia
Opinion, many times reptile is probably of a sort data from the content that different web sites crawls, and how can be judged by algorithm
Just become a problem demanding prompt solution.Although there are some systems solving event classification in market now, but existing
These systems solving event classifications be the most all to be realized by text analyzing, exist computationally intensive, accuracy rate is low etc. asks
Topic.
Summary of the invention
It is an object of the invention to overcome the deficiencies in the prior art, it is provided that a kind of based on complex network label propagation algorithm
Event classification method and system, it is possible to reduce the amount of calculation of event classification, improves the accuracy rate of event classification.
It is an object of the invention to be achieved through the following technical solutions: a kind of based on complex network label propagation algorithm
Event classification method, including:
Obtain event information, and the key word of temporal information described to described extraction;
Calculate the ratio of same keyword in any two event information;
Event network is built according to the ratio of same keyword in described any two event information;
According to described event network, described event information is classified.
In two event informations, the computing formula of the ratio of same keyword is as follows:
Quantity/two the event letter of same keyword in ratio=two event information of same keyword in two event informations
The sum of key word in breath.
The construction method of described event network is:
Using each event information as a node in event network;
Judge that in any two event information, whether the ratio of same keyword is more than even limit threshold value: if more than connecting limit threshold value,
Line between the node that the two event information is corresponding.
The method classifying described event information is:
The method classifying described event information is:
S1. one classification is set for each node in described event network;
The classification of all neighbor nodes S2. finding out each node includes the classification that number of nodes is most;
S3., the classification that the classification of each node changes to its neighbor node includes the classification that number of nodes is most;
S4. step S2~S3 are repeated, until the classification of each node no longer changes in described event network.
A kind of event classification system based on complex network label propagation algorithm, including: data obtaining module, it is used for obtaining
Event information;Keyword extracting module, for extracting the key word of each event information;Same keyword computing module, by based on
Calculate the ratio of same keyword in any two event information;Event network struction module, for according to described any two thing
In part information, the ratio of same keyword builds event network;Event information sort module, for according to described event network pair
Described event information is classified.
In two event informations, the computing formula of the ratio of same keyword is as follows: same keyword in two event informations
Ratio=two event information in same keyword quantity/two event information in the sum of key word.
The construction method of described event network is:
Using each event information as a node in event network;
Judge that in any two event information, whether the ratio of same keyword is more than even limit threshold value: if more than connecting limit threshold value,
Line between the node that the two event information is corresponding.
The method classifying described event information is:
S1. one classification is set for each node in described event network;
The classification of all neighbor nodes S2. finding out each node includes the classification that number of nodes is most;
S3., the classification that the classification of each node changes to its neighbor node includes the classification that number of nodes is most;
S4. step S2~S3 are repeated, until the classification of each node no longer changes in described event network.
The invention has the beneficial effects as follows: similar event information can effectively be sorted out by the present invention, and reduce right
Similar event information carries out the amount of calculation given in classification process, improves the accuracy rate sorting out similar event.
Accompanying drawing explanation
Fig. 1 is the flow chart of event classification method based on complex network label propagation algorithm in the present invention;
Fig. 2 is the flow chart of the structure event network of the present invention;
Fig. 3 is the flow chart in the present invention classified event information;
Fig. 4 is the schematic block diagram of event classification system based on complex network label propagation algorithm in the present invention.
Detailed description of the invention
Technical scheme is described in further detail below in conjunction with the accompanying drawings, but protection scope of the present invention is not limited to
The following stated.
As it is shown in figure 1, a kind of event classification method based on complex network label propagation algorithm, including:
Step one, acquisition event information, and the key word of temporal information described to described extraction.
The ratio of same keyword in step 2, calculating any two event information.Identical key in two event informations
The computing formula of the ratio of word is as follows:
Quantity/two the event letter of same keyword in ratio=two event information of same keyword in two event informations
The sum of key word in breath.
Step 3, according in described any two event information same keyword ratio build event network.Such as Fig. 2 institute
Showing, the construction method of described event network is:
Using each event information as a node in event network;
Judge that in any two event information, whether the ratio of same keyword is more than even limit threshold value: if more than connecting limit threshold value,
Line between the node that the two event information is corresponding.
Step 4, according to described event network, described event information is classified.As it is shown on figure 3, described event is believed
The method that breath carries out classifying is:
S1. one classification is set for each node in described event network;
The classification of all neighbor nodes S2. finding out each node includes the classification that number of nodes is most;
S3., the classification that the classification of each node changes to its neighbor node includes the classification that number of nodes is most;
S4. step S2~S3 are repeated, until the classification of each node no longer changes in described event network.
Embodiment one
Classifying to never grabbing different news with portal website in the present embodiment, even limit threshold value is set to 0.6, including following
Step:
Use reptile to crawl news from each portal website, and extract the key word of every the news crawled;
Every news that traversal crawls, the ratio that itself and remaining all news crawled carry out same keyword calculates,
And preserve result of calculation;
All news crawled all are changed into the node in network, will every news as a node in network,
Then carrying out internodal even limit, even the method on limit is: judge the ratio meter of the same keyword of the news that two nodes are corresponding
Calculating whether result is more than even limit threshold value 0.6, the most then carry out even limit by these 2, otherwise these 2 do not connect limit;
Performing label propagation algorithm, similar news sorted out, classifying method is: a. is that each node in network is arranged
One classification;The classification of all neighbor nodes b. finding out each node includes the classification that number of nodes is most;C. by each joint
The classification of point changes to the classification of its neighbor node and includes the classification that number of nodes is most;D. step b~c are repeated, until institute
State the classification of each node in network no longer to change.
As shown in Figure 4, a kind of event classification system based on complex network label propagation algorithm, including acquisition of information mould
Block, keyword extracting module, same keyword computing module, event network struction module and event information sort module.
Described data obtaining module, is used for obtaining event information.
Described keyword extracting module, for extracting the key word of each event information.
Described same keyword computing module, for calculating the ratio of same keyword in any two event information;Two
In individual event information, the computing formula of the ratio of same keyword is as follows: ratio=two of same keyword in two event informations
The sum of key word in quantity/two event information of same keyword in individual event information.
Described event network struction module, for according to the ratio structure of same keyword in described any two event information
Build event network;The construction method of described event network is: using each event information as a node in event network;Sentence
In disconnected any two event information, whether the ratio of same keyword is more than even limit threshold value: if more than even limit threshold value, in the two
Line between the node that event information is corresponding.
Described event information sort module, for classifying to described event information according to described event network.To institute
Stating the method that event information carries out classifying is:
S1. one classification is set for each node in described event network;
The classification of all neighbor nodes S2. finding out each node includes the classification that number of nodes is most;
S3., the classification that the classification of each node changes to its neighbor node includes the classification that number of nodes is most;
S4. step S2~S3 are repeated, until the classification of each node no longer changes in described event network.
The above is only the preferred embodiment of the present invention, it should be understood that the present invention is not limited to described herein
Form, is not to be taken as the eliminating to other embodiments, and can be used for other combinations various, amendment and environment, and can be at this
In the described contemplated scope of literary composition, it is modified by above-mentioned teaching or the technology of association area or knowledge.And those skilled in the art are entered
The change of row and change, the most all should be at the protection domains of claims of the present invention without departing from the spirit and scope of the present invention
In.
Claims (8)
1. an event classification method based on complex network label propagation algorithm, it is characterised in that: including:
Obtain event information, and the key word of temporal information described to described extraction;
Calculate the ratio of same keyword in any two event information;
Event network is built according to the ratio of same keyword in described any two event information;
According to described event network, described event information is classified.
A kind of event classification method based on complex network label propagation algorithm the most according to claim 1, its feature exists
In: in two event informations, the computing formula of the ratio of same keyword is as follows:
Quantity/two the event letter of same keyword in ratio=two event information of same keyword in two event informations
The sum of key word in breath.
A kind of event classification method based on complex network label propagation algorithm the most according to claim 1, its feature exists
In: the construction method of described event network is:
Using each event information as a node in event network;
Judge that in any two event information, whether the ratio of same keyword is more than even limit threshold value: if more than connecting limit threshold value,
Line between the node that the two event information is corresponding.
A kind of event classification method based on complex network label propagation algorithm the most according to claim 1, its feature exists
In: the method classifying described event information is:
S1. one classification is set for each node in described event network;
The classification of all neighbor nodes S2. finding out each node includes the classification that number of nodes is most;
S3., the classification that the classification of each node changes to its neighbor node includes the classification that number of nodes is most;
S4. step S2~S3 are repeated, until the classification of each node no longer changes in described event network.
5. an event classification system based on complex network label propagation algorithm, it is characterised in that: including:
Data obtaining module, is used for obtaining event information;
Keyword extracting module, for extracting the key word of each event information;
Same keyword computing module, for calculating the ratio of same keyword in any two event information;
Event network struction module, for building event net according to the ratio of same keyword in described any two event information
Network;
Event information sort module, for classifying to described event information according to described event network.
A kind of event classification system based on complex network label propagation algorithm the most according to claim 5, its feature exists
In: in two event informations, the computing formula of the ratio of same keyword is as follows:
Quantity/two the event letter of same keyword in ratio=two event information of same keyword in two event informations
The sum of key word in breath.
A kind of event classification system based on complex network label propagation algorithm the most according to claim 5, its feature exists
In: the construction method of described event network is:
Using each event information as a node in event network;
Judge that in any two event information, whether the ratio of same keyword is more than even limit threshold value: if more than connecting limit threshold value,
Line between the node that the two event information is corresponding.
A kind of event classification system based on complex network label propagation algorithm the most according to claim 5, its feature exists
In: the method classifying described event information is:
S1. one classification is set for each node in described event network;
The classification of all neighbor nodes S2. finding out each node includes the classification that number of nodes is most;
S3., the classification that the classification of each node changes to its neighbor node includes the classification that number of nodes is most;
S4. step S2~S3 are repeated, until the classification of each node no longer changes in described event network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610621944.XA CN106295681A (en) | 2016-08-02 | 2016-08-02 | A kind of event classification method and system based on complex network label propagation algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610621944.XA CN106295681A (en) | 2016-08-02 | 2016-08-02 | A kind of event classification method and system based on complex network label propagation algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106295681A true CN106295681A (en) | 2017-01-04 |
Family
ID=57663968
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610621944.XA Pending CN106295681A (en) | 2016-08-02 | 2016-08-02 | A kind of event classification method and system based on complex network label propagation algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106295681A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109934698A (en) * | 2019-01-29 | 2019-06-25 | 华融融通(北京)科技有限公司 | A kind of fraud related network feature extracting method propagated based on label |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103327092A (en) * | 2012-11-02 | 2013-09-25 | 中国人民解放军国防科学技术大学 | Cell discovery method and system on information networks |
CN103745000A (en) * | 2014-01-24 | 2014-04-23 | 福州大学 | Hot topic detection method of Chinese micro-blogs |
KR20140137521A (en) * | 2013-05-23 | 2014-12-03 | 고영선 | Tag profile system for node expansion of Social Network |
CN104199852A (en) * | 2014-08-12 | 2014-12-10 | 上海交通大学 | Label propagation community structure mining method based on node membership degree |
CN105677906A (en) * | 2015-05-07 | 2016-06-15 | 浚鸿数据开发股份有限公司 | Automatic collecting and analyzing system and method for network events |
CN105677648A (en) * | 2014-11-18 | 2016-06-15 | 四三九九网络股份有限公司 | Community detection method and system based on label propagation algorithm |
-
2016
- 2016-08-02 CN CN201610621944.XA patent/CN106295681A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103327092A (en) * | 2012-11-02 | 2013-09-25 | 中国人民解放军国防科学技术大学 | Cell discovery method and system on information networks |
KR20140137521A (en) * | 2013-05-23 | 2014-12-03 | 고영선 | Tag profile system for node expansion of Social Network |
CN103745000A (en) * | 2014-01-24 | 2014-04-23 | 福州大学 | Hot topic detection method of Chinese micro-blogs |
CN104199852A (en) * | 2014-08-12 | 2014-12-10 | 上海交通大学 | Label propagation community structure mining method based on node membership degree |
CN105677648A (en) * | 2014-11-18 | 2016-06-15 | 四三九九网络股份有限公司 | Community detection method and system based on label propagation algorithm |
CN105677906A (en) * | 2015-05-07 | 2016-06-15 | 浚鸿数据开发股份有限公司 | Automatic collecting and analyzing system and method for network events |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109934698A (en) * | 2019-01-29 | 2019-06-25 | 华融融通(北京)科技有限公司 | A kind of fraud related network feature extracting method propagated based on label |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103745000B (en) | Hot topic detection method of Chinese micro-blogs | |
CN103678670B (en) | Micro-blog hot word and hot topic mining system and method | |
CN107239512B (en) | A kind of microblogging comment spam recognition methods of combination comment relational network figure | |
CN103544255A (en) | Text semantic relativity based network public opinion information analysis method | |
CN111159395A (en) | Chart neural network-based rumor standpoint detection method and device and electronic equipment | |
CN105843795A (en) | Topic model based document keyword extraction method and system | |
CN104268160A (en) | Evaluation object extraction method based on domain dictionary and semantic roles | |
CN104536956A (en) | A Microblog platform based event visualization method and system | |
CN104484343A (en) | Topic detection and tracking method for microblog | |
CN105677661A (en) | Method for detecting repetition data of social media | |
CN106980651B (en) | Crawling seed list updating method and device based on knowledge graph | |
CN103995804A (en) | Cross-media topic detection method and device based on multimodal information fusion and graph clustering | |
CN103617290A (en) | Chinese machine-reading system | |
CN104915443A (en) | Extraction method of Chinese Microblog evaluation object | |
CN104346382B (en) | Use the text analysis system and method for language inquiry | |
CN104573057A (en) | Account correlation method used for UGC (User Generated Content)-spanning website platform | |
CN109992784A (en) | A kind of heterogeneous network building and distance metric method for merging multi-modal information | |
CN106294621B (en) | A kind of method and system of the calculating event similitude based on complex network node similitude | |
CN111324801A (en) | Hot event discovery method in judicial field based on hot words | |
Cataldi et al. | Estimating domain-based user influence in social networks | |
Xu | Cultural communication in double-layer coupling social network based on association rules in big data | |
Zhao et al. | Towards events detection from microblog messages | |
CN105468780A (en) | Normalization method and device of product name entity in microblog text | |
CN108595515A (en) | A kind of microblog emotional analysis method of the weak relationship of combination microblogging | |
CN106295681A (en) | A kind of event classification method and system based on complex network label propagation algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170104 |