CN104281608A - Emergency analyzing method based on microblogs - Google Patents

Emergency analyzing method based on microblogs Download PDF

Info

Publication number
CN104281608A
CN104281608A CN201310284163.2A CN201310284163A CN104281608A CN 104281608 A CN104281608 A CN 104281608A CN 201310284163 A CN201310284163 A CN 201310284163A CN 104281608 A CN104281608 A CN 104281608A
Authority
CN
China
Prior art keywords
microblogging
word
time
burst
accident
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201310284163.2A
Other languages
Chinese (zh)
Inventor
肖江
王光平
李文骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI RUIYING SOFTWARE TECHNOLOGY Co Ltd
Original Assignee
SHANGHAI RUIYING SOFTWARE TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI RUIYING SOFTWARE TECHNOLOGY Co Ltd filed Critical SHANGHAI RUIYING SOFTWARE TECHNOLOGY Co Ltd
Priority to CN201310284163.2A priority Critical patent/CN104281608A/en
Publication of CN104281608A publication Critical patent/CN104281608A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9577Optimising the visualization of content, e.g. distillation of HTML documents

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an emergency analyzing method based on microblogs. The emergency analyzing method based on the microblogs comprises the steps that firstly, burst words are analyzed, and compared with a previous technical method, analyzing of the burst words is characterized in that two features of the time domain and the frequency domain are used for judging the non-periodic burstiness of word feature tracks; secondly, in the situation that the burst words appear in a microblog item at the same time, different subject terms are clustered, one kind of burst words correspond to one emergency, and in order to better describe the emergencies, the emergencies can be described by reading the text part of the relevant microblogs of events which are issued at the earliest; thirdly, analyzing results of the emergencies are represented in a webpage mode. According to the emergency analyzing method based on the microblogs, the emergency analyzing method can be directly applied to a microblog pre-warning system, the concern extent of the emergencies is quantized into the proportion of the events appearing in microblog user comments, pre-warning is conducted in time, and the purposes of public opinion monitoring and managing are achieved.

Description

Based on the incident analysis method of microblogging
Technical field
The present invention relates to a kind of analytical approach, particularly relate to a kind of incident analysis method based on microblogging.
Background technology
Along with the develop rapidly of Internet technology, network microblog has suddenly become the third-largest public sentiment source of the China Internet after news, forum.Because microblogging has hundred billion DBMS amounts and unthinkable Information Sharing speed every day, make all kinds of mechanism, data that enterprise has to pay attention to relating in microblogging self.Relevant scholar, advertiser and politico are very early by large-scale online social networks; as microblog; regard as one and possess the live network understanding the functions such as thought propagation, social bond; especially in microblog, the freedom of speech, velocity of propagation are fast; accident may be carried out extensively propagating fast along with rumour and negative speech more, forms serious public opinion crisis.But at present, go to collect all related datas by artificial method and almost can not complete, let alone these mass datas are analyzed, arranged.So, use suitable method by the means of computing machine, hold the accident on microblogging and regularity of information dissemination thereof in time, there is important social effect.
Research at present based on the incident analysis of microblogging is little, but has some correlative studys for much-talked-about topic, because much-talked-about topic can as a part for accident, therefore can as technical background of the present invention to the research of much-talked-about topic analytical approach.Be the discovery of much-talked-about topic or burst topic be all as an important clue in fact using the descriptor of topic or event, and for the sudden judgement of word, classical method is exactly a kind of text mining algorithm differentiating the outburst of word in Email or newsletter archive stream that Kleinberg.J proposes, main thought utilizes the status switch of automat to simulate text flow according to the initial time sequence of information in text flow, each state is wherein that Annual distribution function that is that strengthen sends information according to becoming large along with number of state indexes, when text flow is in the maximum state of sequence number as a burst.He.Q is then that the seasonal effect in time series spectrum signature formed according to word feature carries out Feature Words classification, to there is the word of high dominant period and the main power spectrum of height as burst word, thus the time burst of word can be found out further by the characteristic time sequence of burst word, according to the appearance in a document of burst word, they are formed important event non-periodic more simultaneously.Fung, G.P.C. the minimal set that accident is the burst word simultaneously occurred in a large number in certain time window of text flow is defined, its method is also define burst word by the feature distribution of word characteristic time sequence, then draw the set of the minimal burstiness word describing accident, and find the popular time of accident.
Above-mentioned method is not for micro-blog information, when carrying out accident retrieval simultaneously, it is all the judgement carrying out the word of theme from time burst, but accident comprises focus incident toward contact in microblogging monitoring, only judge that accident has certain limitation from the time, effect is also undesirable.
Summary of the invention
Technical matters to be solved by this invention is to provide a kind of incident analysis method based on microblogging, it can be applied directly in microblogging early warning system, the degree of concern of accident is quantified as the ratio that event occurs in microblog users states one's views, and make early warning in time, reach the object of public sentiment monitoring management.
The present invention solves above-mentioned technical matters by following technical proposals: a kind of incident analysis method based on microblogging, is characterized in that, the described incident analysis method based on microblogging comprises the following steps:
Step one, analyzes burst word, and the analysis of burst word is mainly with technical method difference before, adopts the non-periodic of time domain and frequency domain two feature grammatical term for the character characteristic locuses sudden;
Step 2, situation about simultaneously being occurred in microblogging item by burst word is by different key phrases clusterings, an one class burst word then corresponding accident, in order to better describe accident, the body part that can read the event relevant microblog issued the earliest is used as the description of accident;
Step 3, represents the analysis result of accident in the form of a web page.
Preferably, the burst word characteristic locus in described step one will determine sudden in a period of time of grammatical term for the character, first will record the changing condition of frequency in during this period of time of word, the characteristic locus of the word that namely happens suddenly.
Preferably, in the building process of described characteristic locus, the metric sebection of eigenwert and the selection of time quantum length be considered.
Preferably, the described incident analysis method based on microblogging is applied in microblogging early warning system, and microblogging early warning system comprises microblogging acquisition module, microblogging analysis module.
Preferably, described microblogging acquisition module is responsible for carrying out Real-time Collection, tracking, monitoring to the Sina on internet, Tengxun, this three large microblog system of twitter, a gordian technique in microblogging acquisition module is Intelligent Information Collection technology, adopt intelligent distributed collaborative reptile, dynamic configuration crawler server quantity and reptile quantity, dynamically increase and decrease the computational resource be used in collection under different collection demands.
Preferably, described microblogging analysis module is the information will obtained through microblogging acquisition module, carry out information duplicate removal, propagation chain analysis, trend analysis etc. through microblogging analysis module and get valuable microblogging information, real-time analyzes hot spot of public opinions, holds some trends of microblogging information.
Positive progressive effect of the present invention is: because microblogging is a new things, among the understanding of microblogging public sentiment is still in and is groped, so the innovation of the present invention in microblogging incident analysis method, good theoretical foundation is provided necessarily to the further investigation of microblogging public sentiment, accident in traditional concept is made simultaneously and having supplemented, add the content of focus incident, form a kind of analysis management pattern of maturation.Particularly at current microblogging still among the process of fast development, the continuous appearance of various new situations and new problems, to be doomed must to keep abreast of the situation to the research of microblogging public sentiment, constantly introduce the process of new ideas, only in this way Cai Nengshi relevant departments grasp in time and adjust operative orientation, take more effectively to be tackled with strategy initiatively.Therefore, the present invention is by depth studying the analysis of the accident in microblogging public opinion comprehensively, and apply it among microblogging early warning system, data acquisition is carried out to the Sina on internet, Tengxun, this three large microblog system of twitter, after extracting effective information, then by these information by interface display to user, the processing policy of further clear and definite current period microblogging public sentiment, has very strong realistic meaning.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the incident analysis method that the present invention is based on microblogging.
Embodiment
Present pre-ferred embodiments is provided, to describe technical scheme of the present invention in detail below in conjunction with accompanying drawing.
As shown in Figure 1, the incident analysis method that the present invention is based on microblogging comprises the following steps:
Step one, analyzes burst word, and the analysis of burst word is mainly with technical method difference before, and the non-periodic of burst word characteristic locuses is sudden to adopt time domain and frequency domain two features to judge;
Step 2, situation about simultaneously being occurred in microblogging item by burst word is by different key phrases clusterings, an one class burst word then corresponding accident, in order to better describe accident, the body part that can read the event relevant microblog issued the earliest is used as the description of accident;
Step 3, represents the analysis result of accident in the form of a web page.
Burst word characteristic locus in described step one will determine sudden in a period of time of grammatical term for the character, and first will record the changing condition of frequency in during this period of time of word, the characteristic locus of the word that namely happens suddenly, this is a sequence of values, is expressed as formula (1):
formula (1)
Y tan entirety, expression be burst word characteristic locus.
In the building process of characteristic locus, the metric sebection of eigenwert and the selection of time quantum length be considered.Here according to the feature of microblog data, select time unit is sky, and eigenwert is then measured by DF-VALUE value, and computing formula is formula (2):
formula (2)
In formula represent the number of files comprising feature f for t days; represent total number of files of t days; represent the total number of files comprising feature f; N represents the total textual data in time window T.
After to the document participle of each time quantum, then build the inverted index of " word-position ", record the numbering of the microblogging item that word appears at and the sequence number in microblogging item text, carry out according to this inverted index calculating, after the calculating of all time quantums all terminates, then carry out in time window the calculating of statistical value, finally calculates the DF-VALUE value of word at each time quantum, thus constitutive characteristic track.In addition, the flow process calculating word characteristic locus it is also conceivable to as word weighting, as according to the forwarding of word place microblogging item or comment situation, if microblogging item is forwarded or comments on once, is then considered as occurrence number increase once.
Carry out text participle and new word discovery, form the index of word on text.Backward maximum matching process is mainly adopted to microblogging text participle, word string absent variable in dictionary is carried out to the cutting of individual character, this segmenting method based on dictionary has higher efficiency, but for not occurring that word is then helpless in dictionary, as the important informations such as the name in accident, place name are all likely the word do not occurred, so also will carry out the identification of neologisms at participle simultaneously.New word discovery algorithm adopts the new word discovery method of Corpus--based Method.Its algorithm is not emphasis of the present invention, therefore repeats no more.
Location between burst word analysis and burst region.On the basis of structure word characteristic locus, discrete Fourier transformation is carried out to the characteristic sequence after normalized, word can be divided into the high-strength and long cycle according to characteristic locus time domain and frequency domain character, the high strength short period, low-intensity long period and low-intensity short period four class, the word wherein with high-strength and long periodic group characteristic locus is identified as burst word.The two indices carrying out classifying is characteristic locus seasonal effect in time series dominant period and main power spectrum respectively.Although be worth the periodicity that can obtain word and happen suddenly by the dominant period, the time range of burst also needs to rely on characteristic locus further to calculate.For word burst region between there is following characteristics:
(1) frequency of the interior word in burst region is usually above normal value;
(2) the frequency mxm. of word should be comprised between burst region, i.e. peak value;
(3) in burst region, characteristic locus generally presents the general trend that rising right side declines on the left of peak value.
According to the feature between burst region, can work up the confining method " peak value thruster algorithm " between burst region, specific algorithm step is as follows:
Input: there is paroxysmal time series
Export: this seasonal effect in time series sudden peaks time is th, and the start time is tb, and the end time is te,
Algorithm:
1) first travel time sequence, the time point recording maximal value place is th; Calculate serial mean Y;
2) suppose that the current investigation time is the time t=th of peak value;
3) t=t 1, if yt>Y and yt<yt+1, skips to step 3, otherwise tb=t, skip to step 4;
4) suppose that the current investigation time is time to peak t=th;
5) t=t+1, if yt>Y and yt<yt-1, skips to step 3, no te=t, terminates.
Build accident.On the basis that burst word is analyzed, according to different descriptor common feature occurred in microblog data of same event, burst word is carried out cluster according to the common microblogging item number occurred, the similarity measurements figureofmerit of burst word is mutual information, what cluster process adopted is hierarchical clustering, and cluster condition is that the maximal value of similarity between class and minimum value all meet certain threshold values.Whole cluster process adopts bottom-up hierarchy clustering method, and the specific algorithm of cluster is as follows:
Input: the set of word and each word appearance situation in the text
Export: some all kinds of, make the similarity in each class meet threshold values
Algorithm:
1) similarity of any two words is calculated by word appearance situation in the text;
2) using each word as an initial classes;
3)?REPEAT;
4) class at two word places with maximum similarity is found out in different classes;
5) if the maximum similarity of word and minimum similarity degree meet certain threshold values respectively in two classes, then two classes are merged;
6) set of UNTIL class does not change.
This algorithm is different from the place of traditional hierarchical clustering algorithm mainly, after two classes finding similarity maximum, also need the minimum and maximum similarity investigated between class whether to meet certain threshold values, this is relevant with the application background under microblog, and in same event, any two descriptor all will have more co-occurrence situation.Word every class word after cluster then represents an accident, and the time burst of event is exactly the time burst of the word in part of speech with peak-peak.Such class descriptor roughly can represent an accident, but will obtain more detailed event description, also needs by existing microblogging text.
The incident analysis method that the present invention is based on microblogging can be applied in microblogging early warning system, realizes the analysis to accident and warning function, in order to better explain that first the present invention is introduced microblogging early warning system.
First the user interface of microblogging early warning system can be set to colleges and universities' microblogging prewarning monitoring system, monitor all micro-blog informations relevant to these colleges and universities, pay close attention to the much-talked-about topic of college students ', focus personage, the accident relevant to colleges and universities is followed the tracks of timely, and to appointment, colleges and universities have the content of microblog of negative effect to make early warning, safeguard the image of colleges and universities, improve the quality of education, safeguard that social harmony is stablized.
Microblogging early warning system comprises the module such as microblogging acquisition module, microblogging analysis module.
Microblogging acquisition module is responsible for the Sina on internet, Tengxun, this three large microblog system of twitter carries out Real-time Collection, follow the tracks of, monitoring, a gordian technique in microblogging acquisition module is Intelligent Information Collection technology, adopt intelligent distributed collaborative reptile, dynamic configuration crawler server quantity and reptile quantity, the computational resource be used in collection is is dynamically increased and decreased under different collection demands, micro-blog information is obtained on internet by the reptile module in web retrieval subsystem, can to the quantity of reptile module installation reptile, grasp speed, initial URL, meet the regular expression gathering the URL required, the constraints such as reptile Thread Termination condition, obtain relevant info web, by Web Cleanout module, advertisement is removed to the info web obtained, picture, the noise datas such as copyright notice, extract the microblogging text in related web page, chained address, the data such as acquisition time.
Microblogging analysis module, the information will obtained through microblogging acquisition module, carry out information duplicate removal, propagation chain analysis, trend analysis etc. through microblogging analysis module and get valuable microblogging information, real-time analyzes hot spot of public opinions, holds some trends of microblogging information.Incident analysis method of the present invention is then mainly used in microblogging analysis module, relates to the following aspects:
Focus, keyword find, adopt focus weight calculation model to analyze microblogging temperature, automatically find the focus vocabulary in microblogging, help user's awareness network focus intuitively;
Focus personage, microblog system analyzes focus personage according to the microblogging from Network Capture;
Trend analysis, for the high attention rate event that microblogging causes, can grasp bursting point and the state of affairs of this microblogging in time, provide the focus incident of Different periods;
, in the short time, there is the event causing very large repercussion on the net of (within 24 hours) in accident;
Microblogging early warning, the key word analysis that microblog system is arranged according to user goes out microblogging, and shows at the microblogging early warning page.
In the process that burst word is analyzed, microblogging text is organized in units of sky, after filtering out non-Chinese and the little microblogging text of information content, carry out text participle and new word discovery, form the index of word on text, based on index statistical measure, build the characteristic sequence of word, to normalization and level and smooth after characteristic sequence carry out discrete Fourier transformation, investigate time domain and frequency domain two features of new sequence, meeting high word judgment that is sudden and long periodicity is burst word, burst word is carried out cluster according to the common microblogging item number occurred, the descriptor of such class can represent an accident, concrete implementation step is as follows:
One, use microblogging search engine to gather microblog data, data acquisition to data mainly divide two class data to store, a class is user data User, and another kind of is microblog data Tweet.Note: the microblogging search engine mentioned here is not repeating, the main algorithm of breadth First that adopts gathers.
Two, use relational data library storage User and Tweet data, inquire about for subsequent association.
Three, prepare event table event and antistop list keywords, an event comprises multiple keyword, therefore needs use the 3rd table to be associated event_keywords.
Four, Chinese words segmentation is used to the content content field in Tweet data, carry out participle.Recycling new word discovery technology, forms the index of word on text.Carry out duplicate removal to the result term after participle, traversal, if the keyword in antistop list comprises this term, then by the term counting number count+1 of this keyword.
Five, a threshold values k is set.Suppose that reptile picking rate is s, so threshold values should be k=s*60/1000.This formula shows, if the frequency that all keywords corresponding to certain event increase is the per mille of picking rate per minute, investigate time domain and frequency domain two features of new sequence more simultaneously, meeting high word judgment that is sudden and long periodicity is burst word, burst word is carried out cluster according to the common microblogging item number occurred, the descriptor of such class can represent an accident, is carrying out more perfect description, and be presented on related pages in conjunction with crawler capturing data to accident.
Six, a timer is set, this timer keywords can be shown in counting count to subtract every 60 seconds 60(actual be exactly per second in subtract 1, this measure is the performance loss brought to reduce this timer).
Seven, arrange a timer again, keywords sum sum corresponding to all events in query event table per minute is greater than all events of threshold values k, and carries out backward sequence according to sum, is namely the microblogging accident that this algorithm finally draws.Finally by the middle of result feedback to the correlation module or the page of relevant microblog early warning system.
Those skilled in the art can carry out various remodeling and change to the present invention.Therefore, present invention covers the various remodeling in the scope falling into appending claims and equivalent thereof and change.

Claims (6)

1. based on an incident analysis method for microblogging, it is characterized in that, the described incident analysis method based on microblogging comprises the following steps:
Step one, analyzes burst word, and the analysis of burst word is mainly with technical method difference before, adopts the non-periodic of time domain and frequency domain two feature grammatical term for the character characteristic locuses sudden;
Step 2, situation about simultaneously being occurred in microblogging item by burst word is by different key phrases clusterings, an one class burst word then corresponding accident, in order to better describe accident, the body part that can read the event relevant microblog issued the earliest is used as the description of accident;
Step 3, represents the analysis result of accident in the form of a web page.
2. as claimed in claim 1 based on the incident analysis method of microblogging, it is characterized in that, burst word characteristic locus in described step one will determine sudden in a period of time of grammatical term for the character, first will record the changing condition of frequency in during this period of time of word, the characteristic locus of the word that namely happens suddenly.
3., as claimed in claim 2 based on the incident analysis method of microblogging, it is characterized in that, in the building process of described characteristic locus, the metric sebection of eigenwert and the selection of time quantum length be considered.
4., as claimed in claim 1 based on the incident analysis method of microblogging, it is characterized in that, the described incident analysis method based on microblogging is applied in microblogging early warning system, and microblogging early warning system comprises microblogging acquisition module, microblogging analysis module.
5. as claimed in claim 4 based on the incident analysis method of microblogging, it is characterized in that, described microblogging acquisition module is responsible for carrying out Real-time Collection, tracking, monitoring to the Sina on internet, Tengxun, this three large microblog system of twitter, a gordian technique in microblogging acquisition module is Intelligent Information Collection technology, adopt intelligent distributed collaborative reptile, dynamic configuration crawler server quantity and reptile quantity, dynamically increase and decrease the computational resource be used in collection under different collection demands.
6. as claimed in claim 4 based on the incident analysis method of microblogging, it is characterized in that, described microblogging analysis module is the information will obtained through microblogging acquisition module, carry out information duplicate removal, propagation chain analysis, trend analysis etc. through microblogging analysis module and get valuable microblogging information, real-time analyzes hot spot of public opinions, holds some trends of microblogging information.
CN201310284163.2A 2013-07-08 2013-07-08 Emergency analyzing method based on microblogs Pending CN104281608A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310284163.2A CN104281608A (en) 2013-07-08 2013-07-08 Emergency analyzing method based on microblogs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310284163.2A CN104281608A (en) 2013-07-08 2013-07-08 Emergency analyzing method based on microblogs

Publications (1)

Publication Number Publication Date
CN104281608A true CN104281608A (en) 2015-01-14

Family

ID=52256484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310284163.2A Pending CN104281608A (en) 2013-07-08 2013-07-08 Emergency analyzing method based on microblogs

Country Status (1)

Country Link
CN (1) CN104281608A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615718A (en) * 2015-02-05 2015-05-13 北京航空航天大学 Hierarchical analysis method for social network emergency
CN105491117A (en) * 2015-11-26 2016-04-13 北京航空航天大学 Flow chart data processing system and method for real time data analysis
CN107273496A (en) * 2017-06-15 2017-10-20 淮海工学院 A kind of detection method of micro blog network region accident
CN107992619A (en) * 2017-12-21 2018-05-04 联想(北京)有限公司 A kind of clustering method, server cluster and virtual bench
CN109145114A (en) * 2018-08-29 2019-01-04 电子科技大学 Social networks event detecting method based on Kleinberg presence machine
CN110457595A (en) * 2019-08-01 2019-11-15 腾讯科技(深圳)有限公司 Emergency event alarm method, device, system, electronic equipment and storage medium
CN110489741A (en) * 2019-07-12 2019-11-22 北京邮电大学 Microblogging burst topic detecting method based on the detection of burst word and filtering
CN110968770A (en) * 2018-09-29 2020-04-07 北京国双科技有限公司 Method and device for terminating crawling of crawler tool
CN110990748A (en) * 2019-12-18 2020-04-10 成都迪普曼林信息技术有限公司 National public opinion data acquisition and publishing system
CN111476025A (en) * 2020-02-28 2020-07-31 开普云信息科技股份有限公司 Government field-oriented new word automatic discovery implementation method, analysis model and system
CN112528024A (en) * 2020-12-15 2021-03-19 哈尔滨工程大学 Microblog emergency detection method based on multi-feature fusion

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615718B (en) * 2015-02-05 2017-12-15 北京航空航天大学 The Hierarchy Analysis Method of social networks accident
CN104615718A (en) * 2015-02-05 2015-05-13 北京航空航天大学 Hierarchical analysis method for social network emergency
CN105491117A (en) * 2015-11-26 2016-04-13 北京航空航天大学 Flow chart data processing system and method for real time data analysis
CN105491117B (en) * 2015-11-26 2018-12-21 北京航空航天大学 Streaming diagram data processing system and method towards real-time data analysis
CN107273496A (en) * 2017-06-15 2017-10-20 淮海工学院 A kind of detection method of micro blog network region accident
CN107992619A (en) * 2017-12-21 2018-05-04 联想(北京)有限公司 A kind of clustering method, server cluster and virtual bench
CN109145114B (en) * 2018-08-29 2021-08-03 电子科技大学 Social network event detection method based on Kleinberg online state machine
CN109145114A (en) * 2018-08-29 2019-01-04 电子科技大学 Social networks event detecting method based on Kleinberg presence machine
CN110968770B (en) * 2018-09-29 2023-09-05 北京国双科技有限公司 Method and device for stopping crawling of crawler tool
CN110968770A (en) * 2018-09-29 2020-04-07 北京国双科技有限公司 Method and device for terminating crawling of crawler tool
CN110489741A (en) * 2019-07-12 2019-11-22 北京邮电大学 Microblogging burst topic detecting method based on the detection of burst word and filtering
CN110489741B (en) * 2019-07-12 2022-06-21 北京邮电大学 Microblog burst topic detection method based on burst word detection and filtering
CN110457595A (en) * 2019-08-01 2019-11-15 腾讯科技(深圳)有限公司 Emergency event alarm method, device, system, electronic equipment and storage medium
CN110990748A (en) * 2019-12-18 2020-04-10 成都迪普曼林信息技术有限公司 National public opinion data acquisition and publishing system
CN111476025A (en) * 2020-02-28 2020-07-31 开普云信息科技股份有限公司 Government field-oriented new word automatic discovery implementation method, analysis model and system
CN111476025B (en) * 2020-02-28 2021-01-08 开普云信息科技股份有限公司 Government field-oriented new word automatic discovery implementation method, analysis model and system
CN112528024A (en) * 2020-12-15 2021-03-19 哈尔滨工程大学 Microblog emergency detection method based on multi-feature fusion
CN112528024B (en) * 2020-12-15 2022-11-18 哈尔滨工程大学 Microblog emergency detection method based on multi-feature fusion

Similar Documents

Publication Publication Date Title
CN104281608A (en) Emergency analyzing method based on microblogs
Tanwar et al. Unravelling unstructured data: A wealth of information in big data
CN105488092B (en) A kind of time-sensitive and adaptive sub-topic online test method and system
Tsirakis et al. Large scale opinion mining for social, news and blog data
CN103544255A (en) Text semantic relativity based network public opinion information analysis method
CN105718587A (en) Network content resource evaluation method and evaluation system
CN102663046A (en) Sentiment analysis method oriented to micro-blog short text
CN103617169A (en) Microblog hot topic extracting method based on Hadoop
CN105068991A (en) Big data based public sentiment discovery method
CN110705288A (en) Big data-based public opinion analysis system
CN104978332B (en) User-generated content label data generation method, device and correlation technique and device
CN104965823A (en) Big data based opinion extraction method
CN106126605B (en) Short text classification method based on user portrait
Xu et al. Research on topic recognition of network sensitive information based on SW-LDA model
Ouyang et al. Sentistory: multi-grained sentiment analysis and event summarization with crowdsourced social media data
Guo et al. A survey of Internet public opinion mining
CN104408083A (en) Socialized media analyzing system
CN111026804A (en) Big data analysis intelligent service system based on semantics
CN105183765A (en) Big data-based topic extraction method
CN111859065A (en) Big data-based public opinion listening system
Stankovic et al. Mapping tweets to conference talks: A goldmine for semantics
Qiu et al. Incorporate the syntactic knowledge in opinion mining in user-generated content
CN111666499A (en) Public opinion monitoring cloud service platform based on big data
TW201640383A (en) Internet events automatic collection and analysis method and system thereof
Chen et al. Popular topic detection in Chinese micro-blog based on the modified LDA model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150114