CN104281670B - The real-time incremental formula detection method and system of a kind of social networks event - Google Patents

The real-time incremental formula detection method and system of a kind of social networks event Download PDF

Info

Publication number
CN104281670B
CN104281670B CN201410509359.1A CN201410509359A CN104281670B CN 104281670 B CN104281670 B CN 104281670B CN 201410509359 A CN201410509359 A CN 201410509359A CN 104281670 B CN104281670 B CN 104281670B
Authority
CN
China
Prior art keywords
mrow
msub
sigma
probability
theme
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410509359.1A
Other languages
Chinese (zh)
Other versions
CN104281670A (en
Inventor
李建欣
邰振赢
于伟仁
张日崇
胡春明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201410509359.1A priority Critical patent/CN104281670B/en
Publication of CN104281670A publication Critical patent/CN104281670A/en
Application granted granted Critical
Publication of CN104281670B publication Critical patent/CN104281670B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Machine Translation (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The present invention provides a kind of the real-time incremental formula detection method and system of social networks event, by using probability graph model, according to the time of short text, document and theme label, carries out model learning to short text, obtains likelihood function;Using EM algorithms, likelihood function is solved, obtains parameter;Using incremental update mode, renewal is iterated to the parameter obtained, until parameter restrains;Using distributed way, the E steps and M steps in EM algorithms are performed according to the parameter after convergence, calculate the content for obtaining short essay shelves, so as to solve real-time, socialization and the fragmentation feature that event detection of the prior art can not adapt to the short text in social networks simultaneously, cause the inaccurate technical problem of testing result.And, it is proposed that there is the short text event detection model of supervision, incremental learning is the same as predicting the algorithm being combined and the event detection model based on internal memory calculating platform.

Description

The real-time incremental formula detection method and system of a kind of social networks event
Technical field
The present invention relates to information technology, more particularly to a kind of real-time incremental formula detection method of social networks event and it is System.
Background technology
Short text in social networks, such as microblogging, often have the characteristics that:Length is strictly limited within 140 words; User can also be carried out interactive while short text is issued by@symbols and other users;User can also using # symbols come Show the theme belonging to short text.
As a kind of interactive and propagated extremely strong instrument, the short text quantity in social networks is often with news thing Part shows volatile growth, so that real time information substitutes frequently on social networks;Simultaneously as The length limitation of short text in social networks so that text more fragmentation.Generally speaking, the short text in social networks Real-time, socialization and fragmentation feature huge challenge is brought to event detection.
In the prior art, include for event detection based on burst word detection method, and based on topic model detection side Method.Wherein, based in burst word detection method, representations of events is the set of one group of associated burst word.First, based on word frequency The methods of acceleration or wavelet analysis detection burst word;Secondly, the similitude between burst word is calculated;Then, divided based on figure Or the methods of K- averages (K-means), carries out the cluster of burst word.But it is based on several as follows ask in burst word detection method being present Topic:First, background and the explanation of probability are lacked;Secondly, can not the theme of track of events change with time;Finally, data Concentrate different time that event occurs cannot be distinguished by.
Another is based in topic model detection method, and representations of events is a theme.Topic model is extensively should The method for excavating latent variable is concentrated used in text data.In the topic model of classics, such as the analysis of potential Di Li Crays (Latent Dirichlet Allocation, LDA), theme is identified according to the cooccurrence relation between the word and word in document. But there are problems that in the detection method based on topic model:First, classical topic model is applied to the data of long text Collection, the cooccurrence relation between the word and word of short text is too sparse, and result of calculation is difficult convergence;Secondly, different time in data set Generation event cannot be distinguished by;Finally, correlation technique is applied to the scene of processed offline, and related algorithm is that order calculates not With concurrency.
Therefore, event detection of the prior art can not adapt to the real-time of the short text in social networks, society simultaneously It can change and fragmentation feature, cause testing result inaccurate.
The content of the invention
The present invention provides a kind of the real-time incremental formula detection method and system of social networks event, for solving prior art In event detection can not adapt to real-time, socialization and the fragmentation feature of the short text in social networks simultaneously, lead Cause the inaccurate technical problem of testing result.
The first aspect of the invention is to provide a kind of real-time incremental formula detection method of social networks event, including:
Using probability graph model, according to the time of short text, document and theme label, model learning is carried out to short text, Obtain likelihood function;
Using EM algorithms, likelihood function is solved, obtains parameter;
Using incremental update mode, renewal is iterated to the parameter obtained, until parameter restrains;
Using distributed way, the E steps and M steps in EM algorithms are performed according to the parameter after convergence, calculates and obtains short essay shelves Content.
Another aspect of the present invention is to provide a kind of real-time incremental formula detecting system of social networks event, including:
Model learning module, for using probability graph model, according to the time of short text, document and theme label, to short Text carries out model learning, obtains likelihood function;
Likelihood function module, for using EM algorithms, likelihood function is solved, obtains parameter;
Incremental update module, for using incremental update mode, renewal is iterated to the parameter obtained, until parameter Convergence;
Distributed Calculation module, for using distributed way, according to the parameter after convergence perform E steps in EM algorithms and M is walked, and calculates the content for obtaining short essay shelves.
The real-time incremental formula detection method and device of social networks event provided by the invention, by using probability artwork Type, according to the time of short text, document and theme label, model learning is carried out to short text, obtains likelihood function;Calculated using EM Method, likelihood function is solved, obtain parameter;Using incremental update mode, renewal is iterated to the parameter obtained, directly Restrained to parameter;Using distributed way, the E steps and M steps in EM algorithms are performed according to the parameter after convergence, calculates and obtains short essay The content of shelves, the real-time of short text in social networks can not be adapted to simultaneously so as to solve event detection of the prior art Property, socialization and fragmentation feature, cause the inaccurate technical problem of testing result.And, it is proposed that there is the short essay of supervision Present event detection model, incremental learning is the same as predicting the algorithm being combined and the event detection model based on internal memory calculating platform.
Brief description of the drawings
Fig. 1 is that a kind of flow of the real-time incremental formula detection method for social networks event that one embodiment of the invention provides is shown It is intended to;
Fig. 2 is a kind of flow of the real-time incremental formula detection method for social networks event that another embodiment of the present invention provides Schematic diagram;
Fig. 3 is probability graph model;
Fig. 4 is that a kind of structure of the real-time incremental formula detecting system for social networks event that one embodiment of the invention provides is shown It is intended to.
Embodiment
Fig. 1 is that a kind of flow of the real-time incremental formula detection method for social networks event that one embodiment of the invention provides is shown It is intended to, as shown in figure 1, including:
101st, using probability graph model, according to the time of short text, document and theme label, model is carried out to short text Practise, obtain likelihood function.
102nd, using expectation maximization (Expectation Maximization Algorithm, EM) algorithm, to likelihood Function is solved, and obtains parameter.
Wherein, parameter includes p (z | d), p (td| z), p (h | z) and p (w | z), and p (z | w, d, td, h) and p (z | w, d, td);Wherein, p (z | d) represents probability of the theme z in document d;p(td| z) represent theme z in time tdWhen probability, p (h | Z) represent that theme label h appears in probability in theme z, and p (w | z) represent probability of the word w in theme z.P (z | w, d, td, H) represent that theme z is related to word w, document d, time td, theme label h probability.P (z | w, d, td) represent that theme z is related to word W, document d, time tdProbability.
103rd, using incremental update mode, renewal is iterated to the parameter obtained, until parameter restrains.
104th, using distributed way, the E steps performed according to the parameter after convergence in EM algorithms walk with M, and calculating obtains short The content of document.
In the present embodiment, by using probability graph model, according to the time of short text, document and theme label, to short essay This progress model learning, obtain likelihood function;Using EM algorithms, likelihood function is solved, obtains parameter;Using increment more New paragon, renewal is iterated to the parameter obtained, until parameter restrains;Using distributed way, according to the ginseng after convergence Number performs E steps and M steps in EM algorithms, the content for obtaining short essay shelves is calculated, so as to solve event detection of the prior art Real-time, socialization and the fragmentation feature of the short text in social networks can not be adapted to simultaneously, cause testing result not Accurate technical problem.And, it is proposed that there is the short text event detection model of supervision, incremental learning is the same as predicting the calculation being combined Method and the event detection model based on internal memory calculating platform.
Fig. 2 is a kind of flow of the real-time incremental formula detection method for social networks event that another embodiment of the present invention provides Schematic diagram, as shown in Fig. 2 including:
201st, model learning.
Fig. 3 is probability graph model, is the probability graph model representation for the event detecting method that the present embodiment is provided, such as Shown in Fig. 3, theme z is generated by time td, document d and theme label h, and the content W of a short essay shelves is generated by a theme z. Wherein, likelihood function is as follows:
Wherein, H represents the text collection containing theme label, the non-text collections for representing not containing theme label of H, θ B tables Show the background theme of the word of no practical significance, n (w, d) represents that word w appears in the word frequency in document d.λ θ B represent a text Shelves are without the probability of practical significance, the probability that p (d) expression documents d occurs, and the probability that all document d occur in a model is Identical, for the inverse of total number of documents, p (td| θ B) represent that background theme θ B appear in time tdProbability, p (h | θ B) represents master Topic label h appears in probability in background theme θ B, and p (w | θ B) represents probability of the word w in background theme θ B, and p (z | d) table Show probability of the theme z in document d;p(td| z) represent theme z in time tdWhen probability, p (h | z) represents that theme label h goes out Probability in present theme z, and p (w | z) represent probability of the word w in theme z.
It is as follows that EM Algorithm for Solving results are carried out to it:
E is walked:
The result of E steps is the intermediate result of algorithm.
P (z | w, d, td, h) and represent that theme z is related to word w, document d, time td, theme label h probability.
P (z | w, d, td) represent that theme z is related to word w, document d, time tdProbability.
M is walked:
P (w | z) represents probability of the word w in theme z in the result of M steps;P (z | d) represent theme z in document d Probability;p(td| z) represent theme z in time tdWhen probability.
It should be noted that on the whole, EM algorithm flow is as follows:
E is walked:Estimate the desired value of unknown parameter, provide current parameter Estimation.
M is walked:Distributed constant is reevaluated, so that the likelihood for obtaining data is maximum, provides the expectation estimation of known variables.
Iteration uses EM steps, until convergence.
202nd, parameter updates.
In order to keep the continuity of the detection of event and the substantial amounts of event content of quick detection, incremental update can be used The mode of parameter.Parameter renewal process is divided into following two parts.
First, the calculating process of algorithm be iterative calculation, the initialization of variable most started be by Stochastic implementation, it A steps are performed afterwards and determine that p (w | z) calculates the value of acquisition for the obtained value of initialization or previous round iteration, ask posteriority general Rate p (z | w, d, td, h) and p (z | w, d, td);Then, perform b step and determine p (z | d), and p (td | z) to be counted according to foregoing Calculate obtain p (z | w, d, td, h) and p (z | w, d, td) bring into M steps and correspond to the value that formula is tried to achieve, and try to achieve p (w | z);Finally, iteration performs a steps and b step to convergence again after the normalization of required result.
203rd, distributed implementation.
In a kind of possible implementation, the foregoing EM algorithms of distributed implementation can be used.
Specifically, Distributed Calculation is realized using mapping reduction (MapReduce).MapReduce is a kind of programming model, For the concurrent operation of large-scale dataset, it is very easy to programming personnel and the program of oneself is operated in into distributed system On.It is to specify mapping (Map) function that current software, which is realized, for one group of key-value pair is mapped to one group of new key assignments It is right, concurrent reduction (Reduce) function is specified, for ensureing each shared identical key in the key-value pair of all mappings Group.
Walked for E, in distributed implementation, can specifically use following code to realize.
Algorithm 1E-step
Input:key:z;value:map<w,val>;key:z;value:map<t,val>;the pairs from M- Step
Output:key:z;value:map<(w,d,td),val>
1:flatMaptoPair:
2:ArrayList M map=new ArrayList;
3:Foreach d contains w
4:M map.add (key=d, value=map<z,val>)
5:Endforeach
6:Foreach d contains td
7:M map.add (key=d, value=map<z,val>)
8:Endforeach
9:return M map;
10:M map union p(zjd);
11:reduceByKey(arg0,arg1):
12:Foreach key in arg1.map.keySets
13:If arg0.map contains key
14:Arg0.map.get (key) *=arg1.map.get (key)
15:Else
16:arg0.map put arg1.map.get(key)
17:Endif
18:return arg0
19:normalize()/*normalize the results.*/
It should be noted that the implementation procedure of aforementioned code is:
1-9 rows are the map stages:Realization calculates the Task Switching of needs into small task;
The wherein title for the map functions that the first row represents
2nd row states the memory cell of result of calculation
3-5 rows are the small tasks of division p (w | z)
6-8 rows are division p (td| z) small task
10-18 rows are the reduce stages:Realization collects the result of calculation of small task
The wherein memory space of 10 row application result of calculations
What the 11st row represented is the title of reduce functions
13-16 rows collect p (w | z) and p (t respectivelyd| z) result is multiplied to obtain the result that we refer in formula E steps
Result is normalized 19th row
Walked for M, in distributed implementation, can specifically use following code to realize.
Algorithm 2M-Step
Input:key:d;value:map<z,val>;the pair from E-Step
Output:key:z;value:map<map1<w,val>,map2<t,val>>
1:flatMaptoPair:
2:ArrayList E map1;
3:ArrayList E map2;
4:for w in d
5:for z in map1.keysets
6:E map1.add (key=z, value=map<w,val>)
7:Endfor
8:Endfor
9:E map2.add (key=z, value=map<t,val>)
10:return map<E map1,E map2>
11:reduceByKey(arg0,arg1):
12:Foreach key in arg1.map1.keySets
13:If arg0.map contains key
14:Arg0.map1.get (key) +=arg1.map1.get (key) * N (w, d)
15:Else
16:arg0.map1put arg1.map1.get(key)*N(w,d)
17:Endif
18:Foreach key in arg1.map2.keySets
19:If arg0.map2contains key
20:Arg0.map2.get (key) +=arg1.map2.get (key)
21:Else
22:arg0.map2put arg1.map2.get(key)
23:Endif
24:return arg0
25:normalize()/*normalize the results.*/
It should be noted that the implementation procedure of aforementioned code is:
1st row is the function name in map stages:Realization calculates calculative Task Switching into small task;
2-3 rows:The space of application result storage;
4-8 rows:By p (z | w, d, td, h) and p (z | w, d, td) it is converted into small task;
9-10 rows:Return to the result of small task storage;
11st row is the function name in reduce stages:Realization collects the result of small task computation;
13-17 rows:Calculating p (w | z) value;
18-23 rows:Calculate p (td| z) value;
24th row returns to the result of calculation of algorithm;
25th row does normalized to result.
In the present embodiment, the real-time incremental formula detection method of social networks event is by using probability graph model, according to short Time, document and the theme label of text, model learning is carried out to short text, obtains likelihood function;Using EM algorithms, to likelihood Function is solved, and obtains parameter;Using incremental update mode, renewal is iterated to the parameter obtained, until parameter is received Hold back;Using distributed way, the E steps and M steps in EM algorithms are performed according to the parameter after convergence, calculates and obtains the interior of short essay shelves Hold, the real-time of short text in social networks, society can not be adapted to simultaneously so as to solve event detection of the prior art It can change and fragmentation feature, cause the inaccurate technical problem of testing result.And, it is proposed that there is the short essay present event of supervision Detection model, incremental learning is the same as predicting the algorithm being combined and the event detection model based on internal memory calculating platform.
Fig. 4 is that a kind of structure of the real-time incremental formula detecting system for social networks event that one embodiment of the invention provides is shown It is intended to, including:Model learning module 41, likelihood function module 42, incremental update module 43 and Distributed Calculation module 44.
Model learning module 41 is right according to the time of short text, document and theme label for using probability graph model Short text carries out model learning, obtains likelihood function.
Likelihood function module 42, it is connected with model learning module 41, for using EM algorithms, likelihood function is asked Solution, obtain parameter.
Incremental update module 43, it is connected with likelihood function module 42, for using incremental update mode, to the ginseng obtained Number is iterated renewal, until parameter restrains.
Distributed Calculation module 44, it is connected with incremental update module 43, for using distributed way, after convergence Parameter performs E steps and M steps in EM algorithms, calculates the content for obtaining short essay shelves.
In the present embodiment, the real-time incremental formula detecting system of social networks event, by using probability graph model, according to short Time, document and the theme label of text, model learning is carried out to short text, obtains likelihood function;Using EM algorithms, to likelihood Function is solved, and obtains parameter;Using incremental update mode, renewal is iterated to the parameter obtained, until parameter is received Hold back;Using distributed way, the E steps and M steps in EM algorithms are performed according to the parameter after convergence, calculates and obtains the interior of short essay shelves Hold, the real-time of short text in social networks, society can not be adapted to simultaneously so as to solve event detection of the prior art It can change and fragmentation feature, cause the inaccurate technical problem of testing result.And, it is proposed that there is the short essay present event of supervision Detection model, incremental learning is the same as predicting the algorithm being combined and the event detection model based on internal memory calculating platform.
For clear explanation the present embodiment, the present embodiment additionally provides a kind of possible implementation, this possible In implementation,
Model learning module 41, specifically for using probability graph model, according to the time td, document d and theme of short text Label h, model learning is carried out to short text, obtains following likelihood function:
Wherein, H represents the text collection containing theme label, the non-text collections for representing not containing theme label of H, θ B tables Show the background theme of the word of no practical significance, n (w, d) represents that word w appears in the word frequency in document d.λ θ B represent a text Shelves are without the probability of practical significance, the probability that p (d) expression documents d occurs, and the probability that all document d occur in a model is Identical, for the inverse of total number of documents, p (td| θ B) represent that background theme θ B appear in time tdProbability, p (h | θ B) represents master Topic label h appears in probability in background theme θ B, and p (w | θ B) represents probability of the word w in background theme θ B, and p (z | d) table Show probability of the theme z in document d;p(td| z) represent theme z in time tdWhen probability, p (h | z) represents that theme label h goes out Probability in present theme z, and p (w | z) represent probability of the word w in theme z.
Likelihood function module 42, specifically for using EM algorithms, likelihood function logL is solved, obtained respectively in E steps Obtain parameter
And walked in M and obtain parameter
Incremental update module 43, specifically for the value according to the parameter p (w | z) being obtained ahead of time, ask posterior probability p (z | W, d, td, h) and p (z | w, d, td);According to p (z | w, d, td, h) and p (z | w, d, td) calculating acquisition parameter p (z | d) and p (td | Z) value, and solve p (w | z);The p tried to achieve (w | z), p (z | d) and p (td | z) are normalized;Described in iteration performs According to the value for the parameter p (w | z) being obtained ahead of time, ask posterior probability p (z | w, d, td, h) and p (z | w, d, td);According to p (z | w, D, td, h) and p (z | w, d, td) value for obtaining parameter p (z | d) and p (td | z) is calculated, and solve p (w | z);The p that will be tried to achieve The process that (w | z), p (z | d) and p (td | z) are normalized, (w | z), p (z | d) and p (td | z) restrain until p.
The device that the present embodiment is provided, for performing the method described in Fig. 1 and Fig. 2, each functional module in device Principle repeats no more, referring to the embodiment of corresponding method.
In the present embodiment, the social networks event detection based on short text is put, by using probability graph model, according to short essay This time, document and theme label, model learning is carried out to short text, obtains likelihood function;Using EM algorithms, to likelihood letter Number is solved, and obtains parameter;Using incremental update mode, renewal is iterated to the parameter obtained, until parameter restrains; Using distributed way, the E steps and M steps in EM algorithms are performed according to the parameter after convergence, calculates the content for obtaining short essay shelves, from And solve event detection of the prior art can not adapt to simultaneously the real-time of short text in social networks, socialization with And fragmentation feature, cause the inaccurate technical problem of testing result.And, it is proposed that there is the short text event detection mould of supervision Type, incremental learning is the same as predicting the algorithm being combined and the event detection model based on internal memory calculating platform.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above-mentioned each method embodiment can lead to The related hardware of programmed instruction is crossed to complete.Foregoing program can be stored in a computer read/write memory medium.The journey Sequence upon execution, execution the step of including above-mentioned each method embodiment;And foregoing storage medium includes:ROM, RAM, magnetic disc or Person's CD etc. is various can be with the medium of store program codes.
Finally it should be noted that:Various embodiments above is merely illustrative of the technical solution of the present invention, rather than its limitations;To the greatest extent The present invention is described in detail with reference to foregoing embodiments for pipe, it will be understood by those within the art that:Its according to The technical scheme described in foregoing embodiments can so be modified, either which part or all technical characteristic are entered Row equivalent substitution;And these modifications or replacement, the essence of appropriate technical solution is departed from various embodiments of the present invention technology The scope of scheme.

Claims (6)

  1. A kind of 1. real-time incremental formula detection method of social networks event, it is characterised in that including:
    Using probability graph model, according to the time of short text, document and theme label, model learning is carried out to short text, obtained Likelihood function;
    Using expectation maximization EM algorithms, likelihood function is solved, obtains parameter;
    Using incremental update mode, renewal is iterated to the parameter obtained, until parameter restrains;
    Using distributed way, the E steps and M steps in EM algorithms are performed according to the parameter after convergence, calculates and obtains the interior of short essay shelves Hold;
    Wherein, it is described to use probability graph model, according to the time of short text, document and theme label, model is carried out to short text Study, likelihood function is obtained, including:
    Using the probability graph model, according to the time t of short textd, document d and theme label h, model is carried out to short text Practise, obtain likelihood function
    <mrow> <mtable> <mtr> <mtd> <mrow> <mi>log</mi> <mi> </mi> <mi>l</mi> <mo>=</mo> <munder> <mo>&amp;Sigma;</mo> <mi>d</mi> </munder> <mi>p</mi> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> <mo>{</mo> <msub> <mi>&amp;lambda;</mi> <msub> <mi>&amp;theta;</mi> <mi>B</mi> </msub> </msub> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <msub> <mi>&amp;theta;</mi> <mi>B</mi> </msub> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>h</mi> <mo>|</mo> <msub> <mi>&amp;theta;</mi> <mi>B</mi> </msub> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <msub> <mi>&amp;theta;</mi> <mi>B</mi> </msub> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>&amp;lambda;</mi> <msub> <mi>&amp;theta;</mi> <mi>B</mi> </msub> </msub> <mo>)</mo> </mrow> <mo>{</mo> <munder> <mo>&amp;Sigma;</mo> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> </mrow> </munder> <mi>p</mi> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> <munder> <mo>&amp;Sigma;</mo> <mi>z</mi> </munder> <mo>{</mo> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>h</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <munder> <mi>&amp;Pi;</mi> <mi>w</mi> </munder> <mi>p</mi> <msup> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mrow> <mi>n</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> </msup> <mo>}</mo> <mo>}</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>&amp;lambda;</mi> <msub> <mi>&amp;theta;</mi> <mi>B</mi> </msub> </msub> <mo>)</mo> </mrow> <mo>{</mo> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> </mrow> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> <msub> <mi>&amp;Sigma;</mi> <mi>z</mi> </msub> <mo>{</mo> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <msub> <mi>&amp;Pi;</mi> <mi>w</mi> </msub> <mi>p</mi> <msup> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mrow> <mi>n</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> </msup> <mo>}</mo> <mo>}</mo> </mrow> </mtd> </mtr> </mtable> <mo>;</mo> </mrow>
    Wherein, H represents the text collection containing theme label,Expression does not contain the text collection of theme label, θBExpression does not have The background theme of the word of practical significance, n (w, d) represent that word w appears in the word frequency in document d;Represent that a document does not have The probability of practical significance, p (d) represent the probability that document d occurs, and the probability that all document d occur in a model is identical, For the inverse of total number of documents, p (tdB) represent background theme θBAppear in time tdProbability, p (h | θB) represent theme label h Appear in background theme θBIn probability, p (w | θB) represent word w in background theme θBIn probability, p (z | d) represents theme z Probability in document d;p(td| z) represent theme z in time tdWhen probability, p (h | z) represents that theme label h appears in theme Probability in z, and p (w | z) represent probability of the word w in theme z.
  2. 2. the real-time incremental formula detection method of social networks event according to claim 1, it is characterised in that the use EM algorithms, are solved to likelihood function, obtain parameter, including:
    Using EM algorithms, likelihood function log l are solved, is walked in E obtain parameter respectively
    <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>h</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&amp;Sigma;</mi> <mi>z</mi> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mrow> <mo>(</mo> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> <mo>)</mo> </mrow> </mrow>
    <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&amp;Sigma;</mi> <mi>z</mi> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mrow> <mo>(</mo> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> <mo>)</mo> </mrow> <mo>;</mo> </mrow>
    Wherein, p (z | w, d, td, h) and represent that theme z is related to word w, document d, time td, theme label h probability;P (z | w, d, td) represent that theme z is related to word w, document d, time tdProbability;
    And walked in M and obtain parameter
    <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> </mrow> </msub> <mi>n</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> </mrow> </msub> <mi>n</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&amp;Sigma;</mi> <mi>w</mi> </msub> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> </mrow> </msub> <mi>n</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> </mrow> </msub> <mi>n</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow>
    <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&amp;Sigma;</mi> <mi>z</mi> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mrow> <mo>(</mo> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> <mo>)</mo> </mrow> </mrow> 1
    <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&amp;Sigma;</mi> <mi>z</mi> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <mrow> <mo>(</mo> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> <mo>)</mo> </mrow> </mrow>
    <mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> </mrow> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> </mrow> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&amp;Sigma;</mi> <mi>t</mi> </msub> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> </mrow> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> </mrow> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>.</mo> </mrow>
  3. 3. the real-time incremental formula detection method of social networks event according to claim 2, it is characterised in that the use Incremental update mode, renewal is iterated to the parameter obtained, until parameter convergence includes:
    According to the value for the parameter p (w | z) being obtained ahead of time, ask posterior probability p (z | w, d, td, h) and p (z | w, d, td);
    According to p (z | w, d, td, h) and p (z | w, d, td) calculating acquisition parameter p (z | d) and p (td| value z), and solve p (w | z);
    By the p tried to achieve (w | z), p (z | d) and p (td| z) it is normalized;
    Iteration performs parameter p that the basis is obtained ahead of time (w | z) value, ask posterior probability p (z | w, d, td, h) and p (z | w, D, td);According to p (z | w, d, td, h) and p (z | w, d, td) calculating acquisition parameter p (z | d) and p (td| value z), and solve p (w|z);By the p tried to achieve (w | z), p (z | d) and p (td| process z) being normalized, (w | z), p (z | d) and p (t until pd | z) restrain.
  4. A kind of 4. real-time incremental formula detecting system of social networks event, it is characterised in that including:
    Model learning module, for using probability graph model, according to the time of short text, document and theme label, to short text Model learning is carried out, obtains likelihood function;
    Likelihood function module, for using expectation maximization EM algorithms, likelihood function is solved, obtains parameter;
    Incremental update module, for using incremental update mode, renewal is iterated to the parameter obtained, until parameter is received Hold back;
    Distributed Calculation module, for using distributed ways, E steps in EM algorithms are performed according to the parameter after convergence and M is walked, Calculate the content for obtaining short essay shelves;
    Wherein, the model learning module, specifically for using probability graph model, according to the time t of short textd, document d and master Label h is inscribed, model learning is carried out to short text, obtains likelihood function
    <mrow> <mtable> <mtr> <mtd> <mrow> <mi>log</mi> <mi> </mi> <mi>l</mi> <mo>=</mo> <munder> <mo>&amp;Sigma;</mo> <mi>d</mi> </munder> <mi>p</mi> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> <mo>{</mo> <msub> <mi>&amp;lambda;</mi> <msub> <mi>&amp;theta;</mi> <mi>B</mi> </msub> </msub> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <msub> <mi>&amp;theta;</mi> <mi>B</mi> </msub> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>h</mi> <mo>|</mo> <msub> <mi>&amp;theta;</mi> <mi>B</mi> </msub> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <msub> <mi>&amp;theta;</mi> <mi>B</mi> </msub> <mo>)</mo> </mrow> <mo>}</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>&amp;lambda;</mi> <msub> <mi>&amp;theta;</mi> <mi>B</mi> </msub> </msub> <mo>)</mo> </mrow> <mo>{</mo> <munder> <mo>&amp;Sigma;</mo> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> </mrow> </munder> <mi>p</mi> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> <munder> <mo>&amp;Sigma;</mo> <mi>z</mi> </munder> <mo>{</mo> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>h</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <munder> <mi>&amp;Pi;</mi> <mi>w</mi> </munder> <mi>p</mi> <msup> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mrow> <mi>n</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> </msup> <mo>}</mo> <mo>}</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>&amp;lambda;</mi> <msub> <mi>&amp;theta;</mi> <mi>B</mi> </msub> </msub> <mo>)</mo> </mrow> <mo>{</mo> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> </mrow> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> <msub> <mi>&amp;Sigma;</mi> <mi>z</mi> </msub> <mo>{</mo> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <msub> <mi>&amp;Pi;</mi> <mi>w</mi> </msub> <mi>p</mi> <msup> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mrow> <mi>n</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> </msup> <mo>}</mo> <mo>}</mo> </mrow> </mtd> </mtr> </mtable> <mo>;</mo> </mrow>
    Wherein, H represents the text collection containing theme label,Expression does not contain the text collection of theme label, θBExpression does not have The background theme of the word of practical significance, n (w, d) represent that word w appears in the word frequency in document d;Represent that a document does not have The probability of practical significance, p (d) represent the probability that document d occurs, and the probability that all document d occur in a model is identical, For the inverse of total number of documents,Represent background themeAppear in time tdProbability,Represent that theme label h appears in background themeIn probability,Represent that word w is being carried on the back Scape themeIn probability, p (z | d) represents probability of the theme z in document d;p(td| z) represent theme z in time tdWhen Probability, p (h | z) represent that theme label h appears in the probability in theme z, and p (w | z) represent probability of the word w in theme z.
  5. 5. the real-time incremental formula detecting system of social networks event according to claim 4, it is characterised in that
    The likelihood function module, specifically for using EM algorithms, is solved to likelihood function log l, walks obtain in E respectively Parameter
    <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>h</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&amp;Sigma;</mi> <mi>z</mi> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mrow> <mo>(</mo> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> <mo>)</mo> </mrow> </mrow>
    <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&amp;Sigma;</mi> <mi>z</mi> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mrow> <mo>(</mo> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> <mo>)</mo> </mrow> <mo>;</mo> </mrow>
    Wherein, p (z | w, d, td, h) and represent that theme z is related to word w, document d, time td, theme label h probability;P (z | w, d, td) represent that theme z is related to word w, document d, time tdProbability;
    And walked in M and obtain parameter
    <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> </mrow> </msub> <mi>n</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> </mrow> </msub> <mi>n</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&amp;Sigma;</mi> <mi>w</mi> </msub> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> </mrow> </msub> <mi>n</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> </mrow> </msub> <mi>n</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow>
    <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&amp;Sigma;</mi> <mi>z</mi> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mrow> <mo>(</mo> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> <mo>)</mo> </mrow> </mrow>
    <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>d</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&amp;Sigma;</mi> <mi>z</mi> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <mrow> <mo>(</mo> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> <mo>)</mo> </mrow> </mrow>
    <mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>|</mo> <mi>z</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> </mrow> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> </mrow> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&amp;Sigma;</mi> <mi>t</mi> </msub> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mi>H</mi> </mrow> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>&amp;Sigma;</mi> <mrow> <mi>d</mi> <mo>&amp;Element;</mo> <mover> <mi>H</mi> <mo>~</mo> </mover> </mrow> </msub> <mi>p</mi> <mrow> <mo>(</mo> <mi>z</mi> <mo>|</mo> <mi>w</mi> <mo>,</mo> <mi>d</mi> <mo>,</mo> <msub> <mi>t</mi> <mi>d</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>.</mo> </mrow>
  6. 6. the real-time incremental formula detecting system of social networks event according to claim 5, it is characterised in that the increment Update module, specifically for the value according to the parameter p (w | z) being obtained ahead of time, ask posterior probability p (z | w, d, td, h) and p (z | W, d, td);According to p (z | w, d, td, h) and p (z | w, d, td) calculating acquisition parameter p (z | d) and p (td| value z), and solve p(w|z);By the p tried to achieve (w | z), p (z | d) and p (td| z) it is normalized;Iteration performs what the basis was obtained ahead of time Parameter p (w | z) value, ask posterior probability p (z | w, d, td, h) and p (z | w, d, td);According to p (z | w, d, td, h) and p (z | W, d, td) calculating acquisition parameter p (z | d) and p (td| value z), and solve p (w | z);By the p tried to achieve (w | z), p (z | d) With p (td| process z) being normalized, (w | z), p (z | d) and p (t until pd| z) restrain.
CN201410509359.1A 2014-09-28 2014-09-28 The real-time incremental formula detection method and system of a kind of social networks event Active CN104281670B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410509359.1A CN104281670B (en) 2014-09-28 2014-09-28 The real-time incremental formula detection method and system of a kind of social networks event

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410509359.1A CN104281670B (en) 2014-09-28 2014-09-28 The real-time incremental formula detection method and system of a kind of social networks event

Publications (2)

Publication Number Publication Date
CN104281670A CN104281670A (en) 2015-01-14
CN104281670B true CN104281670B (en) 2017-12-15

Family

ID=52256543

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410509359.1A Active CN104281670B (en) 2014-09-28 2014-09-28 The real-time incremental formula detection method and system of a kind of social networks event

Country Status (1)

Country Link
CN (1) CN104281670B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106909541A (en) * 2015-12-23 2017-06-30 神州数码信息系统有限公司 A kind of automatic identification of cross-cutting public public sentiment, classify and the system for reporting

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289487A (en) * 2011-08-09 2011-12-21 浙江大学 Network burst hotspot event detection method based on topic model

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102289487A (en) * 2011-08-09 2011-12-21 浙江大学 Network burst hotspot event detection method based on topic model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《EDM:高效的微博事件检测算法》;童薇等;《计算机科学与探索》;20121019;第1076-1086页 *
Using Incremental PLSI for Threshold-Resilient Online Event Analysis;Tzu-Chuan Chou等;《IEEE Transaction on Knowledge and Data Engineering》;20080331;第20卷(第3期);第289-299页 *

Also Published As

Publication number Publication date
CN104281670A (en) 2015-01-14

Similar Documents

Publication Publication Date Title
Zhao et al. Data mining applications with R
CN106934012A (en) A kind of question answering in natural language method and system of knowledge based collection of illustrative plates
Mazumdar et al. Query complexity of clustering with side information
Despalatović et al. Community structure in networks: Girvan-Newman algorithm improvement
CN103150383B (en) A kind of event evolution analysis method of short text data
Dowe et al. Bayes not Bust! Why Simplicity is no Problem for Bayesians1
CN106599194A (en) Label determining method and device
Akgun et al. Automated symmetry breaking and model selection in Conjure
CN105825269B (en) A kind of feature learning method and system based on parallel automatic coding machine
CN107609141A (en) It is a kind of that quick modelling method of probabilistic is carried out to extensive renewable energy source data
CN104484380A (en) Personalized search method and personalized search device
CN114357117A (en) Transaction information query method and device, computer equipment and storage medium
Blanken et al. Estimating network structures using model selection
CN101266660A (en) Reality inconsistency analysis method based on descriptive logic
Gürünlü Alma et al. On the estimation of the extreme value and normal distribution parameters based on progressive type-II hybrid-censored data
Rahman et al. kDMI: A novel method for missing values imputation using two levels of horizontal partitioning in a data set
CN110046344A (en) Add the method and terminal device of separator
CN104281670B (en) The real-time incremental formula detection method and system of a kind of social networks event
Chis Sliding hidden markov model for evaluating discrete data
Roos et al. Analysis of textual variation by latent tree structures
Moreno et al. Scalable and exact sampling method for probabilistic generative graph models
CN109871889A (en) Mass psychology appraisal procedure under emergency event
Riabi et al. β-entropy for Pareto-type distributions and related weighted distributions
CN107430600A (en) Expansible web data extraction
Patra et al. Motif discovery in biological network using expansion tree

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant