CN103810282B - Logistic-normal model topic extraction method - Google Patents

Logistic-normal model topic extraction method Download PDF

Info

Publication number
CN103810282B
CN103810282B CN201410056958.2A CN201410056958A CN103810282B CN 103810282 B CN103810282 B CN 103810282B CN 201410056958 A CN201410056958 A CN 201410056958A CN 103810282 B CN103810282 B CN 103810282B
Authority
CN
China
Prior art keywords
calculate node
document
topic
parameter server
characteristic vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410056958.2A
Other languages
Chinese (zh)
Other versions
CN103810282A (en
Inventor
朱军
陈键飞
王紫
张钹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Real AI Technology Co Ltd
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Priority to CN201410056958.2A priority Critical patent/CN103810282B/en
Publication of CN103810282A publication Critical patent/CN103810282A/en
Application granted granted Critical
Publication of CN103810282B publication Critical patent/CN103810282B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a logistic-normal model topic extraction method. The logistic-normal model topic extraction method includes the following steps that firstly, a parameter server stores count matrixes on a computing node in a distributed mode, and all documents in a training set are distributed to the computing node; secondly, Gibbs sampling is performed on topics which correspond to all words in the documents respectively; thirdly, feature vectors of the sampled documents are collected; fourthly, posteriori distributions which the sum, the quadratic sum, the mean value and the covariance of the feature vectors of each document in the computing node obey are calculated, and the mean value and the covariance of the feature vectors of each document in the posteriori distributions are sampled; fifthly, whether the number of iterations reaches a reserved constant or not is judged, if yes, iterations are stopped, and the sixth step is executed, and if not, one is added to the number of iterations, and the second step, the third step, the fourth step and the sixth step are executed; the second step and the third step are sequentially executed on the documents of the computing node, soft maximum conversion is performed on the feature vectors sampled in the third step, and the proportion of each topic in each document in the computing node is output. By means of the method, the speed of extracting topics can be improved.

Description

A kind of Rogers spy-normal model method for extracting topic
Technical field
The present invention relates to data mining technology field, more particularly, to a kind of Rogers spy-normal model method for extracting topic.
Background technology
Implicit expression topic model has all embodied substantially in terms of excavating the file structure of document semantic information and process complexity Advantage, using implicit expression topic model excavate extensive document in semantic structure need solve problem be mainly:Number of files Amount is very huge, needs available algorithm in a distributed computing environment;The motility of model, such as extracts the dependency of topic.
Nowadays application implicit expression topic model data from small-scale text set develop into large-scale community network, Or even whole the Internet.Traditional unit learning method cannot adapt to the requirement of big data, need quick and can be distributed The algorithm running under formula computing environment.
In prior art, using association topic model, by using non-conjugated Rogers spy's normal model, extracting topic phase Guan Xing, in association topic model, the learning algorithm of Rogers's spy's normal model uses the calculus of variations, is repeatedly changed by numerical algorithm In generation, is solved.
Visible by foregoing description, the learning algorithm of the Rogers's spy's normal model in association topic model uses variation Method, is solved by numerical algorithm successive ignition, and less efficient, speed is low.
Content of the invention
The invention provides a kind of Rogers spy-normal model method for extracting topic, it is possible to increase the speed that topic extracts.
The invention provides a kind of Rogers spy-normal model method for extracting topic, the method includes:
S1:The count matrix distributed storage of topic in training set and word corresponding relation is being calculated section by parameter server On point, all document distribution in training set are given described calculate node by parameter server, and each calculate node preserves described meter The document that matrix number and parameter server are sent;
S2:Calculate node is deposited according to this calculate node to the corresponding topic of each word in the document in this calculate node The count matrix of storage carries out gibbs sampler;
S3:The topic of each word in the document that calculate node is sampled according to this calculate node is sampled the spy of this document Levy vector;
S4:Calculate node calculate the characteristic vector of each document in this node and, quadratic sum, using described and, square The Posterior distrbutionp obeyed with the average and covariance calculating all described characteristic vectors, and each literary composition of sampling from Posterior distrbutionp The average of characteristic vector of shelves and covariance;
S5:In calculate node, judging whether iterationses reach predetermined constant, if it is, stopping iteration, executing S6, If it is not, then iterationses add 1, execute S2, S3, S4 successively;
S6:In calculate node, S2, S3 are executed successively to the document of this calculate node, to the characteristic vector sampled in S3 Do soft maximum conversion, export the ratio of the document shared by each topic in each document in this calculate node.
Further, methods described further includes:
The Posterior distrbutionp of described topic is split into the item of described count matrix and the priori of this node storage by calculate node Item, by introducing the sampling of augmentation uniformly distributed random variable, non-zero entry of only sampling when from the sampling of the item of described count matrix.
Further, the topic sampling of each word in the document that described calculate node is sampled according to this calculate node The characteristic vector of this document, further includes:
S31:Often one-dimensional introducing Augmentation approach to described characteristic vector;
S32:Utilize Gauss distribution approximation sample from the condition distribution of the often one-dimensional Augmentation approach current signature vector This Augmentation approach;
S33:After other all dimensions giving described characteristic vector and Augmentation approach, certain one-dimensional bar of characteristic vector Sample successively the often one-dimensional of described characteristic vector in part distribution;
S34:Judge whether described cycle-index reaches preset loop number of times, if it is not, then cycle-index adds 1, hold successively Row S32, S33.
Further, described preset loop number of times is 8 times.
Further, described step S32, including:The condition distribution of the arbitrary dimension Augmentation approach from current signature vector Middle using through conversion Polya-Gamma (1, z) be distributed this Augmentation approach of approximation sample.
Further, methods described also includes:By implicit expression topic-word distribution matrix in the Posterior distrbutionp of arbitrary topic Removed by integration.
Further, methods described also includes:
The increment of the count matrix of calculate node minute book calculate node, periodically by every a line of this count matrix with The corresponding parameter server of this row synchronizes, and wherein, described parameter server is distributed server, and this count matrix is not Colleague is stored on different nodes.
Further, this is periodically counted by the increment of the count matrix of described calculate node minute book calculate node Every a line parameter server corresponding with this line of matrix synchronizes, and specifically includes:
Numbering according to described row calculates the parameter server of memorizer, and increment in this calculate node for this row is sent To parameter server;
Parameter server, will be corresponding on parameter server according to the count matrix in the incremental update parameter server sent Row and calculate node on the difference of described row send back described calculate node;
Calculate node updates this row in this calculate node according to the difference receiving.
A kind of Rogers spy being provided by the present invention-normal model method for extracting topic, is processed by Distributed Calculation Large-scale data, and the speed of topic extraction can be improved.
Brief description
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing Have technology description in required use accompanying drawing be briefly described it should be apparent that, drawings in the following description are the present invention Some embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis These accompanying drawings obtain other accompanying drawings.
Fig. 1 is a kind of Rogers spy provided in an embodiment of the present invention-normal model method for extracting topic flow chart.
Specific embodiment
Purpose, technical scheme and advantage for making the embodiment of the present invention are clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described it is clear that described embodiment is The a part of embodiment of the present invention, rather than whole embodiments, based on the embodiment in the present invention, those of ordinary skill in the art The every other embodiment being obtained on the premise of not making creative work, broadly falls into the scope of protection of the invention.
Embodiments provide a kind of Rogers spy-normal model method for extracting topic, referring to Fig. 1, the method bag Include:
S1:The count matrix distributed storage of topic in training set and word corresponding relation is being calculated section by parameter server On point, all document distribution in training set are given described calculate node by parameter server, and each calculate node preserves described meter The document that matrix number and parameter server are sent;
S2:Calculate node is deposited according to this calculate node to the corresponding topic of each word in the document in this calculate node The count matrix of storage carries out gibbs sampler;
S3:The topic of each word in the document that calculate node is sampled according to this calculate node is sampled the spy of this document Levy vector;
S4:Calculate node calculate the characteristic vector of each document in this node and, quadratic sum, using described and, square The Posterior distrbutionp obeyed with the average and covariance calculating all described characteristic vectors, and each literary composition of sampling from Posterior distrbutionp The average of characteristic vector of shelves and covariance;
S5:In calculate node, judging whether iterationses reach predetermined constant, if it is, stopping iteration, executing S6, If it is not, then iterationses add 1, execute S2, S3, S4 successively;
S6:In calculate node, S2, S3 are executed successively to the document of this calculate node, to the characteristic vector sampled in S3 Do soft maximum conversion, export the ratio of the document shared by each topic in each document in this calculate node.
A kind of Rogers spy provided in an embodiment of the present invention-normal model method for extracting topic, at Distributed Calculation Reason large-scale data, and the speed of topic extraction can be improved.
Wherein, in the system that a topic extracts, including a parameter server and at least one calculate node, parameter Server is used for distributing the document to be extracted in training set for calculate node, and count matrix is sent to calculate node;Meter Operator node preserves a part of document in all documents in the training set that parameter server distributes, and enters jargon to the document preserving Topic is extracted.
In step sl, parameter server is by the count matrix of topic in training set and word corresponding relationDistributed It is stored in calculate node, all document distribution in training set are given described calculate node by parameter server, each calculates section Point preserves described count matrixThe document sent with parameter server.
Wherein,D is number of documents, NdFor d piece document Length, wdn∈ [1, V] is the numbering of n-th word of d piece document, and V is the size of word list, zdn∈ [1, K] is a d piece The topic numbering of n-th word of document, K is topic number;#A represents the element number of set A.
In step s 2, calculate node is to each word corresponding topic z in the document in this calculate nodednAccording to this The count matrix of calculate node storageCarry out gibbs sampler.
In step s3, the topic z of each word in the document that calculate node is sampled according to this calculate nodednSampling The characteristic vector of this documentWhereinKth dimension for document d characteristic vector.
In step s 4, calculate node calculates the characteristic vector of each document in this nodeAnd, quadratic sum, using institute State and, quadratic sum calculates the Posterior distrbutionp that the average of described characteristic vector and covariance are obeyed, and samples from Posterior distrbutionp The mean μ of the characteristic vector of each document and covariance Σ;
In step s 6, in calculate node, S2, S3 are executed successively to the document of this calculate node, to sampled in S3 Characteristic vectorDo soft maximum conversion, export the ratio shared by each topic in each document in this calculate nodeWherein soft maximum transform definition isAfter conversion
Further, the method includes:The Posterior distrbutionp of described topic is split into the institute of this node storage by calculate node State the item of count matrix and the item of priori, by introducing the sampling of augmentation uniformly distributed random variable, when from described count matrix Only sample non-zero entry during sampling.
Step S3 specifically includes:
S31:Often one-dimensional introducing Augmentation approach to described characteristic vector;
S32:Utilize Gauss distribution approximation sample from the condition distribution of the often one-dimensional Augmentation approach current signature vector This Augmentation approach;
S33:After other all dimensions giving described characteristic vector and Augmentation approach, certain one-dimensional bar of characteristic vector Sample successively the often one-dimensional of described characteristic vector in part distribution;
Specifically, sampling characteristic vector i-th dimension~P (i-th dimension of characteristic vector | the i-th dimension of Augmentation approach, feature become Amount is except the dimension of i).
S34:Judge whether described cycle-index reaches preset loop number of times, if it is not, then cycle-index adds 1, hold successively Row S32, S33.
Wherein, preset loop number of times is preferably 8 times.Step S32 can also be realized by the following method:From current signature to In the condition distribution of the arbitrary dimension Augmentation approach under amount, using the Polya-Gamma through conversion, (1, z) distribution approximation sample should Augmentation approach.
In addition, in order to improve topic extraction speed it is preferable that in the Posterior distrbutionp of arbitrary topic by implicit expression topic- Word distribution matrix is removed by integration.
The method is safeguarded using periodicity Asynchronous Incremental update method to count matrix, specifically includes:Calculate node Corresponding with this row for every a line of this count matrix parameter is periodically taken by the increment of the count matrix of minute book calculate node Business device synchronizes, and wherein, described parameter server is distributed server, and the different rows of this count matrix are stored in different On node.
Wherein, the increment of the count matrix of described calculate node minute book calculate node, periodically by this count matrix Every a line parameter server corresponding with this line synchronize, specifically include:
Numbering according to described row calculates the parameter server of memorizer, and increment in this calculate node for this row is sent To parameter server;
Parameter server, will be corresponding on parameter server according to the count matrix in the incremental update parameter server sent Row and calculate node on the difference of described row send back described calculate node;
Calculate node updates this row in this calculate node according to the difference receiving.
The method method of sampling specifically includes:
a:In calculate node, the method using gibbs sampler to the given corresponding topic of all words and feature to Posterior distrbutionp after amount, to certain topic in documentSampled.
Wherein, For priori β set in advance= (0.01,…,0.01).
b:Count matrix using topic and word corresponding relation The openness probability calculating each topic successively.
Wherein
From condition distributionThe method of middle sampling is:
U~U (0,1),
Wherein Mult (A) is the multinomial distribution with vectorial A as parameter;
c:Often one-dimensional introducing Augmentation approach to characteristic vector
d:Using Gauss distribution or through conversion from the condition distribution of the arbitrary dimension Augmentation approach current signature vector Polya-Gamma (1, z) be distributed this Augmentation approach of approximation sample
Wherein For Polya-Gamma distribution.
e:After other all dimensions giving described characteristic vector and Augmentation approach, certain one-dimensional condition of characteristic vector Sample successively in distribution the often one-dimensional of described characteristic vector;
Wherein
f:Repeat step d and step e reach preset loop number of times S=8 until number of repetition.
g:The average of the characteristic vector of all documents and association side in this calculate node are calculated respectively on each calculate node Difference, calculates the average of the characteristic vector of all documents and covariance in training set using these information.
h:Calculate the parameter of its Posterior distrbutionp using the average of the characteristic vector of documents all in training set and covariance, and New average and covariance are gone out according to this parameter sampling.If the prior distribution of μ, Σ is Normal-Inverse-Wishart distributionThen its Posterior distrbutionp is
Wherein,ρ '=ρ+D, κ '=κ+D, Sample averageSample variance
It should be noted that:Count matrix during calculate node is understood to this calculate node during extracting topic is carried out Update, and by update notification to parameter server, the update content that parameter server is sent according to all calculate nodes is to parameter In server, the count matrix of storage is updated, and after the completion of renewal, the count matrix of the latest edition after updating is sent to All of calculate node, the count matrix that calculate node receives this latest edition is realized to count matrix in this calculate node more Newly.In addition, in method provided in an embodiment of the present invention, using Rogers spy-normal state priori, having obtained the dependency of topic.
Visible by foregoing description, a kind of Rogers spy provided in an embodiment of the present invention-normal model method for extracting topic, The present invention utilizes Distributed Calculation, can process extensive document, by way of data augmentation, has obtained accurate gibbs Sampling algorithm, improves computational efficiency and precision, it is possible to increase the speed that topic extracts.
It should be noted that herein, such as first and second etc relational terms are used merely to an entity Or operation is made a distinction with another entity or operation, and not necessarily requires or imply exist between these entities or operation Any this actual relation or order.And, term " inclusion ", "comprising" or its any other variant are intended to non- The comprising of exclusiveness, so that including a series of process of key elements, method, article or equipment not only include those key elements, But also include other key elements being not expressly set out, or also include being consolidated by this process, method, article or equipment Some key elements.In the absence of more restrictions, the key element being limited by sentence "including a ..." is it is not excluded that including Also there is other same factor in the process of described key element, method, article or equipment.
One of ordinary skill in the art will appreciate that:The all or part of step realizing said method embodiment can be passed through Completing, aforesaid program can be stored in the storage medium of embodied on computer readable the related hardware of programmed instruction, this program Upon execution, execute the step including said method embodiment;And aforesaid storage medium includes:ROM, RAM, magnetic disc or light Disk etc. is various can be with the medium of store program codes.
Last it should be noted that:The foregoing is only presently preferred embodiments of the present invention, be merely to illustrate the skill of the present invention Art scheme, is not intended to limit protection scope of the present invention.All any modifications made within the spirit and principles in the present invention, Equivalent, improvement etc., are all contained in protection scope of the present invention.

Claims (7)

1. a kind of Rogers spy-normal model method for extracting topic is it is characterised in that the method includes:
S1:Parameter server is by the count matrix distributed storage of topic in training set and word corresponding relation in calculate node On, all document distribution in training set are given described calculate node by parameter server, and each calculate node preserves described counting The document that matrix and parameter server are sent;
S2:Calculate node stores according to this calculate node to the corresponding topic of each word in the document in this calculate node Count matrix carries out gibbs sampler;
S3:The topic of each word in the document that calculate node is sampled according to this calculate node sample this document feature to Amount;
S4:Calculate node calculate the characteristic vector of each document in this node and, quadratic sum, using described and, quadratic sum meter Calculate the Posterior distrbutionp that the average of all described characteristic vectors and covariance are obeyed, and each document of sampling from Posterior distrbutionp The average of characteristic vector and covariance;
S5:In calculate node, judging whether iterationses reach predetermined constant, if it is, stopping iteration, executing S6, if No, then iterationses add 1, execute S2, S3, S4 successively;
S6:In calculate node, S2, S3 are executed successively to the document of this calculate node, the characteristic vector sampled is done soft in S3 Maximum converts, and exports the ratio of the document shared by each topic in each document in this calculate node.
2. method according to claim 1 is it is characterised in that methods described further includes:
The Posterior distrbutionp of described topic is split into the item of described count matrix and the item of priori of this node storage by calculate node, By introducing the sampling of augmentation uniformly distributed random variable, non-zero entry of only sampling when from the item sampling of described count matrix.
3. method according to claim 1 is it is characterised in that the literary composition sampled according to this calculate node of described calculate node The topic of each word in shelves is sampled the characteristic vector of this document, further includes:
S31:Often one-dimensional introducing Augmentation approach to described characteristic vector;
S32:Utilize this increasing of Gauss distribution approximation sample from the condition distribution of the often one-dimensional Augmentation approach current signature vector Extent amount;
S33:After other all dimensions giving described characteristic vector and Augmentation approach, certain one-dimensional condition of characteristic vector is divided Sample successively in cloth the often one-dimensional of described characteristic vector;
S34:Judge whether cycle-index reaches preset loop number of times, if it is not, then cycle-index adds 1, successively execution S32, S33.
4. method according to claim 3 is it is characterised in that described preset loop number of times is 8 times.
5. method according to claim 1 is it is characterised in that methods described also includes:Posterior distrbutionp in arbitrary topic Middle implicit expression topic-word distribution matrix is removed by integration.
6. method according to claim 1 is it is characterised in that methods described also includes:
The increment of the count matrix of calculate node minute book calculate node, periodically by every a line of this count matrix and this row Corresponding parameter server synchronizes, and wherein, described parameter server is distributed server, the different rows of this count matrix It is stored on different nodes.
7. method according to claim 6 is it is characterised in that the count matrix of described calculate node minute book calculate node Increment, periodically corresponding with this line for every a line of this count matrix parameter server is synchronized, specifically includes:
Numbering according to described row calculates the parameter server of memorizer, and increment in this calculate node for this row is sent to ginseng Number server;
Parameter server according to the count matrix in the incremental update parameter server sent, by row corresponding on parameter server Send back described calculate node with the difference of the described row in calculate node;
Calculate node updates this row in this calculate node according to the difference receiving.
CN201410056958.2A 2014-02-19 2014-02-19 Logistic-normal model topic extraction method Active CN103810282B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410056958.2A CN103810282B (en) 2014-02-19 2014-02-19 Logistic-normal model topic extraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410056958.2A CN103810282B (en) 2014-02-19 2014-02-19 Logistic-normal model topic extraction method

Publications (2)

Publication Number Publication Date
CN103810282A CN103810282A (en) 2014-05-21
CN103810282B true CN103810282B (en) 2017-02-15

Family

ID=50707052

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410056958.2A Active CN103810282B (en) 2014-02-19 2014-02-19 Logistic-normal model topic extraction method

Country Status (1)

Country Link
CN (1) CN103810282B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105868186A (en) * 2016-06-01 2016-08-17 清华大学 Simple and efficient topic extracting method
CN111259081A (en) * 2020-02-04 2020-06-09 杭州数梦工场科技有限公司 Data synchronization method and device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8463648B1 (en) * 2012-05-04 2013-06-11 Pearl.com LLC Method and apparatus for automated topic extraction used for the creation and promotion of new categories in a consultation system
CN103207856A (en) * 2013-04-03 2013-07-17 同济大学 Ontology concept and hierarchical relation generation method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8463648B1 (en) * 2012-05-04 2013-06-11 Pearl.com LLC Method and apparatus for automated topic extraction used for the creation and promotion of new categories in a consultation system
CN103207856A (en) * 2013-04-03 2013-07-17 同济大学 Ontology concept and hierarchical relation generation method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Generalized Relational Topic Models with Data Augmentation;Ning Chen et al.;《Processing of the Twenty-Third International Joint Conference on Artifical Intelligence》;20131231;1273-1279 *
Gibbs Max-Margin Topic Models with Fast Sampling Algorithms;Jun Zhu et al.;《Processing of the 30th International Conference in Machine Learning》;20131231;124-132 *
Scalale Inference for Logistic-Normal Topic Models;Jianfei Chen et al.;《Advances in Neural Information Processing Systems 26 (NIPS 2013)》;20131231;1-9 *

Also Published As

Publication number Publication date
CN103810282A (en) 2014-05-21

Similar Documents

Publication Publication Date Title
CN104484343B (en) It is a kind of that method of the motif discovery with following the trail of is carried out to microblogging
Gillispie et al. Enumerating Markov equivalence classes of acyclic digraph models
Huang et al. Prediction of wind power by chaos and BP artificial neural networks approach based on genetic algorithm
CN102890698B (en) Method for automatically describing microblogging topic tag
CN106844424A (en) A kind of file classification method based on LDA
CN103279478B (en) A kind of based on distributed mutual information file characteristics extracting method
Cavuoti et al. Photometric redshifts with the quasi Newton algorithm (MLPQNA) Results in the PHAT1 contest
CN103870447A (en) Keyword extracting method based on implied Dirichlet model
CN105868178A (en) Multi-document automatic abstract generation method based on phrase subject modeling
CN111382276B (en) Event development context graph generation method
CN103207856A (en) Ontology concept and hierarchical relation generation method
Sandholm et al. Best experienced payoff dynamics and cooperation in the centipede game
CN106294371B (en) Character string codomain cutting method and device
Martinez Alzamora et al. Fast and practical method for model reduction of large-scale water-distribution networks
CN104317794B (en) Chinese Feature Words association mode method for digging and its system based on dynamic item weights
Zaheer et al. Exponential stochastic cellular automata for massively parallel inference
CN105740354A (en) Adaptive potential Dirichlet model selection method and apparatus
CN103810282B (en) Logistic-normal model topic extraction method
CN106485370B (en) A kind of method and apparatus of information prediction
CN105224577A (en) Multi-label text classification method and system
Schram et al. Monte Carlo methods beyond detailed balance
CN109558436A (en) Air station flight delay causality method for digging based on entropy of transition
CN105335459A (en) XBRL intelligent report platform based statement consolidation data extraction method
CN111930944A (en) File label classification method and device
Zhang et al. Extracting sample data based on Poisson distribution

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20210527

Address after: 100084 a1901, 19th floor, building 8, yard 1, Zhongguancun East Road, Haidian District, Beijing

Patentee after: Beijing Ruili Wisdom Technology Co.,Ltd.

Address before: 100084 mailbox, 100084-82 Tsinghua Yuan, Beijing, Haidian District, Beijing

Patentee before: TSINGHUA University

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20140521

Assignee: Beijing Intellectual Property Management Co.,Ltd.

Assignor: Beijing Ruili Wisdom Technology Co.,Ltd.

Contract record no.: X2023110000073

Denomination of invention: A Topic Extraction Method for Rogers Normal Model

Granted publication date: 20170215

License type: Common License

Record date: 20230531