CN109840324A - It is a kind of semantic to strengthen topic model and subject evolution analysis method - Google Patents
It is a kind of semantic to strengthen topic model and subject evolution analysis method Download PDFInfo
- Publication number
- CN109840324A CN109840324A CN201910020033.5A CN201910020033A CN109840324A CN 109840324 A CN109840324 A CN 109840324A CN 201910020033 A CN201910020033 A CN 201910020033A CN 109840324 A CN109840324 A CN 109840324A
- Authority
- CN
- China
- Prior art keywords
- theme
- semantic
- subject
- word
- topic model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Machine Translation (AREA)
Abstract
The invention discloses a kind of semantic reinforcing topic model and subject evolution analysis methods, semanteme strengthens topic model by the way that condition random field is incorporated relating subject model, the probability that similar word belongs to same subject is improved using the semantic speciality of strengthening of term vector, and lower related term cancelling noise above is utilized, to effectively enhance the semantic coherence of subject extraction.Secondly, the invention proposes a kind of developing algorithms of Evolvement between different themes, theme is dynamically constructed from textstream and generates subject evolution figure, realizes the EVOLUTION ANALYSIS between core subject and sub-topics.It is on Sina weibo data set the experimental results showed that, subject of the present invention abstracting method is better than 5 kinds of benchmark topic models in theme continuity index, and subject evolution figure can be automatically generated, to effectively excavate the subject evolution mode in text.
Description
Technical field
The invention belongs to subject evolution analysis technical field, it is related to a kind of semantic strengthening topic model and subject evolution analysis
A kind of method, and in particular to it is excellent that the semantically enhancement of combination term vector is associated with two aspect of extraction with the theme based on relating subject model
The semantic of gesture strengthens topic model and subject evolution analysis method based on this model.
Background technique
With the high speed development of mobile Internet, wechat, microblogging, Email, forum, live streaming platform, comment website etc.
Using more more and more universal, the information that these platforms generate all be much generated in real time in the form of data flow and dynamic more
Newly.Fast-changing data flow makes us come to analyze textstream in real time there is an urgent need to develop an effective monitoring instrument
Middle generation bulk information.The appearance of search engine, which provides one kind quick-searching and lookup from a large amount of archive data for people, to be had
With the effective ways of information.But the search result of search engine return is usually the unstructured information of fragmentation, it cannot be anti-
Reflect the evolution condition of entire subject matter at any time.It is various to be ground about subject extraction with what is developed with the appearance of probability topic model
Study carefully and emerge in multitude, to solve the problems, such as that hot spot subject extraction and theme Temporal Evolution provide a kind of very good solution method.
Subject evolution analysis refers to high quality theme set has been obtained, and analyzes subject content during time-evolution
And its evolution trend of survival condition, this dredges emergency event early warning, public sentiment, product marketing, information recommendation etc. are significant.
In recent years, in the extraction process of theme, there are some methods for improving topic model subject extraction quality using term vector.
Such methods are using, in semantic relation abundant, the language for enhancing text is associated with, to alleviate sparsity problem, mentions in term vector
The performance that high text subject extracts.Although such method improves model capability to a certain extent, have ignored word to
Measuring intrinsic generting machanism causes each word only to correspond to unique term vector, and for polysemant, this mechanism can be inferred in theme
Noise is added in the process, to influence subject extraction effect, this be the invention solves one of critical issue.
In addition, for a focus incident, it will usually be collected into a short time from different reader theres rich and varied
News, comment and opinion.However, facing a large amount of textstream, reader can not be by checking all relevant short texts
Easily to understand focus incident.There are many evolution analysis method of theme at present, but these methods are come from theme intensity
Analysis, can not analyze the situation of change of its internal node when theme changes.Therefore, in subject evolution process abstraction core
Association between heart theme and sub-topics, to generate understandable subject evolution figure, be the invention solves another pass
Key problem.
Summary of the invention
In order to solve the above-mentioned technical problems, the present invention provides a kind of semantically enhancement of combination term vector with based on being associated with master
The theme association for inscribing model extracts the semantic of two aspect advantages and strengthens topic model and subject evolution analysis side based on this model
Method.
A kind of semantic reinforcing topic model provided by the invention, it is characterised in that:
Firstly, the potential subject layer in relating subject model increases condition random field layer, by the theme z of semantic related term pair
It is connected in the form of nonoriented edge, including five nonoriented edge ((zm1,zm2), (zm1,zm4), (zm1,zm5), (zm2,zm6), (zm3,
zm6));Secondly, the lower related term above of each word w is stored in x, when between semantic related term pair and lower related term above
When cosine similarity is more than a certain threshold value, then the side between their themes is considered as invalid edges, so that eliminating theme infers process
In generated noise;Finally, returning semantic relative words with high probability by semantic reward function in the process that theme is inferred
Belong to same subject.
It is provided by the invention a kind of based on the semantic subject evolution analysis method for strengthening topic model, which is characterized in that packet
Include following steps:
Step 1: the corpus of text data set of acquisition is pre-processed;
Step 2: the semantic related term pair in identification text;
Step 3: based on semantic topic model of strengthening to text progress theme and Relation extraction;
First determine whether the cosine similarity between the term vector of two words of word centering is less than given threshold;It is set if being less than
Determine threshold value, then by the word to semantic related term pair is identified as, semantic reinforcing is carried out to its theme modeling process, otherwise, without
Semanteme is strengthened;
Step 4: the theme Posterior distrbutionp that semanteme strengthens topic model carries out parametric inference;
Increase condition random field layer in the potential subject layer of relating subject model, by the theme of semantic related term pair with undirected
The form on side connects, thus during theme is inferred, so that semantic relative words belong to same master with high probability
Topic, and generated noise during theme is inferred is eliminated using lower related term above;
Step 5: corpus of text data set being divided into several fragments at any time, and is sequentially arranged, using online
Semanteme strengthens topic model and constructs theme from corpus of text data set and generate subject evolution figure.
The present invention has the advantage that
1, the present invention devises a kind of new topic model, incorporates the outer of semantic reinforcing in subject layer using condition random field
Portion's information, realizes high quality motif discovery and thematic relation extracts;
2, the present invention, which devises a kind of online subject evolution model, can effectively identify association in text flow between theme
Relationship, to automatically generate subject evolution figure.
Detailed description of the invention
Attached drawing 1 is that the semantic of the embodiment of the present invention strengthens topic model schematic diagram;
Attached drawing 2a is the theme continuity comparison schematic diagram of the present invention with pedestal method (theme number is 5);
Attached drawing 2b is the theme continuity comparison schematic diagram of the present invention with pedestal method (theme number is 10);
Attached drawing 3 is the distribution subject evolution diagram that online subject evolution auto-building model is utilized in the embodiment of the present invention.
Specific embodiment
Understand for the ease of those of ordinary skill in the art and implement the present invention, with reference to the accompanying drawings and embodiments to this hair
It is bright to be described in further detail, it should be understood that implementation example described herein is merely to illustrate and explain the present invention, not
For limiting the present invention.
The present invention is primarily based on relating subject model, proposes a kind of completely new semantic reinforcing topic model CCTM
(Conditional random field regularized Correlated Topic Model).Semanteme strengthens theme mould
Type CCTM improves similar word ownership by the way that condition random field is incorporated relating subject model, using the semantic speciality of strengthening of term vector
To the probability of same subject, to effectively enhance the semantic coherence of subject extraction, and cancelling noise.Secondly, the present invention proposes
Online semantic reinforcing topic model CCTM model, theme is dynamically constructed from textstream and generates subject evolution figure, is realized
EVOLUTION ANALYSIS between core subject and sub-topics.
Referring to Fig.1, a kind of semantic reinforcing topic model CCTM provided by the invention, firstly, in relating subject model
The potential subject layer of (correlated topic model) increases condition random field layer, by the theme z of semantic related term pair with
The form of nonoriented edge connects.As shown in Fig. 1, there are five such nonoriented edge ((zm1,zm2), (zm1,zm4), (zm1,
zm5), (zm2,zm6), (zm3,zm6)).Secondly, the lower related term above of each word w is stored in x.When semantic related term pair with
When the cosine similarity between related term above is more than a certain threshold value down, then the side between their themes is considered as invalid edges, from
And noise caused by eliminating during theme is inferred.Finally, making semanteme by semantic reward function in the process that theme is inferred
Relative words belong to same subject with high probability.
The present invention also provides a kind of based on the semantic subject evolution analysis method for strengthening topic model, including following step
It is rapid:
Step 1: the corpus of text data set of acquisition is pre-processed;
Pretreatment includes filtering out non-Chinese character and stop words, filtering out the word that frequency of occurrence is less than setting number.
Step 2: the semantic related term pair in identification text lays the foundation for semantic strengthen of step 3;Specifically first determine whether
Whether the cosine similarity between the term vector of two words of word centering is less than given threshold;If being less than given threshold, by the word
To semantic related term pair is identified as, semantic reinforcing is carried out to its theme modeling process, otherwise, is strengthened without semanteme;
For each word in document to (wa,wb), if meeting condition d (wa,wb) < ξ, wherein d (wa,wb) indicate to be somebody's turn to do
The cosine similarity of two term vectors of word centering, ξ refer to preset threshold value, then by the word to being identified as semantic related term
It is right, word waWith word waBetween be semantic reinforcing relationship.
Step 3: based on the semantic topic model CCTM that strengthens to text progress theme and Relation extraction;
Increase condition random field layer in the potential subject layer of relating subject model, by the theme of semantic related term pair with undirected
The form on side connects, thus during theme is inferred, so that semantic relative words belong to same master with high probability
Topic, and generated noise during theme is inferred is eliminated using lower related term above;
If there is the semantic relationship strengthened between two words, strengthen in topic model CCTM in semanteme, it is undirected with one
While connecting their theme label;
At this point, the joint probability of theme label is as follows:
WhereinIndicate the prior distribution of m text subject k;It is word wmnPrior distribution;V indicates vocabulary
Size;zmnIndicate the theme probability distribution of n-th of word in m texts;z-mnWord w is removed in expressionmnTheme probability distribution afterwards;
w-mnWord w is removed in expressionmnRemaining word;xmnIndicate the lower related term probability distribution above of n-th of word in m texts.If
Word waContext-sensitive word xaWith wbCosine similarity be more than given threshold, then release word waWith word wbBetween be semantic reinforcing
Relationship, to eliminate generated noise during theme is inferred;Word w is removed in expressionmnRemaining word distribute to theme k's
Number;Word w is removed in expressionmnWord j is distributed to the number of theme k;ψ () indicates semantic reward function, as follows:
Wherein, λ is balance hyper parameter, if λ is 0, semanteme is strengthened consistent with relating subject model if topic model CCTM;
A is probability normalization factor;E indicates semantic and strengthens connected graph;f(zmi,zmj) it is counting function, indicate wmiHow many semanteme
Strengthen word and belongs to same subject zmi, final to strengthen word wmiBelong to theme zmiProbability.
Step 4: the theme Posterior distrbutionp that semanteme strengthens topic model CCTM carries out parametric inference;
Using the collapse Gibbs sampling method based on data augmentation to the semantic theme posteriority for strengthening topic model CCTM
Distribution carries out parametric inference;
Wherein, parametric inference formula are as follows:
Wherein,The prior distribution of the remaining theme of theme k is removed in expression;NmIndicate the number of word in m texts;Indicate word wmnTheme be k, it is on the contrary then
Conditional probability it is as follows:
Prior part therein is a single argument normal distribution, knownUnder conditions of μ and ∑, model ginseng is obtained
Number:
Wherein Λ=∑-1For concentration matrix;Polya-Gamma latent variable is assisted by introducingThe sampling of data augmentation
Method solves the problems, such as non-conjugated, obtains the edge distribution of following complete probability distribution:
Wherein Indicate Polya-Gamma distribution
Step 5: corpus of text data set being divided into several fragments at any time, and is sequentially arranged, using online
Semanteme strengthens topic model CCTM and constructs theme from corpus of text data set and generate subject evolution figure;
The online semantic topic model CCTM that strengthens assesses in two adjacent time fragments each theme to it with KL divergence
Between subject evolution relationship;
For fragment tnMiddle theme ziAnd tn+1In theme zj, their topic similarity is as follows:
According to the topic similarity based on KL, the subject evolution relationship between adjacent time fragment is established;If topic_
sim(zi,zj) it is less than a certain specific threshold ω, then it represents that theme zjIt is theme ziSubsequent theme, otherwise zjIt is emerging theme, and
Theme ziIt is then decline theme.
Method and benchmark topic model proposed by the invention can relatively verify the efficient of the method for the present invention by experiment
Property.Present invention experiment data set used is the 41839 Chinese microbloggings extracted from Sina weibo.The present invention is by calculating not
Same number of topics, theme continuity and mainstream benchmark topic model LDA, CTM, BTM, PTM, GPU-DMM ratio under different themes word
Compared with subject extraction quality.Hyper parameter α=50/K of the invention, β=0.01;Gibbs sampler number is 1000 words;When two words
The cosine similarity of term vector carries out semantic reinforcing less than 0.3.Experimental result is as shown in Fig. 2, experiment effect of the invention
It is better than 5 kinds of benchmark topic models.This is because to promote semantic relative words to belong to using semantic strengthening mechanism same by CCTM
Theme, and noise word is eliminated using lower related term above, so that theme semantic dependency is stronger.In addition, utilizing online theme
The distribution subject evolution diagram that evolutionary model automatically generates is as shown in Fig. 3.Core subject in horse 370 events of boat, in microblogging
The missing differentiation for focusing on aircraft from March 8th, 2014 is prayed for blessings to March for passenger on the 9th.And it is associated with the sub-topics " military, Vietnam hair
Showed the trace of Oil spills ", " hold counterfeit passport on lost contact aircraft person " etc., reader can be helped to understand whole event rapidly
Subject evolution path.
It should be understood that the part that this specification does not elaborate belongs to the prior art.
It should be understood that the above-mentioned description for preferred embodiment is more detailed, can not therefore be considered to this
The limitation of invention patent protection range, those skilled in the art under the inspiration of the present invention, are not departing from power of the present invention
Benefit requires to make replacement or deformation under protected ambit, fall within the scope of protection of the present invention, this hair
It is bright range is claimed to be determined by the appended claims.
Claims (7)
1. a kind of semantic reinforcing topic model, it is characterised in that:
Firstly, the potential subject layer in relating subject model increases condition random field layer, by the theme z of semantic related term pair with nothing
It is connected to the form on side, including five nonoriented edge ((zm1,zm2), (zm1,zm4), (zm1,zm5), (zm2,zm6), (zm3,
zm6));Secondly, the lower related term above of each word w is stored in x, when between semantic related term pair and lower related term above
When cosine similarity is more than a certain threshold value, then the side between their themes is considered as invalid edges, so that eliminating theme infers process
In generated noise;Finally, returning semantic relative words with high probability by semantic reward function in the process that theme is inferred
Belong to same subject.
2. a kind of based on the semantic subject evolution analysis method for strengthening topic model, which comprises the following steps:
Step 1: the corpus of text data set of acquisition is pre-processed;
Step 2: the semantic related term pair in identification text;
Step 3: based on semantic topic model of strengthening to text progress theme and Relation extraction;
Increase condition random field layer in the potential subject layer of relating subject model, by the theme of semantic related term pair with nonoriented edge
Form connects, to make semantic relative words belong to same subject, and benefit with high probability during theme is inferred
Noise caused by being eliminated with lower related term above during theme is inferred;
Step 4: the theme Posterior distrbutionp that semanteme strengthens topic model carries out parametric inference;
Step 5: corpus of text data set being divided into several fragments at any time, and is sequentially arranged, online semanteme is utilized
Strengthen topic model to construct theme from corpus of text data set and generate subject evolution figure.
3. according to right want 2 described in based on the semantic subject evolution analysis method for strengthening topic model, it is characterised in that: step
It is pre-processed described in 1, including filtering out non-Chinese character and stop words, filtering out word of the frequency of occurrence less than setting number.
4. according to right want 2 described in based on the semantic subject evolution analysis method for strengthening topic model, it is characterised in that: step
In 2, first determine whether the cosine similarity between the term vector of two words of word centering is less than given threshold;If being less than setting threshold
The word is then strengthened theme modeling process to its semanteme and carries out semantic reinforcing, otherwise, no by value to semantic related term pair is identified as
Carry out semantic reinforcing.
5. according to right want 2 described in based on the semantic subject evolution analysis method for strengthening topic model, it is characterised in that: step
In 3, if there is the semantic relationship strengthened between two words, strengthens in topic model in semanteme, connect it with a nonoriented edge
Theme label;
At this point, the joint probability that theme label belongs to k is as follows:
WhereinIndicate the prior distribution of m text subject k;It is word wmnPrior distribution;The size of V expression vocabulary;
zmnIndicate the theme probability distribution of n-th of word in m texts;z-mnWord w is removed in expressionmnTheme probability distribution afterwards;w-mnTable
Show and removes word wmnRemaining word;xmnIndicate the lower related term probability distribution above of n-th of word in m texts;If word wa's
Context-sensitive word xaWith wbCosine similarity be more than given threshold, then release word waWith word wbBetween be semantic reinforcing relationship,
To eliminate generated noise during theme is inferred;Word w is removed in expressionmnRemaining word distribute to the number of theme k;Word w is removed in expressionmnWord j is distributed to the number of theme k;ψ () indicates semantic reward function, as follows:
Wherein, λ is balance hyper parameter, if λ is 0, semanteme is strengthened consistent with relating subject model if topic model;A is probability
Normalization factor;E indicates semantic and strengthens connected graph;f(zmi,zmj) it is counting function, indicate wmiHow many semantic reinforcing word category
In same subject zmi, final to strengthen word wmiBelong to theme zmiProbability.
6. according to claim 5 based on the semantic subject evolution analysis method for strengthening topic model, it is characterised in that: step
In rapid 4, using the collapse Gibbs sampling method based on data augmentation to the semantic theme Posterior distrbutionp for strengthening topic model into
Row parametric inference;
Wherein, parametric inference formula are as follows:
Wherein,The prior distribution of the remaining theme of theme k is removed in expression;NmIndicate the number of word in m texts;
Indicate word wmnTheme be k, it is on the contrary then
Conditional probability it is as follows:
Prior part therein is a single argument normal distribution, knownUnder conditions of μ and ∑, model parameter is obtained:
Wherein Λ=∑-1For concentration matrix;Polya-Gamma latent variable is assisted by introducingThe method of sampling of data augmentation
It solves the problems, such as non-conjugated, obtains the edge distribution of following complete probability distribution:
Wherein Indicate Polya-Gamma distribution
7. according to claim 2 based on the semantic subject evolution analysis method for strengthening topic model, it is characterised in that: step
In rapid 5, online semantic topic model of strengthening assesses the master in two adjacent time fragments between each theme pair with KL divergence
Inscribe Evolvement;
For fragment tnMiddle theme ziAnd tn+1In theme zj, their topic similarity is as follows:
According to the topic similarity based on KL, the subject evolution relationship between adjacent time fragment is established;If topic_sim
(zi,zj) it is less than a certain specific threshold ω, then it represents that theme zjIt is theme ziSubsequent theme, otherwise zjIt is emerging theme, and leads
Inscribe ziIt is then decline theme.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910020033.5A CN109840324B (en) | 2019-01-09 | 2019-01-09 | Semantic enhancement topic model construction method and topic evolution analysis method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910020033.5A CN109840324B (en) | 2019-01-09 | 2019-01-09 | Semantic enhancement topic model construction method and topic evolution analysis method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109840324A true CN109840324A (en) | 2019-06-04 |
CN109840324B CN109840324B (en) | 2023-03-24 |
Family
ID=66883725
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910020033.5A Active CN109840324B (en) | 2019-01-09 | 2019-01-09 | Semantic enhancement topic model construction method and topic evolution analysis method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109840324B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110781281A (en) * | 2019-10-24 | 2020-02-11 | 北京工业大学 | Emerging theme detection method and device, computer equipment and storage medium |
CN111143511A (en) * | 2019-12-16 | 2020-05-12 | 北京工业大学 | Emerging technology prediction method, emerging technology prediction device, electronic equipment and medium |
CN111339289A (en) * | 2020-03-06 | 2020-06-26 | 西安工程大学 | Topic model inference method based on commodity comments |
CN111782784A (en) * | 2020-06-24 | 2020-10-16 | 京东数字科技控股有限公司 | File generation method and device, electronic equipment and storage medium |
CN114580431A (en) * | 2022-02-28 | 2022-06-03 | 山西大学 | Dynamic theme quality evaluation method based on optimal transportation |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2004235636A1 (en) * | 2004-12-03 | 2006-06-22 | Panscient Inc | A Machine Learning System For Extracting Structured Records From Web Pages And Other Text Sources |
CN104268200A (en) * | 2013-09-22 | 2015-01-07 | 中科嘉速(北京)并行软件有限公司 | Unsupervised named entity semantic disambiguation method based on deep learning |
CN109086375A (en) * | 2018-07-24 | 2018-12-25 | 武汉大学 | A kind of short text subject extraction method based on term vector enhancing |
-
2019
- 2019-01-09 CN CN201910020033.5A patent/CN109840324B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2004235636A1 (en) * | 2004-12-03 | 2006-06-22 | Panscient Inc | A Machine Learning System For Extracting Structured Records From Web Pages And Other Text Sources |
CN104268200A (en) * | 2013-09-22 | 2015-01-07 | 中科嘉速(北京)并行软件有限公司 | Unsupervised named entity semantic disambiguation method based on deep learning |
CN109086375A (en) * | 2018-07-24 | 2018-12-25 | 武汉大学 | A kind of short text subject extraction method based on term vector enhancing |
Non-Patent Citations (3)
Title |
---|
CHEN Y等: "Modeling emerging, evolving and fading topics using dynamic soft orthogonal nmf with sparse representation", 《IEEE INTERNATIONAL CONFERENCE ON DATA MINING(ICDM)》 * |
崔凯等: "一种基于LDA的在线主题演化挖掘模型", 《计算机科学》 * |
彭敏等: "基于双向LSTM语义强化的主题建模", 《中文信息学报》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110781281A (en) * | 2019-10-24 | 2020-02-11 | 北京工业大学 | Emerging theme detection method and device, computer equipment and storage medium |
CN111143511A (en) * | 2019-12-16 | 2020-05-12 | 北京工业大学 | Emerging technology prediction method, emerging technology prediction device, electronic equipment and medium |
CN111339289A (en) * | 2020-03-06 | 2020-06-26 | 西安工程大学 | Topic model inference method based on commodity comments |
CN111339289B (en) * | 2020-03-06 | 2022-10-28 | 西安工程大学 | Topic model inference method based on commodity comments |
CN111782784A (en) * | 2020-06-24 | 2020-10-16 | 京东数字科技控股有限公司 | File generation method and device, electronic equipment and storage medium |
CN111782784B (en) * | 2020-06-24 | 2023-09-29 | 京东科技控股股份有限公司 | Document generation method and device, electronic equipment and storage medium |
CN114580431A (en) * | 2022-02-28 | 2022-06-03 | 山西大学 | Dynamic theme quality evaluation method based on optimal transportation |
Also Published As
Publication number | Publication date |
---|---|
CN109840324B (en) | 2023-03-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Asghar et al. | T‐SAF: Twitter sentiment analysis framework using a hybrid classification scheme | |
Gardent et al. | Creating training corpora for nlg micro-planning | |
CN109840324A (en) | It is a kind of semantic to strengthen topic model and subject evolution analysis method | |
EP2553605B1 (en) | Text classifier system | |
Hardeniya et al. | Dictionary based approach to sentiment analysis-a review | |
Zhan et al. | Using deep learning for short text understanding | |
Zhang et al. | Encoding conversation context for neural keyphrase extraction from microblog posts | |
CN104679825B (en) | Macroscopic abnormity of earthquake acquisition of information based on network text and screening technique | |
US20150039296A1 (en) | Predicate template collecting device, specific phrase pair collecting device and computer program therefor | |
Lo et al. | An unsupervised multilingual approach for online social media topic identification | |
Lu et al. | Sentiment analysis of film review texts based on sentiment dictionary and SVM | |
Altheneyan et al. | Big data ML-based fake news detection using distributed learning | |
CN111538828A (en) | Text emotion analysis method and device, computer device and readable storage medium | |
CN107077640B (en) | System and process for analyzing, qualifying, and ingesting unstructured data sources via empirical attribution | |
CN111339772B (en) | Russian text emotion analysis method, electronic device and storage medium | |
Azam et al. | Twitter data mining for events classification and analysis | |
Zhang et al. | A taxonomy, data set, and benchmark for detecting and classifying malevolent dialogue responses | |
EP2369504A1 (en) | System | |
Altiti et al. | Just at semeval-2020 task 11: Detecting propaganda techniques using bert pre-trained model | |
Isnan et al. | Sentiment Analysis for TikTok Review Using VADER Sentiment and SVM Model | |
Vitman et al. | Sarcasm detection framework using context, emotion and sentiment features | |
Alabdullatif et al. | Classification of Arabic Twitter users: a study based on user behaviour and interests | |
Lee et al. | Detecting suicidality with a contextual graph neural network | |
Kuang et al. | Semantic and context-aware linguistic model for bias detection | |
CN110705290A (en) | Webpage classification method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |