CN109840324A - It is a kind of semantic to strengthen topic model and subject evolution analysis method - Google Patents

It is a kind of semantic to strengthen topic model and subject evolution analysis method Download PDF

Info

Publication number
CN109840324A
CN109840324A CN201910020033.5A CN201910020033A CN109840324A CN 109840324 A CN109840324 A CN 109840324A CN 201910020033 A CN201910020033 A CN 201910020033A CN 109840324 A CN109840324 A CN 109840324A
Authority
CN
China
Prior art keywords
theme
semantic
subject
word
topic model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910020033.5A
Other languages
Chinese (zh)
Other versions
CN109840324B (en
Inventor
高望
胡刚
韩玮光
谢倩倩
李冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201910020033.5A priority Critical patent/CN109840324B/en
Publication of CN109840324A publication Critical patent/CN109840324A/en
Application granted granted Critical
Publication of CN109840324B publication Critical patent/CN109840324B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses a kind of semantic reinforcing topic model and subject evolution analysis methods, semanteme strengthens topic model by the way that condition random field is incorporated relating subject model, the probability that similar word belongs to same subject is improved using the semantic speciality of strengthening of term vector, and lower related term cancelling noise above is utilized, to effectively enhance the semantic coherence of subject extraction.Secondly, the invention proposes a kind of developing algorithms of Evolvement between different themes, theme is dynamically constructed from textstream and generates subject evolution figure, realizes the EVOLUTION ANALYSIS between core subject and sub-topics.It is on Sina weibo data set the experimental results showed that, subject of the present invention abstracting method is better than 5 kinds of benchmark topic models in theme continuity index, and subject evolution figure can be automatically generated, to effectively excavate the subject evolution mode in text.

Description

It is a kind of semantic to strengthen topic model and subject evolution analysis method
Technical field
The invention belongs to subject evolution analysis technical field, it is related to a kind of semantic strengthening topic model and subject evolution analysis A kind of method, and in particular to it is excellent that the semantically enhancement of combination term vector is associated with two aspect of extraction with the theme based on relating subject model The semantic of gesture strengthens topic model and subject evolution analysis method based on this model.
Background technique
With the high speed development of mobile Internet, wechat, microblogging, Email, forum, live streaming platform, comment website etc. Using more more and more universal, the information that these platforms generate all be much generated in real time in the form of data flow and dynamic more Newly.Fast-changing data flow makes us come to analyze textstream in real time there is an urgent need to develop an effective monitoring instrument Middle generation bulk information.The appearance of search engine, which provides one kind quick-searching and lookup from a large amount of archive data for people, to be had With the effective ways of information.But the search result of search engine return is usually the unstructured information of fragmentation, it cannot be anti- Reflect the evolution condition of entire subject matter at any time.It is various to be ground about subject extraction with what is developed with the appearance of probability topic model Study carefully and emerge in multitude, to solve the problems, such as that hot spot subject extraction and theme Temporal Evolution provide a kind of very good solution method.
Subject evolution analysis refers to high quality theme set has been obtained, and analyzes subject content during time-evolution And its evolution trend of survival condition, this dredges emergency event early warning, public sentiment, product marketing, information recommendation etc. are significant. In recent years, in the extraction process of theme, there are some methods for improving topic model subject extraction quality using term vector. Such methods are using, in semantic relation abundant, the language for enhancing text is associated with, to alleviate sparsity problem, mentions in term vector The performance that high text subject extracts.Although such method improves model capability to a certain extent, have ignored word to Measuring intrinsic generting machanism causes each word only to correspond to unique term vector, and for polysemant, this mechanism can be inferred in theme Noise is added in the process, to influence subject extraction effect, this be the invention solves one of critical issue.
In addition, for a focus incident, it will usually be collected into a short time from different reader theres rich and varied News, comment and opinion.However, facing a large amount of textstream, reader can not be by checking all relevant short texts Easily to understand focus incident.There are many evolution analysis method of theme at present, but these methods are come from theme intensity Analysis, can not analyze the situation of change of its internal node when theme changes.Therefore, in subject evolution process abstraction core Association between heart theme and sub-topics, to generate understandable subject evolution figure, be the invention solves another pass Key problem.
Summary of the invention
In order to solve the above-mentioned technical problems, the present invention provides a kind of semantically enhancement of combination term vector with based on being associated with master The theme association for inscribing model extracts the semantic of two aspect advantages and strengthens topic model and subject evolution analysis side based on this model Method.
A kind of semantic reinforcing topic model provided by the invention, it is characterised in that:
Firstly, the potential subject layer in relating subject model increases condition random field layer, by the theme z of semantic related term pair It is connected in the form of nonoriented edge, including five nonoriented edge ((zm1,zm2), (zm1,zm4), (zm1,zm5), (zm2,zm6), (zm3, zm6));Secondly, the lower related term above of each word w is stored in x, when between semantic related term pair and lower related term above When cosine similarity is more than a certain threshold value, then the side between their themes is considered as invalid edges, so that eliminating theme infers process In generated noise;Finally, returning semantic relative words with high probability by semantic reward function in the process that theme is inferred Belong to same subject.
It is provided by the invention a kind of based on the semantic subject evolution analysis method for strengthening topic model, which is characterized in that packet Include following steps:
Step 1: the corpus of text data set of acquisition is pre-processed;
Step 2: the semantic related term pair in identification text;
Step 3: based on semantic topic model of strengthening to text progress theme and Relation extraction;
First determine whether the cosine similarity between the term vector of two words of word centering is less than given threshold;It is set if being less than Determine threshold value, then by the word to semantic related term pair is identified as, semantic reinforcing is carried out to its theme modeling process, otherwise, without Semanteme is strengthened;
Step 4: the theme Posterior distrbutionp that semanteme strengthens topic model carries out parametric inference;
Increase condition random field layer in the potential subject layer of relating subject model, by the theme of semantic related term pair with undirected The form on side connects, thus during theme is inferred, so that semantic relative words belong to same master with high probability Topic, and generated noise during theme is inferred is eliminated using lower related term above;
Step 5: corpus of text data set being divided into several fragments at any time, and is sequentially arranged, using online Semanteme strengthens topic model and constructs theme from corpus of text data set and generate subject evolution figure.
The present invention has the advantage that
1, the present invention devises a kind of new topic model, incorporates the outer of semantic reinforcing in subject layer using condition random field Portion's information, realizes high quality motif discovery and thematic relation extracts;
2, the present invention, which devises a kind of online subject evolution model, can effectively identify association in text flow between theme Relationship, to automatically generate subject evolution figure.
Detailed description of the invention
Attached drawing 1 is that the semantic of the embodiment of the present invention strengthens topic model schematic diagram;
Attached drawing 2a is the theme continuity comparison schematic diagram of the present invention with pedestal method (theme number is 5);
Attached drawing 2b is the theme continuity comparison schematic diagram of the present invention with pedestal method (theme number is 10);
Attached drawing 3 is the distribution subject evolution diagram that online subject evolution auto-building model is utilized in the embodiment of the present invention.
Specific embodiment
Understand for the ease of those of ordinary skill in the art and implement the present invention, with reference to the accompanying drawings and embodiments to this hair It is bright to be described in further detail, it should be understood that implementation example described herein is merely to illustrate and explain the present invention, not For limiting the present invention.
The present invention is primarily based on relating subject model, proposes a kind of completely new semantic reinforcing topic model CCTM (Conditional random field regularized Correlated Topic Model).Semanteme strengthens theme mould Type CCTM improves similar word ownership by the way that condition random field is incorporated relating subject model, using the semantic speciality of strengthening of term vector To the probability of same subject, to effectively enhance the semantic coherence of subject extraction, and cancelling noise.Secondly, the present invention proposes Online semantic reinforcing topic model CCTM model, theme is dynamically constructed from textstream and generates subject evolution figure, is realized EVOLUTION ANALYSIS between core subject and sub-topics.
Referring to Fig.1, a kind of semantic reinforcing topic model CCTM provided by the invention, firstly, in relating subject model The potential subject layer of (correlated topic model) increases condition random field layer, by the theme z of semantic related term pair with The form of nonoriented edge connects.As shown in Fig. 1, there are five such nonoriented edge ((zm1,zm2), (zm1,zm4), (zm1, zm5), (zm2,zm6), (zm3,zm6)).Secondly, the lower related term above of each word w is stored in x.When semantic related term pair with When the cosine similarity between related term above is more than a certain threshold value down, then the side between their themes is considered as invalid edges, from And noise caused by eliminating during theme is inferred.Finally, making semanteme by semantic reward function in the process that theme is inferred Relative words belong to same subject with high probability.
The present invention also provides a kind of based on the semantic subject evolution analysis method for strengthening topic model, including following step It is rapid:
Step 1: the corpus of text data set of acquisition is pre-processed;
Pretreatment includes filtering out non-Chinese character and stop words, filtering out the word that frequency of occurrence is less than setting number.
Step 2: the semantic related term pair in identification text lays the foundation for semantic strengthen of step 3;Specifically first determine whether Whether the cosine similarity between the term vector of two words of word centering is less than given threshold;If being less than given threshold, by the word To semantic related term pair is identified as, semantic reinforcing is carried out to its theme modeling process, otherwise, is strengthened without semanteme;
For each word in document to (wa,wb), if meeting condition d (wa,wb) < ξ, wherein d (wa,wb) indicate to be somebody's turn to do The cosine similarity of two term vectors of word centering, ξ refer to preset threshold value, then by the word to being identified as semantic related term It is right, word waWith word waBetween be semantic reinforcing relationship.
Step 3: based on the semantic topic model CCTM that strengthens to text progress theme and Relation extraction;
Increase condition random field layer in the potential subject layer of relating subject model, by the theme of semantic related term pair with undirected The form on side connects, thus during theme is inferred, so that semantic relative words belong to same master with high probability Topic, and generated noise during theme is inferred is eliminated using lower related term above;
If there is the semantic relationship strengthened between two words, strengthen in topic model CCTM in semanteme, it is undirected with one While connecting their theme label;
At this point, the joint probability of theme label is as follows:
WhereinIndicate the prior distribution of m text subject k;It is word wmnPrior distribution;V indicates vocabulary Size;zmnIndicate the theme probability distribution of n-th of word in m texts;z-mnWord w is removed in expressionmnTheme probability distribution afterwards; w-mnWord w is removed in expressionmnRemaining word;xmnIndicate the lower related term probability distribution above of n-th of word in m texts.If Word waContext-sensitive word xaWith wbCosine similarity be more than given threshold, then release word waWith word wbBetween be semantic reinforcing Relationship, to eliminate generated noise during theme is inferred;Word w is removed in expressionmnRemaining word distribute to theme k's Number;Word w is removed in expressionmnWord j is distributed to the number of theme k;ψ () indicates semantic reward function, as follows:
Wherein, λ is balance hyper parameter, if λ is 0, semanteme is strengthened consistent with relating subject model if topic model CCTM; A is probability normalization factor;E indicates semantic and strengthens connected graph;f(zmi,zmj) it is counting function, indicate wmiHow many semanteme Strengthen word and belongs to same subject zmi, final to strengthen word wmiBelong to theme zmiProbability.
Step 4: the theme Posterior distrbutionp that semanteme strengthens topic model CCTM carries out parametric inference;
Using the collapse Gibbs sampling method based on data augmentation to the semantic theme posteriority for strengthening topic model CCTM Distribution carries out parametric inference;
Wherein, parametric inference formula are as follows:
Wherein,The prior distribution of the remaining theme of theme k is removed in expression;NmIndicate the number of word in m texts;Indicate word wmnTheme be k, it is on the contrary then
Conditional probability it is as follows:
Prior part therein is a single argument normal distribution, knownUnder conditions of μ and ∑, model ginseng is obtained Number:
Wherein Λ=∑-1For concentration matrix;Polya-Gamma latent variable is assisted by introducingThe sampling of data augmentation Method solves the problems, such as non-conjugated, obtains the edge distribution of following complete probability distribution:
Wherein Indicate Polya-Gamma distribution
Step 5: corpus of text data set being divided into several fragments at any time, and is sequentially arranged, using online Semanteme strengthens topic model CCTM and constructs theme from corpus of text data set and generate subject evolution figure;
The online semantic topic model CCTM that strengthens assesses in two adjacent time fragments each theme to it with KL divergence Between subject evolution relationship;
For fragment tnMiddle theme ziAnd tn+1In theme zj, their topic similarity is as follows:
According to the topic similarity based on KL, the subject evolution relationship between adjacent time fragment is established;If topic_ sim(zi,zj) it is less than a certain specific threshold ω, then it represents that theme zjIt is theme ziSubsequent theme, otherwise zjIt is emerging theme, and Theme ziIt is then decline theme.
Method and benchmark topic model proposed by the invention can relatively verify the efficient of the method for the present invention by experiment Property.Present invention experiment data set used is the 41839 Chinese microbloggings extracted from Sina weibo.The present invention is by calculating not Same number of topics, theme continuity and mainstream benchmark topic model LDA, CTM, BTM, PTM, GPU-DMM ratio under different themes word Compared with subject extraction quality.Hyper parameter α=50/K of the invention, β=0.01;Gibbs sampler number is 1000 words;When two words The cosine similarity of term vector carries out semantic reinforcing less than 0.3.Experimental result is as shown in Fig. 2, experiment effect of the invention It is better than 5 kinds of benchmark topic models.This is because to promote semantic relative words to belong to using semantic strengthening mechanism same by CCTM Theme, and noise word is eliminated using lower related term above, so that theme semantic dependency is stronger.In addition, utilizing online theme The distribution subject evolution diagram that evolutionary model automatically generates is as shown in Fig. 3.Core subject in horse 370 events of boat, in microblogging The missing differentiation for focusing on aircraft from March 8th, 2014 is prayed for blessings to March for passenger on the 9th.And it is associated with the sub-topics " military, Vietnam hair Showed the trace of Oil spills ", " hold counterfeit passport on lost contact aircraft person " etc., reader can be helped to understand whole event rapidly Subject evolution path.
It should be understood that the part that this specification does not elaborate belongs to the prior art.
It should be understood that the above-mentioned description for preferred embodiment is more detailed, can not therefore be considered to this The limitation of invention patent protection range, those skilled in the art under the inspiration of the present invention, are not departing from power of the present invention Benefit requires to make replacement or deformation under protected ambit, fall within the scope of protection of the present invention, this hair It is bright range is claimed to be determined by the appended claims.

Claims (7)

1. a kind of semantic reinforcing topic model, it is characterised in that:
Firstly, the potential subject layer in relating subject model increases condition random field layer, by the theme z of semantic related term pair with nothing It is connected to the form on side, including five nonoriented edge ((zm1,zm2), (zm1,zm4), (zm1,zm5), (zm2,zm6), (zm3, zm6));Secondly, the lower related term above of each word w is stored in x, when between semantic related term pair and lower related term above When cosine similarity is more than a certain threshold value, then the side between their themes is considered as invalid edges, so that eliminating theme infers process In generated noise;Finally, returning semantic relative words with high probability by semantic reward function in the process that theme is inferred Belong to same subject.
2. a kind of based on the semantic subject evolution analysis method for strengthening topic model, which comprises the following steps:
Step 1: the corpus of text data set of acquisition is pre-processed;
Step 2: the semantic related term pair in identification text;
Step 3: based on semantic topic model of strengthening to text progress theme and Relation extraction;
Increase condition random field layer in the potential subject layer of relating subject model, by the theme of semantic related term pair with nonoriented edge Form connects, to make semantic relative words belong to same subject, and benefit with high probability during theme is inferred Noise caused by being eliminated with lower related term above during theme is inferred;
Step 4: the theme Posterior distrbutionp that semanteme strengthens topic model carries out parametric inference;
Step 5: corpus of text data set being divided into several fragments at any time, and is sequentially arranged, online semanteme is utilized Strengthen topic model to construct theme from corpus of text data set and generate subject evolution figure.
3. according to right want 2 described in based on the semantic subject evolution analysis method for strengthening topic model, it is characterised in that: step It is pre-processed described in 1, including filtering out non-Chinese character and stop words, filtering out word of the frequency of occurrence less than setting number.
4. according to right want 2 described in based on the semantic subject evolution analysis method for strengthening topic model, it is characterised in that: step In 2, first determine whether the cosine similarity between the term vector of two words of word centering is less than given threshold;If being less than setting threshold The word is then strengthened theme modeling process to its semanteme and carries out semantic reinforcing, otherwise, no by value to semantic related term pair is identified as Carry out semantic reinforcing.
5. according to right want 2 described in based on the semantic subject evolution analysis method for strengthening topic model, it is characterised in that: step In 3, if there is the semantic relationship strengthened between two words, strengthens in topic model in semanteme, connect it with a nonoriented edge Theme label;
At this point, the joint probability that theme label belongs to k is as follows:
WhereinIndicate the prior distribution of m text subject k;It is word wmnPrior distribution;The size of V expression vocabulary; zmnIndicate the theme probability distribution of n-th of word in m texts;z-mnWord w is removed in expressionmnTheme probability distribution afterwards;w-mnTable Show and removes word wmnRemaining word;xmnIndicate the lower related term probability distribution above of n-th of word in m texts;If word wa's Context-sensitive word xaWith wbCosine similarity be more than given threshold, then release word waWith word wbBetween be semantic reinforcing relationship, To eliminate generated noise during theme is inferred;Word w is removed in expressionmnRemaining word distribute to the number of theme k;Word w is removed in expressionmnWord j is distributed to the number of theme k;ψ () indicates semantic reward function, as follows:
Wherein, λ is balance hyper parameter, if λ is 0, semanteme is strengthened consistent with relating subject model if topic model;A is probability Normalization factor;E indicates semantic and strengthens connected graph;f(zmi,zmj) it is counting function, indicate wmiHow many semantic reinforcing word category In same subject zmi, final to strengthen word wmiBelong to theme zmiProbability.
6. according to claim 5 based on the semantic subject evolution analysis method for strengthening topic model, it is characterised in that: step In rapid 4, using the collapse Gibbs sampling method based on data augmentation to the semantic theme Posterior distrbutionp for strengthening topic model into Row parametric inference;
Wherein, parametric inference formula are as follows:
Wherein,The prior distribution of the remaining theme of theme k is removed in expression;NmIndicate the number of word in m texts; Indicate word wmnTheme be k, it is on the contrary then
Conditional probability it is as follows:
Prior part therein is a single argument normal distribution, knownUnder conditions of μ and ∑, model parameter is obtained:
Wherein Λ=∑-1For concentration matrix;Polya-Gamma latent variable is assisted by introducingThe method of sampling of data augmentation It solves the problems, such as non-conjugated, obtains the edge distribution of following complete probability distribution:
Wherein Indicate Polya-Gamma distribution
7. according to claim 2 based on the semantic subject evolution analysis method for strengthening topic model, it is characterised in that: step In rapid 5, online semantic topic model of strengthening assesses the master in two adjacent time fragments between each theme pair with KL divergence Inscribe Evolvement;
For fragment tnMiddle theme ziAnd tn+1In theme zj, their topic similarity is as follows:
According to the topic similarity based on KL, the subject evolution relationship between adjacent time fragment is established;If topic_sim (zi,zj) it is less than a certain specific threshold ω, then it represents that theme zjIt is theme ziSubsequent theme, otherwise zjIt is emerging theme, and leads Inscribe ziIt is then decline theme.
CN201910020033.5A 2019-01-09 2019-01-09 Semantic enhancement topic model construction method and topic evolution analysis method Active CN109840324B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910020033.5A CN109840324B (en) 2019-01-09 2019-01-09 Semantic enhancement topic model construction method and topic evolution analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910020033.5A CN109840324B (en) 2019-01-09 2019-01-09 Semantic enhancement topic model construction method and topic evolution analysis method

Publications (2)

Publication Number Publication Date
CN109840324A true CN109840324A (en) 2019-06-04
CN109840324B CN109840324B (en) 2023-03-24

Family

ID=66883725

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910020033.5A Active CN109840324B (en) 2019-01-09 2019-01-09 Semantic enhancement topic model construction method and topic evolution analysis method

Country Status (1)

Country Link
CN (1) CN109840324B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781281A (en) * 2019-10-24 2020-02-11 北京工业大学 Emerging theme detection method and device, computer equipment and storage medium
CN111143511A (en) * 2019-12-16 2020-05-12 北京工业大学 Emerging technology prediction method, emerging technology prediction device, electronic equipment and medium
CN111339289A (en) * 2020-03-06 2020-06-26 西安工程大学 Topic model inference method based on commodity comments
CN111782784A (en) * 2020-06-24 2020-10-16 京东数字科技控股有限公司 File generation method and device, electronic equipment and storage medium
CN114580431A (en) * 2022-02-28 2022-06-03 山西大学 Dynamic theme quality evaluation method based on optimal transportation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2004235636A1 (en) * 2004-12-03 2006-06-22 Panscient Inc A Machine Learning System For Extracting Structured Records From Web Pages And Other Text Sources
CN104268200A (en) * 2013-09-22 2015-01-07 中科嘉速(北京)并行软件有限公司 Unsupervised named entity semantic disambiguation method based on deep learning
CN109086375A (en) * 2018-07-24 2018-12-25 武汉大学 A kind of short text subject extraction method based on term vector enhancing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2004235636A1 (en) * 2004-12-03 2006-06-22 Panscient Inc A Machine Learning System For Extracting Structured Records From Web Pages And Other Text Sources
CN104268200A (en) * 2013-09-22 2015-01-07 中科嘉速(北京)并行软件有限公司 Unsupervised named entity semantic disambiguation method based on deep learning
CN109086375A (en) * 2018-07-24 2018-12-25 武汉大学 A kind of short text subject extraction method based on term vector enhancing

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHEN Y等: "Modeling emerging, evolving and fading topics using dynamic soft orthogonal nmf with sparse representation", 《IEEE INTERNATIONAL CONFERENCE ON DATA MINING(ICDM)》 *
崔凯等: "一种基于LDA的在线主题演化挖掘模型", 《计算机科学》 *
彭敏等: "基于双向LSTM语义强化的主题建模", 《中文信息学报》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781281A (en) * 2019-10-24 2020-02-11 北京工业大学 Emerging theme detection method and device, computer equipment and storage medium
CN111143511A (en) * 2019-12-16 2020-05-12 北京工业大学 Emerging technology prediction method, emerging technology prediction device, electronic equipment and medium
CN111339289A (en) * 2020-03-06 2020-06-26 西安工程大学 Topic model inference method based on commodity comments
CN111339289B (en) * 2020-03-06 2022-10-28 西安工程大学 Topic model inference method based on commodity comments
CN111782784A (en) * 2020-06-24 2020-10-16 京东数字科技控股有限公司 File generation method and device, electronic equipment and storage medium
CN111782784B (en) * 2020-06-24 2023-09-29 京东科技控股股份有限公司 Document generation method and device, electronic equipment and storage medium
CN114580431A (en) * 2022-02-28 2022-06-03 山西大学 Dynamic theme quality evaluation method based on optimal transportation

Also Published As

Publication number Publication date
CN109840324B (en) 2023-03-24

Similar Documents

Publication Publication Date Title
Asghar et al. T‐SAF: Twitter sentiment analysis framework using a hybrid classification scheme
Gardent et al. Creating training corpora for nlg micro-planning
CN109840324A (en) It is a kind of semantic to strengthen topic model and subject evolution analysis method
EP2553605B1 (en) Text classifier system
Hardeniya et al. Dictionary based approach to sentiment analysis-a review
Zhan et al. Using deep learning for short text understanding
Zhang et al. Encoding conversation context for neural keyphrase extraction from microblog posts
CN104679825B (en) Macroscopic abnormity of earthquake acquisition of information based on network text and screening technique
US20150039296A1 (en) Predicate template collecting device, specific phrase pair collecting device and computer program therefor
Lo et al. An unsupervised multilingual approach for online social media topic identification
Lu et al. Sentiment analysis of film review texts based on sentiment dictionary and SVM
Altheneyan et al. Big data ML-based fake news detection using distributed learning
CN111538828A (en) Text emotion analysis method and device, computer device and readable storage medium
CN107077640B (en) System and process for analyzing, qualifying, and ingesting unstructured data sources via empirical attribution
CN111339772B (en) Russian text emotion analysis method, electronic device and storage medium
Azam et al. Twitter data mining for events classification and analysis
Zhang et al. A taxonomy, data set, and benchmark for detecting and classifying malevolent dialogue responses
EP2369504A1 (en) System
Altiti et al. Just at semeval-2020 task 11: Detecting propaganda techniques using bert pre-trained model
Isnan et al. Sentiment Analysis for TikTok Review Using VADER Sentiment and SVM Model
Vitman et al. Sarcasm detection framework using context, emotion and sentiment features
Alabdullatif et al. Classification of Arabic Twitter users: a study based on user behaviour and interests
Lee et al. Detecting suicidality with a contextual graph neural network
Kuang et al. Semantic and context-aware linguistic model for bias detection
CN110705290A (en) Webpage classification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant