CN104331506A - Multiclass emotion analyzing method and system facing bilingual microblog text - Google Patents
Multiclass emotion analyzing method and system facing bilingual microblog text Download PDFInfo
- Publication number
- CN104331506A CN104331506A CN201410670909.8A CN201410670909A CN104331506A CN 104331506 A CN104331506 A CN 104331506A CN 201410670909 A CN201410670909 A CN 201410670909A CN 104331506 A CN104331506 A CN 104331506A
- Authority
- CN
- China
- Prior art keywords
- text
- emotion
- sentiment
- bilingual
- dictionary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a multiclass emotion analyzing method and a system facing a bilingual microblog text and belongs to the technical field of microblog text emotion analysis. The method comprises the following steps that (1) bilingual dictionary construction: corpus with an emotion inclination of a certain size is first collected, high frequent words with the emotion inclination can be extracted from the corpus, an emotional dictionary is then expanded by using an existing knowledge database and a vocabulary similarity calculating model, and finally network language and emotional signs are added in the emotional dictionary; (2) text pretreatment: the words are divided in a to-be-identified text, stop words are removed, and standardization treatment is conducted on English word shapes; (3) text characteristic space expression: the bilingual emotional dictionary is used for conducting vectorization on the text; (4) an emotional identifying task of the corpus text is realized through a multi emotion class model. The accurate rate and the F1 valve of the method are higher than those of a traditional classification method, and particularly the classification effect of a semi-supervised Gaussian mixture model classification algorithm in a small-scale training set is obviously better than that of the other methods.
Description
Technical field
The present invention relates to a kind of sentiment analysis method and system, particularly a kind of multiclass sentiment analysis method and system towards bilingual microblogging text, belongs to microblog text affective analysis technical field.
Background technology
Along with the rise of social media platform and widely using of mobile device, people have been accustomed to conveying feeling and ideas expression demand with 140 characters.Issuing microblog has become the important means that individuality shows emotion, therefore carries out Sentiment orientation analysis for microblogging text and has important practical significance.At present, Sina's microblogging has become the main carriers of domestic network public opinion, and a large number of users carries out information interaction and emotional expression by microblogging.Carry out the exploitation of emotional semantic classification system for user's microblogging text and then complete emotion identification, having important reference significance in fields such as public sentiment monitoring, product test and appraisal.
Existing sentiment analysis system is many is divided into forward emotion and negative sense emotion two class by microblogging text.But the emotion of the mankind is complicated and diversified, forward emotion comprises as moods such as trust, gratitude, rejoices, negative sense emotion then comprise as painful, disdain, hate, envy etc.Simply be divided into two classes can not ensure the accuracy that emotion distinguishes emotion.Still lack the fine granularity emotional semantic classification system can flutterred and catch community interest at present.Current microblog emotional analytic system carry out statistical study mainly for single languages text and Chinese Sentiment orientation, but in recent years due to the raising of CONTINENTAL AREA OF CHINA level of education, and the impact of internationalization trend, Chinese and English collocation use or pure english writing become the important form of individual emotional expression gradually.This Chinese and English mixes the microblogging text taken also for microblog emotional analysis brings new challenge.Emotional semantic classification system based on single language sentiment analysis method is no longer applicable to the microblogging language environment become increasingly complex.
In addition, the work that the distinguishes major part of current emotion vocabulary adopts the method for mechanical translation to obtain emotion vocabulary, but for microblogging text, due to the restriction of its short text, 140 words, vocabulary forms more complicated, English slang, networks enjoy popularity phrase number grow with each passing day, and the quality of mechanical translation cannot be guaranteed.
Summary of the invention
The object of the invention is for solve existing microblog emotional analytical approach granularity of classification thick, the delayed problem of the microblogging text analyzing of taking discriminating conduct of low quality, emotion vocabulary is mixed for Chinese and English, a kind of bilingual Chinese-English sentiment dictionary building method based on microblogging language material and a kind of microblogging multiclass sentiment analysis method based on bilingual dictionary and bilingual microblogging text multiclass sentiment analysis system are provided in microblog text affective field, thus carry out multiclass sentiment analysis for microblogging text.
The thought of technical solution of the present invention is by collecting the microblogging corpus of text in a large number with Sentiment orientation, build Sino-British sentiment dictionary storehouse, the mixture model of semi-supervised and full supervision is adopted to build multiple emotion classifiers, after text-processing is carried out to bilingual text, according to vocabulary emotion classification, space characteristics expression is carried out to text, thus utilize the multiple emotion classifiers built to realize the emotion recognition task of microblogging text.
Specific implementation step of the present invention is as follows:
A kind of bilingual Chinese-English sentiment dictionary building method, the method comprises the following steps:
Step one, capture microblogging webpage, from webpage, collect the Chinese and English language material with Sentiment orientation, and from corpus, extract the high frequency vocabulary with Sentiment orientation add sentiment dictionary storehouse;
Step 2, the existing knowledge base of application are expanded described sentiment dictionary;
The microblogging language material that step 3, analysis capture, adds described sentiment dictionary by emerging for network language and emoticon.
Preferably, described Sentiment orientation comprises the social help, happiness, sadness, indignation and frightened five classes.
Preferably, described knowledge base comprises WordNet, NTUSD and HowNet.
Preferably, the expansion of described step 2 is the average similarity by calculating each Sentiment orientation vocabulary in emotion vocabulary and sentiment dictionary in each knowledge base respectively, and emotion word is extended in the maximum Sentiment orientation classification of similarity.
Preferably, the mode of many people handshow is adopted to classify to its Sentiment orientation to the emerging language of described network and emoticon.
Based on a multiclass sentiment analysis method for bilingual dictionary, the method comprises the following steps:
Step one, pre-service is carried out to language material text;
Step 2, according to described bilingual Chinese-English sentiment dictionary, feature space expression is carried out to described language material text;
The text emotion classifiers model that step 3, basis have been set up carries out emotional semantic classification to language material text.
Preferably, described pre-service comprises participle and removes stop words, also comprises morphology standardization for English text.
Preferably, described text feature space representation is that each text in language material is expressed as five dimensional vectors, and in vector, each element represents the number of the emotion word of corresponding classification in the described bilingual Chinese-English sentiment dictionary comprised respectively.
Preferably, described emotion classifiers model is semi-supervised gauss hybrid models sorting algorithm (Semi-GMM) or the k nearest neighbor algorithm (KNN-KL) based on symmetric relative entropy.
Preferably, described semi-supervised gauss hybrid models sorting algorithm is the corpus collection study gauss hybrid models by having marked, then iterative learning is carried out to the testing material collection marked, until algorithm convergence or unlabeled set are combined into sky using the probability distribution of this model parameter and marker samples as the initial parameter values of gauss hybrid models.
Preferably, the described k nearest neighbor algorithm based on symmetric relative entropy is the distance adopting relative entropy to measure to express text to text emotion similarity, and the classification according to adjacent sample decides sample generic to be sorted.
Preferably, described relative entropy adopts following formula to calculate:
Wherein, T
ifor the normalized vector of retrtieval represents, T
jfor the normalized vector of unmarked text represents, ω
ik, ω
jkrepresent T respectively
i, T
jkth item, k is the integer between 1 to 5.
Towards a multiclass sentiment analysis system for bilingual microblogging text, comprise bilingual Chinese-English sentiment dictionary, language material pretreatment module, language material text feature space representation module, emotion classifiers identification module; Bilingual Chinese-English sentiment dictionary adopts described bilingual Chinese-English sentiment dictionary building method to build; Language material pretreatment module is used for carrying out participle herein to language material to be analyzed and going stop words process, also will carry out morphology standardization processing for English text; Language material text feature space representation module is used for carrying out vectorization expression to the text after the process of language material pretreatment module, be five dimensional vectors by text-processing, five elements in vector represent in text the number being included in the social help in described bilingual Chinese-English sentiment dictionary, happiness, sadness, indignation and frightened five class emotion word respectively; Emotion classifiers identification module carries out emotion recognition for adopting described emotion classifiers model to language material text vector, determines the emotion classification belonging to language material text.
Beneficial effect
The present invention is directed to microblog text affective analysis field, to classify bilingual Chinese-English sentiment dictionary by using Sina's Twitter message text and existing construction of knowledge base fine granularity five, construct the bilingual microblog emotional multi-categorizer based on semi-supervised Gauss model and the k nearest neighbor algorithm based on symmetric relative entropy, sentiment analysis is carried out to the bilingual microblogging of Chinese and English.Experimental result shows, the present invention proposes based on the accuracy rate of the sensibility classification method of bilingual sentiment dictionary and F1 value higher than traditional sorting technique.Particularly the classifying quality of semi-supervised gauss hybrid models sorting algorithm under small-scale training set is obviously better than additive method.
Accompanying drawing explanation
Fig. 1 is microblog text affective analytical approach process flow diagram in the embodiment of the present invention;
Fig. 2 is bilingual microblogging text example schematic in the embodiment of the present invention;
Fig. 3 is the emotional semantic classification algorithm flow schematic diagram of semi-supervised gauss hybrid models in the embodiment of the present invention;
Fig. 4 is that the accuracy rate of multiple machine learning text emotion sorting algorithm in the embodiment of the present invention compares schematic diagram;
Fig. 5 is that the multiple emotional semantic classification algorithm accuracy rate of bilingual microblogging text in the embodiment of the present invention compares schematic diagram.
Fig. 6 is the multiclass sentiment analysis system construction drawing towards bilingual microblogging text in the embodiment of the present invention.
Embodiment
Fig. 1 is the process flow diagram of a kind of multiclass sentiment analysis method towards bilingual microblogging text of the embodiment of the present invention.Text emotion identification main working process is as follows:
(1) bilingual sentiment dictionary builds: first collect the language material that certain scale has Sentiment orientation, and from corpus, extract the high frequency vocabulary with Sentiment orientation; Then, by existing knowledge base (WordNet and NTUSD, HowNet) and Lexical Similarity computation model, sentiment dictionary is expanded; Finally, in sentiment dictionary, add the emerging language of network and emoticon;
(2) Text Pretreatment: participle is carried out to text to be identified and removes stop words.Stop words refers to the function word without physical meaning that human language comprises, the determiner (" the ", " a ", " an ", " that ") in such as English.English text also will carry out lemmatization on this basis and extract stem operation;
(3) text feature space representation: utilize the bilingual sentiment dictionary built to carry out feature extraction to text vocabulary, according to vocabulary emotion classification, five dimensional vector expressions are carried out to text;
(4) many sentiment classification model are utilized to realize the emotion recognition task of language material text.
One, bilingual sentiment dictionary builds:
Before how introduction builds sentiment dictionary storehouse, first introduce the classification of Sentiment orientation.
Emotion tendentiousness of text is classified:
The emotion of the mankind is complicated and diversified, and in microblogging, lead the part that emotion has become people's daily life off.Simply emotion is divided into forward emotion, negative sense emotion two class can not ensure the accuracy that emotion distinguishes.In the present embodiment, in order to better contain the various emotions of microblogging text, further fine granularity emotional semantic classification system, emotion classification is divided into the social help, happiness, sadness, indignation, frightened 5 classes by us.
Sentiment dictionary storehouse builds:
The structure in sentiment dictionary storehouse depends on microblogging text set.When manually carrying out Sentiment orientation mark to text, in text, minority often plays conclusive effect with the vocabulary of Sentiment orientation.Sina's microblogging has microblogging text set in large scale, and user outwards can issue the text within 140 words immediately, and therefore building a sentiment dictionary meeting microblog text affective classification is the basis that the present invention studies.Current microblogging research is many based on single languages language material, but Chinese and English collocation use has become the individual fashion trend expressed.Therefore, the emotional semantic classification demand that Chinese sentiment dictionary can not meet the microblogging text of Chinese and English mixing is only built.For further illustrating the necessity adding English sentiment dictionary, we show two blog articles that microblog users is issued in fig. 2, can find out in figure, and have the user of bilingual statement custom when referring to a certain topic, usual English emotion vocabulary carries out emotional expression.
For setting up bilingual sentiment dictionary, first need to collect the language material that certain scale has Sentiment orientation, and from corpus, manually extract the bilingual vocabulary of the high frequency on a small quantity with Sentiment orientation (as happy, sad, happiness, beautiful, sad) add sentiment dictionary, we using these vocabulary as seed emotion vocabulary.Have collected more than 7,000 microblogging text as corpus from Sina's microblogging in the present embodiment.Our selected part vocabulary from corpus carries out manually marking five class emotions as seed word set seedset={PA, PB, PC, PD, PE}, wherein PA, PB, PC, PD, PE represent the subset of all kinds of emotion (the social help, happiness, sadness, indignation, fear) respectively.
Then, according to the seed word set chosen, by calculating the similarity of emotion vocabulary in seed emotion vocabulary and each knowledge base, the emotion vocabulary large with seed emotion Lexical Similarity in knowledge base is extended in the five class seed emotion vocabulary chosen.In the present embodiment, we apply existing knowledge base WordNet, NTUSD and HowNet expand sentiment dictionary successively.The vocabulary choosing suitable number according to specific needs from knowledge base operates, and such as, mainly have chosen the most of vocabulary in HowNet and WordNet in this example.Each vocabulary v in existing knowledge base
kcan be described by multiple concept, each concept C
kbe again defined by knowledge base express language based on justice is former, and each concept contain the former explanation of multiple justice.
For the Semantic Similarity between two Chinese vocabularies, the present invention adopts HowNet Lexical Similarity computing method, and it is defined as follows shown in formula (1) and formula (2):
In formula 1, similarity (v
1, v
2) represent similarity between two vocabulary.Vocabulary v
1have n concept, it is former that each concept contains a p1 justice.Vocabulary v
2have m concept, each concept has p2 justice former.C
1irepresent vocabulary v
1i-th concept, C
2jrepresent vocabulary v
2a jth concept.
In formula 2, p1, p2 represent the former number of justice that two concepts contain respectively,
then represent that i-th justice is former in concept C
1in weight.And the most former similarity of the cardinal principles of righteousness chosen between two words
as the similarity of two words.α is positive variable element, and d represents the former distance in HowNet justice elite tree of justice.
And for the Semantic Similarity between english vocabulary, then utilize the Lesk method in Wordnet to measure the degree of association between vocabulary.Each concept in Wordnet and word sense are defined by a short notes.Lesk method is by the cross section of finding and calculate the annotation of two concepts and then the similarity similarity (v calculated between two vocabulary
1, v
2).Represent because each English word has variform, such as: the words such as happily, happiness are all relevant with happy, therefore need to carry out merger to the different shape of a word, remove affixe and obtain root, be i.e. morphology standardization, thus improve the efficiency of text-processing.The present invention adopts Lancaster and the WordNet Lemmatizer two kinds of lemmatization modes provided in NLTK to standardize to english vocabulary.
After determining similarity, we utilize following formula the emotion vocabulary in knowledge base and the vocabulary that seed words is concentrated to be carried out the calculating of similarity and during the corresponding Sentiment orientation emotion vocabulary in knowledge base being expanded to sentiment dictionary classifies:
Wherein N
1, N
2, N
3, N
4, N
5for the number of seed vocabulary in all kinds of emotion subset.ω (v) represents emotion classification belonging to non-seed vocabulary, depends on the maximal value of itself and the average similarity of all kinds of emotion subsets.
Except traditional emotion word, the emerging language of increasing network and emoticon are by the emotional expression of user in a large number for microblogging text.Therefore except carrying out except vocabulary extension according to existing knowledge base, we conclude by carrying out observation to a large amount of microblogging texts, artificial introducing netspeak and emoticon in sentiment dictionary, and adopt the mode of many people handshow to classify to its Sentiment orientation.
In sum, the Chinese emotion vocabulary constructed by the present embodiment amounts to 7590, and English emotion vocabulary amounts to 421, network words 613, conventional emoticon 101.Wherein contain " the social help " class vocabulary 971, " happiness " class vocabulary 2731, " sadness " class vocabulary 2289, " indignation " class vocabulary 1458, " fear " class vocabulary 1276.
Two, Text Pretreatment:
Participle is carried out to Twitter message text.The present invention adopts ICTCLAS Words partition system to carry out vocabulary identification to Chinese text, then carries out vocabulary identification according to space for English text.After participle is carried out to a Twitter message text, stop words process is gone to it, as: " ", " a ", " the " etc.English text also will carry out lemmatization on this basis and extract stem operation, and concrete operations are consistent to the process of English text with when sentiment dictionary builds.
Three, the feature space of text represents:
Make D={d
1, d
2..., d
nrepresent the set of all Twitter message texts, therefore every bar Twitter message d
ifive dimensional vectors all can be adopted to represent, each represents the number of the vocabulary belonging to corresponding emotional semantic classification respectively.
Four, sentiment classification model is utilized to realize the emotion recognition task of microblogging text:
Existing a lot of method or model all can realize the sentiment analysis work to text, and we only introduce two kinds of microblogging text multiclass sentiment analysis models in the present invention.Wherein semi-supervised gauss hybrid models emotional semantic classification algorithm needs to carry out repetitive exercise to model, and the k nearest neighbor emotional semantic classification algorithm based on symmetric relative entropy only needs to input the text vector marked, without the need to learning.
Semi-supervised Gaussian Mixture emotion classifiers model:
The emotional semantic classification algorithm flow of semi-supervised gauss hybrid models as shown in Figure 3.Gauss hybrid models many employings expectation-maximization algorithm (EM) carries out parameter estimation.Estimate the probability density distribution of sample with regard to referring to the process that mixed Gauss model (GMM) is trained, and the model estimated is multiple Gauss model weighting sums, wherein each Gauss model represents a class.In the present embodiment for five emotional semantic classifications, we adopt five Gauss models to carry out training study.To gauss hybrid models study, be namely the estimation and the weight (π that each Gauss model are added to probability density
i) carry out the process of maximal possibility estimation.π
idepend on the ratio shared by various emotion classification text in training pattern, π in an iterative process
ican add change along with test set sample, initial value is equal due to five emotion classification text proportions in training set, therefore is 1/5.
Semi-supervised gauss hybrid models is a self-training algorithm, thus in the present embodiment, artificial for the microblogging text marked is divided into two parts, a part is training set, a part is test set, to realize, in the process of repetitive exercise each time, choosing text and add training set thus the object reaching self-training from test set.In each iterative process, the probable value of text in test set at the gauss hybrid models of five emotion classifications is compared, choose the maximum emotional semantic classification of text probable value in gauss hybrid models as text emotion classification, and a maximum text of probable value is chosen from the correct all texts of corresponding category classification, join in training set, more constantly mixed Gauss model is learnt according to new training set.Until model after repetitive exercise to the indifference before the classifying quality of test set and iteration apart from or gap can ignore, also or test set be empty, then algorithm stopping.
φ (u
j| θ
k) be Gaussian probability-density function,
by obtaining model parameter after training study, wherein μ
krepresent the average of each gauss hybrid models,
represent the variance of each gauss hybrid models.First the present invention by marking Twitter message Textual study gauss hybrid models, is input as the text vector marked in training set.Then carry out iterative learning as the initial parameter values of gauss hybrid models to existing model using the probability distribution of this model parameter and marker samples, the output finally obtained is average and the variance of five gauss hybrid models, represents five class emotions.Utilize the model parameter learning to obtain, can input the text vector do not marked, five probable values according to exporting compare and finally classify according to the emotion classification that probable value is maximum.
K nearest neighbor emotion classifiers model based on symmetric relative entropy:
K nearest neighbor sorting algorithm refers to that a sample generic depends on most of generic in sample the most contiguous in k i.e. feature space the most similar in this sample place particular space.The method only decides sample generic to be sorted according to the classification of one or several the most contiguous sample on categorised decision.
The decision rule of the selection of k value, the tolerance of distance and classification constitutes three fundamentals of k nearest neighbor algorithm.We adopt relative entropy to measure text emotion similarity in the present invention.Relative entropy measures the asymmetry of the probability distribution of two in similar events space (X's and Y), is designated as D (X||Y).Therefore represent text vector and be normalized, the text vector after normalization is designated as T
i.
And Twitter message text T
iwith T
jbetween distance definition as follows:
ω
ik, ω
jkrepresent T respectively
i, T
jkth item, k is the integer between 1 to 5.Because relative entropy has asymmetry, therefore when measuring the difference of probability distribution X and Y, X represents the true distribution of data, and Y represents the APPROXIMATE DISTRIBUTION of X.Therefore, when calculating the distance between text, T
ifor the normalized vector of retrtieval represents, T
jthen for the normalized vector of unmarked text represents.But this asymmetry form of calculation have ignored the APPROXIMATE DISTRIBUTION of X for Y.In order to improve the asymmetry that traditional relative entropy calculates, symmetric relative entropy computing formula is defined as follows:
Be input as the vector representation of the normalization microblogging text marked as training set, and the normalization microblogging text vector do not marked represents.The asymmetric relative entropy distance metric algorithm specified by above-mentioned model or symmetric relative entropy distance metric algorithm carry out distance to the microblogging text marked in unmarked text vector and training set and calculate, i.e. the Similarity Measure of microblogging text.To choose in K the arest neighbors of text vector in training set for classification the routine maximum emotion of accounting as the emotion classification of the text.
Be illustrated in figure 6 a kind of multiclass sentiment analysis system towards bilingual microblogging text of the embodiment of the present invention, comprise bilingual Chinese-English sentiment dictionary, language material pretreatment module, language material text feature space representation module, emotion classifiers identification module; Bilingual Chinese-English sentiment dictionary adopts described bilingual Chinese-English sentiment dictionary building method to build; Language material pretreatment module is used for carrying out participle herein to language material to be analyzed and going stop words process, also will carry out morphology standardization processing for English text; Language material text feature space representation module is used for carrying out vectorization expression to the text after the process of language material pretreatment module, be five dimensional vectors by text-processing, five elements in vector represent in text the number being included in the social help in described bilingual Chinese-English sentiment dictionary, happiness, sadness, indignation and frightened five class emotion word respectively; Emotion classifiers identification module carries out emotion recognition for adopting described emotion classifiers model to language material text vector, determines the emotion classification belonging to language material text.
The multiclass sentiment analysis system of the application embodiment of the present invention, input a bilingual microblogging text, emotion classification belonging to the text will be exported after system process, export A and represent that this microblogging belongs to the social help class, export B and represent that this microblogging belongs to happiness class, export C and represent that this microblogging belongs to sad class, export D and represent that this microblogging belongs to angry class, export E and represent that this microblogging belongs to frightened class.
Evaluation index:
The present invention invites the student in research natural language direction to carry out artificial classification mark according to 5 class emotions to the text that the API that Sina provides captures.Wherein part is as training set, and part is as test set.At model after training set training, the accuracy for the microblogging text classification of test set gets final product the quality of evaluation model.
Data set:
When carrying out machine learning algorithm and comparing, the data set that we choose is that 7170 Chinese microblogging text messages of Sina API crawl are as experimental data.And invite 25 students studying natural language direction to carry out artificial classification mark according to 5 class emotions to text, and then the emotion classification of text is made to depend on the emotion classification that majority choose.The distribution situation of language material in each emotion classification is as shown in table 1:
The distribution of table 1. microblogging text in 5 class emotion classifications
When carrying out bilingual microblog text affective classification experiments, similar, we use Sina API to capture 7000 bilingual microblogging text messages and invite 25 students studying natural language direction to carry out artificial classification mark according to 5 class emotions to text, and the distribution situation of emotion classification language material in each emotion classification is as shown in table 2:
The distribution of table 2 microblogging text in 5 class emotion classifications
Experimental result:
We choose 3170 microbloggings as test set (see table 1) from 7170 Chinese microbloggings, wherein express 500, the microblogging text of the social help, express 1300 glad, microblogging text, express 540 sad, microblogging text, express 510, the microblogging text of indignation, express 320 frightened, microblogging text.Training set does not then choose 1000 to 4000 microbloggings not etc. from remaining 4000.
First we compare the k nearest neighbor sorting algorithm based on asymmetric relative entropy and the k nearest neighbor sorting algorithm based on symmetric relative entropy, and experimental result is as shown in table 3:
Table 3 compares based on the accuracy rate of k nearest neighbor sorting algorithm under different training set scale of different distance metric algorithm
Result shows, the accuracy rate based on the k nearest neighbor sorting algorithm of symmetric relative entropy is slightly high, therefore multiple machine learning classification algorithm afterwards relatively in, we only select based on symmetric relative entropy k nearest neighbor sorting algorithm participate in compare.
We select majority voting algorithm (Majority vote), algorithm of support vector machine (SVM), based on the k nearest neighbor sorting algorithm (KNN-Cosine) of COS distance with the semi-supervised gauss hybrid models sorting algorithm (Semi-GMM) proposed in the present invention with compare based on the k nearest neighbor algorithm (KNN-KL) of symmetric relative entropy.Comparative result as shown in Figure 4.As can be seen from the figure, when training set text scale is 4000, KNN-KL accuracy rate is up to 85.1%.When selecting identical k nearest neighbor algorithm, symmetric relative entropy is adopted to carry out text distance metric better than adopting COS distance to carry out text distance metric classifying quality.
When training set text scale is less than 3000, Semi-GMM has better performance.When training text number declines, compared to KNN-Cosine and KNN-KL, Semi-GMM, there is better stability.Along with training set textual data drops to 1000 now, adopt the accuracy rate of KNN-KL to have dropped 8.9%, and Semi-GMM only have dropped 2.9%.This also further demonstrate that Semi-GMM is adapted at using when training set scale is less more, and this full supervised learning algorithm of KNN is easily selected neighbours' number about k, affects classifying quality.
SVM is used for two classification problems.Support that the appearance of sorter is supported in many classification although have, its accuracy rate depends on the quality of training data more.And SVM complexity is higher, and be bad to process extensive classification problem.Owing to obtaining extensive high quality training data high cost in text classification, therefore we think when not marking text collection and being excessive, use Semi-GMM to carry out emotional semantic classification to text more applicable.
For 5 class emotions, when text training set scale is different, the accuracy rate of KNN-KL is as shown in table 4.We can find out when training set textual data drops to 1000 now, and the microblogging text classification effect expressing the social help and sadness also sharply declines.Compared with classifying quality when being 4000 with training set text number, have 79 microblogging texts of expressing the social help by mis-classification, wherein 64 are identified as happiness, and 11 are identified as sadness, and 4 are identified as indignation.Then have 60 by mis-classification for expressing sad microblogging text, wherein 8 are identified as the social help, and 28 are identified as happiness, and 13 are identified as indignation, and 11 are identified as fear.
Table 4 under different training set scale, based on the text classification accuracy of Semi-GMM and KMM-KL
And under different text training set scale, the F1 value of Semi-GMM and KNN-KL is as shown in table 5, this also further demonstrate that the classification advantage of Semi-GMM under small-scale training set.
Table 5 under different training set scale, based on the text classification F1 value of Semi-GMM and KMM-KL
For in bilingual microblog text affective classification experiments, we choose 3000 microbloggings as test set (see table 2) from 7000 bilingual microbloggings, and training set does not then choose 1000 to 4000 microbloggings not etc. from remaining 4000.
We select and only use Chinese sentiment dictionary make the semi-supervised gauss hybrid models sorting algorithm (Semi-GMM (Ch.)) of emotion word identification and to combine the majority voting algorithm (Majority vote (Ch.+Eng.)) carrying out emotion word identification with using Chinese and English sentiment dictionary based on the k nearest neighbor algorithm (KNN-KL (Ch.)) of symmetric relative entropy, SVM (Ch.+Eng.) algorithm, the semi-supervised gauss hybrid models sorting algorithm (Semi-GMM (Ch.+Eng.)) proposed based on the k nearest neighbor sorting algorithm (KNN-Cosine (Ch.+Eng.)) of COS distance and the present invention and comparing based on the k nearest neighbor algorithm (KNN-KL (Ch.+Eng.)) of symmetric relative entropy.Comparative result as shown in Figure 5.As can be seen from the figure, utilize Chinese and English sentiment dictionary to combine to carry out the text emotion sorting algorithm accuracy rate of emotion word identification to carry out the text emotion sorting algorithm of emotion word identification apparently higher than single utilization Chinese sentiment dictionary, further demonstrate that the validity of the bilingual emotion word dictionary that we set up.When training set microblogging text drops to 1000, the classification accuracy of Semi-GMM (Ch.+Eng.) has been up to 68.3%.
Table 6 under different training set scale, based on the text classification accuracy of Semi-GMM and KMM-KL
Table 7 under different training set scale, based on the text classification F1 value of Semi-GMM and KMM-KL
Table 6 and table 7 give when text training set scale is different, Semi-GMM and KNN-KL carries out the accuracy rate of 5 class emotion recognition for text.When text training set scale drops to 1000, the F1 value of Semi-GMM is greater than the F1 value of KNN-KL, this also further demonstrate that in text that the word occurring different language can not impact the stability of Semi-GMM, and Semi-GMM has more classification advantage under small-scale training set.
Therefore, the microblogging multiclass sentiment analysis method based on bilingual dictionary proposed by the invention very has actual application value.
In order to content of the present invention and implementation method are described, this instructions gives a specific embodiment.The object introducing details is not in an embodiment the scope of restriction claims, but helps to understand the method for the invention.One skilled in the art should appreciate that: in the spirit and scope not departing from the present invention and claims thereof, to the various amendments of most preferred embodiment step, change or to replace be all possible.Therefore, the present invention should not be limited to the content disclosed in most preferred embodiment and accompanying drawing.
Claims (10)
1. a bilingual Chinese-English sentiment dictionary building method, is characterized in that: comprise the following steps:
Step one, capture microblogging webpage, from webpage, collect the Chinese and English language material with Sentiment orientation, and from corpus, extract the high frequency vocabulary with Sentiment orientation add sentiment dictionary storehouse;
Step 2, the existing knowledge base of application are expanded described sentiment dictionary;
The microblogging language material that step 3, analysis capture, adds described sentiment dictionary by emerging for network language and emoticon.
2. the bilingual Chinese-English sentiment dictionary building method of one according to claim 1, is characterized in that: described Sentiment orientation comprises the social help, happiness, sadness, indignation and frightened 5 classes.
3. the bilingual Chinese-English sentiment dictionary building method of one according to claim 1, it is characterized in that: the expansion of step 2 is the average similarity by calculating each Sentiment orientation vocabulary in emotion vocabulary and sentiment dictionary in each knowledge base respectively, and emotion word is extended in the maximum Sentiment orientation classification of similarity; Described knowledge base comprises WordNet, NTUSD and HowNet.
4., according to the arbitrary described bilingual Chinese-English sentiment dictionary building method of one of claim 1-3, it is characterized in that: adopt the mode of many people handshow to classify to its Sentiment orientation to described netspeak and emoticon.
5., based on a multiclass sentiment analysis method for bilingual dictionary, the method comprises the following steps:
Step one, pre-service is carried out to language material text;
Step 2, according to described bilingual Chinese-English sentiment dictionary, feature space expression is carried out to described language material text;
The many disaggregated models of text emotion that step 3, basis have been set up carry out emotional semantic classification to language material text.
6. a kind of microblogging multiclass sentiment analysis method based on bilingual dictionary according to claim 5, is characterized in that: described pre-service comprises participle further and removes stop words, also comprises morphology standardization for English text.
7. a kind of microblogging multiclass sentiment analysis method based on bilingual dictionary according to claim 5, it is characterized in that: described text feature space representation is that each text in language material is expressed as five dimensional vectors, and in vector, each element represents the number of the emotion word of corresponding classification in the described bilingual Chinese-English sentiment dictionary comprised respectively.
8. a kind of microblogging multiclass sentiment analysis method based on bilingual dictionary according to claim 5, is characterized in that:
The many disaggregated models of described emotion are semi-supervised gauss hybrid models sorting algorithm or the k nearest neighbor algorithm based on symmetric relative entropy;
Described semi-supervised gauss hybrid models sorting algorithm is the corpus collection study gauss hybrid models by having marked, then iterative learning is carried out to the testing material collection marked, until algorithm convergence or unlabeled set are combined into sky using the probability distribution of this model parameter and marker samples as the initial parameter values of gauss hybrid models;
The described k nearest neighbor algorithm based on symmetric relative entropy is the distance adopting relative entropy to measure to express text to text emotion similarity, and the classification according to adjacent sample decides sample generic to be sorted.
9. a kind of microblogging multiclass sentiment analysis method based on bilingual dictionary according to claim 8, is characterized in that: described relative entropy adopts following formula to calculate:
Wherein, T
ifor the normalized vector of retrtieval represents, T
jfor the normalized vector of unmarked text represents, ω
ik, ω
jkrepresent T respectively
i, T
jkth item, k is the integer between 1 to 5.
10. towards a multiclass sentiment analysis system for bilingual microblogging text, it is characterized in that: comprise bilingual Chinese-English sentiment dictionary, language material pretreatment module, language material text feature space representation module and emotion classifiers identification module four modules;
Bilingual Chinese-English sentiment dictionary adopts bilingual Chinese-English sentiment dictionary building method as claimed in claim 1 to build;
Language material pretreatment module is used for carrying out participle herein to language material to be analyzed and going stop words process, also will carry out morphology standardization processing for English text;
Language material text feature space representation module is used for carrying out vectorization expression to the text after the process of language material pretreatment module, be five dimensional vectors by text-processing, five elements in vector represent in text the number being included in the social help in described bilingual Chinese-English sentiment dictionary, happiness, sadness, indignation and frightened five class emotion word respectively;
Emotion classifiers identification module carries out emotion recognition for adopting emotion classifiers model as claimed in claim 8 to language material text vector, determines the emotion classification belonging to language material text.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410670909.8A CN104331506A (en) | 2014-11-20 | 2014-11-20 | Multiclass emotion analyzing method and system facing bilingual microblog text |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410670909.8A CN104331506A (en) | 2014-11-20 | 2014-11-20 | Multiclass emotion analyzing method and system facing bilingual microblog text |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104331506A true CN104331506A (en) | 2015-02-04 |
Family
ID=52406233
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410670909.8A Pending CN104331506A (en) | 2014-11-20 | 2014-11-20 | Multiclass emotion analyzing method and system facing bilingual microblog text |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104331506A (en) |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105243595A (en) * | 2015-10-13 | 2016-01-13 | 宁波知微瑞驰信息科技有限公司 | Method for measuring similarity between accounts with social network depression emotion |
CN105320960A (en) * | 2015-10-14 | 2016-02-10 | 北京航空航天大学 | Voting based classification method for cross-language subjective and objective sentiments |
CN105912576A (en) * | 2016-03-31 | 2016-08-31 | 北京外国语大学 | Emotion classification method and emotion classification system |
CN106610955A (en) * | 2016-12-13 | 2017-05-03 | 成都数联铭品科技有限公司 | Dictionary-based multi-dimensional emotion analysis method |
CN107038154A (en) * | 2016-11-25 | 2017-08-11 | 阿里巴巴集团控股有限公司 | A kind of text emotion recognition methods and device |
CN107066446A (en) * | 2017-04-13 | 2017-08-18 | 广东工业大学 | A kind of Recognition with Recurrent Neural Network text emotion analysis method of embedded logic rules |
CN107122465A (en) * | 2017-04-28 | 2017-09-01 | 中央民族大学 | The construction method and system of a kind of Tibetan language sentiment dictionary based on Tibetan language language feature |
CN107315797A (en) * | 2017-06-19 | 2017-11-03 | 江西洪都航空工业集团有限责任公司 | A kind of Internet news is obtained and text emotion forecasting system |
WO2017198087A1 (en) * | 2016-05-17 | 2017-11-23 | Huawei Technologies Co., Ltd. | Feature-set augmentation using knowledge engine |
CN107391581A (en) * | 2017-06-21 | 2017-11-24 | 清华大学 | Community network information dissemination Forecasting Methodology and equipment |
CN107423408A (en) * | 2017-07-28 | 2017-12-01 | 广州多益网络股份有限公司 | A kind of cross-cutting sentiment analysis method and system of microblogging text |
CN107679031A (en) * | 2017-09-04 | 2018-02-09 | 昆明理工大学 | Based on the advertisement blog article recognition methods for stacking the self-editing ink recorder of noise reduction |
CN107832663A (en) * | 2017-09-30 | 2018-03-23 | 天津大学 | A kind of multi-modal sentiment analysis method based on quantum theory |
CN107945033A (en) * | 2017-11-14 | 2018-04-20 | 李勇 | A kind of analysis method of network public-opinion, system and relevant apparatus |
CN108363699A (en) * | 2018-03-21 | 2018-08-03 | 浙江大学城市学院 | A kind of netizen's school work mood analysis method based on Baidu's mhkc |
CN108536756A (en) * | 2018-03-16 | 2018-09-14 | 苏州大学 | Mood sorting technique and system based on bilingual information |
CN108804512A (en) * | 2018-04-20 | 2018-11-13 | 平安科技(深圳)有限公司 | Generating means, method and the computer readable storage medium of textual classification model |
CN108846073A (en) * | 2018-06-08 | 2018-11-20 | 青岛里奥机器人技术有限公司 | A kind of man-machine emotion conversational system of personalization |
CN109344331A (en) * | 2018-10-26 | 2019-02-15 | 南京邮电大学 | A kind of user feeling analysis method based on online community network |
CN109885687A (en) * | 2018-12-29 | 2019-06-14 | 深兰科技(上海)有限公司 | A kind of sentiment analysis method, apparatus, electronic equipment and the storage medium of text |
CN109918649A (en) * | 2019-02-01 | 2019-06-21 | 杭州师范大学 | A kind of suicide Risk Identification Method based on microblogging text |
CN110263170A (en) * | 2019-06-21 | 2019-09-20 | 中科软科技股份有限公司 | A kind of automatic marking method and system of text categories |
CN111522913A (en) * | 2020-04-16 | 2020-08-11 | 山东贝赛信息科技有限公司 | Emotion classification method suitable for long text and short text |
CN111723198A (en) * | 2019-03-18 | 2020-09-29 | 北京京东尚科信息技术有限公司 | Text emotion recognition method and device and storage medium |
CN112489688A (en) * | 2020-11-09 | 2021-03-12 | 浪潮通用软件有限公司 | Neural network-based emotion recognition method, device and medium |
CN112966514A (en) * | 2021-03-13 | 2021-06-15 | 北京理工大学 | Natural language emotion classification method based on sememe |
CN113919340A (en) * | 2021-08-27 | 2022-01-11 | 北京邮电大学 | Self-media language emotion analysis method based on unsupervised unknown word recognition |
US11468233B2 (en) * | 2019-01-29 | 2022-10-11 | Ricoh Company, Ltd. | Intention identification method, intention identification apparatus, and computer-readable recording medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8600858B1 (en) * | 2011-05-18 | 2013-12-03 | Fahad Kamruddin | Determining financial sentiment based on contests |
CN103559233A (en) * | 2012-10-29 | 2014-02-05 | 中国人民解放军国防科学技术大学 | Extraction method for network new words in microblogs and microblog emotion analysis method and system |
-
2014
- 2014-11-20 CN CN201410670909.8A patent/CN104331506A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8600858B1 (en) * | 2011-05-18 | 2013-12-03 | Fahad Kamruddin | Determining financial sentiment based on contests |
CN103559233A (en) * | 2012-10-29 | 2014-02-05 | 中国人民解放军国防科学技术大学 | Extraction method for network new words in microblogs and microblog emotion analysis method and system |
Non-Patent Citations (1)
Title |
---|
YUQING LI等: "A Lexicon-Based Multi-class Semantic Orientation Analysis for Microblogs", 《ASIA-PACIFIC WEB CONFERENCE. SPRINGER INTERNATIONAL PUBLISHING》 * |
Cited By (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105243595A (en) * | 2015-10-13 | 2016-01-13 | 宁波知微瑞驰信息科技有限公司 | Method for measuring similarity between accounts with social network depression emotion |
CN105320960A (en) * | 2015-10-14 | 2016-02-10 | 北京航空航天大学 | Voting based classification method for cross-language subjective and objective sentiments |
CN105320960B (en) * | 2015-10-14 | 2022-04-05 | 北京航空航天大学 | Voting-based cross-language subjective and objective emotion classification method |
CN105912576A (en) * | 2016-03-31 | 2016-08-31 | 北京外国语大学 | Emotion classification method and emotion classification system |
CN109155008A (en) * | 2016-05-17 | 2019-01-04 | 华为技术有限公司 | Enhanced using the feature set of knowledge engine |
WO2017198087A1 (en) * | 2016-05-17 | 2017-11-23 | Huawei Technologies Co., Ltd. | Feature-set augmentation using knowledge engine |
CN107038154A (en) * | 2016-11-25 | 2017-08-11 | 阿里巴巴集团控股有限公司 | A kind of text emotion recognition methods and device |
CN106610955A (en) * | 2016-12-13 | 2017-05-03 | 成都数联铭品科技有限公司 | Dictionary-based multi-dimensional emotion analysis method |
CN107066446A (en) * | 2017-04-13 | 2017-08-18 | 广东工业大学 | A kind of Recognition with Recurrent Neural Network text emotion analysis method of embedded logic rules |
CN107122465A (en) * | 2017-04-28 | 2017-09-01 | 中央民族大学 | The construction method and system of a kind of Tibetan language sentiment dictionary based on Tibetan language language feature |
CN107315797A (en) * | 2017-06-19 | 2017-11-03 | 江西洪都航空工业集团有限责任公司 | A kind of Internet news is obtained and text emotion forecasting system |
CN107391581A (en) * | 2017-06-21 | 2017-11-24 | 清华大学 | Community network information dissemination Forecasting Methodology and equipment |
CN107423408A (en) * | 2017-07-28 | 2017-12-01 | 广州多益网络股份有限公司 | A kind of cross-cutting sentiment analysis method and system of microblogging text |
CN107423408B (en) * | 2017-07-28 | 2020-10-23 | 广州多益网络股份有限公司 | Microblog text cross-domain emotion analysis method and system |
CN107679031A (en) * | 2017-09-04 | 2018-02-09 | 昆明理工大学 | Based on the advertisement blog article recognition methods for stacking the self-editing ink recorder of noise reduction |
CN107679031B (en) * | 2017-09-04 | 2021-01-05 | 昆明理工大学 | Advertisement and blog identification method based on stacking noise reduction self-coding machine |
CN107832663B (en) * | 2017-09-30 | 2020-03-06 | 天津大学 | Multi-modal emotion analysis method based on quantum theory |
CN107832663A (en) * | 2017-09-30 | 2018-03-23 | 天津大学 | A kind of multi-modal sentiment analysis method based on quantum theory |
CN107945033A (en) * | 2017-11-14 | 2018-04-20 | 李勇 | A kind of analysis method of network public-opinion, system and relevant apparatus |
CN108536756A (en) * | 2018-03-16 | 2018-09-14 | 苏州大学 | Mood sorting technique and system based on bilingual information |
CN108363699A (en) * | 2018-03-21 | 2018-08-03 | 浙江大学城市学院 | A kind of netizen's school work mood analysis method based on Baidu's mhkc |
CN108804512A (en) * | 2018-04-20 | 2018-11-13 | 平安科技(深圳)有限公司 | Generating means, method and the computer readable storage medium of textual classification model |
CN108846073A (en) * | 2018-06-08 | 2018-11-20 | 青岛里奥机器人技术有限公司 | A kind of man-machine emotion conversational system of personalization |
CN108846073B (en) * | 2018-06-08 | 2022-02-15 | 合肥工业大学 | Personalized man-machine emotion conversation system |
CN109344331A (en) * | 2018-10-26 | 2019-02-15 | 南京邮电大学 | A kind of user feeling analysis method based on online community network |
CN109885687A (en) * | 2018-12-29 | 2019-06-14 | 深兰科技(上海)有限公司 | A kind of sentiment analysis method, apparatus, electronic equipment and the storage medium of text |
US11468233B2 (en) * | 2019-01-29 | 2022-10-11 | Ricoh Company, Ltd. | Intention identification method, intention identification apparatus, and computer-readable recording medium |
CN109918649A (en) * | 2019-02-01 | 2019-06-21 | 杭州师范大学 | A kind of suicide Risk Identification Method based on microblogging text |
CN109918649B (en) * | 2019-02-01 | 2023-08-11 | 杭州师范大学 | Suicide risk identification method based on microblog text |
CN111723198A (en) * | 2019-03-18 | 2020-09-29 | 北京京东尚科信息技术有限公司 | Text emotion recognition method and device and storage medium |
CN111723198B (en) * | 2019-03-18 | 2023-09-01 | 北京汇钧科技有限公司 | Text emotion recognition method, device and storage medium |
CN110263170A (en) * | 2019-06-21 | 2019-09-20 | 中科软科技股份有限公司 | A kind of automatic marking method and system of text categories |
CN111522913A (en) * | 2020-04-16 | 2020-08-11 | 山东贝赛信息科技有限公司 | Emotion classification method suitable for long text and short text |
CN112489688A (en) * | 2020-11-09 | 2021-03-12 | 浪潮通用软件有限公司 | Neural network-based emotion recognition method, device and medium |
CN112966514A (en) * | 2021-03-13 | 2021-06-15 | 北京理工大学 | Natural language emotion classification method based on sememe |
CN113919340A (en) * | 2021-08-27 | 2022-01-11 | 北京邮电大学 | Self-media language emotion analysis method based on unsupervised unknown word recognition |
CN113919340B (en) * | 2021-08-27 | 2024-08-13 | 北京邮电大学 | Self-media language emotion analysis method based on unsupervised unregistered word recognition |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104331506A (en) | Multiclass emotion analyzing method and system facing bilingual microblog text | |
CN105183833B (en) | Microblog text recommendation method and device based on user model | |
CN105260356B (en) | Chinese interaction text emotion and topic detection method based on multi-task learning | |
CN103207913B (en) | The acquisition methods of commercial fine granularity semantic relation and system | |
CN109558487A (en) | Document Classification Method based on the more attention networks of hierarchy | |
CN105912576B (en) | Emotion classification method and system | |
CN107038480A (en) | A kind of text sentiment classification method based on convolutional neural networks | |
CN108984530A (en) | A kind of detection method and detection system of network sensitive content | |
CN110532379B (en) | Electronic information recommendation method based on LSTM (least Square TM) user comment sentiment analysis | |
CN103729474B (en) | Method and system for recognizing forum user vest account | |
CN107193801A (en) | A kind of short text characteristic optimization and sentiment analysis method based on depth belief network | |
CN106202372A (en) | A kind of method of network text information emotional semantic classification | |
CN107122349A (en) | A kind of feature word of text extracting method based on word2vec LDA models | |
Pong-Inwong et al. | Improved sentiment analysis for teaching evaluation using feature selection and voting ensemble learning integration | |
CN103034626A (en) | Emotion analyzing system and method | |
CN109960799A (en) | A kind of Optimum Classification method towards short text | |
CN106651696A (en) | Approximate question push method and system | |
CN108376133A (en) | The short text sensibility classification method expanded based on emotion word | |
CN103761239A (en) | Method for performing emotional tendency classification to microblog by using emoticons | |
CN106599054A (en) | Method and system for title classification and push | |
CN108280057A (en) | A kind of microblogging rumour detection method based on BLSTM | |
CN103020167B (en) | A kind of computer Chinese file classification method | |
CN102929861A (en) | Method and system for calculating text emotion index | |
CN103473262A (en) | Automatic classification system and automatic classification method for Web comment viewpoint on the basis of association rule | |
CN106126502A (en) | A kind of emotional semantic classification system and method based on support vector machine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20150204 |
|
WD01 | Invention patent application deemed withdrawn after publication |