CN105069021A - Chinese short text sentiment classification method based on fields - Google Patents
Chinese short text sentiment classification method based on fields Download PDFInfo
- Publication number
- CN105069021A CN105069021A CN201510415825.4A CN201510415825A CN105069021A CN 105069021 A CN105069021 A CN 105069021A CN 201510415825 A CN201510415825 A CN 201510415825A CN 105069021 A CN105069021 A CN 105069021A
- Authority
- CN
- China
- Prior art keywords
- word
- short text
- field
- emotion
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/374—Thesaurus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention discloses a Chinese short text sentiment classification method based on fields, which includes: data preprocessing of a short text including sentence segmentation, word segmentation, stop word filtration, and field division; construction of a field-oriented sentiment dictionary; extraction and matching of sentiment paths, extraction and polarity discrimination of candidates, and TF-IDF weight calculation of sentiment words by the field-oriented sentiment dictionary and using a corpus as a data set; sentimental characteristic extraction of the short text; and the corpus training or unknown sentiment types discrimination by a rand forest algorithm. Experiments show that the scheme provided by the present invention has high accuracy rate.
Description
Technical field
The present invention relates to machine learning techniques field, particularly relate to a kind of Chinese short text sensibility classification method based on field.
Background technology
Internet develop the favor making social networks and electric business's shopping platform are subject to user more and more widely rapidly, as face book, push away the network platform both at home and abroad such as spy, Sina's microblogging, bean cotyledon, Jingdone district and Taobao.In these network platforms, data increase with presenting explosion type, comprise the evaluation to commodity, to the view of around event and the record etc. to life interesting episode or anxious state of mind.Wherein, short text is the important form that these data are commonly used, and often with emotional color or subjective consciousness.Emotion in this short text data expressed by user is excavated, contribute to allowing different user object carry out better certainly selecting or serving, as provided more pertinent recommendation to user when selecting, thering is provided more effective service to electric business when promoting product, providing to government or department of news media and predict reliably or push potential focus incident etc.
Text emotion analysis is research direction popular in natural language processing (NaturalLanguageProcessing, NLP) field, obtains extensively researching and analysing of scholar.The technology proposed has a lot, but mainly can be divided into 2 kinds: a kind of is method based on sentiment dictionary, and another kind is the method based on machine learning.Method based on sentiment dictionary is the Main Basis differentiated using emotion word (be divided into actively and passive) as emotion, namely carrys out according to emotion word the emotion that decision-making text contains.Method based on machine learning utilizes to classify according to the emotion of sorter to text of training.Two kinds of technical schemes all have pros and cons: the former algorithm is often comparatively simple, and algorithm complex is lower, and without the need to a large amount of label corpus; But have that sentiment dictionary is easily omitted, ambiguity or extreme, and the emotion difference produced the emotion word of different scene usually cannot perception.The latter's accuracy rate is often high compared with the former, but training affective characteristics sorter needs a large amount of tape label corpus, and corpus will be chosen suitably.
Summary of the invention
Technical matters to be solved by this invention how to carry out automatic classification in conjunction with sentiment dictionary and machine learning to the emotion of Chinese short text efficiently, to improve text automatic marking training effectiveness and to make final sorter have high-accuracy.
In order to solve the problems of the technologies described above, the invention provides a kind of Chinese short text sensibility classification method based on field, comprising:
Data prediction is carried out to short text, comprises sentence segmentation, participle, stop words filters and field divides;
Build the field sentiment dictionary of different field;
After utilizing described field sentiment dictionary and pre-service, data calculate the emotion value of short text;
Extract the affective characteristics of short text;
Random forest is adopted to be that classification tool is trained corpus or differentiates the short text of unknown affective style according to extracted affective characteristics.
Further, described data prediction is carried out to short text, comprises sentence segmentation, participle, stop words filters and field divides, specifically comprise:
Utilize punctuation mark that short text is divided into multiple sentence;
ICTCLAS participle instrument is adopted to be independently word by described multiple sentence cutting;
The word of vocabulary to cutting of stopping using is adopted to filter;
According to short text and context environmental, in conjunction with domain lexicon, mark off field belonging to short text.
Further, the field sentiment dictionary of described structure different field, specifically comprises:
The emotion word irrelevant with field is picked out from existing sentiment dictionary, and the word therefrom deleting ambiguity and be of little use, form basic sentiment dictionary;
Extract all nouns in corpus and sort by word frequency, and utilizing threshold method to choose the higher noun of word frequency as evaluation object;
The all emotion paths between the modification emotion word in described evaluation object and described basic sentiment dictionary are extracted in the analysis of employing dependency grammar;
According to described all emotion paths, mate the word corresponding with the emotion path that described evaluation object conforms to, after getting rid of the word in basic sentiment dictionary, will the vocabulary alternatively emotion word that part of speech is adjective, adverbial word and verb be obtained;
After adopting word similarity distinguished number to carry out feeling polarities classification to described candidate's emotion word, superpose with basic dictionary, form field sentiment dictionary.
Further, after utilizing described field sentiment dictionary and pre-service, data calculate the emotion value of short text, specifically comprise:
Calculate the TF-IDF value of each word in the sentiment dictionary of described field, wherein, TF-IDF=TF*IDF, in formula, TF represents word frequency, and IDF represents reverse document-frequency;
For the multiple words obtained after short text word segmentation processing, calculate the emotion value of each word, namely give different weights according to the TF-IDF value of word to word;
Calculate the weighted sum of the emotion value of all words, obtain the emotion value of short text.
Further, described multiple words for obtaining after short text word segmentation processing, calculate the emotion value of each word, namely give different weights according to the TF-IDF value of word to word, specifically comprise:
For the multiple words obtained after short text word segmentation processing, record position and the propensity value p of the appearance of each word, wherein, if word is positive, then p initialization value is f (TF-IDF), if word is passive, then p initialization value is-f (TF-IDF), wherein, f (TF-IDF) the default initial emotion value that is word;
According to the position that word occurs, judge whether occur negative word between word, if occur, then calculate the number of negative word, when the number of negative word is odd number, just reversed by the propensity value p of the word be in after negative word, otherwise propensity value p is constant, final propensity value p is the emotion value of word;
TF-IDF value according to word gives different weights to different words.
Further, described is that classification tool is trained corpus or differentiates the short text of unknown affective style according to extracted affective characteristics employing random forest, specifically comprises:
Utilize arrf feature templates by affective characteristics document formatting;
Call random forests algorithm in weka to carry out training or carrying out emotion prediction classification to the short text of unknown affective style according to the affective characteristics of extracted corpus as classification tool.
Implement the present invention, there is following beneficial effect:
1) the short text emotion method of discrimination based on field that the present invention proposes improves the accuracy rate of text data emotional semantic classification;
2) the accuracy rate that obtains based on the sentiment dictionary in field is proposed apparently higher than the accuracy rate using basic sentiment dictionary to reach.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is the schematic flow sheet of an embodiment of the Chinese short text sensibility classification method based on field provided by the invention;
Fig. 2 is the schematic flow sheet of the concrete steps of step S101 in Fig. 1;
Fig. 3 is the contrast and experiment figure of sentiment dictionary and traditional sentiment dictionary in method proposed by the invention.
Fig. 4 is the test result exemplary plot in four fields.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, to the technical scheme in the embodiment of the present invention carry out clear, intactly describe, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiment.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
Fig. 1 is the schematic flow sheet of an embodiment of the Chinese short text sensibility classification method based on field provided by the invention, comprises the steps:
S101, data prediction is carried out to short text, comprise sentence segmentation, participle, stop words filters and field divides.
Concrete, as shown in Figure 2, step S101 comprises step:
S1011, utilize punctuation mark that short text is divided into multiple sentence;
S1012, ICTCLAS participle instrument is adopted to be independently word by described multiple sentence cutting;
S1013, the inactive word of vocabulary to cutting of employing filter;
S1014, according to short text and context environmental, in conjunction with domain lexicon, mark off field belonging to short text.
The field sentiment dictionary of S102, structure different field.
Concrete, step S102 comprises step:
S1021, from existing sentiment dictionary, pick out the emotion word irrelevant with field, and the word therefrom deleting ambiguity and be of little use, form basic sentiment dictionary;
All nouns in S1022, extraction corpus also sort by word frequency, and utilize threshold method to choose the higher noun of word frequency as evaluation object.
S1023, adopt dependency grammar analysis to extract in described evaluation object and described basic sentiment dictionary modification emotion word between all emotion paths;
S1024, according to described all emotion paths, mate the word corresponding with the emotion path that described evaluation object conforms to, after getting rid of the word in basic sentiment dictionary, will the vocabulary alternatively emotion word that part of speech is adjective, adverbial word and verb be obtained;
After S1025, employing word similarity distinguished number carry out feeling polarities classification to described candidate's emotion word, superpose with basic dictionary, form field sentiment dictionary.
S103, data after described field sentiment dictionary and pre-service are utilized to calculate the emotion value of short text.
Concrete, step S103 comprises step:
S1031, calculate the TF-IDF value of each word in the sentiment dictionary of described field, wherein, TF-IDF=TF*IDF, in formula, TF represents word frequency, and IDF represents reverse document-frequency;
S1032, for the multiple words obtained after short text word segmentation processing, calculate the emotion value of each word, namely give different weights according to the TF-IDF value of word to word.
Concrete, step S1032 comprises:
For the multiple words obtained after short text word segmentation processing, record position and the propensity value p of the appearance of each word, wherein, if word is positive, then p initialization value is f (TF-IDF), if word is passive, then p initialization value is-f (TF-IDF), wherein, f (TF-IDF) the default initial emotion value that is word;
According to the position that word occurs, judge whether occur negative word between word, if occur, then calculate the number of negative word, when the number of negative word is odd number, just reversed by the propensity value p of the word be in after negative word, otherwise propensity value p is constant, final propensity value p is the emotion value of word;
TF-IDF value according to word gives different weights to different words.
S1033, calculate the weighted sum of the emotion value of all words, obtain the emotion value of short text.
The affective characteristics of S104, extraction short text.
Wherein, affective characteristics specifically comprises 9 features, as shown in table 1.
Table 1
S105, according to extracted affective characteristics adopt random forest be that classification tool is trained corpus or differentiates the short text of unknown affective style.
Concrete, step S105 comprises step:
S1051, arrf feature templates is utilized to be formatd by affective characteristics;
S1052, to call random forest in weka be that classification tool is trained corpus or differentiates the short text of unknown affective style.
The embodiment of the present invention is emulated, obtain accuracy rate as shown in table 2 compared with the algorithm of the people such as Tan, in field, hotel and books field, the present invention carry the algorithm of algorithm than people such as Tan and improve a lot in accuracy rate, but in electronic applications, the accuracy rate of algorithm is put forward a little almost by this research institute.
Table 2
Fig. 3 is the contrast and experiment of sentiment dictionary and basic sentiment dictionary in method proposed by the invention.Result shows, field sentiment dictionary is obviously good than the classifying quality of basic sentiment dictionary, and four field Average Accuracies improve 5.3%, wherein on books, hotel, electronic product and cinematic data collection, improve 4%, 5.2%, 2.9% and 8.8% respectively.
Fig. 4 is the test result exemplary plot in four fields, and wherein the transverse axis of figure represents training set proportion, and the longitudinal axis is classification accuracy and F-Measure.Can be shown by result, accuracy rate and the F-Measure of the classification when training data is 80% and test data is 20% are best.
It should be noted that, in this article, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, article or device and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, article or device.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within process, method, article or the device comprising this key element and also there is other identical element.
The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.
In several embodiments that the application provides, the system and method for setting forth can realize by another way.Such as, system embodiment described above is schematic; The division of described unit, is only a kind of logic function and divides, and actual can have other dividing mode when realizing; Multiple unit or assembly can in conjunction with or another system can be integrated into, or some features can be ignored, or do not perform.
The software module that the method described in conjunction with embodiment disclosed herein or the step of algorithm can directly use hardware, processor to perform, or the combination of the two is implemented.Software module can be placed in the storage medium of other form any known in random access memory (RAM), internal memory, ROM (read-only memory) (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field.
To the above-mentioned explanation of the disclosed embodiments, professional and technical personnel in the field are realized or uses the present invention.To be apparent for those skilled in the art to the multiple amendment of these embodiments, General Principle as defined herein can without departing from the scope of the invention, realize in other embodiments.Therefore, the present invention can not be restricted to these embodiments shown in this article, but will meet the most wide region consistent with principle disclosed herein and features of novelty.
Claims (6)
1., based on the Chinese short text sensibility classification method in field, it is characterized in that, comprising:
Data prediction is carried out to short text, comprises sentence segmentation, participle, stop words filters and field divides;
Build the field sentiment dictionary of different field;
After utilizing described field sentiment dictionary and pre-service, data calculate the emotion value of short text;
Extract the affective characteristics of short text;
Random forest is adopted to be that classification tool is trained corpus or differentiates the short text of unknown affective style according to extracted affective characteristics.
2. as claimed in claim 1 based on the Chinese short text sensibility classification method in field, it is characterized in that, described data prediction carried out to short text, comprise sentence segmentation, participle, stop words filter and field division, specifically comprise:
Utilize punctuation mark that short text is divided into multiple sentence;
ICTCLAS participle instrument is adopted to be independently word by described multiple sentence cutting;
The word of vocabulary to cutting of stopping using is adopted to filter;
According to short text and context environmental, in conjunction with domain lexicon, mark off field belonging to short text.
3., as claimed in claim 1 based on the Chinese short text sensibility classification method in field, it is characterized in that, the field sentiment dictionary of described structure different field, specifically comprises:
The emotion word irrelevant with field is picked out from existing sentiment dictionary, and the word therefrom deleting ambiguity and be of little use, form basic sentiment dictionary;
Extract all nouns in corpus and sort by word frequency, and utilizing threshold method to choose the higher noun of word frequency as evaluation object;
The all emotion paths between the modification emotion word in described evaluation object and described basic sentiment dictionary are extracted in the analysis of employing dependency grammar;
According to described all emotion paths, mate the word corresponding with the emotion path that described evaluation object conforms to, after getting rid of the word in basic sentiment dictionary, will the vocabulary alternatively emotion word that part of speech is adjective, adverbial word and verb be obtained;
After adopting word similarity distinguished number to carry out feeling polarities classification to described candidate's emotion word, superpose with basic dictionary, form field sentiment dictionary.
4. as claimed in claim 1 based on the Chinese short text sensibility classification method in field, it is characterized in that, after utilizing described field sentiment dictionary and pre-service, data calculate the emotion value of short text, specifically comprise:
Calculate the TF-IDF value of each word in the sentiment dictionary of described field, wherein, TF-IDF=TF*IDF, in formula, TF represents word frequency, and IDF represents reverse document-frequency;
For the multiple words obtained after short text word segmentation processing, calculate the emotion value of each word, namely give different weights according to the TF-IDF value of word to word;
Calculate the weighted sum of the emotion value of all words, obtain the emotion value of short text.
5. as claimed in claim 4 based on the Chinese short text sensibility classification method in field, it is characterized in that, described multiple words for obtaining after short text word segmentation processing, calculate the emotion value of each word, namely give different weights according to the TF-IDF value of word to word, specifically comprise:
For the multiple words obtained after short text word segmentation processing, record position and the propensity value p of the appearance of each word, wherein, if word is positive, then p initialization value is f (TF-IDF), if word is passive, then p initialization value is-f (TF-IDF), wherein, f (TF-IDF) the default initial emotion value that is word;
According to the position that word occurs, judge whether occur negative word between word, if occur, then calculate the number of negative word, when the number of negative word is odd number, just reversed by the propensity value p of the word be in after negative word, otherwise propensity value p is constant, final propensity value p is the emotion value of word;
TF-IDF value according to word gives different weights to different words.
6. as claimed in claim 1 based on the Chinese short text sensibility classification method in field, it is characterized in that, described is that classification tool is trained corpus or differentiates the short text of unknown affective style according to extracted affective characteristics employing random forest, specifically comprises:
Utilize arrf feature templates by affective characteristics document formatting;
Call random forests algorithm in weka to carry out training or carrying out emotion prediction classification to the short text of unknown affective style according to the affective characteristics of extracted corpus as classification tool.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510415825.4A CN105069021B (en) | 2015-07-15 | 2015-07-15 | Chinese short text sensibility classification method based on field |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510415825.4A CN105069021B (en) | 2015-07-15 | 2015-07-15 | Chinese short text sensibility classification method based on field |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105069021A true CN105069021A (en) | 2015-11-18 |
CN105069021B CN105069021B (en) | 2018-04-20 |
Family
ID=54498394
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510415825.4A Expired - Fee Related CN105069021B (en) | 2015-07-15 | 2015-07-15 | Chinese short text sensibility classification method based on field |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105069021B (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105808529A (en) * | 2016-03-10 | 2016-07-27 | 武汉传神信息技术有限公司 | Method and device of corpora division field |
CN105930359A (en) * | 2016-04-11 | 2016-09-07 | 百度在线网络技术(北京)有限公司 | Tendency monitoring method and device |
CN106096664A (en) * | 2016-06-23 | 2016-11-09 | 广州云数信息科技有限公司 | A kind of sentiment analysis method based on social network data |
CN106202200A (en) * | 2016-06-28 | 2016-12-07 | 昆明理工大学 | A kind of emotion tendentiousness of text sorting technique based on fixing theme |
CN106776686A (en) * | 2016-11-09 | 2017-05-31 | 武汉泰迪智慧科技有限公司 | Chinese domain short text understanding method and system based on many necks |
CN106844516A (en) * | 2016-12-28 | 2017-06-13 | 中央民族大学 | A kind of extracting method and system of focus word |
CN107194739A (en) * | 2017-05-25 | 2017-09-22 | 上海耐相智能科技有限公司 | A kind of intelligent recommendation system based on big data |
CN107301171A (en) * | 2017-08-18 | 2017-10-27 | 武汉红茶数据技术有限公司 | A kind of text emotion analysis method and system learnt based on sentiment dictionary |
CN108062300A (en) * | 2016-11-08 | 2018-05-22 | 中移(苏州)软件技术有限公司 | A kind of method and device that Sentiment orientation analysis is carried out based on Chinese text |
CN108399158A (en) * | 2018-02-05 | 2018-08-14 | 华南理工大学 | Attribute sensibility classification method based on dependency tree and attention mechanism |
CN108416375A (en) * | 2018-02-13 | 2018-08-17 | 中国联合网络通信集团有限公司 | Work order sorting technique and device |
CN108475261A (en) * | 2016-01-27 | 2018-08-31 | Mz知识产权控股有限责任公司 | Determine the user emotion in chat data |
CN108874937A (en) * | 2018-05-31 | 2018-11-23 | 南通大学 | A kind of sensibility classification method combined based on part of speech with feature selecting |
CN109190106A (en) * | 2018-07-16 | 2019-01-11 | 中国传媒大学 | Sentiment dictionary constructs system and construction method |
CN109214008A (en) * | 2018-09-28 | 2019-01-15 | 珠海中科先进技术研究院有限公司 | A kind of sentiment analysis method and system based on keyword extraction |
CN109241276A (en) * | 2018-07-11 | 2019-01-18 | 河海大学 | Word's kinds method, speech creativeness evaluation method and system in text |
CN109840281A (en) * | 2019-02-27 | 2019-06-04 | 浪潮软件集团有限公司 | A kind of self study intelligent decision method based on random forests algorithm |
CN109977396A (en) * | 2019-02-18 | 2019-07-05 | 深圳壹账通智能科技有限公司 | Emotion identification method, device, computer equipment and the computer storage medium of corpus participle |
CN110688836A (en) * | 2019-09-30 | 2020-01-14 | 湖南大学 | Automatic domain dictionary construction method based on supervised learning |
CN111753525A (en) * | 2020-05-21 | 2020-10-09 | 浙江口碑网络技术有限公司 | Text classification method, device and equipment |
CN112182332A (en) * | 2020-09-25 | 2021-01-05 | 科大国创云网科技有限公司 | Emotion classification method and system based on crawler collection |
CN112347259A (en) * | 2020-11-17 | 2021-02-09 | 河北工程大学 | Comment text sentiment analysis method combining dictionary and machine learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080249764A1 (en) * | 2007-03-01 | 2008-10-09 | Microsoft Corporation | Smart Sentiment Classifier for Product Reviews |
CN102760153A (en) * | 2011-04-21 | 2012-10-31 | 帕洛阿尔托研究中心公司 | Incorporating lexicon knowledge to improve sentiment classification |
CN103365867A (en) * | 2012-03-29 | 2013-10-23 | 腾讯科技(深圳)有限公司 | Method and device for emotion analysis of user evaluation |
CN103678278A (en) * | 2013-12-16 | 2014-03-26 | 中国科学院计算机网络信息中心 | Chinese text emotion recognition method |
-
2015
- 2015-07-15 CN CN201510415825.4A patent/CN105069021B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080249764A1 (en) * | 2007-03-01 | 2008-10-09 | Microsoft Corporation | Smart Sentiment Classifier for Product Reviews |
CN102760153A (en) * | 2011-04-21 | 2012-10-31 | 帕洛阿尔托研究中心公司 | Incorporating lexicon knowledge to improve sentiment classification |
CN103365867A (en) * | 2012-03-29 | 2013-10-23 | 腾讯科技(深圳)有限公司 | Method and device for emotion analysis of user evaluation |
CN103678278A (en) * | 2013-12-16 | 2014-03-26 | 中国科学院计算机网络信息中心 | Chinese text emotion recognition method |
Cited By (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108475261A (en) * | 2016-01-27 | 2018-08-31 | Mz知识产权控股有限责任公司 | Determine the user emotion in chat data |
CN105808529A (en) * | 2016-03-10 | 2016-07-27 | 武汉传神信息技术有限公司 | Method and device of corpora division field |
CN105808529B (en) * | 2016-03-10 | 2018-06-08 | 语联网(武汉)信息技术有限公司 | The method and apparatus that a kind of language material divides field |
CN105930359A (en) * | 2016-04-11 | 2016-09-07 | 百度在线网络技术(北京)有限公司 | Tendency monitoring method and device |
CN106096664A (en) * | 2016-06-23 | 2016-11-09 | 广州云数信息科技有限公司 | A kind of sentiment analysis method based on social network data |
CN106096664B (en) * | 2016-06-23 | 2019-09-20 | 广州云数信息科技有限公司 | A kind of sentiment analysis method based on social network data |
CN106202200A (en) * | 2016-06-28 | 2016-12-07 | 昆明理工大学 | A kind of emotion tendentiousness of text sorting technique based on fixing theme |
CN106202200B (en) * | 2016-06-28 | 2019-09-27 | 昆明理工大学 | A kind of emotion tendentiousness of text classification method based on fixed theme |
CN108062300A (en) * | 2016-11-08 | 2018-05-22 | 中移(苏州)软件技术有限公司 | A kind of method and device that Sentiment orientation analysis is carried out based on Chinese text |
CN106776686A (en) * | 2016-11-09 | 2017-05-31 | 武汉泰迪智慧科技有限公司 | Chinese domain short text understanding method and system based on many necks |
CN106844516A (en) * | 2016-12-28 | 2017-06-13 | 中央民族大学 | A kind of extracting method and system of focus word |
CN107194739A (en) * | 2017-05-25 | 2017-09-22 | 上海耐相智能科技有限公司 | A kind of intelligent recommendation system based on big data |
CN107194739B (en) * | 2017-05-25 | 2018-10-26 | 广州百奕信息科技有限公司 | A kind of intelligent recommendation system based on big data |
CN107301171B (en) * | 2017-08-18 | 2020-09-01 | 武汉红茶数据技术有限公司 | Text emotion analysis method and system based on emotion dictionary learning |
CN107301171A (en) * | 2017-08-18 | 2017-10-27 | 武汉红茶数据技术有限公司 | A kind of text emotion analysis method and system learnt based on sentiment dictionary |
CN108399158A (en) * | 2018-02-05 | 2018-08-14 | 华南理工大学 | Attribute sensibility classification method based on dependency tree and attention mechanism |
CN108399158B (en) * | 2018-02-05 | 2021-05-14 | 华南理工大学 | Attribute emotion classification method based on dependency tree and attention mechanism |
CN108416375B (en) * | 2018-02-13 | 2020-07-07 | 中国联合网络通信集团有限公司 | Work order classification method and device |
CN108416375A (en) * | 2018-02-13 | 2018-08-17 | 中国联合网络通信集团有限公司 | Work order sorting technique and device |
CN108874937A (en) * | 2018-05-31 | 2018-11-23 | 南通大学 | A kind of sensibility classification method combined based on part of speech with feature selecting |
CN108874937B (en) * | 2018-05-31 | 2022-05-20 | 南通大学 | Emotion classification method based on part of speech combination and feature selection |
CN109241276A (en) * | 2018-07-11 | 2019-01-18 | 河海大学 | Word's kinds method, speech creativeness evaluation method and system in text |
CN109241276B (en) * | 2018-07-11 | 2022-03-08 | 河海大学 | Word classification method in text, and speech creativity evaluation method and system |
CN109190106A (en) * | 2018-07-16 | 2019-01-11 | 中国传媒大学 | Sentiment dictionary constructs system and construction method |
CN109190106B (en) * | 2018-07-16 | 2023-01-10 | 中国传媒大学 | Emotional dictionary construction system and construction method |
CN109214008A (en) * | 2018-09-28 | 2019-01-15 | 珠海中科先进技术研究院有限公司 | A kind of sentiment analysis method and system based on keyword extraction |
CN109977396A (en) * | 2019-02-18 | 2019-07-05 | 深圳壹账通智能科技有限公司 | Emotion identification method, device, computer equipment and the computer storage medium of corpus participle |
CN109840281A (en) * | 2019-02-27 | 2019-06-04 | 浪潮软件集团有限公司 | A kind of self study intelligent decision method based on random forests algorithm |
CN110688836A (en) * | 2019-09-30 | 2020-01-14 | 湖南大学 | Automatic domain dictionary construction method based on supervised learning |
CN111753525A (en) * | 2020-05-21 | 2020-10-09 | 浙江口碑网络技术有限公司 | Text classification method, device and equipment |
CN111753525B (en) * | 2020-05-21 | 2023-11-10 | 浙江口碑网络技术有限公司 | Text classification method, device and equipment |
CN112182332A (en) * | 2020-09-25 | 2021-01-05 | 科大国创云网科技有限公司 | Emotion classification method and system based on crawler collection |
CN112347259A (en) * | 2020-11-17 | 2021-02-09 | 河北工程大学 | Comment text sentiment analysis method combining dictionary and machine learning |
Also Published As
Publication number | Publication date |
---|---|
CN105069021B (en) | 2018-04-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105069021A (en) | Chinese short text sentiment classification method based on fields | |
Gao et al. | Detecting online hate speech using context aware models | |
Ruder et al. | Character-level and multi-channel convolutional neural networks for large-scale authorship attribution | |
Alwakid et al. | Challenges in sentiment analysis for Arabic social networks | |
KR101713558B1 (en) | Method of classification and analysis of sentiment in social network service | |
CN108268668B (en) | Topic diversity-based text data viewpoint abstract mining method | |
Llewellyn et al. | Summarizing newspaper comments | |
Vani et al. | Using K-means cluster based techniques in external plagiarism detection | |
Al-Kabi et al. | Arabic/English sentiment analysis: an empirical study | |
Rabab'Ah et al. | Evaluating sentistrength for arabic sentiment analysis | |
Abdelali et al. | Arabic dialect identification in the wild | |
CN103995853A (en) | Multi-language emotional data processing and classifying method and system based on key sentences | |
Fromm et al. | TACAM: topic and context aware argument mining | |
Elouardighi et al. | A machine Learning approach for sentiment analysis in the standard or dialectal Arabic Facebook comments | |
Maronikolakis et al. | Analyzing political parody in social media | |
Ibrahim et al. | Sentiment analysis of Arabic tweets: With special reference restaurant tweets | |
Chandra et al. | Anti social comment classification based on kNN algorithm | |
Al-Mahmoud et al. | Arabic text mining a systematic review of the published literature 2002-2014 | |
Rajalakshmi et al. | DLRG@ HASOC 2019: An Enhanced Ensemble Classifier for Hate and Offensive Content Identification. | |
Hussein et al. | Cluster Analysis on covid-19 outbreak sentiments from twitter data using K-means algorithm | |
Campbell et al. | Content+ context networks for user classification in twitter | |
Khalil et al. | Which configuration works best? an experimental study on supervised Arabic twitter sentiment analysis | |
Mohtasseb et al. | Mining online diaries for blogger identification | |
Castro et al. | Authorship verification, combining linguistic features and different similarity functions | |
Liyanage et al. | Hate Speech Detection in Sinhala-English Code-Mixed Language |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20180420 Termination date: 20190715 |